Go Back   Rhinocerus > Newsgroup > Newsgroup comp.lang.* 1 > Newsgroup comp.lang.forth

Reply
 
Thread Tools Display Modes
  #46 (permalink)  
Old 08-10-2012, 12:36 AM
Paul Rubin
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

Alex McDonald <blog@rivadpm.com> writes:
> Look, if you're happy with backups to large TB desktop class drives
> and can afford the time and effort to do it several times to avoid the
> lottery that are unrecoverable disk errors, good on you. I'll withdraw
> my "best of luck" comment and reserve it for the companies that take
> your approach but go down the pan while footering around looking for
> an end to end accurate & readable copy to do a restore.


I don't understand what the big deal is.

1) If your data is valuable, you need multiple backups in physically
dispersed locations in case of earthquake, meteor, etc. regardless.

2) The issue of disk errors is handled by a) redundancy within the
backup set (RAID and maybe some ECC applied within the dump streams),
plus storing checksums in the metadata and doing a verification pass
after writing the data. This is surely more cost effective than using
drives that are 2x as expensive so you can get by with a few percent
less redundancy.
Reply With Quote
Alt Today
Advertising
 
and become member of Rhinocerus
Standard Sponsored Links

  #47 (permalink)  
Old 08-10-2012, 11:13 AM
Alex McDonald
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

On Aug 10, 1:36*am, Paul Rubin <no.em...@nospam.invalid> wrote:
> Alex McDonald <b...@rivadpm.com> writes:
> > Look, if you're happy with backups to large TB desktop class drives
> > and can afford the time and effort to do it several times to avoid the
> > lottery that are unrecoverable disk errors, good on you. I'll withdraw
> > my "best of luck" comment and reserve it for the companies that take
> > your approach but go down the pan while footering around looking for
> > an end to end accurate & readable copy to do a restore.

>
> I don't understand what the big deal is.
>
> 1) If your data is valuable, you need multiple backups in physically
> dispersed locations in case of earthquake, meteor, etc. regardless.
>
> 2) The issue of disk errors is handled by a) redundancy within the
> backup set (RAID and maybe some ECC applied within the dump streams),
> plus storing checksums in the metadata and doing a verification pass
> after writing the data. *This is surely more cost effective than using
> drives that are 2x as expensive so you can get by with a few percent
> less redundancy.


We've been over a lot of ground (probably OT for CLF, but even so more
interesting than Gavino on-topic).

I haven't advocated "2x more expensive drives" because I'm paid a
penny on every sale. There was also some discussion about the
bandwidth of shipping data that got lost in airline timetables and the
quality of coffee but I haven't suggested that the airlines should
drop their prices or that datacenters should be near sources of fine
Arabica beans either (well, perhaps I did tongue in cheek to Anton).

All I'm advocating is a robust backup (and I provided some information
to explain what can mitigate the issues of data corruption or loss),
and disk dumps to large multi TB destktop drives is a no-no in my
book. The rest fell out of that discussion.
Reply With Quote
  #48 (permalink)  
Old 08-10-2012, 03:57 PM
Anton Ertl
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

Alex McDonald <blog@rivadpm.com> writes:
>On Aug 9, 2:44=A0pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
>wrote:
>> Alex McDonald <b...@rivadpm.com> writes:
>> >On Aug 9, 7:00=3DA0am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
>> >wrote:
>> >> Sounds like you swallowed some horror stories some people like to
>> >> spin. =3DA0Why should spin down exacerbate these problems?

>>
>> >Several reasons.

>>
>> >Rated start/stop cycles; 250 average on/off cycles per year at the
>> >expected population AFR of 0.55% (Seagate Cheetah 15.7, enterprise
>> >class drive).

>>
>> What does AFR have to do with the horror stories about corrupted data?

>
>AFR includes corrupted data.


It includes other failure modes, so this says nothing about spin-down
exacerbating disk corruption.

>> And anyone who uses "enterprise class" drives for backup has too much
>> money.

>
>Why? Since many operations value data integrity greater than the cost,
>this is an economic argument, not one of wealth causing stupidity.


In backup and also in RAIDs, we increase safety/reliability through
redundancy. For a given amount of money, we get more
safety/reliability by using more cheap drives instead of fewer
expensive drives.

>And AFR includes corrupted data. I'm mystified; where did I say that
>corrupted data was the only issue?


You spun horror stories about data corruption as if it was the main
issue. In my experience it's a minor issue.

>> >Drives vary; SATA drives at 5k RPM spin up faster than high RPM SAS
>> >drives at 15K, which may take minutes to stabilize at operating speed.
>> >During that time, the disk isn't usable, and I stand by my assertion
>> >that spin up wastes as much power as several minutes of full
>> >operation.

>>
>> Sure, if a drive takes several minutes to spin up, it will consume as
>> much power as several minutes of full operation.
>>
>> But who in his right mind uses an expensive and power-hungry high-RPM
>> drive that takes forever to spin up for a storage solution that
>> requires low power and fast spin-up? =A0Ok, a sales guy selling to a
>> clueless and rich customer will do it, but not because of technical
>> merit.

>
>I was giving an example of slow spin up to counterpoint the "10
>seconds and you're good to go" example you gave.


It's an irrelevant example, because nobody in his right mind will use
such drives for such a design. The 10s example is an ordinary 7200rpm
drive. If somebody wanted to use special drives for a spin-down
system and spin-up time is of any relevance, they will choose drives
that spin up at least as fast as the one I measured.

>To spin up a RAID group of say 14 drives on a shelf of disks will
>require that the drives are turned on serially in small groups.


No. If the hardware cannot spin them up at the same time, one will
not choose such a large RAID group. Conversely, if RAID groups of 14
disks are desired, the hardware should be designed so that the group
can be spun up at the same time. For a system that contains 480 disk
drives, dimensioning the power supply such that it can spin up 14
drives at once should be no problem.

>> >I don't know where you got the idea that 480 tape drives was the
>> >equivalent to 480 disk drives, but it's not an assertion I made and
>> >certainly qualifies as insane.

>>
>> You claimed that lots of disks had to be spun up for bandwidth
>>
>> reasons, and you wrote:
>>
>> |It's the economics of competing with tape; big power supplies to
>> |support 480 disks packed in a single rack cost lots of money.
>>
>> which suggest that you think that a backup solution needs 480 disks
>> spun up for bandwidth reasons.

>
>No, that was the COPAN solution. (IIRC it was the smallest COPAN
>system you could buy.)


And you claimed that their solution was insufficient because they
could only spin 25% of the disks, and that that was insufficient
because it limits the bandwidth too much.

>> It's nonsense, because we are backing up to disks with a total of 10s
>> of TB, and it's workable, and if we wanted to back up to more disks,
>> we would just use more disks. =A0And the main bandwidth limit is, as you
>> write, getting the data off the main storage.

>
>That was my point. If you want off-server backup, then the bandwidth
>off the server is the issue.


For our servers, the bandwidth off the server disks is the limit most
of the time, because there is a lot of seeking during backups, and
also, the data is already compressed when it goes off the server.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2012: http://www.euroforth.org/ef12/
Reply With Quote
  #49 (permalink)  
Old 08-10-2012, 08:41 PM
Mat
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

Hello,
I don't understand why you burden your forth system (and possible
users) with block accessing on cassette storage. In my opinion it
would be both simplier and better to stream the whole memory back to
cassette at demand as load and save times would be slow anyway.

Implementing threading dispatch for 6502 class cpu's is a bad idea in
my opinion. It would be better to implement a simple native-code
compiler for these processor type.

But please fiish your project and show me I'm wrong with these two
cents.

chitselb schrieb:
> I'm working on a retro computing project, a 6502 Forth implementation forthe Commodore PET 2001. https://github.com/chitselb/pettil if you're curious. The goal is for the language to be fast, tight, and capable of running on the actual hardware. For development I'm using the viceteam.org PET emulator with the xa65 cross-assembler, on Linux.
>
> Since most of us back then (1980) didn't have disk drives, I am going to use the cassette tape for mass storage. These are a few ways I'm considering:
>
> 1) Simulate random access using two cassette decks and copy/merge
>
> The PET cassette had two file types, sequential(data) and program.
> a) For program files, there's a long tone followed by a short header block containing the filename, and then a shorter tone followed by one continuous block of memory (two byte load address followed by the data)
> b) for data files, there's the same long tone/file name header, followed by zero or more short tone/192-byte data blocks
>
> On the PET (not the VIC-20 or C=64) there were two datassette ports, and I have two drives. Using the sequential file format and both decks, FLUSH would copy the entire virtual memory from one tape to the other in 1024-byte blocks (preceded by a 16-bit unsigned block number), inserting and replacing blocks from the memory buffers. Then rewind both tapes and go the other way. Slow, tedious, cumbersome. Welcome to my world in 1980.
>
> 2) Historically accurate
>
> Some Forth implementations from back then implemented tape storage. I have been unable to locate one for the PET but yesterday I found tape images for Datatronic Forth on the C=64 and another thing called "C=64 Forth".Both of these appear to implement some type of mass storage on tape.
>
> I'd be very interested to know what other Forth implementations of that era did as far as tape storage. What Forth words, what did they do, etc...
>
> 3) Save source code as sequential files
>
> Using native named files instead of blocks. Not very Forth-like, but possibly the most expedient.
>
> I'm very grateful for the help of this community with my earlier design considerations (circa 2010) on this project, particularly the hashed dictionary and the incredibly fast inner interpreter. Check the project link above if you're curious to see how those parts turned out.
>
> Charlie

Reply With Quote
  #50 (permalink)  
Old 08-10-2012, 09:54 PM
Coos Haak
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

Op Fri, 10 Aug 2012 13:41:09 -0700 (PDT) schreef Mat:

> Hello,
> I don't understand why you burden your forth system (and possible
> users) with block accessing on cassette storage. In my opinion it
> would be both simplier and better to stream the whole memory back to
> cassette at demand as load and save times would be slow anyway.
>
> Implementing threading dispatch for 6502 class cpu's is a bad idea in
> my opinion. It would be better to implement a simple native-code
> compiler for these processor type.
>
> But please fiish your project and show me I'm wrong with these two
> cents.
>

What if you have 100 blocks (100 KB) data, how would you load that many in
one goto into the memory of a 8 bit computer?

--
Coos

CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html
Reply With Quote
  #51 (permalink)  
Old 08-10-2012, 10:41 PM
dambere@web.de
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

Am Freitag, 10. August 2012 23:54:32 UTC+2 schrieb Coos Haak:
> > What if you have 100 blocks (100 KB) data, how would you load that manyin

>
> one goto into the memory of a 8 bit computer?


Typical PET's had 8 KB ram ! How would you processing 100 KB of data with such a platform ? You would process these data in chunks of some KB. This would also be possible with a forth system streaming it's state back to tape,because that do not pretend words managing to load and save processed dataat demand. Instead of block accessing without motor control, resulting in at-hand position adjustments for each data block regardless of its use you would gain freedom to format a tape well suited for a specific task so all needed for processing would be to press the play and stop keys.
Reply With Quote
  #52 (permalink)  
Old 08-10-2012, 11:47 PM
Coos Haak
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

Op Fri, 10 Aug 2012 15:41:45 -0700 (PDT) schreef dambere@web.de:

> Am Freitag, 10. August 2012 23:54:32 UTC+2 schrieb Coos Haak:
>>> What if you have 100 blocks (100 KB) data, how would you load that many in

>>
>> one goto into the memory of a 8 bit computer?

>
> Typical PET's had 8 KB ram ! How would you processing 100 KB of data with such a platform ? You would process these data in chunks of some KB. This would also be possible with a forth system streaming it's state back to tape, because that do not pretend words managing to load and save processed data at demand. Instead of block accessing without motor control, resulting in at-hand position adjustments for each data block regardless of its use you would gain freedom to format a tape well suited for a specific task so all needed for processing would be to press the play and stop keys.


Of course, that's what I meant. Blocks are neat for this sort of work. I've
used a casette drive with C90 casettes in 1981, but not for long. My ZX
Spectrum had two microdrives that I could control from within my own Forth.
Much fast and simpler than pressing knobs with the previous meant computer.

--
Coos

CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html
Reply With Quote
  #53 (permalink)  
Old 08-11-2012, 08:50 AM
Andrew Haley
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

Mat <dambere@web.de> wrote:

> Implementing threading dispatch for 6502 class cpu's is a bad idea in
> my opinion. It would be better to implement a simple native-code
> compiler for these processor type.


I'm sure you're right about speed, but there isn't much memory, and
for that reason all the language implementations I came across at the
time used some some sort of interpretation, whether Forth or Pascal.
6502 Pascal was even more compact than Forth, using a very tight
bytecode. Maybe a JSR-threaded Forth would be OK, but that's still a
considerable code expansion.

Andrew.
Reply With Quote
  #54 (permalink)  
Old 08-11-2012, 09:03 AM
Anton Ertl
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

Andrew Haley <andrew29@littlepinkcloud.invalid> writes:
>Mat <dambere@web.de> wrote:
>
>> Implementing threading dispatch for 6502 class cpu's is a bad idea in
>> my opinion. It would be better to implement a simple native-code
>> compiler for these processor type.

>
>I'm sure you're right about speed,


A JSR-RTS pair is 12 cycles, and the OP mentioned 17 cycles for his
NEXT, so yes, subroutine threading would be a little faster.

> but there isn't much memory, and
>for that reason all the language implementations I came across at the
>time used some some sort of interpretation, whether Forth or Pascal.
>6502 Pascal was even more compact than Forth, using a very tight
>bytecode. Maybe a JSR-threaded Forth would be OK, but that's still a
>considerable code expansion.


Going in the other direction, if the target machine has only 8KB,
there probably won't be more than 256 words anyway, so one could
represent words with bytes in interpreted code. NEXT would probably
be a little slower, though.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2012: http://www.euroforth.org/ef12/
Reply With Quote
  #55 (permalink)  
Old 08-11-2012, 09:08 PM
Andrew Haley
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
> Andrew Haley <andrew29@littlepinkcloud.invalid> writes:
>>Mat <dambere@web.de> wrote:
>>
>>> Implementing threading dispatch for 6502 class cpu's is a bad idea in
>>> my opinion. It would be better to implement a simple native-code
>>> compiler for these processor type.

>>
>>I'm sure you're right about speed,

>
> A JSR-RTS pair is 12 cycles, and the OP mentioned 17 cycles for his
> NEXT, so yes, subroutine threading would be a little faster.


Right, but it's a bit better than that would suggest because enter and
exit are pretty fast too. A problem with JSR threading is that it
would make multi-tasking very messy because the return stack has to
live on Page 1; that might not matter to some but would spoil it for
me. (6502 fig-FORTH had a similar problem because the data stack had
to live in Page 0.)

>>but there isn't much memory, and for that reason all the language
>>implementations I came across at the time used some some sort of
>>interpretation, whether Forth or Pascal. 6502 Pascal was even more
>>compact than Forth, using a very tight bytecode. Maybe a
>>JSR-threaded Forth would be OK, but that's still a considerable code
>>expansion.

>
> Going in the other direction, if the target machine has only 8KB,
> there probably won't be more than 256 words anyway, so one could
> represent words with bytes in interpreted code.


I'm a bit baffled by all this "8kbyte PET" talk. I don't think I ever
saw one with only 8k.

Andrew.
Reply With Quote
  #56 (permalink)  
Old 08-12-2012, 03:27 AM
Paul Rubin
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

Alex McDonald <blog@rivadpm.com> writes:
> All I'm advocating is a robust backup (and I provided some information
> to explain what can mitigate the issues of data corruption or loss),
> and disk dumps to large multi TB destktop drives is a no-no in my
> book. The rest fell out of that discussion.


OK, I'm just missing the part about what's wrong with desktop drives
compared with enterprise drives. You listed a number of issues but it
seems to me that all of them can be handled by software. When 100's or
1000's of drives are involved, a 2x cost difference per drive adds up to
a lot of cash, so it has to be justified rather rigorously.
Reply With Quote
  #57 (permalink)  
Old 08-12-2012, 07:51 AM
Elizabeth D. Rather
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

On 8/10/12 10:41 AM, Mat wrote:
> Hello,
> I don't understand why you burden your forth system (and possible
> users) with block accessing on cassette storage. In my opinion it
> would be both simplier and better to stream the whole memory back to
> cassette at demand as load and save times would be slow anyway.
>
> Implementing threading dispatch for 6502 class cpu's is a bad idea in
> my opinion. It would be better to implement a simple native-code
> compiler for these processor type.
>
> But please fiish your project and show me I'm wrong with these two
> cents.


A memory transfer is appropriate for saving a program image, but I
believe the OP wanted to use tape for source and data, as well. Native
Forths have traditionally organized mass storage as 1024-char blocks
(source blocks were formatted as 16 lines of 64 chars for display and
editing). Managing such blocks as records on a tape can be made to work,
although of course it's slow.

One of the very early (1970-72) Forth systems at NRAO had only tape for
mass storage, and another had a 64 Kb drum plus tape. These systems kept
the program image in the first (longish) record on the tape, for booting
purposes, and 1024-byte records following for source and data. The
system maintained an index of blocks on tape (beyond the program image)
for pseudo-random access. It was better than nothing :-)

We were delighted to get our new PDP-11 in 1973, with a 1.25 Mb
removable disk.

Cheers,
Elizabeth



--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================
Reply With Quote
  #58 (permalink)  
Old 08-13-2012, 12:23 PM
Alex McDonald
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

On Aug 10, 4:57*pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> Alex McDonald <b...@rivadpm.com> writes:
> >On Aug 9, 2:44=A0pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
> >wrote:
> >> Alex McDonald <b...@rivadpm.com> writes:
> >> >On Aug 9, 7:00=3DA0am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
> >> >wrote:
> >> >> Sounds like you swallowed some horror stories some people like to
> >> >> spin. =3DA0Why should spin down exacerbate these problems?

>
> >> >Several reasons.

>
> >> >Rated start/stop cycles; 250 average on/off cycles per year at the
> >> >expected population AFR of 0.55% (Seagate Cheetah 15.7, enterprise
> >> >class drive).

>
> >> What does AFR have to do with the horror stories about corrupted data?

>
> >AFR includes corrupted data.

>
> It includes other failure modes, so this says nothing about spin-down
> exacerbating disk corruption.
>
> >> And anyone who uses "enterprise class" drives for backup has too much
> >> money.

>
> >Why? Since many operations value data integrity greater than the cost,
> >this is an economic argument, not one of wealth causing stupidity.

>
> In backup and also in RAIDs, we increase safety/reliability through
> redundancy. *For a given amount of money, we get more
> safety/reliability by using more cheap drives instead of fewer
> expensive drives.
>
> >And AFR includes corrupted data. I'm mystified; where did I say that
> >corrupted data was the only issue?

>
> You spun horror stories about data corruption as if it was the main
> issue. *In my experience it's a minor issue.
>
>
>
>
>
>
>
>
>
> >> >Drives vary; SATA drives at 5k RPM spin up faster than high RPM SAS
> >> >drives at 15K, which may take minutes to stabilize at operating speed..
> >> >During that time, the disk isn't usable, and I stand by my assertion
> >> >that spin up wastes as much power as several minutes of full
> >> >operation.

>
> >> Sure, if a drive takes several minutes to spin up, it will consume as
> >> much power as several minutes of full operation.

>
> >> But who in his right mind uses an expensive and power-hungry high-RPM
> >> drive that takes forever to spin up for a storage solution that
> >> requires low power and fast spin-up? =A0Ok, a sales guy selling to a
> >> clueless and rich customer will do it, but not because of technical
> >> merit.

>
> >I was giving an example of slow spin up to counterpoint the "10
> >seconds and you're good to go" example you gave.

>
> It's an irrelevant example, because nobody in his right mind will use
> such drives for such a design. *The 10s example is an ordinary 7200rpm
> drive. *If somebody wanted to use special drives for a spin-down
> system and spin-up time is of any relevance, they will choose drives
> that spin up at least as fast as the one I measured.
>
> >To spin up a RAID group of say 14 drives on a shelf of disks will
> >require that the drives are turned on serially in small groups.

>
> No. *If the hardware cannot spin them up at the same time, one will
> not choose such a large RAID group. *Conversely, if RAID groups of 14
> disks are desired, the hardware should be designed so that the group
> can be spun up at the same time. *For a system that contains 480 disk
> drives, dimensioning the power supply such that it can spin up 14
> drives at once should be no problem.


The power supplies are relatively small, and serve a disk shelf that
is (commonly) organised in groups that can be fitted in a 19inch rack
system. 14 drives or more is a common number, although very dense 48
drive systems are also available. In total, the loading on a rack of
such shelves (which may be in the mid hundreds to more of drives)
cannot exceed certain limits in terms of amperage due to the heat
generated; 15KW or more heat from a rack is difficult to dissipate.
Scaling power supplies on a shelf to support 14 drives power-up
simultaneously means that most of the time supplies are operating at
low loads, which is where power supplies are very inefficient; running
them near their maximum rating is preferable, when conversion rates
can be 90% or better.

RAID group size is (relatively) small for RAID-5 type schemes;
normally no more than 6+1 parity or so. Dual parity schemes may employ
12+2 up to around 16+2. Much higher than these limits, and the RAID
rebuild times become prohibitively expensive and riskier due to
failures during rebuild; much lower, and the total space efficiency
and performance due to loss of parallelism is compromised.

>
>
>
>
>
>
>
>
>
> >> >I don't know where you got the idea that 480 tape drives was the
> >> >equivalent to 480 disk drives, but it's not an assertion I made and
> >> >certainly qualifies as insane.

>
> >> You claimed that lots of disks had to be spun up for bandwidth

>
> >> reasons, and you wrote:

>
> >> |It's the economics of competing with tape; big power supplies to
> >> |support 480 disks packed in a single rack cost lots of money.

>
> >> which suggest that you think that a backup solution needs 480 disks
> >> spun up for bandwidth reasons.

>
> >No, that was the COPAN solution. (IIRC it was the smallest COPAN
> >system you could buy.)

>
> And you claimed that their solution was insufficient because they
> could only spin 25% of the disks, and that that was insufficient
> because it limits the bandwidth too much.


It may do. Bandwidth is a problem during massively parallel backups
and due to the design of the shelves. Many systems employ a bus into
which the disks are plugged; disks are addressed via two or more fibre
channel arbitrated loops or a multi-path SAS arrangement (even for
SATA disks). Getting parallelism on such a system requires many
shelves to be active, and the RAID groups are sometimes split across
them, since a single shelf doesn't have max-bandwidth = (disk
bandwidth * number of disks). Hence why 25% of the shelves powered on
in the COPAN system limited bandwidth. (Some system employ "active"
servers supporting the 14+ disks that make up a shelf and can drive
higher sequential (but not random) bandwidth rates, but they are very
power hungry indeed.)

>
> >> It's nonsense, because we are backing up to disks with a total of 10s
> >> of TB, and it's workable, and if we wanted to back up to more disks,
> >> we would just use more disks. =A0And the main bandwidth limit is, asyou
> >> write, getting the data off the main storage.

>
> >That was my point. If you want off-server backup, then the bandwidth
> >off the server is the issue.

>
> For our servers, the bandwidth off the server disks is the limit most
> of the time, because there is a lot of seeking during backups, and
> also, the data is already compressed when it goes off the server.


A smart backup program can reduce seeks by sorting, say, a snapshot of
th disk, to reduce the seeks and read blocks serially. Enterprise
class disks often support "skip read" semantics that can reduce the
requirement to seek when reading data from a single track. The order
in which the blocks are read & sent is immaterial to the construction
of a backup on the target.

>
> - anton
> --
> M. Anton Ertl *http://www.complang.tuwien.ac.at/anton/home.html
> comp.lang.forth FAQs:http://www.complang.tuwien.ac.at/forth/faq/toc.html
> * * *New standard:http://www.forth200x.org/forth200x.html
> * *EuroForth 2012:http://www.euroforth.org/ef12/


Reply With Quote
  #59 (permalink)  
Old 08-15-2012, 03:13 PM
Anton Ertl
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

Alex McDonald <blog@rivadpm.com> writes:
>On Aug 10, 4:57=A0pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
>wrote:
>> Alex McDonald <b...@rivadpm.com> writes:
>> >On Aug 9, 2:44=3DA0pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
>> >wrote:
>> >> Alex McDonald <b...@rivadpm.com> writes:
>> >> >On Aug 9, 7:00=3D3DA0am, an...@mips.complang.tuwien.ac.at (Anton Ertl=

>)
>> >> >wrote:



>> >To spin up a RAID group of say 14 drives on a shelf of disks will
>> >require that the drives are turned on serially in small groups.

>>
>> No. =A0If the hardware cannot spin them up at the same time, one will
>> not choose such a large RAID group. =A0Conversely, if RAID groups of 14
>> disks are desired, the hardware should be designed so that the group
>> can be spun up at the same time. =A0For a system that contains 480 disk
>> drives, dimensioning the power supply such that it can spin up 14
>> drives at once should be no problem.

>
>The power supplies are relatively small, and serve a disk shelf that
>is (commonly) organised in groups that can be fitted in a 19inch rack
>system. 14 drives or more is a common number, although very dense 48
>drive systems are also available. In total, the loading on a rack of
>such shelves (which may be in the mid hundreds to more of drives)
>cannot exceed certain limits in terms of amperage due to the heat
>generated; 15KW or more heat from a rack is difficult to dissipate.


If I have several power supplies, each powering a bunch of drives, I
would distribute the RAID group across these bunches, ideally one
drive per bunch. A nice side benefit is that the system can now
survive a power supply failure without needing any additional power
supply redundancy (not sure if the following rebuilding of lots of
RAID groups on power supply failure is practical, though, but if it
isn't, then we'll just have to bite the bullet and provide power
supply redundancy after all). So, to spin up a whole RAID group at
the same time, each power supply only needs to be able to support
spinning up one drive.

15KW would allow spinning up 480 drives at the same time (and would
also be necessary to let 480 LTO-5 tape drives work at the same time).

>Scaling power supplies on a shelf to support 14 drives power-up
>simultaneously means that most of the time supplies are operating at
>low loads, which is where power supplies are very inefficient; running
>them near their maximum rating is preferable, when conversion rates
>can be 90% or better.


Typical power supplies are relatively efficient across a pretty wide
range, and the highest efficiency is not at maximum load. E.g.,
looking at
http://www.anandtech.com/show/6013/3...1-cheap-psus/3,
i.e., even looking at a cheap power supply, there is relatively little
efficiency variation between 20% and 110% load, and the highest
efficiency is at 50% load. I also looked at the next one in the test
(FSP OEM 400W) and find the same pattern there.

>> For our servers, the bandwidth off the server disks is the limit most
>> of the time, because there is a lot of seeking during backups, and
>> also, the data is already compressed when it goes off the server.

>
>A smart backup program can reduce seeks by sorting, say, a snapshot of
>th disk, to reduce the seeks and read blocks serially. Enterprise
>class disks often support "skip read" semantics that can reduce the
>requirement to seek when reading data from a single track.


Any commodity drive I or my students have measured in the last 15
years or so has cached the data of several tracks for reading (I guess
but have not confirmed that in particular they cache data that they
read while waiting for the disk to rotate to the target sector, but if
there was no request right afterward, probably also the rest of the
track), and that's why some OS-side optimizations we (and others) did
were not as effective as I expected: the drives already did part of
them for us. Anyway, I had not heard that this is a marketing feature
for enterprise drives, and Google has not heard about "skip read
semantics", either.

Concerning our backup program, we are just using tar instead of a
smart one. It's good enough for our needs, but it does not optimize
disk reads the way you suggest (it would need to more about file
systems than I find comfortable to do that), so there's quite a bit of
waiting for disk seeks despite drive caches.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2012: http://www.euroforth.org/ef12/
Reply With Quote
  #60 (permalink)  
Old 08-15-2012, 06:57 PM
Alex McDonald
Guest
 
Posts: n/a
Default Re: Implementing virtual memory on cassette tape

On Aug 15, 4:13*pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> Alex McDonald <b...@rivadpm.com> writes:
> >On Aug 10, 4:57=A0pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
> >wrote:
> >> Alex McDonald <b...@rivadpm.com> writes:
> >> >On Aug 9, 2:44=3DA0pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
> >> >wrote:
> >> >> Alex McDonald <b...@rivadpm.com> writes:
> >> >> >On Aug 9, 7:00=3D3DA0am, an...@mips.complang.tuwien.ac.at (AntonErtl=

> >)
> >> >> >wrote:
> >> >To spin up a RAID group of say 14 drives on a shelf of disks will
> >> >require that the drives are turned on serially in small groups.

>
> >> No. =A0If the hardware cannot spin them up at the same time, one will
> >> not choose such a large RAID group. =A0Conversely, if RAID groups of14
> >> disks are desired, the hardware should be designed so that the group
> >> can be spun up at the same time. =A0For a system that contains 480 disk
> >> drives, dimensioning the power supply such that it can spin up 14
> >> drives at once should be no problem.

>
> >The power supplies are relatively small, and serve a disk shelf that
> >is (commonly) organised in groups that can be fitted in a 19inch rack
> >system. 14 drives or more is a common number, although very dense 48
> >drive systems are also available. In total, the loading on a rack of
> >such shelves (which may be in the mid hundreds to more of drives)
> >cannot exceed certain limits in terms of amperage due to the heat
> >generated; 15KW or more heat from a rack is difficult to dissipate.

>
> If I have several power, each powering a bunch of drives, I
> would distribute the RAID group across these bunches, ideally one
> drive per bunch.


As a practice, that leads to issues with failure modes at the shelf
level. For instance, a failure of a single shelf with something as
simple as a tripped power supply, where the disks in that shelf
contribute to (say) 10 RAID groups may cause 10 simultaneous RAID
rebuilds requiring the involvement of several hundred drives. I have
seen this happen on an HP EVA, where their vdisk RAID supports such a
scheme (although it was not recommended); the resulting mess is not
pretty. It also requires a very large number of spare drives for such
a rebuild.

>*A nice side benefit is that the system can now
> survive a power supply failure without needing any additional power
> supply redundancy (not sure if the following rebuilding of lots of
> RAID groups on power supply failure is practical, though, but if it
> isn't, then we'll just have to bite the bullet and provide power
> supply redundancy after all). *So, to spin up a whole RAID group at
> the same time, each power supply only needs to be able to support
> spinning up one drive.
>
> 15KW would allow spinning up 480 drives at the same time (and would
> also be necessary to let 480 LTO-5 tape drives work at the same time).
>
> >Scaling power supplies on a shelf to support 14 drives power-up
> >simultaneously means that most of the time supplies are operating at
> >low loads, which is where power supplies are very inefficient; running
> >them near their maximum rating is preferable, when conversion rates
> >can be 90% or better.

>
> Typical power supplies are relatively efficient across a pretty wide
> range, and the highest efficiency is not at maximum load. *E.g.,
> looking athttp://www.anandtech.com/show/6013/350450w-roundup-11-cheap-psus/3,
> i.e., even looking at a cheap power supply, there is relatively little
> efficiency variation between 20% and 110% load, and the highest
> efficiency is at 50% load. *I also looked at the next one in the test
> (FSP OEM 400W) and find the same pattern there.


Here's an analysis of power efficiencies in a data center you might
find interesting. It confirms your measurements.
http://www.thegreengrid.org/~/media/...08.pdf?lang=en.

Take a 14 disk system with 2 power supplies. For one power supply to
support all 14 drives plus spinning up 1 requires around 17 disks
worth of power (approx; e.g. 20W spin-up as opposed to 7W in use for a
single drive) from a single PSU running at 100%. Two running in steady
state has 1 supporting 7 drives, and the PSUs are now running at
approx 45% load or less. For a single power supply to support 14 spin-
ups simultaneously would have a pair of PSUs running nearer the 20%
mark, where they are less efficient.

>
> >> For our servers, the bandwidth off the server disks is the limit most
> >> of the time, because there is a lot of seeking during backups, and
> >> also, the data is already compressed when it goes off the server.

>
> >A smart backup program can reduce seeks by sorting, say, a snapshot of
> >th disk, to reduce the seeks and read blocks serially. Enterprise
> >class disks often support "skip read" semantics that can reduce the
> >requirement to seek when reading data from a single track.

>
> Any commodity drive I or my students have measured in the last 15
> years or so has cached the data of several tracks for reading (I guess
> but have not confirmed that in particular they cache data that they
> read while waiting for the disk to rotate to the target sector, but if
> there was no request right afterward, probably also the rest of the
> track), and that's why some OS-side optimizations we (and others) did
> were not as effective as I expected: the drives already did part of
> them for us. *Anyway, I had not heard that this is a marketing feature
> for enterprise drives, and Google has not heard about "skip read
> semantics", either.


See http://ps-2.kev009.com/rs6000/manual...26-7297-01.PDF
for an example (page 66) and http://www.ibmsystemsmag.com/getatta...-adce31c776ab/
for a diagram of 1 command with skip vs 2 commands with no skip.

This kind of feature is only available on enterprise class drives.

>
> Concerning our backup program, we are just using tar instead of a
> smart one. *It's good enough for our needs, but it does not optimize
> disk reads the way you suggest (it would need to more about file
> systems than I find comfortable to do that), so there's quite a bit of
> waiting for disk seeks despite drive caches.
>
> - anton
> --
> M. Anton Ertl *http://www.complang.tuwien.ac.at/anton/home.html
> comp.lang.forth FAQs:http://www.complang.tuwien.ac.at/forth/faq/toc.html
> * * *New standard:http://www.forth200x.org/forth200x.html
> * *EuroForth 2012:http://www.euroforth.org/ef12/


Reply With Quote
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off




All times are GMT. The time now is 11:16 AM.


Copyright ©2009

LinkBacks Enabled by vBSEO 3.3.0 RC2 © 2009, Crawlability, Inc.