Terastore
Introduction
Large data storage computers are very useful for radio science related applications. Putting together these systems using inexpensive disk drives is getting pretty easy. The resulting terabyte class data stores (Terastore) can be attached to high speed networks for long term data recording. Constructing a Terastore using inexpensive ATA disks and off the shelf RAID controllers yourself allows you to minimize the cost per gigabyte of the storage. It is also helpful when optimizing for I/O performance. With the advent of Serial ATA (SATA) it is easier than ever to construct these machines and a new denser chassis is available which allows 12 drives in 2U of rack space. In general a Terastore may find the following capabilities useful :
A compact form factor.
A gigabit network interface.
A gigabit control network interface.
A high performance RAID card.
A minimum of 3 SATA disks of large size.
Dual CPUs for load balancing.
Lots of memory to help with I/O buffering.
An internal 64 bit PCI or PCI-X bus architecture for high internal bandwidth.
A 2.5 inch IDE boot drive hidden inside so all bay mounted drives are in the RAID array.
Currently we have two SATA Terastores (rev 1.1) and three ATA Terastores (rev 1.0) in use with the Millstone Hill Radar. The cost (September 2003) for the latest data store was around $5k for 1.2 TB of RAID 5 storage or $7k for 2.4 TB. Some money could be saved on the CPUs and memory. Short depth rackmount form factors might also be useful for some applications. We expect the Terastore capacity to roughly double every 18 months or so at a constant price level.
An alternative approach would be to use Network Attached Storage boxes (NAS) which usually cost more but are easy to setup. Such systems are often somewhat Microsoft focused, run proprietary operating systems, and also can't usually be used for computing applications. The above may be an advantage or disadvantage for some applications.
Components
Chassis
Advanced Industrial Computers 2U Rackmount 12 Drive Chassis
quantity : 1 part number : RMC2E2-KI-XPSS
Any vendor will do and other cases can be used. Just be sure they fit your motherboard and drives.
Motherboard
Tyan Tiger i7501S Dual CPU Xeon Motherboard
quantity : 1 part number : S2725G2NR
A high quality dual CPU motherboard with multiple high speed PCI-X buses. Any motherboard can be used although low end ones may not have good performance or reliability. The CPU fans which come with the Intel CPUs work fine with this motherboard.
CPU
Intel Xeon 2.8 GHz processors
quantity : 2 part number : BX80532KE2800DU
Any CPU that the motherboard will accept is fine. Terastores are also often useful for doing some processing on the data so more power is good. Get the boxed version that comes with the fan or be sure to get CPU fans separately.
Memory
Kingston Technology 512 MB PC-2100 Registered ECC DDR DIMM
quantity : 2 part number : KVR266X72RC25/512
Any registered ECC memory that the motherboard will accept is fine. Error correcting memory is used to help ensure reliability. The Tyan motherboard requires dual DIMMs to supply its interleaved memory controller.
PCI Bus Risers
Rackmount Pro 64 bit PCI bus riser card for the RMC2D2-KI-XPSS case.
quantity : 1 part number : RC2-012
There are lots of these bus risers cards, most should work. It is needed to change the PCI bus on the motherboard to a right angle for the single RAID controller card. Some riser cards are better than others since they support the higher speed PCI-X bus better. There can be issues with cheap riser cards letting the PCI-X bus see the RAID controller card properly. Always place the RAID card in the slot electrically closest to the motherboard PCI controller when using multi-slot riser cards. For a multislot riser card this will usually be the card closest to the motherboard itself.
RAID controller
www.3ware.com Escalade 8506 RAID Controller
quantity : 1 part number : Escalade 8506-12
A 3ware RAID card supporting Serial ATA drives and supporting up to 12 channels. Check the compatibility listing for the card you purchase prior to getting the disk drives. The performance specs are pretty accurate for sustained reads and writes under Linux. If you really need the write performance use the card in a RAID-0 configuration.
Disk Drives
Western Digital 250 GB IDE Serial ATA drives
quantity : 12 part number : WD2500JD
These were the largest compatible drives at the time. New ones appear periodically. Twelve drives were used with the RAID card in two six drive RAID 5 configurations. We built two Terastores at once and purchased the drives bulk in a 20 pak to save some money (6 drives in one, 12 in the other and two spares).
Boot Disk Drive
Hitachi 60 GB 2.5 inch IDE Parallel ATA drives
quantity : 1 part number : 08K0939
In order to allow all the drives which fit in the chassis to be part of the RAID array we used an internal 2.5 inch IDE drive as the boot drive for the system. This drive was mounted in front of the power supply by just attaching it to the chassis using standoffs. The Tyan motherboard also has onboard Serial ATA support so a SATA 2.5 inch drive could also be used. Note that there isn't space in the chassis for a larger drive. Most any 2.5 inch drive should work fine but you will need a cable adapter for the parallel ATA drives (2.5 inch to 3.5 inch cable adapter; www.startech.com). We actually used a smaller and slower drive than that listed above but the 60 GB Hitachi is a fast 7200 RPM drive which should perform well.
The current 3ware cards and Linux kernel (2.4) don't support drives greater than 2 TB so we split the RAID array into two six drive RAID 5 arrays. This results in 1.2 TB of storage after formatting with ext3 under Linux. Since you get two of these arrays in one 2U box that is 2.4 TB total storage. Future kernels and 3ware firmware revisions will likely support larger single arrays. This is good since there will probably be larger disks available in the not too distant future.
Assembly
Assembly of the Terastore is similar to any other computer. If you don't have experience assembling a computer this isn't the one with which to start. In general everything is easy but there is a lot more cabling for the disk drives. We found that using Serial ATA significantly simplified the assembly since the connectors and wires are much smaller than parallel ATA cables. On top of that Serial ATA provides better performance for the individual drives. Areas where we had some difficulty were :
Our PCI riser card was a cheap one and we needed to place the 3ware card in the slot closest to the motherboard in order for the PCI bus to see the card.
Attaching the motherboard and internal 2.5 inch drive required some minor chassis modifications.
The Serial ATA connectors don't really attach very firmly. This is apparently a know problem which a new cable specification will eventually fix.
Serial ATA cables came with the 3ware controller so we didn't need to purchase separate ones.
Software Installation
After constructing the hardware it is necessary to setup the RAID controller. There is a simple interface that can be gotten to by rebooting the computer and holding down a hot key. The 3ware manual provides good instructions at this point. The basic process involves creating a RAID volume including the drives you want and and then initializing it.
Installing the operating system was easy and a stock copy of RedHat 9.0 was used. RedHat supported the 3ware card out of the box and it shows up as a SCSI drive. The 3ware manual also has instructions for installing the remote monitoring software which allows you to see the state of the RAID store from a web browser.
Troubleshooting
Usually it is a cable problem.
Sometimes the PCI riser card can also cause trouble for high speed PCI-X buses. Putting the card in the earliest PCI slot on the bus can sometimes help this type of issue.
We have had at least one instance of a more serious hardware failure. If a single drive fails you will be able to recover if using a fault tolerant RAID configuration (e.g. RAID-5).
We have had some minor problems with our current units. In particular the boot drive (20 GB Hitachi) of one died. Another of our units has been suffering from drive problems on muliple drives after getting hit with some high power RF (don't ask). So far we have avoided really catastrophic data loss but it has resulted in painful shuffling.
Contact Information
Frank Lind of
MIT Haystack Observatory can be contacted regarding the Terastore design.