IDE RAID Without Additional Hardware: Do It Yourself With Windows 2000
These days, we expect technology hold true to its promise of ever increasing speeds, and so we find ourselves waiting impatiently at the PC as the system boots, launches an application, or copies a large amount of data. If time really is that critical for you, then consider optimizing the hard disk or storage performance.
There are several approaches of how to speed up storage performance and increase data safety. The most effective way is to use hard drives that merely consist of RAM chips (so-called Solid-State Disks). As you can imagine, such drives are very expensive and thus do not come into question for most users.
RAID systems are a more popular and affordable option, but only if they are set up with IDE drives! SCSI is still more expensive, but more on that later. Basically, RAID is a system to increase both performance and data safety by using multiple drives (RAID: Redundant Array of Independent Drives). However, many users do not consider Mode 0 (striping) to be a viable RAID mode because it does not add redundancy, meaning that it can put your data under a high risk: If one drive fails, all data on the whole drive array will be lost.
RAID 0 arrays are not meant to be a long-term storage solution due to the safety issue. But they are suitable very well for fast short-term storage (swap file, temporary drive, video editing, video encoding).
Data Fragmentation: A Performance Killer
Several reasons make us believe or at least feel that hard drives are far too slow. The most important factor is data fragmentation that has heavy influence on the hard drive’s performance. The following images explain the basic problem:
Typically, files are stored one after another.
Deleting files opens a gap between the remaining files …
… which will be closed as soon as new files are stored. If the size of the new file is larger than the capacity of the gap, the file will be split (fragmented).
You don’t need to be an engineer to understand the impact of fragmentation. As long as a file is stored in one piece, it can be read sequentially (which means at maximum performance). Fragmented files require two or more head positioning operations that naturally interrput the data transfer process and finally slow down the hard drive performance considerably.
Although fragmentation has no direct influence on access time (it does not matter if a file is fragmented or not), the read/write heads may finish reading the last file fragment at a different position than they started. Unfortunately, cache algorithms cannot cope with this effect. At reduced efficiency, the cache memory won’t be able to speed up the hard drive any more.
It’s quite obvious that fragmentation cannot be avoided, but it can be kept low by defragmenting your hard drive regularly. Depending on the amount of data you are moving on your hard drive, you should do it every few weeks. If you regularly fill and empty your drive or only every few months if your data turnover is only small.
… it can be kept to a minumum if you defragment your hard drive regularly.
IDE vs. SCSI RAID
Although many people tend to believe that SCSI is already dead in the mainstream market, it still has several important advantages over IDE.
IDE | SCSI | |
Bandwidth | 100 MB/s (UltraATA/100) | 160/320 MB/s (Ultra160, Ultra320 SCSI) |
Cable Length | 45 cm per cable | ~150 cm |
Devices per Channel | 2 | 7/15 |
Scalability | Average | Excellent |
Multitasking Performance | Depends on controller, usually average | Depends on controller, usually good |
Costs | Low | Medium to high |
CPU Load | Medium | Low |
Factors like scalability, multitasking performance and the CPU load may be less important for home or office use, but trivial things like the cable length could make the whole thing a bit difficult: The IDE cable length of only 45 cm will give you a hard time trying to attach cables to several IDE drives in the upper drive bays. The same happens to high-end IDE RAID controllers like the Promise SuperTrak100: This device even supports RAID 5, but attaching the required devices (usually 5 or 6) can be quite difficult in many computer cases. In case of SCSI, you can attach up to 7 devices to one after another to a single cable, which should usually be easy to realize.
RAID 0 via Software
The basic idea behind striping is not particularly new and therefore it is no real surprise that Windows 2000 does not require a hardware RAID controller to setup RAID 0. You can do this just with software, and in fact, Windows 2000 supports not only single drives and drive spanning (setting up several drives to act as a single drive), but also striping. Sceptics may object that running drives in striping mode via software puts an extra load on the CPU. It certainly does, but today’s processors should be able to handle this, and actually, the amount of processing power used by a software RAID solution is not much different than that used by a hardware RAID controller.
Software RAID Requirements
Windows 2000 is able of striping any desired number of drives, regardless of their interface. You could, for example, stripe three IDE drives and one SCSI drive. While spanning allows to hook up several drives in order to use their total capacity, striping will orientate at the smallest drive. For example three 40 GB drives and one 8 GB drive will result in a total capacity of 32 GB (capacity of the smallest drive multiplied by the amount of drives). However, in order to achieve the best capacity and performance, you should stick to using drives of the same or similar types.
An additional IDE controller such as this Promise Ultra100 TX2 isn’t as expensive as a RAID controller and will enhance both your configuration possibilities and number of useable drives for software stripesets.
How To Setup A Software RAID with Windows 2000
First of all you have to install the mass storage controller(s) and hard drives you want to use. After that, launch the Windows 2000 Management Console (simply run “compmgmt.msc /s”) and open the Disk Management Folder inside the Storage category.
The lower frame at the right displays all drives and their partitions. Before you can create a stripeset you have to remove all partitions on the drives that you want to use (right-click the partition to get the options menu). Now you have to convert those drives into dynamic drives. Right-click the field left to the partition area and select “Convert To Dynamic Drive”.
After restarting the system (required), you can right-click on the partition area of any drive you want in order to create a new partition. The Wizard allows you to create simple partitions as well as span or stripe the drives. Next, specify the sector size and disk label, have it automatically formatted, and you are ready to go.
Perfidies of the Software RAID
So far, everything sounds very promising. However, we found that there is no way to run Windows 2000 itself on a software RAID array. After setting up a RAID array specifically for installing Windows 2000, the setup program did not recognize the stripeset drive. A possible reason is that the standard drivers do not support Windows 2000’s dynamic drive model. If you still choose one of the RAID drives to install the OS, Windows 2000 will prompt you to format it.
It was quite interesting to see that Windows 2000 does not care about how you install the members of a stripeset, as long as they are present. For example changing a drive from tertiary master (that’s the primary channel of any additional IDE controller) to secondary slave does not cause any errors; it has merely a little impact on performance. Removing any stripeset member will cause the whole array to remain inaccessible, so be careful!
Putting the Stripeset to Use
It would have been nice to be able to put Windows 2000 onto a RAID array, but that won’t work, and anyway, there are other uses for a fast drive array. For one thing, you can assign the Windows swap file to it. Make sure that you do this before putting any data on it though. And also, be sure to specify the same numbers for initial and maximum size, so that the swap file won’t be fragmented. Another use for the stripeset drive is to use it as a temporary drive for the more demanding applications, such as Adobe Photoshop, etc. If possible, also specify a particular size for the temp files.
Test Setup
Test System | |
CPU | Intel Pentium III, 866 MHz |
Motherboard | Asus CUSL2, i815 Chipset |
RAM | 128 MB SDRAM, 7ns (Crucial/Micron) CL2 |
IDE Interface | i815 UltraDMA/100 Interface (ICH2) Promise FastTrak100 Promise Ultra100 |
Graphics Card | i815 On-Board Graphics |
Network | 3COM 905TX PCI 100 MBit |
Operating Systems | Windows 2000 Pro 5.00.2195 SP2 |
Benchmarks and Measurements | |
Office Applications Benchmark | ZD WinBench 99 – Business Disk Winmark 2.0 |
Highend Applications Benchmark | ZD WinBench 99 – High-End Disk Winmark 2.0 |
Data Transfer | ZD WinBench 99 – Disk Inspection Test 2.0 |
Performance | SiSoft Sandra 2001 |
Settings | |
Graphics Drivers | Intel i815 Reference Drivers 4.3 |
IDE-Drivers | Intel Bus Master DMA Drivers 6.03 Promise Ultra100 Driver 1.60 Promise FastTrak100 Driver 2.00 |
DirectX Version | 8.0a |
Screen Resolution | 1024×768, 16 Bit, 85 Hz Refresh |
Data Transfer Diagrams
Single Drive
Most of you should be familiar with the transfer diagram of a hard drive like the DTLA.
Two Drives, One Channel
This diagram looks pretty much the same, using two instead of only one drive. Only the scale is different: While the single drive provides a maximum of 36 MB/s, the two-drive configuration starts at approximately 70 MB/s – that is double the transfer performance.
Two Drives, Two Channels
Using two instead of only one IDE channel hardly has any impact on the data transfer performance.
Three Drives, Two Channels
Attaching a third drive to the setup above won’t exceed ~75 MB/s, but it makes sure that the transfer rate stays constant at > 55 MB/s. The two-drive setup drops to 38 MB/s, which is quite a difference for demanding applications.
Data Transfer Diagrams, Continued
Three Drives, Three Channels
Here comes the real McCoy! Attaching each of the three drives as master to their own IDE channel results in a utilization of the full bandwidth of the 32 Bit PCI bus!
Four Drives, Two Channels
Some users may want to use four drives at only two channels in order to have a larger capacity. Surprisingly, this does not lead to a maximum transfer rate that is higher than 75 MB/s. However, the minumum rate of 72 MB/s is amazing.
Four Drives, Three Channels
Splitting the stripeset to three channels improves performance to almost 100 MB/s and a minimum of 76 MB/s.
Four Drives, Four Channels
This seems to be the best configuration for power users. Still it is not much faster than the configuration with three drives, each with their own channel.
Data Transfer Comparison Chart
Here you can see the minimum and maximum transfer performance of all configurations in one single chart. The four drive, four channel setup is indeed almost four times faster than the single DTLA.
Disk Access Time
It was predictible that the access time would not be quicker in a stripeset. Though the data throughput is considerably higher, the drives cannot possibly position their heads any faster. This not only applies to software RAID, but also for all kinds of hardware solutions. Expensive adapters may have RISC controllers with a lot of cache memory and sophisticated algorithms, but they cannot solve the basic problem. Conclusion: If you are running demanding applications that frequently access data bases (e.g. a webserver), try to get drives that have quick access times.
Application Benchmark: WinBench 99 Disk WinMarks
Business Disk WinMark
It is pretty obvious that standard applications like Microsoft’s Word or Excel do not benefit much from a faster storage subsystem.
Highend Disk WinMark
As opposed to Business Disk WinMark, the Highend Disk WinMark clearly shows the difference it makes to have a fast storage system. Here, the results are about 50% faster with a setup that uses at least three IDE channels.
SiSoft Sandra 2001 Performance Index
Sandra shows results similar to those of the Disk WinMarks. Using two drives will increase your overal storage performance, while anything beyond that makes a smaller impact.
Conclusion
Windows 2000 support for stripeset media proves to be a mighty utility. Depending on the amount of drives, you can boost data transfer rates up to the limit of the PCI bus (133 MB/s in theory). In systems with 64 Bit PCI, you can even reach much faster performance – provided that more drives and more controllers are deployed. Many applications are programmed in order to run most of the source code in the processor cache or the main memory, so the hard drive performance becomes secondary.
Software stripsets are just perfect for high-performance temporary storage required by demanding applications. You could, for instance, use stripeset drives for Windows swap files or for data that tends to remain static and are frequently backed up (e.g., software that is regularly installed over a LAN). It is not recommended for data storage, since the data is at a much greater risk than with single drive configs or redundant drive arrays, so don’t go storing large databases and other important files on striped drive arrays.