2:47 AM - 4k alignment on WD HDDs
I'm attempting to get a WD EARS drive working properly in one of my servers. There are a number of issues using one of these new advanced format drives.
1. The default alignment causes significant performance problems. Some users have reported 1-8Mbps read speeds.
2. sysinstall doesn't let you change the offset in bsdlabel in a convenient way.
3. Many people have problems with these drives, but few people have good solutions under BSD.
Things I've found out so far:
An offset of 1 is helpful (so it's on 64 instead of 63 for the first partition). Most people trying to use these things are going for one large drive for data. I need to boot off this thing which means bsdlabel has several entries. I haven't found anyone trying this yet.
First, I thought I'd get clever and try gpart/gpt. I actually got the drive setup and some testing showed decent performance but I forgot that I hadn't ported the boot code yet. Doh!
Next, I went to plan b. I used fdisk as normal and instead focused on making appropriate changes in bsdlabel. WD says that as long as sectors are divisible by 8 you are ok, but an interesting analysis showed performance improvements by using a block size o 32768, sector size of 4096 (with newfs). That meant I had to be a little more careful during bsdlabel so that everything lined up nicely.
DES looked into this for FreeBSD and wrote a handy utility to test called phybs. It doesn't do performance testing, but you can see the affect on alignment.
Using that utility, I found that the fdisk setup was slower than gpt for some reason and i suspect things are still not optimal. However, a quick and dirty test of moving some files around showed it was running better than the horror stories I've read.
One test of copying files from a 7200 RPM seagate HDD to two different green drives (a samsung and the WD) showed that the samsung drive was slightly faster (1MBps). diskinfo shows the WD drive faster on the inner part of the disk but slightly slower on the outer part.
I'll post real numbers up later.. it's 3 am.
here's the results under GPT
./phybs -r /dev/ad8p1
count size offset step msec tps kBps
131072 1024 0 4096 18198 7202 7202
131072 1024 512 4096 18026 7271 7271
65536 2048 0 8192 10233 6404 12808
65536 2048 512 8192 11135 5885 11770
65536 2048 1024 8192 11304 5797 11594
32768 4096 0 16384 7508 4364 17456
32768 4096 512 16384 8394 3903 15613
32768 4096 1024 16384 8789 3728 14913
32768 4096 2048 16384 8458 3873 15495
16384 8192 0 32768 5672 2888 23107
16384 8192 512 32768 5723 2862 22900
16384 8192 1024 32768 5999 2730 21846
16384 8192 2048 32768 5867 2792 22337
16384 8192 4096 32768 5735 2856 22852
# gpart show
=> 34 1953525101 ad8 GPT (1000.2GB)
34 2014 - free - (1031.2KB)
2048 2097136 1 freebsd-ufs (1073.7MB)
2099184 16572032 2 freebsd-swap (8.5GB)
18671216 2097136 3 freebsd-ufs (1073.7MB)
20768352 268435456 4 freebsd-ufs (137.4GB)
289203808 1664320808 5 freebsd-ufs (852.1GB)
1953524616 519 - free - (265.7KB)
Here's some rsync data for the samsung drive:
sent 4858801691 bytes received 41302 bytes 28497612.86 bytes/sec
total size is 4858055400 speedup is 1.00
51.771u 35.499s 2:50.08 51.3% 459+1906k 41623+36549io 4pf+0w
rsync for wd:
sent 4858801691 bytes received 41302 bytes 27685715.06 bytes/sec
total size is 4858055400 speedup is 1.00
55.324u 36.006s 2:54.74 52.2% 457+1899k 41572+36276io 0pf+0w
This is not scientific at all.. i was copying tarballs from the last magus run.
fdisk for the drive:
fdisk -v ad8
******* Working on device /dev/ad8 *******
parameters extracted from in-core disklabel are:
cylinders=1938021 heads=16 sectors/track=63 (1008 blks/cyl)
Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=1938021 heads=16 sectors/track=63 (1008 blks/cyl)
Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 165 (0xa5),(FreeBSD/MidnightBSD/NetBSD/386BSD)
start 64, size 1953525105 (953869 Meg), flag 80 (active)
beg: cyl 0/ head 1/ sector 2;
end: cyl 613/ head 0/ sector 1
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>