Skip to Main Content

Infrastructure Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

ISCSI performance problems (VMWare)

807557Apr 2 2010 — edited Apr 2 2010
I apologize but it has been a while since I have played with Solaris, so this maybe a very basic question.

Basically I have setup iSCSI targets on ZFS volumes for use in a test environment for VMWare. I have been able to get the environment up and running, but I ran into some performance problems. While doing IOPS testing I found that I am only getting about 8 IOPS. This has been causing my VM to time out and die.

With that being said I am looking for ways to find what the bottleneck is and how to resolve the bottleneck. So any suggestions would be greatly appreciated.

My Solaris environment is as follows:
SunOS eccoiscsi 5.10 Generic_141445-09 i86pc i386 i86pc
Running on HP DL 180 with 4 300 GB SATA drives (I know these are slow, but they shouldn't be 8 IOPS slow), plus I have a dedicated OS drive (64 GB I think)
I have a dedicated 4 port Gigabit TOE NIC for the iSCSI traffic.
I did a Core Group Installation to trim off any excess that may be running (this might be a mistake on my part).
I installed SSH, and iSCSIt after the install (sshcu, sshr, sshu, sshdr, sshdu, iscsir, iscsitgtr, iscsitgtu, iscsiu)
I created my zpool using raidz across all 4 disks (I have disabled the hardware RAID)
I created 2 ZFS 1 TB pools to be used as targets
I created 2 iSCSI targets (SAN2 LUN0 and SAN2 LUN1)
I setup the tpgt to span across 2 of the ports on the NIC and attached them iSCSI target. I am did not setup the ports as an aggregate, I have been debating if I want to do that or not. (Opinions?)
Once I figure out the performance problem I will probably enable the other 2 ports.

The network is just using a dedicated 24-port Cisco 2960 Gigabit

I am using VMWare ESXi on an HP DL380 with 2 dedicated iSCSI ports (Gigabit TOE), which gives me 4 paths to the iSCSI host.
Originally I had left the iSCSI configs at default for VMWare but I have recently enabled Round Robin to see if that would help performance. But after enabling Round Robin my IOPS went down to about 7.5 IOPS.

All of my tests were run on a Windows 2008 R2 VM using IOMeter with 2 MB transfer size. 33% on write, and 67% on read using 100% random. This is also done on a 5GB partition.

Here is the output of iostat during the test with IOMeter
# iostat -xtc 5 2
extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd0 0.0 0.0 0.0 0.0 0.0 0.0 19.8 0 0 0 0 0 1 0 99
sd1 0.0 0.0 0.1 0.0 0.0 0.0 3.3 0 0
sd2 1.1 35.5 14.8 1038.5 0.0 0.5 14.5 0 5
sd3 1.1 35.7 14.5 1037.0 0.0 0.5 14.6 0 6
sd4 1.1 35.5 14.8 1038.5 0.0 0.5 14.5 0 5
sd5 1.1 35.7 14.5 1037.0 0.0 0.5 14.7 0 6
extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 112 1 7 0 92
sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd2 97.2 204.1 4498.7 4629.9 0.0 3.2 10.6 0 60
sd3 95.6 204.5 4300.9 4629.9 0.0 2.8 9.2 0 55
sd4 94.8 203.9 4379.1 4625.7 0.0 2.6 8.6 0 52
sd5 95.2 204.3 4336.8 4625.7 0.0 2.8 9.2 0 58

And here is the output of vmstat during the IOMeter test
#vmstat 5
0 0 0 574060 663584 0 0 0 0 0 0 0 0 0 2 2 438 95 169 0 0 100
0 0 0 574060 663584 0 0 0 0 0 0 0 0 0 2 2 444 97 179 0 0 100
0 0 0 574056 663580 0 0 0 0 0 0 0 0 0 4 4 436 117 206 0 0 100
0 0 0 574052 663576 0 0 0 0 0 0 0 0 0 2 2 450 97 178 0 0 100
0 0 0 574052 663576 0 0 0 0 0 0 0 0 0 69 64 1195 344 1209 0 1 99
0 0 0 715900 805424 0 0 0 0 0 0 0 0 0 282 276 7126 3995 9943 1 6 93
2 0 0 719832 809356 0 0 0 0 0 0 0 0 0 309 294 4005 1416 5025 0 7 92
0 0 0 707268 796792 0 0 0 0 0 0 0 0 0 224 219 4368 3130 7875 0 5 95
0 0 0 705052 794576 0 0 0 0 0 0 0 0 0 323 313 4573 2417 7040 0 8 92
0 0 0 697524 787048 0 0 0 0 0 0 0 0 0 285 267 2827 525 3046 0 3 97
0 0 0 697112 786636 0 0 0 0 0 0 0 0 0 227 229 4688 3327 8446 0 5 95
0 0 0 696384 785908 0 0 0 0 0 0 0 0 0 364 358 5089 2353 7225 0 9 91
1 0 0 695140 784664 5 41 0 0 0 0 0 0 0 205 187 3516 2184 5722 0 4 96
0 0 0 694500 784024 5 41 0 0 0 0 0 0 0 284 282 4947 3453 8951 0 5 94
0 0 0 691092 780616 0 0 0 0 0 0 0 0 0 278 248 4227 1787 5474 0 9 90
0 0 0 688860 778384 0 0 0 0 0 0 0 0 0 256 257 4871 3517 8908 1 7 92

I am not adverse to making changs to my Solaris environment, but I can't do a full rebuild of it, because I have some engineers running a couple of tests on the iSCSI SAN. As I said this is a test environment, and once the engineers tests are complete I can rebuild the whoe thing. Originally this was going to be used as a proof of concept, but I was pressured to get it "functional" before I had time to fully play with it.

Does anyone have any suggestions of tests to try and find where the bottleneck is at? I am assuming the bottleneck is VMWare but I have not found a way to prove this yet.
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Apr 30 2010
Added on Apr 2 2010
1 comment
268 views