- Apr 20, 2013
- 4,307
- 450
- 126
I believe this to be either a Solaris or hardware issue, but including napp-it just in case. Plus to help summon gea.
I've got two Solaris 11.3 boxes running Napp-it Pro (January build). One is a physical box, the other a VM. Physical box replicates to VM. Outside of that, configuration/software wise they are identical, the only services they are running is Comstar iSCSI which feeds LUN's to my ESXi host. They've both been running smoothly for ages. I generally log into the Web-GUI every week or two just to check system health. When I did this Monday, the login was super slow. Once I got in, I saw a red light on the disk real time monitor. One of the spindles in the Z2 array was pegged at 100%. So I swapped the drive. Everything seemed to go back to normal at that point.
Tuesday, I try to sign in again. Sign in page loads normally. Sign in, it authenticates the credentials (meaning if put bad creds in it errors out accordingly), then hangs. The storage itself seems to be working fine, as the box is the underlying storage for my ESXi host. So I console into the box and restart the napp-it Web-GUI. No change. Solaris System Monitor looks normal. Pool looks normal. Try to reboot the box via menu. Nothing happens. Attempt to open terminal up to reboot via CLI. Terminal won't open.
At this point, I stop the replication job, take the replication pool out of read only, and send all my iSCSI traffic to the VM which is running strong (yay for home DR, lol). I'll be digging into the issue more on the weekend but I'm not quite sure where to start as there hasn't been any configuration changes and everything looks normal except that it's non-responsive to most actions. I'll be pulling the OS/rpool and ZIL SSD's out to check SMART health but beyond that I'm not sure where to go.
I've got two Solaris 11.3 boxes running Napp-it Pro (January build). One is a physical box, the other a VM. Physical box replicates to VM. Outside of that, configuration/software wise they are identical, the only services they are running is Comstar iSCSI which feeds LUN's to my ESXi host. They've both been running smoothly for ages. I generally log into the Web-GUI every week or two just to check system health. When I did this Monday, the login was super slow. Once I got in, I saw a red light on the disk real time monitor. One of the spindles in the Z2 array was pegged at 100%. So I swapped the drive. Everything seemed to go back to normal at that point.
Tuesday, I try to sign in again. Sign in page loads normally. Sign in, it authenticates the credentials (meaning if put bad creds in it errors out accordingly), then hangs. The storage itself seems to be working fine, as the box is the underlying storage for my ESXi host. So I console into the box and restart the napp-it Web-GUI. No change. Solaris System Monitor looks normal. Pool looks normal. Try to reboot the box via menu. Nothing happens. Attempt to open terminal up to reboot via CLI. Terminal won't open.
At this point, I stop the replication job, take the replication pool out of read only, and send all my iSCSI traffic to the VM which is running strong (yay for home DR, lol). I'll be digging into the issue more on the weekend but I'm not quite sure where to start as there hasn't been any configuration changes and everything looks normal except that it's non-responsive to most actions. I'll be pulling the OS/rpool and ZIL SSD's out to check SMART health but beyond that I'm not sure where to go.