technical support

sourceninja

Diamond Member
Mar 8, 2005
8,805
65
91
I've been on the same phone call for 19 hours. Support reps get to go home. I don't until this is fixed :-(
 

Williz

Member
Jan 3, 2014
145
1
0
I'd feign a disconnect and go home at around 2-3 hours into a single call like that...
 
Feb 25, 2011
16,823
1,493
126
critical systems outages are the best.

If it was 19 hours, either it wasn't that critical or they're going out of business already.

Either way, no reason to stay on the call like that. It's easier to troubleshoot if you've had a good nights' sleep anyway.
 

ch33zw1z

Lifer
Nov 4, 2004
38,003
18,350
146
If it was 19 hours, either it wasn't that critical or they're going out of business already.

Either way, no reason to stay on the call like that. It's easier to troubleshoot if you've had a good nights' sleep anyway.

19 hours means critical, otherwise why the eff would you stay that long.

We need more details, but I've been on service calls that have lasted close to that. it's brutal.

edit: I should add, those long service calls are never on x86 products.
 
Last edited:

Exterous

Super Moderator
Jun 20, 2006
20,431
3,537
126
19 hours means critical, otherwise why the eff would you stay that long.

Hourly employee with wage bump after 8 hours on duty with access to entertainment\games\movies while on hold?

"sorry guys I would like to help you re-image all those XP computers\educate users on using a mouse but I am still on this very important support call."
 

NuclearNed

Raconteur
May 18, 2001
7,837
310
126
I've been on the same phone call for 19 hours. Support reps get to go home. I don't until this is fixed :-(

While 19 hours of phone sex using your mom's credit card is simultaneously sad and epic, trying to convince ATOT it's "technical support" leans heavily towards just sad
 
Last edited:

sourceninja

Diamond Member
Mar 8, 2005
8,805
65
91
So here's the breakdown.

Our vCenter environment is not functioning. It spikes to 100% cpu and just stops working. I've tried all the usual support things, nothing is wrong with the database server (and all other database on that server are fine). It's the vcenter procoess vpxd.exe that is at 100%. I should also point out that I've been using vsphere though versions 3-5 and I'm a current VCP holder (going to take my vacp soon).

This is preventing all backups from running as they need vcenter to initiate the backups. I log a ticket with vmware in the morning two days ago. I get a call shortly after and the tech starts to do his thing. I tend to let techs try what they think is right. I've worked phone tech support and I remember those asshole admins who think they know everything, yet had to dial support. After a few hours he is no closer to the problem, but it's his time to go home, so he transfers me to a new tech in another timezone. This tech of course has to start over, because in IT the other guy is obviously an idiot. He suggests we reinstall vcenter. I allow it because I honestly don't know what the hell is wrong. Nothing is solved.

Fast forward another 2 hours and it's this guys time to go home. I get transferred yet again, this time to india (I think) (around the world in one phone call?). This guy quickly realizes he needs help and involves a senior engineer. The senior engineer checks all the things I checked before calling (but I understand) and then suggests we reinstall vcenter. I'm hesitant because we have done this before. He explains that the last guy probably didn't uninstall it properly. I relent and we reinstall vcenter. This causes no relief. He continues working and eventually settles in that the problem is a new ESXi host we added to the cluster a few days ago. We can't remove it, because we can't login to vcenter. He gets another engineer and we go to the database and SQL out the ESXi host. At first CPU usage drops, but then the problem comes right back. At this point it's almost 3am my time, I've been on a call well over 15 hours. I suggest we shelve it until the morning, obviously they need time to work on the issue. They download some logs and dmp files from vcenter and we call it quits. I go get a burger at steak and shake, it was pretty gross and the only food I had to eat after 11am. Bedtime is around 4:30am.

I get into work about 9am the next day (yesterday), the day of my wedding anniversary. vCenter is still down, my boss not terribly happy. I call vmware and they begin their work again around 10am. New tech as the old tech can't be reached. He again explains that everyone else must be an idiot and begins the process a new. He decides it must also be a host issue and we go though a tedious process of rebooting all esxi hosts. To my surprise (and I'm pretty sure also unrelated) vcenter starts to run slightly better even though the cpu usage stays pegged around 80-90%. He then suspects it's a storage issue and brings in a storage expert. They work on the system until about 5pm (My wife brought me lunch so I could eat!) and nothing was resolved. They downloaded the log bundle for vcenter and all hosts and left me so they could analyse the issue.

The result is I still can't not run backups. I still can not manage my VMs though vCenter and I still have no idea what is causing it. I suspect today will be more of the same.

To the support representatives credit, they have been very polite and have tried very hard to fix my problem. I really appreciate their hard work. At this point I could have rebuilt the server. The reason I haven't is because I don't want this problem to pop up again, I want to know the root cause.

In any case, I made that post simply because I needed to vent to someone and my phone was handy.
 

saratoga172

Golden Member
Nov 10, 2009
1,564
1
81
Ouch. Sounds like a slightly different problem I had with a vCenter server a couple months ago. Was doing a vdi deployment for one of our divisions. Move the equip from our corp office where it was staged and running without issues to our division office.

Get there and we can't get into vcenter. Everything hooked up as it should be no other signs of failure or issue. Spent 2 days on the phone with vmware before finally going with my initial thoughts and checking the SQL connection. This is after reinstalling vcenter 3 times, rebooting the hosts, servers etc, talking to 3 different techs. Ended up just rebuilding the vcenter server to start fresh but the issue was a bad connection in the SQL database. The account was not authorized. After some digging found out one of the engineers working on the solution for us had made some changes and nothing got documented, the systems weren't rebooted and tested. Simple reboot would have found the problem.

Issues can be frustrating and sometimes you feel you aren't making progress but I've always had helpful vmware techs and they are willing to exhaust all options when working with you.
 

saratoga172

Golden Member
Nov 10, 2009
1,564
1
81
You say you can't get into vcenter. Are you getting a specific error message? Curious what the root cause is for future reference if the error ever pops up in one of my environments.
 

PliotronX

Diamond Member
Oct 17, 1999
8,883
107
106
Servers are so damn time consuming! It'll feel good once it's water under the bridge and you know what caused it so you can resolve it in the future and hopefully imbue the fix on others
 

sourceninja

Diamond Member
Mar 8, 2005
8,805
65
91
You say you can't get into vcenter. Are you getting a specific error message? Curious what the root cause is for future reference if the error ever pops up in one of my environments.

No error, you can login, but the entire interface is non responsive. Whatever is causing the CPU to be so high is essentially lagging it out. Interestingly if you login to the web client you will find it only has half the data (and is equally unresponsive).
 

ch33zw1z

Lifer
Nov 4, 2004
38,003
18,350
146
So here's the breakdown.

Our vCenter environment is not functioning. It spikes to 100% cpu and just stops working. I've tried all the usual support things, nothing is wrong with the database server (and all other database on that server are fine). It's the vcenter procoess vpxd.exe that is at 100%. I should also point out that I've been using vsphere though versions 3-5 and I'm a current VCP holder (going to take my vacp soon).

This is preventing all backups from running as they need vcenter to initiate the backups. I log a ticket with vmware in the morning two days ago. I get a call shortly after and the tech starts to do his thing. I tend to let techs try what they think is right. I've worked phone tech support and I remember those asshole admins who think they know everything, yet had to dial support. After a few hours he is no closer to the problem, but it's his time to go home, so he transfers me to a new tech in another timezone. This tech of course has to start over, because in IT the other guy is obviously an idiot. He suggests we reinstall vcenter. I allow it because I honestly don't know what the hell is wrong. Nothing is solved.

Fast forward another 2 hours and it's this guys time to go home. I get transferred yet again, this time to india (I think) (around the world in one phone call?). This guy quickly realizes he needs help and involves a senior engineer. The senior engineer checks all the things I checked before calling (but I understand) and then suggests we reinstall vcenter. I'm hesitant because we have done this before. He explains that the last guy probably didn't uninstall it properly. I relent and we reinstall vcenter. This causes no relief. He continues working and eventually settles in that the problem is a new ESXi host we added to the cluster a few days ago. We can't remove it, because we can't login to vcenter. He gets another engineer and we go to the database and SQL out the ESXi host. At first CPU usage drops, but then the problem comes right back. At this point it's almost 3am my time, I've been on a call well over 15 hours. I suggest we shelve it until the morning, obviously they need time to work on the issue. They download some logs and dmp files from vcenter and we call it quits. I go get a burger at steak and shake, it was pretty gross and the only food I had to eat after 11am. Bedtime is around 4:30am.

I get into work about 9am the next day (yesterday), the day of my wedding anniversary. vCenter is still down, my boss not terribly happy. I call vmware and they begin their work again around 10am. New tech as the old tech can't be reached. He again explains that everyone else must be an idiot and begins the process a new. He decides it must also be a host issue and we go though a tedious process of rebooting all esxi hosts. To my surprise (and I'm pretty sure also unrelated) vcenter starts to run slightly better even though the cpu usage stays pegged around 80-90%. He then suspects it's a storage issue and brings in a storage expert. They work on the system until about 5pm (My wife brought me lunch so I could eat!) and nothing was resolved. They downloaded the log bundle for vcenter and all hosts and left me so they could analyse the issue.

The result is I still can't not run backups. I still can not manage my VMs though vCenter and I still have no idea what is causing it. I suspect today will be more of the same.

To the support representatives credit, they have been very polite and have tried very hard to fix my problem. I really appreciate their hard work. At this point I could have rebuilt the server. The reason I haven't is because I don't want this problem to pop up again, I want to know the root cause.

In any case, I made that post simply because I needed to vent to someone and my phone was handy.

I would push for RCA from them, but the obvious questions:

When did it start?
What recent changes were made prior to the issue?
 

sourceninja

Diamond Member
Mar 8, 2005
8,805
65
91
I would push for RCA from them, but the obvious questions:

When did it start?
What recent changes were made prior to the issue?

That's the frustrating part. It started 3 days ago, in the middle of the day. No change tickets were logged so if anyone did any changes, they are not speaking up about it. The only change that is recent was the adding of a new host and new datastore about 3 days previous.

I'm starting to suspect it might be a windows problem. We used 2012R2 for this server and I'm wondering if vCenter has issues with R2. I've noticed a direct correlation between network traffic and CPU load in the task manager. It seems like when network traffic is higher (15-30mbps) the cpu is maxed out. When network traffic drops to 3-500k the cpu load drops to around 50-60% and the interface is somewhat responsive.
 
Feb 25, 2011
16,823
1,493
126
I'm just a vCenter n00b, and I understand Linux guys hate reinstalling stuff, but if you'd just installed a new vCenter instance and pointed it at the old DB, wouldn't it have:

1) Probably fixed it (assuming you're right and the DB is fine)
2) Taken you a couple hours, tops.

?

Have you upgraded to 5.5 yet? I think those support multiple vCenter servers in an HA thingy.

If you haven't upgraded to 5.5, do your hosts first. Upgrading to vCenter 5.5 with a bunch of 5.1 hosts exposed this bug for us. Random hosts started KPing. It was a couple days of hell.
 

imagoon

Diamond Member
Feb 19, 2003
5,199
0
0
I'm just a vCenter n00b, and I understand Linux guys hate reinstalling stuff, but if you'd just installed a new vCenter instance and pointed it at the old DB, wouldn't it have:

1) Probably fixed it (assuming you're right and the DB is fine)
2) Taken you a couple hours, tops.

?

Have you upgraded to 5.5 yet? I think those support multiple vCenter servers in an HA thingy.

If you haven't upgraded to 5.5, do your hosts first. Upgrading to vCenter 5.5 with a bunch of 5.1 hosts exposed this bug for us. Random hosts started KPing. It was a couple days of hell.

Strange. I can reproduce that PSOD with a stand alone ESXi host. I am fairly certain that is a host bug and has nothing to do with vCenter itself and I am a bit doubtful that the upgrade order of vcenter vs hosts would mask that issue. I mean I wouldn't upgrade to vCenter 5.5 until I had put ESXi 5.1u2 on the hosts first via vCenter 5.1.
 

ElFenix

Elite Member
Super Moderator
Mar 20, 2000
102,425
8,388
126
And here I thought getting my old windows XP install converted to a VM was a pain in the ass. :beer for you when you get this worked out

 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |