Common Errors while Running Condor's VM Universe

From Ben's Writing

Jump to: navigation, search

VMware

Windows Services

Sometimes Windows Services crash: it’s a fact of life. It's inevitable that something will go wrong, whether it’s because of the program or the OS, it will happen. Fortunately, there are many way to deal with this. But we are not interested in those solutions; we only care about the side-effects of such events—they can be silent killers: making it very difficult to figure out what's going on.

For instance, I ran across this the other day:

2/13 15:30:45 VMGAHP[1332]: Failed to execute my_system: C:\Perl\bin\PERL.EXE C:\condor\bin\CONDOR~1.PL check

It struck me as a rather odd failure, since the script does very little more than wrap the vmrun.exe functionality. It turned out that even running the command explicitly failed too:

> C:\Perl\bin\PERL.EXE C:\condor\bin\CONDOR~1.PL check
Error: The system returned an error. Communication with the virtual machine may have been interrupted.
Error: The system returned an error. Communication with the virtual machine may have been interrupted
(ERROR) Can't execute C:\Program Files\VMware\VMware VIX\vmrun

But it did give me some more helpful information; namely, that vmrun.exe could not communicate with the VM server (well, I guessed that part). What had happened was that the VMware Registration Service (aka VMware VirtualCenter Agent: vmserverdWin32.exe) had stopped running for some reason. So all that was required of me was to first restart the VMware service and then the Condor service to get VM Universe back up and running.

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox