Join us at engage.eucalyptus.com
Hello,
I am receiving this message whenever I try to run an image:
[Sun May 23 09:49:09 2010][002440][EUCAINFO ] vrun(): [//usr/lib/eucalyptus/euca_rootwrap //usr/share/eucalyptus/partition2disk /var/lib/eucalyptus/instances//admin/i-441B07E1/disk 512 12580]
[Sun May 23 09:49:10 2010][002440][EUCAINFO ] preparing images for instance i-441B07E1...
[Sun May 23 09:49:10 2010][002440][EUCAINFO ] running tune2fs on the root file system at using (/var/lib/eucalyptus/instances//admin/i-441B07E1/disk)
[Sun May 23 09:49:10 2010][002440][EUCAINFO ] vrun(): [//usr/lib/eucalyptus/euca_rootwrap //usr/share/eucalyptus/add_key.pl //usr/lib/eucalyptus/euca_mountwrap 32256 /var/lib/eucalyptus/instances//admin/i-441B07E1/disk ]
[Sun May 23 09:49:10 2010][002440][EUCADEBUG ] system_output(): [//usr/lib/eucalyptus/euca_rootwrap //usr/share/eucalyptus/gen_kvm_libvirt_xml --ramdisk --ephemeral]
[Sun May 23 09:49:10 2010][002440][EUCAINFO ] currently running/booting: i-441B07E1
[Sun May 23 09:49:12 2010][002440][EUCAERROR ] libvirt: operation failed: failed to retrieve chardev info in qemu with 'info chardev' (code=9)
[Sun May 23 09:49:12 2010][002440][EUCAFATAL ] hypervisor failed to start domain
[Sun May 23 09:49:12 2010][002440][EUCADEBUG ] doDescribeResource() invoked
[Sun May 23 09:49:12 2010][002440][EUCADEBUG ] doDescribeInstances() invoked
[Sun May 23 09:49:12 2010][002440][EUCAERROR ] libvirt: Domain not found: no domain with matching name 'i-441B07E1' (code=42)
[Sun May 23 09:49:12 2010][002440][EUCAINFO ] vrun(): [rm -rf /var/lib/eucalyptus/instances//admin/i-441B07E1/]
[Sun May 23 09:49:13 2010][002440][EUCAINFO ] stopping the network (vlan=10)
I have CC + CSC + Walrus on one computer and the NC on another.
The image has worked before. All I have done is add a network card to the CC in order to make it dual homed. I have reinstalled the latest Eucalyptus Ubuntu packages on the respective machines.
Can you give me pointers on how to proceed troubleshooting this?
Hello,
which distro are you using? and which version of distro and eucalyptus are you using? Did you find any clue in the lbvirt logs or in the system logs? Did you upgraded libvirt lately?
I have not seen that error before, but it seems related to libvirt/hypervisor cnfiguration.
cheers
graziano
Ubuntu 10.01 64-bit
# dpkg -l|grep euca
ii eucalyptus-common 1.6.2-0ubuntu30 Elastic Utility Computing Architecture - Com
ii eucalyptus-gl 1.6.2-0ubuntu30 Elastic Utility Computing Architecture - Log
ii eucalyptus-nc 1.6.2-0ubuntu30 Elastic Utility Computing Architecture - Nod
------
I thought the problem may be apparmor, so I purged the apparmor packages & rebooted.
I set MANUAL_INSTANCES_CLEANUP=1
I tried to run the instance by hand using the following script:
curdir=`pwd`
sudo kvm \
-m 896 \
-smp 1 \
-name i-4FB70849 \
-uuid 42b63e48-750f-f200-068e-c5fec9412822 \
-nographic \
-drive file=${curdir}/disk,if=scsi,index=0,format=qcow,boot=on \
-net nic,macaddr=d0:0d:4f:b7:08:49,vlan=0,model=e1000,name=e1000.0 \
-parallel none -usb \
-vnc :1 \
-M pc-0.12 \
-enable-kvm \
-initrd ${curdir}/ramdisk \
-kernel ${curdir}/kernel
Here is the output:
root@node01:/var/lib/eucalyptus/instances/admin/i-3B870780# ./r.sh
++ pwd
+ curdir=/var/lib/eucalyptus/instances/admin/i-3B870780
+ sudo kvm -m 896 -smp 1 -name i-4FB70849 -uuid 42b63e48-750f-f200-068e-c5fec9412822 -nographic -drive file=/var/lib/eucalyptus/instances/admin/i-3B870780/disk,if=scsi,index=0,format=qcow,boot=on -net nic,macaddr=d0:0d:4f:b7:08:49,vlan=0,model=e1000,name=e1000.0 -parallel none -usb -vnc :1 -M pc-0.12 -enable-kvm -initrd /var/lib/eucalyptus/instances/admin/i-3B870780/ramdisk -kernel /var/lib/eucalyptus/instances/admin/i-3B870780/kernel
qemu: could not open disk image /var/lib/eucalyptus/instances/admin/i-3B870780/disk: Success
------
Libvirt logs
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 896 -smp 1 -name i-3B870780 -uuid eda5e41d-
ffed-ba1e-dc34-474ccf8c6ff6 -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/i-3B870780.monitor,server,nowait -monitor chardev:monitor -boot c -kernel /var/lib/eucalyptus/inst
ances//admin/i-3B870780/kernel -initrd /var/lib/eucalyptus/instances//admin/i-3B870780/ramdisk -append root=/dev/sda1 console=ttyS0 -drive file=/var/lib/eucalyptus/instances//admin
/i-3B870780/disk,if=ide,index=0,boot=on -net nic,macaddr=d0:0d:3b:87:07:80,vlan=0,model=e1000,name=e1000.0 -net tap,fd=48,vlan=0,name=tap.0 -chardev file,id=serial0,path=/var/lib/e
ucalyptus/instances//admin/i-3B870780/console.log -serial chardev:serial0 -parallel none -usb -vnc 10.1.0.20:0 -vga cirrus
chardev: opening backend "file" failed
root@node01:/var/lib/eucalyptus/instances/admin/i-3B870780# virsh define libvirt.xml
Domain i-3B870780 defined from libvirt.xml
root@node01:/var/lib/eucalyptus/instances/admin/i-3B870780# LIBVIRT_DEBUG=1 virsh start i-3B870780
02:26:04.743: debug : virInitialize:336 : register drivers
02:26:04.743: debug : virRegisterDriver:837 : registering Test as driver 0
02:26:04.743: debug : virRegisterNetworkDriver:675 : registering Test as network driver 0
02:26:04.743: debug : virRegisterInterfaceDriver:706 : registering Test as interface driver 0
02:26:04.743: debug : virRegisterStorageDriver:737 : registering Test as storage driver 0
02:26:04.743: debug : virRegisterDeviceMonitor:768 : registering Test as device driver 0
02:26:04.743: debug : virRegisterSecretDriver:799 : registering Test as secret driver 0
02:26:04.743: debug : virRegisterDriver:837 : registering Xen as driver 1
02:26:04.743: debug : virRegisterDriver:837 : registering OPENVZ as driver 2
02:26:04.743: debug : vboxRegister:109 : VBoxCGlueInit failed, using dummy driver
02:26:04.743: debug : virRegisterDriver:837 : registering VBOX as driver 3
02:26:04.743: debug : virRegisterNetworkDriver:675 : registering VBOX as network driver 1
02:26:04.743: debug : virRegisterStorageDriver:737 : registering VBOX as storage driver 1
02:26:04.743: debug : virRegisterDriver:837 : registering remote as driver 4
02:26:04.743: debug : virRegisterNetworkDriver:675 : registering remote as network driver 2
02:26:04.743: debug : virRegisterInterfaceDriver:706 : registering remote as interface driver 1
02:26:04.743: debug : virRegisterStorageDriver:737 : registering remote as storage driver 2
02:26:04.743: debug : virRegisterDeviceMonitor:768 : registering remote as device driver 1
02:26:04.743: debug : virRegisterSecretDriver:799 : registering remote as secret driver 1
02:26:04.743: debug : virConnectOpenAuth:1337 : name=qemu:///system, auth=0x7f8f8712ab80, flags=0
02:26:04.743: debug : do_open:1106 : name "qemu:///system" to URI components:
scheme qemu
opaque (null)
authority (null)
server (null)
user (null)
port 0
path /system
02:26:04.743: debug : do_open:1116 : trying driver 0 (Test) ...
02:26:04.743: debug : do_open:1122 : driver 0 Test returned DECLINED
02:26:04.743: debug : do_open:1116 : trying driver 1 (Xen) ...
02:26:04.743: debug : do_open:1122 : driver 1 Xen returned DECLINED
02:26:04.743: debug : do_open:1116 : trying driver 2 (OPENVZ) ...
02:26:04.743: debug : do_open:1122 : driver 2 OPENVZ returned DECLINED
02:26:04.743: debug : do_open:1116 : trying driver 3 (VBOX) ...
02:26:04.743: debug : do_open:1122 : driver 3 VBOX returned DECLINED
02:26:04.743: debug : do_open:1116 : trying driver 4 (remote) ...
02:26:04.743: debug : doRemoteOpen:564 : proceeding with name = qemu:///system
02:26:04.743: debug : remoteIO:8455 : Do proc=66 serial=0 length=28 wait=(nil)
02:26:04.743: debug : remoteIO:8517 : We have the buck 66 0x7f8f87161010 0x7f8f87161010
02:26:27.778: debug : remoteIODecodeMessageLength:7939 : Got length, now need 64 total (60 more)
02:26:27.778: debug : remoteIOEventLoop:8381 : Giving up the buck 66 0x7f8f87161010 (nil)
02:26:27.778: debug : remoteIO:8548 : All done with our call 66 (nil) 0x7f8f87161010
02:26:27.778: debug : remoteIO:8455 : Do proc=1 serial=1 length=56 wait=(nil)
02:26:27.778: debug : remoteIO:8517 : We have the buck 1 0x6d7ff0 0x6d7ff0
02:26:27.779: debug : remoteIODecodeMessageLength:7939 : Got length, now need 56 total (52 more)
02:26:27.779: debug : remoteIOEventLoop:8381 : Giving up the buck 1 0x6d7ff0 (nil)
02:26:27.779: debug : remoteIO:8548 : All done with our call 1 (nil) 0x6d7ff0
02:26:27.779: debug : doRemoteOpen:917 : Adding Handler for remote events
02:26:27.779: debug : doRemoteOpen:924 : virEventAddHandle failed: No addHandleImpl defined. continuing without events.
02:26:27.779: debug : do_open:1122 : driver 4 remote returned SUCCESS
02:26:27.779: debug : do_open:1142 : network driver 0 Test returned DECLINED
02:26:27.779: debug : do_open:1142 : network driver 1 VBOX returned DECLINED
02:26:27.779: debug : do_open:1142 : network driver 2 remote returned SUCCESS
02:26:27.779: debug : do_open:1161 : interface driver 0 Test returned DECLINED
02:26:27.779: debug : do_open:1161 : interface driver 1 remote returned SUCCESS
02:26:27.779: debug : do_open:1181 : storage driver 0 Test returned DECLINED
02:26:27.779: debug : do_open:1181 : storage driver 1 VBOX returned DECLINED
02:26:27.779: debug : do_open:1181 : storage driver 2 remote returned SUCCESS
02:26:27.779: debug : do_open:1201 : node driver 0 Test returned DECLINED
02:26:27.779: debug : do_open:1201 : node driver 1 remote returned SUCCESS
02:26:27.779: debug : do_open:1228 : secret driver 0 Test returned DECLINED
02:26:27.779: debug : do_open:1228 : secret driver 1 remote returned SUCCESS
02:26:27.779: debug : virDomainLookupByName:1974 : conn=0x6d2a00, name=i-3B870780
02:26:27.779: debug : remoteIO:8455 : Do proc=23 serial=2 length=44 wait=(nil)
02:26:27.779: debug : remoteIO:8517 : We have the buck 23 0x6d7ff0 0x6d7ff0
02:26:27.779: debug : remoteIODecodeMessageLength:7939 : Got length, now need 92 total (88 more)
02:26:27.779: debug : remoteIOEventLoop:8381 : Giving up the buck 23 0x6d7ff0 (nil)
02:26:27.779: debug : remoteIO:8548 : All done with our call 23 (nil) 0x6d7ff0
02:26:27.779: debug : virGetDomain:345 : New hash entry 0x6d0c10
02:26:27.779: debug : virDomainGetID:2617 : domain=0x6d0c10
02:26:27.779: debug : virDomainCreate:4635 : domain=0x6d0c10
02:26:27.779: debug : remoteIO:8455 : Do proc=9 serial=3 length=64 wait=(nil)
02:26:27.779: debug : remoteIO:8517 : We have the buck 9 0x6d7ff0 0x6d7ff0
02:26:57.805: debug : remoteIODecodeMessageLength:7939 : Got length, now need 192 total (188 more)
02:26:57.805: debug : remoteIOEventLoop:8381 : Giving up the buck 9 0x6d7ff0 (nil)
02:26:57.805: debug : remoteIO:8548 : All done with our call 9 (nil) 0x6d7ff0
02:26:57.805: debug : virDomainGetName:2524 : domain=0x6d0c10
error: Failed to start domain i-3B870780
02:26:57.805: debug : virDomainFree:2062 : domain=0x6d0c10
02:26:57.805: debug : virUnrefDomain:422 : unref domain 0x6d0c10 i-3B870780 1
02:26:57.805: debug : virReleaseDomain:376 : release domain 0x6d0c10 i-3B870780
02:26:57.805: debug : virReleaseDomain:392 : unref connection 0x6d2a00 2
error: monitor socket did not show up.: Connection refused
02:26:57.805: debug : virConnectClose:1355 : conn=0x6d2a00
02:26:57.805: debug : virUnrefConnect:259 : unref connection 0x6d2a00 1
02:26:57.805: debug : remoteIO:8455 : Do proc=2 serial=4 length=28 wait=(nil)
02:26:57.805: debug : remoteIO:8517 : We have the buck 2 0x6d7ff0 0x6d7ff0
02:26:57.806: debug : remoteIODecodeMessageLength:7939 : Got length, now need 56 total (52 more)
02:26:57.806: debug : remoteIOEventLoop:8381 : Giving up the buck 2 0x6d7ff0 (nil)
02:26:57.806: debug : remoteIO:8548 : All done with our call 2 (nil) 0x6d7ff0
02:26:57.806: debug : virReleaseConnect:216 : release connection 0x6d2a00
Sorry, I meant Ubuntu 10.04 not 10.01
I removed the 'serial' definition, and I was able to start the images using virsh.
The forum code removes anything that looks like an XML tag, so I can't post directly.
Here is what I removed: http://slexy.org/raw/s20HiyArHA
Here is my full libvirt.xml before the removal: http://slexy.org/raw/s2KPFinewG
Weird!
I'm still stuck because I need the serial ports for username/passwords on the image, but at least I'm a step closer to the solution. Would appreciate any insight you can give.
Hello,
hmm .. this is very weird indeed: can you check the permission on that directory? You can remove the generated libvirt.xml modifying the gen_kvm_libvirt_xml: that modification of course will prevent eucalyptus from getting the console output among other things.
It really seems a permission problem: you mentioned that you started the instance with virsh. Did you use the user eucalyptus to start the instance? If you start it as root, you still have problem with the serial option?
cheers
graziano
I was running virsh as root. I'm also pretty sure it is a access/permissions related issue. Maybe this is related to cgroups somehow...
But I don't know how to resolve it.
Hello,
we talked a bit on IRC: if you find anything new on this one. please follow up here so we can leave a trace.
cheers
graziano
I've opened a bug since there is now suspicion that this may be a libvirtd issue: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/585964
Thank you very much for assistance. After our IRC session, I was able to isolate the issue to libvirt-bin and thus exclude Eucalyptus.
After further investigation, it seems the problem is with one of the Ubuntu packages (either libvirt or apparmor). I am now able to reliably reproduce this issue.
The problem begins after apparmor is purged. Thus, as a temporary workaround, I'd suggest that Eucalyptus recommend its users not to uninstall, purge, or disable apparmor. (See bug report below.)
At least this has been my experience. YMMV.
With regards,
Hovanes
Hello,
thanks for the investigative work and to file a bug!
cheers
graziano
I found that I could only startup one instance of an image. Subsequent attempts resulted in the obscure error in nc.log:
libvirt: operation failed: failed to retrieve chardev info in qemu with 'info chardev' (code=9)
This was caused (in my case) by having the following line in my gen_kvm_libvirt_xml file:
graphics type='vnc' listen='192.168.20.2' port='5904'
Once this was removed...SOLVED.
Granted... it was dumb for me to have the above tag in my xml when I'm trying to run more than one instance...it was left over from previous woes....
That being said... A slightly more informative error message might be useful. May I propose:
"libvirt: operation failed: failed to retrieve chardev info in qemu with 'info chardev' (code=9)\n
possible causes: 'graphics' tag in gen_kvm_libvirt_xml"
Hello there, I think that I stumbled into the same problem, but I was not able to fix it by following the advices above.
I am running a cloud based on Ubuntu 10.10 with all latest packages, CLC, CC, SC and Walrus on a server and NC on a separate server, both with dual NIC's configured for multiclustering. Everything was working almost fine. The only problem I had is that I was randomly getting a totally messed up network when I attempted to run several instances from different users. Often I was either getting unaccessible instances, or the instance login was prompting me to the CLC public interface, or the storage volumes were unaccessible or messed up.
We had troubleshooted the network configuration in a different topic already and you said that everything was fine in the config. Neil and Graziano had then concluded that the problem was probably AoE related. Since I already had several problems with AppArmor and it occured to me with different applications that after weeks of troubleshooting I could find out that the only problem was AppArmor, I decided to purge it and test how the cloud would go without AppArmor, also considering that everything is supposed to be fine according to the config you saw.
After removing AppArmor, everything seemed fine, but now the instances go from pending to terminated with the same error messages HM had.
[Mon Nov 15 15:45:27 2010][001422][EUCAERROR ] libvirt: monitor socket did not show up.: Connection refused (code=38)
[Mon Nov 15 15:45:27 2010][001422][EUCAFATAL ] hypervisor failed to start domain
[Mon Nov 15 15:45:32 2010][001422][EUCAERROR ] libvirt: Domain not found: no domain with matching name 'i-45C90874' (code=42)
[Mon Nov 15 15:45:32 2010][001422][EUCAINFO ] vrun(): [rm -rf /var/lib/eucalyptus/instances//user01/i-45C90874/]
and
cat /var/log/libvirt/qemu/i-45C90874.log
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin /usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 1900 -smp 1 -name i-45C90874 -uuid 0e9c8a9d-7059-f0e9-0eed-08a74730803e -nographic -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/i-45C90874.monitor,server,nowait -monitor chardev:monitor -boot c -kernel /var/lib/eucalyptus/instances//user01/i-45C90874/kernel -initrd /var/lib/eucalyptus/instances//user01/i-45C90874/ramdisk -append root=/dev/sda1 console=ttyS0 -drive file=/var/lib/eucalyptus/instances//user01/i-45C90874/disk,if=scsi,index=0,boot=on,format=raw -net nic,macaddr=d0:0d:45:c9:08:74,vlan=0,model=e1000,name=e1000.0 -net tap,fd=46,vlan=0,name=tap.0 -chardev file,id=serial0,path=/var/lib/eucalyptus/instances//user01/i-45C90874/console.log -serial chardev:serial0 -parallel none -usb
chardev: opening backend "file" failed
I don't have any graphical parameter on /usr/share/eucalyptus/gen_kvm_libvirt_xml and even by setting security_driver = "none" in qemu.conf, I did not solve the problem.
I also attempted to reinstall eucalyptus-nc + dependencies with no success. I rebooted the cloud several times of course.
Since I do not trust apparmor behavior I would like to test my cloud without any extra security profiles.
Is apparmor really essential to run libvirt and eucalyptus?
What is this backend file that libvirt is not able to open?
It might be a simple permission issue after removing the apparmor profile, but your guidance for solving this problem would be very appreciated.
Regards,
TritoLux
After removing AppArmor, I realized that qemu.conf was modified and both user and group were set back to root.
Now, I changed it back to what follows and instances can start again:
security_driver = "none"
user = "eucalyptus"
#user = "libvirt-qemu"
#user = "root"
group = "kvm"
#group = "root"
I have not tested everything again, but now I have the following error message in the instance log:
# cat /var/log/libvirt/qemu/i-42AA07D1.log
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin /usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 1900 -smp 1 -name i-42AA07D1 -uuid c27c246b-c109-5fa8-db96-2bf7d526f0d5 -nographic -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/i-42AA07D1.monitor,server,nowait -monitor chardev:monitor -boot c -kernel /var/lib/eucalyptus/instances//user01/i-42AA07D1/kernel -initrd /var/lib/eucalyptus/instances//user01/i-42AA07D1/ramdisk -append root=/dev/sda1 console=ttyS0 -drive file=/var/lib/eucalyptus/instances//user01/i-42AA07D1/disk,if=scsi,index=0,boot=on,format=raw -net nic,macaddr=d0:0d:42:aa:07:d1,vlan=0,model=e1000,name=e1000.0 -net tap,fd=47,vlan=0,name=tap.0 -chardev file,id=serial0,path=/var/lib/eucalyptus/instances//user01/i-42AA07D1/console.log -serial chardev:serial0 -parallel none -usb
pci_add_option_rom: failed to find romfile "pxe-e1000.bin"
I could not find much info on the above error message, apart from the following bug:
https://bugs.launchpad.net/ubuntu/+source/etherboot/+bug/566832
Is it something I should be worried about? How can I properly fix it?
It seems that now I am back to this problem again, which I hoped to solve by removing AppArmor but is not solved at all yet:
http://open.eucalyptus.com/forum/clarification-about-dual-nic-uec
I wonder if the pci_add_option_rom: failed to find romfile "pxe-e1000.bin" is related to it.