networking not working

Bug #204010 reported by fromport
106
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned
Declined for Hardy by Steve Langasek
xen-3.2 (Ubuntu)
Confirmed
Undecided
Unassigned
Declined for Hardy by Steve Langasek

Bug Description

After fixing the missing modules to the initrd package i finally was able
to access the guest on my hardy server. However, networking is not-working.
So close but still no victory ;-)

DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=8.04
DISTRIB_CODENAME=hardy
DISTRIB_DESCRIPTION="Ubuntu hardy (development branch)"
ARCH=AMD64

What i did:

=========================
echo "deb http://ppa.launchpad.net/zulcss/ubuntu hardy main" >> /etc/apt/sources.list
apt-get update && apt-get dist-upgrade

echo "xenblk" >> /etc/initramfs-tools/modules
echo "blktap" >> /etc/initramfs-tools/modules
echo "blkbck" >> /etc/initramfs-tools/modules
echo "xennet" >> /etc/initramfs-tools/modules

update-initramfs -u

[reboot]

echo "extra='xencons=tty'" >> /etc/xen-tools/xm.tmpl

Then i created the guest:

xen-create-image --hostname=etch64 --ip=192.168.42.212

xm create -c /etc/xen/etch64.cfg

=========================

I'm able to login in the guest but i have no network connectivity
in the guest.

Information about the dom0:
---------------
# The primary network interface
auto eth0
iface eth0 inet static
address 192.168.42.45
netmask 255.255.255.0
gateway 192.168.42.1
--------------
relevant lines from /etc/xen/xend-config.sxp:
(network-script network-bridge)
(vif-script vif-bridge)
--------------
contents of /etc/xen/etch64.cfg

kernel = '/boot/vmlinuz-2.6.24-12-xen'
ramdisk = '/boot/initrd.img-2.6.24-12-xen'
memory = '128'
root = '/dev/sda2 ro'
disk = [
                  'phy:/dev/VG/etch64-swap,sda1,w',
                  'phy:/dev/VG/etch64-disk,sda2,w',
              ]
name = 'etch64'
vif = [ 'ip=192.168.42.212,mac=00:16:3E:C4:28:6C' ]
on_poweroff = 'destroy'
on_reboot = 'restart'
on_crash = 'restart'
extra='xencons=tty'

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
On the domU (debian etch 64bits installed with debootstrap)

/etc/network/interfaces
# The primary network interface
auto eth0
iface eth0 inet static
 address 192.168.42.212
 gateway 192.168.42.1
 netmask 255.255.255.0

etch64:~# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:16:3E:C4:28:6C
          inet addr:192.168.42.212 Bcast:192.168.42.255 Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

etch64:~# ping -c10 192.168.42.1
PING 192.168.42.1 (192.168.42.1) 56(84) bytes of data.
From 192.168.42.212 icmp_seq=1 Destination Host Unreachable
From 192.168.42.212 icmp_seq=2 Destination Host Unreachable
From 192.168.42.212 icmp_seq=3 Destination Host Unreachable
From 192.168.42.212 icmp_seq=4 Destination Host Unreachable
From 192.168.42.212 icmp_seq=5 Destination Host Unreachable
From 192.168.42.212 icmp_seq=6 Destination Host Unreachable
From 192.168.42.212 icmp_seq=7 Destination Host Unreachable
From 192.168.42.212 icmp_seq=8 Destination Host Unreachable
From 192.168.42.212 icmp_seq=9 Destination Host Unreachable
From 192.168.42.212 icmp_seq=10 Destination Host Unreachable

--- 192.168.42.1 ping statistics ---
10 packets transmitted, 0 received, +10 errors, 100% packet loss, time 9025ms , pipe 3
etch64:~# arp -an
? (192.168.42.1) at <incomplete> on eth0

-----------------------------

No networking...

Attached: xend.log & more useful information

Revision history for this message
fromport (ubuntu-dth) wrote :
Revision history for this message
Chris Cohen (kildau-ml) wrote :
Revision history for this message
Dirk Opfer (dirk-do13) wrote :

Same problem here. Can you ran ifconfig in dom0 and domu?

I have:
domu:
~# ifconfig
eth0 Link encap:Ethernet HWaddr 00:16:3E:5D:DC:2A
          inet addr:192.168.200.240 Bcast:192.168.200.255 Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:51 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b) TX bytes:2142 (2.0 KB)

So no packets received.

dom0:
.....
peth0 Link encap:Ethernet HWaddr 00:1b:fc:ff:5a:a6
          inet6 addr: fe80::21b:fcff:feff:5aa6/64 Scope:Link
          UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
          RX packets:9844 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6152 errors:0 dropped:0 overruns:0 carrier:1
          collisions:0 txqueuelen:1000
          RX bytes:10245463 (9.7 MB) TX bytes:792792 (774.2 KB)

vif1.0 Link encap:Ethernet HWaddr fe:ff:ff:ff:ff:ff
          inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link
          UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
          RX packets:51 errors:0 dropped:0 overruns:0 frame:0
          TX packets:109 errors:0 dropped:777 overruns:0 carrier:0
          collisions:0 txqueuelen:32
          RX bytes:1428 (1.3 KB) TX bytes:9707 (9.4 KB)

After 109 TX packets all other packets are dropped.

Revision history for this message
Maurício Magalhães (mauriciommagalhaes) wrote :

Hi All

I have the same problem here,

I have done everything and nothing

Tanks

Revision history for this message
Chris Cohen (kildau-ml) wrote :

Did some more research today...
Not even the latest Hypervisor Xen version 3.3-unstable (Debian 3.3-unstable+hg17192-1)
(Debian Version installs without any problem) works so it must be something with the Ubuntu-Xen Kernel?

Revision history for this message
Chris Cohen (kildau-ml) wrote :

One more ;)
2.6.24-12-xen Dom0 + 2.6.22-14-xen DomU -> network is working

Revision history for this message
Mike Baker (mbm) wrote :

Another "me too" - I'm running into the same bugs trying to deploy a server here
- xen modules are not included in the initramfs
- xen networking cannot receive packets in domU

The first bug is annoying but easily fixed; it's the second bug that's got me stuck. I've tried rx_copy and rx_flip with no luck, I've even tried booting a Hardy domU under a Gutsy dom0/xen 3.1 with no luck.

Todd Deshane (deshantm)
Changed in xen-3.2:
status: New → Confirmed
Revision history for this message
Todd Deshane (deshantm) wrote :

could you include the output of brctl show?

Also take a look here:

https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/199533

a manual bridge (i.e. br0, that KVM sometimes uses) has sometimes worked better than the default ethX bridge.

I will try to confirm if that is true tomorrow.

Revision history for this message
fromport (ubuntu-dth) wrote :

brctl show on dom0
bridge name bridge id STP enabled interfaces
eth0 8000.001617a97350 no peth0
                                                                                     vif1.0

Revision history for this message
Todd Deshane (deshantm) wrote :

what happens when you do

ifdown eth0
ifup eth0

Do you get errors like:

SIOCSIFFLAGS: Invalid argument

AND/OR

send_packet: Network is down
receive_packet failed on eth0: Network is down

Revision history for this message
Danny (dth) wrote : Re: [Bug 204010] Re: networking not working

Quoting Todd Deshane (<email address hidden>):
> what happens when you do
> ifdown eth0
> ifup eth0

root@etch:~# ifdown eth0; sleep 1; ifup eth0
ifdown: interface eth0 not configured
Internet Systems Consortium DHCP Client V3.0.6
Copyright 2004-2007 Internet Systems Consortium.
All rights reserved.
For info, please visit http://www.isc.org/sw/dhcp/

Listening on LPF/eth0/00:16:3e:45:73:dd
Sending on LPF/eth0/00:16:3e:45:73:dd
Sending on Socket/fallback
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 3
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 6
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 14
No DHCPOFFERS received.
No working leases in persistent database - sleeping.
root@hardy64:~# ifconfig
eth0 Link encap:Ethernet HWaddr 00:16:3e:45:73:dd
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B) TX bytes:2394 (2.3 KB)

It's obvious it is sending packages.
From my dhcp server log:
Mar 24 18:56:06 firewall dnsmasq[23084]: DHCPDISCOVER(eth1) 00:16:3e:45:73:dd
Mar 24 18:56:06 firewall dnsmasq[23084]: DHCPOFFER(eth1) 192.168.42.163 00:16:3e:45:73:dd
Mar 24 18:56:06 firewall dnsmasq[23084]: DHCPDISCOVER(eth1) 00:16:3e:45:73:dd
Mar 24 18:56:06 firewall dnsmasq[23084]: DHCPOFFER(eth1) 192.168.42.163 00:16:3e:45:73:dd
[ad infinitum]
It's being offered an ip address but that packet never reaches the dom0
The bridge is a "one-way-street"

--
I love deadlines. I like the whooshing sound they make as they fly by.
-Douglas Adams-

Revision history for this message
Marcus Grieger (mgrieger) wrote :

> 2.6.24-12-xen Dom0 + 2.6.22-14-xen DomU -> network is working

I can confirm this. Base install on amd64 (FSC Primergy RX220) was Ubuntu dapper, dist-upgraded to hardy. With the hardy xen kernel running in domU, I could see outgoing network traffic on eth0 in domU and on vif in dom0. I could also incoming packets on vif in dom0 which got lost on their way to domU.

Revision history for this message
Todd Deshane (deshantm) wrote :

Did anyone try making a manual bridge. For example adding something like the following to /etc/network/interfaces

auto br0
iface br0 inet static
  address 192.168.133.7
  netmask 255.255.255.0
  broadcast 192.168.133.0
  bridge_ports eth0
  bridge_fd 9
  bridge_hello 2
  bridge_maxage 12
  bridge_stp off

Then in the guest config make sure you have bridge=br0 in the vif parameter:

e.g. vif=['mac=xx:xx:xx:xx,bridge=br0']

Revision history for this message
Chris Cohen (kildau-ml) wrote :

> Did anyone try making a manual bridge. For example adding something like the following to /etc/network/interfaces

Yes, I did. And it didn't change anything. (But I didn's set bridge_fd, _hello or _maxage)

Revision history for this message
Todd Deshane (deshantm) wrote :

From this post:
http://xen.markmail.org/search/?q=creating%20xenbr0%20manually#query:creating%20xenbr0%20manually+page:1+mid:y3gyxx3kh25eogdz+state:results

I found the following suggestion:

auto xen-br1
iface xen-br1 inet manual
 pre-up /sbin/ip link set vif0.1 arp off multicast off
 pre-up /sbin/vconfig add peth0 2 || true
 bridge_ports vif0.1 peth0.2
 bridge_maxwait 0
 post-down /sbin/vconfig rem peth0 2 || true

Revision history for this message
deti (deti) wrote :

Same problem here - I was wondering, why network packets from dom-u can be seen in dom0 but not the other direction. It seems that no packet can be received by domU instances. This behaviour is independent of any bridging enabled in dom0.

Deti

Revision history for this message
Sam Bashton (sam-bashton) wrote :

Todd: Tried this suggestion, no joy - this seems to be for adding a separate bridge on a different vlan, so perhaps not surprising it's doing nothing for us.

Packets are getting out from the domU OK, and replies are getting as far as the bridge interface, but not the domU itself, for example:

(on dom0)
sudo tcpdump -i eth0 arp
09:27:19.180137 arp who-has ganymede.lan tell 192.168.1.249
09:27:19.180220 arp reply ganymede.lan is-at 00:30:1b:43:48:76 (oui Unknown)

(192.168.1.249 is the domU)

Revision history for this message
Gugiwuz (manuel-laggner-gmail) wrote :

i recognized, that the virtual interface (vifx.0) is dropping pakets:

- eth0 (domU) to vifx.0 (dom0) is working
- vifx.0 (dom0) to destination (i.e. ping to router) over bridge is working
- reply from router to vifx.0 (dom0) over bridge is working
- vifx.0 (dom0) to eth0 (domU) is not working -> vifx.0 is dropping packets

none iptables - Drop rules -> standard xen rules are set

now i am at work. in the evening i could post an example with tcpdump/ifconfig/iptables if needed

Revision history for this message
Dirk Opfer (dirk-do13) wrote :

I don't think this has something todo with the network bridge. Tcpdump shows the packets on vifx.0 in dom0. But the packets don't reach the domU.

Using a 2.6.22-12-xen kernel for domU and the stock 2.6.24-12 for dom0 without changing the configuration, networking works out of the box.

Revision history for this message
Todd Deshane (deshantm) wrote :

Dirk: could you post the config (differences) for those two kernels. Maybe we are missing a module or like suggested above, maybe an iptable or similar configuration is broken with either the kernel or a xen setting that relies on something in the kernel.

Revision history for this message
Dirk Opfer (dirk-do13) wrote :

Todd: Attached the complete diff.

dirk@linux-host:~$ more config.diff | grep -i "xen"
--- config-2.6.22-14-xen 2008-02-12 05:41:08.000000000 +0100
+++ config-2.6.24-12-xen 2008-03-13 01:35:22.000000000 +0100
-# Linux kernel version: 2.6.22-14-xen
+# Linux kernel version: 2.6.24-12-xen
+# CONFIG_X86_XEN is not set
+CONFIG_X86_64_XEN=y
-CONFIG_X86_64_XEN=y
-CONFIG_X86_XEN_GENAPIC=y
+CONFIG_X86_XEN_GENAPIC=y
 CONFIG_XEN_PCIDEV_FRONTEND=y
 # CONFIG_XEN_PCIDEV_FE_DEBUG is not set
 CONFIG_NETXEN_NIC=m
-# CONFIG_TCG_XEN is not set
+CONFIG_TCG_XEN=m
 CONFIG_XEN=y
-CONFIG_XEN_INTERFACE_VERSION=0x00030205
+CONFIG_XEN_INTERFACE_VERSION=0x00030207
 # XEN
 # CONFIG_XEN_UNPRIVILEGED_GUEST is not set
 CONFIG_XEN_PRIVCMD=y
 CONFIG_XEN_XENBUS_DEV=y
-CONFIG_XEN_BACKEND=y
-CONFIG_XEN_BLKDEV_BACKEND=y
-CONFIG_XEN_BLKDEV_TAP=y
-CONFIG_XEN_NETDEV_BACKEND=y
+CONFIG_XEN_BACKEND=m
+CONFIG_XEN_BLKDEV_BACKEND=m
+CONFIG_XEN_BLKDEV_TAP=m
+CONFIG_XEN_NETDEV_BACKEND=m
 # CONFIG_XEN_NETDEV_PIPELINED_TRANSMITTER is not set
-CONFIG_XEN_NETDEV_LOOPBACK=y
-CONFIG_XEN_PCIDEV_BACKEND=y
+CONFIG_XEN_NETDEV_LOOPBACK=m
+CONFIG_XEN_PCIDEV_BACKEND=m
 CONFIG_XEN_PCIDEV_BACKEND_VPCI=y
 # CONFIG_XEN_PCIDEV_BACKEND_PASS is not set
 # CONFIG_XEN_PCIDEV_BACKEND_SLOT is not set
+# CONFIG_XEN_PCIDEV_BACKEND_CONTROLLER is not set
 # CONFIG_XEN_PCIDEV_BE_DEBUG is not set
-# CONFIG_XEN_TPMDEV_BACKEND is not set
-CONFIG_XEN_BLKDEV_FRONTEND=y
-CONFIG_XEN_NETDEV_FRONTEND=y
+CONFIG_XEN_TPMDEV_BACKEND=m
+CONFIG_XEN_BLKDEV_FRONTEND=m
+CONFIG_XEN_NETDEV_FRONTEND=m
+CONFIG_XEN_GRANT_DEV=m
 CONFIG_XEN_FRAMEBUFFER=y
 CONFIG_XEN_KEYBOARD=y
 CONFIG_XEN_CONSOLE=y
 CONFIG_XEN_SCRUB_PAGES=y
-CONFIG_XEN_DISABLE_SERIAL=y
+# CONFIG_XEN_DISABLE_SERIAL is not set
 CONFIG_XEN_SYSFS=y
-CONFIG_XEN_COMPAT_030002_AND_LATER=y
-# CONFIG_XEN_COMPAT_030004_AND_LATER is not set
+# CONFIG_XEN_COMPAT_030002_AND_LATER is not set
+CONFIG_XEN_COMPAT_030004_AND_LATER=y
+# CONFIG_XEN_COMPAT_030100_AND_LATER is not set
 # CONFIG_XEN_COMPAT_LATEST_ONLY is not set
-CONFIG_XEN_COMPAT=0x030002
+CONFIG_XEN_COMPAT=0x030004
 CONFIG_XEN_SMPBOOT=y
+CONFIG_XEN_BALLOON=y
+CONFIG_XEN_DEVMEM=y

Revision history for this message
Mark Goldfinch (4-launchpad-g-org-nz) wrote :

Perhaps I might be stating something obvious here, but Xen was merged in upstream in 2.6.23, doesn't it strike someone as odd that kernels since then appear to have problems with their networking?

In main-line kernel Xen support 64-bit support is currently missing, so perhaps we have a conflict between the main-line source and the patches available from XenSource?

Meanwhile I'm currently looking into attempting to debug the xennet driver. Seeing that packets can be emitted from the domu into dom0 but yet reply packets from dom0 into domu disappear it would seem that the problem would lie somewhere within the xen network device?

Thanks,
Mark.

Revision history for this message
Todd Deshane (deshantm) wrote :

Mark: Interesting point you bring up, I hadn't thought of that. DomU support should be in mainline Linux, although dom0 support is still lacking until Red Hat finishes their port.

Would you be willing to bring some of the key points and information to the xen-users list [1] for a broader audience and get some ideas on debugging it?

Also, has anybody tried this 32-bit? And is it a problem with both 32 bit and 64 bit?

It is also interesting that a 2.6.22-14-xen Ubuntu xen kernel has been reported to work.

[1] http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users

Revision history for this message
Clemens Hupka (clemens) wrote :

Mark, i can confirm that the problem ist present on amd64 AND i386... Tested both of them a few days ago.

Both (amd64 and i386) are working with 2.6.22-14-xen from gutsy on domU and 2.6.24-12-xen from hardy-beta on dom0 with xen3.2 following the steps described in the initial bug report....

And quite stable for a couple of days now...

In my opinion, the xennet driver has a problem. I found out that packets are going out from domU (TX are getting on the bridge, arp-whohas at least) but no replies are comming into the domU (RX stays 0, even if the replys get on the bridge, as well as a couple of Broadcasts)

So long,
Clemens.

Revision history for this message
deti (deti) wrote :

Same problem still exists with new kernel package 2.6.24-14.

Revision history for this message
Marcus Grieger (mgrieger) wrote :

> Same problem still exists with new kernel package 2.6.24-14.

confirmed.

Revision history for this message
Todd Deshane (deshantm) wrote :

can you give the following command a shot:

ethtool -K eth0 tx off

I plan to spend as much time as I can in ##xen on freenode and also
#xen on irc.oftc.net to try to get some advice.

I would like to see a working system before hardy releases...

Revision history for this message
Mark Goldfinch (4-launchpad-g-org-nz) wrote :

Todd Deshane wrote:
> can you give the following command a shot:
>
> ethtool -K eth0 tx off
>
> I plan to spend as much time as I can in ##xen on freenode and also
> #xen on irc.oftc.net to try to get some advice.
>
> I would like to see a working system before hardy releases...
I'm yet to have a reply to my message on the xen-users list.. about our
problems with xennet and bridged packets disappearing on reply back into
the DomU.

The ethtool command above do you want that run on Dom0 or a DomU?

Thanks,
Mark.

Revision history for this message
Todd Deshane (deshantm) wrote :

I tried the ethtool command on domU. didn't work for me.

also tried to put the card in promisc with ifconfig eth0 promisc on the guest.

I am in #xen in oftc trying to get some ideas, i will hang out in both ##xen and #xen like i said before.

I was told that it is the frontend driver in domU, which is not news. We need a way to be able to either fix that driver and/or get some more visibility so we can get someone to fix it.

Revision history for this message
SirYorik (sua) wrote :

Same problem still exists with new kernel package 2.6.24-15

vv-etch64:~# uname -a
Linux vv-etch64 2.6.24-15-xen #1 SMP Fri Apr 4 05:04:49 UTC 2008 x86_64 GNU/Linux

vv-etch64:~# ping -c 1 192.168.3.100
PING 192.168.3.100 (192.168.3.100) 56(84) bytes of data.
From 192.168.3.174 icmp_seq=1 Destination Host Unreachable

vv-etch64:~# ifconfig
eth0 Link encap:Ethernet HWaddr AA:00:00:00:00:D2
          inet addr:192.168.3.174 Bcast:192.168.3.255 Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          ----> RX bytes:0 (0.0 b) <---- TX bytes:252 (252.0 b)

Revision history for this message
Elie De Brauwer (elie) wrote :

Also encountering the same problem on 2.6.24-15-xen

The bridges are created, a win2k3/xp virtual host works perfectly (after fixing the qemu-ifup script (see bug 201765 )). However when I run a centos 4.6 virtual host I see the same symptoms discussed in this thread. The bridges get created, the devices get added to the bridge, the interfaces are all up.
...
but nothing gets through. It tried it with a manually created bridge and with a bridge i let xen create.

On my domU I see the following after a ping -f:

[root@centosbean ~]# ifconfig
eth0 Link encap:Ethernet HWaddr 00:16:3E:15:13:66
          inet addr:192.168.253.228 Bcast:192.168.253.255 Mask:255.255.255.0
          inet6 addr: fe80::216:3eff:fe15:1366/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b) TX bytes:846 (846.0 b)

(not really impressive eh)

When I run xentop I see the following line concerning my virtual instance:

      NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD VBD_WR SSID
    centos --b--- 26 0.0 3584000 85.6 3584000 85.6 4 1 0 11 1 0 3488 1041 32
Net0 RX: 11370bytes 109pkts 0err 14509drop TX: 636bytes 15pkts 0err 0drop

Hence a lot of drops

Revision history for this message
Bart Heinsius (bheinsius) wrote :

I have exactly the same as Elie.
Every ping to the DomU results in a drop.

Revision history for this message
Wido den Hollander (wido) wrote :

I can also confirm the same symptoms.

I am running the 2.6.24-15-xen kernel with a 64-Bit host and a 32-Bit guest (Hardy also).

Traffic gets out of the domU, but it doesn't get back into it.

Revision history for this message
kochab (kochab-gmail) wrote :

The same here,
with hardy 64 host and etch 64 guests, running 2.6.24-14-xen.

Revision history for this message
James Blackwell (jblack) wrote : Workaround Procedure

I was able to get things working by doing the following:

PREREQUISITES:
==========
*The domUs are able to start
*The domUs have interfaces
*The dom0 and domUs are unable to ping each other.
* "brctl show" should show something like:
   bridge name bridge id STP enabled interfaces
   eth0 8000.001ec93cb620 no peth0
                                                                                                   vif1.0
* dom0 can run anything up to and including 2.6.24-16. Latter kernels
   have not been tested.

(note: the /gutsy is not a typo. Its a way to tell apt-get to
use a different release of ubuntu)

PROCEDURE:
=========
1. apt-get install linux-ubuntu-modules-2.6.22-14-xen/gutsy
2. apt-get install linux-image-2.6.22-14-xen/gutsy
3. apt-get install linux-restricted-modules-2.6.22-14-xen/gutsy
4. Edit the /etc/xen/*.cfg file, and replace:
    kernel = '/boot/vmlinuz-2.6.24-15-xen'
    ramdisk = '/boot/initrd.img-2.6.24-15-xen'
with:
    kernel = '/boot/vmlinuz-2.6.22-14-xen'
    ramdisk = '/boot/initrd.img-2.6.22-14-xen'

Revision history for this message
deti (deti) wrote :

linux-image-2.6.24-16-xen does not seem to contain any XEN modules which makes this kernel unusable for XEN operation. I did not find any other package that provides the missing modules. Any clues?

Revision history for this message
Jan Evert van Grootheest (j-e-van-grootheest) wrote :

Another "me too".

Deti, in -16 they finally decided to make the front- and backends into the kernel instead of modules.
So the /etc/initramfs/modules workaround is no longer needed.

Revision history for this message
Jan Evert van Grootheest (j-e-van-grootheest) wrote :

Perhaps someboby should try a redhat kernel. Their status page (http://fedoraproject.org/wiki/Features/XenPvops) says it works...
After that it just the same game: spot the difference.

Revision history for this message
kochab (kochab-gmail) wrote :

using gutsy kernel it works! (following https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/204010/comments/35)
waiting for a fix of hardy kernel...

Revision history for this message
Jan Evert van Grootheest (j-e-van-grootheest) wrote :

So I tried it myself. It took actually a lot less effort than I expected!

I downloaded the current xen kernel from (1), search for kernel-xen, and used alien to extract it (alien -g).
The package also contains xen itself, I think, so don't convert and install!
I manually copied part of /boot from the rpm and /lib/modules/2.6.25-something to the appropriate places.

The xen block driver is a module again, xen-blkfront. So I had to add that to /etc/initramfs/modules.
First boot I had to find out that the block device is now named xvdb1 instead of sdb1.
Second boot worked just fine. Including the network

That all took less than 15 minutes afer I found the rpm.

Now if somebody would compare the code in Ubuntu with the srpm from (2)...

(1)ftp://download.fedora.redhat.com/pub/fedora/linux/development/x86_64/os/Packages/
(2)ftp://download.fedora.redhat.com/pub/fedora/linux/development/source/SRPMS/

Revision history for this message
Jan Evert van Grootheest (j-e-van-grootheest) wrote :

I just realized that this bug is on xen-3.2. But I guess this is wrong.
I have a etch(64) domU with 2.6.18-5 on it.
I have another domU with eth(64) userspace which I use for playing and testing. I've used this with the eth kernel (2.6.18), the feisty or gutsy 2.6.22 and now the hardy 2.6.24.

The thing is, I don't use the ubuntu xen, but the etch backports one
And still the hardy kernel doesn't do networking. So that's another hint that there's something with the kernel, not the hypervisor.

I would say this is a problem of the hardy xen kernel.

Revision history for this message
Paul Wagland (paul-kungfoocoder) wrote :

Nothing really new to add... but another cry of yes, this bug is hitting me too. 2.6.22-14-xen works perfectly, but the 2.6.24-16 do not work for me, with exactly the same problem as described above.

Revision history for this message
Jan Evert van Grootheest (j-e-van-grootheest) wrote :

Given that there are two reports that there's a setup where other kernel versions work but this one does not, I'd say it is confirmed that there's a bug in this kernel.

Changed in linux-source-2.6.24:
status: New → Confirmed
Revision history for this message
Wido den Hollander (wido) wrote :

I tried the Gutsy kernel for my domU.

But now the guest gets stuck at "Setting up the system clock"

Adding HWCLOCK=no to /etc/default/rcS in the domU solves this, but then the guest gets stuck a little bit further while booting.

So this really should be fixed in Hardy, asap! The release is just 2 weeks away, and you can not release a broken Xen in a LTS!

Revision history for this message
Todd Deshane (deshantm) wrote :

Can somebody send in a kernel patch to the xennet driver that fixes the problem and that can make it past the kernel freeze?

Revision history for this message
Thiago Martins (martinx) wrote :

I agree with the affirmation that this isn't a Hypervisor or xen-3.2 bug. I have tested Hardy Kernel linux-2.6.24 with "Hypervisor" and "Tools" from original xen-3.2.0.tar.gz and I got the same network problem with hypervisor/tools compiled from source, so, this isn't a xen-3.2 bug. Ah, and I have purged xen-hypervisor-3.2 and xen-utils-3.2 before doing "make install-hypervisor;make install-tools", of course.

And if the networking works with other kernels like 2.6.22 from Gutsy or even 2.6.18.8-xen from xen.org, It's clean to me that there is a incompatible stuff with the Hardy Kernel.

And resting only 8 days to release Hardy, /etc/init.d/xendomains has horrible, primary errors! Vide: https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/216761 .. I'm feeling fear... fear... what more we can expect!?

Maybe forgetting about XEN and joining KVM?! The options are all over the table.

Revision history for this message
HIRANO Takahito (hiranotaka) wrote :
Revision history for this message
Todd Deshane (deshantm) wrote :

HIRANO: are you saying that this won't work with a hardy dom0?

If so, why?

I will try to do some testing this weekend.

Revision history for this message
whs (wolfram-heinen) wrote :

Today I installed the 8.04rc with kernel 2.6.24-16-xen on one of my servers (HP ML110) and the networking problem in domU still exists.

Revision history for this message
Todd Deshane (deshantm) wrote :

whs (and others): did you try the packages as listed in HIRANO's comment number 47 above?

https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/204010/comments/47

Revision history for this message
HIRANO Takahito (hiranotaka) wrote :

I just meant that the patch does not affect Dom0 at all.
It fixes bugs on DomU.

> HIRANO: are you saying that this won't work with a hardy dom0?
>
> If so, why?
>
> I will try to do some testing this weekend.

Revision history for this message
Wido den Hollander (wido) wrote :

At the moment i only have a i386 system to test on, so i won't be able to test HIRANO's kernel. Could somebody please test it so this hopefully can be fixed before hardy comes out?

Revision history for this message
Marcus Grieger (mgrieger) wrote :

root@ldap-master:~# uname -a
Linux ldap-master 2.6.24-16-xen #1 SMP Sat Apr 19 05:03:28 JST 2008 x86_64 GNU/Linux
root@ldap-master:~# ping www.google.de
PING www.l.google.com (209.85.137.99) 56(84) bytes of data.
64 bytes from mg-in-f99.google.com (209.85.137.99): icmp_seq=1 ttl=245 time=23.1 ms

This is in a domU with both the dom0 and domU running on HIRANO's kernel.

Thank you!

Revision history for this message
HIRANO Takahito (hiranotaka) wrote :

I tried to cross-compile the i386 version:
http://www.il.is.s.u-tokyo.ac.jp/~hiranotaka/linux-image-2.6.24-16-xen_2.6.24-16.30zng1_i386.deb

Note I haven't tested this at all.

Revision history for this message
James Blackwell (jblack) wrote :

Hirano,

Thanks much for chasing down the fix to this bug. You rock. :)

Revision history for this message
Paul Wagland (paul-kungfoocoder) wrote :

Hirano, I can confirm that your images fix the problem for me. I hope that this fix can be implemented for Hardy, since it is a kernel change that only affects the XEN kernel, and turns the xen kernel from having non-working networking to having a working networking stack.

Thanks for chasing this down Hirano!

Revision history for this message
HIRANO Takahito (hiranotaka) wrote :

James posted the suggestion on this bug to the mailing list. Thanks.
https://lists.ubuntu.com/archives/kernel-team/2008-April/002309.html

Revision history for this message
Leandro Pereira de Lima e Silva (leandro-limaesilva) wrote :

Since Hirano's patch fix this bug, would it be a duplicate for #218126 ?

Revision history for this message
forall (forall-stalowka) wrote :

Hi

Someone have kernel 2.6.24-16-xen with Hirano's patch?

Albert

Revision history for this message
deti (deti) wrote : Re: [Bug 204010] Re: networking not working

> Someone have kernel 2.6.24-16-xen with Hirano's patch?
Yes and it's working fine. Now we hope his patch will be included in the
8.04 release...

Revision history for this message
forall (forall-stalowka) wrote :

deti could you put somewhere this kernel , and I can dowload it?

Albert

Revision history for this message
Wido den Hollander (wido) wrote :

"Declined for Hardy by Steve Langasek "

That's what the report says!

Like what!? A serious bug like this won't be fixed in Hardy? Like how can they do this, so many servers will do down when people upgrade their servers from Gutsy to Hardy, not knowing this bug exists.

And how could this ever exist, don't people test their software? This is really madness! Network not working in a server edition, that's like not having graphical support in a desktop edition!

Revision history for this message
deti (deti) wrote :

forall wrote:
> deti could you put somewhere this kernel , and I can dowload it?
See posting above:
https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/204010/comments/47

Revision history for this message
Paul Wagland (paul-kungfoocoder) wrote :

Marking as a duplicate of bug 218126, since that bug is the same as this, and has had a fix committed. I am not sure if this means that a fix will make it for Hardy, but I am hopeful.

Revision history for this message
Jesper Terkelsen (jesper-terkelsen) wrote :

Hirano, thank you for the patch i just tested the i386 version that you compiled, and it works for me.

Revision history for this message
Jesper Terkelsen (jesper-terkelsen) wrote :

Forgot to say that i run both Dom0 and DomU with the hardy version.

Revision history for this message
driver (driver-megahappy) wrote :

+1 for Hirano's kernel

Works great... Hope he didn't install some kind of backdoor... ;)

Revision history for this message
Wolfgang (ubuntu-launchpad-wkraft) wrote :

Hello,

I was fighting with the same error described here. I installed Hiranos kernel (32-bit version) on my AMD Athlon X2 5600+ server.
With that kernel I am now able to get up my network, however the network connectivity is still broken.

I use right now Hardy Heron for Dom0 and my DomU Clients, all with the patched Kernel. However it makes no difference, if I use the original kernel in Dom0.

Every outgoing tcp-connection stalls and finally hangs, as soon as there is more data than a couple of bytes going out.

Doing a "ls -laR /" in an ssh-session is allready enough. As soon as I transfer in a second session a large file from my DomU-Guest with FTP to another physical server in the internet, IP connectivity to the DomU is not longer possible, till I kill both processes and wait some time.

I try to debug whats going on on the tcp-level.

Has someone reading this similar experience??

Thanks for any hints.

Regards

  Wolfgang

Revision history for this message
terii (quad3datwork) wrote :

http://www.il.is.s.u-tokyo.ac.jp/~hiranotaka/

I see Hirano have new PKGs uploaded today...

Hirano, what's the fix?

Revision history for this message
HIRANO Takahito (hiranotaka) wrote :

These *untested* packages are built from Hardy's latest development source tree.
In addition to this Xen bug, some other issues have been fixed.
See changelog in the package for details.

Revision history for this message
Re Persina (r99990) wrote :

Takahito:
I tried your latest kernel (linux-image-2.6.24-19-xen_2.6.24-19.33~zng1_amd64.deb) but it did not solve the domU networking problem. Like the stock Hardy kernel, the guest OS does not receive any network traffic using it.

Using your older kernel (linux-image-2.6.24-16-xen_2.6.24-16.30zng1_amd64.deb) works great, all networking is functioning.

Thank you.

Revision history for this message
HIRANO Takahito (hiranotaka) wrote :

Hmm, I cannot figure out the reason by your description of your environment...
As long as I tested it, there were no problems around Xen networking.

Revision history for this message
mattsteven (matthew-matts) wrote :

Hirano- Using the older[1] image you provided I have networking operating again (thank you!), however the guest created with this kernel often crashes when it's assigned more than one vcpu. The program triggering the crash can vary. If it is of interest to you or other developers I have attached the log.

1. linux-image-2.6.24-16-xen_2.6.24-16.30zng1_i386.deb

Limiting the vcpus to 1 seems to prevent it.

Revision history for this message
HIRANO Takahito (hiranotaka) wrote :

I've been testing my amd64 kernels with several vcpus, but have not encountered such problem.
It may be i386 specific.

According to the log, the crash happened at the PageForeignDestructor macro.
The index member of the page structure was made invalid by some reason,
but I haven't identified the reason.

Revision history for this message
mattsteven (matthew-matts) wrote :

Hirano, thanks a million for putting your kernels out there. Using your -19 kernel fixed a major instability problem that was plaguing one of my systems. You've really helped me a lot!

Revision history for this message
jist anidiot (jistanidiot) wrote :

This bug is still in 8.04.

The URLS in
https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/204010/comments/47
no longer seem to be valid.

Is there somewhere else I can get the patches to make Xen work?

Revision history for this message
mattsteven (matthew-matts) wrote :

jist anidiot wrote:
> This bug is still in 8.04.

This particular bug is fixed, but many bugs remain including one I
submitted months ago where if you have >4GB of memory the Ubuntu Xen
kernel will crash regularly. Bug 236389

To get around this I built a kernel from the one supplied by Xensource,
they have a 2.6.18.8 kernel that is stable for high memory machines.

The maintainers must be very busy and not too interested in getting a
fix in for this anytime soon so you'll likely have to go it on your own.

Good luck!

--
Matthew Steven
http://matts.org/
515 230 7930

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.