Skip to content

FIX: Centos 6.6 on X8SIE-F, nics links down, all packet counters rise and goes crazy , no networking or: NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out

July 12, 2015

Bug:  new Centos 6.6 install on SuperMicro X8SIE-F, After some time nics links down, all packet counters rise and goes crazy , no networking

     If you succeed to install centos 6.6 via network 🙂

Cause: It’s ALL about the ASPM

Logs:

Jul 10 23:01:41 localhost kernel: Hardware name: X8SIE
Jul 10 23:01:41 localhost kernel: NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
Jul 10 23:01:41 localhost kernel: Modules linked in: ipv6 iTCO_wdt iTCO_vendor_support serio_raw i2c_i801 i2c_core sg lpc_ich mfd_core e1000e ptp pps_core ext4 jbd2 mbcache raid1
sd_mod crc_t10dif pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]

fixing on boot time:

append to grub boot line: pcie_aspm=off

fixing after boot:

 echo “performance” > /sys/module/pcie_aspm/parameters/policy

fixing from the BIOS:

Goto: Advanced – Chipset Configuration ( most probably or try something similar in the main menus )

Set: Active State Power Module = Disabled

Here is the /var/log/messages log:

Jul 10 23:00:12 localhost dhclient[1173]: DHCPREQUEST on eth0 to 192.168.203.1 port 67 (xid=0x7e9add19)
Jul 10 23:00:24 localhost dhclient[1173]: DHCPREQUEST on eth0 to 192.168.203.1 port 67 (xid=0x7e9add19)
Jul 10 23:00:43 localhost dhclient[1173]: DHCPREQUEST on eth0 to 192.168.203.1 port 67 (xid=0x7e9add19)
Jul 10 23:00:58 localhost dhclient[1173]: DHCPREQUEST on eth0 to 192.168.203.1 port 67 (xid=0x7e9add19)
Jul 10 23:01:09 localhost dhclient[1173]: DHCPREQUEST on eth0 to 192.168.203.1 port 67 (xid=0x7e9add19)
Jul 10 23:01:20 localhost dhclient[1173]: DHCPREQUEST on eth0 to 192.168.203.1 port 67 (xid=0x7e9add19)
Jul 10 23:01:34 localhost dhclient[1173]: DHCPREQUEST on eth0 to 192.168.203.1 port 67 (xid=0x7e9add19)
Jul 10 23:01:41 localhost kernel: ————[ cut here ]————
Jul 10 23:01:41 localhost kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26b/0x280() (Not tainted)
Jul 10 23:01:41 localhost kernel: Hardware name: X8SIE
Jul 10 23:01:41 localhost kernel: NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
Jul 10 23:01:41 localhost kernel: Modules linked in: ipv6 iTCO_wdt iTCO_vendor_support serio_raw i2c_i801 i2c_core sg lpc_ich mfd_core e1000e ptp pps_core ext4 jbd2 mbcache raid1
sd_mod crc_t10dif pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Jul 10 23:01:41 localhost kernel: Pid: 0, comm: swapper Not tainted 2.6.32-504.el6.x86_64 #1
Jul 10 23:01:41 localhost kernel: Call Trace:
Jul 10 23:01:41 localhost kernel: <IRQ> [<ffffffff81074df7>] ? warn_slowpath_common+0x87/0xc0
Jul 10 23:01:41 localhost kernel: [<ffffffff81014a29>] ? sched_clock+0x9/0x10
Jul 10 23:01:41 localhost kernel: [<ffffffff81074ee6>] ? warn_slowpath_fmt+0x46/0x50
Jul 10 23:01:41 localhost kernel: [<ffffffff8147df7b>] ? dev_watchdog+0x26b/0x280
Jul 10 23:01:41 localhost kernel: [<ffffffff810aaac9>] ? ktime_get+0x69/0xf0
Jul 10 23:01:41 localhost kernel: [<ffffffff81087125>] ? internal_add_timer+0xb5/0x110
Jul 10 23:01:41 localhost kernel: [<ffffffff8147dd10>] ? dev_watchdog+0x0/0x280
Jul 10 23:01:41 localhost kernel: [<ffffffff81087db7>] ? run_timer_softirq+0x197/0x340
Jul 10 23:01:41 localhost kernel: [<ffffffff810a375b>] ? hrtimer_interrupt+0x14b/0x260
Jul 10 23:01:41 localhost kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0
Jul 10 23:01:41 localhost kernel: [<ffffffff810eaa90>] ? handle_IRQ_event+0x60/0x170
Jul 10 23:01:41 localhost kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30
Jul 10 23:01:41 localhost kernel: [<ffffffff8100fc15>] ? do_softirq+0x65/0xa0
Jul 10 23:01:41 localhost kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90
Jul 10 23:01:41 localhost kernel: [<ffffffff81533b45>] ? do_IRQ+0x75/0xf0
Jul 10 23:01:41 localhost kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11
Jul 10 23:01:41 localhost kernel: <EOI> [<ffffffff812ea5ee>] ? intel_idle+0xde/0x170
Jul 10 23:01:41 localhost kernel: [<ffffffff812ea5d1>] ? intel_idle+0xc1/0x170
Jul 10 23:01:41 localhost kernel: [<ffffffff81425b97>] ? cpuidle_idle_call+0xa7/0x140
Jul 10 23:01:41 localhost kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
Jul 10 23:01:41 localhost kernel: [<ffffffff81522e2c>] ? start_secondary+0x2be/0x301
Jul 10 23:01:41 localhost kernel: —[ end trace cf60bf900efaa126 ]—
Jul 10 23:01:41 localhost kernel: e1000e 0000:03:00.0: eth0: Reset adapter unexpectedly
Jul 10 23:01:42 localhost kernel: e1000e 0000:03:00.0: eth0: Timesync Tx Control register not set as expected
Jul 10 23:01:50 localhost dhclient[1173]: DHCPREQUEST on eth0 to 192.168.203.1 port 67 (xid=0x7e9add19)
Jul 10 23:02:05 localhost dhclient[1173]: DHCPREQUEST on eth0 to 192.168.203.1 port 67 (xid=0x7e9add19)

.. keep the software up to date. So … kernel.org

Here are some good words on the issue: http://www.hv23.net/2011/09/intel-82756-gigethernet-on-centosfedora-hang-how-to-fix/

Advertisements

From → Linux, Networking

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: