1317772919 Q * dowdle Remote host closed the connection 1317773092 Q * nicholi Read error: Connection reset by peer 1317773356 J * nicholi ~nicholi@rrcs-76-79-196-34.west.biz.rr.com 1317776767 Q * clopez Ping timeout: 480 seconds 1317790495 J * sannes1 ~ace@cm-84.209.106.118.getinternet.no 1317792670 J * derjohn_mob ~aj@213.238.45.2 1317793632 Q * sladen Quit: Changing server 1317793787 J * sladen ~paul@starsky.19inch.net 1317797738 J * ghislain ~AQUEOS@adsl2.aqueos.com 1317800588 Q * ghislain Quit: Leaving. 1317801924 J * ghislain ~AQUEOS@adsl2.aqueos.com 1317803313 Q * ghislain Quit: Leaving. 1317803533 J * ghislain ~AQUEOS@adsl2.aqueos.com 1317804534 J * thierryp ~thierry@zankai.inria.fr 1317805766 J * kir ~kir@swsoft-msk-nat.sw.ru 1317806270 J * clopez ~clopez@155.99.117.91.static.mundo-r.com 1317806559 P * kir Leaving. 1317807259 N * Bertl_zZ Bertl 1317807263 M * Bertl morning folks! 1317807444 M * Bertl Mr_Smoke: so it took 4 months for that bug to happen? 1317807683 M * Mr_Smoke Approximately yeah 1317807689 M * Mr_Smoke maybe a little more 1317807761 M * Bertl then I'd definitely try with an updated kernel unless you find a way to reproduce/trigger it faster 1317807928 M * Bertl and I'd suggest to compile in debug info in that new kernel 1317808508 M * Bertl to me it looks like a race or problem in the I/O scheduler 1317808574 M * Bertl the interesting part is that __bad_area_nosemaphore() points to a userspace problem interaction maybe cause by reading the sensor values (but that's just guessing) 1317808779 M * Mr_Smoke Okay then 1317808788 M * Bertl the pagefault seems to happen early in cfq_completed_request() 1317808822 M * Bertl I can only guess the locations, but somewhere between struct cfq_queue *cfqq = and now = jiffies; 1317808841 M * Mr_Smoke I thought this kernel had debug info 1317808864 M * Bertl then try to reverse the following with addr2line -e vmlinux 1317808870 M * Bertl c1226d42 1317808876 M * Mr_Smoke 'k, sec 1317808905 M * Mr_Smoke /usr/src/linux/block/cfq-iosched.c:3402 1317808928 M * Bertl and what is in that line of your kernel source? 1317808950 M * Mr_Smoke if (cfqd->rq_in_driver > cfqd->hw_tag_est_depth) 1317808960 M * Mr_Smoke in function static void cfq_update_hw_tag(struct cfq_data *cfqd) 1317808974 M * Bertl that doesn't match 1317808984 M * Bertl it has to be in cfq_completed_request 1317808999 M * Mr_Smoke hm 1317809002 M * Mr_Smoke wrong vmlinux maybe 1317809004 M * Mr_Smoke hangon 1317809047 M * Bertl 2.6.38.7-vs2.3.0.37-rc15 #2 1317809120 M * Mr_Smoke well, that's the one lright 1317809121 M * Mr_Smoke Oo 1317809175 M * Bertl does that happen on only one machine, or several machines with the same hardware? 1317809198 M * Mr_Smoke only one, for the moment 1317809213 M * Bertl because it could as well be a hardware failure 1317809267 M * Mr_Smoke I've had a disk mishap in the (hw) raid array after reboot 1317809300 M * Mr_Smoke Would you suggest I go with the latest ? 2.3.1-pre10.1 ? 1317809342 M * Bertl a 3.x kernel won't hurt, but of course, with 4 month timespan between 'incidents' it's not easy to test 1317809383 M * Mr_Smoke I mean, is the 3.x/2.3.x preferable over the 2.6.39/2.3.x ? 1317809396 M * Mr_Smoke besides the fact that it's more recent 1317809400 J * fisted_ ~fisted@xdsl-87-78-213-121.netcologne.de 1317809406 M * Bertl the 2.6.39 is a backport of 3.x, and it is almost untested 1317809413 M * Mr_Smoke Yikes 1317809424 M * Mr_Smoke So either 2.6.38 or 3.0 then 1317809435 M * Bertl so 3.x is definitely the way to go if you do not want to stick to 2.6.38 1317809479 M * Mr_Smoke D'uh, a poorly reviewed grub.cfg made me boot on an even older 2.6.38 this time 1317809548 M * Mr_Smoke The thing is, with nearly all kernels since 2.6.36, I've had issues 1317809560 M * Mr_Smoke This panic is the first one for which I managed to get a log, thanks to netconsole 1317809587 M * Mr_Smoke In the past, these issues would happen on two servers with very similar hardware 1317809616 M * Mr_Smoke Since the rc15 milestone, they were both stable, and that issue from a few days ago was the first in months 1317809660 Q * fisted Ping timeout: 480 seconds 1317809666 M * Mr_Smoke So here I am, wondering. 1317809731 M * Bertl I don't see any relevant changes in rc15 (compared to rc14) 1317809752 M * Bertl http://vserver.13thfloor.at/ExperimentalT/delta-unhash-fix01.diff 1317809759 M * Bertl that's the only one actually 1317809789 M * Bertl it might have caused instability, but it definitely isn't I/O or cfq related 1317809817 M * Bertl it's more guest start/stop related 1317810028 M * Mr_Smoke 'k 1317810663 J * BenG ~bengreen@212.183.140.21 1317810967 M * Bertl off for a quick nap ... bbl 1317810974 N * Bertl Bertl_zZ 1317814197 Q * BenG Ping timeout: 480 seconds 1317814833 J * BenG ~bengreen@cpc12-aztw24-2-0-cust146.aztw.cable.virginmedia.com 1317815387 Q * BenG Quit: I Leave 1317819087 J * BenG ~bengreen@cpc12-aztw24-2-0-cust146.aztw.cable.virginmedia.com 1317819813 Q * BenG Quit: I Leave 1317820641 N * Bertl_zZ Bertl_oO 1317823609 Q * thierryp Remote host closed the connection 1317825244 N * Bertl_oO Bertl 1317825248 M * Bertl back now ... 1317826734 J * dowdle ~dowdle@scott.coe.montana.edu 1317827138 Q * ncopa Quit: Leaving 1317828455 Q * ntrs Ping timeout: 480 seconds 1317828944 J * ntrs ~ntrs@vault08.rosehosting.com 1317829873 J * hijacker_ ~hijacker@cable-84-43-136-96.mnet.bg 1317830722 M * Bertl translocating .. bbl 1317830728 N * Bertl Bertl_oO 1317831184 J * bonbons ~bonbons@2001:960:7ab:0:4425:c1ef:6036:932c 1317831588 Q * derjohn_mob Ping timeout: 480 seconds 1317831847 Q * Romster Ping timeout: 480 seconds 1317832085 J * Romster ~romster@202.168.100.149.dynamic.rev.eftel.com 1317832973 Q * Romster Ping timeout: 480 seconds 1317833077 J * Romster ~romster@202.168.100.149.dynamic.rev.eftel.com 1317836987 N * Bertl_oO Bertl 1317836991 M * Bertl back again ... 1317837437 Q * wurtel__ Ping timeout: 480 seconds 1317839449 M * Bertl and off to bed ... have to get up early tomorrow 1317839455 N * Bertl Bertl_zZ 1317840515 Q * Rockj Quit: *fluffles* 1317844190 J * wurtel__ ~paul@gw-office.telegraaf.net 1317845334 Q * cuba33ci Read error: Connection reset by peer 1317845423 J * cuba33ci ~cuba33ci@111-240-165-121.dynamic.hinet.net 1317845538 Q * ntrs Ping timeout: 480 seconds 1317845997 J * ntrs ~ntrs@vault08.rosehosting.com 1317846185 J * petzsch ~markus@p57B672E1.dip.t-dialin.net 1317846677 Q * hijacker_ Quit: Leaving 1317846861 Q * clopez Ping timeout: 480 seconds 1317849200 Q * bonbons Quit: Leaving 1317849637 J * clopez ~clopez@238.10.117.91.dynamic.mundo-r.com 1317853228 Q * petzsch Quit: Leaving. 1317854785 Q * fisted_ Read error: Connection reset by peer 1317855237 J * fisted ~fisted@xdsl-87-78-213-41.netcologne.de 1317855794 Q * dowdle Remote host closed the connection