1224376199 Q * morrigan Ping timeout: 480 seconds 1224376199 Q * Bertl Ping timeout: 480 seconds 1224376199 Q * BobR_zZ Ping timeout: 480 seconds 1224376568 T * * http://linux-vserver.org/ |stable 2.2.0.7, devel 2.3.0.34, grsec 2.2.0.7|util-vserver-0.30.215|libvserver-1.0.2|vserver-utils-1.0.3| He who asks a question is a fool for a minute; he who doesn't ask is a fool for a lifetime -- share the gained knowledge on the Wiki, and we forget about the minute. 1224376568 T * ChanServ - 1224376613 J * Bertl herbert@IRC.13thfloor.at 1224376800 J * BobR_zZ odie@IRC.13thfloor.at 1224377107 J * simleocas ~simleocas@190.13.230.134 1224377126 P * simleocas 1224378168 N * frootat frootat___away 1224379111 J * ktwilight__ ~ktwilight@189.72-66-87.adsl-dyn.isp.belgacom.be 1224379381 Q * ktwilight[m] Ping timeout: 480 seconds 1224379392 Q * ktwilight_ Ping timeout: 480 seconds 1224379423 J * ktwilight[m] ~ktwilight@189.72-66-87.adsl-dyn.isp.belgacom.be 1224386393 M * Bertl off to bed now ... have a good one everyone! 1224386399 N * Bertl Bertl_zZ 1224386533 M * Supaplex sleep well Bertl 1224387476 Q * derjohn_foo Ping timeout: 480 seconds 1224390359 Q * Loki|muh Remote host closed the connection 1224391691 Q * hparker Remote host closed the connection 1224391932 J * hparker ~hparker@linux.homershut.net 1224392413 Q * nenolod Quit: Leaving 1224392459 J * nenolod ~nenolod@ip70-189-74-62.ok.ok.cox.net 1224395173 J * frootat ~joern@dyndsl-080-228-177-062.ewe-ip-backbone.de 1224395514 Q * frootat___away Ping timeout: 480 seconds 1224395525 N * frootat frootat___away 1224398027 J * mtg ~mtg@dialbs-088-079-143-204.static.arcor-ip.net 1224399077 Q * hparker Quit: Read error: 104 (Peer reset by connection) 1224399924 Q * mtg Quit: Verlassend 1224400124 J * derjohn_mob ~aj@e180198075.adsl.alicedsl.de 1224400417 J * mtg ~mtg@dialbs-088-079-143-204.static.arcor-ip.net 1224400645 J * jmcaricand jm@172.252.192-77.rev.gaoland.net 1224403699 P * jmcaricand 1224405452 J * bonbons ~bonbons@2001:960:7ab:0:2c0:9fff:fe2d:39d 1224407447 N * frootat___away frootat 1224409578 N * frootat frootat___away 1224416122 Q * FireEgl resistance.oftc.net weber.oftc.net 1224416122 Q * sardyno_ resistance.oftc.net weber.oftc.net 1224416122 Q * Hollow resistance.oftc.net weber.oftc.net 1224416137 J * FireEgl FireEgl@173-16-9-10.client.mchsi.com 1224416137 J * sardyno_ ~me@pool-96-235-18-120.pitbpa.fios.verizon.net 1224416137 J * Hollow ~hollow@proteus.croup.de 1224416572 Q * Hollow Ping timeout: 480 seconds 1224417070 J * Hollow ~hollow@proteus.croup.de 1224417679 Q * Aiken Remote host closed the connection 1224419586 N * Bertl_zZ Bertl 1224419591 M * Bertl morning folks! 1224420245 J * ktwilight ~ktwilight@189.72-66-87.adsl-dyn.isp.belgacom.be 1224420271 Q * ktwilight__ Read error: Connection reset by peer 1224421815 J * Loki|muh loki@satanix.de 1224427066 M * Guy- hi, sorry about being off-topic, but anyone been programming with glib around here and willing to help me out a bit? There's something about GSequence I don't get, and the gnome channels aren't very responsive 1224427204 M * pmjdebru1jn Guy-: you probably couldn't have picked a more unrelated channel :) 1224427260 M * Guy- yes, well, I tend to hang out here and I know people are normally helpful :) 1224427270 M * Guy- (I try to be too, when something I know about catches my eye) 1224427387 J * ktwilight_ ~ktwilight@189.72-66-87.adsl-dyn.isp.belgacom.be 1224427418 Q * ktwilight Remote host closed the connection 1224427535 N * frootat___away frootat 1224428293 Q * ktwilight_ Remote host closed the connection 1224428503 J * ktwilight_ ~ktwilight@189.72-66-87.adsl-dyn.isp.belgacom.be 1224428503 Q * ktwilight_ Remote host closed the connection 1224429340 J * Pazzo ~ugelt@sadsl-246059.rol.raiffeisen.net 1224429371 Q * Pazzo 1224429402 J * hparker ~hparker@linux.homershut.net 1224431241 Q * mtg Quit: Verlassend 1224434805 J * ktwilight ~ktwilight@189.72-66-87.adsl-dyn.isp.belgacom.be 1224436097 M * daniel_hozac Bertl: hmm. notagcheck is broken on 2.6.27, right? 1224436160 M * Bertl not intentionally, but could be 1224436335 M * daniel_hozac the only dx_permission check i see calls it with a NULL nd. 1224436356 M * daniel_hozac which makes sense, since inode_permission no longer has the nd available either. 1224436391 J * doener ~doener@i577AD09B.versanet.de 1224436438 M * daniel_hozac it does mean we can't really do notagcheck though... 1224436483 M * Bertl ah, right .. 1224436498 Q * doener_ Ping timeout: 480 seconds 1224436557 M * Bertl so either we move that check further up, or we propagate the notag info down (which sounds messy) 1224436579 M * daniel_hozac both of them sound really messy. 1224436591 M * daniel_hozac i think the easiest way is to make notagcheck a superblock option. 1224436646 M * Bertl or we drop that feature completely, but yeah, sb option sounds fine to me 1224436712 M * daniel_hozac well, both me personally and PlanetLab do need it. 1224436750 M * Bertl okay, no problem with that .. working on patches? 1224436777 M * daniel_hozac yeah, i was forward-porting my permission cleanup patches when i noticed it. 1224436787 M * daniel_hozac i'll fix this first. 1224436814 M * Bertl okay, let me know when you have something or just want to discuss some stuff 1224438345 J * frootat_ ~joern@dyndsl-091-096-046-036.ewe-ip-backbone.de 1224438412 J * jmcaricand jm@172.252.192-77.rev.gaoland.net 1224438717 Q * frootat Ping timeout: 480 seconds 1224439364 P * jmcaricand 1224441158 M * Bertl nap attack ... bbl 1224441162 N * Bertl Bertl_zZ 1224442534 Q * cga Quit: WeeChat 0.2.6 1224444624 Q * pmenier Quit: Konversation terminated! 1224445986 J * faheem ~faheem@cpe-071-077-007-143.nc.res.rr.com 1224446058 M * faheem Hi. Just upgraded to Debian lenny (upcoming release). A vserver took ages to start, and gave the following warning. 1224446061 M * faheem Enabling additional executable binary formats: binfmt-supportFATAL: Could not load /lib/modules/2.6.26-1-vserver-686/modules.dep: No such file or directory 1224446064 M * faheem update-binfmts: warning: Couldn't load the binfmt_misc module. 1224446081 M * faheem What should I do about this, if anything? 1224446111 M * daniel_hozac uh, clean up your guest's initscripts. 1224446130 M * faheem The file in question does exist. 1224446182 M * faheem daniel_hozac: What is wrong with the initscripts, and how do I clean them up? Should have mentioned the vserver guests in guests in question were created using lenny. Is there any upgrade procedure I need to do? 1224446251 M * daniel_hozac your guests' initscripts shouldn't be attempting to load modules. 1224446256 M * daniel_hozac disable any such scripts. 1224446263 J * dna ~dna@p54BCFAC0.dip.t-dialin.net 1224446301 M * faheem daniel_hozac: Oh. I never added anything to the initscripts. 1224446322 M * daniel_hozac but you've installed something which did. 1224446773 M * faheem Oh, I see. It is trying to load a module from inside the guest. 1224446786 M * harry i've put a test grsec + vserver online on "another" site, because i'm not fully sure that the patch works as good as it should... 1224446791 M * harry if someone want to test: 1224446796 M * harry http://harry.enzoverder.be/patch-2.6.26.6-vs2.3.0.35.6-grsec2.1.12-20081010.diff 1224446810 M * faheem And, of course, there is no kernel there. 1224446841 M * harry if i get some OK tests, i'll put it online on vserver page, if not, i'd like to know what the problems are, so i can fix them 1224446861 M * harry don't ave the hardware nor the time to do extensive tessting myself 1224446864 M * harry (new job, ...) 1224447040 M * daniel_hozac Bertl_zZ: http://people.linux-vserver.org/~dhozac/p/k/delta-notagcheck-fix01.diff 1224447081 M * harry you broke it? ;) 1224447123 M * daniel_hozac Bertl_zZ: i also fixed the option parsing, i have no idea how that ever worked. 1224447266 Q * ktwilight Remote host closed the connection 1224447281 J * ktwilight ~ktwilight@189.72-66-87.adsl-dyn.isp.belgacom.be 1224447505 Q * ktwilight 1224447823 Q * hparker Quit: Read error: 104 (Peer reset by connection) 1224448469 Q * bonbons Quit: Leaving 1224448820 J * nou Chaton@2001:6f8:328:bbc:6666:6667:: 1224448856 J * Aiken ~Aiken@ppp118-208-28-181.lns2.bne1.internode.on.net 1224449175 Q * dna Quit: Verlassend 1224449871 M * Guy- vserver hashify just caused itself, pdflush and the jfsCommit kernel thread to be frozen in D state in 2.6.22.19+vs2.2.0.7 - is this a known problem? 1224449902 M * daniel_hozac could you get a kernel trace of that? 1224449907 M * Guy- pdflush and vhashify are in lmGroupCommit 1224449919 M * Guy- jfsCommit is in __get_metapage 1224449924 M * Guy- what else should I find out and how? 1224449954 M * daniel_hozac echo t > /proc/sysrq-trigger 1224450054 M * Guy- OK, I have a trace 1224450059 M * Guy- shall I paste it somewhere? 1224450063 M * daniel_hozac yes, please. 1224450111 M * Guy- just these processes or the whole thing? 1224450140 M * daniel_hozac those processes should suffice. 1224450699 M * Guy- daniel_hozac: http://pastebin.be/14411 (I couldn't find jfsCommit in dmesg for some reason) 1224450957 Q * frootat_ Quit: :(){ :|:&};: 1224451239 J * joern42 ~joern@dyndsl-091-096-046-036.ewe-ip-backbone.de 1224451251 N * joern42 frootat 1224453344 J * yarihm ~yarihm@77-56-182-18.dclient.hispeed.ch 1224455138 M * Guy- any immediate insight? :) 1224455536 M * Guy- also, is there anything more I can do to help in debugging before I reboot? 1224455543 M * daniel_hozac doesn't strike me as caused-by-vserver. does the filesystem still work? 1224455547 M * Guy- (I assume there is no other way out) 1224455554 M * Guy- I can read from it but not write to it 1224455578 M * Guy- (writing results in the same kind of lockup) 1224455604 M * Guy- I'll have to reboot -f -n because sync hangs as well 1224455852 M * Guy- I've seen similar hangs with raid5 and write intent bitmaps, but this is on raid10 1224455860 M * Guy- I'll try to reproduce it with no bitmap 1224455878 N * Bertl_zZ Bertl 1224455883 M * Bertl back now ... 1224455911 M * Guy- moin 1224455918 M * Bertl jfs filesystem? 1224455960 M * Guy- yes 1224455986 M * Bertl looks like an umount/sync caused your D state 1224456035 M * Bertl do you have anything in dmesg/log from before that? 1224456057 M * Guy- I did a vserver stop, immediately followed by a vserver start and then started vhashify right away (from a script) 1224456077 M * Bertl I think the problem occured before the vhashify, the vhasify just got caught 1224456077 M * Guy- no, my kernel logging was broken, I only fixed it now to obtain the trace 1224456151 M * Bertl you had that with raid5 before? 1224456166 M * Bertl on the very same machine? 1224456182 M * Guy- no, different machine 1224456212 M * Guy- (only similarity is that it was also amd64) 1224456253 M * Guy- I can't swear that jfsCommit was hanging in the same syscall in that case too; I just know that I had a similar hang 1224456268 M * Guy- high i/o (such as a dirvish backup) would trigger it 1224456280 M * Guy- and removing the write intent bitmap from the raid5 "fixed" it 1224456328 M * Bertl I think it just lowered the chances to hit it 1224456357 M * Bertl looks to me like a locking issue in jfs, or some intrinsic problem with the I/O path, which could be hardware related 1224456358 M * Guy- could be; it hasn't happened since and it's been a year or so 1224456392 M * Guy- the i/o path is somewhat convoluted in all cases because it's jfs on lvm on luks on softraid on sata 1224456408 M * Bertl http://article.gmane.org/gmane.linux.kernel/731223 1224456452 M * Bertl (just to get an idea :) 1224456498 M * Guy- yes, looks vaguely similar 1224456518 M * Guy- unfortunately I can't recall what gave me the idea to remove the write intent bitmap 1224456526 M * Guy- I saw the problem described somewhere, probably 1224456571 M * Bertl as I said, I think the bitmap removal is jsut papering over the issue, by making it very unlikely to appear, for whatever reason 1224456596 M * Bertl write intent bitmaps works flawlessly here with ext3 for almost a year now 1224456642 M * Guy- it works for me on raid1 too 1224456649 M * Guy- I only had problems with raid5, and now raid10 1224456694 M * Guy- and luks may also be involved because I have a box with raid5 and jfs, without luks, with bitmap, and it doesn't hang 1224456735 M * Guy- but sure, it is conceivable that by disabling the bitmap I'm just lowering the chances of hitting the bug 1224456743 M * Bertl luks just provides the keys, dm does the main work part, no? 1224456753 M * Guy- dm-crypt, yes 1224456780 M * Guy- oh, could also be SMP related 1224456793 M * daniel_hozac is it a UP box? 1224456794 M * Guy- the boxes that hang are SMP, the one that doesn't is single cpu 1224456803 M * Bertl i.e. the more indirection and async I/O you do, the more likely the effect shows up, and if it is a locking issue, it definitely needs SMP :) 1224456813 M * daniel_hozac SMP should lower the odds of deadlocks, not increase them. 1224456915 M * yarihm Bertl, do you have a moment? 1224456961 M * Bertl sure 1224457081 M * Guy- while trying to stop a vserver, /bin/bash /usr/sbin/vserver ----nonamespace vbox-jayhawk stop /dev/null ran away and is eating my CPU 1224457098 M * Guy- it's doing 1224457099 M * Guy- rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 1224457099 M * Guy- rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 1224457106 M * Guy- in a seemingly endless loop 1224457323 J * hparker ~hparker@linux.homershut.net 1224457616 M * daniel_hozac Bertl: did you see http://people.linux-vserver.org/~dhozac/p/k/delta-notagcheck-fix01.diff? 1224460771 Q * nou Remote host closed the connection