1447719248 J * Ghislain ~aqueos@adsl1.aqueos.com 1447721444 Q * Ghislain Read error: Connection reset by peer 1447721714 J * Ghislain ~aqueos@adsl1.aqueos.com 1447721961 Q * fstd Remote host closed the connection 1447721972 J * fstd ~fstd@xdsl-87-78-17-177.netcologne.de 1447723066 Q * Ghislain Quit: Leaving. 1447737017 M * Bertl_oO off to bed now ... have a good one everyone! 1447737018 N * Bertl_oO Bertl_zZ 1447742880 J * derjohn_mob ~aj@88.128.80.58 1447745957 Q * derjohn_mob Ping timeout: 480 seconds 1447748918 J * derjohn_mob ~aj@fw.gkh-setu.de 1447750617 J * Ghislain ~aqueos@adsl1.aqueos.com 1447752326 J * Gremble ~Gremble@cpc29-aztw22-2-0-cust128.18-1.cable.virginm.net 1447753115 Q * derjohn_mob Ping timeout: 480 seconds 1447753723 J * derjohn_mob ~aj@fw.gkh-setu.de 1447756819 N * Bertl_zZ Bertl 1447756822 M * Bertl morning folks! 1447756858 M * Ghislain hi 1447756911 M * Ghislain i was testing 4.1.13 and let my machine here, it crash ( my windows box) and this morning i see the server has a vserver enter at 100% cpu zombi state :( 1447756956 M * Bertl 4.1.13 crashed your windows box? 1447756962 M * Ghislain lol 1447756990 M * Ghislain no i was on the host logged into the guest with a vserver xx enter. then my computer crashed and i forget about it 1447757012 M * Ghislain this morning i come back on the server and see the vserver enter is at 100% cpu zombi mode process 1447757055 M * Bertl well, maybe it is still trying to send data to your machine 1447757085 M * Bertl shouldn't happen, but I don't think that is a typical scenario :) 1447757111 M * Ghislain well if i can crash a whole server just because my ssh box crashed then it is a concern :) 1447757147 M * Bertl you can do the same with a bash fork loop from your handy 1447757167 M * Bertl for me it falls under the category "don't do it" 1447757180 P * undefined 1447757195 M * Bertl if you expect your machine to crash over night, don't leave a connction open or put it into a screen 1447757197 M * Ghislain well the day adsl will be 100% uptime and windows will be too this will certainly be easier 1447757246 M * Bertl but I don't have a problem to investigate this if you can reliably reproduce it :) 1447757261 J * undefined ~undefined@00011a48.user.oftc.net 1447757267 M * Ghislain crashing is easy ;p will try 1447758287 M * Ghislain ok i reproduced it: connect to the host, vserver enter the guest, do a top 1447758293 M * Ghislain kill the putty windows 1447758313 M * Ghislain 3906 root 20 0 41016 3348 2960 R 99.3 0.0 0:52.49 sudo vserver jessieguest enter 1447758367 M * Bertl and the resulting process is unklillable on the host? 1447758404 M * Bertl i.e. vkill doesn't work on it? 1447758437 M * Bertl what util-vserver version? 1447758451 M * Bertl daniel_hozac: anything which comes to your mind regarding signal handling? 1447758517 M * Ghislain Kernel: 4.1.13-vs2.3.8.3aq, util-vserver: 0.30.216-pre3120 1447758538 M * Ghislain yes i cannot kill the process at all kill kill -9 vkill etc... 1447758697 M * Ghislain http://pastebin.com/raw.php?i=yuAbN1Lt 1447758748 N * ensc Guest9079 1447758758 J * ensc ~irc-ensc@p54ADC9DA.dip0.t-ipconnect.de 1447758940 M * Ghislain undefined: have you the same behavior if you do this ? or anyone for that matter ^^ 1447759045 M * Ghislain lsof show /dev/pts/0 (deleted) 1447759167 Q * Guest9079 Ping timeout: 480 seconds 1447759706 Q * Gremble Ping timeout: 480 seconds 1447761100 N * ensc Guest9084 1447761110 J * ensc ~irc-ensc@p54ADD975.dip0.t-ipconnect.de 1447761381 J * Gremble ~Gremble@cpc87151-aztw31-2-0-cust755.18-1.cable.virginm.net 1447761435 J * beng_ ~Gremble@cpc87151-aztw31-2-0-cust755.18-1.cable.virginm.net 1447761436 Q * beng_ 1447761503 Q * Guest9084 Ping timeout: 480 seconds 1447762779 Q * Gremble Quit: I Leave 1447765161 Q * fstd Remote host closed the connection 1447765172 J * fstd ~fstd@xdsl-81-173-191-136.netcologne.de 1447771948 M * undefined Ghislain, Bertl: just tested it here 1447771952 M * undefined it's specific to sudo 1447771970 M * undefined can't reproduce without using sudo 1447771977 M * Ghislain oh we made a leap there ! good catch 1447771989 M * undefined and upon using sudo we get a kernel message 1447771993 M * undefined let me capture it 1447772012 M * Ghislain it says "got you hey !" 1447772051 M * undefined no, something about "signal... xid mismatch" 1447772067 M * Ghislain i got plenty oif those 1447772068 M * undefined let me retry it attached to the serial console 1447772080 M * Ghislain seems to happen when you change the resolution of the window 1447772109 M * Ghislain from a conversation with daniel_hozack but i can be mangling conversations 1447772113 M * Ghislain ok 1447772215 M * undefined [ 143.408622] vxD: ffff88001712f330: signal 28[ffff880012bfbe98] xid mismatch ffff8800176bb570[#0,2222] xid=#139 1447772215 M * undefined [ 143.429821] vxD: ffff88001712f330: signal 28[ffff880012bfbe98] xid mismatch ffff8800176bb570[#0,2222] xid=#139 1447772255 M * undefined maybe a red herring 1447772259 M * Bertl well, it all boils down to: "do not use vserver enter" except for emergency cases 1447772300 M * Bertl use normal ssh for regular access to the guest 1447772302 M * undefined "vserver enter" works fine without sudo 1447772359 M * undefined i'll try this on 3.18 1447772364 M * undefined (just out of curiousity) 1447772373 M * Bertl most likely the terminal is not just across contexts but also across permission domains (i.e. between user and root) 1447772402 M * Bertl and signals (like the WINCHNG) do not propagate upward for security reasons 1447772424 M * Bertl in this case, it seems that Linux-VServer is blocking it 1447772779 M * undefined yeah, issue of sudo hanging doesn't exist on 3.18 1447772815 M * Ghislain ok so this is 4.1 news 1447772925 M * undefined still get the "signal 28[...] xid mismatch" message when executing sudo, but i don't get the "rcu_sched self-detected stall on CPU" when disconnecting the ssh connection ("~.") 1447772933 M * undefined under 3.18 1447772941 M * Ghislain on my particular case i sudo in a shell then vserver tart/stop and bang...kernel explode 1447773005 M * Bertl kernel explode = system reboots 1447773021 M * Ghislain i mean it goes to 100%cpu 1447773042 M * Bertl a process hogging the cpu trying to send a signal is harmless 1447773080 M * Bertl I'm not sure why you can't terminate it yet, but the rest is userspace only 1447773116 M * Ghislain seems strange that sending a signal goes in such a loop 1447773155 M * Bertl well, the signal is not delivered (for security reasons) and the userspace task goes crazy because of that 1447773177 M * Bertl (probably because it never happened before and nobody considered this case) 1447773208 M * Ghislain would there be a way to respond "whatever man" instead of just blocking ? is there a standard answer that can be delivered when the signal is not accepted 1447773211 M * Ghislain oh ok 1447773229 M * Ghislain you know like a "file not found 404" :p 1447773284 M * Bertl try to sudo a screen first, then do the enter 1447773297 M * Bertl I wouldn't be surprised if that worked just fine 1447773313 M * undefined interesting, the problem doesn't happen when stracing sudo 1447773655 M * Ghislain oh yes i had this too on a related issue that locked me, surely related 1447773875 M * Ghislain under tmux i got the same 100%cpu 1447773899 M * Ghislain hum not all the same 1447773908 M * Ghislain i can ctrl-c it in tmux(screen) 1447776740 M * Ghislain ps auxwf 1447776743 M * Ghislain oups 1447777783 Q * AbyssOne_ Ping timeout: 480 seconds 1447778765 J * AbyssOne_ ~jelle3@62.27.85.48 1447785855 J * sannes ~ace@2a02:fe0:c120:9660:d058:f197:599b:6d9 1447791017 M * Bertl off for now ... bbl 1447791018 N * Bertl Bertl_oO 1447791401 M * daniel_hozac Ghislain: which process do you see using 100% cpu? 1447791407 M * daniel_hozac the parent or the child? 1447792808 Q * derjohn_mob Ping timeout: 480 seconds 1447793358 Q * Ghislain Quit: Leaving. 1447794734 J * bonbons ~bonbons@2001:a18:22b:9901:11ee:9672:1422:6927 1447794811 J * thierryp ~thierry@2a01:e35:2e2b:e2c0:bc43:a50f:380b:1778 1447794930 Q * thierryp Remote host closed the connection 1447795237 J * derjohn_mob ~aj@p578b6aa1.dip0.t-ipconnect.de 1447795872 J * thierryp ~thierry@82.226.190.44 1447796277 Q * bonbons Quit: Leaving 1447796371 Q * thierryp Quit: ciao folks 1447796477 J * thierryp ~thierry@82.226.190.44 1447796825 Q * thierryp Remote host closed the connection 1447796889 J * thierryp ~thierry@82.226.190.44 1447797454 M * undefined daniel_hozac: 1. "sudo vserver enter" is at 100% 2. "vcontext" is zombied 1447797484 M * undefined i'm guessing vcontext is child as it is one PID off from "sudo vserver" 1447797621 Q * thierryp Remote host closed the connection 1447797823 Q * sannes Quit: Leaving. 1447797941 J * thierryp ~thierry@2a01:e35:2e2b:e2c0:4c62:9094:9987:2a56 1447798006 M * daniel_hozac undefined: what process is 1 at that point? can you check /proc//exe and such? 1447798109 Q * thierryp Remote host closed the connection 1447798470 M * daniel_hozac it looks like it'd be sudo, but i don't know why sudo wouldn't have exec'd at that point. 1447798835 M * daniel_hozac i seem to be able to reproduce it though. 1447800088 M * undefined it appears sudo doesn't exec the requested command, if only to log it (ie "sudo: pam_unix(sudo:session): session closed for user ...") 1447800135 M * undefined well, the leaving of the elevated session