1425516336 Q * Ghislain Quit: Leaving. 1425517204 Q * fstd Remote host closed the connection 1425517222 Q * bonbons Quit: Leaving 1425517245 J * fstd ~fstd@xdsl-87-78-181-190.netcologne.de 1425517331 Q * dustinm` graviton.oftc.net charm.oftc.net 1425517331 Q * BWare graviton.oftc.net charm.oftc.net 1425517331 Q * PowerKe graviton.oftc.net charm.oftc.net 1425517331 Q * geb graviton.oftc.net charm.oftc.net 1425517331 Q * ggherdov__ graviton.oftc.net charm.oftc.net 1425517331 Q * gamingrobot_ graviton.oftc.net charm.oftc.net 1425517331 Q * jrayhawk graviton.oftc.net charm.oftc.net 1425517331 Q * kshannon graviton.oftc.net charm.oftc.net 1425517331 Q * guerby graviton.oftc.net charm.oftc.net 1425517331 Q * webhat_ graviton.oftc.net charm.oftc.net 1425517331 Q * hparker graviton.oftc.net charm.oftc.net 1425517331 Q * funnel graviton.oftc.net charm.oftc.net 1425517336 J * jrayhawk ~jrayhawk@nursie.omgwallhack.org 1425517336 J * dustinm` ~dustinm`@2607:5300:100:200::160d 1425517336 J * kshannon ~kris@server.kris.shannon.id.au 1425517342 J * PowerKe ~tom@d54C6A573.access.telenet.be 1425517342 J * BWare ~itsme@31.25.99.5 1425517342 J * guerby ~guerby@ip165-ipv6.tetaneutral.net 1425517343 J * funnel ~funnel@81.4.123.134 1425517350 J * geb ~geb@mars.gebura.eu.org 1425517354 J * gamingrobot_ sid10990@2604:8300:100:200b:6667:2:0:2aee 1425517360 J * webhat ~quassel@31.25.99.5 1425517363 J * ggherdov__ sid11402@2604:8300:100:200b:6667:3:0:2c8a 1425517369 J * hparker ~hparker@0000fb24.user.oftc.net 1425531240 Q * jrklein Quit: Quitting 1425531256 J * jrklein ~cloud@proxy.dnihost.net 1425533885 Q * Aiken Remote host closed the connection 1425534021 J * Aiken ~Aiken@d63f.h.jbmb.net 1425535443 J * derjohn_mob ~aj@p578b6aa1.dip0.t-ipconnect.de 1425537116 N * Bertl_zZ Bertl 1425537117 M * Bertl morning folks! 1425541354 J * Ghislain ~aqueos@adsl1.aqueos.com 1425541933 Q * derjohn_mob Ping timeout: 480 seconds 1425542087 J * wicope ~wicope@0001fd8a.user.oftc.net 1425544447 J * derjohn_mob ~aj@fw.gkh-setu.de 1425544475 J * BenG ~BenG@cpc29-aztw22-2-0-cust128.18-1.cable.virginm.net 1425548224 J * bonbons ~bonbons@2001:a18:20b:c701:e151:7e68:e0ab:9e23 1425548283 J * nikolayK ~nkichukov@199.91.137.248 1425549227 M * nikolayK morning fellows 1425549240 N * nikolayK hijacker 1425550712 M * Bertl morning hijacker! 1425556160 Q * Aiken Remote host closed the connection 1425557670 J * benl ~benl@dockoffice.sonassihosting.com 1425557677 M * benl Hey troops 1425557688 M * benl I have a question I'm certainly sure I'll know the answer to. 1425557720 M * benl Got a guest, with hundreds of processes in a D state. Load averages have sky rocketed (although CPU % remains low). 1425557740 M * benl How can the guest be forceably stopped, without having to reboot the host 1425560403 Q * fstd Remote host closed the connection 1425560446 J * fstd ~fstd@xdsl-87-78-188-113.netcologne.de 1425560865 M * hijacker benl, you have to reboot the host 1425560884 M * hijacker or wait for the disk operations to complete, if waiting is an option 1425560983 M * Bertl the question is why they are in 'D' state 1425561005 M * Bertl it means that they are waiting for some I/O operation to complete 1425561024 M * Bertl this can, for example, be because a network drive disappeared 1425561036 M * Bertl or because the disk subsystem is busy/defective 1425561052 M * Bertl or just because the guest was limited in I/O bandwidth/operations 1425561079 M * Bertl in any case, removing the reason, i.e. making those I/O waits complete will bring the guest back to live 1425561083 M * Bertl *life 1425561113 M * Bertl at which point it can be simply shut down or killed 1425561132 M * Bertl (note that bringing back a defective disk is not trivial :) 1425563286 M * benl That's where it gets confusing then. 1425563295 M * benl There is no network disk/removal media etc. 1425563307 M * benl Its a fixed single disk (nothing special). No failures etc. 1425563317 M * benl Other guests using the same mountpoints are more than happy 1425563341 M * benl I think there's no helping these guests. 1425564441 M * hijacker benl, this is strange 1425564473 M * hijacker Bertl, mentioned about guest being limited on I/O badnwidth/operations, is this the case? 1425564491 M * hijacker why would all guests except for this single one not have processing entering the D state? 1425564594 M * hijacker processes* 1425564600 M * hijacker they share the same subsystem... 1425564619 M * hijacker maybe this one is doing more IO intense operations + IO limits? 1425566180 M * ard it might as wel be an ext4 bug ;-) 1425566931 M * benl There's no I/O constaints 1425566948 M * benl The funny thing is that processes hanging in the guest are really simple operations 1425566964 M * benl And the irony is that they are hanging during the first line of the script - that checks if the script is already running 1425566970 M * benl ie. during a `pidof xxx` 1425567055 M * benl There are no I/O caps - there's almost no restrictions on this particular vm - I tend not to use any as sadly, linux-vserver tends to behave ridiculously when a limit is hit 1425567086 M * benl When CPU/memory limits have been applied to guests, and they hit their limit - it brings the entire host system down. So the resource caps are pretty pointless 1425567101 M * benl So, all the guests are run without restriction 1425567193 M * benl FYI. I'll be restarting this system later today - so if you want me to run anything in the mean time for diagnosis - let me know before 4.30pm GMT 1425567539 M * benl For the record, I've got a feeling this issue is caused by memory exhaustion 1425567719 M * hijacker benl, can you read the contents of any file from the /proc file system? 1425567731 M * benl Probably 1425567741 M * benl Despite the fact the load is at 400 - its still responsive 1425567761 M * hijacker ls /proc 1425567763 M * hijacker for example? 1425567783 M * hijacker I suspect pidof might be reading the /proc file system... not sure though 1425567785 M * AlexanderS especially a stat on some /proc/*/exe files, that may resist on hanging nfs mounts, could be a issue... (pidof does something like that) 1425567787 M * benl I assume in the guest itself 1425567794 M * benl (there's no NFS) 1425567803 M * hijacker yes, in the guest 1425567807 M * benl Works just fine 1425567835 M * benl `ls /proc/*/exe` on the other hand makes it hang 1425567876 M * AlexanderS so some executable of a running process could not be read... 1425567886 M * benl Sounds like it 1425567903 M * benl I'm sure the machine started swapping 1425567908 M * benl which caused it to crash 1425567938 M * benl This is one of the issues I mentioned earlier, and the primary reason I don't put memory caps on vserver guests. As it causes load to skyrocket when they run out of ram 1425567974 M * Bertl so make the limit go away and the swapping will stop 1425567984 M * Bertl and shortly after that, the guest will normalize 1425568000 M * benl There is no limit 1425568008 M * benl That's what I'm saying. I don't use limits as they don't work. 1425568012 M * Bertl so you hit the real host limit? 1425568015 M * benl Yup. 1425568023 M * Bertl that is bad 1425568030 M * benl Probably 1425568039 M * benl But given guest limits don't work anyway - it makes no difference 1425568078 Q * undefined Quit: Closing object 1425568297 M * AlexanderS you can try to find the problematic process with something like that: for i in /proc/[0-9]* ; do strings $i/cmdline | head -1 ; timeout 2 ls -l $i/exe || break; done 1425568318 M * benl I was going to do something like that 1425568327 M * benl But even `timeout` won't survive the dreaded D state 1425568359 J * undefined ~undefined@00011a48.user.oftc.net 1425571124 Q * BenG Quit: I Leave 1425572810 J * fstd_ ~fstd@xdsl-87-78-188-113.netcologne.de 1425572812 Q * fstd Read error: Connection reset by peer 1425572826 N * fstd_ fstd 1425573350 Q * derjohn_mob Ping timeout: 480 seconds 1425575051 Q * hijacker Quit: Leaving 1425579936 Q * Ghislain Ping timeout: 480 seconds 1425583534 Q * undefined reticulum.oftc.net charon.oftc.net 1425583534 Q * ggherdov__ reticulum.oftc.net charon.oftc.net 1425583534 Q * webhat reticulum.oftc.net charon.oftc.net 1425583534 Q * gamingrobot_ reticulum.oftc.net charon.oftc.net 1425583534 Q * funnel reticulum.oftc.net charon.oftc.net 1425583534 Q * macmaN reticulum.oftc.net charon.oftc.net 1425583534 Q * gnarface reticulum.oftc.net charon.oftc.net 1425583534 Q * zerick reticulum.oftc.net charon.oftc.net 1425583534 Q * karasz reticulum.oftc.net charon.oftc.net 1425583534 Q * Hunger reticulum.oftc.net charon.oftc.net 1425583534 Q * fstd reticulum.oftc.net charon.oftc.net 1425583534 Q * bonbons reticulum.oftc.net charon.oftc.net 1425583534 Q * jrklein reticulum.oftc.net charon.oftc.net 1425583534 Q * swenTjuln reticulum.oftc.net charon.oftc.net 1425583534 Q * ntrs reticulum.oftc.net charon.oftc.net 1425583534 Q * ex reticulum.oftc.net charon.oftc.net 1425583534 Q * CcxCZ reticulum.oftc.net charon.oftc.net 1425583534 Q * FloodServ reticulum.oftc.net charon.oftc.net 1425583534 Q * benl reticulum.oftc.net charon.oftc.net 1425583534 Q * BWare reticulum.oftc.net charon.oftc.net 1425583534 Q * kshannon reticulum.oftc.net charon.oftc.net 1425583534 Q * jrayhawk reticulum.oftc.net charon.oftc.net 1425583534 Q * ensc|w reticulum.oftc.net charon.oftc.net 1425583534 Q * Carpoon reticulum.oftc.net charon.oftc.net 1425583534 Q * Bertl reticulum.oftc.net charon.oftc.net 1425583534 Q * Defaultti reticulum.oftc.net charon.oftc.net 1425583534 Q * daniel_hozac reticulum.oftc.net charon.oftc.net 1425583534 Q * click reticulum.oftc.net charon.oftc.net 1425583534 Q * yert reticulum.oftc.net charon.oftc.net 1425583534 Q * kwork_ reticulum.oftc.net charon.oftc.net 1425583536 Q * dustinm` Max SendQ exceeded 1425583634 J * benl ~benl@dockoffice.sonassihosting.com 1425583634 J * BWare ~itsme@31.25.99.5 1425583634 J * kshannon ~kris@server.kris.shannon.id.au 1425583634 J * jrayhawk ~jrayhawk@nursie.omgwallhack.org 1425583634 J * ensc|w ~ensc@62.153.82.27 1425583634 J * Carpoon ~Carpoon@5400B033.dsl.pool.telekom.hu 1425583634 J * Bertl herbert@IRC.13thfloor.at 1425583634 J * Defaultti defaultti@lakka.kapsi.fi 1425583634 J * daniel_hozac ~daniel@h149n2-spaa-a12.ias.bredband.telia.com 1425583634 J * click click@ice.vcon.no 1425583634 J * yert trey@trey.hu 1425583634 J * kwork_ ~user@no.life.ee 1425583658 J * funnel ~funnel@81.4.123.134 1425583658 J * dustinm` ~dustinm`@2607:5300:100:200::160d 1425583658 J * fstd ~fstd@xdsl-87-78-188-113.netcologne.de 1425583658 J * undefined ~undefined@00011a48.user.oftc.net 1425583658 J * bonbons ~bonbons@2001:a18:20b:c701:e151:7e68:e0ab:9e23 1425583658 J * jrklein ~cloud@proxy.dnihost.net 1425583658 J * ggherdov__ sid11402@2604:8300:100:200b:6667:3:0:2c8a 1425583658 J * webhat ~quassel@31.25.99.5 1425583658 J * gamingrobot_ sid10990@2604:8300:100:200b:6667:2:0:2aee 1425583658 J * macmaN ~chezburge@145-183-35-213.dyn.estpak.ee 1425583658 J * swenTjuln ~Swen@195.95.173.243 1425583658 J * CcxCZ ~ccxCZ@asterix.te2000.cz 1425583658 J * Hunger hunger@proactivesec.com 1425583658 J * karasz ~karasz@00015555.user.oftc.net 1425583658 J * ntrs ~ntrs@vault08.rosehosting.com 1425583658 J * ex ~ex@valis.net.pl 1425583658 J * gnarface ~gnarface@108-227-52-42.lightspeed.irvnca.sbcglobal.net 1425583658 J * zerick ~zerick@irc.quassel.zerick.me 1425583661 Q * funnel Max SendQ exceeded 1425583930 J * funnel ~funnel@81.4.123.134 1425584250 J * Aiken ~Aiken@d63f.h.jbmb.net 1425584435 Q * benl Quit: HydraIRC -> http://www.hydrairc.com <- \o/ 1425587258 J * Ghislain ~aqueos@adsl1.aqueos.com 1425588021 J * derjohn_mob ~aj@tmo-109-97.customers.d1-online.com 1425588774 Q * Ghislain Quit: Leaving. 1425589608 J * AndrewLe1 ~andrew@210.240.39.201 1425589724 Q * AndrewLee Ping timeout: 480 seconds 1425590001 J * FloodServ services@services.oftc.net 1425592841 M * Bertl off for a nap ... bbl 1425592850 N * Bertl Bertl_zZ 1425599503 Q * wicope Remote host closed the connection