1375750006 J * hparker ~hparker@2001:470:1f0f:32c:beae:c5ff:fe01:b600 1375752093 Q * jrklein Remote host closed the connection 1375752098 J * jrklein ~osx@proxy.dnihost.net 1375754133 J * Fog_Watch ~ian@21.55.70.115.static.exetel.com.au 1375754383 M * Fog_Watch Hello. Am I correct in thinking that this chat is a better way of communicating about vserver problems, rather than the mailing list? 1375755098 M * Bertl yes, but I already read your email, just didn't answer yet 1375755161 M * Bertl basically when you use system level isolation, and you have something 'critical' running on the host, you want to make sure to reserve some resources 1375755194 M * Bertl like for example reserving a specific CPU for host only use 1375755916 M * Fog_Watch Sorry to be a be pre-emptive. 1375755927 M * Bertl no problem :) 1375755943 M * Fog_Watch "a specific CPU for host only use" sounds exactly like what I'm after. 1375755965 M * Fog_Watch Is there anything in http://linux-vserver.org/Documentation on this? 1375756015 M * Bertl you know the 'great flower page'? 1375756091 M * Fog_Watch Mmmmm, no. 1375756119 M * Bertl just google for the term, it should give you the util-vserver config page as result :) 1375756203 M * Bertl basically what you want to do is use CPU sets to restrict the guests to certain CPUs (or HT sibblings) keeping one (or maybe two) for the host 1375756205 M * Fog_Watch Sorry, yes I was there (http://www.nongnu.org/util-vserver/doc/conf/configuration.html) some time ago. 1375756249 M * Bertl and then configure all network interrupts to be handled by those CPUs (to reduce any switching overhead) 1375756310 M * Fog_Watch I'll have a look. Cheers. 1375756364 M * Bertl you're welcome! let me know if you have any problems with that 1375759758 M * Fog_Watch To allocate a CPU(s) to a guest do I need to create the three files /etc/vservers/vserver-name/cpuset/{name,cpus,mems}? 1375762118 M * Bertl the cpus part should suffice 1375762150 M * Bertl (basically cpusets are a mainline feature and you can find more documentation in the kernel) 1375762179 M * Bertl I'm off to bed now ... have a good one everyone! 1375762183 N * Bertl Bertl_zZ 1375766958 J * imcsk8 ~ichavero@200.66.107.177 1375768937 P * spindritf 1375773525 P * Fog_Watch 1375777521 J * tbenita ~tbenita@men13-6-88-182-112-162.fbx.proxad.net 1375780001 Q * tbenita Ping timeout: 480 seconds 1375780857 J * tbenita ~tbenita@51-117-190-109.dsl.ovh.fr 1375781012 Q * nkukard Remote host closed the connection 1375781046 Q * _br_ Ping timeout: 480 seconds 1375781112 J * nkukard ~nkukard@197.87.137.247 1375783602 Q * Hunger Ping timeout: 480 seconds 1375784854 J * Hunger hunger@proactivesec.com 1375785022 J * _br_ ~bjoern_of@213-239-215-232.clients.your-server.de 1375786766 Q * ircuser-1 Ping timeout: 480 seconds 1375789299 J * ircuser-1 ~ircuser-1@35.222-62-69.ftth.swbr.surewest.net 1375789365 Q * Aiken Remote host closed the connection 1375795136 Q * nkukard Ping timeout: 480 seconds 1375795756 J * TheSeer ~theseer@sliph.netpirates.net 1375795816 M * TheSeer hey hey :) 1375796022 Q * ncopa Quit: Leaving 1375796127 M * TheSeer Just wondering wether using http://rpm.hozac.com/dhozac/rhel/6/vserver/x86_64/ as yum repo for a centos6/rhel box is still recommended or if i missed a change ;) 1375796145 J * ncopa ~test@3.203.202.84.customer.cdi.no 1375796204 J * nkukard ~nkukard@41-133-203-34.dsl.mweb.co.za 1375796250 M * daniel_hozac is it not working for you? 1375796313 M * TheSeer I was just confused by yum trying to update my kernel to a non-vserver kernel ;) 1375796328 M * TheSeer checking, i have "still" 2.6.32-279.9.1.el6.vs2.3.0.36.29.6.23.x86_64 on that box 1375796396 M * TheSeer rpm -qa |grep kernel shows the latest version i have installed to be kernel-2.6.32-358.2.1.el6.vs2.3.0.36.29.6.31.x86_64 ... 1375796409 M * TheSeer so i'm basically missing a reboot for that one 1375796451 M * TheSeer yum update wanted to install 2.6.32-358.14.1.el6 though 1375796485 M * daniel_hozac yeah, i haven't done the latest one apparently. 1375796516 M * TheSeer k, no problem. Just wanted to make sure i'm not missing anything 1375796547 M * TheSeer I'm way too much on the road these days so i don't have a fixed chat client online anymore to check for updates.. Makes me feel a bit detached at times... 1375796637 M * TheSeer what about the util-vserver stuff? i have 0.30.216-1.pre3034, topic suggests pre3038 would be latest. Not sure if there are any important changes 1375796674 M * daniel_hozac i have been neglecting util-vserver in general lately. 1375796692 M * daniel_hozac not much has happened 1375796781 M * TheSeer makes sense to me, I wouldn't know what to be honest except maybe security stuff 1375796787 M * TheSeer i mean, it's working fine otherwise 1375796805 M * TheSeer what would be helpefull would be a patched yum ;) 1375796822 J * BenG ~BenG@cpc35-aztw23-2-0-cust207.18-1.cable.virginmedia.com 1375797647 M * TheSeer is there a pre-patched yum to get rid of the 5 second delay? 1375797687 M * daniel_hozac not that i'm aware of. 1375797973 J * ffrank ~ffrank@g230123015.adsl.alicedsl.de 1375798051 M * ffrank hi. with vserver 2.3.2.13 and utils 0.30.216-pre3029 i'm facing a weird issue with a vserver not starting 1375798068 M * ffrank i will issue "vserver start" and it will block immediately, apparently 1375798085 M * daniel_hozac "block"? 1375798100 M * ffrank there is a child process in D state: /usr/lib/util-vserver/secure-mount -a --chroot --fstab /etc/vservers/ 1375798102 M * ffrank ... 1375798150 M * ffrank ah, which I now realise won't die when i kill their parents...ugh 1375798211 M * ffrank dammit, how am i going to get rid of those now... 1375798247 M * TheSeer no parent process anymore? 1375798300 M * daniel_hozac what does your fstab contain? 1375798328 M * ffrank fstab: 1375798331 M * ffrank none /proc proc defaults 0 0 1375798331 M * ffrank none /dev/pts devpts gid=5,mode=620 0 0 1375798359 M * ffrank the ppid has "succesfully" transferred to 1, but the process remains in sleep 1375798424 M * daniel_hozac what does your dmesg contain? 1375798546 M * ffrank nothing that I deem relevant. the working contexts issue warnings about proc accesses that are forbidden, there is drbd noise, but not from this specific vserver's backing device 1375798974 M * ffrank note, an strace trying to attach to the secure-mount process becomes unresponsive to sigterm as well (but can be killed) 1375799021 M * daniel_hozac /proc//wchan is probably your best bet. 1375799063 M * TheSeer the /vservers is located on drbd device? 1375799077 M * TheSeer since you mentioned drbd noise.. 1375799089 M * ffrank theocrite: it is 1375799148 M * ffrank daniel_hozac: what would a "0" in wchan be conveying? 1375799151 M * TheSeer does the noise contain any errors or something that leads to the assumption there is a blocking problem? 1375799201 M * ffrank the drbd is fine, luckily, the noise is from other drbdevices loosing and re-establishing their peer link 1375799203 M * TheSeer i had some very bizarre issues with a drbd device the other day causing deadlocks.. 1375799210 M * TheSeer ah, okay 1375799270 M * ffrank no wait - the resource in question is now disconnected as well. weird. shouldn't cause i/o issues, but i'll look into it 1375799565 M * TheSeer depends on how you configured drbd 1375799595 M * TheSeer if it's set to sync mode it will sit and wait until the peer is back 1375799809 M * ffrank uhm, no, that's a misconception ;) 1375799844 M * TheSeer i totally agree :) 1375799865 M * TheSeer but people do that for whatever bizarre reason 1375799921 M * ffrank people do what now? 1375800041 M * TheSeer having multiple servers/services run in sync mode and complain about deadlocks or bad performance 1375800119 M * ffrank oh right, well, performance can be painful of course, yes 1375800151 M * TheSeer neither has to do with your issue i presume though.. 1375800159 M * ffrank about wchan, that probably won't serve me due to https://lkml.org/lkml/2008/11/6/12 i fear 1375800381 M * ffrank btw, the parent of my culprit was '/bin/bash /usr/sbin/vserver ----nonamespace start' 1375800391 M * ffrank is that --nonamespace to be expected? 1375800397 M * daniel_hozac yes. 1375800412 M * ffrank humm 1375800549 M * ffrank aside from those secure-mount processes, there seems to be a stuck kernel thread, khugepaged 1375800878 M * ffrank ooh, i managed to get stacktraces 1375800880 M * ffrank http://pastebin.com/dFYPJyyD 1375801050 N * Bertl_zZ Bertl 1375801054 M * Bertl morning folks! 1375801140 M * ffrank moin 1375801663 M * ffrank Bertl: i'm a little more stumped than usual. did you have a chance to read backwards a page or two? ;) 1375801708 M * Bertl I somewhat skimmed over it 1375801833 M * Bertl but the kernel trace is incomplete, and I presume it is one of several 1375801852 M * ffrank indeed 1375801864 M * Bertl (most likely caused by the dreadful stuck task debugging) 1375801935 M * ffrank i.e. my slaughter of parent processes? yeah... :( 1375802033 M * Bertl not necessarily, but network filesystems, especially when they operate in some kind of sync mode (nfs, ocfs2, ..) or network block layer (aoe, iscsi, drbd ...) often cause rather lengthy delays 1375802046 M * Bertl (especially when there is some kind of network problem) 1375802073 M * ffrank i see 1375802076 M * Bertl and as a result, the kernel thinks a task waiting for I/O (i.e. in D state) has stopped working 1375802114 M * ffrank should i try unmounting this block device for a change? 1375802115 M * Bertl and when debugging such 'hangs' is enabled, the kernel will give a stack dump of the task 1375802155 M * Bertl and for whatever reason, it has been my experience, that such tasks will never get unstuck again (no idea why, IMHO a mainline bug) 1375802227 M * ffrank i usually do have my kernel hand me that dump. i'm not sure why that's not happening on this particular box at the moment 1375802276 Q * TheSeer Quit: Client exiting 1375802427 M * Bertl so my general advice is: if you do use a network filesystem or a network block layer, either make it work in some delayed sync mode or make damn sure that the network connection is working 24/7 1375802689 M * ffrank i shall keep that in mind, thanks 1375802864 M * ffrank ah, sure enough - fuser reports that the hung tasks are frolicking about the drbd device, not letting me deconfigure it 1375803028 M * Bertl yes, unfortunately that's what I meant with 'will never get unstuck again' 1375803047 M * Bertl so IMHO the only way to recover is to reboot such a kernel 1375803055 M * ffrank joy :) 1375803087 M * Bertl if you have a setup where you can basically trigger this, it might be worth testing with the 'hung task' stuff disabled 1375803125 M * Bertl and see if it causes the same issues or just 'long' delays when the network has a problem 1375803206 M * ffrank i basically agree, sadly the vservers triggering this are production instances of my customer's, so i probably won't get much play out of them 1375803863 M * Bertl yeah, production is always problematic 1375806967 Q * imcsk8 Ping timeout: 480 seconds 1375807075 J * bonbons ~bonbons@ppp-156-141.adsl.restena.lu 1375807704 Q * tbenita Quit: Ex-Chat 1375808720 Q * jrklein Remote host closed the connection 1375808766 Q * ffrank Quit: Leaving 1375809741 J * jrklein ~osx@proxy.dnihost.net 1375810988 J * hijacker_ ~hijacker@cable-84-43-134-121.mnet.bg 1375818487 Q * BenG Quit: I Leave 1375818773 Q * hijacker_ Quit: Leaving 1375821320 J * Aiken ~Aiken@2001:44b8:2168:1000:21f:d0ff:fed6:d63f 1375822356 Q * bonbons Quit: Leaving 1375824299 N * l0kit Guest2535 1375824307 J * l0kit ~1oxT@0001b54e.user.oftc.net 1375824704 Q * Guest2535 Ping timeout: 480 seconds