1344643713 Q * bonbons Quit: Leaving 1344657342 J * clopez ~clopez@65.27.165.83.dynamic.mundo-r.com 1344658523 Q * clopez Ping timeout: 480 seconds 1344658542 N * Bertl_zZ Bertl 1344658550 M * Bertl back now ... 1344658657 M * Bertl T23 1344658749 M * Bertl ooops, wrong window 1344659740 N * Bertl Bertl_oO 1344663719 Q * DoberMann Ping timeout: 480 seconds 1344664667 J * ghislain ~AQUEOS@adsl2.aqueos.com 1344671012 J * pawels ~pluto@84-10-16-164.dynamic.chello.pl 1344673006 Q * ghislain Quit: Leaving. 1344674027 N * ensc Guest2577 1344674036 J * ensc ~irc-ensc@p54ADF7E2.dip.t-dialin.net 1344674205 J * bonbons ~bonbons@2001:960:7ab:0:28c3:d559:6e8c:7245 1344674218 J * dejanT ~dejan@mccain.rosehosting.com 1344674409 M * dejanT Hi, I have a problem: I cannot ping for example 192.168.1.14 from inside a virtual server. The kernel on the host server is 3.2.23-vs2.3.2.12 . Is there some option that needs to be enabled in kernel in order to fix my problem? 1344674438 Q * Guest2577 Ping timeout: 480 seconds 1344677706 M * daniel_hozac and 192.168.1.14 is what? 1344677713 M * daniel_hozac what's your networking setup? 1344678851 N * Bertl_oO Bertl 1344678870 M * Bertl I cannot ping 192.168.1.14 either!! damn, something must be broken! 1344680134 M * fback Bertl: I can ping 10.1.1.1 instead :) 1344680141 J * distemper ~gfckyrslf@2a01:198:2ee:0:2c:3165:84c4:2f88 1344680168 M * Bertl then maybe not all hope is lost ... :) 1344680324 M * distemper Bertl, could you re-sync the 3.5 patch to 3.5.1 if you got some time? there's a reject in include/linux/net.h 1344680327 M * dejanT 192.168.1.14 is a local IP of another host server 1344680414 M * Bertl and your guest hast what ip? 1344680446 M * Bertl i.e. what does'ip a l' report inside the guest? 1344680457 M * dejanT my guest have two IPs one publicIP and one private IP (192.168.1.xx) 1344680624 M * Bertl okay, so both show up in 'ip a l' inside the guest I presume? 1344680636 M * dejanT Yes. This is the output: 1: lo: mtu 16436 qdisc noqueue 1344680637 M * dejanT link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 1344680637 M * dejanT inet 127.0.0.1/8 scope host lo 1344680637 M * dejanT 2: eth0: mtu 1500 qdisc pfifo_fast qlen 1000 1344680637 M * dejanT link/ether 00:30:48:73:e5:2d brd ff:ff:ff:ff:ff:ff 1344680638 M * dejanT inet xxx.xxx.xxx.81/24 brd xxx.xxx.xxx.255 scope global secondary eth0:vpsname 1344680638 M * dejanT 3: eth1: mtu 1500 qdisc pfifo_fast qlen 1000 1344680640 M * dejanT link/ether 00:30:48:73:e5:2c brd ff:ff:ff:ff:ff:ff 1344680640 M * dejanT inet 192.168.1.23/24 brd 192.168.1.255 scope global secondary eth1:vpsname2 1344680672 M * Bertl okay, and eth1 is the interface configured on the host for 192.168.1.x/24, yes? 1344680690 M * Bertl distemper: will do shortly 1344680696 M * dejanT yes 1344680718 M * distemper thx, Bertl 1344680794 M * Bertl and ping 192.168.1.14 on the guest gives you? an error or just no reply? 1344680961 J * dejanT2 ~dejan@mccain.rosehosting.com 1344680969 Q * dejanT2 1344681178 Q * dejanT Ping timeout: 480 seconds 1344681199 J * dejanT2 ~dejan@mccain.rosehosting.com 1344681204 M * Bertl wb 1344681224 M * Bertl did you read my last question? 1344681263 M * dejanT2 no 1344681266 M * dejanT2 I got disconnected 1344681269 M * Bertl and ping 192.168.1.14 on the guest gives you? an error or just no reply? 1344681274 M * dejanT2 no reply 1344681311 M * Bertl okay, what about ping -I 192.168.1.23 192.168.1.14 ? 1344681474 M * dejanT2 please wait 1344681538 M * Bertl take your time, no need to hurry 1344682733 Q * nlm_ Ping timeout: 480 seconds 1344683271 Q * dejanT2 1344683779 Q * Bertl Ping timeout: 480 seconds 1344683977 J * Bertl herbert@IRC.13thfloor.at 1344684050 M * Bertl did I miss anything? 1344684280 M * fback nope 1344684950 J * dejanT ~dejan@mccain.rosehosting.com 1344686108 J * eyck ~eyck@77.79.198.60 1344687697 Q * dejanT 1344694480 M * Bertl off for a nap ... bbl 1344694485 N * Bertl Bertl_zZ 1344702711 J * clopez ~clopez@65.27.165.83.dynamic.mundo-r.com 1344707795 N * Bertl_zZ Bertl 1344708114 Q * DLange Quit: rebooting. Just for fun. 1344708229 Q * ensc|w Remote host closed the connection 1344708239 J * ensc|w ~ensc@www.sigma-chemnitz.de 1344708260 J * DLange ~DLange@dlange.user.oftc.net 1344712994 M * Guy- hmmm, on an old box running 2.6.36-vs2.3.0.36.38 I have a vserver guest that contains only zombie processes 1344713004 M * Guy- how can I rid of it without rebooting the host? 1344713020 M * Bertl by kicking the reaper :) 1344713039 M * Guy- Bertl: how? there's not even init running in the guest anymore 1344713050 M * Guy- 131 149 0 96.9M 0m00s00 0m00s00 639d10h14 gallery 1344713055 M * Bertl that's most likely the problem 1344713057 M * Guy- this is what vserver-stat says about it 1344713081 M * Guy- that's 149 apache2 threads 1344713086 M * Bertl i.e. if nobody reaps the zombies, there is no way to get rid of them 1344713101 M * Guy- so I'm stuck and I have to reboot the host? 1344713107 M * Bertl you can change the xid of the guest 1344713119 M * Bertl and start a new one while leaving the zombies around 1344713130 M * Bertl (they won't use up many resources) 1344713159 M * Guy- well, the original problem was that the load was 150 with no i/o and no cpu activity 1344713176 M * Guy- (it's a problem because it confuses loadwatch) 1344713187 M * Guy- I tried to fix it by restarting the guests 1344713223 M * Guy- this didn't help but brought up the problem with the zombies :) 1344713379 M * Bertl yeah, well, in this case probaly the reboot is the best choice 1344713566 M * Guy- 21:32:40 up 639 days, 10:36, 11 users, load average: 149.28, 149.71, 150.00 1344713568 M * Guy- *sob* 1344713591 M * Bertl you should have killed apache first :) 1344713616 M * Guy- I issued a vserver gallery restart; normally that works :) 1344715826 Q * pawels Quit: leaving 1344716319 M * Guy- btw, I can't just start a new instance of the guest after changing its xid because util-vserver still complains that it's already running 1344716346 M * Guy- I suppose I must remove its run file too 1344717828 M * Guy- and, interestingly, I can't bind to 0.0.0.0:8480 in the new instance of the old guest because the socket is already in use - by the zombies 1344717869 M * Guy- Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name 1344717872 M * Guy- tcp 0 0 0.0.0.0:8480 0.0.0.0:* LISTEN - 1344717900 M * Guy- I can even establish a tcp session 1344717984 M * Guy- tcp 456 0 10.74.91.5:8480 10.74.91.2:45051 ESTABLISHED - 1344717987 M * Guy- cute :) 1344718164 M * Guy- and indeed, ps axms shows that I have 149 threads of 6 zombies that are all in D state (the threads), which explains the load 1344718189 M * Bertl the question is, why are they in D state 1344718222 M * Guy- do_lookup according to wchan 1344718274 M * Guy- trying to get a backtrace 1344718369 M * Guy- t > sysrq-trigger doesn't seem to show these threads 1344718491 M * Guy- they're apparently all stuck in do_lookup() according to wchan, but they don't show up in sysrq trace output 1344718782 M * Guy- Bertl: any other way of getting a backtrace on them 1344718783 M * Guy- ? 1344718832 M * Guy- echo w >sysrq-trigger seems to have done the trick 1344718884 M * Guy- http://pastebin.com/sz4H9Kgk - this is a sample (all blocked apache2 threads seem to be like this) 1344719171 J * cuba33ci_ ~cuba33ci@114-25-199-80.dynamic.hinet.net 1344719221 Q * cuba33ci Read error: Operation timed out 1344719229 N * cuba33ci_ cuba33ci 1344720615 M * daniel_hozac did you lose an NFS mount? 1344720644 M * Guy- no 1344720661 M * Guy- this box never ever mounted anything via NFS from anywhere 1344720686 M * daniel_hozac any other kind of filesystem that was lost? 1344720724 M * Guy- no 1344720738 M * Guy- this guest only had bind mounts of local filesystems, all of which are still there 1344720903 M * Guy- the load started rising sometime yesterday and peaked at 150 when all apache2 threads apache was willing to start were locked up 1344720910 M * Guy- the other guests seemed unaffected 1344721191 M * daniel_hozac odd 1344721230 M * Bertl well, it's an 2.6.36 kernel 1344721316 M * daniel_hozac true 1344721327 M * daniel_hozac i do remember that being a mainline problem not too long ago, around that time. 1344725067 Q * bonbons Quit: Leaving 1344726711 Q * morfoh Read error: Connection reset by peer 1344726797 Q * karasz_ Ping timeout: 480 seconds 1344727029 J * karasz ~karasz@shell.opensde.net 1344727059 J * morfoh ~morfoh@shell.opensde.net 1344728328 Q * hparker Remote host closed the connection 1344728904 J * hparker ~hparker@2001:470:1f0f:32c:beae:c5ff:fe01:b647 1344729365 M * Bertl off to bed now ... have a good one everyone! 1344729371 N * Bertl Bertl_zZ