1395274602 Q * Ghislain Quit: Leaving. 1395277201 Q * fisted Remote host closed the connection 1395277211 J * fisted ~fisted@xdsl-81-173-184-62.netcologne.de 1395279417 J * fisted_ ~fisted@xdsl-87-78-187-55.netcologne.de 1395279664 Q * zerick Remote host closed the connection 1395279857 Q * fisted Ping timeout: 480 seconds 1395279857 N * fisted_ fisted 1395280527 J * SteeleNivenson ~SteeleNiv@pool-108-29-139-222.nycmny.fios.verizon.net 1395280593 M * Bertl off to bed now ... have a good one everyone! 1395280601 N * Bertl Bertl_zZ 1395286998 N * l0kit Guest3886 1395287004 J * l0kit ~1oxT@0001b54e.user.oftc.net 1395287407 Q * Guest3886 Ping timeout: 480 seconds 1395292411 J * undefined ~undefined@00011a48.user.oftc.net 1395297737 J * Ghislain ~aqueos@adsl1.aqueos.com 1395303076 N * Bertl_zZ Bertl 1395303081 M * Bertl morning folks! 1395303099 M * karasz morning Bertl 1395303102 M * Ghislain hello 1395305939 J * beng_ ~BenG@cpc29-aztw22-2-0-cust128.18-1.cable.virginm.net 1395311193 J * benl ~benl@dockoffice.sonassihosting.com 1395311199 M * benl Hey all 1395311221 M * benl I'm investigating an issue with file descriptor usage in a guest and wanted a bit of advice if possible 1395311279 M * Bertl let's here 1395311281 M * Bertl *hear 1395311989 M * benl A daemon in a guest had a runaway use of file descriptors (into the hundreds of thousands) and flagged a few warnings 1395312106 M * benl So the normal thing was to check "/proc/sys/fs/file-nr" 1395312126 M * benl Then check the usage of each guest "/proc/virtual/X/limit" 1395312138 M * benl The guest in question was identified, and restarted 1395312171 M * benl But what I'm trying to evaluate is whether these handles have been released or not 1395312192 M * benl "/proc/sys/fs/file-nr" is still showing the peak limit (which IIRC is since boot), rather than current usage 1395312252 M * benl But more importantly is that open inodes is still fairly high 1395312303 M * benl A quick (hack) to check all files with lsof across all guests (http://paste.linux-vserver.org/58544) looks "normal" 1395312308 M * Bertl well, file descriptors per se are usually released ... but there is accounting per guest, so maybe check there? 1395312351 M * benl Am I correct in assuming "/proc/sys/fs/file-nr" won't reflect the current usage? 1395312413 M * Bertl it probably will for a given namespace 1395312431 M * Bertl cat /proc/virtual//limit 1395312439 M * benl Yeah, I checked that 1395312454 M * benl Those figures are "normal" following the restart of the guest 1395312467 M * benl but the host is still showing the high value in "/proc/sys/fs/file-nr" 1395312510 M * Bertl what does your /proc/sys/fs/file-nr show? 1395312523 M * benl 528960 0 724289 1395312548 M * benl (where the previous limit was ~524288) 1395312574 M * Bertl well, that is quite a lot actually 1395312580 M * benl Its a huge amount 1395312589 M * Bertl any zombies around? 1395312602 M * benl vps ax | grep Z -- doesn't show much 1395312624 M * benl correction, there's a few apps 1395312636 M * benl munin, watchdog daemon and a cron script 1395312645 M * Bertl maybe processes in 'D' state or stopped processes as well? 1395312656 M * Bertl anything unusual in dmesg? 1395312692 M * benl prior to the limit being reached or after? 1395312705 M * Bertl both 1395312716 M * benl Not really 1395312727 M * benl Infact, nothing of concern (mostly iptables entries) 1395312767 Q * undefined Quit: Closing object 1395312856 M * benl FYI. nothing in D or T states 1395313045 M * benl I guess at this point 1395313057 M * benl All I want to identify is if the host is at or near its fs limit 1395313072 M * benl And whether the figure in /proc/sys/fs/file-nr can be trusted as a real-time, current figure 1395313132 M * Bertl it should be accurate, especially if the file handles have been freed up, they should show as unused (second column) 1395313150 M * Bertl or not at all (if completely disposed) 1395313166 M * benl So when you see 1395313167 M * benl 528400 0 724289 1395313194 M * Bertl I'd say there are roughly half a milion allocated file handles 1395313206 M * benl Yeah, that's the worrying thing then 1395313220 M * benl That single guest was responsible for ~500k of that 1395313229 M * Bertl no limit there? 1395313234 M * benl (there is now!) 1395313247 M * benl I was under the impression the guest restart should clean up its file handles 1395313293 M * Bertl it should at least log any discrepancies regarding exiting with non zero filehandle counts 1395313316 M * benl Where to on a debian system 1395313321 M * benl dmesg/kern.log ? 1395313331 M * Bertl it doesn't actively free up resources though, so it is left to the kernel to clean up when processes die 1395313350 M * benl So I'm assuming that when the vserver was restarted - that some processes never stopped 1395313358 M * benl and are still running under the same context ID? 1395313370 M * benl But no longer showing in the /proc/virtual/X/limit accounts? 1395313382 M * benl FYI. For the guest in question, it is currently 1395313382 M * benl FILES: 6379 0/ 6379 -1/ -1 0 1395313390 M * benl Prior to the restart was 500K+ 1395313392 M * Bertl well, /proc/virtual/X/limit will be reset on a restart 1395313400 M * benl Okay, that makes sense 1395313408 M * Bertl but no process can survive a context restart 1395313418 M * benl Then I'm baffled! 1395313433 M * Bertl still, for example if those file handles are actually sockets, they may hang around for quite a while 1395313449 M * Bertl sockets do not disappear immediately when a process is terminated 1395313459 M * benl Is there a way to identify that? 1395313494 M * Bertl they usually show up as waiting/timeout sockets 1395313558 M * benl http://paste.linux-vserver.org/58545 1395313564 M * benl doesn't look to be the case 1395313591 M * benl FYI. this is on 3.9.5-vs2.3.6.5-beng / 0.30.216-pre3038 1395313788 M * Bertl they are probably part of the context 1395313795 M * Bertl i.e. the network context 1395313842 M * benl How do you enter the network context for a guest again/ 1395313856 M * Bertl ncontext 1395313928 M * benl I think I'm failing at vserver foo here 1395313931 M * benl vnamespace -e 101 --net ss ? 1395314195 Q * Aiken Remote host closed the connection 1395316303 M * Bertl ncontext --migrate --nid /bin/bash 1395316764 M * beng_ " FYI. this is on 3.9.5-vs2.3.6.5-beng / 0.30.216-pre3038" 3.9.5, why that one? 1395316812 M * beng_ also, 0.30.216-pre3038 - there is a more up to date one in the repository I believe 1395317028 J * undefined ~undefined@66-190-97-211.dhcp.unas.tx.charter.com 1395318821 Q * ircuser-1 Ping timeout: 480 seconds 1395320401 Q * fisted Remote host closed the connection 1395320411 J * fisted ~fisted@xdsl-87-78-187-55.netcologne.de 1395321322 J * ircuser-1 ~ircuser-1@35.222-62-69.ftth.swbr.surewest.net 1395322679 P * undefined 1395322774 M * Bertl yes, the tools are a little outdated, the kernel as well but I don't think anything significant changed in the behaviour since 1395323312 J * fisted_ ~fisted@xdsl-84-44-146-68.netcologne.de 1395323754 Q * fisted Ping timeout: 480 seconds 1395323754 N * fisted_ fisted 1395324933 M * ard is it possible to get to the root mount/fs namespace once you are in the initialize part of vserver ... start ? 1395325006 M * ard I have an ip netns add in the initialize part, but since vserver.start is started in seperate mount/fs namespace I try to get back 1395325061 M * Bertl not really, but why would you want to 'go back'? 1395325065 M * ard I can always do a vspace -e --net mount -o bind /proc/self/ns/net /run/netns/ after the start, but I actually want it to work without wrappers 1395325099 M * ard the bindmount of that file is the handle for ip netns 1395325137 M * Bertl and why doesn't that work in initialize? 1395325160 M * ard Because: + exec /usr/sbin/vspace --mount --fs --new -- /usr/sbin/vserver ----nonamespace v1500 start 1395325174 M * ard is what vserver .... start does 1395325198 M * ard so I lost my mount migration 1395325226 M * ard I can always touch nonamespace, but I don't know what happens then 8-D 1395325273 M * ard I actually want to do the ip netns add without having wrappers around vserver ... start 1395325323 M * Bertl wouldn't is make sense to integrate your netns setup into util-vserver? 1395325363 M * ard on the other hand, people here are already used to vspace -e ... --net tcpdump ... but for me this is just the last bits 1395325401 M * ard Yes, it would :-). But I need a working version first, so that you guys understand my setup and accept patches ;-). 1395325412 M * Bertl AFAIK, network namespaces have been integrated to some degree already 1395325419 M * ard I wouldn't accept my patches without knowing what I am doing :-) 1395325448 M * ard I saw some netns stuff, but that doesn't work, and also it is not flexible 1395325479 M * ard What I actually do is create a namespace, redefine _IP to be _IP netns exec :-) 1395325483 M * Bertl well, if it "doesn't work", you should file a bug report and/or send patches to daniel_hozac_ :) 1395325622 M * ard The biggest problem is actually that you want to have a network namespace, create interfaces, move them inside that network namespace and then configure them 1395325651 M * daniel_hozac_ all of which is supported. 1395325654 M * ard With shared network namespaces, you want the interfaces construction to be done in the right namespace 1395325684 M * daniel_hozac_ for which you just set shared. 1395325689 M * daniel_hozac_ and name 1395325709 M * ard and the _IP get's done in the wrong namespace 1395325775 M * daniel_hozac_ have you actually tried it? 1395325781 M * ard yes 1395325799 M * ard And I've tried your suggestions earlier, send patches for them 1395325816 M * ard then I've worked aroudn them like you and bertl said, so the patches weren't necessary 1395325913 M * ard With the workarounds I can have network namespaces set up, interfaces migrated to the right namespace, have util-vserver standard handle the ip and interfacing, and be happy 1395326041 M * ard But the moment I can use ip netns is in initialize, which is behind the new mount namespace, so after vserver start has finished, I lost the bind mount to the file. 1395326047 M * ard I can recreate it. 1395326080 M * ard Anyway: I love to send patches, but then we must have an understanding that what I do is sane, and that my modifications don't hurt others... 1395326097 M * daniel_hozac_ i don't think any modification is necessary honestly. 1395326100 M * ard Because I love what you guys do 1395326226 M * ard I am almost finished with complete scripts seperated into namespace handling and a few minor interface creation and migration 1395326249 M * daniel_hozac_ so you use /etc/vservers//netns/interfaces? 1395326261 M * ard No, I use spaces/net/ 1395326269 M * ard eh without the trailing / 1395326292 M * daniel_hozac_ which is not the netns integration. 1395326297 M * ard I tried netns/interfaces but it doesn't do what I want 1395326316 A * ard sighs with relieve :-) 1395326338 M * ard because at that point the /run/netns is umounted if I am correct 8-D 1395326381 M * ard I use the vserver name in net, which makes it shared and not a new namespace 1395326404 M * ard and then I rewrite the vspace -e --net into ip netns exec on the launch 1395326415 M * ard (in pre-start) 1395326491 M * ard For a new namespace I rewrite the _IP: 1395326497 M * ard _HIP="$_IP" 1395326497 M * ard _IP="$_HIP netns exec ${2} $_HIP" 1395326523 M * ard But for a shared namespace I rewrite the _IP like: 1395326526 M * ard _IP="$_VSPACE --enter $USE_NETNAMESPACE --net $_HIP" 1395326544 M * ard where USE_NETNAMESPACE is the name of the vserver to use 1395326555 M * daniel_hozac_ how is that different from setting name and shared in netns? 1395326600 M * ard the netns part leaves no room for creating interfaces, it just makes a veth 1395326612 M * ard no hooks 1395326632 M * daniel_hozac_ you can create whatever interface you want. 1395326636 M * ard I use enableInterfaces to set ip's on the interfaces 1395326766 M * ard According to _handleNetNS on https://github.com/linux-vserver/util-vserver/blob/master/scripts/vserver.functions 1395326773 M * ard i can only have a veth 1395326857 M * ard setupNetNS is called at a part in vserver.start where /run/netns probably is umounted already 1395326901 M * daniel_hozac_ no, veth is the default. you can set type to whatever you want. 1395326923 M * ard not every type needs a peer name... 1395326955 M * daniel_hozac_ then submit patches for that. 1395327013 M * ard But is it working? Because I tried setting it up like that, but it bailed out somewhere... 1395327027 M * daniel_hozac_ works for me. 1395327060 M * ard Ok, because that's important... 1395327102 M * ard I need to know what kind of patches you will accept 1395327191 M * ard I do not use veth, but I create vlans on bonds, and I want vserver to set up the links for that... 1395327213 M * ard But I think the most important patch will be a way to hook into the interface creation 1395329526 M * benl Hey Bertl 1395329535 M * benl those handles never "cleaned up" 1395329545 M * benl still sat at 500k+ 1395329750 M * Bertl so chances are good that they will not be freed up without a reboot 1395332317 M * benl Shoot. That's a nuisance 1395332329 M * benl So is this a bug in vserver itself? The kernel? 1395332336 M * benl Whats the best steps to follow if it happens again 1395332706 M * Bertl good questions, I think it isn't a bug per se in Linux-VServer, probably an 'acceptable loss' like with certain memory resources in the kernel 1395332746 M * Bertl best steps ... stopping the processes first, i.e. terminating them properly from inside the guest would be a good idea I guess 1395333315 M * benl Will do 1395335274 J * bonbons ~bonbons@2001:a18:225:a901:c583:4a9e:2d7:e58d 1395335870 J * zerick ~eocrospom@190.187.21.53 1395338453 M * arekm what's best way to detect that isolation networking is in use in guest? 1395338466 M * arekm some time ago looking for net: in /proc/self/nsproxy 1395338473 M * arekm worked but now there is no such file 1395338761 M * arekm ok, /proc/self/ninfo 1395339192 M * Bertl /proc/self/ninfo is still there, unless it is disabled via config 1395339257 M * arekm added ninfo detection. previously nsproxy only 1395340125 Q * [Guy] Remote host closed the connection 1395340129 J * Guy- ~korn@elan.rulez.org 1395341442 J * Aiken ~Aiken@2001:44b8:2168:1000:21f:d0ff:fed6:d63f 1395345853 Q * benl Read error: Connection reset by peer 1395346715 Q * beng_ Quit: I Leave 1395349497 M * Bertl off for a nap ... bbl 1395349508 N * Bertl Bertl_zZ 1395354171 Q * bonbons Quit: Leaving 1395354748 Q * Ghislain Quit: Leaving. 1395358710 Q * SteeleNivenson Ping timeout: 480 seconds