1595896911 N * Bertl_zZ Bertl_oO 1595896950 M * Bertl_oO Hurga: did you give enough capabilities to the guest? 1595906219 M * Bertl_oO off to bed now ... have a good one everyone! 1595906220 N * Bertl_oO Bertl_zZ 1595913022 J * Ghislain ~ghislain@adsl2.aqueos.com 1595920314 M * Ghislain so bertl how the 5.8 patch is going ? lol 1595920365 Q * [Guy] 1595920369 J * Guy- ~korn@0002809d.user.oftc.net 1595920377 M * Guy- Hurga: are you running elogind in a guest or on the host? 1595928430 M * Hurga Guy-: both actually 1595928460 M * Guy- Hurga: and the error you pasted occurs where? 1595928467 M * Hurga guest 1595928533 M * Guy- the guest needs some capability or flag in order to be able to manage cgroups 1595928538 M * Guy- I don't know which 1595928574 M * Hurga I've been told before the guest shouldn't have to mount anything... 1595928616 M * Hurga But I've given the guest quite a few permissions already. 1595928893 M * Guy- strace -ff -s200 -yy /lib/elogind/elogind 2>&1 | fgrep EPERM 1595928902 M * Guy- what's it trying and failing to do? 1595929543 M * Aiken I am wondering if it is the same I have been seeing with a gentoo guest with elogind failing when it can no do anything /sys/fs/cgroup 1595929660 M * Hurga Guy-: https://pastebin.com/cym6cJYR 1595929670 M * Hurga Aiken: likely. 1595929740 M * gnarface you guys know you don't need elogind right? 1595929787 M * gnarface you could uninstall it and have guests run startx 1595929801 M * Hurga gnarface: if everything fails, I'll try without. 1595929824 M * gnarface i assume there'd still be permissions to work out but i'm not sure elogind can be made to work at all 1595929839 M * Aiken this is a build server for several desktops and laptop. 1595929848 M * Aiken I commented out the elogind line in /etc/pam.d/system-auth 1595929853 M * Hurga But I also have an Ubuntu 16.04 where the graphical updater fails becasue of cgroup issues 1595929893 M * Guy- Hurga: perhaps you can set up the requisite mounts elogind needs from the guests's fstab 1595929905 M * Guy- and then elogind itself doesn't need to mount anything 1595929907 M * Guy- worth a try 1595929934 M * Aiken Hurga, where you have EPERM I have ENOENT but still dies trying to do things in /sys/fs/cgroup like 1595929989 M * Aiken I only have elogind installed because if the recent gentoo change away from consolekit which was annoying in it's own right 1595929996 M * Hurga Guy-: Tried that too, but it still looked like elogind tried to mount stuff, failed and died 1595930102 N * Bertl_zZ Bertl 1595930106 M * Bertl morning folks! 1595930110 M * Aiken hi 1595930169 M * AlexanderS Hurga: Can you try to give "SECURE_MOUNT" and "BINARY_MOUNT" to the guest? If it is working when the guest mounts the cgroup, then you could try to mount it during startup without capabilities. 1595930171 M * Bertl Hurga: why not try with all capabilities for the guest to see if it is a 'guest problem' or a capability issue? 1595930177 M * gnarface morning Bertl! 1595930300 M * Bertl Hurga: for me it looks like there is no cgroup tree mounted at /sys/fs/cgroup 1595930333 M * Bertl and the guest does not have the permissions to do mounts 1595930813 M * Hurga This will take some time... I'm in a totally different network setup today 1595931162 M * Bertl no problem 1595931253 M * Hurga AlexanderS: like this? 1595931258 M * Hurga cat /etc/vservers/gui2/ccaps 1595931258 M * Hurga secure_mount 1595931258 M * Hurga binary_mount 1595931334 M * AlexanderS I think it is: /etc/vservers/$name/ccapabilities 1595931456 M * AlexanderS You can check with "vattribute --xid $name --get" if the capabilities are applied correctly. 1595931584 M * Aiken between secure_mount binary_mount sys_admin can not do anything in /sys/fs/cgroup 1595932034 M * Hurga Aiken: sys_admin? 1595932089 M * Hurga Even with all the ccapabilities given it's the same error 1595932212 M * Bertl add the bcapabilities as well 1595932278 M * Aiken sys_admin in bcapabilities 1595932283 M * Aiken did not help 1595932340 M * Bertl because if it fails with all c/bcaps added, then it is a problem with the host cgroup configuration and not the guest 1595932352 M * Bertl (or the guest limitations that is) 1595932637 M * Hurga ok, something works now, but I lost net access when stopping the vserver 1595932649 M * Hurga (good thing it's a test machine) 1595932697 M * Bertl well, that just means the guest (un)configures the network 1595932706 M * Hurga thought so 1595932735 M * Bertl so elogind works fine there now? 1595932857 M * Hurga well it starts. 1595932918 M * Bertl try a login and see how that goes 1595932941 M * Hurga on a standard file system /sys/fs/cgroup/ mount BTW 1595932971 M * Bertl if all goes well and elogind is working as expected, then we can figure out the required capabilities 1595933013 M * Bertl you can probably start with taking away net_admin from the bcaps :) 1595933024 M * Bertl off for now ... bbl 1595933029 N * Bertl Bertl_oO 1595933145 M * Aiken Hurga, what did you set? 1595933148 M * Hurga yes works. 1595933184 M * Hurga Aiken: too much. :) I'll strip it down, hangon 1595934281 M * Hurga Aiken: bcapabilities SYS_ADMIN is sufficient 1595934293 M * Hurga but way too powerful.. 1595934307 M * Hurga makes the guest mount ro on exit :P 1595934529 M * Aiken can you make new entries under /sys/fs/cgroup/ ? 1595934598 M * Aiken SYS_ADMIN does not work for me so we have something a bit different happening. The link is both are in cgroups 1595934789 M * Hurga yes, there are entries, even with the corresponsing PIDs for the login processes 1595934812 M * Hurga how did you set up your cgroups? 1595934851 M * Aiken mine in the guest is empty 1595934902 M * Aiken never put any thought into cgroups in a guest until elogind appeared 1595935000 M * Ghislain if you give SYS_ADMIN to a guest then it owns the host too so perhaps there is no point in isolating it :) 1595935221 M * Hurga Ghislain: I know. Any ideas? 1595935461 M * Ghislain if this is cgroup related perhaps https://linuxcontainers.org/lxcfs/introduction/ can work IF it works with vserver (never tried) but if not then i will use real virtualmachines to isolate it as it needs too much rights to works securely in a vserver guest 1595935476 M * Ghislain but that depends where you draw the line with the isolation requirement 1595935563 M * Ghislain cgroups is a real problem in a guest 1595935712 M * Hurga So far I had the impression that util-vserver can setup cgroups (they definetely do something) on the host and then it either works somehow, or you have to bind mount the host cgroups in the guest via (guest) fstab. 1595935784 M * Hurga But if the guest insists on doing the mounts itelf, and that is impossible without SYS_ADMIN, that isn't worth much. 1595935797 M * Ghislain the issue is when a guest try to create or modify cgroups like systemd, i remember trying to mount /dev/cgroups/vserver/guestname/ as /dev/cgroups in a guest but failed. Perhaps it changed since then (was several years ago i think) 1595935821 M * Ghislain yes thats another issue when the guest want to do the mounting itself too 1595935897 M * Ghislain in init script you can getby by editing them but if this is systemd or a daemon it is very hard 1595935961 M * Hurga Ghislain: Your setup is what I would have tried too. Paths are a bit different today, more like /sys/fs/cgroup instead of /dev/cgroup, but hey. 1595935970 M * Hurga But check this patch.... 1595935978 M * Hurga https://github.com/linux-vserver/util-vserver/issues/32 1595936018 M * Hurga the paths it set up look like /sys/fs/cgroup/cpuacct/vserver/mail/... 1595936077 M * Hurga what of it am I supposed to mount to the guest? there are different subsystems like cpuacct higher in the hierarchy than the guest name. 1595936367 M * Ghislain well i prefer the unified hierarchy so the guest cgroup is under one directory. With this distributed hierarchy you cannot easely mount in the guest as it is multiple directory. 1595936378 M * Ghislain i dont find any particular setup i have done for this 1595936389 M * Ghislain vserver on /dev/cgroup type cgroup (rw,relatime,cpuset,cpu,cpuacct,blkio,memory,devices,freezer,net_cls,perf_event,net_prio,hugetlb,pids) 1595936403 M * Ghislain /etc/vservers/.defaults/cgroup/base 1595936411 M * Ghislain =>vservers 1595936635 M * Hurga I would understand that better, admittedly. But I'm trying to understand the tools as they are - there's likely some reson to it 1595936649 J * x1450 ~x1450@77-240-13-1.rdns.melbourne.co.uk 1595936683 M * x1450 o/ 1595936698 M * Hurga (others likely understand the ramifications better than I do) 1595936861 M * x1450 Just have a couple of questions, I'm testing a Jessie -> Buster upgrade using the same kernel and I've noticed that my vservers aren't accounting CPU individually even though `VIRT_CPU` is set, however load avg and mem do appear to be accounted for per container. I have noticed some vxw output on the buster machine in syslog that references vserver_cpu, vserver_resourc and vserver_loadavg: https://pastebin.com/Gcnek27R 1595936900 M * x1450 also with regards to the vxW messages, they appear to print some strange characters that makes grep think the log is binary, is there a way of changing the delimiter used? 1595936950 M * x1450 i can find a reference to it here: http://www.paul.sladen.org/vserver/archives/201009/0067.html but unsure if there's a specific reason 1595937169 M * x1450 well apart from "> I guess Herbert put them there to be able to handle weird executable 1595937169 M * x1450 > names containing one of the other separators..." But how common is this and isn't it possible to mitigate it via a cleaner method? 1595938891 Q * Aiken Remote host closed the connection 1595939795 M * sladen morning 1595942914 M * Ghislain x1450: what do you mean by accounting cpu individualy ? 1595943061 M * x1450 ok so if I run stress-ng in one container, I can see that CPU time is burned by using top in that container whereas other containers aren't affected 1595943105 M * Ghislain oh ok 1595943108 M * x1450 i.e. like this https://pastebin.com/3e93A3h0 1595943238 M * x1450 so that's the existing working setup (Jessie), whereas on buster it appears as so: https://pastebin.com/xMtyPTWa 1595943258 M * Ghislain yes same here 1595943270 M * Ghislain they must have change the api call to a new one 1595943389 M * x1450 in userspace I'm guessing as the kernel is the exact same 1595943424 M * Ghislain yes, linux loves to have 20983098 calls for the same thing, accounting of number of process for exemple 1595943429 M * x1450 :joy: 1595943436 M * x1450 ahh i forgot this is IRC :| 1595943448 M * x1450 too used to discord/slack/mattermost XD 1595944223 M * Ghislain this seems related to /proc/stat 1595944231 M * x1450 yeah i just straced both 1595944512 M * Ghislain the patch limits only the cpu you can see i dont see code to limit the usage count for each cpu 1595944611 M * Ghislain on a debian 7 i still have the wholme cpu stats it seems so are you sure the behavior changed ? 1595944716 J * fstd ~fstd@xdsl-87-78-47-45.nc.de 1595945183 Q * fstd_ Ping timeout: 480 seconds 1595945883 M * x1450 Pastebin links above show the differing CPU lines from top 1595946072 M * x1450 it's not just top though, if I use M/Monit - I can see that on the Jessie machine, high cpu usage is only reported in the container where it's coming from 1595946089 M * x1450 On the Buster machine, the CPU usage rise appears on all containers 1595946614 M * Bertl_oO x1450: regarding 'strange characters' ... that's your personal preference :) 1595946669 M * Bertl_oO because everybody has different taste (and different opinions) about the characters used for quoting pathes, we have two config options for that 1595946685 M * Bertl_oO CONFIG_QUOTES_UTF8 and CONFIG_QUOTES_ASCII 1595946717 M * Bertl_oO if you do not set either of them, the default will be ISO8859-15 1595946764 M * x1450 ahh that may explain it then 1595946793 M * Bertl_oO regarding CPU 'isolation' ... on recent kernels this is done by cgroups, so make sure that your cgroup setup isolates the guests properly (including scheduler limits etc) 1595946834 M * Bertl_oO the Linux-VServer patch only 'virtualizes' the kernel/userspace information (depending on patch version and feature diff you added or not) 1595946887 M * Bertl_oO if there is indeed a place where we do not virtualize the info properly, you need to figure out where the 'wrong' information comes from and create a simple test case so that we can check and fix it in a future patch 1595946922 M * Bertl_oO and yes, there are many ways to get this information from the kernel so it is not unlikely that we are missing one or the other :) 1595947079 M * x1450 does it have to be a non-interactive test case? 1595947106 M * Bertl_oO no, but it should be something I can run here with a bunch of util-vserver commands 1595947138 M * Bertl_oO and not something like 'install a youbuntu whatnot guest and run ngtopwossname' 1595947231 M * Bertl_oO if you can't figure out what syscall or procfs entry contains the 'non virtualized' information, then creating a guest with minimal binaries and libraries is fine too (should be a few megabytes at most) 1595947357 M * x1450 Debian and small don't go well XD 1595947383 M * Bertl_oO you can find libraries used by a binary with 'ldd' 1595947407 M * Bertl_oO if you install only those, you should be able to cut it down significantly 1595947435 M * x1450 yup 1595947458 M * Bertl_oO but as I said, first check with 'strace' to see what the binary does to get the info 1595947490 M * Bertl_oO in case the info comes from a /proc entry, you can also check with cat 1595947516 M * Bertl_oO in case it uses some syscall API, a simple C program should do as well 1595947526 M * Bertl_oO (or python) 1595947602 M * x1450 i did have a strace diff between the two however I need to sanitize any identifying names in them 1595947634 M * Bertl_oO don't bother, just find the place where the wrong information is retrieved 1595947651 M * Bertl_oO it will boil down to a proc entry or syscall 1595948247 M * Bertl_oO (or if it is very fance, netlink :) 1595948253 M * Bertl_oO *fancy 1595949447 M * x1450 https://www.diffnow.com/report/ws87p (Clip 1/Jessie [works] while Clip 2/Buster [not working as expected]) - bearing in mind, I'm not super skilled but from what i can see the main difference is that it appears to call `arch_prctl(ARCH_SET_FS, 0x7fa8fa47e7c0) = 0 1595949447 M * x1450 ` Line 98 clip2, `set_tid_address(0x7fa8fa47ea90) = 12633 1595949447 M * x1450 ` line 112 clip 2, `set_robust_list(0x7fa8fa47eaa0, 24) = 0` 1595949447 M * x1450 line 113 clip2 1595949454 M * x1450 it still reads `/proc/stat` though 1595949608 M * Bertl_oO so does /proc/stat contain the correct or the wrong information in your guest? 1595950577 M * x1450 Hmm now I've confused myself as I'm seeing the same behaviour on a jessie machine 1595950601 M * x1450 vs my dev machine which is also jessie and where the discrepancy popped up 1595951006 M * x1450 I need to find the difference here however just for a record. On my dev machine (Jessie) - using the AWK example: https://rosettacode.org/wiki/Linux_CPU_utilization#AWK to read /proc/stat results in 100% in the container with the high CPU usage and a different value 7-10% in a different container running other stuff but not actively smashing the cpu 1595951033 M * Bertl_oO sounds good 1595951178 M * x1450 yup i'm an idiot 1595951210 M * Bertl_oO don't be too hard on yourself ... so everything is fine? 1595951279 M * x1450 so basically I had a cgroup/cpuset in the container i was running the cpu load from, when i removed this then it started behaving like the other jessie machine and buster (so it appears the behaviour) hasn't changed 1595951319 M * x1450 which I guess would lead to me asking - is it possible to have per container CPU metrics? 1595951354 M * Bertl_oO meaning? 1595951546 M * x1450 so for example say you have guest1 and it was running a process where cpu time was reported at 100%, whereas guest2 is just idle - guest1 would report it's utilizing 100% whereas guest2 would report it's idle? 1595951570 M * x1450 I'm not deeply familiar with how linux handles CPU accounting in that manner or whether it's even possible 1595951619 M * Bertl_oO well, yes, if you have a scheduling cgroup, that's how it should show up 1595951758 M * x1450 to the wiki, i guess :D 1595951814 M * Bertl_oO we used to have our own scheduler, but as it was quite intrusive, we dropped it when the cgroup got usable 1595952040 M * x1450 i don't suppose you know which kernel version that would have been around? 1595952058 M * x1450 as the information i have that this was functional may have been from around the 3.10 days or something 1595952416 M * Bertl_oO was removed around end of 2010/early 2011 1595952551 M * x1450 hmm 3.10 is after that so i don't think that's the case then 1595952554 J * fstd_ ~fstd@xdsl-87-78-47-45.nc.de 1595952663 M * Ghislain x1450: you are not on 4.9 ? 1595952676 M * x1450 I'm on 4.9 1595952681 M * Ghislain oh ok 1595952719 M * x1450 it's just the information I have may have been based on the 3.10 or 3.16 kernels (running some tests to confirm functionality) on buster and the 4.9 kernel 1595952762 M * x1450 as the previous person was used to vserver on those kernels and I haven't really looked into that until now 1595952837 M * x1450 so with cgroups, i don't really want to give greater/lower shares - if I set the cgroup cpu shares to 1024 per each container then I'd get my expected result? 1595952858 M * x1450 I've jotted down that I really need to read up more on this so I apologize for any stupid questions 1595952918 M * Ghislain i dont think cgroup virtualise the cpu usage so vserver will not show the real cpu usage of each guest but only each cpu. Here for cache performance we limit the cpuset per guest 1595952935 M * Ghislain we share some cpu but we have some that are dedicated to the guest 1595952999 Q * fstd Ping timeout: 480 seconds 1595953027 M * x1450 it's mainly for load identification, as each container has monit for service supervision - M/Monit has an overview of all the monit nodes and displays CPU/RAM 1595953056 M * x1450 so assuming you had per container cpu usage, you could look at the dashboard and say oh container1 is where the cpu usage is at 1595953095 M * x1450 whilst not having SSH access to the said container or relying on a different method 1595953099 M * Ghislain you should monitor the cpuacct cgroup metrics for that 1595953212 M * Ghislain this mean specific monitoring instead of using the basic way but this should be what you seek 1595953380 M * x1450 i suspect m/monit or monit doesn't support that 1595954128 M * Bertl_oO load is virtualized (based on cgroup accounting) on 4.9.x patches 1595954142 M * Bertl_oO you just need to enable it 1595954146 M * x1450 and it turns out I may have been mislead *sigh* - yeah loadavg appears to be the one we care about testing and that works fine 1595954797 M * x1450 i appreciate the help anyhow and apologize for the wasted time 1595954807 M * x1450 at least i've got new stuff to read up on :) 1595954814 M * Bertl_oO you're welcome! no problem! 1595954960 M * x1450 Also I meant to ask, if you decide and announce a crowdfunding campaign - where would it be announced? 1595955022 M * Bertl_oO most likely on the mailing list and here 1595955748 M * x1450 kk i'll keep an eye out :) 1595956051 M * x1450 have a good evening all o/ 1595956054 Q * x1450 Quit: Leaving 1595961358 Q * Ghislain Quit: Leaving. 1595961398 J * Ghislain ~ghislain@adsl2.aqueos.com 1595965422 J * Aiken ~Aiken@b951.h.jbmb.net 1595968569 Q * Ghislain Quit: Leaving. 1595976404 M * Bertl_oO off to bed now ... have a good one everyone! 1595976406 N * Bertl_oO Bertl_zZ