1440637953 M * Bertl_oO marcfiu: hey, LTNS! 1440638043 M * marcfiu hey there 1440638043 M * Bertl_oO did you consider that the process might have already disappeared (terminated) between the ps output and the vps lookup? 1440638056 M * marcfiu have not considered that 1440638105 M * Bertl_oO it might make sense for vps to check if the process is still there when getting that lookup error 1440638185 M * marcfiu indeed 1440638559 M * marcfiu is it possible for a process that is exiting but not yet gone to have detached from a context? 1440638659 M * marcfiu unfortunately I cannot poke at an active system that has this condition… I was informed of it with some output of "vps -AF | grep ERR". 1440638692 M * marcfiu My sense is this processes are still around at the time vps is trying to get the context information. 1440638705 M * marcfiu But as I mentioned, without poking at a live system it is all just conjecture. 1440638745 M * Bertl_oO in theory, it is possible, but I would consider it more likely that they simply terminated after ps took the snapshot and are gone at the time vps checks the xid 1440638919 M * marcfiu true 1440638934 M * marcfiu though this is weird to be part of the "vps -AF | grep ERR" output: 1440638936 M * marcfiu root 122450 ERR 109536 0 2063 552 2 12:48 pts/0 00:00:00 vps -AF 1440638936 M * marcfiu root 122451 ERR 109536 0 25810 848 22 12:48 pts/0 00:00:00 grep ERR 1440638936 M * marcfiu root 122452 ERR 122450 0 27557 1192 18 12:48 pts/0 00:00:00 ps -AF 1440638972 M * Bertl_oO well, vps is just a wrapper which takes the ps output and annotates it accordingly 1440638974 M * marcfiu without eyeballing the vps.c code, I'd say that vps itself AND the grep process should still eb out. 1440638987 M * marcfiu … be around 1440638995 M * marcfiu I can see how "ps" might not be around. 1440639032 M * marcfiu this is just a subset of the processes for which the xid is ERR 1440639114 M * Bertl_oO in this particular case, I would suspect the vps/ps/grep to be in the host context, no? 1440639134 M * Bertl_oO what kernel/patch is this exactly? 1440639608 M * marcfiu yes it is running in the host context 1440639623 M * marcfiu but it is also reporting ERR for some processes in a guest context 1440639716 M * Bertl_oO well, we can most likely rule out that vps has terminated while annotating ps output, so if that is indeed the very same vps process giving that output, then it looks like an issue with the Linux-VServer syscall function 1440639822 M * marcfiu It is either a CentOS 6.5 or 6.6 kernel with vs2.3.0.36.29.6 patch applied 1440639856 M * marcfiu so either 2.6.32-431.11.2.el6, 2.6.32-504.1.3.el6, or 2.6.32-504.12.2.el6 1440639870 M * marcfiu as mentioned, I do not have access to the system right now. 1440639918 M * Bertl_oO I do not recall providing patches for thos kernels, who adjusted the patches to the centos kernels? 1440639918 M * marcfiu you mean the vc_get_task_xid() that I referred to before? 1440639937 M * marcfiu We likely did it internally 1440640017 M * marcfiu I can share the patch itself, if it helps 1440640047 M * Bertl_oO well, most likely the patches centos added are relevant too 1440640053 M * marcfiu yeah 1440640056 M * marcfiu :) 1440640063 M * marcfiu note this doesn't happen all the time 1440640073 M * marcfiu but we've gone into this state 1440640086 M * Bertl_oO so I wouldn't consider this a Linux-VServer problem unless we see it on a mainline kernel 1440640087 M * marcfiu though without any significant impact on the operational state of the system 1440640105 M * marcfiu no worries… 1440640112 M * marcfiu How would I map the vc_get_task_xid() syscall to the relevant code in the patch? 1440640139 M * Bertl_oO the syscall function will be called vc_get_task_xid, so should be rather easy 1440640159 M * marcfiu vx_get_proc_task? 1440640392 M * Bertl_oO +int vc_task_xid(uint32_t id) 1440640438 M * Bertl_oO searches for the tasks pid and returns the xid or ESRCH if not found 1440640462 M * Bertl_oO vc_get_task_xid() is probably the util-vserver wrapper 1440641029 M * marcfiu thx 1440641042 M * Bertl_oO you're welcome! 1440641048 M * marcfiu if this happens again is there some sort of debugging that one can enable to get more insight 1440641064 M * marcfiu hate chasing stuff like this without access to a system that is in this state 1440641091 M * marcfiu first, I suppose I can amend vps.c to double check that the process still exists. 1440641118 M * marcfiu but I am wondering if there is some sort of kernel level vserver debug such that one can get more info 1440641120 M * Bertl_oO there is a debug entry which logs the syscall command switch actions, but most likely that will only show you what we already know, that the vc_task_xid() checks a pid and doesn't find it 1440641280 M * marcfiu yeah… I was reading the patch code and didn't see any relevant vxfprintk that one could trigger 1440641438 M * marcfiu +#define find_task_by_real_pid(pid) find_task_by_pid_ns(pid, &init_pid_ns) 1440641443 M * Bertl_oO + vxdprintk(VXD_CBIT(switch, 0), 1440641446 M * marcfiu just for clarificatoin 1440641469 M * Bertl_oO will log the syscall command switch invocation 1440641489 M * marcfiu searching for task by real pid in the init_pid_ns because that is the global namesapce for all processed ids, right? 1440641496 M * Bertl_oO + vxdprintk(VXD_CBIT(switch, 1), 1440641505 M * Bertl_oO will log the exit code 1440641556 M * Bertl_oO @real_pid: yes, that's the idea 1440641587 M * marcfiu thx 1440642114 M * Bertl_oO np 1440645442 Q * marcfiu Quit: Leaving. 1440646133 Q * yang charon.oftc.net helix.oftc.net 1440646133 Q * tokkee charon.oftc.net helix.oftc.net 1440646133 Q * DLange charon.oftc.net helix.oftc.net 1440646210 J * tokkee tokkee@osprey.tokkee.org 1440646210 J * DLange ~DLange@dlange.user.oftc.net 1440646210 J * yang yang@yang.netrep.oftc.net 1440650188 M * Bertl_oO off to bed now ... have a good one everyone! 1440650190 N * Bertl_oO Bertl_zZ 1440658008 J * Ghislain ~aqueos@adsl1.aqueos.com 1440659654 J * nikolay ~Nikolay@84.207.230.65 1440659722 Q * nikolay Read error: Connection reset by peer 1440659744 J * nikolay ~Nikolay@199.91.137.248 1440660052 Q * derjohn_mob Ping timeout: 480 seconds 1440661741 J * derjohn_mob ~aj@fw.gkh-setu.de 1440662591 M * Ghislain macrfui do you have honor privacy guest turned ON ? i wonder if that can be it 1440674535 N * Bertl_zZ Bertl 1440674537 M * Bertl morning folks! 1440676761 Q * fstd Remote host closed the connection 1440676773 J * fstd ~fstd@xdsl-87-78-80-90.netcologne.de 1440677028 Q * Roomster Remote host closed the connection 1440677054 J * Romster ~Romster@202.168.100.149.dynamic.rev.eftel.com 1440680919 M * Ghislain yop 1440685570 J * marcfiu ~Adium@pool-98-110-124-150.cmdnnj.fios.verizon.net 1440686420 Q * marcfiu Read error: No route to host 1440687234 Q * derjohn_mob Ping timeout: 480 seconds 1440688647 J * bonbons ~bonbons@2001:a18:224:1:e82e:380:87cc:e0d6 1440690294 J * derjohn_mob ~aj@88.128.80.134 1440691487 Q * nikolay Ping timeout: 480 seconds 1440693432 Q * derjohn_mob Ping timeout: 480 seconds 1440705880 J * derjohn_mob ~aj@tmo-110-233.customers.d1-online.com 1440706555 Q * bonbons Quit: Leaving 1440709281 Q * Ghislain Quit: Leaving. 1440709736 M * clopez Bertl: what happened to the website? http://linux-vserver.org/ 1440709869 M * daniel_hozac looks okay to me? 1440709963 M * clopez ok.. now is ok.. just some minutes ago it was showing a fritz!box default webpage 1440710054 M * clopez by the way.. i got some folks asking for the removal of util-vserver from debian because "it only supports kernels up to 3.18"... http://bugs.debian.org/797064 1440710081 M * clopez of course i'm not going to let that happen... but i would like to know future plans about supporting newer kernels 1440710089 M * clopez just to reply them 1440710099 M * clopez if you have any idea, i would like to know it 1440711033 M * Bertl well, debian was always behind 1440711054 M * Bertl if somebody builds a recent util-vserver version, it surely supports 4.x kernels as well 1440711097 M * Bertl (so not a problem on our side) 1440712145 M * Bertl off for now ... bbl 1440712151 N * Bertl Bertl_oO 1440719962 Q * fstd Remote host closed the connection 1440719973 J * fstd ~fstd@xdsl-87-78-10-11.netcologne.de