1540181920 J * DelTree ~deplagne@2a00:c70:1:213:246:56:18:2 1540181921 Q * DelTree_ Read error: Connection reset by peer 1540188373 J * _pa ~pav@46-185.dsl.iskon.hr 1540193373 J * hijacker ~nikolay@external.oldum.net 1540195091 M * Ghislain the issue is that 4.4 takign the config from my 4.9 do not boot, so for now my test is not working. 1540195904 M * Ghislain a firend tried the stress-ng thing on a non vserver kernel and it do not crahs the shell or any other process 1540198441 M * Ghislain anyone got a debian stretch non vserver and can try "stress-ng --metrics-brief --verbose --timeout 60 --resources < put here nbcorex3> " as a non root user an tells me if it work or crash their shell ? 1540198997 M * AlexanderS Ghislain: I can check this. 1540199007 M * Ghislain cool 1540199117 M * AlexanderS (But I currently only have a VirtualBox guest.) 1540199194 M * AlexanderS Ghislain: Seems to work: https://pastebin.com/0hy9v67c (Linux debian 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u6 (2018-10-08) x86_64 GNU/Linux) 1540199218 M * Ghislain seems a vserver thing then 1540199508 M * AlexanderS The crash happens on the host with vserver kernel? 1540199523 M * AlexanderS Or only inside a guest? 1540199546 M * Ghislain the issue is that i am unable to get a stretch host work 1540199551 M * Ghislain my host are jessie 1540199565 M * Ghislain so the stress-ng in the host is not the same version 1540199655 M * AlexanderS Maybe just chroot inside a (not running) vserver-/ and use the stretch userland there? 1540199663 M * Ghislain btw 4.4.165 is compile, boot, testfs testme ss -l ok 1540199673 M * Ghislain will try that 1540199735 M * Ghislain yes same it crash the shell 1540199823 M * AlexanderS Hmm, I just booted 4.9.135 and stress-ng seems to work, too: https://pastebin.com/1eBMmASV 1540199871 M * Ghislain in host ? 1540199882 M * Ghislain or in guest ? 1540200010 M * Ghislain you have only one core, could you try 6 stress-ng --metrics-brief --verbose --timeout 60 --resources 6 ? 1540200193 M * AlexanderS Host. 1540200355 M * AlexanderS Oh "--resources 6" indeed crashes the shell. 1540200379 M * Ghislain oh so you got the hosts too, ok 1540200408 M * Ghislain so vserver guest and host have the issue. n 1540200425 M * Ghislain could you retry on a non vserver one ? 1540200427 M * AlexanderS Just trying again on the stock debian kernel. 1540200443 M * Ghislain thx 1540200519 M * Ghislain you can raise the nb as high as you like it just put more load on the system 1540200661 M * AlexanderS The stock kernel does not seem to crash. 1540201078 M * Ghislain atomic_t is integer, is it not a 64bit/32b issue 1540202008 M * Ghislain bertl, in cvirt_proc.h, vx_info_proc_cvirt return an int but the formula add (unsigned long long)cvirt->bias_uptime.tv_sec, so should it not return a long long unsigned int ? (sorry i dont know c) 1540202008 M * Ghislain 1540202050 M * Ghislain that should be me because it seems a string 1540202071 M * Ghislain but the definition say int vx_info_proc_cvirt, must miss something 1540202568 M * Ghislain atomic_t total_forks;           /* number of forks so far */ 1540202586 M * Ghislain unsigned long total_forks; /* Handle normal Linux uptimes. */ fork.c mainline 1540202597 M * Ghislain could it be 32/64b issue there ? 1540203117 Q * Ghislain Ping timeout: 480 seconds 1540207149 Q * romster Quit: Leaving 1540208314 J * romster ~romster@158.140.215.184 1540213782 Q * _pa Ping timeout: 480 seconds 1540215550 J * _pa ~pav@141-136-143-69.dsl.iskon.hr 1540216557 Q * _pa Ping timeout: 480 seconds 1540216943 J * _pa ~pav@141-136-157-103.dsl.iskon.hr 1540219573 J * Ghislain ~ghislain@81.56.195.31 1540220202 M * Ghislain hi, sorry my internet was down for hours 1540220222 M * Ghislain if anyone responded to my talk i did not get the answer 1540220238 M * AlexanderS Nobody responded. :-( 1540220430 Q * _pa Read error: Connection reset by peer 1540220457 J * _pa ~pav@141-136-157-103.dsl.iskon.hr 1540220459 M * AlexanderS Ghislain: The (unsigned long long) part in vx_info_proc_cvirt is not part of the calculation of lenght. It's part of a value in the string in the buffer. 1540220815 M * Bertl_oO Ghislain: an unsigned long long difference is printed into a buffer, the number of characters used for that are returned 1540220839 M * Bertl_oO there is no (real) relation between those two 1540220843 J * _pa_ ~pav@43-155.dsl.iskon.hr 1540220870 Q * _pa_ 1540221118 M * Ghislain ok thx 1540221184 M * Ghislain the stress-ng crash i think is a 32b/64b issue it seem to feel like it the counter overflow. didi you see the remarks about atomic_t total_forks;     should it be atomic64_t total_forks;           1540221247 Q * _pa Ping timeout: 480 seconds 1540221322 M * Bertl_oO the cvirt structures do not apply to the host context, so if you observe the issue on the host too, it cannot be a problem with the cvirt limits or accounting 1540221521 M * AlexanderS It might be a problem of the oom-killer. Are there any vserver specific changes? 1540221548 M * Bertl_oO yes, there are, and that is my suspicion as well 1540221560 M * Bertl_oO it might be a good start to enable related debug messages 1540222251 M * AlexanderS Ghislain: All processes of the user running stress-ng are killed, can you confirm this? 1540222273 J * _pa ~pav@32-206.dsl.iskon.hr 1540222276 M * Ghislain all process it can is killed 1540222288 M * Ghislain so root kills all, user kill all its process 1540222517 M * Ghislain i think the _vx_cvirt could be reviewed. I see that load is unsigned long and that sound 64bit in 64b system and _vx_cvirtis load is atomic_t 1540222517 M * Ghislain 1540222571 M * Bertl_oO well, that's the mysterious part, because the OOM killer is not restricted to 'reachable' processes 1540222637 M * Ghislain its way out of my league but how can i help here :) ? 1540222715 M * Ghislain load average is too an unsigned long from what i read here https://github.com/torvalds/linux/blob/master/kernel/sched/sched.h 1540222728 M * Ghislain feel like reading chinese to me :) 1540222774 M * Bertl_oO is there any real evidence that there might be a 32/64bit issue anywhere? 1540222798 M * Ghislain i dont think this is OOM, it is that the progam kill an id > 32bit that is changed to oxFFFFFFFF 1540222813 M * Ghislain and kill a oxFFFFFF does it kill all process ? 1540222841 M * Ghislain the trace show it kills the thread id and then start to kill oxfffffffffffff 1540222844 M * Bertl_oO the OOM killer picks the 'best choice' and kills it 1540222858 M * Ghislain there is no oom here i think 1540222867 M * Bertl_oO if that doesn't help, it will continue killing processes 1540222896 M * Bertl_oO there is nothing in the Linux-VServer code which 'adds' a 'kill process' 1540222964 M * Ghislain stress-ng does but if it "kills" a process with a 64bit number and is > that 32bit space 1540222997 M * Ghislain what happen, it is in _vx_cvirt_vx_cvirt converted to 32bit no ? then it is oxFFFFFFFFF 1540223064 M * Bertl_oO as I mentioned before, the entire *cvirt part does not apply to the host context 1540223085 M * Bertl_oO the host context does not even have a *cvirt struct to begin with 1540223119 M * Ghislain ok i see what you mean 1540223122 M * Bertl_oO so, if it happens on the host, forget about any context related structures, they are not active and not applicable 1540223165 M * Bertl_oO it might still be a problem with the signal code path changes or the OOM killer changes 1540223195 M * Bertl_oO most of those changes only 'block' actions and prevent specific tasks from receiving signals or being OOM killed 1540223257 M * Bertl_oO I think the only way to narrow this down is to remove certain parts of the Linux-VServer patch and test without the modifications 1540223276 M * Bertl_oO i.e. remove all the scheduler stuff, test, remove all the signal handling stuff, test 1540223333 Q * hijacker Quit: Leaving 1540223375 M * Ghislain ok i see, independantly of that the cirt struct should be concsistent with the non virt ones no ? 1540223380 M * Ghislain i'll get into some testing 1540223701 M * Ghislain ok i rebuild from source stress-ng to have the same version on host and guest 1540223708 M * Ghislain host did not crashed 1540223731 M * Ghislain trying the guest 1540223867 M * Ghislain did not crash the guest..dam ! the package version does crash it, not the compiled version from the github 1540224785 M * AlexanderS Same version? 1540224928 M * Bertl_oO so maybe debian 'improved' something :) 1540224941 M * Bertl_oO (wouldn't be the first time :) 1540225042 M * DelTree debian only improve prng, do they not ? 1540228708 J * _pa_ ~pav@12-85.dsl.iskon.hr 1540228723 Q * _pa Ping timeout: 480 seconds 1540229083 N * _pa_ _pa 1540234827 Q * _pa Quit: Leaving 1540235021 J * _pa ~pav@12-85.dsl.iskon.hr 1540237412 J * icecold ~icecold@58.6.82.144 1540238528 Q * icecold Quit: Leaving 1540241329 Q * any0n Ping timeout: 480 seconds 1540241461 J * any0n ~k@7YZAAAQ1P.tor-irc.dnsbl.oftc.net 1540243137 Q * any0n Remote host closed the connection 1540243174 J * any0n ~k@7YZAAAQ2F.tor-irc.dnsbl.oftc.net 1540250475 J * fstd_ ~fstd@xdsl-84-44-228-193.netcologne.de 1540250922 Q * fstd Ping timeout: 480 seconds 1540250922 N * fstd_ fstd