1155427495 J * lylix ~eric@dynamic-acs-24-154-53-234.zoominternet.net 1155427557 M * lylix anyone successfully using the token bucket scheduler? 1155427577 Q * Wonka Ping timeout: 480 seconds 1155427597 M * Skram at one point I did 1155427640 M * lylix should the actual number of tokens be reflected in /proc/virtual/XID/sched? 1155427658 M * Skram To tell the truth, I forget 1155427677 M * lylix k, so prolly no pointers on the config/setup 1155427690 M * doener lylix: I think they should 1155427705 M * lylix i have valid settings in /etx/vservers/$VSERVER/schedule 1155427711 M * Skram Im looking for the page that says it 1155427721 M * lylix as it gets properly populated in the proc/sched file 1155427749 M * lylix and also have 'sched_hard' in /etc/vservers/BLAH/flags 1155427791 M * lylix and i ran an infinite loop w/ bzip and the tokens in proc never changed 1155427809 M * lylix actually... not really a loop... just the silly thing from the wiki page: 1155427812 M * doener lylix: which kernel version? 1155427817 M * lylix cat /dev/zero | bzip2 | bzip2 | bzip2 > /dev/null 1155427829 M * lylix 2.6.15 1155427835 M * lylix vserver 2.0.1 1155427898 M * ekc2 do you have CONFIG_VSERVER_HARDCPU=y in kernel config? 1155427916 M * lylix oh boy.. checking... 1155427980 M * ekc2 at least in 2.1.1 you need that set for sched_hard 1155427996 M * lylix i see it in 2.0.1 1155428001 M * lylix not set, so thatll do it 1155428014 M * lylix now... sched_prio stills works right? 1155428053 M * ekc2 yes. depending on what flags you have set: sched_hard, sched_prio 1155428110 M * lylix k, how effective is the priority scheduling compared to hard limit? 1155428194 M * ekc2 haven't used sched_prio. but, if you use it you should limit nproc 1155428252 M * lylix makes sense... k, since you are using sched_hard, is it advisable to use the "Limit the IDLE task" option? 1155428300 J * Roey ~katz@h-69-3-4-130.mclnva23.covad.net 1155428306 M * ekc2 i'm trying to get that working right now. it's not enabled by default unless you set a flag from userspace 1155428323 M * ekc2 so, it doesn't hurt to enable it 1155428357 M * lylix i know the vsched prog fron bertl can handle setting idle fill/interval, is this one in the same? 1155428414 M * ekc2 bertl said that you need that option enabled in the devel branch to set the idle fill/interval 1155428444 M * lylix k, not ready for devel branch yet... how stable is it? 1155428456 M * lylix i havent had time to bunk it on an extra box and test 1155428473 M * ekc2 it's very solid for me. works great 1155428486 M * lylix hmmm, food for thought then. 1155428515 M * lylix the only annoyance i get sometimes is the mountpoints in namespaces 1155428535 Q * dna Quit: Verlassend 1155428537 M * lylix wrote a quick script to handle that, becuz we use LVM partition per vserver guest 1155428564 M * lylix and destorying a vserver/lvm is sometimes a real pain 1155428633 M * lylix nothing more than a vnamespace -e BLAH umount $DIR thru all running vservers found in /var/run/vservers 1155428699 M * lylix would be nice if the vserver BLAH stop command unmounted non-shared mounts for that vserver in all namespaces automatically 1155428869 Q * ekc2 1155428886 J * ekc2 ~EKC@netblock-66-245-252-180.dslextreme.com 1155429348 M * ekc2 Does anyone know why idle time fails to advance (stuck at 0) when idle_rate/ide_interval=50/300 but works fine when idle_rate/idle_interval=1/6? 1155429431 M * ekc2 very puzzling 1155430646 M * bj arg... problems with "cannot set rlimit: Operation not permitted" already tired ULIMIT="-H -u 256 -n 65536", CAP_SYS_RESOURCE but it won't run ;/ 1155430865 Q * Skram Remote host closed the connection 1155431033 M * bj anyone got an idea to that ? 1155431203 J * Skram ~Mark@hermes.sentiensystems.com 1155431392 Q * meandtheshell Remote host closed the connection 1155432110 N * Bertl_oO Bertl 1155432115 M * Bertl evening folks! 1155432140 M * Bertl bj: who is trying to set what rlimit? 1155432195 M * Bertl lylix: I think it is better to unmount excessive mounts on startup, than on shutdown 1155432217 M * Bertl lylix: this is what the tools with daniel_hozac's patches should do 1155432223 M * bj hey Bertl :) 1155432254 M * Bertl ekc2: sounds strange, are you sure about that? 1155432259 M * bj Bertl: the bad application called "Postfix Policy Daemon" .oO(apt-cache show postfix-policyd) 1155432280 M * Bertl postfix has a policy daemon? 1155432309 M * Bertl anyway, which rlimit does it hit? 1155432317 M * bj found a discussion where you already discussed a similar thing with somebody but I didn't manage to get it running (http://www.paul.sladen.org/vserver/irc-logs/200312/vserver.2003-12-02.txt) 1155432337 M * bj Bertl: yep the thing is actually quite neat but bad(tm) in vserver env ;/ 1155432351 M * Bertl well, that was two and a half year ago, I doubt that anything from back then is still valid 1155432363 M * Bertl (the entire limit stuff changed a lot since back then) 1155432376 M * bj Bertl: dunno just get the message : cannot set rlimit: Operation not permitted 1155432385 M * bj ahso 1155432392 M * Bertl okay, then let's get strace working and check it out 1155432408 M * bj hm good idea I'll give it a try 1155432436 M * ekc2 bertl: yes, positive. i was going nuts trying to get idle time to advance with rate/interval = 1/1000 and rate_idle/interval_idle=50/300; on a whim I tried rate_idle/interval_idle=1/6 and it worked. 1155432448 M * ekc2 behaviour is repeatable 1155432492 M * ekc2 bucket minimum=50; max size=500 1155432502 M * Bertl okay, sounds like a bug then ... do you have vserver debugging enabled? 1155432509 M * ekc2 dual cpu system 1155432511 M * ekc2 yes, i do 1155432521 M * ekc2 what would you like me to do? 1155432530 M * Bertl give me a moment to check which messages are of interest 1155432565 M * bj Bertl: setrlimit(RLIMIT_NOFILE, {rlim_cur=4097, rlim_max=4097}) = -1 EPERM (Operation not permitted) 1155432573 M * bj Bertl: thats what I get from the strace 1155432582 M * Bertl bj: okay, that sounds like a limit from your host 1155432586 M * bj hmm 1155432603 M * Bertl but let me put it this way, what does postfix policyd do with 4k file handles? 1155432617 M * bj i got no clue ;( 1155432621 M * Bertl bj: check it on the host with ulimit -aH 1155432627 M * bj Bertl: -n: file descriptors 1024 1155432642 M * Bertl so that is your upper limit inherited from the host 1155432652 M * bj already did that and also tried a echo 65536 > /proc/sys/fs/file-max 1155432654 M * Bertl if you raise that, then restart the guest, it will allow that 1155432673 M * Bertl is independant from the kernel parameters (i.e. sys) 1155432680 M * bj hmm 1155432695 Q * Skram Remote host closed the connection 1155432696 M * Bertl try doing ulimit -HS -n 8192 on the host 1155432703 M * bj ahhh ok I'll give it a try 1155432717 M * Bertl that's debian, right? 1155432738 M * bj jepp 1155432754 M * Bertl if you figure where that default is applied, please drop me a note 1155432765 M * Bertl (it's not the first time that we hit this issue) 1155432793 M * bj Bertl: already read that in faq (http://linux-vserver.org/ProblematicPrograms) and posted at the end of the doc something regarding that 1155432810 M * bj Bertl: weird why all thos debs seem to need CAP_SYS_RESOURCE ;( 1155432851 M * Bertl yeah, strange, especially as it is a user resource and should be assigned per user 1155432858 J * Skram ~Mark@hermes.sentiensystems.com 1155432920 M * bj Bertl: hmm so ulimit -Ha (host) gives me the 8192 but after a restart of the guest the app still has the issue and ulimit is giving me back still the 1024 ;( 1155432936 Q * nokoya Server closed connection 1155432956 J * nokoya young@hi-230-82.tm.net.org.my 1155432962 M * Bertl what does ulimit -Sa say on the host? 1155433010 M * Bertl and how do you restart the guest? i.e. from the same account you changed the limits or from a different one? 1155433034 J * Skram_ ~Mark@hermes.sentiensystems.com 1155433077 M * bj Bertl: file descriptors 8192, -c: core file size (blocks) 0, -s: stack size (kbytes) 8192, the rest is unlimited on the host 1155433083 M * Skram sowwy 1155433090 Q * Skram_ 1155433095 M * bj Bertl: vserver $name restart (same account where I changed the limits) 1155433118 M * Bertl and you didn't add any ulimit file to the config, yes? 1155433194 M * bj oh interesting, I restarted it from one pseudo tty but entered it in another (there was the app still with 1024) if I enter from the one where I restarted I get the 8192 1155433200 M * bj different error now :)) 1155433222 M * Bertl good, which one? 1155433233 M * bj Bertl: tcp 0 0 10.0.0.4:10031 0.0.0.0:* LISTEN 17000/postfix-polic 1155433238 M * bj Bertl: solved it thanks a lot :) 1155433247 M * Bertl okay, great! :) 1155433253 M * bj !!!! :))) 1155433263 M * bj ok lets update the wiki 1155433299 M * Bertl please do that, but keep in mind that the settings on the host might different from distro to distro 1155433336 M * Bertl ekc2: have to check that, we currently have no good debug statements where I would need them, except for the scheduler monitor, which would be overkill 1155433359 M * Bertl ekc2: give me a few minutes to try to reproduce it here 1155433383 M * bj Bertl: hm ok I'll try to be verbose hope it helps the others 1155433428 M * ekc2 bertl: ok. well, i'm running the latest dev branch. so, if you want me to run a 'debug' version, i can 1155433443 M * Bertl okay, good to know ... 1155433486 M * Bertl do you have the kernel tree sitting there in a compiled form? i.e. how long would it take to (re)compile with a few minor changes? 1155433508 M * ekc2 ~10 min? 1155433522 M * Bertl okay, so tree is there, and probably built? 1155433527 M * ekc2 yes 1155433540 M * ekc2 but, i'm pxe-booting so i have to do a few other things 1155433553 M * Bertl okay, so it should be far less than, as the kernel only compiles what has changed ... 1155433615 M * ekc2 yes, but i have some kernel dependencies i have to rebuild. it's all automated. but it takes about 10min 1155433636 M * Bertl okay, let me first check if I can reproduce it, because that would be better anyways 1155433658 M * Bertl scheduler args set with vsched-0.03 I presume? 1155433670 M * ekc2 yup 1155433685 M * ekc2 /vnfs/scripts/vsched/vsched -f -i -x 12 -u -1 -I 300 -R 50 -M 50 -S 500 1155433693 M * ekc2 /vnfs/scripts/vsched/vsched -f -x 12 -u -1 -I 1000 -R 1 -M 50 -S 500 -i 1155433702 M * ekc2 (entered in reverse order) 1155434924 M * bj does anyone from you know where I gut the hideous "ulimit -HS -n 8192" ? Already tried /etc/vserver/$vserver/$vservername.conf or /etc/vserver/$vserver/ulimit or /etc/vserver/$vserver/ulimits (of course with in them ULIMIT="-HS -n 8192") 1155434930 M * bj s/gut/put/ 1155434988 M * Bertl ekc2: what's that -1 supposed to do? 1155435026 M * ekc2 i want the command to apply to all cpu's (i have two); is that not how I address all cpu's? 1155435051 M * Bertl ah, okay, got it ... 1155435259 M * Bertl and which R/I settings make the idle time stop for you? 1155435278 M * Bertl 50/300? 1155435323 M * Bertl yep, seems so 1155435471 M * Bertl it seems to be running quite fine here, could you show me a few iterations of /proc/virtual//sched when it is stopped for you? 1155435506 M * Bertl and more important, does the guest do enough to 'cause' idle time? 1155435510 Q * ekc2 Ping timeout: 480 seconds 1155435557 J * ekc2 ~EKC@netblock-66-245-252-180.dslextreme.com 1155435576 M * ekc2 bertl: the 50/300 idle time causes the problems 1155435596 M * Bertl okay, did you read/get my other comments? 1155435605 M * ekc2 yes, doing that now 1155435615 M * ekc2 the guest is running three cpuhog threads 1155435627 M * Bertl okay, that sounds good :) 1155436212 M * ekc2 here's what /proc/virtual//sched is showing me while the guest is running cpuhog: http://rafb.net/paste/results/phrpRL35.html 1155436310 M * Bertl funny thing ... 1155436317 M * ekc2 and here's what happens after i change rate_idle/interval_idle from 50/300 to 1/6: http://rafb.net/paste/results/nDVsfj69.html 1155436351 M * ekc2 and when I change it back to 50/300, idle time freezes and both cpu's go on hold (for that guest) 1155436365 M * Bertl what if you 'just' change it to 49/300 or 50/299? 1155436442 M * ekc2 same behaviour. idle time freezes 1155436478 M * ekc2 1/300 works, though 1155436558 M * ekc2 very odd. 1/10 suits my needs, though. so, it's not a show-stopper. 1155436589 M * Bertl strange thing is, that it doesn't do that here 1155436616 M * Bertl is that x86 or x86_64? 1155436639 M * ekc2 x86_64 1155436647 M * Bertl hmm 1155436648 M * ekc2 debian_amd64 1155436706 M * Bertl CONFIG_VSERVER_IDLELIMIT is disabled or enabled? 1155436727 M * ekc2 enabled 1155436769 M * ekc2 hold on. testing something 1155436947 M * ekc2 ok. when I freshly boot a system and start a vserver with 50/300 idle time, this behaviour appears consistently. 1155436980 M * Bertl but? (that sounds like a but :) 1155437036 M * ekc2 well, when you had me try 49/300, 51/300, i also tried 1/300 and many other values. then I went back to 50/300 and it worked. i just rebooted my system and idle time is stalling again 1155437056 M * Bertl ah, that could be a valuable hint 1155437075 M * ekc2 yes, but i can't get 50/300 working now. 1155437219 M * Bertl okay, let's try a few commands on your system, you got the vsched 0.03 and vcmd too? 1155437267 M * ekc2 ok. got it working again: here's what I did after I had 50/300 set: http://rafb.net/paste/results/5p91j526.html 1155437307 M * Bertl and that made it going again? 1155437315 M * ekc2 yes 1155437330 M * ekc2 yes, i've got vcmd and vsched-0.3 1155437431 M * Bertl okay, let's try the following: 1155437439 M * Bertl vcmd -i 42 -BC ctx_create .flagword=^34^8 -- cpuhog & 1155437505 M * Bertl vsched -f -x 42 -u -1 -I 1000 -R 1 -M 50 -S 500 1155437520 M * Bertl vsched -f -i -x 42 -u -1 -I 300 -R 50 1155437552 M * Bertl this gives me here happily advancing idle time 1155437590 M * Bertl (you might want to limit that to a single cpu for testing) 1155437613 M * ekc2 nope.idle time is stalled at 0 1155437697 M * ekc2 see: http://rafb.net/paste/results/fCYj2a56.html 1155437700 M * Bertl interesting, so token time advances at ~ 1 per second or so 1155437720 M * Bertl but idle time not, and the context is on hold, right? 1155437733 M * ekc2 yes. context is on hold. same as before 1155437737 M * ekc2 HZ=1000 1155437757 M * Bertl okay, let me try a few things here, maybe I can make my setup match that 1155437859 M * Bertl ah, and the system _is_ actually idle, yes? 1155437907 M * ekc2 vtop -i is listing cpuhog and top 1155438203 M * Bertl and plenty of idle time, I presume 1155438225 M * ekc2 how would I measure that? 1155438232 M * Bertl Cpu(s): 1.0% us, 2.9% sy, 0.0% ni, 95.5% id, 0.0% wa, 0.6% hi, 0.0% si 1155438237 M * Bertl id = idle 1155438247 M * Bertl (in the vtop view) 1155438309 M * Bertl (you might need to toggle that with 't', if you default hides it) 1155438419 M * ekc2 id=97.5% 1155438436 M * ekc2 cpuhog is halted, of course 1155438436 M * Bertl okay, so that actually gives a good clue 1155438457 M * Bertl it seems that the conditions for idle time skipping are not met (due to a bug :) 1155438488 M * Bertl which can actually have two causes, a) the check for unholding gets some calculations wrong 1155438515 M * Bertl or b) the criteria, like actually ahving tasks on hold are not met/checked 1155438542 M * ekc2 hmm. very subtle bug, though. because I can fiddle around with the idle rate/interval and after a several attempts it starts working 1155438578 M * Bertl yep, well, if it would be obvious, we would have had hit it earlier :) 1155438593 M * ekc2 yup. true. 1155438603 M * Bertl so I'm opting for the calculations going wrong somehow 1155438658 M * Bertl nevertheless, we should be able to track this down quite easy, as we have a 'special' trigger (the idle task) at hand 1155438740 M * ekc2 yes. 1155438743 M * Bertl we also know that the idle time does not advance at that point 1155438750 M * ekc2 right 1155438757 M * Bertl while the token time _and_ the tokens increase 1155438761 M * matti Bertl: ;-) 1155438768 M * Bertl hey matti! 1155438801 M * ekc2 right. there's no problem with the non-idle tokens increasing. they always go up by 1/second 1155438819 M * Bertl ekc2: so, now can we wait until the min value (50) is reached and see what happens then? probably nothing, i.e. the tokens are substracted and the process gets a short period of cpu, right? 1155438828 M * matti I need to ask. 1155438839 M * matti What both of you doing? :) 1155438846 M * matti Looks interesting. 1155438850 M * ekc2 right. i tried that. and nothing changes. cpuhog consumes all the tokens and the guest returns to halt 1155438872 M * Bertl matti: we are debugging the scheduler by 'thinking' about it :) 1155438879 M * ekc2 haha 1155438892 M * matti ;-) 1155438944 M * Bertl well, most people fire up a debugger or something and spend hour over hour looking for anomalies 1155438960 M * ekc2 ooooo. i found something 1155438967 M * Bertl 'we' on the contrary, use our intelectual ... yeah? 1155438990 M * ekc2 rate_idle/idle_interval = 50/300 is stuck at zero 1155439003 M * ekc2 so, i changed rate/interval to 10/300 and idle time advances 1155439020 M * ekc2 then I change rate/interval back to 50/300 and idle time advances 1155439044 M * ekc2 let me try this again to confirm 1155439086 M * doener hm, certain values cause it to die? 1155439137 M * Bertl hey doener! nobody diet yet :) 1155439151 M * doener everyone's getting fat? SCNR ;) 1155439157 M * Bertl lol 1155439193 M * matti LOL 1155439242 M * Bertl ekc2: do you have more than one context running or just a single one? 1155439247 M * ekc2 just one 1155439269 M * Bertl could you try to start a second one with the same values? 1155439273 M * ekc2 tried 10/300 again. 10/300 worked, 50/300 did not. and I had to go through half a dozen different values before 50/300 worked again 1155439275 M * doener Bertl: how to parse the cpu lines in /proc/..../sched? 1155439283 M * ekc2 ok 1155439335 M * Bertl doener: see sched_proc.h 1155439356 M * doener ah, .h, just grepped the .c ones... 1155439392 M * ekc2 with two contexts running, same behaviour 1155439413 M * ekc2 both have rate/interval=1/1000 and idle_rate/idle_interval=50/300 1155439422 M * Bertl okay, good, I thought it might be related to a scheduler starvation 1155439496 M * Bertl I think I know what happens, but I have to double check that with the source code ... 1155439968 M * Bertl yep, I guess I found it 1155439990 M * ekc2 yeah? 1155439998 M * Bertl the 'magic' condition is that the 'missing' tokens have to be greater than the fill_rate 1155440034 M * Bertl kernel/vserver/sched.c ~177 1155440050 M * Bertl seems I 'forgot' the else case :) 1155440094 M * Bertl let's see if that changes something for you :) 1155440105 M * ekc2 wow. i would have never found that 1155440124 A * doener was looking about 20 lines above that :) 1155440148 M * Bertl but let me also double check that I didn't miss something here 1155440174 M * Bertl this code part is quite tricky, and I did put unusually many comments there 1155440285 M * Bertl okay, doener , let's go through it with a few examples, shall we? 1155440328 A * doener is still trying to grasp that thing... never looked at it before 1155440350 M * Bertl okay, line ~100 1155440373 M * Bertl let's assume delta = 40 for a start 1155440387 M * Bertl interval[1] = 300 1155440406 M * Bertl so we set delta_min[1] to 40 1155440458 M * Bertl now we skip to ~175 (because nothing else happens before) 1155440483 M * Bertl tokens = tokens_min -tokens 1155440494 M * Bertl let's say we have 10 tokens in the bucket, 50 min 1155440504 M * Bertl so we get tokens = 40 too 1155440551 M * Bertl fill_rate[1] = 50, so nothing is changed here 1155440584 M * Bertl therefore it should lead to an idle time advancement of 40 1155440631 M * Bertl that's what I expected .. and which is correct 1155440668 M * Bertl now let's do another run, with delta > interval 1155440682 M * Bertl let's assume the delta is already 300 1155440702 M * Bertl >= is sufficient here 1155440707 M * doener where does that advancement happen? or is this what you intend to fix? (ie. missing else case) 1155440742 M * Bertl in the previous case, idle time was advanced by 40 1155440759 M * Bertl that would have had happened several times before we reach/cross 300 1155440784 M * Bertl (possibly unholding tasks and such, but let's ignore that for now) 1155440798 M * doener idle time as in sched_pc->idle_time? 1155440822 M * Bertl as in *idle_time, the sched_pc->idle_time is just for the delta 1155440846 M * Bertl so we have delta >= 300 here and do math/update 1155440857 M * Bertl tokens = 1 1155440872 M * Bertl integral = 1*50 1155440889 M * Bertl 1*300 (it's late :) 1155440906 M * Bertl tokens = 1*50 1155440922 M * Bertl delta_min[1] = 300-300 = 0 1155440951 M * Bertl sched_pc->idle_time is advanced, same with tokens 1155440981 M * Bertl this is a case we can never hit here, as we would see it as an idle time advancement, right? 1155441001 M * doener hm, the "idle time was advanced by 40" is as in "was advanced from the outside", yeah? 1155441010 M * Bertl yes 1155441017 M * doener ok, now I get it :) 1155441073 M * Bertl so, IMHO we must hit a case where the following is true: 1155441102 M * Bertl delta < sched_pc->interval[1] (300) 1155441136 M * Bertl in which case delta_min[1] = delta 1155441164 M * Bertl and then either tokens > sched_pc->fill_rate[1] (50) 1155441189 M * Bertl or nothing is changed 1155441207 M * Bertl now we know that tokens are _usually_ below that in ekc2's case 1155441243 M * doener what are these delta_min values used for? 1155441245 M * Bertl so the delta itself must be 0 here 1155441268 M * Bertl the delta_min values contain the _skip deltas 1155441282 M * Bertl i.e. how much the idle time is advanced for example 1155441322 M * doener ok, the last call in vx_try_unhold probably updates the rq->idle_time using that value, right? 1155441345 M * doener (vxm_rq_max_min) 1155441354 M * Bertl that's just the monitor 1155441382 M * Bertl vx_try_skip does that 1155441399 M * doener ok 1155441427 M * Bertl but only if the delta is non zero 1155441443 M * Bertl so we basically have a deadlock here :) 1155441475 M * doener if we hit a zero delta, we stay there, right? 1155441498 M * Bertl yep, time is not advanced, the queue is not unlocked 1155441525 M * Bertl if it is slightly off by something, everything starts working 1155441553 M * Bertl now, one important thing is that we actually _can_ advance idle time unconditionally when we get idle 1155441586 M * Bertl regardless of the fact that maybe no single context can be freed 1155441633 M * Bertl the important question here is now, how much time _should_ we advance in the optimal case 1155441666 M * Bertl and this is what the recalc() is supposed to provide, but it doesn't (in this case) 1155441814 M * Bertl so what we need to do is a single interval in the else case 1155441862 M * Bertl so I'd say we want at line 180 1155441887 M * Bertl else delta_min[1] += sched_pc->interval[1] 1155441930 M * Bertl (or probably, to be precise) 1155441946 M * Bertl delta_min[1] = sched_pc->interval[1] - delta_min[1]; 1155441979 M * Bertl i.e. the required delta until we get at least one integral interval 1155442011 M * doener sounds reasonable 1155442013 M * Bertl does that make sense? 1155442081 M * doener just fast-forward to loose as little time as possible without handing out too many tokens 1155442085 M * Bertl testing an artificially 'configured' setup here without and with fix ... will take a minute or too 1155442097 M * Bertl doener: yep, precisely 1155442119 M * Bertl and the 'smallest' value over all suspended contexts is the winner 1155442165 M * Bertl ekc2: still awake? :) 1155442173 M * ekc2 yup. following along closely 1155442188 M * Bertl excellent, what's your opinion? does that match your case? 1155442205 M * ekc2 i think so. building kernel for test now 1155442217 M * Bertl excellent! tx! 1155442279 M * Bertl of course, we should do similar for the 'normal' hard scheduling case too, which suffers from the same issue, except for the fact that it is not visible as time advances without help :) 1155442380 M * Bertl looks good here, I can reproduce the issue and it is fixed by the change :) 1155442683 M * ekc2 bertl: amazing 1155442686 M * ekc2 it works!! 1155442690 M * ekc2 thanks :) 1155442714 M * Bertl excellent!, thank you! 1155442752 M * Bertl without your testing we wouldn't have been able to fix it ... 1155442790 M * ekc2 i guess. but i would never ever have found that bug. so subtle 1155442816 M * ekc2 i guess idle time isn't being used that widely yet 1155442834 M * Bertl no, naturally because of the missing tool support :/ 1155442860 M * Bertl here is the complete patch for both cases: 1155442863 M * Bertl http://vserver.13thfloor.at/Experimental/delta-sched-fix01.diff 1155442873 M * ekc2 awesome 1155442890 M * Bertl but as I already said, the hardcpu hase does not really show up 1155442902 M * ekc2 right 1155442924 M * Bertl it just takes a little longer or slightly misadjusts the idle task 1155442959 M * Bertl nevertheless, would be great if you could give it a thorough spin with debugging enabled 1155442969 M * ekc2 oh, you fixed that, too. nice. 1155442985 M * ekc2 ok. will definitely do that tomorrow. 1155443000 M * Bertl just to see if something unusual pops up, scheduler stuff is always black magic ... youturn a knob and the machine changes 100deg 1155443029 M * Bertl yeah, no need to hurry on that, I think we fixed it properly 1155443058 M * doener somehow knob + machine + 100deg made "centrigrade" pop up in my brain :) 1155443063 M * ekc2 well, i'm going to be making heavy use of hard cpu limits with idle time. so, i'll let you know if i find anything 1155443094 M * Bertl please do so, we are always glad to receive feedback of any kind 1155443106 M * ekc2 thanks again, bertl! g'night all! 1155443112 Q * ekc2 1155443134 M * Bertl that's actually a good idea .. I guess I'm off to bed too :) 1155443145 M * doener good night then, will do the same :) 1155443152 M * Bertl yeah, cya! 1155443158 N * Bertl Bertl_zZ 1155444602 Q * s0undt3ch Ping timeout: 480 seconds 1155444627 M * ebiederm sleep well. 1155448813 J * daniel15 ~dansoftau@220-245-135-167-vic-pppoe.tpgi.com.au 1155448833 M * daniel15 Is there a way to install Redhat Linux on a VServer if the host is Debian? 1155449436 M * daniel15 ?? 1155449691 M * ebiederm Why would it be a problem? 1155449760 M * daniel15 Well, I'm not sure how to 1155449776 M * daniel15 If I'm doing a Debian VServer, I'd do something like vserver test build ............. -m debootstrap -- -d etch 1155449797 M * daniel15 But I'm not sure of how to install other Linux distributions 1155449849 M * lylix do you need "redhat", or something like fedora/centos? 1155449930 M * daniel15 Well, I haven't decided what I'm doing yet (and don't know much about Redhat). Just something that's free, and works fine on a VServer, I suppose :) 1155449964 M * lylix you knwo how to use tar to untar .tar.gz files? 1155449969 M * daniel15 Yes 1155449981 M * lylix http://lylix.net/templates.html 1155449996 M * lylix there are images there to select from 1155450023 M * lylix all you have to do is untar them in a vserver directry, ie. /vserver/NAME/ 1155450045 M * lylix and run the setup command to popluate /etc/vservers, and away you go 1155450058 M * daniel15 Oooh 1155450060 M * daniel15 Thank you :) 1155450068 M * lylix ;) 1155450083 M * lylix you dont need to boostrap 1155450092 M * lylix they are already built, etc 1155450104 M * daniel15 Just out of curiosity, if I wanted to install a distribution manually (from an installation CD), how would I do that? 1155450148 M * lylix heh... idk what the "best" way is... but we typically just install it on a spare box and transfer it into a tarball 1155450172 M * lylix but many times the created image requires many tweaks 1155450185 M * daniel15 OK, so I'm probably best off using your images 1155450188 M * daniel15 Thanks anyways 1155450189 M * daniel15 :) 1155450214 M * lylix unless its something uber-special, the images will get you started with a basic or LAMP enviro 1155450225 M * lylix and the package tools are there to build the system up 1155450232 M * lylix to do anything you want 1155450239 M * daniel15 OK, Thanks 1155450331 M * lylix oh, and dont untar them over your hosts root partition ;) 1155450384 M * lylix had a guy do that today actually... LOL! 1155450430 M * daniel15 :P 1155450436 M * daniel15 Anyways, talk to you later 1155450437 M * daniel15 I have to go 1155450440 Q * daniel15 1155450769 M * Skram :P 1155451086 Q * tokkee Server closed connection 1155451088 J * tokkee tokkee@casella.verplant.org 1155452988 J * Viper0482 ~Viper0482@p5497658A.dip.t-dialin.net 1155453792 Q * gerrit Ping timeout: 480 seconds 1155454351 J * rgl Rui@217.129.151.190 1155454355 M * rgl hilo 1155454372 Q * matled_ Server closed connection 1155454372 J * matled ~matled@85.131.246.184 1155456162 J * bonbons ~bonbons@83.222.36.236 1155458134 Q * Curus_ Remote host closed the connection 1155458293 J * dna ~naucki@p54BCF423.dip.t-dialin.net 1155459818 J * debugger_ ~Rui@217.129.151.190 1155460232 J * pisco ~pampel@p5087A5F9.dip0.t-ipconnect.de 1155460236 P * pisco 1155460270 Q * rgl Ping timeout: 480 seconds 1155464510 Q * michal_ Ping timeout: 480 seconds 1155465438 J * michal_ ~michal@www.rsbac.org 1155465522 Q * DreamerC Quit: leaving 1155465541 J * DreamerC ~dreamerc@59-112-7-8.dynamic.hinet.net 1155465577 Q * DreamerC 1155465594 J * DreamerC ~dreamerc@59-112-7-8.dynamic.hinet.net 1155465626 Q * DreamerC 1155465718 J * DreamerC ~dreamerc@59-112-7-8.dynamic.hinet.net 1155467009 J * rgl Rui@217.129.151.190 1155467450 Q * debugger_ Ping timeout: 480 seconds 1155467874 J * coocoon ~coocoon@p54A07985.dip.t-dialin.net 1155468080 J * yarihm ~yarihm@84-74-17-70.dclient.hispeed.ch 1155468759 J * meandtheshell ~markus@85-124-37-184.dynamic.xdsl-line.inode.at 1155471022 J * dna_ ~naucki@p54BCF423.dip.t-dialin.net 1155471281 Q * dna Ping timeout: 480 seconds 1155471761 J * gerrit ~kvirc@dslb-084-060-254-162.pools.arcor-ip.net 1155471802 J * dna ~naucki@p54BCF423.dip.t-dialin.net 1155472106 Q * dna_ Ping timeout: 480 seconds 1155473146 J * dna_ ~naucki@p54BCF423.dip.t-dialin.net 1155473492 Q * dna Ping timeout: 480 seconds 1155474220 J * debugger_ ~Rui@217.129.151.190 1155474407 J * dna___ ~naucki@p54BCF423.dip.t-dialin.net 1155474516 J * dna ~naucki@p54BCD70C.dip.t-dialin.net 1155474680 Q * rgl Ping timeout: 480 seconds 1155474837 Q * dna_ Ping timeout: 480 seconds 1155474892 Q * dna___ Ping timeout: 480 seconds 1155476852 Q * phedny Ping timeout: 480 seconds 1155476873 J * s0undt3ch yllxilkc@bl7-242-233.dsl.telepac.pt 1155478448 N * Bertl_zZ Bertl 1155478452 M * Bertl morning folks! 1155478879 M * doener mornign Bertl 1155478886 M * doener s/ign/ing/ 1155479805 Q * coocoon Quit: KVIrc 3.2.0 'Realia' 1155480041 J * BoSS ~RJ@85.104.50.113 1155480099 M * Bertl welcome BoSS! 1155480105 M * BoSS thx 1155480109 M * BoSS asl bertl 1155480110 M * BoSS ;D 1155480124 M * BoSS where i am now? 1155480133 M * BoSS who will tell me? 1155480136 M * mnemoc O_O 1155480155 M * Bertl BoSS: depends, you could be right or wrong :) 1155480205 M * BoSS :) 1155480207 M * BoSS o yea 1155480207 M * BoSS :D 1155480208 M * BoSS depends 1155480338 M * Bertl well, check the topic for hints .. maybe have a look at the web site :) 1155480523 Q * BoSS Quit: ««< S O H B E T >»» www.Sohbet.Net 1155481098 M * waldi Bertl: can you recheck the patch? 1155481114 M * Bertl sure, sec 1155481201 M * Bertl could you do the diff with -NurpP ? 1155481255 M * waldi nope 1155481276 M * Bertl hmm, then it will take a little, have to figure where the code goes 1155481308 M * waldi i have to regenerate the tree ... 1155481359 M * debugger_ hey guys :D 1155481361 N * debugger_ rgl 1155481363 Q * michal_ Ping timeout: 480 seconds 1155481410 J * debugger_ Rui@217.129.151.190 1155481700 M * doener which patch? 1155481707 M * Bertl http://194.39.182.225/linux/vserver-bindmount.patch 1155481749 M * Bertl waldi: one comment right now: the first check should be 1155481764 M * Bertl (old_nd.mnt->mnt_flags & MNT_NODEV) && 1155481775 M * Bertl vx_ccaps(VXC_SECURE_MOUNT) 1155481778 N * debugger_ rgl_ 1155481802 M * Bertl waldi: i.e. first the old check, then the caps, and both on a separate line 1155481811 M * rgl_ Bertl, just to check, did you receive my email? 1155481830 M * waldi hmm 1155481834 M * Bertl rgl_: iirc. I even replied to it :) 1155481862 M * rgl_ Bertl, oh, if you did, I didn't get :( 1155481863 M * Roey hey all 1155481865 Q * rgl Ping timeout: 480 seconds 1155481865 M * Roey Bertl: guess what, 1155481881 M * Roey Bertl: I figured that over the other choices, vserver would be easiest. 1155481888 M * Roey Bertl: since I don't need to do the openvpn thing 1155481891 M * Bertl how so? 1155481904 M * Roey Bertl: we'll have openvpn over a different dedicated server. 1155481943 M * rgl_ Bertl, odd, its not even on my spam folder :( can you please resend it? 1155481974 M * Roey Bertl: that leaves us with postfix proxy, external dns, and web services. These can be hosted fine on vserver (in fact, I've set this up long ago on the server--I just haven't switched it on s the usual configure thingie --- I just haven't switched it 'on' yet) 1155481977 M * Roey Bertl: HI BERTL! 1155481979 M * Roey I've had coffee 1155482017 J * michal_ ~michal@www.rsbac.org 1155482035 M * doener Bertl: hm, is that do_mount and do_remount? 1155482055 M * doener (or whatever the remount thing was called) 1155482080 M * Bertl probably, will have to check, that's why I asked for a proper diff 1155482354 M * waldi why does util-vserver nuke /var/run? 1155482397 M * waldi bah 1155482702 M * sid3windr what's required for raw_icmp ccap? 1155482720 M * sid3windr I upgraded to 0.30.210 finally for utils, so now it accepts it, but mtr/traceroute still don't work from the vserver 1155482720 Q * Roey Ping timeout: 480 seconds 1155482876 M * Bertl sid3windr: traceroute does not use raw icmp 1155482911 M * Bertl sid3windr: don't know about mtr, but I suspect it requires raw sockets too (just like traceroute), use tracepath instead 1155482923 M * sid3windr yeah 1155482926 M * sid3windr so that doesn't help? 1155482935 M * sid3windr mtr uses raw sockets too.. 1155482944 M * sid3windr no hope for that unless full network access is given? 1155482947 M * Bertl no, raw sockets mean that the tool can spoof/capture any packets 1155482962 M * Bertl at least I do not want that on my guests :) 1155482987 M * Bertl if you give CAP_NET_RAW, it'll work 1155483122 J * coocoon ~coocoon@p54A07985.dip.t-dialin.net 1155483140 M * daniel_hozac hello everyone! 1155483307 M * daniel_hozac waldi: FHS compliance, IIRC. and only for sysv initstyle where the sysinit stage of the guest's boot isn't run. 1155483469 M * Bertl hey daniel_hozac! how was your vacation? 1155483505 M * daniel_hozac it was great. 1155483561 M * Bertl good to hear! 1155483572 M * daniel_hozac so what did i miss? :) 1155483636 M * Bertl we fixed a scheduler bug :) 1155483673 M * Bertl the wiki (mediawiki) has progressed a lot 1155483674 J * Roey ~katz@h-69-3-4-130.mclnva23.covad.net 1155483689 M * Bertl quota testing is rolling again 1155483801 M * daniel_hozac ah, good. 1155483951 M * Bertl and I almost finished combing through 2.0.2 :) 1155483998 M * daniel_hozac cool, so we should a release RSN? ;) 1155484055 J * phedny ~mark@volcano.p-bierman.nl 1155484095 M * Bertl I'd say so, probably one run through PLM and after that, we should be fine 1155484693 M * Bertl okay, off for now, back a little later ... 1155484708 N * Bertl Bertl_oO 1155485054 M * Hollow heya daniel_hozac! 1155485477 M * daniel_hozac hey! 1155485564 M * Savvy hey, anyone got this warning : http://pastebin.ca/129631 ? 1155485728 Q * rgl_ Quit: Fui embora 1155486136 Q * coocoon Quit: KVIrc 3.2.0 'Realia' 1155486415 M * ebiederm Q: Is setting a larger value of pid_max anything that users are doing with vserver? 1155487101 J * coocoon ~coocoon@p54A055B6.dip.t-dialin.net 1155487494 N * _fs fs 1155488039 Q * coocoon Quit: KVIrc 3.2.0 'Realia' 1155489604 J * pisco ~pampel@p5087A6E1.dip0.t-ipconnect.de 1155489956 J * _mcp ~hightower@wolk-project.de 1155490013 Q * mcp Read error: Connection reset by peer 1155490015 N * _mcp mcp 1155493878 J * ntrs_ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com 1155493878 Q * ntrs__ Read error: Connection reset by peer 1155493965 Q * shedi Read error: Connection reset by peer 1155495034 J * ekc ~ekc@netblock-66-245-252-180.dslextreme.com 1155495608 J * dna_ ~naucki@p54BCD70C.dip.t-dialin.net 1155496015 Q * dna Ping timeout: 480 seconds 1155496371 J * liquid3649_ ~Viper0482@p5497658A.dip.t-dialin.net 1155496391 J * dna ~naucki@p54BCD70C.dip.t-dialin.net 1155496555 Q * Viper0482 Ping timeout: 480 seconds 1155496555 J * shedi ~siggi@inferno.lhi.is 1155496680 Q * dna_ Ping timeout: 480 seconds 1155496695 J * Viper0482 ~Viper0482@p5497658A.dip.t-dialin.net 1155496865 Q * liquid3649_ Ping timeout: 480 seconds 1155497128 J * dna_ ~naucki@p54BCFA68.dip.t-dialin.net 1155497224 J * dna___ ~naucki@p54BCFA68.dip.t-dialin.net 1155497466 Q * dna Ping timeout: 480 seconds 1155497569 J * dna ~naucki@p54BCFA68.dip.t-dialin.net 1155497616 Q * dna_ Ping timeout: 480 seconds 1155497765 Q * yarihm Quit: Leaving 1155497784 Q * Savvy Quit: IceChat - Keeping PC's cool since 2000 1155497799 N * Belu_zZz Belu 1155497961 Q * dna___ Ping timeout: 480 seconds 1155497984 J * dna_ ~naucki@p54BCFA68.dip.t-dialin.net 1155498206 Q * dna Ping timeout: 480 seconds 1155498795 J * dna ~naucki@p54BCFA68.dip.t-dialin.net 1155499056 Q * dna_ Ping timeout: 480 seconds 1155499124 Q * gerrit Remote host closed the connection 1155499300 J * shedii ~siggi@inferno.lhi.is 1155499341 J * s4edi ~siggi@inferno.lhi.is 1155499458 M * cehteh oh .. anyone applied for a vserver booth for the linuxtag essen? 1155499621 Q * shedi Ping timeout: 480 seconds 1155499786 Q * shedii Ping timeout: 480 seconds 1155500013 M * mnemoc anyone at essen? :) 1155500076 Q * ntrs_ Read error: Connection reset by peer 1155500083 J * ntrs_ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com 1155500235 Q * ekc Ping timeout: 480 seconds 1155500326 J * ntrs__ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com 1155500326 Q * ntrs_ Read error: Connection reset by peer 1155500449 J * ekc ~ekc@netblock-66-245-252-180.dslextreme.com 1155500486 Q * schimmi Quit: Verlassend 1155500804 J * shedii ~siggi@inferno.lhi.is 1155500913 Q * bonbons Quit: Leaving 1155501136 Q * s4edi Ping timeout: 480 seconds 1155501189 Q * shedii Quit: Leaving 1155501674 Q * Zaki[] Remote host closed the connection 1155501998 Q * Viper0482 Remote host closed the connection 1155502000 J * Zaki ~Zaki@212.118.121.51 1155503330 J * Wonka produziert@chaos.in-kiel.de 1155503338 J * s0undt3ch_ nxdify@bl7-243-151.dsl.telepac.pt 1155503427 Q * s0undt3ch Ping timeout: 480 seconds 1155503430 N * s0undt3ch_ s0undt3ch 1155503651 A * Belu is away (iŽll be back later...) 1155503652 N * Belu Belu_zZz 1155503819 M * derjohn ebiederm, i've never heard about users needing more than 65k processes on a host. 1155504063 M * derjohn cehteh, essen is too far from here , but I would offer one day of my time. 1155504077 J * ntrs_ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com 1155504080 Q * ntrs__ Read error: Connection reset by peer 1155504126 M * Wonka re 1155504206 J * ntrs__ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com 1155504206 Q * ntrs_ Read error: Connection reset by peer 1155504242 M * daniel_hozac Bertl_oO: did you track down Aiken's COW NOFILE issue? 1155504381 J * coocoon ~coocoon@p54A06E11.dip.t-dialin.net 1155504504 J * ntrs_ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com 1155504504 Q * ntrs__ Read error: Connection reset by peer 1155504892 J * Aiken ~james@tooax8-072.dialup.optusnet.com.au 1155504961 M * michal_ hey guys...point me please to some unification howto 1155505163 M * ebiederm derjohn: Thanks. Do you know if having several vservers going gives you a higher number of processes than a normal machine? 1155505264 M * daniel_hozac ebiederm: what does normal machine mean? what are the vservers doing? 1155505288 M * ebiederm Restating my question. 1155505303 M * daniel_hozac michal_: http://linux-vserver.org/alpha+util-vserver has a section on it. 1155505314 M * ebiederm I have noticed some scalability problems with the data structures that are used for pids. 1155505346 M * ebiederm In general machines running vservers do more work and are thus pushd harder than other machines. 1155505377 M * ebiederm So my question is does running vservers on a machine tend to up the average number of processes on a machine. 1155505393 M * ebiederm I expect it does but I have no real world numbers to work from. 1155505397 M * daniel_hozac well, it would have to. 1155505426 M * daniel_hozac a context will disappear unless there's a process in it, or it has the persistent flag set. 1155505445 M * daniel_hozac and running a guest without processes seems pretty useless. 1155505477 M * daniel_hozac (kind of like buying a computer and never turning it on...) 1155505580 M * ebiederm Right I'm just trying to guage how many processes are typical on a machine running vservers or other containers. 1155505622 M * ebiederm If I can spot a real trend where the number of processes are higher than in other situations the data structures need to be fixed. 1155505638 M * ebiederm Otherwise I can leave the data structures that are more optimal for small pid counts in place. 1155505645 M * ebiederm And otherwise be lazy :) 1155505661 J * dna_ ~naucki@p54BCFA68.dip.t-dialin.net 1155505739 M * derjohn ebiederm, well, I run usually not more than 10-15 guests on a host with current COTS hardware. that roughly 5000-6000 processes per guest. Can there be a machine that stands more? 192 CPU Suns? I would guess the machine would be thrashing anyway .... 1155505801 M * daniel_hozac 5000-6000 processes per guest? wow. 1155505820 M * sid3windr 1155505820 M * derjohn daniel_hozac, 65K/10-15 guests .... is? 1155505825 M * sid3windr woops 1155505868 M * sid3windr I tried booting 2.6.17.7 with the patch from experimental and it just panics on boot 1155505870 M * daniel_hozac 4369. 1155505872 M * sid3windr :( 1155505879 M * derjohn daniel_hozac, this was only meant as an approx maximum .... (to avoid misunderstranding). an ps faxu |wc -l would probably show not more than 100 per guest 1155505887 M * ebiederm derjohn: So you don't have many if any unused pids at all. Interesting. 1155505899 M * daniel_hozac derjohn: i guess ebiederm is looking for reality, not theory though ;) 1155505922 Q * dna Ping timeout: 480 seconds 1155505937 M * daniel_hozac ebiederm: i'd guess your typical guest runs at the very least 10 processes. 1155505990 M * derjohn ebiederm: ususally less than 50, on heavily loaded guests maybe 100. (@daniel_hozac my cyrus spwans 30 'spare' children for pop etc.) 1155506013 M * daniel_hozac yeah, apache will do similar things. 1155506019 M * derjohn i may add that I only run typcial web-environments (whatever 1155506027 M * derjohn typical may be ;)) 1155506058 M * ebiederm Basically anything less than 4K processes doesn't really have hash collisions so the hash table works perfectly. After that things begin to degrade. 1155506085 M * ebiederm It sounds like things are still below 4K for most of the vserver users anyway. 1155506126 M * derjohn 4K per guest or per host? 1155506131 M * ebiederm per host. 1155506175 M * derjohn hm, well strive sill towards a patch that makes it into the mainline kernel= 1155506177 M * derjohn ? 1155506328 J * dna ~naucki@p54BCFA68.dip.t-dialin.net 1155506336 M * ebiederm This is the mainline kernel kernel data structure I'm looking at. I'm wondering if the current data structures will fall down because of increased load. 1155506403 M * ebiederm On 64bit machines it is possible to push pid_max to 4194304 or 4M processes. If anyone uses a significant fraction of that the performance of our current data structures is not very good. 1155506405 M * derjohn ebiederm, there a surely people/ISPs out which run several hundered guests on a 16GB/Quad Opteron setup. At least here in .de are some which do so with openvz. if they run a 50 processes per guest that would create collisions .... 1155506462 M * cehteh derjohn: quite far for me too ... but if i find a place where i can stay overnight maybe i come 1155506501 M * ebiederm derjohn: Yes. Actually anything short of increasing pid_max is livable. The worst case hash chain with a 32K pid_max (the default) is only 9 entries. 1155506557 M * ebiederm Unfortuanately at 4M processes the worst case hash change is 1024 entries. But it doesn't sound like anyone except people with huge cpu counts is coming anywhere near that today. 1155506577 M * derjohn ebiederm, these are structues not related to vserver at all? i.e. you are talking about general hashes for processes? maybe it gets time for a config option at compile time: "use bierderman hashes for process lookup (use this option for machine with a large number of processes)"? 1155506614 M * ebiederm derjohn: Exactly. There structures are not veserver related. 1155506649 M * derjohn ebiederm, maybe openvz developed has such hashed already? Their target market are IMVHO large ISPs ... 1155506673 M * michal_ daniel_hozac: i cannot see in vhashify configuration how it would know which vserver is a reference vserver 1155506693 M * daniel_hozac michal_: hashification doesn't use a reference guest. 1155506699 M * michal_ daniel_hozac: i already have one guest - and would like to 'clone' it 1155506708 Q * renihs Quit: Leaving 1155506711 M * daniel_hozac devel with COW? 1155506717 M * michal_ yep 1155506735 M * ebiederm derjohn: The data structures are good enough and pids used rarely enougn that unless someone was looking they probably would not have noticed a performance hit. 1155506737 Q * dna_ Ping timeout: 480 seconds 1155506760 M * ebiederm I'm just trying to see if the trend will push us over the edge without realizing it. 1155506765 M * daniel_hozac personally i just find /vservers/a -type f -print0 | xargs -0 setattr --iunlink; cp -al /vservers/{a,b} for that. 1155506815 M * daniel_hozac (of course, hashification is a more long-term solution, allowing you to shrink the sizes of the guests once they've updated files etc.) 1155506829 M * daniel_hozac but you could always run that before. 1155506834 M * michal_ i'll definitely will stay with hashification 1155506865 M * daniel_hozac AFAIK there's no COW-knowing clone available yet. 1155506884 M * michal_ ok. we missuderstod. 1155506898 M * daniel_hozac i've been meaning to hack one together because i'm lazy, but the above has worked so far. 1155506914 M * daniel_hozac (mostly because my real hosts are still running stable) 1155506916 M * lylix when does bertl usually pop back in? 1155506931 M * daniel_hozac whenever he's done with whatever it is he's doing. 1155506936 M * lylix yoi 1155506941 M * daniel_hozac :) 1155506957 M * lylix k, think we've located a token bucketing bug in x86_64... have to do some further testing though 1155506987 M * lylix gentoo hosts running same kernel/vserver tree and totally diff behaviors 1155507015 M * michal_ i've got full/working vserver-sarge. now have created _empty_ vserver and am wondering how to 'fill' it using hashification. that's all :) 1155507071 M * daniel_hozac michal_: hashification won't help you, neither will unification (AFAIK, i've never actually used it). 1155507110 M * daniel_hozac michal_: but basically, the idea is to traverse the tree of the base guest, and then for every iunlink|immutable file you link to it, and for every other file, you copy it. 1155507175 M * daniel_hozac michal_: should be a pretty trivial script, and if you hashify the base guest prior to doing it, the new guest will be all hashified when you're done. 1155507201 M * michal_ what's the point of hashifing it than? 1155507314 M * daniel_hozac hmm? you mean if it can't help you clone guests? 1155507333 M * michal_ nah 1155507336 M * daniel_hozac the decreased disk and therefore memory usage is usually the biggest reason. 1155507342 M * michal_ after doing it all manually 1155507358 M * michal_ hashing is beeing used for... 1155507359 M * michal_ COW? 1155507368 M * daniel_hozac see above. 1155507402 Q * dna Ping timeout: 480 seconds 1155507426 M * daniel_hozac one of the benefitsthe point of COW is that you're able to do exactly what i said above (a cp -al after setting iunlink|immutable). 1155507476 M * michal_ cp -al after iunlink|immutable are set by definion will give me 'the decreased disk and therefore memory usage' ;) 1155507528 M * daniel_hozac exactly. 1155507564 M * michal_ mhm. so...where is the place hashing goes into action? 1155507804 M * daniel_hozac during hashification. 1155507826 M * doener assume you have a bunch of vservers, not necessarily cloned, maybe updated for a number of times (say originally it was debian sarge, now some are sid) 1155507829 M * daniel_hozac so it won't have to compare the file contents of every single file. 1155507849 M * doener hashifying allows you to replace common files with hardlinks and make them COW 1155507865 M * michal_ hey, but i can do it maually 1155507869 M * michal_ *manually 1155507899 M * doener the cp -al thing only works for cloning 1155507906 Q * pisco Ping timeout: 480 seconds 1155507919 M * doener hashifying always works, no matter what state the vservers are in 1155507975 M * michal_ ok...so. what are steps vhashify is doing? it takes a file, checks if it is not on a exclude list. it is not, than it calcuclates the hash and...? 1155508104 M * daniel_hozac if the file is already present in the hash directory, it creates a link to that file. 1155508130 M * daniel_hozac if it's not, a new file is created to which the old file is copied, iunlink|immutable set, and then linked to the old location. 1155508145 M * daniel_hozac (IIRC, doener knows more of the details) 1155508262 M * doener daniel_hozac: you know that I'm scared by util-vserver source code ;) 1155508519 M * michal_ ok. it all does not make any sense for me and is a-logical 1155508527 M * michal_ maybe i'll think about it tommorow 1155508591 Q * ekc Ping timeout: 480 seconds 1155508915 M * michal_ thx for help guys anyway 1155508929 M * michal_ i'll back on this topic untill i'll know evryting about it ;p 1155508976 J * ekc ~ekc@netblock-66-245-252-180.dslextreme.com 1155509002 Q * coocoon Quit: KVIrc 3.2.0 'Realia' 1155509239 M * michal_ anyway 1155509265 M * michal_ COW link breaking does not seem to work like it should here 1155509270 M * michal_ platinum:~ # ls -i /vservers/{debian_sarge,sarge2}/etc/ssh/sshd_config 1155509270 M * michal_ 33819174 /vservers/debian_sarge/etc/ssh/sshd_config 33819174 /vservers/sarge2/etc/ssh/sshd_config 1155509273 M * michal_ platinum:~ # ls -i /vservers/{debian_sarge,sarge2}/etc/ssh/sshd_config 1155509276 M * michal_ 33819174 /vservers/debian_sarge/etc/ssh/sshd_config 33819174 /vservers/sarge2/etc/ssh/sshd_config 1155509290 M * michal_ first one is before, second one after modification of that file 1155509298 Q * ntrs_ Read error: Connection reset by peer 1155509309 J * ntrs_ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com 1155509320 M * michal_ /vservers/debian_sarge/etc/ssh/sshd_config:ListenAddress 192.168.1.8 1155509320 M * michal_ /vservers/sarge2/etc/ssh/sshd_config:ListenAddress 192.168.1.8 1155509324 M * doener what does showattr say about these files? 1155509324 M * michal_ bad 1155509349 M * michal_ ----ui- /vservers/debian_sarge/etc/ssh/sshd_config 1155509349 M * michal_ ----ui- /vservers/sarge2/etc/ssh/sshd_config 1155509362 M * michal_ have run vhashify, than cp -al 1155509381 M * doener vhashify on the old vserver? 1155509406 M * michal_ Linux platinum 2.6.16.27-vs2.1.1-rc22 #1 PREEMPT Sat Aug 12 11:37:30 CEST 2006 i686 i686 i386 GNU/Linux 1155509445 M * doener ie. on the one that was copied? 1155509489 M * michal_ michal@platinum:~/code/vserver/linux-2.6.16.27-vs2.1.1-rc22> grep COW .config 1155509490 M * michal_ # CONFIG_BLK_DEV_COW_COMMON is not set 1155509490 M * michal_ CONFIG_VSERVER_COWBL=y 1155509524 M * michal_ hm, yes 1155509534 M * michal_ did vhashify on source vserver 1155509541 M * michal_ than copied 1155509659 M * doener maybe /etc is excluded by default? 1155509689 M * doener just doing cp -al is a bad idea unless you actually force immutable|iunlink 1155509713 J * Johnnie ~john@dynamic-acs-24-154-53-237.zoominternet.net 1155509732 M * michal_ showattr tells they are present 1155509762 M * doener hu? they were pretty non-present in the output you pasted 1155509794 M * michal_ so what is u and i ? 1155509813 M * doener no unlink and no immutable, U and I would mean that they are set 1155509838 M * doener - = impossible for this file, lower case: possible, but not set, upper case: set 1155509851 M * michal_ ups 1155510074 M * michal_ cool 1155510085 M * michal_ after tweaking it link was broken 1155510148 M * michal_ ok, so to sum it all up 1155510224 M * michal_ vhashify calculated hash for a files, say it was bin/ls. made a hardlink in /vservers/.hash 1155510304 Q * Johnnie Quit: G'bye! 1155510332 M * michal_ so, where those .hash dir is beeing actualy used? 1155510362 M * michal_ i could easily setattr -iunlink on files 1155510366 M * michal_ run cp -al 1155510370 M * michal_ would got the same :) 1155510407 M * doener assume you have a third vserver that is being hashified 1155510429 M * doener the hash is calculated and used to lookup the file in /vservers/.hash 1155510452 M * doener otherwise you'd have no fast way to find identical files 1155510461 J * Johnnie ~john@dynamic-acs-24-154-53-237.zoominternet.net 1155510461 Q * Johnnie 1155510484 J * Johnnie ~john@dynamic-acs-24-154-53-237.zoominternet.net 1155510560 M * michal_ so how vunify did it? 1155510592 M * doener it relies on being provided with information on the files that are to be unified 1155510603 M * doener it can utilize rpm specs IIRC 1155510617 M * doener for non-rpm distros, using vunify is said to be quite hard 1155510656 M * doener and it obviously cannot handle differing filenames and/or locations within the vservers 1155510662 M * michal_ and - in which part vserver needs to know files are identical? say i'm a kernel COW code (lol;) - i see a write request to a file. it is clearly unified, i make a new one, copy over it, make this write, say thank you :) 1155510742 M * doener the kernel knows by looking at the hardlink count, but to create hardlinks you need to know if files are identical... you can't just randomly replace files with hardlinks... 1155510764 M * doener cp -la just doesn't work to re-unify a vserver... 1155510987 M * doener daniel_hozac: wow, thread hijacking flood... 1155510995 M * michal_ last question and i'm heading to sleep. how do i upgrade 10 unified vservers with COW in place? and say, add some app to all of them (but with hardlinked-hashified way!)? 1155511026 M * daniel_hozac on the ml? i stopped sorting by threads when i realized too many people don't know how to properly use email... 1155511037 M * doener michal_: install it in everyone and then hashify them again 1155511119 M * daniel_hozac sid3windr: at least the trace is required. which init? the host's or the guest's? 1155511175 Q * ekc Ping timeout: 480 seconds 1155511333 M * michal_ wow 1155511338 M * michal_ it does work like a charm 1155511345 M * michal_ and now i can see it all clearly :) 1155511354 M * michal_ and vhashify role also 1155511390 M * doener great :) 1155511433 M * michal_ last - i've removed unified file - how do i remove it from .hash db? 1155511441 M * michal_ (yes, really last ;) 1155511533 M * daniel_hozac i usually run find /vservers/.hash -links 1 -print0 | xargs -0 rm -f after hashifying my guests. 1155511541 M * michal_ ok 1155511585 Q * PowerKe Quit: Oops, wrong button 1155511662 M * michal_ ok, sorry to bother you too long&happy to finally get it all. cya :) have a good whatever ;) 1155511671 M * doener cya