1155427495 J * lylix ~eric@dynamic-acs-24-154-53-234.zoominternet.net
1155427557 M * lylix anyone successfully using the token bucket scheduler?
1155427577 Q * Wonka Ping timeout: 480 seconds
1155427597 M * Skram at one point I did
1155427640 M * lylix should the actual number of tokens be reflected in /proc/virtual/XID/sched?
1155427658 M * Skram To tell the truth, I forget
1155427677 M * lylix k, so prolly no pointers on the config/setup
1155427690 M * doener lylix: I think they should
1155427705 M * lylix i have valid settings in /etx/vservers/$VSERVER/schedule
1155427711 M * Skram Im looking for the page that says it
1155427721 M * lylix as it gets properly populated in the proc/sched file
1155427749 M * lylix and also have 'sched_hard' in /etc/vservers/BLAH/flags
1155427791 M * lylix and i ran an infinite loop w/ bzip and the tokens in proc never changed
1155427809 M * lylix actually... not really a loop... just the silly thing from the wiki page:
1155427812 M * doener lylix: which kernel version?
1155427817 M * lylix cat /dev/zero | bzip2 | bzip2 | bzip2 > /dev/null
1155427829 M * lylix 2.6.15
1155427835 M * lylix vserver 2.0.1
1155427898 M * ekc2 do you have CONFIG_VSERVER_HARDCPU=y in kernel config?
1155427916 M * lylix oh boy.. checking... 
1155427980 M * ekc2 at least in 2.1.1 you need that set for sched_hard
1155427996 M * lylix i see it in 2.0.1
1155428001 M * lylix not set, so thatll do it
1155428014 M * lylix now... sched_prio stills works right?
1155428053 M * ekc2 yes. depending on what flags you have set: sched_hard, sched_prio
1155428110 M * lylix k, how effective is the priority scheduling compared to hard limit?
1155428194 M * ekc2 haven't used sched_prio. but, if you use it you should limit nproc
1155428252 M * lylix makes sense... k, since you are using sched_hard, is it advisable to use the "Limit the IDLE task" option?
1155428300 J * Roey ~katz@h-69-3-4-130.mclnva23.covad.net
1155428306 M * ekc2 i'm trying to get that working right now. it's not enabled by default unless you set a flag from userspace
1155428323 M * ekc2 so, it doesn't hurt to enable it
1155428357 M * lylix i know the vsched prog fron bertl can handle setting idle fill/interval, is this one in the same?
1155428414 M * ekc2 bertl said that you need that option enabled in the devel branch to set the idle fill/interval
1155428444 M * lylix k, not ready for devel branch yet... how stable is it?
1155428456 M * lylix i havent had time to bunk it on an extra box and test
1155428473 M * ekc2 it's very solid for me. works great
1155428486 M * lylix hmmm, food for thought then.
1155428515 M * lylix the only annoyance i get sometimes is the mountpoints in namespaces
1155428535 Q * dna Quit: Verlassend
1155428537 M * lylix wrote a quick script to handle that, becuz we use LVM partition per vserver guest
1155428564 M * lylix and destorying a vserver/lvm is sometimes a real pain
1155428633 M * lylix nothing more than a vnamespace -e BLAH umount $DIR thru all running vservers found in /var/run/vservers
1155428699 M * lylix would be nice if the vserver BLAH stop command unmounted non-shared mounts for that vserver in all namespaces automatically
1155428869 Q * ekc2 
1155428886 J * ekc2 ~EKC@netblock-66-245-252-180.dslextreme.com
1155429348 M * ekc2 Does anyone know why idle time fails to advance (stuck at 0) when idle_rate/ide_interval=50/300 but works fine when idle_rate/idle_interval=1/6?
1155429431 M * ekc2 very puzzling
1155430646 M * bj arg... problems with "cannot set rlimit: Operation not permitted" already tired ULIMIT="-H -u 256 -n 65536", CAP_SYS_RESOURCE but it won't run ;/
1155430865 Q * Skram Remote host closed the connection
1155431033 M * bj anyone got an idea to that ?
1155431203 J * Skram ~Mark@hermes.sentiensystems.com
1155431392 Q * meandtheshell Remote host closed the connection
1155432110 N * Bertl_oO Bertl
1155432115 M * Bertl evening folks!
1155432140 M * Bertl bj: who is trying to set what rlimit?
1155432195 M * Bertl lylix: I think it is better to unmount excessive mounts on startup, than on shutdown
1155432217 M * Bertl lylix: this is what the tools with daniel_hozac's patches should do
1155432223 M * bj hey Bertl :)
1155432254 M * Bertl ekc2: sounds strange, are you sure about that?
1155432259 M * bj Bertl: the bad application called "Postfix Policy Daemon" .oO(apt-cache show postfix-policyd)
1155432280 M * Bertl postfix has a policy daemon?
1155432309 M * Bertl anyway, which rlimit does it hit?
1155432317 M * bj found a discussion where you already discussed a similar thing with somebody but I didn't manage to get it running (http://www.paul.sladen.org/vserver/irc-logs/200312/vserver.2003-12-02.txt)
1155432337 M * bj Bertl: yep the thing is actually quite neat but bad(tm) in vserver env ;/
1155432351 M * Bertl well, that was two and a half year ago, I doubt that anything from back then is still valid
1155432363 M * Bertl (the entire limit stuff changed a lot since back then)
1155432376 M * bj Bertl: dunno just get the message : cannot set rlimit: Operation not permitted
1155432385 M * bj ahso
1155432392 M * Bertl okay, then let's get strace working and check it out
1155432408 M * bj hm good idea I'll give it a try
1155432436 M * ekc2 bertl: yes, positive. i was going nuts trying to get idle time to advance with rate/interval = 1/1000 and rate_idle/interval_idle=50/300; on a whim I tried rate_idle/interval_idle=1/6 and it worked.
1155432448 M * ekc2 behaviour is repeatable
1155432492 M * ekc2 bucket minimum=50; max size=500
1155432502 M * Bertl okay, sounds like a bug then ... do you have vserver debugging enabled?
1155432509 M * ekc2 dual cpu system 
1155432511 M * ekc2 yes, i do
1155432521 M * ekc2 what would you like me to do?
1155432530 M * Bertl give me a moment to check which messages are of interest
1155432565 M * bj Bertl: setrlimit(RLIMIT_NOFILE, {rlim_cur=4097, rlim_max=4097}) = -1 EPERM (Operation not permitted)
1155432573 M * bj Bertl: thats what I get from the strace
1155432582 M * Bertl bj: okay, that sounds like a limit from your host
1155432586 M * bj hmm
1155432603 M * Bertl but let me put it this way, what does postfix policyd do with 4k file handles?
1155432617 M * bj i got no clue ;(
1155432621 M * Bertl bj: check it on the host with ulimit -aH
1155432627 M * bj Bertl: -n: file descriptors           1024
1155432642 M * Bertl so that is your upper limit inherited from the host
1155432652 M * bj already did that and also tried a echo 65536 > /proc/sys/fs/file-max
1155432654 M * Bertl if you raise that, then restart the guest, it will allow that
1155432673 M * Bertl is independant from the kernel parameters (i.e. sys)
1155432680 M * bj hmm
1155432695 Q * Skram Remote host closed the connection
1155432696 M * Bertl try doing ulimit -HS -n 8192 on the host
1155432703 M * bj ahhh ok I'll give it a try
1155432717 M * Bertl that's debian, right?
1155432738 M * bj jepp
1155432754 M * Bertl if you figure where that default is applied, please drop me a note
1155432765 M * Bertl (it's not the first time that we hit this issue)
1155432793 M * bj Bertl: already read that in faq (http://linux-vserver.org/ProblematicPrograms) and posted at the end of the doc something regarding that
1155432810 M * bj Bertl: weird why all thos debs seem to need CAP_SYS_RESOURCE ;(
1155432851 M * Bertl yeah, strange, especially as it is a user resource and should be assigned per user
1155432858 J * Skram ~Mark@hermes.sentiensystems.com
1155432920 M * bj Bertl: hmm so ulimit -Ha (host) gives me the 8192 but after a restart of the guest the app still has the issue and ulimit is giving me back still the 1024 ;(
1155432936 Q * nokoya Server closed connection
1155432956 J * nokoya young@hi-230-82.tm.net.org.my
1155432962 M * Bertl what does ulimit -Sa say on the host?
1155433010 M * Bertl and how do you restart the guest? i.e. from the same account you changed the limits or from a different one?
1155433034 J * Skram_ ~Mark@hermes.sentiensystems.com
1155433077 M * bj Bertl: file descriptors           8192, -c: core file size (blocks)    0, -s: stack size (kbytes)        8192, the rest is unlimited on the host
1155433083 M * Skram sowwy
1155433090 Q * Skram_ 
1155433095 M * bj Bertl: vserver $name restart (same account where I changed the limits)
1155433118 M * Bertl and you didn't add any ulimit file to the config, yes?
1155433194 M * bj oh interesting, I restarted it from one pseudo tty but entered it in another (there was the app still with 1024) if I enter from the one where I restarted I get the 8192
1155433200 M * bj different error now :))
1155433222 M * Bertl good, which one?
1155433233 M * bj Bertl: tcp        0      0 10.0.0.4:10031          0.0.0.0:*               LISTEN     17000/postfix-polic
1155433238 M * bj Bertl: solved it thanks a lot :)
1155433247 M * Bertl okay, great! :)
1155433253 M * bj !!!! :))) 
1155433263 M * bj ok lets update the wiki
1155433299 M * Bertl please do that, but keep in mind that the settings on the host might  different from distro to distro
1155433336 M * Bertl ekc2: have to check that, we currently have no good debug statements where I would need them, except for the scheduler monitor, which would be overkill
1155433359 M * Bertl ekc2: give me a few minutes to try to reproduce it here
1155433383 M * bj Bertl: hm ok I'll try to be verbose hope it helps the others 
1155433428 M * ekc2 bertl: ok. well, i'm running the latest dev branch. so, if you want me to run a 'debug' version, i can 
1155433443 M * Bertl okay, good to know ...
1155433486 M * Bertl do you have the kernel tree sitting there in a compiled form? i.e. how long would it take to (re)compile with a few minor changes?
1155433508 M * ekc2 ~10 min?
1155433522 M * Bertl okay, so tree is there, and probably built?
1155433527 M * ekc2 yes
1155433540 M * ekc2 but, i'm pxe-booting so i have to do a few other things
1155433553 M * Bertl okay, so it should be far less than, as the kernel only compiles what has changed ...
1155433615 M * ekc2 yes, but i have some kernel dependencies i have to rebuild. it's all automated. but it takes about 10min
1155433636 M * Bertl okay, let me first check if I can reproduce it, because that would be better anyways
1155433658 M * Bertl scheduler args set with vsched-0.03 I presume?
1155433670 M * ekc2 yup
1155433685 M * ekc2 /vnfs/scripts/vsched/vsched -f -i -x 12 -u -1 -I 300 -R 50 -M 50 -S 500
1155433693 M * ekc2 /vnfs/scripts/vsched/vsched -f -x 12 -u -1 -I 1000 -R 1 -M 50 -S 500 -i
1155433702 M * ekc2 (entered in reverse order)
1155434924 M * bj does anyone from you know where I gut the hideous "ulimit -HS -n 8192" ? Already tried /etc/vserver/$vserver/$vservername.conf or /etc/vserver/$vserver/ulimit or /etc/vserver/$vserver/ulimits (of course with in them ULIMIT="-HS -n 8192")
1155434930 M * bj s/gut/put/
1155434988 M * Bertl ekc2: what's that -1 supposed to do?
1155435026 M * ekc2 i want the command to apply to all cpu's (i have two); is that not how I address all cpu's?
1155435051 M * Bertl ah, okay, got it ...
1155435259 M * Bertl and which R/I settings make the idle time stop for you?
1155435278 M * Bertl 50/300?
1155435323 M * Bertl yep, seems so
1155435471 M * Bertl it seems to be running quite fine here, could you show me a few iterations of /proc/virtual/<xid>/sched when it is stopped for you?
1155435506 M * Bertl and more important, does the guest do enough to 'cause' idle time?
1155435510 Q * ekc2 Ping timeout: 480 seconds
1155435557 J * ekc2 ~EKC@netblock-66-245-252-180.dslextreme.com
1155435576 M * ekc2 bertl: the 50/300 idle time causes the problems
1155435596 M * Bertl okay, did you read/get my other comments?
1155435605 M * ekc2 yes, doing that now
1155435615 M * ekc2 the guest is running three cpuhog threads
1155435627 M * Bertl okay, that sounds good :)
1155436212 M * ekc2 here's what /proc/virtual/<xid>/sched is showing me while the guest is running cpuhog: http://rafb.net/paste/results/phrpRL35.html
1155436310 M * Bertl funny thing ...
1155436317 M * ekc2 and here's what happens after i change rate_idle/interval_idle from 50/300 to 1/6: http://rafb.net/paste/results/nDVsfj69.html
1155436351 M * ekc2 and when I change it back to 50/300, idle time freezes and both cpu's go on hold (for that guest)
1155436365 M * Bertl what if you 'just' change it to 49/300 or 50/299?
1155436442 M * ekc2 same behaviour. idle time freezes
1155436478 M * ekc2 1/300 works, though
1155436558 M * ekc2 very odd. 1/10 suits my needs, though. so, it's not a show-stopper.
1155436589 M * Bertl strange thing is, that it doesn't do that here
1155436616 M * Bertl is that x86 or x86_64?
1155436639 M * ekc2 x86_64
1155436647 M * Bertl hmm
1155436648 M * ekc2 debian_amd64
1155436706 M * Bertl CONFIG_VSERVER_IDLELIMIT is disabled or enabled?
1155436727 M * ekc2 enabled
1155436769 M * ekc2 hold on. testing something
1155436947 M * ekc2 ok. when I freshly boot a system and start a vserver with 50/300 idle time, this behaviour appears consistently.
1155436980 M * Bertl but? (that sounds like a but :)
1155437036 M * ekc2 well, when you had me try 49/300, 51/300, i also tried 1/300 and many other values. then I went back to 50/300 and it worked. i just rebooted my system and idle time is stalling again
1155437056 M * Bertl ah, that could be a valuable hint
1155437075 M * ekc2 yes, but i can't get 50/300 working now. 
1155437219 M * Bertl okay, let's try a few commands on your system, you got the vsched 0.03 and vcmd too?
1155437267 M * ekc2 ok. got it working again: here's what I did after I had 50/300 set: http://rafb.net/paste/results/5p91j526.html
1155437307 M * Bertl and that made it going again?
1155437315 M * ekc2 yes
1155437330 M * ekc2 yes, i've got vcmd and vsched-0.3
1155437431 M * Bertl okay, let's try the following:
1155437439 M * Bertl vcmd -i 42 -BC ctx_create .flagword=^34^8 -- cpuhog &
1155437505 M * Bertl vsched -f -x 42 -u -1 -I 1000 -R 1 -M 50 -S 500
1155437520 M * Bertl vsched -f -i -x 42 -u -1 -I 300 -R 50
1155437552 M * Bertl this gives me here happily advancing idle time
1155437590 M * Bertl (you might want to limit that to a single cpu for testing)
1155437613 M * ekc2 nope.idle time is stalled at 0
1155437697 M * ekc2 see: http://rafb.net/paste/results/fCYj2a56.html
1155437700 M * Bertl interesting, so token time advances at ~ 1 per second or so
1155437720 M * Bertl but idle time not, and the context is on hold, right?
1155437733 M * ekc2 yes. context is on hold. same as before
1155437737 M * ekc2 HZ=1000
1155437757 M * Bertl okay, let me try a few things here, maybe I can make my setup match that
1155437859 M * Bertl ah, and the system _is_ actually idle, yes?
1155437907 M * ekc2 vtop -i is listing cpuhog and top
1155438203 M * Bertl and plenty of idle time, I presume
1155438225 M * ekc2 how would I measure that?
1155438232 M * Bertl Cpu(s):  1.0% us,  2.9% sy,  0.0% ni, 95.5% id,  0.0% wa,  0.6% hi,  0.0% si
1155438237 M * Bertl id = idle
1155438247 M * Bertl (in the vtop view)
1155438309 M * Bertl (you might need to toggle that with 't', if you default hides it)
1155438419 M * ekc2 id=97.5%
1155438436 M * ekc2 cpuhog is halted, of course
1155438436 M * Bertl okay, so that actually gives a good clue
1155438457 M * Bertl it seems that the conditions for idle time skipping are not met (due to a bug :)
1155438488 M * Bertl which can actually have two causes, a) the check for unholding gets some calculations wrong
1155438515 M * Bertl or b) the criteria, like actually ahving tasks on hold are not met/checked
1155438542 M * ekc2 hmm. very subtle bug, though. because I can fiddle around with the idle rate/interval and after a several attempts it starts working
1155438578 M * Bertl yep, well, if it would be obvious, we would have had hit it earlier :)
1155438593 M * ekc2 yup. true.
1155438603 M * Bertl so I'm opting for the calculations going wrong somehow
1155438658 M * Bertl nevertheless, we should be able to track this down quite easy, as we have a 'special' trigger (the idle task) at hand
1155438740 M * ekc2 yes. 
1155438743 M * Bertl we also know that the idle time does not advance at that point
1155438750 M * ekc2 right
1155438757 M * Bertl while the token time _and_ the tokens increase
1155438761 M * matti Bertl: ;-)
1155438768 M * Bertl hey matti!
1155438801 M * ekc2 right. there's no problem with the non-idle tokens increasing. they always go up by 1/second
1155438819 M * Bertl ekc2: so, now can we wait until the min value (50) is reached and see what happens then? probably nothing, i.e. the tokens are substracted and the process gets a short period of cpu, right?
1155438828 M * matti I need to ask.
1155438839 M * matti What both of you doing? :)
1155438846 M * matti Looks interesting.
1155438850 M * ekc2 right. i tried that. and nothing changes. cpuhog consumes all the tokens and the guest returns to halt
1155438872 M * Bertl matti: we are debugging the scheduler by 'thinking' about it :)
1155438879 M * ekc2 haha
1155438892 M * matti ;-)
1155438944 M * Bertl well, most people fire up a debugger or something and spend hour over hour looking for anomalies
1155438960 M * ekc2 ooooo. i found something
1155438967 M * Bertl 'we' on the contrary, use our intelectual ... yeah?
1155438990 M * ekc2 rate_idle/idle_interval = 50/300 is stuck at zero
1155439003 M * ekc2 so, i changed rate/interval to 10/300 and idle time advances
1155439020 M * ekc2 then I change rate/interval back to 50/300 and idle time advances
1155439044 M * ekc2 let me try this again to confirm
1155439086 M * doener hm, certain values cause it to die?
1155439137 M * Bertl hey doener! nobody diet yet :)
1155439151 M * doener everyone's getting fat? SCNR ;)
1155439157 M * Bertl lol
1155439193 M * matti LOL
1155439242 M * Bertl ekc2: do you have more than one context running or just a single one?
1155439247 M * ekc2 just one
1155439269 M * Bertl could you try to start a second one with the same values?
1155439273 M * ekc2 tried 10/300 again. 10/300 worked, 50/300 did not. and I had to go through half a dozen different values before 50/300 worked again
1155439275 M * doener Bertl: how to parse the cpu lines in /proc/..../sched?
1155439283 M * ekc2 ok
1155439335 M * Bertl doener: see sched_proc.h
1155439356 M * doener ah, .h, just grepped the .c ones...
1155439392 M * ekc2 with two contexts running, same behaviour
1155439413 M * ekc2 both have rate/interval=1/1000 and idle_rate/idle_interval=50/300
1155439422 M * Bertl okay, good, I thought it might be related to a scheduler starvation
1155439496 M * Bertl I think I know what happens, but I have to double check that with the source code ...
1155439968 M * Bertl yep, I guess I found it
1155439990 M * ekc2 yeah?
1155439998 M * Bertl the 'magic' condition is that the 'missing' tokens have to be greater than the fill_rate
1155440034 M * Bertl kernel/vserver/sched.c ~177
1155440050 M * Bertl seems I 'forgot' the else case :)
1155440094 M * Bertl let's see if that changes something for you :)
1155440105 M * ekc2 wow. i would have never found that
1155440124 A * doener was looking about 20 lines above that :)
1155440148 M * Bertl but let me also double check that I didn't miss something here
1155440174 M * Bertl this code part is quite tricky, and I did put unusually many comments there
1155440285 M * Bertl okay, doener , let's go through it with a few examples, shall we?
1155440328 A * doener is still trying to grasp that thing... never looked at it before
1155440350 M * Bertl okay, line ~100
1155440373 M * Bertl let's assume delta = 40 for a start
1155440387 M * Bertl interval[1] = 300
1155440406 M * Bertl so we set delta_min[1] to 40
1155440458 M * Bertl now we skip to ~175 (because nothing else happens before)
1155440483 M * Bertl tokens = tokens_min -tokens
1155440494 M * Bertl let's say we have 10 tokens in the bucket, 50 min
1155440504 M * Bertl so we get tokens = 40 too
1155440551 M * Bertl fill_rate[1] = 50, so nothing is changed here
1155440584 M * Bertl therefore it should lead to an idle time advancement of 40
1155440631 M * Bertl that's what I expected .. and which is correct
1155440668 M * Bertl now let's do another run, with delta > interval
1155440682 M * Bertl let's assume the delta is already 300
1155440702 M * Bertl >= is sufficient here
1155440707 M * doener where does that advancement happen? or is this what you intend to fix? (ie. missing else case)
1155440742 M * Bertl in the previous case, idle time was advanced by 40
1155440759 M * Bertl that would have had happened several times before we reach/cross 300
1155440784 M * Bertl (possibly unholding tasks and such, but let's ignore that for now)
1155440798 M * doener idle time as in sched_pc->idle_time?
1155440822 M * Bertl as in *idle_time, the sched_pc->idle_time is just for the delta
1155440846 M * Bertl so we have delta >= 300 here and do math/update
1155440857 M * Bertl tokens = 1
1155440872 M * Bertl integral = 1*50
1155440889 M * Bertl 1*300 (it's late :)
1155440906 M * Bertl tokens = 1*50
1155440922 M * Bertl delta_min[1] = 300-300 = 0
1155440951 M * Bertl sched_pc->idle_time is advanced, same with tokens
1155440981 M * Bertl this is a case we can never hit here, as we would see it as an idle time advancement, right?
1155441001 M * doener hm, the "idle time was advanced by 40" is as in "was advanced from the outside", yeah?
1155441010 M * Bertl yes
1155441017 M * doener ok, now I get it :)
1155441073 M * Bertl so, IMHO we must hit a case where the following is true:
1155441102 M * Bertl delta < sched_pc->interval[1] (300)
1155441136 M * Bertl in which case delta_min[1] = delta
1155441164 M * Bertl and then either tokens > sched_pc->fill_rate[1] (50)
1155441189 M * Bertl or nothing is changed
1155441207 M * Bertl now we know that tokens are _usually_ below that in ekc2's case
1155441243 M * doener what are these delta_min values used for?
1155441245 M * Bertl so the delta itself must be 0 here
1155441268 M * Bertl the delta_min values contain the _skip deltas
1155441282 M * Bertl i.e. how much the idle time is advanced for example
1155441322 M * doener ok, the last call in vx_try_unhold probably updates the rq->idle_time using that value, right?
1155441345 M * doener (vxm_rq_max_min)
1155441354 M * Bertl that's just the monitor
1155441382 M * Bertl vx_try_skip does that
1155441399 M * doener ok
1155441427 M * Bertl but only if the delta is non zero
1155441443 M * Bertl so we basically have a deadlock here :)
1155441475 M * doener if we hit a zero delta, we stay there, right?
1155441498 M * Bertl yep, time is not advanced, the queue is not unlocked
1155441525 M * Bertl if it is slightly off by something, everything starts working
1155441553 M * Bertl now, one important thing is that we actually _can_ advance idle time unconditionally when we get idle
1155441586 M * Bertl regardless of the fact that maybe no single context can be freed
1155441633 M * Bertl the important question here is now, how much time _should_ we advance in the optimal case
1155441666 M * Bertl and this is what the recalc() is supposed to provide, but it doesn't (in this case)
1155441814 M * Bertl so what we need to do is a single interval in the else case
1155441862 M * Bertl so I'd say we want at line 180
1155441887 M * Bertl else delta_min[1] += sched_pc->interval[1]
1155441930 M * Bertl (or probably, to be precise)
1155441946 M * Bertl delta_min[1] = sched_pc->interval[1] - delta_min[1];
1155441979 M * Bertl i.e. the required delta until we get at least one integral interval
1155442011 M * doener sounds reasonable
1155442013 M * Bertl does that make sense?
1155442081 M * doener just fast-forward to loose as little time as possible without handing out too many tokens
1155442085 M * Bertl testing an artificially 'configured' setup here without and with fix ... will take a minute or too
1155442097 M * Bertl doener: yep, precisely
1155442119 M * Bertl and the 'smallest' value over all suspended contexts is the winner
1155442165 M * Bertl ekc2: still awake? :)
1155442173 M * ekc2 yup. following along closely
1155442188 M * Bertl excellent, what's your opinion? does that match your case?
1155442205 M * ekc2 i think so. building kernel for test now
1155442217 M * Bertl excellent! tx!
1155442279 M * Bertl of course, we should do similar for the 'normal' hard scheduling case too, which suffers from the same issue, except for the fact that it is not visible as time advances without help :)
1155442380 M * Bertl looks good here, I can reproduce the issue and it is fixed by the change :)
1155442683 M * ekc2 bertl: amazing
1155442686 M * ekc2 it works!! 
1155442690 M * ekc2 thanks :)
1155442714 M * Bertl excellent!, thank you!
1155442752 M * Bertl without your testing we wouldn't have been able to fix it ...
1155442790 M * ekc2 i guess. but i would never ever have found that bug. so subtle
1155442816 M * ekc2 i guess idle time isn't being used that widely yet
1155442834 M * Bertl no, naturally because of the missing tool support :/
1155442860 M * Bertl here is the complete patch for both cases:
1155442863 M * Bertl http://vserver.13thfloor.at/Experimental/delta-sched-fix01.diff
1155442873 M * ekc2 awesome
1155442890 M * Bertl but as I already said, the hardcpu hase does not really show up
1155442902 M * ekc2 right
1155442924 M * Bertl it just takes a little longer or slightly misadjusts the idle task
1155442959 M * Bertl nevertheless, would be great if you could give it a thorough spin with debugging enabled
1155442969 M * ekc2 oh, you fixed that, too. nice.
1155442985 M * ekc2 ok. will definitely do that tomorrow.
1155443000 M * Bertl just to see if something unusual pops up, scheduler stuff is always black magic ... youturn a knob and the machine changes 100deg
1155443029 M * Bertl yeah, no need to hurry on that, I think we fixed it properly
1155443058 M * doener somehow knob + machine + 100deg made "centrigrade" pop up in my brain :)
1155443063 M * ekc2 well, i'm going to be making heavy use of hard cpu limits with idle time. so, i'll let you know if i find anything
1155443094 M * Bertl please do so, we are always glad to receive feedback of any kind
1155443106 M * ekc2 thanks again, bertl! g'night all!
1155443112 Q * ekc2 
1155443134 M * Bertl that's actually a good idea .. I guess I'm off to bed too :)
1155443145 M * doener good night then, will do the same :)
1155443152 M * Bertl yeah, cya!
1155443158 N * Bertl Bertl_zZ
1155444602 Q * s0undt3ch Ping timeout: 480 seconds
1155444627 M * ebiederm sleep well.
1155448813 J * daniel15 ~dansoftau@220-245-135-167-vic-pppoe.tpgi.com.au
1155448833 M * daniel15 Is there a way to install Redhat Linux on a VServer if the host is Debian?
1155449436 M * daniel15 ??
1155449691 M * ebiederm Why would it be a problem?
1155449760 M * daniel15 Well, I'm not sure how to
1155449776 M * daniel15 If I'm doing a Debian VServer, I'd do something like vserver test build ............. -m debootstrap -- -d etch
1155449797 M * daniel15 But I'm not sure of how to install other Linux distributions
1155449849 M * lylix do you need "redhat", or something like fedora/centos?
1155449930 M * daniel15 Well, I haven't decided what I'm doing yet (and don't know much about Redhat). Just something that's free, and works fine on a VServer, I suppose :)
1155449964 M * lylix you knwo how to use tar to untar .tar.gz files?
1155449969 M * daniel15 Yes
1155449981 M * lylix http://lylix.net/templates.html
1155449996 M * lylix there are images there to select from
1155450023 M * lylix all you have to do is untar them in a vserver directry, ie. /vserver/NAME/
1155450045 M * lylix and run the setup command to popluate /etc/vservers, and away you go
1155450058 M * daniel15 Oooh
1155450060 M * daniel15 Thank you :)
1155450068 M * lylix ;)
1155450083 M * lylix you dont need to boostrap
1155450092 M * lylix they are already built, etc
1155450104 M * daniel15 Just out of curiosity, if I wanted to install a distribution manually (from an installation CD), how would I do that?
1155450148 M * lylix heh... idk what the "best" way is... but we typically just install it on a spare box and transfer it into a tarball
1155450172 M * lylix but many times the created image requires many tweaks
1155450185 M * daniel15 OK, so I'm probably best off using your images
1155450188 M * daniel15 Thanks anyways
1155450189 M * daniel15 :)
1155450214 M * lylix unless its something uber-special, the images will get you started with a basic or LAMP enviro
1155450225 M * lylix and the package tools are there to build the system up
1155450232 M * lylix to do anything you want
1155450239 M * daniel15 OK, Thanks
1155450331 M * lylix oh, and dont untar them over your hosts root partition ;)
1155450384 M * lylix had a guy do that today actually... LOL!
1155450430 M * daniel15 :P
1155450436 M * daniel15 Anyways, talk to you later
1155450437 M * daniel15 I have to go
1155450440 Q * daniel15 
1155450769 M * Skram :P
1155451086 Q * tokkee Server closed connection
1155451088 J * tokkee tokkee@casella.verplant.org
1155452988 J * Viper0482 ~Viper0482@p5497658A.dip.t-dialin.net
1155453792 Q * gerrit Ping timeout: 480 seconds
1155454351 J * rgl Rui@217.129.151.190
1155454355 M * rgl hilo
1155454372 Q * matled_ Server closed connection
1155454372 J * matled ~matled@85.131.246.184
1155456162 J * bonbons ~bonbons@83.222.36.236
1155458134 Q * Curus_ Remote host closed the connection
1155458293 J * dna ~naucki@p54BCF423.dip.t-dialin.net
1155459818 J * debugger_ ~Rui@217.129.151.190
1155460232 J * pisco ~pampel@p5087A5F9.dip0.t-ipconnect.de
1155460236 P * pisco 
1155460270 Q * rgl Ping timeout: 480 seconds
1155464510 Q * michal_ Ping timeout: 480 seconds
1155465438 J * michal_ ~michal@www.rsbac.org
1155465522 Q * DreamerC Quit: leaving
1155465541 J * DreamerC ~dreamerc@59-112-7-8.dynamic.hinet.net
1155465577 Q * DreamerC 
1155465594 J * DreamerC ~dreamerc@59-112-7-8.dynamic.hinet.net
1155465626 Q * DreamerC 
1155465718 J * DreamerC ~dreamerc@59-112-7-8.dynamic.hinet.net
1155467009 J * rgl Rui@217.129.151.190
1155467450 Q * debugger_ Ping timeout: 480 seconds
1155467874 J * coocoon ~coocoon@p54A07985.dip.t-dialin.net
1155468080 J * yarihm ~yarihm@84-74-17-70.dclient.hispeed.ch
1155468759 J * meandtheshell ~markus@85-124-37-184.dynamic.xdsl-line.inode.at
1155471022 J * dna_ ~naucki@p54BCF423.dip.t-dialin.net
1155471281 Q * dna Ping timeout: 480 seconds
1155471761 J * gerrit ~kvirc@dslb-084-060-254-162.pools.arcor-ip.net
1155471802 J * dna ~naucki@p54BCF423.dip.t-dialin.net
1155472106 Q * dna_ Ping timeout: 480 seconds
1155473146 J * dna_ ~naucki@p54BCF423.dip.t-dialin.net
1155473492 Q * dna Ping timeout: 480 seconds
1155474220 J * debugger_ ~Rui@217.129.151.190
1155474407 J * dna___ ~naucki@p54BCF423.dip.t-dialin.net
1155474516 J * dna ~naucki@p54BCD70C.dip.t-dialin.net
1155474680 Q * rgl Ping timeout: 480 seconds
1155474837 Q * dna_ Ping timeout: 480 seconds
1155474892 Q * dna___ Ping timeout: 480 seconds
1155476852 Q * phedny Ping timeout: 480 seconds
1155476873 J * s0undt3ch yllxilkc@bl7-242-233.dsl.telepac.pt
1155478448 N * Bertl_zZ Bertl
1155478452 M * Bertl morning folks!
1155478879 M * doener mornign Bertl 
1155478886 M * doener s/ign/ing/
1155479805 Q * coocoon Quit: KVIrc 3.2.0 'Realia'
1155480041 J * BoSS ~RJ@85.104.50.113
1155480099 M * Bertl welcome BoSS!
1155480105 M * BoSS thx
1155480109 M * BoSS asl bertl
1155480110 M * BoSS ;D
1155480124 M * BoSS where i am now?
1155480133 M * BoSS who will tell me?
1155480136 M * mnemoc O_O
1155480155 M * Bertl BoSS: depends, you could be right or wrong :)
1155480205 M * BoSS :)
1155480207 M * BoSS o yea 
1155480207 M * BoSS :D
1155480208 M * BoSS depends
1155480338 M * Bertl well, check the topic for hints .. maybe have a look at the web site :)
1155480523 Q * BoSS Quit: ��< S O H B E T >�� www.Sohbet.Net <http://www.Sohbet.Net>
1155481098 M * waldi Bertl: can you recheck the patch?
1155481114 M * Bertl sure, sec
1155481201 M * Bertl could you do the diff with -NurpP ?
1155481255 M * waldi nope
1155481276 M * Bertl hmm, then it will take a little, have to figure where the code goes
1155481308 M * waldi i have to regenerate the tree ...
1155481359 M * debugger_ hey guys :D
1155481361 N * debugger_ rgl
1155481363 Q * michal_ Ping timeout: 480 seconds
1155481410 J * debugger_ Rui@217.129.151.190
1155481700 M * doener which patch?
1155481707 M * Bertl http://194.39.182.225/linux/vserver-bindmount.patch
1155481749 M * Bertl waldi: one comment right now: the first check should be
1155481764 M * Bertl (old_nd.mnt->mnt_flags & MNT_NODEV) &&
1155481775 M * Bertl     vx_ccaps(VXC_SECURE_MOUNT)
1155481778 N * debugger_ rgl_
1155481802 M * Bertl waldi: i.e. first the old check, then the caps, and both on a separate line
1155481811 M * rgl_ Bertl, just to check, did you receive my email?
1155481830 M * waldi hmm
1155481834 M * Bertl rgl_: iirc. I even replied to it :)
1155481862 M * rgl_ Bertl, oh, if you did, I didn't get :(
1155481863 M * Roey hey all
1155481865 Q * rgl Ping timeout: 480 seconds
1155481865 M * Roey Bertl:  guess what,
1155481881 M * Roey Bertl:  I figured that over the other choices, vserver would be easiest.
1155481888 M * Roey Bertl:  since I don't need to do the openvpn thing
1155481891 M * Bertl how so?
1155481904 M * Roey Bertl:  we'll have openvpn over a different dedicated server.
1155481943 M * rgl_ Bertl, odd, its not even on my spam folder :(   can you please resend it?
1155481974 M * Roey Bertl:  that leaves us with postfix proxy, external dns, and web services.  These can be hosted fine on vserver (in fact, I've set this up long ago on the server--I just haven't switched it on s the usual configure thingie --- I just haven't switched it 'on' yet)
1155481977 M * Roey Bertl:  HI BERTL!
1155481979 M * Roey I've had coffee
1155482017 J * michal_ ~michal@www.rsbac.org
1155482035 M * doener Bertl: hm, is that do_mount and do_remount?
1155482055 M * doener (or whatever the remount thing was called)
1155482080 M * Bertl probably, will have to check, that's why I asked for a proper diff
1155482354 M * waldi why does util-vserver nuke /var/run?
1155482397 M * waldi bah
1155482702 M * sid3windr what's required for raw_icmp ccap?
1155482720 M * sid3windr I upgraded to 0.30.210 finally for utils, so now it accepts it, but mtr/traceroute still don't work from the vserver
1155482720 Q * Roey Ping timeout: 480 seconds
1155482876 M * Bertl sid3windr: traceroute does not use raw icmp
1155482911 M * Bertl sid3windr: don't know about mtr, but I suspect it requires raw sockets too (just like traceroute), use tracepath instead
1155482923 M * sid3windr yeah
1155482926 M * sid3windr so that doesn't help?
1155482935 M * sid3windr mtr uses raw sockets too..
1155482944 M * sid3windr no hope for that unless full network access is given?
1155482947 M * Bertl no, raw sockets mean that the tool can spoof/capture any packets
1155482962 M * Bertl at least I do not want that on my guests :)
1155482987 M * Bertl if you give CAP_NET_RAW, it'll work
1155483122 J * coocoon ~coocoon@p54A07985.dip.t-dialin.net
1155483140 M * daniel_hozac hello everyone!
1155483307 M * daniel_hozac waldi: FHS compliance, IIRC. and only for sysv initstyle where the sysinit stage of the guest's boot isn't run.
1155483469 M * Bertl hey daniel_hozac! how was your vacation?
1155483505 M * daniel_hozac it was great.
1155483561 M * Bertl good to hear!
1155483572 M * daniel_hozac so what did i miss? :)
1155483636 M * Bertl we fixed a scheduler bug :)
1155483673 M * Bertl the wiki (mediawiki) has progressed a lot
1155483674 J * Roey ~katz@h-69-3-4-130.mclnva23.covad.net
1155483689 M * Bertl quota testing is rolling again
1155483801 M * daniel_hozac ah, good.
1155483951 M * Bertl and I almost finished combing through 2.0.2 :)
1155483998 M * daniel_hozac cool, so we should a release RSN? ;)
1155484055 J * phedny ~mark@volcano.p-bierman.nl
1155484095 M * Bertl I'd say so, probably one run through PLM and after that, we should be fine
1155484693 M * Bertl okay, off for now, back a little later ... 
1155484708 N * Bertl Bertl_oO
1155485054 M * Hollow heya daniel_hozac!
1155485477 M * daniel_hozac hey!
1155485564 M * Savvy hey, anyone got this warning : http://pastebin.ca/129631 ?
1155485728 Q * rgl_ Quit: Fui embora
1155486136 Q * coocoon Quit: KVIrc 3.2.0 'Realia'
1155486415 M * ebiederm Q:  Is setting a larger value of pid_max anything that users are doing with vserver?
1155487101 J * coocoon ~coocoon@p54A055B6.dip.t-dialin.net
1155487494 N * _fs fs
1155488039 Q * coocoon Quit: KVIrc 3.2.0 'Realia'
1155489604 J * pisco ~pampel@p5087A6E1.dip0.t-ipconnect.de
1155489956 J * _mcp ~hightower@wolk-project.de
1155490013 Q * mcp Read error: Connection reset by peer
1155490015 N * _mcp mcp
1155493878 J * ntrs_ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com
1155493878 Q * ntrs__ Read error: Connection reset by peer
1155493965 Q * shedi Read error: Connection reset by peer
1155495034 J * ekc ~ekc@netblock-66-245-252-180.dslextreme.com
1155495608 J * dna_ ~naucki@p54BCD70C.dip.t-dialin.net
1155496015 Q * dna Ping timeout: 480 seconds
1155496371 J * liquid3649_ ~Viper0482@p5497658A.dip.t-dialin.net
1155496391 J * dna ~naucki@p54BCD70C.dip.t-dialin.net
1155496555 Q * Viper0482 Ping timeout: 480 seconds
1155496555 J * shedi ~siggi@inferno.lhi.is
1155496680 Q * dna_ Ping timeout: 480 seconds
1155496695 J * Viper0482 ~Viper0482@p5497658A.dip.t-dialin.net
1155496865 Q * liquid3649_ Ping timeout: 480 seconds
1155497128 J * dna_ ~naucki@p54BCFA68.dip.t-dialin.net
1155497224 J * dna___ ~naucki@p54BCFA68.dip.t-dialin.net
1155497466 Q * dna Ping timeout: 480 seconds
1155497569 J * dna ~naucki@p54BCFA68.dip.t-dialin.net
1155497616 Q * dna_ Ping timeout: 480 seconds
1155497765 Q * yarihm Quit: Leaving
1155497784 Q * Savvy Quit: IceChat - Keeping PC's cool since 2000
1155497799 N * Belu_zZz Belu
1155497961 Q * dna___ Ping timeout: 480 seconds
1155497984 J * dna_ ~naucki@p54BCFA68.dip.t-dialin.net
1155498206 Q * dna Ping timeout: 480 seconds
1155498795 J * dna ~naucki@p54BCFA68.dip.t-dialin.net
1155499056 Q * dna_ Ping timeout: 480 seconds
1155499124 Q * gerrit Remote host closed the connection
1155499300 J * shedii ~siggi@inferno.lhi.is
1155499341 J * s4edi ~siggi@inferno.lhi.is
1155499458 M * cehteh oh .. anyone applied for a vserver booth for the linuxtag essen?
1155499621 Q * shedi Ping timeout: 480 seconds
1155499786 Q * shedii Ping timeout: 480 seconds
1155500013 M * mnemoc anyone at essen? :)
1155500076 Q * ntrs_ Read error: Connection reset by peer
1155500083 J * ntrs_ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com
1155500235 Q * ekc Ping timeout: 480 seconds
1155500326 J * ntrs__ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com
1155500326 Q * ntrs_ Read error: Connection reset by peer
1155500449 J * ekc ~ekc@netblock-66-245-252-180.dslextreme.com
1155500486 Q * schimmi Quit: Verlassend
1155500804 J * shedii ~siggi@inferno.lhi.is
1155500913 Q * bonbons Quit: Leaving
1155501136 Q * s4edi Ping timeout: 480 seconds
1155501189 Q * shedii Quit: Leaving
1155501674 Q * Zaki[] Remote host closed the connection
1155501998 Q * Viper0482 Remote host closed the connection
1155502000 J * Zaki ~Zaki@212.118.121.51
1155503330 J * Wonka produziert@chaos.in-kiel.de
1155503338 J * s0undt3ch_ nxdify@bl7-243-151.dsl.telepac.pt
1155503427 Q * s0undt3ch Ping timeout: 480 seconds
1155503430 N * s0undt3ch_ s0undt3ch
1155503651 A * Belu is away (i�ll be back later...)
1155503652 N * Belu Belu_zZz
1155503819 M * derjohn ebiederm, i've never heard about users needing more than 65k processes on a host.
1155504063 M * derjohn cehteh, essen is too far from here , but I would offer one day of my time.
1155504077 J * ntrs_ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com
1155504080 Q * ntrs__ Read error: Connection reset by peer
1155504126 M * Wonka re
1155504206 J * ntrs__ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com
1155504206 Q * ntrs_ Read error: Connection reset by peer
1155504242 M * daniel_hozac Bertl_oO: did you track down Aiken's COW NOFILE issue?
1155504381 J * coocoon ~coocoon@p54A06E11.dip.t-dialin.net
1155504504 J * ntrs_ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com
1155504504 Q * ntrs__ Read error: Connection reset by peer
1155504892 J * Aiken ~james@tooax8-072.dialup.optusnet.com.au
1155504961 M * michal_ hey guys...point me please to some unification howto
1155505163 M * ebiederm derjohn: Thanks.  Do you know if having several vservers going gives you a higher number of processes than a normal machine?
1155505264 M * daniel_hozac ebiederm: what does normal machine mean? what are the vservers doing?
1155505288 M * ebiederm Restating my question.
1155505303 M * daniel_hozac michal_: http://linux-vserver.org/alpha+util-vserver has a section on it.
1155505314 M * ebiederm I have noticed some scalability problems with the data structures that are used for pids.
1155505346 M * ebiederm In general machines running vservers do more work and are thus pushd harder than other machines.
1155505377 M * ebiederm So my question is does running vservers on a machine tend to up the average number of processes on a machine.
1155505393 M * ebiederm I expect it does but I have no real world numbers to work from.
1155505397 M * daniel_hozac well, it would have to.
1155505426 M * daniel_hozac a context will disappear unless there's a process in it, or it has the persistent flag set.
1155505445 M * daniel_hozac and running a guest without processes seems pretty useless.
1155505477 M * daniel_hozac (kind of like buying a computer and never turning it on...)
1155505580 M * ebiederm Right I'm just trying to guage how many processes are typical on a machine running vservers or other containers.
1155505622 M * ebiederm If I can spot a real trend where the number of processes are higher than in other situations the data structures need to be fixed.
1155505638 M * ebiederm Otherwise I can leave the data structures that are more optimal for small pid counts in place.
1155505645 M * ebiederm And otherwise be lazy :)
1155505661 J * dna_ ~naucki@p54BCFA68.dip.t-dialin.net
1155505739 M * derjohn ebiederm, well, I run usually not more than 10-15 guests on a host with current COTS hardware. that roughly 5000-6000 processes per guest. Can there be a machine that stands more? 192 CPU Suns? I would guess the machine would be thrashing anyway ....
1155505801 M * daniel_hozac 5000-6000 processes per guest? wow.
1155505820 M * sid3windr                                                 
1155505820 M * derjohn daniel_hozac, 65K/10-15 guests .... is?
1155505825 M * sid3windr woops
1155505868 M * sid3windr I tried booting 2.6.17.7 with the patch from experimental and it just panics on boot
1155505870 M * daniel_hozac 4369.
1155505872 M * sid3windr :(
1155505879 M * derjohn daniel_hozac, this was only meant as an approx maximum .... (to avoid misunderstranding). an ps faxu |wc -l would probably show not more than 100 per guest
1155505887 M * ebiederm derjohn:  So you don't have many if any unused pids at all.  Interesting. 
1155505899 M * daniel_hozac derjohn: i guess ebiederm is looking for reality, not theory though ;)
1155505922 Q * dna Ping timeout: 480 seconds
1155505937 M * daniel_hozac ebiederm: i'd guess your typical guest runs at the very least 10 processes.
1155505990 M * derjohn ebiederm: ususally less than 50, on heavily loaded guests maybe 100. (@daniel_hozac my cyrus spwans 30 'spare' children  for pop etc.)
1155506013 M * daniel_hozac yeah, apache will do similar things.
1155506019 M * derjohn i may add that I only run typcial web-environments (whatever 
1155506027 M * derjohn typical may be ;))
1155506058 M * ebiederm Basically anything less than 4K processes doesn't really have hash collisions so the hash table works perfectly.  After that things begin to degrade.
1155506085 M * ebiederm It sounds like things are still below 4K for most of the vserver users anyway.
1155506126 M * derjohn 4K per guest or per host?
1155506131 M * ebiederm per host.
1155506175 M * derjohn hm, well strive sill towards a patch that makes it into the mainline kernel=
1155506177 M * derjohn ?
1155506328 J * dna ~naucki@p54BCFA68.dip.t-dialin.net
1155506336 M * ebiederm This is the mainline kernel kernel data structure I'm looking at.  I'm wondering if the current data structures will fall down because of increased load.
1155506403 M * ebiederm On 64bit machines it is possible to push pid_max to 4194304 or 4M processes.  If anyone uses a significant fraction of that the performance of our current data structures is not very good.
1155506405 M * derjohn ebiederm, there a surely people/ISPs out which run several hundered guests on a 16GB/Quad Opteron setup. At least here in .de are some which do so with openvz. if they run a 50 processes per guest that would create collisions ....
1155506462 M * cehteh derjohn: quite far for me too ... but if i find a place where i can stay overnight maybe i come
1155506501 M * ebiederm derjohn: Yes.  Actually anything short of increasing pid_max is livable.  The worst case hash chain with a 32K pid_max (the default) is only 9 entries.
1155506557 M * ebiederm Unfortuanately at 4M processes the worst case hash change is 1024 entries.  But it doesn't sound like anyone except people with huge cpu counts is coming anywhere near that today.
1155506577 M * derjohn ebiederm, these are structues not related to vserver at all? i.e. you are talking about general hashes for processes? maybe it gets time for a config option at compile time: "use bierderman hashes for process lookup (use this option for machine with a large number of processes)"?
1155506614 M * ebiederm derjohn: Exactly.  There structures are not veserver related. 
1155506649 M * derjohn ebiederm, maybe openvz developed has such hashed already? Their target market are IMVHO large ISPs ...
1155506673 M * michal_ daniel_hozac: i cannot see in vhashify configuration how it would know which vserver is a reference vserver
1155506693 M * daniel_hozac michal_: hashification doesn't use a reference guest.
1155506699 M * michal_ daniel_hozac: i already have one guest - and would like to 'clone' it
1155506708 Q * renihs Quit: Leaving
1155506711 M * daniel_hozac devel with COW?
1155506717 M * michal_ yep
1155506735 M * ebiederm derjohn:  The data structures are good enough and pids used rarely enougn that unless someone was looking they probably would not have noticed a performance hit.
1155506737 Q * dna_ Ping timeout: 480 seconds
1155506760 M * ebiederm I'm just trying to see if the trend will push us over the edge without realizing it.
1155506765 M * daniel_hozac personally i just find /vservers/a -type f -print0 | xargs -0 setattr --iunlink; cp -al /vservers/{a,b} for that.
1155506815 M * daniel_hozac (of course, hashification is a more long-term solution, allowing you to shrink the sizes of the guests once they've updated files etc.)
1155506829 M * daniel_hozac but you could always run that before.
1155506834 M * michal_ i'll definitely will stay with hashification
1155506865 M * daniel_hozac AFAIK there's no COW-knowing clone available yet.
1155506884 M * michal_ ok. we missuderstod.
1155506898 M * daniel_hozac i've been meaning to hack one together because i'm lazy, but the above has worked so far.
1155506914 M * daniel_hozac (mostly because my real hosts are still running stable)
1155506916 M * lylix when does bertl usually pop back in?
1155506931 M * daniel_hozac whenever he's done with whatever it is he's doing.
1155506936 M * lylix yoi
1155506941 M * daniel_hozac :)
1155506957 M * lylix k, think we've located a token bucketing bug in x86_64... have to do some further testing though
1155506987 M * lylix gentoo hosts running same kernel/vserver tree and totally diff behaviors
1155507015 M * michal_ i've got full/working vserver-sarge. now have created _empty_ vserver and am wondering how to 'fill' it using hashification. that's all :)
1155507071 M * daniel_hozac michal_: hashification won't help you, neither will unification (AFAIK, i've never actually used it).
1155507110 M * daniel_hozac michal_: but basically, the idea is to traverse the tree of the base guest, and then for every iunlink|immutable file you link to it, and for every other file, you copy it.
1155507175 M * daniel_hozac michal_: should be a pretty trivial script, and if you hashify the base guest prior to doing it, the new guest will be all hashified when you're done.
1155507201 M * michal_ what's the point of hashifing it than?
1155507314 M * daniel_hozac hmm? you mean if it can't help you clone guests?
1155507333 M * michal_ nah
1155507336 M * daniel_hozac the decreased disk and therefore memory usage is usually the biggest reason.
1155507342 M * michal_ after doing it all manually
1155507358 M * michal_ hashing is beeing used for...<inserthere>
1155507359 M * michal_ COW?
1155507368 M * daniel_hozac see above.
1155507402 Q * dna Ping timeout: 480 seconds
1155507426 M * daniel_hozac one of the benefitsthe point of COW is that you're able to do exactly what i said above (a cp -al after setting iunlink|immutable).
1155507476 M * michal_ cp -al after iunlink|immutable are set by definion will give me 'the decreased disk and therefore memory usage' ;)
1155507528 M * daniel_hozac exactly.
1155507564 M * michal_ mhm. so...where is the place hashing goes into action?
1155507804 M * daniel_hozac during hashification.
1155507826 M * doener assume you have a bunch of vservers, not necessarily cloned, maybe updated for a number of times (say originally it was debian sarge, now some are sid)
1155507829 M * daniel_hozac so it won't have to compare the file contents of every single file.
1155507849 M * doener hashifying allows you to replace common files with hardlinks and make them COW
1155507865 M * michal_ hey, but i can do it maually
1155507869 M * michal_ *manually
1155507899 M * doener the cp -al thing only works for cloning
1155507906 Q * pisco Ping timeout: 480 seconds
1155507919 M * doener hashifying always works, no matter what state the vservers are in
1155507975 M * michal_ ok...so. what are steps vhashify is doing? it takes a file, checks if it is not on a exclude list. it is not, than it calcuclates the hash and...?
1155508104 M * daniel_hozac if the file is already present in the hash directory, it creates a link to that file.
1155508130 M * daniel_hozac if it's not, a new file is created to which the old file is copied, iunlink|immutable set, and then linked to the old location.
1155508145 M * daniel_hozac (IIRC, doener knows more of the details)
1155508262 M * doener daniel_hozac: you know that I'm scared by util-vserver source code ;)
1155508519 M * michal_ ok. it all does not make any sense for me and is a-logical
1155508527 M * michal_ maybe i'll think about it tommorow
1155508591 Q * ekc Ping timeout: 480 seconds
1155508915 M * michal_ thx for help guys anyway
1155508929 M * michal_ i'll back on this topic untill i'll know evryting about it ;p
1155508976 J * ekc ~ekc@netblock-66-245-252-180.dslextreme.com
1155509002 Q * coocoon Quit: KVIrc 3.2.0 'Realia'
1155509239 M * michal_ anyway
1155509265 M * michal_ COW link breaking does not seem to work like it should here
1155509270 M * michal_ platinum:~ # ls -i /vservers/{debian_sarge,sarge2}/etc/ssh/sshd_config
1155509270 M * michal_ 33819174 /vservers/debian_sarge/etc/ssh/sshd_config  33819174 /vservers/sarge2/etc/ssh/sshd_config
1155509273 M * michal_ platinum:~ # ls -i /vservers/{debian_sarge,sarge2}/etc/ssh/sshd_config
1155509276 M * michal_ 33819174 /vservers/debian_sarge/etc/ssh/sshd_config  33819174 /vservers/sarge2/etc/ssh/sshd_config
1155509290 M * michal_ first one is before, second one after modification of that file
1155509298 Q * ntrs_ Read error: Connection reset by peer
1155509309 J * ntrs_ ~ntrs@68-188-51-87.dhcp.stls.mo.charter.com
1155509320 M * michal_ /vservers/debian_sarge/etc/ssh/sshd_config:ListenAddress 192.168.1.8
1155509320 M * michal_ /vservers/sarge2/etc/ssh/sshd_config:ListenAddress 192.168.1.8
1155509324 M * doener what does showattr say about these files?
1155509324 M * michal_ bad
1155509349 M * michal_ ----ui- /vservers/debian_sarge/etc/ssh/sshd_config
1155509349 M * michal_ ----ui- /vservers/sarge2/etc/ssh/sshd_config
1155509362 M * michal_ have run vhashify, than cp -al
1155509381 M * doener vhashify on the old vserver?
1155509406 M * michal_ Linux platinum 2.6.16.27-vs2.1.1-rc22 #1 PREEMPT Sat Aug 12 11:37:30 CEST 2006 i686 i686 i386 GNU/Linux
1155509445 M * doener ie. on the one that was copied?
1155509489 M * michal_ michal@platinum:~/code/vserver/linux-2.6.16.27-vs2.1.1-rc22> grep COW .config
1155509490 M * michal_ # CONFIG_BLK_DEV_COW_COMMON is not set
1155509490 M * michal_ CONFIG_VSERVER_COWBL=y
1155509524 M * michal_ hm, yes
1155509534 M * michal_ did vhashify on source vserver
1155509541 M * michal_ than copied
1155509659 M * doener maybe /etc is excluded by default?
1155509689 M * doener just doing cp -al is a bad idea unless you actually force immutable|iunlink
1155509713 J * Johnnie ~john@dynamic-acs-24-154-53-237.zoominternet.net
1155509732 M * michal_ showattr tells they are present
1155509762 M * doener hu? they were pretty non-present in the output you pasted
1155509794 M * michal_ so what is u and i ?
1155509813 M * doener no unlink and no immutable, U and I would mean that they are set
1155509838 M * doener - = impossible for this file, lower case: possible, but not set, upper case: set
1155509851 M * michal_ ups
1155510074 M * michal_ cool
1155510085 M * michal_ after tweaking it link was broken
1155510148 M * michal_ ok, so to sum it all up
1155510224 M * michal_ vhashify calculated hash for a files, say it was bin/ls. made a hardlink in /vservers/.hash
1155510304 Q * Johnnie Quit: G'bye!
1155510332 M * michal_ so, where those .hash dir is beeing actualy used?
1155510362 M * michal_ i could easily setattr -iunlink on files
1155510366 M * michal_ run cp -al
1155510370 M * michal_ would got the same :)
1155510407 M * doener assume you have a third vserver that is being hashified
1155510429 M * doener the hash is calculated and used to lookup the file in /vservers/.hash
1155510452 M * doener otherwise you'd have no fast way to find identical files
1155510461 J * Johnnie ~john@dynamic-acs-24-154-53-237.zoominternet.net
1155510461 Q * Johnnie 
1155510484 J * Johnnie ~john@dynamic-acs-24-154-53-237.zoominternet.net
1155510560 M * michal_ so how vunify did it?
1155510592 M * doener it relies on being provided with information on the files that are to be unified
1155510603 M * doener it can utilize rpm specs IIRC
1155510617 M * doener for non-rpm distros, using vunify is said to be quite hard
1155510656 M * doener and it obviously cannot handle differing filenames and/or locations within the vservers
1155510662 M * michal_ and - in which part vserver needs to know files are identical? say i'm a kernel COW code (lol;) - i see a write request to a file. it is clearly unified, i make a new one, copy over it, make this write, say thank you :)
1155510742 M * doener the kernel knows by looking at the hardlink count, but to create hardlinks you need to know if files are identical... you can't just randomly replace files with hardlinks...
1155510764 M * doener cp -la just doesn't work to re-unify a vserver...
1155510987 M * doener daniel_hozac: wow, thread hijacking flood...
1155510995 M * michal_ last question and i'm heading to sleep. how do i upgrade 10 unified vservers with COW in place? and say, add some app to all of them (but with hardlinked-hashified way!)?
1155511026 M * daniel_hozac on the ml? i stopped sorting by threads when i realized too many people don't know how to properly use email...
1155511037 M * doener michal_: install it in everyone and then hashify them again
1155511119 M * daniel_hozac sid3windr: at least the trace is required. which init? the host's or the guest's?
1155511175 Q * ekc Ping timeout: 480 seconds
1155511333 M * michal_ wow
1155511338 M * michal_ it does work like a charm
1155511345 M * michal_ and now i can see it all clearly :)
1155511354 M * michal_ and vhashify role also
1155511390 M * doener great :)
1155511433 M * michal_ last - i've removed unified file - how do i remove it from .hash db?
1155511441 M * michal_ (yes, really last ;)
1155511533 M * daniel_hozac i usually run find /vservers/.hash -links 1 -print0 | xargs -0 rm -f after hashifying my guests.
1155511541 M * michal_ ok
1155511585 Q * PowerKe Quit: Oops, wrong button
1155511662 M * michal_ ok, sorry to bother you too long&happy to finally get it all. cya :) have a good whatever ;)
1155511671 M * doener cya