1174522082 N * Bertl_oO Bertl 1174522100 M * Bertl DoberMann[ZZZzzz]: tx for the info ... 1174523108 M * slack101 Bertl: things are going real smooth 1174523268 Q * meandtheshell Quit: Leaving. 1174523450 Q * boci^ Quit: Távozom 1174525563 J * ntrs ntrs@68-188-55-120.dhcp.stls.mo.charter.com 1174528581 M * Bertl slack101: good! 1174528683 M * slack101 :P 1174532128 Q * softi42 Ping timeout: 480 seconds 1174532635 Q * phedny Ping timeout: 480 seconds 1174532735 J * softi42 ~softi@p549D5054.dip.t-dialin.net 1174536220 M * Bertl okay, off to bed now ... have a good one everyone! cya! 1174536227 N * Bertl Bertl_zZ 1174536378 Q * ensc Killed (NickServ (GHOST command used by ensc_)) 1174536388 J * ensc ~irc-ensc@p54B4EA85.dip.t-dialin.net 1174538447 P * lylix 1174539150 Q * Johnnie Remote host closed the connection 1174540727 J * DoberMann_ ~james@AToulouse-156-1-66-60.w90-16.abo.wanadoo.fr 1174540835 Q * DoberMann[ZZZzzz] Ping timeout: 480 seconds 1174547087 N * DoberMann_ DoberMann 1174547933 M * spion hmmm 1174547935 M * spion 18799 root 25 0 120 36 28 S 50.0 0.0 5517:45 vcontext 1174547935 M * spion 16581 root 25 0 120 36 28 S 48.0 0.0 196:03.54 vcontext 1174547954 M * spion how can i find out, what is the cause of this load? 1174547971 M * daniel_hozac http://archives.linux-vserver.org/200703/0118.html most likely 1174548097 M * spion okay 1174548125 J * phedny ~mark@ip56538143.direct-adsl.nl 1174548159 M * spion vlogin.c is related to login from users via "vserver vserverblah enter" ? 1174548168 M * spion or when is this function called? 1174548182 M * daniel_hozac yes. 1174548202 M * spion there wasn't any terminal attached, when it happened 1174548236 M * daniel_hozac which is the problem. 1174548271 M * spion the problem is, that no terminal is attached? 1174548310 M * spion strange .... the content is up since 08:24:46 9 days 1174548323 M * daniel_hozac yes. when your terminal detaches from the process without HUPing the childs attached to it, vcontext goes into an infinite loop of select and read. 1174548328 M * spion this night i recognised it the first time 1174548343 M * spion okay ... lets restart the vserver 1174548397 M * spion is the patch on http://svn.linux-vserver.org/projects/util-vserver/changeset/2514?format=diff&new=2514 acknowleged to fix the problem? 1174548425 M * daniel_hozac not yet, but its supposed to. 1174548444 M * spion hmm ... okay ... lets see what happen ;) 1174548771 M * spion is there allready documentation how to make use of routing with vserver instead of bridging? 1174548805 M * spion i saw some script on harrys ftp dir, but i would like to have some howto/docu 1174548812 J * dna ~naucki@p54bceec7.dip.t-dialin.net 1174548928 M * daniel_hozac as all the networking happens on the host, and there is no bridging or routing apart from what Linux normally does, i don't see what that'd say. 1174549364 M * spion are there any hocks, to start scripts on starting and stoping a vserver? 1174549565 M * daniel_hozac that's what the /etc/vservers//scripts directory is for, yes. 1174549675 M * spion okay ... fine, thanks 1174549689 M * spion i'll try harrys scripts :) 1174550389 M * spion hmm ... i guess the $guest is the 1st argument the scripts are called? 1174550570 M * daniel_hozac the first is the action, the second is the guest name. 1174550627 M * spion hmm .. okay 1174551335 N * DoberMann DoberMann[PullA] 1174551434 M * spion harry: wake up! ;) 1174552745 Q * ||Cobra|| Remote host closed the connection 1174553177 J * ||Cobra|| ~cob@pc-csa01.science.uva.nl 1174553339 Q * Hunger Remote host closed the connection 1174553415 J * miller7 ~999@88.218.11.117 1174553418 M * miller7 hello everyone 1174553495 M * daniel_hozac hi 1174553537 M * miller7 do you by any chance use gentoo daniel_hozac? 1174553542 N * DoberMann[PullA] DoberMann 1174553595 J * Hunger Hunger.hu@Hunger.hu 1174553760 M * daniel_hozac miller7: no. 1174553873 Q * phedny Ping timeout: 480 seconds 1174554998 J * bonbons ~bonbons@83.222.39.9 1174555302 J * meandtheshell ~markus@85-124-207-180.dynamic.xdsl-line.inode.at 1174556415 M * harry spion: ok 1174556695 J * DavidS ~david@chello062178045213.16.11.tuwien.teleweb.at 1174556764 P * miller7 1174557866 M * harry i updated my pre-start/post-stop scripts 1174557869 M * harry hope that helps a bit 1174557873 A * harry off for a few mins now.. 1174557914 M * harry it's the pre-start and post-stop scripts you want prolly 1174557927 M * harry added docs for it 1174558135 J * chand ~chand@212.99.51.254 1174558329 M * spion harry: ah c00l :) 1174558339 M * harry :) 1174558357 M * spion hmm 1174558386 M * spion i'm blind, i cant see the docs 1174558421 M * harry http://people.linux-vserver.org/~harry/scripts/pre-start 1174558447 M * spion ah! 1174558454 M * spion my browser caches 1174558461 M * spion sorry ;) 1174558482 Q * dna Quit: Verlassend 1174558528 M * spion cool .. you added ${PREFIX} ;) 1174558530 Q * hardwire Ping timeout: 480 seconds 1174558538 M * harry and comments 1174558540 M * harry :)) 1174558550 M * spion of course 1174558568 J * dna ~naucki@p54bceec7.dip.t-dialin.net 1174558601 J * hardwire ~bip@rdbck-4271.palmer.mtaonline.net 1174558605 Q * dna 1174558632 M * spion hmmm ... will this also work with an alias, like eth0:0 ? 1174558714 Q * cdrx Remote host closed the connection 1174558813 J * dna ~naucki@p54bceec7.dip.t-dialin.net 1174558836 M * harry ? 1174558855 M * harry that's not the way iproute2 works ;) 1174558858 J * cdrx ~legoater@cap31-3-82-227-199-249.fbx.proxad.net 1174558862 M * harry vserver works with iproute 2 ;) 1174558978 M * spion hmm ... 1174559021 M * spion i thought i could use a subnet for vservers which is routed to my host 1174559052 M * spion so i have to bind anywhere my gateway 1174559171 Q * Aiken Quit: Leaving 1174559294 M * harry you can 1174559304 M * harry the reason i put those scrpts there 1174559313 M * harry is: i have lots of different virtual servers 1174559317 M * harry on lots of different networks 1174559321 M * harry on different vlans 1174559335 M * harry so i need routing for each virtual server 1174559351 M * harry making it "more per-host-networking" 1174559360 M * harry making firewalling easier too 1174559389 A * harry off to work now 1174559398 M * harry updated allmost all the scripts now 1174559591 M * harry all are updated! 1174559599 A * harry off to work now 1174559601 M * harry cya! 1174560027 Q * shedi Quit: Leaving 1174561725 Q * DavidS Ping timeout: 480 seconds 1174561894 J * lilalinux ~plasma@dslb-084-058-215-127.pools.arcor-ip.net 1174562019 J * mark ~mark@cpc1-brig1-0-0-cust3.brig.cable.ntl.com 1174562047 N * mark complexmind 1174562064 M * complexmind oops, new client :) 1174562070 M * complexmind hi everyone 1174562090 M * complexmind is anyone here using vservers over nfs ? 1174562388 J * DavidS ~david@chello062178045213.16.11.tuwien.teleweb.at 1174562747 M * harry no vservers over nfs 1174562758 M * harry there are nfs mounts inside some vservers here 1174562858 M * complexmind thanks :) I've been reading around and found some old notes on problematic stuff like postfix, heavy databases over nfs 1174562904 M * harry nfs is nfs :) 1174562914 M * complexmind :) 1174562922 M * harry locking, speed, responsiveness etc... 1174562927 M * complexmind sure 1174562927 M * harry they can all be problems on nfs 1174562935 M * complexmind nothing is without problems though :) 1174562938 M * harry so it's your choise 1174562938 M * complexmind hehe 1174562940 M * harry choice 1174562948 M * complexmind I've been testing with ataoe and iscsi too 1174562966 M * harry is there hardware that supports ataoe? 1174562983 M * harry (from more than 1 vendor? ;)) 1174562983 M * complexmind no, I have been testing with the software targets, vblade and qaoed 1174562994 M * complexmind :) 1174563020 M * complexmind vblade is really cool, but despite a lot of testing I have struggled to get any decent performance 1174563031 A * harry googles 1174563040 M * complexmind iscsi has been great in terms of performance, but it's a pig to migrate storage between head end 1174563106 M * complexmind I always avoided nfs for the obvious reasons, but from an admin perspective it would seem great - aggregate all storage on fileservers, one big nfs export which is mounted on all vserver heads 1174563121 M * harry it is... 1174563122 M * complexmind (or lots of small ones) 1174563132 M * harry i would prefer SAN storage tough :) 1174563145 M * harry multipathing etc... 1174563159 M * complexmind there is a san at the backend, with multipathing 1174563163 M * harry 1 big nfs export allways gives locking problems if you're not extremely careful 1174563192 M * complexmind but it is very expensive to connect this many machines directly to the san :) 1174563212 A * harry works at university 1174563221 M * harry we've got loads of ... ahm... hardware :) 1174563226 M * complexmind :) 1174563243 M * harry hell... i said: i need 4 quad dualcore hyperthreaded machines, 16GB ram each and 5x146GB hotplug scsi disks 15krpm 1174563247 M * harry and there i go.. :) 1174563261 M * complexmind lucky you :) 1174563265 M * harry 6 Gbit interfaces each 1174563270 M * harry well... no! it's not fun anymore! 1174563277 M * harry where's the challenge? 1174563278 M * complexmind my kit is very nice but I am trying to find a way to avoid a large san buildout 1174563290 M * complexmind if possible 1174563315 M * harry if you want cheapass performant storage, i think the only way is aoe or iscsi 1174563331 A * harry very interested in that too 1174563334 M * complexmind yeah 1174563353 M * complexmind aoe is very cool 1174563358 M * harry i want a challenge now :) i've got hardware... now i want to do as much as possible with the least possible ha 1174563362 M * harry hw 1174563362 M * complexmind very 1174563374 M * harry tuning kernel params 1174563384 M * harry aligning disks/partitions/... 1174563388 M * complexmind but, performance is poor, despite a lot of testing, tuneing and trying out the various patches 1174563444 M * complexmind I think it is just a case of immaturity and the fact that vblade is really a reference implementation 1174563474 M * complexmind qaoed shows some promise, but again it is fairly alpha and performance is not very good 1174563534 M * complexmind iscsi on the other hand performs well, but it is a pig to dynamically export/withdraw targets and luns gracefully 1174563655 M * harry ah... so ... waiut for cheap sans 1174563660 M * complexmind :) 1174563678 M * complexmind iscsi is supposed to be the cheap san :0 1174563682 M * complexmind lol 1174563703 M * sannes Had some bad experience with aoe + gfs performance .. 1174563757 M * complexmind really? interesting. I have gfs running on the cluster but not on the vserver head ends 1174563803 M * sannes main problem is directories with a lot of files in them 1174563813 M * complexmind did you rebuild the redhat cluster pkgs against a vanilla kernel? 1174563821 M * sannes yes 1174563835 M * complexmind cool, was that difficult? 1174563863 M * sannes not really, except it kept me at an older kernel for longer .. 1174563874 M * complexmind sure 1174563918 M * complexmind what sort of performance do you get from aoe? 1174563931 M * complexmind in theory it should be great but I have not had much luck 1174563942 M * sannes I guess that won't be a problem now that gfs2 is in the kernel .. but, I'm still worried about large directories, I have this feeling it might not just be locking, did a test the other day .. 1174563949 M * complexmind and what target are you using (coraid hw, vblade, qaoed) 1174563991 M * sannes coraid hw, when not using gfs gives me close to having it locally .. 1174563994 M * matti :) 1174563996 M * sannes well, that is what dd told me anyways 1174563996 M * matti Hi harry ;-) 1174564027 J * phedny ~mark@phedny.vps.van-cuijk.nl 1174564041 M * complexmind ah ok, yeah I suspected the coraid hw might have acceptable performance 1174564056 M * complexmind my model is slightly different 1174564099 M * sannes to see the problem with gfs/gfs2 performance you could make a directory with a lot of files, say 150000 and do an ls -al in it, takes about 14minutes .. do it on ext3 it takes about 8 seconds... 1174564132 M * sannes if you do it ls (with -l) it takes 14s with gfs and 4 or so with ext3 .. (must unmount between testing btw) 1174564165 M * complexmind I already have a decent sized (18TB RAID6, single lun) san, redundant controllers, pair of fileservers with multipath, redhat cluster/clvm for shared access to lun. This loaded into a volume group on which I create 1 logical volume per vserver 1174564181 M * complexmind interesting 1174564244 M * sannes and do it with lock_nolock or whatever it is called (can't remember) .. thing is it seems that it is doing hell of alot more reading from disk than ext3 .. so I'm thinking bad on-disk structure or something .. 1174564262 M * sannes what kind of filesystems do you use? 1174564266 M * complexmind yeah possibly 1174564268 Q * lilalinux Remote host closed the connection 1174564294 M * complexmind I use ext3 on the vserver logical volume 1174564322 M * complexmind plan was to export target via aoe/iscsi, both have advantages/drawbacks 1174564336 M * sannes complexmind: to be fair, they havn't really started to do peformance tuning and i did test on a 2.6.20, could be they have made it better in the mean time .. 1174564356 J * lilalinux ~plasma@dslb-084-058-215-127.pools.arcor-ip.net 1174564357 N * phedny phedny_ 1174564373 M * sannes I wish I had that amount of hardware :P 1174564374 M * complexmind sure - there is rarely such a thing as a perfect solution... 1174564448 M * mjt amount of hardware isn't relevant really. one can get tons of crappy hw and the result will be crap 1174564456 M * complexmind :) 1174564459 M * complexmind true 1174564467 M * complexmind strong as the weakest link... 1174564481 M * complexmind I can get 200MB/s to the raw san storage 1174564486 M * complexmind both ways 1174564491 M * mjt my expirience with iscsi is like that -- "not ready for production use" 1174564502 Q * lilalinux Remote host closed the connection 1174564525 M * complexmind but with aoe average around 10MB/s read, 20MB/s write, figure that 1174564544 M * complexmind (that is for O_DIRECT with AOE) 1174564551 M * complexmind over Gb 1174564560 M * mjt o_direct can be different too. 1174564588 M * complexmind iscsi gives me better, 50-100MB/s 1174564590 M * mjt it VERY depends on the I/O size and whenever you do random or sequential i/o 1174564594 M * complexmind sure 1174564615 M * complexmind this is big files full of random data from 1MB -> 500MB 1174564619 M * mjt difference if several orders of magnitude for realistic sizes 1174564623 M * mjt s/if/of 1174564666 M * mjt ("realistic" = not counting sizes < 4Kb for example, which is just too small) 1174564766 M * mjt we tested a big sanrad storage system - two redundrand raid boxes, 32 hard drives on each, raid5 or raid10 1174564790 M * mjt each box had quite large amount of cache 1174564832 M * mjt but even sequential write - which starts at ~800Mb/sec (10GigE), drops to... about 10Mb/sec when the cache fills up 1174564833 M * complexmind this is what I use with 24x750GB SATA http://www.av-digital.com/a24f-R2224-1.html 1174564860 M * mjt that's over iscsi 1174564891 M * mjt expensive-like-a-space-shuttle qlogic iscsi card gives 80Mb/sec max 1174564904 M * complexmind :) 1174564913 M * mjt and it all fails badly under high load 1174564925 M * complexmind yeah iscsi scares me 1174564931 M * complexmind as do aoe and nfs :0 1174564936 M * complexmind it all scares me :P 1174564971 M * mjt we dropped those sanrad boxes and installed plain old fibre-channel stuff 1174565024 M * complexmind so now you use luns directly on the san.. no iscsi or aoe etc? 1174565033 M * mjt yup 1174565040 M * mjt no shared storage :) 1174565062 M * complexmind that would be ideal, but what are your limitations on carving up that storage? 1174565069 M * mjt (because stuff like e.g. ocfs2 or gfs is even more scary ;) 1174565070 J * zen_pebble avorian@xs3.xs4all.nl 1174565076 Q * zen_pebble 1174565112 M * complexmind ie, how many luns can you support - lvm is lovely, carving up storage at the controller can be much more painful and limiting than lvm depending on your kit 1174565141 M * mjt (i had to look up "carving" in dictionary :) 1174565145 M * complexmind sorry :0 1174565146 J * noodle ~noodle@noodle.xs4all.nl 1174565154 M * mjt fun use of this word in this context ;) 1174565157 M * complexmind :) 1174565173 M * mjt but fortunately we don't really need that much "carving" 1174565184 M * complexmind right... 1174565192 M * mjt we just needed a large disk space for a large database. that's all 1174565198 M * complexmind in my scenario there will likely be a larege turnover of filesystems 1174565214 M * complexmind ok, in your case it makes sense to use just san then 1174565302 M * complexmind my main requirements are the ability to move filesystems from one vserver host to another (which san is great for) but at the same time to be able to create lots, even hundreds of filesystems from 5GB to 1TB, resize them etc 1174565303 M * mjt in ideal world i'd run a cluster database - several nodes sharing the same storage. But it turned out to be almost unuseable 1174565368 M * mjt oh well 1174565380 M * mjt i see. that's umm.. difficult ;) 1174565388 M * complexmind yeah, I find the cluster stuff useful for shared access to a volume group, but probably wouldn't use it for shared logical volume/filesystem access 1174565754 M * renihs i like drbd 1174565762 M * renihs nice performance and dont see any drawbacks 1174565806 M * mjt drbd doesn't support multi-master mode 1174565825 M * complexmind I like drbd too, but it is less suitable for my needs 1174565830 M * renihs 8.0.0 does afaik 1174565841 M * renihs but didnt try active/active yet 1174565846 M * mjt oww. they're up to version 8 already ;) 1174565856 M * complexmind :) I never knew they supported active/active 1174565863 M * renihs it am too afraid for split-brain 1174565864 M * complexmind wow that is very cool 1174565865 M * renihs :) 1174565869 M * complexmind :) 1174565942 M * renihs but if you try before me be sure to leave a not :) 1174565944 M * renihs note 1174566028 M * mjt this scares me even more 1174566047 M * mjt drbd does its own locking for active/active (or at least should) 1174566065 M * mjt but on top of it, the filesystem itself should do its own locking as well 1174566080 M * renihs dunno, i followed a few of the discussion but they seem in a very early stage 1174566106 M * renihs mjt, but i would agree with the filesystem locking :) 1174566163 M * mjt for failover case, there's no need for all that crazy stuff. but sure it'd be very interesting to use some storage as true shared, so that both "primary" and "failover" nodes can do some work at the same time 1174566168 Q * PowerKe Remote host closed the connection 1174566180 J * PowerKe ~tom@d54C13E4B.access.telenet.be 1174566203 M * renihs mjt, ya i am not even sure if the active/active things are supported in the free version 1174566212 M * renihs but that company is 500m away from our company 1174566217 M * renihs i guess i should pay a visit soon 1174566345 M * mjt ..and when i last tried ocfs2 (last november), it was too buggy -- crashing left and right, and deadlocking even on a SINGLE node. 1174566355 M * renihs heh :) 1174566369 M * mjt i didn't even try to enable >1 node access 1174566573 M * mjt most famous thing from it i noticied - attempting to lseek(large_number); write(1 byte) resulted in EMFILE (!) error from ocfs2. After umount, fsck.ocfs2 finds errors which it can't repair, so the only way is to delete that damn file and start over. Starting over helps, but next file created this way gives the same issue again... 1174566742 M * DavidS did you try the sistina/red hat cluster stuff? 1174566937 M * mjt gfs you mean? 1174567171 M * renihs doesnt that require storage devices? 1174567183 Q * PowerKe Read error: Connection reset by peer 1174567428 J * shedi ~siggi@tolvudeild-195.lhi.is 1174567718 Q * noodle Quit: leaving 1174567729 M * DavidS yeah, gfs does. OCFS2 not? 1174567869 M * mjt the concept behind both are pretty much the same 1174567914 M * mjt (besides, it looks like pretty much any fs requires SOME storage device ;) 1174568043 J * Kubus ~student@62.87.154.157 1174568066 M * Kubus elo 1174568156 M * Kubus :) 1174568340 J * Pingwin ~student@62.87.154.157 1174568349 M * Pingwin kuku 1174568350 M * sannes mjt : tmpfs, if you don't count memory as a storage device :P .. or /proc fs if you don't need to store anything at all ... 1174568360 M * Pingwin ziomki 1174568361 M * Kubus :) 1174568364 M * Kubus kuzwa 1174568374 M * Pingwin som tu jakies ziomki z wawy ? 1174568383 M * Kubus mash fotke? 1174568386 M * Pingwin lubicie hh ? 1174568395 M * Pingwin umyj dupe zboczencu 1174568402 M * Kubus :P 1174568409 Q * Pingwin Quit: BitchX-1.1-final -- just do it. 1174568423 Q * Kubus Quit: BitchX-1.1-final -- just do it. 1174569029 J * trippeh atomt@uff.ugh.no 1174571176 Q * DoberMann Ping timeout: 480 seconds 1174571328 J * DoberMann ~james@AToulouse-156-1-66-60.w90-16.abo.wanadoo.fr 1174571715 N * Bertl_zZ Bertl 1174571719 M * Bertl morning folks! 1174571797 M * complexmind hi bertl :) 1174571802 M * harry yowyow bertldude! 1174572216 M * DavidS wb, Bertl! :) 1174572615 J * Dixan ~root@81.200.131.122 1174572630 P * Dixan 1174573157 M * Bertl DavidS: btw, I couldn't trigger the procfs issues yet 1174573336 M * DavidS ? 1174573361 M * Bertl sorry, have to adjust my auto completion habbit 1174573381 M * Bertl usually that would have been daniel_hozac :) 1174573385 M * daniel_hozac hehe 1174573435 M * Bertl DavidS: means, you haven't been here for quite a while .. shame on you! 1174573437 M * Bertl :) 1174573512 M * Bertl daniel_hozac: I have 20k contexts starting and stopping over and over again, doing ps auxwww in their context ... didn't trigger it for several hours 1174573534 M * daniel_hozac Bertl: hmm? 1174573543 M * daniel_hozac one context should be sufficient. 1174573558 M * Bertl yeah, well, I thought that might help ... 1174573585 M * daniel_hozac so, vserver x exec ps faux; vps faux shows the correct inits every time? 1174573599 M * daniel_hozac or are we talking about different procfs issues? 1174573608 M * daniel_hozac (bughunt2?) 1174573611 M * Bertl different ones, yep 1174573630 M * daniel_hozac okay, the do_task_stat one, yes? 1174573636 M * Bertl yep 1174573639 M * daniel_hozac (tty_devnum in 2.6.20) 1174573649 M * daniel_hozac okay, got it. 1174573662 M * Bertl but I have an idea for the pid 1 issue 1174573682 M * Bertl first I think we should make sure that we purge the entire proc entry 1174573693 M * Bertl like the task does on exit 1174573714 M * Bertl (IIRC) 1174573765 M * daniel_hozac okay. 1174574032 M * Bertl if we have that code, we do a simple plausibility check in the lookup 1174574049 M * Bertl i.e. for pid=1 lookup, is the context correct? 1174574081 M * Bertl if it isn't, we just purge the entire pid=1 entry and let the usual mechanisms recreate it 1174574226 M * Bertl but I found something stange regarding the tty issue in the code 1174574254 M * Bertl daniel_hozac: check out disassociate_ctty() in drivers/char/tty_io.c 1174574292 M * Bertl the last block of commands ... and the lock it uses, what do you think? 1174574398 M * daniel_hozac read_lock(&tasklist_lock); session_clear_tty(session);? 1174574435 M * Bertl yep 1174574448 M * Bertl or in older 2.6.19 ... p->signal->tty = NULL; 1174574481 M * daniel_hozac okay. 1174574492 M * daniel_hozac yeah, i don't see why tasklist_lock would protect that. 1174574503 M * daniel_hozac do_task_stat doesn't grab that, right? 1174574513 M * Bertl especially a read_lock() :) 1174574535 M * Bertl so while I assume that do_task_stat() might be holding the lock (for read) 1174574547 M * Bertl that won't really help, would it? 1174574569 M * Bertl doener: ping? 1174574573 M * daniel_hozac right... 1174574582 M * daniel_hozac yeah, so that seems like a likely cause. 1174574694 M * Bertl let's see if we can find the guilty one ... 1174574887 M * daniel_hozac hmm? 1174574904 M * Bertl the patch which introduced this change 1174574908 M * daniel_hozac ah, okay. 1174574982 M * Bertl hmm ... here is an interesting one ... 1174574988 M * Bertl http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2ea81868d8fba0bb56d7b45a08cc5f15dd2c6bb2 1174575060 M * daniel_hozac hmm, strange. 1174575092 M * Bertl http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=24ec839c431eb79bb8f6abc00c4e1eb3b8c4d517 1174575153 M * daniel_hozac interesting. 1174575441 M * Bertl I think we should artificially extend the time between the tty check and a WARN_ON() in procfs 1174575445 M * Bertl and retest with that 1174575491 M * Bertl but I'd be much happier if I could trigger it at least sometimes right now 1174575554 M * daniel_hozac yeah, definitely. 1174575571 M * daniel_hozac tried running e.g. a forkbomd doing ps? 1174575651 M * Bertl I think we need to narrow it down to the do_task_stat() otherwise we spend too much time in uncritical entries 1174575683 M * Bertl but that shouldn't be the problem :) 1174575802 M * daniel_hozac so make it cat /proc/*/stat ;) 1174576020 M * Bertl doing this atm: while true; do bash -c 'cat /proc/[0-9]*/stat >/dev/null' & done 1174576067 M * Bertl the question is, does that get/use a tty which will be cleared off? 1174576251 M * daniel_hozac hmm, good question. 1174576496 J * gerrit ~gerrit@bi01p1.co.us.ibm.com 1174576622 M * Bertl morning gerrit! 1174576719 M * gerrit Mornin' Bertl! 1174576738 M * Bertl gerrit: do you have a moment for me? 1174576785 M * gerrit sure! 1174576836 M * Bertl we are currently looking for a kernel issue, which causes those traces: http://vserver.13thfloor.at/Stuff/BUGHUNT/bertl-0002/ 1174576857 M * Bertl (oops, oops2 and oops3) 1174576914 M * Bertl and looking at 2.6.19.7, we wonder how the locking (regarding session->tty) is supposed to work 1174577005 M * gerrit looking at the traces now - I do know that the locking in the tty layer has always been "challenging" - but we use serial a lot without these issues 1174577005 M * Bertl especially the code in disassociate_ctty() in drivers/char/tty_io.c 1174577069 M * Bertl the thing is, do_task_stat() uses the tty and does a check for its existance before reading values from it 1174577107 M * gerrit I'm assuming there is vserver code linked into this kernel? Hard to tell from the modules and backtrace what else might be in the kernel. 1174577109 M * Bertl but, the disassociate_ctty() claims to protect the clearing of ->tty via a read_lock() on the task:lock 1174577134 M * Bertl gerrit: yes, it is a Linux-VServer kernel 1174577174 M * gerrit I'm pulling up a 2.6.19 kernel now to check some things... 1174577208 M * Bertl what I wonder is, how is a read_lock() supposed to protect against procfs tripping over the change? 1174577241 M * gerrit it may be there there is only one real open of the tty normally 1174577246 M * gerrit and perhaps you 1174577256 M * gerrit er and perhaps you have managed to get a second "real" open somehow 1174577306 M * gerrit I'm working from very old memories here, but usually the tty is only "really" opened once, so a read lock would be fine as long as there is only one process manipulating the real hardware (and possibly some other data structures for that tty) 1174577378 M * Bertl I think I'm missing something here ... 1174577492 M * gerrit I do see the code that "looks" suspicious. But I *think* the reason is that the locking code is just a little sloppy - using read_locks as a way to avoid complexity 1174577526 M * Bertl okay, but a read_lock isn't exclusive, no? 1174577534 M * gerrit read_locks have no limits on the number of times that they can be acquired - as long as you have matching lock/unlocks 1174577542 M * Bertl right 1174577556 M * Bertl so let's assume disassociate_ctty() takes the read lock 1174577559 M * gerrit correct - not exclusive - so for every thread using the tty, there is a read_lock on some data structures 1174577572 M * Bertl then do_task_stat() (or something before) takes it too 1174577576 M * gerrit so it bumps the count to make sure the structure can't be freed while it is doing its job 1174577590 J * PowerKe ~tom@d54C13E4B.access.telenet.be 1174577592 M * Bertl then task_stat() tests for ->tty != 0 1174577602 M * gerrit I bet disassociate_ctty() slams the count back to zero at the end 1174577604 M * Bertl then disassociate_ctty() clears out ->tty 1174577624 M * Bertl then do_task_stat() derefences that 1174577627 M * Bertl *bang* 1174577642 M * gerrit yep - so the additional acquire is to make sure that two disassociate_tty's shouldn't race... 1174577655 M * gerrit ohh..hmmm. let me read that once or twice. 1174577709 M * Bertl +np, and thanks for your time, btw! 1174577722 M * gerrit all of that is done inside of lock_kernel - there should only be one process in that code at a time... 1174577796 M * Bertl I see a tty_mutex in do_task_stat() but a lock_kernel()? 1174577827 M * gerrit there is one is disassociate_ctty() 1174577849 M * Bertl which shouldn't help, if the other one isn't taking it :) 1174577852 M * gerrit I missed where you think do_task_stat() is getting called from? 1174577875 M * Bertl from the proc reading on /proc/*/stat 1174577891 M * gerrit oohhh - external process doing a read... 1174577895 M * Bertl yep 1174577902 M * Bertl that is what the traces look like 1174577916 M * gerrit okay- sorry, it is taking me a minute to get context here... 1174577930 M * Bertl take your time ... 1174577933 M * gerrit what is the proc reading /proc/*/stat? ps? Or something else? 1174577971 M * Bertl iirc, in one case it is ps, in the other a Linux-VServer tool 1174578017 M * gerrit I would make sure that ps is one of the cases. 1174578027 M * Bertl pidof in one case 1174578048 M * gerrit I'm wondering if it has some implicit locking that it uses. Oh, with pidof one can hardly claim it is an application API probem. ;) 1174578080 M * gerrit so if pidof and ps both read and there is a potential race, you probably have a real problem. The question is why you are seeing it but it isn't normally visible. 1174578091 M * Bertl yeah, oops is caused by pidof, oops2 by ps :) 1174578105 M * Bertl gerrit: no, the reading is not a problem 1174578115 M * gerrit because if this were a standard 2.6.19 kernel, running on thousands of laptops, with pidof and ps called billions of random times, this would happen more often. 1174578139 M * Bertl gerrit: the clearing out the tty (for the session) of a process which is ps/pidof-ed in a different thread is 1174578156 M * Bertl gerrit: we cannot even trigger it reliably on this specific kernel 1174578171 M * Bertl the race window is very small IMHO 1174578191 M * gerrit ps -> reading /proc/*/stat* is racing with a disassociate_ctty() leading to an illegal dereference in ps/pidof, right? 1174578198 M * Bertl yep 1174578208 M * Bertl that is my theory so far :) 1174578236 M * gerrit so a process which does a bunch of disassociate_ctty()'s in a loop (reaquiring a session/ctty in between) should be able to forcibly demonstrate the race? 1174578255 M * gerrit race that with pidof in a loop, for instance 1174578269 M * Bertl probably ... 1174578270 M * gerrit being able to generate a repeatby will help people follow that logic. 1174578291 M * gerrit I think you are correct, though, about the point of the actual race and the flaw in the locking 1174578314 M * Bertl let me note the following things here: 1174578327 M * gerrit I'm not sure what the appropriate locking *is* for that one - reading from /proc will always be one of those "racey" things but it shouldn't cause a panic 1174578335 M * gerrit it can return bad data. 1174578345 M * Bertl the code was changed recently, and newer kernels change it several times too 1174578354 M * gerrit so defensive programming in disassociate_ctty might be necessary 1174578371 M * Bertl i.e. it might be fixed in HEAD 1174578374 M * gerrit ahh - okay - I just looked at 2.6.19, not a changelog. so probably someone broke it by opening a very small race. 1174578384 Q * s0undt3ch Quit: leaving 1174578432 M * Bertl okay, so you agree with me that this looks suspicious 1174578454 M * gerrit It sounds like the best bet is to distill this down, post it to LKML and ask if anyone has seen a race there or find out who has been working on that code via git or changelog info 1174578482 M * gerrit Yeah - and the fact that you have panics, although people will probably ask to repeat without vserver built in, just to simplify their debugging 1174578485 J * s0undt3ch ~s0undt3ch@80.69.34.154 1174578507 M * gerrit so if you can throw together a little test program that forces a race there, that would help reproduce it for others in less time. 1174578545 M * Bertl do you have some code fragments somewhere to do the tty stuff? 1174578571 M * gerrit There *should* be something in LTP 1174578575 M * gerrit but I haven't looked in there in ages 1174578600 M * gerrit there should have been some sessions management tests in the testing code that IBM donated 3-5 years ago 1174578605 M * Bertl okay, that is a good starting point... 1174578624 M * Bertl thanks again for your time, we'll investigate ... 1174578673 M * gerrit sure - no prob. and gotta run to present some stuff to folks. cy later 1174579250 Q * gerrit Ping timeout: 480 seconds 1174579301 M * Bertl daniel_hozac: hmm, terminal ioctls seems to be sufficient 1174579334 M * Bertl drivers/char/tty_io.c line 3222 is funny :) 1174579803 M * Bertl daniel_hozac: how about this one: http://vserver.13thfloor.at/Experimental/TOOLS/notty.c 1174579877 J * gerrit ~gerrit@mobile-166-214-209-007.mycingular.net 1174582300 Q * shedi Quit: Leaving 1174584195 J * hallyn ~xa@adsl-75-2-80-15.dsl.chcgil.sbcglobal.net 1174584325 J * stefani ~stefani@tsipoor.banerian.org 1174584359 J * dhansen ~dave@pool-72-90-117-15.ptldor.fios.verizon.net 1174584734 M * Bertl wb hallyn! dhansen! hey stefani! 1174584775 M * daniel_hozac Bertl: "Factor out some common prep work"? :) 1174584804 M * daniel_hozac Bertl: so, notty exposes the problem? 1174585090 M * Bertl nope, unfortunately not 1174585102 M * Bertl I did improve it in the meantime 1174585125 M * Bertl i.e. will upload a threaded version which bangs on proc 1174585144 M * Bertl but still no luck, everything seems smooth 1174585295 M * daniel_hozac Bertl: does the if (current->signal->leader) check trigger for notty?? 1174585366 M * daniel_hozac or is it the other stuff we're after? 1174585417 M * Bertl I have no idea about the kernel pathes atm, ttys are still a mess/mystery for me 1174585441 M * Bertl I will add a bunch of printk() and prope delays to the kernel soon 1174585480 M * daniel_hozac okay 1174585583 J * Johnnie ~jdlewis@jdlewis.org 1174586827 M * Bertl wb Johnnie! 1174587184 J * pagano ~pagano@131.154.5.37 1174587228 J * pagano_ ~pagano@131.154.5.37 1174587233 M * pagano_ hi all 1174587241 M * Bertl hey pagano_! 1174587276 M * pagano_ I have a little problem :( I hope that u can give me a good suggestion :P 1174587291 J * shedi ~siggi@ftth-237-144.hive.is 1174587291 M * pagano_ centos 4.4, last stable, last util 1174587321 M * Bertl that's the problem? 1174587327 M * pagano_ when I try to buil a vserver I receive: 1174587329 M * pagano_ echo -n 'ERROR: Can not find configuration for the distribution '\''rhrelease'\'' 1174587342 M * pagano_ (before a long log) 1174587351 M * Bertl what util-vserver version? 1174587365 M * Bertl and what is your build line? 1174587383 M * pagano_ util-vserver-0.30.212 1174587411 M * pagano_ /usr/sbin/vserver vm5 build -m yum --hostname=vm5 --interface eth0=IP/24 --d fc6 1174587439 M * pagano_ Kernel: 2.6.9-42.0.2 1174587471 M * daniel_hozac well, a) you're not using a Linux-VServer kernel and b) that should be -- -d centos4 or -- -d fc6. 1174587520 M * pagano_ a) I'm using a vanilla patched with your patch 1174587564 M * pagano_ b) now I have /usr/local/etc/vservers/.defaults/vdirbase/vm5: Function not implemented 1174587580 M * daniel_hozac because you're not using a Linux-VServer kernel. 1174587592 M * daniel_hozac 2.6.9-42.0.2 is the default CentOS 4.4 kernel. 1174587601 M * pagano_ oh,,,maybe I'm a little confused 1174587644 M * pagano_ oh my god, let me reboot :) 1174587655 M * pagano_ grub fault :P 1174587822 Q * meandtheshell Quit: Leaving. 1174587844 M * Bertl evil grub! :) 1174587865 M * pagano well, now 2.6.17.13-vs2.0.2.1 1174587875 M * pagano should be ok :) 1174587898 M * pagano ERROR: Can not find configuration for the distribution 'rhrelease'; 1174587898 M * pagano please read http://linux-vserver.org/HowToRegisterNewDistributions 1174587905 M * pagano :( 1174587918 M * daniel_hozac and, what command did you use this time? 1174587999 M * pagano ehmmm... it's working :) 1174588012 M * Bertl congrats! :) 1174588033 M * pagano maybe is too early eheh 1174588093 M * pagano I have on the base system a centos4, can I install a guest with fc6? 1174588106 M * Bertl should work 1174588110 M * daniel_hozac sure. 1174588115 M * pagano should I only respect rpm-rpm / deb-deb 1174588124 M * pagano I guess 1174588129 M * Bertl nope, no problem with that either 1174588140 M * Bertl you can also create a debian or ubuntu guest 1174588140 M * daniel_hozac any guest ought to work on any host. 1174588152 M * pagano wow, sounds great 1174588187 M * Bertl on certain distros, you might run into issues with the host tools, but that is not really related 1174588209 M * Bertl e.g. if you do not have a dynamicly linked rpm, or so ... 1174588250 Q * chand Ping timeout: 480 seconds 1174588265 Q * hardwire Quit: Coyote finally caught me 1174588280 J * meandtheshell ~markus@85-124-232-27.work.xdsl-line.inode.at 1174588284 M * daniel_hozac Bertl: about that, i noticed that rpm-fake also traps get{pw,gr}*. 1174588302 M * daniel_hozac Bertl: which really seems like the only sane thing to do, IMHO. 1174588323 M * Bertl we could work around that in a namespace too, no? 1174588340 M * daniel_hozac oh? 1174588354 M * daniel_hozac bind mounts of /etc/{passwd,group}, or? 1174588357 M * Bertl with bind mounting the passwd/group 1174588359 M * Bertl yep 1174588369 M * daniel_hozac i suppose... 1174588398 M * daniel_hozac but what if the guest is using LDAP to store users? 1174588398 P * pagano_ Leaving 1174588401 M * Bertl but on a different note: 1174588438 M * Bertl my man rpm contains this: http://paste.linux-vserver.org/1338 1174588460 M * daniel_hozac right. 1174588478 M * Bertl so I'm not sure if that is relevant at all (for rpm) 1174588498 M * Bertl because I would expect to do the user/group stuff inside that chroot too 1174588514 M * Bertl (otherwise it probably won't work) 1174588523 M * daniel_hozac i don't think it does. 1174588554 M * daniel_hozac and i was actually lying when i said it chrooted. it doesn't. 1174588597 M * daniel_hozac the resolver (the one handling user/group lookups) does chroot, but it has nothing to do with the scriptlets. 1174588614 M * Bertl so scriptlets are run on the host atm? 1174588624 M * daniel_hozac no, it's using rpm --root. 1174588667 M * Bertl regarding resolver .. how is that a problem with a static rpm? 1174588704 M * daniel_hozac well, nothing would use it if you can't override the getpwnam/getgrnam functions in rpm ;) 1174588743 M * Bertl but it will use the std resolver interfaces, no? 1174588756 M * Bertl so a bind mount should work there quite fine 1174588769 M * daniel_hozac work where? 1174588779 M * Bertl with a static rpm 1174588793 M * daniel_hozac well, as i said, what if the guest uses LDAP? 1174588810 M * daniel_hozac i have a few guests sharing users using LDAP. 1174588853 M * Bertl okay, and those work with the preload lib? 1174588869 M * Bertl and if, how? 1174588873 M * daniel_hozac i believe that's the intention 1174588906 M * daniel_hozac hmm, seems rpm-fake-resolver is a dietlibc program though... 1174588943 M * daniel_hozac the idea is to have the library trap getpwnam/getgrnam and send the requests to the resolver, which returns the values from the guest. 1174589279 M * Bertl sounds good, and what does it actually do? 1174589329 M * matti Bertl: :) 1174590142 M * daniel_hozac Bertl: that, i think :) 1174590246 M * daniel_hozac however, as the resolver is linked against dietlibc, i assume it doesn't support /etc/nsswitch.conf and thus not LDAP etc. 1174590306 M * Bertl sooo, how does that work with your LDAP guests then? 1174590685 M * pagano dev-men, 2 quick questions (I hope) 1174590715 M * pagano when a new vserver is buld I receive: Failed to start vserver 'odev' 1174590715 M * pagano 'vserver ... suexec' is supported for running vservers only; aborting... 1174590747 M * pagano and, if I try to start vserver I have: No device specified for '/usr/local/etc/vservers/vm5/interfaces/0' 1174590747 M * pagano Failed to start vserver 'vm5' 1174590765 M * pagano but I used nodev option as u suggested 1174590778 M * Bertl ls /usr/local/etc/vservers/vm5/interfaces/0 1174590813 M * Bertl the vserver is named 'odev'? 1174590816 M * pagano [root@vm vshelper]# ls /usr/local/etc/vservers/vm5/interfaces/0 1174590816 M * pagano ip name prefix 1174590823 M * pagano no, is named vm5 1174590828 M * Bertl (looks to me like a -nodev -> -n odev :) 1174590878 M * pagano /usr/local/sbin/vserver vm5 build -m yum --hostname=... --interface eth0=.../24 -nodev -- -d centos4 1174590905 M * Bertl that is just weird ... did you even check the --help output? 1174590931 M * Bertl the syntax for --interface is 1174590942 M * matti :> 1174590966 M * Bertl --interface / (for no device, existing ip, tools will complain) 1174590987 M * Bertl --interface :/ (tools will assing ip to ) 1174591013 M * Bertl --interface =:/ (tools will create alias :) 1174591098 M * pagano sorry, u are right ... I'm in front of this damn pc since this morning :( 1174591105 M * pagano maybe is time to leave 1174591123 J * hardwire ~bip@rdbck-4271.palmer.mtaonline.net 1174591134 Q * hardwire 1174591147 M * Bertl in genreal (unless you are mr. Schilling) long options have two dashes ... 1174591175 M * Bertl (well, X is too old to know that :) 1174591238 J * hardwire ~bip@rdbck-4271.palmer.mtaonline.net 1174591417 M * Loki|muh what about imagemagick? ;) 1174591516 M * Bertl yeah, I guess there are many examples around ... 1174591541 M * Bertl let's rephrase that to: util-vserver _always_ uses double dashes for long options :) 1174591547 M * Loki|muh :-) 1174591594 M * Bertl btw, mr Schilling has seen everything and therefore uses all kind of notations to entertain the user :) 1174591640 Q * FloodServ charon.oftc.net services.oftc.net 1174591741 J * FloodServ services@services.oftc.net 1174591786 M * pagano uhmm, I'm not lucky this evening 1174591822 M * pagano the problem is not only the interface (I'm trying all the combination without success) 1174591842 M * pagano but Failed to start vserver 'vm5' 1174591842 M * pagano 'vserver ... suexec' is supported for running vservers only; aborting... 1174591849 M * pagano seems to be the main error 1174592020 M * Bertl what is the command line used to create it? 1174592038 M * Bertl and could you upload the output of 'vserver --debug vm5 start' 1174592044 M * Bertl (paste.linux-vserver.org) 1174592406 M * pagano uff.. I won! :) 1174592432 M * pagano the problem was in the command line options... not a news :) 1174592579 M * daniel_hozac Bertl: i guess it doesn't :) 1174592632 M * Bertl okay, my suggestion would be the followin: 1174592704 M * Bertl let's focus on the typical case (espeically the install case), special cases like LDAP and such need special setup, in which case it is acceptable to internalize the package management 1174592725 M * daniel_hozac sure. 1174592855 M * Bertl imho, if rpm doesn't get the uid/gid stuff right when called with --root, then this is a bug and should be addressed in rpm 1174592882 M * Bertl nevertheless, we can try to work around that with a few bind mounts if necessary 1174592888 M * daniel_hozac well... rpm is full of bugs :) 1174592906 M * Bertl yeah, and debootstrap is perfect, as we all know/have seen :) 1174592921 M * daniel_hozac hehe. 1174592949 M * daniel_hozac nobody seems to care enough about debootstrap to investigate/fix it though ;) 1174592971 M * Bertl is there a maintainer for that? 1174592991 M * daniel_hozac well, i meant more in the util-vserver sense. 1174593039 M * Bertl hmm, I guess micah would look into whatever debian issues there are (given enough time) when somebody reports them 1174593063 M * daniel_hozac sure. 1174593070 M * Bertl IMHO the main problem is that the bug reporting/resolution is flawed there 1174593078 M * daniel_hozac oh? 1174593102 M * Bertl well, somebody comes here and has a (debian) problem 1174593129 M * Bertl we walk him through and explain what debian gets wrong (besides being outdated) 1174593162 M * Bertl we might also ask him to file a bug report to the debian bug tracker (50%) 1174593185 M * Bertl he might actually file a bug report then (25%) 1174593204 M * Bertl which is redistributed to the maintainer of that tool 1174593241 M * Bertl who doesn't know anything about Linux-VServer in most cases ... assuming he knows (10%) 1174593264 M * Bertl he will try to find the time and recreate the issue (5%) 1174593284 M * Bertl and eventually find a good/appropriate solution (3%) 1174593304 M * Bertl which then has to be postponed as debian is in deep freeze 1174593332 M * Bertl but let's further assume it is a really important issue and the maintainer really acknowledges that (1%) 1174593363 M * daniel_hozac so we're talking about issues in non-util-vserver Debian packages, yes? 1174593370 M * Bertl then it might get fixed in one of the updates in the next 6 months (0.5%) 1174593408 M * Bertl we are talking about issues caused, triggered and/or related to linux-vserver, but not necessarily in util-vserver 1174593439 M * Bertl but even for util-vserver the upstream merge is terribly slow, don't you think so? 1174593447 M * daniel_hozac hmm, are there really that many problems? 1174593461 M * daniel_hozac well, micah is quite responsive in updating it. 1174593514 M * daniel_hozac but as Debian's policies are, well, weird, it won't make it into the stable releases. 1174593549 M * daniel_hozac maybe we should just starting hiding security problems in every release to make sure we can convince the Debian people to allow the update :) 1174593553 M * Bertl let me state that here: the debian support is a million times better than the ubuntu support (which is actually not present at all :) 1174593567 M * daniel_hozac definitely. 1174593574 M * daniel_hozac Ubuntu is in desperate need of a maintainer. 1174593606 M * Bertl so that is not a general rant against debian, this is just stating the fact that something in the debian package maintainance concept is seriously wrong 1174593615 M * daniel_hozac well, yeah. 1174593623 M * daniel_hozac but that's just the way Debian is :) 1174593689 M * Bertl yeah, not planning to change that ... just because you seemed to imply that nobody cares about debootstrap :) 1174593778 M * Bertl (as if that were something untypical for debian, that is :) 1174594059 M * DavidS Yeah .. Debian is disassociating itself - unintentionally i hope - from the development cycle of many packages. This is starting to really hurt debian not only as a integrator but overall ... 1174594117 M * DavidS one of the solutions might be to start a "snapshot" archive where the newest release candidates, betas, nightly snapshots of everything are available for gustation 1174594132 M * Bertl central or per project? 1174594153 M * DavidS Bertl: i don't care, but central would obviously be better. 1174594159 M * daniel_hozac umm, isn't that what unstable is? 1174594199 M * DavidS nowadays? not anymore. unstable is "stuff that should migrate to testing" i.e. things which will release sooner or later 1174594224 M * DavidS experimental should have taken over that role - and it is autobuilt these days - but hasn't taken up this role ... 1174594302 M * DavidS I often see developer complaining that nobody is testing their software before a release ... many people willing to do tests who are using debian have no chance at all to even notice(!) a release-candidate since there won't be any debian packages for it ... 1174594315 M * DavidS s/willing/who might be &/ 1174594454 M * Bertl but that would suggest a decentralized solution 1174594461 M * daniel_hozac well, if someone's interested in testing software, they should use upstream packages. 1174594475 M * DavidS hmm .. I just wanted to take gaim as bad example where currently betas are arriving weeks after the release .. but there is a beta+1 in experimental ... 1174594485 M * Bertl something where the maintainer has direct contact with the users 1174594505 M * Bertl DavidS: bet2.0.0-1? 1174594534 M * DavidS etch/sid have beta5 and experimental has beta6 1174594552 M * Bertl okay, that seems almost up-to-date 1174594565 M * Bertl mdv 2007 shipped with beta2 IIRC 1174594566 M * DavidS I'd rather not comment about having a beta in etch ... even if it works very well 1174594592 M * daniel_hozac util-vserver is alpha :) 1174594607 M * DavidS I feel the urge to install some experimental packages on my laptop :) 1174594708 M * DavidS re: contact to maintainer: yeah .. that's right .. on the other hand, there are many "upstream" packages I wouldn't touch with thick gloves .. let alone install on my workstation 1174594759 M * DavidS daniel_hozac: util-vserver is long enough alpha that i don't care :) 1174594824 M * Bertl DavidS: btw, there is no woody for amd64, right? 1174594836 M * daniel_hozac even sarge doesn't have an official release for amd64, IIRC. 1174594843 M * DavidS indeed 1174594857 M * Bertl yeah, but I found the inofficial one :) 1174594869 M * Bertl i.e. sarge64 guest installed fine :) 1174594927 M * pagano can I fix a max use of ram and cpu for each vserver ? 1174594947 M * Bertl fix means? 1174594953 M * pagano set max use 1174594954 M * DavidS brb 1174594961 Q * DavidS Quit: Leaving. 1174595002 M * pagano max amount. vserver with 256 mb ram and 1 ghz for example 1174595027 M * Bertl http://linux-vserver.org/Memory_Limits 1174595054 M * Bertl http://linux-vserver.org/CPU_Scheduler 1174595067 M * pagano perfect :) 1174595068 M * Bertl http://linux-vserver.org/Resource_Limits 1174595180 J * DavidS ~david@chello062178045213.16.11.tuwien.teleweb.at 1174595191 M * DavidS shiny! 1174595206 M * pagano see u tomorrow, Bye! 1174595226 M * pagano and thanks for all, obvoiusly! 1174595234 N * pagano pagano_out 1174595266 M * Bertl pagano_out: cya! 1174595797 M * Bertl daniel_hozac: okay, adding a bunch of debug entries shows that even my most recent notty_race doesn't excersise the interesting pathes :( 1174595846 M * Bertl http://vserver.13thfloor.at/Experimental/TOOLS/notty_race.c 1174595865 M * Bertl if you have any suggestions to improve, I'd be delighted to hear 1174595882 Q * PowerKe Remote host closed the connection 1174595884 M * Bertl doener: ping? that might be something for you too ... 1174595923 M * daniel_hozac Bertl: have you figured out if disassociate_ctty is actually run? 1174595942 M * Bertl it isnt 1174595944 M * daniel_hozac i'd expect we need a setsid or something similar for that. 1174595968 M * Bertl I have printks in all places which remove tty now 1174595972 M * daniel_hozac (i'm still really fuzzy on how all that works) 1174595986 M * daniel_hozac ok. 1174595990 M * Bertl so if you have any code to test, you can do that in princeton 1174595996 M * daniel_hozac okay. 1174596018 M * Bertl serial console is hooked up, so do your worst :) 1174596104 M * daniel_hozac setsid() should set current->signal->leader. 1174596109 M * Bertl what totally confuses me is that the TIOCNOTTY: actually should hit both 1174596126 M * daniel_hozac both what? 1174596134 M * Bertl the disassociate_ctty() in certain cases _and_ the current->signal->tty = NULL in _all_ cases 1174596150 M * Bertl but I get neither of them logged ... 1174596165 M * Bertl maybe it is time to check with strace .... 1174596176 M * daniel_hozac hmm, i.e. neither is executed? 1174596232 M * Bertl I have a printk there, nothing gets logged 1174596274 M * daniel_hozac very weird... 1174596299 M * Bertl [pid 2386] open("/dev/tty", O_RDWR) = 3 1174596299 M * Bertl [pid 2386] ioctl(3, TIOCNOTTY) = 0 1174596304 M * Bertl looks okay to me ... 1174596325 M * daniel_hozac indeed... 1174596335 M * daniel_hozac where's the test program in princeton? 1174596385 Q * dhansen Quit: Leaving 1174596424 M * Bertl daniel_hozac: in the home dir 1174596434 M * daniel_hozac okay. 1174596526 M * daniel_hozac kernel tree? 1174596556 M * Bertl /usr/src/`uname -k` or so 1174596579 M * Bertl linux-`uname -r` actually :) 1174596607 M * Bertl nah, I'm lying .. don't trust me 1174596646 M * daniel_hozac hmm, seems like the right tree to me? 1174596673 M * Bertl yeah, actually it is ... :-) 1174596698 M * Bertl it should have been -X1 ... 1174597802 M * daniel_hozac well, i have no idea. 1174597834 M * Bertl but you can confirm my observations, yes? 1174597872 M * daniel_hozac which ones? 1174597892 M * Bertl that it doesn't log when it should log :) 1174597896 M * daniel_hozac i can't get it to run disassociate_ctty. 1174597908 M * daniel_hozac (or at least not output that message) 1174597952 M * Bertl did you succeed in triggering one of the other 'zero in' messages? 1174597966 M * daniel_hozac setsid and tty_ioctl. 1174598056 M * Bertl tty_ioctl should suffice to trigger the race 1174598063 M * Bertl how did you get that one? 1174598074 M * daniel_hozac ~/notty_race 1174598090 M * Bertl hum, a modified one? 1174598092 M * daniel_hozac but that one is protected by task_lock, shouldn't that be safe? 1174598106 M * daniel_hozac slightly, i just added the setsid and setpgid(0, 0). 1174598210 Q * gerrit Ping timeout: 480 seconds 1174598268 M * Bertl okay, yes, that one is protected ... 1174598361 M * Bertl setsid() creates a new session if the calling process is not a process group leader. 1174598378 M * Bertl isn't our task a process group leader by default? 1174598466 M * Bertl and isn't setpgid(0,0) a noop? 1174598483 M * daniel_hozac i have no idea :) 1174598512 M * daniel_hozac i was mostly trying to get at the current->signal->leader = 1 in setsid. 1174598535 M * daniel_hozac hmm, not current, group_leader... 1174598563 M * daniel_hozac i think the order should be the other way around. 1174598570 M * daniel_hozac i.e. setpgid(0,0); setsid(); 1174598597 M * Bertl http://www.seas.upenn.edu/~cse381/lectures/lec3.pdf 1174598607 M * daniel_hozac heh, thanks. 1174598747 M * Bertl so piping causes process groups? 1174598829 M * daniel_hozac apparently. 1174598991 Q * m`m`h Ping timeout: 480 seconds 1174599063 J * m`m`h ~simba@deb30.mgts.by 1174599383 M * Bertl bingo! 1174599388 M * Bertl screen ./notty_race 1174599448 M * Bertl but still no oops 1174599660 J * Aiken ~james@ppp250-73.lns2.bne4.internode.on.net 1174599749 M * Bertl morning Aiken! 1174599981 M * Bertl okay, I need a break ... back later ... 1174599990 N * Bertl Bertl_oO 1174600564 Q * DavidS Quit: Leaving. 1174600568 J * DavidS ~david@chello062178045213.16.11.tuwien.teleweb.at 1174601083 M * slack101 Bertl_oO: Hola 1174601267 J * PowerKe ~tom@d54C13E4B.access.telenet.be 1174603956 J * boci^ boci@pool-4774.adsl.interware.hu 1174603968 Q * DavidS Quit: Leaving. 1174604564 Q * infowolfe Read error: Connection reset by peer 1174604783 J * chand ~chand@vau75-3-82-224-128-66.fbx.proxad.net 1174604798 Q * chand 1174604864 J * infowolfe ~infowolfe@c-67-164-195-129.hsd1.ut.comcast.net 1174605549 P * stefani I'm Parting (the water) 1174605695 Q * michal` Ping timeout: 480 seconds 1174605981 Q * meandtheshell Quit: Leaving. 1174606069 J * michal` ~michal@www.rsbac.org 1174606278 M * matti :) 1174606748 M * slack101 whaqt up 1174606844 J * glutoman glut@no.suid.pl 1174606931 Q * glut Ping timeout: 480 seconds 1174607114 N * DoberMann DoberMann[ZZZzzz] 1174607209 Q * bonbons Quit: Leaving