1137715644 J * Johnno ~jdlewis@24.154.53.16 1137715645 Q * Johnsie Read error: Connection reset by peer 1137717379 P * undefined 1137718002 Q * mkhl Quit: 1137719075 J * lilo_ ~lilo@cpe-24-167-94-255.houston.res.rr.com 1137719269 Q * lilo Killed (NickServ command used by lilo_) 1137719273 N * lilo_ lilo 1137719881 J * _Roey ~abc@pcp04370251pcs.nrockv01.md.comcast.net 1137719890 M * _Roey say 1137719894 M * _Roey Who here is Namulator ??? 1137719906 M * _Roey http://linux.slashdot.org/comments.pl?sid=174265&cid=14506897 1137719909 M * _Roey who made that post? 1137719973 N * Bertl_oO Bertl 1137719991 M * Bertl back now ... 1137720016 M * _Roey Hey Bertl 1137720031 M * _Roey Herbert, do you know who Namulator is on Slashdot by any chance? 1137720045 M * Bertl no, sorry ... 1137720047 M * _Roey http://linux.slashdot.org/comments.pl?sid=174265&cid=14497502 1137720049 M * _Roey read that 1137720051 M * _Roey he's a vserver dev 1137720194 M * Bertl sounds reasonable what he's writing, no? 1137720209 A * Bertl is reading the second one now 1137720318 M * _Roey very reasonable 1137720319 M * _Roey yes 1137720320 M * _Roey I agree. 1137720420 M * daniel_hozac Bertl: re PR, i could do that. 1137720453 M * _Roey daniel_hozac: oh? 1137720627 J * gerrit gerrit@dhcp65-74-212-252.npk.aus.wayport.net 1137720673 M * Bertl daniel_hozac: excellent, tx 1137720848 M * Bertl wb gerrit! 1137720874 M * Bertl ebiederm: I'm around now ... but no need to hurry ... 1137720909 M * daniel_hozac (assuming there is no PR-style writing needed, in which case i'd suck at it ;)) 1137720924 M * Bertl daniel_hozac: don't worry ... 1137721084 M * Bertl _Roey: so everything working on your servers by now? 1137721222 M * Bertl _Roey: I mean regarding linux-vserver of course :) 1137721238 P * meandtheshell 1137721701 J * undefined ~undefined@adsl-68-93-109-94.dsl.rcsntx.swbell.net 1137721708 M * Bertl welcome undefined! 1137721719 M * undefined howdy Bertl 1137722404 A * Bertl .o( hmm, seems Roey doesn't talk to me anymore :( ) 1137722564 M * daniel_hozac oh, Enrico merged the chbind patch. 1137722573 M * Bertl ah, great! 1137722608 M * daniel_hozac hmm, it appears i miscommunicated the purpose of it... 1137722624 M * Bertl lol ... 1137722634 M * daniel_hozac he seems to think it's ngnet related. 1137722650 M * Bertl but he merged it? 1137722658 M * daniel_hozac yep. 1137722669 M * Bertl well, you should inform him nevertheless 1137722676 M * Bertl but that's actually a good sign ... 1137722682 M * daniel_hozac yeah. 1137722686 M * daniel_hozac i'll add that to the patch report. 1137723037 J * mkhl ~mkhl@200-153-181-238.dsl.telesp.net.br 1137723187 M * Bertl welcome mkhl! 1137723224 M * mkhl Bertl hello 1137723235 J * Aiken_ ~james@tooax7-146.dialup.optusnet.com.au 1137723279 M * Bertl hey Aiken_! 1137723549 Q * Aiken Ping timeout: 480 seconds 1137723675 M * Aiken_ hi 1137723704 M * ebiederm Bertl: I'm back. 1137723740 M * Bertl ebiederm: great, so any issues you encountered? or anything you would like to review or talk about? 1137723779 M * ebiederm At the moment I'm in decent shape. 1137723839 M * ebiederm As soon as I have a couple more moments I will have my git tree posted. 1137723853 M * ebiederm Then I can start a my code is better than your code contest with the IBM guys :) 1137723878 M * Bertl okay ... 1137723931 M * ebiederm I'm just been running talking to my coworkers so I really haven't done anything since about the time you left. 1137723964 Q * brc_ Ping timeout: 480 seconds 1137723986 M * ebiederm Did you have something you wanted to talk about. 1137723991 M * ebiederm It sounded like it earlier. 1137724035 M * Bertl well, not yet .. but I'm sure I will ahve as soon as there are patches around :) 1137724437 Q * pusling Read error: Connection reset by peer 1137724439 J * pusling_ pusling@195.215.29.124 1137725032 M * ebiederm Well now my current snapshot is up at git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-ns.git 1137725042 M * ebiederm I now get to discover what the mirroring delay is. 1137725049 M * ebiederm :) 1137725091 M * Bertl okay, is there a patch (maybe a broken down version) too? 1137725117 M * ebiederm I can make one easily enough. 1137725124 M * Bertl that'd be great! 1137725234 M * ebiederm At least on my development branch my goal is to move quickly and to leave git and peoples scripts the work of generating patches. 1137725253 M * ebiederm For the final kernel inclusion I will need a clean line of development but... 1137725279 M * ebiederm So I guess I need to understand how you work so we can work well together. 1137725299 M * Bertl basically I haven't made the 'trasition' to git yet, but it's on my todo 1137725323 M * Bertl for now, a unified diff (or several of them) will do perfectly fine 1137725340 M * Bertl if I have updates or fixes, then they appear as deltas 1137725348 M * Bertl you can get an idea here: 1137725367 M * Bertl http://vserver.13thfloor.at/Devel/ 1137725426 Q * pusling_ Remote host closed the connection 1137725434 J * pusling pusling@195.215.29.124 1137725563 M * ebiederm Ok. I have mostly digested that. 1137725589 M * ebiederm I honstestly like that break out into logical patches as well. 1137725623 M * ebiederm Currently my work in size looks a lot like your PAT-2.0.0 directory if I break it. 1137725638 M * ebiederm Such a huge quantity of patches just floating around tends to be unweildy. 1137725682 M * Bertl well, breaking it down into smaller parts helps a lot with the understanding of the individula changes 1137725693 M * Bertl even if the number of patches is quite high :) 1137725698 J * _mountie ~mountie@trb229.travel-net.com 1137725705 M * ebiederm I agree and it helps with maintenance outside the tree. 1137725722 M * Bertl ebiederm: of course, breaking it down per file or such is absolute nonsense :) 1137725909 M * ebiederm So I am in the process of generating 2 things. 1137725938 M * ebiederm The first is a tarball of all of my ``patches'' and the second is an overall patch. 1137725958 M * ebiederm Then I get to figure out how to get all of this going on the latest 2.6 kernel. 1137725992 Q * mountie Ping timeout: 480 seconds 1137726002 M * Bertl 2.6.15 or 2.6.16-rc is fine for me ... 1137726075 M * ebiederm I will start with 2.6.16-rc-latest. I'm still in a development mode so older kernels aren't that interesting. 1137726077 M * locksy I just can't seem to get BME applied to 2.0.1.2... How different are 2.0 and 2.1 (i.e. should I risk going with 2.1 for a production server) 1137726110 M * Bertl ebiederm: okay, good ... 1137726138 M * Bertl locksy: well, quite different to be honest ... but the good news is, the next stable release will contain BME (on public demand) 1137726156 M * locksy Yay! 1137726166 M * ebiederm What is BME? 1137726174 M * locksy Bind Mount Extensions 1137726177 M * Bertl a fix to mainline 1137726191 M * locksy allows mount --bind -o ro 1137726195 M * Bertl ebiederm: makes --bind mounts behave 'normal' 1137726195 M * ebiederm Something I need to catch on. 1137726197 M * ebiederm Ah. 1137726198 M * locksy (among other things) 1137726249 M * locksy So, any timeline for the next stable release then :) 1137726271 M * Bertl will probably happen soon, I guess even before 2.6.16 1137726317 M * locksy I've been putting off this kernel upgrade for nearly a fortnight now... I guess I can wait a bit longer. 1137727391 J * brc_ bruce@200141105028.user.veloxzone.com.br 1137727402 M * Bertl welcome brc_! 1137727691 M * ebiederm Ok. The easy half is done. My current snapshot against 2.6.14-rc2 is posted as patches (and as one big patch). 1137727697 M * ebiederm Now I get to forward port the sucker! 1137727712 M * Bertl okidoki :) 1137727725 Q * pusling Remote host closed the connection 1137727774 J * pusling ~pusling@195.215.29.124 1137728157 M * locksy ebiederm: the rsync from master.kernel.org to the public servers still ain't done :( Do you have the git tree accessible elsewhere or should I just be patient :) 1137728219 M * ebiederm I really don't have room elsewhere to put up a git tree. 1137728253 M * ebiederm Is anything showing up? 1137728343 M * locksy It just takes a couple of hours to show up on the public mirrors... 1137728356 M * ebiederm Ok. Sorry about that. 1137728373 M * ebiederm I can probably push my patch up to my website. 1137728374 M * locksy np 1137728390 M * ebiederm At least that would let you get the feel for what I have been doing. 1137728415 M * locksy It's alright I'll go have lunch instead. *grin* 1137728548 M * Bertl ebiederm: well, if you need, I can make some room for you on 13thfloor.at 1137728649 M * ebiederm So with kernel.org I have enough room but it obviously isn't as timely as wood be nice. 1137728688 M * ebiederm So if the delay becomes a problem that would make sense. 1137728690 M * daniel_hozac here's a newbie question for you all: do all files and directories have inodes? 1137728728 M * ebiederm Depends on the filesystem. The unix model is yes. But non-unix fs are peculiar. 1137728741 M * Bertl daniel_hozac: on linux: yes, I'd say so ... 1137728750 Q * lilalinux Remote host closed the connection 1137728794 M * daniel_hozac is ls -1aRi | awk '/^[0-9]+ / { print $1 }' | sort -u | wc -l an accurate way to get a count of how many inodes are used within a directory? 1137728831 M * Bertl hum, no, probably not (because of hardlinks) 1137728840 M * daniel_hozac wouldn't the sort -u take care of those? 1137728858 M * ebiederm It should. 1137728869 M * Bertl hmm, yeah, probably, right .. 1137728884 M * ebiederm So assuming a sane filesystem that will work. 1137728926 M * Bertl yes, as long as it doesn't report the same inode number for actually different inodes 1137728960 M * ebiederm I believe /proc can do that if a process opens more than 64K files. 1137728983 M * Bertl daniel_hozac: i.e. for the dlimit is sounds like a nice solution, if you figure/eliminate unified files 1137729007 M * daniel_hozac that's what's on the disk limit pages now, i think. 1137729043 M * daniel_hozac i've been working a patch for util-vserver to put disk limit configuration in /etc/vservers//dlimits though, and i've been trying to create a util to count the inodes... 1137729053 M * daniel_hozac the result differs from the above, however. 1137729058 Q * pusling Remote host closed the connection 1137729070 M * Bertl daniel_hozac: in what way? 1137729079 M * daniel_hozac (by about 800 inodes) 1137729080 J * pusling pusling@195.215.29.124 1137729086 M * Bertl more or less? 1137729113 M * daniel_hozac 800 inodes more from my util. 1137729119 M * daniel_hozac http://daniel.hozac.com/tmp/util-vserver-0.30.209-dlimit-config.patch 1137729132 M * daniel_hozac i've brutalized vdu (seemed appropriate). 1137729134 J * mef ~mef@pcp09872021pcs.ewndsr01.nj.comcast.net 1137729140 M * Bertl do you walk the directories recursively? 1137729156 M * daniel_hozac yes. 1137729172 M * daniel_hozac opendir(), readdir(), if (S_ISDIR()), call self, etc. 1137729174 M * mef bertl: the machine is ready. 1137729232 M * daniel_hozac space wise i get the exact same results from the util as from regular du (well, except for the fact that the start directory isn't counted by the util, so it's off by one block). 1137729370 M * Bertl mef: excellent! tx! 1137729554 M * Bertl daniel_hozac: the first entries in a dir are '.' and '..' 1137729562 M * Bertl daniel_hozac: you do not account them, right? 1137729588 M * daniel_hozac no. 1137729617 M * daniel_hozac those ought to be accounted through the recursiveness, no? 1137729626 M * Bertl yes, just checking ... 1137729692 M * Bertl daniel_hozac: if you take a simple directory, with just 5 files or so, is there a difference too? 1137729709 M * ebiederm If you are worried about link count . and .. make a difference. 1137729721 M * daniel_hozac no, in my controlled tests, it counted corectly. 1137729724 M * ebiederm But for inode count you should be fine. 1137729762 M * Bertl daniel_hozac: maybe vfsmount boundaries? 1137729784 M * daniel_hozac Bertl: hmm? another filesystem? 1137729809 M * Bertl could be, and could use the same inode numbers there 1137729824 M * daniel_hozac well, i check dirst.st_dev == st.st_dev for directories before recursing. 1137729853 M * daniel_hozac (dirst == struct stat for the current directory, st for the current directory entry) 1137729855 M * ebiederm On which part of the directory? dir or dir/. ? 1137729891 M * daniel_hozac just dir 1137729896 M * Bertl daniel_hozac: does the ls -R care about dev boundaries? 1137729907 M * daniel_hozac Bertl: i think you need -x for that. 1137729915 M * ebiederm Sometimes just dir will return the mount point and not the filesystem mounted on it. 1137729926 M * Bertl daniel_hozac: I don't think it has -x (with that meaning :) 1137729937 M * Bertl -x list entries by lines instead of by columns 1137729937 M * daniel_hozac indeed, -x means something entirely different. 1137729989 M * daniel_hozac well, i have no filesystems mounted in the tree i'm running it. 1137730014 M * Bertl good argument :) 1137730720 M * daniel_hozac is there any way to go from inode number to path? 1137730755 M * ebiederm Brute force. 1137730759 Q * ebiederm Quit: Leaving 1137730789 J * ebiederm ~eric@ebiederm.dsl.xmission.com 1137730801 M * locksy ebiederm: It's there now :) 1137730807 M * ebiederm Cool. 1137730841 M * ebiederm Now you can look at my code and say horrible things about it :) 1137730935 M * ebiederm Berl: You said earlier reading the thread on linux-kernel you got an aha moment? 1137730956 M * ebiederm No I can't even spell peoples names! 1137731025 J * stefani ~stefani@c-24-19-46-211.hsd1.wa.comcast.net 1137731081 M * daniel_hozac oh darn, it was something simple as that. 1137731098 M * daniel_hozac the awk pattern wouldn't match padded inode numbers. 1137731152 M * daniel_hozac sorry for wasting all of yours time, heh. 1137731169 M * Bertl np 1137731173 M * Bertl welcome stefani! 1137731184 M * stefani hola.! 1137731191 M * Bertl ebiederm: yes, regarding the 'real pid' elimination ... 1137731275 M * Bertl ebiederm: so did I miss a patch, or do I have to get the entire kernel tree? 1137731317 M * Bertl ebiederm: or am I just too impatient? 1137731342 M * ebiederm Not sure... 1137731399 M * ebiederm No I just didn't post the url for the patches. 1137731413 M * ebiederm See: ftp://ftp.kernel.org/pub/linux/kernel/people/ebiederm/namespaces/2.6.14-rc2/ 1137731445 M * Bertl ah, great, tx! 1137731491 M * Bertl I assume they are GPL and I can copy them somewhere (public) 1137731529 M * ebiederm Yes. GPLv2 and I just put them somewhere public so.... 1137731561 M * ebiederm However I suspect the sequence of patches looks like a drunken walk at the moment. 1137731906 Q * mef Quit: using sirc version 2.211+KSIRC/1.3.12 1137732513 M * ebiederm So it appears I have stunned the audience with my giant pile of patches.... 1137732652 M * Bertl well, just looking through them right now ... 1137732656 M * Bertl http://www.13thfloor.at/NGNET/ebiederm/2.6.14-rc2/broken-out/ 1137732664 M * ebiederm :) 1137732698 M * Bertl but NGNET is probably the wrong name here, as it includes all different virtualizations 1137732707 M * Bertl *kinds 1137732801 M * Bertl nevertheless, is there a case where do_fork doesn't do pid_to_user()? 1137732817 M * ebiederm Hm... 1137732840 M * ebiederm This is where you get to see my drunken walk. 1137732875 M * ebiederm I started down that path, saw how little use it was and how much a problem it was and then kill the whole pid_to_user idea. 1137732906 M * Bertl okay, so it's probably better to break up the whole patch anew then 1137732919 M * Bertl that's why I don't like the auto generated git/commit breakdowns 1137732947 M * ebiederm I agree and that is the next step. 1137732956 M * Bertl okay, great! :) 1137732978 M * ebiederm Basically I need that if I am going to update to the latest kernel sanely anyway. 1137732989 M * ebiederm But it is going to be a couple of days before that is done. 1137732997 M * ebiederm As it is you can get some sense of what I was thinking. 1137733068 M * ebiederm It is at least slightly interesting to see which wrongs paths I went down, because I have good reasons why I retreated from all of them. 1137733123 M * ebiederm One of the reasons I have problems with task_pid is that my equivalent didn't help. And the IBM guys at least last time around missed the do_fork case. 1137733147 M * Bertl okay, np, any chance to 'priortize' the network stuff, as we want (IIRC?) to include that asap? 1137733293 M * ebiederm Just a little. 1137733306 M * ebiederm the sysvipc stuff is pretty trivial so it doesn't much count. 1137733327 M * ebiederm The process stuff is actually a lot simpler than the network stuff (despite how many revs it took me to get there) 1137733346 M * ebiederm So it will probably still come at the end of the list but that should not take too long. 1137733389 M * ebiederm But there is one interesting question about the network stuff I have been asking myself back and forth. 1137733409 M * ebiederm Does utsname belong to the network? 1137733430 M * Bertl no, definitely not 1137733431 M * ebiederm The code still exists even without CONFIG_NET and it appears to be a completely separate and distinct animal. 1137733459 M * ebiederm Ok. With hostname in there it feels like it because they are always used together. 1137733476 M * ebiederm So I need to break that out into the world's tiniest namespace. 1137733477 M * Bertl not really ... 1137733496 M * Bertl networking is not about 'names' except for the interfaces :) 1137733533 M * ebiederm Well and the ports. 1137733541 M * ebiederm And the IP addresses. 1137733547 M * Bertl they are not _names_ :) 1137733573 M * Bertl at least not in my book ... 1137733590 M * ebiederm Well not in the sense a person usually thinks of them. 1137733604 M * ebiederm But they are labels you find things by so close enough. 1137733636 M * Bertl well ... in this case you probably have to put everything under networking (if I follow your logic :) 1137733682 M * Bertl but no, seriously, we would never put utsname virtualization (which is basically identical to yours) under networking contexts 1137733705 M * ebiederm Ok. That settles it then. 1137733734 M * ebiederm The biggest piece you could help with is to look at the permission issues. 1137733747 M * ebiederm So far all I have looked at is correctness. 1137733776 M * Bertl I will look into all kind of stuff sooner or later .. but as I said, we should have some priorities here ... 1137733836 M * Bertl and, we should definitely clarify if your goal is to 'get something in' or 'just satisfy your needs' or 'make it usable by current virtualization projects' ... 1137733989 M * ebiederm So my first priority is to get my general scheme of breaking everything into namespaces accepted. 1137734050 M * ebiederm Once the basic form of the code is worked out the rest of the pieces get easier. 1137734076 M * Bertl okay, but this already brings some questions ... 1137734104 M * Bertl for example, linux-vserver currently knows and uses two context identifier 1137734129 M * ebiederm One of my needs is long term maintainability. And that means a clean solution that works for the greatest number of people. 1137734131 M * Bertl to make a transition simple, we would probably map the 'namespaces' to them 1137734166 M * Bertl of course, this has to happen with a) no overhead for the user, and b) in a transparent way 1137734198 M * Bertl ebiederm: did you think about this yet? 1137734210 M * ebiederm And I am currently looking at 4 context id's. 1137734230 M * ebiederm So for your problem this is where I don't get in the way but I don't help too much. 1137734257 M * ebiederm There is the trivial solution of putting your context id in the appropriate version of my structures. 1137734286 M * ebiederm I don't know what that does for maintaining patches though. 1137734335 M * Bertl well, could we just play what-if with the utsname stuff for example? 1137734362 M * ebiederm Ok. 1137734389 M * Bertl so, what is (will be) the interface to your utsname virtualization (which is very similar to mine) 1137734409 M * Bertl probably a proc entry, where you write something, right? 1137734427 M * ebiederm Well adding anything to /proc is a problem. 1137734441 M * Bertl (was just an idea, what will it be?) 1137734442 M * ebiederm I would through another clone flag. 1137734457 M * Bertl okay, so we add the CLONE_UTSNAME 1137734458 M * ebiederm And then controll it with the normal sethostname etc. 1137734478 M * Bertl okay, but that is not good for linux-vserver folks 1137734500 M * Bertl why? simple, because some things should (read must) not be changed from _inside_ 1137734513 M * Bertl like for example the hardware architecture or kernel version :) 1137734543 M * ebiederm Do you support ever changing those fields? 1137734545 M * Bertl but regarding host name (or domain name) for example 1137734560 M * Bertl we have some flags, which decide if the guest can change it or not 1137734574 M * Bertl ebiederm: yes we do support that, and for good reason 1137734592 M * Bertl ebiederm: just think i386 on x86_64 or 2.4 guest on 2.6 1137734613 M * ebiederm Agreed. 1137734635 M * ebiederm It is useful to change the extra fields. 1137734647 M * ebiederm Currently the kernel doesn't actually have a mechanism for that. 1137734665 M * Bertl so .. how to map that, with a condition like the beforementioned flags to your namespace? 1137734670 M * ebiederm I would love to throw capability bits at the problem, and just worry about getting that right. 1137734700 M * Bertl good point, that could even work ... one reason why I suggested to extend the capability system years ago 1137734702 M * ebiederm However I believe we used up all 32 capability bits already. 1137734715 M * Bertl almost, but yes 1137734765 M * Bertl but we have 'per context' flags and capabilities too, so this is handled there (to some extend) 1137734785 M * Bertl (does not help mainline :) 1137734812 M * ebiederm Well it at least answers how that problem is solved today. 1137734850 M * ebiederm Stopping for a moment I'm really not too comfortable about utsname as an example for permissions as it is pretty ugly. 1137734867 M * ebiederm My idea is to have one capability per namespace. 1137734877 M * Bertl okay, let's pick something else, the pid virtualization, yes? 1137734919 M * ebiederm My wishlist is to be able to have most of the capabilities inside a namespace and not be able to do anything. 1137734936 M * ebiederm So for pid virtualization the capability for power is CAP_SYS_KILL. 1137734970 M * Bertl which does en-/disable what? 1137735010 M * ebiederm Actually it is CAP_KILL. 1137735030 M * ebiederm All it does is allow you to send signals processes owned by different users. 1137735052 M * ebiederm Not that useful. But there is not much administration going on with pids right now. 1137735074 M * Bertl well, for example we have different 'init' models 1137735079 M * ebiederm Ok. 1137735112 M * Bertl and I assume that this might become interesting even for mainline 1137735148 M * Bertl or at least I would wonder about 'pid namespaces' which do not have an init, but still work fine ... 1137735175 M * ebiederm What are your init modes? 1137735203 M * Bertl they are formed by two different flags, one which allows to blend-through the host (master) init 1137735221 M * Bertl and a second one which makes a specified pid the init process 1137735235 M * Bertl (the second one is your default) 1137735251 M * ebiederm What is blend-through? 1137735265 M * Bertl you can see init from the host, but you cannot touch it 1137735296 M * Bertl it's an important feature if you want to have a large number of (small) guests 1137735314 M * Bertl you basically save the init which is not used at all 1137735388 M * ebiederm Ok. I almost follow. 1137735411 M * Bertl it's basically a shallow guest, without the init process 1137735413 M * ebiederm You are setting up a context that shares pids with the host? 1137735425 M * Bertl only one pid, the init 1137735437 M * ebiederm Ok. You have a single pid guest. 1137735449 M * Bertl and it doesn't even have to be a real init, because it doesn't matter 1137735481 M * Bertl (just showing something which looks like init would be enough) 1137735513 M * ebiederm So the important part is that it can't send signals to anyone else? 1137735525 M * Bertl what? 1137735554 M * ebiederm I'm trying to understand your mini guest. 1137735562 M * Bertl okay, a simple example 1137735575 M * Bertl consider a guest which just consists of sshd running 1137735594 M * Bertl you can logon to that guest (via sshd) and do all kinds of stuff 1137735607 M * Bertl if you do 'pstree' for example, it will fail 1137735645 M * ebiederm Because pstree cannot see init? 1137735651 M * Bertl yep, precisely 1137735669 M * Bertl pid 1 has to be there. period. 1137735687 M * ebiederm Is it important that this guest not see other processes? 1137735704 M * Bertl well, yes, it's a fully fledged linux-vserver :) 1137735732 M * ebiederm And having sshd be pid 1 is would be too silly. 1137735752 M * Bertl the problem is, it would not even work in certain cases 1137735774 M * ebiederm Where does it fall down? 1137735797 M * Bertl some tools assume that pid=1 means init, so they behave differently 1137735819 M * Bertl a very simple example is busybox 1137735839 M * Bertl another one is the init binary itself 1137735847 M * Bertl (which usually is the same as telinit :) 1137735920 M * ebiederm Ok. I begin to understand. 1137735983 M * ebiederm I'm not certain it is worth the special case code in the kernel. 1137736023 M * Bertl well, no problem with that, but if it is hard to add it (or requires significant changes later, it's not very usable for us) 1137736051 M * ebiederm I suspect a special static init gets about the same benefits. 1137736067 M * Bertl I'm not pessimistic here, I just try to point out some things we should consider ... 1137736078 M * Bertl ebiederm: special static init? 1137736080 M * ebiederm And I appreciate it. 1137736101 M * ebiederm Sit all day in a waitpid look and reap children. 1137736122 M * ebiederm Just before entering that loop for and exec something else. 1137736133 M * Bertl yes, that would be an option, question is, how much overhead is that when you have 200 guests? 1137736155 M * Bertl (because _when_ you ahve 200 guests, you care about overhead :) 1137736174 M * ebiederm Yes. 500K per bash instance can be noticable. 1137736211 M * ebiederm I think early on one of my test cases was to nest about 100 guests (recursively) and kill them all simultaneously to look for races. 1137736290 M * ebiederm What do you usually figure as the per guest overhead? 1137736331 M * Bertl the structures allocated (usually one for each guest) and the additional space used by pointers required for those structures 1137736371 M * Bertl in your case, the static init would add with 100% (as it would not be there) 1137736406 M * ebiederm Not quite as it would share all of the namespaces with the other process. 1137736441 M * Bertl yes, but task struct and such would be in addition 1137736453 M * Bertl (not saying it would be much) 1137736477 M * ebiederm Do you have a rule of thumb resource utilization for small guests with processes in them? 1137736526 M * ebiederm I suspect the additional overhead for my approach may come close to 32K (task_struct 4k init etc) on a modern machine. 1137736557 M * Bertl typical 1-3M RSS 1137736591 M * Bertl that's ~5 processes actually doing something 1137736628 M * Bertl what would that add regarding scheduling, and more important, how would it affect scheduling? 1137736662 M * ebiederm If you are sleeping you shouldn't even show up to the scheduler. 1137736674 M * Bertl would it be possible to have some kernel thread 'posing' as init for those contexts? (i.e. one for _all_ of them)? 1137736710 M * ebiederm A good question. 1137736733 P * stefani parting (is such sweet sorrow) 1137736759 Q * ebiederm Quit: Leaving 1137736788 J * ebiederm ~eric@ebiederm.dsl.xmission.com 1137736794 M * Bertl wb ebiederm! 1137736797 M * ebiederm Sorry I think I fat fingered something. 1137736810 M * ebiederm So one init for all processes. 1137736818 M * ebiederm s/processes/contexts/ 1137736837 M * Bertl for example, I would not care about the overhead there 1137736860 M * ebiederm With init on the inside of the guest I don't think I currently can do that. 1137737004 M * Bertl the 'future' pid model is actually pid-less in the kernel, no? 1137737015 M * ebiederm I take the point for light weight guests though. 1137737027 M * ebiederm A very good question. 1137737040 M * ebiederm I don't see how we implement weak references with a pidless model. 1137737069 M * Bertl okay, next question: what about session borders? 1137737099 M * Bertl i.e. how do you 'enter' a process space and get the data to your (still?) host terminal= 1137737100 M * ebiederm Sessions such as login sessions? Which are a superset of process groups. 1137737104 M * Bertl s/=/? 1137737107 P * undefined 1137737107 J * jpacheco ~justin@CPE00146c1608af-CM0f0099806976.cpe.net.cable.rogers.com 1137737112 M * jpacheco hey guys 1137737117 M * Bertl hey jpacheco! 1137737126 M * jpacheco sup Bertl :) 1137737149 M * Bertl everything fine, having a nice chat here :) 1137737150 M * jpacheco anyone else get this problem installing a vserver with gentoo 1137737156 M * jpacheco Selected arch not supported, or profile does not exist! 1137737168 M * Bertl what arch do you have/try? 1137737175 M * jpacheco i686 1137737191 M * Bertl hmm, maybe try i386 or i586 1137737238 M * jpacheco weird 1137737241 M * jpacheco tried x86 1137737241 M * Bertl ebiederm: so, consider you 'enter' the guest from the host, and execute something and later you put it into the background ... 1137737244 M * jpacheco and now it works 1137737263 M * ebiederm Bertl: ok. 1137737305 M * jpacheco Bertl: so, any word on that acl for tagged contexts? 1137737323 Q * pusling Remote host closed the connection 1137737325 M * Bertl jpacheco: not yet ... 1137737331 J * pusling pusling@195.215.29.124 1137737335 M * jpacheco :( 1137737339 M * jpacheco that's cool 1137737368 J * balbir ~balbir@59.145.136.1 1137737488 M * jpacheco anyone ever get this when starting the vserver service 1137737492 M * jpacheco chdir(): No such file or directory 1137737506 M * Bertl no, but sounds like there are some dirs missing 1137737512 Q * brc_ Quit: No windows for this server 1137737516 M * Bertl (probably even the root dir :) 1137737563 M * jpacheco /vservers exists 1137737570 M * jpacheco fresh install 1137737574 M * Bertl ebiederm: how is it planned to enter the 'namespaces' at all? 1137737580 M * jpacheco of everything i mean 1137737591 M * Bertl jpacheco: how did you install the guest? 1137737599 M * jpacheco vserver-new 1137737620 M * Bertl that's probably a gentoo specific tool, so you have to ask Hollow ... 1137737623 M * jpacheco i can chroot into the vserver 1137737632 M * ebiederm Bertl: Only once so far. 1137737633 M * jpacheco using chroot 1137737656 M * ebiederm Bertl: Then you run a management process if you need it that talks to the outside world. 1137737659 M * Bertl jpacheco: that's a good start, what about the reverse entries in /var/run/vserver... 1137737681 M * ebiederm Bertl: Using unix domain socets, or pipes or something like that. 1137737705 M * Bertl ebiederm: well, that's nice, but not practicable ... especially if you have to maintain some of those namespaces from outside (in a secure manner) 1137737712 M * jpacheco Bertl: just found the problem, thanks though 1137737744 M * Bertl ebiederm: just take the filesystem namespaces as example, when you want to unmount something _for_ a guest, which has no permission to do that on it's own ... 1137737752 M * Bertl jpacheco: what was it? 1137737778 M * jpacheco i just remembered that i haven't done a fresh install for quite some time 1137737793 M * jpacheco and i forgot that if i want the vservers all to start when i restart the service 1137737800 M * jpacheco i have to tell the config file that :\ 1137737813 M * jpacheco please, please 1137737815 M * jpacheco allow me 1137737819 M * jpacheco noooooooooooooooooooob 1137737887 M * Bertl :) 1137737920 M * ebiederm Bertl: I can see the problem. 1137737958 M * ebiederm Although mounting is different than unmounting. 1137737998 M * Bertl well, doesn't matter much if there is no permission :) 1137738044 M * ebiederm Actually in the general case I would like things to work well enough that I can give permission to my guest 1137738068 M * Bertl that's something the providers don't like very much :) 1137738071 M * ebiederm to unmount, bind, and at least on a subset of trusted filesystesm mount. 1137738099 J * id23 ~id@p54A03E8A.dip0.t-ipconnect.de 1137738110 M * id23 morning #vserver 1137738122 M * Bertl ebiederm: aside from that, what is the purpose of a 'security context' which can do everything you can do from outside, inside? 1137738133 M * Bertl morning id23! 1137738154 M * ebiederm Bertl: If that security context can't use the same hardware.... 1137738177 M * Bertl then it's nice, (very similar to xen) but not more secure ... 1137738209 M * ebiederm No just cheaper on the resources. 1137738233 M * Bertl linux-vserver (and some other projects in this area) are not just about the hardware abstraction but also about increased security 1137738250 M * ebiederm So this is a piece I need to understand. 1137738263 M * Bertl the granted capabilities ensure that you do not do certain things inside a guest 1137738285 M * Bertl but OTOH, you do not want to lose the ability to do it _for_ the guest 1137738325 M * Bertl for example, regarding the mount, we have a bunch of flags which control what can be done with mount/umount 1137738348 M * Bertl i.e. can the guest mount filesystems, and if, with what flags ... 1137738373 M * _Roey ooh, meiner erd epfel. 1137738381 M * Bertl ebiederm: can the guest mount kernel/network filesystems? 1137738391 M * _Roey gnight Bertl 1137738399 M * Bertl nigh Roey! 1137738410 M * ebiederm Bertl: I have yet to touch that aspect. 1137738423 M * _Roey Bertl: btw I reallly really don't like how teh inclusion of OpenVZ into the Linux mainstream kernel would make YOUR job much more difficult. 1137738437 M * _Roey b/c openvz and vserver are substitutes for one another. 1137738444 M * ebiederm So far all I have done is make /proc mount differently for the guest. 1137738445 M * _Roey unlike xen/openvz or xen/vserver. 1137738486 M * Bertl _Roey: I appreciate that, and I'm pretty sure that OpenVZ will not be integrated (and if, then it is time to start a new kernel tree :) 1137738502 M * _Roey yeah. 1137738516 M * _Roey that's a company pushing its stuff into the kernel. 1137738517 M * _Roey I don't like that. 1137738539 M * Bertl well, nothing against that if it is really useable ... 1137738591 M * Bertl Roey: but seriously, just check the patches they have, I cannot imagine Linus or Andrew merging that ... 1137738600 M * ebiederm _Roey: so how is openvz pushing it's stuff into the kernel? I haven't seen that kind of activity. 1137738638 M * Bertl http://linux.slashdot.org/article.pl?sid=06/01/17/2251233 1137738986 M * ebiederm Posturing. Unless someone can find confirmation from redhat. 1137739014 M * Bertl well, not just that, jumping to conclusions too ... 1137739098 M * Bertl but hey, it's a company, and recently everything in marketing is evaluated regarding hearsay ... 1137739115 M * Bertl s/regarding/according/ 1137739192 M * ebiederm Ok. Then looking at things. 1137739234 M * Bertl but, considered that they were silently violating the GPL and are now using the 'Open Source' train to get cheap betatesting and workers ... they are doing a great job regarding PR :) 1137739282 M * ebiederm So my vision extended to essentially a light-weight version of xen. It looks like you have the whole kernel to yourself. 1137739321 M * Bertl that's nice, but as I already pointed out, is just a small part of what is required to make it _useful_ 1137739323 M * ebiederm vserver goes a littler farther to extra light weight guests. And extra secure guests. 1137739351 M * ebiederm Bert: (Just recap for the moment) 1137739356 M * Bertl k 1137739394 M * ebiederm So I think for extra light weight I have a handle on the problem. 1137739423 M * ebiederm For extra security that is something I need to digest more. 1137739435 M * Bertl it's late for me (almost 8am) so I'm not that snappy ... 1137739448 M * ebiederm Ok. 1137739460 M * Bertl for me the important points for you to think about are: 1137739488 M * ebiederm On the network side I have firewall rules in the ethernet bridging code which can limit all kinds of things. 1137739509 M * ebiederm So at least I have a familiar place to start when thinking about filtering activities. 1137739510 M * Bertl - how do I avoid additional overhead for 'shortcuts' we take/know now (and if you do not plan to do that, which is perfectly fine for me, how could I make it simple to do that for others, like me) 1137739557 M * Bertl - how do I allow to enter/leave a namespace (or several namespaces), without jumping through hoops ... 1137739594 M * Bertl - how do I handle capabilities (old and especially many new ones) 1137739609 M * Bertl - how do I manage resources (both for accounting and for limiting) 1137739661 M * Bertl for linux-vserver all of this is basically described in my paper, so you might have a look there to get a few ideas/concepts we use now 1137739694 M * Bertl of course, it's not really up-to-date (atm) so there is probably more to consider ... 1137739709 J * menomc ~amery@200.75.27.5 1137739718 M * Bertl welcome mnemoc! 1137739757 M * ebiederm Bertl: Ok. 1137739798 M * ebiederm I am going to suggest we may have cases where we use a selection of two different mechanisms. 1137739815 Q * mnemoc Ping timeout: 480 seconds 1137739815 N * menomc mnemoc 1137739819 M * ebiederm One when we want things to be extremely light-weight/filtered. 1137739846 M * ebiederm One when we want to support more of the general case. 1137739865 M * Bertl what is the general case? 1137739871 M * ebiederm Basically like we get with my network code versus just limiting a vserver to a single ip. 1137739981 M * Bertl for example, regarding your network code, I'm not sure which case I would call the general one ... 1137740022 M * ebiederm General case may be the wrong word. A better fill in would be to say it looks like you have your own machine to do with as you will. 1137740066 M * Bertl but without limits it is only of very limited interest to anybody, no? 1137740103 M * Bertl I don't see a real application where you would want complete access ... 1137740129 M * Bertl for me, it's similar to creating a machine with 'just' root logons ... 1137740163 M * Bertl several of them, but nevertheless, and then you go and tell the users, please do not destroy anything ... 1137740199 M * ebiederm I said looks. 1137740245 M * Bertl okay, final comment here: 1137740250 M * Bertl (for today :) 1137740260 M * ebiederm If on your machine where everyone has a root login, none of the roots has any capabilities and every root is put in their own chroot they would have a very hard time hurting anything. 1137740272 M * ebiederm I'm not at all agaist super ulimits. 1137740282 M * ebiederm Ok good night. 1137740290 M * ebiederm I need to head to bed as well. 1137740293 M * Bertl sometimes trying to 'virtualize' everything is not what you want ... 1137740306 M * Bertl at least not what the 'customer' wants 1137740345 M * Bertl ebiederm: good night, and sweet dreams :) 1137740363 M * Bertl good night folks, cya all tomorrow! 1137740372 N * Bertl Bertl_zZ 1137740515 Q * ebiederm Quit: Leaving 1137741499 J * Aiken__ ~james@tooax6-177.dialup.optusnet.com.au 1137741844 Q * Aiken_ Ping timeout: 480 seconds 1137743420 M * jpacheco what's up boyoz 1137743618 Q * id23 Quit: Leaving 1137745629 Q * shedi Quit: Leaving 1137747798 J * undefined ~undefined@adsl-68-93-109-94.dsl.rcsntx.swbell.net 1137747832 P * undefined 1137748872 J * shedi ~siggi@tolvudeild-205.lhi.is 1137750130 N * nokoya nokoyaz 1137750219 Q * FireEgl Ping timeout: 480 seconds 1137750383 J * Milf ~Miranda@ipsio469.ipsi.fraunhofer.de 1137750493 J * Smutje ~Smutje@xdsl-84-44-247-116.netcologne.de 1137750599 Q * Smutje_ Ping timeout: 480 seconds 1137751025 J * prae ~prae@ezoffice.mandriva.com 1137752454 J * id23 ~id@p54A01C87.dip0.t-ipconnect.de 1137752491 J * FireEgl Atlantica@2001:5c0:84dc:: 1137752686 J * Psy0rz_ ~psy0rz@lounge.datux.nl 1137752776 Q * Psy0rz Ping timeout: 480 seconds 1137753977 J * ntrs_ ~ntrs@68-188-50-87.dhcp.stls.mo.charter.com 1137754049 Q * ntrs__ Read error: Connection reset by peer 1137754395 J * lilalinux ~plasma@80.69.35.186 1137755957 Q * marl__ Quit: Leaving 1137756571 J * tudenbart ~willi@xdsl-213-196-249-254.netcologne.de 1137756934 Q * BWare Ping timeout: 480 seconds 1137756996 Q * dothebart Ping timeout: 480 seconds 1137758473 J * Viper0482 ~Viper0482@p5497744F.dip.t-dialin.net 1137758669 Q * Aiken__ Ping timeout: 480 seconds 1137759119 Q * Viper0482 Remote host closed the connection 1137764093 J * mountie ~mountie@CPEdeaddeaddead-CM000a739acaa4.cpe.net.cable.rogers.com 1137764554 Q * _mountie Ping timeout: 480 seconds 1137765458 M * lonewolff hey all 1137765625 Q * mkhl Quit: 1137766651 J * meandtheshell ~markus@85-125-230-137.dynamic.xdsl-line.inode.at 1137766701 J * mkhl ~mkhl@200-148-40-46.dsl.telesp.net.br 1137767368 Q * gerrit Ping timeout: 480 seconds 1137770213 J * gerrit ~gerrit@pixpat.austin.ibm.com 1137770963 J * complexmind ~Frank@cpc1-brig3-6-0-cust194.brig.cable.ntl.com 1137771369 J * vrwttnmtu ~eryktyktu@82.69.161.137 1137771386 M * vrwttnmtu Happy 2006 to you all 1137771410 M * vrwttnmtu Hello Hollow , Bertl_zZ 1137771745 M * Roey hey have any of you by any chance used Nessus 3.0 on Debian/SID? 1137771892 Q * tudenbart Quit: be root reboot 1137772044 Q * balbir Ping timeout: 480 seconds 1137772465 J * ebiederm ~eric@ebiederm.dsl.xmission.com 1137772620 M * Roey http://www.debian-administration.org/articles/328#comment_20 1137772622 M * Roey ebiederm: hi 1137772625 M * Roey I don't get it. 1137772639 M * Roey People don't seem to realize that Xen and VServer are complementary 1137772643 M * ebiederm good morning 1137772645 M * Roey not substitutes 1137772649 M * Roey ebiederm: good morning to you!!!!! 1137772656 M * Roey ebiederm: your nick sounds medical 1137772716 M * ebiederm I don't know it is what you get when you take Eric Biederman do the standard first letter of first name + last name then truncate it to 8 characters. 1137772731 M * ebiederm I stick with it because it is fairly unique. 1137772734 M * Roey =) 1137772756 M * Roey sounds like it would be part of a broader medical term 1137772757 M * Roey like 1137772772 M * Roey lateral ebiedermic cells 1137772774 M * Roey or something 1137772782 M * Roey ebiedermic abrasion therapyu 1137772783 M * Roey *therapy 1137772810 M * Roey the amoeboa viservosa ebiedermoa 1137772836 A * ebiederm laughs 1137772861 M * Roey what, what : 1137772861 M * Roey :) 1137772872 M * Roey Transverse Bertloscsopy 1137772879 M * Roey *Bertloscopy 1137772895 M * Roey vrwttnmtuic acid 1137772915 M * Roey the Hozac Flu of 1903 1137772917 M * Roey etc. 1137772929 M * ebiederm I roey my boat down the river... 1137772933 M * Roey hahaha 1137772970 M * Roey where are you from, with a name like Eric Biederman? 1137772980 M * ebiederm :) 1137772993 M * ebiederm The united states. 1137772997 M * Roey were your grandparents Ashkenazim? 1137773031 J * jeeves ~Bob@c-24-11-171-10.hsd1.mi.comcast.net 1137773036 M * ebiederm No. The Biederman come from ancestors from switzerland. 1137773066 M * ebiederm Where it was spelled Biedermann 1137773083 M * Roey ahhhhhh 1137773095 M * Roey oh, ok; you're not Jewish, I see. 1137773152 M * ebiederm No. not Jewish. 1137773160 M * Roey aye 1137773171 M * Roey ebiederm: do you develop vserver 1137773171 M * Roey ? 1137773182 M * ebiederm I hadn't even made the connection with the Jewish dialects of english. 1137773186 M * Roey I'm just an admin who's using it in a production environment. 1137773186 M * ebiederm Not exactly. 1137773213 M * jeeves I am good old fashioned, red, Heintz 57, and lov'in it. 1137773223 M * jeeves ok, vserver question 1137773247 M * ebiederm I'm a developer who is developing code which should be useful for vserver. 1137773255 M * ebiederm That I intend to merge into the kernel. 1137773282 M * ebiederm So I am looking at vserver for prioir art and testing. 1137773286 M * jeeves What are the lessons learned for copying a vserver? I have read about the vserver-copy script, and some other things. What is the BEST (easiest) way to copy. 1137773305 Q * shedi Quit: Leaving 1137773341 M * Milf Esiest way? Stop the vserver, tar the whole tree? 1137773342 M * Roey ebiederm: do you work on OpenVZ? 1137773353 M * ebiederm Nope. 1137773359 M * Roey jeeves: you have to change a bunch of things. 1137773365 M * jeeves what about the conf's? 1137773374 M * jeeves Thats what I was afraid of. 1137773379 M * Roey jeeves: I would copy just /usr/lib and select files from /etc 1137773384 M * Hollow ebiederm: you're working on the pid virtualization right? (at least i concluded it from the backlog with bertl ;) 1137773389 M * jeeves should I just make new vservsers? 1137773389 M * Roey jeeves: I mean it's not so overly complicated that you can't do it 1137773398 M * Roey jeeves: do that. 1137773398 M * Roey yes 1137773403 M * Milf Waht do you want to do? Setup a copy of a vserver on another machine as standby or copy a vserver to make a different one of it? 1137773406 M * ebiederm Hollow: Among others. 1137773412 M * jeeves ok. Thats what I wanted to hear. 1137773422 M * Hollow ebiederm: others would be..? 1137773426 M * jeeves here is the situation:..... 1137773438 M * ebiederm So far I have just done a proof of concept implementation to see where uglies are. 1137773455 M * ebiederm sysvipc and networking so far. 1137773465 A * Roey imagines a Dr. Mario for OSS programmers 1137773475 M * Hollow ic, i.e. you're playing around with ngnet & co? 1137773491 M * jeeves I am writing a distributed application that can run on multiple machines. I currently have 1 vserver and the parent that are distributing jobs fine. Now I want to copy the vserver to test more distribution. 1137773525 M * ebiederm Hollow: Well I showed up and my network implementation was voted NGNET.... 1137773528 M * Hollow jeeves: cd /vserver && cp -ra vs1 vs2 1137773564 M * Hollow ebiederm: do you have a link to the implementation? 1137773586 M * ebiederm It's up on kernel.org 1137773612 M * lonewolff evening all 1137773616 M * jeeves Hollow: what about the conf files in /etc 1137773632 M * ebiederm And I think Bertl has mirroed some of it, as well. 1137773636 M * jeeves name, ip, interface-name that kind of stuff 1137773651 M * Hollow jeeves: cp the and adapt them 1137773700 M * jeeves thanx 1137773906 J * Viper0482 ~Viper0482@p5497744F.dip.t-dialin.net 1137774018 M * Milf Jeeves: Better: create a template to copy around, that has most adaptations already done. 1137774190 Q * id23 Quit: Leaving 1137774320 M * jeeves Milf: That is what I am going to do. I would like to run probably 6 or so. 1137774351 J * Doener doener@i5387EC59.versanet.de 1137774435 M * Milf jeeves: Yeah gogogo. 1137774534 N * Bertl_zZ Bertl 1137774541 M * Doener hi Bertl 1137774549 M * Bertl morning folks! 1137774560 M * complexmind hi Bertl ! 1137774569 M * ebiederm Good morning Bertl. 1137774609 M * Bertl hey Doener! Hollow! complexmind! ebiederm! :) 1137774632 M * Hollow hey Bertl! 1137774633 M * jeeves Hey B. 1137774702 M * Bertl Hollow: http://www.13thfloor.at/NGNET/ebiederm/2.6.14-rc2/broken-out/ 1137774728 M * Bertl (but it's a mess right now, nevertheless ebiederm is working on it :) 1137774743 M * Hollow yeah.. already found it on kernel.org 1137774916 M * ebiederm Bertl: With regards to resources (accounting and limiting) vservers except for their context id's aren't particullary special. 1137774927 Q * Milf Quit: Miranda IM! Smaller, Faster, Easier. http://miranda-im.org 1137774988 M * ebiederm So I think I am going to leave that up to generic kernel accounting... 1137775037 M * ebiederm Getting a basic infrastructure in should set the stage for modifications to account by context. 1137775063 M * ebiederm I keep hearing rumors of CKRM but I haven't had a chance to check it out yet. 1137775144 M * jeeves where is the pid file. "vserver 'test2' is already running" 1137775271 M * Bertl ebiederm: well, we already thought of using ckrm many times, PlanetLab even used it (for a short period of time) it's basically unusable ... 1137775308 M * Bertl ebiederm: it is very complicated, has significant overhead and does not even remotely do what is required ... 1137775328 M * Bertl ebiederm: aside from that, it break in strange ways (stability issues) 1137775349 M * ebiederm Bertl: I had a hunch. 1137775374 M * Bertl ebiederm: but I do not consider that an issue for the virtualization, as we already do that, so no need to re-invent it atm :) 1137775382 M * ebiederm Bertl: There had to be a reason CKRM has been around for so long and has not yet been merged. 1137775433 M * Bertl the important part for 'our' work (i.e. mainline virtualization) is that we do not forget about such things ... 1137775446 M * ebiederm Bertl: Agreed. 1137775462 M * ebiederm Picky a solution that would not allow for it could be problematic. 1137775464 M * Bertl another issue I'd like to address is (once again) the API to the virtualizations 1137775510 M * Bertl AFAICT the CLONE_* flags should not pose any issues (even if used inside a guest) 1137775527 M * Bertl would just allow guests to use that functionality too 1137775571 M * Bertl nevertheless, we might consider some kind of clone_mask to limit it? 1137775590 M * Bertl (similar to the capability masks) 1137775607 M * ebiederm Possibly. 1137775629 M * ebiederm I was thinking the simple case of you can't create a new namespace unless you are allowed to modify the old one. 1137775647 M * ebiederm Which we largely already have capability bits for. 1137775737 M * Bertl please elaborate! 1137775784 M * Bertl sidenote: would this be a good chance to extend the capability system to at least 64bit (preferable unlimited bits)? 1137775816 M * ebiederm Bertl: I think that needs to be part of the conversation. 1137775830 M * ebiederm Basically for networking we have CAP_NET_ADMIN. 1137775847 M * ebiederm For ipc we have CAP_IPC_OWNER 1137775883 M * ebiederm For processes we have CAP_KILL. 1137775899 M * ebiederm I'm not certain they all map that well. But that was my first reaction. 1137775926 M * Bertl hmm .. I'd definitely prefer separate flags 1137775931 M * ebiederm Either that or we setup appropriate rlimits and deny it with a 0 rlimit. 1137775953 M * Bertl history has shown that the multiple use for single flags was a bad decision 1137775979 M * ebiederm Bertl: Agreed. Clean sematics are a requirement. 1137775989 M * Bertl also rlimits do not fit this purpose very well at least IMHO 1137776027 M * ebiederm Bertl: At least rlimits are not arbitrarily limited. 1137776038 M * Bertl http://linux-vserver.org/Resource+Limits 1137776058 M * Bertl here you can see what we use for rlimits (ulimit and per context) 1137776088 M * Bertl 'reusing' rlimits for boolean decisions seems somewhat hackish to me 1137776108 M * Bertl in this case, I'd prefer to add a completely new interface 1137776212 M * Bertl ebiederm: btw, what I might not have mentioned yet, linux-vserver has a syscall with a perfectly? arch independant multiplexor (the syscall is also registered on all archs). if it makes sens, we could easily use some of those commands for setting or querying additional flagwords, etc ... 1137776213 M * ebiederm Actually I was thinking that for some of these limits are actually sane. 1137776286 M * Bertl of course, once the stuff gets in, a new syscall would not be a problem 1137776387 M * Bertl ebiederm: okay, one important thing which I can't mentione often enough is: we need some way to move (enter and leave) between arbitrary 'namespaces' 1137776395 M * Bertl *mention 1137776448 M * ebiederm Bertl: I understand the problem you are solving. 1137776482 M * ebiederm And moving between arbitrary namespaces does appear sane. (So long as you have the appropriate permissions). 1137776515 M * Bertl yes, of course ... 1137776539 M * ebiederm The case that disturbs me about that kind of GOD MODE usage, is that can potentially be a security problem. 1137776544 M * Bertl which again points to having separated capabilities 1137776559 M * ebiederm Bertl: Sort of. 1137776596 M * ebiederm In an ideal scenario privlege separation is done by contexts. 1137776652 M * ebiederm Part of the problem with capabilities is the don't map well to most real world problem domains. 1137776701 M * ebiederm If you have lots and lots of users having a single all powerful users can be a problem. 1137776716 M * Bertl a 'secure' way to move between contexts (or namespaces) is an important feature of linux-vserver compared to xen 1137776772 M * Bertl we could live with a single almighty user ... but it might make sense to have this adjusted for hierarchy 1137776807 M * Bertl i.e. have one flag 'permitting' the change per se, and another one which allows you to move upwards too (not just down and back) 1137776817 M * ebiederm When I was reading up on BSD jails the thing that most impressed me was the idea of using the jails instead making a complicated capability hierarchy. 1137776859 M * Bertl please elaborate a little on that ... 1137776906 M * ebiederm Trying. The BSD guys said it better. 1137777022 M * ebiederm I think this is the paper: http://phk.freebsd.dk/pubs/sane2000-jail.pdf 1137777036 M * derjohn Bertl, I did not get answer form Nils about Linuxtag, I assume you didnt get one yet, too? 1137777122 M * ebiederm Bertl: The way I look at is which is trumps which. Capabilites trump the namespace, or namespaces trump capabilities. 1137777141 M * Bertl derjohn: no, nothing yet ... 1137777187 A * Bertl is reading the paper now ... 1137777210 J * undefined ~undefined@adsl-68-93-109-94.dsl.rcsntx.swbell.net 1137777221 M * Bertl welcome undefined! 1137777273 M * jeeves ok, I copied a vserver and modified some conf files. I must be missing something like a pid file. I can start the new or the old but not both. "Server 'test' already running." thoughts? 1137777274 M * derjohn Hello All! My daily question: Who wants to join me on LinuxTag 2006 if I reserve am area for Linux-Vserver? I wont do it, if I am alone there (looks bad), but ..... 1137777302 M * derjohn jeeves, have a look at /etc/vserver/*/conext 1137777313 M * Bertl jeeves: check in /var/run/vservers* (and also with vserver-stat) 1137777406 M * derjohn Bertl, ... should be created when starting? So how can one make a mistake when copying a vserver? *shrug* 1137777521 M * Bertl derjohn: existing data, left over running processes, wrongly copied symlinks ... 1137777535 M * Bertl ebiederm: did you read the 'Jail Implementation' part? 1137777537 M * derjohn Bertl, symlinks was the hint. thx 1137777544 M * jeeves <-makes mistakes all the time 1137777570 M * Bertl ebiederm: the long list of '... is prohibited' :) 1137777600 M * Bertl ebiederm: once you permit one of those, trouble comes in big leaps ... 1137777642 M * ebiederm Bertl: You may be right. 1137777677 M * ebiederm This paper dates back to when the work was first being done on BSD. 1137777741 M * Bertl but basically we use a similar scheme for ensuring that guests are safe ... we limit the capabilities to some mask and take away the administrative power (for the context mechanisms) ... of course, this only works once 1137777742 M * ebiederm I'm not certain the BSD got it right. But the idea of using jail like entities for privelege separation is what intrigues me. 1137777794 M * Bertl but, I would not make a single flag (and nothing more is the 'condition' jail) to restrain _all_ functionality inside a guest 1137777854 M * Bertl there are many cases where folks intentionally give some 'insecure' capabilities because they manage the jail too 1137777915 M * ebiederm Bertl: It is clear we aren't where the BSD guys were. 1137777930 M * ebiederm However 1137777950 M * Bertl ebiederm: but let's start with the current state, let's assume we have one deity here, the super-root user or as we call it, the host administrator 1137777998 M * Bertl let's look at the various namespaces (would be nice to identify them once again) and what 'powers' would be required to manage a bunch of them 1137778007 Q * complexmind Quit: Leaving 1137778015 M * Bertl ebiederm: does that sound sane? 1137778053 M * ebiederm I think a question to ask is how can we head in a direction where privelege separation falls out into the multi unix model. 1137778079 M * ebiederm Bertl: Yes I think it makes sense to look at the namespaces and see what makes sense to administer them. 1137778128 M * Bertl okay, the first and most natural one is the filesystem namespace, no? 1137778134 M * ebiederm Bertl: And equally let's look at what administration we can't allow the user of them to administer. 1137778171 M * Bertl the 'user' in this case is the 'guest' admin, right? 1137778179 M * ebiederm Bertl: Yes. 1137778216 M * Bertl okay, so filesystem namespaces allow guest root to access everything they see. period. 1137778218 M * ebiederm I think of it at least sometimes as the mount namespace. 1137778236 M * ebiederm The primary limiting factor are suid executables. 1137778256 M * ebiederm At least that is the limit to too much change. 1137778291 M * Bertl doesn't concern guest root, as 'he' must get the maximum powers granted to the 'guest' (which is not aprt of the namespace though) 1137778291 M * ebiederm So we have a couple of operations. 1137778338 M * ebiederm mount, unmont, mount --bind, use. 1137778341 M * Bertl imho we have to separate namespaces and what you call 'jail' and I call 'guest' or context ... 1137778364 P * undefined 1137778374 M * ebiederm Bertl: Yes. 1137778376 M * Bertl yes, and along the lines of the jail-paper, mount/unmount has to be forbidden 1137778387 M * ebiederm Namespaces are just one of the mechanism used to build up the whole. 1137778402 M * Bertl you can then make same exceptions to the rule, under certain conditions ... 1137778424 M * ebiederm Let's go through the reasons for limit mount and unmount. 1137778442 M * ebiederm Reason 1 suid executables can get confused. 1137778447 J * undefined ~undefined@adsl-68-93-109-94.dsl.rcsntx.swbell.net 1137778451 M * ebiederm Not a problem for the guest admin. 1137778454 P * undefined 1137778458 M * Bertl ah, short side-question here, just because you mentioned it ... do you want to provide the context structure or just the building blocks in mainline? 1137778534 M * ebiederm Bertl: I think just the building blocks, but this assumes we can build the equivalent of a context structure out of them. 1137778550 J * undefined ~undefined@adsl-68-93-109-94.dsl.rcsntx.swbell.net 1137778572 M * jeeves ok, got them copied. Lesson learned. There are to symbolic links to fix. 1137778587 M * jeeves two* 1137778589 M * ebiederm We can't allow mount because that is a back door to allowing mknod. 1137778624 M * Bertl precisely or some other evilö things 1137778625 M * ebiederm We can't allow mount because the kernel filesystem implementation might have bugs. 1137778659 M * Bertl there is also DoS via network filesystems 1137778662 Q * pusling Remote host closed the connection 1137778672 J * pusling pusling@195.215.29.124 1137778701 M * Bertl and a minor resource issue with --bind and --rbind 1137778750 M * ebiederm --bind and --rbind probably make sense to resoruce limit. They are about the same cost as a file descriptor. 1137778790 M * ebiederm What are the problems with unmount? 1137778821 M * Bertl only one, the guest admin might screw up and unmount something he cannot get back easily 1137778867 M * Bertl (assumed that you cannot unmount stuff outside your namespace which is currently guaranteed) 1137778976 M * ebiederm Ok. We are half way there. 1137779156 M * ebiederm So if we have an uncrashable fs we are much closer to allow the user to mount things. 1137779164 M * Bertl hmm, IMHO we haven't even touched the question why and how does the host admin access the namespace 1137779190 M * Bertl but let's continue for now ... 1137779247 M * Bertl yes, assumed we have a 'secure' filesystem which does not expose device nodes, we can allow mounts inside a guest (within some limits again) 1137779267 M * dhansen Bertl: there was a talk about what we need to allow user mounts at OLS last year 1137779275 M * Bertl this seems immediately given for --move, --bind and --rbind 1137779277 M * dhansen I'm trying to think who gave it. They talked about plan9 a lot 1137779288 M * Bertl dhansen: ah, some url would be great! 1137779333 M * ebiederm brb 1137779609 M * ebiederm Before we get too far into user mounts I'm going to need to read through the latest copy of namespace.c 1137779625 M * ebiederm But mount --bind and --rbind is generally safe. 1137779661 M * ebiederm mounting of a device node requires permissions to that device node. 1137779696 M * ebiederm likewise with a network filesystem. 1137779711 M * ebiederm And of course you must be mounting bug free filesystems. 1137779741 M * ebiederm Now to see if I can get a handle on shared mount subtrees. 1137779814 M * Bertl okay, I agree, but that leaves us with two obvious, and one not so obvious questions 1137779832 M * Bertl - what decides about 'permissions' to a device 1137779850 M * Bertl - how to handle permissions on networking (filesystems) 1137779882 M * Bertl - how to allow for guest creation (involving mounts inside the namespace which would be otherwise forbidden?) 1137779963 M * Bertl I'm intentionally leaving out the question "how to enter the namespace to fix something" as IMHO this could be similar to question 3 1137780002 Q * michal` Ping timeout: 480 seconds 1137780005 J * michal` ~michal@www.rsbac.org 1137780012 M * Bertl wb michal`! 1137780049 M * dhansen Bertl: http://www.linuxsymposium.org/2005/linuxsymposium_procv2.pdf 1137780056 M * dhansen "Glen or Glenda" 1137780077 M * ebiederm So for question 3. We can set a lot of things up in the parent before going much farther. 1137780117 M * ebiederm To the second part we may be able to setup the childs namespace as a shared namespace so that we can touch it from the outside of the guest as well as from inside. 1137780121 M * Bertl ebiederm: a lot of things, but not everything, and honestly, it complicates setup a lot (including all kind of races) 1137780156 Q * prae Quit: Execute Order 69 ! 1137780174 M * Bertl ebiederm: the shared (inherited) namespace is an interesting approach, but not always desired 1137780201 M * ebiederm Bertl: Do you use namespaces now or are you using chroots? 1137780208 M * Bertl namespaces and chroots 1137780223 M * ebiederm Ok. 1137780231 J * shedi ~siggi@inferno.lhi.is 1137780251 M * Bertl legacy uses only chroots, new style a namespace with --rbind and chroot 1137780365 M * ebiederm Ok. 1137780404 T * Bertl http://linux-vserver.org/ | latest stable 2.01, 1.2.10, 1.2.11-rc1, devel 2.1.0, exp 2.1.0.5, 2.0.1.2 | util-vserver-0.30.209 | vserver-utils-1.0.2 | He who asks a question is a fool for a minute; he who doesn't ask is a fool for a lifetime -- share the gained knowledge on the wiki, and we'll forget about the minute ;) 1137780408 M * ebiederm So until I catch up on all of this shared/private/slave stuff that has been recently been added to mount I can't comment much more. 1137780435 M * Bertl @all 2.6.15-vs2.1.0.5 (be careful, new stuff for limits) 1137780458 M * ebiederm For devices. We need some kind of filtering. 1137780477 M * ebiederm With sysfs mknod has been broken. 1137780489 M * Bertl we currently use the simple 'do not provide a node' filtering technique 1137780495 M * ebiederm As far as a limiting mechanism to devices anyway. 1137780540 M * ebiederm Bertl: And it make be possible to not provide nodes, and to simply mount --bind the pieces of sysfs you want. 1137780572 M * ebiederm s/make/may/ 1137780574 M * Bertl but, I thought about having some kind of userspace policy/helper to tag the device nodes either for all guests or per guest 1137780584 M * Bertl ebiederm: sysfs is not allowed inside a guest (for now) 1137780641 M * ebiederm Bertl: Agreed. 1137780647 M * ebiederm But it can't stay that way. 1137780664 M * Hollow Bertl: http://phpfi.com/97335 :) 1137780705 M * ebiederm mount --bind on a subset of sysfs may be as good as limiting mknod but I haven't had a chance to look at that in detail yet. 1137780717 M * Bertl ad sysfs: well ... depends ... for linux-vserver I guess it could stay this way ... 1137780729 M * Bertl Hollow: great! 1137780750 M * Hollow will be in the next release of vserver-utils 1137780752 M * Bertl Hollow: that's something ebiederm wants to destroy immediately :) 1137780760 M * Hollow hehe 1137780778 M * Bertl ebiederm: okay, back on topic ... 1137780782 M * Hollow guess it could need some auditing at least 1137780788 M * ebiederm What is that I want to destroy immediately? 1137780809 M * Bertl transition from host context to a guest 1137780818 M * ebiederm Ah. 1137780856 M * ebiederm It is more something I would rather not do if it isn't necessary. 1137780856 M * Bertl let's assume we have some (probably complicated) mechanism to tell good device nodes from bad ones 1137780868 M * ebiederm I am very much a minimalist in my way of thinking. 1137780875 M * Hollow ebiederm: yeah, i wouldn't do it either, but folks are requesting it 1137780887 M * Bertl and let's further assume we can differenciate between mounting and raw access on them 1137780923 M * ebiederm Bertl: I'm not certain the distinction is necessary. 1137780941 M * ebiederm Well for the shared case I guess the distinction is neeeded but not for unshared. 1137780950 M * Bertl well, let's say it this way, you probably want to avoid raw access to a device which can be moutned 1137780982 M * ebiederm Bertl: Yes. But if it is your data it is your foot. 1137780986 M * Bertl otherwise chances for a kernel panic are very high (think evil guest root creating a broken filesystem) 1137781001 M * ebiederm Bertl: Yes. 1137781025 M * Bertl usually the provider does not like those :) 1137781070 M * ebiederm I think what I want is a pallete of filesystems I can mount, probably already mounted to some directory in /dev and I can then just mount --bind them where I want 1137781107 M * Bertl what about loop mounts then? 1137781116 Q * pusling Remote host closed the connection 1137781125 J * pusling pusling@195.215.29.124 1137781146 M * Bertl (pushing the limits, but this is an actual demand) 1137781167 M * ebiederm Bertl: There are two distinct classes. Mounts I am willing to share. Mounts of trusted filesystems on raw devices. 1137781260 M * ebiederm In my mythical pallete area. Where the host admin controls it and not the guest admin, but both can see it. 1137781286 M * ebiederm You might be able to set up an automounter if premounting everything is a problem. 1137781330 M * Bertl okay, let's consider tempfs (we mount that to /tmp) and the procfs which is mounted too 1137781336 M * ebiederm There are also significant issues about what uid/gid values mean accross guets as well. 1137781350 M * ebiederm True kernel virtual filesystems. 1137781355 M * Bertl let's further assume we want the guest to be completely secure, and do not allow mount/umount at all 1137781379 M * Bertl how do we setup or destroy the namespace? 1137781416 M * ebiederm The easy way is for init to set it up with init and then drop the capability. 1137781435 Q * pusling Remote host closed the connection 1137781437 M * Bertl hmm, so you would have to trust guest init, no? 1137781444 J * pusling pusling@195.215.29.124 1137781451 M * Bertl pusling: ping! :) 1137781481 M * ebiederm Bertl: Not exactly. You trust the program that later execs the guest init. 1137781514 M * ebiederm It is the same one that created all of the namespaces etc so it must have priveledges. 1137781518 M * Bertl okay, so once the guest is set up, you're dead (i.e. you cannot change anything there) 1137781561 M * ebiederm So far that is where my thinking is. Unless you leave a host admin process with privelges hanging around. 1137781575 M * ebiederm Although I have thought of some nasty hacks with ptrace as well. 1137781576 M * Bertl inside the guest ... 1137781592 M * ebiederm Bertl: yes. 1137781609 M * Bertl that would, compared to the current init-less light weight guests add two processes, right? 1137781633 M * ebiederm Well if you were in init-less emulation it should only be one process. 1137781649 M * Bertl almighty init inside the guest? 1137781676 M * Bertl with scoket interfaces to change stuff? 1137781689 M * Bertl doesn't sound very secure to me ... 1137781699 M * ebiederm Bertl: That was my thought but I agree. 1137781729 M * ebiederm Basically security is easy if all you do is drop capabilities. 1137781739 M * ebiederm Securely adding capaibilities is hard. 1137781753 M * Bertl okay, let me rephrase the entire issue, because I think that you are intuitively trying to avoid something which should not be an issue at all, except for the pid (but we will come to that later) 1137781771 M * ebiederm I don't think I have figured it out yet. I simply have a few ideas on how to go up the creek and still have something that resembles a paddle. 1137781784 M * Bertl the namespace is an (often shared) property of a task 1137781806 M * Bertl the task struct has a pointer to the namespace it uses 1137781848 M * Bertl moving between namespaces should be as easy as safely changing this pointer, no? 1137781863 M * ebiederm Bertl: Yes. 1137781869 M * Bertl now the important part: who is allowed to move and where? 1137781899 M * Bertl in the simple (current) case, we just allow somebody holding the CONTEXT capability to arbitrarily move 1137781912 M * ebiederm I think I have an idea. 1137781929 M * Bertl in a more elaborated case, we could allow to move 'down' but not 'up' 1137781945 M * Bertl (down being towards child namespaces) 1137781970 M * Bertl this could be made even more secure using a stack like structure 1137781983 M * Bertl (which would also allow to move up again) 1137781993 M * ebiederm So what if we encapsulated that pointer to a namespace in a filedescriptor. 1137782028 M * ebiederm I pass it to you using unix domain sockets you can use it. 1137782028 M * ebiederm I don't give it to you you can't. 1137782046 M * ebiederm We might also be able to allow safely creating it from files in /proc. 1137782155 M * Bertl well, I know this kind of 'token' or key passing seems to come into fashion, but actually I do not see the immediate advantage and/or purpose 1137782204 M * Bertl why would I want to pass on such capabilities, the only thing I would like to be able is to drop them ... 1137782206 M * ebiederm Bertl: It's still a brain storm. 1137782253 M * ebiederm Basically this make the namespace problem look like the familiar problem of chroot+fchdir. 1137782255 M * Bertl and if you are talking about 'accessing' those namespaces (in form of specifying which namespace I want to enter) 1137782280 M * Bertl then IMHO some unique identifier should be more than enough 1137782287 M * Bertl (a topic we should talk about too) 1137782358 M * Bertl another question we should look at first is: are we aiming for a flat or hierarchical model? 1137782411 M * ebiederm There are two questions. 1137782422 M * ebiederm 1) Which namespace to I want to swich to my own. 1137782441 M * ebiederm 2) How do you access controll that namespace. 1137782468 M * ebiederm Names can be anything and are not a big deal as you pointed out. 1137782497 M * Bertl yes, but they might be strongly related, i.e. you might only 'see' and therefore be able to reach child namespaces for example 1137782533 M * Bertl of course, those are two different elements 1137782548 M * ebiederm If I don't always want an all powerful sysadmin giving use a filedescriptor would seem to give me stronger filtering power. 1137782582 M * Bertl but taking away priviledges from the 'parent' admin doesn't sound useful to me 1137782608 M * Bertl (or short-circuiting across several namespace hierarchies) 1137782612 M * ebiederm Except for the pid namespace there is no need for a hierarchy. 1137782624 M * Bertl ebiederm: huh? 1137782639 M * Bertl like chroot() :) 1137782641 Q * Doener Ping timeout: 480 seconds 1137782654 M * ebiederm Bertl: At the implementation level (Sorry your question of the hierarchical model from earlier). 1137782680 J * Doener doener@i5387C8EC.versanet.de 1137782684 M * ebiederm namespaces are essentially independent entities. 1137782703 M * Bertl I think there is, especially for filesystem and networking you actually want hierarchies 1137782723 M * ebiederm As for taking away privleges from the parent admin. Suppose I want to process secret information in my guest. 1137782725 M * Bertl of course, every hierarchy can be expressed as a flat model 1137782756 M * Bertl (expressed or implemented) 1137782833 M * ebiederm Bertl: There are certainly relationships, between namespaces but I don't think in general a hierarchy makes sense. 1137782858 M * ebiederm In general I don't think a hierarchy is fundamental. But this is minor. 1137782887 M * ebiederm The key question is how do I have a setup which allows me to process secret information in a guest that the host admin shouldn't be able to spy on. 1137782916 M * Bertl well, the key question is, do I want that? 1137782928 M * ebiederm Good question. 1137782969 M * ebiederm It is a question of how you configure your guests, so it may be something you never enable. 1137782976 M * Bertl I'm a real fan of privacy and looking at (older) solutions like TCFS and of course, the gpg aspects 1137783019 M * Bertl I think it's worth the efford, but, as you already mentioned, it's a very special feature, which often requires you to do very complex stuff 1137783067 M * Bertl and I wonder if it wouldn't be much easier to 'just' make separate (unpriviledged) init processes for this purpose ... 1137783104 M * Bertl i.e. have the kernel start a new 'unpriviledged' partition :) 1137783121 M * Bertl and run the host 'partition' unpriviledged too 1137783154 M * Bertl (something which in theory might even work without complicating the design) 1137783279 M * ebiederm Coming full circle secrets in guests is why I am leary about changing namespaces. 1137783329 M * ebiederm Having a privlege model that works without lots of aditional capabilites. 1137783341 M * Bertl okay, a simple but effective suggestion here: 1137783362 M * Bertl - let's add a bunch of 'new' capabilities to control the transition 1137783373 M * Bertl - add a kernel boot time mask which restricts them 1137783415 M * Bertl (we could even have a compile time default) 1137783498 M * ebiederm hmm. This is why I wanted to simply pass file descriptors. 1137783514 M * ebiederm Regardless we are getting ahead of our selves. 1137783549 M * ebiederm I'm going to propose using namespaces to control this and see if I can get on talking terms with the IBM guys. 1137783559 M * Bertl okay, but you agree that entering certain namespaces is very important to certain applications 1137783587 M * Bertl (while the strict secparation is also important for others :) 1137783603 M * ebiederm Bertl: Is it a question of entering namespaces or administering namespaces? 1137783625 M * Bertl IMHO both ... for me the administration is just the special case 1137783645 M * ebiederm Ok. What kinds of useful things do you do besides administration? 1137783652 M * ebiederm Just trying to wrap my head around the problem. 1137783687 M * ebiederm Although I am wondering if we solve this if we can remove some of the weird complexity from namespace.c 1137783688 M * Bertl real world scenario: customer calls the provider: I have some problem starting my foobar app 1137783720 M * Bertl reply: sorry pal, you're on your own, I have no access to your machine ... wrong! 1137783735 M * Bertl (at least for 90% of the providers) 1137783757 M * Bertl now here is something which makes linux-vserver (and similar) interesting for the provider 1137783780 M * ebiederm Interesting a back door root. 1137783793 M * Bertl as he can simply 'enter' the guest (maybe even stay invisible or at least without disturbing the guest) 1137783829 M * ebiederm I guess that really is the model customers frequently want with hand holding support. 1137783838 M * Bertl but, at the same time he has to get as close as possible to a guest process, while maintaining security 1137783844 J * stefani ~stefani@superquan.apl.washington.edu 1137783855 M * Bertl welcome stefani! 1137783916 M * ebiederm Is there another use where this would be used in an automated routing kind of fashion? 1137783951 M * Bertl ebiederm: honestly I already tried to convince folks (our customers :) that something like 'enter' should go away ... (and I guess all developers agree here, that those back-doors are bad) 1137783992 M * Bertl but fact is, those admin powers are one of the many advantages a VPS system has over Xen and/or real machines 1137784036 M * ebiederm Ok. I just wanted to make certain I had the use model down. 1137784038 M * Bertl some folks implement light-weight guests without sshd (or any other way to reach the guest= 1137784052 M * Bertl just for efficient service separation 1137784076 M * Bertl but they want to have some way to maintain the services, if something goes wrong 1137784108 M * ebiederm Actually that makes sense. 1137784115 M * ebiederm Just using guests as a supre chroot. 1137784132 M * Bertl yes 1137784162 M * ebiederm Just a place to put a subset of your priveleges so if you are hacked things won't get out of control. 1137784170 M * Bertl exactly 1137784190 M * lonewolff win 12 1137784193 M * Bertl this, btw, is the second most used kind of application 1137784195 M * lonewolff oops sorry 1137784201 M * Bertl np 1137784224 M * ebiederm What is the most used kind of application? 1137784257 M * Bertl the provider scenario, definitely (i.e. VPS or host consolidation) 1137784267 Q * sizo Quit: reboot 1137784269 M * ebiederm Ok. Got it. 1137784300 M * ebiederm I've got another interesting one. 1137784305 M * Bertl at third place is distro testing and such 1137784330 M * ebiederm A higher availability guest. 1137784348 M * ebiederm You virtualize it's surroundings so you can start it on a backup machine if the primary fails. 1137784369 M * Bertl yes, that's basically part of the first application area 1137784394 M * ebiederm Yea, I guess it would be. 1137784394 M * Bertl many providers have failover scenarios working 1137784422 M * Bertl sometimes with shared filesystems, sometimes with drbd or shared devices, sometimes simply with rsync backups 1137784445 A * ebiederm shudders 1137784461 M * ebiederm I used drbd once.... 1137784474 M * ebiederm Then I had a bug and looked at the code. 1137784489 M * Bertl lol 1137784501 A * lonewolff uses nas storage for production boxes 1137784535 M * Bertl ebiederm: nevertheless it works for many folks (as does reiserfs, to my constant surprise) 1137784541 M * ebiederm If I have to do things like drbd I will use nbd + md. 1137784574 M * ebiederm Bertl: I haven't found a bug in the code but last I looked the error handling was provably poor. 1137784579 M * Bertl ebiederm: if you have a larger distrance between your machine and backup, nbd+md will be really slow 1137784607 M * ebiederm Bertl: I think they have finally implemented journally (like drbd) in the md layer. 1137784639 M * ebiederm s/journally/journalling/ 1137784679 M * ebiederm Anyway thanks for the good conversation, and someone to bounce ideas off of. 1137784693 M * Bertl the pleasure was mine ... 1137784708 M * Bertl hope we can do that again soon ... 1137784740 M * ebiederm Right. Now to see if the rest of the kernel developers like the idea of using clone for namespaces. 1137784755 M * ebiederm Just a sec... 1137784841 M * ebiederm One of the things I have been anticpating, but I have not done yet is, a namespace for UID/GID type management. 1137784864 M * Bertl hmm, what would it be used for? 1137784926 M * ebiederm So in the VPS type scenario does everyone agree to use the same UID's for their users? 1137784937 Q * pusling Remote host closed the connection 1137784947 J * pusling pusling@195.215.29.124 1137785025 M * ebiederm the key part is that the kernel has struct user_struct for tracking information per users. 1137785048 M * ebiederm This has per user rlimits and the like. 1137785104 M * ebiederm I think vserver is currently using the high bits of the uid for handling that problem right now. 1137785194 M * ebiederm All I'm certain of is that guests using the same UID for different users is something that needs to be handled at some level. 1137785204 M * Bertl well, the user structs are isolated too in linux-vserver 1137785218 M * Bertl (and virtualized in the hash tables) 1137785227 M * Bertl so basically every guest has it's own 1137785256 M * Bertl so yes, this _is_ a separate namespace 1137785298 M * ebiederm Ok. Does that have anything to with your on disk tagging mechanism? 1137785319 M * Bertl no, not really 1137785327 M * ebiederm Ok thanks. 1137785340 M * Bertl np 1137785349 M * ebiederm For light weight guests in super-chroot mode I don't think that namespace is necessary. 1137785376 M * ebiederm I will have to look at the vserver implementation of the UID/GID namespace at some point in the future then. 1137785464 M * Hollow Bertl: what re the changes to 2.1.0.5? 1137785502 M * Bertl soft limits should be there (also atomic64 ops on 64bit platforms) 1137785519 M * Bertl page fault and slab accounting 1137785560 M * Hollow ok, thanks 1137785614 M * Bertl there will be a fixup shortly (as I'm currently testing the thing) 1137787349 J * liquid3649_ ~Viper0482@p549761A9.dip.t-dialin.net 1137787573 M * yang How do I remove vserver and directories? 1137787606 M * Bertl rm -rf ? 1137787626 M * Bertl make sure it is stopped though 1137787788 Q * Viper0482 Ping timeout: 480 seconds 1137787803 M * yang rm -rf /etc/vservers/vserver-name /var/lib/vservers/vserver-name ? 1137787927 M * lonewolff that would do it 1137788683 Q * pusling Ping timeout: 480 seconds 1137788903 J * pusling pusling@195.215.29.124 1137789687 M * Bertl off for a little, back later ... 1137789696 N * Bertl Bertl_oO 1137790643 Q * liquid3649_ Quit: bin raus, 1137793553 Q * jeeves Quit: Leaving 1137794177 J * bonbons ~bonbons@83.222.39.249 1137794829 Q * bonbons Quit: Leaving 1137795552 Q * Roey Quit: Leaving 1137795827 P * meandtheshell 1137795900 Q * Doener Quit: Leaving 1137796493 P * stefani I'm Parting (the water) 1137797143 J * Smutje_ ~Smutje@xdsl-84-44-186-62.netcologne.de 1137797269 Q * Smutje Ping timeout: 480 seconds 1137797767 Q * gerrit Ping timeout: 480 seconds 1137798444 Q * pusling Remote host closed the connection 1137798446 J * pusling_ pusling@195.215.29.124 1137800197 P * vrwttnmtu Leaving 1137800605 J * monrad ~mikkel@213083190131.sonofon.dk 1137800606 P * undefined