1104365064 M * Bertl kernel/vserver/context.c:374 1104365064 M * Bertl include/asm/uaccess.h:421 1104365064 M * Bertl fs/stat.c:325 1104365064 M * Bertl include/asm/bitops.h:44 1104365064 M * Bertl fs/stat.c:354 1104365066 M * Bertl include/linux/fs.h:1043 1104365069 M * Bertl fs/readdir.c:226 1104365072 M * Bertl fs/readdir.c:284 1104365074 M * Bertl fs/readdir.c:226 1104365077 M * Bertl first one is EIP 1104365188 M * Bertl added it to the OOPS 1104365217 M * Doener return __loc_vx_info(id, &err); ? 1104365253 M * Doener either my source is messed up, or your kernel version is wrong... 1104365302 M * Bertl why? that is the line here too ... 1104365420 M * Bertl hmm .. sec 1104365468 M * Doener EIP is at get_xid_list+0x30/0x60 (maybe i don't understand that one right...) 1104365516 M * Bertl hmm, right, that is odd ... 1104365554 M * Bertl let me disassemble the relevant code .. sec 1104365625 M * Bertl okay, pls update the OOPS 1104365703 M * Bertl hmm, interesting we are at least 20 (hex) off .. strange ... 1104365788 M * Bertl comparing the bytecode .. I'd say EIP should be at 801389a0 1104365795 M * Bertl sec, checking the kernel ... 1104365933 M * Bertl hmm .. seems like I recompiled the kernel tree yesterday ... 1104365950 M * Bertl but I guess I can 'adjust' the offsets, sec ... 1104366198 M * dominance Bertl: hope that ipv6 hands on tour was enough.. just yell for me if you need more.. 1104366376 M * Bertl hum, there was an ipv6 hands on tour? 1104366408 M * Bertl Doener: machine is unreachable atm :/ but I managed to correct the EIP address 1104366451 M * Bertl ah, machine is reachable again ... 1104366550 M * dominance Bertl: on the ml? 1104366577 M * dominance Bertl: found it or want an extra copy? *g* 1104366613 M * Bertl ah, okay, haven't managed to read it yet ... 1104366646 M * dominance well, you have too fast RTT on mails usually, so i presume you scan them at SMTP time already straight into the brain :-P 1104366770 Q * flock Ping timeout: 480 seconds 1104366826 M * Bertl Doener: okay, updated the first 6 addresses 1104366942 M * Doener ok 1104367028 M * Doener 80138980 (801389a0) include/asm/processor.h:636 ? /usr/include/...? 1104367044 A * Doener got no include/asm/... in his tree... 1104367088 M * Bertl it is a symlink to include/asm- in this case x86 1104367106 M * Bertl so look in include/asm-i386/processor.h 1104367118 M * Doener line numbers don't make sense there 1104367132 M * Doener 636: { 1104367167 M * Bertl it's 1104367168 M * Bertl extern inline void prefetch(const void *x) 1104367197 M * Bertl this is inline asm and has no valid line numbers 1104367207 M * Doener ah, i c 1104367217 M * Bertl and it matches the pattern, we are traversing the list 1104367229 M * Bertl and hitting a list poison ... 1104367255 M * Bertl I can go back the address and see where we end ... sec 1104367291 M * Bertl # addr2line -e vmlinux 8013899f 1104367291 M * Bertl kernel/vserver/context.c:390 1104367347 M * Bertl but to simplify it, the bug is pretty obvious 1104367422 M * Bertl get_xid_list() is still assuming rcu rules ... 1104367439 M * Bertl (where the context disposal is non-rcu for now) 1104367548 M * Bertl dominance: okay, scanned your email ... what about 'special' stuff like local, private, martian? does that exist in ipv6? 1104367685 M * dominance martian is a kernel thingy that's neither v6 nor v4.. 1104367700 M * dominance all martian is telling that your input device is not the device your routing table your send a replty to 1104367706 M * dominance thus that should exist alto with ipv6 1104367726 M * dominance and private doesn't really exist, only link-local and site-locate AFAIR.. 1104367741 M * Bertl Doener: okay, what I actually wanted to address is, that we need to find a clean solution for context creation and context destruction ... 1104367765 M * Bertl dominance: hmm, what's that? 1104367767 M * dominance but off the hand i gotta say there's not all 128bit available.. thus there's *MANY* reserved spaces still to give away 1104367779 M * dominance that's "LOCAL" addresses as the name points out.. 1104367786 M * dominance link local should stop at your next router 1104367794 M * dominance and site local at your next border router 1104367804 M * dominance i.e. the "internet-uplink" 1104367807 M * Bertl how can they be identified? 1104367826 M * dominance by the special prefix group ff80 and fe80 iric.. 1104367828 M * dominance iirc.. 1104367842 M * dominance yet i'm not 100% sure that's settled yet 1104367851 M * dominance and still that's only a routing mattter 1104367861 M * dominance for the notation and thus the handling it should be entirely transparent.. 1104367864 M * Bertl yeah, but we will have to do routing with ngnet 1104367873 M * dominance yet if you need that info i can get it to you .. 1104367880 M * Bertl please do so ... 1104367899 M * dominance ok, hold on a sec 1104368153 M * dominance http://www.join.uni-muenster.de/Dokumente/Howtos/IPv6_for_Beginners.php 1104368155 M * Doener ok, so what do we need there? 1104368156 M * dominance sec 3.4 1104368167 M * dominance there's the link-local and site-local plus multicast 1104368208 M * dominance but please acknowledge: 1104368210 M * dominance Achtung: Die IETF ist schon seit langem dabei, den "Site-local Scope" abzuschaffen (siehe draft-ietf-ipv6-deprecate-site-local-02.txt). Eine erneute Revision der "IPv6 Addressing Architecture" (RFC3513 wird es aber erst geben, wenn man sich einen Ersatz für diese Adressen ausgedacht hat, der die Verwendungszwecke "Insel-Systeme" und "private Adressen" abdeckt. Diese Diskussion ist noch in vollem Gang. 1104368212 M * dominance *g* 1104368244 M * dominance what may or may not be needed is the autoconf based on MAC addresses of your ethernet adapter.. 1104368336 M * Bertl Doener: well, the main question is, when do we declare a context dead, and how ... 1104368388 M * Bertl I tend to make an explicit destruct syscall, which kind of reaps a context, once the last (but one) reference is gone ... 1104368449 M * Bertl it might also be a solution to 'just' mark a context dead, (like with zombies) and wait until their reference goes down to zero ... in which case we can dispatch it (the original idea) 1104368476 M * dominance bertl: and in case you have a specific question, try mailing: wp5@6net.org, cc: join@uni-muenster.de .. those are the mls for the working-groups which provide quite cool help with ipv6 questions. =) 1104368549 M * Bertl hmm, okay .. thanks ... 1104368593 M * dominance np. hth. 1104368612 M * Bertl Doener: a quick fix for the oops itself would be to reintroduce a context list lock 1104368634 M * dominance Bertl: if you need a tunnel to poke around and can't get one from anyone else, lemme know ;) 1104368669 M * Doener i'd prefer the latter... though the former may have advantages as well... 1104368880 M * Bertl dominance: k, thanks again ... 1104369059 M * Bertl Doener: currently we have two different kinds of references 1104369080 M * Doener use and reference count, right? 1104369085 M * Bertl yep 1104369121 M * Doener 'use' = context members (processes), 'ref' everything using the context? 1104369141 M * Bertl not sure why I actually introduced them ... they look somehow redundant to me .. 1104369148 M * Bertl (now) 1104369199 M * Bertl i.e. the last process gone could be easily tracked by the nr_threads 1104369245 M * Bertl and OTOH, it doesn't mean that much, because a process could join the context in just that moment (no protection against that yet) 1104369444 M * Loki|muh OTOH? 1104369451 M * Doener on the other hand 1104369480 M * Loki|muh ah :) 1104370474 J * flock ~restless@l192-117-111-12.broadband.actcom.net.il 1104370779 M * Bertl welcome flock! 1104370801 M * Bertl Doener: okay, the basic questions are: 1104370817 M * Bertl a) when does a context become visible to the outside 1104370830 M * Bertl b) when does a context die, and what happens afterwards 1104370849 M * Bertl c) what about references which outlive the context 1104370868 M * Bertl d) what about synchronization ala wait() 1104370879 M * Doener c) is /proc/virtual and friends? 1104370900 M * Bertl yes, for example or much more important networking stuff, sockets, etc 1104370912 M * Doener oops... read "outlive" as "live out of"... 1104370940 A * Doener switches from scanning to reading... 1104370956 M * Bertl e) can a context be created right after the previous one died? 1104370974 M * Bertl e1) if so, how to handle more than one context with the same xid 1104370995 M * Bertl e2) if not, how to notify userspace of the dead context 1104371021 M * Bertl f) if synchronization is important (according to enrico it is) how to do it? 1104371042 M * Bertl f1) leave the context dangling around (dead) until somebody 'collects' the info 1104371055 M * Bertl f2) clean it up, unless somebody waits() for it .. 1104371072 M * Bertl f3) do some versioning and keep the info, but not the context 1104371222 M * Bertl and finally g) how do we map whatever we find appropriate to the legacy stuff? 1104371274 M * Loki|muh is it necessary to support the legacy stuff? 1104371292 M * Bertl well, a good question ... 1104371316 M * Bertl probably not, we can make the new tools a requirement 1104371546 M * Doener a) where does the 'outside' begin? 1104371630 M * Bertl simple, process A is creating context 100, process B is doing something with contexts 1104371660 M * Bertl and the second case: the kernel itself is doing something with contexts 1104371668 M * Doener ok, so outside is everything but the process that creates the context 1104371685 M * Bertl yes 1104371918 M * Doener then i'd say, asap, so that we can hold a lock for a short period (creation + insertion into the list of contexts) and avoid duplicate contexts with that lock. The context may be in a special state, so that other processes may not join it yet... 1104371956 M * Bertl well, we can't use a global lock, context creation can relatively long in kernel terms 1104371979 M * Bertl and a context local lock would not help us with the visibility 1104372065 M * Bertl okay, let's postpone this and do a not-so-dirty fix for the oops first 1104372073 M * Bertl (but keep thinking about those questions) 1104372101 M * Bertl kernel/vserver/context.c line 210 1104372117 M * Bertl this is the currently problematic put() 1104372127 M * Doener 1.9.3.14 or 1.9.3.11 ? 1104372133 M * Bertl 1.9.3.14 1104372197 M * Bertl actually it shows that I started to convert the stuff back from rcu to normal locking but kept the rcu stuff in place ... 1104372241 M * Bertl as my initial issues with rcu where a kernel issue (the rcu stuff was buggy) I'm inclined to give rcu a second chance here ;) 1104372314 M * Bertl this further means that we would disable the vx_info_hash_lock completely 1104372345 M * Bertl and put the second part of __unhash_vx_info() into an rcu callback 1104372397 M * Doener re-activating the call in free_vx_info? 1104372404 M * Bertl nope 1104372450 M * Bertl give me a minute, I'll do a patch ... 1104372456 M * Doener ok 1104372474 M * Doener i'll continue looking into the above then 1104372485 M * Bertl excellent! 1104373528 Q * flock Ping timeout: 480 seconds 1104373563 M * Bertl Doener: http://vserver.13thfloor.at/Experimental/delta-2.6.10-vs1.9.3.14-vs1.9.3.14.1.diff 1104373618 J * flock ~restless@l192-117-111-12.broadband.actcom.net.il 1104373655 M * Bertl wb flock! 1104374526 M * Doener hm, how does synchronize_kernel() solve the problem? 1104374594 M * Bertl let's assume the refcount is one, when we do the unhash 1104374648 M * Bertl what happens in __unhash_vx_info() is that the 1104374667 M * Bertl vxinfo is removed from the list with hlist_del_rcu() 1104374717 M * Bertl and right after that, we hit the put_vx_info() 1104374732 M * Bertl which in turn calls free_vx_info() 1104374781 M * Bertl now the refcnt is 1 which brings ut to 1104374787 M * Bertl __dealloc_vx_info() 1104374807 M * Bertl where we do: 1104374811 M * Bertl vxi->vx_hlist.next = LIST_POISON1; 1104374833 M * Bertl (exactly to catch this kind of error the hard way ;) 1104374879 M * Bertl now the synchronize_kernel() waits until all rcu_read_lock() sections are over ... and we can free the stuff without further thinking ... 1104374984 M * Bertl (does that sound reasonable for you?) 1104375171 M * Doener why can we be sure, that noone else started using it again? 1104375218 M * Bertl doesn't matter ... if somebody uses it again, fine, we will not reach the free in that case 1104375236 M * Bertl (because the put will not lead to the free) 1104375302 M * Bertl every path _walking_ the list is protected with the rcu_read_lock() 1104375336 M * Bertl so we can say for sure that nobody _reading_ the hash is doing so, when synchronize_kernel() returns 1104375592 M * Doener ok, got some coffee... that helps a lot ;) 1104375634 M * Loki|muh i will get some sleep ;) 1104375636 M * Loki|muh gn8 1104375652 M * Bertl Loki|muh: night! 1104375661 M * Doener night Loki|muh! 1104375668 M * Bertl Doener: the rcu stuff isn't really easy to understand 1104375674 A * Doener can't sleep... 1104375688 M * Doener a) i'd miss my 'meeting' at the dentist 1104375697 M * Doener b) thinking of dentist keeps me from sleeping 1104375848 M * Doener Bertl: yeah, the last time i had 2 or 3 articles on that topic on screen to help me going through the source 1104375932 M * Bertl the theory sounds really trivial ... and easy to do ... but when it comes to using it in various places and considering the interactions .. it's not easy at all ... the locking stuff is trivial compared to that ;) 1104375973 M * Bertl but maybe the following chain of thought clarifies it a little ... 1104376040 M * Bertl let's assume there are some sections/threads walking the first hash list 1104376064 M * Bertl one of them is a few places before 'our' context, and the other right behind it 1104376140 M * Bertl now we do hlist_del_rcu() which does nothing more but set prev->next to this->next 1104376191 M * Bertl so the thread before our context walks right over it, and the one after it is not affected at all 1104376217 M * Bertl now let's add a third one, which is precisely reding 'this' context 1104376240 M * Bertl it will see the context, until it decides to move on ... 1104376266 M * Bertl as all list walks are protected by the rcu_read_lock() statement 1104376293 M * Bertl the synchronize_kernel() will force a decision: 1104376319 M * Bertl either the context was referenced (i.e. refcnt > 1 because we keep a reference too) 1104376336 M * Bertl or it was ignored, which is fine too ... 1104376378 M * Bertl any read_rcu_lock() started right after our synchronize_kernel() will have no effect, and work 'just' fine ... 1104376403 M * Doener because the context was already removed from the list... 1104376421 M * Bertl exactly ... 1104376445 M * Doener and those having read this context will have finished reading, so we may modify this context. 1104376456 M * Bertl that's it ... 1104376465 M * Doener ok, thanks! 1104376476 M * Bertl my pleasure! 1104376584 M * Doener well, the rest of the patch is quite straight forward :) so if you want my opinion on it, i guess it's fine 1104376663 M * Bertl yeah, I hope so .. currently facing a different issue ... 1104376675 M * Bertl ..TIMER: vector=0x31 pin1=0 pin2=-1 1104376675 M * Bertl ..MP-BIOS bug: 8254 timer not connected to IO-APIC 1104376675 M * Bertl ...trying to set up timer (IRQ0) through the 8259A ... failed. 1104376675 M * Bertl ...trying to set up timer as Virtual Wire IRQ... works. 1104376688 M * Bertl (with 2.6.10-vs1.9.3.14) 1104376706 M * Bertl kernel hangs after that ... 1104377400 M * Bertl ah, okay, my fault ... should not test more than one thing at a time ;) 1104377408 M * Doener hehe 1104377426 M * Bertl 20kHz is pushing the limits ... 1104377490 M * Doener HZ value? 1104377504 M * Bertl yep 20000 currently ... 1104377568 M * Doener hm, why would you want such high values? doesn't that cause some nice overhead? 1104377627 M * Bertl yes, it does, but the aim is not to run at 20kHz but to make the setting variable ... 1104377649 M * Bertl currently the range is between 20Hz and 20kHz which I consider appropriate 1104377669 M * Bertl I'll do some tests on the overhead in the next few days (weeks?) 1104377691 M * Bertl but this requires a working setup at all those frequencies ... 1104377722 M * Bertl (too many places hardcode the values or do funny assumptions) 1104379305 M * Bertl well, that gives us a nice oops with killer-03 ... 1104379366 M * Bertl actually two oopses (on two different cpus) 1104379374 M * Doener nice ;) 1104379388 M * Bertl and in the same place ;) 1104379492 M * Bertl http://vserver.13thfloor.at/Experimental/OOPS/OOPS-02.txt 1104379740 A * Doener needs to calm down a little... 1104379747 M * Doener damn dentist... ;) 1104379996 M * Bertl hah! we managed to hit 1104379997 M * Bertl printk(KERN_ERR "bad: scheduling from the idle thread!\n"); 1104380085 M * Bertl so synchronize_kernel() is not an option there ... 1104380617 M * Bertl http://vserver.13thfloor.at/Experimental/delta-2.6.10-vs1.9.3.14.1-vs1.9.3.14.2.diff 1104380624 M * Bertl (let's see if that fixes it ;) 1104381485 M * Doener i'll go and read a book or something... i'm going crazy :( 1104381495 N * Doener Doener|gone 1104381517 M * Bertl okay, cya 1104381922 J * nox- ~nox@c135242.adsl.hansenet.de 1104382269 Q * nox Ping timeout: 480 seconds 1104382282 N * nox- nox 1104383595 J * rs ~rs@imhotep.rhapsodyk.net 1104383600 M * rs re 1104383638 M * Bertl hey rs! 1104383657 M * rs hey bertl, hwo are you ? 1104383668 M * Bertl fine thanks, just sent you email ... 1104383675 M * rs is it evening for you ?:) 1104383680 M * rs yeah just answered 1104383687 M * Bertl well, yes, I'm going to bed soon ... 1104383712 M * Bertl ad email: okay, thanks! 1104384001 M * Bertl okay, I'm off to bed ... the router will probably be fine when I get up later ... 1104384363 M * Bertl night everyon! 1104384373 N * Bertl Bertl_zZ 1104386177 Q * rs Quit: Lost terminal 1104393870 Q * _are_ Quit: Disconnecting 1104395425 Q * DuckMaster Quit: Client exiting 1104397119 J * rs rs@ice.aspic.com 1104397190 Q * SiD3WiNDR Ping timeout: 480 seconds 1104397225 J * SiD3WiNDR luser@bastard-operator.from-hell.be 1104401234 J * berni_ ~berni@obelix.ipv6.birkenwald.de 1104403153 J * sebd ~sebd@lesdeveloppementsdurables.org 1104405073 Q * flock Ping timeout: 480 seconds 1104406827 J * flock ~restless@l192-117-111-12.broadband.actcom.net.il 1104412112 Q * berni Ping timeout: 480 seconds 1104412900 J * berni ~berni@2001:1b18:202::2 1104415422 M * sannes hm, 1.9.3.14.2 panics on me when stopping a vserver.. 1104417327 M * sannes well it used to until last reboot, but nevermind that.. 1104417338 M * sannes now I can't make it crash reliably.. 1104417357 M * sannes was worse with 1.9.3.14 .. 1104417845 Q * sannes Read error: Connection reset by peer 1104418063 J * sannes ~ace@home.skarby.no 1104421427 M * meebey can I access /dev/kmem inside a vserver if the nod exists? 1104421677 M * meebey nevermind 1104421688 M * meebey vpn_galilei:/# cat /dev/kmem 1104421688 M * meebey cat: /dev/kmem: Operation not permitted 1104421693 M * meebey that answers it 1104422389 M * TheSeer accessing kmem in a vserver sounds like a bad idea anyway ;> 1104422592 M * eyck ihmm, you shouldn't put kmem node in /dev in vserver... 1104422614 M * eyck you should get : # cat /dev/kmem -> no such file or directory 1104423012 Q * grecea Quit: Leaving 1104423814 M * meebey that was a security test 1104423830 M * meebey some rootkit use kmem to load modules 1104423840 M * meebey without using the module loader of the kernel 1104424503 M * eyck wow, this already 'vbeen automated? 1104425405 A * sannes whines.. 2.6.10 with vserver must be the most unstable kernel I ever have run.. heh 1104425503 M * eyck have you ever ran a stable 2.6.x ? 1104427029 M * TheSeer uptime -> 18:27:02 up 36 days, 4:29 ... 1104427037 M * TheSeer hmm.. not too bad ;> 1104427055 M * TheSeer where the 36 days is pretty much exactly the day i did install that box 1104427376 M * Zoiah eyck: yes. 1104427409 M * Zoiah eyck: http://www.phrack.org/phrack/58/p58-0x07 1104427500 P * Plug 1104427513 M * eyck Zoiah: I read that phrack, quite some time ago, but I haven't seen any rootkit using this technique 1104427514 J * Plug ~plug@datadot.net 1104427532 M * Zoiah eyck: read it again. :) 1104427549 M * Zoiah eyck: it includes a working implementation that's been spotted in the wild. :) 1104427783 Q * SiD3WiNDR Ping timeout: 480 seconds 1104428075 T * services.oftc.net http://linux-vserver.org/ | latest stable 1.29, devel 1.3.9, 1.9.3, ng8.7 1104428090 N * berni Guest2 1104428327 M * sannes eyck : actually yes, have a router running on 2.6 for months without a hitch .. thing is on my desktop computer I run mm-sources and I havn't had any troubles with it.. 1104428361 M * sannes the mm-sources have always been; either it works like a charm or it is completly broken.. 1104430738 N * Bertl_zZ Bertl 1104430742 M * Bertl morning folks! 1104430808 M * sannes morning :) 1104430823 M * Bertl hey, you have issues with 2.6.10-vs1.9.3.14? 1104430838 M * sannes yes, and 1.9.3.14.2 1104430851 M * Bertl ah, don't touch the 1.9.3.14.x 1104430874 M * Bertl (they are known to be buggy) 1104430888 M * sannes well, I get oopses and kernel panics on both.. 1104430906 M * Bertl okay, do you have some kernel panics/oopses from 2.6.10-vs1.9.3.14? 1104430922 M * Bertl (they could be _very_ helpful) 1104430922 M * sannes sure 1104430925 M * sannes ugh 1104430929 M * sannes not the output though.. 1104430937 M * sannes but they happen often enough.. 1104430955 M * sannes just going to need to patch up for a little bit of netlogging.. 1104430961 M * Bertl similar to this: http://vserver.13thfloor.at/Experimental/OOPS/OOPS-01.txt 1104431078 M * sannes not really.. I'll make it happen again and check.. 1104431111 M * Bertl okay, please try to get a trace (maybe from syslog or so?) but be careful with netconsole, it breaks some nics 1104431121 M * Bertl (i.e. it causes oopses itself) 1104431125 M * sannes ah 1104431143 M * sannes well, I don't think I got anything fromt he syslog.. I run syslog-ng maybe it isn't so good at it? 1104431143 M * Bertl at least gigabit nics e1000 and tg3 are affected 1104431163 M * Bertl really depends on when and how the oops happens ... 1104431179 M * sannes heh, can't get my e1000 to work anyways 1104431181 M * Bertl on a dual cpu system you usually have a 50% chance to get it into the log 1104431256 M * sannes well, I'm currently running it on my laptop .. tried the 1.9.3.14.2 on my production server for 30 minutes, but after a lot of kernel panics stopping and starting vservers .. well, heh.. I downgraded.. heh back to 2.4 for me.. until next time, that is.. 1104431288 M * sannes but atleast I have a working setup just waiting for a stable kernel.. heh :> 1104431319 M * Bertl well, if you can capture any oops on your laptop or wherever, please make it available ... 1104431343 M * Bertl (and if it just is a snapshot with your camera ;) 1104431346 M * sannes recompiling a vanlilla 2.6.10 + vs1.9.3.14 1104431350 M * sannes heh 1104431392 M * Bertl do you have preemption enabled, btw? 1104431422 M * sannes yes, on the laptop, but turned if off on the production server.. 1104431441 M * Bertl okay, maybe the oops looked like this then? 1104431444 M * Bertl http://vserver.lauft.net/2.6.10-vs1.9.3.14/kernel-first-oops.jpg 1104431449 M * sannes but it was really weird some times on the production server.. it had a pagefault loop of some kind.. 1104431466 M * Bertl with 1.9.3.14 or 1.9.3.14.x ? 1104431511 M * sannes .x 1104431528 M * Bertl well, 1.9.3.14.x crashed on my test scenario withn 10 seconds 1104431539 M * sannes are the things I should be avoiding the setup? and should I put some debug stuff on? 1104431564 M * Bertl the oops from ndim, seems vserver unrelated to me 1104431581 M * Bertl and it also seems that he had preemption on ... 1104431601 M * sannes I have the sneaking suspicion that 2.6.10 isn't very stable at all.. 1104431615 M * Bertl there are a few things which should be considered regarding config on 2.6.10 1104431630 M * Bertl one is to avoid preemption and npapi 1104431655 M * miekb OK - this is probably a really simpel question - when you enter a vserver wit hvserver XXX enter - why doesn't it run /etc/bashrc or .bashrc? 1104431656 M * Bertl on 64bit machines most filesystems seem to be broken too 1104431668 M * sannes but I probably came to the wrong conclusion that vserver had anything to with it when I ran 14.x because then it atleast oopsed and paniced more easily when starting and stopping vserver.. 1104431710 M * Bertl well, as I said, the 1.9.3.14.x are some testing stuff to narrow down an issue, they _will_ break very easily 1104431720 M * sannes avoided preemption and npapi on the server, but heh, it was .x .. stupid stupid stupid.. 1104431750 M * sannes so, going to get the panic again.. so should I kill off preemption to make sure then? 1104431751 M * Bertl yeah, it's basically my fault, I should have labeled them differently 1104431784 M * Bertl sannes: try with preemption off ... if you fail to get the oops, it's probably not vserver related 1104431892 M * sannes hey, had any time to look at vroot porting yet? have a test setup ready.. 1104431906 M * sannes :) 1104431910 M * Bertl yeah, started porting it ... 1104432061 M * sannes :) anyways, .x seemed to panic when a vserver exited.. (in case you wanted to know) 1104432072 M * Bertl yes I know, but thanks ... 1104432115 M * Bertl btw, your BME patch looks good to me so far ... 1104432137 M * sannes :) cool! 1104432148 Q * serving Ping timeout: 480 seconds 1104432159 M * sannes and thanks for looking at it :) 1104432194 M * Bertl I'll do a more in depth comparison next year, and you should consider submitting it to lkml ... 1104432231 M * sannes The generic part? because that is straightforward, but BME is 0.1 % patch from me and 99.9 % your BME patch.. 1104432274 M * Bertl yeah, well, maybe we do some 'cooperation' on that ... or you mention me (once the patch get's accepted ;) 1104432406 M * sannes got some time before my semester starts, so I could do it so you can work more on vserver stuff :) but, I need to go more carefully through it then, because I mostly looked at the parts that broke when trying to use your patch with 2.6.10 .. 1104432442 M * sannes .. your 2.6.8.1 bme patch with 2.6.10 that is.. 1104432508 M * Bertl maybe al viro likes this one better ... 1104432521 M * Bertl (I would be happy if the stuff made it into mainline) 1104432565 M * sannes so would I .. and I'm quite fresh when it comes to writing kernel code that others can look at .. (just finnished my OS course now.. heh) 1104432572 J * mikmu hadge@h64-5-199-35.gtcust.grouptelecom.net 1104432578 M * Bertl welcome mikmu! 1104432588 M * mikmu thanks Bertl 1104432608 M * Bertl sannes: I'm pretty sure Al will have many things you need to change, but maybe he'll tell you in more detail _what_ he expects 1104432633 M * Bertl (in any case, reading up on the discussions on lkml might be a good thing to do) 1104432651 M * sannes yeah, have read them earlier, going to revisit them now.. 1104432748 M * mikmu I was wondering, I'm having through lately with a vserver built under 1.3.9. It seems that USR1 and HUP signals aren't working well. Init scripts using them to restart daemons don't work. The daemon is shut down, but not restarted. This is happening with both apache and spamd (spamassassin). Has anyone ever come up on this before? My other vservers are working fine, but this one is iffy. Apache is installed from deb package, spamd 1104432814 M * Bertl hmm, did you verify that the same binary/config work without linux-vserver? 1104432848 M * Bertl (it is easily possible that 1.3.9 messes up the signaling) 1104432877 M * sannes Bertl : Al is albeiro ? 1104432886 M * mikmu I don't have the exact same apache config on the base host, but it's the same version and it does restart well 1104432889 M * Bertl Al is Al Viro ;) 1104432906 M * Bertl (should probably go by Viro but that is so unpersonal) 1104432911 M * sannes ok, no #vserver irc junkie? :> 1104432931 M * Bertl no, but he is sometimes on #kernelnebies 1104433005 M * Bertl mikmu: would it be possible to start the machine with a non-vserver kernel (i.e. same kernel but not vserver patched) and start the apache within a chroot (into that vserver dir) and test if it works fine there? 1104433063 M * mikmu hmm, I'll look into it. got a few people using other vservers on that same machine 1104433083 M * Bertl not necessarily right now, when you get around ... 1104433120 M * mikmu yeah, In the meantime, I'll just get the scripts to do a shutdown and restart instead of hup 1104433160 M * mikmu I'll build the latest kernel too. I beleive I have .10rc3 running 1104433192 M * Bertl hmm, so you are actually talking about 1.9.3 not 1.3.9, ... 1104433287 M * mikmu whoops, yes indeed 1104433309 M * Bertl which makes it _more_ interesting for me ;) 1104433334 M * mikmu heh 1104433353 M * Bertl (and less likely that the signalling is messed up) 1104433378 M * mikmu yeah, doesn't seem to be problems on my other vservers. 1104433462 M * mikmu I don't have the first clue as to where to start figuring out my problem here though. Maybe something screwed up in /proc? 1104433547 M * mikmu I did mv this vserver to another location so I could format an lvm partition in reiserfs to hold it, then moved the files back in to the new partition 1104433671 M * Bertl you could strace the apache daemon (with strace -fF -o apache.trace) 1104433694 M * Bertl and try to do nothing else but sendign the HUP 1104433707 M * Bertl then upload the strace output somewhere ... 1104433716 M * mikmu half the time, when I vserver start this vserver, I get errors starting syslog. I may just reinstall the whole thing 1104433726 M * mikmu Bertl: Will do 1104434055 Q * monrad Quit: Leaving 1104434065 M * mikmu http://spensa.mindsweep.ca/apache.trace.txt 1104434243 M * mikmu http://spensa.mindsweep.ca/spam.trace.txt 1104436331 M * Bertl okay, back later ... 1104436336 N * Bertl Bertl_oO 1104436503 J * jiggabo ~black9@cblmdm206-107-239-213.buckeye-express.com 1104436518 P * jiggabo 1104436771 J * _are_ ~are@dsl-084-056-154-110.arcor-ip.net 1104436817 N * weasel weaselTM 1104436821 N * weaselTM weasel 1104436960 M * _are_ hi 1104437066 M * sannes :) morning 1104437089 A * sannes ponders.. he hasn't crashed yet after he turned of preempt .. *celebrates* 1104438049 M * _are_ no crashes here for 2d8h so far and this with 32 and 64 bit vservers running. seems it stabilizes slowly 1104438085 M * _are_ never had preemt in, here. it said 'experimental' and it said 'very experimental' with amd64, scared me away a bit ;) 1104438187 M * sannes hehe :) 1104438460 P * mikmu 1104438959 J * serving ~serving@213.186.187.133 1104439689 J * duckx ~Duck@dyn-83-157-200-131.ppp.tiscali.fr 1104440049 M * miekb _are_: Did you get Samba running ho wyou wanted it? 1104440070 A * miekb got it going last night - minimal fuss though browsing is still messed - though that may be due to my WinNT server still being online 1104440082 M * miekb but \\IP Addr\share works great 1104440139 M * _are_ miekb: nope, not yet, but have not tried to, either. Ignoring the problem for the moment 1104440151 M * miekb _are_: LOL Been there done that 1104440156 M * _are_ always use a wins server with windows clients 1104440160 N * miekb mikeb 1104440163 M * mikeb stupid typo 1104440180 M * _are_ well, the samba server runs 1104440208 M * mikeb Using interfaces =, bind interfaces only = yes, and socket address = Vserver IP sem to do the trick 1104440212 M * mikeb for me anyay 1104440228 M * _are_ and any applications that might require a working locking will only be used from January 10th. as that is far away, e.g. next year, i decided not to care as of now ;) 1104440234 M * mikeb but the browsing is a puzzle, but until I shutdown the WinNT and make the vserver master of all - I won't knwo if its config, vserver, or browser conflict 1104440288 M * _are_ umpf. it is december 30th, 22:00 and some user is using the samba server there. gnn. 1104440310 M * _are_ and it is a company file server, not some university 1104440395 M * mikeb _are_: Heh been in company and academic environs and they both have cube rats that are there at the most bizarre hours :) 1104440434 M * mikeb But in academia I get to tell the rga dstudents who call my cell phone at 4AM that I'm gonna kill then slowly if their faculty PI has < X million in grant funding :0 1104440439 M * mikeb s/rga/grad/ 1104440466 M * _are_ well, i recognize the username and it is none of those cube rats 1104441723 J * tanjix tanjix@pD9FAC835.dip.t-dialin.net 1104441728 M * tanjix hi everyone 1104441748 M * tanjix when chkrootkit says: "Possible LKM Trojan installed" - is that because of a vserver? 1104442774 M * _are_ well, i have a very likely clean server, shall I test? 1104443463 M * tanjix if you like :) 1104443813 M * no_maam_ well, I had a lot of problems with that 1104443818 M * no_maam_ just ignore this message 1104446063 M * tanjix ok, thank you 1104446824 M * no_maam_ but of course only if you are sure that there is no rootkit 1104446869 M * no_maam_ it is very likely that ckrootkit is wrong but there is a little chance that it is not wrong 1104449084 M * Hollow there seems to be a bug in util-vserver 0.30.196: start-vservers [...] --stop does not work: 1104449097 M * Hollow stoned linux # /usr/lib/util-vserver/start-vservers -m default -j 1 --start --all 1104449097 M * Hollow stoned linux # vserver-stat 1104449097 M * Hollow CTX PROC VSZ RSS userTIME sysTIME UPTIME NAME 1104449097 M * Hollow 0 106 3.5G 76.3K 3h08m11 38m09s94 10d16h09 root server 1104449097 M * Hollow 11 2 3.1M 387 0m00s00 0m00s00 0m57s17 mentor 1104449099 M * Hollow 12 2 3.8M 436 0m00s00 0m00s00 0m41s25 ns1 1104449099 M * Hollow 13 2 3.7M 418 0m00s00 0m00s00 0m36s28 ns2 1104449101 M * Hollow 111 59 3.8G 93.5K 0m00s30 0m00s10 0m25s36 www1 1104449103 M * Hollow 112 59 3.8G 93.1K 0m00s30 0m00s10 0m19s80 www2 1104449105 M * Hollow 113 1 1.5M 146 0m00s00 0m00s00 0m09s72 www3 1104449107 M * Hollow 121 12 373.8M 14.5K 0m00s40 0m00s40 0m54s22 sql1 1104449111 M * Hollow 122 8 220.4M 8.3K 0m00s50 0m00s40 0m31s36 sql2 1104449113 M * Hollow 123 1 1.5M 146 0m00s00 0m00s00 0m28s15 sql3 1104449115 M * Hollow 131 29 51.8M 3.5K 0m00s50 0m00s20 0m45s42 mx1 1104449117 M * Hollow 190 1 1.5M 146 0m00s00 0m00s00 0m51s20 ftp1 1104449119 M * Hollow 191 1 1.5M 146 0m00s00 0m00s00 0m48s24 jabber1 1104449121 M * Hollow 65535 2 0 0 0m00s00 0m04s86 10d16h09 1104449124 M * Hollow stoned linux # /usr/lib/util-vserver/start-vservers -m default -j 1 --stop --all 1104449126 M * Hollow Makefile:6: *** target pattern contains no `%'. Stop. 1104449127 M * Hollow stoned linux # 1104450415 N * Bertl_oO Bertl_zZ 1104450498 Q * rs Quit: leaving