--- Log opened Sun Jun 14 00:00:43 2009 20090614 00:23:40-!- thespaceinvader [n=chatzill@wesnoth/artist/thespaceinvader] has quit ["night all"] 20090614 01:11:43-!- zookeeper [n=l@wesnoth/developer/zookeeper] has quit [] 20090614 01:32:35-!- grzywacz [n=grzywacz@wesnoth/developer/grzywacz] has quit [Remote closed the connection] 20090614 02:05:46-!- vjoe [n=vjoe@hh088c.halls.manchester.ac.uk] has quit [Client Quit] 20090614 03:06:39-!- happygrue [n=George@wesnoth/developer/wintermute] has quit [Read error: 113 (No route to host)] 20090614 07:51:17-!- Blueblaze [n=nick@c-98-199-143-139.hsd1.tx.comcast.net] has quit [Remote closed the connection] 20090614 07:52:48-!- Blueblaze [n=nick@c-98-199-143-139.hsd1.tx.comcast.net] has joined #wesnoth-mp 20090614 08:28:21-!- Turuk [n=Turuk@wesnoth/forumsith/turuk] has quit [Read error: 110 (Connection timed out)] 20090614 08:45:03-!- Blueblaze [n=nick@c-98-199-143-139.hsd1.tx.comcast.net] has quit [Remote closed the connection] 20090614 08:55:02-!- Sirp [n=me@wesnoth/developer/dave] has quit ["leaving"] 20090614 09:04:11-!- mordante [n=mordante@wesnoth/developer/mordante] has joined #wesnoth-mp 20090614 10:18:33-!- zookeeper [n=l@wesnoth/developer/zookeeper] has joined #wesnoth-mp 20090614 11:40:41-!- Noyga [n=lame-z@wesnoth/developer/noyga] has joined #wesnoth-mp 20090614 12:00:55-!- vjoe [n=vjoe@hh088c.halls.manchester.ac.uk] has joined #wesnoth-mp 20090614 12:12:30-!- thespaceinvader [n=chatzill@wesnoth/artist/thespaceinvader] has joined #wesnoth-mp 20090614 13:37:35-!- Pietro_S [n=sobotkap@tom.bn-ulm.de] has joined #wesnoth-mp 20090614 13:37:38-!- Pietro_S [n=sobotkap@tom.bn-ulm.de] has quit [Client Quit] 20090614 14:11:01-!- thespaceinvader [n=chatzill@wesnoth/artist/thespaceinvader] has quit ["ChatZilla 0.9.84 [Firefox 3.0.11/2009060215]"] --- Log opened Sun Jun 14 15:18:11 2009 20090614 15:18:20-!- lobby [n=wesnoth@wesnoth/bot/lobby] has joined #wesnoth-mp 20090614 15:18:20-!- Topic for #wesnoth-mp: http://coc.wesnoth.org | stats: http://wesnothd.wesnoth.org | alternate MP servers: basilic.tuxfamily.org and gonzo.dicp.de | proxy: server2.wesnoth.org:3724 | replay archive: http://www.wesnoth.org/forum/viewtopic.php?t=6471 | http://wesnoth.org/replays | #wesnoth-mp-lobby-1.6 | #wesnoth-mp-lobby-dev | http://wesnoth.org/irclogs 20090614 15:18:20-!- Topic set by Soliton [] [Fri May 22 21:36:09 2009] 20090614 15:18:20[Users #wesnoth-mp] 20090614 15:18:20[ cjhopman] [ lobby ] [ Noyga ] [ Soliton] [ vjoe ] 20090614 15:18:20[ grzywacz] [ mordante] [ shadowmaster] [ Turuk ] [ zookeeper] 20090614 15:18:20-!- Irssi: #wesnoth-mp: Total of 10 nicks [0 ops, 0 halfops, 0 voices, 10 normal] 20090614 15:18:32-!- Channel #wesnoth-mp created Tue Jan 27 06:32:14 2009 20090614 15:19:17-!- Irssi: Join to #wesnoth-mp was synced in 65 secs 20090614 15:45:36-!- Turuk [n=Turuk@wesnoth/forumsith/turuk] has quit [] 20090614 15:46:04-!- ^Noyga^ [n=lame-z@AVelizy-151-1-7-169.w82-120.abo.wanadoo.fr] has joined #wesnoth-mp 20090614 15:47:03-!- Noyga [n=lame-z@wesnoth/developer/noyga] has quit [Nick collision from services.] 20090614 15:47:13-!- ^Noyga^ is now known as Noyga 20090614 15:50:52-!- thespaceinvader [n=chatzill@wesnoth/artist/thespaceinvader] has joined #wesnoth-mp 20090614 16:31:18-!- Sirp [n=me@pool-173-74-23-130.dllstx.fios.verizon.net] has joined #wesnoth-mp 20090614 17:01:28-!- Sirp [n=me@pool-173-74-23-130.dllstx.fios.verizon.net] has quit [Read error: 104 (Connection reset by peer)] 20090614 17:12:36-!- Sirp [n=me@wesnoth/developer/dave] has joined #wesnoth-mp 20090614 18:19:25< Sirp> Soliton: is there an in-game interface for viewing/downloading the replays on the server? 20090614 18:19:47< Soliton> Sirp: nope. 20090614 18:21:47< Soliton> some people are or at leasted wanted to work on a nicer web interface but an in-game interface would be even better. 20090614 18:22:14< Sirp> Soliton: yeah.....I mean with a web interface you have to download it then open Wesnoth, locate it in Wesnoth, etc etc 20090614 18:23:33< Soliton> you could maybe make your browser open wesnoth --load ... but yeah. :-) 20090614 18:28:07< Sirp> still a pain and not so accessible to most people. :) 20090614 18:28:42< Soliton> Sirp: btw, the server still aborts occasionally with ENOMEM and i have no idea why. (unlike i guessed before replay saving does not seem involved.) 20090614 18:29:05< grzywacz> Sirp, but that would pretty much make sense with registered nicks only, no? 20090614 18:29:08< Soliton> occasionally means about once every week: http://www.wesnoth.org/cgi-bin/collection3/bin/index.cgi?hostname=wesnoth.wesnoth.org&plugin=memory&plugin=processes&plugin=tcpconns×pan=2678400&action=show_selection&ok_button=OK 20090614 18:29:45< Soliton> you can tell from the memory consumption in the wesnothd-1.6/ps_rss graph. 20090614 18:30:04< Sirp> Soliton: okay. Do we think this behavior has been introduced at some point? 20090614 18:30:20< Sirp> grzywacz: why would it only make sense with registered nicks? 20090614 18:30:32< Soliton> Sirp: well, i though it was introduced around when i implemented the replay save function. 20090614 18:31:01< grzywacz> Sirp, because you'd get aliasing if you uploaded some replays as unregistered nick and then someone else registered it 20090614 18:31:17< Soliton> but i put a try/catch around that and the bad alloc exception is not thrown there. 20090614 18:31:17< grzywacz> Or you'd get your replays deleted by someone else. 20090614 18:31:37< Soliton> grzywacz: we're talking about wesnoth.org/replays 20090614 18:31:40< Sirp> grzywacz: I have no idea how the automatic uploading works, but that's already all in place courtesy of Soliton. I'm just talking about a way to access and download replays 20090614 18:32:18< grzywacz> Sirp, ah, thank you 20090614 18:32:42< Sirp> Soliton: well, (1) bad alloc is usually not thrown when out of memory occurs; (2) the code where memory exhaustion occurs is typically the victim of a memory leak, not the cause. 20090614 18:32:48< grzywacz> Soliton, how long are they stored? 20090614 18:33:07< Soliton> grzywacz: till we run out of space and i have to come up with something. 20090614 18:33:23< grzywacz> ok :) 20090614 18:33:41< Soliton> it's years away from what i projected at some point. 20090614 18:33:50< Sirp> Soliton: if we really want the replays we might just see how much it would cost for olm to install a huge hdd for us 20090614 18:34:01< Sirp> Soliton: ahh that's cool then. :) 20090614 18:34:14< Sirp> Soliton: next time we redo our server with olm we'll just make sure we have enough space. 20090614 18:34:35< Soliton> and there is a lot of room to delete obviously uninteresting replays. like less than 5 turns, etc. 20090614 18:34:47< Sirp> Soliton: the way that malloc() in C and new in C++ were designed to handle out of memory is out of sync with how modern OS's actually do it. 20090614 18:35:02< Soliton> Sirp: i see. 20090614 18:35:24< Soliton> Sirp: so what can we do to narrow the cause down? 20090614 18:35:37< Sirp> Soliton: the model that C/C++ were designed with is, "ask the OS for memory" -> OS says no memory left, sorry -> malloc() returns NULL or new throws bad_alloc 20090614 18:36:06< Soliton> right, that's how i thought it'd work. 20090614 18:36:22< grzywacz> That design ^ breaks down when overcommit is turned on on Linux.... 20090614 18:36:50< Sirp> Soliton: the way it actually works is "ask the OS for memory" -> OS says "oh sure here's some address space" -> you start to actually use the addresses -> OS says "oh they're actually *using* the addresses I gave them, now I have to find some physical memory to map the addresses to" -> OS finds there is no physical memory left -> OS decides it better kill the process using the most memory 20090614 18:37:37< Sirp> i.e. it'll never fail when you allocate because when you allocate you are just given addresses; only when you start to use the addresses does the OS have to map to physical memory, and that is the stage where failure will occur 20090614 18:37:45< Soliton> ok, but wesnothd really dies with a bad alloc exception. it's not getting killed by the oom killer. 20090614 18:38:23< Sirp> oh, really? 20090614 18:38:46< Soliton> 20090608 20:26:49 error server: Ran out of memory. Aborting. 20090614 18:39:06< Soliton> } catch(std::bad_alloc&) { 20090614 18:39:06< Soliton> ERR_SERVER << "Ran out of memory. Aborting.\n"; 20090614 18:39:06< Soliton> return ENOMEM; 20090614 18:39:08< Sirp> Soliton: okay. remove the catch(bad_alloc) and let the exception propagate out of main 20090614 18:39:09< grzywacz> Soliton, ulimit? 20090614 18:39:11< Sirp> turn cores on 20090614 18:39:29< Sirp> it'll dump core. We can see what's happening. 20090614 18:39:36-!- Turuk [n=Turuk@wesnoth/forumsith/turuk] has joined #wesnoth-mp 20090614 18:39:46< Sirp> Soliton: if it's actually throwing bad_alloc that might well mean that someone is doing something crazy, like malloc(-1) 20090614 18:40:19< Soliton> alright. 20090614 18:41:45-!- shadowmaster [n=ignacio@wesnoth/developer/shadowmaster] has quit [Remote closed the connection] 20090614 18:46:48< Soliton> Sirp: curious is also that it takes several hours for the server to die when it dies from that exception. 20090614 18:47:18< Sirp> Soliton: so the server is down during all that time? 20090614 18:47:38< Sirp> Soliton: it's not even dumping core, right? Just exiting? 20090614 18:47:47< Sirp> I guess when it gets to that stage it might be massively swapped out..... 20090614 18:47:49< Soliton> Sirp: yes, though i'm usually there and restart it then. 20090614 18:48:01< Soliton> Sirp: yes, just exiting. 20090614 18:49:08< Soliton> Sirp: well, you can look at the graphs the server is not really out of memory. actually it might just trying to disconnect everyone.. after another waiting for timeouts or something. 20090614 18:49:33 * Sirp nods 20090614 18:49:35< Sirp> yeah that sounds likely 20090614 18:50:04< Sirp> Soliton: so yeah have it dump core; my guess is someone is trying a crazy large alloc, like passing -1 as the size of an STL container 20090614 18:50:17< Soliton> yeah, it sounds like it. 20090614 18:50:26< Sirp> that is a common way to make things go nuts, as the -1 will be converted to an unsigned number and is huge 20090614 18:50:43< Sirp> good thing we had the conversation about bad_alloc and oom. :) 20090614 18:51:05< Soliton> yes, didn't know that. 20090614 19:03:20-!- mordante [n=mordante@wesnoth/developer/mordante] has quit ["Leaving"] 20090614 19:13:54-!- Sirp [n=me@wesnoth/developer/dave] has quit ["leaving"] 20090614 19:27:14-!- Turuk [n=Turuk@wesnoth/forumsith/turuk] has quit [] 20090614 19:49:40-!- cjhopman [n=chris@wesnoth/developer/cjhopman] has quit [Remote closed the connection] --- Log closed Sun Jun 14 19:53:45 2009 --- Log opened Sun Jun 14 19:53:45 2009 20090614 19:56:54-!- Blueblaze [n=nick@c-98-199-143-139.hsd1.tx.comcast.net] has joined #wesnoth-mp --- Log closed Sun Jun 14 20:00:22 2009 --- Log opened Sun Jun 14 20:00:22 2009 --- Log closed Sun Jun 14 20:03:15 2009 --- Log opened Sun Jun 14 20:03:15 2009 --- Log closed Sun Jun 14 20:04:58 2009 --- Log opened Sun Jun 14 20:04:58 2009 --- Log closed Sun Jun 14 20:06:43 2009 --- Log opened Sun Jun 14 20:06:43 2009 20090614 20:12:21-!- noy [n=Noy@wesnoth/developer/noy] has joined #wesnoth-mp 20090614 20:14:47-!- Sirp [n=me@wesnoth/developer/dave] has joined #wesnoth-mp --- Log closed Sun Jun 14 20:21:57 2009 --- Log opened Sun Jun 14 20:21:57 2009 --- Log closed Sun Jun 14 20:25:49 2009 --- Log opened Sun Jun 14 20:25:49 2009 --- Log closed Sun Jun 14 20:26:24 2009 --- Log opened Sun Jun 14 20:26:24 2009 --- Log closed Sun Jun 14 20:31:57 2009 --- Log opened Sun Jun 14 20:31:57 2009 --- Log closed Sun Jun 14 20:33:00 2009 --- Log opened Sun Jun 14 20:33:00 2009 20090614 21:10:59-!- happygrue [n=George@wesnoth/developer/wintermute] has joined #wesnoth-mp 20090614 21:55:21-!- Turuk [n=Turuk@wesnoth/forumsith/turuk] has joined #wesnoth-mp 20090614 21:56:52-!- Espreon [n=espreon@wesnoth/developer/espreon] has joined #wesnoth-mp 20090614 22:03:14-!- Turuk [n=Turuk@wesnoth/forumsith/turuk] has quit [] 20090614 22:16:49-!- cjhopman [n=chris@wesnoth/developer/cjhopman] has joined #wesnoth-mp 20090614 23:06:35-!- Noyga [n=lame-z@wesnoth/developer/noyga] has quit [Read error: 110 (Connection timed out)] 20090614 23:07:57-!- Noyga [n=lame-z@wesnoth/developer/noyga] has joined #wesnoth-mp 20090614 23:45:04-!- shadowmaster [n=ignacio@wesnoth/developer/shadowmaster] has joined #wesnoth-mp --- Log closed Mon Jun 15 00:00:12 2009