Jump to content
Eternal Lands Official Forums
bluap

Client crash on map change

Recommended Posts

I have a repeatable crash when changing maps from Southern Redmoon Mines [171,110] (top of the stairs) back into the main cave. I have to be facing about due west and zoomed fully out. I'm using latest CVS RC build options, in Linux debian etch. My graphics is a GeForce4 Ti 4200. However, the crash also happens on the same machine using Ubuntu 7.10 and even if I run the windows RC145 under wine. It does not happen on my daughters laptop which uses Intel graphics. So I suspect by graphics card is to blame but bare with me. I'm also crash free if I build without CLUSTER_INSIDES defined.

 

Neither gdb nor valgrind reveal anything useful so I set about sprinkling the code with printf() and finally tracked it down a glDrawArrays() call at line 763 of reflection.c in the draw_water_quad_tiles() function. I added some extra debug:

Index: reflection.c
===================================================================
RCS file: /cvsroot/elc/elc/reflection.c,v
retrieving revision 1.123
diff -a -u -r1.123 reflection.c
--- reflection.c		2 Oct 2007 07:29:23 -0000	   1.123
+++ reflection.c		3 Nov 2007 22:01:47 -0000
@@ -760,7 +760,11 @@
			}
			size++;
	}
+printf("draw_water_quad_tiles final glDrawArrays start=%d stop=%d idx=%d size=%d id=%d\n", start, stop, idx, size, water_id );
+fflush(stdout);
	glDrawArrays(GL_QUADS, idx * 4, size * 4);
+printf("glDrawArrays done\n");
+fflush(stdout);
}

void draw_lake_tiles()

This shows that things are fine until size gets big. Here's the last few lines of debug output. Size is mostly zero, often 1 or 5 but can't cope with 19 or more it would appear.

draw_water_quad_tiles final glDrawArrays start=0 stop=0 idx=0 size=0 id=90
glDrawArrays done
draw_water_quad_tiles final glDrawArrays start=489 stop=508 idx=0 size=0 id=90
glDrawArrays done
draw_water_quad_tiles final glDrawArrays start=0 stop=0 idx=0 size=0 id=90
glDrawArrays done
draw_water_quad_tiles final glDrawArrays start=489 stop=508 idx=0 size=0 id=90
glDrawArrays done
draw_water_quad_tiles final glDrawArrays start=0 stop=0 idx=0 size=0 id=90
glDrawArrays done
draw_water_quad_tiles final glDrawArrays start=489 stop=508 idx=0 size=19 id=90
Segmentation fault

I'm not familiar enough with openGL calls to know how to proceed but will happily add more debug if it helps.

Share this post


Link to post
Share on other sites

This shows that things are fine until size gets big. Here's the last few lines of debug output. Size is mostly zero, often 1 or 5 but can't cope with 19 or more it would appear.

More testing. I've had a crash with size=1. I also note that size can be big and things are fine while walking around the top cave. Size may not be important...

Share this post


Link to post
Share on other sites

Can you try to get the values of water_buffer_reflectiv_index, water_buffer_usage, and idx when it crashes? We should have 0 <= idx < water_buffer_reflectiv_index for the first call to draw_water_quad_tiles() in draw_lake_tiles() (line 841) and water_buffer_reflectiv_index <= idx < water_buffer_usage for the second call (line 934).

Share this post


Link to post
Share on other sites

Can you try to get the values...

Thanks Grum. It's always the second call to draw_water_quad_tiles() that causes the crash and always the three values are all zero. I guess that breaks your conditions. Interestingly, all three values of zero do not always cause a crash and can occur for both calls. The crash only happens on the map change. I added a check for your conditions and could not cause the crash any more. I think you're on to something :medieval:

Share this post


Link to post
Share on other sites

Do you happen to use Mesa? ...

Yes I have it installed (Debian Etch) and a preview of apt-get remove shows its rather central to many applications. Its version 6.5.1-0.6:

dpkg --list *mesa* | grep ^ii
ii  libgl1-mesa-dev	 6.5.1-0.6 ...
ii  libgl1-mesa-dri	 6.5.1-0.6 ...
ii  libgl1-mesa-glx	 6.5.1-0.6 ...
ii  libglu1-mesa		6.5.1-0.6 ...
ii  libglu1-mesa-dev	6.5.1-0.6 ...
ii  mesa-common-dev	 6.5.1-0.6 ...
ii  mesa-utils		  6.3.2-2.1 ...

 

Does the crash still hapen if you protect the calls to glDrawArrays() with an

if (size > 0)

Yes, unfortunately it does, good find though. In fact idx is 0, size is 15 when the crash happens. The only way I've found to prevent the crash is to test your "water_buffer_reflectiv_index <= idx < water_buffer_usage" condition and only call draw_water_quad_tiles() if that test is good.

 

Index: reflection.c
===================================================================
RCS file: /cvsroot/elc/elc/reflection.c,v
retrieving revision 1.123
diff -a -u -r1.123 reflection.c
--- reflection.c		2 Oct 2007 07:29:23 -0000	   1.123
+++ reflection.c		13 Nov 2007 20:37:53 -0000
@@ -931,7 +931,8 @@
	}

	get_intersect_start_stop(main_bbox_tree, TYPE_REFLECTIV_WATER, &start, &stop);
-	   draw_water_quad_tiles(start, stop, water_buffer_reflectiv_index, water_id);
+	   if (water_buffer_reflectiv_index < water_buffer_usage)
+			   draw_water_quad_tiles(start, stop, water_buffer_reflectiv_index, water_id);

#ifdef USE_SHADER
	if (water_shader_quality > 0)

Share this post


Link to post
Share on other sites

I seem to be unable to reproduce the crash. I have an idea, which is probably crazy, but may cause the crash: your current actor is being set between the call to build_water_buffer() and draw_water_quad_tiles(), or his position is updated in that period. In that case, the actor cluster in build_water_buffer() (where the arrays to be drawn are computed) will be different from those that are actually drawn in draw_water_quad_tiles().

 

Can you print the actor cluster (variable cluster) in those two functions, and see if they're different when it crashes?

Share this post


Link to post
Share on other sites

I seem to be unable to reproduce the crash. I have an idea, which is probably crazy, but may cause the crash: your current actor is being set between the call to build_water_buffer() and draw_water_quad_tiles(), or his position is updated in that period. In that case, the actor cluster in build_water_buffer() (where the arrays to be drawn are computed) will be different from those that are actually drawn in draw_water_quad_tiles().

 

Can you print the actor cluster (variable cluster) in those two functions, and see if they're different when it crashes?

Standing at the top of the mine stairs I get a value of 12, as I change maps, the value goes to 0, then to 11 just before the crash:

....
build_water_buffer() cluster=12
build_water_buffer() cluster=12
build_water_buffer() cluster=12
draw_water_quad_tiles() cluster=12
draw_water_quad_tiles() cluster=12
build_water_buffer() cluster=0
build_water_buffer() cluster=0
build_water_buffer() cluster=0
draw_water_quad_tiles() cluster=0
draw_water_quad_tiles() cluster=0
build_water_buffer() cluster=0
build_water_buffer() cluster=0
build_water_buffer() cluster=0
draw_water_quad_tiles() cluster=0
draw_water_quad_tiles() cluster=0
build_water_buffer() cluster=0
build_water_buffer() cluster=0
build_water_buffer() cluster=0
draw_water_quad_tiles() cluster=0
draw_water_quad_tiles() cluster=0
build_water_buffer() cluster=0
build_water_buffer() cluster=0
build_water_buffer() cluster=0
draw_water_quad_tiles() cluster=0
draw_water_quad_tiles() cluster=0
build_water_buffer() cluster=0
build_water_buffer() cluster=0
build_water_buffer() cluster=0
draw_water_quad_tiles() cluster=0
draw_water_quad_tiles() cluster=0
build_water_buffer() cluster=11
build_water_buffer() cluster=11
build_water_buffer() cluster=11
draw_water_quad_tiles() cluster=11
draw_water_quad_tiles() cluster=11
idx=0, size+28, water_buffer_usage=0
Segmentation fault

Note this is using your modified version from CVS with the exit(1) removed. Not sure if you meant to commit those changes with the server.lst stuff... Anyhow, if you change that check so that prevents the call to glDrawArrays() the code does not crash.

Share this post


Link to post
Share on other sites

Oh crap.... No, that wasn't supposed to be committed. Anyway, my idea was wrong :D I just can't figure out why water_buffer_usage (or size) is wrong...

 

*Goes off to fix a broken commit*

Share this post


Link to post
Share on other sites

Oh crap.... No, that wasn't supposed to be committed. Anyway, my idea was wrong :( I just can't figure out why water_buffer_usage (or size) is wrong...

Ah well, but thanks for looking into this. It's tempting to just do the check and avoid the crash. May be this is an option for the release version if it never gets triggered without causing a crash. We could add a message to the error log. I can't be the only person this might be an issue for....

 

*Goes off to fix a broken commit*

I was just typing a thing about interface.c but I note you've already seen to it. At work, we refer to these incidents as swabs left in the patient :D

Share this post


Link to post
Share on other sites

Grmbl... I have the feeling I'm missing something really obvious.

 

In the first part of build_water_buffer(), the bbox tree is checked to see if the tile buffer needs to be updated. Can you check if the client gets past this check (i.e. skip the return statement on line ~133) and the buffer is really rebuilt with the correct visibility cluster before it's drawn?

Share this post


Link to post
Share on other sites

Grmbl... I have the feeling I'm missing something really obvious.

 

In the first part of build_water_buffer(), the bbox tree is checked to see if the tile buffer needs to be updated. Can you check if the client gets past this check (i.e. skip the return statement on line ~133) and the buffer is really rebuilt with the correct visibility cluster before it's drawn?

 

Yes it does, sometimes, but it appears unrelated to the crash. Your opinion may be different so I've added lots of debug and written the output to the log file (I'll send you the URL via PM shortly). The cvs diff reflection.c output is append to the log too.

 

I started in the top mine, moved to the lower mine, moved back without crashing (by keeping my zoom low), back down to the bottom mine, then up again allowing a crash. Location information is also written to the log. The crash happens when the cluster tests in build_water_buffer() think that there is nothing to do, but the same tests in draw_water_quad_tiles() think there is. This should be obvious in the debug.

 

Another thing I noticed that I had not before. To avoid the crash I change perspective, direction and/or zoom level. The fine line between crashing and not, exactly corresponds to the window background being black (no crash) or blue (crash). Weird, I'm off to hunt for where that's done....

Share this post


Link to post
Share on other sites
the window background being black (no crash) or blue (crash). Weird, I'm off to hunt for where that's done....
Great! Grum and I haven't been able to find it. There are bugs in it which Roja has wanted fixed (it changes) so if you can find them she'll be very happy. :-) Edited by Torg

Share this post


Link to post
Share on other sites
the window background being black (no crash) or blue (crash). Weird, I'm off to hunt for where that's done....
Great! Grum and I haven't been able to find it. There are bugs in it which Roja has wanted fixed (it changes) so if you can find them she'll be very happy. :-)

Don't get your hopes too high ;) You probably already know this but the blue is really some texture being rendered not just a colour fill. I've just loaded the map editor and it shows the same issues. Some camera angles have a black background (low CPU) then suddenly the whole background is filed with blue (high CPU). This is for the map I'm having trouble with cont2map16_insides. Its the same blue as the water in the cave and appears/disappears at the same time but its not the same textured look. Its as if the water had spilled out from the lake....

Share this post


Link to post
Share on other sites
Don't get your hopes too high :P You probably already know this but the blue is really some texture being rendered not just a colour fill.
Oh! I didn't know that. I was looking in the completely wrong place then. I still have my hopes high. ;-)

 

I've just loaded the map editor and it shows the same issues. Some camera angles have a black background (low CPU) then suddenly the whole background is filed with blue (high CPU). This is for the map I'm having trouble with cont2map16_insides. Its the same blue as the water in the cave and appears/disappears at the same time but its not the same textured look. Its as if the water had spilled out from the lake....
Well the blue is correct from what I understand. It is meant to be the sky meeting with the water edge. There is some grey in there too, which is wrong iirc.

 

It would be best to ask Roja about the exact details.

Share this post


Link to post
Share on other sites

Well, since this is the only remaining bug that holds the update, maybe we should just not use CLUSTER_INSIDES for the official client, and just release an optional patch with that included when it becoems stable.

Share this post


Link to post
Share on other sites

Well, since this is the only remaining bug that holds the update, maybe we should just not use CLUSTER_INSIDES for the official client, and just release an optional patch with that included when it becoems stable.

OK, a few facts/beliefs to help decide.

 

1) I only crash if CLUSTER_INSIDES is enabled. It's in the map transition between SM lower cave and upper. I have not experience the same crash elsewhere. Though I guess it's possible.

 

2) The blue/black background switching issue occurs whether or not CLUSTER_INSIDES is used. It also occurs in the map editor. It does not cause a crash but is a but annoying. When its blue, CPU usage rises dramatically for me at least.

 

3) The link between the blue/black issues is that if I have a black the background, I can go back and forth without crashing. If I have the blue background, I always almost always crash. The exception is if I click on the sign from lower down the stairs (I've only just realised this exception).

 

4) A simple check of state before doing a particular glDrawArrays() will prevent the crash. That, along with the fact excluding CLUSTER_INSIDES does not fix the blue/black issue would suggest to me we might like to go with this trap instead or excluding CLUSTER_INSIDES.

 

Index: reflection.c
===================================================================
RCS file: /cvsroot/elc/elc/reflection.c,v
retrieving revision 1.125
diff -a -u -r1.125 reflection.c
--- reflection.c		15 Nov 2007 21:35:32 -0000	  1.125
+++ reflection.c		17 Nov 2007 12:35:19 -0000
@@ -760,7 +760,10 @@
			}
			size++;
	}
-	   glDrawArrays(GL_QUADS, idx * 4, size * 4);
+	   if (size && idx >= water_buffer_usage)
+			   printf("%s(): crash averted idx=%d, size+%d, water_buffer_usage=%d\n", __FUNCTION__, idx, size, water_buffer_usage);
+	   else
+			   glDrawArrays(GL_QUADS, idx * 4, size * 4);
}

void draw_lake_tiles()

Share this post


Link to post
Share on other sites

I don't really care about blue and black issues, all I care about is not to crash.

So if that check prevents the crash, please just commit it and let's go on with the update.

Share this post


Link to post
Share on other sites

I don't really care about blue and black issues, all I care about is not to crash.

So if that check prevents the crash, please just commit it and let's go on with the update.

OK, done. An message is written to the error log if the trap is used.

Share this post


Link to post
Share on other sites

Very instructive (and extensive!) log, thanks.

build_water_buffer() if is true, building

build_water_buffer() 3 cluster test if start=486 stop=512 l=102413 x=13 y=25 tile_cluster=11 cluster=12

...

build_water_buffer() water_buffer_usage=0

build_water_buffer() if is false, returning

build_water_buffer() if is false, returning

...

draw, don't rebuild (times many)

...

draw_water_quad_tiles() 1 cluster test if start=486 stop=512 l=98326 x=22 y=24 tile_cluster=11 cluster=0

draw_water_quad_tiles() OK idx=0, size=0, water_buffer_usage=0 cluster=0

build_water_buffer() if is false, returning

build_water_buffer() if is false, returning

build_water_buffer() if is false, returning

draw_water_quad_tiles() OK idx=0, size=0, water_buffer_usage=0 cluster=11

draw_water_quad_tiles() 1 cluster test else start=486 stop=512 l=102413 x=13 y=25 tile_cluster=11 cluster=11

So the last time the water buffer is built, the actor is still on cluster 12. Drawing is ok while the actor is in limbo (cluster==0), because the test tile_cluster==cluster always fails. Finally, when the actor reappears, he's on cluster 11, but the buffer isn't rebuilt. The code tries to draw the tiles on cluster 11, but fails because it doesn't have the necessary tiles.

 

So it looks like the ide_changed flag on the bbox tree isn't set when the actor teleports (at least not for types TYPE_REFLECTIV_WATER and TYPE_NO_REFLECTIV_WATER). I'll try to track it down, but I have to leave now.

Share this post


Link to post
Share on other sites

Does this help?

Index: new_actors.c
===================================================================
RCS file: /cvsroot/elc/elc/new_actors.c,v
retrieving revision 1.124
diff -u -r1.124 new_actors.c
--- new_actors.c		19 Nov 2007 19:31:58 -0000	  1.124
+++ new_actors.c		25 Nov 2007 13:54:58 -0000
@@ -133,7 +133,14 @@
	}

	if (actor_id == yourself)
+	   {
+			   // We have just returned from limbo (after teleport,
+			   // map change, disconnect, etc.) Since our position may
+			   // have changed, tell the bbox tree to update so
+			   // that our environment is recomputed.
+			   set_all_intersect_update_needed (main_bbox_tree);
			set_our_actor (our_actor);
+	   }

	actors_list[i]=our_actor;

Share this post


Link to post
Share on other sites

Does this help?

You're a genius. Yes it does. With this change, the trap is no longer triggered and when the trap is removed the client does not crash! Nice one :D

Share this post


Link to post
Share on other sites

This has been working fine since I applied it to my client so I've just committed Grum's fix and removed the previous trap code.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

×