Jump to content
Eternal Lands Official Forums
bluap

Issues with Eye Candy - For example overly intense or window darkness after cast

Recommended Posts

Out of curiosity, I got myself a second hand Pi3b, and (after a failed attempt with Arch Linux[*]), installed Raspberry Pi OS[**] and compiled the client. After twiddling my thumbs for a good while and playing Township on my phone for a little while more, I was able to run the client.

 

Well, it's certainly not the most performant system I have, but it works better than I expected. So good, in fact, that I was not able to trigger the lighting problem. I tried both the "Fake" and the "Full" KMS drivers[***], and the most I was able to achieve was some lagging when casting too many TPTRs in quick succession. (caveat: I was operating the GUI on my laptop through VNC, but I don't think that should affect the OpenGL rendering). Now, I realize this system is not the same as @bluap's Pi400 (fun system, that), but I had rather hoped to be able to reproduce it at least. So that's another plan down the drain.

 

As I'm not going to spend more money on this (esp. with current hardware prices), I will unfortunately have to throw in the towel on this problem.

 

One last question:

On 10/24/2021 at 7:43 PM, bluap said:

Cautiously optimistic that the problem is due to invalid particle positions , I'm just looking for the underlying cause or if I can just trap the values.

I wonder about that. I can see how colour values outside [0,1] could cause driver/hardware-dependent issues, but what constitutes an invalid position? Aside from NaN values, all coordinates are valid, right?

 

EDIT: the CPU on my Pi is being throttled when I run EL, because I'm not using a good enough power supply (an old phone charger ATM. Doubt it makes a difference, but I will try again when I liberate a better power supply (i.e. once the household here is done watching Netflix)

 

[*]: to be fair, alarm (arch linux for ARM) installed without a hitch, I just wasn't able to get 3D acceleration working, nor able to get a proper screen resolution without a monitor attached..

[**]: that was not as smooth as it sounds, but at least I got it working in the end

[***]: Full KMS also only wants to run at 720x480 resolution for some reason

Edited by Grum

Share this post


Link to post
Share on other sites

Here is an update on my efforts.


A couple weeks ago after an Arch Linux OS update, I can no longer run EL on my main PC where I was seeing the issue. I mean I can launch the client and play for a bit but there is an issue where the allocated kernel dynamic memory quickly eats all available RAM and Swap space. I only see this happening when running EL.


In light of that I've been trying to recreate the black screen issue on a couple other PCs using the test server in the cave where Burn spawned the grizzly bears.  I previously mentioned a newer laptop and I still cannot recreate the issue there.

 

I also tried on an old IBM Thinkpad T60 laptop, 32 bit system.
Video card: ATI RV515
OpenGL Version: 2.1 Mesa 21.2.1

This is a weak system so I tried smashing the mana drain and TPTR effects, getting the framerate to drop to 4fps (with the camera zoomed out) or 1fps (with the camera zoomed in). I also played with adjusting the min effects frame rate so that it's both below and above my actual frame rate.  I tried compiling the client with and without Grum's latest commit. In any case I could not reproduce the black screen.

 

 

Edited by Nogrod

Share this post


Link to post
Share on other sites

Sorry I've been quiet on this for a few days, just too much RL stuff this week.  I'm using Ubuntu 21.10 on the Raspberry Pi 400 by the way.

I'm still thinking its the particle positions that are going astray, they are not NaN but very large e+30s for one or more of the coords is typical when the blackout happens.  It is totally predicable that a blackout will happen if these large values are seen, and the blackout are reliably prevented if I "continue" the "// Draw Lights." loop in eye_candy.cpp when I see such values.  A quick hack logging but excluding and coords with fabs() > 1e12 stops the blackouts.  As a minimum work around I will look at the threshold some more and we could use that to prevent the blackouts.  My preferred option is to find the underlying reason for the large values and fix that.  I will be able to spend time this weekend on this but would be more than happy if someone else found the underlying fault before me :)  Here is my hacky patch with debug.

diff --git a/eye_candy/eye_candy.cpp b/eye_candy/eye_candy.cpp
index b1781240..01128d54 100644
--- a/eye_candy/eye_candy.cpp
+++ b/eye_candy/eye_candy.cpp
@@ -2023,6 +2023,11 @@ namespace ec
 					std::min(p->color[2] * brightness, 1.0f),
 					0.0
 				};
+				if (!(p->pos.is_valid()) || (fabsf(p->pos.x) > 1e12) || (fabsf(p->pos.y) > 1e12) || (fabsf(p->pos.z) > 1e12))
+				{
+					std::cout << "Invalid Pos " << p->pos.x << " " << p->pos.y << " " << p->pos.z << std::endl;
+					continue;
+				}
 				glEnable(light_id);
 				glLightfv(light_id, GL_POSITION, light_pos);
 				glLightfv(light_id, GL_DIFFUSE, light_color);

 

Share this post


Link to post
Share on other sites
8 hours ago, bluap said:

I'm still thinking its the particle positions that are going astray, they are not NaN but very large e+30s

 

Heh, interesting.

 

I thought the difference would be that the OpenGL drivers on my laptop would handle such large values better[*], but instead in a few hundred TPTRs I did not trigger your test. I'll try and see if I can figure out how such large values are created.

 

[*] they do though. when I set pos->x explicitly to 1e30, I simply don't see the light, but no blackening of the screen.

Share this post


Link to post
Share on other sites
20 minutes ago, Grum said:

but instead in a few hundred TPTRs I did not trigger your test.

 

Well, should've tried harder. Turns out I can trigger the invalid coordinates. I will probably take another look tonight, RL priorities now.

 

Share this post


Link to post
Share on other sites

Question for you all: when you are talking about getting a "black screen", do you mean the entire monitor goes black or just the EL window?  I ask because I've had a handful of times my entire monitor went black and chalked it up to monitor troubles but now I wonder...since it doesn't happen often but always when actively doing something in EL.  (I run dual monitors and it only happened on the one I was actively doing something in EL on so I just figured my monitor was going bad.  It hasn't happened in ages though.)

Share this post


Link to post
Share on other sites

Found the cause of the invalid particle positions. In TargetMagicParticle::idle(), an effect-dependent shift is added to the particles position. A pointer to the effect is stored in the particle itself. This pointer is upcast to a pointer to a TargetMagicEffect2 and the shift is extracted from the cast pointer. Thing is, the stored effect is not an instance of TargetMagicEffect2, but of TargetMagicEffect, so the shift returned is whatever just happens to be in that particular memory location in the TargetMagicEffect object. Here's a patch to show the issue:

diff --git a/eye_candy/effect_targetmagic.cpp b/eye_candy/effect_targetmagic.cpp
index 853f54f6..26c118cd 100644
--- a/eye_candy/effect_targetmagic.cpp
+++ b/eye_candy/effect_targetmagic.cpp
@@ -260,7 +260,22 @@ namespace ec
                                        break;
                                }
                        }
+
+                       bool was_invalid = std::abs(pos.x) > 1e12
+                               || std::abs(pos.y) > 1e12
+                               || std::abs(pos.z) > 1e12;
+                       TargetMagicEffect *cast_effect = dynamic_cast<TargetMagicEffect*>(effect);
+                       TargetMagicEffect2 *cast_effect2 = dynamic_cast<TargetMagicEffect2*>(effect);
                        pos += ((TargetMagicEffect2*)effect)->shift;
+                       if (!was_invalid
+                               && (std::abs(pos.x) > 1e12 || std::abs(pos.y) > 1e12 || std::abs(pos.z) > 1e12))
+                       {
+                               std::cerr << "Invalid after adding effect->shift = "
+                                       << ((TargetMagicEffect2*)effect)->shift
+                                       << "\neffect = " << effect
+                                       << ", cast effect = " << cast_effect
+                                       << ", cast effect2 = " << cast_effect2 << "\n";
+                       }
                }
 
                //  std::cout << "B) " << this << ": " << velocity << ", " << pos << std::endl;

which prints something like

Invalid after adding effect->shift = <-6.56338e-26, 3.09267e-41, 3.69173e+19>
effect = 0x563695b1f0c0, cast effect = 0x563695b1f0c0, cast effect2 = 0

 

Ugh, this thing is a mess, I don't understand the code enough yet to propose a proper fix. I can say though, that the only place a real instance of TargetMagicEffect2 is created, is in the same function a few lines higher up. Why the idle function for a single particle is adding an effect, I don't know. I probably don't want to know. bluap, for a temporary fix, could you try to comment out the offending line, and see if it prevents the problem:

diff --git a/eye_candy/effect_targetmagic.cpp b/eye_candy/effect_targetmagic.cpp
index 853f54f6..4111cccd 100644
--- a/eye_candy/effect_targetmagic.cpp
+++ b/eye_candy/effect_targetmagic.cpp
@@ -260,7 +260,7 @@ namespace ec
                                        break;
                                }
                        }
-                       pos += ((TargetMagicEffect2*)effect)->shift;
+//                     pos += ((TargetMagicEffect2*)effect)->shift;
                }
 
                //  std::cout << "B) " << this << ": " << velocity << ", " << pos << std::endl;

I will try to figure out the logic behind this. I'm not too hopeful it will happen anytime soon, though :(

 

 

 

Edited by Grum

Share this post


Link to post
Share on other sites
34 minutes ago, Aislinn said:

Question for you all: when you are talking about getting a "black screen", do you mean the entire monitor goes black or just the EL window?

I obviously can't speak for everyone, but I am referring to just the 3D scene in the EL window. Even the actor names are still drawn properly, just the 3D world goes black. The rest of my screen is unaffected.

Share this post


Link to post
Share on other sites
4 hours ago, Grum said:

I obviously can't speak for everyone, but I am referring to just the 3D scene in the EL window. Even the actor names are still drawn properly, just the 3D world goes black. The rest of my screen is unaffected.

Same here - all black bar char name /  mob names HP bars / exp counters / and bags on the floor and an insanely bright purple collar when the spell is cast

 

image.png

image.png

Share this post


Link to post
Share on other sites

Band-aid fix checked in. Though the entire eye candy code could probably use some refactoring, this should hold for now :unsure:

Share this post


Link to post
Share on other sites
3 hours ago, Grum said:

Band-aid fix checked in. Though the entire eye candy code could probably use some refactoring, this should hold for now :unsure:

Yep, that looks to have fixed it.  Pity the original author didn't use proper C++ casting as this would never have slipped through.  I checked a couple of the other uses of ->shift with casting and they would appear to always be a valid type.  There are plenty of other C style casts of objects in the code though and I wonder if we shouldn't spend a bit of time checking and replacing those too.

Share this post


Link to post
Share on other sites
On 29/10/2021 at 5:02 PM, Grum said:

This pointer is upcast

Downcast, BTW. Brainfart.

 

3 hours ago, bluap said:

Yep, that looks to have fixed it. 

Oh great, it would be nice if we could finally put an end to this issue.

 

3 hours ago, bluap said:

Pity the original author didn't use proper C++ casting as this would never have slipped through.

Mwah, yes, if a dynamic_cast had been used. A static_cast (which I think a C-style cast is?), or worse, reinterpret_cast, wouldn't have caught it.

 

3 hours ago, bluap said:

I checked a couple of the other uses of ->shift with casting and they would appear to always be a valid type. 

Oh good, I was so glad to have found the issue at last, that I didn't think to check for other uses. Thanks!

 

3 hours ago, bluap said:

There are plenty of other C style casts of objects in the code though and I wonder if we shouldn't spend a bit of time checking and replacing those too.

Probably.  However, I feel the back pointer to the effect should not be in the Particle base class in the first place. I would like to study the code a bit further, see if we can make things a bit more transparent than they are now. That would be for after the release, though.

Share this post


Link to post
Share on other sites
On 10/20/2021 at 7:40 AM, Burn said:

 

 

I've just invaded about 2k bears in DP crystal caves on Test server. They are capped at 1 so anyone can safely walk in there. Given the cave size that'll satisfy the heavy mob density requirement. My map testing is done so there's no reason for radu to restart it killing the invasion now.

 

If you have a char on Test that can do MD (the chars there are very, very old) or maybe someone else can go with you and MD near you, the bears can still be MD'ed despite the cap so that can be tested on them.

 

Just note that the bear will attack if you MD it, despite the cap.

 

ANYONE with a character old enough to be able to MD from the 1.9.3 client days should try to head to Test server with the last bluap client build and try this.

 

I've been keeping up making my own builds from the git code, been in many invasions and invances with it, and tested some of these bears as well. Haven't seen the issue on my part. (Linux on Intel with NVIDIA GeForce GTX 660 with latest drivers)

 

 

[Map note: Should be okay from IP to DP CC, but the maps on Test server are for 1.9.6, so you may find issues with mapwalking, or you drop into "holes" in random places, usually along edges of mountains/rocks/buildings. Don't be concerned with these, they're expected with the 1.9.5 client. ALWAYS CARRRY TELE ESSIES OR A RING IN CASE OF EMERGENCY.]

 

 

 

Just a reminder, these critters are still invaded in DP CC for testing purposes.

Share this post


Link to post
Share on other sites

My results: I did 100 or so mana drains and about 50 tele-to-range, everything seems okay. I would have seen the lighting issue multiple times with that many before.

 

 

Seems okay here.

Share this post


Link to post
Share on other sites

I did had several scenarios yesterday where i would in the past get a black screen, and it was all normal. And while its hard to say the bug is fixed because you did not seen random effect anymore, i think it looks good on my side

Share this post


Link to post
Share on other sites

I got my main PC working again and been testing back and forth with and without the fix.
Without the fix I'm able consistently trigger the black screen with the Eye Candy DEBUG window.
With the fix I cannot replicate the black screen. I'm confident that the issue is solved.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

×