Petri has done it again. here is a self contained executable for the FGRPB1G, that Petri is referring to an "AIO", I'll try not to confuse this with the TBar AIO package though that we colloquially refer to "AIO".
FGRPB1G is coming to an end soon, so lets make a mad rush on it :).
from mine and others' testing, running 2x tasks per GPU should produce the best production, but be aware of memory usage if you try 2x on the v1.0 app (see notes, 8GB GPU min.)
expect 50-80% speed boost over the stock v1.28 nvidia app depending on your exact configuration.
Requirements:
- This requires Linux. there will not be a Windows version.
- This requires and is only applicable for modern Nvidia GPUs (only tested with Pascal and up). there will not be an AMD version
- This requires the OpenCL 3.0 Nvidia drivers, need at least the 470 branch or later.
- This requires GLIBC 2.29 or better (Ubuntu 19.04 and up)
- This requires a CPU with AVX2 capability (Intel Haswell/Broadwell and up, AMD Ryzen and up)
Get it here:
v1.0: Einstein_Special v1.0
v0.95: Einstein_Special v0.95
(you can ignore the run_manager file, it's leftover from some testing)
EDIT: oh and since this involves moving to Anonymous Platform (like how we ran the special app on SETI), it might be a good idea to set NNT on Einstein and deplete your cache before making the switch. I do not know if making the switch with pre-existing tasks in your queue will nuke them all. I ran my queue down and submitted all work before making the switch.
I'll summarize the instructions here:
- I've organized the package to place the files where they "should" go on your system.
- **Backup your existing stock hsgamma executable to somewhere safe outside of the einstein directory**
- Place the five (5) alternate kernel files in your main boinc directory (the same location that has the slots and projects folders)
- Place the app_info.xml file in your einstein project folder. make sure to double check it and edit for your uses if necessary. you should be able to use your existing app_config.xml if you have one
- Place the new HSgamma executable in your einstein project folder. again double check the execute bit, but it should be set.
- ***new for v1.0*** place the EAH_SLEEP file from the v1.0 package in your BOINC directory, you NEED this file as it contains tuning parameters for the application (do not edit this file at all). just launch boinc via the boincmgr as usual.
Enjoy :)
v0.95 notes:
- available from the above link
- EAH_SLEEP file default is "1" for low CPU use. change to "0" for higher CPU use, but a bit better app performance.
- for best performance, I recommend trying to stagger the tasks so that only one task is in the 90-100% portion at a time per GPU
- you only need to update the HSgamma executable and app_info file if you're already using a previous release. no need to change or update the 5 alt fft files. but i've included the whole package for those who might be starting fresh.
v1.0 notes:
- available from the above link
- another ~8-10% improvement over v0.95
- for best performance, I recommend trying to stagger the tasks so that only one task is in the 90-100% portion at a time per GPU
- Make sure you use the NEW EAH_SLEEP file, do not re-use your old v0.95 one. do not edit this file.
- you only need to update the HSgamma executable and app_info file if you're already using a previous release. no need to change or update the 5 alt fft files. but i've included the whole package for those who might be starting fresh.
- there have been some instances of people getting errors on some Ryzen based systems. the cause is not known yet, and only seems to affect some hosts. if you have these problems, stick with v0.95
- v1.0 uses a LOT of your system and GPU resources. each task will use ~3.5GB GPU memory and ~2GB system memory. make sure you have proper GPU and system memory resources. lower end GPUs might be better to stick to the v0.95 app
_________________________________________________________________________
Thanks for sharing!
)
Thanks for sharing!
I am one of the people that
)
I am one of the people that has trouble running the v1.0 version of the app on my Ryzen hosts.
Watch out for tasks to quickly cascade in errors for brief periods of time with a frequency between 10 minutes and 2 hours with this snippet in the stderr.txt.
.
.
.
Same hosts that have issues with the v1.0 version run the v0.95 version with no issues. Just a slight performance loss over the v1.0 version.
Disregard the (Device 0) enumeration. All cards are identifed with Device 0 so any card could be the originator of the error. Does not indicate that any card is faulty. Just some yet undiagnosed flaw in the v1.0 application on certain hosts.
Thanks to Petri and Ian for
)
Thanks to Petri and Ian for the help. I'm currently running v0.95. So far so good. Got to watch the invalids closely.
pututu wrote: Thanks to
)
They won't be invalids. They will be straight errors.
i think he's talking about
)
i think he's talking about increased invalids vs the stock apps. he wont (or shouldnt) see any errors with the v0.95 app.
i see about 3-4% invalid with v0.95 and v1.0, which is about on par with what you get from the stock 1.28 app and a little higher than you'd see with the 1.22 app if you dont have the opencl 3.0 drivers. but the 1.22 app is dog slow by comparison, so the speed gain far outweighs the increased invalids.
_________________________________________________________________________
All but one of my PCs are Zen
)
All but one of my PCs are Zen based and the one that isn't just has a 1070 so I added the 0.95 app on a PC with a 3070Ti.
1.28 2x ~299 seconds
0.95 2x ~220 seconds
I didn't delete the old app and no tasks aborted after restarting the client.
Some nice improvements!
Very nice. You should give
)
Very nice. You should give the v1.0 app a try anyway. Just monitor for errors.
I am the only team member that got bothered majorly with the v1.0 app on my Ryzen hosts. My Epyc hosts were immune as well as most team members.
1 or 2 other members were seeing sparingly few errors on their Ryzen hosts or none. I also run more projects in conjunction with Einstein Gamma-Ray on the same hosts. I did extensive troubleshooting but could never pin the problem down. So I reverted to 0.95 to eliminate the errors. But at slight loss in production compared to v1.0.
Currently testing Petri's latest v1.0 iteration on the Ryzen hosts and have seen no errors on one host and only two errors so far on a second host which would have produced a dozen or so errors in the 6 hours of testing so far that the original app caused.
Just put the new app on the most troublesome host and will monitor it to see if the latest app iteration has squashed for the most part the errors I was seeing.
The speedup from v0.95 to v1.0 is worth it to try. I got 40 seconds of speedup more with v1.0.
Works fine for me. Good job
)
Works fine for me. Good job :)
That app is insane :)
)
That app is insane :)
tito wrote: That app is
)
just keep an eye on your error rate. if you see a lot more errors, it might make sense to swap over to the 0.95 app.
_________________________________________________________________________