my test bench is using a vanilla version of boinc client that I compiled from the latest source (7.17.0). the only thing "edited" is the coproc file, and was done for testing purposes(as I mentioned, no GPU <3GB), using YOUR method. you don't trust your own methods? not that the results would be any different with another working client.
nice try deflecting though lol. it's hysterical to see you to try this hard to disagree with me when we are saying the same things LOL. do you have anything other than strawman arguments?
if you didnt catch the estimates that were pushed for tasks that ended up using more then 3GB, then you need to keep watching for them. Which is what I'm doing. it's only the test bench running 1 GPU with 1 WU with a resource share of 0, so it only has 1 WU at a time on the host. makes it easy to cross reference that the WU being processed is the same WU as in the log, since the log gets wiped every connection it seems. just yesterday they were sending out those big tasks that failed on my 3GB 1060. today they seem to be only sending out the <2GB ones. ah well. all I can do is wait I suppose.
here we go, this is the kind of proof we're looking for. my hunch was right. the scheduler thinks the task only needs ~1800MB GPU ram, but this is one that actually tries to use like ~3200. either the scheduler is hard coded for this task type, or there's some bug in the code that estimates how much it needs.
screenshot proof with the scheduler log entry, nvidia-smi output showing full memory on a 3GB card, and corresponding WU task name shown. can be no doubt this is what is happening.
here we go, this is the kind of proof we're looking for. my hunch was right. the scheduler thinks the task only needs ~1800MB GPU ram, but this is one that actually tries to use like ~3200. either the scheduler is hard coded for this task type, or there's some bug in the code that estimates how much it needs. screenshot proof with the scheduler log entry, nvidia-smi output showing full memory on a 3GB card, and corresponding WU task name shown. can be no doubt this is what is happening.
Excellent work. Please post this somewhere as a bug report, I don't think there's any tech guys in this thread.
If this page takes an hour to load, reduce posts per page to 20 in your settings, then the tinpot 486 Einstein uses can handle it.
i already posted the info for "the people who matter" (Bernd) in the technical news forum in the relevant GW thread where he was talking about the scheduler. hopefully he sees it there.
someone could PM him directly, but I think he'll probably see it sooner or later since he was posting in that thread.
I have a question about what is better for the project. I have a host with 2 GTX1060 3GB cards. Obviously all the high freq tasks bomb out in very short order, the rest process with no problem. It doesn't bother me because little time is wasted and a lot of good work is being done.
Is the project better off with this host doing what it can or should I exclude that host from this search so the servers don't have to deal with the carnage?
personally I would just exclude that card from the GW search. the tasks that failed to run on my 3GB 1060, it would sit there cycling over and over not really doing anything, until it hit the timeout, all the while reporting 100%. very odd behavior. I think it's better for the project to not have to deal with all the resends which might go out to another 3GB GPU and have the same problem all over again, then be resent again. the scheduler doesnt do a proper job of estimating GPU memory required as outlined in my previous post.
if you remove your 3GB card from the GW search, its just one less card causing resends of failed tasks. there seems to be a lot of them. nearly all of my GW tasks i've processed recently have been resends.
erroring out 1/3 of your tasks seems like a bit much. hosts like that are probably the reason there are so many GW tasks needing validation and so many resends. this inevitably delays the science results from getting back to the project.
but if you want to hear from someone actually at the project, you might need to reach out to them directly. apparently they don't post on the user forums too often.
A GW GPU task has started on my GTX 1060 with its 3 GB Video RAM and seems OK also on GPU-Z. Memory used is 1962 MB. Task has completed and is pending. 11 GB of system memory used.
Tullio
Another task has completed and a third is running. Yesterday I had a cumulative Windows 1903 update on my PC, home edition.
Six of my GW tasks are completed and waiting for validation. Their wingman uses a GTX 750 Ti board with 2 GB video RAM. Since GPU-Z says that the project uses 1962 MB, no wonder they all fail on the 750 Ti board.
my test bench is using a
)
my test bench is using a vanilla version of boinc client that I compiled from the latest source (7.17.0). the only thing "edited" is the coproc file, and was done for testing purposes(as I mentioned, no GPU <3GB), using YOUR method. you don't trust your own methods? not that the results would be any different with another working client.
nice try deflecting though lol. it's hysterical to see you to try this hard to disagree with me when we are saying the same things LOL. do you have anything other than strawman arguments?
if you didnt catch the estimates that were pushed for tasks that ended up using more then 3GB, then you need to keep watching for them. Which is what I'm doing. it's only the test bench running 1 GPU with 1 WU with a resource share of 0, so it only has 1 WU at a time on the host. makes it easy to cross reference that the WU being processed is the same WU as in the log, since the log gets wiped every connection it seems. just yesterday they were sending out those big tasks that failed on my 3GB 1060. today they seem to be only sending out the <2GB ones. ah well. all I can do is wait I suppose.
_________________________________________________________________________
here we go, this is the kind
)
here we go, this is the kind of proof we're looking for. my hunch was right. the scheduler thinks the task only needs ~1800MB GPU ram, but this is one that actually tries to use like ~3200. either the scheduler is hard coded for this task type, or there's some bug in the code that estimates how much it needs.
screenshot proof with the scheduler log entry, nvidia-smi output showing full memory on a 3GB card, and corresponding WU task name shown. can be no doubt this is what is happening.
_________________________________________________________________________
Ian&Steve C. wrote:here we
)
Excellent work. Please post this somewhere as a bug report, I don't think there's any tech guys in this thread.
If this page takes an hour to load, reduce posts per page to 20 in your settings, then the tinpot 486 Einstein uses can handle it.
i already posted the info for
)
i already posted the info for "the people who matter" (Bernd) in the technical news forum in the relevant GW thread where he was talking about the scheduler. hopefully he sees it there.
someone could PM him directly, but I think he'll probably see it sooner or later since he was posting in that thread.
_________________________________________________________________________
I have a question about what
)
I have a question about what is better for the project. I have a host with 2 GTX1060 3GB cards. Obviously all the high freq tasks bomb out in very short order, the rest process with no problem. It doesn't bother me because little time is wasted and a lot of good work is being done.
Is the project better off with this host doing what it can or should I exclude that host from this search so the servers don't have to deal with the carnage?
personally I would just
)
personally I would just exclude that card from the GW search. the tasks that failed to run on my 3GB 1060, it would sit there cycling over and over not really doing anything, until it hit the timeout, all the while reporting 100%. very odd behavior. I think it's better for the project to not have to deal with all the resends which might go out to another 3GB GPU and have the same problem all over again, then be resent again. the scheduler doesnt do a proper job of estimating GPU memory required as outlined in my previous post.
if you remove your 3GB card from the GW search, its just one less card causing resends of failed tasks. there seems to be a lot of them. nearly all of my GW tasks i've processed recently have been resends.
_________________________________________________________________________
That's one view point and may
)
That's one view point and may very well be valid but I want to hear what the project wants.
Edit, I looked and that host has 431 valid tasks showing, 217 errors showing and 10 pending which will all validate.
erroring out 1/3 of your
)
erroring out 1/3 of your tasks seems like a bit much. hosts like that are probably the reason there are so many GW tasks needing validation and so many resends. this inevitably delays the science results from getting back to the project.
but if you want to hear from someone actually at the project, you might need to reach out to them directly. apparently they don't post on the user forums too often.
_________________________________________________________________________
A GW GPU task has started on
)
A GW GPU task has started on my GTX 1060 with its 3 GB Video RAM and seems OK also on GPU-Z. Memory used is 1962 MB. Task has completed and is pending. 11 GB of system memory used.
Tullio
Another task has completed and a third is running. Yesterday I had a cumulative Windows 1903 update on my PC, home edition.
Six of my GW tasks are
)
Six of my GW tasks are completed and waiting for validation. Their wingman uses a GTX 750 Ti board with 2 GB video RAM. Since GPU-Z says that the project uses 1962 MB, no wonder they all fail on the 750 Ti board.
Tullio