Write your own Einstein@home screensaver

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6590

Credit: 319161643

RAC: 413448

Matters are progressing, it's

4 Feb 2013 8:24:08 UTC

Message 78258

(moderation:

)

Matters are progressing, it's just that I'm in the rather slow/rigorous phase of profiling and heap/leak testing and whatnot ... :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6590

Credit: 319161643

RAC: 413448

I've done some research but

13 Mar 2013 6:36:11 UTC

Message 78259

(moderation:

)

I've done some research but I'll throw this out for any feedback :

Does anyone know of any tools to profile memory usage for GPU's on a per-application/process or per-OpenGL-context basis ?

Linux/Windows/NVIDIA/AMD/whatever ?

[ obviously I want to gauge how much on-card resources my screensaver may use .... even to within ballpark error would be good ]

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6590

Credit: 319161643

RAC: 413448

This is

25 Mar 2013 4:15:01 UTC

Message 78260

(moderation:

)

This is cute:

Quote:

glGetError: return error information

GLenum glGetError(voidâ€‹);

GL_NO_ERRORâ€‹ : No error has been recorded. The value of this symbolic constant is guaranteed to be 0

combined with

Quote:

If glGetError itself generates an error, it returns 0.

Thus one can't disambiguate b/w a non-error state of the current OpenGL context on the one hand, and a faulty error reporting facility on the other !! :-) :-)

Now who thought that one out .... not ....

Cheers, Mike.

( edit ) Yes, I do realise this choice avoids endless recursion if glGetError is placed in a loop ( typical usage ), BUT I was wondering if there was a better way ...

( edit - thinking out loud ) My trouble is that I'm getting occasional GL_INVALID_OPERATIONâ€‹ ( x0502 ) returned but the text message obtained for that - using gluErrorString() - says 'No error'. Typically, but not always, the program then exits due to failure to allocate video memory for an image. Only occurs after very prolonged cycling ( days ) through the pulsar tour, involving much acquisition and release of buffer objects ( ~80KB per allocation ). So I'm thinking the OpenGL implementation struggles with 'garbage collection' or fragmentation/consolidation of many small memory blocks ( on the video card I mean ) after some time. This is without any E@H GPU WU's running concurrently, but so far has only happened on my Linux machine which has dual monitors. I might disable the second one and repeat the investigations. Trouble is I don't want to emit a screensaver product that falls over on prolonged use, as by definition really it should hold up. Sigh .... obtuse :-)

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

robertmiles

Joined: 8 Oct 09

Posts: 127

Credit: 29440881

RAC: 21040

Is there a maximum buffer

25 Mar 2013 5:34:55 UTC

Message 78261 in response to message 78260

(moderation:

)

Is there a maximum buffer size? If so, you might consider allocating an array of buffers that size, and keeping track of which buffers within that array are in use.

That way, the garbage collector would only come into play when you free the entire array of buffers.

Note - this assumes a maximum number of such buffers in use at once, unless you prefer to be ready to allocate another array whenever the first one runs out.

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6590

Credit: 319161643

RAC: 413448

RE: Is there a maximum

25 Mar 2013 6:15:52 UTC

Message 78262 in response to message 78261

(moderation:

)

Quote:

Is there a maximum buffer size? If so, you might consider allocating an array of buffers that size, and keeping track of which buffers within that array are in use.

That way, the garbage collector would only come into play when you free the entire array of buffers.

Note - this assumes a maximum number of such buffers in use at once, unless you prefer to be ready to allocate another array whenever the first one runs out.

Thank you Robert.

I agree, I think something along these lines will be the tactic. The minimum guaranteed simultaneous buffer count is approx 32K ( per OpenGL standards ), my release returns each identifier/handle to the pool. But maybe the implementation doesn't conform to spec, who can say?

I've been somewhat obsessed by shaving to the absolute minimum of video card memory use - that is at any moment during execution only the bare minimum is loaded server side.

I think I ought relax on that, which won't be too terrible as for some 50 or so pulsar profile images @ ~80KB is only 4MB in total. If we find another 50 then that's 8MB. Load them all to video buffer ( array ) once at startup, display in sequence during the tour, then release on exit.

Cheers, Mike.

( edit ) Yeah, that makes even more sense now that I think of it. 32K identifiers for 40-something distinct images over 4 days is around seven cycles of the entire pulsar tour per hour. That's about the rate it displays ie. around 10 minutes for a tour. So if the identifiers aren't being returned to pool ( or otherwise not capable of legitimate re-use ) then eventually I'll get a scenario fitting error 0x0502 : "the set of state for a command is not legal for the parameters given to that command."

Now that I can certainly test with some well placed debug code! :-)

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

robertmiles

Joined: 8 Oct 09

Posts: 127

Credit: 29440881

RAC: 21040

Another idea to consider:

26 Mar 2013 5:38:57 UTC

Message 78263 in response to message 78262

(moderation:

)

Another idea to consider: Allocate buffers for only two pulsars, the one currently being displayed and the next one. Switch between these buffers as needed. That way, each buffer can be the maximum size required, and any buffer no longer in use can be overwritten. This should help in situations where little memory is available.

Also, if the buffers are of varying sizes, expect memory fragmentation eventually - a situation where there is more than enough free memory to allocate another buffer, but it is broken into pieces too small to put the buffer in just one piece.

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6590

Credit: 319161643

RAC: 413448

Firstly, I pronounce the

26 Mar 2013 7:41:49 UTC

Message 78264

(moderation:

)

Firstly, I pronounce the gluErrorString() routine officially crap. All this time it has been returning the string 'No error' for the enumerant GL_OUT_OF_MEMORY !! So I'll flick gluErrorString() entirely because it serves precisely no purpose at all, then roll my own solution ie. directly match the enumerants to the phrases used in the OpenGL standard, eg. in this case "There is not enough memory left to execute the command".

Secondly, the program finally exits when a request for a buffer pointer ( image use, my choice of making that a fatal ) is denied by the state machine. This is consistent with the prior analysis regarding the range limit of the buffer identifier/handles ( I put debug error checks just before and just after the relevant glGenTextures() call, so this should be an exact and unambiguous diagnosis ). This is definitely off OpenGL spec and thus, my guess would be, the driver is marking the handles as in use on acquire, but not returning to pool on release. But I will further test/debug that hypothesis too ....

Thirdly, I like your suggestion of a double buffer technique and as it turns out I can make each pulse profile image exactly the same size anyway. That way I'd just overwrite a given buffer with a new pattern ( these are kept as byte level arrays in program's memory anyway ) as the tour proceeds. I can simply allow the HUDImage class to accept a new resource identifier ( see ResourceFactory class ) after object construction, which then loads a new bit pattern on the fly.

Thank you very much Robert ! I do appreciate bouncing this stuff off you ... :-)

What do you get up to at NCSU ?

Cheers, Mike.

( edit ) Some brief research indicates overall unhappiness with OpenGL error reporting generally - as practised by implementations. So I'll stick to just examining the returned enumerants and respond from there.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6590

Credit: 319161643

RAC: 413448

All fixed now. Diagnosis : my

2 Apr 2013 23:54:07 UTC

Message 78265

(moderation:

)

All fixed now. Diagnosis : my own stupidity :-) :-)

Having forgotten a base class behaviour that I wrote a while ago ( which I did adequately annotate/comment at the time, but hadn't re-read ). With the effect of allocating three texture 'units' ( on the video card ) per single image to be displayed. Only one of the three allocated gets released during cycling, so you would naturally expect to run out of such texture units eventually with a prolonged span of operation. This will escape the valgrind debugger ( in memcheck mode ) as that only inspects general memory.

While testing I have also optimised the code dealing with re-acquisition of OGLFT font resources when an AbstractGraphicsEngine is re-initialised or recycled. The issue is that for Windows machines only, the entire OpenGL context is lost on a window resize. In that case you have to re-construct the prior settings of the state machine and then emit a frame. For Linux/Mac the context is retained and one just emits another animation frame. Now an OpenGL context is effectively the combination of some client screen area ( typically a component of some 'window' as defined by the OS, but could be the whole screen un-windowed ) plus given settings of the state machine.

So I've appropriately factorised the relevant section of code ( dealing with OGLFT font initialisation ) into a function and just conditionally compile ( using #define directives based upon a makefile defined symbol ). As thus far this produces the right behaviours, and no errors, so I may consider factorisations of other server side resource acquisitions to likewise conditionally compile.

So as I keep saying - not long now! ;-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

robertmiles

Joined: 8 Oct 09

Posts: 127

Credit: 29440881

RAC: 21040

RE: Thirdly, I like your

10 May 2013 15:01:48 UTC

Message 78267 in response to message 78264

(moderation:

)

Quote:

Thirdly, I like your suggestion of a double buffer technique and as it turns out I can make each pulse profile image exactly the same size anyway. That way I'd just overwrite a given buffer with a new pattern ( these are kept as byte level arrays in program's memory anyway ) as the tour proceeds. I can simply allow the HUDImage class to accept a new resource identifier ( see ResourceFactory class ) after object construction, which then loads a new bit pattern on the fly.

Thank you very much Robert ! I do appreciate bouncing this stuff off you ... :-)

What do you get up to at NCSU ?

Cheers, Mike.

You're welcome.

It's been many years since I was at NCSU, but I studied electronic engineering there. I also learned some computer programming there (for mainframes only).

I used a lot of programming since, and have had a few online computer programming classes (C++ and CUDA) recently.

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6590

Credit: 319161643

RAC: 413448

Update : obscure bug being

12 May 2013 23:46:28 UTC

Message 78268

(moderation:

)

Update : obscure bug being tracked down. Heap overrun within OGLFT library code in Windows builds only, and alas then only sometimes. Related to use of C-style string constructs ie. char * and an assumed '\0' ending for the Face::draw() function. It took me five (5) weeks to define just that. Workaround : attempting to present single characters serially to the OGLFT interface. Sigh. I think there is some 80/20 rule or somesuch : 20% of the bugs take 80% of the time. I'd make it 98/02 ..... :-)

@Robert : I too have done some online classes, which probably gave me the impression I could succeed in this venture. Programming is far more than merely typing code in some language paradigm though! There's a level of rigor required and good tools are essential. My sympathies to our E@H developers ....

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Write your own Einstein@home screensaver

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner