https://gitlab.synchro.net/main/sbbs/-/commit/b5488bb3082147c8d3ee3d6d
Modified Files:
src/build/Common.gmake src/conio/bitmap_con.c scale.c x_events.c
Log Message:
Optimizations:
1) Keep a rectangle updated per-screen rather than regenerate each time
2) Strip palette info when putting pixels into rectangles rather than
during scaling
3) Tighten up the screen locks a bit
4) Don't require a full resend of both screens on an update request
5) Only force a redraw for cursor movement when the cursor is visible
(And force it whenever the cursor changes)
6) Avoid doubles in interpolation
7) Heavily optimize interpolate_height()
interpolate_width() likely doesn't need it because it's generally not
used and also it reads from the next pixel in memory making the
prefetchers job easier.
8) Fix some memory-leak-on-error issues
9) For ARGB8 XImages, manipulate the data directly rather than through
XPutPixel()
At this point, the scaling and X11 output time is heavily dominated by
cache misses. The only really effective way to reduce this hit is to
spread the work across all the L3 caches in the system or move it into
the GPU.
With the latest updates, at the SyncTERM menu, over 90% of the time is
spent in the rendering pipeline, and over 90% of that time is spent
thrashing the caches... the only real easy win left is vectorizing, but
that's highly compiler specific.
To that end, I've switched to -O3 for release builds. There was a comment
that -finline-functions broke Baja "badly", but that's clearly false since -f-inline-functions has been part of -O2 for quite a while now, and Baja doesn't seem any more broken that it ever was.