Over a year ago, the Guix Build Coordinator was building lots of things for substitutes https://octodon.social/@cbaines/104297876310550244
It's taken a while, but now it should be benefiting general users of #guix
I did another round of Guix Build Coordinator database improvements, and the results are pretty good.
The graph here shows the 95th percentile time for a few database operations. The blue one (allocating builds to agents) has gone from taking up to a second most of the time to barely registering (5ms or less).
I think fitting in WAL checkpoints between allocations, prompting SQLite to "optimize" and not caching query plans forever are the main changes which helped bring this about.
I've had this hardware (i5-2500k with a capable cooler) for ages now, and I haven't been using it for many years.
It still works though, and I've managed to overclock the processor to 5Ghz.
I haven't got around to measuring the single core performance, though at some point I want to see how it compares to my i7-8700K.
#Guix Build Coordinator progress, I made some database schema changes, and the size of the guix.cbaines.net database dropped from ~43GB to ~11.5GB!
When I started writing it, I used natural keys, UUID's for the builds, and /gnu/store/... names to identify derivations. This was fine at small scale, but with lots of builds and derivations, it made for a much bigger database, and slower queries.
Whoo, it's pretty rough around the edges, but this is probably the first #Guix Build Coordinator agent build on the GNU Hurd!
I probably should be doing something more important than fixing thumbnails for the Gnome Font Viewer in #Guix, but that's what I find myself doing...
How many inputs does an average #guix derivation have? Around 31 is my guesstimate from the Guix Data Service data.
This sort of chart might show some more nuances though, there's probably some reason why 3, 6, 10 and 25 are quite popular...
Wooo! Turns out that adding (reverse ...) in a couple of places was enough to get the #guix Build Coordinator ordering builds in a pretty sensible way.
I'm pretty sure that change meant that suddenly agents were mostly allocated builds that they could actually attempt, rather than ones where the inputs were missing.
It turns out the "encoding" issue I've been chasing where PostgreSQL says "invalid byte sequence for encoding" when trying to do a query, probably isn't anything to do with incorrect encoding...
My latest theory is that something about the way the query is executed isn't garbage collector safe, and when the garbage collector runs, the memory regions containing the parameters get altered...
This theory at least explains why I see this issue intermittently.
So... after 3 days, it still hadn't finished. I gave up, stopped the DELETE, added an index on a table referencing the table I was deleting from, and then it finished in a few minutes...
It's the "Trigger for constraint" bit that apparently was taking forever.
Actually made it through a pretty big refactor of how the Guix Data Service handles database connections for HTTP requests: https://git.savannah.gnu.org/cgit/guix/data-service.git/commit/?id=c3c9c07f9a208633882a21004d30c5ee29026cb1
There should be less chance of the server being slow handling requests now, and some requests should even be faster due to extra parallelism.
I'm pretty sure I must have broken something somewhere though...
Living in London
Interested in Free Software, Badminton, Bouldering and cats.
The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!