It has been a while since the last progress report, hasn't it?
Over the last few months I've been focusing on the MoarVM Heap Snapshot Profiler. The new format that I explained in the last post, "Intermediate Progress Report: Heap Snapshots", is available in the master branch of MoarVM, and it has learned a few new tricks, too.
The first thing I usually did when opening a Heap Snapshot in the heapanalyzer (the older command-line based one) was to select a Snapshot, ask for the
summary, and then for the
top objects by size,
top objects by count,
top frames by size, and/or
top frames by count to see if anything immediately catches my eye. In order to make more sense of the results, I would repeat those commands for one or more other Snapshots.
Snapshot Heap Size Objects Type Objects STables Frames References ======== ================= ======= ============ ======= ====== ========== 0 46,229,818 bytes 331,212 686 687 1,285 1,146,426 25 63,471,658 bytes 475,587 995 996 2,832 1,889,612 50 82,407,275 bytes 625,958 1,320 1,321 6,176 2,741,066 75 97,860,712 bytes 754,075 1,415 1,416 6,967 3,436,141 100 113,398,840 bytes 883,405 1,507 1,508 7,837 4,187,184 Snapshot Heap Size Objects Type Objects STables Frames References ======== ================= ========= ============ ======= ====== ========== 125 130,799,241 bytes 1,028,928 1,631 1,632 9,254 5,036,284 150 145,781,617 bytes 1,155,887 1,684 1,685 9,774 5,809,084 175 162,018,588 bytes 1,293,439 1,791 1,792 10,887 6,602,449
Realizing that the most common use case should be simple to achieve, I first implemented a command
summary all and later a command
summary every 10 to get the heapanalyzer to give the summaries of multiple Snapshots at once, and to be able to get summaries (relatively) quickly even if there's multiple hundreds of snapshots in one file.
Sadly, this still requires the parser to go through the entire file to do the counting and adding up. That's obviously not optimal, even though this is an Embarrassingly Parallel task, and it can use every CPU core in the machine you have, it's still a whole lot of work just for the summary.
For this reason I decided to shift the responsibility for this task to MoarVM itself, to be done while the snapshot is taken. In order to record everything that goes into the Snapshot, MoarVM already differentiates between Object, Type Object, STable, and Frame, and it stores all references anyway. I figured it shouldn't have a performance impact to just add up the numbers and make them available in the file.
The result is that the summary table as shown further above is available only milliseconds after loading the heap snapshot file, rather than after an explicit request and sometimes a lengthy wait period.
The next step was to see if
top objects by size and friends could be made faster in a similar way.
I decided that adding an optional "statistics collection" feature inside of MoarVM's heap snapshot profiler would be worthwhile. If it turns out that the performance impact of summing up sizes and counts on a per-type and per-frame basis makes capturing a snapshot too slow, it could be turned off.
> snapshot 50 Loading that snapshot. Carry on... > top frames by size Wait a moment, while I finish loading the snapshot... Name Total Bytes ==================================== ============= finish_code_object (World.nqp:2532) 201,960 bytes moarop_mapper (QAST.nqp:1764) 136,512 bytes !protoregex (QRegex.nqp:1625) 71,760 bytes new_type (Metamodel.nqp:1345) 40,704 bytes statement (Perl6-Grammar.nqp:951) 35,640 bytes termish (Perl6-Grammar.nqp:3641) 34,720 bytes <anon> (Perl6-BOOTSTRAP.c.nqp:1382) 29,960 bytes EXPR (Perl6-Grammar.nqp:3677) 27,200 bytes <mainline> (Perl6-BOOTSTRAP.c.nqp:1) 26,496 bytes <mainline> (NQPCORE.setting:1) 25,896 bytes EXPR (NQPHLL.nqp:1186) 25,760 bytes <anon> (<null>:1) 25,272 bytes declarator (Perl6-Grammar.nqp:2189) 23,520 bytes <anon> (<null>:1) 22,464 bytes <anon> (<null>:1) 22,464 bytes
Showing the top objects or frame for a single snapshot is fairly straight-forward in the commandline based UI, but how would you display how a type or frame develops its value across many snapshots?
Instead of figuring out the best way to display this data in the commandline, I switched focus to the Moarperf Web Frontend. The most obvious way to display data like this is a Line Graph, I believe. So that's what we have now!
And of course you also get to see the data from each snapshot's Summary in graph format:
And now for the reason behind this blog post's Title.
Using Jonathan's module
Concurrent::Progress (with a slight modification) I sprinkled the code to parse a snapshot with matching calls of
.increment. The resulting progress reports (throttled to at most one per second) are then forwarded to the browser via the WebSocket connection that already delivers many other bits of data.
The result can be seen in this tiny screencast:
The recording is rather choppy because the heapanalyzer code was using every last drop of performance out of my CPU while it was trying to capture my screen.
There's obviously a lot still missing from the heap snapshot analyzer frontend GUI, but I feel like this is a good start, and even provides useful features already. The graphs for the summary data are nicer to read than the table in the commandline UI, and it's only in this UI that you can get a graphical representation of the "highscore" lists.
I think a lot of the remaining features will already be useful after just the initial pieces are in place, so a little work should go a long way.
Bits and Bobs
I didn't spend the whole time between the last progress report and now to work directly on the features shown here. Apart from Life Intervening™, I worked on fixing many frustrating bugs related to both of the profilers in MoarVM. I added a small subsystem I call
VMEvents that allows user code to be notified when GC runs happen and other interesting bits from inside MoarVM itself. And of course I've been assisting other developers by answering questions and looking over their contributions. And of course there's the occasional video-game-development related experiment, for example with the GTK Live Coding Tool.
Finally, here's a nice little screencap of that same tool displaying a hilbert curve:
That's already everything I have for this time. A lot has (had to) happen behind the scenes to get to this point, but now there was finally something to look at (and touch, if you grab the source code and go through the needlessly complicated build process yourself).
Thank you for reading and I hope to see you in the next one!