Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
00:00 timo left, TempIRCLogger joined, discord-raku-bot joined, timo joined 00:02 reportable6 left 00:03 reportable6 joined 01:03 quotable6 joined 02:17 rakuUser left 04:03 quotable6 left, notable6 left, nativecallable6 left, evalable6 left, statisfiable6 left, tellable6 left, squashable6 left, bloatable6 left, greppable6 left, unicodable6 left, linkable6 left, coverable6 left, committable6 left, benchable6 left, sourceable6 left, reportable6 left, bisectable6 left, releasable6 left, shareable6 left 04:04 evalable6 joined, benchable6 joined 04:05 committable6 joined, coverable6 joined, statisfiable6 joined, linkable6 joined 04:06 bloatable6 joined, unicodable6 joined, nativecallable6 joined, tellable6 joined, greppable6 joined, releasable6 joined 05:04 shareable6 joined 05:07 reportable6 joined 06:02 reportable6 left 06:03 reportable6 joined 06:04 notable6 joined 06:05 bisectable6 joined 06:06 squashable6 joined 06:09 squashable6 left 06:11 squashable6 joined
nine So, the current implementation of the fallback method on Junctions relies on AUTOTHREAD being kinda stupid and threading on every Junction in the arguments. 07:21
07:31 frost joined
nine The obvious solution is to just not use AUTOTHREAD there at all. We already know that the first argument is the Junction we want to thread over. So no need to look for it and no need to check the parameter's type. 08:02
This fixes the $junction.some-method infiniloop and improves performance of method calls on junctions.
08:06 sourceable6 joined
nine Alas...the real problem emerges: multis. The auto-generated protos claim Mu as argument types. Accordingly we don't actualy autothread, but call the multi with a Junction. Which then causes a bind failure which triggers autothreading and we end up in yet another infinite recursion 08:07
Btw. CALL-ME is actually even simpler than the fallback method. It's just call each of the eigenstates and collect their results into a new Junction. 08:37
By punting on the multi issue (i.e. not trying to be smarter with them and instead autothread always) I can get through make test and spectest without failures. First "All tests successful." new-disp has seen on spectest :) 08:38
Of course that's not a full solution. OTOH I'm pretty sure it's an improvement over what we have in master. 08:39
I fear for a full solution we actually need to replace AUTOTHREAD with a dispatcher. Or several dispatchers. Or even worse, integrate it into the existing ones. I don't see how else we could get out of the multi infiniloop 08:42
09:55 evalable6 left, linkable6 left 09:56 linkable6 joined
Geth MoarVM/new-disp: 478 commits pushed by (Jonathan Worthington)++, (Timo Paulssen)++, (Daniel Green)++, (Stefan Seifert)++, (Nicholas Clark)++
review: github.com/MoarVM/MoarVM/compare/a...735164299a
nine PSA: I have rebased and force pushed the new-disp branches, so we won't chase bugs that were alreay fixed on master again. Also a merge is coming nearer anyway. I have taken the opportunity to remove the commit/revert pairs. 10:00
timo your first message today means when we have a function that takes one Mu and one Any and both get a junction we will still thread both of them? 10:11
Nicholas nine: did you remove the revert pairs? 10:12
oh, wait, you did
timo it sounds like he did, from that last message
Nicholas I can't read far enough
fixed in post!
nine timo: yes (to the junction question) 10:13
timo last time i looked into junctions / dispatchers i was wondering about that exact thing, but i was assuming our current implementation already solved that somehow 10:15
nine Pushed my Junction fix to rakudo new-disp, fixing the last remaining spec test 10:16
timo i think there must be cases that we literally can't do correctly. like when there's one multi candidate that takes Mu, Any and one that takes Any, Mu
what can we even do in that case; disambiguate like in other cases where multiple multis match?
nine timo: well, the upside of this endeavour is, that now I know enough to have a sensible discussion :) 10:17
timo boop the user on the snoot and tell them "dont do that"?
nine Both sound sensible to me 10:18
10:28 frost left
nine m: for ^10 { Thread.start: -> { mkdir "/tmp/testdir"; CATCH { default { note $_ } } } } 10:30
camelia Failed to create directory '/tmp/testdir' with mode '0o777': Failed to mkdir: File exists
in block at <tmp> line 1

Failed to create directory '/tmp/testdir' with mode '0o777': Failed to mkdir: File exists
in block at <tmp> line 1
nine This ^^^ is actually a TOCTOU race condition in MoarVM but only on non-Windows 10:31
timo i think there is another T in that acronym 10:33
or i guess it doesn't have to be
nine Well, true. One could argue that it's not critical though :) 10:35
timo doesn't that want like a sleep or so in order not to quit too early? 10:36
nine Well, it managed to demonstrate the bogus "Failed to mkdir: File exists", so I claim it's good enough :) 10:37
timo we normally return success for mkdir when the folder already exists? 10:41
nine We do so on Windows and did so before commit 8e9afcf52e79ef58026fdf32bdc01078e6ca5afd Tue Jun 15 11:41:17 2021 +0700 10:42
did so everywhere else
10:42 linkable6 left 10:44 linkable6 joined
MasterDuke huh, i don't recall ever seeing that happen 10:45
huh, i don't recall ever seeing that happen 10:46
btw, if anybody feels like some light reading, i'm not sure why gist.github.com/MasterDuke17/07eaa...f898dda43c doesn't fix the example of `use MONKEY-SEE-NO-EVAL; use YAMLish; my $buf = buf8.allocate(100_000); my $raku = $buf.raku; my $yaml = save-yaml $buf; for ^5 { say $_; try EVAL $raku; load-yaml $yaml }` 10:47
Geth MoarVM/new-disp: ffd33a736c | (Stefan Seifert)++ | src/io/dirops.c
Fix TOCTTOU race condition in mkdir on non-Windows

mkdir should report successfully creating a directory if it already exists. We do so on Windows and before commit 8e9afcf52e79ef58026fdf32bdc01078e6ca5afd did so reliably everywhere else. However that commit introduced a race condition. Since we first check if the directory already exists and only if not try to create it, this leaves open a window of opportunity for some other process or ... (10 more lines)
10:55 evalable6 joined
timo MasterDuke: you're working on this with master branches or with new-disp? 10:57
MasterDuke master 11:05
11:06 quotable6 joined
timo what number does it usually reach before it crashes and burns? 11:11
MasterDuke never gets past 1 11:13
timo on new-disp it does 11:24
MasterDuke huh. with or without my patch? 11:27
dogbert11 does anyone else get a hang when running 'MVM_SPESH_NODELAY=1 ./rakudo-m -Ilib t/02-rakudo/99-misc.t' under new-disp? 11:37
timo without your patch 11:41
yeah it does look like it hangs
who wants to build the incredibly tricky to implement "dump backtraces for all threads in gdb" 11:45
nine What hangs is actually this: MVM_SPESH_NODELAY=1 gdb --args ./rakudo-m --profile -e 'my %functions = (1 => sub (@args) { [-] @args }, 4 => sub (@args) { [+] @args }); sub foo(@ast) { %functions{@ast[0]}(@ast.map: {$_ ~~ Array ?? foo $_ !! $_} ); }([4, [1, 1, 2]]) for ^250;' 11:47
Ah, it's pretty straight forward: MVM_profile_instrumented_start waits on tc->instance->cond_spesh_sync without marking the thread blocked, while the spesh thread is waiting for the other thread to join the GC run. 11:49
dogbert11 deadlock? 11:50
nine yes
Geth MoarVM/new-disp: 9429bdcb16 | (Stefan Seifert)++ | src/profiler/instrument.c
Fix a potential deadlock when starting to profile

MVM_profile_instrumented_start waited on tc->instance->cond_spesh_sync without marking the thread blocked. If the spesh thread was at that point waiting for the other thread to join the GC run, we would end up in a dead lock.
Fix by marking the thread blocked while waiting for the mutex and cond var.
12:02 reportable6 left
dogbert11 nine++, shouldn't we have seen this problem before? 12:02
or have we just been 'lucky' 12:03
Geth MoarVM/new-disp: a234d1dab0 | (Stefan Seifert)++ | src/profiler/instrument.c
Fix a potential deadlock when ending profiling

MVM_profile_instrumented_end waited on tc->instance->cond_spesh_sync without marking the thread blocked. If the spesh thread was at that point waiting for the other thread to join the GC run, we would end up in a dead lock.
Fix by marking the thread blocked while waiting for the mutex and cond var.
dogbert11 oh, the fixes are coming fast :) 12:05
nine dogbert11: I think we've just been lucky. After all the deadlock required a GC run just at the wrong time
timo any changes for like startup can make a noticeable difference to gc load 12:06
we can do a lot of startup with not a lot of GC by now, right?
nine no idea 12:07
dogbert11 I can't but wonder if it would be possible to sweep the codebase for problems like this 12:08
timo (gdb) print tc->nursery_alloc_limit - tc->nursery_tospace 12:09
$6 = 4194304
(gdb) print tc->nursery_alloc - tc->nursery_tospace
$7 = 4010880
this is from rakudo -e 'exit()'
so when we've parsed a tiny program and have compiled and run it, we're about 180 kbytes before the first GC run 12:10
the good thing about that is when the entire compiler and parser stuff has already become garbage by the time we're in user code, we'll probably have a very empty nursery after that
that's just theorycrafting, though 12:11
ok my program is now use nqp; nqp::force_gc(); sleep(5); and the numbers i get are 4194304 and 23128 12:22
so right after that forced gc the nursery is about half a percent used
m: say 0x03d20000 - 0x00405000 12:25
camelia 59879424
timo m: say (0x03d20000 - 0x00405000) / (1024 * 1024) 12:26
camelia 57.105469
timo (just the heap size of the process when it's just about finished)
i had that gdb python script that analyzes the heck out of the nursery and gen2 in terms of how many objects of what repr and type and such, but it was incredibly slow. maybe i'll rewrite it, or at least a part of it, in C at some point 12:33
lizmat where did it get its data from ? 12:35
timo it essentially walked all the slots in the gen2 and walked from object to object in the nursery 12:36
lizmat aah... /me missed the "gdb" part :-)
timo slot_type = int(STable['REPR_data'].cast(gdb.lookup_type("MVMArrayREPRData").pointer())['slot_type'])
*shudders* 12:37
the tricky bit about the gen2 is that we don't have the info about if a slot is occupied or not directly, but we do have a free list that points from one slot to the next 12:38
i guess we could check if the first 8 bytes of the slot is a pointer into one of the gen2 pages of the same size 12:39
my implementation back then would go collect all the pages, have a bit (well, a Bool) for every slot, then walk the entire free list and punches holes, and only after that will it go through all the objects that it now knows to be valid
not actually sure if this benefits literally anybody :) 12:55
13:05 reportable6 joined
timo i also had it display occupancy of gen2 pages so you'd get a little graphic that shows how holes are distributed 13:24
i mean, tracy has much prettier display for that, but still 13:28
MasterDuke nine++ also fixed that warning in the mkdir code 13:36
timo: interesting re new-disp not MVM_oops'ing 13:37
timo could be related to us not optimizing quite as much
i assume you've already checked like turning spesh pieces off 13:38
MasterDuke well, it happens with jit disabled. i think nine diagnosed it as pretty straightforward. two mutexes are locked, one of which is set as the ex_release_mutex. an exception is thrown (in bytecode validation iirc), the ex_release_mutex is unlocked, the other is gc'ed while still locked 13:42
timo why do we explode in bytecode validation in the first place 13:50
like, that's for when we're reading bad bytecode
nine MasterDuke: this doesn't look right: gist.github.com/MasterDuke17/07eaa...-patch-L63 13:55
MasterDuke m: use MONKEY-SEE-NO-EVAL; EVAL "say buf8.new({(^66_000).join(q|,|)})"
camelia Bytecode validation error at offset 52, instruction 10:
callsite expects 257 more positionals
in block <unit> at <tmp> line 1
nine MasterDuke: you're not releasing ex_release_mutex[0] if ex_release_mutex[1] is set?
MasterDuke right. i'm assuming they work like a stack 13:56
i.e., set ex_release; <code>; set ex_release; <code>; release; <code>; release 13:58
nine But if an exception is thrown, we unravel the whole C stack. So all of those ex_release_mutex should be released 13:59
Ah, not in the manually called MVM_tc_set_ex_release_mutex. 14:01
MasterDuke right, and there's a simple fix i'll just add, call MVM_tc_release_ex_release_mutex twice at src/core/exceptions.c:996
nine MasterDuke: but in src/core/exceptions.c:996
14:02 tealecloud left
MasterDuke i'm not sure why it doesn't happen on new-disp. something else must have changed, but did it actually fix the problem or just mask it somehow? 14:05
ugh. still `MoarVM panic: Tried to garbage-collect a locked mutex` on master even with the second MVM_tc_release_ex_release_mutex added 14:07
timo ah, the call to the , operator is too large 14:12
well, we did change how invocation works in new-disp
MasterDuke i didn't try that shortened example on new-disp, anyone have it built right now? 14:13
timo yes, give me a moment
Bytecode validation error at offset 54, instruction 9: 14:14
register operand index 54 out of range 0..3
in block <unit> at -e line 1
MasterDuke hm. a similar error, but not exactly the same 14:29
but shouldn't there still be a second mutex getting collected? 14:30
since there is in fact a bytecode validation error
timo probably if you turn on --full-cleanup haha 14:36
jnthnwrthngtn nine++ # passing spectest on new-disp \o/ 14:38
dogbert11 nine: are you still around 15:24
could this be a deadlock similar to the one you fixed a few hours ago? gist.github.com/dogbert17/8b4ce8bd...cf9f5d4be5 15:28
15:28 tealecloud joined
nine dogbert11: yes, indeed! 15:49
timo was clean spectests our last task before merging to master or do we switch over nativecall first? 15:59
nine I think Blin is the final frontier 16:00
timo ah, of course
16:24 tealecloud left
nine dogbert11: so, you know how to fix it :) 16:35
dogbert11 nine: yes, I believe so but it seems as if github has changed a bit, my username password no longer works when pushing, sigh 16:51
they talking some kind of token rubbish 16:53
Nicholas dogbert11: I've not read it properly yet, but they're talking about stuff here: github.blog/2021-09-01-improving-g...ty-github/ 17:11
dogbert11 Nicholas: thx, I've now managed to follow their instructions 17:21
Geth MoarVM: dogbert17++ created pull request #1536:
Fix another potential deadlock when starting to profile
dogbert11 nine: done, I shamelessly stole some of your text from a previous commit 17:33
18:02 reportable6 left
nine dogbert11: no worry. I stole from my first commit for the second one as well :D 18:11
Geth MoarVM/new-disp: d17593de7c | (Jan-Olof Hendig)++ | src/profiler/profile.c
Fix another potential deadlock when starting to profile

Very much like commit 9429bdc. The difference here was that MVM_profile_start waited for tc->instance->cond_spesh_sync without marking the thread blocked. A deadlock can occur if the spesh thread is at this point waiting for the other thread to join the GC run.
Fix by marking the thread blocked while waiting for the mutex and cond var.
MoarVM/new-disp: 0306ed31a8 | niner++ (committed using GitHub Web editor) | src/profiler/profile.c
Merge pull request #1536 from dogbert17/fix-possible-deadlock

Fix another potential deadlock when starting to profile
MasterDuke so i think jnthnwrthngtn has some inlining work planned, but other than that it's just a clean Blin run new-disp is waiting for? 18:37
19:04 reportable6 joined 20:21 discord-raku-bot left 20:22 discord-raku-bot joined
jnthnwrthngtn I'm not sure "just" is quite the right word for getting Blin clean enough to merge, but yes, what's left is 1) fix regressions, 2) get us able to inline Raku multi/method calls again 21:20
I think probably sometime soonish after the next release would be a good time, if we're looking good on those 21:21
Then we've a release cycle for feedback on anything really pressing
timo before inlining happens again, is benchmarking a silly idea? 21:26
maybe comparing current new-disp with master but inlining disabled? 21:27
jnthnwrthngtn I didn't restore spesh linking yet either, so even that's not quite comparable 21:29
Geth MoarVM/new-disp: 17c21493d4 | (Jonathan Worthington)++ | 2 files
Make some frame walker functions static inlines

During compilation, dynamic variable lookup consumes a lot of cycles, and the spesh frame walker used to implement it is thus very hot. In fact, it accounted for 9.21% of CPU instruction fetches during CORE.setting compilation. Wich this change, we save ~7 billion of those instruction fetches, lowering it to 8.78%.
jnthnwrthngtn So worth waiting for that at least, I'd say
timo holy wow 21:31
i'm very surprised that so many things aren't being inlined automatically already 21:32
jnthnwrthngtn Me too 21:38
Many of these are cross-compilation-unit
But I'd still expect LTO to do better 21:39
So far as dynamic variables go, I suspect we really do look up more of them now
And create pressure on caching of them
timo i don't remember what we recently added in terms of dynamics 21:40
jnthnwrthngtn $*MAST_FRAME is newly needed for numerous ops (hllize, istype, etc.)
timo there's not something we can do about that? 21:41
i imagine it's a dynamic var because we use the stack to have a stack of them while recursing? 21:42
so can't just use an attribute on some object
jnthnwrthngtn Well, we can keep a stack on the object or some such also 21:45
And cache the current one in an attribute 21:46
I think the compiler isn't even an instance at the moment
But we could make it one
timo have we run the dynamic variable caching analysis script in the last few years? 21:47
22:04 evalable6 left, linkable6 left 22:05 linkable6 joined 23:06 evalable6 joined