| IRC logs at
Set by AlexDaniel on 12 June 2018.
00:02 dogbert11 joined 00:06 dogbert17 left 01:39 Kaiepi left, Kaiepi joined 01:52 leont left 01:56 leont joined 02:38 Kaiepi left 02:39 Kaiepi joined 03:51 evalable6 left, unicodable6 left, bloatable6 left, shareable6 left, tellable6 left, linkable6 left, squashable6 left, committable6 left, releasable6 left, statisfiable6 left, notable6 left, sourceable6 left, nativecallable6 left, benchable6 left, quotable6 left, coverable6 left, bisectable6 left, greppable6 left, bisectable6 joined, releasable6 joined, greppable6 joined 03:52 tellable6 joined, committable6 joined, unicodable6 joined, bloatable6 joined, sourceable6 joined, statisfiable6 joined 03:53 linkable6 joined, coverable6 joined 03:54 notable6 joined, nativecallable6 joined, squashable6 joined, evalable6 joined, benchable6 joined, quotable6 joined, shareable6 joined 07:31 japhb left 07:53 japhb joined
MasterDuke nine: re i assume i need to introduce a mutex as part of moving them to the instance? 08:00
08:12 linkable6 left 08:14 linkable6 joined
nwc10 good *, #moarvm 08:14
japhb Good * 08:35
08:57 linkable6 left 09:00 linkable6 joined 09:56 Kaiepi left, Kaiepi joined
Geth MoarVM: 9b5d14c0c9 | (Daniel Green)++ | src/core/exceptions.c
Save malloc+free per frame when creating backtrace

By using a fixed size char array. The number of instructions reported by callgrind do vary a bit, even with `MVM_SPESH_BLOCKING=1`, but it seems to save ~1m instructions for `my $a; $a = do given "hi" { when 3 {
   "three" }; when 5 { "five" }; when "hi" { "hello" }; when * > 40 {
   "> 40" }; default { "default" } } for ^1_000; say $a`.
MoarVM: 63a69af5c2 | (Daniel Green)++ | 3 files
Use the FSA for MVMActiveHandlers

A lot of them get quickly allocated and freed when generating backtraces, so use the FSA to hopefully avoid the actual malloc/free.
MoarVM: 42959e4645 | MasterDuke17++ (committed using GitHub Web editor) | 3 files
Merge pull request #1457 from MasterDuke17/minor_optimizations_when_generating_backtraces
nine MasterDuke: I don't think so? 09:58
If anything moving them into the instance would make accessing them (a bit) safer.
MasterDuke nine: hm. a naive moving them causing a pretty quick segv 09:59 don't have time to debug now, but this causes a segv with `raku --full-cleanup -e ''` 10:01
10:14 linkable6 left 10:17 linkable6 joined 10:35 linkable6 left, linkable6 joined
dogbert11 just verified that is still present in case anyone is interested :) 10:41
dogbert11 worst sales pitch ever 10:42
nine dogbert11: how can I reproduce it? 11:07
dogbert11 nine: running it in a loop with ASAN and FSA_SIZE_DEBUG 1. It does not happen all the time. 11:10
nine Oh... at some point I removed the --asan from my debug-on script 11:15
dogbert11 this one is a bit tedious atm. I can try to narrow it down a bit, i.e. is it a specific test causing the problem. 11:17
there's also 11:19
it tends to crop up quite often when running the script in a loop 11:20
if we get a message like 'Cannot find method 'sink' on 'BOOTCode': no method cache and no .^find_method' does that mean that MoarVM has called MVM_exception_throw_adhoc? 11:27
MasterDuke MVM_exception_throw_adhoc_free in this case 11:28
dogbert11 MasterDuke: ah cool, so that's where a breakpoint should be placed then
11:38 linkable6 left, evalable6 left
nine dogbert11: I usually look for the error message ("no method cache" in this case) and set a breakpoint on that exact line. 11:38
Setting a break point on MVM_exception_throw_adhoc doesn't gain much since we do throw a lot of exceptions even in normal operations 11:39
11:39 evalable6 joined 11:40 linkable6 joined
dogbert11 nine: thx, the stacktrace looks like this 11:42
(gdb) bt
#0 MVM_6model_find_method (tc=0x55555555a180, obj=0x555558102cb0, name=0x55555889c860, res=0x5555582d7558, throw_if_not_found=1) at src/6model/6model.c:142
#1 0x00007ffff784e785 in MVM_interp_run (tc=0x55555555a180, initial_invoke=0x7ffff79cee08 <toplevel_initial_invoke>, invoke_data=0x5555556090e8, outer_runloop=0x0) at src/core/interp.c:1833
#2 0x00007ffff79cef8e in MVM_vm_run_file (instance=0x555555559660, filename=0x5555555595e0 "/home/dogbert/.rakudobrew/versions/moar-master/install/share/perl6/runtime/perl6.moarvm") at src/moar.c:504
#3 0x00005555555558e8 in main ()
nine Which unfortunately doesn't help much, since that's just how executing a findmeth op looks like 11:44
FWIW I've already sunk a couple hours on this issue. It's clearly caused by spesh. Yet it is also quite elusive. For example I've yet to see it fail with MVM_SPESH_INLINE_DISABLE=1 but that doesn't mean anything since pretty much everything you could do makes it go away 11:45
dogbert11 I guess dumping the bytecode doesn't help much either 11:48
11:49 frost-lab joined
dogbert11 it failed now and the bytecode looks the same as in the original report so that part is consistent at least 11:49
multitasking a bit, I got the nonblocking await bug again, it seemed to fail after test 21 had completed 11:57
yeah, test 22 (Server survived) might possibly be a suspect 11:58
12:01 linkable6 left
dogbert11 nine: I just made the following fail, 'while MVM_SPESH_INLINE_DISABLE=1 perl6 --ll-exception -Ilib repos/rakudo/t/spec/S17-supply/return-in-tap.t; do :; done' 12:01
12:04 linkable6 joined
nine Thanks for proving my point about how devious this bug is :) 12:05
dogbert11 any tips about how to proceed?
nine Anyway this weekend's debugging made me think of another thing: hash randomization. Maybe we can find a hash salt that makes it fail more consistently 12:06
12:06 linkable6 left
nine MVM_SPESH_BLOCKING + disabled hash randomization should lead to very consistent runs 12:06
dogbert11 and where do I turn off randomization
nine MVM_HASH_RANDOMIZE in src/moar.h 12:07
dogbert11 thx, will try to turn it off 12:08
12:08 linkable6 joined
nine Then there's the constant hash salt in src/core/str_hash_table.c:151 which by default is 0 but can be set to other values to try different hash orderings. 12:09
Actually needs an addition at line 172 to set control->salt in that case, too
12:10 linkable6 left
dogbert11 so the salt needs to be set in two places? 12:12
nine yes
12:13 linkable6 joined
dogbert11 btw, I managed to golf issue 1333 a bit and wrote a comment in the case 12:14
12:16 linkable6 left, linkable6 joined 12:35 linkable6 left 12:36 linkable6 joined 12:40 domidumont joined
nine dogbert11: I've tested salts from 0 to 200 with 10 runs each. Haven't seen a single failure yet :( 12:44
dogbert11 :( 12:49
is the hash randomization affecting spesh then? 12:50
13:05 frost-lab left
nine Not directly. It can affect order of operations where hashes are involved which leads to different order of spesh logs (as calls are made in different order) which can mean that e.g. there may or may not be a speshed candidate available for inlining when we spesh a caller 13:20
lizmat perhaps it would make sense to not use hashes there ? 13:30
or perhaps have an interface to some dual list structure that would look like a hash, but wouldn't actually hash but do a serial lookup and be thus repeatable ? 13:31
not sure what the keys on those hashes are
nine lizmat: where? 14:10
lizmat spesh ? 14:11
you're saying hashes are being used in spesh logs, and as such are a source of unrepeatability
nine Spesh isn't using hashes for anything. But we use hashes for lots of things in Raku. Which means that the order of operations of lots of things depends on hash order. The spesh log just reflects that order of operations. 14:12
lizmat aaah... ok
sorry, I've misunderstood then
nine np 14:13
lizmat fwiw, I've replaced a lot of hashes in the RakuAST grammar by actual classes / objects 14:14
well, no, actually, I've made it possible to do so :-)
they still need to be hacked in :-) 14:15
nine That must be a win :) 14:16
dogbert11 heh, this bug is annoying as hell 14:28
tried something different, i.e. 'while MVM_SPESH_LOG=1 perl6 --ll-exception -Ilib repos/rakudo/t/spec/S17-supply/return-in-tap.t; do :; done' 14:29
ok 1 - return in a Supply quit handler works fine
1..1 14:30
ok 1 - return in a Supply quit handler works fine
Segmentation fault (core dumped)
nine Did you by any chance catch that core dump? 14:31
And have a build with debug symbols?
dogbert11 I might, let me see 14:32
nine: 14:34
I suspect you'll be able to replicate the above quite easily 14:37
lizmat And another Rakudo Weekly News hits the Net: 14:43
dogbert11 lizmat++
14:47 zakharyas joined
nine dogbert11: actually, I have yet to see a segfault with this. At least I got the "Cannot find method 'sink'" for the first time today 15:21
Since even with MVM_SPESH_BLOCKING=1 and hash randomization disabled the bug is...well, quite random, my only other idea is uninitialized memory 15:22
Which fits the bug apparently going away with just different debugging/logging options set 15:23
dogbert11 nine: don't you even get any problems when running with the spesh log 15:35
16:00 Kaiepi left
nine It seems to fail more often with spesh log and less often with spesh blocking 16:16
dogbert11 but you get neither SEGV's nor output like "free(): corrupted unsorted chunks" 16:21
nine nope 16:33
dogbert11 bizarre 16:37
the SEGV's disappear if MVM_SPESH_BLOCKING is set or if MoarVM is compiled with --no-optimize 16:38
I'm running on 'Rakudo(tm) v2021.03-116-g0b91b21af. Built on MoarVM version 2021.03-24-g1538f98da.' 16:42
dogbert11 wipes and reinstalls rakudo 16:55
nine Disabling ASLR does not seem to make any difference 17:09
dogbert11 got the crash on my reinstalled rakudo 17:19
dogbert@dogbert-VirtualBox:~/repos/rakudo$ while MVM_SPESH_LOG=1 ./rakudo-m -Ilib t/spec/S17-supply/return-in-tap.t; do :; done 17:20
ok 1 - return in a Supply quit handler works fine
free(): corrupted unsorted chunks1..1
17:20 domidumont left
dogbert11 ok 1 - return in a Supply quit handler works fine 17:20
Segmentation fault (core dumped)
17:47 zakharyas left
nine huh... 17:47
Meanwhile I've tried if memory poisoning from GC_DEBUG=3 makes a difference: no failure so far... 17:48
17:56 harrow left
dogbert11 something is very fishy but I don't know what. 18:02
the SEGV's only show up if I enable the spesh log, weird 18:03
nine always the same backtrace? 18:05
18:09 Kaiepi joined 18:16 harrow joined
dogbert11 nine: sorry for the late reply, was eating dinner. The backtrace (on my vanilla Rafudo install look like this) 18:44
MasterDuke also mentions MVM_SPESH_LOG 18:56
19:03 zakharyas joined
dogbert11 MasterDuke: the backtrace in 1434 looks similar to the second one posted just above your comment 19:04
MasterDuke: does anything strange happen if you run 'while MVM_SPESH_LOG=1 ./rakudo-m -Ilib t/spec/S17-supply/return-in-tap.t; do :; done' 19:05
or is it only on my system
[Coke] ls 19:12
... ww
nine MasterDuke: how can I reproduce that one from 1434? 19:16
Ah, revert a5ca53a8b4bfd789ef5329218900737bd566a841 19:18
19:18 linkable6 left 19:19 linkable6 joined 19:21 linkable6 left 19:22 linkable6 joined 19:24 linkable6 left 19:25 linkable6 joined
nine And yet again...the fix is to simply run things in rr 19:26
19:26 linkable6 left
nine rr is the magic "make MoarVM perfectly stable" tool 19:26
19:27 linkable6 joined 19:28 linkable6 left 19:31 linkable6 joined 19:32 linkable6 left 19:35 linkable6 joined 19:49 linkable6 left, linkable6 joined 19:51 linkable6 left 19:52 linkable6 joined
nine After 295 runs with --chaos, i.e. random scheduling decisions, it did end in a segfault. But a different one 19:53
dogbert11 sounds like a hard won success 19:55
19:55 linkable6 left
nine My segfault is deep inside vprintf when printing what appears to be a rather mundane guard_tree string 19:56
19:57 linkable6 joined
dogbert11 I've seen that as well 20:00
#3 _IO_new_file_xsputn (f=0x55a9b753a9d0, data=<optimized out>, n=7) at fileops.c:1197 20:02
#4 0x00007fe5cafa6af2 in __vfprintf_internal (s=0x55a9b753a9d0, format=0x7fe5cb4adcc9 "After:\n%s", ap=0x7fe5ca94d970, mode_flags=2) at ../libio/libioP.h:948
#5 0x00007fe5cb3ec5fe in MVM_spesh_debug_printf () from //home/dogbert/repos/rakudo/install/lib/
20:09 linkable6 left 20:10 linkable6 joined 20:11 linkable6 left 20:12 linkable6 joined
nine Ok, vfprintf is segfaulting because we already closed the spesh log file handle in MVM_vm_exit. It's just sloppy cleanup 20:12
20:13 linkable6 left 20:14 linkable6 joined
Geth MoarVM/fix_segfault_on_exit_with_spesh_log: af5b7e4790 | (Stefan Seifert)++ | src/moar.c
Fix possible segfault on exit when using spesh log

A normal exit without --full-cleanup did not stop the spesh thread. So spesh may actually still be active and try to print things to the spesh log when MVM_vm_exit closes the spesh log file handle. This would lead to a segfault in vfprintf. Fix by stoping and joining the spesh thread even in MVM_vm_exit if spesh logging is active.
MoarVM: niner++ created pull request #1458:
Fix possible segfault on exit when using spesh log
nine MasterDuke, dogbert11: please try ^^^
20:22 linkable6 left
nine Lesson of the day: in multi threaded programs, always post backtraces of all threads 20:23
20:25 linkable6 joined
nine dogbert11: regarding issue 1333 try switching lines 885 and 886 in src/core/frame.c, i.e. set returner->extra = NULL before freeing it. 20:30
20:32 zakharyas left 20:36 linkable6 left 20:38 linkable6 joined 20:47 linkable6 left 20:50 linkable6 joined
MasterDuke nine: was triggered by the now removed t/09-moar/10-inline-array-pos.t 20:55
[Coke] updated listing in, broke it out by merged vs. unmerged. shows how old, last committer... 20:56
If someone OKs it, I can delete all the merged branches on the list. 20:57
weird, there's a branch that travis committed to?
20:58 linkable6 left 21:00 linkable6 joined
[Coke] nwc10: - any idea what platform that might be? 21:03
21:03 linkable6 left 21:05 linkable6 joined
MasterDuke nine: nm, i see you mention reverting the commit that removed them. and yes, after restoring the tests, master segfaults pretty quickly, your branch hasn't segfaulted after a while longer 21:06
[Coke]: fwiw, might be OBE if/when is ever merged 21:13
21:16 linkable6 left
dogbert11 looks like nine++ solved the speshlog problem 21:16
21:17 linkable6 joined 22:02 dogbert17 joined 22:05 dogbert11 left