00:05 AlexDaniel joined, AlexDaniel left, AlexDaniel joined 01:10 evalable6 left 01:12 evalable6 joined 07:24 MasterDuke joined 07:28 MasterDuke left
nine I don't get it. Why does a spesh log of a program ran with MVM_SPESH_BLOCKING=1 MVM_SPESH_NODELAY=1 and disabled hash randomization (by fixating tc->hashSecret) record specializations of different methods? 08:11
What other source of randomness could there be? 08:12
The program is also not threaded
08:39 domidumont joined 08:47 chloekek joined 09:10 domidumont left 10:25 evalable6 left 10:27 evalable6 joined
timotimo where does the difference begin? 10:30
in the spesh log i mean 10:32
in the statistics, in the plan, or in the speshing?
nine The differences in the 6.5 million line spesh log seem to start around line 185K with a different hit count for gen/moar/Perl6-BOOTSTRAP.c.nqp:4096 10:52
timotimo telemeh has a few messages related to spesh 11:12
it could be a difference when exactly a spesh log gets freed for additional logging?
i.e. maybe the region that's locked is just a little too short? 11:29
12:40 evalable6 left, evalable6 joined 13:43 squashable6 joined 14:32 lucasb joined 15:01 patrickb joined 15:07 sena_kun joined 15:08 sena_kun left 15:09 sena_kun joined 15:10 squashable6 left, squashable6 joined 17:35 domidumont joined, domidumont left 19:39 brrt joined
brrt \o 19:41
timotimo brrt: you can already put devirtualization in because it's a tree-based optimization, right? and we will end up having two optimizers in the jit? 19:42
brrt ohai timotimo 19:46
devirtualization can be done
but I'm not sure what you mean with two optimizers? 19:48
timotimo one working on a tree form, one working on a linear IR?
brrt no, there'll be one, working on a linear IR
timotimo oh 19:49
so how is the virtualization done?
brrt 'cause I think that to be safe, you need to know the order and control flow dependency, and the tree just doesn't encode that
or at least not in a way that's directly done
heh, so, by smoke and mirrors 19:50
what I'm working on today, is in fact using the tree IR
but what I intend to do, is to have the devirtualization use templates 19:51
timotimo oh?
then how will double-devirt work? :P
brrt wonders a bit how to explain the plan
what do you mean by double-devirt?
timotimo maybe you just implement it and i'll never have to ask about it again :D
brrt on the other hand, maybe by asking questions you'll force me to clarify my own muddy thinking 19:52
so the idea I had in mind was this:
- I add a REPR method 'jit' that takes the expr tree, instruction, graph, and input operand buffer
This allows us to do something clever and 'open' for instance for array-push, array-get, etc 19:53
timotimo "open"? 19:54
brrt an open interface, i.e. one that allows the REPR to do arbitrary things, open a shell etc
timotimo ah 19:55
launch nethack
brrt but I was thinking most of using it to have specific codegen
19:55 sena_kun left
brrt - Then, implement a 'default devirtualization' routine, that would do pretty much the same as what devirt in graph.c does, i.e. it builds the correct call tree for a given pointer 19:56
However, the intention would be for that to use a 'named template', a template that the precompiler builds but isn't associated with an opcode
and it'd be applied via the template application mechanism 19:57
so from the point of view of the devirtualization routine, the template itself is opaque
*then*, change the expression IR so that we build a linear IR from templates 19:58
(and then, when that is done, build out the optimizer)
ehm... does that sound like a plan?
timotimo the individual devirtualizations can be rather different for the different opcodes; any way to make that less annoying? 19:59
brrt I hope templates are not terribly inconvenient 20:00
timotimo we'll be writing them in a .expr file, right? how do we "target" them in the devirtualization routines?
brrt the template compiler will generate an enum or a #define that will have the right name 20:02
timotimo OK
we'll end up with a switch/case in the default devirt repr implementation that points at each generated name in turn? 20:03
brrt something like that yes 20:04
timotimo and there's a simple way to plop a function's address in the template, i assume 20:06
brrt oh, sure, just a CONST_PTR node 20:09
timotimo mhm mhm 20:10
i don't know how such a value gets moved from C to the template
or does the #define of the template get included right there and can just access the variable in question? 20:11
i.e. is the template compiler a little like dynasm?
brrt not so much like dynasm because dynasm is integrated into the typical 'body' of the C file, where the template compile just generates a header 20:13
but otherwise, yes, you'd include the generated header and it'd include the array
but the C devirt function would push a CONST_PTR node to the argument list for the template, and then the template would refer to that as argument to the CALL operator 20:14
timotimo ah, argument lists 20:15
that makes sense
brrt the 'regular' expression templates also (implicitly) get arguments 20:16
so it'd be mostly the same mechanism... 20:17
current problem is that I tried to 'optimize' for cheap cleanup when we can't find a template 20:28
timotimo is that more than just using the area allocator? 20:29
brrt expr jit mostly isn't using the spesh allocator 20:30
that's because the internal structures use realloc a lot, and realloc is wasteful for the spesh allocator 20:31
timotimo ah, dang
brrt well, I have no idea just how premature that optimization really is. And in the happy case, we *do* allocate a tree 20:32
timotimo we should be able to measure stuff :) 20:33
brrt yep :-) 20:35
but... tbh I have no idea how to start measuring the overhead of malloc
timotimo hmm 20:39
malloc has the "benefit" that the rest of the program can swoop in to use any holes
also, jemalloc and blah 20:40
brrt it's funny, actually 20:42
in that there's a certain class of programmers that really doesn't want to use garbage-collected languages, because these are complex runtime systems with complex performance characteristics
but malloc also is a pretty complex runtime system with complex performance characteristics, especially in long-running programs 20:43
but nobody minds using that
well, almost nobody
hmm, I just realized that there is a 'side effect' bit in the expression templates, that isn't strictly necessary, because we kind of know which operators yield values and which do not 20:44
timotimo i know that game developers do a lot with flyweights, if i remember the right term 20:45
i.e. instead of having a whole bunch of individual instances, you have one array with all x coordinates, one with all y coordinates etc? 20:46
brrt heh, that sounds a lot like columnar data stores
timotimo aye
brrt game developers arguably have a decent reason not to want GC 20:47
then again, minecraft is a huge success
timotimo just recently i saw flecs on reddit, it's a entity component system in C99 (or was it C11)
minecraft's performance was notoriously bad
there was this project called "optifine" that optimized it somewhat
while putting in shiny
brrt I never played it tbh 20:48
timotimo i got on it surprisingly early
i didn't play terribly much, though
brrt and I seem to recall that microsoft wanted to rewrite it in C++
timotimo i believe they did
did you see what they're doing with minecraft and hololens? 20:49
brrt did not
timotimo essentially putting the game world, including other player's characters, in the real world, anchored on a table or something
just the typical AR thing
and i think something with voice commands? 20:50
brrt heh, that sounds pretty funny actually
brrt is going to sleep... speak you later
timotimo i want to write an ECS in perl6, base it off of flecs design, and call it pecs
good night brrt!
20:55 brrt left
timotimo oh damn 21:09
my registration date is in my mojang account (minecraft)
21:58 patrickb left 22:50 AlexDaniel` joined 22:52 chloekek left 23:31 lucasb left