00:00:46 -!- sparr has quit (Changing host).
00:00:46 -!- sparr has joined.
00:01:43 <esowiki> [[ACL]] https://esolangs.org/w/index.php?diff=64820&oldid=64819 * Hanzlu * (+145)
00:10:34 -!- arseniiv has quit (Ping timeout: 246 seconds).
00:15:58 <esowiki> [[ACL]] https://esolangs.org/w/index.php?diff=64821&oldid=64820 * Hanzlu * (+143)
00:27:24 -!- xkapastel has quit (Quit: Connection closed for inactivity).
00:45:49 -!- xkapastel has joined.
01:11:27 -!- FreeFull has joined.
02:03:55 <tswett[m]> Awrighty, I think I've decided how I want to implement Forth for my weird PC project.
02:04:03 <tswett[m]> Uh, I've decided *some* of it, anyway.
02:04:14 <tswett[m]> Here's what my keyboard handler looks like currently...
02:04:20 <tswett[m]> 29 DB B8 10 01 8E D8 B8 00 B8 8E C0 29 C0 E4 60 D7 B4 07 AB B0 20 E6 20 CF
02:05:04 <tswett[m]> sub bx, bx; mov ax, 0110; mov ds, ax; mov ax, b800; mov es, ax; sub ax, ax; in al, 60; xlat; mov ah, 07; stosw; mov al, 20; out 20, al; iret
02:05:19 <tswett[m]> The new keyboard handler is going to be:
02:05:20 -!- adu_ has joined.
02:05:26 <tswett[m]> b800 set-es 60 input-byte 500 xlat 700 or 404 load store-es 404 load inc inc 404 store 20 20 output-byte iret
02:05:30 <tswett[m]> Which is obviously so much more readable.
02:05:43 -!- adu has quit (Ping timeout: 246 seconds).
02:05:43 -!- adu_ has changed nick to adu.
02:06:07 <tswett[m]> And which compiles to 107 bytes instead of the 25 bytes of the original. :D
02:06:38 <tswett[m]> Now I just merely have to write the compiler.
02:11:11 <kmc> how do you know it's 107 bytes if you haven't written the compiler yet?
02:11:50 <tswett[m]> Well, the compiler is just going to take each of those words, look up the corresponding piece of assembly code, and write that assembly code into the compiled output.
02:17:19 <kmc> sounds like you need an optimizing compiler :)
02:29:54 <int-e> tswett[m]: wah, you have to restore those segment registers!
02:30:15 <int-e> (and the other ones as well)
02:31:17 <tswett[m]> Oh yeah, that's probably a good idea.
02:33:32 <tswett[m]> I guess I'm assuming that no code will be running when an interrupt happens.
02:33:43 <tswett[m]> Which... is an assumption that may prove to be false. :D
02:34:55 <int-e> Oh the interrupt handler will be fine... only other programs will suffer ;)
02:38:05 <tswett[m]> I have another interrupt handler that ends with "eb fe", which is a jump to itself.
02:45:35 <shachaf> I wrote a little assembler the other day, it's great.
02:46:08 <shachaf> I probably won't bother with multipass assembly to get the smallest instruction sizes.
03:34:28 -!- nfd9001 has quit (Ping timeout: 276 seconds).
04:17:23 -!- xkapastel has quit (Quit: Connection closed for inactivity).
04:18:57 -!- FreeFull has quit.
04:47:14 <esowiki> [[What Mains Numbers?]] https://esolangs.org/w/index.php?diff=64822&oldid=64813 * A * (+1498) /* The implementation */
04:49:00 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64823&oldid=64822 * A * (-1363) /* The implementation */
04:51:29 <HackEso> 526) <elliott> Second Life is like... real life, modelled by people who've READ about real life, you know, in books.
04:51:36 <HackEso> 1199) <b_jonas> oerjan: the original purpose was to make a language in which I write ugly source code, and it's compiled to readable standard ml and readable prolog code; but I sort of ran out of time and the readable part got dropped so now the compiled code is even more ugly than the original
04:52:35 -!- Melvar has quit (Quit: WeeChat 2.4).
04:53:33 <shachaf> Are there uses for the REX byte without any of W,R,X,B?
05:07:52 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64824&oldid=64823 * A * (+0) /* The implementation */
05:08:16 -!- doesthiswork has quit (Ping timeout: 258 seconds).
05:08:17 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64825&oldid=64824 * A * (+2) /* What Mains Numbers? */
05:11:09 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64826&oldid=64825 * A * (+33) /* The implementation */
05:19:31 -!- Melvar has joined.
05:28:39 <esowiki> [[What Mains Numbers?]] https://esolangs.org/w/index.php?diff=64827&oldid=64826 * A * (-1) /* The implementation */
05:31:44 <esowiki> [[What Mains Numbers?]] https://esolangs.org/w/index.php?diff=64828&oldid=64827 * A * (+0) /* The implementation */
05:46:25 -!- Sgeo__ has joined.
05:49:28 -!- Sgeo_ has quit (Ping timeout: 244 seconds).
06:57:28 <shachaf> `` doag quotes | grep 'standard definition'
06:57:30 <HackEso> 9233:2016-10-11 <shachäf> addquote <ais523> (btw, "q = 1-p" should be the standard definition of q, IMO)
06:59:03 <esowiki> [[Pxem]] https://esolangs.org/w/index.php?diff=64829&oldid=64735 * YamTokTpaFa * (+27) /* References */
07:01:16 <esowiki> [[User:YamTokTpaFa]] https://esolangs.org/w/index.php?diff=64830&oldid=62025 * YamTokTpaFa * (+22)
07:03:50 <esowiki> [[User:YamTokTpaFa/sandbox4]] N https://esolangs.org/w/index.php?oldid=64831 * YamTokTpaFa * (+33) Created page with "'''Pxemf'''(Pronunciation: ) is"
07:09:01 <esowiki> [[Talk:ACL]] N https://esolangs.org/w/index.php?oldid=64832 * JonoCode9374 * (+290) Created page with "= Python Interpreter = Is it alright if I make a direct port of the Java interpreter in Python? Because I am unable to access anything that could compile Java, but I do have..."
07:10:24 <esowiki> [[Talk:ACL]] M https://esolangs.org/w/index.php?diff=64833&oldid=64832 * JonoCode9374 * (+29) /* Python Interpreter */
07:12:11 <esowiki> [[User:JonoCode9374]] https://esolangs.org/w/index.php?diff=64834&oldid=63904 * JonoCode9374 * (+90)
07:28:17 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64835&oldid=64828 * A * (+61) /* Example programs */
07:28:25 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64836&oldid=64835 * A * (+1) /* = Hello, world! program */
07:30:01 -!- cpressey has joined.
07:31:49 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64837&oldid=64836 * A * (+6) /* The implementation */
07:36:40 <cpressey> Good morning. It occurs to me that "ghci" is an acronym for "Glasgow Haskell Compiler Interpreter". I approve of this. The world needs more compiler interpreters.
07:36:44 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64838&oldid=64837 * A * (+1492) No it is, you probably don't know JavaScript.
07:45:12 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64839&oldid=64838 * A * (+166) /* Infinite loop */
07:57:03 <int-e> Oh. I finally got the etymology of "bison".
07:57:33 <int-e> And I wish I didn't. Puns are only good when they're your own ;-)
08:00:32 <esowiki> [[What Mains Numbers?]] https://esolangs.org/w/index.php?diff=64840&oldid=64839 * A * (+623) /* Infinite loop */
08:02:27 -!- Lord_of_Life has quit (Ping timeout: 268 seconds).
08:03:52 <esowiki> [[What Mains Numbers?]] https://esolangs.org/w/index.php?diff=64841&oldid=64840 * A * (+303) /* Infinite loop */
08:05:10 -!- Lord_of_Life has joined.
08:13:50 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64842&oldid=64841 * A * (+224) /* Infinite loop */ I should have explained it further when Ais523 is a JavaScript beginner. (But indeed it is obvious enough.)
08:16:09 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64843&oldid=64842 * A * (+35) /* Infinite loop */ Minor improvement
08:16:46 <esowiki> [[What Mains Numbers?]] https://esolangs.org/w/index.php?diff=64844&oldid=64843 * A * (+29) /* Infinite loop */
08:22:12 <esowiki> [[Talk:ACL]] M https://esolangs.org/w/index.php?diff=64845&oldid=64833 * A * (+84) Stub reply
08:25:52 -!- GeekDude has quit (Ping timeout: 272 seconds).
08:28:30 -!- GeekDude has joined.
08:43:53 -!- wob_jonas has joined.
08:48:42 <wob_jonas> "<shachaf> Are there uses for the REX byte without any of W,R,X,B?" => rarely. you can use them to encode the DIL, SIL, BPL, SPL byte register operands
08:54:35 <wob_jonas> They aren't in x86_16 or x86_32, but then you rarely need them, you can do almost everything with either the other byte registers or full word operations
08:56:32 <int-e> shachaf: I miss the one byte inc/dec register operations :P
08:56:39 <shachaf> I see. Without the rex prefix it encodes the upper half of the [abcd]x registers.
08:56:59 <int-e> (Bot really. But that's where they stole those prefixes from.)
09:00:28 <shachaf> So 010x is inc and 011x is dec.
09:01:53 <shachaf> And in long mode inc %eax is encoded with... ff?
09:03:05 <int-e> (I don't know what you mean by 010x and 011x though)
09:03:10 <shachaf> Oh, which was already inc r/m
09:03:26 <shachaf> I mean octal 0100|reg and 0110|reg
09:03:48 <int-e> right, makes sense
09:04:00 <shachaf> octal is the way to go, it's great
09:04:40 <shachaf> The mod r/m byte is great. 03xy encodes register x and register y, and so on.
09:24:37 -!- ais523 has joined.
09:25:06 <ais523> well, I wrote my slightly crazy malloc: http://nethack4.org/pastebin/animalloc.tgz
09:25:39 <ais523> it compiles to a .so file that you can use with LD_PRELOAD
09:26:33 <ais523> I've been trying it on various programs, it seems to work (in particular, a full build+test of C-INTERCAL works, with malloc replacements in all the programs invoked as part of that, including make, gcc, and friends)
09:33:51 <ais523> shachaf: it's based around asking the kernel for a huge amount of address space and then using pagefaults for the actual allocation
09:34:37 <ais523> it also uses a lock-free algorithm (specifically, a lock-free stack) to make it async-signal safe, because I hate the fact that standard malloc isn't
09:35:18 <ais523> there's also some code in there to fill freed memory with 0xAA, I should probably remove that because this really isn't a malloc about safety
09:35:59 <ais523> OK, updated with that line taken out
09:36:35 <ais523> note that if you try to allocate too much memory and then use it, you'll likely get a segfault, but that's no different from standard mallocs on Linux
09:36:53 <ais523> (also, this assumes Linux and a 64-bit processor; it doesn't assume x86_64 specifically, although that's what most people are likely to use)
09:37:50 <wob_jonas> ais523: you should probably implement the various aligned allocation functions too, even if by just stubs that abort
09:38:23 <ais523> I just realised that while looking at competitors
09:38:33 <ais523> you're right, it's important to have the entire set of alloc functions implemented at once
09:38:39 <ais523> so that you don't mismatch malloc and free impls
09:39:15 <wob_jonas> see https://www.gnu.org/software/libc/manual/html_node/Replacing-malloc.html#Replacing-malloc
09:39:49 <esowiki> [[User:YamTokTpaFa/sandbox4]] https://esolangs.org/w/index.php?diff=64846&oldid=64831 * YamTokTpaFa * (+942)
09:41:06 <ais523> ooh, "malloc_usable_size" is what that function's called
09:41:07 <wob_jonas> filling with 0xAA can make sense, but make sure to do that only for small allocations, or at least small parts of large allocations, because people can use malloc to allocate huge blocks that will be paged in on demand too
09:41:56 <ais523> yes, it was limited to blocks of 32768 bytes at most
09:42:00 <ais523> but I removed it anyway
09:44:51 <wob_jonas> aligned allocation is another of those messes that we got because everyone added their own incompatible apis, and now we have to support all of them
09:45:13 <ais523> animalloc uses 512-byte alignment for its huge allocation (I didn't want the alignment to be too coarse as that screws up ASLR)
09:46:04 <ais523> so I guess the correct way to implement an aligned alloc is to increase the size to the alignment if it's smaller and we have less than 512 bytes of alignment requirement
09:46:13 <esowiki> [[Ruby]] https://esolangs.org/w/index.php?diff=64847&oldid=38207 * YamTokTpaFa * (+19)
09:46:13 <wob_jonas> shachaf mentioned a few days ago how many different ways you can tell libc to run a function early in the program, before main. that's one of those messes too, because C doesn't have a standard mechanism.
09:46:24 <ais523> or to use the "over-allocate and filter" method if the request is for an alignment larger than 512 bytes
09:46:54 <wob_jonas> and for aligned allocation, I'm tempted to add a new api too if I ever make my own malloc:
09:49:06 <ais523> hmm, a quick check with a malloc benchmarker is showing that my malloc is both faster than glibc malloc, and also less memory-hungry
09:49:25 <ais523> although of course, real-world programs may act in a very different way from a malloc benchmarker
09:49:39 <wob_jonas> I'd like a four-argument function void *alloc4(size_t size, size_t alignment, size_t before, size_t after) which allocates a block of legth at least size bytes, aligned to a boundary of alignment bytes (must be a power of two), with at least before bytes readable before the block and at least after bytes readable after the nominal end of the block
09:50:01 <ais523> this is most likely because animalloc is completely immune to fragmentation
09:50:31 <ais523> wob_jonas: I assume that the readable bytes are allowed to be part of some other allocation, and not necessarily writable
09:50:40 <ais523> so that they're basically just a "legal overrun"?
09:51:10 <wob_jonas> and they can contain administration data used by the malloc implementation itself
09:51:30 <ais523> that'd be pretty trivial with animalloc, tbh, as long as your allocation is at least 9 bytes long you have several gigabytes of readable overrun space :-)
09:52:29 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64848&oldid=64844 * A * (+139)
09:52:30 <wob_jonas> in fact maybe I should make the api even more general, adding an alignment offset too, though that doesn't come up often
09:52:42 <ais523> <man aligned_alloc> The function aligned_alloc() is the same as memalign(), except for the added restriction that size should be a multiple of alignment.
09:52:49 <ais523> but it doesn't report what error it's supposed to return if it isn't :-(
09:52:57 <ais523> wob_jonas: even backwards, that's what the 9-byte minimum is for
09:53:26 <ais523> because there's 4GiB of address space reserved for objects that are 8 bytes and smaller
09:54:00 <ais523> (allocating objects that small is normally a mistake, but there are likely uses for it)
09:57:31 <esowiki> [[What Mains Numbers?]] M https://esolangs.org/w/index.php?diff=64849&oldid=64848 * A * (+1) /* = Computational class */
09:58:23 <shachaf> ais523: The GHC RTS maps 1TB of address space at startup nowadays, I hear.
09:58:51 <shachaf> Sometimes I wonder whether disabling overcommit would be better.
09:59:37 <ais523> I think overcommit should be under the control of the program allocating the memory
10:00:01 <ais523> in the understanding that if a program overcommits willingly, it's signing up to be OOM-killed
10:01:17 <shachaf> I've been writing a bunch of C code that hardly ever calls malloc.
10:02:14 -!- ais523 has quit (Quit: sorry for my connection).
10:02:32 -!- ais523 has joined.
10:03:14 <shachaf> Do you think passing custom allocators to libraries is a plausible thing to do?
10:03:48 <shachaf> One thing that I'm a bit skeptical of is that different allocators can have different APIs and usage patterns, it's not malloc and free.
10:05:34 <wob_jonas> shachaf: sure! in some cases, like for interpreters or haskell runtimes that allocate a lot of small objects with known type (that are either not arrays, or arrays where the actual size has to be known because there are destructors), it's worth to use a library with a sized free, which means the allocation function needn't store the size of the blo
10:05:42 <esowiki> [[Pxem]] https://esolangs.org/w/index.php?diff=64850&oldid=64829 * YamTokTpaFa * (+14)
10:06:34 <wob_jonas> in this case, you pass the size and alignment of the object to both the allocate and the free function, and it's an UB if you pass a different size or alignment when freeing
10:07:02 <esowiki> [[User:YamTokTpaFa/sandbox4]] https://esolangs.org/w/index.php?diff=64851&oldid=64846 * YamTokTpaFa * (+794)
10:09:54 <shachaf> Yes, that would certainly be a better API option.
10:10:36 <shachaf> But also sometimes you want to allocate in an arena and free all at once, in which case you don't want to call free on each individual allocation even if it's a no-op.
10:10:53 <ais523> wob_jonas: animalloc stores the size of the block in the returned pointer :-)
10:11:12 <esowiki> [[Pxem]] https://esolangs.org/w/index.php?diff=64852&oldid=64850 * YamTokTpaFa * (+209)
10:11:18 <ais523> plenty of address space, after all, to use some of the bits of the returned pointer to record the size
10:19:49 <ais523> this works well because there are only ten allocation sizes used anyway, in terms of distance between returned pointers (the entire space is mapped)
10:20:13 <wob_jonas> ais523: it might also help to add an explicit -std to the compiler command in the makefile
10:21:13 <ais523> OK, I added -std=gnu11
10:21:21 <ais523> this is hopelessly Linux-specific anyway
10:21:39 <ais523> haven't uploaded yet, am still implementing the remaining functions
10:22:07 <shachaf> Did you see my fancy gnu11 printf?
10:22:12 <wob_jonas> ais523: you could also consider to mprotect all the space you don't use to unreadable, both the large allocations that you make the kernel throw away with MADV_FREE, and the tail part of the spaces for small allocations until first used. that way it's more likely that some pointer mistakes are caught by segfaults.
10:22:19 <shachaf> http://slbkbs.org/tmp/fmt.txt
10:23:13 <shachaf> I don't remember whether this is C11. I think I concluded it wasn't.
10:23:44 <ais523> huh, this thing compiles a few bytes smaller with -O3 than with -Os under clang (gcc's -Os is larger)
10:25:46 <shachaf> I wonder what the smallest program that's true of is.
10:29:25 <ais523> the -Os /code/ is way shorter, so I wonder why the file is larger
10:30:01 <ais523> (this relationship is true both stripped and unstripped)
10:32:42 <shachaf> Oh, that's more interesting.
10:33:41 <ais523> after looking at readelf for a while I figured it out: there's some sort of alignment requirement which causes the -Os output to lose its advantage due to having to pad the code up to the same length as the -O3 output within the .so file
10:34:36 <ais523> but the -O3 output inlines a call to an exported function, whereas the -Os output calls the function directly, and that function call creates one additional relocation
10:34:44 <ais523> the relocation is in a part of the ELF file that /does/ affect the file length
10:35:34 <ais523> OK, I uploaded the new version
10:37:19 <shachaf> Dynamic linking is so complicated.
10:37:26 <shachaf> And there are hardly any benefits to it.
10:38:17 <ais523> <clang -Os> a6f: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax ← oh come on, even I can beat that
10:38:43 <ais523> in "outoptimize the compiler" wars, -Os is more fun as a target because it's much more objective as to what code is better than what other code
10:39:01 <ais523> that's my competing program
10:39:22 <cpressey> I once worked on some code where the structs were 8 bytes so I passed them directly to functions, and I was told that I should be allocating them on the heap and passing pointers to them instead
10:39:36 <shachaf> Did you see that heap-compacting malloc?
10:39:51 <ais523> admittedly they probably hve different effects on flags but the context doesn't care about the flags
10:40:03 <shachaf> https://github.com/plasma-umass/mesh
10:40:35 <ais523> cpressey: many programmers don't have a good intuition about when it's best to pass things by value
10:40:47 <wob_jonas> ais523: -Os doesn't always try to produce the smallest code if it's slower
10:40:57 <cpressey> you're Just Not Supposed to Do That with a Struct
10:41:29 <ais523> wob_jonas: is there an "as small as possible" optimization option? analogously to the "as fast as possible" -Ofast?
10:41:40 <shachaf> ais523: gcc -Os has a four-byte solution, it seems.
10:41:44 <wob_jonas> I don't know, but I don't care. I dno't use the compiler for golf.
10:42:14 <shachaf> Sign-extended or with 0xff
10:42:19 <ais523> (-Ofast isn't standards-compliant)
10:42:30 <wob_jonas> you can try to add __attribute__((cold)) or whatever that is to functions if you want
10:44:29 <ais523> compilers typically don't do much with hot/cold
10:44:51 <wob_jonas> ais523: and if you're willing to pay for two instructions where the second one depends on the first one, you can just use an OR instruction with a 8-bit immediate, which would be three bytes long and have a false dependency that the cpu doesn't know is false
10:45:05 <ais523> and the normal optimisations for hot don't help much unless you have an entire loop entirely in hot code, which is pretty inappropriate for a malloc
10:45:41 <shachaf> I do wish the amd64 calling convention wasn't different for values and singleton structs.
10:46:05 <shachaf> Or is it? I can't remember for sure now.
10:47:07 <ais523> my experience with clang vs. gcc optimisation wars is that gcc knows many more optimisations, but clang is better at working out when to use them
10:48:23 <ais523> gcc sometimes goes a little mad with the optimisation and makes the code slower/longer as a result
10:49:09 <wob_jonas> ais523: doesn't it go a little mad only when people pass inappropriate compiler options?
10:49:47 <ais523> wob_jonas: well, that's what -O3 is asking for, yes; but I'd be surprised if -O2 could compete
10:50:23 <ais523> hmm, something that's been bothering me: there are two encodings of the ret instruction on x86_64; branching /directly/ to the shorter encoding has a huge performance penalty, the longer one doesn't have that issue
10:50:44 <wob_jonas> ais523: yes, the optimization manual describes that
10:50:52 <ais523> I'm trying to figure out what sort of insanity would be happening in the hardware to make that happen
10:51:25 <ais523> like, I can understand why branching to a ret might be slow, but why would a repret be any faster? is there a reason not to just use the code underlying repret by default, or whenever there's just been a branch, etc.?
10:51:50 <wob_jonas> ais523: the cpu is trying to cache the likely target of jump instructions, for which it stores the address of the jump, but every other jump instruction is at least two bytes long... I don't know the details, but it doesn't seem too strange
10:52:23 <shachaf> People use rep ret to avoid that, right?
10:52:29 <shachaf> I remember reading about this.
10:52:34 <ais523> yes, rep ret is the recommended long encoding
10:52:41 <ais523> nop ret works too but is slower, for obvious reasons
10:53:43 <ais523> (incidentally, you have to use rep rather than the more commonly seen data16 becauses data16 would actually affect the ret instruction and make it pop only 16 bits!)
10:56:29 <shachaf> ais523: http://repzret.org/p/repzret/ talks about the details of the branch predictor that cause this.
10:56:59 <ais523> another thing that surprised me: I assumed that commands like rep movsd were only worthwhile for small moves, and that bigger more complex code could do a large move faster
10:57:28 <ais523> but it's actually the opposite, rep movsd (and particularly rep movsq) has a large setup cost but runs very quickly when it gets going, and thus is only efficient for large moves
10:57:44 <ais523> presumably it's being recompiled into some sort of vector operation behind the scenes
10:58:04 <ais523> also, I'm not surprised that there's a blog post about repret, but am surprised that it has its own website
10:58:40 <wob_jonas> ais523: the rep movsd thing is complicated, it depends a lot on which cpu brand and core you have
10:58:59 <shachaf> I think whether rep movsd is better or worse than alternatives has changed over the years in different uarchs.
10:59:14 <wob_jonas> you have to look up the details in the optimization manuals and Agner's optimization docs and Agner's memcpy implementation if you want to know the details
10:59:39 <ais523> the optimization guides basically say "don't try to figure it out yourself, the compiler manufacturers know what to do, so memcpy or __builtin_memcpy are the best options for copying things"
11:00:10 <wob_jonas> but the opt guides also give assembly source codes for them I think
11:00:23 <shachaf> Aren't the optimization guides written for compiler manufacturers?
11:01:17 <ais523> AMD's is written for anyone who cares but assumes that compiler manufacturers will be a major audience
11:01:48 <shachaf> I wonder how important compiler optimizations are.
11:02:01 <ais523> some of them are pretty important
11:02:13 <ais523> like storing things in registers rather than memory
11:02:45 <ais523> nowadays people typically forget that that's an optimisation, but on -O0 most compilers will use registers only for the course of one instruction and put everything back into memory in between
11:02:45 <wob_jonas> yeah, that's one of the earliest optimizations that compilers implemented
11:03:18 <wob_jonas> way back when what would now count as a non-optimizing compiler of a low-level language counted as an optimizing compiler of a high-level languaeg
11:03:27 <shachaf> Anything that requires solving an NP-complete problem is certainly an optimization.
11:04:29 <shachaf> OK, the claim turned out much bolder than I intended.
11:04:39 <shachaf> For "anything" substitute "This particular thing".
11:04:54 <int-e> register allocation?
11:05:05 <ais523> doing it optimally is NP-complete, I think
11:05:13 <ais523> but doing better than nothing is much easierr
11:05:28 <wob_jonas> it's still hard to do it well enough
11:05:44 <shachaf> What are the most important optimizations for a compiler?
11:05:50 <shachaf> Register allocation seems important.
11:05:50 <wob_jonas> I don't understand how that part of compilers work
11:05:53 <shachaf> Inlining is probably important.
11:06:19 <ais523> constant folding is important because without it most of the other optimisations don't work either
11:06:20 <shachaf> Though it's certainly much more important in a language like C++ than in C.
11:06:46 <shachaf> Ah, constant folding, sure.
11:07:11 <shachaf> I've said before that I think some optimizations are a bad idea, like tail call "optimization".
11:07:30 <wob_jonas> shachaf: constant folding, "peephole optimizations" (a very general term), optimizing integer divisions by a compile-time constant and repeated integer divisions by the same divisor
11:07:41 <int-e> it's a great idea. it should be mandated by the language standard.
11:08:13 <int-e> strength reduction, bounds check elimination, loop unrolling
11:08:27 <shachaf> I think tail calls should be marked explicitly.
11:08:52 <int-e> (bounds check elimination is *very* important because without it, performance-hungry people will not use safe languages at all.)
11:08:56 <shachaf> If they are it should be an error to use an extra stack frame for them. But it shouldn't be implicit.
11:09:00 <wob_jonas> outputting shorter forms of instructions (on x86),
11:09:52 <cpressey> shachaf: I largely agree. I like Perl's use of goto for this.
11:09:54 <shachaf> C compilers don't do bounds check elimination so how important can it be?
11:09:56 <ais523> I'm planning to write a program in asm at the moment, I've spent over a day just on trying to optimise the register allocation to allow for shorter instruction forms
11:10:03 <ais523> and I haven't even started writing the program yet
11:10:13 <ais523> shachaf: I think they do
11:10:33 <ais523> although they might see it as removing redundant conditionals rather than anything specifically related to bounds checking
11:12:28 <shachaf> cpressey: I don't know how that works. Does "goto &f" mean the same as "return f(@_)" but without the stack frame?
11:13:00 <shachaf> When I try to look this up everyone is talking about "tail recursion optimization" which is double silly because tail recursion is the most useless kind of tail call.
11:13:54 <cpressey> shachaf: I think that's basically it.
11:14:42 <shachaf> I think in a language like C tail recursive functions are much more clearly written with a loop.
11:15:23 <shachaf> Useful cases of tail calls are mutually recursive functions like parsers, I guess.
11:15:59 <wob_jonas> also being able to emit the common instructions in all addressing modes, using indexed memory access for reading or writing when possible, immediates for compile-time constant operands, swapping arguments of commutative operations
11:16:37 <cpressey> shachaf: Yes, finite state machines especially.
11:17:15 <wob_jonas> following which local variables are non-volatile and never taken address so the compiler knows that indirect assignments can't affect them
11:17:18 <shachaf> Are these actually important or are you just listing optimizations?
11:17:30 <wob_jonas> indirect reads too, so you can store them in registers
11:17:40 <wob_jonas> shachaf: I don't really know, I haven't made a compiler yet
11:17:41 <shachaf> I think I meant something like, if you only have time to implement a few optimizations, which are the most important?
11:18:02 <ais523> <shachaf> I think in a language like C tail recursive functions are much more clearly written with a loop. ← things like iterating over a binary tree look really ugly when one branch is recursive and the other is iterative, although of course the tail call doesn't save stack height there, just a bit of time
11:18:41 -!- FreeFull has joined.
11:19:58 <shachaf> void print_node(Node *n) { if (!n) return; print_node(n->left); print(n->value); print_node(n->right); }
11:20:03 <esowiki> [[User:YamTokTpaFa]] https://esolangs.org/w/index.php?diff=64853&oldid=64830 * YamTokTpaFa * (+114)
11:20:32 <shachaf> void print_node(Node *n) { while (1) { if (!n) return; print_node(n->left); print(n->value); n = n->right; } }
11:20:55 <shachaf> I suppose you do lose some symmetry.
11:21:27 <wob_jonas> well, it's closer to return &f(@_); which can matter if f is a prototyped function
11:21:42 <esowiki> [[User:YamTokTpaFa/sandbox4]] https://esolangs.org/w/index.php?diff=64854&oldid=64851 * YamTokTpaFa * (+185) /* Language overview */
11:21:45 <shachaf> I don't know Perl at all so I made up some syntax.
11:21:48 <cpressey> shachaf: If I had limited time and only wanted to implement one optimization technique it would probably be peephole optimization, but that's also because I typically don't want to think hard about the code I generate
11:22:20 <cpressey> so, like, push ax; pop ax -> get rid of that
11:22:57 <cpressey> Is register allocation in and of itself an optimization, relative to stack allocation?
11:23:27 <shachaf> push ax; pop ax isn't a no-op but I guess you can assume it doesn't matter.
11:23:28 <wob_jonas> cpressey: does it count as an optimization to be able to emit arithmetic instructions directly on memory operands, for both input and output, and arithmetic instructions with immediate operands?
11:23:33 <ais523> storing things in registers rather than on the stack is a huge optimization and one of the most important
11:23:54 <ais523> shachaf: if you're a compiler, you almost certainly don't want the side effect it has
11:25:42 <ais523> and if you /do/ want that effect, you'd likely write code like orb $0x0,-0x2(%rsp) (and would probably use a more negative number than -2)
11:26:56 <Taneb> What's the side effect?
11:27:12 <shachaf> It writes ax to the memory past the end of the stack.
11:27:26 <ais523> Taneb: pagefault (possibly upgraded to a segfault) if you're at the end of the paged-in portion of the stack
11:27:46 <Taneb> Ah, also of course
11:27:59 <shachaf> I imagine the most important optimizations are the ones that save on the most important resources.
11:28:02 <ais523> and of course on x86_64, you're writing actual valid memory there (specifically, redzone memory)
11:28:24 <ais523> on 32-bit x86 the ABI treats the memory there as undefined
11:28:29 <shachaf> Presumably executing a few extra instructions doesn't matter nearly as much as avoiding cache misses.
11:28:32 <Taneb> I feel like I should spend more time learning about this sort of thing
11:28:36 <ais523> so most people don't use things above the top of the stack there
11:28:49 <shachaf> Or mispredictions or something.
11:29:03 <shachaf> Taneb: You should write a fancy x86 assembler for me.
11:29:10 <shachaf> I wrote one but it's very simple and bad.
11:29:39 <Taneb> I don't want to do that
11:30:00 <ais523> back in 2005, when my copy of the optimization guide was written, the most important factors were latencies in dependency chains, throughput of various units on the CPU, and things that caused the whole processor to stall
11:30:06 <cpressey> Register allocation is such an important optimization that people don't think of it as a compiler optimization
11:30:31 <cpressey> Some of these same people are happy with stack-based VMs
11:30:43 <ais523> for example, most instructions that did nontrivial memory accesses would not stall anything and would not use much of the CPU's resources, but the result wouldn't be available for 4 cycles
11:30:53 <wob_jonas> shachaf: http://yasm.tortall.net/ fancy x86 assembler
11:30:59 <shachaf> Man, generating code for a fancy out-of-order processor is ridiculous.
11:31:23 <shachaf> Taneb: I don't need it anyway because the point of it is to write it myself.
11:31:26 <ais523> I actually had a really good idea for an instruction set design in the last couple of days
11:31:27 <cpressey> Level one: just generate some code that doesn't crash
11:32:06 <ais523> the idea is, many of the registers are actually shift registers, you read from the end of them but write not at the end
11:32:18 <ais523> (either "the value this register will have after one shift" or with higher numbers than "one")
11:32:31 <ais523> and there's a shift instruction to shift everything all at once
11:32:43 <shachaf> Dan Bernstein once posted about how the x87 stack instructions, where you get a free swap after every instruction, let you express things that register machines don't.
11:32:47 <ais523> this gives most of the advantages of VLIW but is much less specific to an individual processor
11:33:44 <ais523> basically, because the processor can decide for itself how to split the VLIW up, in the knowledge that two commands that happen between a register shift can't possibly interfere with each other (because no position in the shift register is simultaneously wriiteable and readable)
11:34:52 <wob_jonas> ais523: is that sort of like those old SIMD cpus where if you write a register, you can't read it in the next instruction?
11:34:52 <ais523> jumps have their own jump target register, the jump instruction just jumps to the address in the jump target register, but you can load that an arbitrary distance beforehand
11:35:08 <ais523> wob_jonas: yes, except you control how much delay there is
11:35:36 <ais523> the nice thing about this is that temporaries don't need any nontrivial register allocation at all
11:36:04 <ais523> you know when the instruction that uses the temporary is coming up, so you just place it in the appropriate point of the shift register
11:36:13 <ais523> and more normal registers are only needed for values that you want to copy or persist
11:36:44 <ais523> (even then, a command to store in multiple shift registers would likely be both useful and easy to implement, to save on copying)
11:37:08 <ais523> my current plan is for immediates to be part of the instruction, but addresses (both store and load) to be taken from shift registers
11:37:27 <ais523> so that the processor can prefetch an address, or lock it into L1 cache, if it sees that an instruction is going to use it soon
11:37:59 <ais523> this also makes double ll/sc really easy to implement, which means efficient lock-free algorithms
11:39:08 <wob_jonas> there are a lot of advantages you can get without any of these fancy instruction set innovations, by just breaking compatibility and designing a new instruction set that is like the current practices but without the historical cruft
11:40:14 <ais523> I guess my aim here is to design an instruction set that won't hurt optimisations in future processors and makes the optimisations done by current processors easier
11:41:39 <ais523> that said, although this is intellectually interesting, I'm unlikely to practically get the chance to design an instruction set that CPUs will actually use
12:18:19 <cpressey> Register allocation does not strike me as fun (YMMV). One reason I would want to target LLVM is so that it can do register allocation for me.
12:20:10 <shachaf> Hmm, one thing I want is precise control and information over stack frame allocation, which I think might be tricky to get with LLVM.
12:20:33 <shachaf> For example, I want a function to be able to get its maximum stack usage at compiletime.
12:21:14 <cpressey> Yes, that might be impossible to know with certainty, with something like LLVM.
12:24:23 <cpressey> Maybe I'll target CIL instead. I already did that for one toy project. It's not terrible.
12:25:27 <cpressey> It's backed by a standard, it runs on at least two platforms, where it has half-decent JIT compilers.
12:27:13 <shachaf> Another related thing I want is efficient coroutines, which is very similar code to stack frame allocation.
12:27:23 <shachaf> I don't know how much LLVM can help you with that.
12:29:02 <cpressey> I would be surprised if LLVM could not do coroutines reasonably efficiently.
12:29:13 <int-e> There's https://en.wikipedia.org/wiki/SPIR-V as well which sounds interesting in that it stores a control-flow graph of SSA basic blocks.
12:29:35 <shachaf> Ugh, why are GPU APIs all such a mess?
12:29:44 <shachaf> Maybe not all, but at least the common ones.
12:29:50 <int-e> But I don't know whether it can do general purpose programming at this point.
12:32:10 <shachaf> Well, LLVM has an implementation of coroutines, but if they're like C++2038's coroutines I probably don't want them.
12:33:44 <int-e> I've probably forfeited my right to complain but now you're just being silly.
12:33:45 <ais523> C++2038 actually came out in 1901 but the date underflowed
12:33:54 <cpressey> What are the last two digits of ω
12:34:37 <shachaf> The C++ coroutine proposal last I saw it does some heap allocations sometimes.
12:35:03 <shachaf> But it said something like, don't worry, with only $x0 million of investment in compilers, we expect that optimizers can usually eliminate these allocations.
12:35:06 <int-e> I guess if you make a decimal reperesentation of ordinals, the last ω digits of ω will be zero.
12:35:38 <int-e> And only the ω-th digit will be one.
12:36:14 <int-e> (counting from the left)
12:36:21 <int-e> (counting from the right)
12:36:29 <int-e> I should say "counting from the end" :P
12:39:06 <int-e> shachaf: I think recent GPU APIs are messy because they are common interfaces to a very wide range of different hardware, and they expose a lot of iffy details for performance reasons. (The original OpenGL had a pretty different attitude.)
12:43:55 <wob_jonas> ``` set -e; cd wisdom; printf "%s/ " *ium
12:43:56 <HackEso> amnesium/ belgium/ corium/
12:44:09 <HackEso> An amnesium is a school where you forget everything you learned after each test.
12:44:10 <HackEso> Corium is the material that a nuclear reactor's core dump is made of.
12:44:13 <HackEso> Alumni is a compromise spelling suggested to solve the aluminum vs aluminium debate that never really caught on, except in a few big colleges.
12:45:21 <wob_jonas> ``` set -e; cd wisdom; printf "%s/ " *ion
12:45:22 <HackEso> abbreviation/ action/ algebraic chess notation/ bessel function/ cat elimination/ cat introduction/ cipation/ citation/ civilization/ communication/ composition/ cut elimination/ damnation/ defenestration/ degeneration/ dereduntantation/ detonation/ eurovision/ hallucination/ hppavilion/ identity function/ implication/ indentity function/ intersection/ invention/ just intonation/ last-class function/ lion/ natural transformation/ nnection/ onion/ operation
12:45:47 <shachaf> I wonder how much more true that is of GPUs than CPUs.
12:46:25 <shachaf> Hmm, someone should name an element "belgium".
12:46:40 <HackEso> The plural form of "Belgium" is "Belgia".
12:48:59 <esowiki> [[Talk:ACL]] https://esolangs.org/w/index.php?diff=64855&oldid=64845 * Hanzlu * (+14) /* Python Interpreter */
12:49:25 <esowiki> [[Talk:ACL]] https://esolangs.org/w/index.php?diff=64856&oldid=64855 * Hanzlu * (-14)
12:50:59 <esowiki> [[Talk:ACL]] https://esolangs.org/w/index.php?diff=64857&oldid=64856 * Hanzlu * (+30)
13:03:53 <esowiki> [[ACL]] https://esolangs.org/w/index.php?diff=64858&oldid=64821 * Hanzlu * (+28)
13:05:37 <esowiki> [[Truth-machine]] https://esolangs.org/w/index.php?diff=64859&oldid=63439 * Hanzlu * (+27)
13:07:30 <esowiki> [[ACL]] https://esolangs.org/w/index.php?diff=64860&oldid=64858 * Hanzlu * (+6)
13:16:47 -!- doesthiswork has joined.
13:23:58 <esowiki> [[Talk:Keg]] M https://esolangs.org/w/index.php?diff=64861&oldid=63982 * A * (-109)
13:37:23 -!- xkapastel has joined.
13:59:18 <esowiki> [[Talk:ACL]] https://esolangs.org/w/index.php?diff=64862&oldid=64857 * A * (+6)
14:09:38 -!- ais523 has quit (Ping timeout: 248 seconds).
14:21:45 <esowiki> [[Brace For Impact]] N https://esolangs.org/w/index.php?oldid=64863 * Areallycoolusername * (+1483) Created page with "'''Brace For Impact''' is a [[Stack]]-based [[esoteric programming language]] made by [[User: Areallycoolusername|Areallycoolusername]]. It has featured derived from Lisp,thou..."
14:21:50 <esowiki> [[ACL]] https://esolangs.org/w/index.php?diff=64864&oldid=64860 * Hanzlu * (+128)
14:29:07 <esowiki> [[What Mains Numbers?]] https://esolangs.org/w/index.php?diff=64865&oldid=64849 * A * (+1) /* What Mains Numbers? */
14:41:08 <esowiki> [[What Mains Numbers?]] https://esolangs.org/w/index.php?diff=64866&oldid=64865 * A * (+825) /* Implementations */
15:32:07 -!- oklopol has quit (Read error: Connection reset by peer).
15:35:13 -!- wob_jonas has quit (Remote host closed the connection).
16:09:27 -!- john_metcalf has joined.
16:29:45 <cpressey> https://github.com/catseye/Castile/blob/master/src/castile/stackmac.py
16:30:48 -!- cpressey has quit (Quit: A la prochaine.).
17:17:09 <esowiki> [[ACL]] https://esolangs.org/w/index.php?diff=64867&oldid=64864 * Hanzlu * (-419)
17:19:05 <esowiki> [[ACL]] https://esolangs.org/w/index.php?diff=64868&oldid=64867 * Hanzlu * (+2)
17:22:43 <esowiki> [[ACL]] https://esolangs.org/w/index.php?diff=64869&oldid=64868 * Hanzlu * (+153)
17:25:47 <esowiki> [[ACL]] https://esolangs.org/w/index.php?diff=64870&oldid=64869 * Hanzlu * (+137)
17:26:22 -!- ais523 has joined.
17:42:31 <esowiki> [[Brace For Impact]] https://esolangs.org/w/index.php?diff=64871&oldid=64863 * Areallycoolusername * (+1)
17:54:35 -!- b_jonas has joined.
18:18:05 <b_jonas> ha, I caught this stupid fly that somehow got in here
18:19:21 <esowiki> [[Esolang:Categorization]] M https://esolangs.org/w/index.php?diff=64872&oldid=58894 * Areallycoolusername * (-157) Most of these proped categories are too obvious to be added. A few need to be discussed on the talk pag, not on the page itself.
18:45:50 <esowiki> [[Esolang:Categorization]] https://esolangs.org/w/index.php?diff=64873&oldid=64872 * Ais523 * (+157) Undo revision 64872 by [[Special:Contributions/Areallycoolusername|Areallycoolusername]] ([[User talk:Areallycoolusername|talk]]): those aren't proposed categories, but listings of possibilities that are too common to be categorised (for exhaustiveness)
18:54:42 <esowiki> [[Brace For Impact]] https://esolangs.org/w/index.php?diff=64874&oldid=64871 * Areallycoolusername * (+606)
19:04:49 -!- ais523 has quit (Quit: quit).
19:16:07 <esowiki> [[Brace For Impact]] https://esolangs.org/w/index.php?diff=64875&oldid=64874 * Areallycoolusername * (+623) /* (a'),(s;),(d.), (f=), and (y~) */
20:00:48 -!- Lord_of_Life_ has joined.
20:04:07 -!- lldd_ has joined.
20:04:19 -!- Lord_of_Life has quit (Ping timeout: 248 seconds).
20:04:19 -!- Lord_of_Life_ has changed nick to Lord_of_Life.
20:15:51 -!- Phantom_Hoover has joined.
20:29:24 -!- howlands has joined.
20:48:59 <b_jonas> ``` set -e; cd wisdom; echo *[ae]nce
20:49:00 <HackEso> ance insurance intelligence persistence reference science sentience this sentence
20:49:06 <HackEso> Insurance is a closed loop.
20:49:18 <HackEso> Semi-automatic text generation.
20:49:20 <HackEso> sentience is the primary goal of wisdom. wisdom is the primary goal of sentience.
20:49:23 <HackEso> This sentence is just. Taneb invented it.
20:49:28 <HackEso> Taneb invented persistence long ago, and it's been around ever since.
20:58:54 -!- lldd_ has quit (Quit: Leaving).
21:00:24 <HackEso> /srv/hackeso-code/multibot_cmds/lib/limits: line 5: exec: hatis: not found
21:00:27 <HackEso> ld.so(8) - dynamic linker/loader
21:08:27 <esowiki> [[User:Hanzlu]] N https://esolangs.org/w/index.php?oldid=64876 * Hanzlu * (+64) Created page with "Some random guy who made [[ACL]]. What more do you need to know?"
21:53:27 -!- b_jonas has quit (Quit: leaving).
22:44:38 -!- Phantom_Hoover has quit (Ping timeout: 245 seconds).
23:09:55 <doesthiswork> suppose that the time complexity of adding two 2 digit integers is linear in the smaller term. But you can choose N sums to memoize to improve the worst case. How should you choose the N sums?
23:16:10 -!- Melvar has quit (Ping timeout: 258 seconds).
23:16:37 -!- Melvar has joined.