2019-08-18 - freenode:#esoteric

←2019-08-17 2019-08-18 2019-08-19→ ↑2019 ↑all

00:05:27 <andrewtheircer> e

00:32:56 -!- andrewtheircer has quit (Remote host closed the connection).

00:38:16 -!- b_jonas has quit (Ping timeout: 272 seconds).

00:38:42 -!- b_jonas has joined.

01:11:03 -!- b_jonas has quit (Remote host closed the connection).

01:20:51 -!- FreeFull has quit.

01:25:34 -!- ais523 has joined.

01:25:49 <ais523> ugh, I'm dealing with a really annoying missed optimisation atm

01:25:52 <ais523> ptrdiff_t notneg(unsigned char b) { if (b <= 0xC0) __builtin_unreachable(); return -(ptrdiff_t)(unsigned char)~b; }

01:29:00 <ais523> the correct optimisation is «return 1 + (ptrdiff_t)(signed char)b;» but neither gcc nor LLVM finds it (you could also do the increment first)

01:32:37 <ais523> huh, they don't find it even with a cast to signed char in there, although clang finds it if you put a cast to signed /int/ in there (of course, none of these casts make a difference because ~b has to be a positive number in the 0..30 range)

01:33:16 <zzo38> LLVM allows you to specify the range, but I think only for loading from memory.

01:35:05 <ais523> I just get annoyed when compilers produce asm that's worse than the asm I'm expecting

01:35:30 <ais523> (and I'm the sort of person who actually checks that when I write code like -(ptrdiff_t)(unsigned char)~b…)

01:36:17 <zzo38> Is there any command to have a inline LLVM code in a C code?

01:37:03 <zzo38> Also, in Ghostscript, the "forall" command doesn't seems to work properly for strings longer than 800 bytes. (Other commands work (including the "length" command), and it works fine for arrays, but not for strings.) Why does it do that?

01:38:29 <ais523> there's no standard C method to use inline LLVM; clang could theoretically support it as an extension to __asm__ but I don't think it does

01:40:19 <ais523> hmm, so my next problem is, should I write it as 1 + (ptrdiff_t)(signed char)b in the original source, even though that's a lot less clear (the operation I'm using is conceptually a combination of a bitwise-complement and a unary minus, the fact that that does the same thing as an increment is just an intentionally engineered coincidence)

01:42:45 <shachaf> ais523: Oddly enough if you write out a lookup table gcc finds a good implementation.

01:44:59 <ais523> what, if you special-case all 31 possible inputs? :-D

01:45:28 <shachaf> Yes, if you write a switch over the entire range.

01:46:21 <shachaf> (I mean "range of inputs" and not "codomain" or "image", which people refer to as "range" for some reason.)

01:47:05 <zzo38> One possibility is to write a comment.

01:47:55 <zzo38> (Also, which implementation is better might depend on the target computer, maybe?)

01:48:33 <shachaf> What's the context of this function?

01:48:45 <ais523> decoder for a file format

01:49:13 <ais523> it stores references to earlier locations in the file as 1, 2, 4, or 8 bytes, I'm writing a separate function for each

01:49:21 <shachaf> And the speed of this operation is relevant?

01:49:25 <ais523> but the references are bitwise-complemented

01:49:51 <ais523> the format's designed to be extremely fast, faster to use a memory image of the format than actually parsing it

01:50:12 <shachaf> Hmm, which format is it?

01:50:15 <ais523> this operation needs to be able to compete with pointer dereferences to do that

01:50:21 <ais523> it's a format I'm inventing atm

01:50:34 <shachaf> Aha.

01:51:02 <ais523> (the idea is that the +1 ends up getting inlined into the dereference of the resulting pointer)

01:51:53 <ais523> I'd store the offsets as with the +1 added if I could, but I can't because 0 is a valid offset and there is a fairly esoteric constraint that the offsets cannot be valid UTF-8

01:52:15 <ais523> #esoteric is definitely the right channel for this :-D

01:52:31 <shachaf> So that's the reason for the 0xc0. I was vaguely wondering before whether it was related to UTF-8.

01:52:48 <shachaf> What's this for?

01:52:59 <shachaf> Maybe you're being deliberately vague.

01:53:28 <ais523> not because it's secret, mostly because it's hard to explain

01:53:50 <ais523> the format's designed for the output of a parser, i.e. the in-memory representation of a parse tree

01:54:10 <ais523> but to be generic enough to be usable in much the same situations as XML is

01:54:50 <ais523> but the idea is that the parser output is just an image in memory, it's not something that you parse into a linked tree of structures like you normally do

01:55:26 <ais523> linked lists are bad performance-wise, and linked trees are very common but probably also beatable in performance for the same reasons, so I wanted to try

01:56:17 <zzo38> XML is often used in cases where it shouldn't be, anyways, I think

01:56:19 <shachaf> That makes sense.

01:56:44 <shachaf> I suspect the gains from a more efficient memory encoding are way bigger than the gains from a couple of extra instructions to dereference a pointer.

01:56:47 <ais523> zzo38: yes

01:57:00 <ais523> shachaf: that's what I'm hoping, at leasts

01:57:32 <ais523> (not just more efficient, but more compact too, meaning it makes better use of the cache; also less malloc overhead)

01:58:39 <shachaf> I think it's plausible that even with a regular parser, you should have only one or a few calls to malloc for the entire tree, which you free all at once.

01:59:04 <shachaf> There isn't much benefit to mallocing individual nodes.

02:00:33 <ais523> all the parsers I've seen so far malloc individual nodes for the output tree, although I agree there isn't much benefit to it

02:00:46 <ais523> I think the idea is to let you free parts of the tree individually

02:01:17 <shachaf> I suspect there's no real idea, it's just that people learn that that's how you get memory.

02:01:21 <ais523> but the normal operation on a tree is to traverse it, and that lets you build a new tree as you go

02:02:14 <zzo38> In some programs you would free one branch of the tree separately from the rest

02:02:16 <ais523> this program is actually being written in Rust (I translated it into C to test it on gcc), and rust has a #[no_std] annotation that means your program has no support at all from the operating system, it's used for purely algorithmic libraries

02:03:11 <ais523> that means you don't get memory allocation unless the caller gives you a memory allocation callback, and so I try to avoid allocations out of habit, thinking carefully about if I need them

02:03:19 <ais523> often, I don't need much more than one or two for the program

02:03:38 <shachaf> I'd like to know more about possible allocation strategies.

02:04:12 <shachaf> I think I've talked about that in here before.

02:04:29 <ais523> well, the one I'm working on for this program (for the parsers, tree operations, etc.) is a purely linear allocation strategy where everything is written in memory left to right in order, meaning that your algorithms have to be able to manage that

02:05:11 <shachaf> Does this memory go left to right in order of increasing addresses?

02:05:13 <ais523> many algorithms can't, but for a parser it's fairly hopeful because parsing algorithms and tree-traversal algorithms both naturally work like that

02:05:48 <ais523> yes, increasing addresses (decreasing is possible I guess but it's sufficiently alien that you normally get less support from the standard library, OS, processor, etc.)

02:06:15 <shachaf> Unless it's the call stack.

02:06:28 <ais523> writing left to right unfortunately means that reading right to left is sometimes required, but reads on a parse tree are generally more random-access than writes as it is (you're often trying to match both branches of the tree to something)

02:06:36 <shachaf> I like the idiom of having a global (or thread-local) arena that can be used effectively as a stack but with the frames manually managed.

02:06:43 <ais523> the call stack in x86 is the wrong way up and I am very upset at this

02:07:27 <shachaf> So functions that produce variable-size output that's only needed briefly can just write into there.

02:07:32 <shachaf> E.g. sprintf.

02:07:33 <zzo38> I wrote one program which needs the caller to specify a memory allocation function, although you can just put realloc and it will work, which is normally what would be done, I think.

02:07:39 <ais523> there's one /really big/ technical advantage to having the call stack the other way round, which is that in x86, overflows beyond the end of a stack-allocated array overwrite the function's return address by default, which is just about the worst possible thing to overwrite

02:08:05 <shachaf> ais523: I've heard that brought up as a security benefit and I don't really buy it.

02:08:18 <zzo38> In some instruction sets the call stack is not addressable

02:08:22 <ais523> if the call stack goes the other way, the situations in which that happen are much less common, requiring you to be writing to an array of a function which called the current one

02:08:36 <ais523> that said, having a separate call stack and data stack is a much better idea for multiple reasons

02:08:40 <shachaf> char buf[n]; sprintf(buf, "%s", ...); seems pretty common to me.

02:08:49 <shachaf> Yes, I agree that that makes more sense.

02:09:51 <ais523> shachaf: well, that example is broken regardless of which way up the stack goes (although with a stack that goes upwards, sprintf would at least be able to detect that something was wrong; it knows where its own return address is in memory, but not where its caller's is)

02:10:14 <moony> yea.. that seems like a security /issue/

02:10:17 <ais523> I'm not sure if anyone would write sprintf like that, but maybe they would

02:10:18 <moony> that's bad

02:10:29 <moony> in fact, that's exactly how ROP works

02:10:41 <shachaf> ais523: Or, I mean, char buf[n]; gets(buf); or whatever.

02:11:07 <shachaf> Many of the classic buffer overflow problems involve calling another function so changing the stack direction doesn't help much.

02:11:08 <ais523> moony: well, ROP is one of the most effective ways to exploit this, but the exploit isn't ROP, it's just that the exploit lets you use ROP

02:11:30 <ais523> gets should be simple enough to be inlined

02:11:46 <moony> also thumbs-up for using rust. If you're trying to allow custom allocators, use what is already provided

02:11:48 <shachaf> I would be pretty surprised if gets was inlined.

02:12:01 <shachaf> Rust seems to be pretty bad at custom allocators or unusual allocation strategies.

02:12:05 -!- botnick has joined.

02:12:20 <ais523> so would I, mostly because that function is such a bad idea that I doubt much effort has been put into optimising it

02:12:34 <ais523> I'm not sure if Rust's #[global_allocator] is stable yet, but that makes custom allocators very easy

02:12:49 <ais523> unusual allocation strategies are IME either very easy or very hard

02:12:57 <moony> mmm..

02:13:06 <shachaf> I think the main reason I think that is that Rust is all about destructors, which muddle deallocation with other code.

02:13:10 <moony> i thought rust's allocation choice could be done per function

02:13:11 <moony> guess not

02:13:15 <moony> i mean

02:13:17 <moony> i see why

02:13:46 <ais523> right, the issue there is not so much allocation but deallocation

02:13:59 <moony> shachaf: there is the Drop trait, which can be useful. Could possibly implement a trait that keeps tabs on what allocator was used to make a object

02:13:59 <shachaf> Unfortunately those issues are connected.

02:14:06 <ais523> although, it wouldn't be too hard to give the various allocators their own portions of address space and use the pointer value to tell them apart

02:14:32 <shachaf> I feel like you just want to do the right thing statically.

02:14:39 <ais523> btw, separately from this project, I had an idea about a very eso allocation strategy

02:14:45 <moony> oooo

02:14:51 <shachaf> If you're adding a bunch of dynamic behavior like that there should be a good reason for it.

02:14:57 <ais523> the idea was that all the pointers in the entire program would be doubly-linked

02:15:13 <ais523> like, to point from one object to another, you need to point to a pointer field that points back at you

02:15:20 <moony> i think the cache would commit suicide

02:15:40 -!- botnick has quit (Read error: Connection reset by peer).

02:15:43 <ais523> now, the great thing about this strategy is that you can move objects around in memory at will

02:15:49 <moony> i mean

02:15:52 <ais523> because you already know where all the inbound pointers are and can just update them

02:15:55 <moony> that's the strategy a lot of VMs use

02:15:57 <moony> like the JVM

02:16:17 <ais523> yes, I was inspired by compacting garbage collectors, the idea was to do the same thing without a GC

02:16:30 <ais523> I'm not sure if this is useful, and it probably isn't, which is why I described it as very eso

02:16:51 <moony> reducing memory fragmentation could be useful if you keep memory organized at the same time

02:17:01 <moony> and do a bit of the memory usage predictor's work for it

02:17:06 <shachaf> Did you see that malloc that merges pages which don't have overlapping allocations?

02:17:07 <ais523> yes, this is one of those cases where the advantages are clear but the disadvantages are terrifying

02:17:11 <ais523> shachaf: yes

02:17:34 <ais523> at first I thought "surely that isn't necessary", but then came up with some pathological programs where it is

02:17:57 <shachaf> I think my current attitude is that if your program is doing zillions of little mallocs and frees everywhere it's probably doing things badly anyway.

02:18:05 <moony> ^

02:18:14 <ais523> e.g. imagine a program that allocates 100 million 8-byte objects, deallocates 99 out of every 100, then allocates 1 million 800-byte objects

02:18:20 <moony> if it comes to that, use a slab allocator for some of those objects

02:18:42 <moony> assuming they're all uniform

02:18:56 <ais523> my thoughts on the matter is that the only difficult case is what happens if a program deallocates a lot of small objects, then needs to allocate large objects that are too large for the remaining spaces

02:19:21 <ais523> all the other cases are trivial by comparison

02:19:56 <ais523> I don't think this is a remotely common pattern for long-running interactive applications, but I can see a batch process doing something like that when moving on from one phase of computation to another

02:20:13 <moony> I think Dwarf Fortress has an issue with that

02:20:18 <moony> albet with larger objects

02:20:20 <ais523> maybe, if you have a random malloc pattern, there should be some sort of compaction phase whenever your program shifts modes

02:21:01 <moony> in The Powder Toy everything is preallocated because malloc pauses are unacceptable

02:21:09 <ais523> anyway, this is what inspired the "everything is doubly-linked" idea because it's one of the only ways, short of a GC, to solve the problem

02:21:30 <ais523> (that said, preventing the problem happening in the first place by using a more predictable malloc/free scheme is probably better!)

02:21:57 <moony> If you know the max number of objects, then just allocate that, the linux kernel won't mind

02:22:10 <ais523> moony: are they preallocated /and/ prefaulted?

02:22:38 <moony> not prefaulted, but they generally get faulted very early in the app's lifespan so it's not an issue

02:22:41 <ais523> animalloc doesn't have any pauses other than page faults, and those will happen even with a preallocation strategy unless you prefault all your pages too

02:22:56 <moony> could add prefaulting for fun

02:23:11 <shachaf> Did you know that merely reading from an mmapped page in Linux will only fault it in as a CoW zero page rather than giving you your own memory?

02:23:16 <ais523> moony: it's one line of code

02:23:21 <ais523> shachaf: yes

02:23:35 <moony> ais523: and that is?

02:24:18 <ais523> actually I'm wrong, I thought madvise could do it on arbitrary memory, but the most aggressive prefault it supports is a request to preload into /cache/ faster

02:24:42 <moony> ok nvm it is prefaulted, we run the simulation clear no matter what

02:24:53 <ais523> mmap can do it with MAP_POPULATE but that requires you to actually be getting the memory from mmap

02:24:54 <moony> so the action of wiping the sim should cause prefaulting

02:25:21 <moony> as we technically "use" the memory

02:25:43 <ais523> …why are mmap and madvise not synchronized in what requests they support

02:26:28 <moony> ..prolly should work on reducing allocations in the game's graphics

02:26:52 <moony> the sim is effectively malloc free

02:26:59 <moony> but graphics are pretty CPU hungry too

02:27:59 <moony> ..gah

02:28:07 <moony> we still haven't squashed the malloc in INST code

02:28:27 <ais523> OK, I've now implemented the 1, 2, 4, and 8 byte versions of this function

02:28:56 <ais523> the -~ gets optimized to +1 only in the 8-byte case (presumably because LLVM can't optimize the - movzx ~ sequence when the number is known to be negative)

02:30:56 <ais523> one interesting thing is that the optimizer implements the comparison with 0xC100000000000000 using a 56-bit shift and a comparison with 0x000000C1; I can believe that all those leading zeroes are actually fastest, but think there miight be a better way

02:31:34 <moony> that sounds like it would be faster on Nephelem era CPUs, but not sure about newer things

02:31:44 <moony> (Nephelem has some.. weird constraints)

02:32:58 <ais523> right, I can see why a comparison of the bottom 8 bits with 0xC1 (which would be mathematically correct) might be slower on some processors

02:33:13 <Hooloovo0> I could also see it being smaller code

02:33:20 <moony> ^ check if it's smaller

02:33:36 <moony> fitting as many instructions in 16 bytes as possible is usually a goal

02:33:38 <Hooloovo0> shift-right is probably 1 instruction, and compare 8 bit is another

02:34:03 <Hooloovo0> and fitting 0xc100000... as an immediate is going to be longer

02:34:38 <ais523> you'd simply have to replace "cmp $0xc0,%rsi" (7 bytes) with "cmp $0xc0,%sil" (4 bytes)

02:34:54 <ais523> as Hooloovo0 says, the only difference is the leading zeroes on the immedite

02:34:56 <ais523> *immediate

02:35:05 <moony> there's a subtle difference here

02:35:26 <moony> the 32-bit registers and 64-bit ones are different, if i recall right

02:35:27 <Hooloovo0> oh, it's a 32-bit compare? weird

02:35:49 <moony> but..

02:35:50 <moony> hm

02:35:53 <moony> can i see ASM

02:35:53 <ais523> Hooloovo0: it's a 64-bit compare using a 32-bit immediate, x86_64 doesn't use 64-bit immediates apart from one instruction

02:36:01 <Hooloovo0> oh, ok

02:36:17 <moony> ais523: can you make a quick listing of the ASM here?

02:36:23 <ais523> mov %rdx,%rsi \ shr $0x38,%rsi \ cmp \ $0xc0,%rsi

02:36:29 <ais523> err, I added an extra slash

02:36:39 <ais523> mov %rdx,%rsi \ shr $0x38,%rsi \ cmp $0xc0,%rsi \ ja 2e

02:36:47 <ais523> (then the value in %rsi is no longer used, nor are the flags)

02:36:52 <moony> can't downgrade to a 32-bit register.

02:37:02 <ais523> moony: %esi is just the bottom half of %rsi

02:37:18 <moony> No, they're treated different if i remember right

02:37:36 <ais523> the complication you're thinking of is that if you ever assign to %esi directly, it gets automatically zero-extended into %rsi

02:37:45 <moony> ahg

02:37:47 <moony> ah

02:37:54 <moony> alright, i'm being a derp

02:38:01 <ais523> i.e. the /top half/ of %rsi is like a register of its own with its own weird rules

02:38:28 <ais523> (meanwhile, assigining to %si or %sil does not clear any other part of %rsi, the rule's only for the $e* variant of registers specifically)

02:39:05 <shachaf> I saw a post recently about which amd64 registers aren't special-cased in some way. It's only a few of them.

02:39:05 <moony> that last bit i did /not/ know

02:39:06 <Hooloovo0> what

02:39:12 <ais523> …which is why xchg %ax, %ax is a no-op, because that's a 16-bit write

02:39:23 <Hooloovo0> whose idea was this

02:39:25 <moony> Hooloovo0: yes, welcome to x86

02:39:35 <ais523> shachaf: six of r8..r15 are not special-cased, I think

02:39:36 <moony> it's all in the name of (misguided) backwards compatibility

02:39:44 <ais523> err, r9..r16

02:40:04 <ais523> I can't remember which two it is that are, it's either 10 and 11, 11 and 12, or 12 and 13

02:40:12 <shachaf> Yes, https://twitter.com/rygorous/status/1162078329706405888

02:40:18 <Hooloovo0> man, fuck computers

02:40:30 <shachaf> ais523: Wait, writing to ax doesn't zero the top half of rax?

02:40:35 <ais523> shachaf: right

02:40:43 <Hooloovo0> nor eax

02:40:46 <ais523> this is what I was missing when we had our nop discussion a while back

02:41:24 <moony> Hooloovo0: i await the day RISC-V takes over the world with sane design

02:41:26 <ais523> so 90 is a nop because it's special-cased, even though that instruction would normally decode to xchg eax, eax (this forces you to use a different encoding if you actually want the swap)

02:41:45 <ais523> but 66 90 is a nop semantically, it swaps ax with itself and that doesn't have side effects

02:42:02 <ais523> (although it's almost certainly special-cased anyway, that's just for performance reasons)

02:42:03 <shachaf> Oh man.

02:42:13 <shachaf> I feel like I came across this fact once before but I completely forgot about it.

02:42:16 <shachaf> What a mess.

02:42:24 <moony> I want RISC-V to take over the world

02:42:44 <ais523> this is why disassemblers can disassemble 66 90 to xchgw ax, ax without being incorrect

02:43:08 <Hooloovo0> yeah, risc-v is nice

02:43:17 * Hooloovo0 has designed a risc-v cpu

02:44:11 <shachaf> Until now I thought that anything that only wrote to the lower 32 bits automatically zeroed the upper 32 bits.

02:44:13 <moony> I want a decently priced linux capable RISC-V board

02:44:15 <moony> then i will be happy

02:45:12 <ais523> fwiw, the special-casing on r12 is almost forgivable, the issue is that some notation is needed to be able to specify not using a register; x86 sacrificed the ability to write an address that requires multiplying the stack pointer by a constant (somewhat understandably) to give the encodings it needed

02:45:38 <ais523> and r12 is encoded by specifying "stack pointer, except second set of registers" and that would therefore need a special-case decode to allow you to multiply it by a constant

02:46:11 <shachaf> Yes, both r12 and r13 are inheriting restrictions from rsp and rbp.

02:47:55 <ais523> I came up with a new optimisation idea a while back: the idea is that when you're implementing an array of structures as parallel arrays (e.g. for alignment reasons), you take a pointer into the array with smallest elements

02:48:16 <ais523> and index into the others by multiplying the /pointer/ by a constant and adding another constant with an appropriately calculated value

02:48:35 <ais523> (or if the arrays don't have fixed addresses, e.g. in a re-entrant function, another register)

02:48:55 <ais523> in the non-fixed-address scenario this saves one register over the obvious way of writing things and isn't any longer or slower

02:49:16 <ais523> although it's really highly illegal in C, you can do it in x86 asm just fine

02:49:19 <moony> Hooloovo0: if you think the base x86-64 stuff is insane

02:49:22 <moony> mess with SSE

02:49:42 <ais523> AVX seems to have hit a sweet spot of performance and sanity, IMO

02:49:50 <moony> Yea, AVX is ok

02:49:50 <ais523> then it started going back downhill from AVX2 onwards

02:49:59 <moony> AVX-512 ;-;

02:50:07 <shachaf> What was wrong with AVX2?

02:50:13 <ais523> I've come to the decision to target AVX1 as my baseline instruction set

02:50:23 <shachaf> It gave a lot of integer instructions that were missing in AVX1.

02:50:25 <ais523> shachaf: nothing massively wrong, it just has some jarring inconsistencies and missing features

02:50:48 <moony> also the useless instructions

02:50:52 <moony> that only have one use

02:50:56 <moony> and are junk for anything else

02:51:00 <shachaf> A while ago I wanted to write vectorized code and I couldn't use AVX.

02:51:07 <ais523> e.g. there are no instructions to operate on 256-bit values as a whole, only pairs of 128-bit values

02:51:17 <ais523> you can't shift the top half of a ymm value into the bottom half

02:51:41 <ais523> which is weird, because you can shift the top half and bottom half of a ymm value separately into the top and bottom halves respectively of a second ymm value

02:51:42 <moony> RISC-V's vector extension is looking really good

02:51:47 <moony> you seen it, ais523?

02:51:53 <ais523> no

02:52:13 <moony> https://github.com/riscv/riscv-v-spec

02:52:59 <moony> it's great

02:53:07 <moony> completely vector register width agnostixc

02:53:10 <moony> *agnostic

02:53:19 <moony> so that old code can utilize larger register sizes easily

02:55:02 <Hooloovo0> should x86 be considered an esolang?

02:55:17 <moony> prolly

02:55:23 <moony> a esolang that everyone uses

02:56:20 <shachaf> Much less than C++, I think.

03:06:09 <zzo38> What I would want to have is the bitwise operations that MMIX has. There is a bit operation extension for RISC-V, but I think the bit operations of MMIX is better.

03:06:27 <zzo38> (At least, for a 64-bit system, it is better. Perhaps for a 32-bit system maybe it isn't better.)

03:06:34 <moony> zzo38: are MAK and EXT (MAKe and EXTract) in that list?

03:06:43 <moony> they're both bitfield opts

03:06:45 <moony> *ops

03:06:58 <moony> really powerful, enough that they can replace shiftleft/shiftright instructions

03:07:36 <zzo38> What are MAK and EXT doing?

03:08:07 <zzo38> (I also like the Muxcomp operation, described in esolang wiki)

03:08:09 * moony pulls up MC88100 desc of them

03:10:10 <moony> zzo38: http://bitsavers.trailing-edge.com/components/motorola/88000/MC88100_RISC_Microprocessor_Users_Manual_2ed_1990.pdf pages 3-44, 3-45, 3-46, 3-47, 3-70, and 3-71,

03:10:26 <moony> it has diagrams describing operation, alongside text

03:11:27 <zzo38> Do you have the order number of the page?

03:11:55 <moony> one sec

03:12:34 <moony> MAK is on 125..

03:13:05 <moony> EXT on 99.

03:13:54 <moony> that good, zzo38?

03:14:25 <moony> I own the physical manual, forgot that the PDF is annoying to navigate

03:15:10 <zzo38> Yes, that is good

03:15:24 <zzo38> (I didn't read all of it yet though)

03:20:59 <zzo38> That MAK and EXT is good. (MMIX doesn't have it, but does have MOR and MXOR and SADD (MOR is very useful), and Muxcomp is even more general to do)

03:21:45 <zzo38> (A problem with Muxcomp though, is that it requires a lot of operands.)

03:23:38 <zzo38> Muxcomp is: Form a 5-bit number from the low bits of each of the first five operands to select which bit of the last operand to copy to the low bit of the result, and then the same for the next bit position, and so on.

03:24:49 <zzo38> (Or six bits for 64-bit registers)

03:26:05 <ais523> zzo38: doesn't that effectively implement an arbitrary five-input bitwise operator?

03:26:11 <ais523> with the last operand specifying the truth table?

03:26:20 <kolontaev> what do you use MOR for? if it's useful...

03:27:17 <zzo38> ais523: I suppose so, yes. (You can also perform a shift or rotate of the last operand by setting the other operands properly)

03:27:20 <moony> MAK and EXT have the advantage of practicality

03:27:43 <zzo38> kolontaev: One use is endian switch (including PDP-endian).

03:27:52 <zzo38> But there are some other uses, too.

03:28:01 <kolontaev> zzo38: aha

03:28:41 <ais523> a while back in here we were discussing how Intel had added INTERCAL's select operator ~ to the x86 instruction set

03:29:34 <ais523> the paper discussing it mentioned an instruction that's effectively a bitwise keysort; you stable-sort the bits of one operand using the bits of the other operand as keys

03:30:04 <ais523> that hasn't been implemented but it seems like it could be useful

03:30:12 <ais523> (a select is basically a bitwise-keysort followed by an AND)

03:30:19 <ais523> err, preceded by an AND

03:34:36 -!- ARCUN has joined.

03:35:55 <ARCUN> cpressey: Do you know where I could get an actual Commodore 64? I need one for a project.

03:36:03 -!- ARCUN has quit (Remote host closed the connection).

03:37:25 <moony> uh

03:37:26 <moony> huh

03:39:21 <zzo38> ais523: Yes, I think it can be useful

03:42:26 <esowiki> [[Commercial]] https://esolangs.org/w/index.php?diff=65539&oldid=65535 * Dtuser1337 * (+8) Formatting codebase.

03:43:44 <zzo38> MMIX doesn't have a "count trailing zero" instruction, but it can easily be done with two instructions (the SADD instruction is popcount(x AND NOT y))

03:43:46 <esowiki> [[Commercial]] https://esolangs.org/w/index.php?diff=65540&oldid=65539 * Dtuser1337 * (-6) continue formatting codebase and move category to bottom

03:45:47 <esowiki> [[Commercial]] https://esolangs.org/w/index.php?diff=65541&oldid=65540 * Dtuser1337 * (-52) /* Implementations */

03:46:20 <ais523> zzo38: is that (x-1) sadd x?

03:46:45 <esowiki> [[ORK]] https://esolangs.org/w/index.php?diff=65542&oldid=53698 * Dtuser1337 * (+13) /* External resources */

03:47:35 <zzo38> Yes.

04:54:20 <esowiki> [[Camouflage]] https://esolangs.org/w/index.php?diff=65543&oldid=30836 * Dtuser1337 * (+24)

04:54:49 <esowiki> [[ZOMBIE]] https://esolangs.org/w/index.php?diff=65544&oldid=53713 * Dtuser1337 * (+20)

05:08:36 -!- ais523 has quit (Quit: quit).

06:35:06 -!- nfd9001 has joined.

07:19:01 -!- rain1 has joined.

08:24:12 -!- Lord_of_Life has quit (Ping timeout: 245 seconds).

08:27:28 -!- Lord_of_Life has joined.

08:32:07 -!- AnotherTest has joined.

08:52:51 -!- nfd9001 has quit (Ping timeout: 264 seconds).

08:55:08 -!- nfd9001 has joined.

09:02:01 -!- b_jonas has joined.

09:05:41 <b_jonas> ais523: "~b has to be a positive number in the 0..30 range" => I think you mean in the 0..0x3E inclusive range

09:07:43 <b_jonas> "should I write it as 1 + (ptrdiff_t)(signed char)b in the original source" => use an ifdef, with the original version inactive and the optimized version active, write a comment that it's an optimization, and then check that the optimized version is compiled the way you want

09:08:03 <b_jonas> that said, isn't it possible that this is a case where the optimization doesn't matter?

09:10:58 <esowiki> [[Edition]] N https://esolangs.org/w/index.php?oldid=65545 * A * (+874) Created page with "Edition language Double speak example I"$R. C"F"0@i1@i Explanation I Take input "$ Copy the input's length to clipboard R. Read the current file's content into the..."

09:11:12 <b_jonas> "the in-memory representation of a parse tree" => that wasn't my guess

09:12:18 -!- arseniiv has joined.

09:13:41 <b_jonas> "having a separate call stack and data stack is a much better idea" => is it still, now that CPUs cache the top of the call stack as an optimization?

09:16:41 <b_jonas> ais523: moving objects around but without a gc, "at first I thought "surely that isn't necessary", but then came up with some pathological programs where it is" => in the kernel, trying to free up pages when it has lots of individually allocated small structures fragmented around memory

09:17:26 <b_jonas> kernel people are actually working on that, and yes, the disadvantages are terrifying, although I think they're doing it more memory conservingly than doubly linking every individual pointer

09:19:48 <arseniiv> hi. Nice it’s calm and wise here as usual :)

09:20:50 <arseniiv> (I was without IRC for two weeks and in some reason I’m still unpacking)

09:22:40 <b_jonas> "comparison with 0xC100000000000000" => that's because you can't have immediates larger than 32 bits, so it has to be two instructions anyway. that it's a shift of the input is still a bit surprising to me, but apparently that's how you can get it in just two instructions, rather than just three as in loading the constant in two instructions.

09:23:37 <b_jonas> may also depend on the context

09:24:23 <arseniiv> hm I played a game called Weave where you place tiles on a hexagonal grid, each tile having two points marked on each side and connecting these points by curved segments in some fashion as to make various intertwined paths when many tiles are placed together. Now I suddenly wonder if something can be esonalged from there

09:25:26 <b_jonas> "you'd simply have to replace "cmp $0xc0,%rsi" (7 bytes) with "cmp $0xc0,%sil" (4 bytes)" => why? isn't there an encoding of "cmp $0xc0,%rsi" that has a 1-byte immediate?

09:25:57 <arseniiv> turns and “knottiness” of paths could be used for semantics, and there are only so many tiles available to make the composition not so trivial

09:27:00 <arseniiv> also I think I have already seen those tiles somewhere even before that game, but don’t remember where

09:28:06 <arseniiv> and it could be downplayed to square tiles if necessary. If anybody would be interested, I’ll link to that game in G.Play and/or draw examples of tiles

09:28:14 <b_jonas> how would gcc even get the assembler to emit the encoding with the 4-byte immediate?

09:31:34 <int-e> b_jonas: 0xc0 is too big for a one-byte immediate

09:31:52 <int-e> (because they are sign-extended)

09:32:29 <int-e> and I thought picking immediate sizes was the assembler's job, not gcc's.

09:33:16 <int-e> Oh, but choosing operand sizes isn't... that's the point.

09:33:52 <b_jonas> int-e: ah ok

09:34:00 <int-e> And intel only offers the choice between one byte immediates and full byte immediates.

09:34:15 <int-e> I meant s/full byte/full size/

09:34:23 <b_jonas> int-e: gcc would know what size the instruction has, in fact the whole part where it outputs assembly rather than the object code directly is mostly for human-debuggability I think

09:35:08 <int-e> I'm not sure that is the case. gcc probably has a good idea how big instructions will be, of course.

09:35:24 <int-e> But I would hope it doesn't rely on always being correct.

09:46:00 <b_jonas> int-e: it certainly dones't have to know for rare slow instructions, because it can handle unknown instructions in asm statements, but for when it generates fast code, it probably knows the instruction size for optimization

09:46:14 <b_jonas> (for x86_64)

09:46:40 <int-e> b_jonas: Sorry, I'll take that as speculation unless you have proof.

09:46:54 <b_jonas> unrelated, but http://www.wyrdplay.org/AlanBeale/CAAPR-ref.html is a small English pronunciation dictionary by the author of 12dicts that seems to be well compiled

09:47:36 <int-e> (For example, a key purpose of things like .align is that the compiler can align code without knowing exactly how big the individual instructions are.)

09:50:56 <b_jonas> int-e: again, gcc generally doesn't know how long the emitted instructions from an asm statement are, which is documented in the gcc docs because it also says that gcc assumes an upper bound and what sneaky things you shouldn't do in the asm statement to violate that

09:51:16 <b_jonas> int-e: and gcc probably can't tell the length of some jump instructions in the first pass

09:51:40 <b_jonas> that said, some of the assembly syntax is probably for humans, not for compilers

09:51:56 <b_jonas> like, do you think the compiler needs both decimal and hex integer syntax?

09:52:32 <b_jonas> the .align too can be useful to make the assembly output more readable, even in the case when the compiler does happen to know how many bytes it emits

09:52:53 <b_jonas> plus .align 16 is shorter than the assembly mnemonic of a 9-byte nop instruction

09:55:31 <b_jonas> or looks nicer, rather

09:55:33 <b_jonas> shorter doesn't matter

09:56:14 -!- FreeFull has joined.

10:10:54 -!- botnick has joined.

10:11:24 -!- botnick has quit (Read error: Connection reset by peer).

11:11:57 -!- FreeFull has quit.

11:12:37 <b_jonas> `card-by-name Hecatomb

11:12:38 <HackEso> Hecatomb \ 1BB \ Enchantment \ When Hecatomb enters the battlefield, sacrifice Hecatomb unless you sacrifice four creatures. \ Tap an untapped Swamp you control: Hecatomb deals 1 damage to any target. \ IA-R, 5E-R, 6E-R, MED-R

11:12:58 <b_jonas> ^ It seems a bit underwhelming that they use the word "hecatomb" for the sacrifice of a mere four creatures

11:13:39 <b_jonas> `card-by-name Epic Struggle

11:13:39 <HackEso> Epic Struggle \ 2GG \ Enchantment \ At the beginning of your upkeep, if you control twenty or more creatures, you win the game. \ JUD-R

11:17:43 -!- livesex_0699 has joined.

11:17:46 -!- livesex_0699 has left.

11:26:39 <b_jonas> (The helix pineaple requires you to pay mana only, not sacrifice creatures.)

12:17:51 -!- andrewtheircer has joined.

12:17:55 <andrewtheircer> hi

12:18:47 <andrewtheircer> idea: programming language only usable from last thursday to next thursday

12:27:33 -!- dog_star_ has changed nick to dog_star.

12:29:12 -!- dog_star has quit.

12:29:33 -!- dog_star has joined.

12:30:07 <andrewtheircer> hi doh

13:03:56 -!- xkapastel has joined.

13:10:41 -!- tromp_ has quit (Remote host closed the connection).

13:36:24 <esowiki> [[~English]] https://esolangs.org/w/index.php?diff=65546&oldid=37171 * Dtuser1337 * (+24) this is a high level language due to how it looked like.

13:36:57 <esowiki> [[Commercial]] https://esolangs.org/w/index.php?diff=65547&oldid=65541 * Dtuser1337 * (+24)

13:37:30 <esowiki> [[Drive-In Window]] https://esolangs.org/w/index.php?diff=65548&oldid=65536 * Dtuser1337 * (+23)

13:42:59 -!- arseniiv_ has joined.

13:42:59 -!- arseniiv has quit (Read error: Connection reset by peer).

13:48:03 -!- Frater_EST has joined.

13:49:46 -!- Frater_EST has left.

13:50:54 -!- tromp has joined.

13:54:17 -!- MDude has quit (Quit: Going offline, see ya! (www.adiirc.com)).

13:55:51 -!- tromp has quit (Ping timeout: 264 seconds).

14:04:21 -!- tromp has joined.

15:34:10 -!- Sgeo__ has joined.

15:37:07 -!- Sgeo_ has quit (Ping timeout: 248 seconds).

16:10:16 -!- xkapastel has quit (Quit: Connection closed for inactivity).

16:51:12 -!- tromp has quit (Remote host closed the connection).

16:53:04 <zzo38> Is thre a metric-only font format in PostScript?

17:01:57 -!- tromp has joined.

17:10:26 -!- xkapastel has joined.

17:13:57 <andrewtheircer> hi zzo

17:14:25 <zzo38> Hello

17:15:54 <andrewtheircer> what is your favorite eso you've made

17:16:00 <andrewtheircer> your crowning achievement in a sense

17:21:20 <zzo38> I don't know.

17:21:38 <zzo38> (I think that I do not make a "crowning achievement")

17:21:47 <b_jonas> zzo38: you mean like for overlaying text on a scanned image of a text? I think you could use a vector font where all the glyphs are blank

17:22:22 <andrewtheircer> well then zzo

17:22:22 -!- Sgeo has joined.

17:23:19 <andrewtheircer> are you david madore , b

17:23:24 <zzo38> b_jonas: Actually I mean for use with an output driver that will provide its own glyphs or convert to some other format that will then be processed by something which provides its own glyphs, although it would also work to overlay text on a scanned image of text too

17:23:32 <zzo38> andrewtheircer: No, I am Aaron Black.

17:23:43 <andrewtheircer> well then

17:24:00 -!- Sgeo__ has quit (Ping timeout: 272 seconds).

17:24:02 <andrewtheircer> does everyone know you are aaron black

17:24:38 <andrewtheircer> if not

17:24:40 <andrewtheircer> this is big revel

17:25:07 <zzo38> I don't know, but I think it doesn't really matter so much. I am just known as "zzo38", and will not confuse with other people also named Aaron Black, I think.

17:25:21 <b_jonas> zzo38: ok. I'd think a font with blank or dummy vector images would work for that too

17:25:39 <andrewtheircer> nice stuff aaron

17:25:54 <andrewtheircer> also:i was asking b_jonas if they were david madore

17:26:40 <b_jonas> andrewtheircer: of course not

17:26:44 <zzo38> b_jonas: Yes, that would work, I suppose

17:26:48 <andrewtheircer> d'oof

17:28:15 <b_jonas> zzo38: but maybe otf already has a way to indicate that a font has metrics only. few people know how font formats actually work. maybe oren knows.

17:34:38 <andrewtheircer> it's a low voodoo about font formats

17:36:05 <b_jonas> it's not surprising that nobody understand them: there's a lot of historical baggage of evolving technology, and a lot of useful optimization

17:44:06 -!- andrewtheircer has quit (Remote host closed the connection).

17:46:25 -!- tromp has quit (Remote host closed the connection).

17:58:55 -!- tromp has joined.

18:09:09 -!- tromp has quit (Remote host closed the connection).

18:17:12 <arseniiv_> I am also not David Madore

18:17:16 -!- arseniiv_ has changed nick to arseniiv.

18:39:40 -!- tromp has joined.

18:44:05 -!- tromp has quit (Ping timeout: 244 seconds).

18:54:05 <esowiki> [[PureStack]] https://esolangs.org/w/index.php?diff=65549&oldid=41588 * Dtuser1337 * (+27)

18:55:11 <esowiki> [[Super Stack!]] https://esolangs.org/w/index.php?diff=65550&oldid=37526 * Dtuser1337 * (+24)

18:55:44 <esowiki> [[Super Stack!]] https://esolangs.org/w/index.php?diff=65551&oldid=65550 * Dtuser1337 * (-24)

18:57:48 <esowiki> [[Super Stack!]] https://esolangs.org/w/index.php?diff=65552&oldid=65551 * Dtuser1337 * (+23)

18:59:16 -!- adu has joined.

19:01:28 <zzo38> I think the font formats with METAFONT is good.

19:14:04 <esowiki> [[Emoji]] https://esolangs.org/w/index.php?diff=65553&oldid=46750 * Dtuser1337 * (+104) Adding some categories.

19:19:13 -!- Sgeo has quit (Ping timeout: 244 seconds).

19:21:10 <esowiki> [[Emoji]] https://esolangs.org/w/index.php?diff=65554&oldid=65553 * Dtuser1337 * (+24) Added output only category due to potential no input.

19:22:42 -!- Sgeo has joined.

19:26:39 -!- tromp has joined.

19:27:54 <esowiki> [[Hello world program in esoteric languages]] https://esolangs.org/w/index.php?diff=65555&oldid=65405 * Dtuser1337 * (+39)

19:32:31 <esowiki> [[Hello world program in esoteric languages]] https://esolangs.org/w/index.php?diff=65556&oldid=65555 * Dtuser1337 * (+746) /* Emoji */

19:38:01 <esowiki> [[Hello world program in esoteric languages]] https://esolangs.org/w/index.php?diff=65557&oldid=65556 * Dtuser1337 * (+91) /* suicide */

19:39:25 -!- tromp has quit (Remote host closed the connection).

19:40:31 <esowiki> [[Hello world program in esoteric languages]] https://esolangs.org/w/index.php?diff=65558&oldid=65557 * Dtuser1337 * (+152) /* MATL */

19:40:36 -!- adu has quit (Quit: adu).

19:41:09 -!- FreeFull has joined.

19:54:19 -!- sombrero has joined.

19:55:35 <sombrero> :0 the style has changed

19:56:05 <int-e> style?

19:56:34 <int-e> Oh the webchat I guess. IRC is a pure text-based protocol, it has no style.

19:56:48 -!- tromp has joined.

19:56:50 <sombrero> yep the frontend

19:57:38 -!- Sgeo_ has joined.

19:58:41 <int-e> sombrero: you're the only user of the webchat here, it seems.

19:59:16 <sombrero> ???

20:00:52 <int-e> (But I'm not sure, some people are cloaked which hides this information.) In any case, there are many IRC clients. The Freenode webchat frontend is just one of them.

20:01:04 -!- Sgeo has quit (Ping timeout: 272 seconds).

20:01:28 <int-e> https://en.wikipedia.org/wiki/Comparison_of_Internet_Relay_Chat_clients <-- many

20:01:51 <zzo38> Yes, and I use one I wrote by myself. Others may also use programs not mentioned on Wikipedia, too

20:06:57 -!- tromp has quit (Remote host closed the connection).

20:08:47 <sombrero> :v anyway, i would be very happy if take a look to a compilation of drawing http://vixra.org/abs/1907.0332 and give your criticss, It would be pretty nice if this inspires some good ideas to make new stuff ;)

20:10:26 <sombrero> (esoteric answers and comment are well accepted ;P)

20:17:23 <b_jonas> int-e: on freenode, the webchat cloak overrides the other cloaks, so you can't hide it that way

20:18:18 <b_jonas> hmm

20:18:42 <b_jonas> sorry, apparently that doesn't apply to kiwi (oh I hate that client by the way), only some other webchats

20:18:53 <b_jonas> because sombrero's hostname isn't a cloak

20:22:22 -!- tromp has joined.

20:23:24 -!- Lord_of_Life_ has joined.

20:25:52 -!- Lord_of_Life has quit (Ping timeout: 244 seconds).

20:25:53 -!- Lord_of_Life_ has changed nick to Lord_of_Life.

20:33:41 -!- sombrero has quit (Remote host closed the connection).

20:40:34 -!- Phantom_Hoover has joined.

21:21:55 <zzo38> Priests of Sun {1W} Creature - Human Cleric (1/1) ;; {1W}: Target creature gets +0/+1 until end of turn. ;; {T}: Tap target blocking creature. That creature gets +1/+2 until end of turn.

21:22:22 <zzo38> Priests of Worm {1B} Creature - Human Cleric (1/1) ;; {(B/P)}: ~ gains first strike and haste until end of turn. ;; {T}, Sacrifice a non-artifact creature: Add {BB}.

21:39:13 -!- AnotherTest has quit (Ping timeout: 250 seconds).

21:39:25 -!- Phantom_Hoover has quit (Quit: Leaving).

22:02:07 -!- andrewtheircer has joined.

22:06:17 -!- Bowserinator has quit (Ping timeout: 245 seconds).

22:06:28 -!- Bowserinator_ has joined.

22:28:41 -!- Sgeo has joined.

22:29:52 -!- Sgeo_ has quit (Ping timeout: 244 seconds).

22:36:07 -!- Sgeo_ has joined.

22:39:13 -!- Sgeo has quit (Ping timeout: 245 seconds).

22:47:11 <zzo38> Do you like this cards?

22:47:25 <andrewtheircer> who is cards

22:48:26 <zzo38> Magic: the Gathering cards

22:48:47 <andrewtheircer> og

22:48:50 -!- andrewtheircer has quit (Remote host closed the connection).

22:50:04 -!- xkapastel has quit (Quit: Connection closed for inactivity).

23:18:03 -!- nfd9001 has quit (Ping timeout: 264 seconds).

23:18:34 -!- nfd9001 has joined.

23:39:04 -!- tromp_ has joined.

23:40:59 -!- tromp has quit (Ping timeout: 250 seconds).

23:44:31 <arseniiv> @tell sombrero what are polysigns mentioned in the abstract of the thing you linked?

23:44:32 <lambdabot> Consider it noted.

23:45:15 -!- ais523 has joined.

23:45:23 -!- FreeFull has quit.

23:46:48 <ais523> b_jonas: the kernel uses intrusive lists a lot so it may be that a large proportion of the pointers are already doubly linked, that makes doubly-linking all of them a little less insane

23:47:52 <ais523> <b_jonas> "you'd simply have to replace "cmp $0xc0,%rsi" (7 bytes) with "cmp $0xc0,%sil" (4 bytes)" => why? isn't there an encoding of "cmp $0xc0,%rsi" that has a 1-byte immediate? ← no, there isn't, 1-byte immediates only exist as offsets of memory addresses (which is why lea is useful) and on byte instructions

23:48:09 <ais523> at least, on instructions like cmp that use a fairly normal encoding

23:49:13 -!- arseniiv has quit (Ping timeout: 245 seconds).

23:49:22 <b_jonas> ais523: they do exist on ordinary word instructions like cmp, but int-e says that they're signed so it won't work here -- I haven't checked that, it's really hard to figure out from intel manuals which things are sign extended and which ones are zero extended

23:50:03 <ais523> incidentally, it crosses my mind that you could go down to three bytes by using bl/cl/dl rather than sil, or even two bytes if al were available

23:50:18 <ais523> b_jonas: ah, OK

23:50:21 <shachaf> I think it's often the case that instead of pointers you want to use some alternative like indices, if you have a lot of them. That already allows for some kinds of relocation.

23:50:28 <ais523> there's a simple rule for this, one-byte immediates in x86 are /always/ signed

23:51:00 <b_jonas> they also exist for the shortcut %eax arithmetic instructions, but annoyingly not for mov, despite that a word mov with one-byte immediate would be often very convenient

23:51:02 <ais523> so the only time they're treated as unsigned is if the instruction treats its operand as unsigned /and/ it only has a 1-byte operand, in which case it gets sign-extended to its own length and then treated as unsigned

23:51:40 <b_jonas> ais523: including one-byte offsets for memory operands?

23:51:47 <ais523> b_jonas: yes

23:51:52 <b_jonas> ok

23:52:00 <ais523> this is why the x86_64 ABI has a redzone, it's to make use of the positive offsets on SP

23:52:17 <b_jonas> ok

23:52:54 <ais523> anyway, a signed immediate /would/ work here, you'd simply need to do a signed shift rather than an unsigned shift

23:53:20 <b_jonas> hmm

23:53:39 <ais523> or, hmm, maybe not

23:53:55 <ais523> because with the signed interpretation, you're checking for values between 0 and a negative constant

23:53:59 <ais523> which isn't all at one end of the range

23:54:10 <ais523> ugh

23:58:12 <b_jonas> I just realized that it's funny how you first say that x86 doesn't have one-byte immediates for word instructions, then said that they're all signed

23:58:53 <ais523> that's the way that memory works, I think

23:59:10 <moony> Insert complaint about x86's insanity here.

23:59:35 <ais523> don't lawyers have a phrase "in the alternative" for describing situations like this?

23:59:41 <b_jonas> but yes, it is signed

←2019-08-17 2019-08-18 2019-08-19→ ↑2019 ↑all