←2019-08-17 2019-08-18 2019-08-19→ ↑2019 ↑all
00:05:27 <andrewtheircer> e
00:32:56 -!- andrewtheircer has quit (Remote host closed the connection).
00:38:16 -!- b_jonas has quit (Ping timeout: 272 seconds).
00:38:42 -!- b_jonas has joined.
01:11:03 -!- b_jonas has quit (Remote host closed the connection).
01:20:51 -!- FreeFull has quit.
01:25:34 -!- ais523 has joined.
01:25:49 <ais523> ugh, I'm dealing with a really annoying missed optimisation atm
01:25:52 <ais523> ptrdiff_t notneg(unsigned char b) { if (b <= 0xC0) __builtin_unreachable(); return -(ptrdiff_t)(unsigned char)~b; }
01:29:00 <ais523> the correct optimisation is «return 1 + (ptrdiff_t)(signed char)b;» but neither gcc nor LLVM finds it (you could also do the increment first)
01:32:37 <ais523> huh, they don't find it even with a cast to signed char in there, although clang finds it if you put a cast to signed /int/ in there (of course, none of these casts make a difference because ~b has to be a positive number in the 0..30 range)
01:33:16 <zzo38> LLVM allows you to specify the range, but I think only for loading from memory.
01:35:05 <ais523> I just get annoyed when compilers produce asm that's worse than the asm I'm expecting
01:35:30 <ais523> (and I'm the sort of person who actually checks that when I write code like -(ptrdiff_t)(unsigned char)~b…)
01:36:17 <zzo38> Is there any command to have a inline LLVM code in a C code?
01:37:03 <zzo38> Also, in Ghostscript, the "forall" command doesn't seems to work properly for strings longer than 800 bytes. (Other commands work (including the "length" command), and it works fine for arrays, but not for strings.) Why does it do that?
01:38:29 <ais523> there's no standard C method to use inline LLVM; clang could theoretically support it as an extension to __asm__ but I don't think it does
01:40:19 <ais523> hmm, so my next problem is, should I write it as 1 + (ptrdiff_t)(signed char)b in the original source, even though that's a lot less clear (the operation I'm using is conceptually a combination of a bitwise-complement and a unary minus, the fact that that does the same thing as an increment is just an intentionally engineered coincidence)
01:42:45 <shachaf> ais523: Oddly enough if you write out a lookup table gcc finds a good implementation.
01:44:59 <ais523> what, if you special-case all 31 possible inputs? :-D
01:45:28 <shachaf> Yes, if you write a switch over the entire range.
01:46:21 <shachaf> (I mean "range of inputs" and not "codomain" or "image", which people refer to as "range" for some reason.)
01:47:05 <zzo38> One possibility is to write a comment.
01:47:55 <zzo38> (Also, which implementation is better might depend on the target computer, maybe?)
01:48:33 <shachaf> What's the context of this function?
01:48:45 <ais523> decoder for a file format
01:49:13 <ais523> it stores references to earlier locations in the file as 1, 2, 4, or 8 bytes, I'm writing a separate function for each
01:49:21 <shachaf> And the speed of this operation is relevant?
01:49:25 <ais523> but the references are bitwise-complemented
01:49:51 <ais523> the format's designed to be extremely fast, faster to use a memory image of the format than actually parsing it
01:50:12 <shachaf> Hmm, which format is it?
01:50:15 <ais523> this operation needs to be able to compete with pointer dereferences to do that
01:50:21 <ais523> it's a format I'm inventing atm
01:50:34 <shachaf> Aha.
01:51:02 <ais523> (the idea is that the +1 ends up getting inlined into the dereference of the resulting pointer)
01:51:53 <ais523> I'd store the offsets as with the +1 added if I could, but I can't because 0 is a valid offset and there is a fairly esoteric constraint that the offsets cannot be valid UTF-8
01:52:15 <ais523> #esoteric is definitely the right channel for this :-D
01:52:31 <shachaf> So that's the reason for the 0xc0. I was vaguely wondering before whether it was related to UTF-8.
01:52:48 <shachaf> What's this for?
01:52:59 <shachaf> Maybe you're being deliberately vague.
01:53:28 <ais523> not because it's secret, mostly because it's hard to explain
01:53:50 <ais523> the format's designed for the output of a parser, i.e. the in-memory representation of a parse tree
01:54:10 <ais523> but to be generic enough to be usable in much the same situations as XML is
01:54:50 <ais523> but the idea is that the parser output is just an image in memory, it's not something that you parse into a linked tree of structures like you normally do
01:55:26 <ais523> linked lists are bad performance-wise, and linked trees are very common but probably also beatable in performance for the same reasons, so I wanted to try
01:56:17 <zzo38> XML is often used in cases where it shouldn't be, anyways, I think
01:56:19 <shachaf> That makes sense.
01:56:44 <shachaf> I suspect the gains from a more efficient memory encoding are way bigger than the gains from a couple of extra instructions to dereference a pointer.
01:56:47 <ais523> zzo38: yes
01:57:00 <ais523> shachaf: that's what I'm hoping, at leasts
01:57:32 <ais523> (not just more efficient, but more compact too, meaning it makes better use of the cache; also less malloc overhead)
01:58:39 <shachaf> I think it's plausible that even with a regular parser, you should have only one or a few calls to malloc for the entire tree, which you free all at once.
01:59:04 <shachaf> There isn't much benefit to mallocing individual nodes.
02:00:33 <ais523> all the parsers I've seen so far malloc individual nodes for the output tree, although I agree there isn't much benefit to it
02:00:46 <ais523> I think the idea is to let you free parts of the tree individually
02:01:17 <shachaf> I suspect there's no real idea, it's just that people learn that that's how you get memory.
02:01:21 <ais523> but the normal operation on a tree is to traverse it, and that lets you build a new tree as you go
02:02:14 <zzo38> In some programs you would free one branch of the tree separately from the rest
02:02:16 <ais523> this program is actually being written in Rust (I translated it into C to test it on gcc), and rust has a #[no_std] annotation that means your program has no support at all from the operating system, it's used for purely algorithmic libraries
02:03:11 <ais523> that means you don't get memory allocation unless the caller gives you a memory allocation callback, and so I try to avoid allocations out of habit, thinking carefully about if I need them
02:03:19 <ais523> often, I don't need much more than one or two for the program
02:03:38 <shachaf> I'd like to know more about possible allocation strategies.
02:04:12 <shachaf> I think I've talked about that in here before.
02:04:29 <ais523> well, the one I'm working on for this program (for the parsers, tree operations, etc.) is a purely linear allocation strategy where everything is written in memory left to right in order, meaning that your algorithms have to be able to manage that
02:05:11 <shachaf> Does this memory go left to right in order of increasing addresses?
02:05:13 <ais523> many algorithms can't, but for a parser it's fairly hopeful because parsing algorithms and tree-traversal algorithms both naturally work like that
02:05:48 <ais523> yes, increasing addresses (decreasing is possible I guess but it's sufficiently alien that you normally get less support from the standard library, OS, processor, etc.)
02:06:15 <shachaf> Unless it's the call stack.
02:06:28 <ais523> writing left to right unfortunately means that reading right to left is sometimes required, but reads on a parse tree are generally more random-access than writes as it is (you're often trying to match both branches of the tree to something)
02:06:36 <shachaf> I like the idiom of having a global (or thread-local) arena that can be used effectively as a stack but with the frames manually managed.
02:06:43 <ais523> the call stack in x86 is the wrong way up and I am very upset at this
02:07:27 <shachaf> So functions that produce variable-size output that's only needed briefly can just write into there.
02:07:32 <shachaf> E.g. sprintf.
02:07:33 <zzo38> I wrote one program which needs the caller to specify a memory allocation function, although you can just put realloc and it will work, which is normally what would be done, I think.
02:07:39 <ais523> there's one /really big/ technical advantage to having the call stack the other way round, which is that in x86, overflows beyond the end of a stack-allocated array overwrite the function's return address by default, which is just about the worst possible thing to overwrite
02:08:05 <shachaf> ais523: I've heard that brought up as a security benefit and I don't really buy it.
02:08:18 <zzo38> In some instruction sets the call stack is not addressable
02:08:22 <ais523> if the call stack goes the other way, the situations in which that happen are much less common, requiring you to be writing to an array of a function which called the current one
02:08:36 <ais523> that said, having a separate call stack and data stack is a much better idea for multiple reasons
02:08:40 <shachaf> char buf[n]; sprintf(buf, "%s", ...); seems pretty common to me.
02:08:49 <shachaf> Yes, I agree that that makes more sense.
02:09:51 <ais523> shachaf: well, that example is broken regardless of which way up the stack goes (although with a stack that goes upwards, sprintf would at least be able to detect that something was wrong; it knows where its own return address is in memory, but not where its caller's is)
02:10:14 <moony> yea.. that seems like a security /issue/
02:10:17 <ais523> I'm not sure if anyone would write sprintf like that, but maybe they would
02:10:18 <moony> that's bad
02:10:29 <moony> in fact, that's exactly how ROP works
02:10:41 <shachaf> ais523: Or, I mean, char buf[n]; gets(buf); or whatever.
02:11:07 <shachaf> Many of the classic buffer overflow problems involve calling another function so changing the stack direction doesn't help much.
02:11:08 <ais523> moony: well, ROP is one of the most effective ways to exploit this, but the exploit isn't ROP, it's just that the exploit lets you use ROP
02:11:30 <ais523> gets should be simple enough to be inlined
02:11:46 <moony> also thumbs-up for using rust. If you're trying to allow custom allocators, use what is already provided
02:11:48 <shachaf> I would be pretty surprised if gets was inlined.
02:12:01 <shachaf> Rust seems to be pretty bad at custom allocators or unusual allocation strategies.
02:12:05 -!- botnick has joined.
02:12:20 <ais523> so would I, mostly because that function is such a bad idea that I doubt much effort has been put into optimising it
02:12:34 <ais523> I'm not sure if Rust's #[global_allocator] is stable yet, but that makes custom allocators very easy
02:12:49 <ais523> unusual allocation strategies are IME either very easy or very hard
02:12:57 <moony> mmm..
02:13:06 <shachaf> I think the main reason I think that is that Rust is all about destructors, which muddle deallocation with other code.
02:13:10 <moony> i thought rust's allocation choice could be done per function
02:13:11 <moony> guess not
02:13:15 <moony> i mean
02:13:17 <moony> i see why
02:13:46 <ais523> right, the issue there is not so much allocation but deallocation
02:13:59 <moony> shachaf: there is the Drop trait, which can be useful. Could possibly implement a trait that keeps tabs on what allocator was used to make a object
02:13:59 <shachaf> Unfortunately those issues are connected.
02:14:06 <ais523> although, it wouldn't be too hard to give the various allocators their own portions of address space and use the pointer value to tell them apart
02:14:32 <shachaf> I feel like you just want to do the right thing statically.
02:14:39 <ais523> btw, separately from this project, I had an idea about a very eso allocation strategy
02:14:45 <moony> oooo
02:14:51 <shachaf> If you're adding a bunch of dynamic behavior like that there should be a good reason for it.
02:14:57 <ais523> the idea was that all the pointers in the entire program would be doubly-linked
02:15:13 <ais523> like, to point from one object to another, you need to point to a pointer field that points back at you
02:15:20 <moony> i think the cache would commit suicide
02:15:40 -!- botnick has quit (Read error: Connection reset by peer).
02:15:43 <ais523> now, the great thing about this strategy is that you can move objects around in memory at will
02:15:49 <moony> i mean
02:15:52 <ais523> because you already know where all the inbound pointers are and can just update them
02:15:55 <moony> that's the strategy a lot of VMs use
02:15:57 <moony> like the JVM
02:16:17 <ais523> yes, I was inspired by compacting garbage collectors, the idea was to do the same thing without a GC
02:16:30 <ais523> I'm not sure if this is useful, and it probably isn't, which is why I described it as very eso
02:16:51 <moony> reducing memory fragmentation could be useful if you keep memory organized at the same time
02:17:01 <moony> and do a bit of the memory usage predictor's work for it
02:17:06 <shachaf> Did you see that malloc that merges pages which don't have overlapping allocations?
02:17:07 <ais523> yes, this is one of those cases where the advantages are clear but the disadvantages are terrifying
02:17:11 <ais523> shachaf: yes
02:17:34 <ais523> at first I thought "surely that isn't necessary", but then came up with some pathological programs where it is
02:17:57 <shachaf> I think my current attitude is that if your program is doing zillions of little mallocs and frees everywhere it's probably doing things badly anyway.
02:18:05 <moony> ^
02:18:14 <ais523> e.g. imagine a program that allocates 100 million 8-byte objects, deallocates 99 out of every 100, then allocates 1 million 800-byte objects
02:18:20 <moony> if it comes to that, use a slab allocator for some of those objects
02:18:42 <moony> assuming they're all uniform
02:18:56 <ais523> my thoughts on the matter is that the only difficult case is what happens if a program deallocates a lot of small objects, then needs to allocate large objects that are too large for the remaining spaces
02:19:21 <ais523> all the other cases are trivial by comparison
02:19:56 <ais523> I don't think this is a remotely common pattern for long-running interactive applications, but I can see a batch process doing something like that when moving on from one phase of computation to another
02:20:13 <moony> I think Dwarf Fortress has an issue with that
02:20:18 <moony> albet with larger objects
02:20:20 <ais523> maybe, if you have a random malloc pattern, there should be some sort of compaction phase whenever your program shifts modes
02:21:01 <moony> in The Powder Toy everything is preallocated because malloc pauses are unacceptable
02:21:09 <ais523> anyway, this is what inspired the "everything is doubly-linked" idea because it's one of the only ways, short of a GC, to solve the problem
02:21:30 <ais523> (that said, preventing the problem happening in the first place by using a more predictable malloc/free scheme is probably better!)
02:21:57 <moony> If you know the max number of objects, then just allocate that, the linux kernel won't mind
02:22:10 <ais523> moony: are they preallocated /and/ prefaulted?
02:22:38 <moony> not prefaulted, but they generally get faulted very early in the app's lifespan so it's not an issue
02:22:41 <ais523> animalloc doesn't have any pauses other than page faults, and those will happen even with a preallocation strategy unless you prefault all your pages too
02:22:56 <moony> could add prefaulting for fun
02:23:11 <shachaf> Did you know that merely reading from an mmapped page in Linux will only fault it in as a CoW zero page rather than giving you your own memory?
02:23:16 <ais523> moony: it's one line of code
02:23:21 <ais523> shachaf: yes
02:23:35 <moony> ais523: and that is?
02:24:18 <ais523> actually I'm wrong, I thought madvise could do it on arbitrary memory, but the most aggressive prefault it supports is a request to preload into /cache/ faster
02:24:42 <moony> ok nvm it is prefaulted, we run the simulation clear no matter what
02:24:53 <ais523> mmap can do it with MAP_POPULATE but that requires you to actually be getting the memory from mmap
02:24:54 <moony> so the action of wiping the sim should cause prefaulting
02:25:21 <moony> as we technically "use" the memory
02:25:43 <ais523> …why are mmap and madvise not synchronized in what requests they support
02:26:28 <moony> ..prolly should work on reducing allocations in the game's graphics
02:26:52 <moony> the sim is effectively malloc free
02:26:59 <moony> but graphics are pretty CPU hungry too
02:27:59 <moony> ..gah
02:28:07 <moony> we still haven't squashed the malloc in INST code
02:28:27 <ais523> OK, I've now implemented the 1, 2, 4, and 8 byte versions of this function
02:28:56 <ais523> the -~ gets optimized to +1 only in the 8-byte case (presumably because LLVM can't optimize the - movzx ~ sequence when the number is known to be negative)
02:30:56 <ais523> one interesting thing is that the optimizer implements the comparison with 0xC100000000000000 using a 56-bit shift and a comparison with 0x000000C1; I can believe that all those leading zeroes are actually fastest, but think there miight be a better way
02:31:34 <moony> that sounds like it would be faster on Nephelem era CPUs, but not sure about newer things
02:31:44 <moony> (Nephelem has some.. weird constraints)
02:32:58 <ais523> right, I can see why a comparison of the bottom 8 bits with 0xC1 (which would be mathematically correct) might be slower on some processors
02:33:13 <Hooloovo0> I could also see it being smaller code
02:33:20 <moony> ^ check if it's smaller
02:33:36 <moony> fitting as many instructions in 16 bytes as possible is usually a goal
02:33:38 <Hooloovo0> shift-right is probably 1 instruction, and compare 8 bit is another
02:34:03 <Hooloovo0> and fitting 0xc100000... as an immediate is going to be longer
02:34:38 <ais523> you'd simply have to replace "cmp $0xc0,%rsi" (7 bytes) with "cmp $0xc0,%sil" (4 bytes)
02:34:54 <ais523> as Hooloovo0 says, the only difference is the leading zeroes on the immedite
02:34:56 <ais523> *immediate
02:35:05 <moony> there's a subtle difference here
02:35:26 <moony> the 32-bit registers and 64-bit ones are different, if i recall right
02:35:27 <Hooloovo0> oh, it's a 32-bit compare? weird
02:35:49 <moony> but..
02:35:50 <moony> hm
02:35:53 <moony> can i see ASM
02:35:53 <ais523> Hooloovo0: it's a 64-bit compare using a 32-bit immediate, x86_64 doesn't use 64-bit immediates apart from one instruction
02:36:01 <Hooloovo0> oh, ok
02:36:17 <moony> ais523: can you make a quick listing of the ASM here?
02:36:23 <ais523> mov %rdx,%rsi \ shr $0x38,%rsi \ cmp \ $0xc0,%rsi
02:36:29 <ais523> err, I added an extra slash
02:36:39 <ais523> mov %rdx,%rsi \ shr $0x38,%rsi \ cmp $0xc0,%rsi \ ja 2e
02:36:47 <ais523> (then the value in %rsi is no longer used, nor are the flags)
02:36:52 <moony> can't downgrade to a 32-bit register.
02:37:02 <ais523> moony: %esi is just the bottom half of %rsi
02:37:18 <moony> No, they're treated different if i remember right
02:37:36 <ais523> the complication you're thinking of is that if you ever assign to %esi directly, it gets automatically zero-extended into %rsi
02:37:45 <moony> ahg
02:37:47 <moony> ah
02:37:54 <moony> alright, i'm being a derp
02:38:01 <ais523> i.e. the /top half/ of %rsi is like a register of its own with its own weird rules
02:38:28 <ais523> (meanwhile, assigining to %si or %sil does not clear any other part of %rsi, the rule's only for the $e* variant of registers specifically)
02:39:05 <shachaf> I saw a post recently about which amd64 registers aren't special-cased in some way. It's only a few of them.
02:39:05 <moony> that last bit i did /not/ know
02:39:06 <Hooloovo0> what
02:39:12 <ais523> …which is why xchg %ax, %ax is a no-op, because that's a 16-bit write
02:39:23 <Hooloovo0> whose idea was this
02:39:25 <moony> Hooloovo0: yes, welcome to x86
02:39:35 <ais523> shachaf: six of r8..r15 are not special-cased, I think
02:39:36 <moony> it's all in the name of (misguided) backwards compatibility
02:39:44 <ais523> err, r9..r16
02:40:04 <ais523> I can't remember which two it is that are, it's either 10 and 11, 11 and 12, or 12 and 13
02:40:12 <shachaf> Yes, https://twitter.com/rygorous/status/1162078329706405888
02:40:18 <Hooloovo0> man, fuck computers
02:40:30 <shachaf> ais523: Wait, writing to ax doesn't zero the top half of rax?
02:40:35 <ais523> shachaf: right
02:40:43 <Hooloovo0> nor eax
02:40:46 <ais523> this is what I was missing when we had our nop discussion a while back
02:41:24 <moony> Hooloovo0: i await the day RISC-V takes over the world with sane design
02:41:26 <ais523> so 90 is a nop because it's special-cased, even though that instruction would normally decode to xchg eax, eax (this forces you to use a different encoding if you actually want the swap)
02:41:45 <ais523> but 66 90 is a nop semantically, it swaps ax with itself and that doesn't have side effects
02:42:02 <ais523> (although it's almost certainly special-cased anyway, that's just for performance reasons)
02:42:03 <shachaf> Oh man.
02:42:13 <shachaf> I feel like I came across this fact once before but I completely forgot about it.
02:42:16 <shachaf> What a mess.
02:42:24 <moony> I want RISC-V to take over the world
02:42:44 <ais523> this is why disassemblers can disassemble 66 90 to xchgw ax, ax without being incorrect
02:43:08 <Hooloovo0> yeah, risc-v is nice
02:43:17 * Hooloovo0 has designed a risc-v cpu
02:44:11 <shachaf> Until now I thought that anything that only wrote to the lower 32 bits automatically zeroed the upper 32 bits.
02:44:13 <moony> I want a decently priced linux capable RISC-V board
02:44:15 <moony> then i will be happy
02:45:12 <ais523> fwiw, the special-casing on r12 is almost forgivable, the issue is that some notation is needed to be able to specify not using a register; x86 sacrificed the ability to write an address that requires multiplying the stack pointer by a constant (somewhat understandably) to give the encodings it needed
02:45:38 <ais523> and r12 is encoded by specifying "stack pointer, except second set of registers" and that would therefore need a special-case decode to allow you to multiply it by a constant
02:46:11 <shachaf> Yes, both r12 and r13 are inheriting restrictions from rsp and rbp.
02:47:55 <ais523> I came up with a new optimisation idea a while back: the idea is that when you're implementing an array of structures as parallel arrays (e.g. for alignment reasons), you take a pointer into the array with smallest elements
02:48:16 <ais523> and index into the others by multiplying the /pointer/ by a constant and adding another constant with an appropriately calculated value
02:48:35 <ais523> (or if the arrays don't have fixed addresses, e.g. in a re-entrant function, another register)
02:48:55 <ais523> in the non-fixed-address scenario this saves one register over the obvious way of writing things and isn't any longer or slower
02:49:16 <ais523> although it's really highly illegal in C, you can do it in x86 asm just fine
02:49:19 <moony> Hooloovo0: if you think the base x86-64 stuff is insane
02:49:22 <moony> mess with SSE
02:49:42 <ais523> AVX seems to have hit a sweet spot of performance and sanity, IMO
02:49:50 <moony> Yea, AVX is ok
02:49:50 <ais523> then it started going back downhill from AVX2 onwards
02:49:59 <moony> AVX-512 ;-;
02:50:07 <shachaf> What was wrong with AVX2?
02:50:13 <ais523> I've come to the decision to target AVX1 as my baseline instruction set
02:50:23 <shachaf> It gave a lot of integer instructions that were missing in AVX1.
02:50:25 <ais523> shachaf: nothing massively wrong, it just has some jarring inconsistencies and missing features
02:50:48 <moony> also the useless instructions
02:50:52 <moony> that only have one use
02:50:56 <moony> and are junk for anything else
02:51:00 <shachaf> A while ago I wanted to write vectorized code and I couldn't use AVX.
02:51:07 <ais523> e.g. there are no instructions to operate on 256-bit values as a whole, only pairs of 128-bit values
02:51:17 <ais523> you can't shift the top half of a ymm value into the bottom half
02:51:41 <ais523> which is weird, because you can shift the top half and bottom half of a ymm value separately into the top and bottom halves respectively of a second ymm value
02:51:42 <moony> RISC-V's vector extension is looking really good
02:51:47 <moony> you seen it, ais523?
02:51:53 <ais523> no
02:52:13 <moony> https://github.com/riscv/riscv-v-spec
02:52:59 <moony> it's great
02:53:07 <moony> completely vector register width agnostixc
02:53:10 <moony> *agnostic
02:53:19 <moony> so that old code can utilize larger register sizes easily
02:55:02 <Hooloovo0> should x86 be considered an esolang?
02:55:17 <moony> prolly
02:55:23 <moony> a esolang that everyone uses
02:56:20 <shachaf> Much less than C++, I think.
03:06:09 <zzo38> What I would want to have is the bitwise operations that MMIX has. There is a bit operation extension for RISC-V, but I think the bit operations of MMIX is better.
03:06:27 <zzo38> (At least, for a 64-bit system, it is better. Perhaps for a 32-bit system maybe it isn't better.)
03:06:34 <moony> zzo38: are MAK and EXT (MAKe and EXTract) in that list?
03:06:43 <moony> they're both bitfield opts
03:06:45 <moony> *ops
03:06:58 <moony> really powerful, enough that they can replace shiftleft/shiftright instructions
03:07:36 <zzo38> What are MAK and EXT doing?
03:08:07 <zzo38> (I also like the Muxcomp operation, described in esolang wiki)
03:08:09 * moony pulls up MC88100 desc of them
03:10:10 <moony> zzo38: http://bitsavers.trailing-edge.com/components/motorola/88000/MC88100_RISC_Microprocessor_Users_Manual_2ed_1990.pdf pages 3-44, 3-45, 3-46, 3-47, 3-70, and 3-71,
03:10:26 <moony> it has diagrams describing operation, alongside text
03:11:27 <zzo38> Do you have the order number of the page?
03:11:55 <moony> one sec
03:12:34 <moony> MAK is on 125..
03:13:05 <moony> EXT on 99.
03:13:54 <moony> that good, zzo38?
03:14:25 <moony> I own the physical manual, forgot that the PDF is annoying to navigate
03:15:10 <zzo38> Yes, that is good
03:15:24 <zzo38> (I didn't read all of it yet though)
03:20:59 <zzo38> That MAK and EXT is good. (MMIX doesn't have it, but does have MOR and MXOR and SADD (MOR is very useful), and Muxcomp is even more general to do)
03:21:45 <zzo38> (A problem with Muxcomp though, is that it requires a lot of operands.)
03:23:38 <zzo38> Muxcomp is: Form a 5-bit number from the low bits of each of the first five operands to select which bit of the last operand to copy to the low bit of the result, and then the same for the next bit position, and so on.
03:24:49 <zzo38> (Or six bits for 64-bit registers)
03:26:05 <ais523> zzo38: doesn't that effectively implement an arbitrary five-input bitwise operator?
03:26:11 <ais523> with the last operand specifying the truth table?
03:26:20 <kolontaev> what do you use MOR for? if it's useful...
03:27:17 <zzo38> ais523: I suppose so, yes. (You can also perform a shift or rotate of the last operand by setting the other operands properly)
03:27:20 <moony> MAK and EXT have the advantage of practicality
03:27:43 <zzo38> kolontaev: One use is endian switch (including PDP-endian).
03:27:52 <zzo38> But there are some other uses, too.
03:28:01 <kolontaev> zzo38: aha
03:28:41 <ais523> a while back in here we were discussing how Intel had added INTERCAL's select operator ~ to the x86 instruction set
03:29:34 <ais523> the paper discussing it mentioned an instruction that's effectively a bitwise keysort; you stable-sort the bits of one operand using the bits of the other operand as keys
03:30:04 <ais523> that hasn't been implemented but it seems like it could be useful
03:30:12 <ais523> (a select is basically a bitwise-keysort followed by an AND)
03:30:19 <ais523> err, preceded by an AND
03:34:36 -!- ARCUN has joined.
03:35:55 <ARCUN> cpressey: Do you know where I could get an actual Commodore 64? I need one for a project.
03:36:03 -!- ARCUN has quit (Remote host closed the connection).
03:37:25 <moony> uh
03:37:26 <moony> huh
03:39:21 <zzo38> ais523: Yes, I think it can be useful
03:42:26 <esowiki> [[Commercial]] https://esolangs.org/w/index.php?diff=65539&oldid=65535 * Dtuser1337 * (+8) Formatting codebase.
03:43:44 <zzo38> MMIX doesn't have a "count trailing zero" instruction, but it can easily be done with two instructions (the SADD instruction is popcount(x AND NOT y))
03:43:46 <esowiki> [[Commercial]] https://esolangs.org/w/index.php?diff=65540&oldid=65539 * Dtuser1337 * (-6) continue formatting codebase and move category to bottom
03:45:47 <esowiki> [[Commercial]] https://esolangs.org/w/index.php?diff=65541&oldid=65540 * Dtuser1337 * (-52) /* Implementations */
03:46:20 <ais523> zzo38: is that (x-1) sadd x?
03:46:45 <esowiki> [[ORK]] https://esolangs.org/w/index.php?diff=65542&oldid=53698 * Dtuser1337 * (+13) /* External resources */
03:47:35 <zzo38> Yes.
04:54:20 <esowiki> [[Camouflage]] https://esolangs.org/w/index.php?diff=65543&oldid=30836 * Dtuser1337 * (+24)
04:54:49 <esowiki> [[ZOMBIE]] https://esolangs.org/w/index.php?diff=65544&oldid=53713 * Dtuser1337 * (+20)
05:08:36 -!- ais523 has quit (Quit: quit).
06:35:06 -!- nfd9001 has joined.
07:19:01 -!- rain1 has joined.
08:24:12 -!- Lord_of_Life has quit (Ping timeout: 245 seconds).
08:27:28 -!- Lord_of_Life has joined.
08:32:07 -!- AnotherTest has joined.
08:52:51 -!- nfd9001 has quit (Ping timeout: 264 seconds).
08:55:08 -!- nfd9001 has joined.
09:02:01 -!- b_jonas has joined.
09:05:41 <b_jonas> ais523: "~b has to be a positive number in the 0..30 range" => I think you mean in the 0..0x3E inclusive range
09:07:43 <b_jonas> "should I write it as 1 + (ptrdiff_t)(signed char)b in the original source" => use an ifdef, with the original version inactive and the optimized version active, write a comment that it's an optimization, and then check that the optimized version is compiled the way you want
09:08:03 <b_jonas> that said, isn't it possible that this is a case where the optimization doesn't matter?
09:10:58 <esowiki> [[Edition]] N https://esolangs.org/w/index.php?oldid=65545 * A * (+874) Created page with "Edition language Double speak example I"$R. C"F"0@i1@i Explanation I Take input "$ Copy the input's length to clipboard R. Read the current file's content into the..."
09:11:12 <b_jonas> "the in-memory representation of a parse tree" => that wasn't my guess
09:12:18 -!- arseniiv has joined.
09:13:41 <b_jonas> "having a separate call stack and data stack is a much better idea" => is it still, now that CPUs cache the top of the call stack as an optimization?
09:16:41 <b_jonas> ais523: moving objects around but without a gc, "at first I thought "surely that isn't necessary", but then came up with some pathological programs where it is" => in the kernel, trying to free up pages when it has lots of individually allocated small structures fragmented around memory
09:17:26 <b_jonas> kernel people are actually working on that, and yes, the disadvantages are terrifying, although I think they're doing it more memory conservingly than doubly linking every individual pointer
09:19:48 <arseniiv> hi. Nice it’s calm and wise here as usual :)
09:20:50 <arseniiv> (I was without IRC for two weeks and in some reason I’m still unpacking)
09:22:40 <b_jonas> "comparison with 0xC100000000000000" => that's because you can't have immediates larger than 32 bits, so it has to be two instructions anyway. that it's a shift of the input is still a bit surprising to me, but apparently that's how you can get it in just two instructions, rather than just three as in loading the constant in two instructions.
09:23:37 <b_jonas> may also depend on the context
09:24:23 <arseniiv> hm I played a game called Weave where you place tiles on a hexagonal grid, each tile having two points marked on each side and connecting these points by curved segments in some fashion as to make various intertwined paths when many tiles are placed together. Now I suddenly wonder if something can be esonalged from there
09:25:26 <b_jonas> "you'd simply have to replace "cmp $0xc0,%rsi" (7 bytes) with "cmp $0xc0,%sil" (4 bytes)" => why? isn't there an encoding of "cmp $0xc0,%rsi" that has a 1-byte immediate?
09:25:57 <arseniiv> turns and “knottiness” of paths could be used for semantics, and there are only so many tiles available to make the composition not so trivial
09:27:00 <arseniiv> also I think I have already seen those tiles somewhere even before that game, but don’t remember where
09:28:06 <arseniiv> and it could be downplayed to square tiles if necessary. If anybody would be interested, I’ll link to that game in G.Play and/or draw examples of tiles
09:28:14 <b_jonas> how would gcc even get the assembler to emit the encoding with the 4-byte immediate?
09:31:34 <int-e> b_jonas: 0xc0 is too big for a one-byte immediate
09:31:52 <int-e> (because they are sign-extended)
09:32:29 <int-e> and I thought picking immediate sizes was the assembler's job, not gcc's.
09:33:16 <int-e> Oh, but choosing operand sizes isn't... that's the point.
09:33:52 <b_jonas> int-e: ah ok
09:34:00 <int-e> And intel only offers the choice between one byte immediates and full byte immediates.
09:34:15 <int-e> I meant s/full byte/full size/
09:34:23 <b_jonas> int-e: gcc would know what size the instruction has, in fact the whole part where it outputs assembly rather than the object code directly is mostly for human-debuggability I think
09:35:08 <int-e> I'm not sure that is the case. gcc probably has a good idea how big instructions will be, of course.
09:35:24 <int-e> But I would hope it doesn't rely on always being correct.
09:46:00 <b_jonas> int-e: it certainly dones't have to know for rare slow instructions, because it can handle unknown instructions in asm statements, but for when it generates fast code, it probably knows the instruction size for optimization
09:46:14 <b_jonas> (for x86_64)
09:46:40 <int-e> b_jonas: Sorry, I'll take that as speculation unless you have proof.
09:46:54 <b_jonas> unrelated, but http://www.wyrdplay.org/AlanBeale/CAAPR-ref.html is a small English pronunciation dictionary by the author of 12dicts that seems to be well compiled
09:47:36 <int-e> (For example, a key purpose of things like .align is that the compiler can align code without knowing exactly how big the individual instructions are.)
09:50:56 <b_jonas> int-e: again, gcc generally doesn't know how long the emitted instructions from an asm statement are, which is documented in the gcc docs because it also says that gcc assumes an upper bound and what sneaky things you shouldn't do in the asm statement to violate that
09:51:16 <b_jonas> int-e: and gcc probably can't tell the length of some jump instructions in the first pass
09:51:40 <b_jonas> that said, some of the assembly syntax is probably for humans, not for compilers
09:51:56 <b_jonas> like, do you think the compiler needs both decimal and hex integer syntax?
09:52:32 <b_jonas> the .align too can be useful to make the assembly output more readable, even in the case when the compiler does happen to know how many bytes it emits
09:52:53 <b_jonas> plus .align 16 is shorter than the assembly mnemonic of a 9-byte nop instruction
09:55:31 <b_jonas> or looks nicer, rather
09:55:33 <b_jonas> shorter doesn't matter
09:56:14 -!- FreeFull has joined.
10:10:54 -!- botnick has joined.
10:11:24 -!- botnick has quit (Read error: Connection reset by peer).
11:11:57 -!- FreeFull has quit.
11:12:37 <b_jonas> `card-by-name Hecatomb
11:12:38 <HackEso> Hecatomb \ 1BB \ Enchantment \ When Hecatomb enters the battlefield, sacrifice Hecatomb unless you sacrifice four creatures. \ Tap an untapped Swamp you control: Hecatomb deals 1 damage to any target. \ IA-R, 5E-R, 6E-R, MED-R
11:12:58 <b_jonas> ^ It seems a bit underwhelming that they use the word "hecatomb" for the sacrifice of a mere four creatures
11:13:39 <b_jonas> `card-by-name Epic Struggle
11:13:39 <HackEso> Epic Struggle \ 2GG \ Enchantment \ At the beginning of your upkeep, if you control twenty or more creatures, you win the game. \ JUD-R
11:17:43 -!- livesex_0699 has joined.
11:17:46 -!- livesex_0699 has left.
11:26:39 <b_jonas> (The helix pineaple requires you to pay mana only, not sacrifice creatures.)
12:17:51 -!- andrewtheircer has joined.
12:17:55 <andrewtheircer> hi
12:18:47 <andrewtheircer> idea: programming language only usable from last thursday to next thursday
12:27:33 -!- dog_star_ has changed nick to dog_star.
12:29:12 -!- dog_star has quit.
12:29:33 -!- dog_star has joined.
12:30:07 <andrewtheircer> hi doh
13:03:56 -!- xkapastel has joined.
13:10:41 -!- tromp_ has quit (Remote host closed the connection).
13:36:24 <esowiki> [[~English]] https://esolangs.org/w/index.php?diff=65546&oldid=37171 * Dtuser1337 * (+24) this is a high level language due to how it looked like.
13:36:57 <esowiki> [[Commercial]] https://esolangs.org/w/index.php?diff=65547&oldid=65541 * Dtuser1337 * (+24)
13:37:30 <esowiki> [[Drive-In Window]] https://esolangs.org/w/index.php?diff=65548&oldid=65536 * Dtuser1337 * (+23)
13:42:59 -!- arseniiv_ has joined.
13:42:59 -!- arseniiv has quit (Read error: Connection reset by peer).
13:48:03 -!- Frater_EST has joined.
13:49:46 -!- Frater_EST has left.
13:50:54 -!- tromp has joined.
13:54:17 -!- MDude has quit (Quit: Going offline, see ya! (www.adiirc.com)).
13:55:51 -!- tromp has quit (Ping timeout: 264 seconds).
14:04:21 -!- tromp has joined.
15:34:10 -!- Sgeo__ has joined.
15:37:07 -!- Sgeo_ has quit (Ping timeout: 248 seconds).
16:10:16 -!- xkapastel has quit (Quit: Connection closed for inactivity).
16:51:12 -!- tromp has quit (Remote host closed the connection).
16:53:04 <zzo38> Is thre a metric-only font format in PostScript?
17:01:57 -!- tromp has joined.
17:10:26 -!- xkapastel has joined.
17:13:57 <andrewtheircer> hi zzo
17:14:25 <zzo38> Hello
17:15:54 <andrewtheircer> what is your favorite eso you've made
17:16:00 <andrewtheircer> your crowning achievement in a sense
17:21:20 <zzo38> I don't know.
17:21:38 <zzo38> (I think that I do not make a "crowning achievement")
17:21:47 <b_jonas> zzo38: you mean like for overlaying text on a scanned image of a text? I think you could use a vector font where all the glyphs are blank
17:22:22 <andrewtheircer> well then zzo
17:22:22 -!- Sgeo has joined.
17:23:19 <andrewtheircer> are you david madore , b
17:23:24 <zzo38> b_jonas: Actually I mean for use with an output driver that will provide its own glyphs or convert to some other format that will then be processed by something which provides its own glyphs, although it would also work to overlay text on a scanned image of text too
17:23:32 <zzo38> andrewtheircer: No, I am Aaron Black.
17:23:43 <andrewtheircer> well then
17:24:00 -!- Sgeo__ has quit (Ping timeout: 272 seconds).
17:24:02 <andrewtheircer> does everyone know you are aaron black
17:24:38 <andrewtheircer> if not
17:24:40 <andrewtheircer> this is big revel
17:25:07 <zzo38> I don't know, but I think it doesn't really matter so much. I am just known as "zzo38", and will not confuse with other people also named Aaron Black, I think.
17:25:21 <b_jonas> zzo38: ok. I'd think a font with blank or dummy vector images would work for that too
17:25:39 <andrewtheircer> nice stuff aaron
17:25:54 <andrewtheircer> also:i was asking b_jonas if they were david madore
17:26:40 <b_jonas> andrewtheircer: of course not
17:26:44 <zzo38> b_jonas: Yes, that would work, I suppose
17:26:48 <andrewtheircer> d'oof
17:28:15 <b_jonas> zzo38: but maybe otf already has a way to indicate that a font has metrics only. few people know how font formats actually work. maybe oren knows.
17:34:38 <andrewtheircer> it's a low voodoo about font formats
17:36:05 <b_jonas> it's not surprising that nobody understand them: there's a lot of historical baggage of evolving technology, and a lot of useful optimization
17:44:06 -!- andrewtheircer has quit (Remote host closed the connection).
17:46:25 -!- tromp has quit (Remote host closed the connection).
17:58:55 -!- tromp has joined.
18:09:09 -!- tromp has quit (Remote host closed the connection).
18:17:12 <arseniiv_> I am also not David Madore
18:17:16 -!- arseniiv_ has changed nick to arseniiv.
18:39:40 -!- tromp has joined.
18:44:05 -!- tromp has quit (Ping timeout: 244 seconds).
18:54:05 <esowiki> [[PureStack]] https://esolangs.org/w/index.php?diff=65549&oldid=41588 * Dtuser1337 * (+27)
18:55:11 <esowiki> [[Super Stack!]] https://esolangs.org/w/index.php?diff=65550&oldid=37526 * Dtuser1337 * (+24)
18:55:44 <esowiki> [[Super Stack!]] https://esolangs.org/w/index.php?diff=65551&oldid=65550 * Dtuser1337 * (-24)
18:57:48 <esowiki> [[Super Stack!]] https://esolangs.org/w/index.php?diff=65552&oldid=65551 * Dtuser1337 * (+23)
18:59:16 -!- adu has joined.
19:01:28 <zzo38> I think the font formats with METAFONT is good.
19:14:04 <esowiki> [[Emoji]] https://esolangs.org/w/index.php?diff=65553&oldid=46750 * Dtuser1337 * (+104) Adding some categories.
19:19:13 -!- Sgeo has quit (Ping timeout: 244 seconds).
19:21:10 <esowiki> [[Emoji]] https://esolangs.org/w/index.php?diff=65554&oldid=65553 * Dtuser1337 * (+24) Added output only category due to potential no input.
19:22:42 -!- Sgeo has joined.
19:26:39 -!- tromp has joined.
19:27:54 <esowiki> [[Hello world program in esoteric languages]] https://esolangs.org/w/index.php?diff=65555&oldid=65405 * Dtuser1337 * (+39)
19:32:31 <esowiki> [[Hello world program in esoteric languages]] https://esolangs.org/w/index.php?diff=65556&oldid=65555 * Dtuser1337 * (+746) /* Emoji */
19:38:01 <esowiki> [[Hello world program in esoteric languages]] https://esolangs.org/w/index.php?diff=65557&oldid=65556 * Dtuser1337 * (+91) /* suicide */
19:39:25 -!- tromp has quit (Remote host closed the connection).
19:40:31 <esowiki> [[Hello world program in esoteric languages]] https://esolangs.org/w/index.php?diff=65558&oldid=65557 * Dtuser1337 * (+152) /* MATL */
19:40:36 -!- adu has quit (Quit: adu).
19:41:09 -!- FreeFull has joined.
19:54:19 -!- sombrero has joined.
19:55:35 <sombrero> :0 the style has changed
19:56:05 <int-e> style?
19:56:34 <int-e> Oh the webchat I guess. IRC is a pure text-based protocol, it has no style.
19:56:48 -!- tromp has joined.
19:56:50 <sombrero> yep the frontend
19:57:38 -!- Sgeo_ has joined.
19:58:41 <int-e> sombrero: you're the only user of the webchat here, it seems.
19:59:16 <sombrero> ???
20:00:52 <int-e> (But I'm not sure, some people are cloaked which hides this information.) In any case, there are many IRC clients. The Freenode webchat frontend is just one of them.
20:01:04 -!- Sgeo has quit (Ping timeout: 272 seconds).
20:01:28 <int-e> https://en.wikipedia.org/wiki/Comparison_of_Internet_Relay_Chat_clients <-- many
20:01:51 <zzo38> Yes, and I use one I wrote by myself. Others may also use programs not mentioned on Wikipedia, too
20:06:57 -!- tromp has quit (Remote host closed the connection).
20:08:47 <sombrero> :v anyway, i would be very happy if take a look to a compilation of drawing http://vixra.org/abs/1907.0332 and give your criticss, It would be pretty nice if this inspires some good ideas to make new stuff ;)
20:10:26 <sombrero> (esoteric answers and comment are well accepted ;P)
20:17:23 <b_jonas> int-e: on freenode, the webchat cloak overrides the other cloaks, so you can't hide it that way
20:18:18 <b_jonas> hmm
20:18:42 <b_jonas> sorry, apparently that doesn't apply to kiwi (oh I hate that client by the way), only some other webchats
20:18:53 <b_jonas> because sombrero's hostname isn't a cloak
20:22:22 -!- tromp has joined.
20:23:24 -!- Lord_of_Life_ has joined.
20:25:52 -!- Lord_of_Life has quit (Ping timeout: 244 seconds).
20:25:53 -!- Lord_of_Life_ has changed nick to Lord_of_Life.
20:33:41 -!- sombrero has quit (Remote host closed the connection).
20:40:34 -!- Phantom_Hoover has joined.
21:21:55 <zzo38> Priests of Sun {1W} Creature - Human Cleric (1/1) ;; {1W}: Target creature gets +0/+1 until end of turn. ;; {T}: Tap target blocking creature. That creature gets +1/+2 until end of turn.
21:22:22 <zzo38> Priests of Worm {1B} Creature - Human Cleric (1/1) ;; {(B/P)}: ~ gains first strike and haste until end of turn. ;; {T}, Sacrifice a non-artifact creature: Add {BB}.
21:39:13 -!- AnotherTest has quit (Ping timeout: 250 seconds).
21:39:25 -!- Phantom_Hoover has quit (Quit: Leaving).
22:02:07 -!- andrewtheircer has joined.
22:06:17 -!- Bowserinator has quit (Ping timeout: 245 seconds).
22:06:28 -!- Bowserinator_ has joined.
22:28:41 -!- Sgeo has joined.
22:29:52 -!- Sgeo_ has quit (Ping timeout: 244 seconds).
22:36:07 -!- Sgeo_ has joined.
22:39:13 -!- Sgeo has quit (Ping timeout: 245 seconds).
22:47:11 <zzo38> Do you like this cards?
22:47:25 <andrewtheircer> who is cards
22:48:26 <zzo38> Magic: the Gathering cards
22:48:47 <andrewtheircer> og
22:48:50 -!- andrewtheircer has quit (Remote host closed the connection).
22:50:04 -!- xkapastel has quit (Quit: Connection closed for inactivity).
23:18:03 -!- nfd9001 has quit (Ping timeout: 264 seconds).
23:18:34 -!- nfd9001 has joined.
23:39:04 -!- tromp_ has joined.
23:40:59 -!- tromp has quit (Ping timeout: 250 seconds).
23:44:31 <arseniiv> @tell sombrero what are polysigns mentioned in the abstract of the thing you linked?
23:44:32 <lambdabot> Consider it noted.
23:45:15 -!- ais523 has joined.
23:45:23 -!- FreeFull has quit.
23:46:48 <ais523> b_jonas: the kernel uses intrusive lists a lot so it may be that a large proportion of the pointers are already doubly linked, that makes doubly-linking all of them a little less insane
23:47:52 <ais523> <b_jonas> "you'd simply have to replace "cmp $0xc0,%rsi" (7 bytes) with "cmp $0xc0,%sil" (4 bytes)" => why? isn't there an encoding of "cmp $0xc0,%rsi" that has a 1-byte immediate? ← no, there isn't, 1-byte immediates only exist as offsets of memory addresses (which is why lea is useful) and on byte instructions
23:48:09 <ais523> at least, on instructions like cmp that use a fairly normal encoding
23:49:13 -!- arseniiv has quit (Ping timeout: 245 seconds).
23:49:22 <b_jonas> ais523: they do exist on ordinary word instructions like cmp, but int-e says that they're signed so it won't work here -- I haven't checked that, it's really hard to figure out from intel manuals which things are sign extended and which ones are zero extended
23:50:03 <ais523> incidentally, it crosses my mind that you could go down to three bytes by using bl/cl/dl rather than sil, or even two bytes if al were available
23:50:18 <ais523> b_jonas: ah, OK
23:50:21 <shachaf> I think it's often the case that instead of pointers you want to use some alternative like indices, if you have a lot of them. That already allows for some kinds of relocation.
23:50:28 <ais523> there's a simple rule for this, one-byte immediates in x86 are /always/ signed
23:51:00 <b_jonas> they also exist for the shortcut %eax arithmetic instructions, but annoyingly not for mov, despite that a word mov with one-byte immediate would be often very convenient
23:51:02 <ais523> so the only time they're treated as unsigned is if the instruction treats its operand as unsigned /and/ it only has a 1-byte operand, in which case it gets sign-extended to its own length and then treated as unsigned
23:51:40 <b_jonas> ais523: including one-byte offsets for memory operands?
23:51:47 <ais523> b_jonas: yes
23:51:52 <b_jonas> ok
23:52:00 <ais523> this is why the x86_64 ABI has a redzone, it's to make use of the positive offsets on SP
23:52:17 <b_jonas> ok
23:52:54 <ais523> anyway, a signed immediate /would/ work here, you'd simply need to do a signed shift rather than an unsigned shift
23:53:20 <b_jonas> hmm
23:53:39 <ais523> or, hmm, maybe not
23:53:55 <ais523> because with the signed interpretation, you're checking for values between 0 and a negative constant
23:53:59 <ais523> which isn't all at one end of the range
23:54:10 <ais523> ugh
23:58:12 <b_jonas> I just realized that it's funny how you first say that x86 doesn't have one-byte immediates for word instructions, then said that they're all signed
23:58:53 <ais523> that's the way that memory works, I think
23:59:10 <moony> Insert complaint about x86's insanity here.
23:59:35 <ais523> don't lawyers have a phrase "in the alternative" for describing situations like this?
23:59:41 <b_jonas> but yes, it is signed
←2019-08-17 2019-08-18 2019-08-19→ ↑2019 ↑all