2019-08-19 - freenode:#esoteric

←2019-08-18 2019-08-19 2019-08-20→ ↑2019 ↑all

00:00:08 <moony> UTC rollover.

00:00:36 <b_jonas> well, you could say that you forgot that the 8-byte immediates are available for the general encoding of the arithmetic, as opposed to the shortcut accumulator destination encoding

00:01:40 <ais523> fwiw, I think modern processors in general are insane because of register renaming

00:02:04 <ais523> compilers go to all this effort to pack their nicely SSAed programs into a set of registers, which is an NP-complete problem

00:02:17 <ais523> then the processor goes to a lot more effort to unpack all thhe register names back into SSA and disregard the actual registers

00:03:05 <shachaf> The Mill people describe their belt system as "hardware SSA".

00:03:41 <ais523> yes, it's pretty similar, but also somewhat limiting I think

00:03:57 <ais523> I prefer doing things the other way round, where each instruction says where the data should be routed to, rather than where the data came from

00:04:22 <ais523> because that allows you to pipeline high-latency instructions, which is half the reason for the register renaming in the first place

00:05:33 <shachaf> I wish I had a better x86 REPL thing.

00:05:46 <shachaf> I could probably write something reasonably easily.

00:06:00 <ais523> asm repl, that's an interesting idea

00:06:31 <ais523> I remember debug.com, arguably that counts, but it was a bit weird in how it worrked

00:08:03 <shachaf> Are there any good debuggers for Linux?

00:08:12 <ais523> what do you mean by "good"?

00:08:33 <ais523> gdb is easily good enough for what I want, but may well not count as "good"

00:08:39 <shachaf> Hmm, I'm not sure. It's probably mostly a UI thing?

00:08:51 <ais523> ddd is a GUI version of gdb, but is incredibly dated

00:08:53 <shachaf> I can figure out the things I want with gdb but it often takes a lot of overhead.

00:09:10 <ais523> there are also a lot of IDEs that run on Linux, of course; probably most of them do

00:09:17 <shachaf> People say Microsoft's debugger is good, but I haven't used it.

00:09:49 <ais523> well, it's integrated with Visual Studio, which IMO automatically makes anything insane

00:10:17 <shachaf> I've barely used Visual Studio. I don't know much about it.

00:10:44 <b_jonas> ais523: yes, and that's made even more insane by how the intel x86 optimization manual says that the longer nop instructions use execution units and a gp register from the register file, so the compiler sometimes has to figure out which register a nop instruction should reference

00:10:47 <ais523> shachaf: the last-but-one time I attempted to install Visual Studio, I ended up needing to entirely reinstall Windows

00:10:59 <shachaf> The debugging APIs in Windows are certainly better than ptrace.

00:11:13 <ais523> aww, I like ptrace

00:11:25 <ais523> although it's a bit slow in terms of how many system calls are needed

00:11:26 <b_jonas> it's strange, you'd think they solve that in the decoder, but no

00:11:38 <shachaf> ptrace does approximately the minimum possible.

00:11:50 <ais523> b_jonas: fwiw, I think there should be two different .align pseudoinstructions

00:11:59 <shachaf> I don't remember all the things Windows does but they generally seem useful.

00:12:00 <ais523> one which uses NOP padding, and the other of which uses ud2

00:12:07 <shachaf> For example, Win32 VirtualAllocEx lets you allocate memory in another process's address space.

00:12:26 <ais523> shachaf: ugh, that seems outright dangerous

00:12:35 <ais523> what if the other process is using a different allocator to the expected allocator?

00:12:39 <shachaf> When you're debugging a process?

00:12:48 <shachaf> VirtualAllocEx is the equivalent of mmap.

00:12:48 <ais523> the ptrace equivalent is to force the other process to call malloc, which seems safer

00:13:07 <shachaf> In general I think many Windows system calls let you specify a process handle.

00:13:27 <shachaf> How do you even do a system call, when you attach to some unknown process?

00:14:01 <shachaf> You at least need to find a syscall instruction, which could mean waiting for the process to do a system call, searching its address space for a syscall instruction, or writing one into its address space.

00:14:35 <ais523> it's documented what you do if the ptrace is stopping at a syscall instruction, you just rewind the IP two bytes

00:14:46 <ais523> if the process isn't stopped at a system call, though, you have to change the IP to point at one

00:14:49 <shachaf> But you need it to be in an executable page, which is often read-only. ptrace lets you write to read-only mappings, but if a file is mapped into memory, it'll secretly convert it to a private mapping from a shared mapping.

00:14:58 <ais523> (there is a system call instruction in the vDSO, that's the one I use for Web of Lies)

00:15:17 <shachaf> That's true, maybe finding the vdso page is your best bet.

00:15:26 <shachaf> (Unless the debugee unmapped it?)

00:15:36 <ais523> well, you just use /proc/*/maps to find where it is

00:15:51 <shachaf> I guess writing a debugger for hostile programs that don't want to be debugged is quite different from writing one for your own programs.

00:15:57 <ais523> unmapping the vDSO is interesting, I didn't think of that

00:16:24 <ais523> also, ptrace doesn't work recursively, which means that you can make yourself almost debugger-immune simply by ptracing yourself or a bunch of child processes

00:16:32 <ais523> (for a debugger to hide from that, it needs to /simulate/ all the ptrace calls itself)

00:17:17 <b_jonas> ais523: I don't think padding with repeated UD2 is a good idea. it encodes to db 0x0F,0x0B| and if you happen to jump to that with the odd alignment, it reads the bytes 0x0B,0x0F which is a perfectly valid two-byte instruction

00:17:33 <ais523> ah right

00:17:49 <ais523> does x86 have an instruction sequence that's invalid from any offset?

00:18:09 <ais523> other than the last byte, possibly, as there are no guaranteed one-byte illegal instructions

00:18:43 <shachaf> Why do you want an illegal isntruction?

00:19:39 <b_jonas> ais523: there are like sixteen one-byte instructions that are currently illegal in x86_64, but they're all just reserved, not guaranteed to be illegal forever

00:19:40 <ais523> because the padding isn't meant to be executed

00:19:53 <shachaf> Why not 0xCC?

00:20:35 <b_jonas> yes, padding with 0xCC is probably the best

00:20:39 <shachaf> Or 0xF4 if you're not in ring 0.

00:21:20 <shachaf> I guess I should start using octal instead of hexadecimal.

00:21:29 <ais523> LOCK LOCK LOCK … LOCK NOP is illegal at any offset other than the last

00:21:57 <shachaf> So int3 is 0314 and hlt is 0364.

00:22:09 <ais523> (once you have more than 15 lock prefixes it becomes illegal for a different reason, but that's OK)

00:22:41 <shachaf> (more than 14?)

00:22:49 <ais523> max length of an instruction is 16 I think

00:23:05 <shachaf> Wait, really? I thought it was 15.

00:23:11 <ais523> …wait, why is LOCK NOP even illegal? NOP is a sort of XCHG instruction, which is one of the few that /can/ be locked

00:23:38 <ais523> oh, because there are two registers, LOCK requires memory to be mentioned

00:23:41 <shachaf> It makes sense for reg-reg xchg to be illegal.

00:23:43 <shachaf> Right.

00:24:40 <ais523> while messing around with atomics I discovered that x86 has an XADD instruction

00:24:56 <ais523> which is (r, m) = (m, r+m)

00:25:17 <ais523> notable mostly because it's lockable and has pretty useful semantics for a lockable instruction

00:25:42 <shachaf> Hmm, I didn't know that.

00:26:28 <zzo38> I think it can be useful sometimes, yes.

00:26:29 <moony> XADD can also be used to perform the fibbonaci sequence in very little space

00:26:32 <moony> which is a thing

00:26:43 <ais523> just two registers, I assume

00:26:56 <zzo38> Yes, I thought that too

00:27:35 <moony> yup, 2 registers fib.

00:27:55 <moony> technically 3

00:28:00 <ais523> now the fun part is: can you do it with /one/ register?

00:28:01 <moony> but 3rd is for loop if you want it to actually halt

00:28:02 <shachaf> But it's not as efficient as repeated squaring, I suppose.

00:28:43 <ais523> one loop iteration, that is

00:28:44 <moony> ais523: quickly thought over the various ways to do addition on x86. My guess is no

00:28:58 <ais523> moony: it'd have to be a vector register I think

00:29:03 <shachaf> Two unbounded registers are enough for any computation.

00:29:16 <moony> I was thinking of vectors too

00:29:20 <shachaf> Man, I wish I had a computer with even one unbounded register.

00:29:38 <b_jonas> ais523: do you need it modulo 2**32, or just the first fifty or so terms?

00:29:38 <moony> you do. it's called RAM. (/s)

00:29:47 <shachaf> mod r/s

00:29:59 <b_jonas> if the latter, then you can do a compare conditional jump thing to do it in one register

00:30:18 <b_jonas> and possibly even some smart hash lookup table thing

00:30:28 <b_jonas> but if you need it forever, then it gets harder

00:30:50 <b_jonas> I think you could use one 64-bit register to compute it mod 2**32:

00:30:54 <moony> actually no

00:31:02 <moony> fib with a 1 byte value should be doable in 1 reg

00:31:08 <b_jonas> as long as you can have a constant in memory

00:31:21 <moony> because upper lower halves

00:31:25 <zzo38> Why the PostScript binary format does not include dictionaries?

00:31:30 <ais523> b_jonas: I was thinking 32-bit integers, or even floats

00:31:40 <b_jonas> multiply it by a constant 0x00000001_00000000, then swap the upper and lower parts using rotate

00:32:27 <b_jonas> no, I mean

00:32:36 <b_jonas> multiply it by 0x00000001_00000001

00:32:37 <b_jonas> duh

00:33:17 <b_jonas> I think you could do that even without a constant memory operand, because 0x00000001_00000001 is composite

00:33:53 <b_jonas> multiply by 641 then multiply by 6700417 then rotate right by 32

00:33:57 <b_jonas> start from 1

00:34:15 <b_jonas> low 32 bytes gives the fibonacci number

00:35:11 <shachaf> I'm wondering about a language feature which is sort of the opposite of the ones I've been talking about wanting recently.

00:35:20 <b_jonas> https://www.perlmonks.com/?node_id=715414 is slightly releavnt

00:35:39 <shachaf> I want a type which can have values that are either known at runtime or at compiletime, and can mostly be treated uniformly.

00:35:56 <shachaf> For example an array's length might either be known statically or be a field on a struct.

00:36:00 <b_jonas> oh yeah, that also works

00:36:31 <shachaf> You might like to be able to write code that works uniformly in both cases.

00:36:35 <ais523> actually AVX is almost certainly enough, you can treat %xmm1 or whatever as four floats, then vhaddps gives you the ability to add and vpermilps lets you move them around inside the register

00:36:40 <b_jonas> ais523: in each step, multiply by phi*2**32, add 2**31, then shift right by 32

00:36:59 <shachaf> So you'd have a "field" on the struct which just refers to a constant compiletime value, or something.

00:36:59 <b_jonas> also works with floats: multiply by phi, round to integer

00:37:06 <ais523> this doesn't work with ints because vpermd can't take input from an immediate for some reason

00:37:08 <b_jonas> round to nearest

00:37:10 <shachaf> Are there languages that do that?

00:37:28 <b_jonas> can only give the series starting from the second 1 though

00:37:39 <ais523> shachaf: Rust does several things which are incredibly similar to that

00:37:55 <ais523> but might not quite have the syntax you want

00:37:58 <shachaf> Hmm, do you have an example?

00:38:38 <ais523> you can say print!("{}", obj.field()); and depending on the type of obj, that can either get a value of a field or else compile down to a constant

00:39:14 <ais523> in this case the function containing that would have a type parameter describing the type of obj so that the compilier knew which implementation to use

00:39:43 <shachaf> Another version of this might be a function that can either take an argument at compiletime or at runtime. (In the former case it could be specialized for that argument.)

00:40:27 <ais523> several of the more modern C-replacements (although not Rust, I think) have syntax in which you just write a function and can use it at compile-time if everything it does is safe then

00:40:47 <shachaf> I guess if you have a length() "method" in your language, and you monomorphize polymorphic functions, that more or less accomplishes the same thing, with enough optimizations.

00:40:54 <ais523> fwiw, even gcc compiling C will generate specialised functions if the code implies that they might be useful, at a sufficiently high optimisation level

00:41:29 <ais523> shachaf: right; one weird thing in Rust is that it's idiomatic to rely on this actually happening

00:41:46 <b_jonas> ais523: yeah, but that's another of those optimizations that are hard to get right in the compiler

00:41:47 <shachaf> That sounds like something straight out of C++.

00:41:48 <ais523> e.g. arrays have a compile-time known length but you'd normally write .len() to get at their length anyway

00:42:09 <ais523> because Rust monomorphises whenever you don't explicitly tell it not to

00:42:24 <shachaf> C++ people are all about generating ridiculously inefficient code with recursive nested structs that the compiler can optimize to something efficient.

00:42:29 <b_jonas> the compiler would have to try to inline every call, recursively, to figure out which sets of inlinings result in a simplificatoins

00:42:35 <ais523> shachaf: yes, that's very Rust as well

00:42:41 <shachaf> Right.

00:42:48 <shachaf> I think that's not that great an approach.

00:42:52 <b_jonas> just inlining single calls isn't enough, because sometimes everything is hidden behind several layers of functions

00:42:54 <ais523> with the exception that Rust relies a bit less on the optimiser for it

00:43:04 <ais523> the language semantics are such that the optimisation is more or less forced

00:44:52 <shachaf> Imagine you could say struct Arr<T, length = -1> { T* data; if length < 0 { int len; } else { compiletime int len = length; } };, or something.

00:45:00 <shachaf> Uh, I mean "T *data;".

00:45:38 <shachaf> Of course there are more things about an array that you might want to be either compiletime or runtime.

00:45:49 <shachaf> For example you could track strides for a multidimensional array.

00:46:10 -!- nfd has joined.

00:46:48 <b_jonas> shachaf: sure, Eigen does that with array lengths, and I think some other libraries do too

00:47:12 <shachaf> In general I feel like there are so many different arrayish types you might want, which is kind of annoying.

00:47:21 <ais523> well, the Rust way to do that would be to write [T; 10] for a fixed ten-element array and [T] (which is a different type) for an array of unknown length, but it's trivial to write a function that's generic over both of those

00:47:48 <ais523> (e.g. by specifying the type of your argument as Into<&[T]>)

00:48:19 <shachaf> For example: Array with static length; array with dynamic length; array with dynamic length and (static or dynamic?) capacity, which can't grow past that capacity; array with dynamic length and dynamic capacity that can be grown with an allocator (do you store the allocator too?)

00:48:38 <ais523> I have come to the conclusion that it's generally incorrect for a method to require a specific type from its arguments, rather than a description of what the type needs to support

00:49:01 <ais523> but that description doesn't have to be duck typing, something like Rust's traits or Haskell's typeclasses probably works better

00:49:51 -!- nfd9001 has quit (Ping timeout: 264 seconds).

00:50:09 <ais523> Rust has three arrayish things, [T; len] (compiletime fixed length); [T] (runtime fixed length); Vec<T> (growable)

00:50:22 <b_jonas> (more than three but yeah)

00:50:25 <shachaf> I think doing too much monomorphizing everywhere isn't that great either.

00:50:27 <ais523> that said, I think the "growable up to a fixed capacity" would be useful, but I don't think it's in the standard library

00:50:34 <ais523> b_jonas: three really well-known ones

00:50:41 <shachaf> For example you get a lot generated code, which isn't great.

00:50:42 <ais523> what sort of minor arrayish things does it have?

00:50:43 -!- nfd has quit (Ping timeout: 248 seconds).

00:51:09 <b_jonas> yeah, those are mentioned on the front page of the standard library documentation

00:51:12 <shachaf> ais523: Vec<T> is only growable with a call to a global allocator, right?

00:51:22 <ais523> shachaf: right

00:51:27 <shachaf> In general I'd like my libraries not to depend on things like a global allocator.

00:51:51 <b_jonas> at https://doc.rust-lang.org/nightly/std/index.html#containers-and-collections that is

00:52:12 <b_jonas> shachaf: sure, but if you want a growable type, what should it grow into?

00:52:33 <shachaf> Well, that's one reason I want "growable up to a fixed capacity".

00:52:34 <b_jonas> do you want one that uses a buffer you give it?

00:52:40 <b_jonas> oh, or that

00:52:52 <b_jonas> hmm yes, I don't know if the library has such a type

00:52:54 <shachaf> You could tell your caller that you need more space.

00:52:58 <b_jonas> you could define one though

00:53:09 <ais523> ugh, I've been trying to remove doc.rust-lang.org from my browser history

00:53:24 <moony> mm? why?

00:53:25 <ais523> I shouldn't have clicked that link

00:53:30 <ais523> moony: well, I often develop Rust offline

00:53:33 <b_jonas> ais523: use the forget feature

00:53:34 <moony> ah

00:53:39 <ais523> so I have a local copy of the Rust documentation

00:53:41 <shachaf> I do all my browsing in incognito mode by default so nothing goes in my history.

00:53:42 <ais523> b_jonas: I did, it's just a pain

00:54:15 <ais523> actually, the really annoying thing is that I can't use file:/// URLs for Rust documentation any more, because the page crashes if you don't have cookies/localstorage enabled

00:54:27 <shachaf> ais523: There's also the separate distinction of whether a type "owns" the memory it's referring to or not.

00:54:28 <ais523> and I don't think file:/// supports that

00:54:35 <ais523> so I'm now running a local webserver just for the Rust documentation

00:54:37 <moony> ais523: `rustup doc`

00:54:42 <moony> does that not work

00:54:51 <ais523> moony: in what way?

00:54:53 <b_jonas> ais523: ouch

00:55:14 <ais523> b_jonas: it's not that ouch, I've had a local webserver running on here for years because I couldn't figure out how to get rid of it

00:55:14 <shachaf> I guess Rust is full of ways to handle that.

00:55:17 <ais523> now it at least has some purpose

00:55:23 <ais523> (it's only accessible on localhost, at least)

00:55:28 <b_jonas> sure, but

00:55:28 <shachaf> But I think e.g. Box<> assumes you're using the global allocator?

00:55:53 <ais523> shachaf: pretty much the entirety of Rust is about whether types own their memory or not

00:56:09 <b_jonas> shachaf: yes, or at least its destructor does

00:56:12 <ais523> many methods have to be written three times for the three possible ownership relationships

00:56:26 <ais523> Box<> is basically the main API for accessing the global allocator

00:56:28 <ais523> that and Vec<>

00:56:39 <ais523> so them being tied to the global allocator really isn't surprising

00:56:40 <b_jonas> and there's not much point using it without

00:56:42 <shachaf> And yet they don't seem to have great support for a lot of allocation strategies.

00:56:49 <ais523> you could have a MyLocalBox or whatever for a different allocator

00:56:57 <shachaf> I guess.

00:57:13 <ais523> basically nothing in Rust's standard library assumes you're using a Box, it's always done with traits

00:57:39 <ais523> there are standard trait combinations to use to say "this is a custom pointer-like thing" and then it can be used in all the same contexts that Box can

00:57:53 <shachaf> Last I tried Rust much Box was called ~, I think (or maybe @?).

00:58:04 <ais523> it was called ~

00:58:05 <shachaf> I think @ was for garbage-collected or maybe reference-counted cells.

00:58:36 <ais523> nowadays ~ is called Box and @ was split into Rc and Arc (a Gc was planned but they never got around to it)

00:58:47 <ais523> the Rc/Arc difference is that Arc is thread-safe but slower

00:59:57 <ais523> (Rc can be used in multithreaded programs but Rust won't let you send one between threads, it's confined to a single thread)

01:00:38 <b_jonas> shachaf: that's what cpressey said too ("https://esolangs.org/logs/2019-07-23.html#l2c")

01:00:41 <shachaf> Pervasive reference-counting doesn't seem like a great allocation strategy to me.

01:01:12 <shachaf> b_jonas: No, I'm not annoyed by the language changing as long as I'm not using it.

01:01:28 <shachaf> I'd rather they make it better rather than try to ensure backwards compatibility.

01:02:18 <ais523> shachaf: pervasive reference-counting is one of the reasons that Perl is probably unrecoverable from a performance point of view

01:02:47 <ais523> I found out fairly recently (in the last few days) that Python uses pervasive reference-counting + a cycle-breaker, which seems even worse somehow (especially in a language which could easily use its own VM)

01:03:07 <b_jonas> ais523: what does "use its own VM" even mean?

01:03:16 <ais523> b_jonas: like Java has the JVM

01:03:27 <ais523> OCaml's allocation strategy is really interesting and apparently not widely used

01:04:05 <b_jonas> ais523: in Java's case that means that the programs are compiled down to an executable that the VM can run, and you don't need the compiler, only the VM, to run the program

01:04:06 <ais523> everything is 32-bit aligned, every 32-bit chunk has a tag bit saying whether or not it's a pointer

01:04:17 <b_jonas> but how is that relevant for what you've said above?

01:04:32 <shachaf> ais523: I think this was changed at one point, but for a long time Python objects that had a reference cycle and destructors were just not collected.

01:04:34 <ais523> this means that exact garbage collection is possible (not just conservative) and doesn't need to know anything about the structure of memory apart from the tag bits

01:05:08 <ais523> b_jonas: well, the JVM is an example of something that can implement exact reference counting, and even things like compaction, because it has full control over the structure of all the memory stored there

01:05:25 <ais523> in Java you can swap out GC algorithms without changing the performance of the program

01:05:27 <b_jonas> ais523: doesn't it also mean that you can't easily have an array of numbers in a dynamically allocated thingy though?

01:05:37 <ais523> and I don't think reference-counting + cycle-breaker can possibly be superior to a GC

01:05:54 <ais523> b_jonas: OCaml numbers are only 31 bits wide so that there's room for the tag bits

01:06:09 <ais523> ints, at least

01:06:14 <b_jonas> ais523: right, so you can't have an array of proper numbers, only of OCaml numbers

01:06:21 <ais523> OCaml has to do fairly insane things to make floats work with this, which is a major downside

01:06:41 <shachaf> What's a cycle-breaker?

01:07:03 <ais523> shachaf: something that detects a situation in which a nonzero reference count exists only because of objects recursively referencing each other

01:07:10 <ais523> and not because the objects are actually referenced

01:07:15 <ais523> and frees all the objects in the cycle

01:07:38 <zzo38> Fix the Rust documentation so that it does not use cookies/localstorage. A documentation page shouldn't require that anyways.

01:07:39 <b_jonas> shachaf: a would-be-thief that decides that if he can't have your bicycle, then you can't either

01:08:13 <shachaf> I mean, how does it work other than effectively doing general GC?

01:08:34 <ais523> shachaf: that's why I think it's ridiculous; it's basically most of the way to a general GC with all the disadvantages of an Rc

01:10:21 <b_jonas> that said, python can work with a pure Gc, and I think the python implementations that run in the jvm do that; and if you know that no dependency of your code needs the gc, then you can run cpython with just the refcounting, disabling the gc

01:11:41 <b_jonas> you can use destructors and/or weak references to make the pure refcounting work

01:12:20 <b_jonas> https://docs.python.org/3/library/gc.html#gc.disable

01:12:35 <b_jonas> (don't click on that if you want to use a local copy of the docs)

01:12:51 <ais523> I don't think I have a local copy of the Python docs

01:13:01 <ais523> maybe I should, but I hardly program in Python

01:13:26 <shachaf> If you program in a language all the time, you probably don't need the documentation.

01:13:50 <ais523> anyway, I had an idea wrt the OCaml way of doing things: for each type, identify the ratio of pointers to non-pointers in it (in terms of bytes), and allocate each type with a given ratio in its own arena

01:14:22 <ais523> /but/, you allocate the pointer and non-pointer parts in separate arenas too (the addresses of the two parts are mathematically related because of the constant ratio)

01:14:56 <ais523> now, you have all the pointers in memory blocks of their own, so that you can GC them really easily, and don't need to waste any bits on tagging things

01:16:28 <ais523> another benefit of this is that it statically guarantees that all pointers are correctly aligned, without needing to waste any bytes on padding

01:16:58 <ais523> I'm not sure what the cache effect would be, there are reasons to think it would help but also reasons to think it would hinder

01:18:21 <b_jonas> `? flower

01:18:22 <HackEso> flower. what IS a flower?

01:18:51 <ais523> actually, this is almost strictly better than the OCaml approach (which is already pretty good), the only downside is related to unions and other sum types

01:19:02 <ais523> which are, of course, heavily used in OCaml, so that might be a problem

01:19:35 <ais523> unless, hmm, perhaps sum types could be references and the tag is stored in your choice of arena to point the pointer into

01:21:12 <ais523> OCaml is the sort of language that really wants a GC, because it heavily relies on immutable value types that it wants to avoid copying, /but/ also contains mutable data

01:22:09 <ais523> (I think that languages which mutate a lot normally benefit from manual memory management, and with languages which don't mutate at all, reference counting is a more attractive possibility than it would be otherwise)

01:24:00 <shachaf> I think languages should mutate a lot when they can, but they always seem to focus on backwards compatibility.

01:24:16 <shachaf> This is why you should never release your software.

01:24:38 -!- Sgeo_ has quit (Ping timeout: 245 seconds).

01:25:07 <pikhq> I'm increasingly of the conclusion the transistor was a mistake.

01:25:19 <shachaf> hikhq

01:25:25 <pikhq> hichaf

01:26:28 <shachaf> do you want to design my language for me twh

01:26:31 -!- Sgeo has joined.

01:26:57 <shachaf> also do you want to see pictures of cute kittens

01:26:59 <ais523> shachaf: Rust is instructive in that but I'm not sure in what direction

01:27:05 <pikhq> Between work and general life demands I'm doing well to keep up on developments in programming languages, really

01:27:12 <pikhq> Also, of course I do. Cute kittens are great.

01:27:22 <ais523> lots of people got upset that it changed so much before being stabilised, but OTOH it ended up really benefitting from the time

01:27:54 <ais523> I do think it's beneficial to really work on a project's specification before the first release, though, making sure it's perfect

01:28:52 <pikhq> There's still some things in the Rust stdlib that I think are kinda questionable...

01:29:58 <zzo38> What things is that?

01:30:23 <pikhq> Allocation failure is generally an unreported and unhandlable error.

01:30:45 <pikhq> Which, yes, I know is more _ergonomic_, but it means you have limited options for handling that error condition.

01:30:53 <shachaf> I think it'd be a pretty reasonable world for every sufficiently large company and so on to use its own programming language, rather than everyone standardizing on a few.

01:31:52 <ais523> pikhq: my experience with allocation failures is that almost every attempt I've seen to handle them is either equivalent to a crash, or more user-hostile than a crash would be

01:31:55 <shachaf> Unfortunately there's a lot of nonsense involved in making programming languages which probably shouldn't be necessary. And also cross-language interoperability is often very bad.

01:32:19 <pikhq> And it feels a touch silly in a language that offers decent error handling approaches, and doesn't have semantics necessitating always-crash behavior

01:32:20 <ais523> arguably /every/ attempt

01:32:34 <pikhq> ais523: For many programs, that is indeed the correct decision.

01:32:36 <pikhq> Perhaps most.

01:33:12 <pikhq> In Rust it grates because Rust is trying to handle problem spaces where that might _not_ be the best decision.

01:33:21 <ais523> pikhq: in programs where you'd want to do something else, you probably need safety against other sorts of crashes too, or even power failure

01:34:16 <ais523> (also, my guess is that allocation failure in Rust is a panic, which is just about possible to handle, and I'm guessing that programs that care about allocation failure recovery care about panic recovery too)

01:34:38 <pikhq> I believe there is an outstanding RFC for _making_ it a panic.

01:34:57 <ais523> oh, it isn't atm? presumably that's due to a fear that destructors might allocate memory

01:35:13 <pikhq> And yeah, having allocation failure be a panic is probably a reasonable strategy for a lot of programs.

01:35:16 <ais523> I think it would be obviously incorrect to make it anything less than a panic

01:35:43 <ais523> especially as it's almost impossible, on most computers, to reach the point of memory exhaustion

01:35:53 <ais523> memory is a shared resource, so the computer just runs slower and slower and slower as it gets shorter on memory

01:36:12 <ais523> the point of true memory exhaustion doesn't actually get reached because the user has started force-quitting things and even doing hard power offs before then

01:36:18 <shachaf> On my computer if a program uses too much memory, it just makes the kernel kill some other random program.

01:36:18 <pikhq> Pretty frequently the hypothetical ideal allocation failure behavior is for a given task to abort, not for the program as a whole to.

01:36:28 <pikhq> (but of course that's harder to achieve)

01:36:30 <ais523> so an actual memory exhaustion only happens when it's a quota that got exhausted rather than physical memory

01:36:51 <shachaf> Are destructors a good idea? I can't tell.

01:36:58 <pikhq> shachaf: On Windows on the other hand, the kernel just reports memory exhaustion.

01:36:58 <ais523> (my life got a lot better when I realised that I could just set a per-program RAM usage quota)

01:37:14 <pikhq> It has strict commit charge tracking.

01:37:36 <pikhq> (though it's still hard to really get this to come into play, because the swap file scales in size)

01:37:57 <shachaf> It's kind of bizarre that Linux still uses swap partitions instead of files.

01:38:24 <pikhq> To be honest, probably the most common case where allocation is going to report failure is exhaustion of address space on 32-bit systems.

01:38:38 <pikhq> And that one's pretty easy to hit.

01:39:07 <pikhq> Easier still to have an attempt to allocate that would exhaust address space, while still having plenty after the allocation failed.

01:39:18 <ais523> shachaf: re destructors: my belief is no, except when it's syntactic sugar for something else (as is often the case with RAII), but for a slightly strange reason: if you have the sort of object that benefits from a destructor, you probably need to be tracking/organising all the references to the object anyway to be able to use it in a semantically correct way, in which case calling the destructor manually would be trivial

01:39:48 <ais523> re: swap: Linux can use swap files just fine, people are just used to setting up partitions

01:39:59 <shachaf> Can it suspend-to-disk to a swap file?

01:40:09 <shachaf> That's my main reason for having a swap partition.

01:40:47 <pikhq> I wouldn't be surprised if using a swap file ends up having a performance penalty over a swap partition, just because the Linux swap code is pretty poo.

01:41:07 <shachaf> I also don't know why swapoff takes half an hour.

01:41:10 <zzo38> Also maybe the file is fragmented

01:41:15 <pikhq> Not that swapping is _ever_ going to be fast, but Linux seems to do it a page at a time, synchronously.

01:41:26 <ais523> (fwiw, I set my current laptop up with no swap, neither partition nor file, and have so far not regretted that decision at all)

01:41:27 <shachaf> I guess the reason is what pikhq just said.

01:42:29 <ais523> even then, you still get swapping-like behaviour at times of high memory pressure, but the kernel isn't swapping data out to disk, but rather unloading pages from memory that happen to equal the corresponding page on disk

01:43:01 <shachaf> I think there are two main uses for destructors, which RAII wants to unify:

01:43:28 <shachaf> One is objects on the stack, where things get auto-cleaned-up at the end of a scope.

01:43:58 <shachaf> This is convenient but I think something like Python's "with" or Go's "defer" might address it better. It's mostly a control flow thing, not an object thing.

01:44:24 <ais523> does Java's try-with-resources fall into the same category?

01:44:37 <shachaf> The other is objects that contain other objects that contain other objects that have destructors, or something.

01:45:09 <ais523> (the semantics: you write try (expression) {}, and when control leaves the {} for any reason at all, the ".close()" method is called don the expression; this includes exceptions and all control flow structures, in addition to just falling off the end naturally)

01:45:12 <shachaf> Where the whole tree is automatically traversed for you when you destruct the outermost object.

01:45:41 <shachaf> That sounds plausible?

01:46:13 <ais523> of course, things like System.exit() can beat the try-with-resources and prevent .close() from running, but the semantics are that the process can't continue until your destructor has run

01:46:20 <shachaf> I guess there's also the special case where you e.g. use a lock-acquiring-object and then tail-call another function and give it that object.

01:46:32 <shachaf> That's probably not handled with a thing like "with".

01:47:08 <ais523> semantically the only issue is that it wouldn't be a tail-call any more

01:47:29 <shachaf> I mean the case where you pass ownership of the lock-object to the other function.

01:47:40 <shachaf> So it can presumably destruct it at any point.

01:48:03 <ais523> hmm… isn't the lock-object basically just a continuation?

01:48:21 <shachaf> Why?

01:48:36 <ais523> I guess it's more like a callback

01:50:59 <ais523> anyway, I have one very strong opinion about memory management, which is: for immutable types that never change, programming languages (possibly unless they're /very/ low level) should provide an API which from the programmer's point of view looks like the objects are copied everywhere and never have multiple references, and should optimise it behind the scenes (which may involve garbage collection or reference counting of a single copy or whatever),

01:51:01 <ais523> but should /never/ allow such objects to mix with any sort of memory allocation scheme that's explicitly controlled by the programmer

01:51:45 <shachaf> Why?

01:52:48 <ais523> because the two cases are basically entirely different in terms of how you need to optimise them, and trying to treat them the same way makes both of them much more difficult

01:53:25 <ais523> in particular, the programmer should never need to track immutable things, you can pass them around at will without any semantic issues, forget about them, whatever

01:53:47 <ais523> things that can be mutated are both rarer, and need a lot more care, typically you'll have some very regimented rules for using them already

01:54:13 <shachaf> Maybe your "/very/ low level" boundary is different from mine.

01:54:46 <ais523> for example, in NetHack, keeping track of the memory management for every string in the program is a lot of ridiculous effort, especially when you want to be able to pass the same string to multiple functions or to the same function multiple times

01:54:52 <ais523> so a garbage collector for strings would be really nice

01:55:27 <ais523> OTOH, a garbage collector for in-game items, monsters, etc. would just be semantically wrong, because you want those things to stick around until you explicitly destroy them, and the destruction has a lot of /logic/ impacts that need consideration by the programmer

01:55:31 <zzo38> Is there a driver to use with Ghostscript to write to a DVI file?

01:56:06 <ais523> e.g. if the monster is holding an item, do you want to destroy that too? what if it's a plot-critical item that simply cannot be destroyed? you need to know where the monster "should have been" to place the item in the right location after hte monster is gone

01:56:36 <ais523> so things like monsters are part of the game logic and managing their memory is trivial because you need to manage their state in just as much detail, and the memory management is easy to tack onto that

01:57:55 <ais523> (I've also concluded that there's actually a third category here, things like "the internal state of an algorithm" that are mutable but self-contained and used only temporarily, but those are nearly always either stack-allocated or effectively-stack-allocated)

01:59:16 <ais523> I guess you could make exceptions for, say, treesort, which could in theory be stack-allocated but nearly always isn't

01:59:18 <zzo38> (Also, is it possible to add drivers to Ghostscript without recompiling?)

02:00:20 <shachaf> Why would treesort not be stack-allocated?

02:00:31 <shachaf> Or allocated in some kind of temporary memory.

02:00:37 <ais523> the tree you're building is a recursive data structure the same size as the original list

02:01:03 <ais523> trying to express a stack-allocation of that, especially if the list is coming from a streaming source and you don't know how large it is, is incredibly difficult in most languages

02:01:30 <shachaf> Well, you can allocate in some temporary arena or something with effecitvely stack semantics.

02:01:46 <ais523> using a recursive function to do the stack allocation using its own call frames would work, but is inefficient due to the return addresses cluttering up the stack

02:02:07 <shachaf> Mergesort also allocates a linear amount of memory.

02:02:10 <ais523> so you'd either need an alloca-alike or else a temporary arena

02:02:22 <Hooloovo0> could you do something fancy with tail recursion?

02:02:37 <shachaf> Tail recursion is pretty pointless if you have iteration.

02:02:53 <shachaf> Just write while (true).

02:03:44 <shachaf> By the way: I realized that "if" evaluates its argument exactly once, at the time it's executed, so it's a lot like a function parameterized on a block.

02:03:56 <shachaf> But "while" re-evaluates its argument so it's not very function-like.

02:03:58 <Hooloovo0> hmm, I guess?

02:04:21 <shachaf> Is "while" an exception among control flow keywords?

02:05:23 <ais523> <Hooloovo0> could you do something fancy with tail recursion? ← I was actually just thinking that, and did some experiments; my conclusion is yes in theory, but no using the x86_64 calling convention

02:05:43 <shachaf> Actually, my real question is: If my function has user-definable control flow constructs, should they be functions, which is enough to support "if", and if so how should they handle "while"?

02:06:29 <ais523> you'd need a calling convention in which the called function cleaned up the section of the stack between stack and base pointer for the caller

02:06:51 <ais523> this calling convention is trivial to define but I don't see any benefit except in this one massively specific case, so I doubt it'd ever be used

02:07:33 <ais523> shachaf: the mathematical solution is call-by-name, and the practical solution most languages use for that is to take a callback parameter, meaning that your language is genereally call-by-value but you call-by-name just this one small part

02:07:53 <shachaf> ais523: There might be more benefit in non-recursive tail calls? But probably you should just manage the memory more epxplicitly if you're doing that.

02:08:16 <ais523> so, e.g., in C, the prototype for while would look liike while(bool(*condition)(), void(*body)())

02:08:33 <ais523> because it's a function pointer not a value, the function you're calling can run it multiple times if it wants to

02:08:36 <shachaf> Except that presumably you're thinking of a closure and not a function pointer.

02:08:48 <ais523> it would normally be a closure in practice

02:09:01 <ais523> doesn't have to be, though, it could just be a function constant

02:09:14 <shachaf> Anyway, I'd want this for an efficient language, where the predicate argument is guaranteed to be inlined at compiletime.

02:09:53 <shachaf> If you have a notion of passing multiple blocks to a function, you could just write "while {p} {body}", which maybe wouldn't be so bad.

02:10:08 <zzo38> I think the user defined control structures should be macros and not functions, and operate at compile time.

02:10:09 <ais523> heh, well I designed Verity a while back: intended for efficiency, call-by-name, callbacks are guaranteed to be inlined (in fact, absolutely everything is guaranteed to be inlined, which in turn means that the language doesn't support recursion)

02:10:25 <ais523> or, well, it does actually support recursion but it does so using a recursion combinator

02:10:35 <zzo38> (The macro expansion will then put in all of the correct code to make it work properly at run time)

02:10:56 <ais523> it doesn't support recursion via having a function call itself because that would just inline it

02:11:15 <shachaf> zzo38: I'd like them to be something between macros and functions.

02:11:21 <ais523> shachaf: hmm, I think that might be valid syntax in Perl (once you change "while" to something that isn't a keyword)

02:11:26 <shachaf> Maybe this just means hygienic macros, but I think it's a bit more than that.

02:11:40 <shachaf> For example, I'd like a block to optionally take parameters.

02:11:49 <shachaf> So you might write "for(xs) {\x; ... }"

02:12:08 <shachaf> That's not just an AST, exactly, it's a compiletime object that can be called.

02:12:31 <zzo38> Yes, that makes sense, it could be hygienic macros that you can add blocks to take parameters. So, you can have anonymous macros perhaps?

02:12:55 <shachaf> I guess you could, though it doesn't seem that useful.

02:13:18 <zzo38> I think may would be useful.

02:13:24 <shachaf> It's possible.

02:14:02 <b_jonas> ais523: impossible to reach memory exhaustion => unless you deliberately put a memory limit to catch it early, whether by setrlimit or through the memory allocator function itself

02:14:02 <shachaf> So my old language idea is, you have an operator that captures "the rest of block" and passes it as an argument to an expression.

02:14:23 <shachaf> So instead of writing "if(p) { body }", you can write "{ if(p)`; body }"

02:14:42 <b_jonas> but I do agree that it's usually not worth to handle an out of memory condition other than guaranteeing that it will crash properly rather than randomly corrupt memory

02:14:53 <ais523> b_jonas: yes, this is why I have a setrlimit on every program I start from the command line by default nowadays

02:15:08 <shachaf> And also instead of writing "for(xs) {\x; ... }", you can write "{ x := for(xs)`; ... }"

02:15:16 <ais523> (to catch fast memory leaks in programs I write, which I'd nearly always be running from the command line)

02:15:57 <shachaf> If you're using this idea, you could maybe have "while" pass a special block that breaks from the loop when you pass it false. Then you could have "{ while`(p); body }".

02:16:11 <shachaf> I don't really like this, though, it seems pretty complicated.

02:16:33 <ais523> shachaf: last time we discussed this, I think we had a debate about if it was a monad or not

02:16:44 <ais523> my current thoughts on this is that it works very well for if-like things but much less well for while-like things

02:16:50 <shachaf> What about for-like things?

02:17:15 <ais523> unsure, but currently leaning towards working

02:17:39 <shachaf> I think it works quite well for "{ loop`; ... }" and "{ x := for(xs)`; ... }"

02:17:56 <shachaf> I think this is nicer than almost any "for" construct in any language.

02:18:26 <shachaf> For example I don't think anyone has a nice way to express "{ x := for(ns)` + 1; ... }"

02:18:28 <ais523> fwiw, I think "each" is a good name for it in this context

02:18:38 <shachaf> Sure, "each" is good.

02:19:18 <ais523> this also generalises to "any" loops pretty well, but those aren't in common use (maybe they should be)

02:19:19 <shachaf> You can also write "{ x := for(xs)`; y := for(ys)`; if(x < y)`; ... }" and get something like list comprehensions.

02:19:29 <shachaf> What are "any" loops?

02:19:36 <ais523> the loop exits unless there's an exception in the loop body

02:19:47 <ais523> if there is an exception it just moves onto the next loop iteration

02:19:53 <shachaf> How do you express the exception?

02:20:22 <ais523> normally it would be some sort of assertion failure, these loops are normally only used in declarative languages

02:20:32 <shachaf> You can also write "{ x := for(for(xss)`)`; ... }

02:20:40 <ais523> but I've found myself writing them in, e.g., Java

02:20:54 <shachaf> I've hardly seen loops like this, or maybe I just haven't recognized them.

02:21:10 <ais523> where they're an ugly sort of "for (a : …) try { … ; break } catch (Exception ex) {}

02:21:12 <ais523> "

02:21:22 <shachaf> Oh man, no way.

02:21:40 <ais523> you basically use them when you have multiple approaches that might potentially work, and just want to find one that works

02:21:51 <shachaf> I think just writing a break at the last line of your loop is simple enough.

02:21:55 <ais523> oh, that's not quite right, because if /every/ iteration fails you want to throw an exception

02:22:15 <ais523> ideally that's a combination of all the others, but I haven't yet seen a language that does that

02:22:35 <shachaf> Probably related to Python-style for-else.

02:22:37 <ais523> "could not communicate by TCP because the socket is closed, nor via carrier pigeon because no pigeons were available"

02:22:50 <ais523> I keep forgetting how for-else works

02:23:03 <shachaf> I used to think it was bizarre but now I think it's very natural.

02:23:13 <ais523> is it that the else block only runs if the loop has no iterations?

02:23:15 <shachaf> Though possibly it's even more natural to express it directly with my language idea.

02:23:32 <shachaf> The else block runs if the loop exits via the condition, instead of a break.

02:23:59 <ais523> aha, I do like that

02:24:02 <ais523> the terminology is insane though

02:24:14 <ais523> but the semantics are useful

02:24:33 <ais523> oddly, I think it's the "break" I disagree with here rather than the "else"

02:24:44 <ais523> because "break" is indicating "found"/"success" which the word "break" doesn't really imply

02:24:54 <shachaf> In my language thing, you can label blocks with, say, @, and the label lets you exit them.

02:24:56 <ais523> (likewise, "continue" is indicating failure)

02:25:05 <ais523> perhaps "done" is a good name for "break"

02:25:24 <b_jonas> "I keep forgetting how for-else works" => isn't that because there are at least two unrelated things with a name similar to them?

02:25:44 <shachaf> So "{ x := for(xs)`; ... }" actually means somthing like: { break := @`; x := for(xs)`; continue := @`; ... }

02:27:20 <b_jonas> ais523: I'd prefer "break" or "last" (especially if you have the full quadruplet "retry/redo/next/last"), because "done" is already used in a conflicting sense in sh

02:27:26 <shachaf> So if you wrote "for (xs) { body } else { notfound }" explicitly, it would be something like: { done := @`; { x := for(xs)`; continue := @`; ... }; notfound }

02:28:04 <shachaf> Who cares about sh? The only good thing about sh syntax is the way it handles variables.

02:30:23 <ais523> b_jonas: what does retry do?

02:31:25 <b_jonas> ais523: jumps to right before the start of the loop. not the start of the loop body, the start of the whole loop

02:33:06 <ais523> neat, that's a possibility I hadn't thought of

02:33:21 <ais523> now I sort-of want a control flow operator to go back to the previous loop iteration

02:33:47 <b_jonas> ais523: jumps to right before the start of the loop. not the start of the loop body, the start of the whole loop

02:34:08 <ais523> I think you sent the wrong line?

02:34:13 <b_jonas> yes, sorry

02:34:40 <shachaf> What's redo?

02:35:05 * moony tries to wrap head around this

02:35:37 <ais523> shachaf: repeats the current loop iteration

02:35:55 <ais523> so it's basically a goto to the start of the block, without increasing the loop counter or anything like that

02:36:30 <ais523> I've used it a few times but it doesn't seem massively useful

02:36:56 <shachaf> I like the way both break and continue are forward jumps, and only the loop construct itself does a backward jump.

02:37:22 <zzo38> I implemented Z-machine in C, JavaScript, and Glulx; now I do in PostScript. Later, I could try other stuff, such as PC, and Famicom (which I have started to do some time ago but haven't finished it), and possibly others too. What others will you implement Z-machine on?

02:37:28 <ais523> it's debatable where continue jumps to

02:37:52 <ais523> I'm not sure it's observable whether it's a forwards or backwards jump (and at the asm level, backwards is normally more efficient unless the loop entry has been unrolled)

02:38:04 <shachaf> The C standard defines it as a forward jump.

02:38:17 <shachaf> I also think that's a much more sensible definition.

02:38:40 <zzo38> Yes, forward jump is sense, and then the optimizer could alter it

02:39:21 <shachaf> Let me see if I can label all the points in a loop to support these behaviors.

02:39:43 <ais523> in BF, is ] a conditional or unconditional jump?

02:39:53 <shachaf> { @break; forever`; @retry; { @thing; forever`; @redo; x := for(xs)`; { @continue; BODY; }; thing; }; break; }

02:39:59 <ais523> (you can even make the argument that ] is conditional but [ is unconditional)

02:40:45 <ais523> what does thing do? I'm still getting used to this notation

02:41:17 <shachaf> It exits the loop when you haven't used redo.

02:41:43 <shachaf> There's probably a better way to express this.

02:43:20 <ais523> I can see what that operation doesn't have a standard name :-D

02:43:56 <ais523> in a more normal notation, the labels go here: retry: while(condition) { redo: BODY; continue:; } break:;

02:44:53 <shachaf> I like break-from-named-block much more than goto for control flow.

02:45:15 <shachaf> It probably doesn't clarify things here, though.

02:45:35 <shachaf> (I think this is a good argument for retry/redo being confusing.)

02:46:26 <zzo38> I prefer goto rather than named continue/break.

02:46:54 <moony> I prefer named continue/break

02:46:58 <moony> goto has too many ways to be evil

02:47:43 <shachaf> I even wonder whether you should just not have continue/break and be explicit about writing the labels when you want them.

02:48:28 <zzo38> moony: Well, so does any other feature, I think.

02:48:30 <ais523> hmm, the only situation in which I find myself tempted to use goto, and it clearly isn't a standin for a missing control structure, is when you have lots of if statements with the same body but they're scattered around the function so you can't combine the conditions, and they're too tightly tied to local variables to extract into a function

02:48:38 <ais523> you can use temporary booleans for that instead but I think the goto is clearer

02:48:58 <zzo38> Yes, and I think there are also other situations where goto is clearer.

02:48:59 <shachaf> ais523: This sounds like a use case for the "blocks" I was describing earlier.

02:49:04 <ais523> shachaf: I think the issue is that the loops are too generically named

02:49:43 <zzo38> I have once written how to convert a program with goto so that it only uses labeled continue/break, which, for example can be used with JavaScript.

02:49:50 <ais523> something like Python's for-else has a clearly defined intended use, Haskell's map also has a clearly defined intended use, but both operations become a for loop in most languages even though they function very differently

02:50:17 <ais523> things like continue and break have unclear semantics because the loops they're short-circuiting/exiting have varying semantics

02:50:59 <shachaf> With your goto notation, for (x in xs) { BODY } else { ELSE } is "for (x in xs) { BODY; continue: }; ELSE; break: "

02:51:41 <ais523> yes

02:51:56 <ais523> and a label before the ELSE should probably be called "fail"

02:53:43 <shachaf> Man, { x := for(xs)`; if(valid(x))`; switch(x)`; { case(A)`; ... }; { case(B)`; ... } }

02:53:46 <shachaf> So flat.

02:54:21 <shachaf> for (x in xs) { if (valid(x)) { switch(x) { Case A: ...; Case B: ... } } }

02:55:35 <ais523> hmm, that's semantically correct but isn't it basically an if/else chain?

02:55:54 <shachaf> The switch part? Sure.

02:55:58 <ais523> the original purpose of a switch was to hint to a compiler that it might want to make a jump table

02:56:13 <ais523> but I guess that nowadays, maybe switches are just syntactic sugar for the if/else chain

02:59:24 <shachaf> Anyway I don't believe in exceptions so you'd need some other way to express "any".

02:59:52 <shachaf> I think it might reasonable to say that the loop breaks by default and you can explicitly tell it to continue instead.

03:00:16 <shachaf> But it seems infrequent enough that you can probably just write out the control flow yourself?

03:00:24 <ais523> well, I'm using "exception" in a general sense, it's any situation where the code says "OK, this won't work"

03:00:39 <shachaf> Sure.

03:00:39 <ais523> I think people normally just use "continue" for failures and "break" for successes

03:00:50 <ais523> then an any loop is just a regular for loop

03:00:57 <shachaf> Maybe you can require an "else" clause for "any" loops.

03:18:56 -!- ais523 has quit (Quit: quit).

04:03:47 -!- Lord_of_Life has quit (Ping timeout: 245 seconds).

04:09:42 -!- Lord_of_Life has joined.

04:11:20 -!- kolontaev has quit (Quit: leaving).

04:24:20 <esowiki> [[ADxc]] N https://esolangs.org/w/index.php?oldid=65559 * A * (+467) Created page with "==Example== Suppose your snippets were AD, xc, 123, and ;l. Then: * AD should produce 1 * [[ADxc]] should produce 2 * ADxc123 should produce 3 * and ADxc123;l should produce..."

05:22:59 <esowiki> [[JUMP]] https://esolangs.org/w/index.php?diff=65560&oldid=41105 * Dtuser1337 * (+66) Adding some category.

05:24:15 <esowiki> [[Tarpit]] https://esolangs.org/w/index.php?diff=65561&oldid=40466 * Dtuser1337 * (+22)

05:28:18 <esowiki> [[JUMP]] https://esolangs.org/w/index.php?diff=65562&oldid=65560 * Dtuser1337 * (+34) Woosh

05:31:33 <esowiki> [[Truth-machine]] https://esolangs.org/w/index.php?diff=65563&oldid=64960 * Dtuser1337 * (+84) /* Jug */

05:32:33 <esowiki> [[Truth-machine]] https://esolangs.org/w/index.php?diff=65564&oldid=65563 * Dtuser1337 * (+2) /* JUMP */

06:16:02 <esowiki> [[Truth-machine]] https://esolangs.org/w/index.php?diff=65565&oldid=65564 * Dtuser1337 * (+207) My implementation of truth machine in emoji-gramming.

06:27:39 <esowiki> [[Turth-machine]] https://esolangs.org/w/index.php?diff=65566&oldid=63931 * Dtuser1337 * (+80) formatting codebase and adding categories.

06:30:17 <esowiki> [[Hello world program in esoteric languages]] https://esolangs.org/w/index.php?diff=65567&oldid=65558 * Dtuser1337 * (+43) /* Trigger */

06:31:41 <esowiki> [[Truth-machine]] https://esolangs.org/w/index.php?diff=65568&oldid=65565 * Dtuser1337 * (-6) /* Turth-machine */

06:35:56 <esowiki> [[Turth-machine]] https://esolangs.org/w/index.php?diff=65569&oldid=65566 * Dtuser1337 * (+45)

07:03:59 <esowiki> [[Capuirequiem]] https://esolangs.org/w/index.php?diff=65570&oldid=58513 * Dtuser1337 * (-1) /* External resources */

07:05:09 <esowiki> [[RETURN]] https://esolangs.org/w/index.php?diff=65571&oldid=38078 * Dtuser1337 * (+13) /* External resources */

07:06:53 <esowiki> [[Nil]] https://esolangs.org/w/index.php?diff=65572&oldid=52918 * Dtuser1337 * (+1) /* External resources */

07:11:10 -!- Sgeo_ has joined.

07:14:56 -!- Sgeo has quit (Ping timeout: 272 seconds).

07:15:52 <esowiki> [[CopyPasta Language]] https://esolangs.org/w/index.php?diff=65573&oldid=56021 * Dtuser1337 * (+13)

07:53:24 -!- AnotherTest has joined.

08:25:27 -!- Lord_of_Life has quit (Ping timeout: 245 seconds).

08:26:12 -!- Sgeo__ has joined.

08:27:41 -!- Lord_of_Life has joined.

08:29:51 -!- Sgeo_ has quit (Ping timeout: 268 seconds).

09:09:16 -!- arseniiv has joined.

09:53:49 <b_jonas> yes, that's how I think of it. imagine added labels like A: for (init; cond; step) { B: body; C: } D:

09:54:07 <b_jonas> then retry/goto jumps to A, redo jumps to B, next/continue jumps to C, last/break jumps to D

10:43:43 <arseniiv> ah, these polysigns are simply amateur algebra (bordering crackpottery), bleh. We take a vector space spanned by e[0], …, e[n−1], add a product e[i] e[j] = e[(i + j) mod n] and factor it all so e1 + … + en = 0. Why don’t people read good math textbooks and speak the consensus language

10:44:46 <Taneb> Polysigns?

10:45:32 <arseniiv> though an analysis paper by Hagen von Eitzen doesn’t go in this straightforward way so IDK, maybe one can’t factor an algebra this way

10:46:58 <arseniiv> Taneb: a person named sombrero posted a link on some images hosted on vixra ≈yesterday. I didn’t look at them but I saw that word in the abstract and went investigating what it is

10:47:27 <arseniiv> unfortunately, my expectations were confirmed

10:48:16 <arseniiv> (amateur math with crackpotish philosophical claims)

10:48:42 <arseniiv> and also an old one: the analysis paper is 2009

10:49:08 <shachaf> b_jonas: Where that expands further to A: init; while (cond) { B: body; C: step; } D: presumably

10:49:26 <b_jonas> shachaf: yes

10:49:35 <arseniiv> linear algebra is sufficient to do many many things people try to invent

10:50:05 <arseniiv> (of course it’s a trivial statement here)

10:50:18 <arseniiv> (but not somewhere there)

10:50:21 <b_jonas> shachaf: well actually more like A: { init; while (cond) { B: body; C: step; } } D: if you care about the scope of the variables declared in init

10:52:18 <shachaf> One of the nice things about my system is that many things gets correctly scoped by default.

10:52:48 <shachaf> Instead of Go-style "if x := e; p(x) { ... }" you can just write "{ x := e; if(p(x))`; ... }"

10:52:52 <shachaf> get

10:53:16 <arseniiv> (ah, now I see the author uses a R[x] as a base because its multiplication is almost right)

10:54:07 <b_jonas> shachaf: "if x:= e; p(x) { ... }" is actual syntax?

10:54:40 <b_jonas> that looks odd to me because in ruby (and julia too) the semicolon would end the condition and start the loop body, but I guess it makes sense because sh works that way too

10:54:50 <shachaf> It's syntax in Go, yes.

10:55:11 <b_jonas> the sh syntax for while loops allows you to put multiple statements in the condition too, which means you can put the test anywhere in the middle of the loop, you don't need a separate do-while loop or if-break

10:55:21 <shachaf> I think something like "if (auto x = e; p(x)) { ... }" is syntax in recent C++ too.

10:55:45 <b_jonas> shachaf: dunno, I can't follow that

10:56:17 <b_jonas> in perl you can write { my $x = e; p(x) or last; body }

10:56:29 <shachaf> It just declares something that's only in the scope of the if.

10:58:30 <shachaf> Presumably it means the same as { x := e; if (p(x)) { ... } }

11:00:20 -!- heroux has quit (Ping timeout: 248 seconds).

11:07:55 -!- heroux has joined.

11:12:52 <int-e> Hmm, this looks like a good preamble for a SMETANA program to me :) Step 1. Swap step 1 with step 2. Step 2. Go to step 1.

11:18:54 <b_jonas> Imagine an alternate universe where English didn't become a lingua franca, so there are entire libraries where all the identifiers are in German, ones where all of them are Russian, ones where all of them are in French,

11:19:19 <b_jonas> but you sometimes end up with names mixing words from different languages because the concept has a name popularized by an earlier library,

11:19:39 <b_jonas> and perhaps the oldest libraries with the worst names, like ncurses in our world, would even be all in latin.

11:20:17 <shachaf> It's ironic that you're calling English a "lingua franca" in that description.

11:20:35 <b_jonas> and then some libraries with English names would appear too.

11:21:10 <b_jonas> and you'd have to learn the basics of three or four different languages to understand source code, which was basically the status of scientific research in the 1960s

11:22:40 <Taneb> fenestram_crea

11:23:46 <b_jonas> yes, you'd probably have to know not only the root words, but also conjugation in latin, french, italian, german, and russian to be able to guess the spelling of identifiers if you want to write code

11:24:01 <Taneb> Sounds fun, we should make this happen

11:24:39 <b_jonas> Taneb: sure, but do it in a modern way, where latin is excluded but instead we have chinese identifiers too

11:25:07 <b_jonas> I mean using latin directly is excluded, we can still have the french and italian and portugeese identifiers of course

11:25:16 <b_jonas> (and englihs)

11:28:17 <Taneb> Hmm, a lot of libraries are American, so would naturally be written in English

11:28:24 <Taneb> Or maybe Spanish or Navajo or something

11:29:09 <b_jonas> and we may have some short identifiers that are ambiguous because they mean something different depending on what language you're reading them in

11:29:16 <b_jonas> I wonder if we can engineer such a situation in common keywords

11:30:37 <b_jonas> `? zodiac

11:30:38 <HackEso> zodiac? ¯\(°_o)/¯

11:30:39 <b_jonas> `? signs

11:30:41 <HackEso> signs? ¯\(°_o)/¯

11:30:42 <b_jonas> `? signs of the zodiac

11:30:43 <HackEso> signs of the zodiac? ¯\(°_o)/¯

11:31:34 <int-e> Oh I just realized that SMETANA may just be powerful enough to write a quine in. (But I'm targeting SMETANA To Infinity! for now.)

11:32:28 <int-e> Well, if we steal the output instruction from S2I that is.

11:32:39 <int-e> No output, no quine... :)

11:32:57 <shachaf> What if you just put the quine in memory?

11:33:05 <shachaf> For example, in the text section.

11:33:42 <int-e> shachaf: Maybe you should refresh your memory (no pun intended) on what SMETANA is.

11:35:17 <b_jonas> tha handle transform tool in gimp 2.10 is such a blessing, it was so hard to do proper alignments of multiple pictures with affine or perspective transforms before that

11:35:34 <int-e> (SMETANA suffers from being a finite state machine; all data has to be encoded somehow in the program itself as written. So you need some serious compression to allow the program to contain a representation of itself, which stretches the term "quine" beyond the limits I'm willing to allow.)

11:37:22 <b_jonas> how does it stretch the term "quine"?

11:37:23 <int-e> shachaf: Of course we could define other output conventions without extending the language... like looking at the trace and mapping certain predetermined addresses to characters.

11:37:38 <shachaf> int-e: I didn't even know about SMETANA until I looked it up 25 minutes ago.

11:38:17 <shachaf> But I was making a general joke that is equally bad in any language.

11:38:50 <shachaf> (The joke is, define "output" to be some region of memory. At program startup, the program is loaded into memory, and therefore you can define that region to be the output and call it a quine.)

11:39:27 <int-e> b_jonas: Well, I don't know how to delineate decompression and a BASIC-style 'LIST' function that makes quines trivial.

11:40:08 <shachaf> @time

11:40:12 <lambdabot> Local time for shachaf is Mon Aug 19 04:40:08 2019

11:40:28 <int-e> it's getting worse

11:40:38 <b_jonas> [ fungot, is j-bot here?

11:40:39 <fungot> b_jonas: not very quickly, and it has a non-real-time gc, but that's just what the user expected type character, got ' ( 1 2 3)

11:42:57 <int-e> fungot's almost making sense again (though I guess that last part is closer to Lisp than to J?)

11:42:58 <fungot> int-e: lms that doesn't violate her fnord, either. probably still aren't. is slib standard?

11:43:30 <int-e> oh well, the moment of clarity has passed

11:46:28 <shachaf> fungot: Was that Lisp or J?

11:46:28 <fungot> shachaf: ( define ( get-move-console) count)). the funge-98 standard is also written in scheme

11:46:44 <b_jonas> fungot: the reason why people don't take you seriously is that you aren't using the One True Style for formatting your code. don't put whitespace after the left parenthesis or after the quote

11:46:44 <fungot> b_jonas: goes where no other music fnord or has been in jail. we went to another bar. then we only need to cons stuff onto the top of

12:07:52 <b_jonas> peksy arthropods, thinking that I invited them just because I left food open on the counter

12:08:26 <b_jonas> if you want food, buy your own, darn it

12:10:45 -!- j-bot has joined.

12:10:49 <FireFly> [ 'hi'

12:10:50 <j-bot> FireFly: hi

12:11:38 <b_jonas> nice, thanks

12:16:45 -!- andrewtheircer has joined.

12:30:12 <b_jonas> is there a strongly typed golf language where, if your code isn't well-typed, it tries to put extra stuff like parenthesis and other extra stuff into your code until it's well-typed, so if you use the type system well, you can omit a lot of things from the parse tree?

12:30:28 <b_jonas> would need a good built-in library and good heuristics for what to insert

12:31:00 <b_jonas> and might be harder to program in than Jelly

12:33:30 <b_jonas> also, is there a golf language that is basically the same as haskell but with a much more golfier syntax?

12:33:46 <b_jonas> I don't think blsq counts as such

12:36:09 <andrewtheircer> i dunno either, jonas

12:52:53 <b_jonas> oh, I get it! if a C compiler environment isn't made for hard realtime programs, then the library defines the PRIoMAX macro to "llo", meaning that the maximum priority that your threads will be able to run is very low

12:57:40 <b_jonas> ``` gcc -O -Wall -o /tmp/a -x c - <<<$'#include<inttypes.h>\n#include<stdio.h>\n int main(void){ printf("maximum thread priority: %s\\n", PRIoMAX); return 0; }' && /tmp/a

12:57:41 <HackEso> maximum thread priority: lo

12:57:49 <b_jonas> that's slight better, it's only low, not very low

13:15:24 -!- andrewtheircer has quit (Quit: Ping timeout (120 seconds)).

13:15:48 -!- arseniiv_ has joined.

13:17:33 -!- andrewtheircer has joined.

13:17:34 -!- arseniiv has quit (Ping timeout: 246 seconds).

13:19:50 <andrewtheircer> oop

13:54:05 <esowiki> [[Special:Log/newusers]] create * Mid * New user account

14:16:34 <b_jonas> apparently the We The Robots comics has ended the long hiatus, except it's not at the old address "http://www.chrisharding.net/wetherobots/" but at "http://www.wetherobots.com/" now

14:16:40 <b_jonas> I haven't noticed until now

14:16:46 <b_jonas> this is what happens when you break urls

14:29:55 -!- sleepnap has joined.

14:30:57 <esowiki> [[ABCD]] https://esolangs.org/w/index.php?diff=65574&oldid=46025 * Dtuser1337 * (+18)

14:39:34 <b_jonas> question. is it possible to make a terminfo entry for irc that tells how to output colors and bold and italic with the mirc codes, and if so, will gcc output colored error messages in HackEso?

14:40:04 <b_jonas> I don't know if gcc's colored error messages respects that

14:40:54 <b_jonas> I made a terminfo entry once, but only by editing a single value in an existing entry

14:44:18 <b_jonas> I should check the ncurses sources in case it already has such an entry though

14:54:45 <b_jonas> whoa... are the terminfo thingies no longer distributed with ncurses? where do they come from then?

15:01:57 <b_jonas> I'll check what debian thinks

15:04:46 <b_jonas> the compiled entries come from the ncurses-term package, now how do I find what the source package is for that?

15:06:01 <b_jonas> and the source package is apparently ncurses

15:06:53 <b_jonas> ah

15:06:58 <b_jonas> it is in ncurses

15:07:00 <b_jonas> in one giant file

15:07:13 <b_jonas> called ncurses-6.1/misc/terminfo.src

15:08:02 <b_jonas> now the question is, can terminal libraries handle that the same code toggles bold on and off?

15:08:14 <b_jonas> and also the way how you can't put a digit after a color code?

15:08:46 <b_jonas> if not, we need a filter after it, which would be ugly

15:09:13 <b_jonas> I guess for the latter, I can just force put two bold codes after a color code

15:09:21 <b_jonas> and for the former, just hope it works

15:17:32 -!- xkapastel has joined.

15:19:35 <b_jonas> that debian package ncurses-term, by the way, contains the _additional_ compiled terminfo entries, the decades of legacy terminals that nobody uses anymore

15:23:09 <Taneb> Imagine if people started adding footnotes* to their IRC messages

15:23:12 <Taneb> * like this one#

15:23:33 <andrewtheircer> nice

15:23:39 <b_jonas> it's actually split around the directory tree too, so the important terminfo entries are in /lib/terminfo , the rest are in /usr/share/terminfo , in case you're mounting usr late

15:23:53 <b_jonas> `? footnote 1

15:23:54 <HackEso> May contain nuts.

15:23:59 <andrewtheircer> can i add meta footnotes*

15:24:01 <andrewtheircer> may contain nuts

15:26:22 <andrewtheircer> `? footnote 1

15:26:23 <HackEso> May contain nuts.

15:32:32 <b_jonas> fungot, do you have a calendar? has SGDQ 2019 started yet?

15:32:32 <fungot> b_jonas: wtf is that? that problem haunts me. i don't think

15:32:39 <b_jonas> uh ok

15:50:55 <andrewtheircer> oh

16:03:19 -!- joast has quit (Quit: Leaving.).

16:03:58 -!- joast has joined.

16:52:42 <fizzie> fungot: I take it you don't watch speedruns then?

16:52:42 <fungot> fizzie: if i remove the fan, heatsink, installation instructions and finally the body is ( are) returned. this expression type is used at a german bank for financial simulations. but that's a mere one character :) i'm not complaining

16:52:58 <b_jonas> fungot, 45 minutes is how many hours?

16:52:59 <fungot> b_jonas: you could just select another transport module system in the works to start off by reading sicp or htdp.

16:53:19 <b_jonas> fungot: don't remove the fan and heatsink! the cpu will overheat

16:53:19 <fungot> b_jonas: is that what you mean

16:53:31 <b_jonas> yes

16:57:57 <andrewtheircer> unfulfilling

17:04:13 -!- Phantom_Hoover has joined.

17:19:09 <int-e> `footnote 1

17:19:09 <HackEso> May contain nuts.

17:20:24 <int-e> `? password

17:20:25 <HackEso> The password of the month is surprising.

17:21:15 -!- Sgeo has joined.

17:24:12 -!- Sgeo__ has quit (Ping timeout: 272 seconds).

17:30:29 <andrewtheircer> e

17:38:36 -!- FreeFull has joined.

17:53:49 -!- andrewtheircer has quit (Remote host closed the connection).

18:02:45 -!- arseniiv_ has changed nick to arseniiv.

18:09:56 <arseniiv> Taneb: I¹ proved² the Riemann hypothesis³ false⁴

18:10:30 <arseniiv> but the page is too short for the footnote section so no one will know what I meant

18:10:50 <arseniiv> mwahaha

18:12:41 <arseniiv> though one footnote can be crammed into the margin (¹ I — Roman numeral 1)

18:12:55 <arseniiv> s/into/onto

18:12:56 <int-e> . o O ( "I proved the Riemann hypothesis" [is] false. )

18:13:31 <b_jonas> int-e: ah yes, newspaper heading grammer

18:13:48 <int-e> Oh. We can claim that this is a portmanteau.

18:14:04 <int-e> (hypothesis and is -> hypothesis)

18:14:33 <int-e> `grWp portmanteau

18:14:35 <HackEso> friend:friend is a portmanteau of fritter and rend \ ism:Isms are philosophies, religions or ideologies that have branched off from older ones, such as Leninism or Buddhism. Etymologically "ism" is a backformation from portmanteaus on "schism". \ portmanteau:«Portmanteau» is the French spelling of “port man toe”.

18:35:37 <b_jonas> `? shoehorn

18:35:38 <HackEso> shoehorn? ¯\(°_o)/¯

18:35:45 <b_jonas> `? bottleneck

18:35:46 <HackEso> bottleneck? ¯\(°_o)/¯

18:39:37 <arseniiv> int-e: :D but is omission of this kind grammatical?

18:40:10 <arseniiv> <int-e> (hypothesis and is -> hypothesis) => ah, quite clever

18:40:47 <arseniiv> one can even claim that hypotheses < hypothesises :D

18:41:21 <arseniiv> or not, it seems different phonetically?

18:41:38 <int-e> arseniiv: FWIW I initially intended to provide a footnote 4 that would redefine "false", but didn't find any nice formulation...

18:42:15 <arseniiv> why, false is true

18:42:49 <arseniiv> though this is not nice at all

18:43:37 <arseniiv> I had in mind conflating unprovedness, unprovability and proved negation

18:44:09 <int-e> arseniiv: https://www.youtube.com/watch?v=8keZbZL2ero is somehow relevant

18:44:11 <arseniiv> but decided to let it vague (v) for a time

18:48:29 <arseniiv> int-e: hehe

18:55:55 -!- Melvar has quit (Quit: WeeChat 2.4).

19:26:05 -!- ais523 has joined.

19:26:18 -!- ais523 has quit (Client Quit).

19:26:31 -!- ais523 has joined.

19:26:55 <ais523> <b_jonas> is there a strongly typed golf language where, if your code isn't well-typed, it tries to put extra stuff like parenthesis and other extra stuff into your code until it's well-typed, so if you use the type system well, you can omit a lot of things from the parse tree? <b_jonas> also, is there a golf language that is basically the same as haskell but with a much more golfier syntax? ← yes to both, and it's the same language: Husk

19:27:21 <b_jonas> ooh, let me look

19:27:37 <b_jonas> hmm, there doesn't seem to be an article on the wiki

19:29:58 <ais523> CGCC users rarely bother with creating wiki articles for their golflangs :-(

19:30:14 <ais523> they normally just link to the github repo to let people know wha the language is

19:30:27 <ais523> https://github.com/barbuz/Husk in this case

19:30:38 -!- Sgeo_ has joined.

19:31:04 <esowiki> [[Husk]] N https://esolangs.org/w/index.php?oldid=65575 * B jonas * (+158) stub

19:31:33 <esowiki> [[Language list]] https://esolangs.org/w/index.php?diff=65576&oldid=65520 * B jonas * (+11) Husk

19:32:16 <b_jonas> oh no, not yet another golf language with its own custom eight-bit character set

19:33:02 <ais523> most golf languages do that

19:33:09 <ais523> because control codes are /really/ hard to read and type

19:33:34 <ais523> (a golf language I'm sporadically working on has its own /six/-bit character set, which has the advantage that I can make it a subset of ASCII)

19:33:48 -!- Sgeo has quit (Ping timeout: 245 seconds).

19:34:34 <b_jonas> I guess it's still better than if the golf language has a tricky variable length compression, so reading and writing it is more difficult than just translating characters

19:34:48 <b_jonas> like a hufmann encoding or even worse

19:34:51 <b_jonas> with shift codes

19:35:28 <b_jonas> of course Jelly's compressed string modes are sort of like that

19:45:06 -!- heroux has quit (Ping timeout: 268 seconds).

19:46:59 -!- adu has joined.

19:47:12 <b_jonas> interesting

19:51:39 -!- heroux has joined.

20:03:32 <arseniiv> ais523: thanks for an interesting language!

20:04:02 <arseniiv> though golflangs make my head spin, there are so many details because of need to compress

20:04:31 <arseniiv> nondeterministic typing there is cool

20:05:20 <arseniiv> (hopefully it’s implemented in such a way as to not blow up combinatorily)

20:08:09 <arseniiv> I’d type overloaded multimethods using union typing, but am I right it’s not easily added to Hindley—Milner-like inference?

20:08:16 <ais523> it might blow up at compile time but that's probably forgivable, golflang programs are short

20:09:12 -!- Melvar has joined.

20:10:01 <b_jonas> I wonder if there's a golf language that has a built-in that gets the OEIS sequence from its index, and to compute its terms, tries to automatically run code in the OEIS entry, so requires Maple and Mathematica plus ten gigabytes of other software to run.

20:10:22 -!- sleepnap has quit (Quit: Leaving.).

20:10:28 <arseniiv> b_jonas: (rofl)

20:10:52 -!- sleepnap has joined.

20:11:23 <b_jonas> and of course you download a snapshot of the OEIS at the time you build the compiler, so that it doesn't cheat by looking info up on the internet that may be newer than the language

20:13:25 -!- sleepnap has quit (Client Quit).

20:19:13 -!- Melvar has quit (Ping timeout: 245 seconds).

20:19:37 -!- Melvar has joined.

20:21:10 <ais523> b_jonas: I think someone tried that once but failed

20:21:47 <ais523> a better approach would probably be to encourage people to submit OEIS programs in a standard machine code; WebAssembly comes to mind

20:21:48 <b_jonas> yeah, this does seem like an interesting language

20:22:11 <b_jonas> ais523: does WebAssembly have bigints?

20:22:32 <ais523> it's a machine code, so not natively in the same sense that x86 doesn't have native bigints

20:22:38 <ais523> but you can run GMP or the like on it easily enough

20:23:03 <b_jonas> ais523: I mean as a standard library that's usually accessible or something, not as a "built-in"

20:23:27 <ais523> WebAssembly doesn't have standard libraries, you're supposed to compile your own language's standard library onto it

20:23:31 <b_jonas> so that you don't have to bundle a copy of the bigint multiplication routine with the code of every quickly growing sequence

20:23:57 <ais523> a typical WebAssembly program is shipped with a decent proportion of libc

20:23:58 <b_jonas> (though I presume you'd allow a single object file that implements multiple OEIS sequences)

20:24:17 <ais523> I guess object files might make more sense than executables, in that case

20:25:24 <esowiki> [[Husk]] https://esolangs.org/w/index.php?diff=65577&oldid=65575 * B jonas * (+465)

20:25:46 <b_jonas> it could be executables, that's not the difference I care about here

20:25:56 <b_jonas> just that there are bunches of OEIS sequences that are closely related

20:26:08 <b_jonas> so it would be redundant to copy the code to each one

20:26:32 -!- Lord_of_Life_ has joined.

20:27:39 -!- Lord_of_Life has quit (Ping timeout: 268 seconds).

20:29:27 -!- Lord_of_Life_ has changed nick to Lord_of_Life.

20:36:44 -!- ais523 has quit (Ping timeout: 272 seconds).

21:08:56 -!- ais523 has joined.

21:10:27 <ais523> well, executables are self-contained, but object files don't have to be

21:11:58 <shachaf> I wrote a program to generate ELF executables but it seems to me object files might actually be trickier

21:12:26 <shachaf> Hmm, maybe just differently tricky.

21:13:02 <shachaf> The tricky thing is that you need a bunch of information in sections for the linker to interpret. But it can handle relocations and so on for you, I suppose.

21:13:46 <b_jonas> shachaf: do you also emit basic debug information like the code span of each function?

21:13:56 <shachaf> Not yet.

21:14:17 <b_jonas> I guess that is sort of redundant because gdb can guess the function from the function symbols, without the debug info

21:14:30 <shachaf> (Because the program just emits some fixed handwritten x86 code.)

21:14:57 <shachaf> I do emit function information, I think. But the only function is _start.

21:15:35 <shachaf> `` objdump -d tmp/out.a

21:15:36 <HackEso> \ tmp/out.a: file format elf64-x86-64 \ \ \ Disassembly of section .text: \ \ 0000000000000178 <_start>: \ 178:48 31 ed xor %rbp,%rbp \ 17b:49 89 d2 mov %rdx,%r10 \ 17e:48 b8 66 69 6e 61 6c movabs $0xa796c6c616e6966,%rax \ 185:6c 79 0a \ 188:50 push %rax \ 189:b8 01 00 00 00 mov $0x1,%eax \ 18e:bf 01 00 00 00 mov $0x1,%edi \ 193:48 89 e6 mov

21:15:55 <b_jonas> nice

21:16:00 <ais523> what sort of calling convention is that? :-D

21:16:10 <ais523> I'm used to seeing functions starting out by messing around with sp and bp

21:16:13 <ais523> but zeroing bp is weird

21:16:20 <ais523> I'm guessing it's just being used as a general-purpose register?

21:16:29 <shachaf> That's standard in the amd64 ABI.

21:16:39 <shachaf> To mark the outermost stack frame.

21:16:43 <ais523> aha

21:17:05 <ais523> couldn't the outermost stack frame actually need it as a base pointer, though?

21:17:56 <shachaf> I think _start normally calls another entry point almost immediately.

21:17:57 <ais523> oh, that string says "finally"

21:18:17 <ais523> I initially misread it as "fitally", I'm not as good at converting hex to ascii in my head as I'd like

21:18:58 <ais523> hmm… do you have an opinion on caller-saved versus callee-saved registers?

21:19:17 <b_jonas> `perl -eprint pack"q",0xa796c6c616e6966 # let's ask a computer to do that, just to check

21:19:18 <HackEso> finally

21:19:18 <shachaf> I don't remember.

21:19:41 <shachaf> I don't have a strong opinion, at least. Maybe I had one in the past.

21:20:27 <shachaf> I think someone who knows more about register renaming and things than I do should give me an opinion.

21:27:45 -!- asdfbot has joined.

21:29:15 <shachaf> I think it's reasonable for _start to be special in this way because it's not actually a function.

21:29:41 <moony> =rasm xor rax, rax

21:29:49 <moony> =rasm2 xor rax, rax

21:29:49 <asdfbot> 4831c0

21:29:54 * moony happy

21:30:25 <shachaf> Intel style?!

21:30:31 <moony> yes

21:30:54 <moony> =rasm2 -s att xor %rax, %rax

21:30:54 <asdfbot> Unknown argument: s

21:30:54 <asdfbot>

21:30:54 <asdfbot> Use --help for options

21:31:06 <moony> =rasm2 -satt xor %rax, %rax

21:31:06 <asdfbot> Unknown arguments: s, t

21:31:06 <asdfbot>

21:31:06 <asdfbot> Use --help for options

21:31:11 <moony> ):

21:31:24 <shachaf> Which order should I write the operands in in my assembler?

21:31:34 <moony> radare2 on hackeso when

21:31:42 <b_jonas> shachaf: use NASM syntax, it's better than either intel or att

21:31:47 <moony> ^

21:31:55 <shachaf> Isn't it pretty Intely?

21:32:00 <moony> it's intel-like yes

21:32:03 <moony> att sucks

21:32:03 <b_jonas> that is if this is an assembler for x86

21:32:58 <shachaf> Presumably I want to target a bunch of platforms

21:33:11 <shachaf> At least x86 and ARM

21:33:23 <shachaf> Maybe WebAssembly?

21:33:34 <moony> shachaf: https://github.com/asmotor/asmotor contribute to this instead then

21:34:31 -!- LBPHacker has joined.

21:34:38 -!- BWBellairs has joined.

21:34:41 <LBPHacker> good day

21:34:43 <moony> `welcome LBPHacker

21:34:44 <b_jonas> you should probably use xor %eax,%eax though, because it encodes shorter

21:34:44 <HackEso> LBPHacker: Welcome to the international hub for esoteric programming language design and deployment! For more information, check out our wiki: <https://esolangs.org/>. (For the other kind of esoterica, try #esoteric on EFnet or DALnet.)

21:34:45 <LBPHacker> blame moony for everything I do here :P

21:34:52 <moony> ):

21:35:44 <shachaf> Why contribute to that instead?

21:35:55 <moony> good, multi-CPU assembler.

21:36:09 <moony> more assemblers is just more competing standards right now

21:36:29 <b_jonas> see http://yasm.tortall.net/ for x86

21:37:03 <b_jonas> or invent your own syntax that looks similar to the others but is incompatible in subtle ways that are hard to debug

21:37:22 -!- sleepnap has joined.

21:37:24 <shachaf> I primarily wanted a library rather than something with a parser anyway

21:37:37 <moony> ah

21:37:45 <moony> uh

21:37:48 <moony> capstone not work?

21:38:10 <shachaf> I don't know?

21:38:19 <moony> oh, right, capstone's a disassembler

21:38:38 <moony> http://www.keystone-engine.org/

21:39:04 <moony> that's it's assembler counterpart

21:39:19 <b_jonas> for disassembling, you can try Agner's disassembler

21:39:39 <moony> or http://www.capstone-engine.org/ as i mentioned

21:41:27 <shachaf> There's only one application I previously wanted a general disassembler library for, and it was kind of ridiculous.

21:42:12 <b_jonas> did it involve malware research or kernel debugging?

21:42:44 <shachaf> No, the goal was to make a variant of strace that traces I/O to memory mapped files precisely.

21:42:56 <b_jonas> ouch

21:43:00 <b_jonas> yes, that is ridiculous

21:43:36 <shachaf> In order to trace exactly which bytes were read from or written to, you need to disassemble the instruction.

21:44:41 <shachaf> I think there are some debuggers that have this feature.

21:44:41 <b_jonas> that, or mprotect all of the mapped pages to 0 permissions after every access, and catch the segfault and see what the siginfo says

21:45:17 <shachaf> That's the approach I'm proposing.

21:45:17 <b_jonas> wait, can't intel's built-in debug faults already tell what's read and written, and doesn't the kernel expose that in siginfo?

21:45:34 <shachaf> Certainly not stepping through every instruction, that seems way too slow.

21:45:50 <shachaf> But when you get the SEGV you need to figure out exactly which octets were read or written.

21:45:52 * ais523 has been reading the x86_64 ABI

21:46:02 <ais523> it gives advice on how to implement exabyte-size functions

21:46:10 <ais523> I wonder what circumstances would require you to write one of those

21:46:24 <moony> I can not imagine a sane one

21:46:53 <moony> hmm

21:47:02 <moony> why do me and ais523 have the same name color in weechat

21:47:22 <moony> we're alphabetical miles apart

21:47:35 <b_jonas> I don't know how all that x86 stuff works

21:47:48 <moony> b_jonas: it works via a generator that runs on hot garbage

21:48:06 <b_jonas> ais523: a compiler may have to, when it's forced to compile firefox

21:48:27 <moony> i mean

21:48:33 <moony> maybe you unroll a massive loop?

21:49:12 <moony> LBPHacker: help i need ideas for why one would use a exabyte sized functions

21:49:38 <ais523> I still have trouble imagining exabytes of data, although there are probably some companies who store that much nowadays

21:49:38 <LBPHacker> hwhat

21:49:41 <b_jonas> though I think they don't have functions larger than gigabyte sized

21:49:42 <ais523> but exabytes of /code/?

21:49:45 <LBPHacker> ^

21:50:06 <ais523> I guess when you're writing an ABI you need to take all eventualities into account

21:50:48 <b_jonas> ais523: right, so that twenty years later, when the required hardware is available, different groups of people don't start inventing incompatible exensions

21:50:59 <shachaf> ais523: I doubt there's any company that stores an exabyte of data on one machine.

21:51:23 <ais523> not even in memory?

21:51:29 <ais523> my guess is that nobody does but I'm not sure

21:51:30 <moony> I wonder what happens if you give GCC infinite (to the max x86-64 allows) RAM and have it compile a exabyte sized function

21:51:43 <moony> ais523: the most we can store per machine right now is about 1PB i think

21:52:10 <moony> so 1024 machines per exabyte

21:52:18 <ais523> there are definitely use cases in which you want as much in-memory storage as possible in one machine and don't care if it's lost to a crash

21:52:39 <b_jonas> moony: even a petabyte is a lot, yeah

21:53:02 <b_jonas> you'd need 32 hard disks, each one 32 terabytes size, or something

21:53:12 <LBPHacker> moony: I'm missing a lot of context here but iirc x86-64 supports an address space of 2**52

21:53:12 <moony> ais523: the most RAM per machine right now is 8TB, assuming a AMD EPYC based server.

21:53:19 <b_jonas> ais523: such as video games, sure

21:53:28 <ais523> moony: so not even in the petabyte range

21:53:33 <b_jonas> or trying to break cryptography stuff by building huge tables

21:53:42 <moony> and this is asof just a few days ago btw

21:53:44 <LBPHacker> or are we talking possibly out of memory

21:53:56 <moony> before the latest EPYC units were released, it was a max of 4TB per machine

21:54:15 <moony> LBPHacker: possibly out of memory i bet

21:54:15 <moony> or rather

21:54:20 <moony> memory paging/swapping/whatever

21:54:32 <b_jonas> LBPHacker: I disagree, https://esolangs.org/logs/2019-08-05.html#luc

21:54:44 <LBPHacker> oh 5-level

21:54:47 <LBPHacker> fun

21:54:59 <b_jonas> LBPHacker: it's the 5-level one that would support 52 bits I think

21:55:12 <b_jonas> the current cpus support only up to 48 bits

21:55:39 <LBPHacker> well I mean I remember 9+9+9+9+12, which is the 4-level

21:55:41 <b_jonas> moony: what's the most fast solid state storage you can have in a machine?

21:55:47 <moony> uh

21:55:50 <LBPHacker> how many address pins your cpu has is another matter :P

21:55:58 <moony> I dunno, i know some servers with 48+ NVME slots

21:56:00 <LBPHacker> oh

21:56:02 <LBPHacker> ok reee I'm dumb

21:56:06 <LBPHacker> 36+12

21:56:11 <ais523> even if your CPU is short on address pins you could always use… bank switching!

21:56:12 <LBPHacker> see this is why I don't do addition

21:56:36 <ais523> although it tends not to play very nicely with the concept of an operating system

21:56:48 <moony> b_jonas: with a modern EPYC system, I think it'd max out at 128 NVMe devices, assuming you somehow utilized all 128 PCIe lanes from both CPUs

21:57:18 <b_jonas> LBPHacker: I gave up on arithmetic years ago, when I debugged a segfault for hours, then found that I allocated 8092 bytes instead of 8192

21:57:20 <moony> (Yes, EPYC servers are currently the most capable. Perfect.)

21:57:34 <b_jonas> these days I'd let the computer figure out the size from a multiplication or shift

21:57:41 <LBPHacker> lol. yes, that is why I don't do decimal either

21:57:58 <moony> uh

21:58:00 <moony> hmm

21:58:07 <b_jonas> moony: and how large can those NBMe devices be?

21:58:07 <moony> how many SATA devices can you have per PCIe lane

21:58:15 <shachaf> Or write 0x2000

21:58:34 <b_jonas> shachaf: yes, that too, though sometimes I mess that up too

21:58:59 <b_jonas> moony: also how many ways is the largest possible NUMA in a machine?

21:59:29 <moony> uh

21:59:49 <moony> I dunno NUMA

22:00:44 <ais523> haha, the PLT uses one format for the first 102261125 entries and changes to a different format from the 102261126th onwards

22:00:54 <b_jonas> computer hardware is getting ridiculously powerful. good thing I have my programmable calculator, with 2 kilobytes of RAM, battery-backed to make it persistent, in the shelf

22:00:55 <moony> I think EPYC zen 2 server processors are one NUMA P each\

22:01:00 <b_jonas> s/shelf/drawer/

22:01:07 <ais523> I wonder how many programs a) care about the PLT format at all and b) are unaware of that detail

22:01:21 <moony> b_jonas: 128 cores per server \o/

22:02:11 <moony> 128 cores, with slots for 4 accelerator cards

22:02:16 <moony> or was it 8

22:02:28 <ais523> (the reason is that there's some shared code between the entries that they normally reach via a jump, but you can't write the jump instruction when you're too far from the start of the PLT)

22:02:50 <b_jonas> moony: nice. have you ever written a program for those that spawns 128 parallel processes to speed up something, but some library you call in them tries to be automatically smart and spawns 128 parallel threads in each of them?

22:03:03 <moony> pffft

22:03:21 <shachaf> If I generate an amd64 program, should I just not use a PLT?

22:03:30 <b_jonas> that's happened to me on 12 cores only, I haven't seen a 128 core machine

22:03:37 <b_jonas> 2-way NUMA too

22:03:42 <moony> b_jonas: 128 core machines are on the market as of a few days ago

22:03:46 <ais523> shachaf: I'm sceptical about the value of the PLT and GOT

22:03:46 <moony> they're 2-way as well

22:03:52 <shachaf> Also, should I take Microsoft's approach and say "x64" instead of "x86-64" or "amd64"? It's shorter.

22:03:59 <moony> go ahead

22:04:07 <b_jonas> shachaf: no

22:04:08 <shachaf> ais523: Hmm, what's the alternative to the GOT in ELF files?

22:04:13 <ais523> x64 probably refers to an unrelated CPU

22:04:18 <moony> b_jonas: they're also impressively cheap, 7k for the highest end 64-core CPU from AMD

22:04:26 <moony> and it absolutely kills in terms of performance

22:04:43 <b_jonas> nice

22:04:51 <moony> somethingsomething50kfor56coresfromintel

22:04:55 <ais523> shachaf: well, the GOT basically lets you find addresses that aren't compile-time constants, but doing arithmetic on %rip can also do that if you know that the entire program has moved the same amount

22:05:04 <b_jonas> I'd like a powerful computer, but not that powerful

22:05:11 <ais523> so it'd only be useful when connecting between two parts of the program that are ASLRed by different amounts

22:05:30 <moony> b_jonas: AMD has 16-core processors for desktop for about $800 i think, coming to market real soon

22:05:46 <shachaf> ais523: Right, but isn't every library loaded at a random address?

22:06:39 <ais523> shachaf: yes, but there isn't a whole lot of actual communication done between libraries through, e.g., shared global constants

22:07:26 <shachaf> But what about calls to library functions?

22:07:31 <ais523> so I'm not sure that I see any particular difference between absolute and position-independent code; you need the dynamic linker to update function calls in one library to point to the other anyway

22:07:58 <ais523> the purpose of the PLT is pretty much just so that you can avoid editing the .text segment, which forces you to use a private rather than shared map

22:08:09 <shachaf> Right, so the PLT seems kind of pointless.

22:08:24 <shachaf> But isn't the GOT the alternative that doesn't edit .text?

22:08:39 <ais523> the GOT and PLT are very similar

22:09:09 <ais523> the difference is that GOT is general-purpose and the PLT is just wrappers for function calls

22:09:23 <ais523> (this lets you give a location in the PLT when you need to give someone a function pointer for a callback)

22:09:45 <moony> LBPHacker: exascale R316 when

22:09:53 <LBPHacker> uwot

22:09:56 <ais523> that said, position-independent code doesn't move while it's running, so if you need to give someone a function pointer, you could probably just lea it yourself and pass them the resulting absolute address

22:10:01 <b_jonas> why do we have to have these interesting conversations during the night when I have to get up early the next day? the sleep cycle of this channel is messe dup

22:10:11 <moony> hi i'm american

22:10:14 <shachaf> Maybe I'm saying GOT when I mean something else.

22:10:15 <moony> it's just the afternoon here

22:10:28 <moony> GOT sounds an awful lot like GDT

22:10:39 <shachaf> What's the way you're proposing to call a dynamically linked function?

22:10:42 <moony> I know GDT, but not GOT..

22:11:13 <b_jonas> moony: you are, yes

22:12:11 <shachaf> I think you should have something like "movq some_offset(%rip), %rax; callq %rax", where that offset is an offset into a segment the dynamic linker populates with the correct address at load time.

22:12:47 <shachaf> Or maybe that's not what I think?

22:14:29 <ais523> shachaf: my suggestion is callq 0(%rip) in the executable, the dynamic linker edits the 0 to the actual distance when the executable is loaded

22:14:54 <ais523> that works well for executable → library calls as long as they're both in the first 31 bits (they typically will be)

22:15:08 <shachaf> But that requires editing .text such that it can't be shared, right?

22:15:12 <ais523> it doesn't work as well for library → library calls, though, because you'd ideally want to be able to store the libraries in a shared mapping

22:15:22 <ais523> right, that prevents a shared .text

22:15:23 <shachaf> Maybe you're saying that's irrelevant.

22:15:32 <ais523> I consider it insufficiently relevant

22:15:48 <shachaf> One thing I like being able to do is load a new copy of a .so at runtime repeatedly.

22:16:04 <shachaf> I guess that's still possible with that approach.

22:16:17 <shachaf> But it might require a page to be W|X.

22:16:21 <ais523> but if it's a problem, the next step would be to have a separate section that groups together jumps to functions, do calls by calling that section, and have the dynamic linker edit the jump targets

22:16:43 <shachaf> I was going to say that sounds like the PLT, but I guess the PLT has an extra level of indirection.

22:17:00 <ais523> indeed

22:17:07 <ais523> I don't understand why it has that extra level

22:17:23 <shachaf> I think it's to allow lazy loading.

22:17:45 <shachaf> I think lazy loading is probably a bad idea, though.

22:18:23 -!- AnotherTest has quit (Ping timeout: 245 seconds).

22:18:23 <ais523> the remaning hard case is when one library (or the executable) refers to a global variable in another library (or the executable)

22:18:35 <ais523> my preferred solution to this would be simply to ban it :-D

22:18:47 <shachaf> Hmm, I use this in the aforementioned use case.

22:19:27 <shachaf> Specifically my program is made of two parts, a loader binary and an application .so.

22:19:29 <ais523> in cases like this, getter/setter methods would normally be a better API

22:19:58 <shachaf> The .so refers to some global variables defined in the loader to share state across reloads.

22:20:08 <shachaf> I suppose the loader could just pass it a pointer.

22:21:04 <ais523> ooh, I think I know why the PLT might be useful: it's to avoid invalidating function pointers into a library after it's reloaded

22:21:34 <shachaf> I think those are invalidated anyway.

22:21:47 <ais523> I guess they have to be

22:21:48 <shachaf> Well, I'm loading the library with dlopen.

22:21:54 <shachaf> So the PLT is irrelevant.

22:22:05 <shachaf> (In the loader->library direction.)

22:27:15 -!- arseniiv has quit (Ping timeout: 248 seconds).

22:27:25 <ais523> anyway, my current hobby is being annoyed at compiler optimisers

22:27:37 <shachaf> One annoying thing about structuring the program this way is that the .so can't use any global variables (that survive across reloads).

22:27:54 <shachaf> Most of the global variables I'm importing from the loader really belong in the library anyway.

22:28:21 <ais523> unsigned short f(unsigned long long *p) { unsigned long long rp = *p; if ((rp & 0xFF00000000000000LLU) != 0x8000000000000000LLU) return 0; return (unsigned short)((rp ^ 0x8000000000000000LLU) >> 48); }

22:28:33 <ais523> ^ that is the test function that I've been working on optimising

22:28:53 <shachaf> I feel like, if you want to make your programs fast, optimizing compilers are rarely the right place to look.

22:29:22 <ais523> this compiles to 45 bytes with gcc -Os, 34 bytes with clang -Os

22:29:48 <ais523> I got it down to 13 bytes of hand-written asm

22:30:04 <shachaf> Oh man. That's a lot of bytes.

22:30:07 <ais523> the reason I'm annoyed is that I want to be able to write clear code and rely on compiler optimisations

22:30:52 <ais523> I had a go at trying to generate good asm via repeatedly hand-optimising the C file

22:31:18 <ais523> that got it down to 24 bytes on clang and 17 on gcc, but at the cost of requiring -fno-strict-aliasing

22:31:51 * ais523 has an idea

22:32:01 <b_jonas> "<ais523> haha, the PLT uses one format for the first 102261125 entries and changes to a different format from the 102261126th onwards" => I wonder if I should addquote that

22:32:48 <ais523> helps on clang, at least

22:33:05 <ais523> you can hit an LLVM optimiser idiom and get back to strictly conforming C, which is nice

22:33:06 <b_jonas> getter/setter methods? why not a function that returns the address of the variable instead?

22:33:31 <ais523> I guess that works in most cases

22:33:44 <ais523> I like getter/setter because it abstracts away the way in which the variable is stored internally

22:34:09 <ais523> but many of the cases you'd need that, e.g. if you need to move the variable to thread-local-storage, still let you take references to it

22:34:20 <ais523> `` printf "%x" 102261125

22:34:21 <HackEso> 6186185

22:34:59 <b_jonas> ais523: isn't the extra indirection so that you can take a function pointer to a function in another library, and equality-compare it in C to the same function pointer taken from a third library, and make them return equal?

22:35:06 <b_jonas> the extra indirection in the PLT that is

22:35:26 <ais523> oh, could be

22:35:27 <shachaf> How does the PLT let you do that?

22:35:41 <b_jonas> ais523: you have made sure that you aren't using a too old compiler or an MS compiler, right?

22:36:06 <ais523> shachaf: ^ this is a really good example of my belief that immutable things should be treated as indistinguishable from values, with any references managed behind the scenes

22:36:52 <ais523> reference-== on immutable things is not a useful operation (except possibly as an optimisation hint) and if you accidentally expose it in your language semantics, a lot of contortions are needed to avoid breaking programs that rely on it

22:37:02 <b_jonas> ais523: for a thread-local variable, you call the function again each time you're not sure you're in the same thread. that's how errno works.

22:37:18 <ais523> in retrospect, errno was a mistake

22:37:23 <ais523> but it took a while for that to become apparent

22:37:32 <moony> errno is a big mistake

22:37:39 <b_jonas> ``` gcc -E - <<<$'#include<errno.h>\nerrno' | tail

22:37:40 <HackEso> # 25 "/usr/include/x86_64-linux-gnu/bits/errno.h" 2 3 4 \ # 50 "/usr/include/x86_64-linux-gnu/bits/errno.h" 3 4 \ \ # 50 "/usr/include/x86_64-linux-gnu/bits/errno.h" 3 4 \ extern int *__errno_location (void) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__)); \ # 36 "/usr/include/errno.h" 2 3 4 \ # 58 "/usr/include/errno.h" 3 4 \ \ # 2 "<stdin>" 2 \ (*__errno_location ())

22:38:00 <ais523> that "leaf" in the header file is curious

22:38:04 <b_jonas> moony: yes, and C99 at least partly gets rid of it by allowing such fast functions as sqrt to not necessarily set it

22:38:12 <ais523> when would a function need to know that it's /calling/ a leaf functiion

22:38:12 <b_jonas> and floor

22:38:47 <ais523> hmm, apparently it means "this function will never call a callback function, nor longjmp"

22:39:09 <ais523> which does seem potentially useful as an optimisation hint

22:42:03 <b_jonas> ok, not floor, sorry

22:42:07 <b_jonas> but sqrt

22:42:20 <shachaf> So when foo is a dynamiclly linked function and it's used as a function pointer, what does that pointer point to?

22:43:15 <b_jonas> shachaf: I don't know. it might not even work the way I explained, maybe it points to different places depending on which library you're naming the function pointer. I never tested it.

22:43:16 <shachaf> Apparently it's not the PLT entry.

22:44:17 <shachaf> I made a test program but now I gotta figure out what it's actually doing.

22:45:08 <shachaf> A PLT entry isn't even generated when you get a function pointer. Only when you call the function directly.

22:45:30 <shachaf> Which is pretty bizarre because "calling the function directly" nominally happens through a function pointer, in C.

22:46:34 <b_jonas> shachaf: does it really? I thought the function (reference) just decays to a function pointer when used as such, just like how an array decays to a pointer

22:46:47 <b_jonas> and when you call it, that doesn't happen

22:47:21 <shachaf> That's what the C standard says.

22:47:44 <b_jonas> not that that's an observable reference, and in any case, this is something that can and will be optimized

22:47:49 <shachaf> When I get back to the computer I'll test it more directly.

22:48:20 <b_jonas> ok

22:48:59 <b_jonas> but still, I think compilers will likely optimize it, because direct function calls are pretty common

22:49:10 <shachaf> Hmm, maybe it is observable, for the reason you said (making function pointers equal across libraries).

22:49:39 <shachaf> Since each library has its own PLT.

22:50:37 <b_jonas> shachaf: does C even guarantee the thing about making function pointers equal?

22:52:12 <b_jonas> I'm not sure you are even allowed to compare two function pointers without invoking undefined behavior unless one of them is a null pointer

22:52:35 <b_jonas> ah no, you probably can

22:52:57 <b_jonas> for equal comparisons, you don't get UB

22:53:21 <b_jonas> but I don't know if two function pointers to the same pointer are guaranteed to be equal, even within a compilation unit

22:53:25 <ais523> you can compare for equality, but not for < and >

22:53:44 <b_jonas> ais523: right, but what result do you get?

22:54:02 <b_jonas> does f == f have to return true if f is a function?

22:54:03 <ais523> who knows, this is C we're talking about :-D

22:54:06 <b_jonas> yeah

22:54:19 <shachaf> Yes, with optimizations it turns into a PLT call in gcc.

22:55:03 <b_jonas> I think you can do equal comparisons on function pointers only because some apis treat null pointers in a special way

22:55:20 <b_jonas> shachaf: nice

22:55:27 -!- b_jonas has quit (Quit: leaving).

22:56:05 <shachaf> Yes, the function pointer comes from the GOT.

23:00:57 <shachaf> Ugh, dynamic linking is so complicated.

23:02:08 -!- Phantom_Hoover has quit (Ping timeout: 245 seconds).

23:09:40 <shachaf> I'm kind of annoyed that when you lookify up something about dynamic linking or ELF files or whatever most of the search results are about random programs that print error messages involving those things rather than anything about how those things actually work.

23:16:18 <moony> yea...

23:17:10 <shachaf> At least writing an emitter is probably way better than writing a loader (which needs to handle all the old things no one uses anymore).

23:17:15 -!- xkapastel has quit (Quit: Connection closed for inactivity).

23:19:13 <shachaf> I'm confused by the last paragraph in https://www.airs.com/blog/archives/549

23:19:21 <shachaf> "It is possible to design a statically linked PIE, in which the program relocates itself at startup time."

23:19:31 <shachaf> Why does the program need to relocate itself? Doesn't the kernel do that?

23:35:18 -!- sleepnap has quit (Quit: Leaving.).

23:36:56 <pikhq> I'd have to look to check, but I think this is "relocation" in the sense of "processing ELF symbol relocations", not in the sense of "mapping into memory"

23:37:34 <pikhq> Because a static PIE binary is more-or-less a shared object that happens to have an entry point and no external dependencies.

23:37:58 <shachaf> Why would ELF symbol relocations be relevant for a statically linked executable?

23:39:34 <pikhq> The executable's GOT and PLT needs populated.

23:40:15 <shachaf> What, are you from Pittsburgh now?

23:40:29 <shachaf> I don't see why a statically linked executable would have a GOT or PLT.

23:41:06 <pikhq> Because you're implemented a static PIE binary by emitting, essentially, a shared object. That happens to have an entry point.

23:41:43 <shachaf> Yes, but that's just a technicality due to the kernel only randomizing ET_DYN files.

23:41:58 <shachaf> There's no need to emit a DYNAMIC segment.

23:43:58 <pikhq> Trying to remember exactly how musl does static PIE binary support...

23:45:00 <ais523> <rustc LLVM> movzwl -0x2(%rdi,%rax,1),%ecx \ mov %ecx,%edx \ xor $0x8060,%edx \ movzwl %dx,%edx

23:45:02 <ais523> oh come on

23:45:12 <ais523> that last movzwl is just some really obvious dead code

23:45:37 <ais523> the top 16 bits of %edx didn't stop being 0 just because you XORed the bottom 16

23:47:18 <ais523> I need to stop taking it personally when compilers generate ridiculous code, but still, that one really hurts

23:48:17 <ais523> meanwhile, clang optimised (x ^ 0x807F) >= 0x00A0 into (x ^ 0x8060) >= 0x00A0 which doesn't really change anything but is amusing

23:49:29 <pikhq> Looks like musl's implementation is that it links in the dynamic linker to static PIE binaries.

23:50:18 <pikhq> Or, rather, a portion of it.

23:50:30 <shachaf> Why?

23:50:56 <pikhq> https://git.musl-libc.org/cgit/musl/tree/ldso/dlstart.c It's this portion.

23:51:24 <pikhq> Just the startup code for the dynamic linker, not the full thing.

23:52:25 <pikhq> Oh, it's doing very very little.

23:52:34 <pikhq> Not even really processing relocations.

23:55:02 <ais523> later on there's also the beautiful "xor %r8d,%r8d \ movzbl %r8b,%eax" (with, AFAICT %r8 dying immediately afterwards)

23:55:30 <ais523> *with, AFAICT,

23:59:39 -!- tromp_ has quit (Remote host closed the connection).

←2019-08-18 2019-08-19 2019-08-20→ ↑2019 ↑all