00:00:36 <b_jonas> well, you could say that you forgot that the 8-byte immediates are available for the general encoding of the arithmetic, as opposed to the shortcut accumulator destination encoding
00:01:40 <ais523> fwiw, I think modern processors in general are insane because of register renaming
00:02:04 <ais523> compilers go to all this effort to pack their nicely SSAed programs into a set of registers, which is an NP-complete problem
00:02:17 <ais523> then the processor goes to a lot more effort to unpack all thhe register names back into SSA and disregard the actual registers
00:03:05 <shachaf> The Mill people describe their belt system as "hardware SSA".
00:03:41 <ais523> yes, it's pretty similar, but also somewhat limiting I think
00:03:57 <ais523> I prefer doing things the other way round, where each instruction says where the data should be routed to, rather than where the data came from
00:04:22 <ais523> because that allows you to pipeline high-latency instructions, which is half the reason for the register renaming in the first place
00:05:33 <shachaf> I wish I had a better x86 REPL thing.
00:05:46 <shachaf> I could probably write something reasonably easily.
00:06:00 <ais523> asm repl, that's an interesting idea
00:06:31 <ais523> I remember debug.com, arguably that counts, but it was a bit weird in how it worrked
00:08:03 <shachaf> Are there any good debuggers for Linux?
00:08:12 <ais523> what do you mean by "good"?
00:08:33 <ais523> gdb is easily good enough for what I want, but may well not count as "good"
00:08:39 <shachaf> Hmm, I'm not sure. It's probably mostly a UI thing?
00:08:51 <ais523> ddd is a GUI version of gdb, but is incredibly dated
00:08:53 <shachaf> I can figure out the things I want with gdb but it often takes a lot of overhead.
00:09:10 <ais523> there are also a lot of IDEs that run on Linux, of course; probably most of them do
00:09:17 <shachaf> People say Microsoft's debugger is good, but I haven't used it.
00:09:49 <ais523> well, it's integrated with Visual Studio, which IMO automatically makes anything insane
00:10:17 <shachaf> I've barely used Visual Studio. I don't know much about it.
00:10:44 <b_jonas> ais523: yes, and that's made even more insane by how the intel x86 optimization manual says that the longer nop instructions use execution units and a gp register from the register file, so the compiler sometimes has to figure out which register a nop instruction should reference
00:10:47 <ais523> shachaf: the last-but-one time I attempted to install Visual Studio, I ended up needing to entirely reinstall Windows
00:10:59 <shachaf> The debugging APIs in Windows are certainly better than ptrace.
00:11:25 <ais523> although it's a bit slow in terms of how many system calls are needed
00:11:26 <b_jonas> it's strange, you'd think they solve that in the decoder, but no
00:11:38 <shachaf> ptrace does approximately the minimum possible.
00:11:50 <ais523> b_jonas: fwiw, I think there should be two different .align pseudoinstructions
00:11:59 <shachaf> I don't remember all the things Windows does but they generally seem useful.
00:12:00 <ais523> one which uses NOP padding, and the other of which uses ud2
00:12:07 <shachaf> For example, Win32 VirtualAllocEx lets you allocate memory in another process's address space.
00:12:26 <ais523> shachaf: ugh, that seems outright dangerous
00:12:35 <ais523> what if the other process is using a different allocator to the expected allocator?
00:12:39 <shachaf> When you're debugging a process?
00:12:48 <shachaf> VirtualAllocEx is the equivalent of mmap.
00:12:48 <ais523> the ptrace equivalent is to force the other process to call malloc, which seems safer
00:13:07 <shachaf> In general I think many Windows system calls let you specify a process handle.
00:13:27 <shachaf> How do you even do a system call, when you attach to some unknown process?
00:14:01 <shachaf> You at least need to find a syscall instruction, which could mean waiting for the process to do a system call, searching its address space for a syscall instruction, or writing one into its address space.
00:14:35 <ais523> it's documented what you do if the ptrace is stopping at a syscall instruction, you just rewind the IP two bytes
00:14:46 <ais523> if the process isn't stopped at a system call, though, you have to change the IP to point at one
00:14:49 <shachaf> But you need it to be in an executable page, which is often read-only. ptrace lets you write to read-only mappings, but if a file is mapped into memory, it'll secretly convert it to a private mapping from a shared mapping.
00:14:58 <ais523> (there is a system call instruction in the vDSO, that's the one I use for Web of Lies)
00:15:17 <shachaf> That's true, maybe finding the vdso page is your best bet.
00:15:26 <shachaf> (Unless the debugee unmapped it?)
00:15:36 <ais523> well, you just use /proc/*/maps to find where it is
00:15:51 <shachaf> I guess writing a debugger for hostile programs that don't want to be debugged is quite different from writing one for your own programs.
00:15:57 <ais523> unmapping the vDSO is interesting, I didn't think of that
00:16:24 <ais523> also, ptrace doesn't work recursively, which means that you can make yourself almost debugger-immune simply by ptracing yourself or a bunch of child processes
00:16:32 <ais523> (for a debugger to hide from that, it needs to /simulate/ all the ptrace calls itself)
00:17:17 <b_jonas> ais523: I don't think padding with repeated UD2 is a good idea. it encodes to db 0x0F,0x0B| and if you happen to jump to that with the odd alignment, it reads the bytes 0x0B,0x0F which is a perfectly valid two-byte instruction
00:17:49 <ais523> does x86 have an instruction sequence that's invalid from any offset?
00:18:09 <ais523> other than the last byte, possibly, as there are no guaranteed one-byte illegal instructions
00:18:43 <shachaf> Why do you want an illegal isntruction?
00:19:39 <b_jonas> ais523: there are like sixteen one-byte instructions that are currently illegal in x86_64, but they're all just reserved, not guaranteed to be illegal forever
00:19:40 <ais523> because the padding isn't meant to be executed
00:20:35 <b_jonas> yes, padding with 0xCC is probably the best
00:20:39 <shachaf> Or 0xF4 if you're not in ring 0.
00:21:20 <shachaf> I guess I should start using octal instead of hexadecimal.
00:21:29 <ais523> LOCK LOCK LOCK … LOCK NOP is illegal at any offset other than the last
00:21:57 <shachaf> So int3 is 0314 and hlt is 0364.
00:22:09 <ais523> (once you have more than 15 lock prefixes it becomes illegal for a different reason, but that's OK)
00:22:49 <ais523> max length of an instruction is 16 I think
00:23:05 <shachaf> Wait, really? I thought it was 15.
00:23:11 <ais523> …wait, why is LOCK NOP even illegal? NOP is a sort of XCHG instruction, which is one of the few that /can/ be locked
00:23:38 <ais523> oh, because there are two registers, LOCK requires memory to be mentioned
00:23:41 <shachaf> It makes sense for reg-reg xchg to be illegal.
00:24:40 <ais523> while messing around with atomics I discovered that x86 has an XADD instruction
00:24:56 <ais523> which is (r, m) = (m, r+m)
00:25:17 <ais523> notable mostly because it's lockable and has pretty useful semantics for a lockable instruction
00:26:28 <zzo38> I think it can be useful sometimes, yes.
00:26:29 <moony> XADD can also be used to perform the fibbonaci sequence in very little space
00:26:43 <ais523> just two registers, I assume
00:26:56 <zzo38> Yes, I thought that too
00:27:35 <moony> yup, 2 registers fib.
00:28:00 <ais523> now the fun part is: can you do it with /one/ register?
00:28:01 <moony> but 3rd is for loop if you want it to actually halt
00:28:02 <shachaf> But it's not as efficient as repeated squaring, I suppose.
00:28:43 <ais523> one loop iteration, that is
00:28:44 <moony> ais523: quickly thought over the various ways to do addition on x86. My guess is no
00:28:58 <ais523> moony: it'd have to be a vector register I think
00:29:03 <shachaf> Two unbounded registers are enough for any computation.
00:29:16 <moony> I was thinking of vectors too
00:29:20 <shachaf> Man, I wish I had a computer with even one unbounded register.
00:29:38 <b_jonas> ais523: do you need it modulo 2**32, or just the first fifty or so terms?
00:29:38 <moony> you do. it's called RAM. (/s)
00:29:59 <b_jonas> if the latter, then you can do a compare conditional jump thing to do it in one register
00:30:18 <b_jonas> and possibly even some smart hash lookup table thing
00:30:28 <b_jonas> but if you need it forever, then it gets harder
00:30:50 <b_jonas> I think you could use one 64-bit register to compute it mod 2**32:
00:31:02 <moony> fib with a 1 byte value should be doable in 1 reg
00:31:08 <b_jonas> as long as you can have a constant in memory
00:31:21 <moony> because upper lower halves
00:31:25 <zzo38> Why the PostScript binary format does not include dictionaries?
00:31:30 <ais523> b_jonas: I was thinking 32-bit integers, or even floats
00:31:40 <b_jonas> multiply it by a constant 0x00000001_00000000, then swap the upper and lower parts using rotate
00:32:36 <b_jonas> multiply it by 0x00000001_00000001
00:33:17 <b_jonas> I think you could do that even without a constant memory operand, because 0x00000001_00000001 is composite
00:33:53 <b_jonas> multiply by 641 then multiply by 6700417 then rotate right by 32
00:34:15 <b_jonas> low 32 bytes gives the fibonacci number
00:35:11 <shachaf> I'm wondering about a language feature which is sort of the opposite of the ones I've been talking about wanting recently.
00:35:20 <b_jonas> https://www.perlmonks.com/?node_id=715414 is slightly releavnt
00:35:39 <shachaf> I want a type which can have values that are either known at runtime or at compiletime, and can mostly be treated uniformly.
00:35:56 <shachaf> For example an array's length might either be known statically or be a field on a struct.
00:36:31 <shachaf> You might like to be able to write code that works uniformly in both cases.
00:36:35 <ais523> actually AVX is almost certainly enough, you can treat %xmm1 or whatever as four floats, then vhaddps gives you the ability to add and vpermilps lets you move them around inside the register
00:36:40 <b_jonas> ais523: in each step, multiply by phi*2**32, add 2**31, then shift right by 32
00:36:59 <shachaf> So you'd have a "field" on the struct which just refers to a constant compiletime value, or something.
00:36:59 <b_jonas> also works with floats: multiply by phi, round to integer
00:37:06 <ais523> this doesn't work with ints because vpermd can't take input from an immediate for some reason
00:37:10 <shachaf> Are there languages that do that?
00:37:28 <b_jonas> can only give the series starting from the second 1 though
00:37:39 <ais523> shachaf: Rust does several things which are incredibly similar to that
00:37:55 <ais523> but might not quite have the syntax you want
00:37:58 <shachaf> Hmm, do you have an example?
00:38:38 <ais523> you can say print!("{}", obj.field()); and depending on the type of obj, that can either get a value of a field or else compile down to a constant
00:39:14 <ais523> in this case the function containing that would have a type parameter describing the type of obj so that the compilier knew which implementation to use
00:39:43 <shachaf> Another version of this might be a function that can either take an argument at compiletime or at runtime. (In the former case it could be specialized for that argument.)
00:40:27 <ais523> several of the more modern C-replacements (although not Rust, I think) have syntax in which you just write a function and can use it at compile-time if everything it does is safe then
00:40:47 <shachaf> I guess if you have a length() "method" in your language, and you monomorphize polymorphic functions, that more or less accomplishes the same thing, with enough optimizations.
00:40:54 <ais523> fwiw, even gcc compiling C will generate specialised functions if the code implies that they might be useful, at a sufficiently high optimisation level
00:41:29 <ais523> shachaf: right; one weird thing in Rust is that it's idiomatic to rely on this actually happening
00:41:46 <b_jonas> ais523: yeah, but that's another of those optimizations that are hard to get right in the compiler
00:41:47 <shachaf> That sounds like something straight out of C++.
00:41:48 <ais523> e.g. arrays have a compile-time known length but you'd normally write .len() to get at their length anyway
00:42:09 <ais523> because Rust monomorphises whenever you don't explicitly tell it not to
00:42:24 <shachaf> C++ people are all about generating ridiculously inefficient code with recursive nested structs that the compiler can optimize to something efficient.
00:42:29 <b_jonas> the compiler would have to try to inline every call, recursively, to figure out which sets of inlinings result in a simplificatoins
00:42:35 <ais523> shachaf: yes, that's very Rust as well
00:42:48 <shachaf> I think that's not that great an approach.
00:42:52 <b_jonas> just inlining single calls isn't enough, because sometimes everything is hidden behind several layers of functions
00:42:54 <ais523> with the exception that Rust relies a bit less on the optimiser for it
00:43:04 <ais523> the language semantics are such that the optimisation is more or less forced
00:44:52 <shachaf> Imagine you could say struct Arr<T, length = -1> { T* data; if length < 0 { int len; } else { compiletime int len = length; } };, or something.
00:45:38 <shachaf> Of course there are more things about an array that you might want to be either compiletime or runtime.
00:45:49 <shachaf> For example you could track strides for a multidimensional array.
00:46:10 -!- nfd has joined.
00:46:48 <b_jonas> shachaf: sure, Eigen does that with array lengths, and I think some other libraries do too
00:47:12 <shachaf> In general I feel like there are so many different arrayish types you might want, which is kind of annoying.
00:47:21 <ais523> well, the Rust way to do that would be to write [T; 10] for a fixed ten-element array and [T] (which is a different type) for an array of unknown length, but it's trivial to write a function that's generic over both of those
00:47:48 <ais523> (e.g. by specifying the type of your argument as Into<&[T]>)
00:48:19 <shachaf> For example: Array with static length; array with dynamic length; array with dynamic length and (static or dynamic?) capacity, which can't grow past that capacity; array with dynamic length and dynamic capacity that can be grown with an allocator (do you store the allocator too?)
00:48:38 <ais523> I have come to the conclusion that it's generally incorrect for a method to require a specific type from its arguments, rather than a description of what the type needs to support
00:49:01 <ais523> but that description doesn't have to be duck typing, something like Rust's traits or Haskell's typeclasses probably works better
00:49:51 -!- nfd9001 has quit (Ping timeout: 264 seconds).
00:50:09 <ais523> Rust has three arrayish things, [T; len] (compiletime fixed length); [T] (runtime fixed length); Vec<T> (growable)
00:50:22 <b_jonas> (more than three but yeah)
00:50:25 <shachaf> I think doing too much monomorphizing everywhere isn't that great either.
00:50:27 <ais523> that said, I think the "growable up to a fixed capacity" would be useful, but I don't think it's in the standard library
00:50:34 <ais523> b_jonas: three really well-known ones
00:50:41 <shachaf> For example you get a lot generated code, which isn't great.
00:50:42 <ais523> what sort of minor arrayish things does it have?
00:50:43 -!- nfd has quit (Ping timeout: 248 seconds).
00:51:09 <b_jonas> yeah, those are mentioned on the front page of the standard library documentation
00:51:12 <shachaf> ais523: Vec<T> is only growable with a call to a global allocator, right?
00:51:27 <shachaf> In general I'd like my libraries not to depend on things like a global allocator.
00:51:51 <b_jonas> at https://doc.rust-lang.org/nightly/std/index.html#containers-and-collections that is
00:52:12 <b_jonas> shachaf: sure, but if you want a growable type, what should it grow into?
00:52:33 <shachaf> Well, that's one reason I want "growable up to a fixed capacity".
00:52:34 <b_jonas> do you want one that uses a buffer you give it?
00:52:52 <b_jonas> hmm yes, I don't know if the library has such a type
00:52:54 <shachaf> You could tell your caller that you need more space.
00:52:58 <b_jonas> you could define one though
00:53:09 <ais523> ugh, I've been trying to remove doc.rust-lang.org from my browser history
00:53:25 <ais523> I shouldn't have clicked that link
00:53:30 <ais523> moony: well, I often develop Rust offline
00:53:33 <b_jonas> ais523: use the forget feature
00:53:39 <ais523> so I have a local copy of the Rust documentation
00:53:41 <shachaf> I do all my browsing in incognito mode by default so nothing goes in my history.
00:53:42 <ais523> b_jonas: I did, it's just a pain
00:54:15 <ais523> actually, the really annoying thing is that I can't use file:/// URLs for Rust documentation any more, because the page crashes if you don't have cookies/localstorage enabled
00:54:27 <shachaf> ais523: There's also the separate distinction of whether a type "owns" the memory it's referring to or not.
00:54:28 <ais523> and I don't think file:/// supports that
00:54:35 <ais523> so I'm now running a local webserver just for the Rust documentation
00:54:37 <moony> ais523: `rustup doc`
00:54:42 <moony> does that not work
00:55:14 <ais523> b_jonas: it's not that ouch, I've had a local webserver running on here for years because I couldn't figure out how to get rid of it
00:55:14 <shachaf> I guess Rust is full of ways to handle that.
00:55:17 <ais523> now it at least has some purpose
00:55:23 <ais523> (it's only accessible on localhost, at least)
00:55:28 <shachaf> But I think e.g. Box<> assumes you're using the global allocator?
00:55:53 <ais523> shachaf: pretty much the entirety of Rust is about whether types own their memory or not
00:56:09 <b_jonas> shachaf: yes, or at least its destructor does
00:56:12 <ais523> many methods have to be written three times for the three possible ownership relationships
00:56:26 <ais523> Box<> is basically the main API for accessing the global allocator
00:56:39 <ais523> so them being tied to the global allocator really isn't surprising
00:56:40 <b_jonas> and there's not much point using it without
00:56:42 <shachaf> And yet they don't seem to have great support for a lot of allocation strategies.
00:56:49 <ais523> you could have a MyLocalBox or whatever for a different allocator
00:57:13 <ais523> basically nothing in Rust's standard library assumes you're using a Box, it's always done with traits
00:57:39 <ais523> there are standard trait combinations to use to say "this is a custom pointer-like thing" and then it can be used in all the same contexts that Box can
00:57:53 <shachaf> Last I tried Rust much Box was called ~, I think (or maybe @?).
00:58:05 <shachaf> I think @ was for garbage-collected or maybe reference-counted cells.
00:58:36 <ais523> nowadays ~ is called Box and @ was split into Rc and Arc (a Gc was planned but they never got around to it)
00:58:47 <ais523> the Rc/Arc difference is that Arc is thread-safe but slower
00:59:57 <ais523> (Rc can be used in multithreaded programs but Rust won't let you send one between threads, it's confined to a single thread)
01:00:38 <b_jonas> shachaf: that's what cpressey said too ("https://esolangs.org/logs/2019-07-23.html#l2c")
01:00:41 <shachaf> Pervasive reference-counting doesn't seem like a great allocation strategy to me.
01:01:12 <shachaf> b_jonas: No, I'm not annoyed by the language changing as long as I'm not using it.
01:01:28 <shachaf> I'd rather they make it better rather than try to ensure backwards compatibility.
01:02:18 <ais523> shachaf: pervasive reference-counting is one of the reasons that Perl is probably unrecoverable from a performance point of view
01:02:47 <ais523> I found out fairly recently (in the last few days) that Python uses pervasive reference-counting + a cycle-breaker, which seems even worse somehow (especially in a language which could easily use its own VM)
01:03:07 <b_jonas> ais523: what does "use its own VM" even mean?
01:03:16 <ais523> b_jonas: like Java has the JVM
01:03:27 <ais523> OCaml's allocation strategy is really interesting and apparently not widely used
01:04:05 <b_jonas> ais523: in Java's case that means that the programs are compiled down to an executable that the VM can run, and you don't need the compiler, only the VM, to run the program
01:04:06 <ais523> everything is 32-bit aligned, every 32-bit chunk has a tag bit saying whether or not it's a pointer
01:04:17 <b_jonas> but how is that relevant for what you've said above?
01:04:32 <shachaf> ais523: I think this was changed at one point, but for a long time Python objects that had a reference cycle and destructors were just not collected.
01:04:34 <ais523> this means that exact garbage collection is possible (not just conservative) and doesn't need to know anything about the structure of memory apart from the tag bits
01:05:08 <ais523> b_jonas: well, the JVM is an example of something that can implement exact reference counting, and even things like compaction, because it has full control over the structure of all the memory stored there
01:05:25 <ais523> in Java you can swap out GC algorithms without changing the performance of the program
01:05:27 <b_jonas> ais523: doesn't it also mean that you can't easily have an array of numbers in a dynamically allocated thingy though?
01:05:37 <ais523> and I don't think reference-counting + cycle-breaker can possibly be superior to a GC
01:05:54 <ais523> b_jonas: OCaml numbers are only 31 bits wide so that there's room for the tag bits
01:06:14 <b_jonas> ais523: right, so you can't have an array of proper numbers, only of OCaml numbers
01:06:21 <ais523> OCaml has to do fairly insane things to make floats work with this, which is a major downside
01:07:03 <ais523> shachaf: something that detects a situation in which a nonzero reference count exists only because of objects recursively referencing each other
01:07:10 <ais523> and not because the objects are actually referenced
01:07:15 <ais523> and frees all the objects in the cycle
01:07:38 <zzo38> Fix the Rust documentation so that it does not use cookies/localstorage. A documentation page shouldn't require that anyways.
01:07:39 <b_jonas> shachaf: a would-be-thief that decides that if he can't have your bicycle, then you can't either
01:08:13 <shachaf> I mean, how does it work other than effectively doing general GC?
01:08:34 <ais523> shachaf: that's why I think it's ridiculous; it's basically most of the way to a general GC with all the disadvantages of an Rc
01:10:21 <b_jonas> that said, python can work with a pure Gc, and I think the python implementations that run in the jvm do that; and if you know that no dependency of your code needs the gc, then you can run cpython with just the refcounting, disabling the gc
01:11:41 <b_jonas> you can use destructors and/or weak references to make the pure refcounting work
01:12:20 <b_jonas> https://docs.python.org/3/library/gc.html#gc.disable
01:12:35 <b_jonas> (don't click on that if you want to use a local copy of the docs)
01:12:51 <ais523> I don't think I have a local copy of the Python docs
01:13:01 <ais523> maybe I should, but I hardly program in Python
01:13:26 <shachaf> If you program in a language all the time, you probably don't need the documentation.
01:13:50 <ais523> anyway, I had an idea wrt the OCaml way of doing things: for each type, identify the ratio of pointers to non-pointers in it (in terms of bytes), and allocate each type with a given ratio in its own arena
01:14:22 <ais523> /but/, you allocate the pointer and non-pointer parts in separate arenas too (the addresses of the two parts are mathematically related because of the constant ratio)
01:14:56 <ais523> now, you have all the pointers in memory blocks of their own, so that you can GC them really easily, and don't need to waste any bits on tagging things
01:16:28 <ais523> another benefit of this is that it statically guarantees that all pointers are correctly aligned, without needing to waste any bytes on padding
01:16:58 <ais523> I'm not sure what the cache effect would be, there are reasons to think it would help but also reasons to think it would hinder
01:18:51 <ais523> actually, this is almost strictly better than the OCaml approach (which is already pretty good), the only downside is related to unions and other sum types
01:19:02 <ais523> which are, of course, heavily used in OCaml, so that might be a problem
01:19:35 <ais523> unless, hmm, perhaps sum types could be references and the tag is stored in your choice of arena to point the pointer into
01:21:12 <ais523> OCaml is the sort of language that really wants a GC, because it heavily relies on immutable value types that it wants to avoid copying, /but/ also contains mutable data
01:22:09 <ais523> (I think that languages which mutate a lot normally benefit from manual memory management, and with languages which don't mutate at all, reference counting is a more attractive possibility than it would be otherwise)
01:24:00 <shachaf> I think languages should mutate a lot when they can, but they always seem to focus on backwards compatibility.
01:24:16 <shachaf> This is why you should never release your software.
01:24:38 -!- Sgeo_ has quit (Ping timeout: 245 seconds).
01:25:07 <pikhq> I'm increasingly of the conclusion the transistor was a mistake.
01:26:28 <shachaf> do you want to design my language for me twh
01:26:31 -!- Sgeo has joined.
01:26:57 <shachaf> also do you want to see pictures of cute kittens
01:26:59 <ais523> shachaf: Rust is instructive in that but I'm not sure in what direction
01:27:05 <pikhq> Between work and general life demands I'm doing well to keep up on developments in programming languages, really
01:27:12 <pikhq> Also, of course I do. Cute kittens are great.
01:27:22 <ais523> lots of people got upset that it changed so much before being stabilised, but OTOH it ended up really benefitting from the time
01:27:54 <ais523> I do think it's beneficial to really work on a project's specification before the first release, though, making sure it's perfect
01:28:52 <pikhq> There's still some things in the Rust stdlib that I think are kinda questionable...
01:29:58 <zzo38> What things is that?
01:30:23 <pikhq> Allocation failure is generally an unreported and unhandlable error.
01:30:45 <pikhq> Which, yes, I know is more _ergonomic_, but it means you have limited options for handling that error condition.
01:30:53 <shachaf> I think it'd be a pretty reasonable world for every sufficiently large company and so on to use its own programming language, rather than everyone standardizing on a few.
01:31:52 <ais523> pikhq: my experience with allocation failures is that almost every attempt I've seen to handle them is either equivalent to a crash, or more user-hostile than a crash would be
01:31:55 <shachaf> Unfortunately there's a lot of nonsense involved in making programming languages which probably shouldn't be necessary. And also cross-language interoperability is often very bad.
01:32:19 <pikhq> And it feels a touch silly in a language that offers decent error handling approaches, and doesn't have semantics necessitating always-crash behavior
01:32:20 <ais523> arguably /every/ attempt
01:32:34 <pikhq> ais523: For many programs, that is indeed the correct decision.
01:33:12 <pikhq> In Rust it grates because Rust is trying to handle problem spaces where that might _not_ be the best decision.
01:33:21 <ais523> pikhq: in programs where you'd want to do something else, you probably need safety against other sorts of crashes too, or even power failure
01:34:16 <ais523> (also, my guess is that allocation failure in Rust is a panic, which is just about possible to handle, and I'm guessing that programs that care about allocation failure recovery care about panic recovery too)
01:34:38 <pikhq> I believe there is an outstanding RFC for _making_ it a panic.
01:34:57 <ais523> oh, it isn't atm? presumably that's due to a fear that destructors might allocate memory
01:35:13 <pikhq> And yeah, having allocation failure be a panic is probably a reasonable strategy for a lot of programs.
01:35:16 <ais523> I think it would be obviously incorrect to make it anything less than a panic
01:35:43 <ais523> especially as it's almost impossible, on most computers, to reach the point of memory exhaustion
01:35:53 <ais523> memory is a shared resource, so the computer just runs slower and slower and slower as it gets shorter on memory
01:36:12 <ais523> the point of true memory exhaustion doesn't actually get reached because the user has started force-quitting things and even doing hard power offs before then
01:36:18 <shachaf> On my computer if a program uses too much memory, it just makes the kernel kill some other random program.
01:36:18 <pikhq> Pretty frequently the hypothetical ideal allocation failure behavior is for a given task to abort, not for the program as a whole to.
01:36:28 <pikhq> (but of course that's harder to achieve)
01:36:30 <ais523> so an actual memory exhaustion only happens when it's a quota that got exhausted rather than physical memory
01:36:51 <shachaf> Are destructors a good idea? I can't tell.
01:36:58 <pikhq> shachaf: On Windows on the other hand, the kernel just reports memory exhaustion.
01:36:58 <ais523> (my life got a lot better when I realised that I could just set a per-program RAM usage quota)
01:37:14 <pikhq> It has strict commit charge tracking.
01:37:36 <pikhq> (though it's still hard to really get this to come into play, because the swap file scales in size)
01:37:57 <shachaf> It's kind of bizarre that Linux still uses swap partitions instead of files.
01:38:24 <pikhq> To be honest, probably the most common case where allocation is going to report failure is exhaustion of address space on 32-bit systems.
01:38:38 <pikhq> And that one's pretty easy to hit.
01:39:07 <pikhq> Easier still to have an attempt to allocate that would exhaust address space, while still having plenty after the allocation failed.
01:39:18 <ais523> shachaf: re destructors: my belief is no, except when it's syntactic sugar for something else (as is often the case with RAII), but for a slightly strange reason: if you have the sort of object that benefits from a destructor, you probably need to be tracking/organising all the references to the object anyway to be able to use it in a semantically correct way, in which case calling the destructor manually would be trivial
01:39:48 <ais523> re: swap: Linux can use swap files just fine, people are just used to setting up partitions
01:39:59 <shachaf> Can it suspend-to-disk to a swap file?
01:40:09 <shachaf> That's my main reason for having a swap partition.
01:40:47 <pikhq> I wouldn't be surprised if using a swap file ends up having a performance penalty over a swap partition, just because the Linux swap code is pretty poo.
01:41:07 <shachaf> I also don't know why swapoff takes half an hour.
01:41:10 <zzo38> Also maybe the file is fragmented
01:41:15 <pikhq> Not that swapping is _ever_ going to be fast, but Linux seems to do it a page at a time, synchronously.
01:41:26 <ais523> (fwiw, I set my current laptop up with no swap, neither partition nor file, and have so far not regretted that decision at all)
01:41:27 <shachaf> I guess the reason is what pikhq just said.
01:42:29 <ais523> even then, you still get swapping-like behaviour at times of high memory pressure, but the kernel isn't swapping data out to disk, but rather unloading pages from memory that happen to equal the corresponding page on disk
01:43:01 <shachaf> I think there are two main uses for destructors, which RAII wants to unify:
01:43:28 <shachaf> One is objects on the stack, where things get auto-cleaned-up at the end of a scope.
01:43:58 <shachaf> This is convenient but I think something like Python's "with" or Go's "defer" might address it better. It's mostly a control flow thing, not an object thing.
01:44:24 <ais523> does Java's try-with-resources fall into the same category?
01:44:37 <shachaf> The other is objects that contain other objects that contain other objects that have destructors, or something.
01:45:09 <ais523> (the semantics: you write try (expression) {}, and when control leaves the {} for any reason at all, the ".close()" method is called don the expression; this includes exceptions and all control flow structures, in addition to just falling off the end naturally)
01:45:12 <shachaf> Where the whole tree is automatically traversed for you when you destruct the outermost object.
01:46:13 <ais523> of course, things like System.exit() can beat the try-with-resources and prevent .close() from running, but the semantics are that the process can't continue until your destructor has run
01:46:20 <shachaf> I guess there's also the special case where you e.g. use a lock-acquiring-object and then tail-call another function and give it that object.
01:46:32 <shachaf> That's probably not handled with a thing like "with".
01:47:08 <ais523> semantically the only issue is that it wouldn't be a tail-call any more
01:47:29 <shachaf> I mean the case where you pass ownership of the lock-object to the other function.
01:47:40 <shachaf> So it can presumably destruct it at any point.
01:48:03 <ais523> hmm… isn't the lock-object basically just a continuation?
01:48:36 <ais523> I guess it's more like a callback
01:50:59 <ais523> anyway, I have one very strong opinion about memory management, which is: for immutable types that never change, programming languages (possibly unless they're /very/ low level) should provide an API which from the programmer's point of view looks like the objects are copied everywhere and never have multiple references, and should optimise it behind the scenes (which may involve garbage collection or reference counting of a single copy or whatever),
01:51:01 <ais523> but should /never/ allow such objects to mix with any sort of memory allocation scheme that's explicitly controlled by the programmer
01:52:48 <ais523> because the two cases are basically entirely different in terms of how you need to optimise them, and trying to treat them the same way makes both of them much more difficult
01:53:25 <ais523> in particular, the programmer should never need to track immutable things, you can pass them around at will without any semantic issues, forget about them, whatever
01:53:47 <ais523> things that can be mutated are both rarer, and need a lot more care, typically you'll have some very regimented rules for using them already
01:54:13 <shachaf> Maybe your "/very/ low level" boundary is different from mine.
01:54:46 <ais523> for example, in NetHack, keeping track of the memory management for every string in the program is a lot of ridiculous effort, especially when you want to be able to pass the same string to multiple functions or to the same function multiple times
01:54:52 <ais523> so a garbage collector for strings would be really nice
01:55:27 <ais523> OTOH, a garbage collector for in-game items, monsters, etc. would just be semantically wrong, because you want those things to stick around until you explicitly destroy them, and the destruction has a lot of /logic/ impacts that need consideration by the programmer
01:55:31 <zzo38> Is there a driver to use with Ghostscript to write to a DVI file?
01:56:06 <ais523> e.g. if the monster is holding an item, do you want to destroy that too? what if it's a plot-critical item that simply cannot be destroyed? you need to know where the monster "should have been" to place the item in the right location after hte monster is gone
01:56:36 <ais523> so things like monsters are part of the game logic and managing their memory is trivial because you need to manage their state in just as much detail, and the memory management is easy to tack onto that
01:57:55 <ais523> (I've also concluded that there's actually a third category here, things like "the internal state of an algorithm" that are mutable but self-contained and used only temporarily, but those are nearly always either stack-allocated or effectively-stack-allocated)
01:59:16 <ais523> I guess you could make exceptions for, say, treesort, which could in theory be stack-allocated but nearly always isn't
01:59:18 <zzo38> (Also, is it possible to add drivers to Ghostscript without recompiling?)
02:00:20 <shachaf> Why would treesort not be stack-allocated?
02:00:31 <shachaf> Or allocated in some kind of temporary memory.
02:00:37 <ais523> the tree you're building is a recursive data structure the same size as the original list
02:01:03 <ais523> trying to express a stack-allocation of that, especially if the list is coming from a streaming source and you don't know how large it is, is incredibly difficult in most languages
02:01:30 <shachaf> Well, you can allocate in some temporary arena or something with effecitvely stack semantics.
02:01:46 <ais523> using a recursive function to do the stack allocation using its own call frames would work, but is inefficient due to the return addresses cluttering up the stack
02:02:07 <shachaf> Mergesort also allocates a linear amount of memory.
02:02:10 <ais523> so you'd either need an alloca-alike or else a temporary arena
02:02:22 <Hooloovo0> could you do something fancy with tail recursion?
02:02:37 <shachaf> Tail recursion is pretty pointless if you have iteration.
02:03:44 <shachaf> By the way: I realized that "if" evaluates its argument exactly once, at the time it's executed, so it's a lot like a function parameterized on a block.
02:03:56 <shachaf> But "while" re-evaluates its argument so it's not very function-like.
02:04:21 <shachaf> Is "while" an exception among control flow keywords?
02:05:23 <ais523> <Hooloovo0> could you do something fancy with tail recursion? ← I was actually just thinking that, and did some experiments; my conclusion is yes in theory, but no using the x86_64 calling convention
02:05:43 <shachaf> Actually, my real question is: If my function has user-definable control flow constructs, should they be functions, which is enough to support "if", and if so how should they handle "while"?
02:06:29 <ais523> you'd need a calling convention in which the called function cleaned up the section of the stack between stack and base pointer for the caller
02:06:51 <ais523> this calling convention is trivial to define but I don't see any benefit except in this one massively specific case, so I doubt it'd ever be used
02:07:33 <ais523> shachaf: the mathematical solution is call-by-name, and the practical solution most languages use for that is to take a callback parameter, meaning that your language is genereally call-by-value but you call-by-name just this one small part
02:07:53 <shachaf> ais523: There might be more benefit in non-recursive tail calls? But probably you should just manage the memory more epxplicitly if you're doing that.
02:08:16 <ais523> so, e.g., in C, the prototype for while would look liike while(bool(*condition)(), void(*body)())
02:08:33 <ais523> because it's a function pointer not a value, the function you're calling can run it multiple times if it wants to
02:08:36 <shachaf> Except that presumably you're thinking of a closure and not a function pointer.
02:08:48 <ais523> it would normally be a closure in practice
02:09:01 <ais523> doesn't have to be, though, it could just be a function constant
02:09:14 <shachaf> Anyway, I'd want this for an efficient language, where the predicate argument is guaranteed to be inlined at compiletime.
02:09:53 <shachaf> If you have a notion of passing multiple blocks to a function, you could just write "while {p} {body}", which maybe wouldn't be so bad.
02:10:08 <zzo38> I think the user defined control structures should be macros and not functions, and operate at compile time.
02:10:09 <ais523> heh, well I designed Verity a while back: intended for efficiency, call-by-name, callbacks are guaranteed to be inlined (in fact, absolutely everything is guaranteed to be inlined, which in turn means that the language doesn't support recursion)
02:10:25 <ais523> or, well, it does actually support recursion but it does so using a recursion combinator
02:10:35 <zzo38> (The macro expansion will then put in all of the correct code to make it work properly at run time)
02:10:56 <ais523> it doesn't support recursion via having a function call itself because that would just inline it
02:11:15 <shachaf> zzo38: I'd like them to be something between macros and functions.
02:11:21 <ais523> shachaf: hmm, I think that might be valid syntax in Perl (once you change "while" to something that isn't a keyword)
02:11:26 <shachaf> Maybe this just means hygienic macros, but I think it's a bit more than that.
02:11:40 <shachaf> For example, I'd like a block to optionally take parameters.
02:11:49 <shachaf> So you might write "for(xs) {\x; ... }"
02:12:08 <shachaf> That's not just an AST, exactly, it's a compiletime object that can be called.
02:12:31 <zzo38> Yes, that makes sense, it could be hygienic macros that you can add blocks to take parameters. So, you can have anonymous macros perhaps?
02:12:55 <shachaf> I guess you could, though it doesn't seem that useful.
02:13:18 <zzo38> I think may would be useful.
02:14:02 <b_jonas> ais523: impossible to reach memory exhaustion => unless you deliberately put a memory limit to catch it early, whether by setrlimit or through the memory allocator function itself
02:14:02 <shachaf> So my old language idea is, you have an operator that captures "the rest of block" and passes it as an argument to an expression.
02:14:23 <shachaf> So instead of writing "if(p) { body }", you can write "{ if(p)`; body }"
02:14:42 <b_jonas> but I do agree that it's usually not worth to handle an out of memory condition other than guaranteeing that it will crash properly rather than randomly corrupt memory
02:14:53 <ais523> b_jonas: yes, this is why I have a setrlimit on every program I start from the command line by default nowadays
02:15:08 <shachaf> And also instead of writing "for(xs) {\x; ... }", you can write "{ x := for(xs)`; ... }"
02:15:16 <ais523> (to catch fast memory leaks in programs I write, which I'd nearly always be running from the command line)
02:15:57 <shachaf> If you're using this idea, you could maybe have "while" pass a special block that breaks from the loop when you pass it false. Then you could have "{ while`(p); body }".
02:16:11 <shachaf> I don't really like this, though, it seems pretty complicated.
02:16:33 <ais523> shachaf: last time we discussed this, I think we had a debate about if it was a monad or not
02:16:44 <ais523> my current thoughts on this is that it works very well for if-like things but much less well for while-like things
02:16:50 <shachaf> What about for-like things?
02:17:15 <ais523> unsure, but currently leaning towards working
02:17:39 <shachaf> I think it works quite well for "{ loop`; ... }" and "{ x := for(xs)`; ... }"
02:17:56 <shachaf> I think this is nicer than almost any "for" construct in any language.
02:18:26 <shachaf> For example I don't think anyone has a nice way to express "{ x := for(ns)` + 1; ... }"
02:18:28 <ais523> fwiw, I think "each" is a good name for it in this context
02:19:18 <ais523> this also generalises to "any" loops pretty well, but those aren't in common use (maybe they should be)
02:19:19 <shachaf> You can also write "{ x := for(xs)`; y := for(ys)`; if(x < y)`; ... }" and get something like list comprehensions.
02:19:36 <ais523> the loop exits unless there's an exception in the loop body
02:19:47 <ais523> if there is an exception it just moves onto the next loop iteration
02:19:53 <shachaf> How do you express the exception?
02:20:22 <ais523> normally it would be some sort of assertion failure, these loops are normally only used in declarative languages
02:20:32 <shachaf> You can also write "{ x := for(for(xss)`)`; ... }
02:20:40 <ais523> but I've found myself writing them in, e.g., Java
02:20:54 <shachaf> I've hardly seen loops like this, or maybe I just haven't recognized them.
02:21:10 <ais523> where they're an ugly sort of "for (a : …) try { … ; break } catch (Exception ex) {}
02:21:40 <ais523> you basically use them when you have multiple approaches that might potentially work, and just want to find one that works
02:21:51 <shachaf> I think just writing a break at the last line of your loop is simple enough.
02:21:55 <ais523> oh, that's not quite right, because if /every/ iteration fails you want to throw an exception
02:22:15 <ais523> ideally that's a combination of all the others, but I haven't yet seen a language that does that
02:22:35 <shachaf> Probably related to Python-style for-else.
02:22:37 <ais523> "could not communicate by TCP because the socket is closed, nor via carrier pigeon because no pigeons were available"
02:22:50 <ais523> I keep forgetting how for-else works
02:23:03 <shachaf> I used to think it was bizarre but now I think it's very natural.
02:23:13 <ais523> is it that the else block only runs if the loop has no iterations?
02:23:15 <shachaf> Though possibly it's even more natural to express it directly with my language idea.
02:23:32 <shachaf> The else block runs if the loop exits via the condition, instead of a break.
02:24:02 <ais523> the terminology is insane though
02:24:14 <ais523> but the semantics are useful
02:24:33 <ais523> oddly, I think it's the "break" I disagree with here rather than the "else"
02:24:44 <ais523> because "break" is indicating "found"/"success" which the word "break" doesn't really imply
02:24:54 <shachaf> In my language thing, you can label blocks with, say, @, and the label lets you exit them.
02:24:56 <ais523> (likewise, "continue" is indicating failure)
02:25:05 <ais523> perhaps "done" is a good name for "break"
02:25:24 <b_jonas> "I keep forgetting how for-else works" => isn't that because there are at least two unrelated things with a name similar to them?
02:25:44 <shachaf> So "{ x := for(xs)`; ... }" actually means somthing like: { break := @`; x := for(xs)`; continue := @`; ... }
02:27:20 <b_jonas> ais523: I'd prefer "break" or "last" (especially if you have the full quadruplet "retry/redo/next/last"), because "done" is already used in a conflicting sense in sh
02:27:26 <shachaf> So if you wrote "for (xs) { body } else { notfound }" explicitly, it would be something like: { done := @`; { x := for(xs)`; continue := @`; ... }; notfound }
02:28:04 <shachaf> Who cares about sh? The only good thing about sh syntax is the way it handles variables.
02:30:23 <ais523> b_jonas: what does retry do?
02:31:25 <b_jonas> ais523: jumps to right before the start of the loop. not the start of the loop body, the start of the whole loop
02:33:06 <ais523> neat, that's a possibility I hadn't thought of
02:33:21 <ais523> now I sort-of want a control flow operator to go back to the previous loop iteration
02:33:47 <b_jonas> ais523: jumps to right before the start of the loop. not the start of the loop body, the start of the whole loop
02:34:08 <ais523> I think you sent the wrong line?
02:35:05 * moony tries to wrap head around this
02:35:37 <ais523> shachaf: repeats the current loop iteration
02:35:55 <ais523> so it's basically a goto to the start of the block, without increasing the loop counter or anything like that
02:36:30 <ais523> I've used it a few times but it doesn't seem massively useful
02:36:56 <shachaf> I like the way both break and continue are forward jumps, and only the loop construct itself does a backward jump.
02:37:22 <zzo38> I implemented Z-machine in C, JavaScript, and Glulx; now I do in PostScript. Later, I could try other stuff, such as PC, and Famicom (which I have started to do some time ago but haven't finished it), and possibly others too. What others will you implement Z-machine on?
02:37:28 <ais523> it's debatable where continue jumps to
02:37:52 <ais523> I'm not sure it's observable whether it's a forwards or backwards jump (and at the asm level, backwards is normally more efficient unless the loop entry has been unrolled)
02:38:04 <shachaf> The C standard defines it as a forward jump.
02:38:17 <shachaf> I also think that's a much more sensible definition.
02:38:40 <zzo38> Yes, forward jump is sense, and then the optimizer could alter it
02:39:21 <shachaf> Let me see if I can label all the points in a loop to support these behaviors.
02:39:43 <ais523> in BF, is ] a conditional or unconditional jump?
02:39:53 <shachaf> { @break; forever`; @retry; { @thing; forever`; @redo; x := for(xs)`; { @continue; BODY; }; thing; }; break; }
02:39:59 <ais523> (you can even make the argument that ] is conditional but [ is unconditional)
02:40:45 <ais523> what does thing do? I'm still getting used to this notation
02:41:17 <shachaf> It exits the loop when you haven't used redo.
02:41:43 <shachaf> There's probably a better way to express this.
02:43:20 <ais523> I can see what that operation doesn't have a standard name :-D
02:43:56 <ais523> in a more normal notation, the labels go here: retry: while(condition) { redo: BODY; continue:; } break:;
02:44:53 <shachaf> I like break-from-named-block much more than goto for control flow.
02:45:15 <shachaf> It probably doesn't clarify things here, though.
02:45:35 <shachaf> (I think this is a good argument for retry/redo being confusing.)
02:46:26 <zzo38> I prefer goto rather than named continue/break.
02:46:54 <moony> I prefer named continue/break
02:46:58 <moony> goto has too many ways to be evil
02:47:43 <shachaf> I even wonder whether you should just not have continue/break and be explicit about writing the labels when you want them.
02:48:28 <zzo38> moony: Well, so does any other feature, I think.
02:48:30 <ais523> hmm, the only situation in which I find myself tempted to use goto, and it clearly isn't a standin for a missing control structure, is when you have lots of if statements with the same body but they're scattered around the function so you can't combine the conditions, and they're too tightly tied to local variables to extract into a function
02:48:38 <ais523> you can use temporary booleans for that instead but I think the goto is clearer
02:48:58 <zzo38> Yes, and I think there are also other situations where goto is clearer.
02:48:59 <shachaf> ais523: This sounds like a use case for the "blocks" I was describing earlier.
02:49:04 <ais523> shachaf: I think the issue is that the loops are too generically named
02:49:43 <zzo38> I have once written how to convert a program with goto so that it only uses labeled continue/break, which, for example can be used with JavaScript.
02:49:50 <ais523> something like Python's for-else has a clearly defined intended use, Haskell's map also has a clearly defined intended use, but both operations become a for loop in most languages even though they function very differently
02:50:17 <ais523> things like continue and break have unclear semantics because the loops they're short-circuiting/exiting have varying semantics
02:50:59 <shachaf> With your goto notation, for (x in xs) { BODY } else { ELSE } is "for (x in xs) { BODY; continue: }; ELSE; break: "
02:51:56 <ais523> and a label before the ELSE should probably be called "fail"
02:53:43 <shachaf> Man, { x := for(xs)`; if(valid(x))`; switch(x)`; { case(A)`; ... }; { case(B)`; ... } }
02:54:21 <shachaf> for (x in xs) { if (valid(x)) { switch(x) { Case A: ...; Case B: ... } } }
02:55:35 <ais523> hmm, that's semantically correct but isn't it basically an if/else chain?
02:55:58 <ais523> the original purpose of a switch was to hint to a compiler that it might want to make a jump table
02:56:13 <ais523> but I guess that nowadays, maybe switches are just syntactic sugar for the if/else chain
02:59:24 <shachaf> Anyway I don't believe in exceptions so you'd need some other way to express "any".
02:59:52 <shachaf> I think it might reasonable to say that the loop breaks by default and you can explicitly tell it to continue instead.
03:00:16 <shachaf> But it seems infrequent enough that you can probably just write out the control flow yourself?
03:00:24 <ais523> well, I'm using "exception" in a general sense, it's any situation where the code says "OK, this won't work"
03:00:39 <ais523> I think people normally just use "continue" for failures and "break" for successes
03:00:50 <ais523> then an any loop is just a regular for loop
03:00:57 <shachaf> Maybe you can require an "else" clause for "any" loops.
03:18:56 -!- ais523 has quit (Quit: quit).
04:03:47 -!- Lord_of_Life has quit (Ping timeout: 245 seconds).
04:09:42 -!- Lord_of_Life has joined.
04:11:20 -!- kolontaev has quit (Quit: leaving).
04:24:20 <esowiki> [[ADxc]] N https://esolangs.org/w/index.php?oldid=65559 * A * (+467) Created page with "==Example== Suppose your snippets were AD, xc, 123, and ;l. Then: * AD should produce 1 * [[ADxc]] should produce 2 * ADxc123 should produce 3 * and ADxc123;l should produce..."
05:22:59 <esowiki> [[JUMP]] https://esolangs.org/w/index.php?diff=65560&oldid=41105 * Dtuser1337 * (+66) Adding some category.
05:24:15 <esowiki> [[Tarpit]] https://esolangs.org/w/index.php?diff=65561&oldid=40466 * Dtuser1337 * (+22)
05:28:18 <esowiki> [[JUMP]] https://esolangs.org/w/index.php?diff=65562&oldid=65560 * Dtuser1337 * (+34) Woosh
05:31:33 <esowiki> [[Truth-machine]] https://esolangs.org/w/index.php?diff=65563&oldid=64960 * Dtuser1337 * (+84) /* Jug */
05:32:33 <esowiki> [[Truth-machine]] https://esolangs.org/w/index.php?diff=65564&oldid=65563 * Dtuser1337 * (+2) /* JUMP */
06:16:02 <esowiki> [[Truth-machine]] https://esolangs.org/w/index.php?diff=65565&oldid=65564 * Dtuser1337 * (+207) My implementation of truth machine in emoji-gramming.
06:27:39 <esowiki> [[Turth-machine]] https://esolangs.org/w/index.php?diff=65566&oldid=63931 * Dtuser1337 * (+80) formatting codebase and adding categories.
06:30:17 <esowiki> [[Hello world program in esoteric languages]] https://esolangs.org/w/index.php?diff=65567&oldid=65558 * Dtuser1337 * (+43) /* Trigger */
06:31:41 <esowiki> [[Truth-machine]] https://esolangs.org/w/index.php?diff=65568&oldid=65565 * Dtuser1337 * (-6) /* Turth-machine */
06:35:56 <esowiki> [[Turth-machine]] https://esolangs.org/w/index.php?diff=65569&oldid=65566 * Dtuser1337 * (+45)
07:03:59 <esowiki> [[Capuirequiem]] https://esolangs.org/w/index.php?diff=65570&oldid=58513 * Dtuser1337 * (-1) /* External resources */
07:05:09 <esowiki> [[RETURN]] https://esolangs.org/w/index.php?diff=65571&oldid=38078 * Dtuser1337 * (+13) /* External resources */
07:06:53 <esowiki> [[Nil]] https://esolangs.org/w/index.php?diff=65572&oldid=52918 * Dtuser1337 * (+1) /* External resources */
07:11:10 -!- Sgeo_ has joined.
07:14:56 -!- Sgeo has quit (Ping timeout: 272 seconds).
07:15:52 <esowiki> [[CopyPasta Language]] https://esolangs.org/w/index.php?diff=65573&oldid=56021 * Dtuser1337 * (+13)
07:53:24 -!- AnotherTest has joined.
08:25:27 -!- Lord_of_Life has quit (Ping timeout: 245 seconds).
08:26:12 -!- Sgeo__ has joined.
08:27:41 -!- Lord_of_Life has joined.
08:29:51 -!- Sgeo_ has quit (Ping timeout: 268 seconds).
09:09:16 -!- arseniiv has joined.
09:53:49 <b_jonas> yes, that's how I think of it. imagine added labels like A: for (init; cond; step) { B: body; C: } D:
09:54:07 <b_jonas> then retry/goto jumps to A, redo jumps to B, next/continue jumps to C, last/break jumps to D
10:43:43 <arseniiv> ah, these polysigns are simply amateur algebra (bordering crackpottery), bleh. We take a vector space spanned by e[0], …, e[n−1], add a product e[i] e[j] = e[(i + j) mod n] and factor it all so e1 + … + en = 0. Why don’t people read good math textbooks and speak the consensus language
10:45:32 <arseniiv> though an analysis paper by Hagen von Eitzen doesn’t go in this straightforward way so IDK, maybe one can’t factor an algebra this way
10:46:58 <arseniiv> Taneb: a person named sombrero posted a link on some images hosted on vixra ≈yesterday. I didn’t look at them but I saw that word in the abstract and went investigating what it is
10:47:27 <arseniiv> unfortunately, my expectations were confirmed
10:48:16 <arseniiv> (amateur math with crackpotish philosophical claims)
10:48:42 <arseniiv> and also an old one: the analysis paper is 2009
10:49:08 <shachaf> b_jonas: Where that expands further to A: init; while (cond) { B: body; C: step; } D: presumably
10:49:35 <arseniiv> linear algebra is sufficient to do many many things people try to invent
10:50:05 <arseniiv> (of course it’s a trivial statement here)
10:50:21 <b_jonas> shachaf: well actually more like A: { init; while (cond) { B: body; C: step; } } D: if you care about the scope of the variables declared in init
10:52:18 <shachaf> One of the nice things about my system is that many things gets correctly scoped by default.
10:52:48 <shachaf> Instead of Go-style "if x := e; p(x) { ... }" you can just write "{ x := e; if(p(x))`; ... }"
10:53:16 <arseniiv> (ah, now I see the author uses a R[x] as a base because its multiplication is almost right)
10:54:07 <b_jonas> shachaf: "if x:= e; p(x) { ... }" is actual syntax?
10:54:40 <b_jonas> that looks odd to me because in ruby (and julia too) the semicolon would end the condition and start the loop body, but I guess it makes sense because sh works that way too
10:55:11 <b_jonas> the sh syntax for while loops allows you to put multiple statements in the condition too, which means you can put the test anywhere in the middle of the loop, you don't need a separate do-while loop or if-break
10:55:21 <shachaf> I think something like "if (auto x = e; p(x)) { ... }" is syntax in recent C++ too.
10:55:45 <b_jonas> shachaf: dunno, I can't follow that
10:56:17 <b_jonas> in perl you can write { my $x = e; p(x) or last; body }
10:56:29 <shachaf> It just declares something that's only in the scope of the if.
10:58:30 <shachaf> Presumably it means the same as { x := e; if (p(x)) { ... } }
11:00:20 -!- heroux has quit (Ping timeout: 248 seconds).
11:07:55 -!- heroux has joined.
11:12:52 <int-e> Hmm, this looks like a good preamble for a SMETANA program to me :) Step 1. Swap step 1 with step 2. Step 2. Go to step 1.
11:18:54 <b_jonas> Imagine an alternate universe where English didn't become a lingua franca, so there are entire libraries where all the identifiers are in German, ones where all of them are Russian, ones where all of them are in French,
11:19:19 <b_jonas> but you sometimes end up with names mixing words from different languages because the concept has a name popularized by an earlier library,
11:19:39 <b_jonas> and perhaps the oldest libraries with the worst names, like ncurses in our world, would even be all in latin.
11:20:17 <shachaf> It's ironic that you're calling English a "lingua franca" in that description.
11:20:35 <b_jonas> and then some libraries with English names would appear too.
11:21:10 <b_jonas> and you'd have to learn the basics of three or four different languages to understand source code, which was basically the status of scientific research in the 1960s
11:23:46 <b_jonas> yes, you'd probably have to know not only the root words, but also conjugation in latin, french, italian, german, and russian to be able to guess the spelling of identifiers if you want to write code
11:24:01 <Taneb> Sounds fun, we should make this happen
11:24:39 <b_jonas> Taneb: sure, but do it in a modern way, where latin is excluded but instead we have chinese identifiers too
11:25:07 <b_jonas> I mean using latin directly is excluded, we can still have the french and italian and portugeese identifiers of course
11:28:17 <Taneb> Hmm, a lot of libraries are American, so would naturally be written in English
11:28:24 <Taneb> Or maybe Spanish or Navajo or something
11:29:09 <b_jonas> and we may have some short identifiers that are ambiguous because they mean something different depending on what language you're reading them in
11:29:16 <b_jonas> I wonder if we can engineer such a situation in common keywords
11:30:43 <HackEso> signs of the zodiac? ¯\(°_o)/¯
11:31:34 <int-e> Oh I just realized that SMETANA may just be powerful enough to write a quine in. (But I'm targeting SMETANA To Infinity! for now.)
11:32:28 <int-e> Well, if we steal the output instruction from S2I that is.
11:32:39 <int-e> No output, no quine... :)
11:32:57 <shachaf> What if you just put the quine in memory?
11:33:05 <shachaf> For example, in the text section.
11:33:42 <int-e> shachaf: Maybe you should refresh your memory (no pun intended) on what SMETANA is.
11:35:17 <b_jonas> tha handle transform tool in gimp 2.10 is such a blessing, it was so hard to do proper alignments of multiple pictures with affine or perspective transforms before that
11:35:34 <int-e> (SMETANA suffers from being a finite state machine; all data has to be encoded somehow in the program itself as written. So you need some serious compression to allow the program to contain a representation of itself, which stretches the term "quine" beyond the limits I'm willing to allow.)
11:37:22 <b_jonas> how does it stretch the term "quine"?
11:37:23 <int-e> shachaf: Of course we could define other output conventions without extending the language... like looking at the trace and mapping certain predetermined addresses to characters.
11:37:38 <shachaf> int-e: I didn't even know about SMETANA until I looked it up 25 minutes ago.
11:38:17 <shachaf> But I was making a general joke that is equally bad in any language.
11:38:50 <shachaf> (The joke is, define "output" to be some region of memory. At program startup, the program is loaded into memory, and therefore you can define that region to be the output and call it a quine.)
11:39:27 <int-e> b_jonas: Well, I don't know how to delineate decompression and a BASIC-style 'LIST' function that makes quines trivial.
11:40:12 <lambdabot> Local time for shachaf is Mon Aug 19 04:40:08 2019
11:40:28 <int-e> it's getting worse
11:40:39 <fungot> b_jonas: not very quickly, and it has a non-real-time gc, but that's just what the user expected type character, got ' ( 1 2 3)
11:42:57 <int-e> fungot's almost making sense again (though I guess that last part is closer to Lisp than to J?)
11:42:58 <fungot> int-e: lms that doesn't violate her fnord, either. probably still aren't. is slib standard?
11:43:30 <int-e> oh well, the moment of clarity has passed
11:46:28 <shachaf> fungot: Was that Lisp or J?
11:46:28 <fungot> shachaf: ( define ( get-move-console) count)). the funge-98 standard is also written in scheme
11:46:44 <b_jonas> fungot: the reason why people don't take you seriously is that you aren't using the One True Style for formatting your code. don't put whitespace after the left parenthesis or after the quote
11:46:44 <fungot> b_jonas: goes where no other music fnord or has been in jail. we went to another bar. then we only need to cons stuff onto the top of
12:07:52 <b_jonas> peksy arthropods, thinking that I invited them just because I left food open on the counter
12:08:26 <b_jonas> if you want food, buy your own, darn it
12:10:45 -!- j-bot has joined.
12:16:45 -!- andrewtheircer has joined.
12:30:12 <b_jonas> is there a strongly typed golf language where, if your code isn't well-typed, it tries to put extra stuff like parenthesis and other extra stuff into your code until it's well-typed, so if you use the type system well, you can omit a lot of things from the parse tree?
12:30:28 <b_jonas> would need a good built-in library and good heuristics for what to insert
12:31:00 <b_jonas> and might be harder to program in than Jelly
12:33:30 <b_jonas> also, is there a golf language that is basically the same as haskell but with a much more golfier syntax?
12:33:46 <b_jonas> I don't think blsq counts as such
12:52:53 <b_jonas> oh, I get it! if a C compiler environment isn't made for hard realtime programs, then the library defines the PRIoMAX macro to "llo", meaning that the maximum priority that your threads will be able to run is very low
12:57:40 <b_jonas> ``` gcc -O -Wall -o /tmp/a -x c - <<<$'#include<inttypes.h>\n#include<stdio.h>\n int main(void){ printf("maximum thread priority: %s\\n", PRIoMAX); return 0; }' && /tmp/a
12:57:41 <HackEso> maximum thread priority: lo
12:57:49 <b_jonas> that's slight better, it's only low, not very low
13:15:24 -!- andrewtheircer has quit (Quit: Ping timeout (120 seconds)).
13:15:48 -!- arseniiv_ has joined.
13:17:33 -!- andrewtheircer has joined.
13:17:34 -!- arseniiv has quit (Ping timeout: 246 seconds).
13:54:05 <esowiki> [[Special:Log/newusers]] create * Mid * New user account
14:16:34 <b_jonas> apparently the We The Robots comics has ended the long hiatus, except it's not at the old address "http://www.chrisharding.net/wetherobots/" but at "http://www.wetherobots.com/" now
14:16:40 <b_jonas> I haven't noticed until now
14:16:46 <b_jonas> this is what happens when you break urls
14:29:55 -!- sleepnap has joined.
14:30:57 <esowiki> [[ABCD]] https://esolangs.org/w/index.php?diff=65574&oldid=46025 * Dtuser1337 * (+18)
14:39:34 <b_jonas> question. is it possible to make a terminfo entry for irc that tells how to output colors and bold and italic with the mirc codes, and if so, will gcc output colored error messages in HackEso?
14:40:04 <b_jonas> I don't know if gcc's colored error messages respects that
14:40:54 <b_jonas> I made a terminfo entry once, but only by editing a single value in an existing entry
14:44:18 <b_jonas> I should check the ncurses sources in case it already has such an entry though
14:54:45 <b_jonas> whoa... are the terminfo thingies no longer distributed with ncurses? where do they come from then?
15:01:57 <b_jonas> I'll check what debian thinks
15:04:46 <b_jonas> the compiled entries come from the ncurses-term package, now how do I find what the source package is for that?
15:06:01 <b_jonas> and the source package is apparently ncurses
15:07:13 <b_jonas> called ncurses-6.1/misc/terminfo.src
15:08:02 <b_jonas> now the question is, can terminal libraries handle that the same code toggles bold on and off?
15:08:14 <b_jonas> and also the way how you can't put a digit after a color code?
15:08:46 <b_jonas> if not, we need a filter after it, which would be ugly
15:09:13 <b_jonas> I guess for the latter, I can just force put two bold codes after a color code
15:09:21 <b_jonas> and for the former, just hope it works
15:17:32 -!- xkapastel has joined.
15:19:35 <b_jonas> that debian package ncurses-term, by the way, contains the _additional_ compiled terminfo entries, the decades of legacy terminals that nobody uses anymore
15:23:09 <Taneb> Imagine if people started adding footnotes* to their IRC messages
15:23:39 <b_jonas> it's actually split around the directory tree too, so the important terminfo entries are in /lib/terminfo , the rest are in /usr/share/terminfo , in case you're mounting usr late
15:32:32 <b_jonas> fungot, do you have a calendar? has SGDQ 2019 started yet?
15:32:32 <fungot> b_jonas: wtf is that? that problem haunts me. i don't think
16:03:19 -!- joast has quit (Quit: Leaving.).
16:03:58 -!- joast has joined.
16:52:42 <fizzie> fungot: I take it you don't watch speedruns then?
16:52:42 <fungot> fizzie: if i remove the fan, heatsink, installation instructions and finally the body is ( are) returned. this expression type is used at a german bank for financial simulations. but that's a mere one character :) i'm not complaining
16:52:58 <b_jonas> fungot, 45 minutes is how many hours?
16:52:59 <fungot> b_jonas: you could just select another transport module system in the works to start off by reading sicp or htdp.
16:53:19 <b_jonas> fungot: don't remove the fan and heatsink! the cpu will overheat
16:53:19 <fungot> b_jonas: is that what you mean
17:04:13 -!- Phantom_Hoover has joined.
17:20:25 <HackEso> The password of the month is surprising.
17:21:15 -!- Sgeo has joined.
17:24:12 -!- Sgeo__ has quit (Ping timeout: 272 seconds).
17:38:36 -!- FreeFull has joined.
17:53:49 -!- andrewtheircer has quit (Remote host closed the connection).
18:02:45 -!- arseniiv_ has changed nick to arseniiv.
18:09:56 <arseniiv> Taneb: I¹ proved² the Riemann hypothesis³ false⁴
18:10:30 <arseniiv> but the page is too short for the footnote section so no one will know what I meant
18:12:41 <arseniiv> though one footnote can be crammed into the margin (¹ I — Roman numeral 1)
18:12:56 <int-e> . o O ( "I proved the Riemann hypothesis" [is] false. )
18:13:31 <b_jonas> int-e: ah yes, newspaper heading grammer
18:13:48 <int-e> Oh. We can claim that this is a portmanteau.
18:14:04 <int-e> (hypothesis and is -> hypothesis)
18:14:35 <HackEso> friend:friend is a portmanteau of fritter and rend \ ism:Isms are philosophies, religions or ideologies that have branched off from older ones, such as Leninism or Buddhism. Etymologically "ism" is a backformation from portmanteaus on "schism". \ portmanteau:«Portmanteau» is the French spelling of “port man toe”.
18:39:37 <arseniiv> int-e: :D but is omission of this kind grammatical?
18:40:10 <arseniiv> <int-e> (hypothesis and is -> hypothesis) => ah, quite clever
18:40:47 <arseniiv> one can even claim that hypotheses < hypothesises :D
18:41:21 <arseniiv> or not, it seems different phonetically?
18:41:38 <int-e> arseniiv: FWIW I initially intended to provide a footnote 4 that would redefine "false", but didn't find any nice formulation...
18:42:49 <arseniiv> though this is not nice at all
18:43:37 <arseniiv> I had in mind conflating unprovedness, unprovability and proved negation
18:44:09 <int-e> arseniiv: https://www.youtube.com/watch?v=8keZbZL2ero is somehow relevant
18:44:11 <arseniiv> but decided to let it vague (v) for a time
18:55:55 -!- Melvar has quit (Quit: WeeChat 2.4).
19:26:05 -!- ais523 has joined.
19:26:18 -!- ais523 has quit (Client Quit).
19:26:31 -!- ais523 has joined.
19:26:55 <ais523> <b_jonas> is there a strongly typed golf language where, if your code isn't well-typed, it tries to put extra stuff like parenthesis and other extra stuff into your code until it's well-typed, so if you use the type system well, you can omit a lot of things from the parse tree? <b_jonas> also, is there a golf language that is basically the same as haskell but with a much more golfier syntax? ← yes to both, and it's the same language: Husk
19:27:37 <b_jonas> hmm, there doesn't seem to be an article on the wiki
19:29:58 <ais523> CGCC users rarely bother with creating wiki articles for their golflangs :-(
19:30:14 <ais523> they normally just link to the github repo to let people know wha the language is
19:30:27 <ais523> https://github.com/barbuz/Husk in this case
19:30:38 -!- Sgeo_ has joined.
19:31:04 <esowiki> [[Husk]] N https://esolangs.org/w/index.php?oldid=65575 * B jonas * (+158) stub
19:31:33 <esowiki> [[Language list]] https://esolangs.org/w/index.php?diff=65576&oldid=65520 * B jonas * (+11) Husk
19:32:16 <b_jonas> oh no, not yet another golf language with its own custom eight-bit character set
19:33:02 <ais523> most golf languages do that
19:33:09 <ais523> because control codes are /really/ hard to read and type
19:33:34 <ais523> (a golf language I'm sporadically working on has its own /six/-bit character set, which has the advantage that I can make it a subset of ASCII)
19:33:48 -!- Sgeo has quit (Ping timeout: 245 seconds).
19:34:34 <b_jonas> I guess it's still better than if the golf language has a tricky variable length compression, so reading and writing it is more difficult than just translating characters
19:34:48 <b_jonas> like a hufmann encoding or even worse
19:35:28 <b_jonas> of course Jelly's compressed string modes are sort of like that
19:45:06 -!- heroux has quit (Ping timeout: 268 seconds).
19:46:59 -!- adu has joined.
19:51:39 -!- heroux has joined.
20:03:32 <arseniiv> ais523: thanks for an interesting language!
20:04:02 <arseniiv> though golflangs make my head spin, there are so many details because of need to compress
20:04:31 <arseniiv> nondeterministic typing there is cool
20:05:20 <arseniiv> (hopefully it’s implemented in such a way as to not blow up combinatorily)
20:08:09 <arseniiv> I’d type overloaded multimethods using union typing, but am I right it’s not easily added to Hindley—Milner-like inference?
20:08:16 <ais523> it might blow up at compile time but that's probably forgivable, golflang programs are short
20:09:12 -!- Melvar has joined.
20:10:01 <b_jonas> I wonder if there's a golf language that has a built-in that gets the OEIS sequence from its index, and to compute its terms, tries to automatically run code in the OEIS entry, so requires Maple and Mathematica plus ten gigabytes of other software to run.
20:10:22 -!- sleepnap has quit (Quit: Leaving.).
20:10:52 -!- sleepnap has joined.
20:11:23 <b_jonas> and of course you download a snapshot of the OEIS at the time you build the compiler, so that it doesn't cheat by looking info up on the internet that may be newer than the language
20:13:25 -!- sleepnap has quit (Client Quit).
20:19:13 -!- Melvar has quit (Ping timeout: 245 seconds).
20:19:37 -!- Melvar has joined.
20:21:10 <ais523> b_jonas: I think someone tried that once but failed
20:21:47 <ais523> a better approach would probably be to encourage people to submit OEIS programs in a standard machine code; WebAssembly comes to mind
20:21:48 <b_jonas> yeah, this does seem like an interesting language
20:22:11 <b_jonas> ais523: does WebAssembly have bigints?
20:22:32 <ais523> it's a machine code, so not natively in the same sense that x86 doesn't have native bigints
20:22:38 <ais523> but you can run GMP or the like on it easily enough
20:23:03 <b_jonas> ais523: I mean as a standard library that's usually accessible or something, not as a "built-in"
20:23:27 <ais523> WebAssembly doesn't have standard libraries, you're supposed to compile your own language's standard library onto it
20:23:31 <b_jonas> so that you don't have to bundle a copy of the bigint multiplication routine with the code of every quickly growing sequence
20:23:57 <ais523> a typical WebAssembly program is shipped with a decent proportion of libc
20:23:58 <b_jonas> (though I presume you'd allow a single object file that implements multiple OEIS sequences)
20:24:17 <ais523> I guess object files might make more sense than executables, in that case
20:25:24 <esowiki> [[Husk]] https://esolangs.org/w/index.php?diff=65577&oldid=65575 * B jonas * (+465)
20:25:46 <b_jonas> it could be executables, that's not the difference I care about here
20:25:56 <b_jonas> just that there are bunches of OEIS sequences that are closely related
20:26:08 <b_jonas> so it would be redundant to copy the code to each one
20:26:32 -!- Lord_of_Life_ has joined.
20:27:39 -!- Lord_of_Life has quit (Ping timeout: 268 seconds).
20:29:27 -!- Lord_of_Life_ has changed nick to Lord_of_Life.
20:36:44 -!- ais523 has quit (Ping timeout: 272 seconds).
21:08:56 -!- ais523 has joined.
21:10:27 <ais523> well, executables are self-contained, but object files don't have to be
21:11:58 <shachaf> I wrote a program to generate ELF executables but it seems to me object files might actually be trickier
21:12:26 <shachaf> Hmm, maybe just differently tricky.
21:13:02 <shachaf> The tricky thing is that you need a bunch of information in sections for the linker to interpret. But it can handle relocations and so on for you, I suppose.
21:13:46 <b_jonas> shachaf: do you also emit basic debug information like the code span of each function?
21:14:17 <b_jonas> I guess that is sort of redundant because gdb can guess the function from the function symbols, without the debug info
21:14:30 <shachaf> (Because the program just emits some fixed handwritten x86 code.)
21:14:57 <shachaf> I do emit function information, I think. But the only function is _start.
21:15:36 <HackEso> \ tmp/out.a: file format elf64-x86-64 \ \ \ Disassembly of section .text: \ \ 0000000000000178 <_start>: \ 178:48 31 ed xor %rbp,%rbp \ 17b:49 89 d2 mov %rdx,%r10 \ 17e:48 b8 66 69 6e 61 6c movabs $0xa796c6c616e6966,%rax \ 185:6c 79 0a \ 188:50 push %rax \ 189:b8 01 00 00 00 mov $0x1,%eax \ 18e:bf 01 00 00 00 mov $0x1,%edi \ 193:48 89 e6 mov
21:16:00 <ais523> what sort of calling convention is that? :-D
21:16:10 <ais523> I'm used to seeing functions starting out by messing around with sp and bp
21:16:13 <ais523> but zeroing bp is weird
21:16:20 <ais523> I'm guessing it's just being used as a general-purpose register?
21:16:29 <shachaf> That's standard in the amd64 ABI.
21:16:39 <shachaf> To mark the outermost stack frame.
21:17:05 <ais523> couldn't the outermost stack frame actually need it as a base pointer, though?
21:17:56 <shachaf> I think _start normally calls another entry point almost immediately.
21:17:57 <ais523> oh, that string says "finally"
21:18:17 <ais523> I initially misread it as "fitally", I'm not as good at converting hex to ascii in my head as I'd like
21:18:58 <ais523> hmm… do you have an opinion on caller-saved versus callee-saved registers?
21:19:17 <b_jonas> `perl -eprint pack"q",0xa796c6c616e6966 # let's ask a computer to do that, just to check
21:19:41 <shachaf> I don't have a strong opinion, at least. Maybe I had one in the past.
21:20:27 <shachaf> I think someone who knows more about register renaming and things than I do should give me an opinion.
21:27:45 -!- asdfbot has joined.
21:29:15 <shachaf> I think it's reasonable for _start to be special in this way because it's not actually a function.
21:29:41 <moony> =rasm xor rax, rax
21:29:49 <moony> =rasm2 xor rax, rax
21:30:54 <moony> =rasm2 -s att xor %rax, %rax
21:31:06 <moony> =rasm2 -satt xor %rax, %rax
21:31:24 <shachaf> Which order should I write the operands in in my assembler?
21:31:34 <moony> radare2 on hackeso when
21:31:42 <b_jonas> shachaf: use NASM syntax, it's better than either intel or att
21:32:00 <moony> it's intel-like yes
21:32:03 <b_jonas> that is if this is an assembler for x86
21:32:58 <shachaf> Presumably I want to target a bunch of platforms
21:33:34 <moony> shachaf: https://github.com/asmotor/asmotor contribute to this instead then
21:34:31 -!- LBPHacker has joined.
21:34:38 -!- BWBellairs has joined.
21:34:43 <moony> `welcome LBPHacker
21:34:44 <b_jonas> you should probably use xor %eax,%eax though, because it encodes shorter
21:34:44 <HackEso> LBPHacker: Welcome to the international hub for esoteric programming language design and deployment! For more information, check out our wiki: <https://esolangs.org/>. (For the other kind of esoterica, try #esoteric on EFnet or DALnet.)
21:34:45 <LBPHacker> blame moony for everything I do here :P
21:35:44 <shachaf> Why contribute to that instead?
21:35:55 <moony> good, multi-CPU assembler.
21:36:09 <moony> more assemblers is just more competing standards right now
21:36:29 <b_jonas> see http://yasm.tortall.net/ for x86
21:37:03 <b_jonas> or invent your own syntax that looks similar to the others but is incompatible in subtle ways that are hard to debug
21:37:22 -!- sleepnap has joined.
21:37:24 <shachaf> I primarily wanted a library rather than something with a parser anyway
21:37:48 <moony> capstone not work?
21:38:19 <moony> oh, right, capstone's a disassembler
21:38:38 <moony> http://www.keystone-engine.org/
21:39:04 <moony> that's it's assembler counterpart
21:39:19 <b_jonas> for disassembling, you can try Agner's disassembler
21:39:39 <moony> or http://www.capstone-engine.org/ as i mentioned
21:41:27 <shachaf> There's only one application I previously wanted a general disassembler library for, and it was kind of ridiculous.
21:42:12 <b_jonas> did it involve malware research or kernel debugging?
21:42:44 <shachaf> No, the goal was to make a variant of strace that traces I/O to memory mapped files precisely.
21:43:36 <shachaf> In order to trace exactly which bytes were read from or written to, you need to disassemble the instruction.
21:44:41 <shachaf> I think there are some debuggers that have this feature.
21:44:41 <b_jonas> that, or mprotect all of the mapped pages to 0 permissions after every access, and catch the segfault and see what the siginfo says
21:45:17 <shachaf> That's the approach I'm proposing.
21:45:17 <b_jonas> wait, can't intel's built-in debug faults already tell what's read and written, and doesn't the kernel expose that in siginfo?
21:45:34 <shachaf> Certainly not stepping through every instruction, that seems way too slow.
21:45:50 <shachaf> But when you get the SEGV you need to figure out exactly which octets were read or written.
21:45:52 * ais523 has been reading the x86_64 ABI
21:46:02 <ais523> it gives advice on how to implement exabyte-size functions
21:46:10 <ais523> I wonder what circumstances would require you to write one of those
21:46:24 <moony> I can not imagine a sane one
21:47:02 <moony> why do me and ais523 have the same name color in weechat
21:47:22 <moony> we're alphabetical miles apart
21:47:35 <b_jonas> I don't know how all that x86 stuff works
21:47:48 <moony> b_jonas: it works via a generator that runs on hot garbage
21:48:06 <b_jonas> ais523: a compiler may have to, when it's forced to compile firefox
21:48:33 <moony> maybe you unroll a massive loop?
21:49:12 <moony> LBPHacker: help i need ideas for why one would use a exabyte sized functions
21:49:38 <ais523> I still have trouble imagining exabytes of data, although there are probably some companies who store that much nowadays
21:49:41 <b_jonas> though I think they don't have functions larger than gigabyte sized
21:49:42 <ais523> but exabytes of /code/?
21:50:06 <ais523> I guess when you're writing an ABI you need to take all eventualities into account
21:50:48 <b_jonas> ais523: right, so that twenty years later, when the required hardware is available, different groups of people don't start inventing incompatible exensions
21:50:59 <shachaf> ais523: I doubt there's any company that stores an exabyte of data on one machine.
21:51:29 <ais523> my guess is that nobody does but I'm not sure
21:51:30 <moony> I wonder what happens if you give GCC infinite (to the max x86-64 allows) RAM and have it compile a exabyte sized function
21:51:43 <moony> ais523: the most we can store per machine right now is about 1PB i think
21:52:10 <moony> so 1024 machines per exabyte
21:52:18 <ais523> there are definitely use cases in which you want as much in-memory storage as possible in one machine and don't care if it's lost to a crash
21:52:39 <b_jonas> moony: even a petabyte is a lot, yeah
21:53:02 <b_jonas> you'd need 32 hard disks, each one 32 terabytes size, or something
21:53:12 <LBPHacker> moony: I'm missing a lot of context here but iirc x86-64 supports an address space of 2**52
21:53:12 <moony> ais523: the most RAM per machine right now is 8TB, assuming a AMD EPYC based server.
21:53:19 <b_jonas> ais523: such as video games, sure
21:53:28 <ais523> moony: so not even in the petabyte range
21:53:33 <b_jonas> or trying to break cryptography stuff by building huge tables
21:53:42 <moony> and this is asof just a few days ago btw
21:53:44 <LBPHacker> or are we talking possibly out of memory
21:53:56 <moony> before the latest EPYC units were released, it was a max of 4TB per machine
21:54:15 <moony> LBPHacker: possibly out of memory i bet
21:54:20 <moony> memory paging/swapping/whatever
21:54:32 <b_jonas> LBPHacker: I disagree, https://esolangs.org/logs/2019-08-05.html#luc
21:54:59 <b_jonas> LBPHacker: it's the 5-level one that would support 52 bits I think
21:55:12 <b_jonas> the current cpus support only up to 48 bits
21:55:39 <LBPHacker> well I mean I remember 9+9+9+9+12, which is the 4-level
21:55:41 <b_jonas> moony: what's the most fast solid state storage you can have in a machine?
21:55:50 <LBPHacker> how many address pins your cpu has is another matter :P
21:55:58 <moony> I dunno, i know some servers with 48+ NVME slots
21:56:11 <ais523> even if your CPU is short on address pins you could always use… bank switching!
21:56:12 <LBPHacker> see this is why I don't do addition
21:56:36 <ais523> although it tends not to play very nicely with the concept of an operating system
21:56:48 <moony> b_jonas: with a modern EPYC system, I think it'd max out at 128 NVMe devices, assuming you somehow utilized all 128 PCIe lanes from both CPUs
21:57:18 <b_jonas> LBPHacker: I gave up on arithmetic years ago, when I debugged a segfault for hours, then found that I allocated 8092 bytes instead of 8192
21:57:20 <moony> (Yes, EPYC servers are currently the most capable. Perfect.)
21:57:34 <b_jonas> these days I'd let the computer figure out the size from a multiplication or shift
21:57:41 <LBPHacker> lol. yes, that is why I don't do decimal either
21:58:07 <b_jonas> moony: and how large can those NBMe devices be?
21:58:07 <moony> how many SATA devices can you have per PCIe lane
21:58:34 <b_jonas> shachaf: yes, that too, though sometimes I mess that up too
21:58:59 <b_jonas> moony: also how many ways is the largest possible NUMA in a machine?
22:00:44 <ais523> haha, the PLT uses one format for the first 102261125 entries and changes to a different format from the 102261126th onwards
22:00:54 <b_jonas> computer hardware is getting ridiculously powerful. good thing I have my programmable calculator, with 2 kilobytes of RAM, battery-backed to make it persistent, in the shelf
22:00:55 <moony> I think EPYC zen 2 server processors are one NUMA P each\
22:01:07 <ais523> I wonder how many programs a) care about the PLT format at all and b) are unaware of that detail
22:01:21 <moony> b_jonas: 128 cores per server \o/
22:02:11 <moony> 128 cores, with slots for 4 accelerator cards
22:02:28 <ais523> (the reason is that there's some shared code between the entries that they normally reach via a jump, but you can't write the jump instruction when you're too far from the start of the PLT)
22:02:50 <b_jonas> moony: nice. have you ever written a program for those that spawns 128 parallel processes to speed up something, but some library you call in them tries to be automatically smart and spawns 128 parallel threads in each of them?
22:03:21 <shachaf> If I generate an amd64 program, should I just not use a PLT?
22:03:30 <b_jonas> that's happened to me on 12 cores only, I haven't seen a 128 core machine
22:03:42 <moony> b_jonas: 128 core machines are on the market as of a few days ago
22:03:46 <ais523> shachaf: I'm sceptical about the value of the PLT and GOT
22:03:46 <moony> they're 2-way as well
22:03:52 <shachaf> Also, should I take Microsoft's approach and say "x64" instead of "x86-64" or "amd64"? It's shorter.
22:04:08 <shachaf> ais523: Hmm, what's the alternative to the GOT in ELF files?
22:04:13 <ais523> x64 probably refers to an unrelated CPU
22:04:18 <moony> b_jonas: they're also impressively cheap, 7k for the highest end 64-core CPU from AMD
22:04:26 <moony> and it absolutely kills in terms of performance
22:04:51 <moony> somethingsomething50kfor56coresfromintel
22:04:55 <ais523> shachaf: well, the GOT basically lets you find addresses that aren't compile-time constants, but doing arithmetic on %rip can also do that if you know that the entire program has moved the same amount
22:05:04 <b_jonas> I'd like a powerful computer, but not that powerful
22:05:11 <ais523> so it'd only be useful when connecting between two parts of the program that are ASLRed by different amounts
22:05:30 <moony> b_jonas: AMD has 16-core processors for desktop for about $800 i think, coming to market real soon
22:05:46 <shachaf> ais523: Right, but isn't every library loaded at a random address?
22:06:39 <ais523> shachaf: yes, but there isn't a whole lot of actual communication done between libraries through, e.g., shared global constants
22:07:26 <shachaf> But what about calls to library functions?
22:07:31 <ais523> so I'm not sure that I see any particular difference between absolute and position-independent code; you need the dynamic linker to update function calls in one library to point to the other anyway
22:07:58 <ais523> the purpose of the PLT is pretty much just so that you can avoid editing the .text segment, which forces you to use a private rather than shared map
22:08:09 <shachaf> Right, so the PLT seems kind of pointless.
22:08:24 <shachaf> But isn't the GOT the alternative that doesn't edit .text?
22:08:39 <ais523> the GOT and PLT are very similar
22:09:09 <ais523> the difference is that GOT is general-purpose and the PLT is just wrappers for function calls
22:09:23 <ais523> (this lets you give a location in the PLT when you need to give someone a function pointer for a callback)
22:09:45 <moony> LBPHacker: exascale R316 when
22:09:56 <ais523> that said, position-independent code doesn't move while it's running, so if you need to give someone a function pointer, you could probably just lea it yourself and pass them the resulting absolute address
22:10:01 <b_jonas> why do we have to have these interesting conversations during the night when I have to get up early the next day? the sleep cycle of this channel is messe dup
22:10:14 <shachaf> Maybe I'm saying GOT when I mean something else.
22:10:15 <moony> it's just the afternoon here
22:10:28 <moony> GOT sounds an awful lot like GDT
22:10:39 <shachaf> What's the way you're proposing to call a dynamically linked function?
22:10:42 <moony> I know GDT, but not GOT..
22:12:11 <shachaf> I think you should have something like "movq some_offset(%rip), %rax; callq %rax", where that offset is an offset into a segment the dynamic linker populates with the correct address at load time.
22:12:47 <shachaf> Or maybe that's not what I think?
22:14:29 <ais523> shachaf: my suggestion is callq 0(%rip) in the executable, the dynamic linker edits the 0 to the actual distance when the executable is loaded
22:14:54 <ais523> that works well for executable → library calls as long as they're both in the first 31 bits (they typically will be)
22:15:08 <shachaf> But that requires editing .text such that it can't be shared, right?
22:15:12 <ais523> it doesn't work as well for library → library calls, though, because you'd ideally want to be able to store the libraries in a shared mapping
22:15:22 <ais523> right, that prevents a shared .text
22:15:23 <shachaf> Maybe you're saying that's irrelevant.
22:15:32 <ais523> I consider it insufficiently relevant
22:15:48 <shachaf> One thing I like being able to do is load a new copy of a .so at runtime repeatedly.
22:16:04 <shachaf> I guess that's still possible with that approach.
22:16:17 <shachaf> But it might require a page to be W|X.
22:16:21 <ais523> but if it's a problem, the next step would be to have a separate section that groups together jumps to functions, do calls by calling that section, and have the dynamic linker edit the jump targets
22:16:43 <shachaf> I was going to say that sounds like the PLT, but I guess the PLT has an extra level of indirection.
22:17:07 <ais523> I don't understand why it has that extra level
22:17:23 <shachaf> I think it's to allow lazy loading.
22:17:45 <shachaf> I think lazy loading is probably a bad idea, though.
22:18:23 -!- AnotherTest has quit (Ping timeout: 245 seconds).
22:18:23 <ais523> the remaning hard case is when one library (or the executable) refers to a global variable in another library (or the executable)
22:18:35 <ais523> my preferred solution to this would be simply to ban it :-D
22:18:47 <shachaf> Hmm, I use this in the aforementioned use case.
22:19:27 <shachaf> Specifically my program is made of two parts, a loader binary and an application .so.
22:19:29 <ais523> in cases like this, getter/setter methods would normally be a better API
22:19:58 <shachaf> The .so refers to some global variables defined in the loader to share state across reloads.
22:20:08 <shachaf> I suppose the loader could just pass it a pointer.
22:21:04 <ais523> ooh, I think I know why the PLT might be useful: it's to avoid invalidating function pointers into a library after it's reloaded
22:21:34 <shachaf> I think those are invalidated anyway.
22:21:47 <ais523> I guess they have to be
22:21:48 <shachaf> Well, I'm loading the library with dlopen.
22:22:05 <shachaf> (In the loader->library direction.)
22:27:15 -!- arseniiv has quit (Ping timeout: 248 seconds).
22:27:25 <ais523> anyway, my current hobby is being annoyed at compiler optimisers
22:27:37 <shachaf> One annoying thing about structuring the program this way is that the .so can't use any global variables (that survive across reloads).
22:27:54 <shachaf> Most of the global variables I'm importing from the loader really belong in the library anyway.
22:28:21 <ais523> unsigned short f(unsigned long long *p) { unsigned long long rp = *p; if ((rp & 0xFF00000000000000LLU) != 0x8000000000000000LLU) return 0; return (unsigned short)((rp ^ 0x8000000000000000LLU) >> 48); }
22:28:33 <ais523> ^ that is the test function that I've been working on optimising
22:28:53 <shachaf> I feel like, if you want to make your programs fast, optimizing compilers are rarely the right place to look.
22:29:22 <ais523> this compiles to 45 bytes with gcc -Os, 34 bytes with clang -Os
22:29:48 <ais523> I got it down to 13 bytes of hand-written asm
22:30:04 <shachaf> Oh man. That's a lot of bytes.
22:30:07 <ais523> the reason I'm annoyed is that I want to be able to write clear code and rely on compiler optimisations
22:30:52 <ais523> I had a go at trying to generate good asm via repeatedly hand-optimising the C file
22:31:18 <ais523> that got it down to 24 bytes on clang and 17 on gcc, but at the cost of requiring -fno-strict-aliasing
22:32:01 <b_jonas> "<ais523> haha, the PLT uses one format for the first 102261125 entries and changes to a different format from the 102261126th onwards" => I wonder if I should addquote that
22:32:48 <ais523> helps on clang, at least
22:33:05 <ais523> you can hit an LLVM optimiser idiom and get back to strictly conforming C, which is nice
22:33:06 <b_jonas> getter/setter methods? why not a function that returns the address of the variable instead?
22:33:31 <ais523> I guess that works in most cases
22:33:44 <ais523> I like getter/setter because it abstracts away the way in which the variable is stored internally
22:34:09 <ais523> but many of the cases you'd need that, e.g. if you need to move the variable to thread-local-storage, still let you take references to it
22:34:20 <ais523> `` printf "%x" 102261125
22:34:59 <b_jonas> ais523: isn't the extra indirection so that you can take a function pointer to a function in another library, and equality-compare it in C to the same function pointer taken from a third library, and make them return equal?
22:35:06 <b_jonas> the extra indirection in the PLT that is
22:35:27 <shachaf> How does the PLT let you do that?
22:35:41 <b_jonas> ais523: you have made sure that you aren't using a too old compiler or an MS compiler, right?
22:36:06 <ais523> shachaf: ^ this is a really good example of my belief that immutable things should be treated as indistinguishable from values, with any references managed behind the scenes
22:36:52 <ais523> reference-== on immutable things is not a useful operation (except possibly as an optimisation hint) and if you accidentally expose it in your language semantics, a lot of contortions are needed to avoid breaking programs that rely on it
22:37:02 <b_jonas> ais523: for a thread-local variable, you call the function again each time you're not sure you're in the same thread. that's how errno works.
22:37:18 <ais523> in retrospect, errno was a mistake
22:37:23 <ais523> but it took a while for that to become apparent
22:37:32 <moony> errno is a big mistake
22:37:39 <b_jonas> ``` gcc -E - <<<$'#include<errno.h>\nerrno' | tail
22:37:40 <HackEso> # 25 "/usr/include/x86_64-linux-gnu/bits/errno.h" 2 3 4 \ # 50 "/usr/include/x86_64-linux-gnu/bits/errno.h" 3 4 \ \ # 50 "/usr/include/x86_64-linux-gnu/bits/errno.h" 3 4 \ extern int *__errno_location (void) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__)); \ # 36 "/usr/include/errno.h" 2 3 4 \ # 58 "/usr/include/errno.h" 3 4 \ \ # 2 "<stdin>" 2 \ (*__errno_location ())
22:38:00 <ais523> that "leaf" in the header file is curious
22:38:04 <b_jonas> moony: yes, and C99 at least partly gets rid of it by allowing such fast functions as sqrt to not necessarily set it
22:38:12 <ais523> when would a function need to know that it's /calling/ a leaf functiion
22:38:47 <ais523> hmm, apparently it means "this function will never call a callback function, nor longjmp"
22:39:09 <ais523> which does seem potentially useful as an optimisation hint
22:42:20 <shachaf> So when foo is a dynamiclly linked function and it's used as a function pointer, what does that pointer point to?
22:43:15 <b_jonas> shachaf: I don't know. it might not even work the way I explained, maybe it points to different places depending on which library you're naming the function pointer. I never tested it.
22:43:16 <shachaf> Apparently it's not the PLT entry.
22:44:17 <shachaf> I made a test program but now I gotta figure out what it's actually doing.
22:45:08 <shachaf> A PLT entry isn't even generated when you get a function pointer. Only when you call the function directly.
22:45:30 <shachaf> Which is pretty bizarre because "calling the function directly" nominally happens through a function pointer, in C.
22:46:34 <b_jonas> shachaf: does it really? I thought the function (reference) just decays to a function pointer when used as such, just like how an array decays to a pointer
22:46:47 <b_jonas> and when you call it, that doesn't happen
22:47:21 <shachaf> That's what the C standard says.
22:47:44 <b_jonas> not that that's an observable reference, and in any case, this is something that can and will be optimized
22:47:49 <shachaf> When I get back to the computer I'll test it more directly.
22:48:59 <b_jonas> but still, I think compilers will likely optimize it, because direct function calls are pretty common
22:49:10 <shachaf> Hmm, maybe it is observable, for the reason you said (making function pointers equal across libraries).
22:49:39 <shachaf> Since each library has its own PLT.
22:50:37 <b_jonas> shachaf: does C even guarantee the thing about making function pointers equal?
22:52:12 <b_jonas> I'm not sure you are even allowed to compare two function pointers without invoking undefined behavior unless one of them is a null pointer
22:52:57 <b_jonas> for equal comparisons, you don't get UB
22:53:21 <b_jonas> but I don't know if two function pointers to the same pointer are guaranteed to be equal, even within a compilation unit
22:53:25 <ais523> you can compare for equality, but not for < and >
22:53:44 <b_jonas> ais523: right, but what result do you get?
22:54:02 <b_jonas> does f == f have to return true if f is a function?
22:54:03 <ais523> who knows, this is C we're talking about :-D
22:54:19 <shachaf> Yes, with optimizations it turns into a PLT call in gcc.
22:55:03 <b_jonas> I think you can do equal comparisons on function pointers only because some apis treat null pointers in a special way
22:55:27 -!- b_jonas has quit (Quit: leaving).
22:56:05 <shachaf> Yes, the function pointer comes from the GOT.
23:00:57 <shachaf> Ugh, dynamic linking is so complicated.
23:02:08 -!- Phantom_Hoover has quit (Ping timeout: 245 seconds).
23:09:40 <shachaf> I'm kind of annoyed that when you lookify up something about dynamic linking or ELF files or whatever most of the search results are about random programs that print error messages involving those things rather than anything about how those things actually work.
23:17:10 <shachaf> At least writing an emitter is probably way better than writing a loader (which needs to handle all the old things no one uses anymore).
23:17:15 -!- xkapastel has quit (Quit: Connection closed for inactivity).
23:19:13 <shachaf> I'm confused by the last paragraph in https://www.airs.com/blog/archives/549
23:19:21 <shachaf> "It is possible to design a statically linked PIE, in which the program relocates itself at startup time."
23:19:31 <shachaf> Why does the program need to relocate itself? Doesn't the kernel do that?
23:35:18 -!- sleepnap has quit (Quit: Leaving.).
23:36:56 <pikhq> I'd have to look to check, but I think this is "relocation" in the sense of "processing ELF symbol relocations", not in the sense of "mapping into memory"
23:37:34 <pikhq> Because a static PIE binary is more-or-less a shared object that happens to have an entry point and no external dependencies.
23:37:58 <shachaf> Why would ELF symbol relocations be relevant for a statically linked executable?
23:39:34 <pikhq> The executable's GOT and PLT needs populated.
23:40:15 <shachaf> What, are you from Pittsburgh now?
23:40:29 <shachaf> I don't see why a statically linked executable would have a GOT or PLT.
23:41:06 <pikhq> Because you're implemented a static PIE binary by emitting, essentially, a shared object. That happens to have an entry point.
23:41:43 <shachaf> Yes, but that's just a technicality due to the kernel only randomizing ET_DYN files.
23:41:58 <shachaf> There's no need to emit a DYNAMIC segment.
23:43:58 <pikhq> Trying to remember exactly how musl does static PIE binary support...
23:45:00 <ais523> <rustc LLVM> movzwl -0x2(%rdi,%rax,1),%ecx \ mov %ecx,%edx \ xor $0x8060,%edx \ movzwl %dx,%edx
23:45:12 <ais523> that last movzwl is just some really obvious dead code
23:45:37 <ais523> the top 16 bits of %edx didn't stop being 0 just because you XORed the bottom 16
23:47:18 <ais523> I need to stop taking it personally when compilers generate ridiculous code, but still, that one really hurts
23:48:17 <ais523> meanwhile, clang optimised (x ^ 0x807F) >= 0x00A0 into (x ^ 0x8060) >= 0x00A0 which doesn't really change anything but is amusing
23:49:29 <pikhq> Looks like musl's implementation is that it links in the dynamic linker to static PIE binaries.
23:50:18 <pikhq> Or, rather, a portion of it.
23:50:56 <pikhq> https://git.musl-libc.org/cgit/musl/tree/ldso/dlstart.c It's this portion.
23:51:24 <pikhq> Just the startup code for the dynamic linker, not the full thing.
23:52:25 <pikhq> Oh, it's doing very very little.
23:52:34 <pikhq> Not even really processing relocations.
23:55:02 <ais523> later on there's also the beautiful "xor %r8d,%r8d \ movzbl %r8b,%eax" (with, AFAICT %r8 dying immediately afterwards)
23:59:39 -!- tromp_ has quit (Remote host closed the connection).