00:01:24 -!- Phantom_Hoover has quit (Read error: Connection reset by peer).
00:36:42 -!- Melvar has quit (Ping timeout: 268 seconds).
00:41:49 -!- Essadon has quit (Quit: Qutting).
01:06:06 -!- mniip has quit (Ping timeout: 630 seconds).
01:09:15 -!- Melvar has joined.
01:56:08 <ais523> I'm surprisingly annoyed at the lack of boxed-slice constructors in Rust
01:56:34 <ais523> you have to go via a temporary Vec if you want one (assuming that the size isn't known at compile time)
01:56:40 <ais523> even if you know the size at runtime
01:56:43 -!- nfd9001 has joined.
02:00:43 <ais523> even if you know the size at runtime
02:00:51 <ais523> whoops, meant to do that in my shell, not IRC
02:02:51 <zzo38> Do you know how to play poker with tarot cards or mahjong with pokemon?
02:05:10 -!- nfd9001 has quit (Ping timeout: 250 seconds).
02:05:22 -!- xkapastel has quit (Quit: Connection closed for inactivity).
02:08:48 <esowiki> [[+-]] https://esolangs.org/w/index.php?diff=58559&oldid=58522 * Cortex * (+69) /* Examples */
02:15:17 <esowiki> [[Omgrofl]] https://esolangs.org/w/index.php?diff=58560&oldid=58453 * Cortex * (-110)
02:15:39 <esowiki> [[Template:Dead memes]] N https://esolangs.org/w/index.php?oldid=58561 * Cortex * (+123) Created page with "'''WARNING''': This article contains high amounts of dead memes and outdated jokes. Please remember that before proceeding."
02:16:16 <esowiki> [[Surround notation]] https://esolangs.org/w/index.php?diff=58562&oldid=22152 * Cortex * (+16) /* Examples */
03:01:50 <paul2520> I've been meaning to find a use for mine.
03:04:43 <zzo38> There are a number of different games played with tarot. Originally the games were trick taking games, somewhat like whist or bridge, except there is one suit of permanent trumps. In some games the Fool is the highest trump, and in some other games, you can play it at any time even if you are able to follow suit but you will always lose the trick if you play it (it is sometimes called "Excuse" in this case).
03:06:27 <zzo38> (Even though it is either the highest trump or has no numerical value, still it seems it is commonly labeled zero. Someone told me that it is zero because it represents the beginning of a journey; I can see how, although that still does not seem a good enough reason to call it zero when its value in the game isn't actually zero.)
03:14:22 <zzo38> There are a few different games you can find, actually. If you have Latin-suited cards and the rules mention French-suited or vice-versa, you can use the following correspondences: spades=swords, clubs=rods/wands, diamonds=coins (sometimes they have pentagrams printed on them; it is fine if they do), hearts=cups
03:16:52 -!- sebbu has quit (Ping timeout: 246 seconds).
03:30:45 -!- mniip has joined.
03:34:35 -!- sebbu has joined.
04:37:32 -!- ais523 has quit (Remote host closed the connection).
04:38:45 -!- ais523 has joined.
04:51:14 -!- zzo38 has quit (Ping timeout: 246 seconds).
04:56:52 -!- zzo38 has joined.
05:10:02 -!- doesthiswork has quit (Quit: Leaving.).
05:29:30 <zzo38> O, good, I didn't miss any
05:38:51 -!- ais523 has quit (Quit: quit).
05:40:32 -!- zzo38 has quit (Ping timeout: 250 seconds).
05:40:42 -!- zzo38 has joined.
06:35:25 <esowiki> [[HARSH]] https://esolangs.org/w/index.php?diff=58563&oldid=58549 * ShareMan * (+1059) Fixed computation classification, and just generally made things cleaner, easier to understand, etc.
07:20:28 -!- arseniiv has quit (Ping timeout: 245 seconds).
07:50:06 -!- imode has quit (Ping timeout: 250 seconds).
08:07:24 -!- xkapastel has joined.
08:19:17 -!- doesthiswork has joined.
08:41:33 -!- oerjan has joined.
09:53:23 -!- Vorpal has quit (Ping timeout: 245 seconds).
09:58:07 -!- wob_jonas has joined.
10:04:14 -!- Vorpal has joined.
10:19:14 -!- doesthiswork has quit (Quit: Leaving.).
10:43:37 <wob_jonas> otppenztarak.hu has one of the worst authentication system I've ever seen on a website
10:44:44 <wob_jonas> the start seems sane: you enter your email address, which identifies your account, and your password, which you have set earlier. the password has some stupid restrictions on it: I think it must contain at least two digits or something stupid like that.
10:45:08 <wob_jonas> but then, it sends you a one-time token in email that is valid for 10 minutes only. who the heck thinks that everyone can receive email in 10 minutes?
10:53:27 -!- oerjan has quit (Quit: Later).
11:34:10 <esowiki> [[User:Salpynx]] https://esolangs.org/w/index.php?diff=58564&oldid=58302 * Salpynx * (-119)
11:35:37 <esowiki> [[Language list]] https://esolangs.org/w/index.php?diff=58565&oldid=58534 * Salpynx * (+11) /* A */ A-DU
11:36:21 <wob_jonas> `bobadventureslist http://bobadventures.comicgenesis.com/d/20181204.html
11:36:22 <HackEso> bobadventureslist http://bobadventures.comicgenesis.com/d/20181204.html: b_jonas
11:37:04 <esowiki> [[Language list]] https://esolangs.org/w/index.php?diff=58566&oldid=58565 * Salpynx * (+15) /* G */
11:41:41 <esowiki> [[Joke language list]] https://esolangs.org/w/index.php?diff=58567&oldid=58426 * Salpynx * (+80) /* General languages */ 42
11:44:10 <wob_jonas> ais523: re rust boxed slice constructors, I was confused about that at first, but it turs out that there's a really good reason why it's done this way
11:45:21 <wob_jonas> ais523: the problem is that a slice or boxed slice has the type invariant that all the elements of he slice are initialized, and you can't initialize a general type, so you can't just make a general boxed slice constructor for any type
11:47:13 <wob_jonas> ais523: however, you can initialize the elements one by one from left to right, keeping track how much is initialized. in fact, Vec does exactly that, and the rust std guarantees that a Vec allocates the contents the same way as a boxed slice. so there's a safe method that converts a Vec to a boxed slice that is guarateed to not reallocate the memo
11:47:13 <wob_jonas> ry if the Vec's size is equal to its capacity.
11:48:43 <wob_jonas> the box is effectively just a start pointer, a size and a capacity; and a boxed slice is effectively a start pointer and a size; and this pointer and size and capacity are local variables that the optimizer can see directly when optimizing your program when you fill a vec.
11:50:36 <wob_jonas> So the general way to create a boxed slice is to create a boxed vector, possibly extend it to the capacity if you know the size in advance, then push the elements from left to right, then convert it to a boxed slice, which is at that point free. You can even abort by throwing away the vec.
11:51:42 <wob_jonas> There are some convenience abstractions around this in std, but perhaps not enough, and that part of your complaint might be valid.
11:51:53 <wob_jonas> But you can write any missing abstractions yourself.
11:52:22 <wob_jonas> If you want to initialize the slice in an order different from left to right, then you're somewhat screwed though.
11:52:29 <wob_jonas> There are ways to do that, but they're not easy.
11:53:06 <wob_jonas> And they're unsafe because there's no way to prove to the compiler that you kept track of which elements are initialized correctly.
11:53:16 -!- Lord_of_Life has quit (Ping timeout: 268 seconds).
11:55:52 -!- Lord_of_Life has joined.
11:55:52 -!- Lord_of_Life has quit (Changing host).
11:55:52 -!- Lord_of_Life has joined.
11:57:58 -!- zzo38 has quit (Ping timeout: 250 seconds).
12:02:06 <esowiki> [[APLBAONWSJAS]] https://esolangs.org/w/index.php?diff=58568&oldid=57861 * Salpynx * (+154) /* Interpreter */ Python3 interpreter silliness
12:03:50 <esowiki> [[APLBAONWSJAS]] M https://esolangs.org/w/index.php?diff=58569&oldid=58568 * Salpynx * (+37) /* In other languages */
12:08:38 <esowiki> [[User:Salpynx]] https://esolangs.org/w/index.php?diff=58570&oldid=58564 * Salpynx * (+133) /* Code for languages created by others */
12:12:45 <wob_jonas> ais523: I think if you want to initialize the elements in an order other than left to right, then you're screwed. I don't see a sane way to do that and get a Box<T> from it.
12:14:08 <wob_jonas> You can use a custom container, which uses Box<[std::mem::MaybeUninitialized<T>]> for allocation and deallocation, fill the elements in place in whatever order you want, and once you're sure that all elements are initialized to a valid states, you can expose the contents as a &mut [T], and you can implement this as an unsafe method that calls mem::
12:14:08 <wob_jonas> transmute on the slice (not on the box)
12:16:50 -!- AnotherTest has joined.
12:21:59 <wob_jonas> nope, it's even worse if the type has a destructor
12:22:43 -!- AnotherTest has quit (Ping timeout: 244 seconds).
12:23:40 <wob_jonas> you need a Box<[std::mem::MaybeUninitialized<std::mem::ManuallyDrop<T>>]> for that
12:24:01 <esowiki> [[Grime MC]] https://esolangs.org/w/index.php?diff=58571&oldid=58464 * Salpynx * (+632) Notes on macros
12:33:11 <wob_jonas> but probably only for a year or two more, I think people already want to add a single wrapper for std::mem::MaybeUninitialized<std::mem::ManuallyDrop<T>> because it's too common
12:33:34 <wob_jonas> I bet you'll find one in std within two years
12:45:04 -!- Bowserinator has quit (Ping timeout: 268 seconds).
12:45:51 -!- Bowserinator has joined.
12:49:01 -!- AnotherTest has joined.
13:02:31 -!- arseniiv has joined.
13:02:57 -!- AnotherTest has quit (Ping timeout: 268 seconds).
13:35:55 -!- Essadon has joined.
13:36:10 -!- Essadon has quit (Max SendQ exceeded).
13:36:35 -!- Essadon has joined.
14:00:29 -!- doesthiswork has joined.
15:19:34 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58572&oldid=46762 * Gamer * (+15) /* Truth-machine */
15:19:53 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58573&oldid=58572 * Gamer * (+0) /* Truth-machine */
15:20:43 <esowiki> [[Element]] M https://esolangs.org/w/index.php?diff=58574&oldid=58573 * Gamer * (+0) /* Truth-machine */
15:22:22 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58575&oldid=58574 * Gamer * (-46) /* Nth Triangular Number */
15:22:40 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58576&oldid=58575 * Gamer * (-2) /* Digital root calculator */
15:26:45 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58577&oldid=58576 * Gamer * (-113) /* N Factorial */
15:28:15 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58578&oldid=58577 * Gamer * (+0) /* Nth Triangular Number */
15:33:30 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58579&oldid=58578 * Gamer * (-121) /* GCD of two numbers */
15:33:42 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58580&oldid=58579 * Gamer * (+24) /* N Factorial */
15:33:44 -!- siraben has left ("Kicked by @appservice-irc:matrix.org : issued !quit command").
15:33:55 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58581&oldid=58580 * Gamer * (+24) /* Digital root calculator */
15:36:29 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58582&oldid=58581 * Gamer * (+46) /* N Factorial */
15:37:06 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58583&oldid=58582 * Gamer * (+36) /* Truth-machine */
15:37:46 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58584&oldid=58583 * Gamer * (-40) /* GCD of two numbers */
15:39:24 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58585&oldid=58584 * Gamer * (-45) /* Nth Fibonacci number */
15:39:46 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58586&oldid=58585 * Gamer * (-34) /* GCD of two numbers */
15:42:12 <esowiki> [[Element]] https://esolangs.org/w/index.php?diff=58587&oldid=58586 * Gamer * (+76) /* Digital root calculator */
15:44:11 <myname> talk about exeggerating
15:55:22 -!- hexfive has joined.
16:11:12 -!- wob_jonas has quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client).
17:14:02 -!- zzo38 has joined.
18:29:13 <esowiki> [[4]] https://esolangs.org/w/index.php?diff=58588&oldid=46060 * Gamer * (-13) /* Hello World! */
18:49:51 <oren> DYK? "lockheed" is pronounced /lɒxid/
18:51:29 <shachaf> What's the c doing in there?
18:56:14 -!- Phantom_Hoover has joined.
19:05:56 -!- hexfive has quit (Quit: WeeChat 2.2).
19:10:42 <esowiki> [[Alphuck]] https://esolangs.org/w/index.php?diff=58589&oldid=57853 * Gamer * (-28) /* Hello, world! program */
19:18:25 -!- b_jonas has joined.
19:19:45 -!- arseniiv has quit (Ping timeout: 252 seconds).
19:31:19 -!- b_jonas has quit (Quit: Reconnecting).
19:31:35 -!- b_jonas has joined.
20:07:40 <b_jonas> ``` for x in 0 1; do quote; done
20:07:41 <HackEso> 120) <AnMaster> cpressey, oh go to zzo's website. He is NIH <Phantom_Hoover> AnMaster, really? I was strongly under the impression that zzo was invented here. \ 157) <oklopol> well i just ate some stuff and watched family guy <oklopol> and i own a piano <oklopol> and i'm not wearing socks
20:07:52 <b_jonas> ``` for x in 0 1; do wisdom; done
20:07:53 <HackEso> afk//Afk wrote a famous story about hang. \ lunacy//LUNacy is wisdom generated by a neu^Wlayered unit net. Ask Warrigal for details.
20:17:46 <HackEso> 2 quote/1:/bin/sed: -e expression #1, char 4: extra characters after command \ /hackenv/bin/n: line 1: 2 quote: syntax error in expression (error token is "quote")
20:20:16 <HackEso> 1/1:412) <NihilistDandy> Non sequitur is my forte <NihilistDandy> On-topic discussion is my piano <Taneb> Bowls of sugary breakfast cereal is my mezzoforte <Taneb> Full fat milk is my pianissimo <Taneb> On which note, I'm hungry \ 937) <Bike> Are you sure this isn't the Sims <kallisti> can you get married to your variables? <kallisti> this is a feature I find lacking in most languages
20:26:33 -!- Melvar has quit (Quit: rebooting).
20:41:26 -!- Melvar has joined.
20:44:40 -!- b has joined.
20:44:48 -!- b has quit (Client Quit).
21:20:10 -!- tromp has quit (Remote host closed the connection).
21:20:47 -!- tromp has joined.
21:22:02 -!- ais523 has joined.
21:23:01 <ais523> does anyone know of a data structure which lets me do this?: we can add elements to the structure and it records when each element was added (new additions replace earlier additions of the previous element), and we can efficiently discover which elements have been added since a specific element was last added
21:23:28 <ais523> something like a linked deque + an index into it would work, I think, but I'm wondering if there's anything simpelr
21:37:12 -!- gurmble has changed nick to grumble.
21:41:17 -!- tromp has quit (Remote host closed the connection).
21:41:28 -!- tromp has joined.
21:43:42 <ais523> on another note, how many languages use "if … else" (with no "then" clause) for what Perl/Ruby call "unless"?
21:46:53 -!- AnotherTest has joined.
21:48:51 <FireFly> huh, that's an interesting syntactic idea, and makes enough sense
21:51:38 -!- zzo38 has quit (Ping timeout: 250 seconds).
21:51:53 <oren> if(x == 30);else{ }
21:52:56 <ais523> yes, it's easy enough to stick a dummy "then" part in
21:57:15 <b_jonas> ais523: hi. see logs for today about your rust question.
21:58:32 <b_jonas> ais523: I don't see how a linked deque and index would work, you'd have to update a lot of indexes. a heap and indexes could work, because you can delete any element and you only have to update log(size) number of indexes.
21:59:08 <b_jonas> hm wait, what operations exactly do you want to do
22:01:32 <ais523> the index would be a map from an element to the object representing that element inside the deque (that's why it has to be a /linked/ deque)
22:02:11 <ais523> the operations I need that can't trivially be abstracted out into a separate structure are adding an element to the structure, and getting a list of all elements added more recently than a specific element
22:02:25 <ais523> also if I add the same element repeatedly, earlier additions should be forgotten rather than leaking memory
22:02:41 <ais523> in case it helps, I know all possible elements that could be added in advance
22:03:04 <ais523> elements can be compared for equality easily (in the situation I want to use this they're just integers)
22:03:22 <ais523> I was hoping that there'd be some existing structure which supports this so that I could look for an implementation in libraries
22:03:51 <ais523> re: the Rust question discussion in logs, I was expecting something that created an /initialised/ boxed slice, e.g. by copying a given element a given number of times (like vec! does)
22:05:07 <b_jonas> ais523: oh, you mean a deque plus an dictionary with the element value as key, and an iterator into the deque as the value. that works, I was confused by "index" then
22:05:24 <b_jonas> a linked list deque that is
22:05:29 <b_jonas> so that the elements have stable nodes
22:05:32 <ais523> I meant it in the sense of "database index"
22:05:44 <ais523> of course, databases are good at this, because they're basically a general-purpose data structure
22:05:51 <ais523> but it feels like something simpler should work too
22:05:57 <esowiki> [[+-]] https://esolangs.org/w/index.php?diff=58590&oldid=58559 * Cortex * (+34293) /* Examples */
22:06:11 <ais523> (also I'm writing this program in Rust, and a linked deque is one of the simplest things that Rust is /really/ bad at…)
22:06:29 <b_jonas> I was thinking of a heap (such as a binary or quaternary heap) and a dictionary from the key to the index into the heap. I know you can efficiently maintain a dictionary from the element to the index in a binary heap, I've implemented that.
22:06:34 <b_jonas> it's not clear which of these is better her
22:07:36 <b_jonas> ais523: no way. you never delete elements, so you can put the deque nodes in a Vec that will never shrink, and you never waste memory, and use indexes or 32-bit indexes into that Vec for the links.
22:08:08 <ais523> b_jonas: oh, that does work, it's weird to think about using indexes rather than pointers though
22:08:22 <ais523> especially as the bounds checks happen at runtime not compile-time
22:08:29 <b_jonas> using indexes is generally a good idea because it's easy to automatically bounds checked which catches some (not all) silly mistakes with pointer stupidity, and allows you to use 32-bit indexes in a 64-bit address space
22:08:49 <b_jonas> no need to allocate each node separately
22:09:30 <b_jonas> this is the main reason why I think x32 is a silly idea. if in the structures or loops where you really need to optimize for performance, you just use 32-bit indexes, then the 64-bit pointers never bother you.
22:10:03 <b_jonas> admittedly the rust std isn't too well-suited for that, but ideally if you wrote everything in machine language, that would work, and rust or C++ allow good enough approximations.
22:11:05 <b_jonas> rust is in a slight disadvantage here, but in the very rare cases when that causes a performance problem, you just optimze the inner loop by reimplementing it in something that isn't rust.
22:11:19 <b_jonas> or give dirty hints to rust to speed it up
22:11:47 <ais523> doesn't using pointers directly, rather than index + base pointer, reduce register pressure?
22:11:51 <b_jonas> probably eventually in future rust that won't be necessary, they just add enough stuff in the compiler and std
22:12:18 <b_jonas> ais523: perhaps, but usually register pressure is less of a problem on x86_64 than L1 cache and L2 cache throughput
22:12:25 <b_jonas> and decoder throughput and stuff like that
22:12:46 <b_jonas> especially on modern x86_64
22:13:49 <ais523> now I'm wondering if simple dereferences like mov [reg], reg take up less instruction-cache pressure than complex ones like mov [base+index*4], reg
22:13:59 <b_jonas> and register throughput problems is something that has some chance of automatically fixing themselves without changing your code when you install a new compiler with a better optimizer, whereas cache throughput is less likely to do that
22:14:33 <b_jonas> ais523: that really depends on the compiler
22:15:34 <b_jonas> but you aren't likely to get [base+index*4] or [base+index*8] indexes here unless you premultiply your indexes or use avx512, because your element size is greater than 8 bytes
22:15:38 <ais523> what's "that" referring to? if it's my most recent comment, I'm talking about at the machine code → micro-op level of abstraction, well below the compiler
22:16:35 <b_jonas> instead, you'll get a silly 64-bit shift instruction to do the multiplication, and you won't be able to convince rust to optimize that a 32-bit multiplication, because you can't easily tell the compiler that the array size is less than 32 bytes.
22:17:15 <ais523> oh, I see, you're claiming that the more condensed instructions might not even be generatable
22:17:18 <b_jonas> "that" is referencing "register throughput problems"
22:17:54 <b_jonas> ais523: they might be if you premultiply your indexes or use avx512, but you rarely care that much about performance
22:18:08 <b_jonas> microoptimize your code only when it's really the bottleneck
22:21:51 <b_jonas> I'd like to note that in this case, since you always insert new elements whose weights are larger than every existing element, inserting a new element takes constant time, you're not moving anything, just pushing an element with the next weight at the end of the array
22:22:04 <b_jonas> you probably need to keep a counter tha tracks the next weight separately
22:22:36 <b_jonas> if you rarely update elements, then that's the simplest one, but since you didn't mention deleting elements, I think you want to update elements often
22:23:19 <b_jonas> mind you, adding a new element to the front of a deque is about as simple
22:23:37 <b_jonas> and the dictionary insert has about the same cost in either case
22:23:53 <ais523> you can't just keep inserting at the end of an array
22:23:57 <b_jonas> the heap won't work at all
22:24:00 <ais523> because then you'll end up growing the array forever
22:24:35 <b_jonas> ais523: but you need to grow the array when you insert a new element, as opposed to updating an element with an existing key
22:25:27 <b_jonas> ok, I think you're right. use a deque
22:27:48 <b_jonas> hmm wait, there's a trickier way
22:28:59 <ais523> I've been looking for existing Rust modules which have the right functionality; one of them uses a really clever trick
22:29:15 <ais523> the elements I'm adding are small integers
22:29:49 <ais523> and this data structure is designed for them; it basically has an array which stores the next integer and previous integer for each integer n at the nth slot
22:29:53 <b_jonas> ais523: trickier way, slower but less memory: use a singly linked list, with pointers towards older elements, and a dictionary with the value of an iterator to the previous element. when you remove an element, look up the next element by its value in the dictionary, and update that dict entry too as well as the dict entry for the element you move to the front.
22:30:28 <ais523> next /element/, previous /element/ rather than next /pointer/, previous /pointer/ (or simulating pointers using indexes) simplifies things a lot
22:30:42 <b_jonas> ais523: how many elements do you expect to have? if it's very small, like 8, then there are tricks
22:31:09 <ais523> it scales based on the size of user input
22:31:10 <b_jonas> ais523: that works only if your elements (keys) are of small size
22:31:18 <ais523> but it's consecutive from 0
22:31:28 <b_jonas> ais523: ok, so it's not like always less than 16
22:31:42 <ais523> right, in practice it'll usually be small but I don't want a hard limit
22:32:05 <ais523> (the values themselves are `usize`s)
22:32:36 <b_jonas> ais523: there's a crazy structure used in hardware caches to handle the 8-way caching that stores binom(8,2) bits and lets the hardware update one of the eight slot to make it the newest one, and find the oldest slot so it knows which element to drop when it has to insert a new one
22:33:26 <ais523> I probably won't use that, but am curious about how it works
22:33:45 <b_jonas> ais523: if it's between 8 and 64, then you might not care about asymptotic performance, and implement this as a simple vector of the integer values where you search the value with a linear search, and memmove part of the array to insert or delete something
22:34:08 <b_jonas> ais523: there's a bit for each pair of the 8 elements, and it tells whether the first one is newer than the other
22:35:17 <b_jonas> inserting a new element to a slot needs to update 8 of those 28 bits, which is somewhat easy and fast in hardware in this case
22:35:45 <ais523> oh yes, I can see how that would work now
22:36:33 <b_jonas> so much that I guess the bottleneck is not that age array, but the dispatch logic to find one of the eight elements matching the 6-bit address
22:36:53 <ais523> to be honest a linear search would probably be fast enough for me too, but it was a set of performance properties I hadn't seen before in a data structure, so it prompted me to ask here
22:38:05 <b_jonas> you need to do that address compare in each of the eight elements, then three layers of if-else to forward one of them, and then a comparison to the upper bits of the address when it arrives from the page table cache (it's not called "page table cache", it has a fancier name that I keep forgetting, but that's what it really is)
22:38:24 <b_jonas> and all of that must go through within a clock cycle ideally
22:38:36 <b_jonas> you get slightly more slack in the L2 cache
22:38:56 <ais523> this is a program I'm heavily optimising partly just for practice/curiosity/the sake of it (it's an esolang interpreter, people hyperoptimise those for brainfuck…)
22:39:06 -!- AnotherTest has quit (Ping timeout: 272 seconds).
22:40:24 <b_jonas> first you find a 64-way dispatch for the right slot with 8 cache entries from the 6 middle bits of the address, and THEN when the address arrives from the page table, only then can you compare the upper lots of address bits to the eight slots and do the eight-bit dispatch
22:40:56 <b_jonas> that's why it's hard to make the latency low, since the page table cache also has a latency
22:41:11 <ais523> memory latency is a huge problem for today's computers
22:43:08 <b_jonas> I think the big problem is that until we design an entirely new architecture where we can be sure that no software depends on that pages are allowed to be 4k sized, we can't have an L1 cache larger than 32K (4K times eight). we've had CPUs with 32K L1 data cache and 32K L1 code cache for almost a decade, even though the operating systems mostly want to use 8K pages,
22:43:33 <b_jonas> and we can't fix this while we want x86 compatibility.
22:43:56 <b_jonas> it's one of these stupid historical things. 4k pages made sense back in 386 when computers had a few megabytes of memory.
22:44:09 <b_jonas> sometimes just one or two megabytes
22:44:28 <ais523> the issue isn't really so much the size of the pages, as the minimum granularity of the page table
22:44:38 <b_jonas> and larger L1 caches would help a lot
22:44:39 <esowiki> [[Doreq]] M https://esolangs.org/w/index.php?diff=58591&oldid=58377 * Unitan * (+35)
22:44:54 <b_jonas> ais523: what do you mean by the granularity? the cache line size? I think that's fine being 64 bytes as is.
22:44:59 <ais523> if you said "the pages are 512 bytes but you can only map them in units of megabytes" people would be fine with that
22:45:05 <ais523> b_jonas: minimum size/alignment of a page map
22:45:29 <b_jonas> ais523: if you can only map them in units of megabytes, then what would the 512 byte mean? in what sense would it be a page?
22:46:40 <ais523> presumably it'd be used for things like bounds checks
22:47:00 <b_jonas> ais523: hmm, so you'd have a size entry in each page? why's that useful?
22:47:12 <b_jonas> I mean, it could be done, but I don't see the point
22:47:50 <ais523> well, what's the purpose of a page?
22:48:09 <ais523> sometimes it's for memory allocation purposes, sometimes it's to get useful side effects from the page faults
22:48:18 <ais523> you could divorce the two, I think
22:48:39 <b_jonas> ais523: for a cpu page, which is what we're talking about here, it's mapping from virtual memory address space (of a process) to "physical" memory address space in a way that the cpu dispatches the mapping automatically at every ordinary memory access
22:49:45 <b_jonas> "physical" address ideally means what's sent to the main memory, but it could be different if there's virtualization or legacy 32 bit large memory shenanigans involved
22:50:30 <ais523> right, and I don't see why virtual→physical mapping would need a small granularity nowadays unless you're intentionally overlapping pages in weird ways
22:51:33 <b_jonas> ais523: anyway, if you add a size field to a page, and you set it to less than maximum size, then you're probably wasting the rest of the physical page, because it's unlikely that you'll quickly find another partial page that you can allocate in the same place that has just the right page size alignment
22:52:23 <ais523> b_jonas: OK, so what about a setup that works like this: you can set up virtual→physical maps which have a large minimum size and alignment, but you can "mask" parts of a virtual page so that they pagefault when accessed
22:52:31 <ais523> rather than going to the physical page behind
22:52:37 <b_jonas> it could perhaps be useful for catching some stack overflows
22:52:57 <ais523> then you can effectively split a physical page between multiple virtual pages as long as the low bits in the virtual addresses are all distinct from each otehr
22:53:00 <b_jonas> but I don't think that's likely useful
22:53:17 <b_jonas> it would just make so that we have to automatically extend the stack more often than once every few pages
22:53:54 <b_jonas> ais523: make part of a page masked to unreadable? perhaps, but I don't really think it's worth the cost.
22:54:32 <ais523> b_jonas: well, the point is you have to pay this cost anyway when you have small pages
22:54:38 <ais523> so this is saving part of the cost
22:54:44 <b_jonas> ais523: I mean, then you'll need extra bits in the page table cache, which is something that has to be really close to the L1 cache and core on the cpu die for latency
22:54:55 <ais523> and it's definitely useful to be able to mask, say, the code section and data section of an executable map differently
22:55:24 <b_jonas> ais523: not really. if you actually use smaller pages, rather than just allow smaller pages, then you pay the page table cost
22:56:03 <ais523> make it so that the high bits of a pointer specify what sort of accesses can be used via it; then instead of having permissions specified separately in the page table, they're specified indirectly by what virtual address you mapped the physical page to
22:56:09 <b_jonas> ais523: but currently, the cpu allows 4k pages, but most pages are actually larger, megabyte sized (we don't have 8K or 16K pages on x86_64 I think, but we could in the future), which is better because they take up less space in the page table cache
22:56:19 <b_jonas> well, not yet, but OSes are getting better with large pages
22:56:30 <b_jonas> eventually we will be using mostly megabyte-sized pages on x86_64
22:56:46 <b_jonas> and then you don't pay for the cost of the small pages in the page table cache, but only in the L1
22:57:32 <b_jonas> and once you require larger pages, with no mask bits, you have a page table cache that is about as efficient as currently, only slightly more because it doesn't handle small pages at all, and an L1 cache that can finally break the 32 kilobyte barrier
22:58:08 <b_jonas> it's still tricky, you can't just increase the L1 cache to as large as you want, because that also increases the latency of that cache
22:59:24 <b_jonas> you might even end up with an L1 cache and an L1.5 cache, the L1.5 cache being slower than the L1 cache, but still works like the current L1 cache in that it can tell which group of 8 elements to use before the page table cache tells the address
23:00:14 <b_jonas> you might even drop the L2 cache, and have only an L1 cache, and L1.5 cache, and what's currently called an L3 cache
23:00:37 <b_jonas> at this point I just don't have the hardware competence
23:02:40 <ais523> I think the future of hardware is mostly moving towards trying to make the memory structure less general so that it can be faster in special cases
23:03:04 <ais523> for example, GPUs typically use batch memory transfer mechanisms and suspend threads while waiting for the memory values to arrive
23:03:21 <ais523> and NUMA allows different CPU cores to have different RAM
23:04:16 <b_jonas> I'm a software guy who wants to optimize stuff on existing or near future consumer PC cpus, and only trying to understand cpus and compilers from that direction
23:05:32 <b_jonas> ais523: sure, the point of NUMA is that if each half of the cpu is accessing the RAM that's close to him, then the two halves can do memory transfer faster, and possibly with slihghtly lower latency too
23:06:50 <b_jonas> this is useful if you have processes that are really well parallelizable to twofolds, as in, two parts are very independent in what memory they use, which is common, then you get higher memory performance with cheaper hardware
23:07:08 <b_jonas> and it's common to have tasks that partition to two easily
23:07:53 <b_jonas> I've used a 2-way numa server to compress several videos in parallel, the bulk of each video compression runs on one core
23:08:08 <b_jonas> one cpu core that is, with some gpu magic involved too
23:09:00 <b_jonas> and the raw video is read to the half of the main RAM that compresses that video, and then the intermediate storage used for the compression is in that half too
23:09:31 <b_jonas> I sort of have to trust the OS to do the Sane Thing in the common case
23:11:33 <b_jonas> compressing half of several videos still takes too much memory that it probably overflows an L3 cache
23:12:03 <ais523> CPU → GPU memory transfers are really really slow (the other direction is IIRC even slower, but less useful)
23:14:10 <b_jonas> ais523: sure, but for video compression it's still worth, because a part of the video compression can be really sped up by a GPU
23:14:26 <b_jonas> assuming you have a modern GPU of course
23:14:57 <b_jonas> it's the kind of tasks GPUs are optimized for
23:19:11 <b_jonas> and I think if you do both the finding the nearby similar square in previous frames thing on the GPU and the discrete cosine/wavelet/etc transform on the GPU too, then you can keep all the data in the deeper parts of the GPU between those two
23:20:21 <b_jonas> so ideally you only transfer each frame to GPU only once and do all the tasks there, then transfer the result back to CPU which does further processing on it and writes the file
23:21:05 <b_jonas> there's some other operations besides those two, like possibly partly decoding the frames from the input format if it's already compressed, subtraction, color space transform, etc
23:21:48 <b_jonas> or even resizing to lower resolution or other operations done at the same time as the encoding
23:22:07 <b_jonas> or, heck, deinterlacing and resizing to lower resolution
23:40:29 -!- Phantom_Hoover has quit (Read error: Connection reset by peer).
23:54:42 -!- Lord_of_Life has quit (Ping timeout: 250 seconds).
23:56:24 -!- Lord_of_Life has joined.
23:56:24 -!- Lord_of_Life has quit (Changing host).
23:56:24 -!- Lord_of_Life has joined.
23:59:18 -!- sleepnap has quit (Quit: Leaving.).