00:01:24 -!- Phantom_Hoover has quit (Read error: Connection reset by peer). 00:36:42 -!- Melvar has quit (Ping timeout: 268 seconds). 00:41:49 -!- Essadon has quit (Quit: Qutting). 01:06:06 -!- mniip has quit (Ping timeout: 630 seconds). 01:09:15 -!- Melvar has joined. 01:56:08 I'm surprisingly annoyed at the lack of boxed-slice constructors in Rust 01:56:34 you have to go via a temporary Vec if you want one (assuming that the size isn't known at compile time) 01:56:40 even if you know the size at runtime 01:56:43 -!- nfd9001 has joined. 02:00:43 even if you know the size at runtime 02:00:51 whoops, meant to do that in my shell, not IRC 02:02:51 Do you know how to play poker with tarot cards or mahjong with pokemon? 02:05:10 -!- nfd9001 has quit (Ping timeout: 250 seconds). 02:05:22 -!- xkapastel has quit (Quit: Connection closed for inactivity). 02:08:48 [[+-]] https://esolangs.org/w/index.php?diff=58559&oldid=58522 * Cortex * (+69) /* Examples */ 02:15:17 [[Omgrofl]] https://esolangs.org/w/index.php?diff=58560&oldid=58453 * Cortex * (-110) 02:15:39 [[Template:Dead memes]] N https://esolangs.org/w/index.php?oldid=58561 * Cortex * (+123) Created page with "'''WARNING''': This article contains high amounts of dead memes and outdated jokes. Please remember that before proceeding." 02:16:16 [[Surround notation]] https://esolangs.org/w/index.php?diff=58562&oldid=22152 * Cortex * (+16) /* Examples */ 03:00:36 Do you like this? 03:01:36 poker with tarot, interesting 03:01:50 I've been meaning to find a use for mine. 03:04:43 There are a number of different games played with tarot. Originally the games were trick taking games, somewhat like whist or bridge, except there is one suit of permanent trumps. In some games the Fool is the highest trump, and in some other games, you can play it at any time even if you are able to follow suit but you will always lose the trick if you play it (it is sometimes called "Excuse" in this case). 03:06:27 (Even though it is either the highest trump or has no numerical value, still it seems it is commonly labeled zero. Someone told me that it is zero because it represents the beginning of a journey; I can see how, although that still does not seem a good enough reason to call it zero when its value in the game isn't actually zero.) 03:14:22 There are a few different games you can find, actually. If you have Latin-suited cards and the rules mention French-suited or vice-versa, you can use the following correspondences: spades=swords, clubs=rods/wands, diamonds=coins (sometimes they have pentagrams printed on them; it is fine if they do), hearts=cups 03:16:52 -!- sebbu has quit (Ping timeout: 246 seconds). 03:24:26 very interesting 03:30:45 -!- mniip has joined. 03:34:35 -!- sebbu has joined. 04:37:32 -!- ais523 has quit (Remote host closed the connection). 04:38:45 -!- ais523 has joined. 04:51:14 -!- zzo38 has quit (Ping timeout: 246 seconds). 04:56:52 -!- zzo38 has joined. 04:58:25 Hello 05:10:02 -!- doesthiswork has quit (Quit: Leaving.). 05:28:44 Hello 05:28:50 Sorry 05:29:30 O, good, I didn't miss any 05:38:51 -!- ais523 has quit (Quit: quit). 05:40:32 -!- zzo38 has quit (Ping timeout: 250 seconds). 05:40:42 -!- zzo38 has joined. 06:35:25 [[HARSH]] https://esolangs.org/w/index.php?diff=58563&oldid=58549 * ShareMan * (+1059) Fixed computation classification, and just generally made things cleaner, easier to understand, etc. 07:20:28 -!- arseniiv has quit (Ping timeout: 245 seconds). 07:50:06 -!- imode has quit (Ping timeout: 250 seconds). 08:07:24 -!- xkapastel has joined. 08:19:17 -!- doesthiswork has joined. 08:41:33 -!- oerjan has joined. 09:10:35 oh they've removed +r 09:53:23 -!- Vorpal has quit (Ping timeout: 245 seconds). 09:58:07 -!- wob_jonas has joined. 10:04:14 -!- Vorpal has joined. 10:19:14 -!- doesthiswork has quit (Quit: Leaving.). 10:43:37 otppenztarak.hu has one of the worst authentication system I've ever seen on a website 10:44:44 the start seems sane: you enter your email address, which identifies your account, and your password, which you have set earlier. the password has some stupid restrictions on it: I think it must contain at least two digits or something stupid like that. 10:45:08 but then, it sends you a one-time token in email that is valid for 10 minutes only. who the heck thinks that everyone can receive email in 10 minutes? 10:53:27 -!- oerjan has quit (Quit: Later). 11:34:10 [[User:Salpynx]] https://esolangs.org/w/index.php?diff=58564&oldid=58302 * Salpynx * (-119) 11:35:37 [[Language list]] https://esolangs.org/w/index.php?diff=58565&oldid=58534 * Salpynx * (+11) /* A */ A-DU 11:36:21 `bobadventureslist http://bobadventures.comicgenesis.com/d/20181204.html 11:36:22 bobadventureslist http://bobadventures.comicgenesis.com/d/20181204.html: b_jonas 11:37:04 [[Language list]] https://esolangs.org/w/index.php?diff=58566&oldid=58565 * Salpynx * (+15) /* G */ 11:41:41 [[Joke language list]] https://esolangs.org/w/index.php?diff=58567&oldid=58426 * Salpynx * (+80) /* General languages */ 42 11:44:10 ais523: re rust boxed slice constructors, I was confused about that at first, but it turs out that there's a really good reason why it's done this way 11:45:21 ais523: the problem is that a slice or boxed slice has the type invariant that all the elements of he slice are initialized, and you can't initialize a general type, so you can't just make a general boxed slice constructor for any type 11:47:13 ais523: however, you can initialize the elements one by one from left to right, keeping track how much is initialized. in fact, Vec does exactly that, and the rust std guarantees that a Vec allocates the contents the same way as a boxed slice. so there's a safe method that converts a Vec to a boxed slice that is guarateed to not reallocate the memo 11:47:13 ry if the Vec's size is equal to its capacity. 11:48:43 the box is effectively just a start pointer, a size and a capacity; and a boxed slice is effectively a start pointer and a size; and this pointer and size and capacity are local variables that the optimizer can see directly when optimizing your program when you fill a vec. 11:50:36 So the general way to create a boxed slice is to create a boxed vector, possibly extend it to the capacity if you know the size in advance, then push the elements from left to right, then convert it to a boxed slice, which is at that point free. You can even abort by throwing away the vec. 11:51:42 There are some convenience abstractions around this in std, but perhaps not enough, and that part of your complaint might be valid. 11:51:53 But you can write any missing abstractions yourself. 11:52:22 If you want to initialize the slice in an order different from left to right, then you're somewhat screwed though. 11:52:29 There are ways to do that, but they're not easy. 11:53:06 And they're unsafe because there's no way to prove to the compiler that you kept track of which elements are initialized correctly. 11:53:16 -!- Lord_of_Life has quit (Ping timeout: 268 seconds). 11:55:52 -!- Lord_of_Life has joined. 11:55:52 -!- Lord_of_Life has quit (Changing host). 11:55:52 -!- Lord_of_Life has joined. 11:57:58 -!- zzo38 has quit (Ping timeout: 250 seconds). 12:02:06 [[APLBAONWSJAS]] https://esolangs.org/w/index.php?diff=58568&oldid=57861 * Salpynx * (+154) /* Interpreter */ Python3 interpreter silliness 12:03:50 [[APLBAONWSJAS]] M https://esolangs.org/w/index.php?diff=58569&oldid=58568 * Salpynx * (+37) /* In other languages */ 12:08:38 [[User:Salpynx]] https://esolangs.org/w/index.php?diff=58570&oldid=58564 * Salpynx * (+133) /* Code for languages created by others */ 12:12:45 ais523: I think if you want to initialize the elements in an order other than left to right, then you're screwed. I don't see a sane way to do that and get a Box from it. 12:14:08 You can use a custom container, which uses Box<[std::mem::MaybeUninitialized]> for allocation and deallocation, fill the elements in place in whatever order you want, and once you're sure that all elements are initialized to a valid states, you can expose the contents as a &mut [T], and you can implement this as an unsafe method that calls mem:: 12:14:08 transmute on the slice (not on the box) 12:16:50 -!- AnotherTest has joined. 12:21:59 nope, it's even worse if the type has a destructor 12:22:43 -!- AnotherTest has quit (Ping timeout: 244 seconds). 12:23:40 you need a Box<[std::mem::MaybeUninitialized>]> for that 12:24:01 [[Grime MC]] https://esolangs.org/w/index.php?diff=58571&oldid=58464 * Salpynx * (+632) Notes on macros 12:33:11 but probably only for a year or two more, I think people already want to add a single wrapper for std::mem::MaybeUninitialized> because it's too common 12:33:34 I bet you'll find one in std within two years 12:45:04 -!- Bowserinator has quit (Ping timeout: 268 seconds). 12:45:51 -!- Bowserinator has joined. 12:49:01 -!- AnotherTest has joined. 13:02:31 -!- arseniiv has joined. 13:02:57 -!- AnotherTest has quit (Ping timeout: 268 seconds). 13:35:55 -!- Essadon has joined. 13:36:10 -!- Essadon has quit (Max SendQ exceeded). 13:36:35 -!- Essadon has joined. 14:00:29 -!- doesthiswork has joined. 15:19:34 [[Element]] https://esolangs.org/w/index.php?diff=58572&oldid=46762 * Gamer * (+15) /* Truth-machine */ 15:19:53 [[Element]] https://esolangs.org/w/index.php?diff=58573&oldid=58572 * Gamer * (+0) /* Truth-machine */ 15:20:43 [[Element]] M https://esolangs.org/w/index.php?diff=58574&oldid=58573 * Gamer * (+0) /* Truth-machine */ 15:22:22 [[Element]] https://esolangs.org/w/index.php?diff=58575&oldid=58574 * Gamer * (-46) /* Nth Triangular Number */ 15:22:40 [[Element]] https://esolangs.org/w/index.php?diff=58576&oldid=58575 * Gamer * (-2) /* Digital root calculator */ 15:26:45 [[Element]] https://esolangs.org/w/index.php?diff=58577&oldid=58576 * Gamer * (-113) /* N Factorial */ 15:28:15 [[Element]] https://esolangs.org/w/index.php?diff=58578&oldid=58577 * Gamer * (+0) /* Nth Triangular Number */ 15:33:30 [[Element]] https://esolangs.org/w/index.php?diff=58579&oldid=58578 * Gamer * (-121) /* GCD of two numbers */ 15:33:42 [[Element]] https://esolangs.org/w/index.php?diff=58580&oldid=58579 * Gamer * (+24) /* N Factorial */ 15:33:44 -!- siraben has left ("Kicked by @appservice-irc:matrix.org : issued !quit command"). 15:33:55 [[Element]] https://esolangs.org/w/index.php?diff=58581&oldid=58580 * Gamer * (+24) /* Digital root calculator */ 15:36:29 [[Element]] https://esolangs.org/w/index.php?diff=58582&oldid=58581 * Gamer * (+46) /* N Factorial */ 15:37:06 [[Element]] https://esolangs.org/w/index.php?diff=58583&oldid=58582 * Gamer * (+36) /* Truth-machine */ 15:37:46 [[Element]] https://esolangs.org/w/index.php?diff=58584&oldid=58583 * Gamer * (-40) /* GCD of two numbers */ 15:39:24 [[Element]] https://esolangs.org/w/index.php?diff=58585&oldid=58584 * Gamer * (-45) /* Nth Fibonacci number */ 15:39:46 [[Element]] https://esolangs.org/w/index.php?diff=58586&oldid=58585 * Gamer * (-34) /* GCD of two numbers */ 15:42:12 [[Element]] https://esolangs.org/w/index.php?diff=58587&oldid=58586 * Gamer * (+76) /* Digital root calculator */ 15:44:11 talk about exeggerating 15:55:22 -!- hexfive has joined. 16:11:12 -!- wob_jonas has quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client). 17:14:02 -!- zzo38 has joined. 18:29:13 [[4]] https://esolangs.org/w/index.php?diff=58588&oldid=46060 * Gamer * (-13) /* Hello World! */ 18:49:51 DYK? "lockheed" is pronounced /lɒxid/ 18:51:29 What's the c doing in there? 18:56:14 -!- Phantom_Hoover has joined. 19:05:56 -!- hexfive has quit (Quit: WeeChat 2.2). 19:10:42 [[Alphuck]] https://esolangs.org/w/index.php?diff=58589&oldid=57853 * Gamer * (-28) /* Hello, world! program */ 19:18:25 -!- b_jonas has joined. 19:19:45 -!- arseniiv has quit (Ping timeout: 252 seconds). 19:31:19 -!- b_jonas has quit (Quit: Reconnecting). 19:31:35 -!- b_jonas has joined. 20:07:22 `2 quote 20:07:23 2/1: 20:07:26 no no 20:07:40 ``` for x in 0 1; do quote; done 20:07:41 120) cpressey, oh go to zzo's website. He is NIH AnMaster, really? I was strongly under the impression that zzo was invented here. \ 157) well i just ate some stuff and watched family guy and i own a piano and i'm not wearing socks 20:07:52 ``` for x in 0 1; do wisdom; done 20:07:53 ​afk//Afk wrote a famous story about hang. \ lunacy//LUNacy is wisdom generated by a neu^Wlayered unit net. Ask Warrigal for details. 20:17:46 `n 2 quote 20:17:46 2 quote/1:/bin/sed: -e expression #1, char 4: extra characters after command \ /hackenv/bin/n: line 1: 2 quote: syntax error in expression (error token is "quote") 20:18:01 hm 20:18:14 that's not right 20:20:15 `` \`^ 2 quote 20:20:16 1/1:412) Non sequitur is my forte On-topic discussion is my piano Bowls of sugary breakfast cereal is my mezzoforte Full fat milk is my pianissimo On which note, I'm hungry \ 937) Are you sure this isn't the Sims can you get married to your variables? this is a feature I find lacking in most languages 20:26:33 -!- Melvar has quit (Quit: rebooting). 20:41:26 -!- Melvar has joined. 20:44:40 -!- b has joined. 20:44:48 -!- b has quit (Client Quit). 21:20:10 -!- tromp has quit (Remote host closed the connection). 21:20:47 -!- tromp has joined. 21:22:02 -!- ais523 has joined. 21:23:01 does anyone know of a data structure which lets me do this?: we can add elements to the structure and it records when each element was added (new additions replace earlier additions of the previous element), and we can efficiently discover which elements have been added since a specific element was last added 21:23:28 something like a linked deque + an index into it would work, I think, but I'm wondering if there's anything simpelr 21:24:44 `unidecode µ 21:24:44 ​[U+00B5 MICRO SIGN] 21:37:12 -!- gurmble has changed nick to grumble. 21:41:17 -!- tromp has quit (Remote host closed the connection). 21:41:28 -!- tromp has joined. 21:43:42 on another note, how many languages use "if … else" (with no "then" clause) for what Perl/Ruby call "unless"? 21:46:53 -!- AnotherTest has joined. 21:48:51 huh, that's an interesting syntactic idea, and makes enough sense 21:51:38 -!- zzo38 has quit (Ping timeout: 250 seconds). 21:51:53 if(x == 30);else{ } 21:52:56 yes, it's easy enough to stick a dummy "then" part in 21:57:15 ais523: hi. see logs for today about your rust question. 21:58:32 ais523: I don't see how a linked deque and index would work, you'd have to update a lot of indexes. a heap and indexes could work, because you can delete any element and you only have to update log(size) number of indexes. 21:59:08 hm wait, what operations exactly do you want to do 22:01:32 the index would be a map from an element to the object representing that element inside the deque (that's why it has to be a /linked/ deque) 22:02:11 the operations I need that can't trivially be abstracted out into a separate structure are adding an element to the structure, and getting a list of all elements added more recently than a specific element 22:02:25 also if I add the same element repeatedly, earlier additions should be forgotten rather than leaking memory 22:02:41 in case it helps, I know all possible elements that could be added in advance 22:03:04 elements can be compared for equality easily (in the situation I want to use this they're just integers) 22:03:22 I was hoping that there'd be some existing structure which supports this so that I could look for an implementation in libraries 22:03:51 re: the Rust question discussion in logs, I was expecting something that created an /initialised/ boxed slice, e.g. by copying a given element a given number of times (like vec! does) 22:05:07 ais523: oh, you mean a deque plus an dictionary with the element value as key, and an iterator into the deque as the value. that works, I was confused by "index" then 22:05:24 a linked list deque that is 22:05:29 so that the elements have stable nodes 22:05:31 yes, that works 22:05:32 I meant it in the sense of "database index" 22:05:44 of course, databases are good at this, because they're basically a general-purpose data structure 22:05:51 but it feels like something simpler should work too 22:05:57 [[+-]] https://esolangs.org/w/index.php?diff=58590&oldid=58559 * Cortex * (+34293) /* Examples */ 22:06:11 (also I'm writing this program in Rust, and a linked deque is one of the simplest things that Rust is /really/ bad at…) 22:06:29 I was thinking of a heap (such as a binary or quaternary heap) and a dictionary from the key to the index into the heap. I know you can efficiently maintain a dictionary from the element to the index in a binary heap, I've implemented that. 22:06:34 it's not clear which of these is better her 22:07:36 ais523: no way. you never delete elements, so you can put the deque nodes in a Vec that will never shrink, and you never waste memory, and use indexes or 32-bit indexes into that Vec for the links. 22:08:08 b_jonas: oh, that does work, it's weird to think about using indexes rather than pointers though 22:08:22 especially as the bounds checks happen at runtime not compile-time 22:08:29 using indexes is generally a good idea because it's easy to automatically bounds checked which catches some (not all) silly mistakes with pointer stupidity, and allows you to use 32-bit indexes in a 64-bit address space 22:08:49 no need to allocate each node separately 22:09:30 this is the main reason why I think x32 is a silly idea. if in the structures or loops where you really need to optimize for performance, you just use 32-bit indexes, then the 64-bit pointers never bother you. 22:10:03 admittedly the rust std isn't too well-suited for that, but ideally if you wrote everything in machine language, that would work, and rust or C++ allow good enough approximations. 22:11:05 rust is in a slight disadvantage here, but in the very rare cases when that causes a performance problem, you just optimze the inner loop by reimplementing it in something that isn't rust. 22:11:19 or give dirty hints to rust to speed it up 22:11:47 doesn't using pointers directly, rather than index + base pointer, reduce register pressure? 22:11:51 probably eventually in future rust that won't be necessary, they just add enough stuff in the compiler and std 22:12:18 ais523: perhaps, but usually register pressure is less of a problem on x86_64 than L1 cache and L2 cache throughput 22:12:25 and decoder throughput and stuff like that 22:12:46 especially on modern x86_64 22:13:49 now I'm wondering if simple dereferences like mov [reg], reg take up less instruction-cache pressure than complex ones like mov [base+index*4], reg 22:13:59 and register throughput problems is something that has some chance of automatically fixing themselves without changing your code when you install a new compiler with a better optimizer, whereas cache throughput is less likely to do that 22:14:33 ais523: that really depends on the compiler 22:15:34 but you aren't likely to get [base+index*4] or [base+index*8] indexes here unless you premultiply your indexes or use avx512, because your element size is greater than 8 bytes 22:15:38 what's "that" referring to? if it's my most recent comment, I'm talking about at the machine code → micro-op level of abstraction, well below the compiler 22:16:35 instead, you'll get a silly 64-bit shift instruction to do the multiplication, and you won't be able to convince rust to optimize that a 32-bit multiplication, because you can't easily tell the compiler that the array size is less than 32 bytes. 22:17:15 oh, I see, you're claiming that the more condensed instructions might not even be generatable 22:17:18 "that" is referencing "register throughput problems" 22:17:54 ais523: they might be if you premultiply your indexes or use avx512, but you rarely care that much about performance 22:18:08 microoptimize your code only when it's really the bottleneck 22:19:46 yes 22:21:51 I'd like to note that in this case, since you always insert new elements whose weights are larger than every existing element, inserting a new element takes constant time, you're not moving anything, just pushing an element with the next weight at the end of the array 22:22:04 you probably need to keep a counter tha tracks the next weight separately 22:22:36 if you rarely update elements, then that's the simplest one, but since you didn't mention deleting elements, I think you want to update elements often 22:23:19 mind you, adding a new element to the front of a deque is about as simple 22:23:37 and the dictionary insert has about the same cost in either case 22:23:46 hmm wait 22:23:53 you can't just keep inserting at the end of an array 22:23:57 the heap won't work at all 22:24:00 because then you'll end up growing the array forever 22:24:15 or at least, not easily 22:24:35 ais523: but you need to grow the array when you insert a new element, as opposed to updating an element with an existing key 22:25:27 ok, I think you're right. use a deque 22:25:33 a linked list deque 22:27:48 hmm wait, there's a trickier way 22:28:59 I've been looking for existing Rust modules which have the right functionality; one of them uses a really clever trick 22:29:15 the elements I'm adding are small integers 22:29:49 and this data structure is designed for them; it basically has an array which stores the next integer and previous integer for each integer n at the nth slot 22:29:53 ais523: trickier way, slower but less memory: use a singly linked list, with pointers towards older elements, and a dictionary with the value of an iterator to the previous element. when you remove an element, look up the next element by its value in the dictionary, and update that dict entry too as well as the dict entry for the element you move to the front. 22:30:28 next /element/, previous /element/ rather than next /pointer/, previous /pointer/ (or simulating pointers using indexes) simplifies things a lot 22:30:42 ais523: how many elements do you expect to have? if it's very small, like 8, then there are tricks 22:31:09 it scales based on the size of user input 22:31:10 ais523: that works only if your elements (keys) are of small size 22:31:18 but it's consecutive from 0 22:31:28 ais523: ok, so it's not like always less than 16 22:31:42 right, in practice it'll usually be small but I don't want a hard limit 22:32:05 (the values themselves are `usize`s) 22:32:36 ais523: there's a crazy structure used in hardware caches to handle the 8-way caching that stores binom(8,2) bits and lets the hardware update one of the eight slot to make it the newest one, and find the oldest slot so it knows which element to drop when it has to insert a new one 22:33:26 I probably won't use that, but am curious about how it works 22:33:45 ais523: if it's between 8 and 64, then you might not care about asymptotic performance, and implement this as a simple vector of the integer values where you search the value with a linear search, and memmove part of the array to insert or delete something 22:34:08 ais523: there's a bit for each pair of the 8 elements, and it tells whether the first one is newer than the other 22:35:17 inserting a new element to a slot needs to update 8 of those 28 bits, which is somewhat easy and fast in hardware in this case 22:35:45 oh yes, I can see how that would work now 22:36:33 so much that I guess the bottleneck is not that age array, but the dispatch logic to find one of the eight elements matching the 6-bit address 22:36:53 to be honest a linear search would probably be fast enough for me too, but it was a set of performance properties I hadn't seen before in a data structure, so it prompted me to ask here 22:38:05 you need to do that address compare in each of the eight elements, then three layers of if-else to forward one of them, and then a comparison to the upper bits of the address when it arrives from the page table cache (it's not called "page table cache", it has a fancier name that I keep forgetting, but that's what it really is) 22:38:24 and all of that must go through within a clock cycle ideally 22:38:28 in the L1 cache that is 22:38:36 you get slightly more slack in the L2 cache 22:38:56 this is a program I'm heavily optimising partly just for practice/curiosity/the sake of it (it's an esolang interpreter, people hyperoptimise those for brainfuck…) 22:39:06 -!- AnotherTest has quit (Ping timeout: 272 seconds). 22:39:07 no wait 22:39:11 that's not how it works 22:40:24 first you find a 64-way dispatch for the right slot with 8 cache entries from the 6 middle bits of the address, and THEN when the address arrives from the page table, only then can you compare the upper lots of address bits to the eight slots and do the eight-bit dispatch 22:40:28 sorry 22:40:33 that sounds even harder 22:40:56 that's why it's hard to make the latency low, since the page table cache also has a latency 22:41:11 memory latency is a huge problem for today's computers 22:43:08 I think the big problem is that until we design an entirely new architecture where we can be sure that no software depends on that pages are allowed to be 4k sized, we can't have an L1 cache larger than 32K (4K times eight). we've had CPUs with 32K L1 data cache and 32K L1 code cache for almost a decade, even though the operating systems mostly want to use 8K pages, 22:43:33 and we can't fix this while we want x86 compatibility. 22:43:56 it's one of these stupid historical things. 4k pages made sense back in 386 when computers had a few megabytes of memory. 22:44:09 sometimes just one or two megabytes 22:44:28 the issue isn't really so much the size of the pages, as the minimum granularity of the page table 22:44:38 and larger L1 caches would help a lot 22:44:39 [[Doreq]] M https://esolangs.org/w/index.php?diff=58591&oldid=58377 * Unitan * (+35) 22:44:54 ais523: what do you mean by the granularity? the cache line size? I think that's fine being 64 bytes as is. 22:44:59 if you said "the pages are 512 bytes but you can only map them in units of megabytes" people would be fine with that 22:45:05 b_jonas: minimum size/alignment of a page map 22:45:29 ais523: if you can only map them in units of megabytes, then what would the 512 byte mean? in what sense would it be a page? 22:46:40 presumably it'd be used for things like bounds checks 22:47:00 ais523: hmm, so you'd have a size entry in each page? why's that useful? 22:47:12 I mean, it could be done, but I don't see the point 22:47:50 well, what's the purpose of a page? 22:48:09 sometimes it's for memory allocation purposes, sometimes it's to get useful side effects from the page faults 22:48:18 you could divorce the two, I think 22:48:39 ais523: for a cpu page, which is what we're talking about here, it's mapping from virtual memory address space (of a process) to "physical" memory address space in a way that the cpu dispatches the mapping automatically at every ordinary memory access 22:49:45 "physical" address ideally means what's sent to the main memory, but it could be different if there's virtualization or legacy 32 bit large memory shenanigans involved 22:50:30 right, and I don't see why virtual→physical mapping would need a small granularity nowadays unless you're intentionally overlapping pages in weird ways 22:51:33 ais523: anyway, if you add a size field to a page, and you set it to less than maximum size, then you're probably wasting the rest of the physical page, because it's unlikely that you'll quickly find another partial page that you can allocate in the same place that has just the right page size alignment 22:52:23 b_jonas: OK, so what about a setup that works like this: you can set up virtual→physical maps which have a large minimum size and alignment, but you can "mask" parts of a virtual page so that they pagefault when accessed 22:52:31 rather than going to the physical page behind 22:52:37 it could perhaps be useful for catching some stack overflows 22:52:57 then you can effectively split a physical page between multiple virtual pages as long as the low bits in the virtual addresses are all distinct from each otehr 22:53:00 but I don't think that's likely useful 22:53:17 it would just make so that we have to automatically extend the stack more often than once every few pages 22:53:54 ais523: make part of a page masked to unreadable? perhaps, but I don't really think it's worth the cost. 22:54:32 b_jonas: well, the point is you have to pay this cost anyway when you have small pages 22:54:38 so this is saving part of the cost 22:54:44 ais523: I mean, then you'll need extra bits in the page table cache, which is something that has to be really close to the L1 cache and core on the cpu die for latency 22:54:55 and it's definitely useful to be able to mask, say, the code section and data section of an executable map differently 22:55:03 although, hmm 22:55:07 I just had an idea 22:55:24 ais523: not really. if you actually use smaller pages, rather than just allow smaller pages, then you pay the page table cost 22:56:03 make it so that the high bits of a pointer specify what sort of accesses can be used via it; then instead of having permissions specified separately in the page table, they're specified indirectly by what virtual address you mapped the physical page to 22:56:09 ais523: but currently, the cpu allows 4k pages, but most pages are actually larger, megabyte sized (we don't have 8K or 16K pages on x86_64 I think, but we could in the future), which is better because they take up less space in the page table cache 22:56:19 well, not yet, but OSes are getting better with large pages 22:56:30 eventually we will be using mostly megabyte-sized pages on x86_64 22:56:46 and then you don't pay for the cost of the small pages in the page table cache, but only in the L1 22:57:32 and once you require larger pages, with no mask bits, you have a page table cache that is about as efficient as currently, only slightly more because it doesn't handle small pages at all, and an L1 cache that can finally break the 32 kilobyte barrier 22:58:08 it's still tricky, you can't just increase the L1 cache to as large as you want, because that also increases the latency of that cache 22:59:24 you might even end up with an L1 cache and an L1.5 cache, the L1.5 cache being slower than the L1 cache, but still works like the current L1 cache in that it can tell which group of 8 elements to use before the page table cache tells the address 23:00:14 you might even drop the L2 cache, and have only an L1 cache, and L1.5 cache, and what's currently called an L3 cache 23:00:20 though that's not likely 23:00:31 I dunno really 23:00:37 at this point I just don't have the hardware competence 23:00:40 I can't predict 23:02:40 I think the future of hardware is mostly moving towards trying to make the memory structure less general so that it can be faster in special cases 23:03:04 for example, GPUs typically use batch memory transfer mechanisms and suspend threads while waiting for the memory values to arrive 23:03:21 and NUMA allows different CPU cores to have different RAM 23:04:16 I'm a software guy who wants to optimize stuff on existing or near future consumer PC cpus, and only trying to understand cpus and compilers from that direction 23:05:32 ais523: sure, the point of NUMA is that if each half of the cpu is accessing the RAM that's close to him, then the two halves can do memory transfer faster, and possibly with slihghtly lower latency too 23:05:52 yes 23:06:50 this is useful if you have processes that are really well parallelizable to twofolds, as in, two parts are very independent in what memory they use, which is common, then you get higher memory performance with cheaper hardware 23:07:08 and it's common to have tasks that partition to two easily 23:07:53 I've used a 2-way numa server to compress several videos in parallel, the bulk of each video compression runs on one core 23:08:08 one cpu core that is, with some gpu magic involved too 23:09:00 and the raw video is read to the half of the main RAM that compresses that video, and then the intermediate storage used for the compression is in that half too 23:09:21 well, ideally 23:09:31 I sort of have to trust the OS to do the Sane Thing in the common case 23:11:33 compressing half of several videos still takes too much memory that it probably overflows an L3 cache 23:12:03 CPU → GPU memory transfers are really really slow (the other direction is IIRC even slower, but less useful) 23:14:10 ais523: sure, but for video compression it's still worth, because a part of the video compression can be really sped up by a GPU 23:14:26 assuming you have a modern GPU of course 23:14:57 it's the kind of tasks GPUs are optimized for 23:16:34 indeed 23:19:11 and I think if you do both the finding the nearby similar square in previous frames thing on the GPU and the discrete cosine/wavelet/etc transform on the GPU too, then you can keep all the data in the deeper parts of the GPU between those two 23:20:21 so ideally you only transfer each frame to GPU only once and do all the tasks there, then transfer the result back to CPU which does further processing on it and writes the file 23:21:05 there's some other operations besides those two, like possibly partly decoding the frames from the input format if it's already compressed, subtraction, color space transform, etc 23:21:48 or even resizing to lower resolution or other operations done at the same time as the encoding 23:22:07 or, heck, deinterlacing and resizing to lower resolution 23:40:29 -!- Phantom_Hoover has quit (Read error: Connection reset by peer). 23:54:42 -!- Lord_of_Life has quit (Ping timeout: 250 seconds). 23:56:24 -!- Lord_of_Life has joined. 23:56:24 -!- Lord_of_Life has quit (Changing host). 23:56:24 -!- Lord_of_Life has joined. 23:59:18 -!- sleepnap has quit (Quit: Leaving.).