00:45:50 WDYT, if an IRC client send a periodic PING to avoid the dreaded "TCP connection was lost but the client has nothing to write" issue, should it bother to try to also verify the server responds with a corresponding PONG, or is that superfluous? It's definitely not necessary for the TCP thing, but hypothetically there might be a server that continues to speak TCP but not respond to commands. And 00:45:52 often it's used to estimate latency, but that's a different feature. 00:48:44 I check manually. I have the F2 key bound to PING and then I can see if PONG is received or not. (For automated IRC clients, it could check automatically) 00:50:21 Yes, this would be for an automaton. Just wondering if it's a failure mode that really needs worrying about, assuming I don't care about estimating the latency to try to jump servers if it's too high or w/e. 00:55:00 At least in my experience, if I try to PING and it isn't working, there will eventually be a connection error anyways. However, it might be worth checking after some (configurable) timeout anyways. 00:56:09 It has also happened to me that I was able to receive but not send. In this case, eventually the server will disconnect me due to a ping timeout. 00:56:16 What if it continues to speak TCP and respond to pings, but not to other commands? 00:57:09 Then I would think that the server is defective, probably. 00:57:47 I guess I could have it privmsg me to solve a CAPTCHA. But then what if the server sends that message to some other human who responds to it? 00:58:11 -!- chiselfuse has quit (Remote host closed the connection). 00:58:25 -!- chiselfuse has joined. 00:58:46 You can write a question that you do not expect anyone else to know the answer 01:31:14 -!- earendel has joined. 01:53:17 -!- dutch has joined. 05:27:01 fizzie I always thought client is exactly supposed to respond to PING with PONG, not just send periodic PING on their own 05:31:28 also if believe this line https://github.com/Nakilon/nakiircbot/blob/43bf3dfa932e78f19b656520d29629c9bf94c5bc/lib/nakiircbot.rb#L99 Quakenet used this command for measuring the latency too 05:33:34 I mean when I was making this comment I was reusing some old Quakenet bot that IIRC it had the timestamp parsing in it 05:33:57 but as it says in case of Libera there is just server name there 06:08:49 [[School]] https://esolangs.org/w/index.php?diff=87989&oldid=87986 * AceKiron * (+391) Added the PUSH and POP memory operants 07:54:15 [[Matrix (data structure)]] N https://esolangs.org/w/index.php?oldid=87990 * AceKiron * (+174) Created page with "A **matrix** is a data structure that can serve as an programming language's memory. The number of stacks may vary. Many languages have other methods of data storing as well." 07:54:27 [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87991&oldid=87990 * AceKiron * (+3) 07:55:43 [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87992&oldid=87991 * AceKiron * (+105) 07:56:17 fizzie: I check PONG replies anyway to know when the server has processed my previous commands, which I need to know to not send more commands to the server that fit in its buffer, or else it would quit me. 07:56:29 and at that point you probably want a timeout too 07:58:22 [[Category:Matrix-based]] N https://esolangs.org/w/index.php?oldid=87993 * AceKiron * (+181) Created page with "Languages primarily using one or more [[Matrix_(data_structure)|matrix]]s for storage. ==See also== * [[:Category:Queue-based]] * [[:Category:Stack-based]] Category:Langu..." 07:59:15 [[School]] https://esolangs.org/w/index.php?diff=87994&oldid=87989 * AceKiron * (+225) 07:59:58 [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87995&oldid=87992 * AceKiron * (+10) 08:01:00 [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87996&oldid=87995 * AceKiron * (+46) 08:06:20 -!- hendursa1 has joined. 08:08:51 -!- hendursaga has quit (Ping timeout: 276 seconds). 08:21:22 [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87997&oldid=87996 * AceKiron * (+1) 08:47:39 [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87998&oldid=87997 * AceKiron * (+64) 08:51:10 [[School]] https://esolangs.org/w/index.php?diff=87999&oldid=87994 * AceKiron * (-2) 09:05:37 -!- spruit11_ has quit (Quit: https://quassel-irc.org - Chat comfortably. Anywhere.). 09:05:59 -!- spruit11 has joined. 09:31:40 -!- Koen_ has joined. 09:32:53 -!- Sgeo has quit (Read error: Connection reset by peer). 09:48:52 [[School]] https://esolangs.org/w/index.php?diff=88000&oldid=87999 * AceKiron * (+15) /* Memory operants */ 09:51:30 -!- Trieste_ has joined. 09:51:46 -!- Trieste has quit (Ping timeout: 240 seconds). 09:58:02 -!- Oshawott has joined. 10:01:31 -!- archenoth has quit (Ping timeout: 252 seconds). 11:04:25 [[Special:Log/newusers]] create * Bsoelch * New user account 11:20:40 -!- hanif has joined. 11:26:01 Yes, I mean, the client does need to respond to PING with a PONG, but that's a different thing. 11:30:30 does IRC need ping and pong? doesn't TCP already have this basically 11:34:04 TCP has an *optional* keepalive option. But I don't think it's very popular compared to application protocol heartbeats. 11:38:24 As for not sending too many things, I'm using a credit-based system (each byte costs so and so, some commands have an extra surcharge, the client gets credit at a fixed rate capped to some maximum value) to approximate that. That's what ircd (at least the real one, the one used at IRCnet) does on the server side. Of course it's not exactly exact due to network latency and so on, but it's been 11:38:26 working just fine. 11:40:23 On keepalive, IIRC the default timeouts tend to be huge (hours), and configurable only system-wide. 11:48:12 [[Meow]] https://esolangs.org/w/index.php?diff=88001&oldid=87959 * Martsadas * (+20) /* fixed mistakes*/ 11:49:14 [[Meow]] M https://esolangs.org/w/index.php?diff=88002&oldid=88001 * Martsadas * (+27) 12:15:51 -!- hanif has quit (Ping timeout: 276 seconds). 12:50:37 -!- earendel has quit (Quit: Connection closed for inactivity). 12:52:38 [[Matrix]] M https://esolangs.org/w/index.php?diff=88003&oldid=42721 * PythonshellDebugwindow * (+50) Confusion 12:52:48 [[Matrix (data structure)]] M https://esolangs.org/w/index.php?diff=88004&oldid=87998 * PythonshellDebugwindow * (+50) Confusion 12:52:57 [[Matrix (data structure)]] M https://esolangs.org/w/index.php?diff=88005&oldid=88004 * PythonshellDebugwindow * (-17) m 12:58:01 -!- hanif has joined. 13:12:27 -!- hendursa1 has quit (Quit: hendursa1). 13:12:53 -!- hendursaga has joined. 13:40:19 [[Special:Log/newusers]] create * 4gboframram * New user account 13:47:28 [[Esolang:Introduce yourself]] https://esolangs.org/w/index.php?diff=88006&oldid=87982 * 4gboframram * (+184) /* Introductions */ 14:17:17 -!- delta23 has joined. 14:32:41 -!- Koen_ has quit (Remote host closed the connection). 14:33:22 -!- velik has quit (Remote host closed the connection). 14:34:00 -!- velik has joined. 14:40:11 -!- velik has quit (Remote host closed the connection). 14:40:29 -!- velik has joined. 14:41:55 -!- velik has quit (Remote host closed the connection). 14:42:13 -!- velik has joined. 14:45:28 -!- velik has quit (Remote host closed the connection). 14:47:31 -!- velik has joined. 15:03:05 -!- normsaa has joined. 15:03:22 https://pastebin.com/px6HUCLV how can this binary be decoded? 15:03:28 In any esoteric lang? 15:06:38 normsaa: Where did it come from? 15:07:14 Corbin A friend 15:07:30 He wrote it 15:09:46 Any ideas? 15:23:26 . o O ( it's too bad that the flag doesn't identify the CTF this is from ) 15:23:39 normsaa: tell your "friend" to solve the problem properly, by themselves. 15:39:04 Also tell your friend to fix the overlapping assignment, which probably breaks the script. 15:51:39 -!- hanif has quit (Ping timeout: 276 seconds). 15:56:32 -!- hanif has joined. 16:03:37 -!- Koen_ has joined. 16:11:38 riv: yes, IRC sort of requires PING and PONG for at least three reasons. some servers (not freenode, I haven't looked at libera yet) require that you send *one* pong after connecting, copying an unpredictable code from the PING that the server sends, as a sort of anti-spam measure. second, some servers, including freenode (again, haven't looked at libera yet) require that the client sends something 16:11:44 every five minutes, to ensure that it can drop clients that are disconnected. it ensures that clients do this by sending pings, you don't need to reply to those technically, but replying to pings is an easy way to satisfy this requirement. 16:14:38 thirdly, you can use pings for flow control. the way IRC works is that the server has a very small input buffer for each client, and if the client sends more than that input buffer over what the server has handled locally, it disconnects the client. the server handles commands for one client in series, so if you pay attention to local replies (replies from that server, not other servers), you can 16:14:44 sometimes tell how much the server handled, and so how full the queue is. but not all commands have local replies, or the local reply isn't always easy to identify, so sometimes you want to send a command just to force a local reply. the best command for that is a local PING (as opposed to a PING to a different server), since that does nothing but send you a reply. 16:20:54 -!- velik has quit (Remote host closed the connection). 16:21:23 -!- velik has joined. 16:22:00 -!- velik has quit (Remote host closed the connection). 16:22:17 -!- velik has joined. 16:23:59 int-e: wait, do you actually recognize what that is, or do you just know it's homework from what it looks like and how they simultaneously cross-post on multiple channels? 16:30:29 int-e: I was half-expecting doing a web search for the flag value would tell you where it's from (surely all of those have answers posted online?), but apparently it doesn't. 16:31:05 fizzie: yeah. which /could/ indicate that it's an ongoing one, or just that it's very obscure 16:31:27 "It looks like there aren't many great matches for your search. Tip: Try using words that might appear on the page that you’re looking for. For example, 'cake recipes' instead of 'how to make a cake'." 16:31:40 Mmm, cake. 16:32:57 . o ( glados instead of cake ) 16:38:06 google gave me this video https://www.youtube.com/watch?v=JMrd8PoxvPc, but the author doesn't do this challenge in the video 16:38:06 it's a piece of cake to bake a pretty cake 16:38:36 and the ctf site linked is dead and unarchived 16:44:25 normsaa 16:46:43 fizzie: there's technically a third, most unlikely case: that it's from a site like Advent of Code that gives every logged in user a different test input 16:59:58 -!- Koen_ has quit (Remote host closed the connection). 17:00:26 -!- j-bot has quit (Remote host closed the connection). 17:00:40 -!- j-bot has joined. 17:07:26 -!- oerjan has joined. 17:21:23 -!- arseniiv has joined. 17:30:35 -!- normsaa90 has joined. 17:33:15 -!- normsaa has quit (Ping timeout: 256 seconds). 17:36:23 -!- normsaa has joined. 17:39:29 -!- normsaa90 has quit (Ping timeout: 256 seconds). 17:40:12 -!- hanif has quit (Ping timeout: 276 seconds). 17:41:39 -!- normsaa91 has joined. 17:41:45 -!- normsaa has quit (Ping timeout: 256 seconds). 18:02:07 -!- immibis has quit (Remote host closed the connection). 18:05:43 -!- immibis has joined. 18:05:47 -!- Sgeo has joined. 18:12:21 -!- normsaa91 has quit (Ping timeout: 256 seconds). 18:14:59 -!- Guest81 has joined. 18:14:59 h 18:15:41 -!- Guest81 has quit (Client Quit). 18:16:02 -!- normsaa has joined. 18:31:02 "Site compatible with IE 10 or above, Mozila [sic], ..." is probably not a good sign. 18:33:30 ow 18:34:14 `? tlwtnt 18:34:17 tlwtnt? ¯\(°​_o)/¯ 18:34:30 ow! 18:37:04 \wp ow 18:37:06 OW -- Wikimedia disambiguation page https://en.wikipedia.org/wiki/OW 18:38:23 looks like sometimes there is a default page and sometimes not 18:38:53 fizzie: does it also have a link to where you can download Acrobat Reader to view their PDFs and the Java and Adobe Flash plugins, without mentioning Oracle for Java? 18:39:31 also do they recommend at an least 256 color and at least 1024x768 pixel resolution display for best view? 18:40:54 very long ago I made a script to say "best viewed with Mozilla" or "best viewed with Internet Explorer", always the other one than the viewer is using 18:41:19 lol 18:41:49 b_jonas do you know who chukchas are? 18:41:59 no 18:42:22 ethnic Siberians who live deep in tundra with deers 18:42:27 there is a joke 18:43:02 b_jonas: What will do if neither is use? 18:43:22 smth like: "to keep chukcha busy give him a paper with 'read on the other side' written on both sides" 18:43:31 b_jonas: lol, evil 18:44:40 \wp chukcha 18:44:42 Chukchi people -- ethnic group https://en.wikipedia.org/wiki/Chukchi_people 18:45:21 oh, even not Siberia 18:51:00 zzo38: one of them was the default. I don't remember which. 18:52:19 It didn't have those other things. Maybe it would have elsewhere on the site. 18:57:13 [[School]] https://esolangs.org/w/index.php?diff=88007&oldid=88000 * AceKiron * (-80) 19:00:25 -!- ais523 has joined. 19:01:08 I think the reason why some servers ping during connection, and don't connect until they receive a matching pong, is to prevent non-IRC-related programs being tricked into connecting to IRC 19:01:33 if your ircd ignores invalid commands (and many do), it isn't hard to put a segment of valid IRC commands in the middle of, say, an HTTP POST request 19:01:39 that is a good reason but only requires one PING right at the start 19:01:55 I remember a spam attack on Freenode that worked by exactly that mechanism 19:02:01 so you can create a web page with a script that causes the viewers to spam IRC, and this has been used to create IRC worms in the past 19:02:21 it would POST a set of IRC commands that cause the user to join a bunch of channels and spam them with the URL of the page 19:02:24 right 19:02:25 pretty funny really 19:02:43 HAMRADIO 19:02:50 Postel's Law sounds good but is absolutely terrible for security 19:03:05 Postel's Law is bad 19:03:14 also bad for long term maintainability 19:03:22 i remember that spam attack, that was funny 19:03:34 \wp Postel's Law 19:03:35 robustness principle -- design guideline for software that states: "be conservative in what you do, be liberal in what you accept from others" https://en.wikipedia.org/wiki/Robustness_principle 19:05:54 as users expect the "best guess" behavior of implementations will continue working forever 19:06:11 leading to the codification of insanely complex behavior as exemplified by the WHATWG HTML spec 19:06:48 I do actually like what that HTML spec has done, though 19:07:06 because it means that there are now set boundaries for exactly what you are and aren't allowed to do in HTML 19:07:13 right 19:07:29 it's a regrettable necessity based on the early days of the web being dominated by ad hoc systems and postel's law 19:07:34 it is Postellish in some respects, too, e.g. saying that web pages must be in UTF-8 but giving long complicated instructions for what to do if they aren't 19:08:46 -!- arseniiv has quit (Ping timeout: 260 seconds). 19:09:45 actually, one related problem I've been having recently, which may be unsolvably difficult, and Stack Overflow has not been helpful: 19:10:06 given a URL, which characters in it can be safely percent-decoded without changing the meaning of the URL 19:10:10 ? 19:11:42 I'm trying to write an HTML sanitizer and would prefer to avoid allowing people to put obfuscated URLs through it, but it's so hard to figure out the rules for what will and what won't work 19:12:28 generate string of all chars and escape it with some very common library used for that need 19:12:35 to see what chars it will process 19:14:10 nakilon: that basically means assuming that the library is correct, which it probably won't be 19:14:15 pretty sure all libraries will process different set of chars ) 19:14:28 there is probably no correct library 19:14:32 that said, I have been trying various test strings on various browsers and one httpd, to see what happens 19:14:46 maybe some Chrome is implemented "correctly" but it won't provide a library 19:14:50 (testing a wide range of httpds would be frustrating) 19:15:22 one thing I did learn throughout all this is that the URL path component %2e%2e is in fact equivalent to .. and will cancel out the previous component 19:15:38 which seems like an unwise decision from a security point of view, that's just asking for path traversal vulnerabilities 19:15:43 also the possible achievable "correctness" of your tool is limited by how correct the servers are 19:16:01 many of them work differently about URL escaping 19:16:13 right 19:16:30 I think the only real option here is to have a parameter for what sort of dubious-looking escapings the user wants to exclude 19:16:37 also additional rules and bugs in redirects 19:22:08 teaching velik wolfram alpha, somehow it took the whole day to make 10 tests, and it's only a piece of Math examples; there are three other topics, maybe I'll make most of them tomorrow 19:32:17 ais523: "prevent non-IRC-related programs being tricked into connecting to IRC" => yes, that might be part of the reason. 19:33:46 "avoid allowing people to put obfuscated URLs through it" => yeah, that's probably impossible 20:13:55 Normally I don't pay *that* much attention to update sizes, but now updating blender wants to install "libembree3-3" that will take half a gigabyte of disk. 20:15:45 that's a pretty big library! 20:16:18 b_jonas: it just seems so wrong to let people post arbitrary URLs on, say, forums or the like, when you're supposed to be sanitising the content 20:16:40 "Intel® Embree is a collection of high performance ray tracing kernels that helps graphics application engineers to improve the performance of their photorealistic rendering application." I feel like they've probably got versions specifically tuned for a bazillion different (Intel) CPU models. 20:18:21 Heh, it's a single 485223648-byte .so file. 20:18:47 ais523: I'm not sure I see why, except for the part where you might sanitize the protocol part (the part before the first colon) and add a max length 20:19:43 fizzie: I suspect that you only need around eight different versions of your code to get peak performance on all 64-bit Intel CPUs 20:19:49 ais523: though most of the time you'd probably throw in that HTML attribute that hints to search engines that this is a link by a third party submission and the search engine shouldn't think your site is deliberately linking to kiddy porn 20:20:02 and maybe another four or five more for AMD 20:20:22 b_jonas: I've already been looking through the list of rel= attributes 20:20:27 it won't take away all the responsibility about what links you host, but you can't generally fix that by just looking at the URL 20:20:42 I think probably at least nofollow and noreferrer should be in there for external links by default 20:21:17 you absolutely want to whitelist protocols though because of javascript: links though 20:21:21 although noreferrer is interesting because you can also set an HTTP header that tells the browser to noreferrer everything 20:21:40 I explicitly turned it off on my website (as in, I outright said in the headers that I know this header exists and I'm choosing not to use it), which makes some security checkers really annoyed 20:22:00 lol "only need around eight different versions of your code to get peak performance on all 64-bit Intel CPUs" 20:22:13 b_jonas: I mean that there isn't a combinatorial explosion 20:22:32 and then you want AMD and code running on GPU and a port for ARM64 etc 20:22:56 ais523: yes, that's because it's really hard to make CPUs so there are only two or three companies making x86 cpus at a time 20:23:07 . o O ( rel=dontclickfortheloveofgod ) 20:23:08 although, in practice nowadays, I think you can get decent performance for CPU-bound code by writing a post-AVX2 version for most people, and a pre-AVX2 version (aiming for maximum compatibility) for people who are running on really old computers 20:23:22 and they mostly develop at most three lines of them in parallel each 20:23:33 one expensive, one home, and one low-power laptop one 20:23:34 optimising for AMD does seem to be significantly different from optimising for Intel, though 20:24:01 in particular, if the program isn't memory-bound, the next most relevant bottleneck on Intel is normally instruction dispatch, whereas on AMD it's usually something else 20:24:07 ais523: and more importantly, you only need to make different versions of a few performance-critical functions, not everything in your code 20:25:01 there might still be a combinatorial explosion if you want versions of your code that differ in ways other than the CPU hardware 20:25:09 I came up against this in the fizzbuzz I've been writing over the last year or so 20:25:40 I want to read a vector from memory, then do two instructions with that vector as one argument and a vector in a register as a second argument 20:25:52 how is the fizzbuzz coming?? 20:25:54 on Intel, it's optimal to read the vector from memory twice 20:26:08 on AMD, you want to read it into a register and then use it from there 20:26:33 this is because Intel is bottlenecked on instruction decode so simply using fewer instructions is a gain, the L1 cache can handle the second read 20:26:53 Well, yes. It is a C++ project. It's possible there's a combinatorial explosion of templates instead. There isn't that much code in terms of source code lexically. 20:27:00 on AMD the instruction decode is faster but the L1 cache has less bandwidth, so you can spare an extra instruction to read into a register to spare the cache bandwidth 20:27:10 ais523: that might change for future AMD cpus... 20:27:16 riv: I think I have a plan, the issue is just finding the time to write this code 20:27:46 b_jonas: it's possible, but AMD seem to have been going down the path of using hyperthreading to make use of the extra instruction dispatch capability 20:28:05 (sorry, I meant dispatch not decode, AMD is bottlenecked on decode too but that only matters if you aren't in a loop because of the µop cache) 20:28:22 that said, I agree that bottlenecking on either memory access or instruction dispatch is typical these days, the execution times don't matter as much, unless you are specifically writing matrix multiplication inner loops or things like that 20:29:08 even matrix multiplication is bottlenecked on memory access, most of the fast techniques for it are based on trying to avoid cache spills 20:30:05 ais523: yes, so it's only the inner loops where you actually have to care about the execution times of these floating point multiplication-add instructions. 20:30:22 it's hard to think of something that wouldn't bottleneck on memory access – maybe things like prime factorization, or pathfinding 20:31:00 b_jonas: oh, fused multiply-adds are fast, but that doesn't really matter, they're more beneficial in terms of accuracy than they are in terms of speed 20:31:35 multiply then add is 2 cycles latency, fused multiply-add is 1 cycle latency, and they both have enormous throughput (values correct for recent Intel and also recent AMD) 20:33:55 ais523: yes, the current CPUs are so optimized for that that you basically can't run out of multiplication units. I remember there was a time when the CPU was better at fused multiply-add than additions 20:34:25 integer additions do actually beat all the floating-point stuff on most modern CPUs, though 20:35:02 on Intel, this is primarily because the execution unit that handles jumps and branches can be used to do integer additions/subtractions if it isn't needed for the jump 20:35:13 ais523: while floating point multiplications beat integer multiplications, yes 20:35:19 (because it can handle fused compare-jump, fused subtract-jump, and friends) 20:35:38 and yes, floating point multiplication performance is better than integer multiplication (although normally not that much better) 20:35:45 only 64-bit ones though, because the mantissa is bigger 20:35:54 Intel actually has two different floating point multipliers with different performance characterstics 20:36:04 one has higher throughput, the other lower latency 20:36:36 ais523: used for the same instructions? I didn't know that 20:36:38 actually, the main throughput bottleneck I tend to hit is vector shuffles 20:36:49 b_jonas: I think they're mostly used for different instructions 20:37:05 Intel normally has only one vector shuffler, it's fast but you can only use it once per cycle 20:37:14 and lots of useful instructions fall into the "vector shuffle" group 20:37:43 yeah 20:38:30 there's also the infamous lane-crossing penalty (especially on AMD, but I think it affects Intel too) 20:39:15 where it costs something like 3 cycles to do anything that combines the top half of a register and the bottom half of the register, when the register is "sufficiently large" (normally a recently introduced vector size) 20:39:58 this is why lots of vector instructions are incapable of mixing the top and bottom half of a YMM register, they're instead basically designed as two XMM instructions in SIMD (even if they aren't normally SIMD instrucitons) 20:40:24 ais523: yeah 20:41:35 -!- arseniiv has joined. 20:42:04 VPSHUFB for ymm registers specifically 20:42:50 that was the example I was going to use 20:43:17 I am following this because AVX2 is now available on lots of CPUs 20:43:38 and there's an annoying lack of backwards compatibility, too – even if Intel or AMD figure out how to make five bits of the index useful, they won't be able to make their VPSHUFB instructions actually handle them 20:43:41 (including my new home computer) 20:43:43 because it would break backwards compatibility 20:43:53 I've had an AVX2-capable computer for a few years now 20:44:56 ais523: they just add a new instruction for that. they're adding lots of new vector instructions all the time anyway. 20:45:30 but what do they even name it? 20:45:33 VPSHUFB5? 20:45:41 no, I think it's VPERMsomething 20:45:44 rather than SHUF 20:45:46 (with a VPSHUFB6 coming in a few years after AVX-512 is more mature?) 20:45:49 it already exists 20:45:53 or so I think 20:46:00 let me look it up, I think it's later than AVX2 20:46:00 oh, the PERM stuff normally has worse granularity than SHUF, this will be confusing 20:46:38 no, there is now a VPERMB that is a full byte level shuffle even on a zmm register 20:46:47 the lower granularity was a thing of the past 20:47:08 knowing how AVX-512 is going, this is likely to have been specified by Intel but not actually implemented by anything 20:47:08 well, thing of the past that's still in many CPUs that we're using now 20:47:10 but you know 20:47:15 very long ago I made a script to say "best viewed with Mozilla" or "best viewed with Internet Explorer", always the other one than the viewer is using => rofl oh my 20:47:24 there's a lot of AVX-512 which was specified but with no implementations 20:47:34 ais523: quite possible. 20:48:28 this may end up leading to another FMA3/FMA4 debacle some time in the future 20:48:52 ais523: there's even a VPERMI2B instruction to byte level permute *two* zmm registers 20:48:59 (FMA got specified prior to being implemented, with Intel and AMD proposing different plans; each then implemented the *other's* specification, leaving them incompatible) 20:49:00 which means 128 slots 20:49:08 wait what? 20:49:14 each implemented only the other's specification? 20:49:15 I think AMD implemented Intel's specification because they wanted to be compatible 20:49:18 I didn't follow that 20:49:26 I know they implemented incompatible stuff 20:49:30 but I didn't know they swapped 20:49:32 and Intel implemented AMD's specification because they couldn't get their own to work, it needed too much internal rearchitecturing 20:49:34 that is odd 20:49:44 wow 20:49:49 (presumably this is why AMD came up with their version in the first place, it would be easier to implement) 20:49:57 but 3DNow was AMD's specification that was never in Intel, right? 20:50:08 b_jonas: yes, although a couple of 3DNow commands survived 20:50:34 admittedly, SSE is much better-designed than 3DNow was, although both are dubious in terms of encoding 20:51:07 I never really looked into the details of what 3DNow does. it was obsolete by the time I could have cared. 20:51:31 we already had SSE4.1 by the time I started to care about SIMD instruction stuff 20:52:00 b_jonas: think SSE with 64-bit-wide vectors 20:52:17 no, that's MMX 20:52:24 well, not quite 20:52:27 I thought MMX wasn't vectorised at all 20:52:37 3DNow is, as long as you want a pair of single-precision floats 20:53:17 it was the first vector unit; it simply just wasn't a very good one 20:53:25 MMX has the drawback that it shares state with the FPU, and you have to do a slow switch of the FPU between MMX and traditional mode each time you want to use it, since the existing ABI expects the FPU to be in non-MMX mode 20:53:50 MMX is "vectorized" in that it can handle two 32-bit floats in a 64-bit register 20:54:11 hmm, maybe I got them muddled then 20:54:20 but two floats per register is still a big help 20:54:21 or maybe 3DNow uses the MMX registers for its vectors 20:54:39 it also handles packed integers 20:54:52 https://en.wikipedia.org/wiki/3DNow! 20:55:02 no, I'm wrong 20:55:03 right, 3DNow! seems to be an extension to use the MMX registers as vector registers 20:55:14 apparently MMX *only* handles integers 20:55:31 that's even more useful 20:56:23 oh, so MMX does int vectorisation and 3DNow! does float vectorisation? 20:56:25 I know these days MMX is only useful to get a few extra registers that you can sometimes access with shorter encodings than anything in the later instruction sets, and basically never worth to use 20:56:49 I have no idea what 3DNow does 20:56:55 I'm actually vaguely surprised that MMX didn't become the standard for non-vectorised floating point 20:57:15 ais523: what do you mean "the standard"? 20:57:23 it is saner than x87, and supported by all 64-bit CPUs 20:57:27 b_jonas: in the ABI 20:57:54 like, the ABI passes floats in MMX registers, assumes MMX mode at call boundaries, and the like 20:58:05 ais523: which ABI? we can't change the x86_32 ABI, it's too late for that, and x86_64 comes with always SSE2 so by that time the point is moot 20:58:17 b_jonas: x86_64 20:58:19 also if MMX only handles integers then that can't work 20:58:34 no, MMX definitely does floats 20:58:55 -!- Lord_of_Life has quit (Ping timeout: 260 seconds). 20:58:59 no, I'm wrong 20:59:01 it doesn't do floats, only ints 20:59:07 that's why people don't use it for float maths :-) 20:59:19 -!- Lord_of_Life has joined. 20:59:39 ais523: also SSE2 is the standard for passing floats in the x86_64 ABI, and that's a good thing 21:00:02 because with SSE2 there, MMX is almost never useful 21:00:21 and SSE2 adds advantages, both wider vectors and a better instruction set 21:00:23 so it looks like we have three sets of registers: integer; x87/MMX/3DNow!; and XMM/YMM/ZMM 21:00:35 oh, 3DNow also uses the x87 registers? 21:00:43 (also random special-purpose stuff like flags, but I'm not counting those) 21:00:45 b_jonas: right 21:00:56 also (sigh) we also have AVX512 mask registers. 21:01:21 on AVX512-capable CPUs that is 21:01:29 x87 interprets the registers as one of three float formats (long double, plus formats which are almost but not quite the same as float and double); MMX as 64-bit integer vectors; and 3DNow! always as two floats 21:01:43 b_jonas: to be fair those are really helpful for some applications 21:02:42 ais523: no, x87 specifically stores 80-bit floats, not long doubles. there's a difference because long double is 64-bit floats in the MSVC ABI 21:03:08 well, yes, but they're what has been known as "long double" for ages on Intellish processors 21:03:23 but they got deprecated with the change to 64-bit 21:03:40 because SSE2 handles 64-bit floats, yes 21:05:43 [[Cabra]] M https://esolangs.org/w/index.php?diff=88008&oldid=81202 * PythonshellDebugwindow * (+0) /* Language Definition */ Fix typo 21:05:46 Wikipedia says that 3DNow! invented SFENCE 21:05:47 wtf there's a KADDBB/KADDW/KADDD/KADDQ AVX512 instruction? I never noticed that 21:06:30 ais523: I admit I don't follow how the fence instructions work. I leave them to slightly higher level libraries. 21:06:31 but that seems unlikely to me, because my understanding of the x86 memory model is that an SFENCE is only useful with non-temporal writes or write-combining memory, and I didn't think those were implemented at that point 21:07:00 b_jonas: I can describe the general (non-x86-specific) implementation fairly easily 21:07:15 imagine loads and stores as not happening instantly, but being spread out over time 21:07:46 an lfence stops a load crossing it (it has to happen entirely before if it's before the lfence, or entirely after if it's after the lfence) 21:07:51 likewise, an sfence stops a store crossing it 21:08:13 I think PPC conventionally has a double-double as its `long double` type. 21:08:44 if one thread is storing two pieces of data, and another thread is loading them, then you need to sfence between the stores and lfence between the loads if you want to prevent the loading thread seeing the new value of the second store, but the old value of the first store 21:09:45 spread out over time how? you mean they happen at different times to different layers of the cache hierarchy, going down the hierarchy if either the smaller caches need to free up space or to make the value known to other CPUs? 21:10:14 b_jonas: imagine that you send a "request to write memory" but then continue executing before the request has been handled 21:10:20 and let the motherboard respond to the request at some later time 21:10:21 ais523: "stops a load crossing it" at what levels of the hierarchy? 21:11:01 it's a logical rather than physical barrier, it's not bound to a specific level of hierarchy 21:11:08 so you have to match an lfence on one thread with an sfence on another 21:11:11 ok 21:11:32 I still think I don't need to know the details of this, what I do need to know is the atomic and mutex abstractions over them that libraries provide me 21:11:50 on x86-64 specifically I think it's handled as part of the cache coherency mechanism 21:11:56 because I don't think I write inter-thread (or inter-process) communication code that is at a lower level than those 21:12:09 right 21:12:34 nor CPU-level code that handles memory mapped to the video card or other memory-mapped IO 21:12:43 one way to think about it is that sfence is one of the two main mechanisms for implementing the "release" atomic ordering, and lfence is one of the two main mechanisms for implementing the "acquire" atomic ordering 21:13:20 atomic release is sfence then write; atomic acquire is read then lfence 21:14:03 only, x86 has extra guarantees that most processors don't, so sfence is usually a no-op and I think many atomic libraries leave it out there on x86-64 (even though they would use it on other processors) 21:14:18 (lfence is not a no-op, though, and is important in atomic code) 21:15:19 and as far as I understand, the compilers need to know about both fences and atomics, because they have a meaning not only on the CPU level, but for what the optimizer isn't allowed to do, and current compilers indeed do this. (in contrast, I think the compiler needn't know about mutexes directly.) 21:16:22 the compiler does need to know about the acquire/release rules on mutexes, but either it can see the atomic read/write in the function, or else it can't see anything at all and thus has to assume the worst 21:16:51 oh, this reminded me of a weird case of wanting a compiler barrier specifically 21:17:13 yes, the fast (non-contented) cases mutex functions must be fast so the optimizer will see into the functions when necessary 21:17:19 s/cases/cases of/ 21:17:21 the idea would be in functions that undropped permissions, did a system call with checks, then dropped them again 21:17:43 to do the checks with permissions raised, and to compiler-barrier to ensure that the undropping is done before the checks 21:18:01 this sounds like it violates the least-permissions principle, but the point is to protect the checks from return-oriented programming 21:18:11 in order to get permission to do the system call, the code would need to run through the checks too 21:18:46 ais523: what kind of permission checks? aren't those undropping, permission checking, and dropping three system calls, and the compiler already mustn't reorder system calls? 21:19:01 it also reminds me of something similar by the way 21:19:29 b_jonas: say, you want an mprotect() wrapper that checks that you aren't making anything executable if it was previously nonexecutable 21:19:55 ais523: ah, so by permission checking you just mean accessing memory that may or may not have read/write/execute permissions? 21:20:07 hmm, that might be difficult 21:20:18 I didn't mean permission checking, just checking in general 21:20:42 but even so, can't a system call basically write anything to anywhere in your user memory, so you usually can't reorder memory accesses around them anyway? 21:21:11 oh, that's interesting – the point being that compilers wouldn't optimise system calls anyway due to not knowing what they do? 21:21:30 ais523: yes, except maybe a few specific system calls of which they know the meaning 21:21:51 there are system calls like pread that can write even to memory that you didn't pass a pointer to to the system call 21:21:58 I know there are some functions that can system call, and that the compiler treats specially 21:22:01 malloc, for example 21:22:30 -!- delta23 has quit (Quit: Leaving). 21:23:07 no, not pread, sorry. 21:23:15 preadv 21:24:35 preadv can write anywhere, and a compiler has to assume that an unknown system call can do things worse than that 21:24:53 preadv seems so specific 21:25:18 I can see why it could be useful – it saves the overhead of making multiple system calls when you want to do that operation specifically – but I'm unclear on how common that particular operation would be 21:25:25 it's specific in that it's a particularly badly behaving system call, that's why I'm giving it here as an example 21:25:37 most system calls are tamer than that, but the compiler can't easily rely on that 21:26:14 I meant, I was thinking on a different line of thought when you mentioned preadv 21:26:25 like, what was the motivation behind adding that to the kernel? who needed it, and what do they do with it? 21:26:52 I don't really know. 21:27:21 99% of programs would just read into a large buffer and then copy the data into the appropriate final locations, rather than spending time coming up with a big description for preadv 21:27:29 although, preadv is faster because it reduces cache pressure 21:28:21 I think it might be there because they wanted to get asynchronious regular file reading to work, which turned out quite hard and they're still struggling with it, but anyway the interface of the async reading allows similar scatter-gather read because it has to allow multiple reads at the same time, so they added a normal non-async interface at that point 21:29:03 but maybe someone just used a cyclical buffer and wanted to micro-optimize the number of system calls, since the context change for system calls used to be slower than now 21:29:19 preadv is very old, you have to remember that 21:29:25 so it can have some odd historical reason 21:29:33 …now I'm wondering if preadv is faster than mmap + memcpy 21:29:47 it could be, I guess? because the physical memory you mmap into has to be cached 21:31:24 ais523: yeah, look, the manpage says "these system calls first appeared in 4.2BSD" 21:31:34 so old it's hard to speculate about it 21:32:55 ais523: readv is older than pread if the system call numbering can be believed 21:33:25 b_jonas: while searching about uses of preadv, I found some mailing list archive mentions which implied that readv was newer 21:33:31 -!- oerjan has quit (Quit: Nite). 21:37:26 ais523: anyway, what this reminded me of is the SSE floating point control word. these control the rounding mode, the exception mask, and two bits to change denormal inputs and results to zero in floating point instructions because those denormals would cause a big slowdown on certain CPUs. anyway, the compiler *should* know about the semantics of the SSE floating point control word to the extent that 21:37:32 it's not allowed to reorder floating point arithmetic around changing the control word, but current compilers don't yet know this, so it's not quite clear how you can write to the floating point control in a useful way without potential undefined behavior. 21:37:46 the situation is similar to the atomic operations back when multithreading was new and compilers didn't yet know much about it 21:38:02 or when SMP was new. 21:39:49 what's the performance of changing the floating-point control word like? 21:39:57 I can easily imagine algorithms which want to change it a lot 21:40:20 and no, you can't just change the floating point control word in a non-inlinable function, partly because the ABI says that the rounding mode etc has to be in its default state between function calls, and more importantly because the compiler is normally allowed to reorder a floating point operation around an unknown function call. 21:40:34 IIRC AVX-512 dedicates a couple of bits of the instruction to override parts of the FPU control word 21:41:17 fwiw, on gcc you could probably get away with an asm voltatile that takes the results of previous FPU instructions and inputs of subsequent FPU instructions as read-write parameters and then doesn't change them 21:41:38 would be annoying to write, but gcc would be forced to put the control-word-changing operation in the right place 21:41:49 ais523: there are two cases when you want to change the floating point control word a lot. one is if you want to use a non-default floating point control word, but also do function calls or returns to code that you don't control since technically you have to restore the default control word because any library function is allowed to do floating instructions; the other is interval arithmetic which can 21:41:55 change the rounding mode a lot 21:42:09 ais523: but more likely you just want to change the control word once, then do a lot of float operations 21:43:35 well, admittedly there's a third case, if you want to read the exception flags and have to reset them for that reason 21:43:45 I was thinking of interval arithmetic 21:43:52 exception flags might also be relevant in some algorithms 21:44:13 and I don't know about what performance writing the control word has, I'm mostly concerned about cases when that doesn't matter 21:50:07 apparently you can get slowdowns for denormal results in both Intel and AMD, and the optimization manuals for the two brands detail when these can and can't happen and what you should do about them 21:50:35 the actual rules do differ between Intel and AMD, at least for some of their CPUs 21:52:15 I think for many float operations, denormals are emulated in firmware rather than having dedicated hardware 21:52:19 so the performance is terrible 21:52:58 ais523: that's the gist of it, but the details are complicated. not all instructions with a denormal input or output give a slowdown. 21:53:24 that's why there's an optimization manual 21:53:30 I don't claim to really understand the rules 21:53:55 -!- normsaa has quit (Ping timeout: 256 seconds). 21:53:57 I just have to know where to look if this becomes important 21:54:27 fungot, does a circle have no end? 21:54:28 b_jonas: yeah you said that geographic location has to do with the code than by using only the functional aspects of scheme systems support arbitrary-size vectors... 22:16:19 fungot: alright, but at least it should have a coend, does it? (I don’t know what a coend is, that’s something from category theory) 22:16:20 arseniiv: for example because it pollutes the default namespace 22:17:31 hm that’s a bit too philosophical 22:18:55 fungot: What is a coend but a colimit over a bifunctor? 22:18:56 Corbin: i suppose fnord is the original? 22:19:21 fungot: Bi-fnord-tor? 22:19:21 Corbin: no, he refuses to give me access to your harddrive.) 22:20:36 fungot always tries to fool you with words when you ask significant questions 22:20:36 arseniiv: even the ' web-sivu' in the xvid format... would it be for me making return values in your helper function seem to always be using fd3 itself, though 22:21:31 fungot: at least tell me this: xvid or divx? 22:21:31 arseniiv: what, thin and unnecessarily bendy? and!... but is that what you recorded was to later evaluate ( quux zot) 22:21:54 exactly as I said, no definite answer :′( 22:23:54 > fix fungot 22:23:54 arseniiv: what with my poor grammar and ' be's all over the state. so you can do 22:23:55 *Exception: Can't be done 22:36:08 -!- Cale has quit (Remote host closed the connection). 22:38:37 -!- Cale has joined. 22:39:16 -!- chiselfuse has quit (Write error: Connection reset by peer). 22:39:16 -!- hendursaga has quit (Write error: Connection reset by peer). 23:46:08 [[Esolang:Sandbox]] M https://esolangs.org/w/index.php?diff=88009&oldid=87590 * PythonshellDebugwindow * (+18) rd 23:46:38 [[Esolang:Sandbox]] M https://esolangs.org/w/index.php?diff=88010&oldid=88009 * PythonshellDebugwindow * (+1) Rd 23:46:51 [[Esolang:Sandbox]] M https://esolangs.org/w/index.php?diff=88011&oldid=88010 * PythonshellDebugwindow * (+1) :