2021-09-11 - libera.chat:#esolangs

←2021-09-10 2021-09-11 2021-09-12→ ↑2021 ↑all

00:45:50 <fizzie> WDYT, if an IRC client send a periodic PING to avoid the dreaded "TCP connection was lost but the client has nothing to write" issue, should it bother to try to also verify the server responds with a corresponding PONG, or is that superfluous? It's definitely not necessary for the TCP thing, but hypothetically there might be a server that continues to speak TCP but not respond to commands. And

00:45:52 <fizzie> often it's used to estimate latency, but that's a different feature.

00:48:44 <zzo38> I check manually. I have the F2 key bound to PING and then I can see if PONG is received or not. (For automated IRC clients, it could check automatically)

00:50:21 <fizzie> Yes, this would be for an automaton. Just wondering if it's a failure mode that really needs worrying about, assuming I don't care about estimating the latency to try to jump servers if it's too high or w/e.

00:55:00 <zzo38> At least in my experience, if I try to PING and it isn't working, there will eventually be a connection error anyways. However, it might be worth checking after some (configurable) timeout anyways.

00:56:09 <zzo38> It has also happened to me that I was able to receive but not send. In this case, eventually the server will disconnect me due to a ping timeout.

00:56:16 <shachaf> What if it continues to speak TCP and respond to pings, but not to other commands?

00:57:09 <zzo38> Then I would think that the server is defective, probably.

00:57:47 <fizzie> I guess I could have it privmsg me to solve a CAPTCHA. But then what if the server sends that message to some other human who responds to it?

00:58:11 -!- chiselfuse has quit (Remote host closed the connection).

00:58:25 -!- chiselfuse has joined.

00:58:46 <zzo38> You can write a question that you do not expect anyone else to know the answer

01:31:14 -!- earendel has joined.

01:53:17 -!- dutch has joined.

05:27:01 <nakilon> fizzie I always thought client is exactly supposed to respond to PING with PONG, not just send periodic PING on their own

05:31:28 <nakilon> also if believe this line https://github.com/Nakilon/nakiircbot/blob/43bf3dfa932e78f19b656520d29629c9bf94c5bc/lib/nakiircbot.rb#L99 Quakenet used this command for measuring the latency too

05:33:34 <nakilon> I mean when I was making this comment I was reusing some old Quakenet bot that IIRC it had the timestamp parsing in it

05:33:57 <nakilon> but as it says in case of Libera there is just server name there

06:08:49 <esolangs> [[School]] https://esolangs.org/w/index.php?diff=87989&oldid=87986 * AceKiron * (+391) Added the PUSH and POP memory operants

07:54:15 <esolangs> [[Matrix (data structure)]] N https://esolangs.org/w/index.php?oldid=87990 * AceKiron * (+174) Created page with "A **matrix** is a data structure that can serve as an programming language's memory. The number of stacks may vary. Many languages have other methods of data storing as well."

07:54:27 <esolangs> [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87991&oldid=87990 * AceKiron * (+3)

07:55:43 <esolangs> [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87992&oldid=87991 * AceKiron * (+105)

07:56:17 <b_jonas> fizzie: I check PONG replies anyway to know when the server has processed my previous commands, which I need to know to not send more commands to the server that fit in its buffer, or else it would quit me.

07:56:29 <b_jonas> and at that point you probably want a timeout too

07:58:22 <esolangs> [[Category:Matrix-based]] N https://esolangs.org/w/index.php?oldid=87993 * AceKiron * (+181) Created page with "Languages primarily using one or more [[Matrix_(data_structure)|matrix]]s for storage. ==See also== * [[:Category:Queue-based]] * [[:Category:Stack-based]] Category:Langu..."

07:59:15 <esolangs> [[School]] https://esolangs.org/w/index.php?diff=87994&oldid=87989 * AceKiron * (+225)

07:59:58 <esolangs> [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87995&oldid=87992 * AceKiron * (+10)

08:01:00 <esolangs> [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87996&oldid=87995 * AceKiron * (+46)

08:06:20 -!- hendursa1 has joined.

08:08:51 -!- hendursaga has quit (Ping timeout: 276 seconds).

08:21:22 <esolangs> [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87997&oldid=87996 * AceKiron * (+1)

08:47:39 <esolangs> [[Matrix (data structure)]] https://esolangs.org/w/index.php?diff=87998&oldid=87997 * AceKiron * (+64)

08:51:10 <esolangs> [[School]] https://esolangs.org/w/index.php?diff=87999&oldid=87994 * AceKiron * (-2)

09:05:37 -!- spruit11_ has quit (Quit: https://quassel-irc.org - Chat comfortably. Anywhere.).

09:05:59 -!- spruit11 has joined.

09:31:40 -!- Koen_ has joined.

09:32:53 -!- Sgeo has quit (Read error: Connection reset by peer).

09:48:52 <esolangs> [[School]] https://esolangs.org/w/index.php?diff=88000&oldid=87999 * AceKiron * (+15) /* Memory operants */

09:51:30 -!- Trieste_ has joined.

09:51:46 -!- Trieste has quit (Ping timeout: 240 seconds).

09:58:02 -!- Oshawott has joined.

10:01:31 -!- archenoth has quit (Ping timeout: 252 seconds).

11:04:25 <esolangs> [[Special:Log/newusers]] create * Bsoelch * New user account

11:20:40 -!- hanif has joined.

11:26:01 <fizzie> Yes, I mean, the client does need to respond to PING with a PONG, but that's a different thing.

11:30:30 <riv> does IRC need ping and pong? doesn't TCP already have this basically

11:34:04 <fizzie> TCP has an *optional* keepalive option. But I don't think it's very popular compared to application protocol heartbeats.

11:38:24 <fizzie> As for not sending too many things, I'm using a credit-based system (each byte costs so and so, some commands have an extra surcharge, the client gets credit at a fixed rate capped to some maximum value) to approximate that. That's what ircd (at least the real one, the one used at IRCnet) does on the server side. Of course it's not exactly exact due to network latency and so on, but it's been

11:38:26 <fizzie> working just fine.

11:40:23 <fizzie> On keepalive, IIRC the default timeouts tend to be huge (hours), and configurable only system-wide.

11:48:12 <esolangs> [[Meow]] https://esolangs.org/w/index.php?diff=88001&oldid=87959 * Martsadas * (+20) /* fixed mistakes*/

11:49:14 <esolangs> [[Meow]] M https://esolangs.org/w/index.php?diff=88002&oldid=88001 * Martsadas * (+27)

12:15:51 -!- hanif has quit (Ping timeout: 276 seconds).

12:50:37 -!- earendel has quit (Quit: Connection closed for inactivity).

12:52:38 <esolangs> [[Matrix]] M https://esolangs.org/w/index.php?diff=88003&oldid=42721 * PythonshellDebugwindow * (+50) Confusion

12:52:48 <esolangs> [[Matrix (data structure)]] M https://esolangs.org/w/index.php?diff=88004&oldid=87998 * PythonshellDebugwindow * (+50) Confusion

12:52:57 <esolangs> [[Matrix (data structure)]] M https://esolangs.org/w/index.php?diff=88005&oldid=88004 * PythonshellDebugwindow * (-17) m

12:58:01 -!- hanif has joined.

13:12:27 -!- hendursa1 has quit (Quit: hendursa1).

13:12:53 -!- hendursaga has joined.

13:40:19 <esolangs> [[Special:Log/newusers]] create * 4gboframram * New user account

13:47:28 <esolangs> [[Esolang:Introduce yourself]] https://esolangs.org/w/index.php?diff=88006&oldid=87982 * 4gboframram * (+184) /* Introductions */

14:17:17 -!- delta23 has joined.

14:32:41 -!- Koen_ has quit (Remote host closed the connection).

14:33:22 -!- velik has quit (Remote host closed the connection).

14:34:00 -!- velik has joined.

14:40:11 -!- velik has quit (Remote host closed the connection).

14:40:29 -!- velik has joined.

14:41:55 -!- velik has quit (Remote host closed the connection).

14:42:13 -!- velik has joined.

14:45:28 -!- velik has quit (Remote host closed the connection).

14:47:31 -!- velik has joined.

15:03:05 -!- normsaa has joined.

15:03:22 <normsaa> https://pastebin.com/px6HUCLV how can this binary be decoded?

15:03:28 <normsaa> In any esoteric lang?

15:06:38 <Corbin> normsaa: Where did it come from?

15:07:14 <normsaa> Corbin A friend

15:07:30 <normsaa> He wrote it

15:09:46 <normsaa> Any ideas?

15:23:26 <int-e> . o O ( it's too bad that the flag doesn't identify the CTF this is from )

15:23:39 <int-e> normsaa: tell your "friend" to solve the problem properly, by themselves.

15:39:04 <Corbin> Also tell your friend to fix the overlapping assignment, which probably breaks the script.

15:51:39 -!- hanif has quit (Ping timeout: 276 seconds).

15:56:32 -!- hanif has joined.

16:03:37 -!- Koen_ has joined.

16:11:38 <b_jonas> riv: yes, IRC sort of requires PING and PONG for at least three reasons. some servers (not freenode, I haven't looked at libera yet) require that you send *one* pong after connecting, copying an unpredictable code from the PING that the server sends, as a sort of anti-spam measure. second, some servers, including freenode (again, haven't looked at libera yet) require that the client sends something

16:11:44 <b_jonas> every five minutes, to ensure that it can drop clients that are disconnected. it ensures that clients do this by sending pings, you don't need to reply to those technically, but replying to pings is an easy way to satisfy this requirement.

16:14:38 <b_jonas> thirdly, you can use pings for flow control. the way IRC works is that the server has a very small input buffer for each client, and if the client sends more than that input buffer over what the server has handled locally, it disconnects the client. the server handles commands for one client in series, so if you pay attention to local replies (replies from that server, not other servers), you can

16:14:44 <b_jonas> sometimes tell how much the server handled, and so how full the queue is. but not all commands have local replies, or the local reply isn't always easy to identify, so sometimes you want to send a command just to force a local reply. the best command for that is a local PING (as opposed to a PING to a different server), since that does nothing but send you a reply.

16:20:54 -!- velik has quit (Remote host closed the connection).

16:21:23 -!- velik has joined.

16:22:00 -!- velik has quit (Remote host closed the connection).

16:22:17 -!- velik has joined.

16:23:59 <b_jonas> int-e: wait, do you actually recognize what that is, or do you just know it's homework from what it looks like and how they simultaneously cross-post on multiple channels?

16:30:29 <fizzie> int-e: I was half-expecting doing a web search for the flag value would tell you where it's from (surely all of those have answers posted online?), but apparently it doesn't.

16:31:05 <int-e> fizzie: yeah. which /could/ indicate that it's an ongoing one, or just that it's very obscure

16:31:27 <fizzie> "It looks like there aren't many great matches for your search. Tip: Try using words that might appear on the page that you’re looking for. For example, 'cake recipes' instead of 'how to make a cake'."

16:31:40 <fizzie> Mmm, cake.

16:32:57 <int-e> . o ( glados instead of cake )

16:38:06 <hanif> google gave me this video https://www.youtube.com/watch?v=JMrd8PoxvPc, but the author doesn't do this challenge in the video

16:38:06 <keegan> it's a piece of cake to bake a pretty cake

16:38:36 <hanif> and the ctf site linked is dead and unarchived

16:44:25 <riv> normsaa

16:46:43 <b_jonas> fizzie: there's technically a third, most unlikely case: that it's from a site like Advent of Code that gives every logged in user a different test input

16:59:58 -!- Koen_ has quit (Remote host closed the connection).

17:00:26 -!- j-bot has quit (Remote host closed the connection).

17:00:40 -!- j-bot has joined.

17:07:26 -!- oerjan has joined.

17:21:23 -!- arseniiv has joined.

17:30:35 -!- normsaa90 has joined.

17:33:15 -!- normsaa has quit (Ping timeout: 256 seconds).

17:36:23 -!- normsaa has joined.

17:39:29 -!- normsaa90 has quit (Ping timeout: 256 seconds).

17:40:12 -!- hanif has quit (Ping timeout: 276 seconds).

17:41:39 -!- normsaa91 has joined.

17:41:45 -!- normsaa has quit (Ping timeout: 256 seconds).

18:02:07 -!- immibis has quit (Remote host closed the connection).

18:05:43 -!- immibis has joined.

18:05:47 -!- Sgeo has joined.

18:12:21 -!- normsaa91 has quit (Ping timeout: 256 seconds).

18:14:59 -!- Guest81 has joined.

18:14:59 <Guest81> h

18:15:41 -!- Guest81 has quit (Client Quit).

18:16:02 -!- normsaa has joined.

18:31:02 <fizzie> "Site compatible with IE 10 or above, Mozila [sic], ..." is probably not a good sign.

18:33:30 <arseniiv> ow

18:34:14 <arseniiv> `? tlwtnt

18:34:17 <HackEso> tlwtnt? ¯\(°_o)/¯

18:34:30 <arseniiv> ow!

18:37:04 <nakilon> \wp ow

18:37:06 <velik> OW -- Wikimedia disambiguation page https://en.wikipedia.org/wiki/OW

18:38:23 <nakilon> looks like sometimes there is a default page and sometimes not

18:38:53 <b_jonas> fizzie: does it also have a link to where you can download Acrobat Reader to view their PDFs and the Java and Adobe Flash plugins, without mentioning Oracle for Java?

18:39:31 <b_jonas> also do they recommend at an least 256 color and at least 1024x768 pixel resolution display for best view?

18:40:54 <b_jonas> very long ago I made a script to say "best viewed with Mozilla" or "best viewed with Internet Explorer", always the other one than the viewer is using

18:41:19 <nakilon> lol

18:41:49 <nakilon> b_jonas do you know who chukchas are?

18:41:59 <b_jonas> no

18:42:22 <nakilon> ethnic Siberians who live deep in tundra with deers

18:42:27 <nakilon> there is a joke

18:43:02 <zzo38> b_jonas: What will do if neither is use?

18:43:22 <nakilon> smth like: "to keep chukcha busy give him a paper with 'read on the other side' written on both sides"

18:43:31 <keegan> b_jonas: lol, evil

18:44:40 <nakilon> \wp chukcha

18:44:42 <velik> Chukchi people -- ethnic group https://en.wikipedia.org/wiki/Chukchi_people

18:45:21 <nakilon> oh, even not Siberia

18:51:00 <b_jonas> zzo38: one of them was the default. I don't remember which.

18:52:19 <fizzie> It didn't have those other things. Maybe it would have elsewhere on the site.

18:57:13 <esolangs> [[School]] https://esolangs.org/w/index.php?diff=88007&oldid=88000 * AceKiron * (-80)

19:00:25 -!- ais523 has joined.

19:01:08 <ais523> I think the reason why some servers ping during connection, and don't connect until they receive a matching pong, is to prevent non-IRC-related programs being tricked into connecting to IRC

19:01:33 <ais523> if your ircd ignores invalid commands (and many do), it isn't hard to put a segment of valid IRC commands in the middle of, say, an HTTP POST request

19:01:39 <riv> that is a good reason but only requires one PING right at the start

19:01:55 <keegan> I remember a spam attack on Freenode that worked by exactly that mechanism

19:02:01 <ais523> so you can create a web page with a script that causes the viewers to spam IRC, and this has been used to create IRC worms in the past

19:02:21 <keegan> it would POST a set of IRC commands that cause the user to join a bunch of channels and spam them with the URL of the page

19:02:24 <keegan> right

19:02:25 <keegan> pretty funny really

19:02:43 <nakilon> HAMRADIO

19:02:50 <keegan> Postel's Law sounds good but is absolutely terrible for security

19:03:05 <riv> Postel's Law is bad

19:03:14 <keegan> also bad for long term maintainability

19:03:22 <riv> i remember that spam attack, that was funny

19:03:34 <nakilon> \wp Postel's Law

19:03:35 <velik> robustness principle -- design guideline for software that states: "be conservative in what you do, be liberal in what you accept from others" https://en.wikipedia.org/wiki/Robustness_principle

19:05:54 <keegan> as users expect the "best guess" behavior of implementations will continue working forever

19:06:11 <keegan> leading to the codification of insanely complex behavior as exemplified by the WHATWG HTML spec

19:06:48 <ais523> I do actually like what that HTML spec has done, though

19:07:06 <ais523> because it means that there are now set boundaries for exactly what you are and aren't allowed to do in HTML

19:07:13 <keegan> right

19:07:29 <keegan> it's a regrettable necessity based on the early days of the web being dominated by ad hoc systems and postel's law

19:07:34 <ais523> it is Postellish in some respects, too, e.g. saying that web pages must be in UTF-8 but giving long complicated instructions for what to do if they aren't

19:08:46 -!- arseniiv has quit (Ping timeout: 260 seconds).

19:09:45 <ais523> actually, one related problem I've been having recently, which may be unsolvably difficult, and Stack Overflow has not been helpful:

19:10:06 <ais523> given a URL, which characters in it can be safely percent-decoded without changing the meaning of the URL

19:10:10 <ais523> ?

19:11:42 <ais523> I'm trying to write an HTML sanitizer and would prefer to avoid allowing people to put obfuscated URLs through it, but it's so hard to figure out the rules for what will and what won't work

19:12:28 <nakilon> generate string of all chars and escape it with some very common library used for that need

19:12:35 <nakilon> to see what chars it will process

19:14:10 <ais523> nakilon: that basically means assuming that the library is correct, which it probably won't be

19:14:15 <nakilon> pretty sure all libraries will process different set of chars )

19:14:28 <nakilon> there is probably no correct library

19:14:32 <ais523> that said, I have been trying various test strings on various browsers and one httpd, to see what happens

19:14:46 <nakilon> maybe some Chrome is implemented "correctly" but it won't provide a library

19:14:50 <ais523> (testing a wide range of httpds would be frustrating)

19:15:22 <ais523> one thing I did learn throughout all this is that the URL path component %2e%2e is in fact equivalent to .. and will cancel out the previous component

19:15:38 <ais523> which seems like an unwise decision from a security point of view, that's just asking for path traversal vulnerabilities

19:15:43 <nakilon> also the possible achievable "correctness" of your tool is limited by how correct the servers are

19:16:01 <nakilon> many of them work differently about URL escaping

19:16:13 <ais523> right

19:16:30 <ais523> I think the only real option here is to have a parameter for what sort of dubious-looking escapings the user wants to exclude

19:16:37 <nakilon> also additional rules and bugs in redirects

19:22:08 <nakilon> teaching velik wolfram alpha, somehow it took the whole day to make 10 tests, and it's only a piece of Math examples; there are three other topics, maybe I'll make most of them tomorrow

19:32:17 <b_jonas> ais523: "prevent non-IRC-related programs being tricked into connecting to IRC" => yes, that might be part of the reason.

19:33:46 <b_jonas> "avoid allowing people to put obfuscated URLs through it" => yeah, that's probably impossible

20:13:55 <fizzie> Normally I don't pay *that* much attention to update sizes, but now updating blender wants to install "libembree3-3" that will take half a gigabyte of disk.

20:15:45 <ais523> that's a pretty big library!

20:16:18 <ais523> b_jonas: it just seems so wrong to let people post arbitrary URLs on, say, forums or the like, when you're supposed to be sanitising the content

20:16:40 <fizzie> "Intel® Embree is a collection of high performance ray tracing kernels that helps graphics application engineers to improve the performance of their photorealistic rendering application." I feel like they've probably got versions specifically tuned for a bazillion different (Intel) CPU models.

20:18:21 <fizzie> Heh, it's a single 485223648-byte .so file.

20:18:47 <b_jonas> ais523: I'm not sure I see why, except for the part where you might sanitize the protocol part (the part before the first colon) and add a max length

20:19:43 <ais523> fizzie: I suspect that you only need around eight different versions of your code to get peak performance on all 64-bit Intel CPUs

20:19:49 <b_jonas> ais523: though most of the time you'd probably throw in that HTML attribute that hints to search engines that this is a link by a third party submission and the search engine shouldn't think your site is deliberately linking to kiddy porn

20:20:02 <ais523> and maybe another four or five more for AMD

20:20:22 <ais523> b_jonas: I've already been looking through the list of rel= attributes

20:20:27 <b_jonas> it won't take away all the responsibility about what links you host, but you can't generally fix that by just looking at the URL

20:20:42 <ais523> I think probably at least nofollow and noreferrer should be in there for external links by default

20:21:17 <b_jonas> you absolutely want to whitelist protocols though because of javascript: links though

20:21:21 <ais523> although noreferrer is interesting because you can also set an HTTP header that tells the browser to noreferrer everything

20:21:40 <ais523> I explicitly turned it off on my website (as in, I outright said in the headers that I know this header exists and I'm choosing not to use it), which makes some security checkers really annoyed

20:22:00 <b_jonas> lol "only need around eight different versions of your code to get peak performance on all 64-bit Intel CPUs"

20:22:13 <ais523> b_jonas: I mean that there isn't a combinatorial explosion

20:22:32 <b_jonas> and then you want AMD and code running on GPU and a port for ARM64 etc

20:22:56 <b_jonas> ais523: yes, that's because it's really hard to make CPUs so there are only two or three companies making x86 cpus at a time

20:23:07 <oerjan> . o O ( rel=dontclickfortheloveofgod )

20:23:08 <ais523> although, in practice nowadays, I think you can get decent performance for CPU-bound code by writing a post-AVX2 version for most people, and a pre-AVX2 version (aiming for maximum compatibility) for people who are running on really old computers

20:23:22 <b_jonas> and they mostly develop at most three lines of them in parallel each

20:23:33 <b_jonas> one expensive, one home, and one low-power laptop one

20:23:34 <ais523> optimising for AMD does seem to be significantly different from optimising for Intel, though

20:24:01 <ais523> in particular, if the program isn't memory-bound, the next most relevant bottleneck on Intel is normally instruction dispatch, whereas on AMD it's usually something else

20:24:07 <b_jonas> ais523: and more importantly, you only need to make different versions of a few performance-critical functions, not everything in your code

20:25:01 <b_jonas> there might still be a combinatorial explosion if you want versions of your code that differ in ways other than the CPU hardware

20:25:09 <ais523> I came up against this in the fizzbuzz I've been writing over the last year or so

20:25:40 <ais523> I want to read a vector from memory, then do two instructions with that vector as one argument and a vector in a register as a second argument

20:25:52 <riv> how is the fizzbuzz coming??

20:25:54 <ais523> on Intel, it's optimal to read the vector from memory twice

20:26:08 <ais523> on AMD, you want to read it into a register and then use it from there

20:26:33 <ais523> this is because Intel is bottlenecked on instruction decode so simply using fewer instructions is a gain, the L1 cache can handle the second read

20:26:53 <fizzie> Well, yes. It is a C++ project. It's possible there's a combinatorial explosion of templates instead. There isn't that much code in terms of source code lexically.

20:27:00 <ais523> on AMD the instruction decode is faster but the L1 cache has less bandwidth, so you can spare an extra instruction to read into a register to spare the cache bandwidth

20:27:10 <b_jonas> ais523: that might change for future AMD cpus...

20:27:16 <ais523> riv: I think I have a plan, the issue is just finding the time to write this code

20:27:46 <ais523> b_jonas: it's possible, but AMD seem to have been going down the path of using hyperthreading to make use of the extra instruction dispatch capability

20:28:05 <ais523> (sorry, I meant dispatch not decode, AMD is bottlenecked on decode too but that only matters if you aren't in a loop because of the µop cache)

20:28:22 <b_jonas> that said, I agree that bottlenecking on either memory access or instruction dispatch is typical these days, the execution times don't matter as much, unless you are specifically writing matrix multiplication inner loops or things like that

20:29:08 <ais523> even matrix multiplication is bottlenecked on memory access, most of the fast techniques for it are based on trying to avoid cache spills

20:30:05 <b_jonas> ais523: yes, so it's only the inner loops where you actually have to care about the execution times of these floating point multiplication-add instructions.

20:30:22 <ais523> it's hard to think of something that wouldn't bottleneck on memory access – maybe things like prime factorization, or pathfinding

20:31:00 <ais523> b_jonas: oh, fused multiply-adds are fast, but that doesn't really matter, they're more beneficial in terms of accuracy than they are in terms of speed

20:31:35 <ais523> multiply then add is 2 cycles latency, fused multiply-add is 1 cycle latency, and they both have enormous throughput (values correct for recent Intel and also recent AMD)

20:33:55 <b_jonas> ais523: yes, the current CPUs are so optimized for that that you basically can't run out of multiplication units. I remember there was a time when the CPU was better at fused multiply-add than additions

20:34:25 <ais523> integer additions do actually beat all the floating-point stuff on most modern CPUs, though

20:35:02 <ais523> on Intel, this is primarily because the execution unit that handles jumps and branches can be used to do integer additions/subtractions if it isn't needed for the jump

20:35:13 <b_jonas> ais523: while floating point multiplications beat integer multiplications, yes

20:35:19 <ais523> (because it can handle fused compare-jump, fused subtract-jump, and friends)

20:35:38 <ais523> and yes, floating point multiplication performance is better than integer multiplication (although normally not that much better)

20:35:45 <b_jonas> only 64-bit ones though, because the mantissa is bigger

20:35:54 <ais523> Intel actually has two different floating point multipliers with different performance characterstics

20:36:04 <ais523> one has higher throughput, the other lower latency

20:36:36 <b_jonas> ais523: used for the same instructions? I didn't know that

20:36:38 <ais523> actually, the main throughput bottleneck I tend to hit is vector shuffles

20:36:49 <ais523> b_jonas: I think they're mostly used for different instructions

20:37:05 <ais523> Intel normally has only one vector shuffler, it's fast but you can only use it once per cycle

20:37:14 <ais523> and lots of useful instructions fall into the "vector shuffle" group

20:37:43 <b_jonas> yeah

20:38:30 <ais523> there's also the infamous lane-crossing penalty (especially on AMD, but I think it affects Intel too)

20:39:15 <ais523> where it costs something like 3 cycles to do anything that combines the top half of a register and the bottom half of the register, when the register is "sufficiently large" (normally a recently introduced vector size)

20:39:58 <ais523> this is why lots of vector instructions are incapable of mixing the top and bottom half of a YMM register, they're instead basically designed as two XMM instructions in SIMD (even if they aren't normally SIMD instrucitons)

20:40:24 <b_jonas> ais523: yeah

20:41:35 -!- arseniiv has joined.

20:42:04 <b_jonas> VPSHUFB for ymm registers specifically

20:42:50 <ais523> that was the example I was going to use

20:43:17 <b_jonas> I am following this because AVX2 is now available on lots of CPUs

20:43:38 <ais523> and there's an annoying lack of backwards compatibility, too – even if Intel or AMD figure out how to make five bits of the index useful, they won't be able to make their VPSHUFB instructions actually handle them

20:43:41 <b_jonas> (including my new home computer)

20:43:43 <ais523> because it would break backwards compatibility

20:43:53 <ais523> I've had an AVX2-capable computer for a few years now

20:44:56 <b_jonas> ais523: they just add a new instruction for that. they're adding lots of new vector instructions all the time anyway.

20:45:30 <ais523> but what do they even name it?

20:45:33 <ais523> VPSHUFB5?

20:45:41 <b_jonas> no, I think it's VPERMsomething

20:45:44 <b_jonas> rather than SHUF

20:45:46 <ais523> (with a VPSHUFB6 coming in a few years after AVX-512 is more mature?)

20:45:49 <b_jonas> it already exists

20:45:53 <b_jonas> or so I think

20:46:00 <b_jonas> let me look it up, I think it's later than AVX2

20:46:00 <ais523> oh, the PERM stuff normally has worse granularity than SHUF, this will be confusing

20:46:38 <b_jonas> no, there is now a VPERMB that is a full byte level shuffle even on a zmm register

20:46:47 <b_jonas> the lower granularity was a thing of the past

20:47:08 <ais523> knowing how AVX-512 is going, this is likely to have been specified by Intel but not actually implemented by anything

20:47:08 <b_jonas> well, thing of the past that's still in many CPUs that we're using now

20:47:10 <b_jonas> but you know

20:47:15 <arseniiv> <b_jonas> very long ago I made a script to say "best viewed with Mozilla" or "best viewed with Internet Explorer", always the other one than the viewer is using => rofl oh my

20:47:24 <ais523> there's a lot of AVX-512 which was specified but with no implementations

20:47:34 <b_jonas> ais523: quite possible.

20:48:28 <ais523> this may end up leading to another FMA3/FMA4 debacle some time in the future

20:48:52 <b_jonas> ais523: there's even a VPERMI2B instruction to byte level permute *two* zmm registers

20:48:59 <ais523> (FMA got specified prior to being implemented, with Intel and AMD proposing different plans; each then implemented the *other's* specification, leaving them incompatible)

20:49:00 <b_jonas> which means 128 slots

20:49:08 <b_jonas> wait what?

20:49:14 <b_jonas> each implemented only the other's specification?

20:49:15 <ais523> I think AMD implemented Intel's specification because they wanted to be compatible

20:49:18 <b_jonas> I didn't follow that

20:49:26 <b_jonas> I know they implemented incompatible stuff

20:49:30 <b_jonas> but I didn't know they swapped

20:49:32 <ais523> and Intel implemented AMD's specification because they couldn't get their own to work, it needed too much internal rearchitecturing

20:49:34 <b_jonas> that is odd

20:49:44 <b_jonas> wow

20:49:49 <ais523> (presumably this is why AMD came up with their version in the first place, it would be easier to implement)

20:49:57 <b_jonas> but 3DNow was AMD's specification that was never in Intel, right?

20:50:08 <ais523> b_jonas: yes, although a couple of 3DNow commands survived

20:50:34 <ais523> admittedly, SSE is much better-designed than 3DNow was, although both are dubious in terms of encoding

20:51:07 <b_jonas> I never really looked into the details of what 3DNow does. it was obsolete by the time I could have cared.

20:51:31 <b_jonas> we already had SSE4.1 by the time I started to care about SIMD instruction stuff

20:52:00 <ais523> b_jonas: think SSE with 64-bit-wide vectors

20:52:17 <b_jonas> no, that's MMX

20:52:24 <b_jonas> well, not quite

20:52:27 <ais523> I thought MMX wasn't vectorised at all

20:52:37 <ais523> 3DNow is, as long as you want a pair of single-precision floats

20:53:17 <ais523> it was the first vector unit; it simply just wasn't a very good one

20:53:25 <b_jonas> MMX has the drawback that it shares state with the FPU, and you have to do a slow switch of the FPU between MMX and traditional mode each time you want to use it, since the existing ABI expects the FPU to be in non-MMX mode

20:53:50 <b_jonas> MMX is "vectorized" in that it can handle two 32-bit floats in a 64-bit register

20:54:11 <ais523> hmm, maybe I got them muddled then

20:54:20 <b_jonas> but two floats per register is still a big help

20:54:21 <ais523> or maybe 3DNow uses the MMX registers for its vectors

20:54:39 <b_jonas> it also handles packed integers

20:54:52 <ais523> https://en.wikipedia.org/wiki/3DNow!

20:55:02 <b_jonas> no, I'm wrong

20:55:03 <ais523> right, 3DNow! seems to be an extension to use the MMX registers as vector registers

20:55:14 <b_jonas> apparently MMX *only* handles integers

20:55:31 <b_jonas> that's even more useful

20:56:23 <ais523> oh, so MMX does int vectorisation and 3DNow! does float vectorisation?

20:56:25 <b_jonas> I know these days MMX is only useful to get a few extra registers that you can sometimes access with shorter encodings than anything in the later instruction sets, and basically never worth to use

20:56:49 <b_jonas> I have no idea what 3DNow does

20:56:55 <ais523> I'm actually vaguely surprised that MMX didn't become the standard for non-vectorised floating point

20:57:15 <b_jonas> ais523: what do you mean "the standard"?

20:57:23 <ais523> it is saner than x87, and supported by all 64-bit CPUs

20:57:27 <ais523> b_jonas: in the ABI

20:57:54 <ais523> like, the ABI passes floats in MMX registers, assumes MMX mode at call boundaries, and the like

20:58:05 <b_jonas> ais523: which ABI? we can't change the x86_32 ABI, it's too late for that, and x86_64 comes with always SSE2 so by that time the point is moot

20:58:17 <ais523> b_jonas: x86_64

20:58:19 <b_jonas> also if MMX only handles integers then that can't work

20:58:34 <ais523> no, MMX definitely does floats

20:58:55 -!- Lord_of_Life has quit (Ping timeout: 260 seconds).

20:58:59 <ais523> no, I'm wrong

20:59:01 <ais523> it doesn't do floats, only ints

20:59:07 <ais523> that's why people don't use it for float maths :-)

20:59:19 -!- Lord_of_Life has joined.

20:59:39 <b_jonas> ais523: also SSE2 is the standard for passing floats in the x86_64 ABI, and that's a good thing

21:00:02 <b_jonas> because with SSE2 there, MMX is almost never useful

21:00:21 <b_jonas> and SSE2 adds advantages, both wider vectors and a better instruction set

21:00:23 <ais523> so it looks like we have three sets of registers: integer; x87/MMX/3DNow!; and XMM/YMM/ZMM

21:00:35 <b_jonas> oh, 3DNow also uses the x87 registers?

21:00:43 <ais523> (also random special-purpose stuff like flags, but I'm not counting those)

21:00:45 <ais523> b_jonas: right

21:00:56 <b_jonas> also (sigh) we also have AVX512 mask registers.

21:01:21 <b_jonas> on AVX512-capable CPUs that is

21:01:29 <ais523> x87 interprets the registers as one of three float formats (long double, plus formats which are almost but not quite the same as float and double); MMX as 64-bit integer vectors; and 3DNow! always as two floats

21:01:43 <ais523> b_jonas: to be fair those are really helpful for some applications

21:02:42 <b_jonas> ais523: no, x87 specifically stores 80-bit floats, not long doubles. there's a difference because long double is 64-bit floats in the MSVC ABI

21:03:08 <ais523> well, yes, but they're what has been known as "long double" for ages on Intellish processors

21:03:23 <ais523> but they got deprecated with the change to 64-bit

21:03:40 <b_jonas> because SSE2 handles 64-bit floats, yes

21:05:43 <esolangs> [[Cabra]] M https://esolangs.org/w/index.php?diff=88008&oldid=81202 * PythonshellDebugwindow * (+0) /* Language Definition */ Fix typo

21:05:46 <ais523> Wikipedia says that 3DNow! invented SFENCE

21:05:47 <b_jonas> wtf there's a KADDBB/KADDW/KADDD/KADDQ AVX512 instruction? I never noticed that

21:06:30 <b_jonas> ais523: I admit I don't follow how the fence instructions work. I leave them to slightly higher level libraries.

21:06:31 <ais523> but that seems unlikely to me, because my understanding of the x86 memory model is that an SFENCE is only useful with non-temporal writes or write-combining memory, and I didn't think those were implemented at that point

21:07:00 <ais523> b_jonas: I can describe the general (non-x86-specific) implementation fairly easily

21:07:15 <ais523> imagine loads and stores as not happening instantly, but being spread out over time

21:07:46 <ais523> an lfence stops a load crossing it (it has to happen entirely before if it's before the lfence, or entirely after if it's after the lfence)

21:07:51 <ais523> likewise, an sfence stops a store crossing it

21:08:13 <fizzie> I think PPC conventionally has a double-double as its `long double` type.

21:08:44 <ais523> if one thread is storing two pieces of data, and another thread is loading them, then you need to sfence between the stores and lfence between the loads if you want to prevent the loading thread seeing the new value of the second store, but the old value of the first store

21:09:45 <b_jonas> spread out over time how? you mean they happen at different times to different layers of the cache hierarchy, going down the hierarchy if either the smaller caches need to free up space or to make the value known to other CPUs?

21:10:14 <ais523> b_jonas: imagine that you send a "request to write memory" but then continue executing before the request has been handled

21:10:20 <ais523> and let the motherboard respond to the request at some later time

21:10:21 <b_jonas> ais523: "stops a load crossing it" at what levels of the hierarchy?

21:11:01 <ais523> it's a logical rather than physical barrier, it's not bound to a specific level of hierarchy

21:11:08 <ais523> so you have to match an lfence on one thread with an sfence on another

21:11:11 <b_jonas> ok

21:11:32 <b_jonas> I still think I don't need to know the details of this, what I do need to know is the atomic and mutex abstractions over them that libraries provide me

21:11:50 <ais523> on x86-64 specifically I think it's handled as part of the cache coherency mechanism

21:11:56 <b_jonas> because I don't think I write inter-thread (or inter-process) communication code that is at a lower level than those

21:12:09 <ais523> right

21:12:34 <b_jonas> nor CPU-level code that handles memory mapped to the video card or other memory-mapped IO

21:12:43 <ais523> one way to think about it is that sfence is one of the two main mechanisms for implementing the "release" atomic ordering, and lfence is one of the two main mechanisms for implementing the "acquire" atomic ordering

21:13:20 <ais523> atomic release is sfence then write; atomic acquire is read then lfence

21:14:03 <ais523> only, x86 has extra guarantees that most processors don't, so sfence is usually a no-op and I think many atomic libraries leave it out there on x86-64 (even though they would use it on other processors)

21:14:18 <ais523> (lfence is not a no-op, though, and is important in atomic code)

21:15:19 <b_jonas> and as far as I understand, the compilers need to know about both fences and atomics, because they have a meaning not only on the CPU level, but for what the optimizer isn't allowed to do, and current compilers indeed do this. (in contrast, I think the compiler needn't know about mutexes directly.)

21:16:22 <ais523> the compiler does need to know about the acquire/release rules on mutexes, but either it can see the atomic read/write in the function, or else it can't see anything at all and thus has to assume the worst

21:16:51 <ais523> oh, this reminded me of a weird case of wanting a compiler barrier specifically

21:17:13 <b_jonas> yes, the fast (non-contented) cases mutex functions must be fast so the optimizer will see into the functions when necessary

21:17:19 <b_jonas> s/cases/cases of/

21:17:21 <ais523> the idea would be in functions that undropped permissions, did a system call with checks, then dropped them again

21:17:43 <ais523> to do the checks with permissions raised, and to compiler-barrier to ensure that the undropping is done before the checks

21:18:01 <ais523> this sounds like it violates the least-permissions principle, but the point is to protect the checks from return-oriented programming

21:18:11 <ais523> in order to get permission to do the system call, the code would need to run through the checks too

21:18:46 <b_jonas> ais523: what kind of permission checks? aren't those undropping, permission checking, and dropping three system calls, and the compiler already mustn't reorder system calls?

21:19:01 <b_jonas> it also reminds me of something similar by the way

21:19:29 <ais523> b_jonas: say, you want an mprotect() wrapper that checks that you aren't making anything executable if it was previously nonexecutable

21:19:55 <b_jonas> ais523: ah, so by permission checking you just mean accessing memory that may or may not have read/write/execute permissions?

21:20:07 <b_jonas> hmm, that might be difficult

21:20:18 <ais523> I didn't mean permission checking, just checking in general

21:20:42 <b_jonas> but even so, can't a system call basically write anything to anywhere in your user memory, so you usually can't reorder memory accesses around them anyway?

21:21:11 <ais523> oh, that's interesting – the point being that compilers wouldn't optimise system calls anyway due to not knowing what they do?

21:21:30 <b_jonas> ais523: yes, except maybe a few specific system calls of which they know the meaning

21:21:51 <b_jonas> there are system calls like pread that can write even to memory that you didn't pass a pointer to to the system call

21:21:58 <ais523> I know there are some functions that can system call, and that the compiler treats specially

21:22:01 <ais523> malloc, for example

21:22:30 -!- delta23 has quit (Quit: Leaving).

21:23:07 <b_jonas> no, not pread, sorry.

21:23:15 <b_jonas> preadv

21:24:35 <b_jonas> preadv can write anywhere, and a compiler has to assume that an unknown system call can do things worse than that

21:24:53 <ais523> preadv seems so specific

21:25:18 <ais523> I can see why it could be useful – it saves the overhead of making multiple system calls when you want to do that operation specifically – but I'm unclear on how common that particular operation would be

21:25:25 <b_jonas> it's specific in that it's a particularly badly behaving system call, that's why I'm giving it here as an example

21:25:37 <b_jonas> most system calls are tamer than that, but the compiler can't easily rely on that

21:26:14 <ais523> I meant, I was thinking on a different line of thought when you mentioned preadv

21:26:25 <ais523> like, what was the motivation behind adding that to the kernel? who needed it, and what do they do with it?

21:26:52 <b_jonas> I don't really know.

21:27:21 <ais523> 99% of programs would just read into a large buffer and then copy the data into the appropriate final locations, rather than spending time coming up with a big description for preadv

21:27:29 <ais523> although, preadv is faster because it reduces cache pressure

21:28:21 <b_jonas> I think it might be there because they wanted to get asynchronious regular file reading to work, which turned out quite hard and they're still struggling with it, but anyway the interface of the async reading allows similar scatter-gather read because it has to allow multiple reads at the same time, so they added a normal non-async interface at that point

21:29:03 <b_jonas> but maybe someone just used a cyclical buffer and wanted to micro-optimize the number of system calls, since the context change for system calls used to be slower than now

21:29:19 <b_jonas> preadv is very old, you have to remember that

21:29:25 <b_jonas> so it can have some odd historical reason

21:29:33 <ais523> …now I'm wondering if preadv is faster than mmap + memcpy

21:29:47 <ais523> it could be, I guess? because the physical memory you mmap into has to be cached

21:31:24 <b_jonas> ais523: yeah, look, the manpage says "these system calls first appeared in 4.2BSD"

21:31:34 <b_jonas> so old it's hard to speculate about it

21:32:55 <b_jonas> ais523: readv is older than pread if the system call numbering can be believed

21:33:25 <ais523> b_jonas: while searching about uses of preadv, I found some mailing list archive mentions which implied that readv was newer

21:33:31 -!- oerjan has quit (Quit: Nite).

21:37:26 <b_jonas> ais523: anyway, what this reminded me of is the SSE floating point control word. these control the rounding mode, the exception mask, and two bits to change denormal inputs and results to zero in floating point instructions because those denormals would cause a big slowdown on certain CPUs. anyway, the compiler *should* know about the semantics of the SSE floating point control word to the extent that

21:37:32 <b_jonas> it's not allowed to reorder floating point arithmetic around changing the control word, but current compilers don't yet know this, so it's not quite clear how you can write to the floating point control in a useful way without potential undefined behavior.

21:37:46 <b_jonas> the situation is similar to the atomic operations back when multithreading was new and compilers didn't yet know much about it

21:38:02 <b_jonas> or when SMP was new.

21:39:49 <ais523> what's the performance of changing the floating-point control word like?

21:39:57 <ais523> I can easily imagine algorithms which want to change it a lot

21:40:20 <b_jonas> and no, you can't just change the floating point control word in a non-inlinable function, partly because the ABI says that the rounding mode etc has to be in its default state between function calls, and more importantly because the compiler is normally allowed to reorder a floating point operation around an unknown function call.

21:40:34 <ais523> IIRC AVX-512 dedicates a couple of bits of the instruction to override parts of the FPU control word

21:41:17 <ais523> fwiw, on gcc you could probably get away with an asm voltatile that takes the results of previous FPU instructions and inputs of subsequent FPU instructions as read-write parameters and then doesn't change them

21:41:38 <ais523> would be annoying to write, but gcc would be forced to put the control-word-changing operation in the right place

21:41:49 <b_jonas> ais523: there are two cases when you want to change the floating point control word a lot. one is if you want to use a non-default floating point control word, but also do function calls or returns to code that you don't control since technically you have to restore the default control word because any library function is allowed to do floating instructions; the other is interval arithmetic which can

21:41:55 <b_jonas> change the rounding mode a lot

21:42:09 <b_jonas> ais523: but more likely you just want to change the control word once, then do a lot of float operations

21:43:35 <b_jonas> well, admittedly there's a third case, if you want to read the exception flags and have to reset them for that reason

21:43:45 <ais523> I was thinking of interval arithmetic

21:43:52 <ais523> exception flags might also be relevant in some algorithms

21:44:13 <b_jonas> and I don't know about what performance writing the control word has, I'm mostly concerned about cases when that doesn't matter

21:50:07 <b_jonas> apparently you can get slowdowns for denormal results in both Intel and AMD, and the optimization manuals for the two brands detail when these can and can't happen and what you should do about them

21:50:35 <b_jonas> the actual rules do differ between Intel and AMD, at least for some of their CPUs

21:52:15 <ais523> I think for many float operations, denormals are emulated in firmware rather than having dedicated hardware

21:52:19 <ais523> so the performance is terrible

21:52:58 <b_jonas> ais523: that's the gist of it, but the details are complicated. not all instructions with a denormal input or output give a slowdown.

21:53:24 <b_jonas> that's why there's an optimization manual

21:53:30 <b_jonas> I don't claim to really understand the rules

21:53:55 -!- normsaa has quit (Ping timeout: 256 seconds).

21:53:57 <b_jonas> I just have to know where to look if this becomes important

21:54:27 <b_jonas> fungot, does a circle have no end?

21:54:28 <fungot> b_jonas: yeah you said that geographic location has to do with the code than by using only the functional aspects of scheme systems support arbitrary-size vectors...

22:16:19 <arseniiv> fungot: alright, but at least it should have a coend, does it? (I don’t know what a coend is, that’s something from category theory)

22:16:20 <fungot> arseniiv: for example because it pollutes the default namespace

22:17:31 <arseniiv> hm that’s a bit too philosophical

22:18:55 <Corbin> fungot: What is a coend but a colimit over a bifunctor?

22:18:56 <fungot> Corbin: i suppose fnord is the original?

22:19:21 <Corbin> fungot: Bi-fnord-tor?

22:19:21 <fungot> Corbin: no, he refuses to give me access to your harddrive.)

22:20:36 <arseniiv> fungot always tries to fool you with words when you ask significant questions

22:20:36 <fungot> arseniiv: even the ' web-sivu' in the xvid format... would it be for me making return values in your helper function seem to always be using fd3 itself, though

22:21:31 <arseniiv> fungot: at least tell me this: xvid or divx?

22:21:31 <fungot> arseniiv: what, thin and unnecessarily bendy? and!... but is that what you recorded was to later evaluate ( quux zot)

22:21:54 <arseniiv> exactly as I said, no definite answer :′(

22:23:54 <arseniiv> > fix fungot

22:23:54 <fungot> arseniiv: what with my poor grammar and ' be's all over the state. so you can do

22:23:55 <lambdabot> *Exception: Can't be done

22:36:08 -!- Cale has quit (Remote host closed the connection).

22:38:37 -!- Cale has joined.

22:39:16 -!- chiselfuse has quit (Write error: Connection reset by peer).

22:39:16 -!- hendursaga has quit (Write error: Connection reset by peer).

23:46:08 <esolangs> [[Esolang:Sandbox]] M https://esolangs.org/w/index.php?diff=88009&oldid=87590 * PythonshellDebugwindow * (+18) rd

23:46:38 <esolangs> [[Esolang:Sandbox]] M https://esolangs.org/w/index.php?diff=88010&oldid=88009 * PythonshellDebugwindow * (+1) Rd

23:46:51 <esolangs> [[Esolang:Sandbox]] M https://esolangs.org/w/index.php?diff=88011&oldid=88010 * PythonshellDebugwindow * (+1) :

←2021-09-10 2021-09-11 2021-09-12→ ↑2021 ↑all