00:00:46 so you can send broken utf-8, but everyone interprets what you send as utf-8 by default, and may complain about broken messages if you send non-utf8 bytes, or may ignore it silently but think you're doing something wrong with your client config 00:01:09 before that people assumed every channel may have a different ascii-compatible encoding 00:01:46 -!- tromp has quit (Remote host closed the connection). 00:02:31 even though clients mostly just supported one encoding, but that's like how every computer and operating system supported one encoding back before about windows 95, the one byte encoding or shifting encoding that your region uses. 00:02:39 region and operating system brand. 00:03:31 I think it was web browsers that made computers start to be able to at least regularly convert from any input encoding 00:04:07 but even then it was a lot of time until other applications could do that, together with windows getting modified to be based on early unicode and utf-16 00:04:41 but even then for a while programs couldn't just decode an encoding, just the one "operating system default code page" that's activated or sometyhign 00:05:04 or two, one for windows and one in terminals only 00:06:35 on UNIXalikes, the way it was meant to work is that the user would set LC_CTYPE, and all programs would assume input was in that encoding and produce output in that encoding 00:07:00 which works fine for pipes, less well for files (unless you try to ensure all your files are in the same encoding), and not really over networks 00:07:03 ais523: yeah, but they didn't have any unicode encoding to convert to internally 00:07:08 (although network traffic should have a specified encoding) 00:07:11 so programs could barely implement that 00:07:22 they only knew which characters are whitespace and all that 00:07:26 they had wchar_t, just didn't know what it meant 00:07:27 perhaps a locale-aware wcwidth 00:07:33 but not much more 00:07:40 actually wcwidth is kind-of broken 00:07:56 true width calculation is more complex than you'd expect 00:08:11 that means a programs could tell letters from punctuation with locale-aware isupper/islower, but not much else 00:08:24 for libuncursed2 I've worked out an algorithm that correctly calculates the width of well-formed Unicode 00:08:31 but can produce bizarre results on malformed input 00:09:30 ais523: and the difference isn't even something about following new unicode standards quickly even if the user doesn't update their libc every year and so doesn't get a newer data backing the wcwidth, right? 00:09:50 it's not just tables for newer characters, I mean 00:11:03 wob_jonas: it's because multiple adjacent codepoints can merge into a single grapheme 00:11:28 yeah, in retrospect now I'm glad we got rid of that and everyone uses utf-8 or utf-16 and unicode ONLY, but it took lots of years to get there 00:11:37 * ais523 is careful to avoid the word "character" when this level of distinction matters, as both "codepoint" and "grapheme" are possible concepts that correspond to it 00:11:52 ais523: right, that's why you have to pull in a full libicu for unicode stuff 00:12:41 which can do normalization, or all those computations on any utf-16-nativeendian string without visible normalization, or a few of those computations on utf-8 strings as a bonus (the main interface is utf-16 native). 00:14:20 well, libuncursed2 uses unicode, not utf-8 00:14:30 strings are stored as ucs-4 internally 00:14:50 but utf-8 is used for communicating with applications 00:14:59 and also localization-aware soring and uppercasing (turkish i) and all that crazy stuff if you want, but WITHOUT unix locales needed anywhere. 00:15:04 utf-8 is a good transmission and storage format but bad for other things 00:15:30 Iıİi 00:15:53 ais523: right, that's why libicu and python and many other libraries use utf-16-native when they want anything unicode or localization-related 00:15:53 What operators is UTF-8 bad for? 00:16:00 operations* 00:16:03 IMO Unicode should have used different codepoints for the two Turkish Is than for the Latin I 00:16:31 ais523: that wouldn't have worked either 00:16:36 Lymia: finding the nth codepoint of a string 00:16:50 ... when is this an operation you do? 00:16:55 wob_jonas: presumably because of compatibility with legacy encodings that don't distinguish 00:16:55 ais523: then on a turkish machine people wouldn't be able to type an ascii i from the keyboard, or only with some really special key combo that most users don't use 00:17:16 Lymia: well in libuncursed2 we have an array of grapheme clusters, each of which is an array of codepoints 00:17:18 ais523: and then they can't use normal forms that aren't unicode-aware but would accept simple ascii strings with an i 00:17:25 -!- tromp has joined. 00:18:00 indexing within the clusters is important 00:18:09 so that's taking the nth codepoint of a string, effectively 00:18:35 Rust, at least, gets away just fine with UTF-8 strings internally. In my experience, indexing by a codepoint in particular is never what you actually want except in rare cases. 00:18:49 it's what you want if you have a single grapheme 00:18:51 You want to index a particular known point in the string, in which case, a byte index suffices. 00:19:01 normally you want to index by grapheme or by visual width 00:19:16 but if you only have the one grapheme then a codepoint is probably what you're looking for when you're indexing 00:20:12 When would you need to say "I want the second codepoint period in O(1) time"? 00:22:12 when you're trying to determine which of the many sorts of grapheme clusters you have 00:22:34 -!- tromp has quit (Ping timeout: 256 seconds). 00:22:47 e.g. if it's a Korean syllable, it follows different rules from if it's an emoji 00:22:57 You can't do that with an one-pass iterator? 00:23:28 well, I'm assuming you found the cluster itself via indexing an array of grapheme clusters, so now you have the cluster 00:23:46 ais523: my opinion is that with the turkish I, which was btw established shortly after Kemal Atatürk so definitely by the second world war, we're screwed and we would have been screwed anyway no matter what we do, because it's the one special case you need for language-dependent case folding, which would otherwise not be a thing and programs didn't 00:23:46 need to support it. 00:24:40 there's really nothing you can do about it that doesn't require that special case in programs. 00:25:02 I'm OK with a special case, but would like a special case that actually solves the problem 00:25:06 you need language-dependent processing for lots of other operations, like collation and search and comparison 00:25:11 -!- xkapastel has joined. 00:25:17 but not for case folding if it weren't for the turkish i 00:25:19 if you just have a "uses turkic rules" versus "uses latin rules" global switch, you can't casefold a string that mixes English and Turkish 00:25:22 it complicates the programs 00:25:33 ais523: exactly 00:25:45 so what would you do instead? 00:25:48 Anyway, my experience in Rust has been that, even with very complex tasks like parsing/etc, the UTF-8 encoding has not been an issue. 00:25:49 so the string itself needs inbound signalling as to what language it's in 00:26:01 make the Turkish i a differen code point? then Turkish users can't use any ascii program without workarounds 00:26:12 you need this for Japanese versus Chinese too; it's OK to say "these ideograms are the same" but the two languages need to render them slightly differently 00:26:15 when they communicate to ascii programs, they need to filter their text through a normalization or something 00:26:24 Because, in practice, all the complaints about there being no O(1) indexing are rendered moot by... indexing by byte index instead of some concept of character indexes. 00:26:40 ais523, cjk now takes 2x the codepoints 00:26:42 gratz 00:26:46 was it worth it? 00:26:56 ais523: yes, font rendering too, not just Chinese but Cyrillic has the same problem 00:27:12 Lymia: this is actually the solution some Japanes users use 00:27:26 however it would probably be better to have some sort of bracketing, like there is for bidi 00:27:31 *Japanese 00:27:35 therea are two to five different cyrillic scripts (I can't tell how many) that should have been encoded separately because they're never mixed, but they didn't for compatibility with legacy 8-bit encodings, 00:28:11 like the ascii hyphen or quotation mark 00:28:23 yes, I was going to say, this is a problem even in ASCII 00:28:31 at least they added a whole set of digits 00:28:36 only they never invented the non-compatibility characters then, because every program and font accepted only the unicode compatibility cyrillic 00:28:39 many typewriters didn't have a 0 or 1 00:28:40 ais523: right 00:28:58 ais523: and the hyphens and quotation marks, but those at least mostly have a character 00:29:01 (one or two missing) 00:29:10 and spaces 00:29:14 `unidecode - 00:29:14 ​[U+002D HYPHEN-MINUS] 00:29:20 hyphen-minus is a mess of a character 00:29:25 but for a while some fonts didn't have a character for some of those 00:29:36 the newer ones 00:29:43 anyway, it's very late and I gtg now 00:29:44 goodbye 00:29:49 night 00:29:51 -!- wob_jonas has quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client). 00:30:08 it's rare to use a hyphen before a digit and fairly rare to use a minus before a letter, so perhaps they should have been split in some automated way? 00:30:20 people don't have much trouble using a separate character for dash, after all 00:37:50 but I don't even know how to type a minus sign 00:55:22 -!- tromp has joined. 00:59:35 -!- tromp has quit (Ping timeout: 240 seconds). 01:00:49 -!- ais523 has quit (Quit: quit). 01:06:49 -!- zzo38 has joined. 01:22:43 You have a Abelian group with binary operator # and identity @ and one more binary operator $ satisfying: a$(b#c) = (a$b)$c (a#b)$c = (a$c)#(b$c) a$@ = a @$a = @ What is a mathematical structure called if it satisfies this? (I know at least one example, but what is it called in general?) 01:26:27 -!- arseniiv has quit (Ping timeout: 240 seconds). 01:29:19 -!- variable has joined. 01:58:49 -!- imode has joined. 02:27:14 -!- variable has quit (Ping timeout: 276 seconds). 02:32:08 [[Undefined behavior]] M https://esolangs.org/w/index.php?diff=55800&oldid=55796 * Oerjan * (-1) typo 02:33:46 -!- variable has joined. 02:37:42 -!- sprocklem has quit (Ping timeout: 245 seconds). 02:39:05 -!- sprocklem has joined. 02:43:46 -!- tromp has joined. 02:47:46 -!- variable has quit (Quit: /dev/null is full). 02:48:21 -!- tromp has quit (Ping timeout: 264 seconds). 02:50:54 https://ptpb.pw/fM-x/text 02:52:10 -!- MDude has quit (Read error: No route to host). 02:52:12 I like how I can exploit the concurrency of string rewriting with this approach. though there's probably a better way of doing it. 02:52:49 -!- MDude has joined. 02:53:06 you can, for example, interleave the current global state between every two symbols and have it propagate from either end. 02:59:59 i.e, you'd have one cursor per two symbols of space, ideally, including the start and end markers. 03:02:53 -!- imode-desktop has quit (Quit: WeeChat 2.1). 03:03:12 -!- imode-desktop has joined. 03:11:04 I like this because I can have "builtins" that behave like hidden rules, like arithmetic ops that are done in one logical step, as well as other computationally heavy things or I/O. 03:14:16 the issue is flow control, and the many apparent ways that that's possible. fork-join-style (I guess?) control is demonstrated above. 03:14:55 which is useful for sorting here, but I really want to abstract upwards towards random-access. 03:15:19 which makes me think I have to think more parallel than anything. 03:16:20 though this begs the question, what's underneath string rewriting? multiset rewriting? it looks a lot like linear logic, but I'm still not sure how you'd simulate things like a TM's tape. I guess godel numbering helps... 03:17:06 that's "too low", though, and I feel you wouldn't get the naive concurrency that you would with traditional string rewriting. 03:19:19 it'd be interesting to be proven wrong, though... 03:20:03 if you were to simulate a tape, you'd have to jump through a hoop to either something akin to a counter machine, or give every tape location a unique element of the multiset, with the multiplicity of that element being the symbol stored. 03:20:45 which, I mean, whatever, but this means in order to do something halfway useful with it, you need to plan out your storage requirements ahead of time, and the ruleset would be _complicated_. 03:23:02 at that point you've gone below the minimum structure required for "convenience". 04:04:11 -!- Naergon_ has joined. 04:04:22 -!- Naergon has quit (Ping timeout: 245 seconds). 04:31:53 -!- tromp has joined. 04:36:27 -!- tromp has quit (Ping timeout: 245 seconds). 04:37:26 [[English Binerdy]] N https://esolangs.org/w/index.php?oldid=55801 * Iamcalledbob * (+1814) Created page with "'''English Binerdy''' was created by [[user:Iamcalledbob]] to make [[Binerdy]] easier to program. ==Instructions== English Binerdy supports the following instructions: {| c..." 04:59:35 [[Esoteric programming language]] https://esolangs.org/w/index.php?diff=55802&oldid=55752 * Iamcalledbob * (+14) /* Obfuscation */ 05:11:43 -!- variable has joined. 05:12:05 -!- variable has quit (Client Quit). 05:16:07 -!- variable has joined. 05:21:26 -!- variable has quit (Ping timeout: 255 seconds). 05:28:48 [[Unhappy]] N https://esolangs.org/w/index.php?oldid=55803 * Iamcalledbob * (+833) Created page with "==:(== The title is "Unhappy" because of tecnical limitations. ==Overview== Commands start with or. The commands should be one line. Program runs every command one-by-one. =..." 05:28:59 -!- xkapastel has quit (Quit: Connection closed for inactivity). 06:06:06 imode: generalized monoid rewriting? 06:10:05 alercah: can you clarify? 06:10:31 Do you like ZZT game? Now I completed made up XYZABCDE.ZZT game. 06:11:26 [[Unhappy]] https://esolangs.org/w/index.php?diff=55804&oldid=55803 * Iamcalledbob * (-111) /* Overview */ 06:12:35 [[Unhappy]] https://esolangs.org/w/index.php?diff=55805&oldid=55804 * Iamcalledbob * (-27) /* Commands */ 06:13:19 alercah: why monoids and not semigroups? 06:19:54 -!- tromp has joined. 06:24:33 -!- tromp has quit (Ping timeout: 248 seconds). 06:35:07 [[Unhappy]] https://esolangs.org/w/index.php?diff=55806&oldid=55805 * Iamcalledbob * (-2) /* Examples */ 06:35:41 [[Unhappy]] https://esolangs.org/w/index.php?diff=55807&oldid=55806 * Iamcalledbob * (+2) /* Examples */ 06:43:07 -!- imode has quit (Ping timeout: 265 seconds). 06:55:54 -!- variable has joined. 07:01:28 -!- variable has quit (Ping timeout: 268 seconds). 07:11:02 -!- variable has joined. 07:13:08 -!- variable has quit (Client Quit). 07:14:16 -!- variable has joined. 07:17:41 -!- variable has quit (Client Quit). 07:20:53 -!- tromp has joined. 07:34:11 -!- wob_jonas has joined. 07:34:28 great! they silently released svn 1.10 in 2018-04 and I never noticed: 07:35:34 it's not in debian stable obviously, I haven't noticed so I haven't installed manually, I have installed it at work but I'm working only since 2018-05 after a long hiatus between 2015-02 and 2015-05, 07:35:50 at work I've installed latest svn a months ago but didn't notice the version number until today, 07:36:52 and their compatibility policy (summarized in the release notes http://subversion.apache.org/docs/release-notes/1.10 ) is actually so generous that you can almost always silently upgrade just the software, and get some but not all the new features immediately, 07:38:05 and every minor version bump adds cool new features, and the only compat that you sometimes notice immediately is that the checkout (working copy) format changes completely between some but not all minor version upgrades (like every second minor version upgrade or so, but not deterministically) 07:38:42 and this time it didn't change between 1.10 and 1.9 which is why I didn't even notice that I'm not running 1.9 until now 07:40:28 apache svn http://subversion.apache.org/ is great software, and I can really recommend it especially to the sort of people like me or zzo38 who are annoyed by a lot of user-facing software interface changing seriously like every two or three years now, and even a lot of backend libraries changing every five years so all the tech you learn is obsole 07:40:29 te in five years even if there's technically compatibility guarantees 07:41:13 and most technology is unrecognizable every ten years with these occasional updates 07:42:14 svn is one of those software like sqlite that has had strong compatibility guarantees and only kept improving 07:45:43 and svn stability goes back to version 1.0 in 2004, its C api is forward compatible since then, and existing repositories keep working forever in a compatible way (you lose some new features if you don't upgrade repository, but don't win), so you can mostly upgrade seamlessly, the only caveats are that 07:48:46 (a) working copy format sometimes changes incompatibly, so you may have to recreate checkouts, you can't upgrade them from older formats, but even that only changed like five or six times since svn 1, and 2. (b) you can't downgrade or used mixed versions of svn on a client because of this, 07:49:41 (c) you can't always easily create new repositories in formats readable by the oldest servers, so on a server you can't downgrade minor versions ever if you have created a repository 07:50:48 That's slightly worse than sqlite, which still has an option to create a database that sqlite 3.0 and every version since can read, even though creating such repositories isn't the default 07:51:22 s/repositories/databases/ 07:52:49 It's interesting that svn 1.0 and sqlite 3.0, the respective first version that offered strong compatibility guarantees decades into the future, were both released in 2004. 07:55:18 and for all the incompatibilities, they detail them nicely in release notes 07:55:58 some of them are even repeats: this release notes says 07:56:05 "Since "1.10.0" is smaller than "1.9.0" when considered as ASCII strings, scripts that compare Subversion versions as strings may fail to correctly determine which of "1.10.0" and "1.9.0" is the more recent one. Such scripts should be changed to compare Subversion version numbers correctly:" 08:01:13 -!- wob_jonas has quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client). 08:11:48 -!- teuchter has joined. 08:15:21 -!- choochter has quit (Ping timeout: 260 seconds). 09:57:55 -!- SopaXorzTaker has joined. 11:16:03 -!- Kai_Bruneji has joined. 12:09:54 -!- SopaXorzTaker has quit (Ping timeout: 256 seconds). 12:36:52 -!- arseniiv has joined. 12:47:20 -!- moony has quit (Ping timeout: 245 seconds). 12:47:53 -!- moony has joined. 12:55:02 -!- SopaXorzTaker has joined. 13:20:20 [[Esoteric programming language]] https://esolangs.org/w/index.php?diff=55808&oldid=55802 * Ais523 * (-14) Undo revision 55802 by [[Special:Contributions/Iamcalledbob|Iamcalledbob]] ([[User talk:Iamcalledbob|talk]]): Small isn't particularly hard to read as esolangs go; don't assume everything was written for obfuscation just because it uses only a few symbols 13:30:46 -!- ais523 has joined. 13:32:05 -!- oerjan has joined. 13:36:47 -!- ais523 has quit (Remote host closed the connection). 13:38:01 -!- ais523 has joined. 13:43:23 -!- Kai_Bruneji has quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/). 14:17:20 i suppose it wouldn't do for agatha to visit britain and *not* have a murder mystery. 14:17:57 They do seem to happen a lot here 14:18:02 Almost as much as Denmark 14:20:09 oh, i never got into that nordic stuff. 14:20:33 (not that i'm into crime mysteries in general either) 14:56:33 -!- SopaXorzTaker has quit (Remote host closed the connection). 15:02:27 argh, spoiler! 15:08:46 -!- imode has joined. 15:12:43 https://play.rust-lang.org/?gist=cb7244f41c040db41fc447d491031263&version=nightly&mode=debug 15:13:04 Here's some weirdness with Rust's type system at compiletime and two nightly features 15:14:31 -!- Kai_Bruneji has joined. 15:28:47 -!- SopaXorzTaker has joined. 15:28:55 -!- impomatic has joined. 15:29:23 -!- imode has quit (Ping timeout: 255 seconds). 15:42:42 -!- xkapastel has joined. 15:47:41 -!- ais523 has quit (Quit: quit). 15:48:19 -!- variable has joined. 15:48:34 -!- Kai_Bruneji has quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/). 16:03:27 int-e: hey, it was obvious in the _previous_ comic, and pretty much telegraphed since a page after the first mention of the guy. 16:07:18 -!- variable has quit (Quit: Found 1 in /dev/zero). 16:08:31 -!- imode has joined. 16:12:51 -!- imode has quit (Ping timeout: 240 seconds). 16:12:59 -!- imode has joined. 16:12:59 -!- imode has quit (Client Quit). 16:28:49 -!- oerjan has quit (Quit: Later). 16:41:26 xkcd is a groaner 17:22:49 -!- SopaXorzTaker has quit (Remote host closed the connection). 17:38:16 -!- MDude has quit (Read error: No route to host). 17:46:02 -!- MDude has joined. 17:50:20 -!- Kai_Bruneji has joined. 18:02:40 -!- AnotherTest has joined. 18:32:22 -!- xkapastel has quit (Quit: Connection closed for inactivity). 18:41:59 -!- Remavas has joined. 18:42:40 -!- Remavas has quit (Client Quit). 18:44:14 -!- Remavas has joined. 19:00:41 -!- AnotherTest has quit (Ping timeout: 265 seconds). 19:08:57 I found a problem with the documentation of SQLite. The type for the xCreate/xConnect methods of virtual tables says that the "argv" argument is "char**argv" but actually the correct type is "const char*const*argv". 19:15:35 <\oren\> zzo38: ehhh? 19:16:28 Did you use SQLite? 19:17:38 -!- xkapastel has joined. 19:23:47 -!- Remavas has quit (Read error: Connection reset by peer). 19:24:23 -!- Remavas has joined. 20:25:16 -!- sftp has quit (Ping timeout: 265 seconds). 21:05:57 [[Andromeda]] https://esolangs.org/w/index.php?diff=55809&oldid=55315 * ZM * (+37) 21:18:55 -!- j-bot has joined. 21:25:03 -!- Remavas has quit (Quit: Leaving). 21:28:35 -!- zzo38 has quit (Ping timeout: 245 seconds). 21:39:22 [[Special:Log/newusers]] create * GiratronKode * New user account 21:41:49 -!- AnotherTest has joined. 21:42:57 -!- Kai_Bruneji has quit (Read error: Connection reset by peer). 21:48:42 -!- zzo38 has joined. 21:51:21 -!- AnotherTest has quit (Ping timeout: 264 seconds). 21:51:31 The address has changed because a new modem has been installed. 22:05:43 [[Equipage]] https://esolangs.org/w/index.php?diff=55810&oldid=55763 * Ais523 * (+2734) /* Computational class */ proof via Minsky machines 22:06:01 [[Equipage]] https://esolangs.org/w/index.php?diff=55811&oldid=55810 * Ais523 * (-12) /* External resources */ recat, computational class is now known 22:07:36 [[Equipage]] https://esolangs.org/w/index.php?diff=55812&oldid=55811 * Ais523 * (+8) /* See also */ add 7, which is very similar to Equipage but uses a different set of commands 22:10:09 -!- boily has joined. 22:18:56 [[Esolang:Introduce yourself]] https://esolangs.org/w/index.php?diff=55813&oldid=55656 * GiratronKode * (+234) 22:19:21 [[InfSt]] N https://esolangs.org/w/index.php?oldid=55814 * GiratronKode * (+3702) Created page with "Hello, I'm GiratronKode and this is my first attempt to do an esolang, please have that in mind. ==Usage== InfSt has 3 commands: ''search stones'' => searches for an infini..." 22:47:47 -!- sftp has joined. 22:56:30 -!- arseniiv has quit (Ping timeout: 260 seconds). 22:59:43 -!- variable has joined. 23:00:53 -!- trout has joined. 23:04:22 -!- variable has quit (Ping timeout: 245 seconds). 23:26:31 -!- trout has quit (Ping timeout: 265 seconds).