00:02:49 -!- rain1 has quit (Quit: WeeChat 3.0).
00:39:12 <esowiki> [[Clue (oklopol)/Quicksort]] M https://esolangs.org/w/index.php?diff=79929&oldid=20726 * PythonshellDebugwindow * (+6) Cat, rm redundant nowiki tag
00:40:10 <esowiki> [[Clue (oklopol)/SKI calculus]] M https://esolangs.org/w/index.php?diff=79930&oldid=20724 * PythonshellDebugwindow * (+6) Cat, rm redundant nowiki tag
01:07:47 -!- copumpkin has joined.
01:24:55 -!- ArthurStrong has quit (Quit: leaving).
01:33:43 -!- copumpkin has quit (Changing host).
01:33:43 -!- copumpkin has joined.
04:00:00 -!- Taneb has quit (Quit: I seem to have stopped.).
04:01:30 -!- Taneb has joined.
04:36:14 -!- harha has quit (Quit: ZNC 1.8.2 - https://znc.in).
04:46:55 -!- harha_ has joined.
06:37:37 -!- MDude has quit (Quit: Going offline, see ya! (www.adiirc.com)).
06:40:45 -!- Sgeo has quit (Read error: Connection reset by peer).
06:45:44 -!- Sgeo has joined.
07:07:58 <Sgeo> I usually go to sleep later than this. Probably not a good habit
07:10:22 <lambdabot> Local time for Sgeo is Mon Jan 11 02:10:20
08:08:23 <shachaf> I tried to reconstruct KMP string search from "memory" (I never actually looked at the details so it's not really memory), and ended up with something else instead.
08:08:42 <shachaf> With a lookup table of size |pattern| * 256
08:23:40 -!- Sgeo has quit (Read error: Connection reset by peer).
08:39:21 -!- LKoen has joined.
08:42:59 -!- imode has joined.
08:47:54 <int-e> shachaf: Yeah KMP builds a very compactly represented NFA, not a DFA.
08:48:45 <shachaf> Right, I saw something about that.
08:49:51 <shachaf> Or, hmm, this book says that KMP uses a deterministic automaton.
08:50:18 <shachaf> And compares it to shift-and which uses a nondeterministic automaton, it says.
08:50:23 <int-e> KMP has epsilon transitions
08:51:26 <shachaf> Epsilon transitions? I must be thinking of something else then.
08:51:50 <shachaf> Or maybe it's just about how you're thinking of it.
08:51:53 <int-e> Maybe I'm taking a too detailed view on KMP.
08:53:01 <shachaf> If you have /.*pattern/, you can think of the NFA where state 0 has a transition to state 1 on p, and also to state 0 on every character.
08:53:12 <int-e> In that view, each time you compare a letter from the haystack with a letter from the needle, a transition is made.
08:54:13 <int-e> And each time these are not equal, the haystack letter is not consumed... so that makes it an epsilon transition to my mind.
08:54:37 <int-e> But then again it lacks the annoying property of NFAs that you have to keep track of several states...
08:55:02 <shachaf> I was sort of thinking you operate by always taking one character at a time and doing your transitions, which is why I ended up with a DFA.
08:55:09 <int-e> So it's a weird beast inbetween :)
08:55:16 <shachaf> But the trick is that it's allowed to choose not to consume characters.
08:55:29 <shachaf> So a single character can take multiple transitions (possibly all the way back to 0).
08:56:15 <int-e> Mainly this was/is me trying to make sense of the table that KMP builds.
08:56:24 <shachaf> I think I vaguely remember something about that now.
08:56:31 <int-e> Which if it is viewed as a DFA, is hard.
08:57:22 <shachaf> But it's still guaranteed to take linear time.
08:58:09 <int-e> Oh yes. Each DFA transition takes amortized constant time.
08:58:45 <int-e> But you almost certainly know that.
08:59:04 <shachaf> I "know" it but it's not immediately obvious why.
08:59:10 <shachaf> I guess it's some typical amortized argument.
08:59:21 <shachaf> To be able to jump back you must have gone forward some number of steps.
08:59:30 <int-e> Each epsilon transition goes back in the needle, so is paid for by a previous transition that advanced in the needle.
09:00:18 <int-e> So that's your cost per character: Advance the needle, plus a potential epsilon transition that skips back.
09:02:15 <shachaf> Right, that's the sort of thing I meant.
09:03:04 <shachaf> So, hmm, you get at most 2n transitions or something?
09:03:11 <int-e> (and then, of course, there's building the table)
09:03:55 <shachaf> Where "n" is the length of the haystack, not the needle, despite the confusing name.
09:04:34 <shachaf> Anyway, this book doesn't actually get into the details of KMP. It says it's mainly useful for short needles, and there are better algorithms for those.
09:05:31 <int-e> Hmm. Somehow I've never studied Boyer-Moore.
09:06:28 <int-e> Beyond the very basic idea (skip ahead a full needle's length; if you're lucky the character you find is none of the needle characters, and then you'll process the string much faster)
09:07:02 <shachaf> Well, according to this book, Boyer-Moore is slower and more complicated than its Horspool simplification.
09:07:15 <int-e> I've never even heard of that one
09:07:43 <int-e> But that's consistent with what I said :)
09:07:54 <shachaf> It's what GNU memmem uses, apparently.
09:08:38 <int-e> (consistent: the difference is in the details that I never studied)
09:10:24 <shachaf> Shift-And looks really simple.
09:28:38 -!- imode has quit (Ping timeout: 260 seconds).
09:28:40 <shachaf> This algorithm is just simulating an NFA in parallel with bitwise operations.
09:28:57 <shachaf> So the transition is: state = ((state << 1) | 1) & table[c];
09:30:02 -!- imode has joined.
09:37:57 <shachaf> And to be slightly trickier you can invert all the bits to get shift-or.
09:38:15 <shachaf> Then shifting left gives you a 0 for free, so the transition is just: state = (state << 1) | table[c];
09:41:28 -!- LKoen has quit (Remote host closed the connection).
10:03:28 <shachaf> Hmm, maybe Horspool is just what I thought Boyer-Moore was.
10:10:51 <int-e> If it's a simplification...
10:11:04 <int-e> ...it's likely to get taught.
10:12:14 <shachaf> If that's true, why does anyone teach bubble sort ever?
10:12:23 <shachaf> What a terrible algorithm.
10:15:48 <int-e> Because of the name...
10:16:04 <int-e> And it's so easy to implement in place.
10:16:23 <int-e> Also it's really a family of algorithms.
10:19:05 <int-e> https://en.wikipedia.org/wiki/Sorting_network#Insertion_and_Bubble_networks
10:20:11 <shachaf> Yes, I know it's the same sorting network as insertion sort.
10:20:22 <shachaf> But that's not an advantage over insertion sort. Insertion sort is just better.
10:20:55 <shachaf> It's not just more efficient, it's simpler and more obviously correct.
10:21:14 <int-e> I really think it's the evocative name, and the physical analogy, that makes bubble sort popular.
10:24:54 <int-e> from a practical perspective I'd probably start with bucket or radix sort on a deck of cards :P
10:30:33 -!- rain1 has joined.
10:51:21 -!- LKoen has joined.
11:00:54 -!- sftp has quit (Excess Flood).
11:01:26 -!- sftp has joined.
11:14:25 <shachaf> Oh, this Horspool thing is actually not what I was thinking.
11:14:57 <shachaf> It places a window at a particular location, but then it just checks whether the window matches, which you can do either backward or forward.
11:15:14 <shachaf> Then if there's a mismatch it decides how to move the window based on the last byte in the window. That's it.
11:16:47 <esowiki> [[Special:Log/newusers]] create * Txlyre * New user account
11:39:03 <esowiki> [[Esolang:Introduce yourself]] https://esolangs.org/w/index.php?diff=79931&oldid=79926 * Txlyre * (+161) Add my introduction.
12:16:37 -!- imode has quit (Quit: WeeChat 3.0).
12:39:07 <esowiki> [[Length]] https://esolangs.org/w/index.php?diff=79932&oldid=79812 * Nailuj29 * (+99) add C# compiler
12:45:28 -!- txlyre has joined.
12:58:47 -!- LKoen has quit (Quit: “It’s only logical. First you learn to talk, then you learn to think. Too bad it’s not the other way round.”).
13:03:35 -!- txlyre has quit (Quit: -a- IRC for Android 2.1.59).
13:15:37 -!- xelxebar_ has joined.
13:16:23 -!- xelxebar has quit (Ping timeout: 240 seconds).
13:27:16 -!- heroux has quit (Ping timeout: 240 seconds).
13:28:03 -!- user3456 has quit (Ping timeout: 258 seconds).
13:33:50 -!- user3456 has joined.
13:34:07 -!- heroux has joined.
13:44:20 -!- LKoen has joined.
13:44:34 <b_jonas> shachaf: yes. it's an algorithm that almost never comes up anymore on modern machines. it made a bit more sense back when RAMs read one byte at a time but were the same frequency as the CPU and almost no latency. you'll find a more precise description in Knuth volume 5. it isn't even described in Cormen, or in Rónyai–Ivanyos–Szabó, ... hmm
13:45:19 <b_jonas> it must be in some book. I know I was supposed to understand this (and the other two string search algorithms) for an exam.
13:49:20 <b_jonas> I don't think it's in ed. Iványi either
13:53:25 <b_jonas> is Boyer-Moore the same as that algorithm?
13:55:16 <b_jonas> https://regi.tankonyvtar.hu/hu/tartalom/tamop425/0046_algoritmusok/ch11.html this book lists three different nontrivial string search algorithms
14:05:56 -!- adu has joined.
14:07:37 -!- adu has quit (Client Quit).
14:10:47 -!- arseniiv has joined.
14:10:58 -!- Arcorann has quit (Ping timeout: 260 seconds).
14:41:33 -!- mmmattyx has joined.
14:56:07 -!- ArthurStrong has joined.
15:00:49 -!- Sgeo has joined.
15:09:50 -!- LKoen has quit (Remote host closed the connection).
15:31:46 -!- LKoen has joined.
15:39:52 <esowiki> [[Special:Log/newusers]] create * Shahryar * New user account
15:47:27 <esowiki> [[Esolang:Introduce yourself]] https://esolangs.org/w/index.php?diff=79933&oldid=79931 * Shahryar * (+325) /* Introductions */
15:57:14 <esowiki> [[Language list]] https://esolangs.org/w/index.php?diff=79934&oldid=79904 * Shahryar * (+16) /* Non-alphabetic */
15:59:13 <esowiki> [[Language list]] https://esolangs.org/w/index.php?diff=79935&oldid=79934 * Shahryar * (+16) /* P */
16:12:34 <esowiki> [[Plutonium]] N https://esolangs.org/w/index.php?oldid=79936 * Shahryar * (+408) Plutonium Programming Language Intro
16:16:39 <esowiki> [[User:Shahryar]] N https://esolangs.org/w/index.php?oldid=79937 * Shahryar * (+194) Created page with "Hi, I am Shahryar Ahmad.I am a self taught teenage programmer.I love programming.I created my own programming language plutonium.I code in C/C++ and these are my favourite pro..."
16:30:53 <esowiki> [[Length]] https://esolangs.org/w/index.php?diff=79938&oldid=79932 * Nailuj29 * (+72)
16:41:56 <esowiki> [[Length]] https://esolangs.org/w/index.php?diff=79939&oldid=79938 * Nailuj29 * (-4)
16:48:04 <esowiki> [[Length]] https://esolangs.org/w/index.php?diff=79940&oldid=79939 * Nailuj29 * (+83)
16:54:51 <shachaf> b_jonas: The algorithm I'm describing is certainly in the Boyer-Moore family.
16:56:09 <shachaf> The book I'm reading divides string-in-string search algorithms into approximately three families, KMP-like, BM-like, and ones based on substrings (which it calls factors).
16:56:35 <shachaf> Rabin-Karp is yet another method, which it hasn't even mentioned yet.
16:56:57 <shachaf> Maybe it'll mention it if don't-care characters come up later.
16:57:51 <shachaf> Anyway, BNDM -- Backward Nondeterministic Dawg Matching -- is an example of the factor algorithm. https://www-igm.univ-mlv.fr/~lecroq/string/bndm.html looks like a link for it?
17:00:25 -!- xelxebar_ has quit (Remote host closed the connection).
17:00:49 -!- xelxebar has joined.
17:00:53 <b_jonas> shachaf: Rabin-Karp is mentioned in the book that I linked, and in Cormen
17:01:11 <b_jonas> shachaf: isn't there an additional family of randomized (hashing) algorithms?
17:01:18 <shachaf> Yes, I know what Rabin-Karp is, I just mean that so far it hasn't mentioned rolling hashes or anything.
17:01:50 <b_jonas> ah right, Rabin-Karp is the randomized one
17:02:13 <shachaf> My vague recollection of this topic is that optimal time for searching with don't-care characters in the pattern is pretty tricky to achieve with conventional algorithms.
17:02:40 <shachaf> And that maybe a rolling hash method does best at it, or something.
17:03:40 <b_jonas> shachaf: isn't that only if you insist on theoretical asymptotics though, while most of the practical input data that you want to search for is much easier, though you have to be careful when users can give you text and/or search queries of course?
17:04:02 <shachaf> I think that's right. This book is pretty practically-minded.
17:04:49 <b_jonas> well, in the end we'll just have to wait for Knuth vol 5 for a clear summary and final word
17:06:37 <shachaf> I quite like these methods that store a set of states in a machine word.
17:07:06 <shachaf> Even if they're limited to search for patterns of size 64 or something.
17:08:58 <shachaf> One thing this book doesn't cover at all is offline algorithms, where you can build an index on the text.
17:09:20 <shachaf> I'd read another book about those because there are so many interesting tricks there.
17:18:35 -!- sprocklem has joined.
17:20:58 <esowiki> [[Special:Log/newusers]] create * Chibiningen * New user account
17:24:19 <esowiki> [[Esolang:Introduce yourself]] https://esolangs.org/w/index.php?diff=79941&oldid=79933 * Chibiningen * (+159)
17:25:21 <esowiki> [[User:Chibiningen]] N https://esolangs.org/w/index.php?oldid=79942 * Chibiningen * (+13) Created page with "Irashaimasen."
17:28:09 <esowiki> [[Talk:Imaginary function]] https://esolangs.org/w/index.php?diff=79943&oldid=43070 * Chibiningen * (+164)
17:31:04 -!- mmmattyx has quit (Quit: Connection closed for inactivity).
17:48:14 -!- delta23 has joined.
18:19:43 -!- privateger has joined.
18:33:55 <esowiki> [[V]] M https://esolangs.org/w/index.php?diff=79944&oldid=79887 * Bo Tie * (+0) oops
18:34:44 -!- FreeFull has joined.
18:36:35 <esowiki> [[V]] M https://esolangs.org/w/index.php?diff=79945&oldid=79944 * Bo Tie * (+0) oops again
19:00:32 -!- Lord_of_Life_ has joined.
19:02:37 -!- Lord_of_Life has quit (Ping timeout: 264 seconds).
19:03:22 -!- Lord_of_Life_ has changed nick to Lord_of_Life.
19:11:00 -!- delta23 has quit (Remote host closed the connection).
19:11:30 -!- delta23 has joined.
19:11:33 -!- delta23 has quit (Read error: Connection reset by peer).
19:12:14 -!- delta23 has joined.
19:46:44 -!- jess has joined.
19:47:40 -!- jess has quit (Client Quit).
19:48:15 -!- jess has joined.
19:52:26 <rain1> file format for packing multiple files togethr: (filename\0data\0)* works with files who do not contain \0
19:55:10 -!- MDude has joined.
19:55:44 -!- also_uplime has quit (Disconnected by services).
19:55:57 -!- also_uplime has joined.
19:57:31 <zzo38> Yes, although other than text files, many files will contain \0
19:58:20 <kmc> https://en.wikipedia.org/wiki/Consistent_Overhead_Byte_Stuffing
20:03:14 <zzo38> That will work, although that makes it a more complicated format. (Although for some applications that might still be helpful, I suppose.)
20:09:28 <zzo38> Although I generally prefer the Hamster archive format, which is like what rain1 described except instead of adding a null byte after the data, add the 32-bit PDP-endian data size before the data. (Of course, different formats have their own advantages and disadvantages, such as this won't work if the data size won't fit in a 32-bit number.)
20:09:34 <kmc> if the goal is just to pack files together then i reckon a length-prefixed format is better than a delimited format
20:10:18 <zzo38> Yes, I think so too.
20:10:22 <kmc> the length-prefixed format doesn't require processing the file data at all, and it allows a reader to easily skip files that are not of interest, assuming the archive is on a seekable medium
20:12:14 <zzo38> Yes, I did think of that too (and have taken advantage of that too).
20:14:34 <kmc> COBS is good for something like a serial data link where a) you don't necessarily know the length of a packet when you start transmitting it and b) you want to be able to jump into the middle of a stream and resynchronize as soon as you reach the end of a packet
20:15:19 <kmc> I am thinking the next time I build an embedded system which needs to send structured data over a serial link, I might use CBOR + COBS
20:17:53 <zzo38> Yes, I believe you; that makes sense.
20:23:16 <kmc> serialization is a surprisingly hard problem
20:23:48 <kmc> it often seems stupid that the world has so many serialization formats, but it's surprisingly tricky to design a good one, and there are a lot of conflicting requirements such that there isn't necessarily one best choice
20:53:05 <shachaf> Man, this string searching loop is so good: for (int i = 0; i < text_size; i++) { state = (state >> 1) | table[text[i]]; if ((state & 1) == 0) { /* found */ } }
20:55:29 <rain1> does table just put how many chars left are need as bits?
20:56:45 <shachaf> I guess you can put it that way?
20:57:43 <rain1> it seems ok, very basic no skip aheads
20:58:45 <shachaf> Yes, it's only for small patterns.
21:00:45 <shachaf> If you want skipahead, Horspool is also really simple (way simpler than Boyer-Moore) and good.
21:02:38 <zzo38> kmc: I think that different formats can be good for different purposes, although it is true there are some problem with some of them
21:02:39 <rain1> thanks i didn't know about it
21:03:05 <shachaf> window_left := 0; while (window_left <= end - pattern_size) { if (s[window_left:window_left+pattern_size] == pattern) { /* found */ } else { window_left += table[text[window_left + pattern_size - 1]]; }
21:03:14 <shachaf> Where that if is doing string comparison, of course.
21:04:15 <shachaf> And the table just has the rightmost occurrence of each character (excluding the last one).
21:06:54 <zzo38> I think XML is too often used as a general purpose serialization format when it isn't very good for that; what XML is good for is stuff like HTML (and avoids some of the problems of HTML).
21:07:16 <kmc> XML has a lot of problems
21:08:06 <zzo38> Yes, it does have a lot of its own problems too
21:16:17 <zzo38> One serialization format that is often missed is the format produced by the printobject operator in PostScript.
21:16:26 <kmc> I do like some things that XML can do, such as namespacing of tags, ability to embed one type of XML document in another, and schemas to check validity
21:16:32 <kmc> but these things are often not used or used improperly
21:16:37 <kmc> and the concrete syntax of XML is very cumbersome
21:16:48 <kmc> which defeats the purpose of a "human-readable" format
21:17:03 <kmc> and if it's not going to be human-readable/editable then it could be a more efficient and easy to parse binary format
21:17:19 <kmc> zzo38: what is that format?
21:17:43 <zzo38> XML is way too often misused I think. The good things XML can do is good for things like HTML, not for other kind of stuff, I think.
21:18:14 <zzo38> kmc: Here is a description: http://fileformats.archiveteam.org/wiki/PostScript_binary_object_format
21:18:57 <esowiki> [[Plutonium]] M https://esolangs.org/w/index.php?diff=79946&oldid=79936 * Tetrapyronia * (+53) added link (needs formatting and stuff)
21:19:24 <esowiki> [[Plutonium]] M https://esolangs.org/w/index.php?diff=79947&oldid=79946 * Tetrapyronia * (-9)
21:23:12 <kmc> HTML isn't even proper XML
21:23:24 <kmc> and the project to turn it into proper XML failed and was abandoned
21:24:14 <zzo38> Yes, I know, HTML isn't even proper XML.
21:24:25 <b_jonas> kmc: it wasn't abandonned. while we don't transmit HTML written as XML, browsers and their Javascript DOM interface effectively expose a view of the live internal state of HTML document that is basically an XML tree
21:24:58 <b_jonas> so XML is a good way to describe how the semantics works
21:25:48 <kmc> it still differs from XML
21:25:54 <b_jonas> we just don't want to apply the restrictions of XML to the HTML files that we transmit because that'd be pointless. like, no \x00 characters? it'd just be a stupid extra requirement on the side that serves the XHTML, when the browser side will always have to be able to parse full HTML anyway.
21:26:31 <b_jonas> kmc: sure, it differs, but I don't think that counts as abandonned
21:26:34 <kmc> I agree that HTML is similar to XML and some of the same concepts apply when working with a DOM
21:26:41 <b_jonas> the XML thing wasn't a dead-end
21:26:49 <b_jonas> it just led to the DOM interface that isn't quite XML
21:27:02 <kmc> the idea of serializing HTML pages in an XML compatible format basically went nowhere
21:27:12 <kmc> there is actually an XML compatible serialization of HTML5 documents (XHTML5)
21:27:15 <kmc> but I don't think it's used much
21:27:22 <kmc> it does not parse as standard HTML5, I don't think
21:27:27 <kmc> it would have a different content-type
21:27:31 <kmc> and a different, much stricter parser
21:28:02 <kmc> versus the HTML5 parser which is a precisely specified complicated ball of garbage meant to parse any vaguely correct HTML-ish thing anyone's ever written since 1990
21:32:23 <fizzie> "-//W3C//DTD XHTML 1.0 Transitional//EN" for life
21:33:34 <kmc> <!DOCTYPE html> is so much shorter
21:33:39 <fizzie> Remember when web pages had that button at the bottom where it proudly proclaimed which standard it validates at, and when you clicked it, you got a validator report with at least a dozen errors?
21:42:32 -!- user24 has joined.
22:12:54 -!- user24 has quit (Quit: Leaving).
22:16:37 -!- privateger has quit (Quit: Leaving.).
22:23:37 -!- arseniiv_ has joined.
22:24:58 -!- arseniiv has quit (Ping timeout: 246 seconds).
22:46:55 <oren> Did they ever fix the vertical alignment problem in HTML?
22:47:18 <oren> (where there is no portable way to vertically centre anything, other than using tables)
22:47:57 <shachaf> Does flex-whatever not do it nowadays?
22:48:29 <zzo38> Isn't there some CSS command to position something as though it is a table cell?
22:50:02 <zzo38> (Although maybe I am wrong; my use of CSS is mostly limited to correcting the bad designs of other CSS writers.)
22:51:29 <oren> oh apparently display:flex with align-items:center does vertical centring
22:53:47 -!- delta23 has quit (Quit: Leaving).
23:08:32 -!- FreeFull has quit (Read error: Connection reset by peer).
23:08:33 -!- LKoen has quit (Quit: “It’s only logical. First you learn to talk, then you learn to think. Too bad it’s not the other way round.”).
23:22:06 -!- rain1 has quit (Quit: WeeChat 3.0).
23:25:45 -!- FreeFull has joined.
23:37:45 <fizzie> They did a multicolumn CSS thing too, right?
23:47:44 -!- ArthurStrong has quit (Quit: leaving).
23:54:22 <kmc> zzo38: yes, table layout is fully specified by CSS
23:54:57 <kmc> that is, there is nothing special about <table> and <td> etc. tags, you could use <div class="table"> and <div class="td"> and so forth and get the same results with an appropriate stylesheet
23:55:26 <kmc> fun fact: you can even style normally invisible tags such as <script> to be visible
23:55:39 <kmc> try injecting "script { display: block; }" as user CSS on your favorite website and see what happens
23:55:47 <kmc> talk about literate programming!
23:57:08 <zzo38> Yes, although <TABLE> is still useful for some purposes, such as, if the user disabled CSS, if you use a browser that doesn't have CSS, or so that a browser that implements some special commands for use with tables can deal with tables, such as sorting, SQL queries, export, etc.
23:59:49 <kmc> as far as i know, the use of <table> is still considered correct for things that are actually semantically tables
23:59:54 <kmc> just not for random bits of site layout