←2025-12-19 2025-12-20 2025-12-21→ ↑2025 ↑all
01:11:52 <Sgeo> The existence of id :: Void -> Void shows that in Haskell's type system, 0^0=1. Its math operators agree. What languages do we have in here, I wonder if any disagree
01:17:11 -!- ais523 has joined.
01:21:07 <ais523> <Sgeo> The existence of id :: Void -> Void shows that in Haskell's type system, 0^0=1. ← it's hard to imagine a type system that includes an uninhabited type but *doesn't* allow defining the identity function on it
01:23:17 -!- amby has quit (Remote host closed the connection).
01:24:20 <ais523> the equivalent of x^0 is easy to define, it's absurd (i.e. a function with type Void -> a) – so the x^0==1 identity, when translated into type system terms, effectively states that there is exactly one correct way to define absurd
01:25:02 <ais523> meanwhile, the equivalent of 0^x is a function with type a -> Void – for most choices of a there are no such functions, but for inhabited a such functions do exist (so you get 0^x == 0 except when x == 0)
01:26:17 <Sgeo> I feel like someone somewhere has tried to figure out what negative and fractional types are. And I'm suddenly now curious about imaginary types
01:27:30 <sorear> well -1/12 = 1 + 2 + 3 + ...
01:28:09 <ais523> * for uninhabited a such functions do exist
01:28:59 <Sgeo> I assume there are unsigned infinite amount of functions from a negative type to Void
01:30:42 <ais523> I've thought about negative types in the past because the joke esolang TURKEY BOMB defines one (without specifying how it actually works)
01:31:17 <ais523> but TURKEY BOMB's is more of a fractional type, I think (it has more than 0 but fewer than 1 possibility)
01:31:58 <ais523> OK, so the simplest type is 1/2, which is a -1-bit integer – if you store it together with an 8-bit integer the resulting type fits in 7 bits
01:32:02 <ais523> * simplest such type
01:33:10 <ais523> 2^-1 == 1/2, so a function from the type -1 to bool should return a -1-bit integer
01:33:45 <ais523> wait, no, it returns a bool, the -1-bit integer is how you represent all such functions
01:34:34 <ais523> my guess is that because -1^1 = -1, -1^2 = 1, -1^3 = -1, etc., it is impossible to come up with a sensible definition of this
01:35:47 <ais523> how does it make sense for a function a -> b (for varying a and fixed b) to be either definable or antidefinable based on whether a contains an even number of possible values or not?
01:44:43 <sorear> non-classical logic, maybe
02:15:12 -!- pool has joined.
03:35:00 <esolangs> [[User:PrySigneToFry/Silicon dioxide in a polypropylene box/Mirror to Esolangist's Chess Games/Xiangqi]] N https://esolangs.org/w/index.php?oldid=170878 * PrySigneToFry * (+1213) Created page with "This is the scene of a Chinese chess match. You can freely add, modify, or delete content related to the symbols. = Notations = Red pieces are represented by lowercase le
03:49:59 <b_jonas> "<ais523> it's hard to imagine a type system that includes an uninhabited type but *doesn't* allow defining the identity function on it" => C++ constexpr begs to differ, I think you can't define a constexpr function with an uninhabited parameter type, or maybe you can define one only using templates, I'm not sure how this works
06:51:46 <esolangs> [[Xaxa]] https://esolangs.org/w/index.php?diff=170879&oldid=105195 * Yayimhere2(school) * (+11) /* Commands */ I can see ^ is 0 indexed
08:00:06 -!- tromp has joined.
08:02:16 <esolangs> [[Lacc]] https://esolangs.org/w/index.php?diff=170880&oldid=170603 * Yayimhere2(school) * (+0)
08:04:37 <esolangs> [[Lacc]] https://esolangs.org/w/index.php?diff=170881&oldid=170880 * Yayimhere2(school) * (+12) /* Command set */
08:12:00 -!- tromp has quit (Ping timeout: 245 seconds).
08:35:33 <esolangs> [[]] https://esolangs.org/w/index.php?diff=170882&oldid=156410 * Yayimhere2(school) * (-40) /* Find zero to the right (invalid) */ its not actually invalid, as ive Benn told
08:36:54 <ais523> hmm, I'm wondering if the main difference between paradigms is with respect to things like "in imperative languages, loops desugar into goto, but in functional languages, loops desugar into recursion"
08:37:10 <ais523> this isn't *quite* objective but seems to be a hard-line qualitative difference
08:50:34 -!- Sgeo has quit (Read error: Connection reset by peer).
08:56:23 <b_jonas> I can believe the latter, but I think there can be imperative languages where loops desugar into recursion
09:03:22 <ais523> hmm, maybe this is a consequence thing
09:03:57 <ais523> along the lines of "paradigms are basically a set of design decisions that are commonly made together, people who are designing an imperative language will therefore usually design it with jump-based loops"
09:06:13 <b_jonas> I mean, I admit that I'm trying to design Enchain to have if-gotos as the main way to make a loop, so in that respect I'm guilty.
09:06:40 <b_jonas> but I don't think that's the only possibility for an imperative language
09:07:54 <b_jonas> also we were recently talking about BASIC, and in at least some versions of BASIC, NEXT doesn't desugar into a GOTO because it can jump to different FOR statements depending on the path you arrived from
09:08:29 <b_jonas> (it also doesn't desugar to recursion though)
09:10:24 <b_jonas> in intercal it's not clear what kind of loop you want to desugar, it doesn't have sugar that tastes exactly like a loop
09:11:33 <b_jonas> come from mostly desugars into goto, but next doesn't
09:17:42 <b_jonas> oh yeah, TeX. I think TeX counts as imperative with loops desugaring to recursion
09:18:37 <ais523> NEXT from INTERCAL is interesting because it's unclear from the NEXT itself whether it's a subroutine call or a jump – it's subsequent code that retroactively decides for it
09:19:31 <ais523> although INTERCAL feels very imperative, partly because functions don't have "official" arguments (you have to use dynamic scoping to simulate it)
09:21:01 <ais523> hmm… using CREATE on operators, INTERCAL has functions which are close to first-class (and with arguments and a return), just with a weird calling syntax (but they work like C function pointers in that they are not closures)
09:21:02 <b_jonas> hehe, that's true to scan as well
09:21:10 <b_jonas> it has functions but they don't take arguments
09:21:38 <ais523> the basic idea is that you CREATE an operator and immediately use it in the next line, as a way of doing a function call
09:21:56 <ais523> and because C-INTERCAL supports computed CREATE you can do this with a computed line number
09:22:00 <b_jonas> though that's only because it's incomplete and abandonned, the functions should have arguments, I just never implemented that
09:30:30 <esolangs> [[User guessed]] https://esolangs.org/w/index.php?diff=170883&oldid=167629 * PrySigneToFry * (+10)
09:34:51 <esolangs> [[User:PrySigneToFry]] https://esolangs.org/w/index.php?diff=170884&oldid=169704 * PrySigneToFry * (+29)
10:10:21 <esolangs> [[User guessed]] https://esolangs.org/w/index.php?diff=170885&oldid=170883 * I am islptng * (+76)
10:10:44 <esolangs> [[User guessed]] https://esolangs.org/w/index.php?diff=170886&oldid=170885 * I am islptng * (-2)
10:11:47 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170887&oldid=170844 * Yayimhere2(school) * (+0) /* Semantics */
10:15:27 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170888&oldid=170887 * Yayimhere2(school) * (-3) /* Semantics */
10:27:48 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170889&oldid=170888 * Yayimhere2(school) * (+226) /* Examples */
10:29:12 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170890&oldid=170889 * Yayimhere2(school) * (-2)
10:33:48 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170891&oldid=170890 * Yayimhere2(school) * (+68) /* Examples */
10:42:28 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170892&oldid=170891 * Yayimhere2(school) * (+32) /* Syntax */
10:49:32 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170893&oldid=170892 * Yayimhere2(school) * (+1) /* Examples */
10:57:56 <esolangs> [[User guessed]] https://esolangs.org/w/index.php?diff=170894&oldid=170886 * Hammy * (+40) /* Commands */
10:59:50 <esolangs> [[Talk:Kava]] https://esolangs.org/w/index.php?diff=170895&oldid=148397 * Hammy * (+357)
11:04:56 <esolangs> [[Contains everything]] https://esolangs.org/w/index.php?diff=170896&oldid=170613 * C++DSUCKER * (+9)
11:09:08 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170897&oldid=170893 * Yayimhere2(school) * (+1537)
11:10:19 <sorear> "it's subsequent code that retroactively decides for it" riscv JAL says hi
11:10:54 <sorear> there's a fairly natural variant of SSA where basic blocks take named arguments instead of having phi nodes, and all jumps are syntactically tail recursion
11:20:17 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170898&oldid=170897 * Yayimhere2(school) * (+17) /* Examples */
11:25:45 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170899&oldid=170898 * Yayimhere2(school) * (+38) /* Examples */
11:28:11 <ais523> when I think of SSA, I think of the version that uses arguments rather than phi nodes – but I also think of it as using jumps rather than tail calls (there's nothing fundamentally strange to me about a jump that has arguments)
11:30:24 <b_jonas> what is "SSA"?
11:34:21 <ais523> single static assignment
11:35:29 <ais523> it's a coding style in which all variables are assigned a value on declaration and can never have that value change, but the name SSA is normally only used in cases where that is done in a compiler's intermediate representation
11:42:09 <b_jonas> oh, so that's why you want tail recursion
11:42:51 <b_jonas> and you might need that not only for loops but for conditionals
11:43:29 <ais523> b_jonas: righg
11:43:31 <ais523> * right
12:29:12 <esolangs> [[Special:Log/newusers]] create * Choas * New user account
12:41:43 <APic> Hi
12:56:30 -!- amby has joined.
13:33:00 <esolangs> [[User guessed]] https://esolangs.org/w/index.php?diff=170900&oldid=170894 * PrySigneToFry * (+253)
13:33:50 <esolangs> [[User guessed]] https://esolangs.org/w/index.php?diff=170901&oldid=170900 * PrySigneToFry * (+26)
13:34:50 <esolangs> [[User guessed]] https://esolangs.org/w/index.php?diff=170902&oldid=170901 * PrySigneToFry * (-63) Oops!
14:00:06 <esolangs> [[User guessed]] https://esolangs.org/w/index.php?diff=170903&oldid=170902 * Hammy * (+73) /* Commands */
14:08:16 <esolangs> [[Third Party Contractor Accused Of A Robbery]] https://esolangs.org/w/index.php?diff=170904&oldid=81927 * Kaveh Yousefi * (+1534) Introduced an examples section, comprehending as its incipial and aefauld member a single letter printer, supplemented a hyperlink to my implementation of the language on GitHub, and altered the Unimplemented page category tag to Impleme
14:34:07 -!- ais523 has quit (Quit: quit).
14:40:39 -!- impomatic has joined.
14:46:38 <esolangs> [[User guessed]] https://esolangs.org/w/index.php?diff=170905&oldid=170903 * PrySigneToFry * (+211)
15:31:02 <APic> cu
15:57:37 -!- slavfox has quit (Ping timeout: 264 seconds).
16:00:58 <esolangs> [[Esolang:Introduce yourself]] https://esolangs.org/w/index.php?diff=170906&oldid=170859 * Choas * (+182) /* Introductions */ my intro
16:28:35 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170907&oldid=170899 * Yayimhere2(school) * (+37) /* Examples */
16:49:23 <esolangs> [[Apraxia]] https://esolangs.org/w/index.php?diff=170908&oldid=170907 * Yayimhere2(school) * (+60) /* Semantics */
16:55:25 <esolangs> [[Blur]] N https://esolangs.org/w/index.php?oldid=170909 * Choas * (+9590) create Blur page for release 0.1.0
16:57:58 <esolangs> [[Language list]] https://esolangs.org/w/index.php?diff=170910&oldid=170864 * Choas * (+11) /* B */ Blur
17:16:14 <esolangs> [[Blur]] https://esolangs.org/w/index.php?diff=170911&oldid=170909 * Corbin * (+29) Tag slop.
17:16:56 <korvo> I'm not going to bother putting my evidence on-wiki anymore. In this case, the reference implementation is Claude-generated and contains a WIKI.md which has the same contents as the article. The article itself still has some memetic tells, too.
17:20:30 <sorear> somewhat unfortunate that the IRC bot now posts links that are useless without wiki accounts
17:23:49 <korvo> The curse of Mediawiki. It's not a cheap wiki to operate. It's no accident that most Mediawikis are run by one of like three groups.
17:42:09 <fizzie> Heh, I didn't think of those links.
17:43:57 <fizzie> Honestly I think they might as well just link to the page instead of the specific diff, but it doesn't have any settings, it's just "the IRC format".
17:47:12 <int-e> hmm though there's a risk that the numerical ids bypassed the problem of encoding unicode characters?
17:47:43 <int-e> `unidecode ◧◨
17:47:46 <HackEso> ​[U+25E7 SQUARE WITH LEFT HALF BLACK] [U+25E8 SQUARE WITH RIGHT HALF BLACK]
17:52:03 <fizzie> Well, you could have a numeric-ID link to a page as well; https://esolangs.org/w/index.php?curid=24423 in that case. Though it does make more sense for a recent changes feed to link to the change, this whole crawling thing is just unfortunate.
18:03:51 <esolangs> [[2 poets, 1 poem]] https://esolangs.org/w/index.php?diff=170912&oldid=139833 * Hammy2 * (+111)
18:14:49 -!- impomatic has quit (Quit: Client closed).
18:48:05 -!- Sgeo has joined.
19:33:49 -!- Lord_of_Life has quit (Ping timeout: 260 seconds).
19:37:19 -!- Lord_of_Life has joined.
19:44:26 -!- ais523 has joined.
19:44:45 <ais523> Blur is a good example of a language somehow missing its *own* point
19:46:16 <ais523> fwiw, the AI tells in the article itself, while present, are much weaker than in most AI-generated articles, which makes it a better article – at least I get a clear idea of how the language is specified from the description
19:47:34 <korvo> It's Claude. We're most used to detecting ChatGPT. The difference is that ChatGPT's RL is RLHF while Claude uses their own in-house RL called "Constitutional AI". Instead of human feedback, RL is evaluated by a policy drafted and verified by multiple smaller bots.
19:47:35 <ais523> although the scope of "sharp for"'s unblurring is unclear when you use it for something more complicated than a loop over a range
19:48:20 <ais523> the language would be much better without sharp for and without the 1000-iteration limit on regular for loops
19:48:50 <ais523> but maybe whoever generated the language didn't think of that
19:49:18 <ais523> (we've had similar issues in human-wrItten languages before, e.g. Java2K's main gimmick has no interesting way of working around it, you just have to write the code multiple times and hope at least one copy works)
19:50:36 <ais523> anyway, part of my thoughts are that the easier it is to tell that an article is AI-generated, the sloppier it is – slop can be considered undesirable either because LLM use in general is considered undesirable or because it's hard to read and not very good at conveying information
19:51:17 <ais523> but if the article doesn't look AI-generated then it's usually conveying information quite well (because if it weren't, you would be able to tell it were AI based on that), so the second reason why it might be undesirable is less significant
19:52:12 <ais523> of course, this may well have the usual AI problem of the documentation not really matching the implementation at all (I haven't checked), but the documentation does at least describe an internally consistent (if unfortunately designed) esolang
19:52:17 <korvo> I don't think that the user came up with the idea for the language directly. I think that they prompted Claude with something like, "What are some ideas for a chaotic esoteric programming language?"
19:52:37 <ais523> maybe?
19:52:42 <ais523> the core idea is interesting, though
19:52:52 <ais523> at least from the TCness proof point of view
19:53:18 <ais523> the restriction is fairly easy to work around for booleans, but finitely many booleans is not enough
19:53:21 <korvo> They don't mention inigos/exponential decay or Welford's algorithm, so I don't think that they were doing any of the background research that might lead one to want this sort of language feature. They certainly don't understand that an efficient implementation would have to *summarize history*; Claude's implementation keeps full history instead.
19:54:10 <ais523> you could also work around the restriction by allocating continuously in order to avoid ever having to assign to a variable more than once
19:54:21 <korvo> And the issues around loops suggest that they don't know about probabalistic programming and how that's normally solved via sampling strategies.
19:54:48 <ais523> actually, there's a neat reverse summarisation trick, but it only works if you have arbitrary-precision rationals
19:56:01 <ais523> <korvo> They don't mention inigos/exponential decay or Welford's algorithm, so I don't think that they were doing any of the background research that might lead one to want this sort of language feature. ← I think this is expecting too much, human-created esolangs also often don't know the mathematical basis and are naively implemented
19:57:29 <ais523> e.g. look at https://esolangs.org/wiki/Feed_the_Chaos, I actually *did* the research into implementing that in an optimised way, but I didn't write an interpreter that uses it
19:58:25 <korvo> ais523: I'm not trying to be mean. I'm genuinely trying to understand what they did. I know that asking them "How did you make this?" is futile.
19:59:08 <ais523> korvo: from my point of view, what I'm trying to understand is "is this the sort of article we want on the wiki or not" and I don't know the answer, so I'm exploring subquestions
20:00:11 <korvo> Oh, sure. I mean, yes. But we'd delete it if it were a copyright violation. The policy wags the enforcement.
20:01:03 <korvo> LLMs are simulators. Claude will simulate an entire menagerie of computer-science theories if given space and time. It will even implement them in code. But that doesn't imply Naur-style theory building.
20:02:14 <ais523> even bad ideas can be inspirational sometimes: IMAGERY is clearly ridiculous but it got me thinking
20:03:19 <ais523> specifically the rule 110 example: I was trying to work out an esolang that, given a picture of a 1D cellular automaton history, would expand it to produce more history lines (without being told what the automaton actually is)
20:03:23 <ais523> an that would give you a TC language
20:04:45 <korvo> Sure. And I won't pressure you into deleting this or any other slop page. I just want to stress: Claude doesn't have bad ideas. Claude doesn't have ideas. We don't actually have Conway's Law and etc. without Naur's theory-building; we do actually need *people* to have the ideas at hand.
20:07:17 <ais523> korvo: so I like to compare LLMs to random number generators
20:08:02 <ais523> if you ask an LLM to try to find an input that makes a program crash, this is to me conceptually similar to fuzz-testing, except that the LLM weights its attempts in a way that might or might not be useful
20:08:53 <ais523> on codegolf stack exchange, we were debating a couple of situations in which a program had defined all possible esolangs (by seedably generating them randomly, sort-of like Seed but it generates the interpreter rather than the program)
20:09:04 <korvo> afl also does weights. Or I suppose it does weighted paths. It rewards itself for exploring new paths, for finding crashy paths, and maybe one or two other scenarios.
20:09:39 <ais523> do those languages exist or not? (at the time, it mattered for CGCC rules) My argument was "the language is created at the point that a human sees it and decides that it is interesting"
20:10:10 <ais523> all the possible languages were there, but the creative step is deciding which particular ones are worth investigating
20:11:00 <ais523> and I think LLM-generated "ideas" are similar: the LLM isn't having an idea, but sometimes its output inspires an idea in a human reading it, now the idea has been had (even if it didn't occur organically)
20:11:28 <korvo> Tangent: https://lobste.rs/s/oysxby/functional_genetic_programming is probably the best paper for folks wanting to use ML to generate functional programs. It covers rewards, genes, tournament schedules, all sorts of goodies.
20:11:48 <korvo> Well, yeah. It's inception.
20:13:06 <korvo> The LLM is a pile of memes. We shake it and ask it for some memes, giving it a seed to grow from. It gives us likely memes. We don't really think of the output as memetic because the model's large enough to have learned grammar.
20:13:18 <ais523> korvo: something that's really been fascinating me recently is "there have been a lot of impressive-sounding claims for what LLMs can do recently – how many of the impressive LLM results could be reproduced using a smart fuzzer instead?"
20:14:02 <korvo> The problem is that those memes are average, laundered, corporate-approved. What is OpenAI going to tell you besides blandness that is, in terms of sentiment, neutral-to-slightly-positive about OpenAI?
20:14:24 <ais523> it seems to me like many apparent LLM successes can be explained using a mixture of fuzzing and plagiarism, but I'm interested in whether they contribute anything beyond that
20:14:30 <korvo> ais523: It's all search. That's a Bitter Sublesson.
20:14:49 <ais523> korvo: which meaning of "search" are you using here?
20:15:43 <korvo> TBH I'm not even seeing the successes. LLMs are just larger variants of pre-existing language models. I'm worried by the fact that people seem to think that these outputs are good instead of average.
20:16:20 <korvo> ais523: Like, *the* meaning. Search trees. Search spaces. All verification is proof search, which is grammatical tree search.
20:16:27 <ais523> (I have been thinking about the bitter lesson quite a bit, though – the original version of it seems plausible but it seems to have been misinterpreted by many AI companies, e.g. they seem to be interpreting it as "more training data will make the AI smarter" and I think that's neither true nor a correct interpretation)
20:17:03 <ais523> korvo: I have been using "search-based AI" to distinguish it from the neutral network / machine learning style of AI
20:17:11 <ais523> using the term, that is
20:17:32 <korvo> Almost nobody's actually read the Bitter Lesson. I happen to know it by heart, mostly because I'm tired of people getting it wrong. In particular, all it can really tell you is that you're working at too-high a level of abstraction and you'll never make artificial humans that way, which is not what AI researchers ever want to hear.
20:17:55 <ais523> korvo: I read it (because I saw people talking about it and decided to go to the original source)
20:18:20 <ais523> I am not sure whether it's right or not, but suspect that if it is right it is narrow in application
20:19:15 <ais523> in any case, I think it's very debatable whether any AI output is actually good – but indisputable that a large number of people believe it to be good which is an interesting fact on its own to analyze
20:19:17 <korvo> I don't see a big difference at a high level. Like, I believe NNAEPR: Neural Nets Are Essentially Polynomial Regression. But also, LLM interpretability tells us that Transformers can count how many characters are left in a line and how many characters each token will cost, as well as other sorts of counters. Also they evolve trigonometric tools and convolutional tools.
20:20:02 <korvo> And the Bitter Lesson papers over all of that neatly by simply saying that it's Moore's Law or equivalent scaling to blame. It's not the fault of the researchers; rather, it's more like good software engineering practice is also good ML practice. Update your machines and forward-port your code to newer libraries.
20:21:04 <korvo> Biologists would tell us that it's an example of supernormal stimulus; the LLM stimulates us like a chat converation would, but *moreso*.
20:21:29 <ais523> one thing I've been looking at recently is double-dummy solving in bridge: this is a game tree search with an objectively correct answer, and modern dummy double solvers can search the entire tree in a second or two
20:21:36 -!- impomatic has joined.
20:21:51 <ais523> but that requires lots of optimisations to make it work, a naive approach (even using good algorithms) can take tens of seconds or a few minutes
20:22:38 <korvo> My analysis here (https://awful.systems/post/5000835) is that humans prefer memes to original thoughts; if given the chance to think for ourselves or to be part of a hivemind, most humans prefer the latter. I could speculate that it's easier or conditioned or cultural or etc. but the fact that it happens is sufficient for me for now.
20:22:46 <ais523> just leaving this to Moore's Law is a possibility, *but* there are use cases where it would need to be several orders of magnitude faster (milliseconds or microseconds) and I'm wondering whether it would make sense to try to develop those optimisations
20:23:14 <ais523> korvo: I agree that is true but also suspect that I am an exception
20:25:06 <korvo> I think some optimizations do need to be engineered. Parallelizing, for example. Moore's Law and its friends are really only good explanations over multiple decades, not over one or two years.
20:28:54 <korvo> ais523: BTW, I'm not going to pull it up, but you might find the Bitter Lesson examples compelling. Chess is one of them IIRC. In a real sense, the biggest difference between a modern image classifier like CLIP and the 1950s image classifiers is the number of pixels and neurons and features and categories; the decades have given us compute and RAM.
20:29:50 <korvo> "Deep Learning" really just the fact that in the 1990s we could afford more than one layer of neurons. It turns out that vision-shaped models, like image classifiers, *always* develop the first couple layers into general-purpose convolutional tools; but we didn't know that then.
20:30:27 <ais523> korvo: I think the chess example is interestingly clarifying: improvements in search depth made position evaluation less important (but it was still pretty important), but things like observing symmetries gives a permanent speedup that isn't lost to the bitter lesson
20:32:41 <korvo> ais523: I'm trying to remember which grandmaster said, "I only look ahead one move -- but it's a really good one!" We're starting to understand that, to a reasonable degree, a large Markov model will have internal predictions for multiple subsequent tokens, even though those predictions could be spoiled by a surprising token.
20:33:36 <korvo> Position evaluation is really important, but it requires more than one scalar. Say, a vector space with multiple scalars for multiple facets. Say, a 180k-dimensional latent space with an embedding that has learned quite a few simple data-manipulating circuits~
20:34:25 <korvo> A *really good* lookahead, even only one move, might as well be a lookahead that speculates on more than one move. I have no idea how to square this with traditional ways of thinking about chess other than to suggest that humans also do tree search by default.
20:34:32 * korvo sucks at chess
20:35:07 <ais523> korvo: chess is interesting in a way because a really good neural network evaluator can play pretty well without any lookahead, *and* a really good lookahead algorithm can play at above grandmaster level even with a seriously flawed evaluator
20:35:38 <ais523> but the current top chess engines combine both, and thus only other top chess programs can realistically compete against them
20:36:02 <ais523> it's kind-of unclear what half of this situation the bitter lesson is supposed to apply to
20:36:12 <korvo> Yeah. More generally and topologically, we can't summarize an arc/sector of a hyperbolic system just by looking at it and describing what we see when zoomed out and blurry.
20:37:08 <korvo> Maybe it's *wrong* to use a (pseudo?) Euclidean embedding for LLMs. There's too much detail to capture.
20:37:58 <korvo> Or maybe intelligence (whatever it may be) is inherently emergent from composition and a hybrid engine is always going to have a higher capacity for benchmark performance.
20:38:20 <ais523> the situation with top Go engines is even more interesting – they play much better than the top humans in almost every respect, but there is a counterstrategy using a series of terrible moves that the engines have serious trouble beating
20:38:28 <korvo> But, like, that's part of what makes the lesson so Bitter; time spent wondering about this is time spent not upgrading hardware or OS or libraries.
20:39:33 <ais523> the basic issue is that one important strategic factor in Go is the number of empty intersections next to a connected group of stones, and you beat the engine by trying to engineer it into creating a group of stones that's approximately annulus-shaped
20:40:12 <ais523> the developers have theorised that their neural networks are really bad at counting the number of intersections next to an annulus because it doesn't have a clear end to count from
20:40:37 <ais523> (and even adding this sort of group to the training data hasn't helped much)
20:44:11 <korvo> Maybe it just takes longer than we think. Like, up until last year, we weren't even using the correct projections for how long gradient descent takes to get good, and that's something that was proven to be good (and biologically analogous to the Hebbian principle) a long time ago.
20:44:27 <korvo> Or maybe it takes a bigger network than we think. Like, humans aren't perfect at Go either.
20:45:45 <esolangs> [[8ial]] M https://esolangs.org/w/index.php?diff=170913&oldid=169517 * Ractangle * (+30) /* Syntax */
20:45:46 <ais523> (to clarify, because I wasn't explicit: the errors made by the Go engines in that situation are clearly related to miscounting the intersections)
20:46:45 <korvo> Oh! Very surprising and curious.
20:50:21 <b_jonas> ais523: so what happened to that? did the devs manage to improve the baduk engines to at least counter that one strategy? how did the arms race continue?
20:51:03 <b_jonas> it's not clear to me if there'd be just a very few specific strategies found and after that the engine becomes unbeatable by humans, or if there'd be new strategies found for countering each engine for a long time
21:01:59 <ais523> b_jonas: so I don't have up-to-the-minute information on this, but the last time I checked, they added a lot of patterns using that strategy to the training data, and the humans had fought back by modifying the inside of the annulus slightly so that it didn't match the patterns in the training data any more
21:02:16 <ais523> I assume that could very easily get into an arms race but don't know the result
21:20:44 <b_jonas> I mean if the problem were really that the neural nets are "bad at counting the number of intersections next to an annulus because it doesn't have a clear end to count from" that sounds like the devs could help that specifically
21:21:50 <ais523> a "this is the number of intersections next to the big annulus-shaped group" input? the problem is that big annulus-shaped groups don't normally arise naturally, so it's unclear what should be done in cases where there are none of them, or two
21:22:00 <ais523> and you would have to somehow define them objectively
21:22:28 <ais523> otherwise neither your training nor your inference will be able to supply the inputs properly
21:24:25 -!- ais523 has quit (Quit: quit).
22:00:33 -!- impomatic has quit (Quit: Client closed).
←2025-12-19 2025-12-20 2025-12-21→ ↑2025 ↑all