• Don’t hate AI

    People hated books when they first came out. They hated horseless carriages. So any new disruptive technology will take a little while for society to digest it, also to evolve. Matt Mullenweg, 2026 I’ve seen people hating a lot of things when they came out. Like credit cards, crypto, JavaScript private elements syntax, or even

    Read more

  • Prefix compression of thousands of similar strings

    In the Mojibake library I use the data of the Unicode standard UnicodeData.txt file. That file is automatically generated and full of similar strings. The aim of Mojibake is to be as small as possible, and the name column of the unicode_data table is by far the largest piece of data in my SQLite database.

    Read more

  • A new BibLaTeX WordPress block

    I found that BibLaTeX nowadays supports URLs, Unicode characters, and all the modern stuff. I’ve always been a great LaTeX fan, since my university days. My book had a very complex millimeter-precise pagination such as: So, why not create a WordPress plugin that generates a BibLaTeX entry of the current post? But, even more funny,

    Read more

  • When a software project needs to stop

    Some years ago, I was at one of those shops where you can buy useless, but funny, objects that you can forget in one of your desk drawers. There was one hand-made of wood, that kind of pretty, creepy one, you can move parts, and that 99% of the time has been positioned to flip

    Read more

  • It’s Not Wrong that (for HN) “🤦🏼‍♂️”.length == 36

    Hey! Unintentional clickbait! I am not talking about how a space character has length 36 in Hacker News! If you are coming here from HN the above 🤦🏼‍♂️ emoji have been replaced with a space! Edit: Success! HN has renamed my entry to It’s Not Wrong that (for HN) “[facepalm emoji]”.length == 36. At least

    Read more

  • Unicode, buffering and printf

    One day, during a weekend, I was writing some code for my Mojibake library when I saw a strange output in my CLI for the U+10C0 codepoint chosen by random. Uh? The Unicode Georgian block (MJB_BLOCK_GEORGIAN) has ID 36, and not 0. Even stranger the name of the block was correct. Just the ID was

    Read more

  • Shrink SQLite amalgamation

    In Mojibake, the low-level Unicode library I am writing the first rule is: be small. I am using the SQLite amalgamation to index the hundred of thousands on codepoints/characters/etc. These are the numbers on SQLite 3.50.2, standard amalgamation. Once built with a very basic run of clang: We can do better because in Mojibake I

    Read more

  • Obscure projects and AI

    I like parts of developing that are not very trendy nowadays. Whatever LLVM I use makes a lot of very basic mistakes, produces slow code, and generally makes code that makes me smile. I had time for myself for the last two days, and I asked Cursor to count the commas in a string, in

    Read more

  • Perl, my old friend

    I was a Perl programmer many years ago. I remember having a program called good.pl, as in “good morning,” that at login runs my terminal, my text editor, SVN, the browser, etc. The 80-character line was a big deal, and with Perl, it was easy; that’s one of the reasons I really liked it. I

    Read more

  • The WhyUPLUSC388 curse

    In Italy we have something that I call the WhyUPLUSC388 curse, AKA the Alt+0200 curse, AKA “how do I write PERCHÉ“. That LATIN CAPITAL LETTER E WITH GRAVE and LATIN CAPITAL LETTER E WITH ACUTE, together with few other accented characters and the euro symbol are the only characters of the latin-1 encoding used all

    Read more