Dear sir, you have built a compiler (2022)

255 points132 comments4 days ago
quantadev

In software development it's pretty important to know when to build "on top" of something else, and when to start from scratch.

Lots of developers will find it much more interesting, challenging, rewarding and just plain fun to develop something from scratch, even when there are better things that already exist.

They'll cleverly manipulate and convince the boss, against the better discretion of their elder developers, that they can do it, and if they're one of the better developers, the boss won't want to risk losing them so they'll agree to the escapade.

Then said escapade turns into a shambles, as predicted by the elder devs, and the developer who created the mess simply quits and moves to some other job, in search of more fun and greener pastures. Any developer with decades of experience has probably seen this same pattern multiple times.

show comments
pjungwir

I've seen this a lot when someone wants to add "workflow automation" or "scripting" to their app. The most success I'd had is embedding either Lua or Javascript (preferably Lua) with objects/functions from the business domain available to the user's script. This is what games do too. I think it's a great way to dodge most of the work. For free you can support flow control, arbitrary boolean expressions, math, etc.

show comments
burnt-resistor

<old-guy-high-school-glory-days-and-nobody-today>

Reminds me of the pain of intentionally building a compiler for Java 2 (subset) to MIPS compiler by writing out each AST node class by hand. And, I did it twice, once in C++03 with bison and flex and again in Java 2 with CUP and JFlex... each was developed to build and run as a host portably across Solaris (sparc), Linux (x86), HP-UX (68k), SGI (MIPS), and Windows (x86) with compiled with targets run on the SPIM emulator. It did have dead code, dead string, and dead variable elimination, but that was as far my optimization passes went. I recall the only build tool I used for each was the portable subset of make without GNU extensions.

Speaking of reinventing the wheel, in 1998, I built a flexible almost framework for a "portable" generic installer using Java 2, JWT (native GUI controls), and JNI on Windows to create a program group and desktop shortcut icon. The hilarious part was shipping a full JRE on a CD. It took forever to load but the additional time seemed impressive for expensive, niche software in a way similar to the now fake "loading..." delayed progress bar.

</old-guy-high-school-glory-days-and-nobody-today>

show comments
iamthepieman

I get the solution for this and I know what all the terms mean. But I don't understand the problem. Whether it's facetious or hyperbole or whatever, I just don't get who or what circumstances this is addressing.

This is written like a Jeopardy answer. I just don't know what the question is.

Can anyone enlighten me?

show comments
DHaldane

It's ok to build a compiler sometimes -- it's just very important to make that choice intentionally

show comments
Pedro_Ribeiro

Having recently built 90% of a compiler by mistake, I felt like this post was written specifically about me. Hilarious writing, congrats to the author.

show comments
tda

So what do you use to know if you need to build it yourself or if there is already something out there? Niot being able to find a tool for the problem does not mean it doesn't exist, just that you haven't found it. Especially when you lack the familiarity with the problem to know the correct keywords.

I find ChatGPT to be of great help to explore the area, find relevant keywords or the name of the research domain. Sometimes you really need to know exactly what you are looking for before you can find the link to that one super helpful github library that solves you problem. The of course the next step is figuring out if you want to take on the dependency or not...

I have wasted hours searching for an (analytical) inverse kinematics library for robotic arms. There are tons of slow non analytical libraries out there, and some horrible ones like ikfast that is a effectively a code generator that spits out c that can be compiled with python bindings. I eventually did find https://github.com/Jmeyer1292/opw_kinematics, which someone ported rust (for which it was easy to create python bindings).

PittleyDunkin

I don't think building compilers is that bad, tbh. It's very difficult to do this without realizing it.

I've written a dozen different programs that might be considered compilers; some very simple, others very complex and whose life continued once I left the organization. Writing a functional compiler that provides the needs of the organization where existing tooling doesn't takes discipline and focus on what you actually want to accomplish. I don't know what "defining a struct inside a loop" might mean and this strikes me as, very obviously, having no clue what you actually want to build.

Perhaps the issue is not building a compiler but rather the lack of focus to begin with.

praptak

The conclusion is similar to the Greenspun quote

"Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp."

swyx

I wrote a similar recently: Oops! you built a database https://news.ycombinator.com/item?id=34941650

direct link https://dx.tips/oops-database

show comments
benrutter

This was a fun read! It has a link at the bottom to "if architects had to work like software engineers" which sounds fun, but the link no longer works, and searching doesn't bring anything up.

Anyone here know where I can find it?

show comments
neilv

I had this kind of risk in mind when I wrote a server-side "HTML template" feature for Racket.

The template language intentionally only handles static chunks of HTML, escaping of values, and a little safety guards.

Everything else (including the usual template language behavior like iterating over a collection/stream, such as from a database query result) is done with arbitrary normal Racket language, which the template feature's implementation doesn't have to know about nor handle specially.

https://www.neilvandyke.org/racket/html-template/

More recently (for employability reasons, or under-resourced startup pragmatics), doing Python with Flask, JavaScript with SvelteKit, and Swift with SwiftUI, I still miss the clean simplicity and available power that I had with Scheme/Racket.

taeric

I am not clear on why reaching for an existing compiler's AST would ever be top of list?

Don't get me wrong. I think many language design points should be used more. But starting from scratch makes a ton of sense. Skip the parsing stage and build up supported AST style constructs of your own.

Done simply, this is basically the command pattern. Keep execution separate from declaration and you should be fine?

Sure, you may want a parser for a dedicated serialization language some day. Hard to think you need start there?

But starting with the full AST of an existing language feels like a terrible idea. In any world.

dgfitz

Man, the yocto framework could do for a read over of this.

show comments
tn1

Many older .NET applications saved programmers from this by providing "C# scripts". The framework includes the compiler and then it's trivial to use the compiled artifact. You can still do it by including the Roslyn libraries. I don't see it as much anymore, or it's some half-baked Python or Lua interface.

show comments
vishnugupta

There’s an insider joke at Uber that if you start out building configuration manager you’ll end up with a full blown version control system.

layer8

But can it send email?

torginus

I do not understand this rant. If you have the vagues pretensions of being an actual software engineer, and your file format isn't brain-dead simple, the way to parse it tokenize->grammar based parser->ast binding phase. ASTs are simple recursive data structures, if you handle them correctly, it doesn't matter if they contain 50 or 5000 nodes or how they nest, as long as the code is correct.

SSA is a nice ish format for representing program code, but it's not the only choice and may or may not be appropriate for your domain. For example, if your language describes data instead of control flow, imo SSA is a bad choice.

I have done this and if you take care to do things right, you won't need to bother with these hacky corner cases.

brunospars

every config parser is a compiler. if platforms (e.g. programming languages) made run-time plugins easier, we wouldn't even have config files.

Imagine a config file with type checking and control flow. You have it-- it's your programming language. you just need to load the code at runtime, like erlang.

show comments
teaearlgraycold

I know of someone that did this for a bespoke form definition language to drive onboarding. Tens of thousands of lines, months of delays, and a bus factor of 1 later it was all eventually ripped out and replaced with plain old page templates. When your 10 question onboarding flow has a back-end class named “PredicateEvaluator” something is wrong.

bsenftner

Back in the earlier days of AI, not that early, but the late 80's I was the lead developer for an AI research program being jointly conducted by 3 business professors from MIT, Harvard, and Boston University. We were working on "frame based knowledge representation" - frame of reference based node links between nodes containing something: a number, a word, a sentence, or a "function that combines linked nodes into a new frame of reference".

Long story short, we thought we were making a new type of N-dimensional spreadsheet, but after 3 semesters of work one of the advisors at MIT told us we need to meet his colleague, and that guy informed us we had a working compiler for a hybrid of Lisp and C.

ris

Terraform

akshayshah

I enjoyed the article, but the unintentional Easter egg at the end left me in stitches: the link to “If Architects had to work like Programmers” just 404s, which feels spot on.

show comments
fragmede

So at what point does Kubernetes become justified?

show comments