Debugging Exercise: Homebrew, Notmuch, and the Missing Manpages

Recently I’ve been tinkering with the email setup on my MBP. When I installed notmuch, I encountered a bug. Notmuch is a project that sports a majestic Unix beard, so naturally among the forms of documentation they provide are manpages. A quick brew install notmuch gave me a working notmuch, but no manpages. Figuring out why the manpages didn’t install was mildly tricky, so I’m writing it down here in case anyone else (possibly Future Me) has the same problem.¹

When you’re looking at a problem with command-line tools, switching them to verbose mode is always a good place to start. Homebrew normally suppresses the output of installer programs, but its --verbose flag makes that output visible. The average Makefile can produce a lot of output, though, so I used grep to see if there was any low-hanging fruit. There was:

brew install --verbose notmuch | egrep -i 'man.?page'
# => Checking if sphinx is available and supports nroff output... No (so will not install man pages).

I was puzzled: Sphinx is a widely-used tool, my system does have it installed, and it does support nroff output.

This is the point where the problem went from “am I doing the right thing?” to “why did the right thing fail to happen?” When problems come up, be sure to look at the possibility that the failure is your fault. We’ve all made errors, and humility is an important life skill. The Sphinx error told me there was probably a bug in the code involved, rather than in my understanding of them. All of the code involved is freely available (thank you RMS) so I downloaded it and took a look:

git clone git://notmuchmail.org/git/notmuch
cd notmuch
git grep -l 'supports nroff output'
# => configure

Looking for the error message led me to the configure script. It’s part of a fairly complex Makefile infrastructure, but the two-part test it uses to search for Sphinx is easy to reproduce: if command -v sphinx-build > /dev/null && ${python} -m sphinx.writers.manpage > /dev/null 2>&1 ; The first half reproduces with a quick copy and paste: command -v foo is similar to which foo, but in addition to asking whether there’s an executable file foo in $PATH (as which does), it also looks at builtins, shell functions, and aliases. To reproduce the second half, I need the value of ${python}, which an earlier part of the script defines by looking for a Python interpreter under various names. Usually the value will be just python, so I used that.

command -v sphinx-build > /dev/null
echo "$?"
# => 0
python -m sphinx.writers.manpage > /dev/null 2>&1
echo "$?"
# => 0

Running the configure script’s test confirms that yes, I have a Sphinx install that’s capable of generating manpages. The next question is, why is that Sphinx install not visible when the configure script is running during installation? Answering that question is what the site module is best at. It’s imported by default when you run Python, and it’s responsible for “adding all the standard site-specific directories to the module search path,” which in turn is a critical part of what makes the import statement work.

I used find $(brew --cache) -iname 'notmuch*' and brew formula notmuch to find the install source and the install script, then started editing. First, I commented out the sha256 "deadbeef0000" lines in the install script. Homebrew checks the SHA256 hash of sources during a normal install, which is a good and correct security feature that needs to be turned off for this. Then I edited the notmuch configure script in the install source, added a ${python} -m site invocation, saved it, and ran the installation again.

sys.path = [
    '/private/tmp/notmuch-20171027-17288-bpiisc/notmuch-0.25.1',
    '/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python36.zip',
    '/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6',
    '/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/lib-dynload',
    '/usr/local/lib/python3.6/site-packages',
    '/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages',
]
USER_BASE: '/private/tmp/notmuch-20171027-17288-bpiisc/notmuch-0.25.1/.brew_home/Library/Python/3.6' (doesn't exist)
USER_SITE: '/private/tmp/notmuch-20171027-17288-bpiisc/notmuch-0.25.1/.brew_home/Library/Python/3.6/lib/python/site-packages' (doesn't exist)

Success! Comparing this to the same invocation run from my terminal immediately points out a problem, further highlighted by site helpfully adding a little “(doesn’t exist)” note.

sys.path = [
    '~/projects/notmuch',
    '/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python36.zip',
    '/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6',
    '/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/lib-dynload',
    '~/Library/Python/3.6/lib/python/site-packages',
    '/usr/local/lib/python3.6/site-packages',
    '/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages',
]
USER_BASE: '~/Library/Python/3.6' (exists)
USER_SITE: '~/Library/Python/3.6/lib/python/site-packages' (exists)
ENABLE_USER_SITE: True

To confirm that this mismatch is causing a problem, I asked the system where my Sphinx install is.

which -a sphinx-build
# => ~/Library/Python/3.6/bin/sphinx-build
pip3 show sphinx | grep -i 'location'
# => Location: ~/Library/Python/3.6/lib/python/site-packages

This is progress: I have a narrow answer to the “why did the right thing fail to happen?” question. The ~/Library/Python/3.6/lib/python/site-packages path for Sphinx tells me that I installed it via pip install --user sphinx.² The /private/tmp entries in the module search path tell me that during installation, the configure script is sandboxed in a temporary directory and using that directory as $HOME. When invoked from the configure script, Python can only find packages that were installed system-wide, and Sphinx isn’t one of them. I took a quick trip into Homebrew’s source code to look for sandboxing. Because it’s written in Ruby, which makes it very easy to access environment variables like $HOME, it’s very easy to search for idiomatic use of environment variables. A quick cd $(brew --repository) && git grep -E 'ENV\W{2}HOME\W{2}' turned up an old_home = ENV["HOME"] assignment in the stage() function, which does indeed assign a new $HOME during installs.

As gratifying as it is to figure out why something failed, there’s still work to do. There are two main tradeoffs to make after characterizing a problem: specific versus general and workaround versus solution. Among other costs, things closer to the “general” and “solution” poles tend to require more control of the underlying elements and things closer to the “specific” and “workaround” poles tend to not be helpful to other people. With that in mind, here are some ways to address the problem I started with.

Roll my own: I already established that in my regular environment, the configure script can find Sphinx just fine, and brew install prints out the invocation it uses. It’s only a few steps more to compile the manpages myself:
```
cd ~/projects/notmuch
PYTHON=$(which python3) ./configure \
    --prefix=/usr/local/Cellar/notmuch/0.25.1 --with-emacs \
    --emacslispdir=/usr/local/Cellar/notmuch/0.25.1/share/emacs/site-lisp/notmuch \
    --emacsetcdir=/usr/local/Cellar/notmuch/0.25.1/share/emacs/site-lisp/notmuch
make V=1 install-man
brew unlink notmuch
brew link notmuch
```
This is pretty much all the way out the “specific” and “workaround” axes: it isn’t very reproducible and it doesn’t do anything about the underlying issue.
Interactive install: This is a tiny step further towards being reproducible: Homebrew’s sandboxing of installs is turned off if you pass the --interactive flag, so if I used brew install --interactive notmuch, I could run the same installation commands in my normal shell. This still requires doing work by hand, though, so it’s not very appealing.
Do it live: I could install Sphinx as a system-scope package, rather than as a user-scope package. This is a solution that doesn’t require doing things by hand and which might be helpful to others. Unfortunately, it requires messing with system-level packages, which is not something I want to do or recommend that others do.
Save the environment: In addition to importing site, Python uses the $PYTHONPATH environment variable to find modules. If I added $HOME/Library/Python/3.6/lib/python/site-packages, subsequent Python invocations should be able to find packages installed to that directory. I’d like to avoid setting $PYTHONPATH if I can; it’s prone to causing problems. For example, if you have both Python 2 and Python 3 installed, as many developers do, setting $PYTHONPATH will cause both versions of Python to look at the given path for modules. That’s good when you’re actively trying to develop against both versions of Python, but bad when you’re trying to repair the site-packages path.
Eat the $PATH: Another problem with changing $PYTHONPATH is that doing so only makes the Sphinx test halfway pass. As part of its sandboxing, Homebrew also drastically restricts $PATH, leaving the sphinx-build executable unfindable during the installation.³ Homebrew does have an affordance, however, for turning off the $PATH restrictions. You can add env :userpaths to the formula or pass --env=std on the command line. Combining these two approaches gets us to something that approaches being a good workaround:
```
export PYTHON=python3.6 && export PYTHONPATH=$($PYTHON -msite --user-site)
brew install --env=std notmuch
export PYTHON=‘’ && export PYTHONPATH=‘’
```
This isn’t perfect, but it’s got good reproducibility, so it’s what I ended up doing.

At this point I’m not entirely sure whether Homebrew’s behavior here is a bug. I don’t like that it discourages people from installing packages as --user, and it already has the setup_home() function (clumsily) patching the module search path. Needing to perpetrate $PYTHONPATH shenanigans is a bad sign. The superenv approach does make installs much easier and more reproducible, so it’s a very good thing overall, but it could be improved.

What is clearly a bug, though, is an issue in notmuch that I stumbled on while digging through all this. You can set the $PYTHON environment variable to tell the installation where your preferred Python install is. The installer ignores this information when it goes to run Sphinx: it instead takes the first sphinx-build it finds on your $PATH. Similar to the problems with $PYTHONPATH, this can lead to problems when you have both Python 3 Sphinx and Python 2 Sphinx installed. The workaround for this is to use command -v sphinx-build to check which version is first on your $PATH and to use that version. This won’t work indefinitely, but it should work for as long as notmuch can be built with both Python 2 and Python 3.

Per the genre conventions of debugging posts, I’m eliding almost all of the dead ends and unproductive attempts from this and instead writing about how I would have solved the problem if I were staring out the window on a pleasant foggy morning with a tasty cup of coffee beside me and a good night’s sleep behind me.↩
There have already been plenty of posts about this, so I’ll say this very quickly: you should almost never sudo pip install anything; the right way to install in almost all circumstances is pip install --user.↩
It would still be findable if Sphinx had been installed as a system-level module. Notice a theme?↩

How to Make SQLAlchemy Pass Connection Flags to a SQLite Database via File:// URI Query Parameters

August 10th, 2016

(for just the code promised by the title, see this GitHub gist.)

Making software is more work than just sitting down and writing code. For the code you’ve written to matter, you have to make it available to others. Like prose or visual art, there is no “done” with software projects: there is only Good Enough. Unlike prose or visual art, a software project can raise its standard of Good Enough over time by releasing new versions of the code with bug-fixes and new features. This is where the “make it available to others” part starts being very difficult. Active projects, by releasing new versions, always end up in a state of heterogeneity where not all of the project’s users are using the same version of the project. When active projects are incorporated into larger projects, it exacerbates the problem. It’s very common for active projects to advance very far as a standalone project, but to lag very far behind that advancement as part of a larger project. Sometimes this is very difficult for users. But it is not any one person’s fault: it is, overwhelmingly, the emergent result of how projects interact with each other. Today I’m going to show how this process leads to the preëminent database/ORM library in the Python world, SQLAlchemy, being unable to take advantage of a nearly decade-old feature of SQLite, a widely-used database.

Let’s start with SQLite. Like most database systems, SQLite lets you provide connection flags (extra information) when you connect to a database. By sending connection flags (e.g. SQLITE_OPEN_READONLY or SQLITE_OPEN_CREATE) you can have the database itself enforce restrictions on your interactions with it. Being able to send connection flags is very helpful for programmers. Having the database enforce the restrictions that the connection flags signify means you don’t need to write your own enforcement code. It also eliminates the chance of making mistakes while writing enforcement code. You gain productivity because the time that writing and checking enforcement code would take, you can instead spend on writing other code.

SQLite added connection flags in version 3.5.0 (Fall 2007). However, SQLite is a C project, not a Python project. The connection flags are concepts that exist in SQLite’s C code. For them to exist in other languages, those languages (or their ecosystems) must provide a way of interacting with SQLite that permits specifying “please send the following connection flags when you connect to the SQLite database.”

Plenty of other languages already had tools for interacting with SQLite in 2007, based on a function named sqlite3_open(). Because there was already plenty of software using sqlite3_open() and relying on its existing behavior (SQLite’s first public release was Spring 2000), SQLite 3.5.0 also introduced a new function, sqlite3_open_v2(), that understood connection flags. This allowed users to keep using sqlite3_open() until they were ready to change their code to use sqlite3_open_v2(). Once they began using the new function, they’d be able to use the new features. In version 3.7.7 (Summer 2011), SQLite made it easier still to use the new features by teaching both the old and new versions of sqlite3_open() to, with a little coaxing, understand file:// URLs¹ as well as file paths. File paths are specific to a particular operating system or family of them, but file:// URLs are OS-independent. This made life slightly easier in general, but using file:// URLs had a more important benefit as well. Using them made it much easier to send connection flags, because SQLite permitted users to put connection flags in the file:// URL. Effectively, both versions of sqlite3_open() could now understand connection flags. SQLite also added some connection flags that could only be used by embedding them in a file:// URL.

If you were writing Python instead of C, though, you couldn’t count on having access to SQLite’s improvements. Python’s support for SQLite comes from periodically incorporating code from the independent pysqlite project. The sqlite3 module in Python’s standard library (introduced with Python 2.5’s release, Fall 2006) is a lightly modified version of pysqlite. Python 2.7 (Summer 2010) contained version 2.6.0 of pysqlite (Spring 2010). This version remains the core of sqlite3 as of Python 2.7.11 (Winter 2015) and Python 3.5.2 (Summer 2016). There does not yet exist a version of Python where the following code works²:

import sqlite3
sqlite3.connect("file:///path/to/my/database.db")

There are workarounds, but they show how challenging it can be to get new versions of software projects into users' hands. Fundamentally, the Python code above fails because SQLite, in the process of teaching the sqlite3_open() functions to understand file:// URLs, chose to make the new feature opt-in (similar to how they distinguished between sqlite3_open() and sqlite_v2_open). There are three times when you can opt into having SQLite understand file:// URIs: during its compilation, when it launches, and when you call it. The sqlite3 module, in its pysqlite version 2.6.0 incarnation, avails itself of none of them. It also provides no way for users to opt in.³ As an independent project, pysqlite released version 2.8.2 (Spring 2015), which added a way for users to send any connection flags SQLite understands.⁴ This version is not part of Python, however, and is only available for use as a standalone module when using 2.x versions of Python. Early versions of Python 3 were also stuck with the no-URIs behavior. Python 3.4 (Spring 2014) introduced a way to tell sqlite3.connect() that it should treat its input as a URL.⁵ Unlike pysqlite’s improved version, the Python 3.4 change didn’t add a general way to send flags (though it did open up the “send flags as part of a file:// URL” path). Still, by mid-2015, if you were using sqlite3, you had a fairly good chance of being able to use connection flags.

There are a lot of people using SQLite who aren’t using sqlite3, though, at least not directly. Because of how easy it is to create bugs, some of which will be disastrous security holes, and because of how tedious it can be to write raw SQL queries, the overwhelming (and correct) consensus of the Python community is that you should use SQLAlchemy to interact with your database. SQLAlchemy also connects to databases via URLs, but given that its decision to use URLs predates SQLite’s by years (SQLAlchemy version 0.1.0, Winter 2005-06), it should be unsurprising that the two usages clash. SQLAlchemy wants users to identify databases by URLs with the database name as the schema. So the database in our example above would be sqlite:///path/to/my/database.db. SQLAlchemy’s database-URL schemas can have extra information (query parameters) in them, like SQLite’s file:// URLs, which tell it how to connect to the database. The connection isn’t done by SQLAlchemy, though, it’s done by an external library. SQLAlchemy is a layer on top of modules like sqlite3 that understand how to directly interact with databases. Under the hood, SQLAlchemy extracts a file path from the database URL and hands that file path to the underlying database module. This structure, though, eliminates the possibility of asking SQLite to open a file:// URI! It can only send file paths to SQLite, and so the extra capabilities that SQLite activates when it sees the right prompts in a file:// URL cannot be activated through SQLite. SQLAlchemy does try to pass on extra arguments that it finds in the database URL, but it passes those on to underlying database modules like frozen-in-amber-since-2010 sqlite3.⁶ Such extra arguments change the details of sqlite3’s behavior, but do not change the way it tries to connect to SQLite. On older Python versions, pysqlite 2.8.2 or later can be substituted for the built-in sqlite3, but because pysqlite is not available on modern Python versions, this is not a satisfactory solution.

We are in a situation, nearly 10 years after SQLite introduced its connection flags and file:// URLs, where taking advantage of those features from Python code is impossible to accomplish with the tools provided by the latest version of Python’s best database library, running on the latest version of Python itself. It’s important to note that none of this is malfeasance or incompetence on the part of library or language maintainers. Projects like the Python language, SQLAlchemy, and SQLite, prize stability very, very highly. They are infrastructure projects: they want to build durably and to provide durable materials for others to build atop, and they are not wrong in how often they value this above convenience. The power of defaults is very important here, too: although many OSes ship with Python and/or SQLite built in, those projects in turn have their own release cycles and their own stability concerns. The first version of OS X that shipped with a SQLite version able to understand file:// URIs was summer 2012’s OS X 10.8 “Mountain Lion” (Summer 2012). Debian-stable didn’t ship with such a SQLite until midway through wheezy’s patch sequence (2014). Ubuntu picked it up faster, incorporating SQLite 3.7.7 in their Oneiric Ocelot release (Fall 2011). All of these infrastructure projects, reasonably enough, tend to defer building support for any particular thing until they are sure that their users can and want to use it. Frustratingly, they can unintentionally enable each other in delaying support. But there is no archfiend actively obstructing the uptake of new versions, just a collection of overworked engineers trying to build things that won’t fall apart too easily.

Fortunately, individual programmers writing brand-new projects have no old versions to be bound to. We can, by investing a little work, make different decisions about stability than project maintainers. This brings us around to the promise in this post’s title. Python, sqlite3, and SQLAlchemy were all written by clever people with an interest in flexibility. The tools that they’re eager to give us, the defaults, are not the only tools they can give us: there are others. Let’s use those others.

The code below follows a fairly straightforward strategy. Given a file path and some arguments to pass to SQLite, it begins with some basic plausibility checks. It ignores the arguments and uses only the path if the caller has an ancient version of SQLite or wants SQLite’s :memory: database. Otherwise, it turns the file path and the arguments into a file:// URL, then tries to connect to that URL. First it tries in the way that Python 3.4+ allows, with a uri=True parameter. If trying to connect that way is met with “I know no such parameter as uri”, we know we’re on an earlier version of Python. But since we know that SQLite and sqlite3 are available, we use ctypes to reach into the Python/C communication channel that the sqlite3 library has already set up. We prod the variable in C code that tells SQLite we’re opting into our input being treated as a URL, then connect again with our URL. Finally, we create a SQLAlchemy engine. We tell it that we’re connecting to an empty URL⁷, but we also tell it “when you go to connect to that URL, use this object we’re handing you in order to establish that connection.” The object we hand it is the SQLite connection we just established, which SQLAlchemy doesn’t know how to create by itself.

This strategy has some limitations: it definitely won’t work as-is on PyPy or Jython, and it’s superfluous if you know your project will run only on Python 3.4+. However, if you want your project to run on multiple versions of Python and to have access to nice SQLite features on all of them, this function will, I hope, get you to that point. I should also note that I drew inspiration from this GitHub issue and this Gist: the digging is all me, however, as is the unit test. I’m firmly of the opinion that if you tell someone you have code that can solve their problem, the code you provide should be tested.

URIs, if you’re picky.↩
Specifically, you’ll get a sqlite3.OperationalError that signifies a C-level SQLITE_CANTOPEN return code.↩
To be fair, Python can’t (and shouldn’t!) compile SQLite for you.↩
Mostly by switching from sqlite3_open() to sqlite3_open_v2().↩
Also by switching from sqlite3_open() to sqlite3_open_v2().↩
See sqlalchemy.dialects.pysqlite.SQLiteDialect_pysqlite.create_connect_args() for the implementation.↩
Normally this gets you connected to the :memory: DB.↩

☈ An Innovative Business Model

Every so often Pantone blows my mind. Their business model is that they sell colors—not that they sell paint or ink or dye, no, they sell colors! It’s a little unbelievable. Of course if you venture into the world of printing, publishing, and large-scale content creation in general, color is just one of many important details that have to not only be right, but right in a way that you can talk to others sensibly about. So Pantone can be thought of as a standards body like ISO— and like ISO, they aren’t cheap, and they look rather surreal from the outside.

They’re a great lesson in that way, though: sometimes to understand odd things, you have to meet them on their own terms, not on yours. Otherwise you’ll be looking at “Emerald is the color of the year!” and be staring at your monitor mouthing “what the devil does that mean, who decided that, how the heck is a color ‘lush’?” But it makes perfect sense on its own terms. Like many other things in the world, it exists for itself and those who know it, not for strangers.

Of course, you may want to treat my opinion on the matter with skepticism since I’m the kind of person who finds the Emerald Pantone iPhone Case mildly fetching.

Two Traps

Joel Spolsky famously opined that you should look for two things when hiring a programmer: Smart, and Gets Things Done. Now, there are a lot of smart people in the world (with a nod in the direction of the fox/hedgehog debate), and plenty of them spend some time programming. Even though there are also a great many people who think themselves smarter than they are, there is a thriving body of lore on filtering that demographic out. So most of the task of evaluating programmers is about evaluating how and if they Get Things Done.

Evaluating yourself this way is a good skill to have; lately I’ve been trying to build more and tinker more on account of being less-than-fully satisfied with what I see in my self-evaluation. I’ve also noticed two failure modes that smart programmers may fall into (one of which I’m doing my darndest to avoid), running parallel to the symbols-versus-understanding sides of the Chinese Room Argument. The argument, oversimplified, is about whether or not it’s possible to infer that an unseen conversational partner who manipulates symbols correctly, understands the communication.

A self-taught programmer is proficient in manipulating symbols, but is vulnerable to the failure mode of not understanding them, of having a myopia about methods and goals. On the other hand, engineers who’ve been involved in trying to hire from university computer science departments can attest that CS shops produce a certain proportion of people who understand why all of those symbol-manipulation rules are what they are, and who know a great deal about the rules and how they are implemented, but who are curiously unable to actually perform the manipulation of symbols and the latching-together of symbols into structures— no GitHub account, no projects of their own, no open-source contributions.

I’m nothing like the first person to notice these failure modes, but I think that identifying them as such (not as “all self-taught programmers are flaky” or “a CS degree is superfluous”) is helpful. Identifying a problem opens up the possibility of solving it. For me, it was humbling to stumble on the Chinese Room Argument and realize that while I’m good at manipulating the symbols, that is not the same as fluency and I have a lot of work to do ahead of me.

☈ Negative 100 Points

A short post by Eric Gunnerson about designing C#, nearly a decade old now, has stuck with me for a long time.

[the question] implies that we started with an existing language (C++ and Java are the popular choices here), and then started removing features until we got to a point where we liked. That’s not how the language got designed. One of the big reasons we didn’t do this is that it’s really hard to remove complexity when you take a subtractive approach, as removing a feature in one area may not allow you to revisit low-level design decisions, nor will it allow you to remove complexity elsewhere, in places where it support the now-removed feature. We decided on the additive approach instead, and worked hard to keep the complexity down. One way to do that is through the concept of “minus 100 points.” Every feature starts out in the hole by 100 points, which means that it has to have a significant net positive effect on the overall package for it to make it into the language. Some features are okay features for a language to have, they just aren’t quite good enough to make it into the language.

If this sounds familiar, it should: this is another lens on the design philosophy, popularized in the developer community by Apple, that good design requires saying “no.” Saying “no” a lot. I applaud Apple for applying this philosophy so rigorously— but it’s important to remember that they’re not the only people who use it, and their way is not the only way. What you say “no” to, defines you.

The Prisoner of Zend.php

June 18th, 2013

It is surely not news to you that PHP is awful: there is a thriving sub-genre of tech blog posts about how very, very bad PHP is. It should tell you something about what a horrid clown rodeo PHP is that even in the presence of Eevee’s magisterial “PHP: A Fractal Of Bad Design” article, so many of us feel compelled to contribute to the vast body of PHP-criticizing literature anyhow. For my part, even after acknowledging excellent works (written by pals smarter than me) like Fractal Of Bad Design, man-of-mystery Pi’s “PHP is a low-level programming language at the wrong level,” and Watts Martin’s trenchant “PHP is not an acceptable COBOL,” I still think there’s more that needs saying. There are plenty of languages that one may dislike, and there are plenty of warts on any language one does like — and yet, PHP is sui generis in its terribleness.

Before starting in on my own complaints, I’m going to cite a rant from outside the programming world. During Leonard Pierce’s massively acerbic chronicle of hating Billy Joel there is an aside that I’m gonna use to answer the question “why do people hate PHP in a way that people almost never hate JavaScript, C++, or Visual Basic, deeply flawed languages all?”

Just as one can argue that there were better World Series teams than the 1927 New York Yankees, one can argue that various performers have written worse songs than those produced from the depressingly fertile mind of Billy Joel. […] But while there are those who can honestly contend that the ‘27 Bronx Bombers were not the greatest of all World Series teams, no one — not even those who hate the Yankees with a soul-scorching fire, as do all right-thinking humans — can argue that they are not the best baseball franchise ever. The numbers simply speak for themselves. No other team has even remotely come close to topping their total number of world championships. Similarly, no other performer or group has ever had so many horrible songs become so successful on the charts as has Billy Joel. Others have been worse; others have been bigger. But no one has been bigger and badder at the same time than Billy Joel.

No one has been bigger and badder at the same time than PHP. That’s why.

To expand lightly on the criteria Fractal uses, a programming language is a tool for thinking about a problem space and for expressing solutions to particular problems in that space. The writeups that I’ve cited do great work on talking about this, but I think there’s a little more that needs to be said. We usually take this for granted, but a tool for task X should, as the very least, most basic requirement, help you accomplish X more often than it hinders you in trying to accomplish X. PHP fails at this. Additionally, software engineering does not happen in a vacuum. Choices we engineers make affect others, including our future selves. Software inherently has a social context, and how it interacts with that context, matters (this is where I think Jeff Atwood deeply misjudged PHP). So here’s what I want to add to the conversation:

PHP is not just a sub-optimal or distasteful tool, it’s a treacherous one
In addition to being treacherous for those writing it, PHP pollutes the commons
Because PHP is a treacherous tool whose use pollutes the commons, it should be torn out and demolished like an unsafe bridge or building

A Tool That Fits No Hand

Part of why “Fractal Of Bad Design” commands attention is the sheer volume of issues with PHP it collects and contextualizes. The gotchas, pitfalls, and boilerplate-chunks in PHP combine to produce an environment where simple, easy-to-read code is often wrong. In turn, this means that code written with diligence and caution in PHP, is harder to give a close reading to. You don’t need to intensely scrutinize code every time you read it, but when you pick up your own code that you haven’t worked with for a while, when you’re reviewing code in a security-focused state of mind, or when you’re deciding whether external changes require altering the code in front of you, giving the code a close reading is extremely important. But when the simple way is often wrong, a close reading is far harder than it should be. You will ask yourself many questions, individually small and not particularly difficult, but enormous in number and potential consequences.

Which specific major, minor, and patch version of PHP was this file written against?
Does this function require a particular ini_set() invocation that could be clobbered elsewhere?
Does this if-block behave correctly when the result of an expression is 0 instead of FALSE?
Does all the code use === instead of ==?
Does this function behave acceptably if one of its variables gets clobbered by a global?
Does this block handle sleep()’s many possible return values correctly?
Does this library perpetrate asynchronous atrocities behind my back?

The burden of dealing with these questions means that PHP does not just make it possible write bad code, but that its quirks actively make it harder to write good code and more likely that you will write bad code. You can write good code in PHP, but the path is a fearful one. Compared to other languages, you will write more lines code to do the same tasks, it’s harder to know or prove that the code you’ve written is good, and the language ecosystem is so burdened with dubious code that good code cannot be quickly brought into projects of any significant age. One of Perl’s design goals is to “make easy things easy and hard things possible.” PHP, as though coming from a mirror universe with a sinister goatee, makes easy things hard and hard things impossible. In a total inversion of good language design, a concise and readable piece of PHP is more likely to have bugs, not less. This is what pushes PHP from “a tool that I have distaste for” to “a tool that is bad” — when I say that it is “treacherous,” I’m talking about this property where simple code is prone not just to being wrong, but to being wrong in a way that tends to fail silently and to fail with extremely dangerous effects.

We have a specific term for “past technical decisions are making it harder to make the right technical decisions in the present”: technical debt. Languages change over time. Production environments often achieve stability specifically by slowing down their update cycle. This much is normal. However, the volume of PHP’s technical debt makes updates much more of a problem for PHP than in the general case. Because something like Python’s virtualenv or Ruby’s rbenv doesn’t exist in the PHP world as of 2013, incremental updates (either of PHP itself or of any library or C module you may happen to be using) are very difficult: the difficulty of using new versions of PHP is dominated by the most out-of-date libraries or language features a project uses. Because of how hard it is to make them incremental, updates are risky: it is extremely difficult to fully understand and accurately predict their effects, especially in judging security and stability issues. One of the ways that PHP fails as a tool is that when improvements in the language or in libraries come along, it makes it hard to take advantage of those improvements.

When the question of PHP’s quality comes up, inevitably someone tries to use Wikipedia, Facebook, and WordPress as examples of PHP’s success. Even if you leave aside how that’s like saying that most American universities are Harvard, it ignores that Wikipedia, Facebook, and WordPress all have significant problems that are directly attributable to their decision to use PHP! If you are not prepared to deal with those problems, then you had better not use PHP. To argue that PHP is a good tool because these large, successful projects have been built with PHP while ignoring that all of these projects had to make extraordinary investments in technical infrastructure, is to advocate that other people waste tremendous quantities of time and money. More precisely, the fact that Wikipedia, Facebook, and WordPress all used PHP is insufficient to demonstrate that you personally should use PHP for anything: you must know how those projects work and what tradeoffs they made in order to to know whether their use of PHP means it’s a good idea to use PHP for your application.

Wikipedia is the easiest example to pick on here, because they provide all the evidence themselves. Go and check out a copy of the MediaWiki source code (I’m going to treat “Wikipedia” and “MediaWiki” as synonymous) and take a look at it. Reflect on how many engineer-hours it took to get the project to that state, and how many more hours are being requested. Reflect on the contents of their “Annoying Large Bugs” and “Annoying Little Bugs” pages. If you want to use Wikipedia as a role model, being blind to Wikipedia’s flaws is a terrible idea.

Because Wikipedia is such a high-profile target (huge PageRank points, huge repository of user-generated content, huge mindshare) there’s a steady record of vulnerabilities with MediaWiki. If you get into the plumbing of Wikipedia, get under the layer that just presents pages to visitors, get familiar with the greasy-handed wiki-gnomes, you’ll find all kinds of interesting infrastructure designed to cope with this. As a social project, Wikipedia is not a bad project: it’s an amazingly good one. It’s a triumph of the cooperative open-source ethos and an incalculably valuable community resource. But as an engineering project, you should be very careful about emulating it. You should make sure that you can invest proportionate engineer-hours into security and maintenance — and that you account for how a PHP-based project needs far more of those hours than other kinds of project.

Speaking of gigantic quantities of engineer-hours, there’s Facebook. Facebook is an even worse choice as an example of PHP’s success, because Facebook has effectively re-built PHP from the ground up. Look at their HipHop PHP project: it’s replacing the default PHP interpreter wholesale and replacing Apache’s mod_php as well. You shouldn’t use Facebook as evidence that your project should use PHP, because the way you use PHP is not like the way that Facebook uses PHP. Facebook ended up writing not just their own PHP toolchain, but their own entire PHP runtime. This is probably not the way you want to go for your project: it’s expensive and optimizes for solving problems that you don’t have.

On top of that, there are ways in which Facebook’s usage of PHP is dubious, or at least suggests that they would rather not be using PHP. Before the current version of HipHop, which is a VM that executes PHP, they were cross-compiling to C++. When “cross-compile to C++” makes your project less painful, that’s a bad sign. This emphasizes the earlier point about technical debt: Facebook at this point is trapped in PHP and making the best of it. They’re up to the point where they’re custom-compiling PHP and doing static-analysis optimization on it — which is to say, they are doing original compsci research, because PHP’s internals are that much of a mess.

Nor is WordPress a good PHP role model. It’s gotten better over time, but the direction of its evolution is away from “blog” and towards “maximalist content management system,” which massively expands the number of things that can go wrong. WordPress has a huge difference from Wikipedia and Facebook: rather than being a giant application hosted and administered by someone else, WordPress is a PHP application that you can download, install, and investigate for yourself. They’ve invested a lot of effort in making that part easy. Unfortunately, “easy PHP” is pretty much always “insecure PHP.” So WordPress has a long track record of nasty vulnerabilities. It also has a well-earned reputation as a tool spammers love. Because it’s a platform that you can set up yourself with no gatekeeper (compare to Movable Type, professionally hosted WordPress installations, or Blogger instances), it’s become the best choice for spammers (who want to programmatically deploy large numbers of WordPress instances). Then there’s the architecture matter: maybe this is just taste, but I find things like rewind_posts() inherently suspect (and there are unproven allegations of grotesque features lurking in the codebase). More substantially, there’s mutable global state lurking all over the place (on top of the distressing action-at-a-distance issues PHP inherently has — see Eevee’s writeup for more about that), the app buys into the “sanitize input” voodoo, and like most PHP apps, it requires a bunch of read-and-write access to its environment that other language ecosystems . Wordpress' engineering problems lead to persistent and near-intractable security problems, and those problems affect more than just the people running WordPress blogs.

The Superfund Site Of Programming Languages

Because of the friction discussed earlier, problems fixed or mitigated in new versions of PHP (tremendous improvements on versions like PHP4) have a very long half-life before they’re no longer found in the wild. Obstacles to upgrading software don’t have to be insurmountable to keep users on old versions, they just have to exist. There’s a big difference between “easy enough that people can do it” and “easy enough that people actually do it,” and PHP is on the wrong side of that difference. The design & usability world has known for a long time that if the right thing and the easy thing are different, your users will almost never do the right thing. PHP’s legacy of technical debt means that maintaining PHP code has far too much friction for maintainers to always do the right thing. I throw the epithet “avatar of technical debt” at PHP sometimes, because this dynamic means that to use PHP at all is to incur a wallop of technical debt. Worse, this technical debt is almost always an externality, a cost that the person writing the code doesn’t have to pay. Instead, the cost is borne by unknown future engineers and users. Beware of externalities! If you are not paying the real, full costs of your decisions, you will be led to make worse decisions. Because PHP fails so hugely at making the right thing easy, it tends to make the wrong thing the default — and the costs of dealing with the wrong thing are all too often externalized, whether that’s from today’s coder to the same person tomorrow, from an engineer to a sysadmin, or from the vendor to the users of a piece of software.

That it’s hard to update PHP projects wouldn’t matter if those projects were only relevant to their creators and users. This is not the case: those projects are relevant to the public good. As programmers, do not create, modify, or use software in isolation. We interact with software in a social context, in a technological context, and in a networked context. Similar to how herd immunity in medicine means that the chance of catching a particular disease is unevenly distributed, software vulnerabilities are dangerous even to people who aren’t running the affected software. The most common thing that an attacker might do with a compromised machine is suborn its resources, using it to propagate further attacks (e.g. having it join a botnet). This is why it matters that PHP is so big and so bad: even if I don’t write any PHP code and don’t operate anything based on PHP (or on MySQL, its co-conspirator in suckitude), PHP is still a severe and frequent problem for me!

In the recent past:

A security researcher finds that of sites vulnerable to password dumps, most are built on PHP.
A remote-code-execution vulnerability in two of the most popular WordPress plugins is discovered — and the subsequent patches have an utterly dismal uptake rate.
There are a multitude of PHP-based server control panels that have deeply disturbing security problems of grave severity.
A search on GitHub reveals a multitude of PHP projects open to a trivial SQL injection attack.
A bug in parsing URLs — surely an action that should be a core competency for a “web language”! — turns out to be implemented in the shoddiest way.

Returning to WordPress in particular, WordPress' popularity exacerbates these security problems: WordPress has become a platform as much as it is an app. Going from app to platform is both difficult in general and difficult particularly in the security context. A WordPress setup’s susceptibility to attack comes not just from problems in code its users write nor just from problems in code that WordPress' creators write, but by those potential problems multiplied by the worst code in any plugin or theme being used. There are a huge number of WordPress themes and plugins, and they can do anything they like. For example, there’s RePress, which staples a web proxy onto the side of your blog for the use of folks in locales where services like Google and Wikipedia are blocked. Whatever one thinks of RePress, it’s only possible for it to exist because WordPress just picks up plugin code and lets it do whatever it asks. WordPress is a particularly acute example because its target audience is non-engineering users. Someone who sets up an instance of MediaWiki, Joomla, or Drupal faces a higher barrier to entry than a WordPress user, who is the beneficiary of vigorous and successful efforts to make WordPress accessible to a wide audience. Unfortunately, that experience of easy-to-install software ends up re-enacting the Windows 9x era: it’s very easy to install things that create opportunities for attackers, and almost impossible to tell ahead of time which things are safe to install. In WordPress' case, some of its most high-profile plugins, like the TimThumb image resizer and the popular caching plugins, have seen remote-code-execution vulnerabilities that can be exploited at scale, by botnets — and which are particularly likely to succeed against users of WordPress whose blogs and their upkeep are not an every-waking-moment concern.

I worked with Magento professionally for a while, and one thing that gave me massive creepy-crawlies about it was that it has the same kind of wild and problematic plugin ecosystem as WordPress, but centered around an app that’s meant to be handling people’s credit-card information. “All the security of WordPress, also people use it to handle money!” does not inspire confidence (though with eBay now running the show, there’s a good chance that Magento will have the budget to shape up security-wise).

If the problems I’ve been talking about only affected the people actually running that software, I’d care far less. It’s important for people have to the right to make their own dang mistakes. But these things don’t happen in a vacuum. Facebook is the ultimate example: a steady trickle of facebook vulnerabilities make their way to light over time, and there are over a billion Facebook users who can be very directly affected by them. Every unpatched MediaWiki install sitting around, every forgotten WordPress instance, every homebrew app quietly chugging away, is susceptible to becoming part of a botnet and worsening the state of the entire Internet. Every machine that gets rooted, is another machine conducting attacks of one kind or another — and even all of my own servers run on an imaginary free-ponies-with-awesome-sparkles-and-no-security-vulnerabilities-ever language, a legion of zombie PHP-running boxes can still just throw denial-of-service attacks my way until it doesn’t matter what I’m running.

This is why it matters that PHP is both big and bad: by being both ubiquitous and insecure, it pollutes the commons. It adds unncessary cost and friction to any project we undertake that’s connected to the Internet — which is to say, to everything. Every server that connects to the Internet has its attack surface artificially enlarged because PHP’s own attack surface is so vast. Programming doesn’t happen in a vacuum, it happens in an ecosystem — an ecosystem that PHP-based systems have a long and terrifying track record of dumping nuclear waste into.

Public Hazard

Its being sub-optimal, distateful to me, or outright poorly designed, wouldn’t remotely justify my spending time and heartache on telling people not to use PHP. Likewise, I don’t think that PHP will make you a worse programmer except in the extremely boring sense that it’ll waste a lot of your time and thus make it harder to rack up the quantities of deliberate-focused-practice time that one needs for mastery, which is absolutely not a sense worth picking a fight over: everyone has a right to their own yak-shaving. There are plenty of people out there being total jackasses “in defense of” PHP, but those people are freely deciding to be jackasses: their social deficiencies are very much separate from their choice of programming language (plus, Ruby is an amazing language and its community has no shortage of tremendous jackasses). As a programmer who cares about craft and tools, I think other languages will reward your time & effort far better than PHP, but if you don’t use those, oh well. I have zero interest in picking a fight over PHP on that basis. As someone who cares about the Internet being safe and functional enough for me to buy music, check my credit card balance, and communicate with my friends, I want you to stop using PHP and replace existing PHP code — like, yesterday — and I think you should be restrained from using PHP for new projects. I’m willing to pick a fight about PHP on the basis of its decade-and-counting track record of design problems that cause security problems that cause “you don’t write or use PHP but this is going to mess up your day anyhow!” problems.

Software can’t be isolated from its social context any more than it can be isolated from its technological context. The social and technological context of modern software is the Internet. With a large enough userbase, any software project is de facto infrastructure (especially if it participates in the Internet). As builders of infrastructure, we have a moral responsibility to not build hazardous, shoddy infrastructure because doing so hurts everyone who uses or depends on that infrastructure, even indirectly. PHP’s track record demonstrates that it is a grossly deficient tool for building infrastructure. When you undertake to build or maintain infrastructure, you take on a responsibility to everyone affected by the quality & functionality of that infrastructure. Choosing to use grossly deficient tools like PHP is irresponsible and unethical for builders of infrastructure, especially if it’s justified in terms of ease or of being able to build a thing swiftly. By definition, infrastructure projects require that you prioritize durability and certainty over ease and swiftness! Nor is there an argument to be had on a “the other tools are also flawed” basis: none of the other tools have PHP’s decade-long track record of massive deficiency, nor do their maintainers have the indifference towards fixing deficiencies that PHP’s maintainers display. It is only by combining its track record of problems with the long reach that those problems have, that PHP crosses the threshold of “should people be restrained from using this tool?” No-one should lower the requirements for that kind of thing: we should be very, very wary of doing so. PHP has met those requirements: there are no other tools in such wide use whose problems are so many, are inflicted on so many people beyond the tool’s users, have gone unfixed for so long, have so few virtues to excuse them, and are the responsibility of maintainers who have done so little to fix them. Nothing short of that should prompt the programming community to say “no, this tool is not okay to use, stop.”

How to attain the elimination of PHP is a question I don’t have a good answer for, and it’s obvious that the programming community as a whole hasn’t yet come up with a good answer. It’s especially important to demand that a scheme for reducing & eliminating PHP not make it more difficult to get into programming. PHP offers “you can just write code and see it work!” and that’s a hugely, hugely important feature for making programming accessible — the problem is that PHP offers this feature at a ruinously high cost and smudges the ink on the metaphorical price tag. I also think it’s going to be very difficult in general: I’ve been comparing PHP to pollution-causing industrial tactics here, but America has not done at all a good job of holding people who cause pollution responsible for its harmful effects.

I look forward to a future where we’ve invested the collective effort in building tools that fit our hands gracefully and that don’t sabotage our efforts to build durable, predictable, world-improving infrastructure. Software has both a social and a technological context: this means that the apparently-social problem of eliminating PHP also is a technical problem. The technical problem is “how do we build something better than PHP?” and the tremendous numbers of beautiful & useful solutions we’ve already come up with for that problem, give me every confidence that we can handle that part. Now let’s work on the social part.

Note: this post was updated in summer 2016.

First Date With Ruby

Last night I took a notion into my head and wound up spending a solid few hours with Ruby. I’m happy with how that went! There’s some first-time-with-a-new-language friction, but nothing out of the ordinary. Here’s what I came up with, and afterwards, why I chose that and what I think it shows that I accomplished that.

This creates a new Liquid tag, {% music %}, which can be inserted in page templates. I added it to my footer.html after the byline, timestamp, and categories. The tag checks whether the post’s YAML front-matter has data for a musician and a track name. If the post has that data, the plugin attempts to create a link to the iTunes Store for the given track. With makeItunesTarget() it puts together a URL that is a query to the iTunes Store Search API, with getFromItunes() it loads the query URL and hands off the response to the standard library’s JSON parser, and with makeAnchorFromItunesData() it takes the first search result and generates text to use for an <a> tag and a URL to use for the tag’s href attribute (if you have an affiliate code for the iTunes store, it’ll be inserted). Finally, there’s a convenience function, getMusic(), that just composes the previous three.

Part of why this worked well is that it’s another project with limited scope: I had a specific objective in mind, so I was able to keep moving gradually towards it. However, that limited scope was a way of making progress towards the broad goal of “learn Ruby” and also took on the medium-scope goal of “learn the iTunes Store Search API.” As a practical matter, learning to work with other people’s APIs, whether they’re libraries, services, or daemons, is an important skill for a working programmer; toy projects that include cultivating that skill are good uses of my time. Learning new languages is also a career-long thing: for all the talk of Lisp being “the hundred-year language,” no-one now working as a programmer will be programming in just one language for the rest of their days. There are shell scripts and libraries and wrappers: there is a fragmented world that despite the friction of fragmentation, would not actually be better-served by a language monoculture. In addition, there are plenty of exciting things out there whose roots are in Ruby, so I was enthusiastic about picking up a smattering of Ruby.

I’m definitely fond of Ruby so far. Part of this is because I’m getting to the point where I’m seeing parallels with other languages and able to make good guesses about how a new language will behave. I was able to guess from reading source “oh okay, Ruby is one of the languages where the return value of a function, if not explicit, is the value of the last statement evaluated in its body,” was pleasantly surprised that it has the same tuple-packing return-multiple-values feature as Python, and noticed “oh hey neat, there’s a Scheme-like function!() naming convention for functions that mutate their parameters.” So that’s all good stuff.

Part of choosing Ruby, too, is that I’m currently blogulating via Octopress, which is built on Ruby. Most of why I chose it is that Wordpress is awful (on the axes I care about), but now that I’ve chosen it, I want to have a grasp of how it works. That means learning Ruby and tinkering—which I’m looking forward to.

As a supplemental note, if this stuff sounds to you like a good attitude for a programmer to have, you should hire me.

☈ Don't Let Survivorship Bias Lie to You

In an interview recently, David Kirtley pointed out that in business school there’s this point made that if you interview rich people who have won the lottery, you might come to believe that playing the lottery is the only way to become rich. I thought that was interesting. One of the things I’m constantly trying to point out is that we’re not doing nearly enough to highlight both median and failure modes, because that’s where the real lessons lie. As for myself, I find message boards where new writers struggle to sell more than a few copies interesting, and where I harvest data about the low end.

Tobias Buckell is writing for writers— but as someone with both a writer-brain and an engineer-brain, I read him as someone talking about startups as well. Looking at the home-run billion-dollar-valuation startups will certainly tell you some things worth knowing, but it won’t and can’t tell you all the things worth knowing. There are also a lot of things worth knowing that you will only find out by becoming a student of failure.

This is part of what calls to me about Lean Startup stuff, about the developing startup-culture communal knowledge about how to best learn from failure and how to rapidly iterate such that you have a lot of relevant but non-terminal failures to learn from. Life is not all success, and one of the things that you have to do to set yourself up for earned success, is to learn from failure.

☈ A Real Thing That Exists

An Afro-Celtic fusion cusine food truck named “ÉireTrea.” Oh, San Francisco. ♥

Stories Mattering

One of the things that a startup needs is a story. Trust me (and Seth Godin), you desperately need a story. The fact that you need a story is a whole genre of blog post, but right now I’m here to share a very short anecdote about it.

Here is how you can tell that stories get into people’s heads: if you ask people, “Are vampires real?” they will answer No. But if you ask those same people “Can vampires can be killed with a wooden stake?” they will answer Yes. That is why stories matter.

(h/t to Fred Clark for the vampire question)

← Older Blog Archives

Strongly Emergent

What comes from combining humans, computers, and narrative