trapped in the technologist factory

The first time I was paid to use Clojure was only two years after the language was released. Needless to say, it was at an early-stage startup.

The idea behind Runa was that we’d do dynamic pricing for smaller online retailers, offering targeted discounts based on individual shopper behavior. In practice, we offered a fixed discount to anyone who abandoned their shopping cart. By treating a small percentage of customers as a control group, we could calculate our effect on the retailer’s sales, and take a cut. Our biggest problem was convincing retailers to let us control their prices, closely followed by convincing them the math behind our calculated “lift” was sound.

We told ourselves a big contract was always just about to close, permanently increasing our traffic by an order of magnitude. We used HBase as our primary data store, because it was designed to scale, just like us.¹ In the future, we would power a loosely federated Amazon, composed of independent online retailers, knit together by our software. If we built it, they would come.

But they didn’t. The only thing that kept us afloat was a bespoke product we had built for eBay, because our CEO knew an executive there. Later, after I left, that same executive moved to Staples and convinced them to acquire the startup outright. Nothing we had built was useful to Staples, it was just evidence of our ability to “innovate”. The resulting “skunk works” team has since been disbanded.

My next job was at Factual, a more established startup that had built their product using Java and embedded Javascript, and were beginning to adopt Clojure. Their product was an index of all the places in the world, generated by applying hundreds of different heuristics to web crawl data. Each entry had a canonical id and various properties like name, address, and hours. More interestingly, it also had the equivalent ids from different platforms like Yelp and Foursquare, and we offered an API that could resolve an an incomplete (or even partially incorrect) set of properties to the corresponding entity in our index.

Our product could stitch together thousands of disparate datasets, so long as they had some geographic component; I liked to describe it as a “foreign key for the world”. Unfortunately, there’s not much money in ontologies. By the time I had joined, the dataset had already been used as the basis for Facebook Places and as part of Apple’s then-new map offering. These were large contracts, but there are only so many big tech firms in the world, and even fewer with a geographic component to their data. No matter how much we tried to make up the difference with our smaller customers, we didn’t have a path to a billion dollar valuation.

We began to offer services powered by our data. Internally, this was presented as less a pivot than a demonstration of the intrinsic value of our data; once people saw what we could do with our data, surely they’d have some ideas of their own. Our first customers were adtech firms, hoping to do better geographic targeting for mobile ads. We hired salespeople and marketing executives with a background in digital advertising, and when sales started to take off we hired more. Even as I built services to ingest billions of daily ad impressions and bespoke almost-databases to power on-premise services for our adtech partners, I told myself it was all incidental; our business was data, not ads.

I was wrong, of course. The advertising side of the company was where all the growth was happening, and our product direction was whatever helped them close deals. But I wasn’t the only person who was slow to acknowledge this fact. In my exit interview with the CEO, just after I had given notice, I told him that he should be more direct with everyone that Factual’s focus had permanently shifted. He told me what I had only just stopped telling myself: ads were a temporary diversion, and we’d return to the underlying data before too long. In the years that followed, however, Factual became increasingly focused on location-based advertising, until it finally merged with Foursquare.

When a startup takes venture funding, its singular goal becomes growth; it will either reach a billion dollar valuation or die trying. With that growth comes ever-greater technical challenges. Some of these are obvious: more users will not only require more machines, but a more complex infrastructure.² Others are less obvious: as the userbase grows, each individual becomes increasingly abstracted in the business logic of our software. To deal with the first challenge, we must have technical acumen. To deal with the second, we must have a deep understanding of our business domain.

This means that as long as a startup continues to grow, its engineers can (and must) continue to improve their knowledge and acumen. But most startups fail. Some of those failures are because the company couldn’t keep up with their own growth, but these are the exception. The common, boring reason for failure is that the startup barely grew at all.

Confronted with stagnation, an engineering organization can have two reactions: they can try to help reinvent the company, or they can continue to prepare for the growth that must, inevitably, come. At Runa, we chose the latter.

At first, the engineering leadership adopted promising, poorly-understood technologies (Clojure! HBase!) which would give them the necessary leverage down the line. But once they launched, and the stagnation set in, that preparation became a need for constant readiness. Any early technical mistakes, however regrettable, were baked in. Trying to revisit them, to build something better, would just leave us flat-footed when the growth finally came.

I joined Runa only a few months before the engineering organization entered its defensive crouch. I was too inexperienced and too distracted³ to question it. Every day we played Jenga, trying to jostle the existing codebase as little as possible with each new commit. Later, as I realized I wasn’t proud of any of the work I had done so far, I began to energetically push for rebuilding key pieces of the system.

After a few months, I was told to either drop it or take three weeks severance. I took the severance.

Of course, even if I had gotten my way, nothing would have changed. All of us were carefully ignoring the actual problem, which is that our business model didn’t really work. Our product and pricing model both required unjustifiable levels of trust from our prospective customers, but none of us saw that as our problem to solve. We were downstream of the business model; our job was simply to wait and prepare for its eventual success.

I joined Factual because I found the problem space fascinating, but most of it had been solved by the time I joined. About six months later, the leadership began to acknowledge that our business model, while successful, wasn’t going to get us to a billion dollar valuation. So began my descent into adtech.

I wanted to prove that I had learned my lesson; if the current business model wasn’t working, I’d help them build a new one. I built services to consume the raw data, and on-premise servers to provide the processed result to our customers. I attended sales meetings, wrote design docs, and painstakingly analyzed packet dumps to prove to partners that latency spikes were occurring elsewhere in their system.

In the end, I learned that I didn’t like the advertising industry very much. When a brand wanted to run a digital ad campaign, they went to an agency, which is be responsible for spending their budget to the greatest effect. The agency then went to a demand-side platform (DSP), which was be responsible for making real-time bids on individual ad impressions on an exchange. To inform those decisions, the DSP went to a company like Factual, which could distill a phone’s current and past locations into binary features relevant to the campaign.

Crucially, when you’re not selling something online, it’s almost impossible to measure the efficacy of a digital ad campaign.⁴ This means that the agency could say almost anything, so long as it closed the deal. The DSP, in turn, sold whatever they thought would help agencies close their deals. And at the end of this chain was Factual, selling whatever we thought would entice the salespeople enticing the salespeople enticing the brands.

I thought I had become product-focused, ready to solve any problem that would help the company to succeed. In truth, I was only really paying attention to the technical problems. Once they were largely solved, it became harder to ignore my ambivalence for what the product actually did. Eventually, I left.

I learned two lessons from this time in my life. The first was personal: I am, at heart, a technologist. I like to generalize, to abstract. While I believe it’s crucial to understand the context around your software, I’m happiest when that context is other people’s software.

The second lesson was broader, and less obvious:⁵ our industry is designed to foster people like me.

This was surprising because it seems so clearly against our own interest. In almost every case, companies fail because they build the wrong thing. Unless your customers are themselves engineers, I’m the wrong person to help with that. You want someone comfortable at the periphery of your system, who wants to learn about the competitive landscape, who wants to talk to customers. You want a product engineer.

But consider our standard interview questions: data structures, recursion, and computational complexity. It always feels a little strange when I see someone arguing that these don’t reflect “real” day-to-day software tasks; I write recursive functions all the time. But that’s a consequence of the abstraction; what might be a simple nested lookup on any specific datatype becomes recursion when you try to generalize over a set of possible datatypes.

Likewise, I’ve seen it argued that the difference between, say, O(log N) and O(N) isn’t important, because in practice N tends to be small enough. That may be true for some domains, maybe even most of them, but if you’re building a general-purpose tool you have to focus on the pathological cases.

Of course, others will argue this first-year CS material is the essence of what we do. This was certainly the attitude when I was at Google;⁶ since their existing employees were going to any length to avoid writing Javascript, they decided that anyone with frontend experience would be held to a lower technical standard. These hires were widely viewed as second-class; people who were only allowed in because they were willing to do the work no one else wanted.

But if we want to hire product engineers, what questions should we ask instead? It’s impractical, given the sprawling scope of our industry, to only consider candidates with prior experience in our exact product space. Likewise, it’s unrealistic to expect that we’d have expertise in a candidate’s prior product spaces. We lack a common vocabulary, a common understanding of the nuances that separate good design choices from middling ones.

And so we continue to search for our keys under the streetlamp; they could be anywhere, but this is where the light is. If we’re lucky, the technologists we hire will also have all the other skills necessary to make humane, useful software.

This means that even if you’re not a technologist, you have to learn how to pretend.⁷ And this brings us back to the beginning, when it was obvious that my first Clojure job, only two years after the language was released, would have been at an early-stage startup.

Most new technologies, if they get adopted at all, follow a familiar path: first come the hobbyists, and then the companies which employ those hobbyists. This second step is easiest when the company has little to no process around adopting a new technology; in other words, an early-stage startup. The standard narrative around this phenomenon is that certain technologies are simply more powerful, but only startups are daring enough to hire from the niche pool of talent that knows how to wield them.

But new technologies⁸ don’t have power; for that they’d need a community, documentation, and a thriving ecosystem of ancillary technology. What they have is potential, which resonates with the potential within the startup and the early adopter; perhaps they can all, over time, grow together.

As many have pointed out, this is not a rational strategy for building a company. It is, however, a phenomenal way to train new technologists. Chesterton notwithstanding, the fastest way to learn why a fence exists is to tear it down and see what happens.

Seen from this perspective, it doesn’t seem so irrational; most startups fail before reaching a scale that has any existential technical risks. The only guaranteed benefit they can offer their engineers is the freedom to invent their own challenges, and learn through iterative failure.⁹ If the startup fails, there’s no harm done. If the startup gets traction, the engineers can apply their newfound wisdom.

This means startups don’t adopt new technologies despite their immaturity, they adopt them because of that immaturity. This drives a constant churn of novelty and obsolescence, which amplifies the importance of a technologist’s skillset, which drives startups to adopt new technologies.

This flywheel has been spinning for a long time, and won’t stop simply because I’ve pointed out that we’re conflating novelty with technological advancement. Hopefully we can slow it down, though, because I believe it’s causing real harm.

By introducing abstraction into every problem we solve, we distance ourselves from how our work is ultimately used. We tell ourselves we’re in the business of building sharp knives; if we made them safer, they’d be useless for everything except spreading butter. We float above the the effects of what we’ve created, treating them as inexorable consequences of progress.

It’s true we can’t encode our values into general-purpose software,¹⁰ but we’re not simply atomized technologists, and our worlds are not bounded by the interfaces we expose. We share a collective responsibility for what we create, and are capable of collectively acting on that responsibility.

But what does a belief in collective responsibility mean, in practical terms? What actions does it entail? Honestly, I don’t know. All I know is that we can’t stay under the streetlamp forever. At some point, we’ll have to see what’s out there in the dark.

In a twist that will surprise no one, the way we used it could never scale. To support ordered scans over keys, HBase partitions on contiguous ranges in the keyspace. Since we wanted to scan over the events for a given day, all of our keys began with a timestamp. This meant every write for a given time period was focused on a single shard, which would drive HBase to constantly repartition its shards in a futile attempt to equally distribute the load. Our HBase cluster was always on the cusp of falling over. ↩
Even in the exceedingly rare case where your core services can horizontally scale, processing the increasing volumes of data they generate will introduce complexity elsewhere in your system. ↩
My first year at Runa coincided with a bout of mostly-unrelated depression, and the one-foot-in-front-of-the-other approach to software development perfectly mirrored how I was approaching pretty much everything else in my life. ↩
The only possible exception to this is Google. If you have an Android phone, Google can attribute your visit to a Walmart store to an ad for Walmart they showed you some number of days earlier. If you use Google Pay, they can even attribute your purchases. They may not be able to prove a causal link, but it’s still lightyears beyond what anyone else has to offer. ↩
It’s difficult to reason about structural forces in tech, and especially in startups, because of the pervading myth that we are all protagonists within the company’s narrative arc. If something failed, it’s only because we didn’t work hard enough, or didn’t have a better idea. Even now, years later, thinking about this period in my career brings a twinge of guilt; couldn’t I have changed how the story ended? ↩
This was back in 2010, but I’d be surprised if anything has substantially changed. ↩
I know of a few ways to skip this gauntlet, but none of them have universal appeal. If someone at an early-stage startup vouches for you, they’ll often skip straight to making sure you’re a “culture fit”. This only works, however, for a candidate with a strong network, who wants to work for a small startup, and who fits the culture in question. Likewise, you can simply found your own company, but being a founder places you first and foremost at the interface between business and investor; the interface between product and user is always further down the list. ↩
And the niche technologies adopted by startups are, almost without exception, new. Graham’s Lisp evangelism didn’t lead to widespread adoption of Common Lisp, it led to the adoption of Clojure, a brand new Lisp which was still full of potential. ↩
Given our current situation, telling junior engineers to only use boring technologies can actually be portrayed as the older generations of engineers pulling up the ladder. They got to learn through failure, why can’t the new generation do the same? ↩
I’m begging everyone who thinks a software license can delineate when something is or isn’t being used for a bad purpose to read a single book on ethics. ↩

trapped in the technologist factory

2021