Tuesday, July 28, 2009

Making Big Problems Smaller

Software engineers, like all other mortals, operate in a continuum between ignorance and certainty. Good engineers are aware of this, and are careful about what they claim to know for sure. Hypotheses are tested by designing experiments that yield results.

...

As a first example, let's say that nearly every day around lunch-time, site performance on your web site slows to a crawl. With only that information, there is a lot you don't know. Good engineers react by first identifying what they do know, then asking questions about what else they think they need to know and can find out. Were there any alerts from any of the affected systems? What was network traffic, CPU usage, disk usage on the system when the problems started happening? What about in the few minutes before? Are there any system errors or messages? What do the logs say?

This is a big problem because it is poorly understood. But notice that just asking "What happened?" doesn't tell you anything. You need to ask questions with answers that provide new information. Such questions transform the initial monolithic, intractable problem into a series of smaller, tractable problems. And each smaller problem in turn yields new results which help you build a hypothesis explaining the big problem.

In the easy case, one or more of these initial questions reveals a smoking gun. But even in the less easy case, where our first investigations don't point to a cause, we already know about a bunch of things that didn't cause the problem. Hopefully there are some well understood additional questions we can investigate. But even if there are not we can backpedal. "How can we find out more from the system when this happens next time?" "Can we reproduce this somewhere? Once? Repeatably?" Hard problems may take longer to solve, but there are always more questions you can ask.

...

Designing software systems benefits from a different style of problem decomposition. At the start of the process you do not know everything about how a system should be built. But you know something. You likely know, for example, how users will access your program. In a browser? Using a dedicated program on the iPhone? As you work on the system, you continually confront all that you don't know -- after all, you are building what doesn't yet exist. This can cause anxiety, or worse (in some cases, actual conniption fits), but it doesn't have to.

Let's imagine a web site that lets customers check the weather in their area. A first stab at a model of this system might look like this:


This picture is ridiculously simplistic, so much so that it may seem useless. Yet it still meets both of our decomposition criteria. It breaks up the monolithic initial problem into three chunks, and it yields results you can act on, in this case some questions you can begin investigating answers to. For example, how will the system know a user's location to show them weather results in their area? What are some data elements about weather it will need to store? Hey, what about data describing users? The current model labels the data cylinder "Weather Data" -- looks like you need to update the model.

You could ask these questions in a vacuum. But as soon as you have a representation of some part of the system, however simple, you can look at that part of the system in greater detail and, crucially, with a clearer sense of "what that thing is." Guided by representations of the system, by metaphors essentially, you can drill down from more abstract to more concretely detailed and back up again without becoming disoriented. Successive decomposition brings structure to what was amorphous -- a solution literally "takes shape" as you work.

So then in the design domain you have a different continuum. Looking at the code, you have local certainty about a particular part of the application, but you are ignorant of the system as a whole. On the other hand, a picture with boxes and arrows offers a global view, at the price of total uncertainty about local behavior in any part of the system. Software engineers move between these poles as they work -- high abstraction/low detail to low abstraction/high detail, and back again.

...

It should be evident at this point that the only way to go about this business of making software is a little at a time. In the beginning, you only know enough to draw a picture and ask some questions. As you answer the questions, partially and bit by bit, you come to know enough to write some code. Back and forth and so on. More code, more questions, more answers, more code.

The current phrase we use to describe this truism is Agile software development, a method of working which assumes that each project starts out confronting the mother (and father) of all monolithic and unknown problems, "What are we building? And, how are we going to build it?" Agile decomposes big problems into problems small enough to code answers to, all the while generating new problems and questions which are decomposed in turn.

The truth is that "What are we building and how" is never completely answered. The answer is evolving over time, so all you ever have is the current best possible answer. This may sound bleak to fans of certainty, but in fact the agile process benefits those paying for and charting the course of a software project. First, every time the project pauses to deliver its current best answer, stakeholders have something real to respond to. They might decide to halt the project. Or release all or part of it to users as is. Or request a series of adjustments to make between now and the next delivery of "current best answer." Again the same pattern -- break a big unsolvable problem into a series of reasonable and smaller ones, each of which yields results. Those results in turn lead to decisions about the next set of problems to be solved, questions to ask, steps to take.

...

We've now come to the portion of our show where we ask what these insights mean to those parts of our lives not governed by 0's and 1's. Well, first, notice that productivity systems like Getting Things Done are essentially advocating this same simple pattern. Take the big problem of being overloaded and unorganized, and the big question of "What should I be doing now?" Decompose it by creating categories and rules for how to categorize and prioritize and act on things.

The pattern applies just as well with less formality. Don't know what you need to get done? Make a list. (God did this in response to the question of "How should a person live their life?" You surely do not have any problems to decompose as hard as that one.) You can always start to transform the unknown into a series of more manageable, better understood steps.

Finally, we can learn from Agile. The first insight is that we must accept uncertainty; you will almost never be certain of what to do, but you will still need to decide. Fortunately Agile also suggests a path toward learning to make better decisions. Identify any actions you can take and questions you can ask. Act on what you can and pursue answers to the questions. This uncovers new actions to take and new questions to answer. Repeat. Repeat again.

"Live and learn," my mom used to say. To which I add, learn to live.