Line Noise

Monday, July 12, 2010

Design Facilitation

We've a problem of implementations that simply doesn't match our expectations, not so much the low level detail but an over arching problem of coherence, poor quality design.

We dealt with lower level design by various means: testing and TDD, steady refactoring, and some good libraries (collections, list processing, dependency management, etc).

But at a higher level it can be hard to distinguish the business knowledge from the contingencies of implementation. At times it feels like we were incrementally feeling our way into the product, blindly sneaking forward without any wider view to order our route.

And we're talking a lot now about how to resolve that problem. We've got a number of new hires, and they are very good, so getting them to insist on thorough analysis and their best principles and not merely going along with the flow. But also to be more conscious about design, and align implementation to the way we talk about the product, and how the users talk about it, basic DDD stuff.

We talk about taking time to design, having some sort of map, rather than just hacking with a machete down the path of least resistance.

Also we have a few guys who really want to improve their design skills, who want to make that a key part of their career. I was talking with one about how to get him involved with these things and one thing that occurs is to have someone involved in design discussions who is more responsible for facilitating the discussion than for trying develop the design.

Or maybe that's just a bit over the top.

Saturday, May 15, 2010

Kanban, TDD, and Iterative Development

Where I work we make a lot of use of Kanban type systems. It's a key tool in our process and has helped us manage bottlenecks and make activity more visible.

We also favour Test Driven Development which encourages a very fine scale of iteration, and we get benefits from that.

The red green refactor cycle of TDD is pretty good at tuning interfaces and driving looser coupling between modules. Our Kanban process is pretty good at finding bottlenecks in our delivery process and helping us reallocate resources or use our time differently.

Recently we've had some serious production bugs, relatively new code (six to nine months old) being extended with new functionality is causing us grief. Whatever we did in our initial implementation we were missing the secret sauce, we implemented a design that is now hard to fathom, and work in the area is blowing out well beyond initial estimates.

Our designs don't make ongoing use of the code easy.

We were treating Kanban + TDD as our iterative process and we hoped we would get design improvements from our iterative processes, but we weren't really. At the scale of design and iteration of design, neither Kanban nor TDD have given us what we want.

The design bit in TDD is a bit of a lie, it's the hope that if you pay attention to the little stuff then the big stuff will look after itself. Evolution can be a cruel and terribly inefficient process, and there's no guarantee about the future fitness of it's outcome. A program emerges from TDD and satisfies the tests but nothing more is guaranteed.

Moreover, it seems to me that Kanban works against constructive iteration. Of course a larger feature may be broken down as a series of tasks on a Kanban board, and that will be an iteration of sorts, but the motive for task breakdown in Kanban is managing and coordinating resources; creating opportunity to revise earlier decisions is not really Kanban.

Fundamentally, Kanban is monolithic within it's model of a single task. If any iteration is visible within the task cycle you see tokens being pushed back, and anyway you look at that it's negative: it's regression, it's a bug, it's something being rejected. If I take iteration to mean go back and repeat, then from a Kanban perspective iteration within the cycle is a sign of problems.

I feel we're missing something between TDD and Kanban, or parallel to them, that promotes giving attention to higher level code quality issues, quality design and integration, and implementations that align with domain models. Particularly we're missing anything that promotes iteration over higher level design decisions. I'm reluctant to just propose adding more baroque details to TDD or Kanban because any success had with those version is likely to be confused with the basic practice, and when people are weary or don't understand the elaborations they'll slip back to the default practice. I look for methods that support me in my weakest moments, rather than methods that require me to be on top of my game in every moment.

Around the basic development activity (which for us is TDD) I'd like to explicitly promote fine grained analysis and design, review of progress with all sorts of relevant parties, and refactoring to explicitly improve design. Something that encourages multiple iterations for any given task, reinforcing that for any given task a single design step is probably not enough, something that highlights the positive cycle of reconsidering design. Something that revisits the integration of units and their interactions and not focusing just on requirements satisfying interfaces. Something that rewards the developer for going back and having another look.

These concerns are sometimes at a higher level than a naive TDD. I'm concerned with how the collection of objects and services fits together, the coherence of the system as a whole, thinking about future ease of work, clean mapping between implementation and our understanding of the business domain. But at times it's low level but invisible to TDD, it's behind the interface. With TDD alone a thing can be tested and correct and yet have an utterly impenetrable implementation from the perspective of future development.

What might the solution look like? I don't know just yet, perhaps a check list with multiple columns, each column is a fine design iteration, start the next column if you have to put a cross in the box you've already ticked, and then treat getting into more columns as a good thing. I'll be trying out something like that and see where it takes me.

P.S. I know I've presented a very naive view of TDD, but the name itself suggest a simple interpretation. The name TDD gives no clue that you should look for more subtle and nuanced practices even though that's what Beck described in his books.

Tuesday, May 4, 2010

Test Driven Development and Code Quality

I've been digging into some hairy code the last couple of weeks. It's new code, so there's a lot of testing around it, but it's complex, partly of necessity because it is a hard part of our business, but it seems harder than it needs to be. One of the things I've noticed is that no matter how good or bad it is the tests are better.

Tests are an extra layer sketching out a single scenario over the production code, so they are both more abstract and more specific. Any given test isn't trying to generalise all the scenarios and behaviours of the code being tested, in that sense they are more specific. Also they are at least an extra layer of abstraction over the production code.

It may also be important that tests normally form a topmost layer with no distorting pressure from their caller. By way of contrast consider a web application, the code has a very concrete front end and back end, typically the HTTP interface and a database. Wedged between those things is our poor beleaguered layer of abstracted business logic, all too easily squeezed and distorted by both upstream and downstream pressures.

Production code, the actual program, must encompass all the possible uses, in that sense it must be generalised. I hold that generalisation is next to optimisation as a source of complexity and confusion in code. Any developer looking at a generalised piece of code must assume that it supposed to be able to do all the things that it does: being more specific reduces the things the developer has to take into account when working; needless or accidental generalisation forces the developer to think about things that are irrelevant or unnecessary.

It seems the pressures on production code drive it to be concrete and general, while clarity seems to be served by being abstract and specific, there are two axes here and I think we get them mixed up at times.

And we have a model of abstract and specific code sitting right beside the production code: the tests! But it seems that the quality of those tests isn't feeding back into the production code in a way that makes the production code easier to understand or maintain. The good practice of testing is helping to produce correct code, but not necessarily sustainable code.

But we may have some pieces of the puzzle, some seeds of solutions:

Wrap general collections so that there is clearly documented set of operations that hopefully communicate the business meaning of interacting the with structure. Isolate and limit misleading generalisation.
Perhaps we need principles such that a test should never use domain vocabulary that isn't also in the code. If the test wants to interact with something, then the name should be coming from the code, if the name is not there then the code is missing an abstraction.
Similarly, if the test is creating a composite object, then perhaps that represents an abstraction that is missing from the code. And if the test is wrapping an action or a series of steps in a named function then perhaps that represents something missing from the code.
Make your implementation match the way you talk about the business: if your intuition or knowledge of the domain tells you something and the code contradicts that expectation, unless the expectation is actually wrong, then the implementation is bad, because being arbitrarily different from expectation requires additional intellectual effort to work with. (That last one is an insight from Domain Driven Design, emphasising ubiquitous language and consistent with models expressed in code.)

Monday, April 26, 2010

Driven Development

There is a trend in the agile community to name development philosophies as driven, for example: TDD and BDD and DDD. Names for principles and methods to help us stay on track and to guide us toward good practices.

I thought it might be worthwhile to name some of the other development styles I’ve seen, to help us stay on track and to guide us away from bad practices, (beware irony ahead):

FFDD Favourite Feature Driven Development: when the determining factor in the code you write is some cool feature of the language you use or a preferred programming technique. Everything is a list to unrepentant lispers. Operator overriding is another favourite. They say everything is a look up problem at one well known company. Closely related are Favourite Language Driven Development and Favourite Library Driven Development.

FPDD Favourite Pattern Driven Development: a very popular variety of the FFDD family of development methods—because everything is better with Template Methods.

PDD Personality Driven Development: where your design decisions are based on following someone else's notions, not making your own decisions nor taking into account the current circumstances. If the personality in question is outside the team it’s Guru Driven Development and from inside the team it’s Charisma Driven Development.

PoLRDD Path of Least Resistence Driven Development: polaroid development can have many symptoms: doing things because it’s the way it’s always been done, or wanting to commit changes in someone else's name because you've worked to their priorities. After all it's their product, they're the owner, client, lead, etc.

DADD Decision Avoidance Driven Development: might be seen as a whole company variety of PoLRDD and is often presented as iterative development, affectionately known by some as Hot Potato Driven Development.

LoCDD Lines of Code Driven Development: and it’s more subtle variant proportion of lines of code delivered to production driven development, also known as POLOC DD. The driving principle is that if you don’t need it you can delete later but every keystroke spent on tests and infrastructure is stolen from production.

SDD Seniority Driven Development: the old guys do new stuff and the new guys do old stuff, which is often organised in new development teams and maintenance teams and accompanied by such practices as “I know this system better than anyone and I'm sure this patch is okay so just pop it into production”, and “it worked on my machine”.

CPDD Cool Puzzle Driven Development: overlapping with both FFDD and PDD is the philosophy in work environments dominated by technophiles of doing what you are best motivated to do, of course some dull things have to be done, but that’s how new hires learn the system.

Thursday, April 22, 2010

Now that's out of my system

The last two posts were essays that I wrote a little while ago to clarify my thoughts, a bit big for blog posts I guess, but maybe they'll be the right thing for someone sometime.

So, more bloggishly, (to keep the spelling checker happy I guess I could say, "in a more blog-like style", but I like "more bloggishly"):

I've been chatting with a friend who's been disappointed by some experiences in a new job. They have a demanding client and a backlog of bugs, the new starts are being thrown at the maintenance problems, things the experienced folk think should be simple fixes, and it's not going well.

Sadly, simple fixes are normally only simple if handled by the longest serving staff. No one else should feel confident that just patching the problem somewhere is going to be fine without good testing or some formal verification.

And, indeed, those experienced old hands may be right, these may be problems needing only simple patches to resolve. But anyone without their experience of the peculiarities of a system and it's history, and without a safety net of a thorough testing process, can't make those little fixes. Without experience of a system the only professional thing to be done is explore it thoroughly, test as much as possible, and proceed with caution.

I've noticed this before: people with deep experience of a system come to be unconscious of what they know, and assume that things are simple and obvious, (and sometimes get damned obnoxious when it's not so for others).

So if you want to optimise for quality and development throughput the new starts get the problems that can only be solved by thorough exploration, broad testing, and general caution. And the long serving staff get the little maintenance chores.

Heh, in my world they'd still all be collaborating closely and back filling tests over the legacy functionality when those little maintenance tasks come up. But in my world dividing developers into an underclass of maintenance developers and on overclass of new feature developers doesn't happen, that's optimising for hubris, a trap I escaped with my mental and physical health in tatters...

Complexity, keep it to yourself.

"Tell don't ask" is a heuristic aimed at reducing a particular kind of coupling complexity: operations by other objects on the exposed state of a target object. "Tell don't ask" solves this by pushing the operation into the target object and hiding the associated state in there as well.

The more objects that can see and act on the states of other objects the more complex your system becomes: the exposed states of an object are multipliers of complexity. A big part of controlling complexity is limiting the exposed states of objects, or from a different angle, limiting exposure to the states of other objects. The symptom of code that could benefit from "tell don't ask" is often called "feature envy" where one object spends a lot of time looking at another to do its job.

We're all aware of exposed variables as exposed state but anything you get back through a function return value, a call back, even the things infered from exceptions are exposed state. Exposed state is anything about an object that can be used to by another object to make a decision.

The most complete "tell don't ask" style would be a void function which throws no exceptions and has no later consequences giving the caller no possibility of behaving differently in response to the action they've triggered; there's no visible state, no return message, nothing for the caller to act on, and that makes it much easier to reason about the correctness of the caller.

The next step up allows a boolean return value with the caller able to follow two branches. Returning a number allows many branches, exceptions are also returns that cause branching, and so forth. It's easier to think about just two branches rather than many branches. It's much easier to think about no posibility of branching, (but beware of downstream consequences, if a void call now causes observably visibly different behaviour later, then that first call is also exposing state).

If changes in the target object's state are visible then anything with access to that object can change it's behaviour in reponse to the initial operation, multiplying complexity depending on who is exposed to the object's state and how much state is visible to act on.

There are two perspectives here: the caller and the target.

When developing a caller object you want to be able to reason about it, to assure yourself that it is correct, and to know that others looking at it later will feel sure it's right. Being exposed to less state in other objects helps keep your object simpler, so you should be trying to expose yourself to the least number of objects possible and you should want them to have simple interfaces that don't allow needlessly complex possibilities as a response to their operation. If nothing else hide the state of bad objects in facades to show how it could be done.

A developer writing a object that will be called by others should be trying to be a good citizen, should be trying to make it easy for the callers to be simple. Offering wide open interfaces with access to complex state forces the callers to at least think about the possibility of responding to that complexity, and that makes their lives harder: general is not simple.

There are lots of other design heuristics, refactorings, rules of thumb, and so forth that lead to reduced complexity through reduced coupling:

"Tell don't ask"
"Just one way to do something"
"Don't expose state"
"Reduce the number of messages"
"Be shy" or "Defend your borders" or "Limit an object's collaborators"
"Be specific" or "Do one thing well"
"Wrap collections and simple types"
"Reduce the average number of imports"
"Generality is generally not simple"

Sometimes you have to open yourself to coupling, after all programs are really only valuable because they respond to things, but there are ways to reduce risks. Broadly, isolate coupling in time and prevent coupling by side effect:

"Return the least amount of information possible"
"Expose less state"
"Expose only immutable things"
"Complete construction before starting operation"
"Defensive copy on get and set"
"Fail early"

Thursday, April 8, 2010

Cows, librarians, and the evolution of god-objects

In an objected-oriented farm there's an objected-oriented cow and object-oriented milk, so, does the object-oriented milk send an un-milk message to the object-oriented cow, or does the object-oriented cow send a de-cow message to the object-oriented milk?

A little vocabulary: in an OO system objects send messages to other objects, a message contains enough information to provoke an action or change.

In a language like Java the built in message formats are function calls, return values, and exceptions. A simple void function is one message from caller to callee. A function call with a return value is two messages, (the return value is a seperate message, albeit part of a well defined dialogue). Thrown exceptions constitute a third class of messages. (Also, exposed state is message broadcasting and to be avoiding if you care about controlling complexity.)

An alternate version of the riddle: in a library do books send messages to shelves saying shelf me, or do shelves send messages to books saying you are now shelved here?

Ah-ha! we say, (some might say "mu"), the riddle is broken, there's an object missing from the system, the librarian coordinates the action, putting books on shelves, sending messages to books telling them they are shelved and other messages to shelves telling them they now contain a book.

And of course the farmer fills a similar role on the farm.

So lets do some more stuff on the farm. And let's keep in mind that useful heuristic, "model the real world", it helps keep us on the straight and narrow in many design problems.

Bringing in the sheep from the top paddock: the farmer opens the gates, the farmer whistles at the dog, etc. Harvesting fruit: the farmer calls for temporary workers, the farmer provides the workers with baskets, the farmer takes the workers to the orchard, the workers fill the baskets, the workers leave the baskets with the farmer, the farmer calls the market, etc. Ordering new seed stock and fertilizer: the farmer makes a list of what's in the shed, the farmer makes a plan for which fields need to be sown, the farmer dials the supply store, etc. Then there's mucking out the stable, tending the ewes at lambing time, etc.

If we follow the "model the real world" heuristic in a domain with a dominant active agent we end up with a lot of complexity in a big class representing that agent, (and we get some weedy little collaborating classes around the edges).

Sometimes we make god-objects because we don't see the world in terms of individually active objects, distribution doesn't come naturally to humans. We see the world as our play-pen filled with toys which are only interesting when we're playing with them, and we model accordingly. Fat-controller-classes correspond to normal human thinking, but they're not good for managing complexity.

Sometimes we're like rental tenants, when the plumbing's broken we call the owner, when there are electrical problems we call the owner, etc, and this owner becomes the god class for household maintenance.

The point is that a heuristic shouldn't be slavishly followed into unmanagable complexity. "Model the real world" can to be watered down, logic can distributed into other objects, even if that arrangement diverges from real life. We have balancing heuristics to help head off runaway god making, things like "don't model the user", and "don't make manager or controller classes" and the "single responsibility principle".

Sometimes a design heuristic like "always consider another two alternatives" is useful to keep us from "the world revolves around me (and my favourite object)" type thinking.

The goal is to manage complexity, to make stable and maintainable code. We've got the reality of change giving us opportunities to find alternatives that fix problems. Good ideas are good but we should be wary of following good ideas for so long that they become bad ideas.

And sometimes we have to do odd things, like maybe writing cows that tell milk to de-cow itself.

"Mu"