failure demand

I’ve been looking for a concise way to describe:

“teams that ship a feature should be responsible for the long-term success of that feature, including the customer ops caused by stuff missing from the feature”.

I believe our teams should have an area of ownership, and being responsible (and capable, and allowed) to change anything within their area — to make the team go faster.

Failure demand vs. Debt

What I now know, is that we are talking about “failure demand”. Failure demand is demand caused by a failure to do something or do something right for the customer.

Failure demand has a bunch of different names. The common understanding of failure demand is the debt analogy: tech debt, product debt, architectural debt, and sometimes organisational debt.

I like the debt analogy, because you can think of it like economics. I borrow something now to increase what I am capable of (creating leverage), and I pay back more later. This really clearly translates into something we can understand in product and engineering organisations. I ship something smaller than is required now (borrow) and pay back more later (repayment + interest). The borrowing helps me deliver my feature to customers sooner than I would have been able to otherwise (leverage).

The debt analogy is less suited to describing how and when to ‘pay back later’. Debt generally has a ‘term’ attached to it. The term is the time over which the repayments are made.

Debt is modelled as a fixed cost over time which doesn’t accurately represent how our product and engineering orgs experience the effects of what we call ‘debt’.

Value demand and Failure demand

/img/failure-demand-equation.png

A team has a fixed capacity, which can be used on ‘value demand’ or ‘failure demand’. It’s important to call out here; failure is not bad, not learning from failure is bad.

In our teams:

  • Value demand is typically new features.
  • Failure demand is typically improvements to existing features (which we have previously called debt).

Using a debt analogy, with it’s fixed repayment term, is not an effective way of thinking about this problem. The ‘failure demand’ is entirely dependent on the choices we made when we launched the original feature (value).

The example in the image does not express whether customers got any value from our feature, but it does express that customers came back, creating more demand and consuming more of the organisations resources because the feature was ineffective.

The distinction between ‘value’ and ‘failure’ is not important, because both clearly have extra and alternate meaning attached to them beyond this post’s description. What is important is how we manage the two. Clearly we want to ship value, but if we do not address the failure demand and attempt to minimise it, we detract from a team’s capacity to deliver value. The team has a fixed capacity, and any time spent on failure demand is not spent on value demand.

Failure demand is closest to the ‘interest’ component of tech debt and product debt. Failure demand is the component that accrues and increases over time.

Team’s responsibility

As we move to a model where our teams are responsible for product and tech areas, we should also be aware and responsible for the failure demand. In the past we’ve seen failure demand as an ‘expected cost of doing business’, and tried to solve it with manual ops fixes. This is clearly wasteful, has low enduring value, and isn’t great for our customers.

I am not arguing that we ship “perfect” features. Our customers' requirements of our product are on a spectrum. A feature that is ‘perfect’ for one customer, will be lacking for another. What I am arguing for is;

  • Know which features and customers create (or have created) failure demand.
  • Recognise that the more new features we ship that are lacking, the more failure demand we create.
  • Fixing the failure demand reduces the total demand on a team’s capacity, allowing more space for the team to deliver value. (value demand + failure demand = team capacity).
  • The team that creates failure demand should be the team that experiences that demand.

Ship small

Shipping something small now can create more demand on the team’s capacity later, but shipping small and early is exactly what we want to do. It’s important to note that;

  • ‘small feature’ does not have to mean lacking value for customers
  • ‘correct feature’ does not have to mean lots of work for the team
  • ‘small and correct’ are not mutually exclusive

We should be really conscious of the failure demand we create, and the trade-offs we are making when shipping small and early. We should address failure demand quickly when our small and early feature isn’t quite right. We should not accept manual customer ops as the ‘expected cost of doing business’.