Slowing down hurts : ROSL85

ROSL85 — Reflections on startup life, week 85

There’s no end of things that keep me awake at night, but one current observation is that we are slowing down our release speed. I think there’s a few reasons:

Code quality and performance

When you start, you build features at break-neck pace with little concern for testing and performance. Getting it out there is “good enough” — who knows if someone wants to use it or not, if you never launch it, you’ll never know.

Then as you acquire users, some of these things come back to bite you. The cost of speed is that you sometimes make mistakes you have to go back and fix later (like with our recent performance issues).

Because we’re growing and we’re just to dumb to quit, we know we’re going to be around for a while to come. So now we want to avoid making these mistakes again in the future. Which means we trade off time now against pain later and invest more time into tests, test coverage and code quality. The rule of thumb — you touch it, you improve it.

As I tweeted the other day, code refactoring feels a little bit like Inception, the deeper you go, the slower time moves.

User Experience

As we get more experience with Trunk.ly, we’ve reached the conclusion that our user experience is very poor at the moment. We don’t really need more features, what we need to do is to make the features we DO have more discoverable and easier for people to access. The challenge here is that implementing a functional feature quickly is generally easy, taking that same feature and delivering an excellent user experience is much harder and far more time consuming (yet it’s becoming more critical).

We’re here for the long haul

Stubborn, stupid, determined, call it what you will — we will be here this time next year and beyond. In addition to the code debt problem described above — we want to spend time now to avoid hurting ourselves later, we also spend a little bit more time on self-management (ie. taking time off on weekends). You just can’t keep up the break neck pace every single day without huge personal and family cost. So the longer we’re in the game, the more careful we are about making sure we take breaks to avoid burn out.

Release management and “done” inflation

One thing we’ve got to get better at is release management. We tend to want to hold features until they’re “done”. The problem is because of all the above, the value of “done” keeps increasing. It’s harder and harder to cross that threshold. Compounding this is that we still think to much in a feature centric way — we (generally) want to only release a working “whole” (which would include the backend through to the frontend etc.). The problem is that it keeps you from releasing — we need to be better disciplined at smaller, more frequent releases.

Production maintenance

As the site matures and the user base grows, we hit more and more production issues that need to be resolved with maintenance. This requires time to resolve, either in “doing it” or “automating it”. Even with automation, as the size of the database and amount of automation grows, time is required to check and ensure it’s all running smoothly.

Lessons learnt

I know for sure that recognising something is the first step to tackling the problem.

Some of it is unavoidable, some will get easier with resources for whom this is a core skill. For example, I suspect that one reason we struggle with the user experience components is that it’s not what we are really good at (which is different from can we do it).

Some of these we just need to tackle head on. We need to improve our release discipline. We need to ask the hard questions — has the recent bad experience made us release “gun shy”; maybe we need to reduce the emphasis on test coverage for now — we’re better than we were, do we need to be perfect? I’ve never heard of a startup getting funded because they had a perfect test suite, and no startup I even heard of failed because their test coverage was not 100%.

As complexity and scope of the tasks increase it’s inevitable with no change in resources that things get harder and take longer. What do you do to manage this? Where are the logical short-cuts that have to be taken? Love to hear your thoughts.