Summary
David starts by saying "to talk about
trade-offs, you really have to understand the drawbacks, because
if there are no drawbacks there are no trade-offs." He continued by saying that TDD doesn’t force you to do
things, but it does nudge you in certain directions.
The first issue he wanted to raise was over-testing. It’s often said you shouldn’t
write a line of code without a failing test, at first this seems
reasonable but it can lead to over-testing, such as where there
are four lines of test code for every line of production code.
This means that when you need to change behavior, you have more
code to change .
Kent has said ‘you aren’t paid to write tests,
you just write enough to be confident’ - so he asked if Kent and
I wrote tests
before every line of production code?
Kent replied "it depends, and that’s going to be the beginning
to all of my answers to any question that’s
interesting". With JUnit they were very strict about test-first and were very
happy with how it turned out - so he doesn’t think you always get
over-testing when you use TDD. Herb Derby came up with the notion of delta coverage - what
coverage does this test provide that’s unique? Tests with zero
delta coverage should be deleted unless they provide some kind
of communication purpose. He said he’d often write a system-y test, write some code to
implement it, refactor a bit, and end up throwing away the
initial test. Many people freak out at throwing away tests, but
you should if they don’t buy you anything. If the same thing is tested multiple ways, that’s coupling,
and coupling costs.
I said that I’m sure there is over-tested code,
indeed if anyone does it would be ThoughtWorks since we have a
strong testing culture. It’s hard to get the amount just
right, sometimes you’ll overshoot and sometimes undershoot. I
would expect to overshoot from time to time and it’s not
something to worry about unless it’s too large. On the test-every-line-of-code point I
ask the question: "if I screw up this line of code is a test
going to fail?" I sometimes deliberately comment a line out
or reverse a conditional and run the tests to ensure one
fails. My other mental test (from Kent) is only test things that can
possibly break. I assume libraries work (unless they are really wonky). I ask
if I can mess up my use of the library and how critical are the
consequences of the mistake.
Kent declared that the ratio of lines of test code to lines of
production code was a bogus metric. A formative experience for him was watching Christopher Glaeser
write a compiler, he had 4 lines of test code for every line
of compiler code - but this is because compilers have lots of
coupling. A simpler system would have a much smaller
ratio. David said that that to detect commenting out a line of code
implies 100% test coverage. Thinking about what can break is
worth exploring, Rails’s declarative statements don’t lead to
enough breakage to be worth testing, so he’s comfortable with
significantly less than 100% coverage.
I replied that "you don’t have enough tests (or good enough
tests) if you can’t confidently change the code,"
and
"the
sign of too much is whenever you change the code you think you
expend more effort changing the tests than changing the
code." You want to be in the Goldilocks zone, but that comes
with experience of knowing what mistakes you and your team tend to make and
which ones don’t cause a problem. I said I like the "can I comment out a line of code" approach
when I’m unsure of my ground, it’s a starting place but as I
work more in an environment I can come up with better heuristics. David felt that this tuning is different between product teams that
are stable rather than consulting teams that are handing the
code over
to an unknown team and thus need more tests. Kent said that it’s good to learn the discipline of
test-first, it’s like a 4WD-low gear for tricky parts of
development.
David introduced the next issue: many people used to think
that documentation was more important than code. Now he’s
concerned that people think tests are more important than
functional code. Connected with this is an under-emphasis on
the refactor part of the TDD cycle. All this leads to
insufficient energy to refactoring and keeping the code
clear. Kent described that he just went through an episode where he
threw away some production code, but keeping the
tests and reimplementing it. He really likes that approach as
the tests tell him if the new code is working. This leads to
an interesting question: would you rather throw away the code
and keep the tests or vice-versa? In different situations you’d
answer that question differently.
I said I’d found situations where reading the tests helped me
understand what the code was doing. I didn’t think one was
more important than the other - the whole point is the double
check where there is an error if they get a mismatch. I agreed with David that I’d sometimes sensed teams making the
bad move of putting
more energy into the testing environment than in supporting
the user, tests should be means to the end. I find I get a dopamine shot when I clarify code, but my biggest
thrill is when I have to add a feature, think it will be tricky,
but it turns out easy. That happens due to clean code, but
there is a distance between cleaning the code and getting the
dopamine shot. Kent showed a metaphor for this from Jeff Eastman, that is too
tricky to describe in text. He got his rush from big design
simplifications. He feels that it’s easy to explain the value of a new test
working, but hard to state the value of cleaning the design.
David said we often focus on things we can quantify, but you
can’t reduce design quality to a number - so people prioritize things
that are low on the list like test speed, coverage,
and ratios. These things are honey traps, and we need to be
aware of their siren calls. Cucumber really gets his goat - glorification of a testing
environment rather than production code. Only useful in the
largely imaginary sweetspot of writing tests with
non-technical stakeholders. It used to be important to sell TDD, but now
it’s conquered all, we need to explore its drawbacks. I disagreed that TDD was dominant, hearing many places where
it’s yet to gain traction.