We discuss some downsides of testing and TDD: can you do too much testing, and is there a problem with teams valuing tests more than they value the functional code?
David starts by saying "to talk about trade-offs, you really have to understand the drawbacks, because if there are no drawbacks there are no trade-offs." He continued by saying that TDD doesn’t force you to do things, but it does nudge you in certain directions. The first issue he wanted to raise was over-testing. It’s often said you shouldn’t write a line of code without a failing test, at first this seems reasonable but it can lead to over-testing, such as where there are four lines of test code for every line of production code. This means that when you need to change behavior, you have more code to change . Kent has said ‘you aren’t paid to write tests, you just write enough to be confident’ - so he asked if Kent and I wrote tests before every line of production code?
Kent replied "it depends, and that’s going to be the beginning to all of my answers to any question that’s interesting". With JUnit they were very strict about test-first and were very happy with how it turned out - so he doesn’t think you always get over-testing when you use TDD. Herb Derby came up with the notion of delta coverage - what coverage does this test provide that’s unique? Tests with zero delta coverage should be deleted unless they provide some kind of communication purpose. He said he’d often write a system-y test, write some code to implement it, refactor a bit, and end up throwing away the initial test. Many people freak out at throwing away tests, but you should if they don’t buy you anything. If the same thing is tested multiple ways, that’s coupling, and coupling costs.
I said that I’m sure there is over-tested code, indeed if anyone does it would be ThoughtWorks since we have a strong testing culture. It’s hard to get the amount just right, sometimes you’ll overshoot and sometimes undershoot. I would expect to overshoot from time to time and it’s not something to worry about unless it’s too large. On the test-every-line-of-code point I ask the question: "if I screw up this line of code is a test going to fail?" I sometimes deliberately comment a line out or reverse a conditional and run the tests to ensure one fails. My other mental test (from Kent) is only test things that can possibly break. I assume libraries work (unless they are really wonky). I ask if I can mess up my use of the library and how critical are the consequences of the mistake.
Kent declared that the ratio of lines of test code to lines of production code was a bogus metric. A formative experience for him was watching Christopher Glaeser write a compiler, he had 4 lines of test code for every line of compiler code - but this is because compilers have lots of coupling. A simpler system would have a much smaller ratio. David said that that to detect commenting out a line of code implies 100% test coverage. Thinking about what can break is worth exploring, Rails’s declarative statements don’t lead to enough breakage to be worth testing, so he’s comfortable with significantly less than 100% coverage.
I replied that "you don’t have enough tests (or good enough tests) if you can’t confidently change the code," and "the sign of too much is whenever you change the code you think you expend more effort changing the tests than changing the code." You want to be in the Goldilocks zone, but that comes with experience of knowing what mistakes you and your team tend to make and which ones don’t cause a problem. I said I like the "can I comment out a line of code" approach when I’m unsure of my ground, it’s a starting place but as I work more in an environment I can come up with better heuristics. David felt that this tuning is different between product teams that are stable rather than consulting teams that are handing the code over to an unknown team and thus need more tests. Kent said that it’s good to learn the discipline of test-first, it’s like a 4WD-low gear for tricky parts of development.
David introduced the next issue: many people used to think that documentation was more important than code. Now he’s concerned that people think tests are more important than functional code. Connected with this is an under-emphasis on the refactor part of the TDD cycle. All this leads to insufficient energy to refactoring and keeping the code clear. Kent described that he just went through an episode where he threw away some production code, but keeping the tests and reimplementing it. He really likes that approach as the tests tell him if the new code is working. This leads to an interesting question: would you rather throw away the code and keep the tests or vice-versa? In different situations you’d answer that question differently.
I said I’d found situations where reading the tests helped me understand what the code was doing. I didn’t think one was more important than the other - the whole point is the double check where there is an error if they get a mismatch. I agreed with David that I’d sometimes sensed teams making the bad move of putting more energy into the testing environment than in supporting the user, tests should be means to the end. I find I get a dopamine shot when I clarify code, but my biggest thrill is when I have to add a feature, think it will be tricky, but it turns out easy. That happens due to clean code, but there is a distance between cleaning the code and getting the dopamine shot. Kent showed a metaphor for this from Jeff Eastman, that is too tricky to describe in text. He got his rush from big design simplifications. He feels that it’s easy to explain the value of a new test working, but hard to state the value of cleaning the design.
David said we often focus on things we can quantify, but you can’t reduce design quality to a number - so people prioritize things that are low on the list like test speed, coverage, and ratios. These things are honey traps, and we need to be aware of their siren calls. Cucumber really gets his goat - glorification of a testing environment rather than production code. Only useful in the largely imaginary sweetspot of writing tests with non-technical stakeholders. It used to be important to sell TDD, but now it’s conquered all, we need to explore its drawbacks. I disagreed that TDD was dominant, hearing many places where it’s yet to gain traction.