NoMeansNo
No means No - how to keep testing failures meaningful in CI. Slow tests. Flaky tests. Tests that change.
Katrina Edgar
Devs writing unit tests Testers writing integration tests Devs multiple check-ins per day. Difficult to get integration tests meaningful to testers 2 hour test suite: Bad code, copy and paste Moved setup steps out of Selenium
got down to half hour Fragile – needed to change sleeps to waits
Because tester had ownership, tests would break and testers would have to constant stream of red builds. Devs stopped paying attention to red builds.
Integration tests only helpful to teste r, became same interpretation as in manual environment.
Loop of death - different reasons for tests to go red each time Green light once a week, celebrated with coffee, acknowledged as a problem (“squirrel dance” at TIM)
Andrew – same problem but worse, not part of deployment pipeline
5 devs, 2 testers – needs more testers than usual
Martin – integration tests owned by devs, testing team test from external interfaces, large amount of manual testing, embedded radio systems, automated environments with hand-held units, tests call quality etc. Same problem with Jenkins turning red and slow feedback time Looking to run smoke test on each build, then less frequent or nightly build Painful to install stuff onto device Dev checkin – 30 minutes for CI tests to complete, some tests only nightly
Julian – canary tests for risky areas,
How to tackle continual red builds? Potential for doing a “hearts and minds” Katrina tried Gold star charts, which was a good motivation for devs Also got devs writing the tests
Ward – failures are personal, need to make it fun, if you're not writing bugs, you're not writing code
HTML tests are always going to be flaky – :because browsers suck”, write tests at service layer, subcutaneous testing Devs already had responsibility for the quick tests Would you throw away GUI tests? Tend to be repetitive, need to refactor down
False negatives - Claim plugin – assigned “cake points”, if you didn't claim you had to bring in cake. Plus cafe bonuses
Jenkins game plugin – useful to start people getting interested in it, but could lead to bad behaviour (eg. Checking in meaningless tests to get points) Stop the line on broken builds Make a developer responsible for checking the build and doing triage of failures – can make you feel crap always have to go back to same person What was better – picking on one person or stopping whole team
Reverting check-ins Validated merge plugin Git plugin – merge to branch on successful build Gerrit
Source code management
10 teams checking in on branches then merging to trunk. Teams have to wait when trunk is broken.
Visibility of breakage build radiator USB tower of LEDs that showed breakages
Build radiator also showed message of the day, jokes etc to act as central source of information
Have to slow down before you speed up. Not doing CI if you're not stopping when it breaks
Look at definition of done criteria – can't claim points until its green
CD is powerful – can't deploy until working
ATDD – check-in of incomplete features, use Pending/Expected to fail flag on acceptance tests while developing feature and checking in successul unit tests
Pushback from devs on creating and running integration tests. Breaks “flow”.
Acceptance tests shouldn't fail if sufficient testing at a lower level.
Test on own machine – can pass, then still fail on build server due to environmental issues
By time of failure, multiple commits have been picked up so hard to ascertain blame. Potential changes – slowing down commits, concurrent builds, spin-up multiple environments in the cloud to run tests that require environment
Devs unwilling to run tests, needs extra environmental setup
Jan - 30 minutes integration test time. Devs run tests over lunch or at end of day. Commit every 6-8 working hours. Daphne – use Git for tiny commits.
Delete integration tests that never fail.
Devs saw as someone else's code, tester owned. Potentially having the tester pair with developer may have resulted in shared ownership. Co-location helps.
Make it fun – devs will stick around longer and do extra stuff.
Team ownership of broken builds.
Silos can cause friction – eg. Different reporting lines for devs and testers, turf war. Needs buy-in from management, and focus on better working relationships.
Silo books -
- Silos, Politics and Turf Wars: A leadership fable about destroying the barrers that turn colleagues into competitors, Lencioni
- Bust the silos, Hunter Hastings and Jeff Saperstein
- Silo Busting, conference workshop by Tom Perry and Lourdes Vidueira
- The Robbers Cave Experiment, Sherif
Coding dojos – get team working together on shared goal that's not production code, can use CI approach and ensure CI principles are followed
In large org, having meeting 2-3 times a week across scrum of scrums helps get understanding of what is being committed and less broken builds