It has been my experience that one of the toughest things to manage about the testing phase of development is the creation and maintenance the schedule. One result of the test->fix->test process that invariably happens, testers are constantly pulled around the product. Test feature a, encounter bug that prevents forward movement, test feature b etc.. This pattern can, and often does, occur more than once a day. Especially early in the phase. Getting pulled back-and-forth like this can easily derail the order of a project schedule as people report working “about a day” or “until lunch” as the duration worked. Okay, these are extreme examples, but even “half-day” is pretty vague. Did they work a 10 hour day, or a 6 hour day? Most software shops have embraced the notion of flex-hours where as long as you get your scheduled commitments completed you can work whenever you want, so there is no way to really, REALLY know how long a task took. This knowledge is critical as new projects will use the previous project’s actuals as the basis for this project’s estimates.

So Adam, what do you suggest to try and rope in a test schedule? Glad you asked…

  • Measure in hours – Stop thinking in terms of “man days”. Think instead in terms of hours and then roll things up later into days. Management is not going to care about task x taking 80 hours. What they want to know is how many days it will take (10; or 2 single person work weeks). Testers will then report their progress against tasks in hours. This is a more defined value of time and allows people to work on different tasks during the day but still provide a very clear measurement. Then for inputting into the project plan just multiply the reported value against the “standard” work day length. Or if you are reporting via a spreadsheet template, do the calculation right in there. Note: Once you start this, do not start to watch the total number of hours a person is logging in a week and force them to meet it. Only use that number if things are ridiculously out of whack one way or another. This is similar to not measuring your testers against the number of bugs they find in a set period.
  • Task independence – While not always possible, the schedule should be designed in such a way at each testing activity affects exactly one line item in the schedule. If a testing task can be applied to more than one item, how are you supposed to accurately measure the time taken? Apply the same time to each? Well, you just injected a fictitious time period into the mix. Allocate the amount? How do you decide what percentage accurately reflects the time taken. It could be easy to figure out in some cases, but might not be in others.
  • Task granularity – Most testing tasks can be broken into smaller components. These smaller components should be recorded as sub-tasks which can be recorded against. This allows for clear trending and task separation. Example: It might take me 5 hours to run the automated framework against server platform x. But what if 3.5 hours of that was environmental trying to get an instance of the LDAP server I am interested in running correctly. Without a high level of granularity we cannot see that at the end of the project I spent 6 days (once we aggregated the hours reported up) fighting unrelated software. And if we had a better process in place to handle it, then either we could have pulled the schedule in a bit, or I could have done more exploratory testing.
  • Sufficient detail – Yes, the project schedule is a “living document” and does not live in isolation. It should however be able to stand on its own tough. Each item should have a descriptive enough title allow the testing activity to commence without having to consult 2 or 3 documents to figure out what the heck the activity is supposed to be. Example:
    Bad – Upgraded Oracle LDAP Support
    Good – Upgraded Oracle LDAP (9.0.4) Support
    By just adding the version number to the task item, the value is increased significantly as you instantly can tell what the upgraded version is. Without it you need to find the version number by combing through the requirements and supported platform matrix. And both documents may be inaccurate as things change. As people are ultimately measured about how they do in the context of commitments in the schedule, it is the schedule that should be considered law.

By doing these things, the numbers at the end of the schedule will be much more realistic. Which in turn helps your estimates for the next project be more realistic. Which means less of a “crunch” at the end where you are working silly long hours. And noone really wants to work silly long hours (now that the dream of making a killing during an IPO myth has been quashed).