This is part three in a series of posts about getting control of distressed projects.
For this particular project, the development team started off working in a waterfall approach. However, things ended up breaking down into an ad-hoc approach once development began and the BUFR analysis document became invalidated by uncontrolled change. From there, it didn’t take long for the line from the code back to the requirements to become blurred and then eventually erased.
Initially the team was composed of members that were matrixed in from their respective functional silos (analysts, developers, and testers). Team members were brought on to play their specific role in the designated phase (analysis, development, testing). There was not a sense of shared responsibility of the quality and delivery of the system, and that isn’t surprising since the team did not operating as a cohesive unit from the start of the project.
The profile of the work assignment was a classic push system with a very large batch size. The development team was pushed a large requirements artifact from the analysis team. The test team was pushed a large code base by the development team plus the very large requirements artifact from the analysis team. During testing, when defects were uncovered, they were pushed to specific developers by a manager. When the fixes were made, they were pushed to specific testers by a manager.
This approach resulted in a lot of waste. In the analysis phase, a vast amount of unimplemented requirements accumulated. In the development phase, a vast amount of untested code was developed. In the test phase, a vast amount of unimplemented and improperly implemented requirements were identified.
In the previous post it was discussed that a large number of defects escaped development and made their way in front of the client. This origins of this phenomenon can be traced back in large degree to the batch size that the team was working with. The team simply did not have the capacity to move the entire batch forward as a unit in the timeframe in a way that maintained the integrity of the unit as a whole.
Trying to move such large work products forward resulted in each team becoming a bottleneck on the team that needed to perform the subsequent tasks. In the end, it became a Sisyphean task that ended up sending the work products collapsing backwards from testing to development to analysis. And the process kept repeating.
Breaking the destructive cycle and getting the team to a state where they could actually close defects and keep them closed meant changing the team structure and fundamentally changing the way that the team moved new work from Pending to Done. In fact, it meant changing their definition of the word “done”.
The fact that the “team” members did not share a common sense of ownership of the solution was a major impediment. The first thing we did was dissolve the matrix organization. All team members were now just part of the development team. Along with this was directly setting the expectation that it was their collective responsibility to deliver a quality product and that they all owned quality – not just the testers.
The next thing we did was to remove tasking authority from the project managers. That is, managers could no longer push work to the team. As mentioned, the team was operating in an interrupt-driven mode, wherein a manager would ad-hoc task team members, which resulted in a low probability that the previous task actually got completed. In the next post, I’ll describe what we ended up doing with the managers.
On top of all this, we physically co-located all the team members in a large conference room. This further reinforced the fact that they were a single team with shared purpose. It also meant that now they had to actually talk to one another instead of emailing each other from one cubicle to another.
To put some structure to the development effort and seal the exits on defects escaping development, we introduced a pull system using Kanban. The Kanban approach forced the team to work with smaller batch sizes. Initially, this was easy because the all of the work items that needed to be completed were defects, which are usually pretty small to begin with. It also forced us to define the word “done”.
With this approach, work items (depicted as cards) made their way from left-to-right through a series of states from Pending to Accepted. Different team members had to apply their particular skill at the right places to keep the cards flowing. A work item was not “done” until it has passed through each of these states and ended up in the Accepted state. Done = Accepted.
In many cases a Kanban board can be drawn on a wall and the work items can be represented with sticky notes. In our case, the team was geographically distributed, and I wanted to make sure that we were constantly relying on the captured metrics to make more informed decisions about how to keep making the process better. We chose Version One as our Agile project management tool. The resulting Kanban board for this project was the following:
Each column represents work that needs to be done to move the card forward. Note that the first thing that needs to be done is Develop Acceptance Tests as described in the previous post. In order to decrease coordination costs related to communication of completed tasks, we added wait states to indicate that the previous task was completed and now the next task is ready to be performed. For example, when the acceptance tests have been developed, the work item is ready to develop code.
We also introduced work-in-process (WIP) limits to control the pace at which work items could work their way through the Kanban states. We observed that the analysts were not able to produce acceptance tests at a rate that kept pace with the developers. Since the developers could not keep working on new development tasks without violating the downstream WIP limits, the analysts became a bottleneck in the system. This forced the team to work together to figure out how to even out the distribution of work to ensure a consistent flow of work items. Sometimes developers had to write and verify acceptance tests (not for their own development work). We had to add more analysts and testers to the team. In some cases we adjusted the WIP limits to work out unnatural wait states.
Ultimately, the team had to start working as a real team. They had to learn how to think about how to make the work items flow through the Kanban system. This caused a big boost in morale, and the team began to own the quality of the result as a team instead of pointing the finger at each other when they were simply matrixed onto the project to perform some specialized skill and then go back to the resource pool.