Tenet #8: Monitor buffers and queues
Monitoring buffers and queues of active products under development in your system is extremely important for speed. I will first show how taking advantage of project buffers can assure an aggressive project and on-time delivery, and then show how to monitor and react to growing queues at the front of process centers in your product development factory.
Building an accurate project plan must take into account the complex psychologies of your team members; in particular, their necessity to overstate the time needed to complete a task can reduce schedule predictability and slow the project. This tendency is due in part to the way leaders have responded in the past to late tasks, and how team members have suffered if they underestimated the task length. Punitive actions for tardiness communicate to the teams (perhaps without leaders realizing it) that “schedule accuracy is more important than project speed”. To create a faster organization, leaders need to find a way to give the opposite message: “speed is more important that schedule accuracy”, and then reward the team accordingly.
This undercurrent of task duration overstatement is present in more project teams than you might expect. In fact, it is typical that each task has an implicit or explicit time buffer built-in to account for both technical and logistic uncertainty. The team member has often given the project manager the 95% confidence level for the task duration. A very typical way for a project team member to handle an assigned task is as follows:
- Respond to the request for a duration estimate with a 95% confidence answer for task length.
- Investigate immediately to assure that the task can be done in the allotted time. This may take 1 day or several weeks
- If the task duration has been overestimated, do not inform the project manager, but delay starting on the task until it is nearly due, allowing for time to work on another project. (This is sometimes called the “student problem” due to the similarity to students waiting until the last minute before completing a homework assignment.)
- If the task duration has been underestimated, go back to the project manager and restate the probable duration.
Figure 8-1: “The Student Problem” - Typical effort applied by contributor for a given project task
The result of this activity is that the project schedule, built from many such tasks, has a great deal of buffer time in it (as measured by the difference between the stated task time and the most aggressive duration), time that may not add value to the project, just wait time. Worse, the buffer time is almost guaranteed to be used up, since tasks expand to occupy the allowed time. The situation is exacerbated if individual contributors have their time split across many projects, since this multitasking comes with extra uncertainty and extra time to switch attention and efforts from one project to another. So this uncertainty time is added by task estimators, and maybe even double-counted in their estimations.
Tasks seldom finish early, but usually take up the allotted and agreed-upon time. From the individual contributor viewpoint, as long as their given task is completed in the time promised, the job has been done well, even if it could have been completed sooner. But large project surprises happen, and when they do these individual task buffers are all but invisible and useless to absorb the impact. They can’t help an overall project stay on track. So when these unexpected events occur, the project is forced to slip its schedule.
Let’s now contrast this with a project managed with attention to these buffers. Often this method is called “project buffer management” and, if done right, can result in more predictable schedules and early delivery. This kind of buffer management has its roots in a method known as Critical Chain Project Management . We start by requesting each person on the project to give us 50% confidence and the 90% confidence durations for each task estimate, and we build and manage a schedule based on this 50% confidence for each task. This is a very aggressive schedule, and we know from the beginning that the chances of hitting it are very small. But by keeping this aggressive schedule in front of the project team and managing to it, the message given to the team is “finish your tasks early if you can, and we will take advantage of that effort”.
Next we will take the buffer time for each task estimate (the difference between the 90% and 50% confidence durations) and combine them into an overall project buffer B. From statistical theory, the combining of task buffers to get a project buffer is not a straight sum, but will result in something smaller than a straight sum because of the independence of probabilities of task slippage. An often favored method is to use a “sum of squares” algorithm. With this method, take the difference between the 90% and 50% confidence duration for a given task and square it. Do this for all tasks and add them up. Finally, take the square root of this result. In equation form, this looks as shown below:
B in the equation above represents the amalgamation of all individual task buffers into one overall project buffer, tacked on to the end of the project and monitored for its “burn down”. This becomes a measure of the overall ability of the project to maintain an aggressive schedule and meet or ideally beat the published project schedule. (See Figure 8-2. In this example, this sum of squares method gave a project buffer 33% smaller than would a straight addition of the task buffers.)
While the project team is managed to the 50% confidence schedule, your sponsors may be given expectations from the 90% confidence schedule. In a mature organization, full discussion of two schedules and the buffer consumption chart (also known as a “Fever” chart) showing the use of the project buffer will create greater understanding of the methods used and obstacles faced in product development. (See Figure 8-3).
Project buffers also mitigate the “student problem” discussed above. Each team member will want to avoid having his task eat into the project buffer so will not delay completion until the last minute.
Figure 8-2: Simple example of a schedule with individual Task Buffers, and schedule with task buffers taken out and replaced by a single Project Buffer. (50% task estimate in green, difference with 90% estimate in tan, task buffer)
Figure 8-3: Example of a Project Buffer Consumption Chart. Black line is the expected project buffer consumption rate assuming a linear burn. Red line is the actual buffer consumption. Being in the green region of the chart is good, while in the red is bad.
Carefully examine your individual task buffers, turning them into a single project buffer and then monitor this project buffer consumption throughout your project. Doing so will give you a much greater chance of driving an aggressively fast project, and assuring on-time delivery and often delivering early to the schedule communicated to your sponsors.
Queues for NPD are those spots in your process where the product under development sits waiting inactively for some important piece of information or unavailable resource. Remember that the goal of the NPD organization is to create the information needed to manufacture a product consistently and reliably. Examples of queues can be seen:
- in front of your centralized PC Board layout center
- in front of your key resource for developing DSP algorithms
- in front of your centralized technical writing staff
- in front of your key resource for user interface design
To measure such queues of work-in-process (WIP) in manufacturing, you might count the number of PC boards in front of a work area and assign worth by summing the WIP value of each board. But in new product development, the value generated is in the information created by the process, not in the physical objects. This information value is very hard to estimate at a specific time in the development cycle . How then do you estimate the value of a queue in front of a work area in NPD?
The best answer to this dilemma is given by using the cost of delay (COD) as discussed in “Tenet #5: Sharpen Your Prioritization”. Recall that in this tenet, I described a way to prioritize each piece in a queue. A number was assigned to a given job waiting by estimated by COD*Throughput Time (TT) for that work area. Rather than the value of the job, this describes the potential loss if the job is not addressed in a unit of time. We can still use this same number in order to calculate the total potential loss represented in a queue. Our queue size (or potential loss L represented by the queue size) would then be the sum of COD*TT for all jobs waiting in front of the work area:
It is also illuminating to do the same calculation for all projects in progress (WIP) inside the same work area. So you get two numbers for a given work area:
- Potential loss due to cost of delay in queue
- Potential loss due to cost of delay in WIP
While this tenet is about queues, the second number can be quite instructive as well and watching it may help determine action to improve your product development pipeline.
Watching queue levels is a first step. But it does little good unless you (a) determine values for a good and bad level and (b) put plans in place for reacting to levels that are too high. Finding the right levels to set as expected queue and WIP levels at a given work area will take time and experience, but it is essential to keeping your pipeline flowing well.
Once you have expected queue levels determined, have set up control limits, and are charting these levels, how do you address rising queues at a given work area? While the general philosophy in manufacturing might be to reduce bottlenecks (the constraint in a system that limits the flow) by duplication of equipment, knowledge workers and other similar means, this is not always practical in NPD for at least two reasons:
- Bottlenecks move too often and quickly in NPD
- The elimination of a bottleneck by throwing money at it is often impractical due to the rarity of certain resources and/or the time to duplicate a given resource
Bottlenecks move too often (since few projects through the pipeline are the same or execute as expected). One month the bottlenecks may be in front of mechanical design and PC design, and then next month they are in front of power supply design and analog output stage design. Additionally, your bottlenecks may be in front of a world expert in a given area, and growing another expert may take too long and cost too much money.
However, creating appropriate reactions to breaching your control limits for queues at a given work area is possible. And measuring those queue levels will give great insight into where to place your scarce dollars over the long term. For instance, you may find that cross-training, while potentially taking a lot of time (depending on the skill level needed) may be one of the best ways to be prepared to react to high queue levels. Another method often used in NPD is to have access to specialized contract labor. This contract labor may mainly serve to offload some of your more highly critical resources so they are free to perform tasks for which you cannot find a replacement.
As an example from my organization a few years ago, my examination of queue levels made it clear where I should invest in a product development organization: it was in contract engineering labor and in cross-training in general, but especially for PC design. We had gone through several rounds of improvement in product development and we were doing very well, but we knew we could be faster in our throughput times. Measuring queues at key process centers was one of the later improvements we made to our process. The data were clearly showing us bottlenecks at PC design and in a variety of other areas depending on the project and the collisions of projects requiring the same resources.
We started by generating a skills matrix, a matrix showing every person in the department and how they rated on various skills from low signal-level analog design, to GUI design, to PC design, and to test system engineering. We then used our data of queue levels to target specific areas where cross-training would help us. In some cases we were lucky, like having a PC designer that had gained training and education and was now a firmware designer. In a pinch, this engineer could act as a PC designer if the queue levels got too high. Having this engineer on call was our reaction to our PC design queue control limit. In other cross-training cases, we needed to invest over the next couple of years to duplicate our most precious competencies. Several contract labor houses were also vetted and qualified for resources areas where we saw chronic spikes in demand leading to shortages. Of course when developing cross-training programs, we tried whenever we could to fit with each individuals’ career desires. The result of this planning showed-up in disasters averted, flow maintained and improved throughput times.
In this blog, I have tackled a two-part tenet to improve your product development speed. The first part argued for the need to turn your task buffers into project buffers and then monitor the consumption of that project buffer. The second half admonished you to watch your queue levels at key process areas and be prepared with a clear reaction to levels above limits you have set.
In my next blog, “Tenet #9: Communicate the purpose and status”, I will discuss the need and best practice methods to keep the purpose and status of projects and the department performance as a whole in front of your product development organization. This visibility is important to maintaining a sense of urgency, as well as applauding small successes.
Originally part of the work done by Eliyahu M. Goldratt. See his book Theory of Constraints, 1997
Note: Some groups use an Economic Value Added [EVA] calculation to estimate this value, but EVA really only considers the investment into the project, not its real value on the market.