Capturing and analysing value streams is one of the most used and liked methods in the lean process improvement methodology. In a sense we all grew up , as lean coaches, reading Mike Rothers brilliant “Learning to See” book and applying it in all possible situations. Most of us are also familiar with the objections of the type “ our process is far too complex for a value stream analysis to work” and learned how to work around them, mostly by eliminating unnecessary complexity from the analysis. I think there is a consensus among us, lean coaches, that value streams work very well and are the most important step in analysing a process, be it manufacturing or administration.

We must recognize though that the easy application of a VS rests on a few premises:

- Each process step is executed by dedicated resources who only work in that process step
- The processes described by the value stream are standardized to the extent that they have little enough variation around their mean values describe the process well.

**The Formula**

If we are looking at a manufacturing operation, like a production line, both of these premises are almost certainly true. However, as soon as we move to administrative processes the situation starts to look a bit more shaky. It is common knowledge that resources are not dedicated to a single task , but have several tasks related to different value streams: e.g. a person answering customer enquiries about new products might also be responsible for handling customer complaints , or someone managing the finished goods deliveries will also work in planning the production, or a maintenance engineer will work on classifying incoming defects and also repair parts and so on. To add insult to injury in many of these cases the processing times will be wildly variable ranging from minutes to days for the same type of task .

An important question to answer for any VS specialist will be, how we can handle these situations. One obvious answer would be, to just explain the premises and regress to the “normal” process mapping like the swim lane. This approach has the downside, that it will represent and not eliminate the “unnecessary” complexity of the process, indeed one of the goals of the mapping exercise will be to show everyone how complex the process is in order to create an impetus towards simplifying it. So, a swim lane is a great tool to shock the stakeholders into action, but much less suitable to actually analyse a process.

A better approach would be to extend the value stream methodology to handle deviations from the two premises. There are two steps needed to do this, mostly corresponding to each of the two premises.

If the problem is that we have several unrelated tasks performed by the people working in each of the boxes in the value stream, and more people work on those tasks in parallel we can extend the concept of the processing time to something we call “effective processing time” which will be the average time between two finished products leaving the process step, provided the step was well supplied (i.e. it did not have to wait for materials, input, etc). We have a nice formula for the general case of several resources with differing efficiency and allocated for different tasks as well. The derivation of the formula, for those interested, can be found here:

and it looks like this:

PTeff = 1/SUM(A1/P1 + A2/P2+ … + An/Pn)

where A1 is the percent of time resource 1 is working efficiently on the task related to our value stream and P1 is the processing time, the time it takes resource 1 to accomplish the task, provided there are no interruptions during the task.

For example, imagine we have two maintenance engineers working on repairs. The first one is more experienced, and he averages 2 hours per repair, the less experienced colleague averages 2.5 hours. However the experienced engineer will also have to manage the suppliers of spare parts which takes about 3 hours of his day, the less experienced one is only dedicated to repair jobs. What will be the effective processing time of the repair step?

Using the formula with P1=2, P2=2.5. the first engineer can only work (8-3)/8=0.6 (60%) of his time on repairs, the less experienced one 100%. Putting it all together the PTeff of the step will be

1/(0.6/2+1/2.5)=1/0.7=1.4 hours.

So, every one and half hours the team finishes a repair job. In a value stream map we could represent this step in the same way as if we had one resource that could finish one repair every 1.5 hours. By applying the formula we managed to eliminate the complexity generated by the unequal process times, more then one resource in the process step and additional tasks that are not related to the value stream, basically eliminating the problems related to the first premise.

**Simulation**

The second premise also raises problems in practice. One of the most frequently heard comments, when training lean and especially value streams is that this only applies to car manufacturing, exactly because they have very highly standardized processes. In cases where we have too many random influences our analysis of the value stream will miss important effects because we only concentrate on the average behaviour.

The problem is also, that we have very little intuitive understanding of how a value stream will behave and especially how random effects will influence a process. This would require a dynamic view of the value stream and our mapping is essentially static. The way out of this has been known for a long time : it is to build and to analyse a simulation of the value stream. The problem is (or rather was) that simulation software was expensive and specialized. As far as I know, there was no standardized and cheap way of easily building a simulation.

This changed, as so much else in statistical analysis, with the advent of R (and to be fair ,Python as well). Today we have open source, widely used software that is a de facto standard for system simulations. This means that any value stream we build in the traditional static way can be easily transformed into a dynamic view. A dynamic view also means that we can build a much better intuition of what the value stream is doing and also we can get a picture of what the effects of random variations will be and moreover, we can answer hypothetical questions about how our value stream would change if we introduced specific changes in the process.

As an example I will take an interesting process proposed by one especially talented trainee group we work with: a visit at the doctor. The process has 4 steps :

- The nurse receives the patient and prepares the patient file for the doctor
- The doctor examines the patient
- The nurse updates the patient file
- The doctor signs the documents
- During the day random calls for future appointments by patients also have to be answered by the nurse.

As we can see, this is not a complex process by far, still, it already violates both premises. In order to map the process we need to work out the effective processing times for each step and to do this we need some average values. These would need to be measured or estimated in a real case, for now, for the sake of the analysis let us just assume them like this:

- Step 1 takes in the average 2 minutes
- Step 2 8 minutes
- Step 3 4 minutes
- Step 4 0,5 minutes
- A call takes on the average 4 minutes, one call arrives once in about 10 minutes

We also assume one patient arriving every 9 minutes.

Using the formula from before we can calculate the effective processing times for the nurse like this. For Step 1 she can spend 2/(2+4+4) = 20% of his time. The processing time is 2 minutes so the effective processing time is 2/0,2 = 10 minutes, so in average he can prepare 1 patient file every 10 minutes. The effective processing time of the doctor is 8.5 minutes. The standard value stream analysis will tell us that the patients will queue waiting for the nurse, there will be no queue waiting for the doctor.

Can we have a better view of what is going on, by using a simulation? Using the R library simmer we can easily build one and check the queue length over a working day. It will look like this, if we do not consider any randomness. This would be **Case 1** on the graph.

We can see that the real behaviour is a bit more complex then our static view. Even in absence of random effects, we might see a bit of a queue at the doctor but essentially our view is correct: we see no build-up at the doctor but a steadily increasing queue at the nurse.

Now let us introduce some randomness into the process. In order to do this we would need more detailed information about the distribution of the processing times for each of these steps – that would mean detailed measurements and a longer period of data collection. However, there is´s a quick and dirty way of introducing some assumptions in a simulation by using the so called triangular distributions . These can be defined by 3 numbers: the minimum, the maximum and the most frequently occurring value (aka the mode). The shape of the distribution is triangular, so we will miss the finer details, but in order to get a first impression the details are generally not that important, and can be refined in later steps if necessary.

Let us take an example : let us assume that the doctor has some variation described by the triangular distribution (5,14,8) . The mean time would then be 9 minutes per exam, with variations between 5 minutes and 14 minutes. Let us also assume that the patients arrive randomly described by the distribution (5, 15, 8) that is on the average 9,33 minutes per patient. To make things simple let us assume the nurse will work in the standardized way, that is his working time is constant in each phase with no variation.

Now, as we have introduced some randomness into the simulation we will have a different picture at each run. One example can be seen in **Case 2.** Even though statically seen the doctor has time, we see a build-up developing around mid-day for the doctor . This is purely due to bad luck, the doctor had a few random patients who took longer and/or some arrived earlier then expected. This effect is hard to predict based on the static value stream alone. The nurse is still overworked – he can finish one patient in 10 minutes (all phases considered) and patients arrive once in 9.3 minutes so that by the end of the day we predictably have a queue in front of the nurse. The doctor however managed to eliminate the queue , which was to be expected over a longer term.

Just to illustrate how this analysis would go on, let us consider the idea of outsourcing the incoming calls and let the nurse only work with the patients. The result is seen in **Case 3. **By applying the formulas we could have more or less predicted this result but it is still nice to see our prediction realised.

Now there is room for further ideas. Obviously the new bottleneck is the doctor, so what would we need to do in order to reduce waiting times and queuing?

The above is just a simple example of combining a more detailed value stream analysis with simulations , but imagine the power of this method in a real workshop where we have the people actually working on the process, together with an analyst and a simulation, where we could try ideas and hypothesis on the fly, coming up with new scenarios and being able to quickly answer questions like the above. This will be a whole new level of understanding the processes we work with and we definitely should apply the method as a new standard of digital value stream analysis.