What do you think, what works for your lean initiative?

The title is a bit misleading as I do not want to talk about our experiences about which factors will make an initiative a success. I would rather show a method, how we can measure what the employees of different organizations think, that makes the initiative successful. And when I say think, I really mean, what their belief is and not what they would say if asked this question.

The idea is to apply a statistical method employed in psychology (but other fields as well) to evaluate a carefully designed survey, to quantify the strength of the beliefs of the respondents concerning the link between different factors influencing our outcome variable. This sounds unnecessarily complicated, but let me give a concrete example.

Imagine we want to measure what people think of which factors determine the success of a lean initiative.  Just asking this simple question is not very helpful because there is a near certainty that we will get all kinds of incommensurable answers – everybody understands “success” differently and the factors mentioned will also be wildly different, from the very concrete to the philosophical. So, obviously we will have to ask more specific questions.

Concerning success, for example, we might have our own definition of what a successful lean initiative looks like – many Kaizen events, Obeya rooms regularly visited, waste elimination, 5S in place etc. All these elements REFLECT the status of the initiative , meaning, that if we ask questions about these in a successful lean company we will generally get high scores and vice versa. We do not need to capture ALL that defines a successful implementation though, we just need to be reasonably sure that all  successful companies will have reasonably high scores on most of our questions.

To put it in more special terms, we assume, that there is a hidden variable out there at the company that we can call Maturity of Lean. We can not measure this directly, indeed we do not even have an operational definition of it. However we can ask questions that we expect that will reflect on the state of our Maturity of Lean hidden variable, a bit like looking at mosaic stones and trying to figure out the whole picture. To get this information we shall define one question for each aspect of Maturity we came up with  in our survey.

Using the same logic we can assume that there is  another hidden variable at the company called Management Commitment and another one called Tool Proficiency and so on. To get an idea of the state of each of these we shall design several questions that we believe will reflect on the status of the hidden variable. E.g. for Tool Proficiency we might decide to ask the number of successful Kaizen events , the number of employees involved, the amount of money saved, the number of areas with visual management present and so on. In the same way we may define a number of questions around management commitment and so on.

As a side issue, wherever possible, we should use Likert scales for the answers to facilitate the analysis.

Now, once we collected the answers, we will want to analyse the relationship between these hidden variables and their effect on our similarly hidden outcome variable. In principle, it would be possible to analyse more detailed effects, like the impact of 5S on the number of Kaizen events, but this means building a large number of correlations (one between each success component and influencing factor) and this will be statistically absolutely unsound unless we apply some corrections (like the Bonferroni correction if we want to keep it simple) and really also way too detailed. Another problem will be that many of our independent variables (the answers to our questions) will be correlated , which will make a traditional regression analysis very difficult,

Anyway, he real questions are at the level of the hidden variables – e.g. does Tool Proficiency contribute to the success of the initiative and if yes how strong is it’s effect? Once we get answers at this level we can go one step deeper and analyse the contribution of each component to the outcome, like: does 5S have a strong contribution to the Tool Proficiency or is it the Visual Management? , and the like..

The statistical method to  analyse our survey is called PLS SEM (Partial Least Squares – Structural Equation Modelling). Not delving into the mathematics , it has essentially 3 steps.

  1. We describe which survey questions relate to which hidden variable. As we designed the survey with hidden variables in mind, this will not be a difficult exercise. Based onthis info, the system will optimally construct the hidden synthetic variables, as linear combinations of the respective inputs. That will be roughly the PLS part of the method
  1. We start with some broad assumption on which hidden variable can impact which other hidden variable. E.g. we can assume that management commitment has an impact on tool proficiency or that leadership has an impact on management and also on  tool proficiency etc. Based on these assumptions  the system will calculate the strength of the influence of a hidden variable on another – that would be the SEM part.

With these two elements we can run the PLS SEM model and then we have the 3rd step which will be the interpretation of the model. Here we can quality check the structure of our hidden variables and see whether we picked the categories correctly . Then if all our hidden variables are correctly built  we can check the model as if it were a normal multiple regression and reduce it based on the p values in the usual way.  So  we end up with is a statistically sound high level description or model ) of what the survey participants think of a such a complex issue as the success of a lean initiative.

The method is not limited to survey evaluation though. In any industry were we have some customer requirement that can be described by several measured values, we can apply the idea of hidden variables that are reflected by these measured values. Indeed my first exposure to the method was at a time we worked for a number of companies manufacturing paint. One characteristic of paint that is of interest to the customers is how “shiny” the paint is. This will be measured by a lab instrument at several angles, say, at 20 degrees, 40, 60 80 . We can model the customer requirement as the hidden variable “Shine” reflected by the measurement values at different angles. Then we can use the hidden variable as the Y in a regression model, that can have manufacturing parameters , components of recipes and the like giving us a lot of insight into the way we can improve our process.  The same logic can be applied in the food industry in many places as well.

In summary, we know that the most important step in a process improvement is the accurate capture of the voice of the customer. As Six Sigma  has a reputation of relying strongly on measurements and statistics, adopting a method that will link our strength to our aspiration is definitely something we should do, and do more often.

Lean and Building a Robot

I am not thinking of actual physical robots here like the ones building cars in a Toyota plant, though that would be a very interesting topic in itself. Rather, I would like to share some experiences I had in building software robots of the RPA fame at an organisation which  is at the start of the process excellence journey with or without robots.

This is a very interesting experiment – given that we have some debates going on concerning the right way to chose in a such a situation.

Do we have to implement pure Lean first, as advocated by the purists and of course most all traditional lean consultancies? There is a famous lean percept saying that automatized waste is still waste, maybe even worse then non-automatized waste. This would mean, that we should postpone implementing RPA until the processes are optimized and waste mostly eliminated, then run a RPA initiative to gain the best of both worlds, lean sleek processes that also run in the most part automatically with the help of SW robots.

There is an alternative way to look at this situation, of course. You might say, that applying lean to a crusty old process that resisted optimization for many years is a waste of effort. Just pick the parts that let themselves to automatization, implement robots there and reap the benefits. Naturally, this will be the view of the RPA vendors. No other expertise is needed then the one to implement robots in the particular platform and benefits will start pouring in in a few weeks after the end of the implementation without any need to worry about obscure things like “cultural change” or “value streams”.

I am in a particular position having been both a SW developer and a lean consultant for basically the same (long) amount of time. Still, having had my share of failures and successes in process improvement (and software development) I am naturally leaning towards the second view – implement the robots quickly and either get the benefits or quickly discover the mistakes we made in the process.  This is , funnily enough, the view people have, who tried to apply lean principles (or just common sense) to software development and came up with the Agile methodology in the 90-ies.

So, to come to the real story, I showed up at the organization in question, for one week of voluntary work (they are a well – known NGO) to start with them on the path of robotic process automation. I was very much focused on getting at least one robot to be functional and useful by the end of the week – being naïve and hopelessly optimistic, as most developers, I actually promised 5. I also completely neglected any preparation work, such as creating process maps and value streams – this time on purpose – thinking that we could not make a process map at the level of each keypress and mouse-click the way I needed for the robot and also because my receiving organization would not have had the time to prepare such maps even at higher levels of detail.

The first hurdle I hit was to find the first (and as turned out, only) process step to automatize. Here a detailed process landscape and process maps would have helped but lacking those we picked , sensibly enough, something where the process worker was available and what  was deemed sufficiently important by the team  (but, of course, during the robot development process we did have doubts, whether we picked the right choice or not but by then it was too late to change.)

The second hurdle was that I completely underestimated the impact of a non-standard environment in which the robot had to work. As an NGO, they had several IT solutions, all delivered by enthusiasts or volunteers, that were well suited for human operators, who will never care about windows tags and such,  but a lot less standardised in behaviour then a system that would have used standard windows components.  This made the development work a lot more strenuous as I had no way of knowing whether some trick I needed will even be possible within this framework, like going through a list of push-buttons and clicking them automatically in a given order. On the other hand, this is exactly where an RPA system becomes so much better than any traditional macro – and we hear often that RPA is nothing BUT a better developed macro. Well, that part of “better developed” and the possibility to manipulate windows descriptors, may be a small step as programming development goes, but it is huge step for an implementer, who is faced with an unknown system with poorly (or not at all) specified  interfaces and a tight deadline. The RPA paradigm means in theses case the difference between “impossible” and “tough but doable”.

So, not without some hassle and way later than what I planned, I managed to build something what was, as I thought, a working and useful robot. Then, of course, came the third hurdle: it turned out that the process, as described by the colleague I was working with on the specification of the robot, was by no means the only way or even the accepted way of working on that process step. In short – my robot turned out to be not useful at all before the team worked out the right way of making that step. So, this is where, you could think, the lack of applying correct Lean methods came back to bite us. Had we had a Lean workshop first , we would have discovered the discrepancies and the lack of standard work before wasting time on the robot, and we would have fixed the problem BEFORE we started programming . Well …..  – this is true in a way. In an ideal world the procedure should definitely have been like this –running a  Lean Workshop, building a process map, agreeing  on  standard work, implementing the robot with the standard version.

But, I have my doubts. We do not live in this ideal world and NGOs in particular do not have the time and money, and the motivation either, to spend a lot of time building process maps. The colleagues I was talking to honestly believed that they have only one standard way of doing the process I tried to automatize. Not because they were ignorants, but because they have many other priorities and as long as the process functions they have better ways of spending their time (and overtime – I basically did not see an 8 hour day during the time I was with them). Building the robot changes the optics – suddenly the process we are looking at will be performed by a mindless entity, which will not have the intelligence to discover and correct deviations from the standard . It will definitely build chaos instead, so, if we want (must) apply robots, we must make sure that these deviations will not happen anymore.

But his means that the strong motivation to look at processes and to standardize them comes AFTER the first robots were built –

It is in my opinion next to impossible to infuse this motivation as long as humans perform even the most boring process steps as well.

So, was my work for a week a failure – I do not think so. We all learned – and the company I worked for definitely learned that robots will only be as good as the standardization of the process steps we apply robots to. As a Toyota manager I had the chance to interview once stubbornly repeated – “No standard, no improvement”. This is even more true and important in the new era of Lean Robots than it used to be.

Now, to close – did I learn that we need a complete Lean Transformation before we implement robots? No, I still do not believe that. However, being careful pays off. I would still build a robot first, but I would emphasize a lot more (as befitting any Agile methodology) that we are building the version 0  only – the first, minimally useful product. THAT I would try to build even faster than I did it in this case. The moment this version is ready, the reality of what a robot can and will do hits home with the team – and then our users, customers and developers will get a lot of legitimate objections and errors discovered which., with any luck, will ensure that the next version will be something almost useful, but will certainly generate a new wave of tests, objections and such and so the cycle will move on to something legitimately useful. I strongly believe that this is the practical way of going forward with RPA – building more and more useful robots and learning at each step of the way.

Robotic Process Automatization versus Software Development – the case for Agile RPA

Recently I had the huge chance to learn and then to apply RPA in a real life project. I think the method is nothing short of revolutionary and will definitely change the way we think about process improvements and more importantly, how we will actually improve processes. I also plan to write several blog posts about my experiences as it will be probably useful, or at least entertaining, to those who embark on the same path.

I spent about 15 years of my professional life as a software developer at various big companies like Siemens or GE, so I was naturally intrigued (well, the actual term should be more like pissed off ) by the advertisement of several RPA providers who all claim that scripting robots is NOT software development, it is in fact something that anyone can do without previous knowledge of programming.

This claim is based on the fact, RPA systems allow one to capture user activities and to replay them, so it is in fact possible to not have to “program” to develop a script. Unfortunately, there are some basic problems with this view.

The first is the confusion between “typing code” and programming. The idea that clicking on icons and pulling them to various places on a screen is somehow different and easier than actually writing a program looks very tempting at the first sight. After all, programmers type gibberish that only they understand most of the time while business analysts and managers build presentations using pretty pictures – or so the stereotypes go . So, if we can do away with the gibberish (and with the typing) then probably business analysts and even managers will be able to produce robots, and we will need no expensive programmers no one understands anyway – this feels like a very tempting proposition.

Unfortunately this idea is very far from being true.  At the risk of stating the obvious – programming is not about typing. Programming means describing procedures in a highly structured way so that no other knowledge, of  the kind humans have, will be necessary to execute them, and even more importantly , keeping these descriptions (aka “programs”) in a state where they can be updated, modified and generally used by people who did not participate in the development. This is absolutely necessary as programmers tend to wander off to other projects, customers discover new needs and bugs raise their ugly heads. Programs are never finished in the sense bridges, for instance, are so they need constant care, which would be next to impossible if the program is not developed with this in mind.

That means in Software Engineering terms that programming is mostly concerned with the famous “ibilities” – usability,  reliability and maintainability. Usability means developing a program (or script) that the customers find useful and are willing to pay for it, reliability that the program can run most of the time and maintainability that the developed code is easy to understand and to modify without the risk of breaking it  by introducing changes that have unforeseen effects.

Achieving these goals has absolutely nothing to do with the way a program is developed – by typing text or by assembling pretty little pictures. In this sense the message “RPA is absolutely not like programming” is wrong and probably dangerous. My uncomfortable feeling is that many companies will translate this marketing message to something like “ we can now finally forget all the lessons SW Engineering learned in the last 50 years and we can just work spontaneously as we think fit, because THIS IS NOT PROGRAMMING”.   I think the lesson to be first forgotten will be about the Agile Development.

To be fair, Agile has already taken some hits due to the hype that has been going on for some years now. Most companies developed an uneasy relationship with the concept which makes it all the more riskier to take the position that RPA initiatives absolutely need Agile as the development methodology. Let me showcase why this is amust by the example of the “ibilities”:


How do we make sure we develop robots that make things people find useful? The only way I can imagine is to send the RPA developer to sit with the people who actually do the work that will be automatized (at least partially) observes what they do and discusses with them what they need. (Remember the lean term Gemba ? The Agile idea of including customers in the development team? This is the same, only ten times more necessary).  This is only possible in loops – ideas will be captured, robots with minimally useful functionality implemented and feedback from the direct customers (the office workers whose work will be made easier) collected for the next version. And the next version will be developed the next week, so that the customers see the effects of their participation immediately.  This also means that in the early phases errors are tolerated or even welcome. After all, each error discovered in this phase is one error less in the final robot.

Unfortunately, if the organization is unaware (or wilfully forgets) what we all learned about SW development (and remember , they bought into the RPA with the idea it is NOT SW) the spontaneous way to develop will be the well-known waterfall. We will send an e-mail to process experts to please describe their processes with the most details possible and when the spec is ready somebody who probably never saw a live process worker, somewhere in the basement of the IT organization,  or maybe somewhere even farther, will develop a robot. The robot will be tested on a number of (ideally well chosen) test cases and then deployed. It will also quickly fail – because requirements change, the users forgot to mention the odd extraordinary case and so on… we al know the examples from real life projects. However by this time the developers have other robots to develop and the whole deployment degenerates into the acrimonious discussion of who is to be blamed for what.  We have all been there and have done that  – and we risk starting the cycle once again.


This „ibility“ will be about testing . Again, the Agile way of developing small useful chunks, the immediately let them be tested by the users and repeat the cycle is the best way of achieving this. The traditional way of developing a number of testcases and test them cold has the weakness of never knowing for sure whether the testcases really cover all the eventualities and whether a “tested” SW is really safe to deploy or not. I worked once in a project where ateam sent one year writing test cases and when we checked later it turned out that all the cases covered less the 20% of al eventualities.  Designing tests is an important part of the SW Engineer know-how but remember, RPA is NOT SW ?  This way we risk ending up with the worst of both worlds: no timely, direct feedback from the people who use the robot and no really usable testcases either.


This is where I find the marketing line RPA not being SW the most dangerous.   In the effort to sell RÜA as NOT SOFTWARE most RPA providers embrace a visual programming style . This is all very nice and easy on a marketing show, where anybody can draw up to five icons to a screen and visually link them with nice arrows – but the then real life will necessarily kick in after the purchase. It is no accident that other industries already experimented with visual programming and then returned to text. The problem is, that with a visual program a LOT of essential information is hidden in small dialog boxes attached to these nice icons. And by essential I mean things like delays before a mouse click or waiting times before a terminal message is sent or even input parameters that are given to what goes for a function in this visual paradigm. Now, imagine the simplest of maintenance actions – find the places where a timeout has a small given value and change it to something longer. In the good old text-programming wold this would be a simple search and replace operation . Maybe there is a better way in an RPA visual program, but the only way I can see now is opening each and every program that uses this timeout and click through each and every icon to separately edit each  dialog-box . Good luck doing this with a few hundred icons (aka lines of code) and especially good luck finding people willing to do this brainless work for days on end. (Well maybe we could write code maintenance robots to do it).

And this just the tip of the iceberg and a pretty much trivial task. Once robots will be deployed in numbers there will be many such and more complicated tasks – we know from SW development that in big organizations code maintenance takes up to 80% or more of the time of a developer. Unless we can freeze the processes with robots down to the last mouse-click I see no way this percentage should be different for robots.

So, where does this leave us? Is RPA bad for the companies?

I definitely do not think so. The message that RPA is easy and so not like SW is on the other hand dangerous and damaging. If anything, we would need to be more agile in RPA development than in “normal” SW development, and this definitely needs planning and organization before the deployment. Call me a maniac, but I strongly believe that Lean and its SW offshoot Agile are the answer for most of the problems. It only takes the will of the organizations to implement it – and to not fail for the syren song of “this is easy, anyone can do it” of the marketing types. It is not easy and many will fail if they mindlessly implement it – but it has a huge potential to make life better for the people working in processes and we know how to do it right. So,  as in so many other things in life : PLAN, DO, CHECK ACT  and take the benfits.


Value Stream Analysis in a Digital World

Capturing and analysing value streams is one of the most used and liked methods in the lean process improvement methodology. In a sense we all grew up , as lean coaches, reading Mike Rothers brilliant “Learning to See” book and applying it in all possible situations. Most of us are also familiar with the objections of the type “ our process is far too complex for a value stream analysis to work” and learned how to work around them, mostly by eliminating unnecessary complexity from the analysis. I think there is a consensus among us, lean coaches, that value streams work very well and are the most important step in analysing a process, be it manufacturing or administration.

We must recognize though that the easy application of a VS rests on a few premises:

  1. Each process step is executed by dedicated resources who only work in that process step
  2. The processes described by the value stream are standardized to the extent that they have little enough variation around their mean values describe the process well.

The Formula

If we are looking at a manufacturing operation, like a production line, both of these premises are almost certainly true. However, as soon as we move to administrative processes the situation starts to look a bit more shaky. It is common knowledge that resources are not dedicated to a single task , but have several tasks related to different value streams: e.g. a person answering customer enquiries about new products might also be responsible for handling customer complaints , or someone managing the finished goods deliveries will also work in planning the production, or a maintenance engineer will work on classifying incoming defects and also repair parts and so on. To add insult to injury in many of these cases the processing times will be wildly variable ranging from minutes to days for the same type of task .

An important question to answer for any VS specialist will be, how we can handle these situations. One obvious answer would be, to just explain the premises and regress to the “normal” process mapping like the swim lane. This approach has the downside, that it will represent and not eliminate the “unnecessary” complexity of the process, indeed one of the goals of the mapping exercise will be to show everyone how complex the process is in order to create an impetus towards simplifying it. So, a swim lane is a great tool to shock the stakeholders into action, but much less suitable to actually analyse a process.

A better approach would be to extend the value stream methodology to handle deviations from the two premises. There are two steps needed to do this, mostly corresponding to each of the two premises.

If the problem is that we have several unrelated tasks performed by the people working in each of the boxes in the value stream, and more people work on those tasks in parallel we can extend the concept of the processing time to something we call “effective processing time” which will be the average time between two finished products leaving the process step, provided the step was well supplied (i.e. it did not have to wait for materials, input, etc). We have a nice formula for the general case of several resources with differing efficiency and allocated for different tasks as well. The derivation of the formula, for those interested, can be found here:


and it looks like this:

PTeff = 1/SUM(A1/P1 + A2/P2+ … + An/Pn)

where A1 is the percent of time resource 1 is working efficiently on the task related to our value stream and P1 is the processing time, the time it takes resource 1 to accomplish the task, provided there are  no interruptions during the task.

For example, imagine we have two maintenance engineers working on repairs. The first one is more experienced, and he averages 2 hours per repair, the less experienced colleague averages 2.5 hours. However the experienced engineer will also have to manage the suppliers of spare parts which takes about 3 hours of his day, the less experienced one is only dedicated to repair jobs. What will be the effective processing time of the repair step?

Using the formula with P1=2, P2=2.5. the first engineer can only work (8-3)/8=0.6 (60%) of his time on repairs, the less experienced one 100%. Putting it all together the PTeff of the step will be

1/(0.6/2+1/2.5)=1/0.7=1.4 hours.

So, every one and half hours the team finishes a repair job. In a value stream map we could represent this step in the same way as if we had one resource that could finish one repair every 1.5 hours. By applying the formula we managed to eliminate the complexity generated by the unequal process times, more then one resource in the process step and additional tasks that are not related to the value stream, basically eliminating the problems related to the first premise.


The second premise also raises problems in practice. One of the most frequently heard comments, when training lean and especially value streams is that this only applies to car manufacturing, exactly because they have very highly standardized processes. In cases where we have too many random influences our analysis of the value stream will miss important effects because we only concentrate on the average behaviour.

The problem is also, that we have very little intuitive understanding of how a value stream will behave and especially how random effects will influence a process. This would require a dynamic view of the value stream and our mapping is essentially static. The way out of this has been known for a long time : it is to build and to analyse a simulation of the value stream. The problem is (or rather was) that simulation software was expensive and specialized. As far as I know, there was no standardized and cheap way of easily building a simulation.

This changed, as so much else in statistical analysis, with the advent of R (and to be fair ,Python as well). Today we have open source, widely used software that is a de facto standard for system simulations. This means that any value stream we build in the traditional static way can be easily transformed into a dynamic view. A dynamic view also means that we can build a much better intuition of what the value stream is doing and also we can get a picture of what the effects of random variations will be and moreover, we can answer hypothetical questions about how our value stream would change if we introduced specific changes in the process.

As an example I will take an interesting process proposed by one especially talented trainee group we work with: a visit at the doctor. The process has 4 steps :

  1. The nurse receives the patient and prepares the patient file for the doctor
  2. The doctor examines the patient
  3. The nurse updates the patient file
  4. The doctor signs the documents
  5. During the day random calls for future appointments by patients also have to be answered by the nurse.

As we can see, this is not a complex process by far, still, it already violates both premises. In order to map the process we need to work out the effective processing times for each step and to do this we need some average values. These would need to be measured or estimated in a real case, for now, for the sake of the analysis let us just assume them like this:

  1. Step 1 takes in the average  2 minutes
  2. Step 2 8 minutes
  3. Step 3 4 minutes
  4. Step 4 0,5 minutes
  5. A call takes on the average 4 minutes, one call arrives once in about 10 minutes

We also assume one patient arriving every 9 minutes.

Using the formula from before we can calculate the effective processing times for the nurse like this. For Step 1 she can spend 2/(2+4+4) = 20% of his time. The processing time is 2 minutes so the effective processing time is 2/0,2 = 10 minutes, so in average he can prepare 1 patient file every 10 minutes. The effective processing time of the doctor is 8.5 minutes. The standard value stream analysis will tell us that the patients will queue waiting for the nurse, there will be no queue waiting for the doctor.

Can we have a better view of what is going on, by using a simulation? Using the R library simmer we can easily build one and check the queue length over a working day. It will look like this, if we do not consider any randomness. This would be Case 1 on the graph.


We can see that the real behaviour is a bit more complex then our static view. Even in absence of random effects, we might see a bit of a queue at the doctor but essentially our view is correct: we see no build-up at the doctor but a steadily increasing queue at the nurse.

Now let us introduce some randomness into the process. In order to do this we would need more detailed information about the distribution of the processing times for each of these steps – that would mean detailed measurements and a longer period of data collection. However, there is´s a quick and dirty way of introducing some assumptions in a simulation by using the so called triangular distributions . These can be defined by 3 numbers: the minimum, the maximum and the most frequently occurring value (aka the mode). The shape of the distribution is triangular, so we will miss the finer details, but in order to get a first impression the details are generally not that important, and can be refined in later steps if necessary.

Let us take an example : let us assume that the doctor has some variation described by the triangular distribution  (5,14,8) . The mean time would then be 9 minutes per exam, with variations between 5 minutes and 14 minutes. Let us also assume that the patients arrive randomly described by the distribution (5, 15, 8) that is on the average 9,33 minutes per patient. To make things simple let us assume the nurse will work in the standardized way, that is his working time is constant in each phase with no variation.

Now, as we have introduced some randomness into the simulation we will have a different picture at each run. One example can be seen in Case 2. Even though statically seen the doctor has time, we see a build-up developing  around mid-day for the doctor . This is purely due to bad luck, the doctor had a few random patients who took longer and/or some arrived earlier then expected. This effect is hard to predict based on the static value stream alone. The nurse is still overworked – he can finish one patient in 10 minutes (all phases considered) and patients arrive once in 9.3 minutes so that by the end of the day we predictably have a queue in front of the nurse. The doctor however managed to eliminate the queue , which was to be expected over a longer term.


Just to illustrate how this analysis would go on, let us consider the idea of outsourcing the incoming calls and let the nurse only work with the patients. The result is seen in Case 3. By applying the formulas we could have more or less predicted this result but it is still nice to see our prediction realised.

Now there is room for further ideas. Obviously the new bottleneck is the doctor, so what would we need to do in order to reduce waiting times and queuing?

The above is just a simple  example of combining a more detailed value stream analysis with simulations , but imagine the power of this method in  a real workshop where we have the people actually working on the process, together with an analyst and a simulation, where we could try ideas and hypothesis on the fly, coming up with new scenarios and being able to quickly answer questions like the above. This will be a whole new level of understanding the processes we work with and we definitely should apply the method as a new standard of digital value stream analysis.


Designed Experiment to Re-engage Silent Customers

In the spring I had a chance to work in a project that had a very special problem. We had to convince the customers of an energy company to stay at home for a day, so that the company  can upgrade a meter in their home. The problem was special because the upgrade was mandated by government policy, but offered basically few advantages to the customers.

Obviously this a great challenge for the customer care organization – they need to contact as many customers as they can  and convince them to take a day off and wait at home for the upgrade. The organization needs to send out huge numbers of messages in the hope that enough customers will react to it. This necessarily means that we also get a great number of so called “silent customers” – people who decide to  not react to our first message in any way.

As we obviously do not have an infinite number of customers to convince, silent customers do have a great value – at least they did not say no yet. The question is, how to make them respond ? If we learn how to activate at least some of them we can use this knowledge for  the first contact message and make our communication more effective.

The problem is of a more general interest then this special project  – just think of NGOs who depend on donors. Learning how to make prospective donors more interested at the first contact has a very definite advantage for them as well.

So, how do we go about this? Coming from the Lean/Six Sigma world our first idea was to actually LEARN what is of interest to the customers. Previously there were many discussions and many hypothesis were floating around, mostly based on personal experiences and introspection. Some were already tried but none were really successful.

We changed the game by first admitting that we do not know what is of interest to our customer base – they had wildly differing demographic, age and income profiles, which did make all these discussions quite difficult.  Once we admit ignorance though (not an easy thing to do BTW) our task becomes way more simple. There is just one question left in the room: how do we learn what the customer preferences are, except the many we used to have along the lines of   “how do we interest hipsters or families with small children”? and so on.   Coming from the Lean six Sigma world there is just one answer to this question : we run a designed experiment to find out.

It is important to realze that we run the experiment to LEARN and not to improve anything. This is an error in industrial settings as well but in this project managing the expectations was even more important. However as we stuck to our goal of learning about the customer, designing the experiment became  much simpler, as we avoided useless discussions about what will be beneficial and what not. Every time an objection came up about the possible usefulness of an experimental setting we could just give our standard answer : we do not know, but if you are right it will be proven by the experiment.

As we went on designing the experiment we realized that we only needed (and were allowed to) to use two factors :  communication channels and message types. All the previously so bothersome issues of age distribution, locality and such we solved by requiring large random samples across al these factors.  Having large samples was, unlike in manufacturing, no problem at all. We could decide to send an email to a thousand customers or two thousand without any great difficulty or cost. As we were expecting weak effects anyway, having large sample sizes was essential to the success of the experiment.

Finally we decided on the following : we used two communication channels, e-mail and SMS, and three message types. One message targeted the geeks by describing how much cooler is the new meter, one targeted greens by describing how the new meters contribute  to saving the environment and one  was appealing to our natural laziness by describing how much easier it will be to read the meter. So, in the end we had a 2X3 design., two channels times three message types And this is where our problems started.

Customer contacts are different from settings on a complex machines in the sense that everybody has an opinion about them and for the machines you do not need to talk to the legal and to the marketing department before changing a setting. We had several weeks of difficult negotiations trying to convince every real or imagined stakeholder that what we intend to do will not harm the company – and at every level it would have been way easier to just give up then to trudge on . It is a tribute to the negotiation skills and commitment of our team members that we managed to actually run the experiment. I kind of think, that this political hassle is the greatest single reason why we do not see more experiments done in customer related businesses.

For 3 weeks we sent every week about 800 e-mails and about 300 SMS-es per each message type . We had several choices about how to measure the results. With the e-mails we could count how many customers  actually clicked on the link to the company web-site but for the sms-es it was only possible to see whether a customer chose to book an appointment or not. This was definitely not optimal, because the we could not directly measure the efficiency of the messages except for the emails. To put it simply the fact whether a customer clicks on the link in the message is mostly influenced by the message content while the fact whether the customer books an assignment depends on many other factors. Here is randomization helpful – with the sample sizes and randomization we could hope that these other factors statistically cancel each other so that the effect of the message will be visible if a little more dimly.

Our results were finally worth the effort. A first learning was that we had basically no-one reacting to the SMS messages. Looking back, this had a quite clear explanation – our message directed the recipient to click on a link to the company web-site and people are generally much more reluctant to open a web-site on a mobile phone than on a computer (at least that’s what I think). Fact is, our sms-es were completely unsuccessful, though more expensive than the e-mails.

On the e-mails we had a response of 3.5 – 4% for the ones appealing to the natural laziness as compared to less then 2% for the other message types. As the contacted people were silent customers, who once already decided to ignore our message, getting 4.5% of them to answer was a sizeable success.By the sample sizes, we had, proving statistical significance was a no-brainer.

The fly in the ointment was that we failed to translate these clicks to confirmed appointments – we basically had the same, very low percentage of confirmations  irrespective of channels or message types. Does this mean that our experiment failed to identify any possible improvement? At the risk of being self-defensive here, I would say that it does not. Making a binding confirmation depends on many factors outside the first priming message we were experimenting with. The content of the Web-side our customers go to, to mention just one, should be in synch with the priming message, which was not the case here. So, the experiment delivered valuable knowledge about how we can make a customer come to our web-site , but not about how to make the customer interested  in our message – and this ok.  This was exactly what we set out to investigate. As mentioned before, managing expectations is a very important element here.

What would be the next steps? Obviously we would need to set up a new experiment to investigate what factors impact the customer willingness to accept our offer. I am certain, that this is what the team will do in the next phase – after all, we learned quite a lot about our customers with a ridiculously low effort (excepting the negotiations) so why not keep on learning?

Theory of Constraints meets Big Data part 2

I would like to continue the story of the hunt for the constraint using a lot of historical data and the invaluable expertise of the local team.  There is a lot of hype around big data and data being the new oil – and there is also a lot of truth in this. However, I find that ultimately the success of a data mining operation will depend on the intimate process knowledge of the team . The local team will generally not have the expertise of mining the data using the appropriate tools, which is absolutely ok, given that data mining is not their daily job.  On the other hand a data specialist will be absolutely blind to the fine points of the operation of the process – so cooperation is an absolute must to achieve results  The story of our hunt for the constraint illustrates this point nicely in my opinion.

After having found proof that we have a bottleneck in the process our task was to find it or at least gain as much knowledge about the nature of the bottleneck as possible. This might seem to be an easy task for hardcore ToC practitioners in manufacturing, where the constraint is generally a process step or even a physical entity, such as a machine. In our process of 4 different regions, about 100 engineers per regions, intricate long and short term planning and erratic customer behaviour, little of the known methods to find the bottleneck seemed to be relevant.  For starters, there was no shop-floor we could have visited and no WIP laying around giving us clues about the location of the bottleneck. The behaviour of all regions seemed to be quite similar which pointed us in the direction of a systematic or policy constraint . I have read much about those, but a procedure how to identify one was sorely missing from my reading list.

So, we went back to our standard behaviour in process improvements : “when you do not know what to do learn more about the process”.  A hard-core lean practitioner would have instructed us to go Gemba, which, I have no doubt, would have provided us with adequate knowledge in time. But we did not have enough time, so our idea was to learn more about the process by building a model of it. This is nicely in line with the CRISP-DM methodology and it was also our only possibility given the short time period we had to complete the job.

The idea (or maybe I should call it a bet) was to build a well-behaved statistical model of the installation process and then check the residuals. If we have a constraint, we shall either be able to identify it with the model or (even better) we shall observe that the actual numbers are always below the model predictions and thus we can pinpoint where and how the bottleneck manifests itself.

Using the tidyverse (https://www.tidyverse.org/) packages from R  it was easy to summarize the daily data to weekly averages. Then, taking the simplest approach, we built a linear regression model. After some tweeking and adjusting we came up with a model that had an amazing 96.5% R-squared adjusted value, with 4 variables. Such high R-squared values are in fact more of a bad news in themselves – they are an almost certain sign of overfitting, that is, that our model is tracking the data  too faithfully, incorporating even random fluctuations into the model. To test that we used the model to predict the number of successful installs of Q1 2018. If we overfitted the 2017 data then the 2018  predictions should be off the mark – god knows, there was enough random fluctuation in 2017 to lead the model astray.

But we were lucky – our predictions fit the new data to within +/- 5% . This meant, that the fundamental process did not change between 2017 and 2018 and also that our model was good enough to be investigated for the bottleneck.    Looking at the variables we used we saw that we had two that had a large impact and were process related  –  the average number of jobs an operator will be given per week and the percentage of cases where an operator was given access to the meter by the customer . The first was a thinly disguised measure of the utilisation of our capacity and the other a measure of the quality of our “raw material” – the customers. Looking at this with a process eye, we found a less then earth-shaking conclusion – for a high success rate we need a high utilisation and high quality raw materials.

Looking at the model in more detail we found another consequence – there were many different combinations of these two parameters that led to the same number of successes:  low utilisation combined with high quality was just as successful as high utilization combined with much lower quality. If we plotted the contour lines of equal number of successes then we got, unsurprisingly, a number of parallel straight lines moving from the lower left corner to the upper right corner of the graph.  This delivered the message, again, not an earth-shaking discovery, that in order to increase the number of successes we need to increase the utilisation AND the quality in the same time.

To me, the surprise came when we plotted the weekly data from 2017 over this graph of parallel lines, and this was really a jaw-dropping surprise. All weekly performance data for the whole of 2017 (and 2018) were moving parallel to one of the constant success lines. This meant that all the different improvements and ideas that were tried during the whole year were either improving the utilization but in parallel reducing the quality or improving the quality but reducing the utilization – sliding up and down along a line of a constant number of success (see attached graph).

This is a clear case of a policy constraint – there is no physical law forcing the process to move along that single line (well, two lines actually) but there is something that forces the company to stay there. As long as the policies keep the operation on this one (two) lines, this will look exactly the same as a physical constraint.

This is about the most we can achieve with data anylysis. The job is not yet done – the most important step is now for the local team to identify the policy constraint and to move the company towards changing the mode they operate from sliding in parallel to the constant line  to a mode where they move perpendicular to the lines. We can provide the data, the models and the graphs but now we need passion, convincing power and commitment –  and this the way data mining can actually deliver on the hype. In the end it is about people able and willing to change the way a company operates and about the company  empowering them to investigate, draw conclusions and implement the right changes.  so, business as usual in the process improvement world.Historical 2017 with weeks


Theory of Constraints meets Big Data

The theory of constraints is the oldest and probably the simplest (and most logical) of the great process optimization methodologies. One must also add that it is probably the most difficult to sell nowadays as everybody already heard about it and is also convinced that for their particular operation it is not applicable. Most often we hear the remark, “we have dynamic constraints”, meaning that the constraint is randomly moving from one place in the process to the other . Given that the ToC postulates one fixed constraint in any process clearly the method is not applicable to such complex operations.  This is an easily refutable argument though it undoubtedly points to a missing link in the original theory : if there is too much random variation in the process steps, this variation will generate fake bottlenecks in the process, such that they seem to move unpredictably from one part of the process to the other. Obviously, we need a more standardized process with less variation in the steps to even recognize, where the true bottleneck is, and this leads us directly to Lean with its emphasis on Mura reduction (no typo, Mura is the excessive variation in the process, that is recognized just as bad as it’s better known counterpart Muda). This probably eliminates or at least reduces the need to directly apply the theory of constraints as a first step.

There are other situations as well. Recently I was working for a large utilities company in a project where they need to gain access to their customer’s homes to   execute an upgrade in a meter, which is a legal obligation of the company prescribed by law. So, the process starts with convincing customers to grant access to their site and actually be present during the upgrade, allocate the job to an operator with sufficient technical knowledge to execute the upgrade, get the operator to the site on time and to execute the necessary work. There is a lot of locality and time based variation in this process  – different regions have different demographics that react differently to the request for access and also people tend to be more willing to grant access to the operator outside the working hours, but not too late in the day and so on.


On the other hand this process looks like a textbook example of the Theory of Constraints : we have a clear goal defined by the law, to upgrade X amount of meters in two y

ears. Given a clear goal, the next question will be, what is keeping us from reaching this goal? Whatever we identify here, will be our bottleneck and once the bottleneck is identified we can apply the famous 5 improvement steps of the ToC,

1. Identify the constraint

2. Exploit the constraint

3. Subordinate all processes to the constraint

4. Elevate the constraint

5. Go back to step 1

In a traditional, very much silo-based, organization steps 1-3 would already be very valuable. By observing the processes in their actual state we already saw, that each silo was working hard on improving their part of the process. We literally had tens of uncoordinated improvement initiatives per silo, all trying their best to move closer to the goal. The problem with this understandable approach  is nicely summarized in the ToC principle: any improvement at a non-constraint is nothing but an illusion.  As long as we do not know where the bottleneck is, running around starting improvement projects will be a satisfying but vain activity. It is clearly a difficult message to send concerned managers, that their efforts are mostly generating illusions, but I believe this is a necessary first step in getting to a culture of process (as opposed to silo) management.

The obvious first requirement, then, is to find the bottleneck. In a production environment we would most probably start with a standardization initiative to eliminate the Mura, to clear the smoke-screen that does not allow us to see. But what can we do in a geographically, organizationally diverse, huge organization? In this case our lucky break was that the organization already collected huge amounts of data – and this is where my second theme “big data” comes in.  One of the advantages of having a lot of data points – several hundreds per region per month – is that smaller individual random variations will be evened out and even in the presence of Mura we might be able to see the most important patterns.

In this case the first basic question was: “do we have a bottleneck”? this might seem funny to someone steeped in ToC but in practice, people need positive proof that a bottleneck exists in their process – or, to put it differently, that the ToC concepts are applicable. Having a large and varied dataset we could start with several steps of exploratory data analysis to find the signature of the bottleneck. Exploratory data analysis means that we run through many cycles of looking at the process in detail, set up a hypothesis, try to find proof of the hypothesis and repeat the cycle. The proof is at the beginning mostly of graphical nature – in short, we try to find a representation that tells the story in an easy to interpret way, without worrying too much about statistical significance.

In order to run these cycles there are a few pre-requisites in terms of people and tools. We need some team members who know the processes deeply and are not caught in the traditional silo-thinking. They should also be open and able to interpret and translate the graphs for the benefit of others. We also need at least one team member who can handle the data analysis part – has a good knowledge of the different graphical possibilities and has experience with telling a story through data. And finally we need the right tools to do the work.

In terms of tools I have found that Excel is singularly ill-suited to this task – it really handles several hundred thousands of lines badly (loading, saving, searching all take ages) and the graphical capabilities are poor and difficult to do. In working on a task like this I will use R with the “tidyverse” library and of course the ggplot2 graphical library. This is a very handy and fast environment – using pipes with a few well chosen filtering and processing functions and directing the data output directly to the ggplot graphics system allows the generation of hypothesis and publication quality graphs on the fly during a discussion with the process experts. It does have its charm to have the process expert announce a hypothesis and to have a high quality graph to show the hypothesis within one two minutes of the announcement. It is also the only practical way to proceed in such a case.

Most of the hypothesis and graphs end on the dung-heap of history, but some will not. They will become the proofs that we do have a bottleneck, and bring us closer to identifying it. Once we are close enough we can take the second step in the exploratory data analysis and complete a first CRISP-DM cycle (https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining) by building a statistical model and generate predictions. If we are lucky, our predictions will overestimate our performance in terms of the goal – thus pointing towards a limiting factor (aka bottleneck) because we achieve LESS than what would be expected based on the model. Once here, we try some new, more concrete hypothesis, generate new graphs and models and see how close we get to the bottleneck.

So, where are we in real life today? In this concrete example we are through the first cycle and our latest model, though overoptimistic, will predict the performance towards the goal up to -10%. We are at the second iteration now, trying to find the last element of the puzzle to give us the full picture – and of course we already have a few hypothesis.

In conclusion – I think that the oldest and most venerable process optimization methodology might get a new infusion of life by adapting the most modern and up-to-date one. This is a development to watch out for and I will definitely keep my fingers crossed.