Sustainable Lean Initiatives and the Law of Organizational Behavior

One of the many upsides of being on vacation is that one can take a bit of a distance from current work, reflect and look at issues from a further and perhaps more theoretical point of view. With a bit of luck, one can do this in some non-conventional and possibly more pleasant surroundings than a normal office environment.

This is how, this week, I spent a lot of time immersed in the nicely cool waters of the Danube thinking about discussions, tensions and possible strategies related to one of our customers for a large scale lean implementation in several countries. The relaxed atmosphere or the cool water or maybe both somehow made me come up with something I would now call the fundamental law of organizational behavior (pompous, I know).

The law could be enounced in the following way : over medium or long term members of an organization will behave in the way which is the easiest for them. I know it does not seem like a big discovery but it does have  some nice consequences for any process improvement we try to do . (Also, as a disclaimer: I am pretty sure I am not the first to discover this, my only claim is that I discovered it independently from others ).

If you ever read forums of people involved in process improvements (Lean or Six Sigma or other) the recurrent theme in all of them is the complaint about the lack of management support. Innumerable times did we all say that this or that initiative would have succeeded if only management (especially middle management) supported us better. Now, with the Law this problem is suddenly much clearer. People will behave in a way that is the easiest for them. In many cases our process improvements are not designed to make life easier for the process workers – it is designed to make the process maybe more efficient, less costly and so on, but not easier. So, in accordance with the Law we need middle management to supervise the new process and make sure that not respecting it will make life definitely harder for those involved in the process.

Let us take 5S as an example: from the POV of a factory worker maintaining order and cleanliness for the sake of an abstract 5S  process defined at headquarters is definitely NOT on top of their priorities. So, we expect middle management to regularly inspect the premises and to make enough fuss about it that it will be easier to adhere to the process then not to.

But the Law applies to the middle management as well – and not creating unnecessary conflicts with the work-force over secondary topics (after all, production targets are the important ones) definitely makes THEIR life easier. And so , the Lean implementor is left with the ruins of a 5S initiative and the complaint about the lack of management support. “If only they would have followed through with the 5S audits – we could have been sooo successful”.  

I just picked 5S as the most frequently occurring debacle but the same reasoning applies to SMED initiatives, team boards, regular lean meetings or any other improvement we might think of. The Law will simply say – if  it does not visibly make life easier for the involved people the initiative will die.

So, sitting in the cool (and blue) Danube – obviously the next question will be what we can do about this? Is there a way to make an initiative sustainable?

I think the answer is in the methodology developed by Mike Rother in his Toyota Kata books. He very boldly states that the objective of Lean initiatives should not be the implementation of any combination of Lean tools . What we should aim for  is to teach a way of thinking about problems and continuously solve them. The vehicle for this will be the definition of a target condition – a state of the process we want to achieve, and a method of continuously building our way towards this target, together with all involved process workers and managers. (This is the time-honored  PDCA by the way.) There are many possible target conditions that can be defined but once we have consensus on our target we have the method and the way of thinking to guarantee (which is a big word here) that we will with time get ever closer to it.

Now, to bring the Law of Organizational Behavior to act in our favor, all we have to do is to define the target condition PRIMARILY as a process that is easier for the involved people then the process we started with. This thought has been hidden in the original ideas of the Lean pioneers – the concepts of Mura and Muri are very much in this direction or I could also cite Taichi Ono here “Why not make the work easier and more interesting so that people do not have to sweat? The Toyota style is not to create results by working hard. It is a system that says there is no limit to people’s creativity. People don’t go to Toyota to ‚work‘ they go there to ‚think’.”

However, by selling Lean we were so focused on measurable process improvement – percent setup time reduction, lead time reduction, OEE improvement etc. that we neglected this part of the equation.

Do I mean that we should turn Lean into a feelgood initiative with no regard to hard-core benefits? Definitely not – if we do not improve the processes in a quantifiable way then we have no justification for even being present in an organization. Instead I am arguing the other side of the coin – we definitely should aim for those improvements but we have to be fully aware that unless we make the process also easier for the process workers all our improvements will be ephemeral illusions. 

So, to come back to our 5S example- how could we make the improvement stick? I would say, we should start by identifying the process worker’s needs – what do they need to make their life easier, less stressful, even more pleasant? Lost, eventually stolen tools ? Explain the concept of shadow boards. Broken window on the shop floor? Fix it as the first step of the new 5S implementation and define a process so that the next time nobody has to wait for month before a window is fixed. Gloves missing? Buy some and define a process (Kanban for instance) to make ordering new gloves immediate. It is stressy to have a big boss come to a 5S inspection every month?  Introduce self-audits and coach the boss to only audit the audits – you get the idea.

This way it is possible to turn an initiative that people see as an additional burden into something they will all be proud of. And when you get a delegation from a different machine asking management for the introduction of 5S in their place as well, you will know that you succeeded and that the Law is now on your side. And after all, this is all we need to make the initiative a success.


The Glass Drinking Horn and Remote Trainings

A few years ago I saw the funniest object in the British Museum – a drinking horn made of glass. It was a perfect piece – the artisan had absolutely corectly reproduced the horn of an ox , shape, curvature and all. No doubt , it was a great achievement and a very prized object in the collection of some Saxon ruler in the 7th century. I can – as an avid reader/watcher oft he Last kingdom series of books – vividly imagine how the king raised the glass horn to toast Odin and all the others only had their simple, natural horns to drink from.

Why do I find this funny, you might ask? Well, we today know fully well, that that piece is a complete misuse of technology. Oxen grow their horns in a natural way and there is not much anyone can do about their shape. Glass is, in contrast, much more flexible, we can design pretty much any shape we or the customers want to. As, for example, a flat bottom might appeal to some , to conveniently place the glass on the table, should they want it. We can make it larger, or smaller, cylindric or more round shaped and so on and on… But, obviously, the artisans making this horn did not realize the opportunities offered by the new technology, they just kept to the business as usual even though they could have used the opportunity to make something better.

The image came to me as we were discussing ways to design and develop new, remote, trainings. Filming a trainer in front of a flipchart and putting the film on a webpage a la YouTube is the equivalent of a glass drinking horn. We might even think that we do this because our customers want it this way, as no doubt the Saxon king probably ordered a glass drinking horn. But then, our job as training specialists is to embrace the new technology, understand it and provide a wealth of new alternatives to our customers. We, as our customers as well, are hindered in this by our past years spent in various classrooms, as children, students and later trainers. The format we inherited, as the drinking horns, is „what we have always been doing“ , in the last two millennia at least. Teacher up front at a flipchart, more or less bored trainees in the benches receiving the wisdom from the mouth of the teacher – this is how it should be, right? Why would we want to change this venerable format, just because the pupils sit in front of a screen instead of in the benches?

It turns out , there are many reasons why we should change. The old format is not like this because it is in any way optimal, it is like this because there were no alternatives to it from the times of Aristoteles‘ s Lycaeum to the advent of the Internet. Now that we have the means to do things differently and we have a strong incentive to change, we should be bold enough to look for alternatives.

Take interactivity and cooperation as an example. Often customers tell us that they would still like to have a traditional training, sometimes after corona, because they value the exchange and cooperation between the participants very highly. This is a very natural wish, but the logic is a bit flawed. It is by no means guaranteed that the best way to exchange ideas and to cooperate is by sitting for hours in a classroom listening to a teacher. I still remember the shock I felt about 15 years (!) ago now when I first saw my child go on a dragon hunt with a team of Swedes, American, French and one child from South-Africa. For those untrained in the World of Warcraft – a dragon hunt is a very challenging activity against a powerful opponent, requiring fine-tuned, real-time cooperation between participants of different skill levels and sets, where every mistake can be – well, sort of deadly , throwing back the whole team to the starting position. Do you think, those kids could have done better if they sat in a classroom, even assuming this were possible?

If we want in our training the participants to practice cooperation and matching skills , should we flood them with theory and give them maybe one opportunity to practice in a classroom exercise – or maybe we can design our training to be a dragon hunting experience during the whole session? Which one would you prefer as a customer? Even more importantly – how will our trainees exercise those skills in real life? Getting everybody in the same room to solve a problem is a nice concept, but generally not practicable due to costs and time constraints in a modern environment. So, why not train in a realistic environment, which is increasingly the remote one?

But we have no dragons to hunt, one might object. Are we sure? Precisely in our domain, can we not turn the identification of a bottleneck into a dragon hunt for our team? Identifying the 8 types of wastes a detective game? Finding the significant Xs an enjoyable cooperative experience? I know, not think, that this is possible and highly desirable. It is not like we need to invent everything from scratch – gamification and programmed education are already known concepts. We need to embrace them more, and of course we need to change our mindset. We should not be the teachers of the old times, standing in front oft he class and enunciating deep truths – our role should be a cross between a coach and a wilderness guide. We should take a team and help them navigate a treacherous environment until we all successfully reach our goal at the end of the training session. Also, we need to be the architects, building those environments for our customers in the first place.

Customizing the „dragon“ and the environment is an added chance in this new world. We should not only sell a more or less generic piece of knowledge – we should sell an experience. The experience of learning new stuff and successfully apply it in a team to solve real-life (or very similar) problems that are relevant to our customers. This means that one size will no more fit all, we have to have a flexible way to quickly build customer specific simulations and games that will be used maybe just once and then thrown away – but will be for that one time decisive in the success of a training.

This is a whole new world of designing experiences and addressing customer needs in a completely new way. It is also, as I see it, way closer to what our customers really need than the old classroom style training. We have the methods and we have the knowledge to use it – and with the corona epidemy we also have the possibility to prove the concept works – this is really the perfect opportunity to improve and to better serve our customers. After the lockdown we shall look back at these times and wonder: why did we not change earlier?

What do you think, what works for your lean initiative?

The title is a bit misleading as I do not want to talk about our experiences about which factors will make an initiative a success. I would rather show a method, how we can measure what the employees of different organizations think, that makes the initiative successful. And when I say think, I really mean, what their belief is and not what they would say if asked this question.

The idea is to apply a statistical method employed in psychology (but other fields as well) to evaluate a carefully designed survey, to quantify the strength of the beliefs of the respondents concerning the link between different factors influencing our outcome variable. This sounds unnecessarily complicated, but let me give a concrete example.

Imagine we want to measure what people think of which factors determine the success of a lean initiative.  Just asking this simple question is not very helpful because there is a near certainty that we will get all kinds of incommensurable answers – everybody understands “success” differently and the factors mentioned will also be wildly different, from the very concrete to the philosophical. So, obviously we will have to ask more specific questions.

Concerning success, for example, we might have our own definition of what a successful lean initiative looks like – many Kaizen events, Obeya rooms regularly visited, waste elimination, 5S in place etc. All these elements REFLECT the status of the initiative , meaning, that if we ask questions about these in a successful lean company we will generally get high scores and vice versa. We do not need to capture ALL that defines a successful implementation though, we just need to be reasonably sure that all  successful companies will have reasonably high scores on most of our questions.

To put it in more special terms, we assume, that there is a hidden variable out there at the company that we can call Maturity of Lean. We can not measure this directly, indeed we do not even have an operational definition of it. However we can ask questions that we expect that will reflect on the state of our Maturity of Lean hidden variable, a bit like looking at mosaic stones and trying to figure out the whole picture. To get this information we shall define one question for each aspect of Maturity we came up with  in our survey.

Using the same logic we can assume that there is  another hidden variable at the company called Management Commitment and another one called Tool Proficiency and so on. To get an idea of the state of each of these we shall design several questions that we believe will reflect on the status of the hidden variable. E.g. for Tool Proficiency we might decide to ask the number of successful Kaizen events , the number of employees involved, the amount of money saved, the number of areas with visual management present and so on. In the same way we may define a number of questions around management commitment and so on.

As a side issue, wherever possible, we should use Likert scales for the answers to facilitate the analysis.

Now, once we collected the answers, we will want to analyse the relationship between these hidden variables and their effect on our similarly hidden outcome variable. In principle, it would be possible to analyse more detailed effects, like the impact of 5S on the number of Kaizen events, but this means building a large number of correlations (one between each success component and influencing factor) and this will be statistically absolutely unsound unless we apply some corrections (like the Bonferroni correction if we want to keep it simple) and really also way too detailed. Another problem will be that many of our independent variables (the answers to our questions) will be correlated , which will make a traditional regression analysis very difficult,

Anyway, he real questions are at the level of the hidden variables – e.g. does Tool Proficiency contribute to the success of the initiative and if yes how strong is it’s effect? Once we get answers at this level we can go one step deeper and analyse the contribution of each component to the outcome, like: does 5S have a strong contribution to the Tool Proficiency or is it the Visual Management? , and the like..

The statistical method to  analyse our survey is called PLS SEM (Partial Least Squares – Structural Equation Modelling). Not delving into the mathematics , it has essentially 3 steps.

  1. We describe which survey questions relate to which hidden variable. As we designed the survey with hidden variables in mind, this will not be a difficult exercise. Based onthis info, the system will optimally construct the hidden synthetic variables, as linear combinations of the respective inputs. That will be roughly the PLS part of the method
  1. We start with some broad assumption on which hidden variable can impact which other hidden variable. E.g. we can assume that management commitment has an impact on tool proficiency or that leadership has an impact on management and also on  tool proficiency etc. Based on these assumptions  the system will calculate the strength of the influence of a hidden variable on another – that would be the SEM part.

With these two elements we can run the PLS SEM model and then we have the 3rd step which will be the interpretation of the model. Here we can quality check the structure of our hidden variables and see whether we picked the categories correctly . Then if all our hidden variables are correctly built  we can check the model as if it were a normal multiple regression and reduce it based on the p values in the usual way.  So  we end up with is a statistically sound high level description or model ) of what the survey participants think of a such a complex issue as the success of a lean initiative.

The method is not limited to survey evaluation though. In any industry were we have some customer requirement that can be described by several measured values, we can apply the idea of hidden variables that are reflected by these measured values. Indeed my first exposure to the method was at a time we worked for a number of companies manufacturing paint. One characteristic of paint that is of interest to the customers is how “shiny” the paint is. This will be measured by a lab instrument at several angles, say, at 20 degrees, 40, 60 80 . We can model the customer requirement as the hidden variable “Shine” reflected by the measurement values at different angles. Then we can use the hidden variable as the Y in a regression model, that can have manufacturing parameters , components of recipes and the like giving us a lot of insight into the way we can improve our process.  The same logic can be applied in the food industry in many places as well.

In summary, we know that the most important step in a process improvement is the accurate capture of the voice of the customer. As Six Sigma  has a reputation of relying strongly on measurements and statistics, adopting a method that will link our strength to our aspiration is definitely something we should do, and do more often.

Lean and Building a Robot

I am not thinking of actual physical robots here like the ones building cars in a Toyota plant, though that would be a very interesting topic in itself. Rather, I would like to share some experiences I had in building software robots of the RPA fame at an organisation which  is at the start of the process excellence journey with or without robots.

This is a very interesting experiment – given that we have some debates going on concerning the right way to chose in a such a situation.

Do we have to implement pure Lean first, as advocated by the purists and of course most all traditional lean consultancies? There is a famous lean percept saying that automatized waste is still waste, maybe even worse then non-automatized waste. This would mean, that we should postpone implementing RPA until the processes are optimized and waste mostly eliminated, then run a RPA initiative to gain the best of both worlds, lean sleek processes that also run in the most part automatically with the help of SW robots.

There is an alternative way to look at this situation, of course. You might say, that applying lean to a crusty old process that resisted optimization for many years is a waste of effort. Just pick the parts that let themselves to automatization, implement robots there and reap the benefits. Naturally, this will be the view of the RPA vendors. No other expertise is needed then the one to implement robots in the particular platform and benefits will start pouring in in a few weeks after the end of the implementation without any need to worry about obscure things like “cultural change” or “value streams”.

I am in a particular position having been both a SW developer and a lean consultant for basically the same (long) amount of time. Still, having had my share of failures and successes in process improvement (and software development) I am naturally leaning towards the second view – implement the robots quickly and either get the benefits or quickly discover the mistakes we made in the process.  This is , funnily enough, the view people have, who tried to apply lean principles (or just common sense) to software development and came up with the Agile methodology in the 90-ies.

So, to come to the real story, I showed up at the organization in question, for one week of voluntary work (they are a well – known NGO) to start with them on the path of robotic process automation. I was very much focused on getting at least one robot to be functional and useful by the end of the week – being naïve and hopelessly optimistic, as most developers, I actually promised 5. I also completely neglected any preparation work, such as creating process maps and value streams – this time on purpose – thinking that we could not make a process map at the level of each keypress and mouse-click the way I needed for the robot and also because my receiving organization would not have had the time to prepare such maps even at higher levels of detail.

The first hurdle I hit was to find the first (and as turned out, only) process step to automatize. Here a detailed process landscape and process maps would have helped but lacking those we picked , sensibly enough, something where the process worker was available and what  was deemed sufficiently important by the team  (but, of course, during the robot development process we did have doubts, whether we picked the right choice or not but by then it was too late to change.)

The second hurdle was that I completely underestimated the impact of a non-standard environment in which the robot had to work. As an NGO, they had several IT solutions, all delivered by enthusiasts or volunteers, that were well suited for human operators, who will never care about windows tags and such,  but a lot less standardised in behaviour then a system that would have used standard windows components.  This made the development work a lot more strenuous as I had no way of knowing whether some trick I needed will even be possible within this framework, like going through a list of push-buttons and clicking them automatically in a given order. On the other hand, this is exactly where an RPA system becomes so much better than any traditional macro – and we hear often that RPA is nothing BUT a better developed macro. Well, that part of “better developed” and the possibility to manipulate windows descriptors, may be a small step as programming development goes, but it is huge step for an implementer, who is faced with an unknown system with poorly (or not at all) specified  interfaces and a tight deadline. The RPA paradigm means in theses case the difference between “impossible” and “tough but doable”.

So, not without some hassle and way later than what I planned, I managed to build something what was, as I thought, a working and useful robot. Then, of course, came the third hurdle: it turned out that the process, as described by the colleague I was working with on the specification of the robot, was by no means the only way or even the accepted way of working on that process step. In short – my robot turned out to be not useful at all before the team worked out the right way of making that step. So, this is where, you could think, the lack of applying correct Lean methods came back to bite us. Had we had a Lean workshop first , we would have discovered the discrepancies and the lack of standard work before wasting time on the robot, and we would have fixed the problem BEFORE we started programming . Well …..  – this is true in a way. In an ideal world the procedure should definitely have been like this –running a  Lean Workshop, building a process map, agreeing  on  standard work, implementing the robot with the standard version.

But, I have my doubts. We do not live in this ideal world and NGOs in particular do not have the time and money, and the motivation either, to spend a lot of time building process maps. The colleagues I was talking to honestly believed that they have only one standard way of doing the process I tried to automatize. Not because they were ignorants, but because they have many other priorities and as long as the process functions they have better ways of spending their time (and overtime – I basically did not see an 8 hour day during the time I was with them). Building the robot changes the optics – suddenly the process we are looking at will be performed by a mindless entity, which will not have the intelligence to discover and correct deviations from the standard . It will definitely build chaos instead, so, if we want (must) apply robots, we must make sure that these deviations will not happen anymore.

But his means that the strong motivation to look at processes and to standardize them comes AFTER the first robots were built –

It is in my opinion next to impossible to infuse this motivation as long as humans perform even the most boring process steps as well.

So, was my work for a week a failure – I do not think so. We all learned – and the company I worked for definitely learned that robots will only be as good as the standardization of the process steps we apply robots to. As a Toyota manager I had the chance to interview once stubbornly repeated – “No standard, no improvement”. This is even more true and important in the new era of Lean Robots than it used to be.

Now, to close – did I learn that we need a complete Lean Transformation before we implement robots? No, I still do not believe that. However, being careful pays off. I would still build a robot first, but I would emphasize a lot more (as befitting any Agile methodology) that we are building the version 0  only – the first, minimally useful product. THAT I would try to build even faster than I did it in this case. The moment this version is ready, the reality of what a robot can and will do hits home with the team – and then our users, customers and developers will get a lot of legitimate objections and errors discovered which., with any luck, will ensure that the next version will be something almost useful, but will certainly generate a new wave of tests, objections and such and so the cycle will move on to something legitimately useful. I strongly believe that this is the practical way of going forward with RPA – building more and more useful robots and learning at each step of the way.

Robotic Process Automatization versus Software Development – the case for Agile RPA

Recently I had the huge chance to learn and then to apply RPA in a real life project. I think the method is nothing short of revolutionary and will definitely change the way we think about process improvements and more importantly, how we will actually improve processes. I also plan to write several blog posts about my experiences as it will be probably useful, or at least entertaining, to those who embark on the same path.

I spent about 15 years of my professional life as a software developer at various big companies like Siemens or GE, so I was naturally intrigued (well, the actual term should be more like pissed off ) by the advertisement of several RPA providers who all claim that scripting robots is NOT software development, it is in fact something that anyone can do without previous knowledge of programming.

This claim is based on the fact, RPA systems allow one to capture user activities and to replay them, so it is in fact possible to not have to “program” to develop a script. Unfortunately, there are some basic problems with this view.

The first is the confusion between “typing code” and programming. The idea that clicking on icons and pulling them to various places on a screen is somehow different and easier than actually writing a program looks very tempting at the first sight. After all, programmers type gibberish that only they understand most of the time while business analysts and managers build presentations using pretty pictures – or so the stereotypes go . So, if we can do away with the gibberish (and with the typing) then probably business analysts and even managers will be able to produce robots, and we will need no expensive programmers no one understands anyway – this feels like a very tempting proposition.

Unfortunately this idea is very far from being true.  At the risk of stating the obvious – programming is not about typing. Programming means describing procedures in a highly structured way so that no other knowledge, of  the kind humans have, will be necessary to execute them, and even more importantly , keeping these descriptions (aka “programs”) in a state where they can be updated, modified and generally used by people who did not participate in the development. This is absolutely necessary as programmers tend to wander off to other projects, customers discover new needs and bugs raise their ugly heads. Programs are never finished in the sense bridges, for instance, are so they need constant care, which would be next to impossible if the program is not developed with this in mind.

That means in Software Engineering terms that programming is mostly concerned with the famous “ibilities” – usability,  reliability and maintainability. Usability means developing a program (or script) that the customers find useful and are willing to pay for it, reliability that the program can run most of the time and maintainability that the developed code is easy to understand and to modify without the risk of breaking it  by introducing changes that have unforeseen effects.

Achieving these goals has absolutely nothing to do with the way a program is developed – by typing text or by assembling pretty little pictures. In this sense the message “RPA is absolutely not like programming” is wrong and probably dangerous. My uncomfortable feeling is that many companies will translate this marketing message to something like “ we can now finally forget all the lessons SW Engineering learned in the last 50 years and we can just work spontaneously as we think fit, because THIS IS NOT PROGRAMMING”.   I think the lesson to be first forgotten will be about the Agile Development.

To be fair, Agile has already taken some hits due to the hype that has been going on for some years now. Most companies developed an uneasy relationship with the concept which makes it all the more riskier to take the position that RPA initiatives absolutely need Agile as the development methodology. Let me showcase why this is amust by the example of the “ibilities”:


How do we make sure we develop robots that make things people find useful? The only way I can imagine is to send the RPA developer to sit with the people who actually do the work that will be automatized (at least partially) observes what they do and discusses with them what they need. (Remember the lean term Gemba ? The Agile idea of including customers in the development team? This is the same, only ten times more necessary).  This is only possible in loops – ideas will be captured, robots with minimally useful functionality implemented and feedback from the direct customers (the office workers whose work will be made easier) collected for the next version. And the next version will be developed the next week, so that the customers see the effects of their participation immediately.  This also means that in the early phases errors are tolerated or even welcome. After all, each error discovered in this phase is one error less in the final robot.

Unfortunately, if the organization is unaware (or wilfully forgets) what we all learned about SW development (and remember , they bought into the RPA with the idea it is NOT SW) the spontaneous way to develop will be the well-known waterfall. We will send an e-mail to process experts to please describe their processes with the most details possible and when the spec is ready somebody who probably never saw a live process worker, somewhere in the basement of the IT organization,  or maybe somewhere even farther, will develop a robot. The robot will be tested on a number of (ideally well chosen) test cases and then deployed. It will also quickly fail – because requirements change, the users forgot to mention the odd extraordinary case and so on… we al know the examples from real life projects. However by this time the developers have other robots to develop and the whole deployment degenerates into the acrimonious discussion of who is to be blamed for what.  We have all been there and have done that  – and we risk starting the cycle once again.


This „ibility“ will be about testing . Again, the Agile way of developing small useful chunks, the immediately let them be tested by the users and repeat the cycle is the best way of achieving this. The traditional way of developing a number of testcases and test them cold has the weakness of never knowing for sure whether the testcases really cover all the eventualities and whether a “tested” SW is really safe to deploy or not. I worked once in a project where ateam sent one year writing test cases and when we checked later it turned out that all the cases covered less the 20% of al eventualities.  Designing tests is an important part of the SW Engineer know-how but remember, RPA is NOT SW ?  This way we risk ending up with the worst of both worlds: no timely, direct feedback from the people who use the robot and no really usable testcases either.


This is where I find the marketing line RPA not being SW the most dangerous.   In the effort to sell RÜA as NOT SOFTWARE most RPA providers embrace a visual programming style . This is all very nice and easy on a marketing show, where anybody can draw up to five icons to a screen and visually link them with nice arrows – but the then real life will necessarily kick in after the purchase. It is no accident that other industries already experimented with visual programming and then returned to text. The problem is, that with a visual program a LOT of essential information is hidden in small dialog boxes attached to these nice icons. And by essential I mean things like delays before a mouse click or waiting times before a terminal message is sent or even input parameters that are given to what goes for a function in this visual paradigm. Now, imagine the simplest of maintenance actions – find the places where a timeout has a small given value and change it to something longer. In the good old text-programming wold this would be a simple search and replace operation . Maybe there is a better way in an RPA visual program, but the only way I can see now is opening each and every program that uses this timeout and click through each and every icon to separately edit each  dialog-box . Good luck doing this with a few hundred icons (aka lines of code) and especially good luck finding people willing to do this brainless work for days on end. (Well maybe we could write code maintenance robots to do it).

And this just the tip of the iceberg and a pretty much trivial task. Once robots will be deployed in numbers there will be many such and more complicated tasks – we know from SW development that in big organizations code maintenance takes up to 80% or more of the time of a developer. Unless we can freeze the processes with robots down to the last mouse-click I see no way this percentage should be different for robots.

So, where does this leave us? Is RPA bad for the companies?

I definitely do not think so. The message that RPA is easy and so not like SW is on the other hand dangerous and damaging. If anything, we would need to be more agile in RPA development than in “normal” SW development, and this definitely needs planning and organization before the deployment. Call me a maniac, but I strongly believe that Lean and its SW offshoot Agile are the answer for most of the problems. It only takes the will of the organizations to implement it – and to not fail for the syren song of “this is easy, anyone can do it” of the marketing types. It is not easy and many will fail if they mindlessly implement it – but it has a huge potential to make life better for the people working in processes and we know how to do it right. So,  as in so many other things in life : PLAN, DO, CHECK ACT  and take the benfits.


Value Stream Analysis in a Digital World

Capturing and analysing value streams is one of the most used and liked methods in the lean process improvement methodology. In a sense we all grew up , as lean coaches, reading Mike Rothers brilliant “Learning to See” book and applying it in all possible situations. Most of us are also familiar with the objections of the type “ our process is far too complex for a value stream analysis to work” and learned how to work around them, mostly by eliminating unnecessary complexity from the analysis. I think there is a consensus among us, lean coaches, that value streams work very well and are the most important step in analysing a process, be it manufacturing or administration.

We must recognize though that the easy application of a VS rests on a few premises:

  1. Each process step is executed by dedicated resources who only work in that process step
  2. The processes described by the value stream are standardized to the extent that they have little enough variation around their mean values describe the process well.

The Formula

If we are looking at a manufacturing operation, like a production line, both of these premises are almost certainly true. However, as soon as we move to administrative processes the situation starts to look a bit more shaky. It is common knowledge that resources are not dedicated to a single task , but have several tasks related to different value streams: e.g. a person answering customer enquiries about new products might also be responsible for handling customer complaints , or someone managing the finished goods deliveries will also work in planning the production, or a maintenance engineer will work on classifying incoming defects and also repair parts and so on. To add insult to injury in many of these cases the processing times will be wildly variable ranging from minutes to days for the same type of task .

An important question to answer for any VS specialist will be, how we can handle these situations. One obvious answer would be, to just explain the premises and regress to the “normal” process mapping like the swim lane. This approach has the downside, that it will represent and not eliminate the “unnecessary” complexity of the process, indeed one of the goals of the mapping exercise will be to show everyone how complex the process is in order to create an impetus towards simplifying it. So, a swim lane is a great tool to shock the stakeholders into action, but much less suitable to actually analyse a process.

A better approach would be to extend the value stream methodology to handle deviations from the two premises. There are two steps needed to do this, mostly corresponding to each of the two premises.

If the problem is that we have several unrelated tasks performed by the people working in each of the boxes in the value stream, and more people work on those tasks in parallel we can extend the concept of the processing time to something we call “effective processing time” which will be the average time between two finished products leaving the process step, provided the step was well supplied (i.e. it did not have to wait for materials, input, etc). We have a nice formula for the general case of several resources with differing efficiency and allocated for different tasks as well. The derivation of the formula, for those interested, can be found here:

and it looks like this:

PTeff = 1/SUM(A1/P1 + A2/P2+ … + An/Pn)

where A1 is the percent of time resource 1 is working efficiently on the task related to our value stream and P1 is the processing time, the time it takes resource 1 to accomplish the task, provided there are  no interruptions during the task.

For example, imagine we have two maintenance engineers working on repairs. The first one is more experienced, and he averages 2 hours per repair, the less experienced colleague averages 2.5 hours. However the experienced engineer will also have to manage the suppliers of spare parts which takes about 3 hours of his day, the less experienced one is only dedicated to repair jobs. What will be the effective processing time of the repair step?

Using the formula with P1=2, P2=2.5. the first engineer can only work (8-3)/8=0.6 (60%) of his time on repairs, the less experienced one 100%. Putting it all together the PTeff of the step will be

1/(0.6/2+1/2.5)=1/0.7=1.4 hours.

So, every one and half hours the team finishes a repair job. In a value stream map we could represent this step in the same way as if we had one resource that could finish one repair every 1.5 hours. By applying the formula we managed to eliminate the complexity generated by the unequal process times, more then one resource in the process step and additional tasks that are not related to the value stream, basically eliminating the problems related to the first premise.


The second premise also raises problems in practice. One of the most frequently heard comments, when training lean and especially value streams is that this only applies to car manufacturing, exactly because they have very highly standardized processes. In cases where we have too many random influences our analysis of the value stream will miss important effects because we only concentrate on the average behaviour.

The problem is also, that we have very little intuitive understanding of how a value stream will behave and especially how random effects will influence a process. This would require a dynamic view of the value stream and our mapping is essentially static. The way out of this has been known for a long time : it is to build and to analyse a simulation of the value stream. The problem is (or rather was) that simulation software was expensive and specialized. As far as I know, there was no standardized and cheap way of easily building a simulation.

This changed, as so much else in statistical analysis, with the advent of R (and to be fair ,Python as well). Today we have open source, widely used software that is a de facto standard for system simulations. This means that any value stream we build in the traditional static way can be easily transformed into a dynamic view. A dynamic view also means that we can build a much better intuition of what the value stream is doing and also we can get a picture of what the effects of random variations will be and moreover, we can answer hypothetical questions about how our value stream would change if we introduced specific changes in the process.

As an example I will take an interesting process proposed by one especially talented trainee group we work with: a visit at the doctor. The process has 4 steps :

  1. The nurse receives the patient and prepares the patient file for the doctor
  2. The doctor examines the patient
  3. The nurse updates the patient file
  4. The doctor signs the documents
  5. During the day random calls for future appointments by patients also have to be answered by the nurse.

As we can see, this is not a complex process by far, still, it already violates both premises. In order to map the process we need to work out the effective processing times for each step and to do this we need some average values. These would need to be measured or estimated in a real case, for now, for the sake of the analysis let us just assume them like this:

  1. Step 1 takes in the average  2 minutes
  2. Step 2 8 minutes
  3. Step 3 4 minutes
  4. Step 4 0,5 minutes
  5. A call takes on the average 4 minutes, one call arrives once in about 10 minutes

We also assume one patient arriving every 9 minutes.

Using the formula from before we can calculate the effective processing times for the nurse like this. For Step 1 she can spend 2/(2+4+4) = 20% of his time. The processing time is 2 minutes so the effective processing time is 2/0,2 = 10 minutes, so in average he can prepare 1 patient file every 10 minutes. The effective processing time of the doctor is 8.5 minutes. The standard value stream analysis will tell us that the patients will queue waiting for the nurse, there will be no queue waiting for the doctor.

Can we have a better view of what is going on, by using a simulation? Using the R library simmer we can easily build one and check the queue length over a working day. It will look like this, if we do not consider any randomness. This would be Case 1 on the graph.


We can see that the real behaviour is a bit more complex then our static view. Even in absence of random effects, we might see a bit of a queue at the doctor but essentially our view is correct: we see no build-up at the doctor but a steadily increasing queue at the nurse.

Now let us introduce some randomness into the process. In order to do this we would need more detailed information about the distribution of the processing times for each of these steps – that would mean detailed measurements and a longer period of data collection. However, there is´s a quick and dirty way of introducing some assumptions in a simulation by using the so called triangular distributions . These can be defined by 3 numbers: the minimum, the maximum and the most frequently occurring value (aka the mode). The shape of the distribution is triangular, so we will miss the finer details, but in order to get a first impression the details are generally not that important, and can be refined in later steps if necessary.

Let us take an example : let us assume that the doctor has some variation described by the triangular distribution  (5,14,8) . The mean time would then be 9 minutes per exam, with variations between 5 minutes and 14 minutes. Let us also assume that the patients arrive randomly described by the distribution (5, 15, 8) that is on the average 9,33 minutes per patient. To make things simple let us assume the nurse will work in the standardized way, that is his working time is constant in each phase with no variation.

Now, as we have introduced some randomness into the simulation we will have a different picture at each run. One example can be seen in Case 2. Even though statically seen the doctor has time, we see a build-up developing  around mid-day for the doctor . This is purely due to bad luck, the doctor had a few random patients who took longer and/or some arrived earlier then expected. This effect is hard to predict based on the static value stream alone. The nurse is still overworked – he can finish one patient in 10 minutes (all phases considered) and patients arrive once in 9.3 minutes so that by the end of the day we predictably have a queue in front of the nurse. The doctor however managed to eliminate the queue , which was to be expected over a longer term.


Just to illustrate how this analysis would go on, let us consider the idea of outsourcing the incoming calls and let the nurse only work with the patients. The result is seen in Case 3. By applying the formulas we could have more or less predicted this result but it is still nice to see our prediction realised.

Now there is room for further ideas. Obviously the new bottleneck is the doctor, so what would we need to do in order to reduce waiting times and queuing?

The above is just a simple  example of combining a more detailed value stream analysis with simulations , but imagine the power of this method in  a real workshop where we have the people actually working on the process, together with an analyst and a simulation, where we could try ideas and hypothesis on the fly, coming up with new scenarios and being able to quickly answer questions like the above. This will be a whole new level of understanding the processes we work with and we definitely should apply the method as a new standard of digital value stream analysis.


Designed Experiment to Re-engage Silent Customers

In the spring I had a chance to work in a project that had a very special problem. We had to convince the customers of an energy company to stay at home for a day, so that the company  can upgrade a meter in their home. The problem was special because the upgrade was mandated by government policy, but offered basically few advantages to the customers.

Obviously this a great challenge for the customer care organization – they need to contact as many customers as they can  and convince them to take a day off and wait at home for the upgrade. The organization needs to send out huge numbers of messages in the hope that enough customers will react to it. This necessarily means that we also get a great number of so called “silent customers” – people who decide to  not react to our first message in any way.

As we obviously do not have an infinite number of customers to convince, silent customers do have a great value – at least they did not say no yet. The question is, how to make them respond ? If we learn how to activate at least some of them we can use this knowledge for  the first contact message and make our communication more effective.

The problem is of a more general interest then this special project  – just think of NGOs who depend on donors. Learning how to make prospective donors more interested at the first contact has a very definite advantage for them as well.

So, how do we go about this? Coming from the Lean/Six Sigma world our first idea was to actually LEARN what is of interest to the customers. Previously there were many discussions and many hypothesis were floating around, mostly based on personal experiences and introspection. Some were already tried but none were really successful.

We changed the game by first admitting that we do not know what is of interest to our customer base – they had wildly differing demographic, age and income profiles, which did make all these discussions quite difficult.  Once we admit ignorance though (not an easy thing to do BTW) our task becomes way more simple. There is just one question left in the room: how do we learn what the customer preferences are, except the many we used to have along the lines of   “how do we interest hipsters or families with small children”? and so on.   Coming from the Lean six Sigma world there is just one answer to this question : we run a designed experiment to find out.

It is important to realze that we run the experiment to LEARN and not to improve anything. This is an error in industrial settings as well but in this project managing the expectations was even more important. However as we stuck to our goal of learning about the customer, designing the experiment became  much simpler, as we avoided useless discussions about what will be beneficial and what not. Every time an objection came up about the possible usefulness of an experimental setting we could just give our standard answer : we do not know, but if you are right it will be proven by the experiment.

As we went on designing the experiment we realized that we only needed (and were allowed to) to use two factors :  communication channels and message types. All the previously so bothersome issues of age distribution, locality and such we solved by requiring large random samples across al these factors.  Having large samples was, unlike in manufacturing, no problem at all. We could decide to send an email to a thousand customers or two thousand without any great difficulty or cost. As we were expecting weak effects anyway, having large sample sizes was essential to the success of the experiment.

Finally we decided on the following : we used two communication channels, e-mail and SMS, and three message types. One message targeted the geeks by describing how much cooler is the new meter, one targeted greens by describing how the new meters contribute  to saving the environment and one  was appealing to our natural laziness by describing how much easier it will be to read the meter. So, in the end we had a 2X3 design., two channels times three message types And this is where our problems started.

Customer contacts are different from settings on a complex machines in the sense that everybody has an opinion about them and for the machines you do not need to talk to the legal and to the marketing department before changing a setting. We had several weeks of difficult negotiations trying to convince every real or imagined stakeholder that what we intend to do will not harm the company – and at every level it would have been way easier to just give up then to trudge on . It is a tribute to the negotiation skills and commitment of our team members that we managed to actually run the experiment. I kind of think, that this political hassle is the greatest single reason why we do not see more experiments done in customer related businesses.

For 3 weeks we sent every week about 800 e-mails and about 300 SMS-es per each message type . We had several choices about how to measure the results. With the e-mails we could count how many customers  actually clicked on the link to the company web-site but for the sms-es it was only possible to see whether a customer chose to book an appointment or not. This was definitely not optimal, because the we could not directly measure the efficiency of the messages except for the emails. To put it simply the fact whether a customer clicks on the link in the message is mostly influenced by the message content while the fact whether the customer books an assignment depends on many other factors. Here is randomization helpful – with the sample sizes and randomization we could hope that these other factors statistically cancel each other so that the effect of the message will be visible if a little more dimly.

Our results were finally worth the effort. A first learning was that we had basically no-one reacting to the SMS messages. Looking back, this had a quite clear explanation – our message directed the recipient to click on a link to the company web-site and people are generally much more reluctant to open a web-site on a mobile phone than on a computer (at least that’s what I think). Fact is, our sms-es were completely unsuccessful, though more expensive than the e-mails.

On the e-mails we had a response of 3.5 – 4% for the ones appealing to the natural laziness as compared to less then 2% for the other message types. As the contacted people were silent customers, who once already decided to ignore our message, getting 4.5% of them to answer was a sizeable success.By the sample sizes, we had, proving statistical significance was a no-brainer.

The fly in the ointment was that we failed to translate these clicks to confirmed appointments – we basically had the same, very low percentage of confirmations  irrespective of channels or message types. Does this mean that our experiment failed to identify any possible improvement? At the risk of being self-defensive here, I would say that it does not. Making a binding confirmation depends on many factors outside the first priming message we were experimenting with. The content of the Web-side our customers go to, to mention just one, should be in synch with the priming message, which was not the case here. So, the experiment delivered valuable knowledge about how we can make a customer come to our web-site , but not about how to make the customer interested  in our message – and this ok.  This was exactly what we set out to investigate. As mentioned before, managing expectations is a very important element here.

What would be the next steps? Obviously we would need to set up a new experiment to investigate what factors impact the customer willingness to accept our offer. I am certain, that this is what the team will do in the next phase – after all, we learned quite a lot about our customers with a ridiculously low effort (excepting the negotiations) so why not keep on learning?

Theory of Constraints meets Big Data part 2

I would like to continue the story of the hunt for the constraint using a lot of historical data and the invaluable expertise of the local team.  There is a lot of hype around big data and data being the new oil – and there is also a lot of truth in this. However, I find that ultimately the success of a data mining operation will depend on the intimate process knowledge of the team . The local team will generally not have the expertise of mining the data using the appropriate tools, which is absolutely ok, given that data mining is not their daily job.  On the other hand a data specialist will be absolutely blind to the fine points of the operation of the process – so cooperation is an absolute must to achieve results  The story of our hunt for the constraint illustrates this point nicely in my opinion.

After having found proof that we have a bottleneck in the process our task was to find it or at least gain as much knowledge about the nature of the bottleneck as possible. This might seem to be an easy task for hardcore ToC practitioners in manufacturing, where the constraint is generally a process step or even a physical entity, such as a machine. In our process of 4 different regions, about 100 engineers per regions, intricate long and short term planning and erratic customer behaviour, little of the known methods to find the bottleneck seemed to be relevant.  For starters, there was no shop-floor we could have visited and no WIP laying around giving us clues about the location of the bottleneck. The behaviour of all regions seemed to be quite similar which pointed us in the direction of a systematic or policy constraint . I have read much about those, but a procedure how to identify one was sorely missing from my reading list.

So, we went back to our standard behaviour in process improvements : “when you do not know what to do learn more about the process”.  A hard-core lean practitioner would have instructed us to go Gemba, which, I have no doubt, would have provided us with adequate knowledge in time. But we did not have enough time, so our idea was to learn more about the process by building a model of it. This is nicely in line with the CRISP-DM methodology and it was also our only possibility given the short time period we had to complete the job.

The idea (or maybe I should call it a bet) was to build a well-behaved statistical model of the installation process and then check the residuals. If we have a constraint, we shall either be able to identify it with the model or (even better) we shall observe that the actual numbers are always below the model predictions and thus we can pinpoint where and how the bottleneck manifests itself.

Using the tidyverse ( packages from R  it was easy to summarize the daily data to weekly averages. Then, taking the simplest approach, we built a linear regression model. After some tweeking and adjusting we came up with a model that had an amazing 96.5% R-squared adjusted value, with 4 variables. Such high R-squared values are in fact more of a bad news in themselves – they are an almost certain sign of overfitting, that is, that our model is tracking the data  too faithfully, incorporating even random fluctuations into the model. To test that we used the model to predict the number of successful installs of Q1 2018. If we overfitted the 2017 data then the 2018  predictions should be off the mark – god knows, there was enough random fluctuation in 2017 to lead the model astray.

But we were lucky – our predictions fit the new data to within +/- 5% . This meant, that the fundamental process did not change between 2017 and 2018 and also that our model was good enough to be investigated for the bottleneck.    Looking at the variables we used we saw that we had two that had a large impact and were process related  –  the average number of jobs an operator will be given per week and the percentage of cases where an operator was given access to the meter by the customer . The first was a thinly disguised measure of the utilisation of our capacity and the other a measure of the quality of our “raw material” – the customers. Looking at this with a process eye, we found a less then earth-shaking conclusion – for a high success rate we need a high utilisation and high quality raw materials.

Looking at the model in more detail we found another consequence – there were many different combinations of these two parameters that led to the same number of successes:  low utilisation combined with high quality was just as successful as high utilization combined with much lower quality. If we plotted the contour lines of equal number of successes then we got, unsurprisingly, a number of parallel straight lines moving from the lower left corner to the upper right corner of the graph.  This delivered the message, again, not an earth-shaking discovery, that in order to increase the number of successes we need to increase the utilisation AND the quality in the same time.

To me, the surprise came when we plotted the weekly data from 2017 over this graph of parallel lines, and this was really a jaw-dropping surprise. All weekly performance data for the whole of 2017 (and 2018) were moving parallel to one of the constant success lines. This meant that all the different improvements and ideas that were tried during the whole year were either improving the utilization but in parallel reducing the quality or improving the quality but reducing the utilization – sliding up and down along a line of a constant number of success (see attached graph).

This is a clear case of a policy constraint – there is no physical law forcing the process to move along that single line (well, two lines actually) but there is something that forces the company to stay there. As long as the policies keep the operation on this one (two) lines, this will look exactly the same as a physical constraint.

This is about the most we can achieve with data anylysis. The job is not yet done – the most important step is now for the local team to identify the policy constraint and to move the company towards changing the mode they operate from sliding in parallel to the constant line  to a mode where they move perpendicular to the lines. We can provide the data, the models and the graphs but now we need passion, convincing power and commitment –  and this the way data mining can actually deliver on the hype. In the end it is about people able and willing to change the way a company operates and about the company  empowering them to investigate, draw conclusions and implement the right changes.  so, business as usual in the process improvement world.Historical 2017 with weeks


Theory of Constraints meets Big Data

The theory of constraints is the oldest and probably the simplest (and most logical) of the great process optimization methodologies. One must also add that it is probably the most difficult to sell nowadays as everybody already heard about it and is also convinced that for their particular operation it is not applicable. Most often we hear the remark, “we have dynamic constraints”, meaning that the constraint is randomly moving from one place in the process to the other . Given that the ToC postulates one fixed constraint in any process clearly the method is not applicable to such complex operations.  This is an easily refutable argument though it undoubtedly points to a missing link in the original theory : if there is too much random variation in the process steps, this variation will generate fake bottlenecks in the process, such that they seem to move unpredictably from one part of the process to the other. Obviously, we need a more standardized process with less variation in the steps to even recognize, where the true bottleneck is, and this leads us directly to Lean with its emphasis on Mura reduction (no typo, Mura is the excessive variation in the process, that is recognized just as bad as it’s better known counterpart Muda). This probably eliminates or at least reduces the need to directly apply the theory of constraints as a first step.

There are other situations as well. Recently I was working for a large utilities company in a project where they need to gain access to their customer’s homes to   execute an upgrade in a meter, which is a legal obligation of the company prescribed by law. So, the process starts with convincing customers to grant access to their site and actually be present during the upgrade, allocate the job to an operator with sufficient technical knowledge to execute the upgrade, get the operator to the site on time and to execute the necessary work. There is a lot of locality and time based variation in this process  – different regions have different demographics that react differently to the request for access and also people tend to be more willing to grant access to the operator outside the working hours, but not too late in the day and so on.


On the other hand this process looks like a textbook example of the Theory of Constraints : we have a clear goal defined by the law, to upgrade X amount of meters in two y

ears. Given a clear goal, the next question will be, what is keeping us from reaching this goal? Whatever we identify here, will be our bottleneck and once the bottleneck is identified we can apply the famous 5 improvement steps of the ToC,

1. Identify the constraint

2. Exploit the constraint

3. Subordinate all processes to the constraint

4. Elevate the constraint

5. Go back to step 1

In a traditional, very much silo-based, organization steps 1-3 would already be very valuable. By observing the processes in their actual state we already saw, that each silo was working hard on improving their part of the process. We literally had tens of uncoordinated improvement initiatives per silo, all trying their best to move closer to the goal. The problem with this understandable approach  is nicely summarized in the ToC principle: any improvement at a non-constraint is nothing but an illusion.  As long as we do not know where the bottleneck is, running around starting improvement projects will be a satisfying but vain activity. It is clearly a difficult message to send concerned managers, that their efforts are mostly generating illusions, but I believe this is a necessary first step in getting to a culture of process (as opposed to silo) management.

The obvious first requirement, then, is to find the bottleneck. In a production environment we would most probably start with a standardization initiative to eliminate the Mura, to clear the smoke-screen that does not allow us to see. But what can we do in a geographically, organizationally diverse, huge organization? In this case our lucky break was that the organization already collected huge amounts of data – and this is where my second theme “big data” comes in.  One of the advantages of having a lot of data points – several hundreds per region per month – is that smaller individual random variations will be evened out and even in the presence of Mura we might be able to see the most important patterns.

In this case the first basic question was: “do we have a bottleneck”? this might seem funny to someone steeped in ToC but in practice, people need positive proof that a bottleneck exists in their process – or, to put it differently, that the ToC concepts are applicable. Having a large and varied dataset we could start with several steps of exploratory data analysis to find the signature of the bottleneck. Exploratory data analysis means that we run through many cycles of looking at the process in detail, set up a hypothesis, try to find proof of the hypothesis and repeat the cycle. The proof is at the beginning mostly of graphical nature – in short, we try to find a representation that tells the story in an easy to interpret way, without worrying too much about statistical significance.

In order to run these cycles there are a few pre-requisites in terms of people and tools. We need some team members who know the processes deeply and are not caught in the traditional silo-thinking. They should also be open and able to interpret and translate the graphs for the benefit of others. We also need at least one team member who can handle the data analysis part – has a good knowledge of the different graphical possibilities and has experience with telling a story through data. And finally we need the right tools to do the work.

In terms of tools I have found that Excel is singularly ill-suited to this task – it really handles several hundred thousands of lines badly (loading, saving, searching all take ages) and the graphical capabilities are poor and difficult to do. In working on a task like this I will use R with the “tidyverse” library and of course the ggplot2 graphical library. This is a very handy and fast environment – using pipes with a few well chosen filtering and processing functions and directing the data output directly to the ggplot graphics system allows the generation of hypothesis and publication quality graphs on the fly during a discussion with the process experts. It does have its charm to have the process expert announce a hypothesis and to have a high quality graph to show the hypothesis within one two minutes of the announcement. It is also the only practical way to proceed in such a case.

Most of the hypothesis and graphs end on the dung-heap of history, but some will not. They will become the proofs that we do have a bottleneck, and bring us closer to identifying it. Once we are close enough we can take the second step in the exploratory data analysis and complete a first CRISP-DM cycle ( by building a statistical model and generate predictions. If we are lucky, our predictions will overestimate our performance in terms of the goal – thus pointing towards a limiting factor (aka bottleneck) because we achieve LESS than what would be expected based on the model. Once here, we try some new, more concrete hypothesis, generate new graphs and models and see how close we get to the bottleneck.

So, where are we in real life today? In this concrete example we are through the first cycle and our latest model, though overoptimistic, will predict the performance towards the goal up to -10%. We are at the second iteration now, trying to find the last element of the puzzle to give us the full picture – and of course we already have a few hypothesis.

In conclusion – I think that the oldest and most venerable process optimization methodology might get a new infusion of life by adapting the most modern and up-to-date one. This is a development to watch out for and I will definitely keep my fingers crossed.


Rapid Action Workouts – Lean for NGOs

There are situations where we would like to improve a process but we do not have the luxury of working inside a full-fledged lean initiative. This means, most of all, that we can not build on previous trainings, lean awareness and a changing culture that the teams know about. Also, in these cases, the expectation is to achieve rapid successes , as the effort of getting the teams together can not be justified by long-term positive evolution. In short, the activity has to pay for itself.


In my experience, these situations can arise in two ways – either there is a simple need to improve one process in the organization, and they have not (yet) had thoughts about a long-term improvement initiative or the exercise is meant by a promoter of a lean initiative as an appetizer to convince the organization to start a deeper and more serious lean initiative. Either way, it is important to be successful in the allocated short time.

To respond to this need we at ifss  ( developed a rapid process improvement methodology. The methodology is addressing several of the constraints we see for this scenario:

  1. The teams are not trained, indeed, not even aware of the lean way of thinking and solving problems
  2. The costs of the workshop need to be minimal, so the action needs to be fast

Our idea is to select the minimal effective subset of the lean toolset. Each day starts with a short training (short meaning a maximum of 1 hour) only focusing on the lean tools that will be needed for the day. The rest of the day will be spent on applying the tools the team learned on that day, to the problem that needs to be solved. The whole day the team has access to the coach, but they will have to apply the tools themselves. At the end of the day results will be summarized and the roadmap for the next day will be discussed.

Of course, for this to work, problem selection and expectation management are key. As such, the coach has to work with the organization to understand the problem before the RAW and to help the organization select an appropriate problem.  It would be totally disrespectful to assume that we, as lean coaches, can solve any organizational problem within a workshop of 4 days, but in most cases we can suggest improvements, achieve team buy-in and design a roadmap to be followed. Thus, we must work with the organization to define a problem, where this improvement justifies the effort of organizing the workshop. In optimal cases we do have the required tools to help them with, Intermediate Objectives Maps or Priorization Matrices, to just name a few. Nevertheless, the ultimate decision, and most important one at that, is the responsability of the target organization in the end.

The second step the coach needs to do is to select the right tools for the RAW workshop, This can be, in theory, different for each client and problem. In practice we have a set of tools that can be utilized well in many different situations – SIPOC, Process Mapping, Root Cause Analysis , Future State Definition, Risk Analysis and Improvement Plan will (in this order) generally work. I put the methods in uppercase, much like chapter titles, because we have a fair number of different methods for each “chapter” and the coach will have to pick the one that is best suited to the problem and team.

E.g. for Root Cause Analysis the coach might pick an Ishikawa diagram if she judges the causes to be simple (and uncontroversial) or dig deep with an Apollo Chart if the contrary. Of course the training for the day the team starts to apply the tool will have to be adapted, based on the choice the coach made.

Because we generally do not get to finish all the actions and we definitely aim for a sustained improvement effort  I will always discuss PDCAs as well – and make sure that the teams defines a rhythm in which the PDCA cycles will be performed and presented to the local management.

This is all nice in theory, but does it really work? I had the privilege to work for several years with two NGOs in improving processes in Africa and recently in the Middle East. The constraints I mentioned above apply very strongly for them and I found that this approach of combining minimal training with process improvement work met with enthusiastic support and was successful. So, hopefully we will be able to work and refine this approach further in the future .