Quotation Central
The famous artist, Edgar Degas, probably had it right when he said: “What a delightful thing is the conversation of specialists!” Which sounds good on its own as a statement, as specialists do get passionate about their specialism [1]. But Edgar went on: “One understands absolutely nothing, and it’s charming.” What Edgar is saying is that we may nod appreciatively as we know the specialists are talking, communicating, whatever… but do we have any idea what they are really saying? Likely not…
And there we have the essence of the communication problem: how do we ensure our message is getting through as intended? That it means the same for the receiver as it does for the sender? In terms of managing technical assets, I believe it was the Australian engineer John Hardwick who put it most succinctly [2]: “The role of the asset manager is to talk technical to technical people and to talk finance to financial people, and be able to translate for the two groups.”
Simple Examples
There is a requirement in Wales that road signs are in both English and Welsh languages as shown in pictures [3]. Welsh, English, and place names in Welsh do not need to be translated – and there are two in picture 1. (Quite how one is supposed to pronounce the place names is a separate challenge…)
Picture 1: Welsh/English Roadsign
Picture 2: English/Welsh Roadsign
And a famous example of the need for ‘someone’ to understand both languages is given in picture 2. The local authority was putting up a new sign to restrict large vehicles from entering an estate and needed a translation into Welsh [4]. So, they sent the text to their inhouse translation service, and the service responded with the Welsh you see below in the sign: text used as received to make sure there were no copying mistakes… Unfortunately, the Welsh actually says: “I am not in the office at the moment. Please send any work to be translated." The sign was taken down within days. Nobody involved in the production of the sign seemed to speak both languages.
Amusing though the road sign may seem to be, there are cases where a mistranslation could, in practice, be dangerous. Take the sign in picture 3: if you speak just one of the two languages there is clarity of message, however, if you speak both you’ll know that the Welsh actually says “Pedestrians Look Left”, which is the exact opposite of the English [5].
And that mistranslation could be highly dangerous: what if you cross a busy street having looked the wrong way?
The Curse of Knowledge
If you have ever played the game ‘Charades’, a miming game where one participant tries to act out silently the title of a book, or a song, or a movie… it can be SO frustrating when it’s perfectly clear in your own mind how your actions relate to the words of the title, but the audience are ‘just not getting the message’. That is known as ‘The Curse of Knowledge’ in that the audience doesn’t know what you know, or at least hasn’t made the connection yet [6].
A similar effect was seen in a game run at Stanford University where one person, the ‘tapper’, tapped out a specific and well known tune using a pencil tapping on a table and the other person, ‘the listener’ tried to guess what the tune was [7]. The tapper was asked what percentage of the time they thought the listener would be successful, which they estimated to be about 50% of the time. In practice the listener guessed correctly about 2.5% of the time. Why were the estimates from the tapper so poor? Because they have the tune in their heads, and once you know something it’s hard to ‘not know’ it, and that means we tend to assume that the listener will also know it. That, once again, is the curse of knowledge. In the business world, managers and employees, marketers and customers, corporate headquarters and the front line, all rely on ongoing communication but suffer from enormous information imbalances, just like the tappers and listeners. Leaders can thwart the curse of knowledge by “translating” their strategies into concrete language. What do we mean by ‘concrete’? Common, well understood and agreed terms, with continual feedback to make sure they are common and well understood.
One thing that may bring the curse of knowledge into focus is a review of high school lesson effectiveness: teachers who present lessons almost invariably overestimate how effective the lessons are compared to the students who receive those lessons!
The cases presented here are both real and practical: when living through the events at the time, with both uncertainty and without the benefit of hindsight, decisions are often complex, and outcomes unpredictable.
Alerts/Alarms May Be Straightforward, but Context Is Important
There’s a difference between condition monitoring for ‘detection’ and condition monitoring for ‘diagnostics’: the former seeks to identify that we have ‘a problem’ while the latter seeks to identify also what the problem is. In both cases, contextual information is important.
Take, for example, a domestic fire alarm going off at 8 in the morning, while breakfast is being prepared, and someone is making toast… do we call the fire brigade immediately, or go check the state of the toast in the toaster? Likely the latter: the condition being monitored is smoke and the toast is, well, probably toast. What about the fire alarm going off in a hotel at 2 a.m.? My response would be to leave immediately – probably taking ID, wallet and laptop with me as they’re light, and an overcoat in case of rain/snow. And, yes, this has happened to me several times (not always at exactly 2 a.m.) and I am always surprised that I leave my room and see other people, heads poking out of their doorways, asking the question “is it a real alarm?” Well, I don’t know, but the downside of staying where I am in my room may be so bad that I’ll go with the discomfort of leaving the hotel via the fire escape and standing outside for a while, waiting for the all clear.
Fire alarms are almost always a detector of particular conditions but they can ‘communicate’ that they have found something, and we do well to ‘listen’ and act, but context is important in our response.
And what if we have a rising trend for Hydrogen in a power transformer through an online DGA detector? Hopefully, limits for alerts/alarms were agreed and implemented when the monitor was installed, limits which reflect both industry standards for DGA levels and our expectations for this particular unit. And hopefully, there is an agreed response plan for every alert/alarm from every monitor on every asset… and we ‘just’ follow the plan. That would be ‘ideal’! Everyone knows what the monitor does, what the alert means and what the interventions will be, from the technical folks who make recommendations to the spreadsheet/finance folks who make decisions. Unfortunately, it doesn’t always work like that in practice and the limits aren’t agreed/set well, and the response plan isn’t detailed, and if it is detailed it may not be followed. To get the best from a monitor we need to understand what it does, and what the data means and have a plan which is agreed by all stakeholders and followed: and someone needs to know what’s going on [8]!
Bushing Power Factor Deteriorates, and then improves?
But It Got Better!
What happens if an alert comes in from an online bushing monitor, indicating that the power factor of a bushing is rising gently [9]? Again, hopefully, there is a plan of how to respond – usually to take the transformer offline and test the bushings to confirm the readings. An outage is planned, but before it happens, the bushing seems to improve: the power factor is falling gently, back towards an acceptable value! A non-technical decision might be that as the readings have recovered, there is nothing to be concerned about as the monitor is no longer giving an alert. A technical decision would more likely identify a scenario whereby the deterioration has continued but changed character and there is now an internal tracking on the surface of the conductor within the bushing. But how to convince the asset manager that the monitor is perfectly OK, but taking the readings at face values means we are deceiving ourselves. Someone has to know what is, or may, be going on. But how do they communicate that successfully to someone who knows very little about bushings?
Asset Health Indices (AHI) and Data Compression
Is it for asset replacement, maintenance, refurbishment or something else? How does the index address this? What does the index mean? So, let’s take an example where we have an index based on an analysis of available data which assigns a percent health score with 100 being perfect and 0 being the opposite: no health at all. What would 60% mean? Is that a 40% chance of failure? In the next week? Month? Year? What is the raw data which means that we have a less than perfect specimen? Knowing the index tells us very little about the asset, unless we can dig into the data, the analyses and the algorithms, and work out what it means in practice: detection of a problem and diagnostics thereafter.
An AHI is not the asset itself, it is not even the true condition of the asset, it’s a model of the unknown value, ‘asset condition’, which has its own imprecision and inaccuracy. It was the late Tom Rhodes of Duke Energy who said [11]: And there lies the problem: the technical interpretation of an AHI comes with an understanding of the inherent imprecision, the likely inaccuracy, and the need for interpretation of any index because it is a model of the asset. And as all models are wrong, and only some are useful, we need to know how far wrong our model is before it becomes unacceptably wrong [12]. To quote “The map is not the terrain”, or maybe “the chart is not the patient”… [13]. If we only know the numbers, and not where they came from, and how they are derived we may confuse the model and the asset, the map and the terrain.
If It’s in the Red Box It Must Be Worse Than the Ones in the Orange Box
One big advantage of a risk matrix is the simplicity of the matrix itself: straightforward axes for probability and consequence, easy to interpret colors, sometimes with qualitative ‘small – medium – large’ categories rather than quantitative numbers we need to calculate with. Easy? Actually, not easy! If we use a risk matrix to communicate where we need to focus, which risks to address first, we may be sending exactly the wrong message.
To identify issues with communicating risk I’d suggest starting with “The Risk of Using Risk Matrices” as a discussion of how risk matrices can be misleading [14]. Also, a technical paper which shows how the matrix does not even prioritize risks based on risk magnitude resulting in more urgent situations being de-prioritized [15].
In the risk matrix shown in picture 4, the axes are given numerical ranges, which is good, but we can still be misled by the way the matrix works. Asset 1 has a higher risk magnitude, but falls in an orange box, while Asset 2 with a lower risk magnitude is in the red. Once we have that situation it is very difficult to convince someone that they should address Asset 1 first because it’s in an orange box, and Asset 2 is red!
And a famous example of the need fIt was Sapolsky who noted in a lecture that we, as humans, like to think in categories, and that comes at a cost: we tend to think that all things within a given category are similar to each other but also very different to things in other categories [16]. And it becomes very easy to inappropriately categorize risks with a simple colored risk matrix.
So, if we wish to communicate risk priorities from a technical point of view to a less technical audience, we may be in trouble if we use a risk matrix: especially if the matrix has qualitative values for axes rather than quantitative values as there is another layer of interpretation required. But even then – how do we know that our message is getting through? Sidorenko suggests managing risks individually and looking at the details of each one, rather than just putting them in boxes – at least then we are looking at the raw data and can see the precision, the accuracy, and the pitfalls of the colored boxes.
Picture 4: Example Risk Matrix and inappropriate prioritization
Good Decision, Bad Outcome?
The following case illustrates good communication relating in the way technical data was analyzed, contextual and industry data used to support conclusions, and a specific asset risk was managed by a cross functional team [17].
One of two Self Contained Fluid Filled (SCFF) transmission cables serving an urban station failed, putting the system at N-1 and the company at risk of having to shed load. An analysis was performed using a cross-functional team to look at available data including, but not limited to: spares on hand, lead times of new replacements, second hand cable availability, condition of the present cables on the system.
With no spare readily available, the system was held at N-1 while a system study scenario-analysis was performed to help identify and plan mitigations should the second cable fail. In parallel, analysis of historic and industry SCFF cable data showed an annual failure probability of ~3.4%.
A replacement cable of XLPE construction was ordered, with extra spare cables for contingencies. Three weeks later the second cable failed, bringing the system to N-2 and requiring network system operators to implement a mitigation, with back-feeding to support load and developing extra stress on a number of system assets. The scenario analyses were good, the failure probabilities low, but the failure hazard was realized and had to be managed.
This is an interesting case as the cross-functional teamwork meant that any issues regarding communication of meaning, risk and analysis were identified early and addressed promptly. The team all had a common understanding that the probability of a second failure was low, and that 3.3% was equivalent to correctly predicting the roll of a 29-sided dice. The fact that the risk was realized meant that the mitigation plans were followed, and load shedding was not required.
Plan for the worst, and hope for the best, but a good decision can still have a poor outcome.
Discussion
We likely only know we have a ‘failure to communicate’ when something goes wrong unexpectedly. Confirming that a message has been received and understood requires continual checking and rechecking that, as they say, ‘we are on the same page’: we must do more than talk, we have to listen, and listen carefully.
A few years ago, while making an asset management presentation in the USA on power transformer assessment and ranking using heath indices, I realized that most of the audience didn’t really know whata power transformer was – many of them were nodding sagely but I could tell: yes, they’ve seen my pictures and heard me talk about 660 MVA units, but these devices were still abstract to them. I had to make it ‘real’, make it ‘concrete’, something which they could relate to. So, I talked about big cars, like a GMC Yukon SUV which weighs about 2 tons… and imagine trying to push that to the top of the Empire State Building in New York City. That would take a lot of effort… and now imagine doing that with 99 more! Then dropping all 100 cars onto the streets below, resulting in a lot of damage when they hit the ground. That energy, which causes all the damage, is the amount of energy which goes through a 660 MVA transformer every second. The car analogy, and the work to push it to the top of the building made things much more relatable, more real, and I could see it in their response.
And one last quotation which really sums up the need to check and recheck that what we think of as clear communication is in fact clear; to paraphrase William Whyte, who also coined the term ‘groupthink’ [18]: “The great enemy of communication is the illusion that it has taken place.”
Acknowledgments
With thanks to the many colleagues who reviewed and improved this article!
References
[1] https://en.wikipedia.org/wiki/Edgar_Degas
[2] “Aspects of Asset Management at Energy Australia” J. Hardwick, Energy Australia, 76th Annual International Conference of Doble Clients, Boston, USA, 2007
[3] https://en.wikipedia.org/wiki/Road_signs_ in_Wales
[4] http://news.bbc.co.uk/1/hi/7702913.stm
[5] https://therockyroadtowelsh.weebly.com/ blog/welsh-road-signs
[6] https://en.wikipedia.org/wiki/Curse_of_ knowledge
[7] Harvard Business Review https://hbr. org/2006/12/the-curse-of-knowledge
[8] K. Wyper et al, “Condition Monitoring in the Real World”, Doble Client Conference, 2012
[9] better
[10] “Deriving a Useful Asset Health Index”. McGrail et al, 83rd International Conference of Doble Clients
[11] “Practical Machine Learning Applications”, Tom Rhodes, Imene Mitiche et al, CIGRE Paris, 2022
[12] Box, G. E. P. (1976), "Science and Statistics" (PDF), Journal of the American Statistical Association
[13] https://en.wikipedia.org/wiki/ Map%E2%80%93territory_relation
[14] Bratvold R. et al “The Risk of Using Risk Matrices”, Society of Petroleum Engineers, Economics & Management, 2014
[15] Guillon et al, “Asset Health Indices and Risk Matrices”, 90th International Conference of Doble Clients
[16] “Introduction to Human Behavioral Biology”, Prof. R. Sapolsky, Stanford, https://www.youtube.com/ watch?v=NNnIGh9g6fA
[17] Dhir R. et al “Everyone has a Plan Until they get Punched in the Mouth”, IAM N. America Conference, 2022
[18] https://en.wikipedia.org/wiki/William_H._ Whyte
Tony McGrail is Doble Engineering Company’s Solutions Director for Asset Management &Monitoring Technology, providing condition, criticality and risk analysis for utility companies. Previously Tony has spent over 10 years with National Grid in the UK and the US; he has been both a substation equipment specialist and subsequently sub-station asset manager, identifying risks and opportunities for investment in an aged infrastructure. Tony is a Fellow of the IET, a member of the IEEE, CIGRE, ASTM, ISO and the IAM, and is currently active on the Doble Client Committee on Asset and Maintenance Management and a contributor to SFRA, Condition Monitoring and Asset Management standards. His initial degree was in Physics, supplemented by an MS and a PhD in EE followed by an MBA.