Assessing the Cost of Unreliability in a Gas Plant to Have Sustainable Operation – Part 1 (2)


With ever-increasing pressure worldwide, it is essential that gas and petrochemical plants operate with high reliability and safety to maximize the return obtained for the capital investment. For the business, the financial issue of reliability is controlling the cost of unreliability (COUR) from equipment and process-failures, which waste money.

Fernando Vincente

Fernando Vicente

Asset Integrity and Reliability


ABB Service


This situation is typical of industries where plants have high equipment redundancy and where only the availability figure is tracked. The strong competitive environment between companies and the current world financial crisis are forcing organizations to explore ways in which to reduce operating costs, looking for a sustainable economy environment. The cost of unreliability index is a simple and practical reliability tool for converting failure data into cost. It helps End-Users and OEM’s to put reliability into a business context in order to reduce operational costs instead of just cutting expenditure on equipment maintenance.

Cost of Unreliabiltiy (COUR)

The term “cost of unreliability” means the overall cost resulting from all situations caused by reliability-related failures. This cost includes the cost of repairing equipment after failure and the production lost value (quality and quantity); these costs are known as direct cost. There are also indirect costs:

  1. The cost of being a reactive organization or the cost of having to be prepared to respond to failures. An organization performing a lot of reactive maintenance needs to be larger than a proactive organization. It needs people both to keep things running and to respond to failures
  2. The cost of lost business, an impact from missing deliveries or making poor products while affected by poor reliability. The cost of unreliability includes all costs resulting in any manner from a poor reliability programme and bad maintenance practices.

Since indirect costs are very hard or not possible to estimate, we will concentrate on estimating the direct costs.

The cost of unreliability is a tool from the reliability toolbox, which permits the taking of the big picture of the annualized process plant failure costs.

This kind of reliability tool is very important and useful for top management because the problem can be described in terms of business risk ($) and can be explained on paper, helping make right decisions to solve important and critical problems within the process plant.

The following Equations (1) and (2) and Figure 1 describe costs related to an equipment process failure.


Figure 1. Cost of unreliability tree.


Figure 2. COUR methodology process.

COUR = DC + IC (1)

DC = Ec + Lc + Pc (2)


Ec Equipment/spare part costs

Lc Labour costs due to reparation

Pc Production cost (loss production costs and product off-spec costs)

DC Direct Costs

IC Indirect Costs

There are two main and important advantages for the organization in calculating the cost of unreliability:

a) It is critical for general managers to know the total cost of unreliability from a business perspective in order to accurately understand the value of the entire opportunity. Without this information, general managers may interpret that corrective or improvement action is too expensive. With good information of the cost of unreliability, the top management can make the right decisions based on reliable data, knowing that the improvement will be a good investment.

b) It is important to know the total cost of unreliability to understand the total loss of money that results from poor reliability, which helps to understand the business risks. It is also an easy and auditable system for assurance companies; if the process plant has a high cost of unreliability, the business risk the assurance fees will be high as well.

Methodology for Assessing the COUR

The reason for reducing the unreliability in process plants is very well known – money. High reliability value reduces equipment failure costs since failure decreases the production and limits the gross margin of the business. A simple way to quantify the costs related to process equipment failures is the Cost Of Unreliability “COUR”. Finding COUR helps to understand the big picture of what is going on with the plant in three main aspects:

a) Where is the major cost problem, which equipment, what section of the plant?

b) What is the magnitude of the problem?

c) What is the major type of the occurring problem?

A methodology for assessing the COUR for an NGL (Natural Gas Liquids) Recovery Gas Plant was developed and implemented on site to produce the desired business result. This methodology is composed by the following important steps:

  1. Gathering information
  2. Equipment Pareto Chart
  3. Reliability Block Diagram (RBD) of the plant
  4. Reliability Equipment Analysis
  5. Root Cause Failure Analysis
  6. Improvement actions.

This systematic methodology permits the study of an NGL Recovery Gas Plant as links in a chain for a reliability system and the costs incurred when the plant fails to produce the desired results or its intended function, the methodology process is shown in Figure 2.

 1) Gathering Information

Every month the equipment process failures, time to repair, labour time, spare part consumption and production losses are collected from the CMMS (Computer Maintenance Management System). In most organizations the process of acquiring reliable data is seldom considered an easy task; most of the time data is missed or is not considered important by technicians and for this reason a lot of work is required to change the culture.

A good tool to have a systematic way to register failures and their causes is the “Failure Tree” which will be uploaded in the CMMS for every equipment type. For the Oil & Gas industry the reference standard for doing this is API-689 “Collection and Exchange of Reliability and Maintenance Data for Equipment” idem to ISO-14224:1999. This standard provides a comprehensive basis for the collection of reliability and maintenance data in a standard format for equipment in all facilities and operations within the petroleum, natural gas and petrochemical industries during the operational life cycle of the equipment.

The failure modes defined in the normative part of this international standard can be used as a “reliability thesaurus” for various quantitative as well as qualitative applications. Standardization of data-collection practices facilitates the exchange of information between parties, e.g. plants, OEM’s, End-Users and contractors. If there is no data, or data isn't reliable, organizations are running blind and all decisions are ‘guesses’. A lot of the major wins were due to data highlighting previously unknown problems. Without data, reliability engineers can not show the value of improvements, so they can not show the return of the reliability programme.

 2) Equipment Pareto Chart

The Pareto principle, also known as the 80- 20 rule, states that for many events roughly 80 % of the effects come from 20 % of the causes. In a typical Pareto Chart shown in Figure 3 the right vertical axis is the cumulative percentage of the total number of occurrences, in total cost or other particular unit of measure. In this case a Pareto Chart for the Top Ten equipment and process areas of the month should be created by selecting the Bad Actors (equipment and process areas with high failure rates and costs). This chart is useful when determining where the reliability engineering analysis should be initiated and where maintenance efforts should be applied and what problems need attention first to get the best business results. These process area costs are related to equipment failures, labour costs due to maintenance and costs associated to production losses.

 3) Plant Reliability Block Diagram (RBD) Analysis

A Reliability Block Diagram is a very important tool to support the assessment process of the unreliability costs. RBD performs the system reliability and availability analysis on large and complex systems using block diagrams to show the network relationships. The RBD structure defines the logical interaction of failures between functions that are required to sustain system operation. The blocks are connected with direction lines representing the reliability relationships. In a reliability block diagram the blocks represent a component, subsystem or assembly and the RBD is connected by a “parallel” or “series” configuration. In a series configuration, a failure of any component results in the failure of an entire system. In a simple parallel system, at least one of the units must succeed for the system to succeed.


Figure 3. Pareto Chart for process area cost.


Figure 4. NGL gas plant money lost due to system failures.

RBD analysis is an essential step during the COUR assessing process. Elements or key process areas will be considered as a series reliability model comprising links in a chain of events that deliver success or failure. Logical block diagrams of major steps or systems will be considered and failure costs are calculated by process area and by type. For each major block diagram (process area), the following data will be collected over a period of time (monthly) to identify significant cost contributors to the COUR:

  • Total failures
  • Losses from partial failures
  • Quality problems due to failures
  • Repair costs

The RBD analysis will help the Maintenance and Plant Managers to see the big picture of the process situation in terms of business losses due to equipment breakdowns. Top Management set the reliability and maintenance strategy for the protection against losses. Reliability and maintenance efforts should be focused on the equipment and process areas where the impact to the loss of gross margin is highest. The strategy should be based on which equipment needs protection, where the protection is required and what type of actions are needed to get best results. The following simple example regarding the gas plant RBD diagram illustrates the approach to the problem.


A section of an NGL gas plant consists of three major blocks to carry the production load to produce the desired business results. The blocks are arranged in a series connection so if any of them fails, the plant fails. Data is collected every month to calculate the failure rate for each process area. As this is a series system the failure rates are summed to calculate the overall failure rate for the plant over a one-year (or 365 days or 8760 hours) period. Figure 4 shows the RBD diagram for the gas plant with a 10-year study interval. Assuming that NGL cost is $300/ton, the normal production is 150 tons/hour and the production cost is $45,000/hour, a simple spreadsheet can be utilized to find the annual cost of unreliability COUR

The compression area shows a high failure rate in respect to the other areas so a reliability programme will be focused on this sector to help reduce the unreliability issues. Figure 4 shows that Recovery NGL area has a maintainability problem as it takes too much time to repair a failure and a reliability programme will be focused on this plant sector to reduce the time to repair.

This kind of analysis helps the reliability engineers set priorities, but the real challenge is to turn these reliability values into money. Managers will be concerned about those plant sectors where business is being reactive or wasting money. This section of an NGL gas plant ia a $500,000 per year problem and a prioritized maintenance management strategy is required to reduce this cost. The major contributor to the cost of unreliability in this process plant is the Recovery NGL area followed by the Compression section.

This type of analysis shows the financial state of the process plant in a simple spreadsheet, which is an easy way for top management to understand and justify investments for plant performance improvement and budget development.

Once the main process plant areas contributing to COUR have been identified, a deeper and more detailed analysis of each problematic area should be developed to understand which equipment is, and/or what are the main causes responsible for increasing the maintenance costs. Several tools can be applied in this phase such as Weibull analysis and/or Root Cause Failure Analysis, which will be described in the second part of this article published in the next issue of MaintWorld.