Abstraction for Safe Closed-Loop Control by Cloud Systems in Mining
What happens when we close the loop in mining operations? When Internet-exposed services are compromised and send malicious instructions back into our safety-critical operations? How do we prevent that from happening?
Andrew Ginter
Central “cloud” services have substantial benefits, especially in mining and other industries where large industrial sites are situated at a distance from population centers. Unidirectional gateways safely and easily send information from these distant industrial sites out to cloud services, but what happens when we close the loop, when Internet-exposed services are compromised and send malicious instructions back into our safety-critical operations?
“Unidirectional gateways safely and easily send information from these distant industrial sites out to cloud services, but what happens when we close the loop?”
Inbound Unidirectional Gateways
Part of the risk can be addressed by with a second Unidirectional Gateway deployed to send information back into the mine, carefully designed to be “distant” from and independent of the outbound Unidirectional Gateway. This configuration prevents the targeted command-and-control loops that are the favorite attack tool for high-end ransomware and nation-state adversaries. In addition, to address the risk of a compromised cloud service sending unsafe instructions back into our mine or other remote industrial operation, we need data validation, which is made enormously easier and safer when we can thoroughly abstract our data.
Mineral Processing Example
What does that mean? Consider the example of a primary mineral processing operation at a mine site. The loaders and trucks are instrumented to measure the characteristics of every load of ore and transmit those measurements to a cloud-based expert system. The expert system or even an AI in the cloud uses the measurements to determine optimal mineral processing steps for the load at the mine site. Worker safety and equipment protection are priorities. For example, assume that primary processing includes steps such as:
- a crusher with only three safe speed settings,
- an electrostatic separator with a maximum safe voltage level, beyond which electrical insulation starts breaking down with a risk of arcing and fires at the site, and
- a high-speed gravity-separator centrifuge that must avoid certain speeds because of the risk of harmonic frequencies creating dangerous vibrations in the device.
Imagine that these and other equipment at the primary processing site are monitored and controlled by a dozen PLCs. For each load of ore, optimally cost-efficient processing involves operating equipment at regularly-changed settings, based on the characteristics of each load of ore, and on the characteristics of the previous load of ore, part of which may still be in progress when the next load starts processing.
Abstracting Control Signals
To address the risk of a compromised cloud sending unsafe instructions, we need to define or restrict communications through the gateway to combinations of values and settings that are known to be safe. One can imagine the cloud service sending a stream of time stamps and PLC register numbers and values into a validation-checking system that looks at all the hundreds or thousands of PLC values and tries to determine whether the settings are likely to result in unsafe conditions. In practice, such testing systems are very difficult to design so that:
- They are not easily confused, and
- They are comprehensive as to the unsafe conditions they prevent.
A more secure and reliable system is one that uses a high degree of information abstraction – sending control information into the industrial network in a simpler format, one that is designed to be easily verified for safety, both by human inspectors and by software systems. In our example, the control information could be encoded as a standard document schema, such as an XML or JSON schema, designed to express safe variations of primary processing conditions. For example:
- Since the crusher has only three safe speeds, express those speeds as the text values “low,” “medium,” or “high.”
- Since the electrostatic separator has a maximum safe voltage, send the voltage setpoint as a value between “0%” and “100%”. Forbid any value greater than “100%” in the separator voltage instruction.
- Since the centrifuge has both safe and unsafe speeds, select a range of safe speeds – 10 speeds in this example – and express the centrifuge speed setting as either the string “off” or a number between “1” and “10.”
Instead of sending the processing system hundreds of PLC values and timestamps and scratching our heads as to whether these values are safe, a process engineering team in this example has determined the safe operating modes of the processing facility and the control instructions are encoded as abstract points within that “space” of safe settings.
Schema Verification
A syntax checker on the external network then verifies that the incoming commands comply with the XML or JSON or CSV or other schema describing allowed values. Any failures to comply are logged as errors and rejected. A second syntax checker repeats the process on the internal network, after the instructions have been sent through the Unidirectional Gateway. Any schema failures identified inside the primary processing network represent misconfigurations that defeated our first level of filtering. Such misconfigurations are again rejected and here are logged as high-priority errors and most likely indicative of an attack in progress. And finally, all accepted and twice-validated control instructions are passed to a Manufacturing Execution System (MES) or to a batch manager or comparable system in the mine’s OT network to be translated into PLC values and communicated to the devices controlling the physical process.
Safe Closed-Loop Cloud Control
This example highlights the network engineering discipline’s focus on the difference between monitoring information, which leaves the site, and control which enters the site. In this example, an attacker tampering with monitoring information that goes out to the cloud can at worst confuse the cloud. A confused or maliciously compromised cloud can at worst communicate an unprofitable configuration for the primary processor back into the industrial system. The schema for expressing configuration information that is the incoming control information is designed so that it is not possible to express unsafe configurations.
Malicious configurations are at worst ill-formed and easily detected and rejected, or well-formed but unprofitable. Unprofitable configurations will eventually be detected, investigated and the compromise of the cloud system detected. Short periods of unprofitability are regarded by most industrial operations as acceptable business consequences for which we can buy insurance, not unacceptable safety, environmental disaster or equipment-damaging consequences. Inbound Unidirectional Gateways coupled with control information abstraction and strict validity-checking makes closed-loop cloud-based control of mining operations and other industrial operations safe.
For more information about network engineering, please request a free copy of my latest book Engineering-Grade OT Security: A manager’s guide.
About the author
Andrew Ginter
Share
Trending posts
Why Understanding OT Attacks Is Important
Firewalls vs Data Diodes vs Unidirectional Security Gateways
Secure Remote Access for Critical Infrastructure: What’s at Stake?
Stay up to date
Subscribe to our blog and receive insights straight to your inbox