Would You Rather Use a Control System That’s Proven Correct? – Episode 136

For safety-critical operations or for critical national infrastructures, would you rather base your system on a code that people have tested as best they can, or would you rather base your system on a platform that has been proven correct? Daly Brown and Nick Foubert of Metropolitan Technologies look at a new approach to designing OT systems.

For more episodes, follow us on:

Share this podcast:

“With digital transformation there’s a lot of possible new cyber threats to the existing infrastructure and architectures of OT networks.” – Daily Brown

Transcript of Would You Rather Use a Control System That’s Proven Correct? | Episode 137

Please note: This transcript was auto-generated and then edited by a person. In the case of any inconsistencies, please refer to the recording as the source.

Nate Nelson
Welcome listeners to the Industrial Security Podcast. My name is Nate Nelson. I’m here with Andrew Ginter, the Vice President of Industrial Security at Waterfall Security Solutions, who’s going to introduce the subject and guest of our show today. Andrew, how’s it going?

Andrew Ginter
I’m very well thank you. Our guests today – we have two of them – are Daily Brown and Nick Foubert. Daily is the CEO. Nick is the chief technology officer, CTO. They are both co-founders of Metropolitan Technologies and they are working on a new and more secure platform, a whole new way of doing industrial control systems.

Nate Nelson
Then with without further ado, let’s jump right into your interview.

Andrew Ginter
Hello, Daly. Hello, Nick. Welcome to the podcast. you know Thank you for joining us. Before we get started, Ken, I ask each of you to tell our listeners a little bit about yourselves, about your background, and about the good work that you’re doing at Metropolitan Technologies.

Daly Brown
Yeah, sounds good. So I guess I’ll start. Thanks for having us on, Andrew. I’m Daly. I’m the co-founder and CEO of Metropolitan Technologies. Just a bit about my background. So I have about 15 years of experience in the aerospace and defense industry as a system and software engineer. Prior to founding our startup, I worked for a big company called General Dynamics Mission Systems, and I had roles such as product owner, technical lead, and senior engineer there.

Daly Brown
I was responsible for design, development, testing, and deployment of various mission-critical applications on airborne platforms. And I also led research collaborations with academia in artificial intelligence, machine learning, target tracking, and information fusion. And I’ll let Nick introduce himself.

Nick Foubert
Yeah. Hi, Andrew. Uh, thanks for having us on the podcast. I’m Nick Fuber. I’m the other co-founder and the CTO of metropolitan technologies. I have a roughly 14 to 15 years of experience in industry as a systems and software developer. And like Daly before founding metropolitan, um, I also worked at general dynamics mission systems as a technical lead and senior software developer.  

I worked on mission critical applications on airborne platforms and Daly. And I actually worked pretty closely together there for four or five years before leaving to start metropolitan. And before general dynamics, I had done some other work in biometric security and biomedical applications of machine learning. So yeah, that’s me.

Daly Brown
Yes, I guess I’ll talk a bit about our startup or I’ll introduce it anyways. So Metropolitan Technologies is an early stage Canadian startup. We’re based in Ottawa, Ontario, Canada. We founded the company to build reliable and secure commercial products based on our experience in the aerospace and defense industry. When we first started the company, our first idea was to build smart city products with a focus on privacy and equity.

And after doing a bit of market research for for a platform to build upon, we didn’t find anything that met yeah actual our actual standards for security, privacy, and reliability. So we decided to build our own. It also became apparent at the time that the market opportunities for our own platform were greater than the smart city applications we first thought to build. So we did a bit of a pivot and to focus our concentration on building up the platform and expand beyond cities to the broader critical infrastructure and industrial security market.

And so consequently, we ended up building a secure by design and default decentralized IoT connectivity and cybersecurity platform to help organizations connect and secure their OT networks. And yeah, so that’s a brief introduction about us.

Andrew Ginter
So thanks for that. And you know you guys are doing some stuff that sounds really interesting. I mean, this is why I i asked you to to to join us on the show. You’ve mentioned Zero Trust and you know IoT, Internet of Things. You’re talking about a platform. you know I know that you’re also touching on stuff like mesh networks and formal methods. It’s a bit of a buzzword soup, but you know if you’re doing this all, I thought, I gotta have these people on. Can you break it down for us? you know You have a product line. What do you have? what What is it? What does it do?

Daly Brown
Yeah, so I’ll start a bit about that. I get carried away actually sometimes with the buzzwords and slogans. I think it’s indoctrinated in me for my time working in the defense industry. I’m pretty sure there’s conversations we actually used to have where we only spoke in acronyms or buzzy slogans. And so old habits die hard, but I’ll try to speak like more of a human so people can understand and be cognizant of it. So like I mentioned in the opening, we’re building a decentralized IoT connectivity and OT cybersecurity platform.

We’re also working on a pre-integrated edge appliance to simplify the integration of our technology with existing OT networks. And so many of the ideas that are built into our platform are based on our experience in defense and aerospace industry. We wanted to combine the resilience and security of tactical battlefield networks and the reliability of safety reability and safety of avionics software that we used to work on. And so it just so happens that equation leads to a parade of buzzwords. And I’ll let Nick maybe go into a bit more about what that means right now.

Nick Foubert
Yeah, so we’re using a number of techniques to build our platform. And I guess these are kind of getting back into buzzwords here, but we can dive into some of them as we continue the conversation. But so things like zero trust, security model, data-centric encryption, formal methods, mesh networking, among other things. We’re using similar types of technologies you might find in battlefield networks in the military.

So one example of that, we’re a software platform, but we’re building our platform to support different networking topologies like mesh networks that you don’t typically find in a traditional enterprise IT network. So our our software components communicate peer to peer and that helps us minimize the amount of centralized infrastructure we need to rely on.

So even when the peers in the network need to communicate with a central service, like a cloud service, for example, and they might need to do that to do things like help discover other peers in the network. Even those more centralized services are designed to be distributed and and tolerant to network outages and network changes.

The data plane in our platform uses a publish-subscribe paradigm. What that means is that our software components don’t need to know specifically which other components they’re communicating with, but rather they just advertise what types of data they can provide and what types of data that they want to consume. And then our software takes care of actually moving the data between the components. This lets us do things like tune quality of service and enforce security controls among a bunch of other benefits.

And the reason we’re building it this way is because we want our platform to be usable in the most challenging industrial environments, whether that’s on a battlefield, deep in a mine, or monitoring some remote asset in the Arctic. So yeah, I think that kind of gives you a high level picture of what we’re doing as far as mesh networking is concerned.

Nate Nelson
Okay. So Daily Just There promised us a parade of buzzwords and they very much followed through. One of them that stood out to me, Andrew, the publish-subscribe paradigm. Do you want to help me out with that one?

Andrew Ginter
Sure. I got sort of two things out of it, MeSH and publish subscribe. Let me start with MeSH first and then I’ll come to PubSub. MeSH is sort of the concept that not everything, I mean, on an IT t network, generally speaking, give or take a firewall, everything can talk to everything. Anything wants to send a message to anything else, you open a connection and there you are with with industrial networks, frequently you have devices in the field, but solar powered. They might not be on the cellular network. They might have sort of a local radio link to an aggregator that’s on the cellular network across blah, blah, blah. It gets complicated.

And so it’s generally not the case that things can talk to each, rather can talk to anything. They can talk to their neighbors and they can pass messages along. And this is the the essence of me mesh networking. You see it used sometimes in meter reading, where not every meter might have a SIM card that lets you on the Internet or not every meter might be wired into something.

And a local community of of smart meters might talk to each other and find the one that has the connection to those the fiber optic backbone and send all of their readings into there. So this this idea of a mesh network where devices cooperate in a challenging communications environment is something you see in sort of limited applications of industrial networks. and These folks at Metropolitan Technology, they want to make this sort of intrinsic. It’s the natural way that everything in their system communicates. and When they communicate,

They don’t make connections like TCP connections. They don’t connect to each other. It’s hard to figure out who to connect to. When they communicate, they just say, here’s what I’ve got. I’ve got meter readings for this address. Here they are. And anybody who wants that information subscribes to it. So sending it out is called publishing. You give it a tag, meter reading dash four dash this address and you push it out. And anyone who wants that subscribes, they say, I want all of the tags that start with meter reading. 

And because this would be the the consumer, the accounting system. So this is the concept of of mesh, this is the concept of publish, subscribe, you don’t connect to things, you just push out what you have and other people, consume it. And this The PubSub is also used in industrial networks. Some, a small number of of big name industrial vendors like 20, maybe even 30 years ago when PubSub just sort of was invented, adopted PubSub wholesale and one or two of the big vendors out there use it internally everywhere.

But even everybody else in the industrial world most of them use Publish Subscribe to talk to the cloud because the MQTT protocol has emerged as the the dominant way to talk to the industrial cloud or the Internet of Things cloud. And MQTT is intrinsically Pub-Sub. So when, I don’t know, a smartwatch talks to the cloud or any smart device talks to the cloud, three times out of four, it’s using MQTT, it’s using Pub-Sub. So these are not alien concepts in industrial automation. But what metropolitan technology has said is, this is the right way to do this. And they’ve made it the default. It is the way their stuff communicates.

Andrew Ginter
So, if if I may, a clarifying question. I heard you use the phrase data plane and I hear people use that phrase from time to time and I’ve never figured out really what it means. In my understanding, in deep history, it it might have come from telecoms where you tended to have maybe two sets of wires, one for communicating the voice communications, the phone calls back 50 years ago, and a separate kind of communications mechanism for communicating metadata like call setup or billing or who knows what. Today, we’re talking the internet. What’s the difference between a data plane and any other plane in the internet if you can?

Daly Brown
Yeah, so I can jump in right here. So we we kind of divide our system up into two different planes, or we call them planes. We call it, one’s a data plane, and one’s a management and control plane. And the reason we differentiate this is because the actual means of communication are slightly different. So when we say our systems decentralized, we mean the data plane of our system decentralized. And the we consider the data plane all the operational technology data. So sensor data, actuator and control data,

Another type of data you would find when when, for example, working with an industrial control system. The management and control plane, we we we consider that things like access control, security policies, remote, the health and monitoring of the actual application, things like that. And so that part of our platform can be centrally managed and locally enforced, but but that’s why we differentiate between the two. And I don’t know, Nick, did you want to jump in there and clarify anything?

Nick Foubert
Yeah, I think you covered most of the points and and we’ll get into a little bit more about our zero trust security model. But I think Daly mentioned our platform is a software platform, but we organize it in two layers as far as the communication infrastructure goes. And we have like an outer layer that kind of provides a level of perimeter security. So there’s an encrypted outer layer.

Nick Foubert
And then the inner layer of that is another separately encrypted layer of data. And that’s just that’s typically what we refer to when when we’re talking about the data plane.

Andrew Ginter
Okay, so so just so I’m clear, like I said, I’m i’m old school. i never i For decades, I’ve wondered what this stuff means. What I’m hearing you say is that in the modern world, it’s more or less all internet protocol at the core, but we’re talking maybe a different set of TCP ports or maybe even UDP ports, different set of protocols for the data plane, which is sort of a collection of protocols and services that are focused on moving the data around or managing the data, and the management plane that is more a set of protocols and services focused on managing the system, configuring the system rather than dealing second by second with the data. Is is that fair in terms of terminology?

Daly Brown
Yeah, I’d say that’s a pretty good way of describing it.

Andrew Ginter
Okay, sweet. Because I’ve always wondered if it’s something much more exotic. So thank you for that.

Daly Brown
Well, you have it right. And it’s a good the example you gave earlier about the wires and the metadata for the telephone system. And then the actual voiceover, the voice part of it is is a pretty good analog to that. So I think that describes it pretty well.

Andrew Ginter
So let’s dig into some of your your your other concepts. So, you’ve you’ve mentioned mesh. you’ve mentioned mesh, you’ve mentioned cloud. You’ve defined a mesh communication system. What are you using it for? Sort of what’s the next layer of of functionality you’re building in?

Daly Brown
Yeah, so let’s maybe talk about the Zero Trust security model of it, because that’ll that’ll describe a bit about the kind of the the shell of how it works, and then we can kind of dive deeper into some of the other technologies that we’re applying to make the system secure. And so i just related to Zero Trust, our platform, we built it from the beginning to have a Zero Trust model. And the idea of relying on something like perimeter security for a software system never sat well with me, even before the term Zero Trust became sort of mainstream.

And so it’s one of the reasons we sought to build our own activity platform is that when we’re doing market research, we found that there’s too much implicit trust in many of the existing solutions. So we built our platform, as Nick mentioned, to consist of two independent security layers, each with their own zero trust security model. The the outer mesh network layer that we’ve built can overlay untrusted and unsecured networks. And so each node in that network is mutually authenticated continuously, whether it’s a service, user, or device. and so This provides broad protection against what we’d call like outside threats to an OT network. 

Then we have an inner peer-to-peer layer that provides data-centric encryption, ensuring that all data in our OT network is accessed on a need-to-know basis. And so this inner layer provides very fine-grained control over who sees what and who can connect to whom, helping protect what we’d say the OT network against insider threats. And so these access control policies are locally enforced at each node in our network,

And this ensures that the zero trust security model is enforced, even if we face network partitions or outages. So we take the idea of zero trust even further than that, actually, further than the conventional definition that includes only users, assets and resources. And we bring it to the actual granular data level so that even within a software process, there is zero trust regarding what data can be read or right read read or written by that software process.

And I’ll just say there’s actually some other practical benefits of our approach to this too, especially at the convergence of IT and OT networks. The outer layer provides like a well-defined perimeter in which to hunt for threats. So nodes in the and the network outside of our our software-defined perimeter, I’ll say, can be considered outside our threats. And we are mostly concerned about preventing sort of like denial-of-service attacks there and those types of of those types of threats. 

Then nodes that inside the perimeter it could be considered insider threats. In addition to this sort of denial of service attacks, we’re concerned about the confidentiality and integrity of that data that’s being passed around. But because of our security model, it makes a problem of intrusion detection easier inside are inside our mesh network. The data centric security part of our system preserves the integrity of the data, even if somehow some sort of node were to be compromised. And so we have a pretty, our defense in depth architecture provides, a lot of, I won’t say guarantees, but as much to the extent as possible, protecting the actual integrity and and of the data that’s being passed around.

Nate Nelson
At this point, I think the term zero trust has been used at least a dozen times in your interview, Andrew. And it always makes me cringe because of how that term is thrown around in IT. Are we using the term zero trust here in the same way that I’m used to it in IT spaces?

Andrew Ginter
Zero Trust has evolved. My original understanding of Zero Trust was basically what you do on the internet. If you want to log in if you want to connect to anything sensitive that that involves money, you have to give a password. You have to have an encrypted connection. That was Zero Trust. you don’t give someone money just because you like their IP address, the way you might trust someone on an IT t network, old school. The modern sort of vision for Zero Trust has evolved into something that’s loosely yeah sort of marketing talk for Active Directory.

If you look at a lot of the zero trust standards that are coming out of NIST, the US National Institute of Standards and Technology, they’re talking about zero trust as single sign on, where instead of sending your username and password around every to every machine, everything you want to do, you sign on once to a zero trust broker. know which is an Active Directory server, and you get back a Kerberos ticket, which is sort of an encrypted thing that represents you and your credentials. 

And now you can go to the print server or you can go to the SAP server and and not have not be challenged for another password. That’s sort of the the thing on IT networks.

What I heard Daly and Nick talking about here was sort of zero trust taken to an extreme, where every message that is exchanged in the system is authenticated somehow, not just every log in, every message. Not just every connection, there are no connections. It’s all published, subscribed. 

So you can do really fine-grained permissions. You can say, these kinds of users have permission to subscribe to these topics, this data that I’m putting out, but not that data that I’m putting out. Every topic can have different permissions. And every message where you exchange a topic and a piece of data has to be signed. And so in a sense, it’s the IT t style of of zero trust, let’s say taken to an extreme.

Andrew Ginter
So thanks for that. So we’ve we’ve talked about mesh, we’re talking about authentication, constant authentication within the mesh. This is, in a sense, the the nature of zero trust. You’ve also, but you I know that you guys touch on formal methods and this drew my attention. I mean, correct me if I’m wrong. Can you tell us, what are you doing with formal methods? 

If you might sort of correct my misconception if there is one. When I hear formal methods, I imagine we’re proving things correct. Now, I know formal methods might actually be a little different from that. Can you talk about, for you, what is formal methods? And what are you applying it to? What’s what’s the goal? What are you achieving with that?

Nick Foubert
Yeah, so in short, you’re you’re you’re right. In essence, formal methods is about proving things correct. So formal methods essentially are rigorous mathematical techniques to specify, analyze, design, implement, or or verify a system.

And I know a lot of people get scared or maybe bored when the topic of math comes up. So I’ll try to keep it more high level here for your listeners. So under the umbrella of formal methods is another concept called formal verification. And that’s that’s where you’re proving the correctness of the system with respect to some formal specification or or property of that system. So and examples of that typically are like safety or security properties.

And traditionally, outside of academia anyways, formal methods have been used mainly in the most safety and security critical systems like like avionics. And that’s mainly due to the expertise and the specialized tools that are needed or or were needed anyways. But we’re starting to see more traction across industry as awareness grows of these tools and and the tools are actually becoming easier to use and to apply. So one example of of their use in industry actually comes from our home province of Ontario, where formal methods were used from the design through the verification of a software-based shutdown system for the Darlington Nuclear Generating Station. 

So I think they spent about a decade working on formal methods, again, across design through verification to ensure that the shutdown system was in fact safe. So there’s many different techniques that fall under the umbrella of formal methods. and today, there’s lots of tools that actually help developers automate those techniques. 

Some of the ones we’re using at Metropolitan include things like static analysis, model checking and automated proof tools. And I’ll dig into these a little bit so your listeners understand a bit what those are. So static analysis is is quite widely known actually in the software development world.

And it essentially consists of automatically checking certain properties of a computer program at compile time. So before you’re actually running it in a test environment or running it in a production environment. So examples of the types of properties that can be checked by static analysis include things like that the programmer who wrote the program isn’t using language features that are potentially unsafe.

Because programming languages in general allow you to do a lot of things with the with the underlying hardware and in safety critical systems, sometimes you don’t want to use that that full feature set because you can introduce errors into the program that can cause safety or security issues. Some other things you can check with static analysis are that well-known program weaknesses or vulnerabilities aren’t actually in the program.

So theres databases that are available publicly that actually list the types of weaknesses and vulnerabilities that are often found in software. And some of these static analysis tools can actually check those databases and check it against and your source code to make sure that the programs aren’t actually implementing those weaknesses or vulnerabilities. 

Also things like linters, which essentially means analysis tools that look at the source code of your program and make sure that they need a certain quality specification. So this is often used to make sure that the the source code can be read, maintained, and updated by somebody other than the original programmer. And this kind of thing is really important in software systems that have to live for quite a long time. So the original developers of the system might not be around anymore.

And you need programmers to make updates or to make fixes to the software. So you want to make sure that they can actually read and maintain that software. So we have tools that help us ensure the quality even of the source code. So another one that I mentioned was model checking. So that refers to automatically and exhaustively checking that certain logical properties of a system are true or false for all reachable states in that system or for some specific state of that system.

And usually we use simplified models of the system, but we ensure that they include the like essential states of that system and and how they might change over time. So in model checking, one type of property you would want to prove is called a safety property. And that’s something that the system should not do, should never do. So examples of that might be like in a financial transaction system, you want to prevent double spending. So someone can’t spend the same money twice simultaneously on different transactions.

In an electronic medical record system, you want to ensure that you’re not leaking private personal data. Or in an industrial control system, you want to make sure that you’re not issuing your system is not issuing a command that could cause physical harm or system damage. And then automated proof tools layer on top of that, and they can do things like automatically prove the absence of any runtime errors in your software implementation.

So most people familiar with software development are aware of software testing, and software developers spend a lot of time writing tests against their software, and that’s great. But software testing cannot be exhaustive, but we do have proof tools that actually can prove conclusively and exhaustively that there are no bugs in your software that will cause runtime errors.

So, and you can prove other things, like specified data dependencies and flows in the program. So you want to ensure that certain software modules data only flows from one to the other and doesn’t leak back out the other way. Or you can prove that a software implementation is correct with respect to like a functional specification. So your customer has some functional specification of what they want the piece of software to do. You can write that specification in a programming language.

And then that same programming language could actually be used to prove that your implementation of that is correct with respect to that specification. So again, typically that’s done with with testing and demonstration to your customers and sometimes by analysis or or inspection of the the code itself. So these proof tools add another level of and assurance that because they can do sound reasoning, you can you can get assurance that your software actually does what the specification wants it to do.

So the last thing I want to say about formal methods is that awareness of formal methods is growing pretty rapidly in industry. So for instance, last year, the White House released a report called Back to the Building Blocks, which is part of the United States national cybersecurity strategy. And it explicitly calls on industry to start adopting formal methods into their software engineering. And if your listeners do a quick Google search, they’ll be able to find examples of how formal methods are being adopted by some of the giants in computing, like Amazon, Google, Nvidia and of course, Metropolitan Technologies.

Andrew Ginter
If I might dumb it down, I mean, if I have a superpower, it’s dumbing things down, sorry. If I might dumb it down, in my understanding, it’s very difficult to apply formal methods to very large software artifacts. I mean, the the Windows operating system is said to have, I don’t know, something like 100 million lines of code in it. A web browser has, I think, more than 10 million lines of code in it.

The industrial control system products I worked on back 20 years ago in my youth were two, three million lines of code. It’s difficult for me to imagine applying formal methods to a body of of of software that large and so in my dim understanding, dumbing it down, we use these techniques on smaller software artifacts or possibly even with a larger artifact with a specific goal. So for example, waterfall produces a unidirectional gateway. Now, we don’t, to my knowledge, I’m i’m not part of the dev team, to my knowledge, we’re not doing formal methods, but we, we have been certified against but common criteria. Common criteria is a kind of certification. 

The way it works is the vendor makes a claim. We claim the hardware is truly unidirectional. It does not matter what you do to the software. It’s not physically possible to send any information back in the other direction. So you make a very specific claim. 

The common criteria tests the system against that claim. In my understanding, it’s analogous you know, correct me if I’m wrong, to to formal methods where you say this is what we’re gonna prove and you state that thing, whatever it is, reasonably simply. It has to be reasonably simple to to have a high degree of confidence in your in your proof. And then you set about using formal methods to establish that property. 

Can I ask you, what properties have you focused on in your use of formal methods? What is it that you believe that your customers or the marketplace are going to want to to to have a high degree of assurance for?

Daly Brown
Yeah, absolutely. So you you make a good point. Formally proving properties of a whole industrial control system or an operating system with millions and millions of lines of code is an enormous undertaking. And so we’re very aware of that. We started with proving the absence of runtime errors of what we identified as critical components of our software. 

Our long-term view is to prove as much as humanly possible that our code has no vulnerabilities. As that means like making a foundation, building a foundation, a software foundation and in libraries, in small building blocks that we formally proven that we could build on top of. And so we we believe that proving at least part of your code is a step in the right direction for security. Is this one of the best techniques available to us for us to reason about software? And so I’ll let Nick elaborate a bit about a couple of focus areas that we’re working on to prove and and where we think that’s valuable.

Nick Foubert
Yeah, to get a little bit more specific. So yeah, we’re focused right now, but essentially in in two areas. Uh, first at the points where our software consumes or ingest data from other systems and where our software produces and publishes data at other systems. So, because interaction between software components is generally governed by protocol and format specification. They can either be standardized or sometimes they’re proprietary.

But most software developers will know that poor input validation, whether that inputs from from a human in in a GUI or from another piece of software, is the source of many vulnerabilities and it’s a common attack vector. So a core part of our platform’s value is in connecting and translating between many different communication protocols and formats. 

So we’re using specification and code generation tools that can prove that our data input and output models will only accept and create valid data, and that those modules themselves are free of runtime errors. so For instance, you can’t bad input data won’t cause a buffer overrun in the software and cause the software to crash. The second area we’re focused on really is our overall software engineering tooling and practice.

So for example, all of our own software is written using the Spark programming language. And just to not confuse your listeners, it’s not the Apache Spark tool, but it’s it’s a programming language that’s been around for for quite a while. It’s part of a subset of what’s known as the Ada programming language, which was originally designed for the United States Department of Defense. 

It’s a mature but modern language designed for the most safety and security critical systems. So using Spark as a programming language gives us a high level of software assurance as a baseline because the language itself was designed for critical systems and it includes a bunch of features that help programmers write error-free programs. But on top of that, it also allows you to progressively apply additional methods to prove critical data flows, program integrity, and functional correctness, things I was talking about earlier.

And so the ability to progressively apply these methods means that you don’t have to fully prove a program or an entire system from the get go. You can focus on your critical components first. So again, in our case, we’re mainly focused on where data comes in and how data goes out, because those are aspects of the system that we have less control over. 

We’re using the higher assurance proof tools on those components first, but we’re building all of our software using this language that’s going to allow us to progressively increase the assurance in areas that we deem to be of higher criticality.

Andrew Ginter
Let me ask you though a hard question. I remember this was maybe close to a decade ago. The SSH protocol, a bunch of mathematicians proved the protocol correct. What does that mean? They proved that if an implementation complies with the protocol, if there’s no defects in the implementation, then the protocol could not leak key information into the internet, into anybody, so that the the keys for the the communication would always be secure. A year later, they broke the protocol, not an implementation.

They broke the protocol. they they they They demonstrated how every compliant implementation of the SSH protocol could be manipulated into leaking the first 14 bits of the key information. How did they do that? It turned out that the mathematical proof proved an assertion. No key information leaked. Proved the assertion at a certain level of abstraction.

What they did was they said, well, we’re using the internet, we’re using TCP, but and they they modeled TCP mathematically as sort of a tube where you stick characters in one end, and they come out the other end either in the same order, the same characters, or the connection is torn down. That’s sort of the that that the assumption, the mathematical model they had of of the TCP protocol. And of course, everyone knows TCP is messages and you can fit multiple characters into a single message. It’s not one character at a time, it’s sort of messages at a time. 

They used a timing attack to prove that every compliant SSH implementation could leak the first 14 bits by taking advantage of information about messages that that are are sent back and forth and the time stamps in those messages.

And so really, they they they attacked the protocol below the level of the proof, but but below that level of abstraction. Can you talk about Can you talk about sort of applying this principle to your system? At what level of abstraction are your proofs what bluntly? Should we believe these proofs?

Daly Brown
Yeah, so you make some good points. Formal methods can be applied at multiple levels. So like you mentioned, at the specification of the protocol, in this case, SSH, you can apply them at the systems level, for example, app proving a distributed system has various properties. And you can apply them at the implementation level. And this is where we are concentrating. But also, like you mentioned, they’re only as good as the specification in which they are proving. So it’s not a panacea.

One of our main assumptions though is the cost of integrating formal methods into our engineering processes are coming down to the point where the benefits outweigh the costs, even for a startup. So even though you might, it’s true that if there’s a flawed assumption in the specification, then even the implementation improving, improving implementation itself against that specification could also have flaws. 

We are human after all writing these specifications. It does though give us the best defense to to the extent possible, that we’re guarding against certain security and and and safety vulnerabilities. and And the tooling is getting better to allow us to identify some of these. And maybe, Nick, you can talk a bit about that.

Nick Foubert
Yeah, so I mean, the the tooling has improved to the point where you can progressively adopt that the techniques of formal methods. But it is best to start as early as possible. So at Metropolitan, we we started right from the beginning, from day zero. And our platform is not fully proven yet, as we mentioned before. But I mean, think about it. Would you rather use a tool that has been proven to be correct with respect to some set of assumptions and where the developers are working on proving more and more of it correct with respect to even more conservative assumptions, or would you rather use something that is just as secure as the human developers can think to make it before they move on to their next task. 

So we don’t use, well sorry, where we don’t use formal verification yet, we cover everything else with automated testing to help ensure the reliability of our platform. And we think this hybrid approach of mixing formal methods on the most critical components first and filling in gaps with automated testing is the most practical approach for us right now.

So I think maybe to get back to the the example you give and just just bring this back up to to a high level, one thing I think is important to bear in mind is that I think adopting formal methods into your engineering process forces you to do a bit of thinking in advance. 

For instance, in the cybersecurity space, your your formal proofs are only going to be as good as the models you’ve developed to eventually develop your software and prove that software correct.

But when you’re developing your model, you need to make sure that you’re accounting for things like your threat model. So there are things there are unknown unknowns in the world, and we have to do some deep thinking at the start of our design to try to capture the full threat landscape as best as we can.

Um, and these tools will help you prove that your software implementation is robust against those threats. But again, like Daly said, it’s not a panacea there again are unknown, unknowns. And sometimes your models don’t account for certain threats that you could not have predicted.

Andrew Ginter
So there’s been some long answers here, Nate. I asked a complicated question. I got a bit of a complicated answer that was very carefully couched saying, well, nothing is ever perfect. We’re not guaranteeing that our stuff is perfect. they had sort of the bottom line buried in a paragraph. I wanted to emphasize, in my mind, the bottom line is this.

When you are looking for a software platform, a technology platform to build an important function on, a safety critical function, a function that’s essential to critical infrastructure, when you’ve got something important to do, you have two choices. You can either use a bunch of software as your foundation that a bunch of developers have worked on and then moved away, did what they could and moved on. 

Or you could use a bunch of software that has been proven correct under a bunch of assumptions against a bunch of threats in light of a bunch of of contingencies. And the development team continues to prove the thing correct and evolve it to become more correct against more threats and and more conditions.

Which would you rather? The one that people kind of do what they do and move on? Or the one where you’re determinedly proving it against a bigger and bigger body of of potential gotchas? I’d rather use the platform that’s been proven correct. No, it’s not perfect, but it sure is a big step in the right direction is what it sounds like to me.

Andrew Ginter
What you’re building is, in a sense, a platform. It is software that other developers can use to develop industrial control systems. And when you have sort of infrastructure like that, in my experience of the marketplace, in what vendors I’ve worked for, you tend to see programming platforms adopted by vendors who are building industrial control systems when there’s sort of a compelling need. The compelling need might be the threat environment changes. I don’t know. a gazille A gazillion Stuxnet’s hit everybody next year. 

It might be that there is a new kind of application, I don’t know, smart car chargers or something that demand a new way of doing things. Or, it, it, it might be that it solves a long standing industry problem. And the whole industry looks at it and says, that’s the right way to solve this problem. And all of the new development switches to the new platform.

You know, Java was a little bit like that. Can I ask you, how’s it going in terms of adoption? Do you see a triggering event? Do you see a new industry? Do you see a killer app for this platform that you’re producing?

Daly Brown
I’ll say that, the digital transformation of industry, is, is happening. And so, historically, a lot of these OT networks, industrial control systems, et cetera, have been isolated and they’ve been off the internet and for valid reasons, for cybersecurity and other reasons as well. But there’s, there’s new requirements coming out and, and I think it’s, you can’t ignore the fact that Increasingly, 

OT networks are becoming connected to the internet, and and there’s reasons people want to do this. know Compliance with new cybersecurity requirements, asset management, so remote monitoring and and management of all the assets in the network, applications like predictive maintenance become possible. And so with this digital transformation, though, there’s a lot of new like threat cyber threats possible to to the existing infrastructure and architectures of of these OT networks.

And so in addition to that, there’s incoming macro triggers and and one, for example, could be quantum computers and the threat of quantum computers against traditional cryptography. And so having a platform that can address some of these emerging requirements, connected OT networks, and some of the existing vulnerabilities of the other solutions that exist today, I think is valuable.

And so we’re trying to address that. And at the same time, we’re also cognizant and sensitive to to being as non-disruptive as possible to the existing networks. Because we know people, organizations buy these and they expect to operate their their networks, their OT devices for decades. And so the the sales cycle and and the lifetime of these of these devices and and systems is very long. And so we have a platform, but we also have a means of less disruptive integration of our technologies, and we’ve kind of alluded it to it earlier, and we’ll talk a bit a bit more about this in a bit, about having it pre-integrated on an edge appliance where you can just sit in front of an OT network to get a lot of the benefits that we’ve talked about yeah without having before having to fully disrupt the current operations of the OT environment. And so to that point, yeah, I think there’s triggers coming, and I think there’s there’s value in our platform moving forward, and there’s ways of getting the benefits of it without being very disruptive to the existing networks and systems.

Andrew Ginter
You folks are a startup. You’ve got a bunch of stuff that you’re done. You’ve got a bunch of stuff that that you’re doing. How’s it coming? Uh, have you, is there, is there adoption? Is there technology? how are things going?

Daly Brown
Yeah, so as a startup founder, it’s a roller coaster. But it’s rewarding that you’re building a product that you think will make a difference, especially in an industry that has such an impact on people’s day-to-day lives. And so in terms of where we’re at, we’re doing ongoing customer discoveries. so We’re talking to industrial control system vendors. We’re talking to operational technology operators. We’re making sure we provide a solution that’s desirable and feasible and trying to best position ourselves in the market. We’re also building partnerships with industries. so These include multinational system integrators, security companies, and other technology partnerships to go to market with. And Nick, you want to talk a bit about the R&D stuff we’re doing in the piloting?

Nick Foubert
Yeah, so we’re still in the research and development and piloting phase of our product right now. We have a beta release that’s actually out in the wild in a controlled test environment, and that allows us to rapidly iterate and test our platform. We’re working towards what you’d call maybe a version one release later this year, probably in the fourth quarter. 

We’re also working on a pre-integrated edge appliance that can sit in your OT network just in front of a device like a bump in the wire to provide a non as non disruptive as possible integration of our edge technology. And we feel like this is the path of least resistance to getting our technology deployed in the most critical of infrastructures where people tend to be a lot more risk averse.

Andrew Ginter
And so that all makes sense. this has been, this has been great. I’m, I love learning stuff, thank you for this, but this is in a sense so different from what I do every day. I’m not even sure I’ve asked you the right questions. So let me ask you an open question. what else do you want to talk about? What have I missed? What didn’t I ask you that I should have?

Daly Brown
Yeah, actually, that’s a good question. And there’s a really cool pilot project that we’re working on. So we’re working on a quantum state version of our platform, specifically for industrial control systems. And we’re using the DNP3 ecosystem, which is a pretty standard protocol used in utilities, as the first use case for this new configuration of our platform. And we’re ensuring the protocol is secure against existing and emerging threats, such as quantum computers.

It’s a very exciting project and we’re working with really bleeding edge security technologies and we’re integrating a brand new state of the art cryptography, which has been independently proven to have information theoretical security. And so what that means is it’s as a system that’s secure against adversaries with unlimited computing resources and time, such as a quantum computer. And I just want to clarify, this is not the same thing as post quantum cryptography, which has been quite a bit in the news lately and which NIST has recently standardized and also which OT operators will be expected to adopt in the future and which we also actually also support. 

But it’s actually using a different type of key distribution technology altogether. And so we’re we’re part of a consortium of a bunch of world-class companies and universities delivering this capability. And I’m just going to shout out our partners. And so it’s Thales Digital Identity tallli digital identity identity Services, which is a group from the Thales multinational french company, Quantum Bridge, which is a Canadian startup, and the University of Toronto. And funding has been contributed by one of Canada’s most prestigious funds as well. And there’s actually going to be a public announcement in the next few weeks about the project.

Andrew Ginter
Well, Nick Daley, this has been educational. I love episodes where I learn something. Before we let you go, can can I ask you, can you sum up for us? What so what what should we be be thinking about, learning about in this space?

Nick Foubert
Yeah, thanks a lot for having us, Andrew. This has been a great experience. So I guess what I’ll end off saying, though I might be preaching to the converted, is that cybersecurity shouldn’t be an afterthought, especially in critical infrastructure systems. These systems are essential to our day-to-day life. And I think that that we, as in the general public, often take their reliability for granted. It should be table stakes that cybersecurity is intrinsic in the development of any operational technology. And we take that very seriously when we say that we are secure by design and default. I don’t know, Daly, if you want to wrap it up.

Daly Brown
Yeah, so I’ll just say, the vision of our company is to be the digital backbone of the world’s critical infrastructure. And we don’t consider ourselves like a traditional cybersecurity company. We’re a connectivity company that takes cybersecurity seriously. And we aim to market a product that’s easy to integrate and to secure from the ground up, allowing organizations to reap the full benefits of their digital transformation and at the same time increase the resilience of their infrastructure. 

So I’ll just end by inviting all the listeners to connect with us on LinkedIn or send us an email to schedule a demo or just chat some more about the technology we’re building. And we’re always looking to collaborate, partner, or pilot our technologies with other organizations. So we hope that what we’ve discussed resonates with some of the listeners out there, and we can continue this conversation. And so thanks again for having us, Andrew. And we’d love to come back on the podcast in the future to catch up. And that’s it.

Nate Nelson
Andrew, that just about concludes your interview with Daily Brown and Nick Foubert from Metropolitan Technologies. Do you want to take us out now with a final word for our listeners?

Andrew Ginter
Sure. I love episodes like this where I learn stuff and I’ve heard of formal methods for a long time and they slowly are making are making inroads in different industries. I was keenly interested to see how how these folks were using it in the industrial control system space. To me, the the challenge and in in the in the interview here, it’s clear that that Nick and Daily understand the challenge. The challenge business-wise with this kind of endeavor is that you’re not selling the platform to end users, okay, to power companies. Power companies don’t buy this kind of platform because as much as possible, power companies really try hard to avoid writing code. They wanna buy stuff, not write stuff themselves. You gotta sell this stuff to people who write a lot of code.

So know who in the industrial space is writing a lot of code right now? Well, to me, the huge opportunity is renewables. It’s taking what the oil and gas sector that is being transformed, taking the electric sector that’s being transformed, there’s a lot of new code being written, not just existing control systems that have a new feature or six. We’re talking brand new stuff for electric vehicle, high speed chargers and load balancers within the grid for the high speed chargers, a whole bunch of new code being written. 

To me, that’s that’s an opportunity. These developers, these businesses who write in that code are are looking around for the platforms they’re going to use. Is it the same old or is there something new and better? Because when you start the process, that’s the the huge opportunity for embedding a new platform and a new way of thinking about it. So that that sector strikes me as as ripe for innovation like this. And it’s lovely to see that there’s options like this in the marketplace. We’ll we’ll have to see how they shake out over time.

Nate Nelson
Well, thanks to Daly and Nick for elucidating all that. Andrew, thank you for speaking with me.

Andrew Ginter
It’s always a pleasure. Thank you, Nate.

Nate Nelson
This has been the Industrial Security Podcast from Waterfall. Thanks to everyone out there listening.

Stay up to date

Subscribe to our blog and receive insights straight to your inbox