Businesses are increasingly relying on IoT devices to help expand and streamline operations, optimize processes and better engage customers. The trend is so significant that the value of the IoT market is predicted to exceed $1300 billion by 2026.
But relying on IoT devices also means managing them – maintaining, monitoring and updating them. Provisioning, management and telemetry control of diverse IoT devices is a complex task – especially when your fleet includes hundreds, or thousands, of individual devices. There is no single IoT device fleet management solution that addresses all possible business scenarios and needs, but there are some solutions and providers leading the pack.
In this five-part series, we’ll detail key elements of IoT device fleet management, including architecture, over-the-air (OTA) update capabilities, device provisioning, and cybersecurity. And we'll explore three popular platforms:
- Torizon is an open-source OTA platform built with the Yocto Project and based on the Toradex Embedded Linux BSP.
- Balena is a complete set of tools for building, deploying, and managing fleets of connected Linux devices.
- Mender is a secure, robust, open source OTA software update manager for IoT devices.
To get started, let’s take a 30,000-foot view of what makes a successful solution.
Characteristics of a Winning Fleet Management Solution
To provide business value, an IoT fleet management solution must provide the enterprise with 24/7 visibility into their entire device fleet, including location and technical specs. The solution must provide the ability to remotely access every device that is part of the fleet in order to monitor the devices’ state, run diagnostics, perform minor repairs such as device resets, and of course perform over-the-air (OTA) updates.
If any device in the fleet crashes or malfunctions, the solution must be able to send alerts. Of course, the solution must encompass robust cybersecurity tools and processes to safeguard the software in every fleet device at all times. And, it should be able to be fully automated so the fleet management solution can operate optimally without human intervention.
Reference architecture outlines recommended integrations of IT products and services to form a solution in a way that embodies accepted industry best practices. Regarding IoT device fleet management, there is no single architecture that will suit the requirements of all use cases. Still, a scalable architecture flexible enough to support specific requirements across a spectrum of use cases is highly valuable as it provides a starting point for developers to create effective solutions.
Keep in mind, reference architecture is an abstract presentation of the system without implementation specification and low-level details (protocols, storages, APIs, etc). Reference architecture is constructed using different software structures formed in relations between components of sub-systems.
In general, IoT systems are distinguished by their distributed – or edge – devices and the central IoT core, which is managed by one of the major cloud providers. An effective IoT reference architecture conforms to client-server architectural patterns and gets attributed with Zero-Trust security of connected devices, horizontal scalability, rapid application deployment, AI/ML processing, etc. (Zero Trust is a security framework that mandates all users, both in and outside the enterprise’s network, are authenticated, authorized and continuously validated for security configuration and posture before being granted access to applications and data.)
Here’s what this reference architecture might look like.
Client (IoT device) consists of various hardware components including sensors, transport certificates, firmware, etc.
Server (cloud infrastructure) communicates with a client (set of devices) and facilitates device registration, data processing and software update using Cloud Managed Services (IoT core, device registration tool) or installed software (OTA platform).
OTA (over-the-air update) capability is considered a (major quality attribute) and is considered mandatory in the design of ICS’ IoT device management solutions.
IoT core is able to update a group of devices or even a single device rather than a whole fleet. Generally, IoT devices in the role of slave are subscribed (conforms Pub/Sub pattern) to the events or updates on the server side via MQTT topics. However, the IoT core is unable to update or recover the device state in the case of severe device failure because the device becomes a brick. Therefore, device update operation needs to be safe and reliable. (OTA platforms like Balena, Torizon and Mender, which we’ll explore in detail in part 4 of this series, offer an alternative to the distilled IoT core.)
Choosing between different architectural patterns is a trade-off between system qualities. One of those relates to the security principle of fleet management design, which dictates the selection of mTLS (Mutual TLS) or a similar authentication/authorization protocol. While this protocol has an impact on the device provisioning scenario (complexity, usability), there is a trade off. In this case, usage of mTLS must predominate over other IoT device protocols. That makes mTLS something to consider when making decisions around architecture.
Here’s another trade off: balancing the importance of OTA updates versus cybersecurity. A successful IoT solution will typically provide the capacity to remotely obtain and update its own software, for instance a new UI feature, a background task or an operating system update. (For ICS’ IoT fleet management solutions, OTA is listed as one of the major capabilities, or quality attributes, of the fleet management system.)
But support for OTA capabilities conflicts with a quality attribute that is likely more prized: security. Cybersecurity is always a concern since use of IoT devices increases a surface for malicious attacks onto the system.
Once an IoT device is aware that OTA updates are available, the device should attempt to download the updates. But how? One approach is to connect to a dedicated server and download a released artifact. However, since an IoT device is already connected to the cloud via a secure telemetry channel, which typically operates using the Message Queuing Telemetry Transport (MQTT) protocol, use of a separate mechanism for OTA updates increases the attack surface for hackers.
A workaround is to download new updates via the MQTT channel. Using the MQTT channel is also more memory-efficient as there is no need for an HTTP client or an additional Transport Layer Security (TLS) channel. (However, this is mostly beneficial when the microcontroller is executing the application and networking stack in the same memory space.) Whether the OTA update downloads via the MQTT channel or from a separate HTTPS server, the device will need to support protocols such as TLS when it first establishes a secure connection.
Although remote attacks like “man in the middle” are the most common security threats, a successful IoT device fleet management solution must also incorporate protection against physical attack – especially because large deployments may attract more sophisticated attacks.
To hinder physical attacks, the IoT device should prevent attackers from easily reading everything in memory. For example, JTAG ports should not be open for use on a production device. The IoT device must store security credentials and code images in an encrypted way. Rendering them in memory poses threats since attackers might find a way to read or write from/to memory.
In part 2 in our series we’ll explore specific requirements around OTA platforms. (Read more about OTA updates.) Looking ahead to part 3, we’ll delve deeper into cybersecurity concerns and identify ways to secure your device fleet.