Over-the-Air Updates, Part 1: Introduction

The Qt for Device Creation offering has been successful at bringing many new and exciting products to the market by significantly reducing time-to-market with its pre-configured software stack and toolchains for rapid UI development. We would like to take this a step further by providing an opt-in feature for Over-the-Air (OTA) system updates.

What is an OTA update?

An OTA update is a mechanism of distributing software updates over a wireless network without requiring physical access to a device. For a target device to be able to update wirelessly, it needs to have support for this in software.

The significance of OTA updates.

With everything being connected to the Internet in the Internet of Things era and everyone owning a smartphone, users want more from devices. Expectations for software have changed in a way where OTA has become an increasingly important component when building modern embedded devices. Embedded software now demands the full lifecycle support, including updates for software bugs. Software has to continuously provide new features to attract more users and to retain the existing ones. Being connected to the Internet also means higher exposure to security exploits which need to be addressed in a timely fashion.

With OTA, updates can be sent to all users from a central location - software updates no longer require expensive product recalls, trips to customers nor climbing up a tower to have physical access to a device. With how fast things are changing in the technology world and trends shifting, it is vital to be able to update devices after they have been shipped to the customers. Otherwise they will soon end up in a junk box. We can see OTA updates in all sorts of devices these days, even in cars that traditionally required a trip to the mechanic if software needed to be updated. The times when it was sufficient to deploy the software once, make the medium read-only and ship it are over (of course there still are use cases where this approach is perfectly fine).

System updates are complex.

There are so many things that could go wrong during a software update leaving the system in an inconsistent state. A failed update could render the system unusable - a device that does not boot or goes into an infinite reboot cycle, applications that do not start properly or missing configuration files. Updates could also be compromised or tampered with, so the security aspect should be addressed as well. As you can see, there are a lot of fail vectors to think about. Countless solutions for this problem exist, often ad-hoc, difficult-to-customize, incomplete, distribution-specific or do not meet all the desired requirements. While looking at the many existing update solutions, we have gathered a list of requirements that we consider essential for an OTA update system:

    • Flexible and Reusable.

 

An update solution should not lock you into a specific partition layout, filesystem type or distribution. Porting to new target devices should be straightforward.

    • Atomic Updates.

 

All or nothing. It should be safe to interrupt an update without leaving a system in an inconsistent state. If the update did not fully complete, the currently running system should remain unmodified.

    • Atomic Rollbacks.

 

Atomically switch back to the previous version if the installed update has unwanted side effects.

    • Updates Processing in Background.

 

The update process should not require downtime. Users should be able to use the system while an update is being applied in the background.

    • Secure.

 

Secure transmission of updates with authentication and update integrity verification.

    • Efficient Handling of Disk Space.

 

Many update solutions are based on the partition swap pattern with a lot of duplicated files. This might not be a big issue, but ideally there is a better solution (see below).

    • Bandwidth Optimized.

 

Updates should be as small as possible. This is achieved by various binary-delta technologies by taking advantage of how executable files change. Only the files that have changed should be downloaded, instead of downloading a complete image file.

    • Handle Poor Connectivity and Transmission Failures.

 

When resuming from an interrupted download, only the missing files should be fetched.

    • Fail-safe and Resilient.

 

Have a built-in mechanism for recovering from a disaster. This depends a lot on the specific use case, and therefore should be highly-customizable.

    • Fixed-Purpose System vs Application Store Model.

 

It should support fixed-purpose systems and systems with OS updates (via OTA) at the base and agnostic application delivery mechanism on top. Third party applications need to live in synergy with a base OS and should be able to update independently. This requirement was inspired by the new Minimalist Operating System concept explained in this blog post. Android OS is another good example of this.

    • Versioned System.

 

OTA client devices should replicate content assembled on the server side, instead of resolving dependencies on a target device during software update. This results in a predictable, reproducible and reliable environment - we always know what files are part of a certain system's version and we can test the exact combination of libraries that will be available on a target device (kind of like snapshotting the system). Third party applications can either use system libraries or run in containers. There should be a user writable location which is shared between versioned systems.

    • Extensible with beautiful Qt/QML APIs.

 

More on this will follow in another blog post.

Meet OSTree and why we selected it as a back-end.

OSTree is a tool that combines a git-like model for committing and downloading bootable filesystem trees, along with a layer for deploying them and managing the bootloader configuration. OSTree is like git in that it checksums individual files and has a content-addressed-object store. It's unlike git in that it checks out the files via hardlinks from the OSTree repository. Having this hardlink farm means that each system's version is deduplicated; an upgrade process only costs disk space proportional to the new files, plus some small fixed overhead.

In the OSTree model, operating systems no longer live in the physical "/" root directory. Instead, they parallel install to the new toplevel /ostree directory. At boot time, one of the parallel installed trees is switched to be the real "/" root. This is the base for atomic update and rollback features. The filesystem layout in a booted OSTree system does not look much different from a non-OSTree system. OSTree is in many ways very evolutionary, it builds on concepts and ideas introduced from many different projects. In addition to an OTA update feature, this allows us to bring interesting concepts such as the stateless system to Qt for Device Creation.

OSTree addresses most (and is a direct source for some) of the requirements from the list above and serves as a great base for building OTA update tooling. Currently we are working on APIs and the corresponding tools that brings OTA capability to Qt for Device Creation and makes it simple to integrate with your Qt-based application. OSTree is an open source project (LGPLv2), with the source code available on GitHub.

Conclusion.

A solution for OTA in Qt for Device Creation could further reduce time-to-market and provide the means to conveniently and continuously improve the shipped products. A technology preview for this will be available with Qt 5.7 for Device Creation. In part two of the blog post series I will write about device integration, API, what features we have planned for future releases and I will elaborate more on how OSTree fits into the OTA solution.


Blog Topics:

Comments