Another key feature of Transitive is its notion of full-stack packages, called capabilities. These packages bundle code for the robot, the cloud, and the web. Bundling them like this provides us with the assurance that these pieces of software are versioned together and hence work correctly together. This is not a given, as separate devices tend to be upgraded to new versions at different times and it is not uncommon as fleets of robots and other devices grow larger for there to be a high degree of heterogeneity, i.e., different robots running different versions of software. This joint versioning, together with the versioned data namespaces provided by MQTTSync solve the cross-device version dependency problem.
What are cross-device version dependencies?
These dependencies exist when Version A of the robot software is not compatible with Version B of cloud. In an ideal world, all robots would always be running the same version, and that version would be compatible with the version run on the cloud. In practice that is hardly ever the case. When different robots run different versions, it can be challenging to ensure they all function correctly when the cloud software gets upgraded. And that's assuming just one cloud.
In practice there may even be several cloud deployments, such as dev, staging, and production, each running different versions of the software. Having to switch a robots software version just to be able to use it with a different such cloud deployment is a nightmare. Transitive saves us from this enabling parallel operation of several versions of the same software on the cloud, running in separate containers and communicating with robots in separate namespaces.
Anatomy of a package
Transitive packages are npm packages, where it is up to the package author how to sub-divide the content of the package for deployment on robot, cloud, and web. However in practice the Transitive capability starter code provides a pre-defined structure of sub-packages and a utility script,
subScript.sh, that elegantly delegates
npm commands to the appropriate sub-package depending on the execution location. This structure is as follow:
├── robot # The robot sub-package.
│ ├── main.js
│ └── package.json
├── cloud # The cloud sub-package.
│ ├── main.js
│ └── package.json
├── web # The web sub-package, i.e., front-end components.
│ ├── device.jsx # Per-device UI components.
│ └── fleet.jsx # Fleet UI components.
├── docs # Any images in this folder are show on the package page.
├── Dockerfile # Dockerfile used to build the cloud container.
├── docker.sh # Shell script to test the docker container in dev.
├── esbuild.js # Used to build the web front-end bundle.
├── generate_certs.sh # Used to generate dev certificates.
├── package.json # The npm package definition.
├── publish.sh # Script to publish the package to the capability registry.
├── rundev.sh # Script to run the robot component in dev.
├── subScript.sh # Used by the top-level package to delegate to sub-packages.
└── tmux.sh # A shell script to start everything in dev in tmux.
package.json must define two scripts (already present in the template):
start script starts the robot component, while
cloud starts the cloud component. When Transitive executes these in their respective environments, it signals to the script where it is being run by setting either
subScript.sh uses this to delegate to the
start script of the correct sub-package.
In the provided package layout, the robot component lives in the
robot/ sub-folder. It has its own
package.json specifying npm dependencies and scripts. Robot components run in a sandbox on the robot and are able to install additional
apt packages within that sandbox. This is facilitated by the Transitive agent that exposes a local API to the capabilities via which such
apt dependencies can be requested to be installed. This typically happens in a
preinstall script defined in the package.json.
Each capability is started by the agent in its own sandbox, but these sandboxes all share a common file-tree that includes the locally installed
apt packages. That file-tree lives in
~/.transitive. Note that the agent is able to install
apt packages in that file-tree without the need for
sudo. This sets it apart from various other fleet management solutions that all require
sudo to install and run, hence exposing a robotic fleet to additional vulnerabilities.
Transitive is designed to allow for third-party capabilities to be installed and hence needs to ensure security that prevents such third-parties from stealing sensitive information. When designing Transitive we considered various ways to isolate and sandbox capabilities on the robot. The reason we decided against Docker was two-fold: 1) it requires
sudo (or alternatively being part of the
docker group which essentially is the same as having
sudo access), and 2) it would be challenging to share
apt dependencies between the system and all capabilities and we didn't want to unnecessarily bloat the size of capabilities on the robot.
The sandbox mechanism we designed uses many of the same Linux container technologies as Docker does (namespaces, overlayfs), but in a much more fine-grained fashion and, again, without the need for
Robot capabilities do not connect to the cloud-hosted MQTT broker directly, but are relayed via an MQTT broker run locally on the robot by the agent. This is part of the sandboxing and authentication solution and further increases bandwidth efficiency.
Robot capabilities only have access to their device and capability specific namespace, i.e.:
Given this restriction, the local MQTT broker actually strips this prefix such that inside a robot capability the
/ topic corresponds to the above prefix in the global broker.
Robot capabilities run as a separate process from the Transitive agent. This is, of course, required for sandboxing, but it also has the advantage that the agent can be restarted independently from the capabilities, which can continue running.
The cloud component, typically living in
cloud/, is run inside a Docker container that is built by the Transitive portal application (
transitive/app/) on-demand. Demand for a cloud capability arises when a device running that capability and version is started and registers with the portal. The containers are named according to their capability scope, name and version-namespace, e.g.,
transitive-robotics.test.0.3. These names are used by Transitive to determine whether the demanded container is already running or needs to be built and started.
Unlike robot capabilities, cloud capabilities have access to their capability specific namespace for all users and devices, i.e., expressed using MQTT wildcards:
This is required in order to avoid needing to run a separate container per robot, which obviously wouldn't scale, and also in order to perform aggregations across multiple robots, e.g., for preparing the data needed by a fleet overview UI.
The web UI components included in Transitive packages are proper Web Components, i.e., they define custom elements that can be embedded anywhere. They authenticate with the MQTT broker using the JWT provided by the user (and auto-generated on-demand on the Portal for testing).
Web components are first-class participants in the MQTTSync, i.e., the cloud capability is not required to implement any form of API to expose data to the front-end. Instead, web components connect to the MQTT broker directly using secure websockets, and use the same MQTTSync class as the code running on the robot and the cloud to access and share data.
You are free to implement these web components using whatever framework you choose (or none at all). But we strongly recommend React, which is what we use at Transitive Robotics and what Transitive has the best support for.
As with the robot ↔️ cloud relationship, also the version of web components need to be compatible with code running on robot and cloud. Hence the same versioning and namespace considerations apply to them. When the UI for a robot running is requested, the portal, which serves the front-end bundles from the capabilities, dynamically checks which version the robot is running and serves the corresponding version of the front-end. This guarantees that the UI component will always work with the data coming from the robot, and, reversely, that the data written by the front-end will be processed correctly by the robot.
For fleet UI components, where a 1:1 mapping of robot version to UI version is not always possible, the portal always serves the UI of the highest version found among all robots. It remains the responsibility of the fleet UI author to ensure it can process all the data it subscribes to. Note however that this is made significantly easier by the fact that all data it receives is "tagged" with the version it was produced from. This makes it easy to implement version-specific transformations that ensure correct merging of data from different versions.
Capabilities auto-update when a new version is available. On the cloud, the update-check is performed every five minutes, on the robot once an hour as well as on every restart of the capability.