Skip to main content

· 8 min read
Christian Fritz

We are thrilled to announce the self-hosted version of Transitive, the open-source framework for full-stack robotic software.

Transitive has been used in production every day by many robotics startups for over a year now but until today the only way to use it was via the hosted version on transitiverobotics.com. The hosted version is ideal for any robotics company that wants to add new capabilities to its fleet quickly, such as low-latency video-streaming or configuration management. But Transitive's vision has always been to create and support an ecosystem of developers who want to develop and share their own full-stack robotic capabilities. We believe this has the potential to accelerate the development of new and exciting robotic applications similar to how ROS has advanced this industry before. Before ROS, developing your own robotics product meant you also had to develop your own robotics communication middle-ware and a lot of the necessary software modules, incl. SLAM and navigation. With the advent of ROS and its ecosystem of packages, startups could not only move faster, they could also get to market with a lot less capital and reach profitability sooner. We aspire to give robotics startups another boost of this kind by creating a framework that makes it easy to connect robots to the cloud and web front-ends, provides the same level of openness as ROS, and supports sharing of packages, which we call capabilities.

What is Transitive?

The Transitive framework makes it easy to build robot cloud portals. Even given all the great open-source tools for web development and device management, building such cloud portals for robots is still not an easy task. There are several reasons for this, but a big one is that robots are different from regular servers, despite many people attempting to tread them as such. They go offline a lot, have limited network bandwidth, and each robot in a fleet may run a different version of software and require a different configuration. Robots also generate a lot of data, some of which needs to be synced in real-time with cloud and web front-ends for processing and visualization, some of which recorded, and some discarded. In addition robots roam insecure areas and are connected over networks outside of the control of the startup, hence requiring tight authentication and authorization.

Transitive solves many of these issues. It provides a reliable, real-time data synchronization protocol that operates on top of MQTT, called MQTTSync. MQTTSync seamlessly synchronizes stateful data between robot, cloud, and web, instead of just passing messages. It provides the notion of full-stack packages that implement encapsulation and versioning of software components for all three systems (robot, cloud, and web) and uses MQTTSync's name-spaced data model to reliably communicate and operate, even when different robots run different versions of the package. The robot and cloud components are run in sandboxes to isolate them from the rest of the system, and the web components can be embedded in any web page including existing robot cloud portals. And of course, all this is secured, using SSL for transport-level security, client certificates and JSON Web Tokens for authentication, and authorization based on MQTT topics. Taken together this lays a solid foundation for building new full-stack capabilities with ease.

Note that Transitive is not a replacement for ROS, and in fact many of our capabilities run ROS nodes on the robot. Neither is Transitive a fleet management system. It just makes it easy for you to build your own!

Demo of Live Data Sync

This short video demonstrates a capability you can build in just minutes using Transitive, showing live on the web the ROS data from a robot, here simulated via Turtlesim.

Self-hosting

By self-hosting you can use Transitive to build your own cloud portal much faster than starting from scratch. It allows you to develop your own capabilities and run the entire stack on your own cloud instances or on-prem servers. Just like with the hosted version there are no switching cost if you already have your own portal: you can just build capabilities and embed them in your existing portal just like you would when using the hosted version.

Getting Started

Getting started with self-hosting is easy. On a Ubuntu 20+ machine with docker and docker-compose installed, run

curl https://transitiverobotics.com/deploy | bash

This will pull and start the latest docker images of Transitive's micro-services. Once running, go to http://portal.$HOSTNAME.local, which is your locally running Transitive portal, and use the curl command provided there to add your dev machine as robot.

You should then see your dev machine as a device on the portal. Next, you can create a new capability and start it:

npm init @transitive-sdk@latest mycap
cd mycap
npm start

This creates a new capability "mycap" from our template and starts its components in tmux. Once running, you will see it in the list of running capabilities on your device in the portal. If you want to replicate the live-data example from our video above, just uncomment the ROS related code in mycapp/robot/main.js, including this:

ros.subscribe('/turtle1/pose', 'turtlesim/Pose', (msg) => {
_.forEach(msg, (value, key) => {
mqttSync.data.update(`/device/pose/${key}`, value);
});
});

This subscribes to the turtlesim pose and adds each field of the message separately to MQTTSync. MQTTSync will publishes field values only when they actually change. This significantly reduces bandwidth and allows for better real-time performance.

Before restarting the robot component, make sure you have a ROS 1 master running and start turtlesim and teleop to generate some data. You should then be able to see your turtle's pose update live on your portal in real-time.

Embedding in your cloud portal

Once you have developed your capability, you can embed its web-components in your cloud portal, just like you would with the hosted version. These components are authenticated and authorized via JWTs that you generate for your web users on-the-fly to grant permission to see and use the capabilities you embed in the pages you show them.

Help us build!

The list of full-stack capabilities robotics startups need to operate their fleets is long and it's getting longer every day as new startups find new applications for robotics. As such we believe it would be futile for any one company to try and build a one-size-fits-all system to meet those needs. Instead we invite you to come and join us in building a library of capabilities together. Capabilities you share on our platform can be free or you can charge a monthly fee per robot, similar to how Android let's app developers make a living of their apps. We believe that this is necessary to ensure quality and long-term support of apps/capabilities and to create a thriving ecosystem of developers and users.

If you are looking for ideas of what to build with Transitive, we are well aware of the need for many other capabilities in addition to the ones we have already built, such as:

  • Live map viewing and editing,
  • Map and AI model management,
  • Software deployment management,
  • Test automation and test result visualizations,
  • Logging,
  • Alerting,
  • Anomaly detection,
  • Task/mission queuing, tracking, and reporting,
  • Various dashboards for internal and external, i.e., customer-facing use,
  • ROS bag and MCAP recording and uploading,
  • Data ingest into various data stores like MongoDB, Prometheus, ElasticSearch, or ClickHouse,
  • Integration with third-party software tools like PagerDuty, Slack, Foxglove, or Twilio,
  • Inter-operation with other control systems like Open-RMF,
  • Integrations with infrastructure like elevators, automated doors, or phone systems,
  • Tools for internationalization,
  • Tools for robot bring-up and deployment,
  • Tools for sensor calibration,

or just web UIs for existing ROS packages similar to package specific UIs and rviz plugins such as those for RTAB-Map and slam_toolbox. For instance, we would love to see a capability that makes Cartographer easier to use.

Next Steps

In our documentation you can learn more about Transitive, get more detailed instructions for self-hosting, and find answers to frequent questions. If you just want to see Transitive in action first, we suggest creating a free account on our hosted version and trying out some of our capabilities on your robots. For questions or feedback or if you just want to chat, please email us or join our community Slack using this link. Finally, please star ⭐ us on GitHub!

· 13 min read
Christian Fritz

There are many reasons and circumstances that require a robot operator to see through the eyes of a robot from afar. This is obviously the case for robots that are remote controlled, but the need also arises with autonomous robots. Examples include incident resolution, where a robot calls for help because it is aware of a problem it cannot resolve on its own, or after a customer reports an issue with the robot. Other examples include routine fleet monitoring, applications where robots are used for remote inspection, and AI applications that process video data in the cloud. In fact sight is such a fundamental sense to humans that the ability to "see at a distance" feels so enabling to robotic customers that, whenever possible and appropriate, a robotics company is well advised to offer it to its users.

Requirements and Criteria

In practice, robotics companies have taken a number of different approaches to stream live video from their robots for these purposes, but before we discuss these approaches and their pros and cons let us first enumerate some of the requirements and criteria of a good video streaming solution.

Low latency

In our conversations with robotics companies as well as our own experience, having low latency is one of the most important requirements. Remotely tele-operating a robot when there is significant lag in the video stream is not only frustrating and exhausting to the operator who can then usually only proceed in small steps, it is also unsafe in most cases since the surroundings of the robot may change faster than the operator is able to react to. Similarly, other applications of video streaming become less valuable when the video is delayed. As a practical reference, latency should be less than 500ms.

Bandwidth efficient

No matter how your robots are connected to the Internet, your available bandwidth will almost certainly be limited. Of course, this is especially true when connected over a cellular connection in which case your data usage will also incur significant cost. A 5 GB/month data plan on a 5g network in the US costs anywhere from $30-$50 per robot and video data can get large.

Ability to operate on unreliable networks (robust)

Network connections of robots are typically unreliable, especially as they move around. Wifi connections tend to suffer from varying degrees of coverage, poor roaming support, but also simple configuration mistakes by a customer. This happens at customers small and large, which is why many robotics companies prefer to rely on a cellular connection even indoors. However cellular connections have their own problems. Both indoors and outdoors there can be gaps in coverage as well as significant fluctuations in available bandwidth. A reliable video streaming solution needs to be able to handle these situations and in particular gracefully handle packet loss, disconnect and reconnect events, as well as said fluctuations of available bandwidth.

Reasonable frame-rate (fast)

Together with low latency, many applications of video streaming also benefit from a reasonable framerate -- at least 5 frames per second, better 15, sometimes 30 is required. Note also that at a frame rate of 10Hz, 100ms additional latency is introduced merely by the time between frame updates.

Accessible

Access to the video stream is as important as the stream itself. To enable a large number of users of varying professional roles and backgrounds to watch the video, it is essential to remove hurdles. This means reducing the software needing to be installed and configured on a watchers device, but also simplifying authentication and authorization as much as possible. Ideally, a user would be able to start watching without having to set up any new software on their device or configuring a VPN to gain access.

Secure

Last but definitely no least, the stream needs to be secure. End-to-end encryption is ideal as it maximizes privacy for the robotics customer. For the very least, transport level encryption needs to be used to secure the video as it streams through the systems of network and cloud providers, and any required cloud proxies should be configured to relay the stream without decrypting and reencrypting. This means that a simple proxy that maintains two SSL connections, one with the robot and one with the watcher, is not sufficient in all cases.

Approaches

While many, and probably the majority of, robotics companies use ROS to write the software running on their robots, there is no standard platform like ROS for functionality like video-streaming that requires code running not just on the robot, but also in the cloud and on the front-end. As a result, different robotics companies have implemented their own robotic cloud stack and with it their own video-streaming solutions. We'll describe some common approaches and discuss their pros and cons. After looking at what people do in practice we'll evaluate these approaches against the requirements and criteria described above.

Sending individual images

This is by far the most straightforward approach and very often the first one taken by robotics companies. In this approach the robot captures still images from its cameras and transmits them as such over TCP to a cloud server. The cloud relays the images to web clients where they are displayed by repeatedly updating an HTML <img> tag. The primary benefit of this approach is that it is relatively quick to implement and doesn't require a lot of knowledge of video formats. In fact it requires none of that, which is also it's greatest shortcoming: it is horribly inefficient. As many readers will already know and as we will see shortly, modern video compression algorithms are able to reduce the size of the stream by two orders of magnitude, i.e., save bandwidth and use the available network bandwidth more efficiently. Without this compression, most practical implementations can only support very low framerates -- typically 1-2 frames per second -- and often resort to grayscale images to reduce the size. This approach also suffers from being prone to man-in-the-middle attacks: in fact, in the naive approach, several parties beside the robotics company itself may be able to tap into the video stream.

RViz via ssh-tunnel or VPN

RViz is a powerful visualization tool for ROS built on Qt. It is primarily meant for developers but it does support visualizing video feeds from a robot's cameras and hence some companies use it for remote video streaming. To make this work, the watching RViz client needs to be put on a common network with the robot which implies some form of VPN being required. This makes this approach cumbersome to set up, especially on non-Ubuntu computers -- forget about mobile devices. It actually also suffers from the terrible inefficiency of the previous approach as ROS transmits these streams as sequences of individual camera images, too.

Foxglove via remote a websocket connection

Foxglove can be described as a modern, web-based replacement of RViz that, by being available on all major OS' and also being available in the browser, elegantly solves the accessibility problem RViz suffers from. This enabled a much greater variety of users to see robotic data, including video. Nevertheless, to use Foxglove for video streaming one still requires a VPN or custom-made cloud-proxy for establishing the connection between the robot and the client. As of this writing, Foxglove doesn't natively support video-compression yet either, making it suffer from the same bandwidth inefficiency as the previous two approaches, since again, individual images are transmitted. Foxglove is a big step in the right direct though and the team is working on adding support for h264-compressed video as well.

ROS web_video_server

The web_video_server is a ROS package that has been available for many years. It consumes images from a ROS topic and compresses them into a video stream using modern compression algorithms including VP8 and h264. It then exposes an HTTP server on the robot from where these streams can be consumed. This solves the problem of bandwidth inefficiency, however HTTP, which uses TCP, is a sub-optimal choice for streaming over unreliable networks. This is because TCP guarantees the delivery of packets even when there are interruptions -- a property one actually doesn't want with video, because it results in lag building up with each network interruption. The better policy is to drop frames that are too old and show the most recent video frames instead. THe server is also not able to perform any sort of congestion control, i.e., reduce picture quality or frame rate, when available bandwidth fluctuates. Lastly, in order to connect to this HTTP server from afar, again a VPN is required or a cloud proxy that forwards this (unencrypted) stream.

WebRTC

WebRTC, Web Real-Time Communication, is still a relatively new protocol that so far is primarily implemented by browser for use in video conferencing. It is supported by all modern web browsers, making it readily accessible on any device without additional setup, and, given its purpose, it has many desirable properties and features for live video-streaming. These include support for modern video compression algorithm like VP8 and h264 (+ VP9 and h265 if desired), graceful handling of packet loss, use of the ICE framework for automatically finding the best network route between the two peers (robot and browser in our case), congestion control to dynamically adjust picture quality to account for fluctuations of available bandwidth, and end-to-end encryption. WebRTC by default uses UDP and typical latency is around 200ms.

If you've paid attention, you will have noticed that WebRTC scores high on all the requirements and criteria we've laid out above. So naturally everyone in the robotics industry is using it for video streaming, right? Wrong! As we will see, WebRTC is not yet used very much in robotics and the reason for that is actually quite simple: it's still a hell of a task to implement on an end device, i.e., not a browser. Even building your own WebRTC application for use between two browsers is not trivial. But outside of browsers there aren't many libraries one can use and those that exist are still very much work in progress. There has been some tremendous progress on such libraries in recent years, too, but in order to benefit from many of the great features of WebRTC named above, one still has to do a lot of work oneself. Because the truth about WebRTC is that it is not one protocol or standard, it is a loose collection of several RFCs, each proposing and specifying approaches for different aspects of what, as a whole, is needed for video conferencing. On top of that, robotics companies often run versions of Ubuntu that are already a few years old -- either because upgrading a whole fleet is not easy, or because they still run ROS 1 which is not supported beyond Ubuntu 20. This is a problem for implementing WebRTC since the libraries included in those older versions of Ubuntu are still missing a lot of the fixes and features that make WebRTC so desirable.

A report from the field

We polled roboticists on two different online forums to find out what people currently use for video streaming from their robotic fleet in practice. Here are the results for those respondents who do use video-streaming on their robots (around ⅔ of all respondents).

We invited people who built their own solution in house to comment with details. None of those that commented used WebRTC. Which is not to say that people don't use it. Through other channels we have heard from several companies that have already made the switch to WebRTC. It seems fair to say though that those are still exceptions.

Evaluation

Summing it all up, we evaluate the five approaches along the requirements and criteria laid out above as follows. Disclaimer: this evaluation is meant as a practical guide for someone choosing between these approaches and is definitely subjective in some regards.

individual imagesRVizFoxgloveweb_video_serverWebRTC
low latencyusuallyusuallyusuallyusuallyyes
efficientyesyes
robustyes
fastyesyes
accessibleyeswith effortwith effortyes
securedependsdependsdependsdependsyes

Note that low latency is only provided by the first four approaches so long as there are no network disruptions, packet loss, or dips in available bandwidth. If any of these do happen then significant latency can build up -- even as much as 30 seconds.

Foxglove and web_video_server can be made accessibly without the need for a VPN that each watcher would need to configure and join, if an appropriate cloud-proxy is configured.

Regarding the security of the approaches it seems only fair to say that all these approaches can be made almost as secure as WebRTC, by choosing appropriate network level encryption. However at least the robotics company itself typically still retains the ability to watch ongoing streams, i.e., not providing privacy between the two peers. This last aspect may very well be a deciding factor when choosing a solution, especially when the robot is used in any kind of security or surveillance context, where customers may not want third-parties to be able to tap into the stream.

Conclusion

After many years of robotics companies trying a variety of approaches to streaming video from their robots there now exists a clear winner: WebRTC. It provides many features required for reliable, low latency streaming that many of us were not even aware off as we were exploring other means in the past. The primary downside of WebRTC is the complexity involved with implementing it, but we believe it is worth it.

Still not convinced? Try it on your own robots by installing our ready-to-go WebRTC Video capability -- which you can do for free when you register for an account on our hosted offering of Transitive. It only takes a few minutes and the capability includes everything you need: the code for the robot to tap the video stream and a web UI component for displaying the video. All required cloud services are provided by us. Afterwards you can still decide to implement you're own if you want, or just continue using ours by embedding the provided UI component in your own robot web portal aka. fleet management system.

Comments? Questions? We would love to hear them! Email us or join our Slack community and we'll be happy to help and discuss.

About Transitive

Transitive is an open-source framework for full-stack robotic applications with a modular architecture. It is developed and maintained by Transitive Robotics, which also offers commercially supported capabilities ("apps") that run on Transitive such as video-streaming, remote-teleop, and health-monitoring. Transitive accelerates the development of commercial robotics applications by providing robotics companies with a growing collection of capabilities they can readily integrate into their fleet management systems.

· One min read
Christian Fritz

In this episode, Audrow Nash speaks to Christian Fritz, CEO and founder of Transitive Robotics. Transitive Robotics makes software for building full stack robotics applications. In this conversation, they talk about how Transitive Robotic's software works, their business model, sandboxing for security, creating a marketplace for robotics applications, and web tools, in general.

Sense Think Act Podcast: 15. Full-stack Robotics and Growing an App Marketplace, with Christian Fritz