Skip to main content

WebRTC Video

Provides low-latency video streaming over an end-to-end encrypted WebRTC connection. Once installed you can use the provided embedding instructions to embed the video widget for that robot in your own web application.


  • Typical latency of just 200ms
  • Allows setting desired bit rate (in KB/s), e.g., for choosing between high definition and lower bandwidth cost
  • Multi-camera support
    • Can be easy arranged into layouts using CSS
  • Supports various video sources:
    • Video4linux cameras, i.e., virtually all USB cameras
      • allows you to select from list of resolutions and frame rates supported by your cameras
    • ROS and ROS 2 image topics, incl. bayer encoded ones, and pre-encoded h264 streams
    • RTSP sources, e.g., from IP cameras
    • custom GStreamer source pipelines
  • Utilizes h264 for video compression
  • Hardware acceleration on Nvidia platforms (e.g., Jetsons), and Rockchip based platforms (e.g., Orange Pi, Firefly)
  • Robust against even heavy packet loss
  • Congestion control: automatically adjust bitrate to account for network conditions
  • Automatically reconnects after network loss
  • Works in all modern browsers (Chrome recommended)
  • Encrypted end-to-end
  • No bandwidth cost when sender and receiver are on the same network
  • Audio streaming
  • Supervisor UI:
    • A UI component listing all ongoing sessions on a robot
    • Allows joining an ongoing session without increasing bandwidth usage on the robot

To learn more about the security model of WebRTC and why it is safe see, e.g., here.


Requires gstreamer 1.16 (Ubuntu 20.04) or later.

During the installation process the Transitive agent will try to install all required dependencies into its sandbox environment. If this fails or if your build and deployment process makes it preferable you can pre-install them manually:

sudo apt install build-essential pkg-config fontconfig git gobject-introspection gstreamer1.0-x gstreamer1.0-libav gstreamer1.0-nice gstreamer1.0-plugins-bad gstreamer1.0-plugins-base-apps gstreamer1.0-plugins-good gstreamer1.0-plugins-ugly gstreamer1.0-tools libgstreamer1.0-0 libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev libgstreamer-plugins-bad1.0-dev libgirepository1.0-dev libc-dev libcairo2 libcairo2-dev


If you are running Transitive inside a Docker container and want to use USB cameras, then be sure to add the following to your docker run command (or similarly for docker-compose):

-v /run/udev:/run/udev # required by gst-device-monitor-1.0 to enumerate available devices and supported resolutions


The easiest way to configure the capability is to use the UI in the Transitive portal, which gives you the option to choose the video source to use as input, plus any parameters you may be able to set on it, e.g., resolution and frame rate on v4l devices, plus bit rate. Once configured, the video will show as well as the attributes you need to add to the embedding HTML snippet to use the configuration you selected.

Alternatively you can configure a default source (or multi-source layout) and parameters in your ~/.transitive/config.json file, e.g.:

"global": {
"@transitive-robotics/webrtc-video": {
"default": {
"streams": [
"videoSource": {
"type": "rostopic",
"rosVersion": "1",
"value": "/tracking/fisheye1/image_raw"
"complete": true
"videoSource": {
"type": "v4l2src",
"value": "/dev/video0",
"streamType": "image/jpeg",
"resolution": {
"width": "432",
"height": "240"
"framerate": "15/1"
"complete": true
"videoSource": {
"type": "videotestsrc"
"complete": true

and then add use-default=true as an attribute in the embedding HTML to use this instead.

Deciding on a bitrate

In order to find an appropriate bitrate for your streams, we suggest 0.01 bytes per pixel as a rule of thumb. For example, one stream of 640x480 at 15 fps results in 4,608,000 pixels per second, multiplied by 0.01 bytes/pixel gives 46 KB/s as a suggested bitrate. You can set it lower if necessary, but the quality will really suffer. Of course, if your connection allows it, you can always set it higher to increase image quality.


When using a v4l source, i.e., a usb cam, you can choose any of the framerates provided by the hardware. When using ROS topics, the default framerate is 15/1, i.e., 15 frames per second. This can be changed by providing the framerate option in the embedding code, e.g., framerate="10/1".


Each <video> element generated by the front-end web component will have a unique class name webrtc-videoN, where N enumerates the elements starting at 0. This makes it easy to arrange and style these elements using CSS. For instance, these CSS rules would create a layout where one camera, e.g., the front camera, is large on top, and at the bottom we have left, back, and right-viewing cameras.

webrtc-video-device video { position: absolute }
webrtc-video-device .webrtc-video0 { width: 960px; height: 720px; }
webrtc-video-device .webrtc-video1 { top: 720px; }
webrtc-video-device .webrtc-video2 { top: 720px; left: 640px; }
webrtc-video-device .webrtc-video3 { top: 720px; left: 320px; width: 320px; }

In addition, the div element immediately wrapping these video elements has the class name webrtc-video-wrapper. This makes it possible to use apply various CSS layout features such as flexbox or grid layouts, e.g., the following would create a layout where the front facing camera is large in the middle, left and right cameras are to the sides on top, and the backward facing camera is in the bottom left. The bottom right is left black here but would make for a good place to show a map component.

.webrtc-video-wrapper {
display: grid;
grid-gap: 10px;
grid-template-columns: 1fr 1fr 1fr 1fr;
grid-template-rows: 1fr 1fr;
"left front front right"
"back front front .";
.webrtc-video0 { grid-area: front; }
.webrtc-video1 { grid-area: left; }
.webrtc-video2 { grid-area: right; }
.webrtc-video3 { grid-area: back; }
video {
width: 100%;
height: 100%;
object-fit: cover;

React callbacks

In React, this capability can be embedded using the TransitiveCapability tag from the @transitive-sdk/utils-web package. When doing so, you can provide an additional object as props containing callbacks you would like to receive, e.g.:

<TransitiveCapability jwt={jwt} timeout="1800"
// receive connection state events:
onConnectionStateChange: (state) => console.log({state}),
// continuously receive lag information:
onLag: (lag) => console.log({lag}),
// continuously receive stats from the webrtc connection:
onStats: (stats) => console.log({stats}),
// receive tracks as they are created
onTrack: (track, mid) => console.log({track, mid}),
// be notified when the user clicks on the video
onClick: (clickEvent) => console.log({clickEvent}),
}} />

RTSP streams (IP cameras)

To use an RTSP stream as video source, such as those provided by many IP cameras, set the type in your embedding HTML to rstp, and set the source to the RTSP URL provided by your camera. For instance:

  <webrtc-video-device id="superbot" host="" ssl="true"

This assumes that your stream is already h264 encoded, as is common with IP cameras, and that you do not wish to transcode the stream. This is by far the most CPU efficient option but also implies that the webrtc-video capability will not be able to provide congestion control or set the bitrate. If you do want to decode the stream and re-encode it in order to get congestion control and bitrate control back, you can use type="rtsp-transcode" instead.

ROS topics carrying pre-encoded h264 streams

While we do not recommend this, some users may choose to multiplex their h264 video streams via ROS topics. ROS does not have a standard message format for video streams, but it is possible to "abuse" any ROS message type with a binary array data type, e.g., sensor_msgs/CompressedImage.

Assuming you have such a stream in a topic called /camera/h264, you can use that as your video input source in embedded code like this:

<webrtc-video-device id="superbot" host="" ssl="true"

or in React:

<TransitiveCapability jwt={jwt} timeout="1800" count="1"
rosversion="1" type="rostopic-h264" source="/camera/h264" />

where jwt, as always, is a JSON Web Token carrying the payload described in the Embed instructions of this capability on the portal. For ROS 2 just change the rosversion.

Note that whenever you use a pre-encoded h264 streams as input, you lose a very significant and important feature: congestion control. Hence, if your input stream uses a higher bitrate than your network can support, there will be no way for webrtc-video to adjust the bitrate to ensure low-latency. We therefor recommend against using pre-encoded h264 streams. When you do use them, please make sure they use a bitrate that is appropriate for your Internet connection.

Custom Video Source Pipelines

In addition to various video sources, the capability supports the specification of custom GStreamer source pipelines. In your embedding HTML you can specify type custom and set as source your custom pipeline. For example:

  <webrtc-video-device id="superbot" host="" ssl="true"
source="videotestsrc is-live=true ! video/x-raw,framerate=(fraction)15/1,width=640,height=480"

This assumes that the sink of your pipeline can be fed into a videoconvert element for conversion to video/x-raw. If your pipeline produces a stream that is already h264 encoded and you don't want to decode and re-encode this stream, then you can set the type to custom-h264. Note that in that case the capability will not implement any congestion control for you.

This is an advanced feature and only meant for users who are familiar with GStreamer pipelines. It is also more difficult to debug. We recommend to test your custom source pipeline first using gst-launch-1.0 and autovideosink or fakesink.

Local Recording

The capability supports recording all outgoing video on disk in a rolling buffer. This feature, similar to black-box recorders on airplanes, can be useful when investigating recent incidences after the fact. The buffer restarts each time a new session is started and currently records a maximum of ten minutes or 1 GB, whichever comes first. Add record="true" to your HTML embedding code to enable this. The recordings will be in /tmp/stream_*.mov.

Custom Video Sink Pipelines

In many cases, streaming video is only one of several usages of a robot's video sources. When this is the case it is useful to be able to "tee" the stream at several locations in the pipeline to feed it into auxiliary processing pipelines. Local recording is an example of this, but a very specific one. To support any other such applications, webrtc-video gives the ability to specify additional sink pipelines that can be injected after the encoding step of the pipeline. This is done per-stream.


  <webrtc-video-device id="superbot" host="" ssl="true"
encodedpipe="splitmuxsink location=/tmp/video0.mp4 max-size-time=60000000000 max-files=10"
encodedpipe_1="splitmuxsink location=/tmp/video1.mp4 max-size-time=60000000000 max-files=10"

Device Sharing

The camera streams on robots are often used for multiple purposes, remote streaming only being one of them. When this is the case, the stream needs to be shared. This is trivial for ROS topics and RTSP streams, but not for USB cameras (video4linux2 devices). This is because under Linux only one process can open such a device at a time. Fortunately there are a few options for sharing the streams from these devices.

Virtual devices (v4l2loopback)

The v4l2loopback kernel module allows you to create virtual v4l2 devices which can be shared. Using ffmpeg or gstreamer you can then forward a physical device's stream to one or more such virtual devices.


# create the virtual devices
sudo modprobe v4l2loopback devices=2 video_nr=10,11 exclusive_caps=1,1 card_label="front_camera,back_camera"
# forward a physical device's stream to virtual device
ffmpeg -f v4l2 -input_format h264 -video_size 640x480 -framerate 30 -i /dev/video3 -c:v copy -f v4l2 /dev/video10

In this example we are forwarding a h264 stream provided by the camera (requires ffmpeg v6+), but the same approach also works for raw and jpeg streams.

RTP over UDP

In this approach we feed the stream coming from the camera to multiple UDP sockets using the RTP protocol. These UDP sockets can then be accessed using gstreamer's udpsrc element.

# producer
gst-launch-1.0 -e v4l2src device=/dev/video2 ! video/x-h264,width=640,height=480,framerate=30/1 ! h264parse config-interval=-1 ! rtph264pay ! multiudpsink clients=,
# Client 1
gst-launch-1.0 udpsrc port=9001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" ! rtph264depay ! decodebin ! autovideosink
# Client 2
gst-launch-1.0 udpsrc port=9002 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" ! rtph264depay ! decodebin ! autovideosink

Shared Memory

Similar to the UDP approach, we can use shared memory to share the stream:

# Producer
gst-launch-1.0 -e v4l2src device=/dev/video4 ! video/x-raw,width=640,height=480,framerate=30/1 ! shmsink socket-path=/tmp/foo2 shm-size=2000000 wait-for-connection=false sync=true
# Clients
gst-launch-1.0 shmsrc socket-path=/tmp/foo2 do-timestamp=true is-live=true ! video/x-raw,width=640,height=480,framerate=30/1,format=YUY2 ! autovideosink

Or, for a h264 camera stream:

# producer
gst-launch-1.0 -e v4l2src device=/dev/video2 ! h264parse config-interval=-1 ! shmsink socket-path=/tmp/foo shm-size=2000000 wait-for-connection=false sync=true
# consumer
gst-launch-1.0 shmsrc socket-path=/tmp/foo do-timestamp=true is-live=true ! 'video/x-h264,profile=baseline,framerate=30/1' ! h264parse config-interval=-1 ! decodebin ! autovideosink

Supervisor UI

The supervisor UI component allows you to join an ongoing session. It does so by forwarding the streams being watched from the watchers computer directly, i.e., it does not increase resource or bandwidth usage on the robot itself.

Like all UI components, the Supervisor UI can be embedded on other pages as well. When doing so, you can set the following options to automatically join the latest ongoing session on the device, and, optionally, to select only a subset of all stream (when the session has more than one stream/camera):

  • auto: set this if you want to auto-join the latest session
  • streams: a comma separated list of stream indices, e.g., "0,2"

This allows you, for instance, to create a page where you are showing one stream for each ongoing session. That page will auto-update as sessions start and stop, giving you a bird-eye overview of what is happening with your fleet.



  • Support for ROS topics carrying pre-encoded h264 streams
  • v0.20.1:
    • Avoids high memory usage when establishing the connection takes long (e.g., due to packet loss) when using ROS topics


  • Supervisor UI, embedding:
    • auto-join last active session
    • only subscribe to subset of session streams
  • Made ready for upgrade to Node 20


  • Replace video-streams with an error message if they fail to start within a timeout (currently 2 seconds). This is particularly useful when you stream multiple cameras but one fails to start, e.g., due to USB issues.


  • Improved congestion control: accounting for static network delay
  • New debug option: disable congestion control
  • Increased default bitrate to 100KB/s


  • New feature: support for streaming audio as well
  • New feature: supervisor UI
  • v0.16.4:
    • Fixes a problem with h264 streams via v4l2loopback
  • v0.16.8:
    • Fixes reactivity, e.g., when switching devices (in JWT) or input sources
  • v0.16.10:
    • Fixes an issue regarding the installation of ROS 2 bindings
  • v0.16.11:
    • Time out if ROS topic is not published, but still show other streams


  • New feature: ability to record locally to disk on robot/device
  • New feature: ability to specify custom source and sink pipelines
  • New feature: support for Bayer encoded ROS image topics
  • v0.15.1:
    • Automatically recover from interruptions during updates


  • New feature: hardware acceleration on Rockchip based boards
  • Fixed a bug preventing the connection to establish from mobile browsers
  • Fixed a bug preventing the embedding of video streams using ROS 2 topics as video source
Version 0.20.1, published 6/27/2024, 1:07:30 AM
$15/robot/month Includes 10 GB of relay traffic. $0.3/GB after that. Only applies when video needs to be relayed through the cloud because no peer-to-peer connection could be established.