Document: Digital Capture Requirements

Digital Capture

Assuming we have consumer/prosumer equipment that provide a clean-HDMI signal, then we will need to convert the HDMI signals into a digital form that can be used by a computer.

This task is done by a capture card. It will connect to the camera’s HDMI output, sample the signal, and convert it into a format ready for a PC to ingest. It may also have an HDMI pass-through, allowing the video stream to also be connected to further downstream devices, such as a TV.

There is a delay between the image arriving at the capture card and the converted version of the frame being available to software running on the PC. This delay is called the latency. There will also be slight variations in this conversion time from frame to frame. These variations are called jitter.

Many capture cards are designed for capturing the output of video games- either on a second PC or game console. The focus is on providing as little latency between the HDMI input and the HDMI passthrough, so as not to affect the millisecond timing needed for some games. However the requirements for the latency between an image arriving and being captured and available to the PC may be much looser.

For video conferencing/live video production we also need a low latency from the HDMI in (from the camera) to being captured by the PC. For this reason, many video game capture cards are not well suited for video conferencing. There is a noticeable, and distracting, delay between the live action and the transmitted stream. This can be further complicated if sound is being captured using a different system that has a different (usually lower) latency.

The captured data can be presented to the computer by several different physical connections:

  • PCI-e (internal expansion card)
  • USB (2+)

Some caveats with USB:

  • USB 2 may not have enough bandwidth
  • Capture devices should be on their own bus (e.g. you don’t want your mouse on the same bus, as moving your mouse could cause random delays to data being sent from the capture card)
  • That the port can provide enough power (usually fixable by interposing a powered hub)

They provide data by one of the following software interfaces:

  • Video capture device
  • UVC webcam
  • Virtual (UVC) camera
  • Proprietary interface

Cameras that present themselves as a UVC webcam are preferable. All conferencing software will use this API. Most video editing/capture software will as well.

Next are “virtual camera” interfaces. Usually this is implemented by cameras that communicate via WiFi, ethernet, or something similar. A special piece of software acts as a bridge between this protocol and creates a software-only virtual UVC webcam. This is less desirable, as it will increase latency. As well, the software may be proprietary- long term support has to be carefully evaluated. Examples are rtmp based cameras (usually security cameras) and action cameras (eg GoPro.)

Notable exception: NDI. NDI is a low latency, professional broadcast capable protocol for sending video via regular IP networks. It is developed by NewTek (of VideoToaster fame.) It is proprietary, but they have provided “binary blobs” that allow developers- including open source projects such as OBS- to include this functionality in their products. There are a several bridge programs to go from NDI to a virtual UVC camera, and the potential exists for conferencing apps to use the NDI protocol directly.

Many of the video capture device APIs pre-date the USB UVC class of devices. There are several, and they work differently. They are often well supported in video editing software, but less so for conferencing tools.

Proprietary (hardware and/or software) should be avoided if possible as they’re likely to be the most fragile and may not have long term support.

Asides:

  • lower cost low latency HDMI-USB adapters are 8 bit, no HDR, capture 4:2:2 or 4:2:0
  • BlackMagic and other prosumer/professional adapters can capture 10 bit, even 4:4:4
  • capturing at 4k 4:2:2 and downsampling the luminance and keeping chroma at the captured resolution give close to 1080P 4:4:4- excellent for (recorded) green screen work. See this detailed forum post.
  • Elgato 4k capture studio has a secret menu (accessed via ctrl+shift+alt+u) that can, for some models, allow 4:2:2 capture. In OBS choose YUY2 as
    color format. Be careful with any of the settings in that menu! Ref

Key Observations

  • Direct connection, high-quality (good resolution, frame rate, and reasonable lossy compression) UVC USB from the camera would be wonderful
  • USB UVC capture interface is desirable
  • Low latency is needed
  • 4:4:4 or 4:4:2 chroma sub-sampling, 10 bit color, HDR capture capability would be nice

So this is excellent statement of what you are recommending after some consideable study…Can we make a decision on a London Drugs purchase of camera so we can do basic compatibility with a computer and some appropriate software…???