Tracking API

The Spectacular AI tracking API (a.k.a. “VIO” API) consists of the common output types relevant for real-time 6-degree-of-freedom pose tracking.

Coordinate systems

The SDK uses the following coordinate conventions, which are also elaborated in the diagram below

  • World coordinate system: Right-handed Z-is-up

  • Camera coordinate system: OpenCV convention (see here) for a nice illustration), which is also right-handed

These conventions are different from, e.g., Intel RealSense SDK (cf. here)), ARCore, Unity and most OpenGL tutorials, most of which use an “Y-is-up” coordinate system, often different camera coordinates systems, and sometimes different pixel (or “NDC”) coordinate conventions.

By default, the spectacularAI.Pose object returned by the Spectacular AI SDK uses the left camera as the local reference frame. To get the pose for another camera, use either spectacularAI.VioOutput.getCameraPose() (OR spectacularAI.depthai.Session.getRgbCameraPose() on OAK-Ds)

SDK coordinate systems

6-DoF pose output

class spectacularAI.Camera(self: spectacularAI.Camera, arg0: List[List[float[3]][3]], arg1: int, arg2: int)

Represents the intrinsic parameters of a particular camera. If the input image is distorted, the camera and projection matrices correspond to the undistorted / rectified image.

Build a pinhole camera

getIntrinsicMatrix(self: spectacularAI.Camera) numpy.ndarray

3x3 intrinsic camera matrix (OpenCV convention, undistorted).

getProjectionMatrixOpenGL(self: spectacularAI.Camera, arg0: float, arg1: float) numpy.ndarray

4x4 projection matrix for OpenGL (undistorted)

pixelToRay(self: spectacularAI.Camera, arg0: spectacularAI.PixelCoordinates) object

Convert pixel coordinates to camera coordinates

Parameters:

arg0 PixelCoordinates: pixel coordinates.

Returns:

Vector3d ray in camera coordinates on succesful conversion. None otherwise.

rayToPixel(self: spectacularAI.Camera, arg0: spectacularAI.Vector3d) object

Convert camera coordinates to pixel coordinates

Parameters:

arg0 Vector3d: ray in camera coordinates.

Returns:

PixelCoordinates pixel coordinates on succesful conversion (note that the pixel can be outside image boundaries). None otherwise.

class spectacularAI.CameraPose

Represents the pose (position & orientation) and other parameters of a particular camera.

property camera

Camera: camera parameteres

getCameraToWorldMatrix(self: spectacularAI.CameraPose) numpy.ndarray

4x4 homogeneous camera-to-world matrix

getPosition(self: spectacularAI.CameraPose) spectacularAI.Vector3d

Vector3d position of the camera

getWorldToCameraMatrix(self: spectacularAI.CameraPose) numpy.ndarray

4x4 homogeneous world-to-camera matrix

pixelToWorld(self: spectacularAI.CameraPose, arg0: spectacularAI.PixelCoordinates) object

Convert pixel coordinates to rays in world coordinates

Parameters:

arg0 PixelCoordinates: pixel coordinates to convert.

Returns:

A tuple with Vector3d origin of the ray in world coordinates, Vector3d direction of ray from the origin on succesful conversion. None otherwise.

property pose

latest Pose

property velocity

Vector3d instantaneous velocity vector (xyz) of the camera center in m/s

worldToPixel(self: spectacularAI.CameraPose, arg0: spectacularAI.Vector3d) object

Convert world coordinates to pixel coordinates

Parameters:

arg0 Vector3d: point in world coordinates coordinates.

Returns:

PixelCoordinates pixel coordinates on succesful conversion (note that the pixel can be outside image boundaries). None otherwise.

class spectacularAI.VioOutput

Main output structure

property angularVelocity

angular velocity vector in SI units (Vector3d)

asJson(self: spectacularAI.VioOutput) str

a JSON representation of this object

getCameraPose(self: spectacularAI.VioOutput, arg0: int) spectacularAI.CameraPose

CameraPose corresponding to a camera whose index is given as the parameter. Index 0 corresponds to the primary camera and index 1 the secondary camera.

property globalPose

GnssVioOutput Global pose, only returned if GNSS information is provided via Session.addGnss(…)

property pose

latest Pose

property poseTrail

trail of smoothed historical poses (list of Pose objects)

property positionCovariance

position uncertainty, 3x3 covariance matrix

property status

current TrackingStatus

property tag

input tag from addTrigger. Set to 0 for other outputs.

property velocity

current velocity (Vector3d)

property velocityCovariance

velocity uncertainty, 3x3 covariance matrix

class spectacularAI.GnssVioOutput

GNSS-VIO output

property angularVelocity

current Vector3d instantaneous angular velocity in ENU coordinates

property coordinates

current WgsCoordinates

property enuPositionCovariance

enu position uncertainty, 3x3 covariance matrix

getEnuCameraPose(self: spectacularAI.GnssVioOutput, arg0: int, arg1: spectacularAI.WgsCoordinates) spectacularAI::CameraPose

Get the global pose of a particular camera. The “world” coordinate system of the camera pose is an East-North-Up system, whose origin is at the given WGS84 coordinates.

property orientation

current Quaternion

property velocity

current Vector3d instantaneous velocity in ENU coordinates

property velocityCovariance

velocity uncertainty, 3x3 covariance matrix

Common types

class spectacularAI.Pose

Represents the pose (position & orientation) of a device at a given time. This typically corresponds the pose of the IMU (configurable). See CameraPose for exact poses of the cameras.

asMatrix(self: spectacularAI.Pose) numpy.ndarray

4x4 matrix that converts homogeneous local coordinates to homogeneous world coordinates

fromMatrix(self: float, arg0: List[List[float[4]][4]]) spectacularAI.Pose

Create a pose from a timestamp and 4x4 local-to-world matrix

property orientation

Quaternion orientation of the IMU / camera, local-to-world

property position

Vector3d position of the IMU / camera

property time

float timestamp in seconds, synchronized with device monotonic time (not host)

class spectacularAI.Vector3d(*args, **kwargs)

Vector in R^3. Can represent, e.g., velocity, position or angular velocity. Each property is a float.

Overloaded function.

  1. __init__(self: spectacularAI.Vector3d) -> None

  2. __init__(self: spectacularAI.Vector3d, arg0: float, arg1: float, arg2: float) -> None

property x
property y
property z
class spectacularAI.Vector3f(*args, **kwargs)

Vector in R^3. Single precision.

Overloaded function.

  1. __init__(self: spectacularAI.Vector3f) -> None

  2. __init__(self: spectacularAI.Vector3f, arg0: float, arg1: float, arg2: float) -> None

property x
property y
property z
class spectacularAI.Quaternion

Quaternion representation of a rotation. Hamilton convention. Each property is a float.

property w
property x
property y
property z
class spectacularAI.TrackingStatus(self: spectacularAI.TrackingStatus, value: int)

Members:

INIT

TRACKING

LOST_TRACKING

INIT = <TrackingStatus.INIT: 0>
LOST_TRACKING = <TrackingStatus.LOST_TRACKING: 2>
TRACKING = <TrackingStatus.TRACKING: 1>
property name
property value
class spectacularAI.ColorFormat(self: spectacularAI.ColorFormat, value: int)

Members:

NONE

GRAY

RGB

RGBA

GRAY16

GRAY = <ColorFormat.GRAY: 1>
GRAY16 = <ColorFormat.GRAY16: 7>
NONE = <ColorFormat.NONE: 0>
RGB = <ColorFormat.RGB: 2>
RGBA = <ColorFormat.RGBA: 3>
property name
property value
class spectacularAI.Bitmap

Represents a grayscale or RGB bitmap

getColorFormat(self: spectacularAI.Bitmap) spectacularAI.ColorFormat

ColorFormat

getHeight(self: spectacularAI.Bitmap) int

int bitmap height

getWidth(self: spectacularAI.Bitmap) int

int bitmap width

toArray(self: spectacularAI.Bitmap) numpy.ndarray

Returns array representation of the bitmap

class spectacularAI.Frame

A camera frame with a pose

property cameraPose

CameraPose corresponding this camera

property image

Bitmap that will contain bitmap image if it’s available

property index

Camera index

class spectacularAI.WgsCoordinates(self: spectacularAI.WgsCoordinates)

Represents the pose (position & orientation) of a device at a given time.

property altitude
property latitude
property longitude
class spectacularAI.PixelCoordinates(*args, **kwargs)

Coordinates of an image pixel (x, y), subpixel accuracy (float).

Overloaded function.

  1. __init__(self: spectacularAI.PixelCoordinates) -> None

  2. __init__(self: spectacularAI.PixelCoordinates, arg0: float, arg1: float) -> None

property x
property y
class spectacularAI.FeaturePoint(self: spectacularAI.FeaturePoint)

Sparse 3D feature point observed from a certain camera frame.

property id

An int ID to identify same points in different frames.

property pixelCoordinates

PixelCoordinates of the observation in the camera frame.

property position

Vector3d global position of the feature point.