Python API Reference#, **subprocess_kwargs) CompletedProcess#

Send commands to an Android device using ADB.

This is a convenience function. It will construct an AdbDevice with the default parameters (taken from your config files) and call AdbDevice.adb with the parameters given here.

class, adb_server=None, adb_binary=None, tcpip=None)#

Send commands to an Android device using ADB.

Default values for each parameter can be specified in your “stbt.conf” config file under the “[android]” section.

  • address (string) – IP address (if using Network ADB) or serial number (if connected via USB) of the Android device. You can get the serial number by running adb devices -l. If not specified, this is read from “device_under_test.ip_address” in your Node-specific configuration files.

  • adb_server (string) – The ADB server (that is, the PC connected to the Android device). Defaults to localhost.

  • adb_binary (string) – The path to the ADB client executable. Defaults to “adb”.

  • tcpip (bool) – The ADB server communicates with the Android device via TCP/IP, not USB. This requires that you have enabled Network ADB access on the device. Defaults to True if address is an IP address, False otherwise.


Currently, the Stb-tester Nodes don’t support ADB over USB. You must enable Network ADB on the device-under-test, and you must configure the IP address of the device-under-test by specifying device_under_test.ip_address in the Node-specific configuration files for each Stb-tester Node.

) CompletedProcess#

Run any ADB command.

For example, the following code will use “adb shell am start” to launch an app on the device:

d = AdbDevice(...)
d.adb(["shell", "am", "start", "-S",

Any keyword arguments are passed on to


subprocess.CompletedProcess from


subprocess.CalledProcessError if check is true and the adb process returns a non-zero exit status.


subprocess.TimeoutExpired if timeout is specified and the adb process doesn’t finish within that number of seconds.

Raises: if adb connect fails.

devices() str#

Output of adb devices -l.

get_frame(coordinate_system=None) stbt.Frame#

Take a screenshot using ADB.

If you are capturing video from the Android device via another method (namely, HDMI capture) sometimes it can be useful to capture a frame via ADB for debugging. This function will manipulate the ADB screenshot (scale and/or rotate it) to match the screenshots from your main video-capture method as closely as possible.


A stbt.Frame, that is, an image in OpenCV format. Note that the time attribute won’t be very accurate (probably to <0.5s or so).

press(key) None#

Send a keypress.


key (str) – An Android keycode as listed in <>. Also accepts standard Stb-tester key names like “KEY_HOME” and “KEY_BACK”.

swipe(start_position, end_position) None#

Swipe from one point to another point.

  • start_position – A stbt.Region or (x, y) tuple of coordinates at which to start.

  • end_position – A stbt.Region or (x, y) tuple of coordinates at which to stop.


d.swipe((100, 100), (100, 400))
tap(position) None#

Tap on a particular location.


position – A stbt.Region, or an (x,y) tuple.


d.tap((100, 20))
logcat(filename='logcat.log', logcat_args=None)#

Run adb logcat and stream the logs to filename.

This is a context manager. See Capturing logs from the device-under-test for the recommended way to use it.

  • filename (str) – Where the logs are written.

  • logcat_args (list) – Optional arguments to pass on to adb logcat, such as filter expressions. For example: logcat_args=["ActivityManager:I", "MyApp:D", "*:S"]. See the logcat documentation.


Bases: Exception

Exception raised by AdbDevice.adb.

  • returncode (int) – Exit status of the adb command.

  • cmd (list) – The command that failed, as given to AdbDevice.adb.

  • output (str) – The output from adb.

  • devices (str) – The output from “adb devices -l” (useful for debugging connection errors).

class stbt.AppleTV(address: str | None = None)#

Control an AppleTV device using pyatv.

pyatv is an open source tool for controlling Apple TV devices. It uses AirPlay and other network protocols supported by the Apple TV. This class makes it easy to use pyatv in Stb-tester test scripts.

pyatv has additional dependencies that are not installed by default in the Stb-tester v34 environment. For instructions on how to install these dependencies, and how to configure the Apple TV’s IP address, see Controlling Apple TV with pyatv.


We have done the “integration” work to make pyatv work on the Stb-tester Node, but we are not the authors of pyatv, and we don’t provide support for bugs in pyatv itself.


address (str) – The IP address of the Apple TV device. If not specified, this is read from “device_under_test.ip_address” in your Node-specific configuration files.

launch_app(name: str)#

Launches the specified app.


name – The name or bundle ID of the app (for example “Netflix” or “”).

list_apps() dict[str, str]#

Returns a dict of id: name with all the apps installed on the Apple TV device.

set_text(text: str)#

Set the text in a focused text field.

This only works if a text field is currently focused (for example a search or login field).

text: str,
corrections: dict[Pattern | str, str] | None = None,
) str#

Applies the same corrections as stbt.ocr’s corrections parameter.

This is available as a separate function so that you can use it to post-process old test artifacts using new corrections.


Context manager that replaces test failures with test errors.

Stb-tester’s reports show test failures (that is, UITestFailure or AssertionError exceptions) as red results, and test errors (that is, unhandled exceptions of any other type) as yellow results. Note that wait_for_match, wait_for_motion, and similar functions raise a UITestFailure when they detect a failure. By running such functions inside an as_precondition context, any UITestFailure or AssertionError exceptions they raise will be caught, and a PreconditionError will be raised instead.

When running a single testcase hundreds or thousands of times to reproduce an intermittent defect, it is helpful to mark unrelated failures as test errors (yellow) rather than test failures (red), so that you can focus on diagnosing the failures that are most likely to be the particular defect you are looking for. For more details see Test failures vs. errors.


message (str) – A description of the precondition. Word this positively: “Channels tuned”, not “Failed to tune channels”.


PreconditionError if the wrapped code block raises a UITestFailure or AssertionError.


def test_that_the_on_screen_id_is_shown_after_booting():
    channel = 100

    with stbt.as_precondition("Tuned to channel %s" % channel):
        assert channels.is_on_channel(channel)


Low-level API to get raw audio samples.

audio_chunks returns an iterator of AudioChunk objects. Each one contains 100ms to 5s of mono audio samples (see AudioChunk for the data format).

audio_chunks keeps a buffer of 10s of audio samples. time_index allows the caller to access these old samples. If you read from the returned iterator too slowly you may miss some samples. The returned iterator will skip these old samples and silently re-sync you at -10s. You can detect this situation by comparing the .end_time of the previous chunk to the .time of the current one.


time_index (int or float) – Time from which audio samples should be yielded. This is an epoch time compatible with time.time(). Defaults to the current time as given by time.time().


An iterator yielding AudioChunk objects

Return type:


class stbt.AudioChunk#

A sequence of audio samples.

An AudioChunk object is what you get from audio_chunks. It is a subclass of numpy.ndarray. An AudioChunk is a 1-D array containing audio samples in 32-bit floating point format (numpy.float32) between -1.0 and 1.0.

In addition to the members inherited from numpy.ndarray, AudioChunk defines the following attributes:

  • time (float) – The wall-clock time of the first audio sample in this chunk, as number of seconds since the unix epoch (1970-01-01T00:00:00Z). This is the same format used by the Python standard library function time.time.

  • rate (int) – Number of samples per second. This will typically be 48000.

  • duration (float) – The duration of this audio chunk in seconds.

  • end_time (float) – time + duration.

AudioChunk supports slicing using Python’s [x:y] syntax, so the above attributes will be updated appropriately on the returned slice.

class stbt.BGRDiff(
min_size: tuple[int, int] | None = None,
threshold: float = 25,
erode: bool = True,

Compares 2 frames by calculating the color distance between them.

The algorithm calculates the euclidean distance in BGR colorspace between each pair of corresponding pixels in the 2 frames. This distance is then binarized using the specified threshold: Values smaller than the threshold are ignored. Then, an “erode” operation removes differences that are only 1 pixel wide or high. If any differences remain, the 2 frames are considered different.

This is the default diffing algorithm for detect_motion, wait_for_motion, press_and_wait, find_selection_from_background, and ocr’s text_color.

class stbt.Color(hexstring: str)#
class stbt.Color(blue: int, green: int, red: int, alpha: int | None = None)
class stbt.Color(bgr: tuple[int, int, int])
class stbt.Color(bgra: tuple[int, int, int, int])

A BGR color, optionally with an alpha (transparency) value.

A Color can be created from an HTML-style hex string:

>>> Color('#f77f00')

Or from Blue, Green, Red values in the range 0-255:

>>> Color(0, 127, 247)

Note: When you specify the colors in this way, the BGR order is the opposite of the HTML-style RGB order. This is for compatibility with the way OpenCV stores colors.

Any stbt APIs that take a Color will also accept a string or tuple in the above formats, so you don’t need to construct a Color explicitly.


Calculate euclidean color distance in a perceptually uniform colorspace.

Calculates the distance of each pixel in frame against the color specified in background_color or foreground_color. The output is a binary (black and white) image.

  • frame (stbt.Frame) – The video frame to process.

  • background_color (Color) – The color to diff against. Output pixels will be white where the color distance is greater than threshold. Use this to remove a background of a particular color.

  • foreground_color (Color) – The color to diff against. Output pixels will be white where the color distance is smaller than threshold. Use this to find a foreground feature of a particular color, such as text or the selection/focus.

  • threshold (float) – Binarization threshold in the range [0., 1.]. Foreground pixels will be set to white, background pixels to black. A value of 0.01 means a barely-noticeable difference to human perception. To disable binarization set threshold=None; the output will be a grayscale image.

  • erode (bool) – Run the thresholded differences through an erosion algorithm to remove noise or small differences (less than 3px).

Return type:



Binary (black & white) image, or grayscale image if threshold=None.

Added in v33.

exception stbt.ConfigurationError#

Bases: Exception

An error with your stbt configuration file.

class stbt.ConfirmMethod#

An enum. See MatchParameters for documentation of these values.

NONE = 'none'#
ABSDIFF = 'absdiff'#
NORMED_ABSDIFF = 'normed-absdiff'#
stbt.crop(frame: Frame, region: Region) Frame#

Returns an image containing the specified region of frame.


frame (stbt.Frame or numpy.ndarray) – An image in OpenCV format (for example as returned by frames, get_frame and load_image, or the frame parameter of MatchResult).


An OpenCV image (numpy.ndarray) containing the specified region of the source frame. This is a view onto the original data, so if you want to modify the cropped image call its copy() method first.

timeout_secs: float = 10,
noise_threshold: int | None = None,
mask: Mask | Region | str = Region.ALL,
region: Region = Region.ALL,
frames: Iterator[Frame] | None = None,
) Iterator[MotionResult]#

Generator that yields a sequence of one MotionResult for each frame processed from the device-under-test’s video stream.

The MotionResult indicates whether any motion was detected.

Use it in a for loop like this:

for motionresult in stbt.detect_motion():

In most cases you should use wait_for_motion instead.

  • timeout_secs (int or float or None) – A timeout in seconds. After this timeout the iterator will be exhausted. Thas is, a for loop like for m in detect_motion(timeout_secs=10) will terminate after 10 seconds. If timeout_secs is None then the iterator will yield frames forever. Note that you can stop iterating (for example with break) at any time.

  • noise_threshold (int) –

    The difference in pixel intensity to ignore. Valid values range from 0 (any difference is considered motion) to 255 (which would never report motion).

    This defaults to 25. You can override the global default value by setting noise_threshold in the [motion] section of .stbt.conf.

  • mask (str|numpy.ndarray|Mask|Region) – A Region or a mask that specifies which parts of the image to analyse. This accepts anything that can be converted to a Mask using stbt.load_mask. See Regions and Masks.

  • region (Region) – Deprecated synonym for mask. Use mask instead.

  • frames (Iterator[stbt.Frame]) – An iterable of video-frames to analyse. Defaults to stbt.frames().

Changed in v33: mask accepts anything that can be converted to a Mask using load_mask. The region parameter is deprecated; pass your Region to mask instead. You can’t specify mask and region at the same time.

Changed in v34: The difference-detection algorithm takes color into account. The noise_threshold parameter changed range (from 0.0-1.0 to 0-255), sense (from “bigger is stricter” to “smaller is stricter”), and default value (from 0.84 to 25).

stbt.detect_pages(frame=None, candidates=None, test_pack_root='')#

Find Page Objects that match the given frame.

This function tries each of the Page Objects defined in your test-pack (that is, subclasses of stbt.FrameObject) and returns an instance of each Page Object that is visible (according to the object’s is_visible property).

This is a Python generator that yields 1 Page Object at a time. If your code only consumes the first object (like in the example below), detect_pages will try each Page Object class until it finds a match, yield it to your code, and then it won’t waste time trying other Page Object classes:

page = next(stbt.detect_pages())

To get all the matching pages you can iterate like this:

for page in stbt.detect_pages():

Or create a list like this:

pages = list(stbt.detect_pages())
  • frame (stbt.Frame) – The video frame to process; if not specified, a new frame is grabbed from the device-under-test by calling stbt.get_frame.

  • candidates (Sequence[Type[stbt.FrameObject]]) – The Page Object classes to try. Note that this is a list of the classes themselves, not instances of those classes. If candidates isn’t specified, detect_pages will use static analysis to find all of the Page Objects defined in your test-pack.

  • test_pack_root (str) – A subdirectory of your test-pack to search for Page Object definitions, used when candidates isn’t specified. Defaults to the entire test-pack.

Return type:



An iterator of Page Object instances that match the given frame.

Added in v32.

class stbt.Differ#

An algorithm that compares two images or frames to find the differences between them.

Subclasses of this class implement the actual diffing algorithms: See BGRDiff and GrayscaleDiff.

class stbt.Direction#

An enumeration.

HORIZONTAL = 'horizontal'#

Process the image from left to right

VERTICAL = 'vertical'#

Process the image from top to bottom

stbt.draw_text(text: str, duration_secs: float = 3) None#

Write the specified text to the output video.

  • text (str) – The text to write.

  • duration_secs (int or float) – The number of seconds to display the text.

stbt.find_file(filename: str) str#

Searches for the given filename relative to the directory of the caller.

When Stb-tester runs a test, the “current working directory” is not the same as the directory of the test-pack git checkout. If you want to read a file that’s committed to git (for example a CSV file with data that your test needs) you can use this function to find it. For example:

f = open(stbt.find_file("my_data.csv"))

If the file is not found in the directory of the Python file that called find_file, this will continue searching in the directory of that function’s caller, and so on, until it finds the file. This allows you to use find_file in a helper function that takes a filename from its caller.

This is the same algorithm used by load_image.


filename (str) – A relative filename.

Return type:



Absolute filename.


FileNotFoundError if the file can’t be found.

Added in v33.

min_size=(20, 20),
) list[Region]#

Find contiguous regions of a particular color.

For a guide to using this API see Finding GUI elements by color.

Return type:



A list of stbt.Region instances.

Added in v33.

image: Image | str,
max_size: tuple[int, int],
min_size: tuple[int, int] | None = None,
frame: Frame | None = None,
mask: Mask | Region | str = Region.ALL,
threshold: float = 25,
erode: bool = True,
) FindSelectionFromBackgroundResult#

Checks whether frame matches image, calculating the region where there are any differences. The region where frame doesn’t match the image is assumed to be the selection. This allows us to simultaneously detect the presence of a screen (used to implement a stbt.FrameObject class’s is_visible property) as well as finding the selection.

For example, to find the selection of an on-screen keyboard, image would be a screenshot of the keyboard without any selection. You may need to construct this screenshot artificially in an image editor by merging two different screenshots.

Unlike stbt.match, image must be the same size as frame.

  • image

    The background to match against. It can be the filename of a PNG file on disk, or an image previously loaded with stbt.load_image.

    If it has an alpha channel, any transparent pixels are masked out (that is, the alpha channel is ANDed with mask). This image must be the same size as frame.

  • max_size – The maximum size (width, height) of the differing region. If the differences between image and frame are larger than this in either dimension, the function will return a falsey result.

  • min_size – The minimum size (width, height) of the differing region (optional). If the differences between image and frame are smaller than this in either dimension, the function will return a falsey result.

  • frame – If this is specified it is used as the video frame to search in; otherwise a new frame is grabbed from the device-under-test. This is an image in OpenCV format (for example as returned by stbt.frames and stbt.get_frame).

  • mask – A Region or a mask that specifies which parts of the image to analyse. This accepts anything that can be converted to a Mask using stbt.load_mask. See Regions and Masks.

  • threshold – Threshold for differences between image and frame for it to be considered a difference. This is a colour distance between pixels in image and frame. 0 means the colours have to match exactly. 255 would mean that even white (255, 255, 255) would match black (0, 0, 0).

  • erode – By default we pass the thresholded differences through an erosion algorithm to remove noise or small anti-aliasing differences. If your selection is a single line less than 3 pixels wide, set this to False.


An object that will evaluate to true if image and frame matched with a difference smaller than max_size. The object has the following attributes:

  • matched (bool) – True if the image and the frame matched with a difference smaller than max_size.

  • region (stbt.Region) – The bounding box that contains the selection (that is, the differences between image and frame).

  • mask_region (stbt.Region) – The region of the frame that was analysed, as given in the function’s mask parameter.

  • image (stbt.Image) – The reference image given to find_selection_from_background.

  • frame (stbt.Frame) – The video-frame that was analysed.

stbt.find_selection_from_background was added in v32.

Changed in v33: mask accepts anything that can be converted to a Mask using load_mask (previously it only accepted a Region).

class stbt.Frame#

A frame of video.

A Frame is what you get from stbt.get_frame and stbt.frames. It is a subclass of numpy.ndarray, which is the type that OpenCV uses to represent images. Data is stored in 8-bit, 3 channel BGR format.

In addition to the members inherited from numpy.ndarray, Frame defines the following attributes:

  • time (float) – The wall-clock time when this video-frame was captured, as number of seconds since the unix epoch (1970-01-01T00:00:00Z). This is the same format used by the Python standard library function time.time.

  • width (int) – The width of the frame, in pixels.

  • height (int) – The height of the frame, in pixels.

  • region (Region) – A Region corresponding to the full size of the frame — that is, Region(0, 0, width, height).

class stbt.FrameObject#

Base class for user-defined Page Objects.

FrameObjects are Stb-tester’s implementation of the Page Object pattern. A FrameObject is a class that uses Stb-tester APIs like stbt.match() and stbt.ocr() to extract information from the screen, and it provides a higher-level API in the vocabulary and user-facing concepts of your own application.


Based on Martin Fowler’s PageObject diagram#

Stb-tester uses a separate instance of your FrameObject class for each frame of video captured from the device-under-test (hence the name “Frame Object”). Stb-tester provides additional tooling for writing, testing, and maintenance of FrameObjects.

To define your own FrameObject class:

  • Derive from stbt.FrameObject.

  • Define an is_visible property (using Python’s @property decorator) that returns True or False.

  • Define any other properties for information that you want to extract from the frame.

  • Inside each property, when you call an image-processing function (like stbt.match or stbt.ocr) you must specify the parameter frame=self._frame.

The following behaviours are provided automatically by the FrameObject base class:

  • Truthiness: A FrameObject instance is considered “truthy” if it is visible. Any other properties (apart from is_visible) will return None if the object isn’t visible.

  • Immutability: FrameObjects are immutable, because they represent information about a specific frame of video – in other words, an instance of a FrameOject represents the state of the device-under-test at a specific point in time. If you define any methods that change the state of the device-under-test, they should return a new FrameObject instance instead of modifying self.

  • Caching: Each property will be cached the first time is is used. This allows writing testcases in a natural way, while expensive operations like ocr will only be done once per frame.

For more details see Object Repository in the Stb-tester manual.

The FrameObject base class defines several convenient methods and attributes:


A tuple containing the names of the public properties.

__bool__() bool#

Delegates to is_visible. The object will only be considered True if it is visible.

__eq__(other) bool#

Two instances of the same FrameObject type are considered equal if the values of all the public properties match, even if the underlying frame is different. All falsey FrameObjects of the same type are equal.

__hash__() int#

Two instances of the same FrameObject type are considered equal if the values of all the public properties match, even if the underlying frame is different. All falsey FrameObjects of the same type are equal.

__init__(frame: Frame | None = None) None#

The default constructor takes an optional frame of video; if the frame is not provided, it will grab a live frame from the device-under-test.

If you override the constructor in your derived class (for example to accept additional parameters), make sure to accept an optional frame parameter and supply it to the super-class’s constructor.

__repr__() str#

The object’s string representation shows all its public properties.

We only print properties we have already calculated, to avoid triggering expensive calculations.

refresh(frame: Frame | None = None, **kwargs) FrameObject#

Returns a new FrameObject instance with a new frame. self is not modified.

refresh is used by navigation functions that modify the state of the device-under-test.

By default refresh returns a new object of the same class as self, but you can override the return type by implementing refresh in your derived class.

Any additional keyword arguments are passed on to __init__.

stbt.frames(timeout_secs: float | None = None) Iterator[Frame]#

Generator that yields video frames captured from the device-under-test.

For example:

for frame in stbt.frames():
    # Do something with each frame here.
    # Remember to add a termination condition to `break` or `return`
    # from the loop, or specify `timeout_secs` — otherwise you'll have
    # an infinite loop!

See also stbt.get_frame.


timeout_secs (int or float or None) – A timeout in seconds. After this timeout the iterator will be exhausted. That is, a for loop like for f in stbt.frames(timeout_secs=10) will terminate after 10 seconds. If timeout_secs is None (the default) then the iterator will yield frames forever but you can stop iterating (for example with break) at any time.

Return type:



An iterator of frames in OpenCV format (stbt.Frame).

section: str,
key: str,
type_: Callable[[str], T] = str,
) T#
section: str,
key: str,
default: DefaultT,
type_: Callable[[str], T] = str,
) T | DefaultT

Read the value of key from section of the test-pack configuration file.

For example, if your configuration file looks like this:

stbt_version = 30

backend_ip =

then you can read the value from your test script like this:

backend_ip = stbt.get_config("my_company_name", "backend_ip")

This searches in the .stbt.conf file at the root of your test-pack, and in the config/test-farm/<hostname>.conf file matching the hostname of the stb-tester device where the script is running. Values in the host-specific config file override values in .stbt.conf. See Configuration files for more details.

Test scripts can use get_config to read tags that you specify at run-time: see Automatic configuration keys. For example:

my_tag_value = stbt.get_config("result.tags", "my tag name")

Raises ConfigurationError if the specified section or key is not found, unless default is specified (in which case default is returned).

Changed in v32: Allow specifying None as the default value (previously None would be treated as if you hadn’t specified any default value).

stbt.get_frame() Frame#

Grabs a video frame from the device-under-test.

Return type:



The most recent video frame in OpenCV format.

Most Stb-tester APIs (stbt.match, stbt.FrameObject constructors, etc.) will call get_frame if a frame isn’t specified explicitly.

If you call get_frame twice very quickly (faster than the video-capture framerate) you might get the same frame twice. To block until the next frame is available, use stbt.frames.

To save a frame to disk pass it to cv2.imwrite. Note that any file you write to the current working directory will appear as an artifact in the test-run results.

stbt.get_rms_volume(duration_secs=3, stream=None) RmsVolumeResult#

Calculate the average RMS volume of the audio over the given duration.

For example, to check that your mute button works:'KEY_MUTE')
time.sleep(1)  # <- give it some time to take effect
assert get_rms_volume().amplitude < 0.001  # -60 dB
  • duration_secs (int or float) – The window over which you should average, in seconds. Defaults to 3s in accordance with short-term loudness from the EBU TECH 3341 specification.

  • stream (Iterator[AudioChunk]) – Audio stream to measure. Defaults to audio_chunks().


ZeroDivisionError – If duration_secs is shorter than one sample or stream contains no samples.

Return type:


class stbt.GrayscaleDiff(
min_size: tuple[int, int] | None = None,
threshold: float = 0.84,
erode: bool = True,

Compares 2 frames by converting them to grayscale, calculating pixel-wise absolute differences, and ignoring differences below a threshold.

This was the default diffing algorithm for wait_for_motion and press_and_wait before v34.

class stbt.Grid(
region: Region,
cols: int | None = None,
rows: int | None = None,
data: Sequence[Sequence[Any]] | None = None,

A grid with items arranged left to right, then down.

For example a keyboard, or a grid of posters, arranged like this:


All items must be the same size, and the spacing between them must be consistent.

This class is useful for converting between pixel coordinates on a screen, to x & y indexes into the grid positions.

  • region (Region) – Where the grid is on the screen.

  • cols (int) – Width of the grid, in number of columns.

  • rows (int) – Height of the grid, in number of rows.

  • data – A 2D array (list of lists) containing data to associate with each cell. The data can be of any type. For example, if you are modelling a grid-shaped keyboard, the data could be the letter at each grid position. If data is specified, then cols and rows are optional.

class Cell(index, position, region, data)#

A single cell in a Grid.

Don’t construct Cells directly; create a Grid instead.

  • index (int) – The cell’s 1D index into the grid, starting from 0 at the top left, counting along the top row left to right, then the next row left to right, etc.

  • position (Position) –

    The cell’s 2D index (x, y) into the grid (zero-based). For example in this grid “I” is index 8 and position (x=3, y=1):


  • region (Region) – Pixel coordinates (relative to the entire frame) of the cell’s bounding box.

  • data – The data corresponding to the cell, if data was specified when you created the Grid.

index: int | None = None,
position: PositionT | None = None,
region: Region | None = None,
data: Any = None,
) Cell#

Retrieve a single cell in the Grid.

For example, let’s say that you’re looking for the selected item in a grid by matching a reference image of the selection border. Then you can find the (x, y) position in the grid of the selection, like this:

selection = stbt.match("selection.png")
cell = grid.get(region=selection.region)
position = cell.position

You must specify one (and only one) of index, position, region, or data. For the meaning of these parameters see Grid.Cell.

A negative index counts backwards from the end of the grid (so -1 is the bottom right position).

region doesn’t have to match the cell’s pixel coordinates exactly; instead, this returns the cell that contains the center of the given region.


The Grid.Cell that matches the specified query; raises IndexError if the index/position/region is out of bounds or the data is not found.

class stbt.Image#

An image, possibly loaded from disk.

This is a subclass of numpy.ndarray, which is the type that OpenCV uses to represent images.

In addition to the members inherited from numpy.ndarray, Image defines the following attributes:

  • filename (str or None) – The filename that was given to stbt.load_image.

  • absolute_filename (str or None) – The absolute path resolved by stbt.load_image.

  • relative_filename (str or None) – The path resolved by stbt.load_image, relative to the root of the test-pack git repo.

Added in v32.

frame: Frame | None = None,
mask: Mask | Region | str = Region.ALL,
threshold: int | None = None,
region: Region = Region.ALL,
) _IsScreenBlackResult#

Check for the presence of a black screen in a video frame.

  • frame (Frame) – If this is specified it is used as the video frame to check; otherwise a new frame is grabbed from the device-under-test. This is an image in OpenCV format (for example as returned by frames and get_frame).

  • mask (str|numpy.ndarray|Mask|Region) – A Region or a mask that specifies which parts of the image to analyse. This accepts anything that can be converted to a Mask using stbt.load_mask. See Regions and Masks.

  • threshold (int) – Even when a video frame appears to be black, the intensity of its pixels is not always 0. To differentiate almost-black from non-black pixels, a binary threshold is applied to the frame. The threshold value is in the range 0 (black) to 255 (white). The global default (20) can be changed by setting threshold in the [is_screen_black] section of .stbt.conf.

  • region (Region) – Deprecated synonym for mask. Use mask instead.


An object that will evaluate to true if the frame was black, or false if not black. The object has the following attributes:

  • black (bool) – True if the frame was black.

  • frame (stbt.Frame) – The video frame that was analysed.

Changed in v33: mask accepts anything that can be converted to a Mask using load_mask. The region parameter is deprecated; pass your Region to mask instead. You can’t specify mask and region at the same time.

class stbt.Keyboard(*, mask: Mask | Region | str = Region.ALL, navigate_timeout: float = 60)#

Models the behaviour of an on-screen keyboard.

You customize for the appearance & behaviour of the keyboard you’re testing by specifying two things:

  • A Directed Graph that specifies the navigation between every key on the keyboard. For example: When A is focused, pressing KEY_RIGHT on the remote control goes to B, and so on.

  • A Page Object that tells you which key is currently focused on the screen. See the page parameter to enter_text and navigate_to.

The constructor takes the following parameters:

  • mask (str|numpy.ndarray|Mask|Region) – A mask to use when calling stbt.press_and_wait to determine when the current focus has finished moving. If the search page has a blinking cursor you need to mask out the region where the cursor can appear, as well as any other regions with dynamic content (such as a picture-in-picture with live TV). See stbt.press_and_wait for more details about the mask.

  • navigate_timeout (int or float) – Timeout (in seconds) for navigate_to. In practice navigate_to should only time out if you have a bug in your model or in the real keyboard under test.

For example, let’s model the lowercase keyboard from the YouTube search page on Apple TV:

# 1. Specify the keyboard's navigation model
# ------------------------------------------

kb = stbt.Keyboard()

# The 6x6 grid of letters & numbers:
kb.add_grid(stbt.Grid(stbt.Region(x=125, y=175, right=425, bottom=475),
# The 3x1 grid of special keys:
kb.add_grid(stbt.Grid(stbt.Region(x=125, y=480, right=425, bottom=520),
                      data=[[" ", "DELETE", "CLEAR"]]))

# The `add_grid` calls (above) defined the transitions within each grid.
# Now we need to specify the transitions from the bottom row of numbers
# to the larger keys below them:
#     5 6 7 8 9 0
#     ↕ ↕ ↕ ↕ ↕ ↕
# Note that `add_transition` adds the symmetrical transition (KEY_UP)
# by default.
kb.add_transition("5", " ", "KEY_DOWN")
kb.add_transition("6", " ", "KEY_DOWN")
kb.add_transition("7", "DELETE", "KEY_DOWN")
kb.add_transition("8", "DELETE", "KEY_DOWN")
kb.add_transition("9", "CLEAR", "KEY_DOWN")
kb.add_transition("0", "CLEAR", "KEY_DOWN")

# 2. A Page Object that describes the appearance of the keyboard
# --------------------------------------------------------------

class SearchKeyboard(stbt.FrameObject):
    """The YouTube search keyboard on Apple TV"""

    def is_visible(self):
        # Implementation left to the reader. Should return True if the
        # keyboard is visible and focused.

    def focus(self):
        """Returns the focused key.

        Used by `Keyboard.enter_text` and `Keyboard.navigate_to`.

        Note: The reference image (focus.png) is carefully cropped
        so that it will match the normal keys as well as the larger
        "SPACE", "DELETE" and "CLEAR" keys. The middle of the image
        (where the key's label appears) is transparent so that it will
        match any key.
        m = stbt.match("focus.png", frame=self._frame)
        if m:
            return kb.find_key(region=m.region)
            return None

    # Your Page Object can also define methods for your test scripts to
    # use:

    def enter_text(self, text):
        return kb.enter_text(text.lower(), page=self)

    def clear(self):
        page = kb.navigate_to("CLEAR", page=self)
        return page.refresh()

For a detailed tutorial, including an example that handles multiple keyboard modes (lowercase, uppercase, and symbols) see our article Testing on-screen keyboards.

stbt.Keyboard was added in v31.

Changed in v32:

  • Added support for keyboards with different modes (such as uppercase, lowercase, and symbols).

  • Changed the internal representation of the Directed Graph. Manipulating the networkx graph directly is no longer supported.

  • Removed stbt.Keyboard.parse_edgelist and stbt.grid_to_navigation_graph. Instead, first create the Keyboard object, and then use add_key, add_transition, add_edgelist, and add_grid to build the model of the keyboard.

  • Removed the stbt.Keyboard.Selection type. Instead, your Page Object’s focus property should return a Key value obtained from find_key.

Changed in v33:

  • Added class stbt.Keyboard.Key (the type returned from find_key). This used to be a private API, but now it is public so that you can use it in type annotations for your Page Object’s focus property.

  • Tries to recover from missed or double keypresses. To disable this behaviour specify retries=0 when calling enter_text or navigate_to.

  • Increased default navigate_timeout from 20 to 60 seconds.

Changed in v34:

  • The property of the page object should be called focus, not selection (for backward compatibility we still support selection).

class Key(
name: str | None = None,
text: str | None = None,
region: Region | None = None,
mode: str | None = None,

Represents a key on the on-screen keyboard.

This is returned by stbt.Keyboard.find_key. Don’t create instances of this class directly.

It has attributes name, text, region, and mode. See Keyboard.add_key.

name: str,
text: str | None = None,
region: Region | None = None,
mode: str | None = None,

Add a key to the model (specification) of the keyboard.

  • name (str) – The text or label you can see on the key.

  • text (str) – The text that will be typed if you press OK on the key. If not specified, defaults to name if name is exactly 1 character long, otherwise it defaults to "" (an empty string). An empty string indicates that the key doesn’t type any text when pressed (for example a “caps lock” key to change modes).

  • region (stbt.Region) – The location of this key on the screen. If specified, you can look up a key’s name & text by region using find_key(region=...).

  • mode (str) – The mode that the key belongs to (such as “lowercase”, “uppercase”, “shift”, or “symbols”) if your keyboard supports different modes. Note that the same key, if visible in different modes, needs to be modelled as separate keys (for example (name=" ", mode="lowercase") and (name=" ", mode="uppercase")) because their navigation connections are totally different: pressing up from the former goes to lowercase “c”, but pressing up from the latter goes to uppercase “C”. mode is optional if your keyboard doesn’t have modes, or if you only need to use the default mode.


The added key (stbt.Keyboard.Key). This is an object that you can use with add_transition.


ValueError if the key is already present in the model.

name: str | None = None,
text: str | None = None,
region: Region | None = None,
mode: str | None = None,
) Key#

Find a key in the model (specification) of the keyboard.

Specify one or more of name, text, region, and mode (as many as are needed to uniquely identify the key).

For example, your Page Object’s focus property would do some image processing to find the position of the focus, and then use find_key to identify the focused key based on that region.


A stbt.Keyboard.Key object that unambiguously identifies the key in the model. It has “name”, “text”, “region”, and “mode” attributes. You can use this object as the source or target parameter of add_transition.


ValueError if the key does not exist in the model, or if it can’t be identified unambiguously (that is, if two or more keys match the given parameters).

name: str | None = None,
text: str | None = None,
region: Region | None = None,
mode: str | None = None,
) list[Key]#

Find matching keys in the model of the keyboard.

This is like find_key, but it returns a list containing any keys that match the given parameters. For example, if there is a space key in both the lowercase and uppercase modes of the keyboard, calling find_keys(text=" ") will return a list of 2 objects [Key(text=" ", mode="lowercase"), Key(text=" ", mode="uppercase")].

This method doesn’t raise an exception; the list will be empty if no keys matched.

source: Key | dict | str,
target: Key | dict | str,
keypress: str,
mode: str | None = None,
symmetrical: bool = True,
) None#

Add a transition to the model (specification) of the keyboard.

For example: To go from “A” to “B”, press “KEY_RIGHT” on the remote control.

  • source – The starting key. This can be a Key object returned from add_key or find_key; or it can be a dict that contains one or more of “name”, “text”, “region”, and “mode” (as many as are needed to uniquely identify the key using find_key). For convenience, a single string is treated as “name” (but this may not be enough to uniquely identify the key if your keyboard has multiple modes).

  • target – The key you’ll land on after pressing the button on the remote control. This accepts the same types as source.

  • keypress (str) – The name of the key you need to press on the remote control, for example “KEY_RIGHT”.

  • mode (str) –

    Optional keyboard mode that applies to both source and target. For example, the two following calls are the same:

    add_transition("c", " ", "KEY_DOWN", mode="lowercase")
    add_transition({"name": "c", "mode": "lowercase"},
                   {"name": " ", "mode": "lowercase"},

  • symmetrical (bool) – By default, if the keypress is “KEY_LEFT”, “KEY_RIGHT”, “KEY_UP”, or “KEY_DOWN”, this will automatically add the opposite transition. For example, if you call add_transition("a", "b", "KEY_RIGHT") this will also add the transition ("b", "a", "KEY_LEFT)". Set this parameter to False to disable this behaviour. This parameter has no effect if keypress is not one of the 4 directional keys.


ValueError if the source or target keys do not exist in the model, or if they can’t be identified unambiguously.

edgelist: str,
mode: str | None = None,
symmetrical: bool = True,
) None#

Add keys and transitions specified in a string in “edgelist” format.

  • edgelist (str) –

    A multi-line string where each line is in the format <source_name> <target_name> <keypress>. For example, the specification for a qwerty keyboard might look like this:


    The name “SPACE” will be converted to the space character (” “). This is because space is used as the field separator; otherwise it wouldn’t be possible to specify the space key using this format.

    Lines starting with “###” are ignored (comments).

  • mode (str) – Optional mode that applies to all the keys specified in edgelist. See add_key for more details about modes. It isn’t possible to specify transitions between different modes using this edgelist format; use add_transition for that.

  • symmetrical (bool) – See add_transition.

grid: Grid,
mode: str | None = None,
merge: bool = False,
) Grid#

Add keys, and transitions between them, to the model of the keyboard.

If the keyboard (or part of the keyboard) is arranged in a regular grid, you can use stbt.Grid to easily specify the positions of those keys. This only works if the columns & rows are all of the same size.

If your keyboard has keys outside the grid, you will still need to specify the transitions from the edge of the grid onto the outside keys, using add_transition. See the example above.

  • grid (stbt.Grid) – The grid to model. The data associated with each cell will be used for the key’s “name” attribute (see add_key).

  • mode (str) – Optional mode that applies to all the keys specified in grid. See add_key for more details about modes.

  • merge (bool) – If True, adjacent keys with the same name and mode will be merged, and a single larger key will be added in its place.


A new stbt.Grid where each cell’s data is a key object that can be used with add_transition (for example to define additional transitions from the edges of this grid onto other keys).

text: str,
page: FrameObject,
verify_every_keypress: bool = False,
retries: int = 2,
) FrameObject#

Enter the specified text using the on-screen keyboard.

  • text (str) – The text to enter. If your keyboard only supports a single case then you need to convert the text to uppercase or lowercase, as appropriate, before passing it to this method.

  • page (stbt.FrameObject) –

    An instance of a stbt.FrameObject sub-class that describes the appearance of the on-screen keyboard. It must implement the following:

    • focus (Key) — property that returns a Key object, as returned from find_key.

    When you call enter_text, page must represent the current state of the device-under-test.

  • verify_every_keypress (bool) –

    If True, we will read the focused key after every keypress and assert that it matches the model. If False (the default) we will only verify the focused key corresponding to each of the characters in text. For example: to get from A to D you need to press KEY_RIGHT three times. The default behaviour will only verify that the focused key is D after the third keypress. This is faster, and closer to the way a human uses the on-screen keyboard.

    Set this to True to help debug your model if enter_text is behaving incorrectly.

  • retries (int) – Number of recovery attempts if a keypress doesn’t have the expected effect according to the model. Allows recovering from missed keypresses and double keypresses.


A new FrameObject instance of the same type as page, reflecting the device-under-test’s new state after the keyboard navigation completed.

Typically your FrameObject will provide its own enter_text method, so your test scripts won’t call this Keyboard class directly. See the example above.

target: Key | dict | str,
page: FrameObject,
verify_every_keypress: bool = False,
retries: int = 2,
) FrameObject#

Move the focus to the specified key.

This won’t press KEY_OK on the target; it only moves the focus there.

  • target – This can be a Key object returned from find_key, or it can be a dict that contains one or more of “name”, “text”, “region”, and “mode” (as many as are needed to identify the key using find_keys). If more than one key matches the given parameters, navigate_to will navigate to the closest one. For convenience, a single string is treated as “name”.

  • page (stbt.FrameObject) – See enter_text.

  • verify_every_keypress (bool) – See enter_text.

  • retries (int) – See enter_text.


A new FrameObject instance of the same type as page, reflecting the device-under-test’s new state after the keyboard navigation completed.

class stbt.Keypress#

Information about a keypress sent with

key: str#

The name of the key that was pressed.

start_time: float#

The time just before the keypress started (in seconds since the unix epoch, like time.time() and stbt.Frame.time).

end_time: float#

The time when transmission of the keypress signal completed.

frame_before: stbt.Frame#

The most recent video-frame just before the keypress started. Typically this is used by functions like stbt.press_and_wait to detect when the device-under-test reacted to the keypress.

stbt.last_keypress() Keypress | None#

Returns information about the last key-press sent to the device under test.

See the return type of

Added in v32.

class stbt.Learning#

An enumeration.

NONE = 0#

Don’t learn the menu structure for future calls to navigate_1d or navigate_grid.


Learn the menu structure to speed up future calls to navigate_1d or navigate_grid during this test-run (that is, during the lifetime of the Python process).


Learn the menu structure and save it to a persistent cache on the Stb-tester Node so that it can be used to speed up future calls to navigate_1d or navigate_grid when running tests on the same Stb-tester Node.

stbt.load_image(filename: Image | str) Image#
filename: Image | str,
flags: int,
) Image
filename: Image | str,
color_channels: int | tuple[int, ...],
) Image

Find & read an image from disk.

If given a relative filename, this will search in the directory of the Python file that called load_image, then in the directory of that file’s caller, and so on, until it finds the file. This allows you to use load_image in a helper function that takes a filename from its caller.

Finally this will search in the current working directory. This allows loading an image that you had previously saved to disk during the same test run.

This is the same search algorithm used by stbt.match and similar functions.

  • filename (str) – A relative or absolute filename.

  • flags – Flags to pass to cv2.imread. Deprecated; use color_channels instead.

  • color_channels (tuple[int]) –

    Tuple of acceptable numbers of color channels for the output image: 1 for grayscale, 3 for color, and 4 for color with an alpha (transparency) channel. For example, color_channels=(3, 4) will accept color images with or without an alpha channel. Defaults to (3, 4).

    If the image doesn’t match the specified color_channels it will be converted to the specified format.

Return type:



An image in OpenCV format — that is, a numpy.ndarray of 8-bit values. With the default color_channels parameter this will be 3 channels BGR, or 4 channels BGRA if the file has transparent pixels.


IOError if the specified path doesn’t exist or isn’t a valid image file.

  • Changed in v32: Return type is now stbt.Image, which is a numpy.ndarray sub-class with additional attributes filename, relative_filename and absolute_filename.

  • Changed in v32: Allows passing an image (numpy.ndarray or stbt.Image) instead of a string, in which case this function returns the given image.

  • Changed in v33: Added the color_channels parameter and deprecated flags. The image will always be converted to the format specified by color_channels (previously it was only converted to the format specified by flags if it was given as a filename, not as a stbt.Image or numpy array). The returned numpy array is read-only.

stbt.load_mask(mask: Mask | Region | str) Mask#

Used to load a mask from disk, or to create a mask from a Region.

A mask is a black & white image (the same size as the video-frame) that specifies which parts of the frame to process: White pixels select the area to process, black pixels the area to ignore.

In most cases you don’t need to call load_mask directly; Stb-tester’s image-processing functions such as is_screen_black, press_and_wait, and wait_for_motion will call load_mask with their mask parameter. This function is a public API so that you can use it if you are implementing your own image-processing functions.

Note that you can pass a Region directly to the mask parameter of stbt functions, and you can create more complex masks by adding, subtracting, or inverting Regions (see Regions and Masks).


mask (str|Region) –

A relative or absolute filename of a mask PNG image. If given a relative filename, this uses the algorithm from load_image to find the file.

Or, a Region that specifies the area to process.


A mask as used by is_screen_black, press_and_wait, wait_for_motion, and similar image-processing functions.

Added in v33.

class stbt.Mask#

Internal representation of a mask.

Most users will never need to use this type directly; instead, pass a filename or a Region to the mask parameter of APIs like stbt.wait_for_motion. See Regions and Masks.

static from_alpha_channel(image: Image | str) Mask#

Create a mask from the alpha channel of an image.


image (string or numpy.ndarray) –

An image with an alpha (transparency) channel. This can be the filename of a png file on disk, or an image previously loaded with stbt.load_image.

Filenames should be relative paths. See stbt.load_image for the path lookup algorithm.

region: Region,
color_channels: int = 1,
) tuple[ndarray | None, Region]#

Materialize the mask to a numpy array of the specified size.

Most users will never need to call this method; it’s for people who are implementing their own image-processing algorithms.

  • region (stbt.Region) – A Region matching the size of the frame that you are processing.

  • color_channels (int) – The number of channels required (1 or 3), according to your image-processing algorithm’s needs. All channels will be identical — for example with 3 channels, pixels will be either [0, 0, 0] or [255, 255, 255].

Return type:

tuple[numpy.ndarray | None, Region]


A tuple of:

  • An image (numpy array), where masked-in pixels are white (255) and masked-out pixels are black (0). The array is the same size as the region in the second member of this tuple.

  • A bounding box (stbt.Region) around the masked-in area. If most of the frame is masked out, limiting your image-processing operations to this region will be faster.

If the mask is just a Region, the first member of the tuple (the image) will be None because the bounding-box is sufficient.

image: Image | str,
frame: Frame | None = None,
match_parameters: MatchParameters | None = None,
region: Region = Region.ALL,
) MatchResult#

Search for an image in a single video frame.

  • image (string or numpy.ndarray) –

    The image to search for. It can be the filename of a png file on disk, or a numpy array containing the pixel data in 8-bit BGR format. If the image has an alpha channel, any transparent pixels are ignored.

    Filenames should be relative paths. See stbt.load_image for the path lookup algorithm.

    8-bit BGR numpy arrays are the same format that OpenCV uses for images. This allows generating reference images on the fly (possibly using OpenCV) or searching for images captured from the device-under-test earlier in the test script.

  • frame (stbt.Frame or numpy.ndarray) – If this is specified it is used as the video frame to search in; otherwise a new frame is grabbed from the device-under-test. This is an image in OpenCV format (for example as returned by frames and get_frame).

  • match_parameters (MatchParameters) – Customise the image matching algorithm. See MatchParameters for details.

  • region (Region) – Only search within the specified region of the video frame.


A MatchResult, which will evaluate to true if a match was found, false otherwise.

image: Image | str,
frame: Frame | None = None,
match_parameters: MatchParameters | None = None,
region: Region = Region.ALL,
) Iterator[MatchResult]#

Search for all instances of an image in a single video frame.

Arguments are the same as match.


An iterator of zero or more MatchResult objects (one for each position in the frame where image matches).


all_buttons = list(stbt.match_all("button.png"))
for match_result in stbt.match_all("button.png"):
    # do something with match_result here
text: str,
frame: Frame | None = None,
region: Region = Region.ALL,
lang: str | None = None,
tesseract_config: dict[str, bool | str | int] | None = None,
case_sensitive: bool = False,
upsample: bool | None = None,
text_color: Color | None = None,
text_color_threshold: float | None = None,
engine: OcrEngine | None = None,
char_whitelist: str | None = None,
) TextMatchResult#

Search for the specified text in a single video frame.

This can be used as an alternative to match, searching for text instead of an image.

  • text (str) – The text to search for.

  • frame – See ocr.

  • region – See ocr.

  • mode – See ocr.

  • lang – See ocr.

  • tesseract_config – See ocr.

  • upsample – See ocr.

  • text_color – See ocr.

  • text_color_threshold – See ocr.

  • engine – See ocr.

  • char_whitelist – See ocr.

  • case_sensitive (bool) – Ignore case if False (the default).


A TextMatchResult, which will evaluate to True if the text was found, false otherwise.

For example, to select a button in a vertical menu by name (in this case “TV Guide”):

m = stbt.match_text("TV Guide")
assert m.match
while not stbt.match('selected-button.png').region.contains(m.region):'KEY_DOWN')
Added in v31: The char_whitelist parameter.
class stbt.MatchMethod#

An enum. See MatchParameters for documentation of these values.

SQDIFF = 'sqdiff'#
SQDIFF_NORMED = 'sqdiff-normed'#
CCORR_NORMED = 'ccorr-normed'#
CCOEFF_NORMED = 'ccoeff-normed'#
class stbt.MatchParameters(
match_method: MatchMethod | None = None,
match_threshold: float | None = None,
confirm_method: ConfirmMethod | None = None,
confirm_threshold: float | None = None,
erode_passes: int | None = None,

Parameters to customise the image processing algorithm used by match, wait_for_match, and press_until_match.

You can change the default values for these parameters by setting a key (with the same name as the corresponding python parameter) in the [match] section of .stbt.conf. But we strongly recommend that you don’t change the default values from what is documented here.

You should only need to change these parameters when you’re trying to match a reference image that isn’t actually a perfect match – for example if there’s a translucent background with live TV visible behind it; or if you have a reference image of a button’s background and you want it to match even if the text on the button doesn’t match.

  • match_method (MatchMethod) – The method to be used by the first pass of stb-tester’s image matching algorithm, to find the most likely location of the reference image within the larger source image. For details see OpenCV’s cv2.matchTemplate. Defaults to MatchMethod.SQDIFF.

  • match_threshold (float) – Overall similarity threshold for the image to be considered a match. This threshold applies to the average similarity across all pixels in the image. Valid values range from 0 (anything is considered to match) to 1 (the match has to be pixel perfect). Defaults to 0.98.

  • confirm_method (ConfirmMethod) –

    The method to be used by the second pass of stb-tester’s image matching algorithm, to confirm that the region identified by the first pass is a good match.

    The first pass often gives false positives: It can report a “match” for an image with obvious differences, if the differences are local to a small part of the image. The second pass is more CPU-intensive, but it only checks the position of the image that the first pass identified. The allowed values are:


    Do not confirm the match. This is useful if you know that the reference image is different in some of the pixels. For example to find a button, even if the text inside the button is different.


    Compare the absolute difference of each pixel from the reference image against its counterpart from the candidate region in the source video frame.


    Normalise the pixel values from both the reference image and the candidate region in the source video frame, then compare the absolute difference as with ABSDIFF.

    This method is better at noticing differences in low-contrast images (compared to the ABSDIFF method), but it isn’t suitable for reference images that don’t have any structure (that is, images that are a single solid color without any lines or variation).

    This is the default method, with a default confirm_threshold of 0.70.

  • confirm_threshold (float) –

    The minimum allowed similarity between any given pixel in the reference image and the corresponding pixel in the source video frame, as a fraction of the pixel’s total luminance range.

    Unlike match_threshold, this threshold applies to each pixel individually: Any pixel that exceeds this threshold will cause the match to fail (but see erode_passes below).

    Valid values range from 0 (less strict) to 1.0 (more strict). Useful values tend to be around 0.84 for ABSDIFF, and 0.70 for NORMED_ABSDIFF. Defaults to 0.70.

  • erode_passes (int) – After the ABSDIFF or NORMED_ABSDIFF absolute difference is taken, stb-tester runs an erosion algorithm that removes single-pixel differences to account for noise and slight rendering differences. Useful values are 1 (the default) and 0 (to disable this step).

class stbt.MatchResult#

The result from match.

  • time (float) – The time at which the video-frame was captured, in seconds since 1970-01-01T00:00Z. This timestamp can be compared with system time (time.time()).

  • match (bool) – True if a match was found. This is the same as evaluating MatchResult as a bool. That is, if result: will behave the same as if result.match:.

  • region (Region) – Coordinates where the image was found (or of the nearest match, if no match was found).

  • first_pass_result (float) – Value between 0 (poor) and 1.0 (excellent match) from the first pass of stb-tester’s image matching algorithm (see MatchParameters for details).

  • frame (Frame) – The video frame that was searched, as given to match.

  • image (Image) – The reference image that was searched for, as given to match.

Changed in v32: The type of the image attribute is now stbt.Image. Previously it was a string or a numpy array.

exception stbt.MatchTimeout#

Bases: stbt.UITestFailure

Exception raised by wait_for_match.

  • screenshot (Frame) – The last video frame that wait_for_match checked before timing out.

  • expected (str) – Filename of the image that was being searched for.

  • timeout_secs (int or float) – Number of seconds that the image was searched for.

class stbt.MotionResult#

The result from detect_motion and wait_for_motion.

  • time (float) – The time at which the video-frame was captured, in seconds since 1970-01-01T00:00Z. This timestamp can be compared with system time (time.time()).

  • motion (bool) – True if motion was found. This is the same as evaluating MotionResult as a bool. That is, if result: will behave the same as if result.motion:.

  • region (Region) – Bounding box where the motion was found, or None if no motion was found.

  • frame (Frame) – The video frame in which motion was (or wasn’t) found.

exception stbt.MotionTimeout#

Bases: stbt.UITestFailure

Exception raised by wait_for_motion.

  • screenshot (Frame) – The last video frame that wait_for_motion checked before timing out.

  • mask (Mask or None) – The mask that was used, if any.

  • timeout_secs (int or float) – Number of seconds that motion was searched for.

class stbt.MultiPress(
key_mapping: dict[str, str] | None = None,
interpress_delay_secs: float | None = None,
interletter_delay_secs: float = 1,

Helper for entering text using multi-press on a numeric keypad.

In some apps, the search page allows entering text by pressing the keys on the remote control’s numeric keypad: press the number “2” once for “A”, twice for “B”, etc.:

1.,     ABC2    DEF3
GHI4    JKL5    MNO6

To enter text with this mechanism, create an instance of this class and call its enter_text method. For example:

multipress = stbt.MultiPress()

The constructor takes the following parameters:

  • key_mapping (dict) –

    The mapping from number keys to letters. The default mapping is:

        "KEY_0": " 0",
        "KEY_1": "1.,",
        "KEY_2": "abc2",
        "KEY_3": "def3",
        "KEY_4": "ghi4",
        "KEY_5": "jkl5",
        "KEY_6": "mno6",
        "KEY_7": "pqrs7",
        "KEY_8": "tuv8",
        "KEY_9": "wxyz9",

    This matches the arrangement of digits A-Z from ITU E.161 / ISO 9995-8.

    The value you pass in this parameter is merged with the default mapping. For example to override the punctuation characters you can specify key_mapping={"KEY_1": "@1.,-_"}.

    The dict’s key names must match the remote-control key names accepted by The dict’s values are a string or sequence of the corresponding letters, in the order that they are entered when pressing that key.

  • interpress_delay_secs (float) – The time to wait between every key-press, in seconds. This defaults to 0.3, the same default as

  • interletter_delay_secs (float) – The time to wait between letters on the same key, in seconds. For example, to enter “AB” you need to press key “2” once, then wait, then press it again twice. If you don’t wait, the device-under-test would see three consecutive keypresses which mean the letter “C”.

enter_text(text: str) None#

Enter the specified text using multi-press on the numeric keypad.


text (str) – The text to enter. The case doesn’t matter (uppercase and lowercase are treated the same).

target: str,
page: FrameObject,
direction: Direction,
mask: Mask | Region | str = Region.ALL,
timeout_secs: float = 120,
get_focus: Callable[[FrameObject], str] | None = None,
eq: Callable[[str, str], bool] | None = None,
press_and_wait: Callable[[str], Transition | TransitionStatus] | None = None,
recover: str | Callable[[NavigationState], FrameObject] | None = None,
retry_missed_keypresses: bool = False,
learning: Learning = Learning.PERSISTENT,
id: str | None = None,
) FrameObject#

Navigate through a 1-dimensional menu, searching for a target.

This doesn’t press KEY_OK on the target; it only moves the focus there.

navigate_1d takes a page parameter, which is a Page Object that describes the current state of the menu. The Page Object must have a property called focus. navigate_1d will search the menu until the current focus (according to the Page Object) matches target.

Typically this function is used from a Page Object’s navigate method; don’t use this directly in your test scripts. For example:

class Menu(stbt.FrameObject):
    def is_visible(self):
        ...  # implementation not shown

    def focus(self):
        '''Read the currently focused menu item.'''
        return stbt.ocr(frame=self._frame, region=...)

    def navigate_to(self, target):
        return stbt.navigate_1d(target, page=self,
  • target – Navigate to this menu item. Typically a string.

  • page – An instance of a stbt.FrameObject sub-class that represents the current state of the menu. We will use this object’s focus property to determine which item is currently focused. After we have pressed up, down, left, or right, we will call this object’s refresh method to read the new state from the screen.

  • direction – Direction to move in: Either Direction.VERTICAL (for navigation with “KEY_DOWN” and “KEY_UP”) or Direction.HORIZONTAL (for navigation with “KEY_RIGHT” and “KEY_LEFT”).

  • mask – A mask to use when calling stbt.press_and_wait to determine when the current focus has finished moving. Usually this will be a Region specifying where the menu is on the screen, but it can also be a more complex mask.

  • timeout_secs – Raise NavigationFailed if we still haven’t found the target after this many seconds.

  • get_focus – By default, navigate_1d will use the page object’s focus property to determine which item is currently focused. You can override this behaviour by providing a function that takes a single argument (the page instance) and returns the focused menu item.

  • eq – A function that compares two strings (the current focus and the target), and returns True if they match. The default is stbt.ocr_eq, which ignores common OCR errors.

  • press_and_wait

    navigate_1d will call stbt.press_and_wait with the appropriate key (“KEY_DOWN”, “KEY_UP”, “KEY_LEFT”, or “KEY_RIGHT”) to move the focus and wait until it has finished moving. If you need to customise the parameters to press_and_wait, you can pass in your own function here: It should accept a key name and call the real stbt.press_and_wait with your extra parameters. For example:

    def my_press_and_wait(key):
        return stbt.press_and_wait(
            key, mask=stbt.Region(...), timeout_secs=10)
    return stbt.navigate_1d(
        target, page=self, direction=stbt.Direction.HORIZONTAL,

    In many cases a lambda (anonymous function) is more convenient:

    return stbt.navigate_1d(
        target, page=self, direction=stbt.Direction.HORIZONTAL,
        press_and_wait=lambda key: stbt.press_and_wait(
            key, mask=stbt.Region(...), timeout_secs=10))

  • recover

    A function that will recover if navigation “falls off” the edge of the menu onto a different page.

    ”Falling off” means that page.is_visible is False after navigating up, down, left or right.

    By default, navigate_1d will try to recover by pressing the opposite key (for example, “KEY_LEFT” if it had fallen off after pressing “KEY_RIGHT”), and if that fails, it will try “KEY_BACK”. To use a different recovery strategy, specify your own function here.

    The function should take a single argument of type stbt.NavigationState. It should do the appropriate actions to get back onto the menu that we were navigating, and then return a new Page Object that represents the new state of the menu. For example:

    def my_recover(state: stbt.NavigationState):
        assert stbt.press_and_wait("KEY_BACK")
        return Menu()
    stbt.navigate_1d(..., recover=my_recover)

    If necessary, you can check the function’s parameter to see the previous Page Object (before falling off) and the last key that was pressed.

    Your recovery function doesn’t have to get back to the same menu item that was focused before falling off; it just needs to get back onto the menu. For example, if your Page Object has a staticmethod called open that will open the menu from any state, you can use that to recover, like this:

    stbt.navigate_1d(..., recover=lambda _:

    The recover parameter can also be a string, which is the name of a key for navigate_1d will attempt to recover by pressing that key once.

  • learning

    We can remember the items we discovered during navigation so that the next time you call navigate_1d we can use that information to navigate faster. To disable this behaviour, pass learning=stbt.Learning.NONE. See stbt.Learning.

    If the actual menu of the device-under-test doesn’t match the learned menu structure, this function will still work — it will fall back to the searching behaviour.

  • id – A unique identifier for this menu. You only need to specify this if learning is enabled and you are using the same FrameObject class to recognize many different menus and sub-menus.


A new FrameObject instance of the same type as page, reflecting the device-under-test’s new state after the navigation completed.


NavigationFailed – If we can’t find the target.

Added in v34.

target: str,
page: FrameObject,
mask: Mask | Region | str = Region.ALL,
timeout_secs: float = 300,
get_focus: Callable[[FrameObject], str] | None = None,
eq: Callable[[str, str], bool] | None = None,
press_and_wait: Callable[[str], Transition | TransitionStatus] | None = None,
recover: str | Callable[[NavigationState], FrameObject] | None = None,
retry_missed_keypresses: bool = False,
learning: Learning = Learning.PERSISTENT,
id: str | None = None,
) FrameObject#

Navigate through a 2-dimensional grid, searching for a target.

This doesn’t press KEY_OK on the target; it only moves the focus there.

See navigate_1d for documentation on all the parameters.

Added in v34.

exception stbt.NavigationFailed#

Bases: AssertionError

Raised by navigate_1d or navigate_grid if it couldn’t find the requested target.

class stbt.NavigationState#

State of the device-under-test during navigation.

Passed to the recover parameter of navigate_1d and navigate_grid.

frame: Frame#

Latest frame of video from the device-under-test — that is, its current state.

last_page: FrameObject#

The last Page Object seen when the page was still visible (before navigation “fell off” the edge of the menu onto a different page).

last_key: str#

The last key pressed during navigation, that caused it to “fall off” the edge.

frame: Frame | None = None,
region: Region = Region.ALL,
lang: str | None = None,
tesseract_config: dict[str, bool | str | int] | None = None,
tesseract_user_words: list[str] | str | None = None,
tesseract_user_patterns: list[str] | str | None = None,
upsample: bool | None = None,
text_color: Color | None = None,
text_color_threshold: float | None = None,
engine: OcrEngine | None = None,
char_whitelist: str | None = None,
corrections: dict[Pattern | str, str] | None = None,

Return the text present in the video frame as a Unicode string.

Perform OCR (Optical Character Recognition) using the “Tesseract” open-source OCR engine.

  • frame (Frame) – If this is specified it is used as the video frame to process; otherwise a new frame is grabbed from the device-under-test.

  • region (Region) – Only search within the specified region of the video frame.

  • mode (OcrMode) – Tesseract’s layout analysis mode.

  • lang (str) – The three-letter ISO-639-3 language code of the language you are attempting to read; for example “eng” for English or “deu” for German. More than one language can be specified by joining with ‘+’; for example “eng+deu” means that the text to be read may be in a mixture of English and German. This defaults to “eng” (English). You can override the global default value by setting lang in the [ocr] section of .stbt.conf. You may need to install the tesseract language pack; see installation instructions here.

  • tesseract_config (dict) – Allows passing configuration down to the underlying OCR engine. See the tesseract documentation for details.

  • tesseract_user_words (unicode string, or list of unicode strings) – List of words to be added to the tesseract dictionary. To replace the tesseract system dictionary altogether, also set tesseract_config={'load_system_dawg': False, 'load_freq_dawg': False}.

  • tesseract_user_patterns (unicode string, or list of unicode strings) –

    List of patterns to add to the tesseract dictionary. The tesseract pattern language corresponds roughly to the following regular expressions:

    tesseract  regex
    =========  ===========
    \c         [a-zA-Z]
    \d         [0-9]
    \n         [a-zA-Z0-9]
    \p         [:punct:]
    \a         [a-z]
    \A         [A-Z]
    \*         *

  • upsample (bool) – Upsample the image 3x before passing it to tesseract. This helps to preserve information in the text’s anti-aliasing that would otherwise be lost when tesseract binarises the image. This defaults to True; you can override the global default value by setting upsample=False in the [ocr] section of .stbt.conf. You should set this to False if the text is already quite large, or if you are doing your own binarisation (pre-processing the image to make it black and white).

  • text_color (Color) – Color of the text. Specifying this can improve OCR results when tesseract’s default thresholding algorithm doesn’t detect the text, for example white text on a light-colored background or text on a translucent overlay with dynamic content underneath.

  • text_color_threshold (int) – The threshold to use with text_color, between 0 and 255. Defaults to 25. You can override the global default value by setting text_color_threshold in the [ocr] section of .stbt.conf.

  • engine (OcrEngine) – The OCR engine to use. Defaults to OcrEngine.TESSERACT. You can override the global default value by setting engine in the [ocr] section of .stbt.conf.

  • char_whitelist (str) – String of characters that are allowed. Useful when you know that the text is only going to contain numbers or IP addresses, for example so that tesseract won’t think that a zero is the letter o. Note that Tesseract 4.0’s LSTM engine ignores char_whitelist.

  • corrections (dict) –

    Dictionary of corrections to replace known OCR mis-reads. Each key of the dict is the text to search for; the value is the corrected string to replace the matching key. If the key is a string, it is treated as plain text and it will only match at word boundaries (for example the string "he saw" won’t match "the saw" nor "he saws"). If the key is a regular expression pattern (created with re.compile) it can match anywhere, and the replacement string can contain backreferences such as "\1" which are replaced with the corresponding group in the pattern (same as Python’s re.sub). Example:

    corrections={'bad': 'good',
                 re.compile(r'[oO]'): '0'}

    Plain strings are replaced first (in the order they are specified), followed by regular expresions (in the order they are specified).

    The default value for this parameter can be set with stbt.set_global_ocr_corrections. If global corrections have been set and this corrections parameter is specified, the corrections in this parameter are applied first.

Added in v31: The char_whitelist parameter.
Added in v32: The corrections parameter.
stbt.ocr_eq(a: str, b: str) bool#

Compare two strings for equality, ignoring common OCR errors.

stbt.ocr sometimes mistakes some characters, such as “O” instead of “0”, especially when reading short fragments of text without enough context. ocr_eq wil treat such characters as equal to each other. It also ignores spaces and punctuation. For example:

>>> ocr_eq("hello", "hel 10")

The character mapping used by ocr_eq’s normalization algorithm is available in ocr_eq.replacements; you can modify it by adding or removing entries. The default mapping is:

>>> ocr_eq.replacements
{'0': 'o', 'O': 'o',
 '1': 'l', 'i': 'l', 'I': 'l', '|': 'l', '7': 'l',
 '2': 'z', 'Z': 'z',
 '4': 'A',
 '5': 's', 'S': 's',
 '6': 'g', 'G': 'g', '9': 'g', 'q': 'g',
 '8.': '&',
 '8': 'B',
 'f': 'r', 'F': 'r',
 'ł': 't',
 'm': 'rn',
 'C': 'c',
 'K': 'k',
 'P': 'p',
 'V': 'v',
 'W': 'w',
 'vv': 'w',
 'X': 'x',
 'Y': 'y'}

If you need to normalize a single string using this same algorithm, use ocr_eq.normalize:

>>> ocr_eq.normalize("hel 10")

Added in v34.

class stbt.OcrEngine#

An enumeration.


Tesseract’s “legacy” OCR engine (v3). Recommended.

LSTM = 1#

Tesseract v4’s “Long Short-Term Memory” neural network. Not recommended for reading menus, buttons, prices, numbers, times, etc, because it hallucinates text that isn’t there when the input isn’t long prose.


Combine results from Tesseract legacy & LSTM engines. Not recommended because it favours the result from the LSTM engine too heavily.


Default engine, based on what is installed.

class stbt.OcrMode#

Options to control layout analysis and assume a certain form of image.

For a (brief) description of each option, see the tesseract(1) man page.

RAW_LINE = 13#
stbt.pdu(uri: str | None = None) PDU#

Return a PDU (Power Distribution Unit) object to control the power to the device-under-test.


name (str|None) – The name of the PDU. This must match a PDU configured in the test-pack’s configuration files. If None then the name is taken from the device_under_test.power_outlet configuration variable.

For more details see Power Distribution Units.

stbt.pdu was added in v34.

class stbt.PDU#

API to control a specific outlet of a network-controlled Power Distribution Unit (PDU).

Use the stbt.pdu factory function to create instances of this class.

set(power: bool)#
get() bool#
class stbt.Position(x: int, y: int)#

A point with x and y coordinates.

exception stbt.PreconditionError#

Exception raised by as_precondition.
key: str,
interpress_delay_secs: float | None = None,
hold_secs: float | None = None,
) Keypress#

Send the specified key-press to the device under test.

  • key (str) –

    The name of the key/button.

    If you are using infrared control, this is a key name from your lircd.conf configuration file.

    If you are using HDMI CEC control, see the available key names here. Note that some devices might not understand every CEC command in that list.

  • interpress_delay_secs (int or float) –

    The minimum time to wait after a previous key-press, in order to accommodate the responsiveness of the device-under-test.

    This defaults to 0.3. You can override the global default value by setting interpress_delay_secs in the [press] section of .stbt.conf.

  • hold_secs (int or float) – Hold the key down for the specified duration (in seconds). Currently this is implemented for the infrared, HDMI CEC, and Roku controls. There is a maximum limit of 60 seconds.


A stbt.Keypress object with information about the keypress that was sent.

  • Changed in v33: The key argument can be an Enum (we’ll use the Enum’s value, which must be a string).

key: str,
mask: Mask | Region | str = Region.ALL,
region: Region = Region.ALL,
timeout_secs: float = 10,
stable_secs: float = 1,
min_size: tuple[int, int] | None = None,
retries: int = 0,
frames: Iterator[Frame] | None = None,
) Transition#

Press a key, then wait for the screen to change, then wait for it to stop changing.

This can be used to wait for a menu selection to finish moving before attempting to OCR at the selection’s new position; or to measure the duration of animations; or to measure how long it takes for a screen (such as an EPG) to finish populating.

  • key – The name of the key to press (passed to

  • mask – A Region or a mask that specifies which parts of the image to analyse. This accepts anything that can be converted to a Mask using stbt.load_mask. See Regions and Masks.

  • region – Deprecated synonym for mask. Use mask instead.

  • timeout_secs – A timeout in seconds. This function will return a falsey value if the transition didn’t complete within this number of seconds from the key-press.

  • stable_secs – A duration in seconds. The screen must stay unchanged (within the specified region or mask) for this long, for the transition to be considered “complete”.

  • min_size – A tuple of (width, height), in pixels, for differences to be considered as “motion”. Use this to ignore small differences, such as the blinking text cursor in a search box.

  • retries – Press the key again (up to this number of times) if the first press didn’t have any effect (that is, if the status would have been TransitionStatus.START_TIMEOUT). Defaults to 0 (no retries).

  • frames – An iterable of video-frames to analyse. Defaults to stbt.frames().


A Transition object that will evaluate to true if the transition completed, false otherwise.

Changed in v32: Use the same difference-detection algorithm as wait_for_motion.

Added in v33: The started, complete and stable attributes of the returned value.

Changed in v33: mask accepts anything that can be converted to a Mask using load_mask. The region parameter is deprecated; pass your Region to mask instead. You can’t specify mask and region at the same time.

Added in v34: The retries and frames parameters.

Changed in v34: The difference-detection algorithm takes color into account.

key: str,
interpress_delay_secs: float | None = None,
) ContextManager[Keypress]#

Context manager that will press and hold the specified key for the duration of the with code block.

For example, this will hold KEY_RIGHT until wait_for_match finds a match or times out:

with stbt.pressing("KEY_RIGHT"):

The same limitations apply as’s hold_secs parameter.

key: str,
image: Image | str,
interval_secs: float | None = None,
max_presses: int | None = None,
match_parameters: MatchParameters | None = None,
region: Region | None = Region.ALL,
) MatchResult#

Call press as many times as necessary to find the specified image.

  • key – See press.

  • image – See match.

  • interval_secs (int or float) –

    The number of seconds to wait for a match before pressing again. Defaults to 3.

    You can override the global default value by setting interval_secs in the [press_until_match] section of .stbt.conf.

  • max_presses (int) –

    The number of times to try pressing the key and looking for the image before giving up and raising MatchTimeout. Defaults to 10.

    You can override the global default value by setting max_presses in the [press_until_match] section of .stbt.conf.

  • match_parameters – See match.

  • region – See match.


MatchResult when the image is found.


MatchTimeout if no match is found after timeout_secs seconds.

class stbt.prometheus.Counter(name: str, description: str)#

Log a cumulative metric that increases over time, to the Prometheus database on your Stb-tester Portal.

Prometheus is an open-source monitoring & alerting tool. A Prometheus Counter tracks counts of events or running totals. See Metric Types and instrumentation best practices in the Prometheus documentation.

Example use cases for Counters:

  • Number of times the “buffering” indicator or “loading” spinner has appeared.

  • Number of frames seen with visual glitches or blockiness.

  • Number of VoD assets that failed to play.

  • name (str) – A unique identifier for the metric. See Metric names in the Prometheus documentation.

  • description (str) – A longer description of the metric.

Added in v32.

inc(value: float = 1, labels: dict[str, str] | None = None)#

Increment the Counter by the given amount.

  • value (int) – The amount to increase.

  • labels (Mapping[str,str]) –

    Optional dict of label_name: label_value entries. See Labels in the Prometheus documentation.


    Every unique combination of key-value label pairs represents a new time series, which can dramatically increase the amount of memory required to store the data on the Stb-tester Node, on the Stb-tester Portal, and on your Prometheus server. Do not use labels to store dimensions with high cardinality (many different label values), such as programme names or other unbounded sets of values.

class stbt.prometheus.Gauge(name: str, description: str)#

Log a numerical value that can go up and down, to the Prometheus database on your Stb-tester Portal.

Prometheus is an open-source monitoring & alerting tool. A Prometheus Gauge tracks values like temperatures or current memory usage.

  • name (str) – A unique identifier for the metric. See Metric names in the Prometheus documentation.

  • description (str) – A longer description of the metric.

Added in v32.

set(value: float, labels: dict[str, str] | None = None)#

Set the Gauge to the given value.

class stbt.prometheus.Histogram(name: str, description: str, buckets: list[float])#

Log measurements, in buckets, to the Prometheus database on your Stb-tester Portal.

Prometheus is an open-source monitoring & alerting tool. A Prometheus Histogram counts measurements (such as sizes or durations) into configurable buckets.

Prometheus Histograms are commonly used for performance measurements:

  • Channel zapping time.

  • App launch time.

  • Time for VoD content to start playing.

Prometheus Histograms allow reporting & alerting on particular quantiles. For example you could configure an alert if the 90th percentile of the above measurements exceeds a certain threshold (that is, the slowest 10% of requests are slower than the threshold).

  • name (str) – A unique identifier for the metric. See Metric names in the Prometheus documentation.

  • description (str) – A longer description of the metric.

  • buckets (Sequence[float]) – A list of numbers in increasing order, where each number is the upper bound of the corresponding bucket in the Histogram. With Prometheus you must specify the buckets up-front because the raw measurements aren’t stored, only the counts of how many measurements fall into each bucket.

Added in v32.

log(value: float, labels: dict[str, str] | None = None)#

Store the given value into the Histogram.

class stbt.Region(
x: float,
y: float,
width: float | None = None,
height: float | None = None,
right: float | None = None,
bottom: float | None = None,

Region(x, y, width=width, height=height) or Region(x, y, right=right, bottom=bottom)

Rectangular region within the video frame.

For example, given the following regions a, b, and c:

- 01234567890123
0 ░░░░░░░░
1 ░a░░░░░░
2 ░░░░░░░░
3 ░░░░░░░░
4 ░░░░▓▓▓▓░░▓c▓
5 ░░░░▓▓▓▓░░▓▓▓
6 ░░░░▓▓▓▓░░░░░
7 ░░░░▓▓▓▓░░░░░
8     ░░░░░░b░░
9     ░░░░░░░░░
>>> a = Region(0, 0, width=8, height=8)
>>> b = Region(4, 4, right=13, bottom=10)
>>> c = Region(10, 4, width=3, height=2)
>>> a.right
>>> b.bottom
Position(x=8, y=7)
>>> b.contains(c), a.contains(b), c.contains(b), c.contains(None)
(True, False, False, False)
>>> b.contains(, a.contains(
(True, False)
>>> b.extend(x=6, bottom=-4) == c
>>> a.extend(right=5).contains(c)
>>> a.width, a.extend(x=3).width, a.extend(right=-3).width
(8, 5, 5)
>>> c.replace(bottom=10)
Region(x=10, y=4, right=13, bottom=10)
>>> Region.intersect(a, b)
Region(x=4, y=4, right=8, bottom=8)
>>> Region.intersect(a, b) == Region.intersect(b, a)
>>> Region.intersect(c, b) == c
>>> print(Region.intersect(a, c))
>>> print(Region.intersect(None, a))
>>> Region.intersect(a)
Region(x=0, y=0, right=8, bottom=8)
>>> Region.intersect()
>>> quadrant = Region(x=float("-inf"), y=float("-inf"), right=0, bottom=0)
>>> quadrant.translate(2, 2)
Region(x=-inf, y=-inf, right=2, bottom=2)
>>> c.translate(x=-9, y=-3)
Region(x=1, y=1, right=4, bottom=3)
>>> Region(2, 3, 2, 1).translate(b)
Region(x=6, y=7, right=8, bottom=8)
>>> Region.intersect(Region.ALL, c) == c
>>> Region.ALL
>>> print(Region.ALL)
>>> c.above()
Region(x=10, y=-inf, right=13, bottom=4)
>>> c.below()
Region(x=10, y=6, right=13, bottom=inf)
>>> a.right_of()
Region(x=8, y=0, right=inf, bottom=8)
>>> a.right_of(width=2)
Region(x=8, y=0, right=10, bottom=8)
>>> c.left_of()
Region(x=-inf, y=4, right=10, bottom=6)

The x coordinate of the left edge of the region, measured in pixels from the left of the video frame (inclusive).


The y coordinate of the top edge of the region, measured in pixels from the top of the video frame (inclusive).


The x coordinate of the right edge of the region, measured in pixels from the left of the video frame (exclusive).


The y coordinate of the bottom edge of the region, measured in pixels from the top of the video frame (exclusive).


The width of the region, measured in pixels.


The height of the region, measured in pixels.

x, y, right, bottom, width and height can be infinite — that is, float("inf") or -float("inf").


A stbt.Position specifying the x & y coordinates of the region’s center.

static from_extents()#

Create a Region using right and bottom extents rather than width and height.

Typically you’d use the right and bottom parameters of the Region constructor instead, but this factory function is useful if you need to create a Region from a tuple.

>>> extents = (4, 4, 13, 10)
>>> Region.from_extents(*extents)
Region(x=4, y=4, right=13, bottom=10)
static bounding_box(*args)#

The smallest region that contains all the given regions.

>>> a = Region(50, 20, right=60, bottom=40)
>>> b = Region(20, 30, right=30, bottom=50)
>>> c = Region(55, 25, right=70, bottom=35)
>>> Region.bounding_box(a, b)
Region(x=20, y=20, right=60, bottom=50)
>>> Region.bounding_box(b, b)
Region(x=20, y=30, right=30, bottom=50)
>>> Region.bounding_box(None, b)
Region(x=20, y=30, right=30, bottom=50)
>>> Region.bounding_box(b, None)
Region(x=20, y=30, right=30, bottom=50)
>>> Region.bounding_box(b, Region.ALL)
>>> print(Region.bounding_box(None, None))
>>> print(Region.bounding_box())
>>> Region.bounding_box(b)
Region(x=20, y=30, right=30, bottom=50)
>>> Region.bounding_box(a, b, c) == \
...     Region.bounding_box(a, Region.bounding_box(b, c))
static intersect(*args)#

The intersection of the passed regions, or None if the regions don’t intersect.

Any parameter can be None (an empty Region) so intersect is commutative and associative.

to_slice() tuple[slice, slice]#

A 2-dimensional slice suitable for indexing a stbt.Frame.

contains(other: Region) bool#

True if other (a Region or Position) is entirely contained within self.

translate(x: Region) Region#
translate(x: float | None, y: float | None) Region
translate(x: tuple[int, int]) Region

A new region with the position of the region adjusted by the given amounts. The width and height are unaffected.

translate accepts separate x and y arguments, or a single Region.

For example, move the region 1px right and 2px down:

>>> b = Region(4, 4, 9, 6)
>>> b.translate(1, 2)
Region(x=5, y=6, right=14, bottom=12)

Move the region 1px to the left:

>>> b.translate(x=-1)
Region(x=3, y=4, right=12, bottom=10)

Move the region 3px up:

>>> b.translate(y=-3)
Region(x=4, y=1, right=13, bottom=7)

Move the region by another region. This can be helpful if TITLE defines a region relative another UI element on screen. You can then combine the two like so:

>>> TITLE = Region(20, 5, 160, 40)
>>> CELL = Region(140, 45, 200, 200)
>>> TITLE.translate(CELL)
Region(x=160, y=50, right=320, bottom=90)
x: float | None = 0,
y: float | None = 0,
right: float | None = 0,
bottom: float | None = 0,
) Region#

A new region with the edges of the region adjusted by the given amounts.

x: float | None = None,
y: float | None = None,
width: float | None = None,
height: float | None = None,
right: float | None = None,
bottom: float | None = None,
) Region#

A new region with the edges of the region set to the given coordinates.

This is similar to extend, but it takes absolute coordinates within the image instead of adjusting by a relative number of pixels.

dilate(n: int) Region#

Expand the region by n px in all directions.

>>> Region(20, 30, right=30, bottom=50).dilate(3)
Region(x=17, y=27, right=33, bottom=53)
erode(n: int) Region#

Shrink the region by n px in all directions.

>>> Region(20, 30, right=30, bottom=50).erode(3)
Region(x=23, y=33, right=27, bottom=47)
>>> print(Region(20, 30, 10, 20).erode(5))
above(height: float = float('inf')) Region#

A new region above the current region, extending to the top of the frame (or to the specified height).

below(height: float = float('inf')) Region#

A new region below the current region, extending to the bottom of the frame (or to the specified height).

right_of(width: float = float('inf')) Region#

A new region to the right of the current region, extending to the right edge of the frame (or to the specified width).

left_of(width: float = float('inf')) Region#

A new region to the left of the current region, extending to the left edge of the frame (or to the specified width).

class stbt.RmsVolumeResult#

The result from get_rms_volume.

  • amplitude (float) – The RMS amplitude over the specified window. This is a value between 0.0 (absolute silence) and 1.0 (a full-range square wave).

  • time (float) – The start of the window, as number of seconds since the unix epoch (1970-01-01T00:00Z). This is compatible with time.time() and stbt.Frame.time.

  • duration_secs (int|float) – The window size in seconds, as given to get_rms_volume.

dBov(noise_floor_amplitude=0.0003) float#

The RMS amplitude converted to dBov.

Decibels are a logarithmic measurement; human perception of loudness is also logarithmic, so decibels are a useful way to measure loudness.

This is a value between -70 (silence, or near silence) and 0 (the loudest possible signal, a full-scale square wave).


noise_floor_amplitude – This is used to avoid ZeroDivisionError exceptions. We consider 0 amplitude to be this non-zero value instead. It defaults to ~0.0003 (-70dBov).

class stbt.Roku(address: str)#

Helper for interacting with Roku devices over the network.

This uses Roku’s External Control Protocol.

To find the Roku’s IP address and to enable the Roku’s network control protocol see Device Configuration: Roku.


address (str) – IP address of the Roku.

Or, use Roku.from_config() to create an instance using the address configured in the test-pack’s configuration files.

Added in v33.

static from_config() Roku#

Create a Roku instance from the test-packs’s configuration files.

Expects that the Roku’s IP address is specified in device_under_test.ip_address. This configuration belongs in your Stb-tester Node’s Node-specific configuration files. For example:

device_type = roku
ip_address =

ConfigurationError – If Roku IP address not configured.

filename: str = 'roku.log',
) Generator[None, None, None]#

Stream logs from the Roku’s debug console to filename.

This is a context manager. See Capturing logs from the device-under-test for the recommended way to use it.

query_apps() dict[str, str]#

Returns a dict of application_id: name with all the apps installed on the Roku device.

launch_app(id_or_name) None#

Launches the specified app. Accepts the app’s ID or name.

Use Roku.query_apps to find the IDs & names of the apps installed on the Roku.

merge_px=(0, 0),
) list[Region]#

Segment (partition) the image into a list of contiguous foreground regions.

This uses an adaptive threshold algorithm to binarize the image into foreground vs. background pixels. For finer control, you can do the binarization yourself (for example with stbt.color_diff) and pass the binarized image to segment.

For a guide to using this API see Using segmentation to find GUI elements.

  • frame (Frame) – The video-frame or image to process.

  • region (Region) – Only search in this region.

  • initial_direction (Direction) – Start scanning in this direction (left-to-right or top-to-bottom).

  • steps (int) – Do another segmentation within each region found in the previous step, altering direction between VERTICAL and HORIZONTAL each step. For example, the default values steps=1, initial_direction=stbt.Direction.VERTICAL will find lines of text; steps=2 will recursively perform segmentation horizontally within each line to find each character in the line (assuming the characters don’t overlap due to kerning; overlapping characters will be segmented as a single region).

  • narrow (bool) – At the last step, narrow each region in the opposite direction. For example: if you are segmenting lines of text with steps=1, initial_direction=stbt.Direction.VERTICAL, narrow=False you will get regions with y & bottom matching the top & bottom of each line, but with x & right set to the left & right edges of the frame (0 and the frame’s width, respectively). With narrow=True, each region’s x & right will be the leftmost / rightmost edge of the line.

  • light_background (bool) – By default, the adaptive threshold algorithm assumes foreground pixels are light-coloured and background pixels are dark. Set light_background=True if foreground pixels are dark (for example black text on a light background).

  • merge_px (int|Size) –

    Merge nearby regions that are separated by a gap of this many pixels or fewer. This is a tuple of (width, height). Regions are merged width-wise during horizontal steps, and height-wise during vertical steps. For example, to merge letters within the same word in horizontal text, use merge_px=(5, 0).

    Specifying a single integer means the same gap will be used for horizontal and vertical steps. The default is (0, 0) which means no merging.

Return type:



A list of stbt.Region instances.

stbt.segment was added in v33.
The merge_px parameter was added in v34.
stbt.set_global_ocr_corrections(corrections: dict[Pattern | str, str])#

Specify default OCR corrections that apply to all calls to stbt.ocr and stbt.apply_ocr_corrections.

See the corrections parameter of stbt.ocr for more details.

We recommend calling this function from tests/ to ensure it is called before any test script is executed.

class stbt.Size(width: int, height: int)#

Size of a rectangle with width and height.

stbt.stop_job(reason: str | None = None) None#

Stop this job after the current testcase exits.

If you are running a job with multiple testcases, or a soak-test, the job will stop when the current testcase exits. Any remaining testcases (that you specified when you started the job) will not be run.


reason (str) – Optional message that will be logged.

Added in v31.

class stbt.TextMatchResult#

The result from match_text.

  • time (float) – The time at which the video-frame was captured, in seconds since 1970-01-01T00:00Z. This timestamp can be compared with system time (time.time()).

  • match (bool) – True if a match was found. This is the same as evaluating MatchResult as a bool. That is, if result: will behave the same as if result.match:.

  • region (Region) – Bounding box where the text was found, or None if the text wasn’t found.

  • frame (Frame) – The video frame that was searched, as given to match_text.

  • text (str) – The text that was searched for, as given to match_text.

class stbt.Transition#

The return value from press_and_wait and wait_for_transition_to_end.

This object will evaluate to true if the transition completed, false otherwise. It has the following attributes:

  • key (str) – The name of the key that was pressed.

  • frame (stbt.Frame) – If successful, the first video frame when the transition completed; if timed out, the last frame seen.

  • status (stbt.TransitionStatus) – Either START_TIMEOUT (the transition didn’t start – nothing moved), STABLE_TIMEOUT (the transition didn’t end – movement didn’t stop), or COMPLETE (the transition started and then stopped). If it’s COMPLETE, the whole object will evaluate as true.

  • started (bool) – The transition started (movement was seen after the keypress). Implies that status is either COMPLETE or STABLE_TIMEOUT.

  • complete (bool) – The transition completed (movement started and then stopped). Implies that status is COMPLETE.

  • stable (bool) – The screen is stable (no movement). Implies complete or not started.

  • press_time (float) – When the key-press completed.

  • animation_start_time (float) – When animation started after the key-press (or None if timed out).

  • end_time (float) – When animation completed (or None if timed out).

  • duration (float) – Time from press_time to end_time (or None if timed out).

  • animation_duration (float) – Time from animation_start_time to end_time (or None if timed out).

All times are measured in seconds since 1970-01-01T00:00Z; the timestamps can be compared with system time (the output of time.time).

class stbt.TransitionStatus#

An enumeration.


The transition didn’t start (nothing moved).


The transition didn’t end (movement didn’t stop).


The transition started and then stopped.

exception stbt.UITestFailure#

Bases: Exception

The test failed because the device under test didn’t behave as expected.

Inherit from this if you need to define your own test-failure exceptions.

class stbt.VolumeChangeDirection#

An enumeration.

exception stbt.VolumeChangeTimeout#

Bases: AssertionError

keypress: Keypress | None = None,
spinner_fn: TakesFrame | None = None,
mask: Mask | Region | str = Region.ALL,
black_threshold: int | None = None,
motion_threshold: int | None = None,
consecutive_frames: int | str | None = None,
timeout_secs: float = 30,
threshold_db: float = 10.,
) list[tuple[float, str, Any]]#

Waits for content to start, taking performance measurements.

This looks for the following events in parallel:

keypress ─┬──> Screen goes black ──> No longer black ───┐
          ├──> Spinner appears ───> Spinner disappears ─┤
          ├───────────────> Sound ──────────────────────┤
          └───────────────> Motion ─────────────────────┴─> Done

It spawns a separate thread to detect each of these events in parallel. It uses, respectively, stbt.is_screen_black, a custom spinner-detection function provided by the caller, stbt.wait_for_volume_change, and stbt.wait_for_motion.

  • keypress – The return value from that started content playback.

  • spinner_fn – A function that takes a frame argument and return true if the spinner is visible in the frame, false otherwise. Defaults to None (no spinner detection).

  • mask – A mask used for motion and black-screen detection. This should mask out the spinner and any UI elements superimposed on the video.

  • black_threshold – See the threshold parameter of stbt.is_screen_black.

  • motion_threshold – See the noise_threshold parameter of stbt.wait_for_motion.

  • consecutive_frames – See the consecutive_frames parameter of stbt.wait_for_motion.

  • threshold_db – See the threshold_db parameter of stbt.wait_for_volume_change.

  • timeout_secs – Number of seconds to wait for content to start playing.


A list of 3-element tuples (time, event_type, event) in time order. Possible event types are: “press”, “black”, “not_black”, “spinner”, “not_spinner”, “audio”, and “motion”.

Example usage:

# after navigating to the right place in the UI so that the "play"
# button is focused and ready to be pressed...
keypress ="KEY_OK")
events = wait_for_content_start(keypress, mask="mask-out-spinner.png")
start_time = keypress.start_time
for t, event, _ in events:
    print("%.3f %s" % (t - start_time, event))
content_started = min(t - start_time for t, event, _ in events
                      if event in ["not_black", "motion", "audio"])
print("Content started after %.3f seconds" % content_started)

This example code would print something like this:

0.000 press
0.406 black
2.606 motion
2.606 not_black
2.925 audio
Content started after 2.606 seconds

Added in v34.

image: Image | str,
timeout_secs: float = 10,
consecutive_matches: int = 1,
match_parameters: MatchParameters | None = None,
region: Region = Region.ALL,
frames: Iterator[Frame] | None = None,
) MatchResult#

Search for an image in the device-under-test’s video stream.

  • image – The image to search for. See match.

  • timeout_secs (int or float or None) – A timeout in seconds. This function will raise MatchTimeout if no match is found within this time.

  • consecutive_matches (int) – Forces this function to wait for several consecutive frames with a match found at the same x,y position. Increase consecutive_matches to avoid false positives due to noise, or to wait for a moving selection to stop moving.

  • match_parameters – See match.

  • region – See match.

  • frames (Iterator[stbt.Frame]) – An iterable of video-frames to analyse. Defaults to stbt.frames().


MatchResult when the image is found.


MatchTimeout if no match is found after timeout_secs seconds.

timeout_secs: float = 10,
consecutive_frames: int | str | None = None,
noise_threshold: int | None = None,
mask: Mask | Region | str = Region.ALL,
region: Region = Region.ALL,
frames: Iterator[Frame] | None = None,
) MotionResult#

Search for motion in the device-under-test’s video stream.

“Motion” is difference in pixel values between two frames.

  • timeout_secs (int or float or None) – A timeout in seconds. This function will raise MotionTimeout if no motion is detected within this time.

  • consecutive_frames (int or str) –

    Considers the video stream to have motion if there were differences between the specified number of consecutive frames. This can be:

    • a positive integer value, or

    • a string in the form “x/y”, where “x” is the number of frames with motion detected out of a sliding window of “y” frames.

    This defaults to “10/20”. You can override the global default value by setting consecutive_frames in the [motion] section of .stbt.conf.

  • noise_threshold (int) – See detect_motion.

  • mask (str|numpy.ndarray|Mask|Region) – See detect_motion.

  • region (Region) – See detect_motion.

  • frames (Iterator[stbt.Frame]) – See detect_motion.


MotionResult when motion is detected. The MotionResult’s time and frame attributes correspond to the first frame in which motion was detected.


MotionTimeout if no motion is detected after timeout_secs seconds.

Changed in v33: mask accepts anything that can be converted to a Mask using load_mask. The region parameter is deprecated; pass your Region to mask instead. You can’t specify mask and region at the same time.

Changed in v34: The difference-detection algorithm takes color into account. The noise_threshold parameter changed range (from 0.0-1.0 to 0-255), sense (from “bigger is stricter” to “smaller is stricter”), and default value (from 0.84 to 25).

initial_frame: Frame | None = None,
mask: Mask | Region | str = Region.ALL,
region: Region = Region.ALL,
timeout_secs: float = 10,
stable_secs: float = 1,
min_size: tuple[int, int] | None = None,
frames: Iterator[Frame] | None = None,
) Transition#

Wait for the screen to stop changing.

In most cases you should use press_and_wait to measure a complete transition, but if you need to measure several points during a single transition you can use wait_for_transition_to_end as the last measurement. For example:

keypress ="KEY_OK")  # Launch my app
m = stbt.wait_for_match("my-app-home-screen.png")
time_to_first_frame = m.time - keypress.start_time
end = wait_for_transition_to_end(m.frame)
time_to_fully_populated = end.end_time - keypress.start_time

See press_and_wait.


Wait for changes in the RMS audio volume.

This can be used to listen for the start of content, or for bleeps and bloops when navigating the UI. It returns after the first significant volume change. This function tries hard to give accurate timestamps for when the volume changed. It works best for sudden changes like a beep.

This function detects changes in volume using a rolling window. The RMS volume is calculated over a rolling window of size window_size_secs. For every sample this function compares the RMS volume in the window preceeding the sample, to the RMS volume in the window following the sample. The ratio of the two volumes determines whether the volume change is significant or not.

Example: Measure the latency of the mute button:

keypress ='KEY_MUTE')
quiet = wait_for_volume_change(
print "MUTE latency: %0.3f s" % (quiet.time - keypress.start_time)

Example: Measure A/V sync between “beep.png” being displayed and a beep being heard:

video = wait_for_match("beep.png")
audio = wait_for_volume_change(
    stream=audio_chunks(time_index=video.time - 0.5),
print "a/v sync: %i ms" % (video.time - audio.time) * 1000
  • direction (VolumeChangeDirection) – Whether we should wait for the volume to increase or decrease. Defaults to VolumeChangeDirection.LOUDER.

  • stream (Iterator returned by audio_chunks) – Audio stream to listen to. Defaults to audio_chunks(). Postcondition: the stream will be positioned at the time of the volume change.

  • window_size_secs (int) – The time over which the RMS volume should be averaged. Defaults to 0.4 (400ms) in accordance with momentary loudness from the EBU TECH 3341 specification. Decrease this if you want to detect bleeps shorter than 400ms duration.

  • threshold_db (float) – This controls sensitivity to volume changes. A volume change is considered significant if the ratio between the volume before and the volume afterwards is greater than threshold_db. With threshold_db=10 (the default) and direction=VolumeChangeDirection.LOUDER the RMS volume must increase by 10 dB (a factor of 3.16 in amplitude). With direction=VolumeChangeDirection.QUIETER the RMS volume must fall by 10 dB.

  • noise_floor_amplitude (float) – This is used to avoid ZeroDivisionError exceptions. The change from an amplitude of 0 to 0.1 is ∞ dB. This isn’t very practical to deal with so we consider 0 amplitude to be this non-zero value instead. It defaults to ~0.0003 (-70dBov). Increase this value if there is some sort of background noise that you want to ignore.

  • timeout_secs (float) – Timeout in seconds. If no significant volume change is found within this time, VolumeChangeTimeout will be raised and your test will fail.


VolumeChangeTimeout – If no volume change is detected before timeout_secs.


An object with the following attributes:

  • direction (VolumeChangeDirection) – This will be either VolumeChangeDirection.LOUDER or VolumeChangeDirection.QUIETER as given to wait_for_volume_change.

  • rms_before (RmsVolumeResult) – The RMS volume averaged over the window immediately before the volume change. Use result.rms_before.amplitude to get the RMS amplitude as a float.

  • rms_after (RmsVolumeResult) – The RMS volume averaged over the window immediately after the volume change.

  • difference_db (float) – Ratio between rms_after and rms_before, in decibels.

  • difference_amplitude (float) – Absolute difference between the rms_after and rms_before. This is a number in the range -1.0 to +1.0.

  • time (float) – The time of the volume change, as number of seconds since the unix epoch (1970-01-01T00:00:00Z). This is the same format used by the Python standard library function time.time() and stbt.Frame.time.

  • window_size_secs (float) – The size of the window over which the volume was averaged, in seconds.

callable_: Callable[[], T],
timeout_secs: float = 10,
interval_secs: float = 0,
predicate: None = None,
stable_secs: Literal[0] = 0,
) T#
callable_: Callable[[], T],
timeout_secs: float = 10,
interval_secs: float = 0,
predicate: Callable[[T], Any] | None = None,
stable_secs: float = 0,
) T | None

Wait until a condition becomes true, or until a timeout.

Calls callable_ repeatedly (with a delay of interval_secs seconds between successive calls) until it succeeds (that is, it returns a truthy value) or until timeout_secs seconds have passed.

  • callable – any Python callable (such as a function or a lambda expression) with no arguments.

  • timeout_secs (int or float, in seconds) – After this timeout elapses, wait_until will return the last value that callable_ returned, even if it’s falsey.

  • interval_secs (int or float, in seconds) – Delay between successive invocations of callable_.

  • predicate – A function that takes a single value. It will be given the return value from callable_. The return value of this function will then be used to determine truthiness. If the predicate test succeeds, wait_until will still return the original value from callable_, not the predicate value.

  • stable_secs (int or float, in seconds) – Wait for callable_’s return value to remain the same (as determined by ==) for this duration before returning. If predicate is also given, the values returned from predicate will be compared.


The return value from callable_ (which will be truthy if it succeeded, or falsey if wait_until timed out). If the value was truthy when the timeout was reached but it failed the predicate or stable_secs conditions (if any) then wait_until returns None.

After you send a remote-control signal to the device-under-test it usually takes a few frames to react, so a test script like this would probably fail:"KEY_EPG")
assert stbt.match("guide.png")

Instead, use this:

import stbt
from stbt import wait_until"KEY_EPG")
assert wait_until(lambda: stbt.match("guide.png"))

wait_until allows composing more complex conditions, such as:

# Wait until something disappears:
assert wait_until(lambda: not stbt.match("xyz.png"))

# Assert that something doesn't appear within 10 seconds:
assert not wait_until(lambda: stbt.match("xyz.png"))

# Assert that two images are present at the same time:
assert wait_until(lambda: stbt.match("a.png") and stbt.match("b.png"))

# Wait but don't raise an exception if the image isn't present:
if not wait_until(lambda: stbt.match("xyz.png")):

# Wait for a menu selection to change. Here ``Menu`` is a `FrameObject`
# subclass with a property called `selection` that returns the name of
# the currently-selected menu item. The return value (``menu``) is an
# instance of ``Menu``.
menu = wait_until(Menu, predicate=lambda x: x.selection == "Home")

# Wait for a match to stabilise position, returning the first stable
# match. Used in performance measurements, for example to wait for a
# selection highlight to finish moving:
keypress ="KEY_DOWN")
match_result = wait_until(lambda: stbt.match("selection.png"),
                          predicate=lambda x: x and x.region,
assert match_result
match_time = match_result.time  # this is the first stable frame
print("Transition took %s seconds" % (match_time - keypress.end_time))