24 Sep 2015. By William Manley.

In the previous blog post we discussed the frame object pattern and how it could lead to greater agility and lower maintenance costs thanks to improved self-testability. This blog post discusses a technique to get the maximum value from this testability at near-zero cost.

All software has costs associated with it. Costs to write in the first place, costs to fix bugs, and costs associated with ongoing maintenance. Having automated test-cases reduces these costs. Of course test-cases are software themselves, with their own associated costs. There’s a strong argument that we should test the tests, but at some point the marginal cost of an additional layer of self-testing will exceed its value.

Here’s how we reduce the cost of some self-tests (tests that test the test-pack itself) to near zero: We automatically generate a set of Python doctests which capture the behaviour of our frame objects based on a corpus of images.

Doctests and the frame object pattern

In the previous blog post we said that Frame Objects should define a __repr__ method to make them usable with doctests. Here’s what a doctest based on last week’s GuideFrame frame-object might look like:

guide = cv2.imread('selftest/screenshots/guide.png')
live_tv = cv2.imread('selftest/screenshots/live-tv.png')


def selftest_GuideFrame():
    """
    >>> GuideFrame(frame=guide)
    GuideFrame(is_visible=True, programme_title='The One Show', current_time='19:00')
    >>> GuideFrame(frame=live_tv)
    GuideFrame(is_visible=False)
    """
    pass

This is a clean and human-readable form of recording our expectations of the GuideFrame class when provided with a variety of different images. We defined GuideFrame.__repr__() to return as much information from the frame as it could extract and we see this here.

We can verify that the behaviour of the GuideFrame class is as expected with the command:

python -m doctest selftest_guide.py

We can run this command every time we make a change to the test-pack and preferably as a part of selftest CI (a subject for a future blog post). With this we gain confidence that we haven’t caused any regressions with our changes. With this confidence comes agility. Fantastic so far…

It’s not so arduous to write these tests out by hand when your test image corpus is small and you have just a few frame objects, but the task quickly becomes tedious, particularly when we are extracting many pieces of information and we start to support more and more UI variants. Adding a new property to our frame object now requires updating as many doctests as we have example frames.

Enter automation

Doctests are validated automatically, and they can be generated automatically too. We’ve created a tool to generate doctest files based on some annotations added to our test-case modules.

For example, at the beginning of guide.py we might have:

AUTO_REGRESSION_TEST_FUNCTIONS = [
    'GuideFrame(frame={frame})',
    'WarningDialogFrame(frame={frame})',
]

AUTO_REGRESSION_TEST_SCREENSHOTS = [
    'dialog-*.png',
    'guide-*.png',
    'live-tv*.png',
]

We would then run our tool:

./auto-selftest.py -o selftest/autoselftest_guide.py tests/guide.py

and it would generate this:

guide = cv2.imread('selftest/screenshots/guide.png')
guide2 = cv2.imread('selftest/screenshots/guide2.png')
live_tv = cv2.imread('selftest/screenshots/live-tv.png')
warning_dialog = cv2.imread('selftest/screenshots/warning-dialog.png')


def selftest_GuideFrame():
    """
    >>> GuideFrame(frame=guide)
    GuideFrame(is_visible=True, programme_title='The One Show', current_time='19:00')
    >>> GuideFrame(frame=guide2)
    GuideFrame(is_visible=True, programme_title='Top Gear', current_time='15:00')
    >>> GuideFrame(frame=live_tv)
    GuideFrame(is_visible=False)
    >>> GuideFrame(frame=warning_dialog)
    GuideFrame(is_visible=False)
    """
    pass


def selftest_warningdialogframe():
    """
    >>> WarningDialogFrame(frame=guide)
    WarningDialogFrame(is_visible=False)
    >>> WarningDialogFrame(frame=guide2)
    WarningDialogFrame(is_visible=False)
    >>> WarningDialogFrame(frame=live_tv)
    WarningDialogFrame(is_visible=False)
    >>> WarningDialogFrame(frame=warning_dialog)
    WarningDialogFrame(is_visible=True, message='No signal')
    """
    pass

Then we commit this file to git. It may seem unusual to commit a generated file, but is has a fantastic effect. See this diff from a real fix that we applied to the helper function introduced in a previous blog post:

diff --git a/tests/utils.py b/tests/utils.py
--- a/tests/utils.py
+++ b/tests/utils.py
@@ -990,7 +990,7 @@ def find_contiguous_region(point, frame=None):
     if frame is None:
         frame = stbt.get_frame()
     _, rect = cv2.floodFill(
-        frame, None, point, None,
+        frame, None, point, None, (1, 1, 1), (1, 1, 1),
         flags=cv2.FLOODFILL_FIXED_RANGE | cv2.FLOODFILL_MASK_ONLY)
     region = stbt.Region(*rect)
     stbt.debug("find_contiguous_region(..., %s) -> %s" % (point, str(region)))
diff --git a/tools/auto_regression_tests/players_regression_test.py b/tools/auto_regression_tests/players_regression_test.py
--- a/tools/auto_regression_tests/players_regression_test.py
+++ b/tools/auto_regression_tests/players_regression_test.py
@@ -272,7 +272,7 @@ def regression_test_find_iplayer_ui():
     >>> find_iplayer_ui(frame=iplayer_parental_controls_dialogue)
     'dialog box'
     >>> find_iplayer_ui(frame=iplayer_playing_bbc_london_news)
-    IPlayerContentOverlay(is_visible=True, title=None)
+    IPlayerContentOverlay(is_visible=True, title=u'BBC London News')
     >>> find_iplayer_ui(frame=iplayer_playing_madagascar)
     IPlayerContentOverlay(is_visible=True, title=u'Madagascar')
     >>> find_iplayer_ui(frame=iplayer_playing_the_one_show)

By looking at the diff from one commit from the next you can see the change to the test code, and also the effect of that change! In the regression test output you can see that we are now correctly reading “BBC London News”, and more significantly you can see that we haven’t caused a regression to any of the other results.

In this instance we added the example screenshot iplayer-playing-bbc-london-news.png in a previous commit. This allowed us to capture the incorrect behaviour in the diff too, so we can better see that it was fixed.

This makes creating test-cases easier and cheaper and improves agility:

  • Creating Frame Objects is easier: you just need some screenshots. Checking that they work is automatic.
  • Code review is easier: the changes are captured alongside the results.
  • It’s easier to test multiple hardware or UI variants: you can be sure that any changes you make to support these variants work correctly and don’t cause any regressions.
  • The auto-regression-tests serve as a form of documentation – much like manually written doctests do.

Given this, the process we follow when fixing a defect with our test-cases is:

  1. Find a frame of video that reproduces the issue.
  2. Commit it to our repo with the incorrect behaviour recorded in the auto-self-tests.
  3. Fix the issue.
  4. Verify the fix by checking that the incorrect recorded behaviour is now corrected and that there have been no regressions in the other results.
  5. Commit both the fix and the changes to the auto-self-tests to git.

We’re currently looking at adding this tool to stbt proper.

Update 2016-03-31:
A new command stbt auto-selftest was merged to stb-tester git master and will be in the stb-tester v25 release. It’s easier to use than described above:
  • Each FrameObject has tests generated for it automatically.
  • You don’t need to list the screenshots to be included: they all are by default.
  • You don’t need to specify the input and output files: tests are generated for each of the Python files in your test-pack

See PR #343 for more information.