New features in v33: Using segmentation to find GUI elements
11 Nov 2022.
In image processing, “segmentation” means finding which pixels belong to foreground objects (such as text or other GUI elements) versus background pixels. Stb-tester v33 adds a new API called segment (pronounced like the verb: segMENT) that finds the location of distinct foreground elements.
For example, let’s find the location of each “poster” or “tile” in this screenshot:
By default, segment starts from the top of the screen and travels down, finding distinct rows:
frame = stbt.get_frame() regions = stbt.segment(frame)
Next, we can discard rows that are too small, and then run segment again, but horizontally: Starting from the left of the row, and moving right.
tiles =  for row in regions: if row.height > 100: tiles.extend(stbt.segment(frame, region=row, initial_direction=stbt.Direction.HORIZONTAL))
We can combine both of these steps into a single call to segment, by
steps=2. The steps alternate between vertical and horizontal
regions = stbt.segment(frame, steps=2)
Then we can discard regions that are too small to be a tile:
tiles = [r for r in stbt.segment(frame, steps=2) if r.height > 100]
Alternately, we could have limited the search region —if we know it— by
region parameter of segment, instead of searching the whole
Note that some of the regions aren’t the same height, because the bottom of
the poster is too similar to the background color. If we set
then segment will keep the top and bottom coordinates of the row from the
tiles = [r for r in stbt.segment(frame, steps=2, narrow=False) if r.height > 100 and r.width > 40]