Testing on-screen keyboards, part 1: Identifying the selection
16 Oct 2019.
Stb-tester v31 added new APIs that make it much easier to navigate on-screen keyboards from your test scripts. In this series of tutorials we will create a Page Object that knows how to navigate YouTube’s search keyboard on the Apple TV.
Before we implement any navigation, we need to understand what the screen is showing us: Where is the current selection? In the screenshot above, the selection is on “a”. Any navigation function (for example to type the letter “p”) needs to know where the selection is, so that it can figure out what buttons it needs to press to get to the target.
In Stb-tester the way we extract information from the screen is by implementing a Page Object. Our “Search” Page Object will implement 2 properties:
- is_visible returns True if the search page is present on the screen.
- selection returns the text of the current selection.
To keep this tutorial simple, we will only support the lowercase keyboard. This wouldn’t be suitable for testing a login keyboard where you need to type in a password, but it should be enough for testing a search page. In a future tutorial we will extend this example to allow entering uppercase letters and symbols.
We’ll use image matching (stbt.match) to determine if the Search page is visible. For our reference image, we’ll take a screenshot and mask out the areas with dynamic content:
- The white selection can move anywhere within the keyboard.
- The text that you’ve typed appears at the top next to the magnifying-glass icon.
- Search results appear on the top right half of the page.
For more details about this masking technique, see Using match transparency: Determining which page we’re on.
Since we’re not supporting uppercase & symbol modes, I have kept the “abc”, “ABC”, and “#+-” buttons in the reference image, just to provide some structure — otherwise the reference image would be relatively featureless and it might end up matching a black screen! We don’t expect the tests to move the selection onto these buttons, and we don’t expect the tests to change the mode (which would make the “ABC” or “#+-” button brighter, so it would stop matching our reference image). We will add support for uppercase letters and symbols in a future tutorial.
The code looks like this:
(Aside: It’s a bit of a shame that the thing we’re most interested in —the keyboard!— is excluded from the reference image. In a future article we’ll discuss a better technique for matching reference images that have a movable selection.)
To find the selection we’ll use stbt.match again, but this time we’ll ask it to find the white selection anywhere within the keyboard’s region:
Some buttons are wider than others, so I have excluded the right edge of the selection. If your keyboard has buttons of variable height, then you should also exclude the bottom edge so that you are only matching the top left corner. Mask any areas where text can appear, by setting those pixels transparent.
Here’s the code:
So far we have found the region (position) of the selection. We still need to find the text or name of the button at that position. OCR won’t be reliable enough for this: If you give a single button’s region to the OCR engine, it will be missing a lot of context that it normally gets from surrounding text (such as the size of lowercase vs. uppercase letters, where the baseline is, etc.) For example: Is the button to the right an “l” (lowercase L), an “I” (uppercase i), or a “|” (vertical bar)?
Instead, we will hard-code the text of each button in our Page Object, and we’ll look it up by region. stbt.Grid provides a convenient API for specifying grid-like keyboards:
The code above specifies a 6✕6 grid for the first 6 rows, and a 3✕1 grid for the bottom row. stbt.Grid expects regular, equal-sized cells, so that’s why we need a separate grid for the bottom row.
data is a 2D matrix, or list of lists. Actually it’s a list of iterables: That’s why we can provide a list of strings because iterating over a string yields a character at a time. We could also have specified it like this:
…but the first way is easier to type and easier to read. For BOTTOM_GRID we do have to use a list of lists because the button names are longer than a single character.
Here is a visualisation of the 2 grids:
Now we can use stbt.Grid.get to look up the selection’s match position within the grid:
For the screenshot above, Search().selection would return “g”.
Note that the white selection is larger than the unselected buttons, so the selections at two adjacent positions will overlap each other. This doesn’t really matter, and you don’t have to get the grid coordinates exactly right, because stbt.Grid.get looks up the selection’s position within the grid by looking at the centre of the region.
You can see & debug the Page Object in the Object Repository tab of your Stb-tester Portal:
See the full code from this tutorial here.
In part 2 we will implement a function to navigate this keyboard and enter some search text.