Testing on-screen keyboards, part 2: Navigating the keyboard
04 Nov 2019.
In part 1 we created a Page Object that knows how to identify the current selection of an on-screen keyboard, just from looking at the pixels on the screen. In this article we will teach our Page Object how to navigate the keyboard — that is, how to move the selection from its current position to a target letter. In this tutorial we are using the keyboard from the YouTube app on Apple TV.
By the end of this article we will have implemented a Page Object with an
enter_text method, so that you can write a test script like this:
Watch it in action:
Modelling the keyboard
First we will specify a Directed Graph that describes the behaviour of the keyboard under test. A “graph” (in the computer-science sense of the word) consists of nodes connected by edges. Each button on our keyboard will be a node in our graph, and the possible transitions from each button to its neighbours will be the edges:
Each edge will specify the remote-control button that you need to press to make
Stb-tester will use this graph to calculate the shortest path to the
target. For example, if the current selection is on “a”, to type the letter “p”
Stb-tester might press
KEY_DOWN, and finally
KEY_OK to type the selected letter. Note that there
can be more than one shortest path.
There are several ways to specify this graph in your test scripts. We’ll start with the conceptually simplest way:
Specifying each edge explicitly
stbt.Keyboard’s constructor takes an “edge list”: A multiline string where
each line is in the format
<start_node> <end_node> <action>. For our YouTube
keyboard the first few lines look like this:
See the whole edgelist here.
This is easy to understand, but it is tedious and error-prone to type. On the plus side, you only have to type it once; and the format is nicely diffable so you can see what changed if you need to update it in the future.
We’ll see other, more convenient ways of specifying the graph in part 3 of this tutorial series.
For the space character, you can’t put a literal space in the edgelist
because Stb-tester’s edgelist parser treats whitespace as the separator between
node names and actions. So call it
SPACE like this:
For buttons that don’t enter a character when pressed, use a descriptive name
KEY_UP from the
SPACE button might go to
5, or it might go to
6 — depending on where you came from before you landed on
Our graph doesn’t model this behaviour. Instead, we specify all of the
possible transitions. For example, here are all of the transitions from
SPACE in our edgelist:
Note that two of those transitions use the same action (
KEY_UP). This tells
us that to get from
5 we need to press
KEY_UP — but if we
press it we can’t be sure if we’ll end up on
6. When Stb-tester sees
that two outbound edges have the same action, it waits to see which node it
actually lands on, and then re-calculates a new path to the target.
Of course, in the implementation of the keyboard-under-test this behaviour
isn’t nondeterministic; the system will remember some state so that pressing up
SPACE behaves consistently according to that previous state. But
stbt.Keyboard doesn’t know about the implementation, so it models the
keyboard as a Nondeterministic state machine.
Internally, Stb-tester assigns a large “weight” to these edges, to stop the shortest-path algorithm from choosing a shortcut that won’t actually happen on the real keyboard under test.
Tying it all together
Now we have all the pieces we need to navigate the on-screen keyboard:
- A way to tell which button is currently selected (the
selectionproperty from the Page Object we made in part 1).
- A graph that tells us the path from any button to any other button.
We’ll add a method to our Page Object called
enter_text. This will take a
text parameter, and it will type the text into the on-screen keyboard by
using Stb-tester’s stbt.Keyboard class:
stbt.Keyboard.enter_text will use the
selection property of its
parameter to see which button is currently selected. Then it will loop over
each letter in
text: find a node in the graph with that name, navigate to it,
KEY_OK to type the letter.
Note that we convert the text to lowercase because all of our node names are in
b, etc). In a future article we will see how to model a
keyboard with separate lowercase, uppercase, and symbol modes.
Navigating to a single button
We can also navigate to a single button using stbt.Keyboard.navigate_to. For
example, we might want to provide a
clear method so that our test scripts can
clear any text that has been entered into the Search page:
Some keyboards have an explicit “SEARCH” button that you have to press after
typing the text. For those keyboards, our Page Object’s
would look like this:
Common mistake: Using an outdated page instance
stbt.Keyboard.enter_text and stbt.Keyboard.navigate_to take a Page Object
instance in their
page parameter. This instance has a
that reflects the position of the selection at the time the instance was
created. If this instance is out of date (because the selection has moved
since that time), then
stbt.Keyboard will calculate a path from the wrong
start position to your target node.
This is because Stb-tester’s Page Objects are immutable: An instance of the Page Object reflects the state of the device-under-test at the time the instance was created.
The following code won’t work:
To get the latest state, you can create a new instance of the Page Object like this:
Or like this:
self is an instance of our
Search Page Object.)
For this purpose, stbt.Keyboard.enter_text and stbt.Keyboard.navigate_to
return a new page instance that reflects the state of the device-under-test
after the text has been entered (or the navigation completed). We can use
their return value instead of calling
self.refresh(). Here’s the corrected
Note that we have made our method return an updated page instance. This is
consistent with stbt.Keyboard’s behaviour, and it allows any testcases that
enter_text method to use the same pattern.
See the full code from this tutorial here.
In part 3 we’ll see other ways of specifying the model (graph) of the keyboard.