Waiting Strategies — Appium and Selenium Automation

16 min readDec 3, 2023

⏳ Waiting for Application State — How to Address Race Conditions 🏃‍♀️

Race Conditions is a common challenge testers encounter when finding elements. What happens when we try to find an element which just isn’t there? It’s a prominent issue in test automation, and it’s a part of a more general phenomenon in functional tests: race conditions due to poor assumptions about app state.

Why Wait?

◕ Selenium and Appium aren’t all that smart. They don’t come with our eyes or our brain. They’re really just automation robots that know how to retrieve elements from the UI and act on them. They don’t understand any larger context for our app or what we’re actually trying to do with our app by means of the automation commands.
◕ The reality of web and mobile apps is that they are constantly changing. Sometimes apps are static enough that they don’t change in between actions we take with them. But more commonly, apps are making background network requests that change the state of the page to show us real time information, or update the images in a carousel, and so on. Even if apps don’t change on their own, sometimes it can take time for an app to respond to our action. If we click a button that causes a new page to load, or that causes some network request to happen, then we need to wait for the page to load or the request to finish before proceeding with our interaction. As humans, we can usually tell when we are supposed to wait for the app before interacting with it again. We don’t just start clicking around if we can tell that the app is already working.
◕ Put these things together and we have a problem. Selenium doesn’t have our brain, and it doesn’t really know what’s going on in the app. It just does things in the app when we tell it to. So what if we try and find an element that we know will be there, but isn’t there yet? What if the app is taking more than a few milliseconds to load the element as a result of a network call, as often happens? Then we’ll end up with a NoSuchElementException. In this case, the NoSuchElementException isn’t really an error, it’s more of a mistake in the way we coded up the test. This kind of mistake is called a race condition. A race condition is when two processes are operating at the same time, but success depends on one of them finishing before the other. In our example, the race is between our Selenium test trying to find an element on one hand, and the app doing whatever it needs for the element to be displayed on the other hand.
◕ It’s up to us to make sure our scripts don’t wind up in a race condition with the app. It’s up to us to teach our scripts to wait for elements.

Race Conditions

◕ A race condition is when two processes or procedures are operating simultaneously, and either one could finish before the other. This is a problem in automated test scripts because our code usually handles only one scenario. So, in the alternative world where the unintuitive procedure finishes first, our test will fail.
◕ To use our example of finding elements: our code usually implicitly assumes that the element will be present before we try to find it. In reality, the request to find an element and the app’s own process of working to display the element are in a race. As human users of the app, we know how to gracefully lose the race: we simply wait. If we want to tap a button and it’s not yet on the screen, we wait for it to show up.
◕ Appium and Selenium are less graceful, and do exactly what the client tells them to do, even if that means trying to find an element before it has been properly rendered on the screen. This is just one example of many possible examples in the category of test code assuming the app is in a certain state, but being proven wrong. It can be a particularly pesky problem because it might only show up infrequently, or only in CI environments. When we develop test code locally, we can often be tricked into making all kinds of assumptions about how races will resolve. Just because the app always wins a race (as we’d expect) — when testing locally -does not mean the app will behave the same in other environments.

Waiting for App States

There are three different waiting strategies we can use to teach our scripts how to wait for elements:

① Static Wait — wait for a hard-coded amount of time, e.g., using time.sleep(n).
② Implicit Wait — use the WebDriver server’s built-in element finding retry, up to a timeout.
③ Explicit Wait — use Expected Conditions to poll the app for the appropriate state.

Ultimately I think we should just use one of them, but it’s good to recognize and understand all the options. First, we have the strategy of waiting statically. Second, we have something called an implicit wait, which relies on a feature built into the WebDriver server itself. Finally, we have the recommended strategy, explicit waits, which use client-side features to poll the app for any kind of state. Let’s discuss each of these in turn.

◉ Static Waits

When we code a static wait into our test, you’re basically pausing our test script for a certain amount of time, to give the app some time to get into the state we want. We could think of it like this. Imagine that when we click a button on our app, we then immediately close our eyes and start counting down from some number. When we’re done counting, we open our eyes and try to find the next element we need to interact with. If we found it, great! If we didn’t find it, then we have an error. In code, this would look like using Python’s time library and using the time.sleep method on it.

In general I don’t recommend using static waits. One big issue is that the amount of time we need to wait for an element to appear is often not fixed. It can depend on a lot of circumstances and change a lot from test run to test run. Slow internet traffic could make it so that an element which usually takes 1–2 seconds to appear now takes 10 seconds to appear. This might be a performance issue, but it might not be considered a failure of our app from a functional perspective, and so we don’t want it to fail our test. The only way to get around this with static waits is to use really long waits.

But now, we are waiting for a certain number of seconds every time we run our test, even the times where the element shows up quickly. Depending on how many waits we have and how many tests we have, we could be wasting minutes or even hours in our test suite just blindly waiting. Additionally, static waits are usually hard to interpret while reading test code, unless we make comments in the code about why exactly the value is what it is.

The good news is there’s no reason to ever use static waits. Sometimes we use them as a way to test out whether we do actually have a race condition which is giving us problems in our tests. But we never actually add the static wait to the test suite. We always replace it before submitting our code.

Waiting “statically” just means applying a lot of good old time.sleep all over the place. This is the brute force solution to a race condition. Is your test script trying to find an element before it is present? Force your test to slow down by adding a static sleep! This is what a login test would look like using static waits:

I chose 3 seconds as the static wait amount. Why did I choose that value? I’m not sure. It worked for me locally, and solved my race condition problems. Good enough, right? Not exactly. There are some major problems with this approach:

◕ The test is now much longer than it needs to be (up to 9 seconds longer), wasting our time and our build’s time and therefore our team’s time and our company’s time. And time is money!
◕ We’ve staved off the chaos of a race condition … temporarily. Who’s to say we won’t wind up in some other scenario where 3 seconds won’t be enough? What if one of the elements shows up based on a network request, and every so often the network is just a bit slower? Our only recourse would be to keep increasing the static wait, thereby making the problem above even worse.
◕ Unless we’re diligent at experimentation and commenting, no one will know why we picked the precise values that we did. Was it random or was there a reason?

So what else can we do?

◉ Implicit Waits

Implicit waits differ in some important ways from static waits. The main thing to know about implicit waits is that they involve something called a ‘retry’. This is a retry that happens on the WebDriver server, and it happens only when trying to find elements.

The way it works is that when we have an implicit wait active, if we try to find an element using the server, it will first try to find the element right when we ask. If it can find the element, it returns it to us. If it can’t, it doesn’t immediately respond with a NoSuchElementException. Instead, it waits for a very small amount of time, then tries to find the element again. It goes through this cycle as many times as it needs to until it finds the element. Now, it doesn’t do this forever, because that would lead to problems if the element never shows up. Instead, it keeps retrying finding the element while a timeout is counting down. This timeout can be whatever we want. So if we set it for 10 seconds, then the server will keep trying to find elements for us for up to 10 seconds, and if it can’t find them by the end of that time, then it will respond with a NoSuchElementException. So it’s very different than static waits, since if the element shows up earlier than the timeout, we’ll get the element object and will be able to continue our test at that point. This is great because it means we don’t really need to waste time in your tests if the elements themselves show up quickly.

The way we use implicit waits in our test is by setting an implicit wait timeout on the server. We do this using an implicitly_wait method on the driver object, and passing in the number of seconds we want the timeout to be. We only need to do this once, and the timeout will apply for all subsequent wait commands.

For this reason, implicit waits are global. It’s nice that we only have to set it once, but it’s actually a downside as well, because it means that the same timeout applies to all find commands, unless we call the implicitly_wait command with a different value. In actual practice, this means that we end up having a really high implicit wait timeout in our script, so that it can cover all our bases. This isn’t a problem when our elements do actually appear, but imagine a case where a certain element fails to appear. This is a problem that stops our test in its tracks. The question is, how long did we wait to decide that the element wouldn’t appear? There are some elements where we know that if they don’t appear within a certain amount of time, they’re certainly never going to appear, and we should stop waiting. Imagine clicking a button that simply adds a new element to the page using Javascript. In this case, if the element doesn’t show up very soon, there’s no point in waiting 10 or 20 seconds for it to appear. But if we have an implicit wait timeout set, that is what will happen. The point is that we often want to tailor the timeout to a specific instance of finding the element, and that’s not how implicit waits are designed to be used.

The other limitation of implicit waits is that they only work for finding elements. We can’t set implicit waits to wait for other kinds of application state. What other kinds of state might we want to wait for? Well, we could wait until the title becomes a certain string, for example to test that our app has updated the title in a single page app. Or, we could wait until a certain JavaScript object contains a certain value. These are all kinds of app state that I’ve found it useful to wait for while writing Selenium tests. And implicit waits don’t help at all in cases where these other state changes don’t happen instantly.

Because the designers of Selenium were well aware of the element finding race conditions, a long time ago they added the ability in the Selenium (and it was copied by the Appium) server for the client to set an implicit wait timeout. This timeout is remembered by the server and used in any instance of element finding. If an element can’t be found instantly, the server will keep trying to find it up to the specified timeout. The same test implemented with implicit waits would look like:

That code is a lot nicer, for one. We’ve also completely solved the problem about the ever-increasing waste of time we ran into with static waits. Because the server-side element-finding retry is on a pretty tight loop, we’re guaranteed to find the element within (say) a second of when it actually shows up, meaning we waste very little time while simultaneously making our test much more robust.

There’s still a problem or two with this approach, however:

Using implicit wait, we tend to set one timeout and forget about it. Inevitably, this timeout becomes pretty high because it has to be high enough to account for the slowest element we could validly wait for. This means that for other elements, which we know would never take as long to show up, we still end up wasting time waiting for them. In other words, we still want our find element command to fail relatively quickly in the case where an element truly never makes an appearance. We don’t want to wait for a whole minute to decide when a few seconds would have done.
We’ve been focusing on waiting for elements, which is what implicit waits are designed around. But an element’s presence is just one example of an app state that we might want to wait for. What about an element’s text or visibility? Implicit wait won’t help us there.

Thankfully, there’s an even more general solution that gets us past these issues as well. We come to the final waiting strategy, and the one that I recommend as a good solution for pretty much every case, namely explicit waits.

◉ Explicit Waits

Explicit waits work very similar to implicit waits in that they involve a retry to check for the presence of an element, or to check some other kind of app state. The main difference is that explicit waits run on the client rather than on the server. In other words, when we perform an explicit wait, it’s our client library which is doing the retrying for us.

One main benefit of this strategy is that we can use explicit waits to check for any kind of app state which can be determined using WebDriver commands. And this is an awful lot. Think about commands like getting the title or the URL, or the text value of an element, and so on. Any of these can be made part of an explicit wait, so we can very easily have a nice elastic retry for finding elements, or waiting for these other kinds of app state as well.

How do explicit waits work? Thankfully, the logic for retrying — which we can also call polling because we’re periodically asking the server for information — is built into the Python client, so we don’t have to write it ourself. There’s a class called WebDriverWait we can import from the module selenium.webdriver.support.wait, and it's possible to instantiate this class and get back a Wait object we can use to do the waiting. It encapsulates all the retrying for us.

Once we have one of these WebDriverWait objects, we can call a special method on it called until. This until method takes as a parameter, something called an expected condition. We can think of the expected condition as encoding the type of state we are waiting for. Are we waiting for an element to be present? There's an expected condition for that. Are we waiting for the title to be a certain string? There's an expected condition for that. We can import all these expected conditions from selenium.webdriver.support.expected_conditions in the Python client. Let's look at some examples.

Alright, let’s first learn how to construct an expected condition we can use to wait for the title to be a certain string. It looks like importing the expected_conditions module, and then finding the presence_of_element_located method on it.

We can also use expected conditions to wait for the browser URL or the page titlw to be something we expect. It works just the same as the previous example.

expected_conditions.presence_of_element_located: the condition we probably use most of the time. This is an expected condition used to wait for an element to be present. Accordingly, it takes as its parameter the locator strategy and selector used to find the element. Note, however, that these are not two parameters, but rather one parameter sent in as a Python tuple. A Python tuple is a lot like a list, but it's defined using parentheses rather than square brackets. All this to say, we need to make sure to wrap the strategy and selector up as a tuple before sending it into this expected condition.

Explicit waits are just that: they make explicit what we are waiting for and how long it will take. At the cost of a little more verbosity, we get much more fine-grained control, and are able to teach our test script how to wait for just the right condition in our app before moving on. For example:

Here we use the WebDriverWait constructor to initialize a wait object with a certain timeout. We can reuse this object anytime we want the same timeout. We can configure different wait objects with different timeouts and use them for different kinds of waiting or elements. Then, we use the until method on the wait object in conjunction with something called an expected condition.

Built-in Conditions, e.g., **presence_of_element_located** or **title_contains**.

⦾ Expected Conditions:
expected_conditions.title_is(‘page_title’)
expected_conditions.url_to_be(‘https://page_url.com')
expected_conditions.presence_of_element_located(By.CSS_SELECTOR. ‘#element_id’)

An expected condition is simply a special method which returns an anonymous inner class whose magic __call__ method will be called periodically until it returns something. The expected_conditions class has a number of useful, pre-made condition methods. What's great about explicit waits, though, is that we're not limited to what comes in the box. We can make our own!

◉ Custom Explicit Waits

If the app state we want to wait for is particularly complex, we can always make our own expected condition. For example, let’s say that the click() command is terribly unreliable, and often it fails, even when our element is found. So what we want is to keep retrying both the find and click actions until they both succeed one after the other. We could make a custom expected condition, like so:

We simply return a new custom expected condition. Then we can use this in our test code, for example as in this revision of the previous test:

This may be a bit of a useless example, but it demonstrates how easy it is to create useful and reusable waits that our whole team can use.

◉ Fluent Waits

A fluent wait is the type of explicit wait where we can define polling intervals and ignore certain exceptions to proceed with further test execution even if the element is not found.

When we specify a fluent wait, we provide one or more of the following parameters:
◕ maximum wait time
◕ polling interval or frequency to check the element
◕ any specific exception(s) to ignore
◕ message that should appear after timeout