XPath Pros and Cons — Mobile Locator Strategy

Lana Begunova
6 min readDec 8, 2023

The use of XPath in Appium and Its Caveats

In an earlier post, we talked about the various locator strategies that can be used to find elements, and we briefly covered a locator strategy called XPath.

What is XPath?

XPath is not a thing that Selenium, Appium, or any other UI automation tools invented. XPath is a query language designed for use with searching XML documents. An XPath engine allows us to use an XPath query in conjunction with a particular XML document in order to find specific XML nodes we are interested in.

Every element known to Appium is displayed in the XML output of the WebDriver driver.page_source command. If an element is not listed in the output, it is not available, at least from Appium’s perspective.

An XML document representing the structure of the UI at the time we requested the page source.

So, Appium has an XML representation of our app. What this means is that we can use XPath queries on that XML document to specify individual UI elements in our app. Because of the way XPath works, it is an extremely flexible and powerful way of finding elements. However, it comes with some caveats and downsides that we should be aware of any time we consider using XPath.

This post is not a complete introduction to XPath in general. It is really helpful to have some understanding of what it is and how to write basic XPath queries. If you’re totally new to XPath, I recommend checking out these GitHub repos, which have an introduction to all the important concepts: https://github.com/lana-20/selenium-locators and https://github.com/lana-20/xpath-locators-selenium.

XPath Pros

What are the reasons we’d want to use XPath to find our elements with Appium?

  • The main reason is that for any element that exists in our app hierarchy there is guaranteed to be some XPath query, which can find it. In other words, given an XML document, every node can be found by one or more XPath queries. So if we need to automate an element that doesn’t have an accessibility ID, for example, we know that we can always use XPath to find it.
  • The other main benefit is that XPath is a powerful query language that enables complex filtering of criteria. For example, when we login and get past the login prompt of our test app, we then search for the ‘welcome’ or ‘logged in’ message. We don’t necessarily know what username our test will use to log in — it may be some dynamically-generated synthetic data. We do know, however, that a certain portion of the element’s text will contain ‘You are logged in’. So we use the XPath contains() function to find that element.
XPath Cons

Unfortunately, XPath does not come without certain costs.

  • The first problem with XPath is that it can encourage us to use brittle selectors. Brittle selectors are selectors which we use to find the elements while our test is being developed, but then stop finding the element successfully on subsequent test runs or for apps updated.
//android.widget.Layout/android.widget.Layout[2]/android.widget.Layout[3]/android.widget.TextView[5]
  • This is an example of a brittle selector. Why is it brittle? This XPath query finds the fifth TextView underneath a layout, underneath the second layout of any other layout. When we write our test, this might very well find the element we want, but it relies completely on the hierarchical structure of our app. If the app developer changes any of the layout at all or, if we run the app in a tablet form factor, let’s say instead of a phone form factor, it’s highly probable that this XPath query will not find the element we want any longer. Even worse, it might find a completely different element, which would cause no end to our confusion while trying to debug what was going on.
  • The other main reason to think twice before reaching for XPath is that finding elements using XPath within Appium can be slow. This is the case because neither Android apps, nor iOS apps, are internally represented as XML documents. When it comes to running an XPath query in our app, Appium actually reads the state of the app, turns it into an XML document, runs the XPath query, finds any matching nodes, and then turns those XML nodes back into native UI element objects. It’s a lot. Each of these steps can potentially be expensive especially the step of generating the XML document in the first place. Basically, the more elements we have in our app view, the longer finding anything using XPath will take. In some cases we might find that it becomes completely unusable. Of course, you can always try it, and see how fast it is for you.

Assuming the performance issues do not affect our case, how can we be wise users of the XPath locator strategy?

  • The main thing is to avoid brittle selectors. We can do this by making sure to restrict our XPath query by making use of any unique information. For example, if we know that the node we want has a unique attribute, we should make sure to specify that in the XPath, just like the one here: //android.widget.TextView[@text=”Unique text”]. With this query, we are guaranteed to find only the element which contains the unique text.
  • We can also use unique information about a node’s ancestor. Here’s an example where we’re trying to find a TextView: //android.widget.Layout[@content-desc=”Unique layout”]/android.widget.TextView[2]. The text view itself has no unique properties, but we know that its parent layout has a content description, which is unique. So we can specify that in the XPath query to make sure that we are only looking at the second TextView of the unique layout we know it will fall under.
  • We can also query the other direction, from knowledge of unique information about a descendant. In this final example, we are looking for an element which has the text ‘bar’, but let’s imagine that there are lots of elements that have this text in our app. In this case, we know the element we want is an ancestor of another element, which has the text, ‘foo’: //ancestor::*[*[@text=”foo”]][@text=”bar”]. We can use this query to restrict our search for elements containing ‘bar’ to only those which are ancestors of elements containing ‘foo’. We can imagine that would restrict our query to only the element we care about.

In general, XPath is not something to be afraid of, but it can have test maintainability as well as test performance consequences if we aren’t careful. But now we know everything you need to know to use XPath wisely.

Happy testing! I welcome any comments and contributions to the subject. Connect with me on LinkedIn, X , GitHub, or Insta.

--

--

Lana Begunova

I am a QA Automation Engineer passionate about discovering new technologies and learning from it. The processes that connect people and tech spark my curiosity.