Is Page Object Model overrated?

7 min readMay 3, 2023

I have recently encountered opinion that Page Object Model is ‘over-rated’. As a rational person (and not a dogmatic one), I decided to thoroughly analyze this subject. To answer the main question, we have to analyze at least two things:

What are the benefits of the Page Object Model compared to alternatives?
What are the costs of the Page Object Model compared to alternatives?

My automation experience started in around 2007. In 2011, I worked intensively with Selenium IDE and Selenium RC, which gives me some perspective on the difficulties that may arise with such an approach.

WHAT IS PAGE OBJECT MODEL?

The Page Object Model is a design pattern proposed by Martin Fowler in 2013. It separates the interaction with a page into separate classes called Page Objects. The main goal of the Page Object Model is to enhance test maintenance and reduce code duplication.

WHAT PROBLEMS DO PAGE OBJECT MODEL SOLVE?

Let me demonstrate this with an example.

driver.get("https://www.google.com/");
driver.findElement(By.name("q")).sendKeys("Selenium");
driver.findElement(By.name("btnK")).click();
driver.findElement(By.xpath("//form[@id=\'tsf\']/div/div/div[2]/div/div[2]/textarea"))
    .sendKeys("Page Object Model");
driver.findElement(By.cssSelector(".zgAlFc svg")).click();
driver.findElement(By.id("gsr")).click();
driver.findElement(By.id("APjFqb")).sendKeys("Page Object Model Selenium");
driver.findElement(By.cssSelector(".zgAlFc svg")).click();

What does this test actually do? It’s hard to say at first glance because we don’t have any meaningful names here.

Let’s go further. Now, imagine that you have 300 such tests. The application is changing. Some changes are made to the UI. Now, you have to make changes in 900 places in the code. Regular expressions? We can complicate this story even more and imagine that a similar button exists in some test cases, so using a regular expression can break some other tests.

So much work after one simple change. It sounds like a nightmare, but that’s how record-and-replay tests were working over a decade ago.

The Page Object Model offers a simple solution. Instead of changing the same thing in 900 places, we can change it in one place. But that’s not the only benefit.

Let’s imagine that we have several pages with the “Submit” button. How should I organize the code? The Page Object Model gives a simple and very efficient answer: organize the code in the same way that your UI is organized. In this way, anyone knows exactly where to look for the part of the testing code related to the specific UI part. But that’s not all the benefits.

Imagine that you are a newly hired tester, and you have to maintain the following code proposed in the Karate-UI project:

Feature: Move mouse
Scenario: move mouse
    * configure driver = { type: 'chrome' }
    Given driver 'https://karatelabs.github.io/karate/karate-core/'
    * delay(2000)
    * driver.maximize()
    * delay(7000)
    And click("(//a//code[text()='driver'])[1]")
    * delay(7000)

What was the functional intention of this test? What is the business logic behind it? How can we talk about possible changes in requirements? If a button is not found, then what kind of problem is it? Is it a defect in the UI, a defect in the business logic implementation, or perhaps some requirements were changed, and the tests were not updated? We do not know, and it is difficult even to say who we should ask. The BA? The frontend developer? The backend developer?

The Page Object Model helps us organize the code in a meaningful way. The code created with this model is self-explanatory. You cannot say that about Karate-UI code.

WHAT ARE BENEFITS FROM PAGE OBJECT MODEL?

Let’s summarize the advantages of the Page Object Model:

It encapsulates browser interaction.
It separates browser interaction from the business logic behind it.
It increases code reusability (DRY).
It partially implements SOLID principles (‘partially’ because the Screenplay pattern does it better).
It organizes the code in a way that helps understand it by giving meaningful names.

WHAT ARE COSTS OF PAGE OBJECT MODEL?

In my opinion, there is no cost to using the Page Object Model when starting from the very beginning with this approach. When someone has many tests prepared with repeated locators, some effort is needed to rewrite them, but in my opinion, the benefits are worth it.

OTHER ALTERNATIVES ON THE TOP OF PAGE OBJECT MODEL

I have searched the internet for some competitive design patterns, and here is what I have found:

Screenplay Pattern
Facade Design Pattern
Lean Page Object Model
Page Component Model
Fluent Design Pattern

Only the first one on this list is not directly based on the Page Object Model. The Facade Design Pattern focuses on creating a full facade at the test level, where the whole facade class is called and provided with an object that contains all the inputs needed. The Lean Page Object Model and Fluent Design Pattern are also ways of organizing the code of Page Objects. The Page Component Model focuses on smaller pieces of pages, especially when they are a common part of multiple pages.

Screenplay Pattern is to some extent similar to Page Object Model, but it starts from other focus. However, we can say that it appears after some refactoring made on Page Object Model.

One cannot see the benefits of the Screenplay Pattern without seeing the benefits of the Page Object Model. If someone considers the Page Object Model useless, then the Screenplay Pattern will also be useless for that person. This is not surprising.

But what exactly is the Screenplay Pattern?

The Screenplay Pattern is a more advanced variation of the Page Object Model, proposed by Antony Marcano, Andy Palmer, Jan Molak and John Ferguson Smart in 2013. It assumes that interactions with the system under test are represented as user-driven “tasks”. These tasks can be composed of several “actions” which encapsulate interactions with Page Objects. Each task can have one or more “actors”, which represent the user or a system component that initiates the task.

The main goal of the Screenplay Pattern is to make tests more expressive and easier to understand, while also promoting collaboration between testers, developers, and business stakeholders. It encourages creating user journeys and scenarios, which model real-life usage of the system, rather than individual test cases.

Another enhancement to the Page Object Model is the use of a fluent interface. Again, this is not against the principles of the Page Object Model. It is rather a specific and very useful method of writing code for the Page Object Model.

YET ANOTHER POINT OF VIEW — ESCAPE FROM THE KINGDOM OF NOUNS

While writing this post, I came across some other viewpoints. Some testers, instead of strictly implementing the Page Object Model, prefer reusability at the level of methods with some granularity. If this approach works for someone, then it is perfectly fine. I suppose that a well-used such approach would not be so far from the Screenplay Pattern as one may think. Fine-grained methods can be focused on tasks, interactions, abilities, and questions, and this (not classes) is the main point of the Screenplay Pattern.

As I properly understand Eldad Uzman, his main argument against the Page Object Model is excessive focus on nouns (represented by classes) instead of verbs describing actions. There are also some critical words about object-oriented programming in general.

Use functions to commit your high level actions and constants to store your locators, split them up to modules (or namespaces) when the need for that arise.
— Eldad Uzman

Maybe I am too optimistic, but contrary to Eldad, I think that the Screenplay Pattern changes the focus in the desired direction. And as I stated above, Eldad’s approach is not so far from the Screenplay Pattern as he thinks. The Screenplay Pattern changes the focus from pages to actors. In my opinion, such a change is enough to avoid overusing page-centric nouns.

The purpose of using nouns (“Page”) in the Page Object Model is to create a simple way of organizing multiple WebElements and methods. The Screenplay Pattern organizes it with a more granular manner.

Generally, both approaches have a lot of common points: encapsulation, reusability, meaningful names, etc. The only difference is whether we are using classes or a more procedural style.

DEEPER VIEW BEHIND REMARKS BY ELDAD UZMAN

When I wrote this text some time ago, I read it for myself and came to the conclusion that I missed something. I decided to put this post aside and reconsider. So I did.

Now I have come to the conclusion that it is worth taking a closer look at Eldad Uzman’s remarks. What does it mean for him that a class is not a class? From the point of view of the Page Object Model logic, a class simply represents a page. No more, no less. If the scope of its responsibility seems strange, maybe the problem is not in the design pattern, but in the UX of the software under test? If we don’t know what the class representing the page does, do we know what the page does?

A critical look at the created classes can show us the problems in the software under test itself. This is why we should be sensitive to every code smell.

SUMMARY

After 10 years, the idea presented by Martin Fowler still defends itself. It is still popular, and actually, all currently recommended models to some extent take the Page Object Model as a reference point. In my opinion, this shows that it is not overrated at all.