Playwright Visual Tests and the “Environment Flakiness”

Vinicius Gabriel Cabral Paulino
5 min readApr 5, 2024

The Problem:

Well….this problem is not only for Playwright. Other Visual Test tools present the same “problem.” If you have to run the same tests on different environments, they might present failures, as the pages will render slight differences across them.

So, in this story, I will present a common scenario and one way to overcome this “barrier.” Although it isn’t the only solution, it can provide some insights if you have the same problem.

Let’s create a Software Development scenario that we can use as an example:

  • My project contains a CI environment that uses Docker, and the tests run inside the official image of Playwright.
  • Developers and QEs are using Windows.
  • Other developers and QEs are using macOS.
  • And….some of them are also using Linux.

So, the snapshots generated in one environment might fail in all others. Also, increasing thresholds might not be the best approach as it might lead to real issues not being caught.

I will create a repository with a NextJS demo application using React to provide a shared, easy-to-reproduce example.

npx create-next-app@latest

Then, I will install Playwright using the init to generate the standard configuration.

npm init playwright@latest

In the “package.json,” I added more scripts that will be used later.

{
"scripts": {
"start:build": "npm run build && npm run start",
"test": "npx playwright test --config playwright.config.ts"
},
}

Now, in the “playwright.config.ts,” configure the “webServer” to execute the “start:build” script. When the test execution starts, the application is also started.

export default defineConfig({
webServer: {
command: 'npm run start:build',
url: 'http://127.0.0.1:3000',
timeout: 20000,
reuseExistingServer: !process.env.CI,
},
});

Another important thing is the Playwright default setting for the names of the snapshots used on the platform. They added it in the documentation and stated that the reason is the difference in rendering between them. So, running in another environment, Playwright might say there aren’t snapshots.

In the “playwright.config.ts” is possible to change this setting, adding a name pattern that removes the environment dependency.

export default defineConfig({
snapshotPathTemplate: '{testDir}/{testFilePath}-snapshots/{testName}-{projectName}{ext}',
});

Using the “example.spec.ts” that was generated, changing to open the application page and taking the screenshot.

import { test, expect } from '@playwright/test';

test('Visual Test', async ({ page }) => {
await page.goto('http://localhost:3000/');

await expect(page).toHaveScreenshot({
animations: 'disabled',
fullPage: true,
scale: 'css',
threshold: 0.1,
});
});

I’ve generated the snapshots in the CI environment using Docker, which looks like this:

Running the tests on a macOS machine the test fails with the following differences:

Running the tests on a Windows machine the test fails with the following differences:

It’s annoying, and people try to update the snapshots locally. This causes it to fail for others, especially in the CI environment.

The Solution:

What if you standardized the same as the CI, in this example Docker?! It would solve the problem, but building everything (including the app images) inside Docker sometimes takes time, especially for big apps with many dependencies.

There is a better way, and faster! Using a Playwright Remote Server. This means that you can run the app locally and the tests, but the browser will be inside the server.

In the proposed scenario, the CI runs the Playwright tests with Docker and the official Playwright. We can create a remote server to do the same.

Create a short script to launch the Playwright Servers:

const { chromium, firefox, webkit } = require('@playwright/test');

(async () => {
const serverCr = await chromium.launchServer({ headless: true, port: 1010, wsPath: 'chromium' });
console.log(serverCr.wsEndpoint());

const serverFr = await firefox.launchServer({ headless: true, port: 1011, wsPath: 'firefox' });
console.log(serverFr.wsEndpoint());

const serverWk = await webkit.launchServer({ headless: true, port: 1012, wsPath: 'webkit' });
console.log(serverWk.wsEndpoint());
})();

Now, let’s create a Dockerfile that will serve as a base image for our Playwright Server Container:

ARG IMAGE_PLAYWRIGHT_FROM=mcr.microsoft.com/playwright
ARG IMAGE_PLAYWRIGHT_TAG=v1.42.1-focal

FROM ${IMAGE_PLAYWRIGHT_FROM}:${IMAGE_PLAYWRIGHT_TAG} AS pw-server
WORKDIR /src
ENV PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1
RUN npm init -y
RUN npm install @playwright/test@1.42.1
COPY tests/core/remoteServer.js remoteServer.js
ENTRYPOINT [ "node", "remoteServer.js" ]

As the Docker-compose.yml is already used in the CI environment, I’ll add the Playwright Server there:

version: "3.8"

services:
pw-server:
build:
dockerfile: ./dockerfile
context: .
target: pw-server
ports:
- 1010:1010
- 1011:1011
- 1012:1012
extra_hosts:
- "host.docker.internal:host-gateway"

So now, running the following command will start the Docker Container:

docker-compose up pw-server

The console will prompt:

Attaching to pw-server-1
pw-server-1 | ws://localhost:1010/chromium
pw-server-1 | ws://localhost:1011/firefox
pw-server-1 | ws://localhost:1012/webkit

In the “playwright.config.ts” file, it’s possible to define that the browser will run on a remote server inside the projects. You can create another configuration file for this case or manage it another way (e.g., use an environment variable to inject the configuration).

Notice that the “baseURL” also changes since the Docker Container needs to point to the app that’s running locally.

import { defineConfig, devices } from '@playwright/test';


const config = defineConfig({
use: {
baseURL: 'http://host.docker.internal:3000',
},
projects: [
{
name: 'chromium',
use: {
...devices['Desktop Chrome'],
connectOptions: {
wsEndpoint: `ws://localhost:1010/chromium`,
exposeNetwork: "*",
timeout: 30000,
}
},
},
});

export default config;

When the Playwright Server runs, the test execution will start the application and then the test connection to the Server. The server page opens using the local app.

The tests would fail without running using the Playwright Server, as the following example:

The Conclusion:

This can be a solution for making these tests work and stable in all environments. Although the local Docker was used in this example, you can point to any server.

In my opinion, this solution helps improve the development experience with these tests and the maintenance, and it is really easy to implement.

In this repository, you can have a detailed view of this example, check it, and explore the solution.

--

--