Effortless Performance Testing with k6: A Comprehensive Guide

17 min readAug 10, 2023

In today’s fast-paced digital landscape, ensuring your software’s ability to withstand challenges is no longer an option — it’s a necessity.

In our previous article, we delved into the diverse realm of performance testing, exploring the various methodologies that empower developers to gauge their applications’ responsiveness, stability, and scalability.

Now, armed with an understanding of these vital testing types, it’s time to roll up our sleeves and dive into the practical aspects of its implementation.

Introducing k6 — a cutting-edge tool that promises to revolutionize the way you conduct performance tests.

In this article, we’ll unravel the complexities of k6, equipping you with the skills to effortlessly execute thorough performance tests, optimize your digital offerings, and ultimately deliver impeccable user experiences.

What are the k6 use cases?

k6 caters to Developers, QA Engineers, SDETs, and SREs, offering insights into API, microservices, and website performance.

There are several k6 use cases:

Efficient Performance Testing — k6 excels in resource-efficient performance testing, spanning spike, stress, and soak tests.
Precise Browser Testing Leverage — k6’s browser module for laser-focused performance evaluations, capturing browser-specific issues often overlooked at the protocol level.
Chaos-Driven Resilience — k6 integrates real-world traffic patterns into chaos experiments and injects orchestrated disruptions via xk6-disruptor in Kubernetes.
Continuous Performance Oversight — Utilize k6’s automation prowess for scheduled, lightweight tests to monitor production environment performance and availability.

Does k6 have limitations?

Although k6 stands as a robust performance testing tool scripted in JavaScript, its design for optimal capabilities brings forth certain limitations:

It doesn’t operate natively as a browser, differing from typical browser rendering. This sidesteps resource-intensive browser activities, enabling higher load simulation on a single machine (check k6 browser to overcome the limitation);
Given JavaScript’s suboptimal performance characteristics, k6 avoids native NodeJS execution. To maximize efficiency, the tool employs Go for its core, embedding a JavaScript runtime to facilitate streamlined test scripting.

Why is k6 so popular?

There are several other performance test tools you could use, such as Gatling, JMeter or Taurus, but k6 stands out for several reasons:

It is a lightweight, developer-oriented, modern (released in 2017) and super extensible application for performance, load testing, and even SRE tasks;
Scripts are developed using JavaScript on any text platform.
It can reuse script sections as functions in other scripts.
Lean solution with blazing performance.
Natively supports HTTP 1 & 2, WebSockets, and gRPC.

Also, given this advantages, see the following plot, comparing maximum Requests Per Second (RPS) with the correspondent memory usage for each common tool:

Source: Comparing k6 and JMeter for load testing

From this plot, you can see that k6 provides an high rate of maximum RPS and a low memory consumption. These characteristics aren’t observed in JMeter or Gatling, that cannot reach such an high RPS rate, and still consume lot more memory.

Overall, choosing a tool for these tests really depends on the context that your team is in. There isn’t a solution that best suites every teams needs.

Regarding the plot, you could say that, for example, Vegeta, Apachebench or Hey have better ratios of maximum RPS rates vs Memory usage.

But you should not only look at these characteristics in order to choose the best tool. For example:

Vegeta does not provide a scripting capability and no assertion support. Also, more complex operations result in long pipe chains which can become difficult to read and has limited supported protocols;
Both Apachebench and Hey have a slow contribution rate.

How can I install k6?

In this article, we will be using MacOS for the tutorial, but you can install k6 in either Linux, Windows or use a docker container (Installation Guide).

Regarding MacOS, it is really simple, and it only requires you to have Homebrew:

brew install k6

Furthermore, if you want to execute a simple local script, run:

k6 run path_to_your_file.js

Is it possible to add extensions to k6?

Yes, it is! Currently, there are over 75 available extensions that you can use.

The good news with this feature is that you can combine multiple extensions, depending on your use case.

For this, you’ll have to create a custom k6 binary, that can be built using xk6, a command-line tool and framework written in Go.

Regarding these extensions, they can be classified into 2 major groups:

JavaScript Extensions — Amplify the array of JavaScript APIs at your disposal within your test scripts. This includes support for novel network protocols (check xk6-amqp for publishing and consuming messages from queues and exchanges using AMQP), elevate performance beyond analogous JS libraries, or introduce innovative functionalities.
Output Extensions — Redirect metrics towards a customized file format or service, introducing personalized processing and distribution mechanisms (check xk6-output-prometheus-remote to publish metrics to Prometheus). This facilitates seamless integration of tailored output formats and data handling procedures.

In order to build the binary using xk6, take a look into this Dockerfile example:

FROM golang:1.20 as builder
RUN go install go.k6.io/xk6/cmd/xk6@latest
RUN xk6 build --output /k6 \
    --with github.com/szkiba/xk6-dotenv@latest \
    --with github.com/avitalique/xk6-file@latest
FROM loadimpact/k6:latest
COPY --from=builder /k6 /usr/bin/k6

Afterwards, to run any test using this binary, simply use:

./k6 run path_to_your_file.js

You might be wondering how should a k6 test file look like. So, lets take a look at the test lifecycle of these scripts.

How does the lifecycle of a k6 tests looks like?

// 1. init code

export function setup() {
  // 2. setup code
}

export default function (data) {
  // 3. VU code
}

export function teardown(data) {
  // 4. teardown code
}

The lifecycle of a k6 test is composed of 4 stages, as you can see above:

init — loads local files, imports modules and declares lifecycle functions. It is a required stage and its called once per Virtual User (VU);
Setup — sets up the test environment and generates data, to be shared among VUs. It is an optional stage, and it is only called once;
VU code — runs the test function (usually default). It is a required stage, and its executed once per iteration, as many times as the test options require;
Teardown — analyzes the outcome of the setup code and conclude the test environment. It is an optional stage, and it is executed only once.

A typical VU code would look like:

import { check } from 'k6';
import http from 'k6/http';

export default function () {
  const res = http.get('http://test.k6.io/');
  check(res, {
    'is status 200': (r) => r.status === 200,
  });
}

There are 2 things we need to highlight in the code: the HTTP Request and the check for the HTTP response code.

In this code, we are performing a GET request targeting http://test.k6.io. Although this method was used, the http module k6/http handles all kinds of HTTP requests and methods, namely:

batch() — issues multiple HTTP requests in parallel;
del() — HTTP DELETE request;
head() — HTTP HEAD request;
options() — HTTP OPTIONS request;
patch() — HTTP PATCH request;
post() — HTTP POST request;
put() — HTTP PUT request;
request() — issues any type of HTTP request.

Regarding the check performed, its main usage is to validate boolean conditions in the test. It works similarly to the typical assertions, and in this example, it was checking that the request’s HTTP response code was 200.

You can have one or more of these check conditions, and validate more things than only the status code:

check(res, {
    'is status 200': (r) => r.status === 200,
    'body size is 11,105 bytes': (r) => r.body.length == 11105,
  });

Having one or more check condition failed won’t necessarily make a test to abort or finish with a failed status. Each check will create a rate metric, and this metric can have thresholds defined, that will impact the test success.

While executing a test, k6 creates metrics that measure the performance of the system. These metrics could be either built-in or custom.

By default, at the end of the test, k6 prints summarized results to stdout, known as an end-of-test summary report.

Take a look at a standard test output report:


          /\      |‾‾| /‾‾/   /‾‾/   
     /\  /  \     |  |/  /   /  /    
    /  \/    \    |     (   /   ‾‾\  
   /          \   |  |\  \ |  (‾)  | 
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: k6_example.js
     output: -

  scenarios: (100.00%) 1 scenario, 1 max VUs, 10m30s max duration (incl. graceful stop):
           * default: 1 iterations for each of 1 VUs (maxDuration: 10m0s, gracefulStop: 30s)


     ✓ is status 200
     ✓ body size is 11,278 bytes

     checks.........................: 100.00% ✓ 2        ✗ 0
     data_received..................: 17 kB   34 kB/s
     data_sent......................: 543 B   1.1 kB/s
     http_req_blocked...............: avg=156.06ms min=97.28ms  med=156.06ms max=214.85ms p(90)=203.09ms p(95)=208.97ms
     http_req_connecting............: avg=96.2ms   min=95.57ms  med=96.2ms   max=96.83ms  p(90)=96.7ms   p(95)=96.77ms 
     http_req_duration..............: avg=97.74ms  min=96.36ms  med=97.74ms  max=99.13ms  p(90)=98.85ms  p(95)=98.99ms 
       { expected_response:true }...: avg=97.74ms  min=96.36ms  med=97.74ms  max=99.13ms  p(90)=98.85ms  p(95)=98.99ms 
     http_req_failed................: 0.00%   ✓ 0        ✗ 2
     http_req_receiving.............: avg=864.5µs  min=151µs    med=864.5µs  max=1.57ms   p(90)=1.43ms   p(95)=1.5ms   
     http_req_sending...............: avg=639.5µs  min=44µs     med=639.5µs  max=1.23ms   p(90)=1.11ms   p(95)=1.17ms  
     http_req_tls_handshaking.......: avg=58.66ms  min=0s       med=58.66ms  max=117.32ms p(90)=105.59ms p(95)=111.46ms
     http_req_waiting...............: avg=96.24ms  min=94.97ms  med=96.24ms  max=97.5ms   p(90)=97.25ms  p(95)=97.38ms 
     http_reqs......................: 2       3.913649/s
     iteration_duration.............: avg=510.39ms min=510.39ms med=510.39ms max=510.39ms p(90)=510.39ms p(95)=510.39ms
     iterations.....................: 1       1.956825/s


running (00m00.5s), 0/1 VUs, 1 complete and 0 interrupted iterations
default ✓ [======================================] 1 VUs  00m00.5s/10m0s  1/1 iters, 1 per VU

When you are running a test, you will be able to see the test progress and some test details, but once it finishes, k6 prints the full details and summary statistics of several metrics.

These are built-in metrics, and the most common ones are, for example:

http_req_failed: Total number of failed requests;
http_req_duration: The end-to-end time of all requests;
http_reqs: Total number of requests made.

The metrics associated with time measures, like http_req_blocked, have some extra statistical values:

Median (med) and Average (avg) values;
Minimum (min) and Maximum (max) values;
p(90), p(95) and p(99) values.

You could also add some custom metrics, if you feel like these are not enough, and they can belong to any of these 4 types:

Counter — represents a custom cumulative counter metric;
Gauge — represents a custom metric that holds only the latest added value;
Rate — represents a custom metric that keeps track of the percentage of added values that are different than 0;
Trend — represents a custom metrics that depict a user-defined metric, facilitating computation of diverse statistics on included values such as minimum, maximum, mean, or percentiles.

How can I define pass or fail criteria?

In order to implement these criteria, you can use Thresholds. If your system have test metrics that do not meet the conditions of the threshold, the test will have a failed status.

Taking into consideration the code example I provided earlier, lets suppose that we have the following SLOs:

Have 95% of the requests with less than 100ms of response time;
Have less than 1% of requests that returned an error;
Have the check conditions happening on more than 99% of the requests.

Translating this into a k6 test script, it would look like:

import { check } from 'k6';
import http from 'k6/http';

export const options = {
    thresholds: {
      http_req_duration: ['p(95)<100'],
      http_req_failed: ['rate<0.01'], 
      checks: ['rate>0.99']
    },
  };

export default function () {
  const res = http.get('http://test.k6.io/');
  check(res, {
    'is status 200': (r) => r.status === 200,
    'body size is 11,278 bytes': (r) => r.body.length == 11278,
  });
}

After the script is executed, in the output, the metrics that had their values checked, will have either a green checkmark or a red cross on the left side of the metric, indicating a pass or failed test status, respectfully:

✓ checks.........................: 100.00% ✓ 2        ✗ 0
✓ http_req_duration..............: avg=93.33ms  min=91.68ms  med=93.33ms  max=94.98ms  p(90)=94.65ms  p(95)=94.82ms 
       { expected_response:true }...: avg=93.33ms  min=91.68ms  med=93.33ms  max=94.98ms  p(90)=94.65ms  p(95)=94.82ms 
✓ http_req_failed................: 0.00%   ✓ 0        ✗ 2

Environment variables

In order to make the test more reusable in different contexts, rather than creating several separate scripts, you can use environment variables.

These variables can be set through:

-e CLI flag:

k6 run -e YOUR_VARIABLE=value path_to_your_file.js

Within your script, in order to access this variables, consider using __ENV.YOUR_VARIABLE.

k6 options configurations — using this option, you can define, for example, for how long will the test run or with how many VUs:

K6_DURATION=20s K6_VUS=5 k6 run path_to_your_file.js

You may be wondering how can you have the plots that I referenced in the previous article regarding the evolution of VUs or RPS over the time, with the ramp-up and ramp-down phases. This can be solved by using scenarios.

What are scenarios?

Scenarios provide detailed configuration for VUs and iteration schedules, enabling versatile workload modeling in load tests.

The reasons why we should use scenarios include:

Enhanced test organization: Declare multiple scenarios in one script, each executing a separate JavaScript function independently.
Realistic traffic simulation: Different VU and iteration patterns per scenario, driven by dedicated executors, yield authenticity.
Parallel or sequential workloads: Scenarios are autonomous and run in parallel, with sequential appearance achievable through startTime adjustment.
Precise result analysis: Set distinct environment variables and metric tags per scenario, allowing meticulous analysis.

Each scenario will have its VU workload scheduled by an executor. This executor is responsible for configuring the duration of the test, and to define how RPS or VUs evolve over time.

There are several executors that you can pick, based on your specific needs:

shared-iterations: shares iterations between the defined number of VUs. For example, if we want to schedule 100 iterations shared by 5 VUSs, and with a maximum test duration of 15 seconds, the options would look like:

export const options = {
  scenarios: {
    contacts: {
      executor: 'shared-iterations',
      vus: 5,
      iterations: 100,
      maxDuration: '15s',
    },
  },
};

In this case, the 100 iterations will be distributed by the 5 VUs, so, each VU will perform 20 iterations. Once all iterations were performed, the test ends.

per-vu-iterations: each VU will execute a specific number of iterations. If we schedule 10 VUs to execute 20 iterations each, we will have performed 200 iterations in total, and the script would look like:

export const options = {
  scenarios: {
    contacts: {
      executor: 'per-vu-iterations',
      vus: 10,
      iterations: 20,
      maxDuration: '30s'
    },
  },
};

Use this executor if you need a specific amount of VUs to complete the same amount of iterations. This can be useful when you have fixed sets of test data that you want to partition between VUs.

constant-vus: each VU will execute as many iterations as possible, within a certain amount of time. For example, lets assume we want the scenario to have 20 VUs executing for as much as possible, for a total of 30 seconds. The options would look like:

export const options = {
  scenarios: {
    contacts: {
      executor: 'constant-vus',
      vus: 20,
      duration: '30s',
    },
  },
};

ramping-vus: a variable number of VUs executes as many iterations as possible, within a certain amount of time. The difference to the previous executor is that we will have stages that can vary the number of VUs performing iterations. Lets say that we want, in the first 20 seconds, the number of VUs to increase gradually (ramp-up) from 0 to 10 VUs. And afterwards, we want it to decrease gradually (ramp-down) until 0:

export const options = {
  scenarios: {
    contacts: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '20s', target: 10 },
        { duration: '10s', target: 0 },
      ]
    },
  },
};

The plot of VUs and RPS evolution would look like:

In this execution, we got a total of 30 seconds of execution time, with 2 different stages — ramp-up and ramp-down.

constant-arrival-rate: given a specific and fixed amount of RPS, k6 will dynamically allocate the needed VUs. If we wanted to have a steady rate of 30 RPS over 30 seconds, and if we assume that there are 50 pre-allocated VUs:

export const options = {
  scenarios: {
    contacts: {
      executor: 'constant-arrival-rate',
      duration: '30s',
      rate: 30,
      timeUnit: '1s',
      preAllocatedVUs: 50
    },
  },
};

In this example, the timeUnit relates to the units used for the rate variable: if it was ‘1m’, the rate would be 30 requests per minute, and not per second.

ramping-arrival-rate: this stage differs from the previous because instead of maintaining a fixed rate of RPS, we can include stages, and will now be able to add ramp-up and ramp-down stages. For example, lets consider that we want to implement a 4 stage test, consisting of:

Start at 300 Requests Per Minute (RPM) and keep that rate for a minute;
Evolve gradually (ramp-up), for 2 minutes, from 300 to 600 RPM;
Keep a steady rate of 600 RPM for 4 minutes;
Decrease gradually (ramp-down), for 2 minutes, from 600 to 60 RPM.

export const options = {
  scenarios: {
    contacts: {
      executor: 'ramping-arrival-rate',
      startRate: 300,
      timeUnit: '1m',
      preAllocatedVUs: 50,
      stages: [
        { target: 300, duration: '1m' },
        { target: 600, duration: '2m' },
        { target: 600, duration: '4m' },
        { target: 60, duration: '2m' },
      ],
    },
  },
};

The total test duration would be 9 minutes, and k6 will dynamically adjust the needed VUs, that will make the results look like the following plot:

externally-controlled: using this executor, we can control and scale an execution via k6’s REST API or the CLI.

export const options = {
  scenarios: {
    contacts: {
      executor: 'externally-controlled',
      vus: 10,
      maxVUs: 50,
      duration: '10m',
    },
  },
};

If we want the test to start at 10 VUs and go to a maximum of 50 for 10 minutes, we can run the script, and from another console:

Increase the number of current used VUs to 15:

k6 scale --vus=15

Pause the execution:

k6 pause

Resume the execution:

k6 resume

How can I organize my tests?

A performance test often targets a service with multiple components and resources, making it challenging to identify performance issues.

To enhance result comprehension, organization, and filtering, k6 introduces the following features:

Tags: These categorize checks, thresholds, custom metrics, and requests, enabling detailed filtering.
Groups: They associate tags with script functions.

Additionally, you can apply broader tags for cross-test result comparison. Beyond filtering, tags can also restrict operations analyzed by thresholds.

These tags are instrumental in classifying k6 elements and refining test outcomes, being defined in 2 types:

System tags: Automatically assigned by k6.
User-defined tags: Added during script creation, and can tag entities like requests, checks, thresholds or custom metrics.

import http from 'k6/http';
import { Trend } from 'k6/metrics';
import { check } from 'k6';

const myTrend = new Trend('my_trend');

export default function () {
  // Add tag to request metric data
  const res = http.get('https://httpbin.test.k6.io/', {
    tags: {
      my_tag: "I'm a tag",
    },
  });

  // Add tag to check
  check(res, { 'status is 200': (r) => r.status === 200 }, { my_tag: "I'm a tag" });

  // Add tag to custom metric
  myTrend.add(res.timings.connecting, { my_tag: "I'm a tag" });
}

Regarding groups, we can organize the load scripts by functions, or even nest groups using a BDD-style approach. Lets take a look at the syntax of the script using several groups:

import { group } from 'k6';

export default function () {
  group('visit product listing page', function () {
    // ...
  });
  group('add several products to the shopping cart', function () {
    // ...
  });
  group('visit login page', function () {
    // ...
  });
  group('authenticate', function () {
    // ...
  });
  group('checkout process', function () {
    // ...
  });
}

For each of these groups, k6 will gather the total execution time for each group.

Ways to visualize k6 results

There are several ways you could visualize your results, besides the standard stdout. For that, use the flag — out and then the format <key> = <value>, where the key is the output type and the value is the file path or remote destinations.

Each script can have one or multiple outputs:

k6 run script.js \
--out json=test.json \
--out influxdb=http://localhost:8086/k6
--out CSV=file.csv

In this example, we are sending the results output to a json format at test.json, to a CSV format at file.csv and also to a influxDB format, in a specific localhost port.

k6 can also be a client for the statsD daemon, and with this you will be able to send the metrics to several services, namely:

Furthermore, k6 has a built-in support to send the output metrics using the Prometheus remote write protocol (PrometheusRW) to Prometheus, to AWSTimestream, InfluxDB or even TimescaleDB.

If you want an output HTML file, you can also consider the open-source reporters, such as https://github.com/benc-uk/k6-reporter , with outputs like:

For this, you only need to add this module to your test script:

import { htmlReport } from "https://raw.githubusercontent.com/benc-uk/k6-reporter/main/dist/bundle.js";

And then, outside the test’s default function:

export function handleSummary(data) {
  return {
    "summary.html": htmlReport(data),
  };
}

Is k6 free to use?

Although k6 is an open-source and free tool, and you can develop all your tests and use the several approaches I suggested you earlier, k6 also provides a premium service called k6 Cloud.

This is a commercial Software as a Service (SaaS) product, and provides extra functionalities, such as:

Running cloud tests;
Storing and visualizing test results;
Correlating results between different tests;
Detecting performance issues.

Are there downsides on using k6?

Like with every tools, there might be needs that are not covered by k6, and it will really depend on your scenario.

While in my case the supported protocols are more than enough for my scenarios, this might not be true for everyone.

One struggle that I faced was regarding the usage of xk6-amqp extension. The problem was that the extension made asynchronous use of the runtime, and for this I was receiving panic errors:

panic: runtime error: invalid memory address or nil pointer dereference

The problem appeared once I surpassed a certain RPS rate, and it was similar to https://github.com/grafana/xk6-amqp/issues/6.

In the meantime, I discovered that there was a eventLoop branch that solved my problem, and although I asked if there was a chance to see it merged into main, one of the code owners told that the branch was old and needed to be redone.

And since there were more important topics, and this involved lots of changes, they answered that it won’t be done in the near future https://github.com/grafana/xk6-amqp/issues/10.

Also, I found that, in terms of reporting, Gatling might provide a more feature-rich solution to its HTML report.

While we could have details on, for example, New Relic, if we want to get the results in an HTML document, we have to rely on an external extension, that doesn’t provide detailed information regarding plotting and evolution of each metric overtime.

Conclusion

In today’s rapidly evolving digital landscape, the imperative to ensure software resilience is no longer a choice, but a necessity.

Our previous exploration delved into the multifaceted realm of performance testing, encompassing methodologies to gauge application responsiveness, stability, and scalability.

Now equipped with an understanding of these crucial testing paradigms, we embark on a hands-on journey into implementation, introducing k6 — a cutting-edge tool poised to redefine your approach to performance tests.

This article has unraveled the intricacies of k6, empowering you with the skills to execute comprehensive performance tests, optimize digital offerings, and deliver impeccable user experiences.

From its diverse use cases encompassing APIs, microservices, and websites to its prowess in efficient performance testing, precise browser evaluations, chaos-driven resilience, and continuous performance oversight — k6 stands as a powerful ally in the realm of testing.

Amidst a landscape of performance testing tools, k6 distinguishes itself through lightweight, developer-centric design, extensibility, and exceptional performance. Its compatibility with various protocols, coupled with its efficiency in achieving high RPS rates with minimal memory consumption, sets it apart as a compelling choice.

As you navigate your testing endeavors, remember that tool selection hinges on the unique context of your team. While k6 excels in many facets, evaluating trade-offs and aligning tool features with your specific requirements remains crucial.

Remember, performance testing is a dynamic field, and your mastery of k6 can contribute to a future where seamless, high-performance software is the norm.

I hope you enjoyed reading this article!

My name is João Coelho, and I am currently a QA Automation Engineer at Talkdesk. Lately, I have been writing articles regarding automation, QA and software engineering topics, that might not be known by the community.

If you want to follow my work, check my Linkedin and my author profile at Medium!

Furthermore, if you’re interested in further supporting me and my content creation efforts, you can do so by buying me a coffee! 😄👇

Your support goes a long way in helping me dedicate more time to researching and sharing valuable insights about automation, QA, and software engineering.

Effortless Performance Testing with k6: A Comprehensive Guide

Written by João Coelho