New – Parallel Query for Amazon Aurora

Amazon Aurora is a relational database that was designed to take full advantage of the abundance of networking, processing, and storage resources available in the cloud. While maintaining compatibility with MySQL and PostgreSQL on the user-visible side, Aurora makes use of a modern, purpose-built distributed storage system under the covers. Your data is striped across hundreds of storage nodes distributed over three distinct AWS Availability Zones, with two copies per zone, on fast SSD storage. Here’s what this looks like (extracted from Getting Started with Amazon Aurora):

New Parallel Query
When we launched Aurora we also hinted at our plans to apply the same scale-out design principle to other layers of the database stack. Today I would like to tell you about our next step along that path.

Each node in the storage layer pictured above also includes plenty of processing power. Aurora is now able to make great use of that processing power by taking your analytical queries (generally those that process all or a large part of a good-sized table) and running them in parallel across hundreds or thousands of storage nodes, with speed benefits approaching two orders of magnitude. Because this new model reduces network, CPU, and buffer pool contention, you can run a mix of analytical and transactional queries simultaneously on the same table while maintaining high throughput for both types of queries.

The instance class determines the number of parallel queries that can be active at a given time:

db.r*.large – 1 concurrent parallel query session
db.r*.xlarge – 2 concurrent parallel query sessions
db.r*.2xlarge – 4 concurrent parallel query sessions
db.r*.4xlarge – 8 concurrent parallel query sessions
db.r*.8xlarge – 16 concurrent parallel query sessions
db.r4.16xlarge – 16 concurrent parallel query sessions

You can use the aurora_pq parameter to enable and disable the use of parallel queries at the global and the session level.

Parallel queries enhance the performance of over 200 types of single-table predicates and hash joins. The Aurora query optimizer will automatically decide whether to use Parallel Query based on the size of the table and the amount of table data that is already in memory; you can also use the aurora_pq_force session variable to override the optimizer for testing purposes.

Parallel Query in Action
You will need to create a fresh cluster in order to make use of the Parallel Query feature. You can create one from scratch, or you can restore a snapshot.

To create a cluster that supports Parallel Query, I simply choose Provisioned with Aurora parallel query enabled as the Capacity type:

I used the CLI to restore a 100 GB snapshot for testing, and then explored one of the queries from the TPC-H benchmark. Here’s the basic query:

SELECT
  l_orderkey,
  SUM(l_extendedprice * (1-l_discount)) AS revenue,
  o_orderdate,
  o_shippriority

FROM customer, orders, lineitem

WHERE
  c_mktsegment='AUTOMOBILE'
  AND c_custkey = o_custkey
  AND l_orderkey = o_orderkey
  AND o_orderdate < date '1995-03-13'
  AND l_shipdate > date '1995-03-13'

GROUP BY
  l_orderkey,
  o_orderdate,
  o_shippriority

ORDER BY
  revenue DESC,
  o_orderdate LIMIT 15;

The EXPLAIN command shows the query plan, including the use of Parallel Query:

+----+-------------+----------+------+-------------------------------+------+---------+------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
| id | select_type | table    | type | possible_keys                 | key  | key_len | ref  | rows      | Extra                                                                                                                          |
+----+-------------+----------+------+-------------------------------+------+---------+------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
|  1 | SIMPLE      | customer | ALL  | PRIMARY                       | NULL | NULL    | NULL |  14354602 | Using where; Using temporary; Using filesort                                                                                   |
|  1 | SIMPLE      | orders   | ALL  | PRIMARY,o_custkey,o_orderdate | NULL | NULL    | NULL | 154545408 | Using where; Using join buffer (Hash Join Outer table orders); Using parallel query (4 columns, 1 filters, 1 exprs; 0 extra)   |
|  1 | SIMPLE      | lineitem | ALL  | PRIMARY,l_shipdate            | NULL | NULL    | NULL | 606119300 | Using where; Using join buffer (Hash Join Outer table lineitem); Using parallel query (4 columns, 1 filters, 1 exprs; 0 extra) |
+----+-------------+----------+------+-------------------------------+------+---------+------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.01 sec)

Here is the relevant part of the Extras column:

Using parallel query (4 columns, 1 filters, 1 exprs; 0 extra)

The query runs in less than 2 minutes when Parallel Query is used:

+------------+-------------+-------------+----------------+
| l_orderkey | revenue     | o_orderdate | o_shippriority |
+------------+-------------+-------------+----------------+
|   92511430 | 514726.4896 | 1995-03-06  |              0 |
|  593851010 | 475390.6058 | 1994-12-21  |              0 |
|  188390981 | 458617.4703 | 1995-03-11  |              0 |
|  241099140 | 457910.6038 | 1995-03-12  |              0 |
|  520521156 | 457157.6905 | 1995-03-07  |              0 |
|  160196293 | 456996.1155 | 1995-02-13  |              0 |
|  324814597 | 456802.9011 | 1995-03-12  |              0 |
|   81011334 | 455300.0146 | 1995-03-07  |              0 |
|   88281862 | 454961.1142 | 1995-03-03  |              0 |
|   28840519 | 454748.2485 | 1995-03-08  |              0 |
|  113920609 | 453897.2223 | 1995-02-06  |              0 |
|  377389669 | 453438.2989 | 1995-03-07  |              0 |
|  367200517 | 453067.7130 | 1995-02-26  |              0 |
|  232404000 | 452010.6506 | 1995-03-08  |              0 |
|   16384100 | 450935.1906 | 1995-03-02  |              0 |
+------------+-------------+-------------+----------------+
15 rows in set (1 min 53.36 sec)

I can disable Parallel Query for the session (I can use an RDS custom cluster parameter group for a longer-lasting effect):

set SESSION aurora_pq=OFF;

The query runs considerably slower without it:

+------------+-------------+-------------+----------------+
| l_orderkey | o_orderdate | revenue     | o_shippriority |
+------------+-------------+-------------+----------------+
|   92511430 | 1995-03-06  | 514726.4896 |              0 |
...
|   16384100 | 1995-03-02  | 450935.1906 |              0 |
+------------+-------------+-------------+----------------+
15 rows in set (1 hour 25 min 51.89 sec)

This was on a db.r4.2xlarge instance; other instance sizes, data sets, access patterns, and queries will perform differently. I can also override the query optimizer and insist on the use of Parallel Query for testing purposes:

set SESSION aurora_pq_force=ON;

Things to Know
Here are a couple of things to keep in mind when you start to explore Amazon Aurora Parallel Query:

Engine Support - We are launching with support for MySQL 5.6, and are working on support for MySQL 5.7 and PostgreSQL.

Table Formats - The table row format must be COMPACT; partitioned tables are not supported.

Data Types - The TEXT, BLOB, and GEOMETRY data types are not supported.

DDL - The table cannot have any pending fast online DDL operations.

Cost - You can make use of Parallel Query at no extra charge. However, because it makes direct access to storage, there is a possibility that your IO cost will increase.

Give it a Shot
This feature is available now and you can start using it today!

— Jeff;

AWS News Blog

New – Parallel Query for Amazon Aurora

Resources

Follow