среда, 19 декабря 2012 г.

InnoDB, extended secondary keys.

It's a well-know fact that InnoDB secondary keys contain both user defined columns and the primary key columns. For example, if a table has PRIMARY KEY(pk) and secondary key k1(f1), then index k1 is internally stored as k1(f1,pk).

Prior to version 5.6.9, the MySQL optimizer could only partially use these extended primary key columns: they could be used for sorting and to provide index only access. Starting from MySQL 5.6.9, the optimizer makes full use of the extended columns. This means that 'ref' access, range access, MIN/MAX optimizations, index_merge, loose index scan etc all works as if you had created the index with all primary key columns in all secondary keys. The new feature is turned on and off by optimizer switch 'use_index_extensions' and is on by default.

Consider we have following table:

CREATE TABLE t1
(
  f1 INT NOT NULL DEFAULT '0',
  f2 INT NOT NULL DEFAULT '0',
  f3 DATE DEFAULT NULL,
  PRIMARY KEY (f1, f2),
  KEY k_f3 (f3)
) ENGINE = InnoDB;

INSERT INTO t1 VALUES (1, 1, '1998-01-01'), (1, 2, '1999-01-01'),
(1, 3, '2000-01-01'), (1, 4, '2001-01-01'), (1, 5, '2002-01-01'),
(2, 1, '1998-01-01'), (2, 2, '1999-01-01'), (2, 3, '2000-01-01'),
(2, 4, '2001-01-01'), (2, 5, '2002-01-01'), (3, 1, '1998-01-01'),
(3, 2, '1999-01-01'), (3, 3, '2000-01-01'), (3, 4, '2001-01-01'),
(3, 5, '2002-01-01'), (4, 1, '1998-01-01'), (4, 2, '1999-01-01'),
(4, 3, '2000-01-01'), (4, 4, '2001-01-01'), (4, 5, '2002-01-01'),
(5, 1, '1998-01-01'), (5, 2, '1999-01-01'), (5, 3, '2000-01-01'),
(5, 4, '2001-01-01'), (5, 5, '2002-01-01');

Lets take a look at the difference in results with "use_index_extensions=off|on":


mysql> set optimizer_switch= "use_index_extensions=off";
Query OK, 0 rows affected (0.00 sec)
mysql> FLUSH STATUS;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT COUNT(*) FROM t1 WHERE f3 = '2000-01-01' AND f1 BETWEEN 1 AND 3;
+----------+
| COUNT(*) |
+----------+
|        3 |
+----------+
1 row in set (0.01 sec)
mysql> SHOW STATUS LIKE 'handler_read_next';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Handler_read_next | 5     |
+-------------------+-------+
1 row in set (0.00 sec)


mysql> set optimizer_switch= "use_index_extensions=on";
Query OK, 0 rows affected (0.00 sec)
mysql> FLUSH STATUS;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT COUNT(*) FROM t1 WHERE f3 = '2000-01-01' AND f1 BETWEEN 1 AND 3;
+----------+
| COUNT(*) |
+----------+
|        3 |
+----------+
1 row in set (0.00 sec)
mysql> SHOW STATUS LIKE 'handler_read_next';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Handler_read_next | 3     |
+-------------------+-------+
1 row in set (0.00 sec)

Number of handler_read_next is decreased with use_index_extensions=on. It happens because optimizer uses 'f3, f1' pair for range access. This behaviour change is indirectly can be visible is explain:


mysql> set optimizer_switch= "use_index_extensions=off";
Query OK, 0 rows affected (0.00 sec)
mysql> EXPLAIN SELECT COUNT(*) FROM t1 WHERE f3 = '2000-01-01' AND f1 BETWEEN 1 AND 3;
+----+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra                    |
+----+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
|  1 | SIMPLE      | t1    | ref  | PRIMARY,k_f3  | k_f3 | 4       | const |    5 | Using where; Using index |
+----+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
1 row in set (0.00 sec)


mysql> set optimizer_switch= "use_index_extensions=on";
Query OK, 0 rows affected (0.00 sec)
mysql> EXPLAIN SELECT COUNT(*) FROM t1 WHERE f3 = '2000-01-01' AND f1 BETWEEN 1 AND 3;
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|  1 | SIMPLE      | t1    | range | PRIMARY,k_f3  | k_f3 | 8       | NULL |    3 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+

As you can see key_len field is changed (4 with "use_index_extensions=off" and 8 with "use_index_extensions=on") which means that 'f3, f1' pair is used with enabled feature.

Another example shows the improvement when using JOIN(the same table with 12800 records):


mysql> SELECT COUNT(*) FROM t1;
+----------+
| COUNT(*) |
+----------+
|    12800 |
+----------+
1 row in set (0.07 sec)

mysql> set optimizer_switch= "use_index_extensions=off";
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT count(*) FROM t1 AS t1 JOIN t1 AS t2 ON t2.f1 = t1.f1
WHERE t1.f3 = '2000-01-01' AND t2.f3 = '2002-01-01';
+----------+
| count(*) |
+----------+
|    10240 |
+----------+
1 row in set (0.47 sec)

set optimizer_switch= "use_index_extensions=on";
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT count(*) FROM t1 AS t1 JOIN t1 AS t2 ON t2.f1 = t1.f1
WHERE t1.f3 = '2000-01-01' AND t2.f3 = '2002-01-01';
+----------+
| count(*) |
+----------+
|    10240 |
+----------+
1 row in set (0.18 sec)


As you can see execution of the query is roughly 2.5 better with enabled "use_index_extensions" switch.