Wednesday, 10 June 2020

MariaDB Explain

https://mariadb.com/kb/en/explain/

EXPLAIN

Syntax

EXPLAIN tbl_name
Or
EXPLAIN [EXTENDED | PARTITIONS] 
  {SELECT select_options | UPDATE update_options | DELETE delete_options}

Description

The EXPLAIN statement can be used either as a synonym for DESCRIBE or as a way to obtain information about how MariaDB executes a SELECT (as well as UPDATE and DELETE since MariaDB 10.0.5) statement:
  • 'EXPLAIN tbl_name' is synonymous with 'DESCRIBE tbl_name' or 'SHOW COLUMNS FROM tbl_name'.
  • When you precede a SELECT statement (or, since MariaDB 10.0.5, an UPDATE or a DELETE as well) with the keyword EXPLAIN, MariaDB displays information from the optimizer about the query execution plan. That is, MariaDB explains how it would process the SELECT, UPDATE or DELETE, including information about how tables are joined and in which order. EXPLAIN EXTENDED can be used to provide additional information.
  • EXPLAIN PARTITIONS has been available since MySQL 5.1.5. It is useful only when examining queries involving partitioned tables.
    For details, see Partition pruning and selection.
  • ANALYZE statement, which performs the query as well as producing EXPLAIN output, and provides actual as well as estimated statistics, has been available from MariaDB 10.1.0.
  • Since MariaDB 10.0.5, it has been possible to have EXPLAIN output printed in the slow query log. See EXPLAIN in the Slow Query Log for details.
Since MariaDB 10.0, SHOW EXPLAIN shows the output of a running statement. In some cases, its output can be closer to reality than EXPLAIN.
Since MariaDB 10.1, the ANALYZE statement runs a statement and returns information about its execution plan. It also shows additional columns, to check how much the optimizer's estimation about filtering and found rows are close to reality.
There is an online EXPLAIN Analyzer that you can use to share EXPLAIN and EXPLAIN EXTENDED output with others.
EXPLAIN can acquire metadata locks in the same way that SELECT does, as it needs to know table metadata and, sometimes, data as well.

Columns in EXPLAIN ... SELECT

Column nameDescription
idSequence number that shows in which order tables are joined.
select_typeWhat kind of SELECT the table comes from.
tableAlias name of table. Materialized temporary tables for sub queries are named <subquery#>
typeHow rows are found from the table (join type).
possible_keyskeys in table that could be used to find rows in the table
keyThe name of the key that is used to retrieve rows. NULL is no key was used.
key_lenHow many bytes of the key that was used (shows if we are using only parts of the multi-column key).
refThe reference that is used as the key value.
rowsAn estimate of how many rows we will find in the table for each key lookup.
ExtraExtra information about this join.
Here are descriptions of the values for some of the more complex columns in EXPLAIN ... SELECT:

"Select_type" Column

The select_type column can have the following values:
ValueDescription
DEPENDENT SUBQUERYThe SUBQUERY is DEPENDENT.
DEPENDENT UNIONThe UNION is DEPENDENT.
DERIVEDThe SELECT is DERIVED from the PRIMARY.
MATERIALIZEDThe SUBQUERY is MATERIALIZED.
PRIMARYThe SELECT is a PRIMARY one.
SIMPLEThe SELECT is a SIMPLE one.
SUBQUERYThe SELECT is a SUBQUERY of the PRIMARY.
UNCACHEABLE SUBQUERYThe SUBQUERY is UNCACHEABLE.
UNCACHEABLE UNIONThe UNION is UNCACHEABLE.
UNIONThe SELECT is a UNION of the PRIMARY.
UNION RESULTThe result of the UNION.
LATERAL DERIVEDThe SELECT uses a Lateral Derived optimization

"Type" Column

This column contains information on how the table is accessed.
ValueDescription
ALLA full table scan is done for the table (all rows are read). This is bad if the table is large and the table is joined against a previous table! This happens when the optimizer could not find any usable index to access rows.
constThere is only one possibly matching row in the table. The row is read before the optimization phase and all columns in the table are treated as constants.
eq_refA unique index is used to find the rows. This is the best possible plan to find the row.
fulltextA fulltext index is used to access the rows.
index_mergeA 'range' access is done for for several index and the found rows are merged. The key column shows which keys are used.
index_subqueryThis is similar as ref, but used for sub queries that are transformed to key lookups.
indexA full scan over the used index. Better than ALL but still bad if index is large and the table is joined against a previous table.
rangeThe table will be accessed with a key over one or more value ranges.
ref_or_nullLike 'ref' but in addition another search for the 'null' value is done if the first value was not found. This happens usually with sub queries.
refA non unique index or prefix of an unique index is used to find the rows. Good if the prefix doesn't match many rows.
systemThe table has 0 or 1 rows.
unique_subqueryThis is similar as eq_ref, but used for sub queries that are transformed to key lookups

"Extra" Column

This column consists of one or more of the following values, separated by ';'
Note that some of these values are detected after the optimization phase.
The optimization phase can do the following changes to the WHERE clause:
  • Add the expressions from the ON and USING clauses to the WHERE clause.
  • Constant propagation: If there is column=constant, replace all column instances with this constant.
  • Replace all columns from 'const' tables with their values.
  • Remove the used key columns from the WHERE (as this will be tested as part of the key lookup).
  • Remove impossible constant sub expressions. For example WHERE '(a=1 and a=2) OR b=1' becomes 'b=1'.
  • Replace columns with other columns that has identical values: Example: WHERE a=b and a=c may be treated as 'WHERE a=b and a=c and b=c'.
  • Add extra conditions to detect impossible row conditions earlier. This happens mainly with OUTER JOIN where we in some cases add detection of NULL values in the WHERE (Part of 'Not exists' optimization). This can cause an unexpected 'Using where' in the Extra column.
  • For each table level we remove expressions that have already been tested when we read the previous row. Example: When joining tables t1 with t2 using the following WHERE 't1.a=1 and t1.a=t2.b', we don't have to test 't1.a=1' when checking rows in t2 as we already know that this expression is true.
ValueDescription
const row not foundThe table was a system table (a table with should exactly one row), but no row was found.
DistinctIf distinct optimization (remove duplicates) was used. This is marked only for the last table in the SELECT.
Full scan on NULL keyThe table is a part of the sub query and if the value that is used to match the sub query will be NULL, we will do a full table scan.
Impossible HAVINGThe used HAVING clause is always false so the SELECT will return no rows.
Impossible WHERE noticed after reading const tables.The used WHERE clause is always false so the SELECT will return no rows. This case was detected after we had read all 'const' tables and used the column values as constant in the WHERE clause. For example: WHERE const_column=5 and const_column had a value of 4.
Impossible WHEREThe used WHERE clause is always false so the SELECT will return no rows. For example: WHERE 1=2
No matching min/max rowDuring early optimization of MIN()/MAX() values it was detected that no row could match the WHERE clause. The MIN()/MAX() function will return NULL.
no matching row in const tableThe table was a const table (a table with only one possible matching row), but no row was found.
No tables usedThe SELECT was a sub query that did not use any tables. For example a there was no FROM clause or a FROM DUAL clause.
Not existsStop searching after more row if we find one single matching row. This optimization is used with LEFT JOIN where one is explicitly searching for rows that doesn't exists in the LEFT JOIN TABLE. Example: SELECT * FROM t1 LEFT JOIN t2 on (...) WHERE t2.not_null_column IS NULL. As t2.not_null_column can only be NULL if there was no matching row for on condition, we can stop searching if we find a single matching row.
Open_frm_onlyFor information_schema tables. Only the frm (table definition file was opened) was opened for each matching row.
Open_full_tableFor information_schema tables. A full table open for each matching row is done to retrieve the requested information. (Slow)
Open_trigger_onlyFor information_schema tables. Only the trigger file definition was opened for each matching row.
Range checked for each record (index map: ...)This only happens when there was no good default index to use but there may some index that could be used when we can treat all columns from previous table as constants. For each row combination the optimizer will decide which index to use (if any) to fetch a row from this table. This is not fast, but faster than a full table scan that is the only other choice. The index map is a bitmask that shows which index are considered for each row condition.
Scanned 0/1/all databasesFor information_schema tables. Shows how many times we had to do a directory scan.
Select tables optimized awayAll tables in the join was optimized away. This happens when we are only using COUNT(*), MIN() and MAX() functions in the SELECT and we where able to replace all of these with constants.
Skip_open_tableFor information_schema tables. The queried table didn't need to be opened.
unique row not foundThe table was detected to be a const table (a table with only one possible matching row) during the early optimization phase, but no row was found.
Using filesortFilesort is needed to resolve the query. This means an extra phase where we first collect all columns to sort, sort them with a disk based merge sort and then use the sorted set to retrieve the rows in sorted order. If the column set is small, we store all the columns in the sort file to not have to go to the database to retrieve them again.
Using indexOnly the index is used to retrieve the needed information from the table. There is no need to perform an extra seek to retrieve the actual record.
Using index conditionLike 'Using where' but the where condition is pushed down to the table engine for internal optimization at the index level.
Using index condition(BKA)Like 'Using index condition' but in addition we use batch key access to retrieve rows.
Using index for group-byThe index is being used to resolve a GROUP BY or DISTINCT query. The rows are not read. This is very efficient if the table has a lot of identical index entries as duplicates are quickly jumped over.
Using intersect(...)For index_merge joins. Shows which index are part of the intersect.
Using join bufferWe store previous row combinations in a row buffer to be able to match each row against all of the rows combinations in the join buffer at one go.
Using sort_union(...)For index_merge joins. Shows which index are part of the union.
Using temporaryA temporary table is created to hold the result. This typically happens if you are using GROUP BY, DISTINCT or ORDER BY.
Using whereA WHERE expression (in additional to the possible key lookup) is used to check if the row should be accepted. If you don't have 'Using where' together with a join type of ALL, you are probably doing something wrong!
Using where with pushed conditionLike 'Using where' but the where condition is pushed down to the table engine for internal optimization at the row level.
Using bufferThe UPDATE statement will first buffer the rows, and then run the updates, rather than do updates on the fly. See Using Buffer UPDATE Algorithm for a detailed explanation.

EXPLAIN EXTENDED

The EXTENDED keyword adds another column, filtered, to the output. This is a percentage estimate of the table rows that will be filtered by the condition.
An EXPLAIN EXTENDED will always throw a warning, as it adds extra Message information to a subsequent SHOW WARNINGS statement. This includes what the SELECT query would look like after optimizing and rewriting rules are applied and how the optimizer qualifies columns and tables.

Examples

As synonym for DESCRIBE or SHOW COLUMNS FROM:
DESCRIBE city;
+------------+----------+------+-----+---------+----------------+
| Field      | Type     | Null | Key | Default | Extra          |
+------------+----------+------+-----+---------+----------------+
| Id         | int(11)  | NO   | PRI | NULL    | auto_increment |
| Name       | char(35) | YES  |     | NULL    |                |
| Country    | char(3)  | NO   | UNI |         |                |
| District   | char(20) | YES  | MUL |         |                |
| Population | int(11)  | YES  |     | NULL    |                |
+------------+----------+------+-----+---------+----------------+
A simple set of examples to see how EXPLAIN can identify poor index usage:
CREATE TABLE IF NOT EXISTS `employees_example` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `first_name` varchar(30) NOT NULL,
  `last_name` varchar(40) NOT NULL,
  `position` varchar(25) NOT NULL,
  `home_address` varchar(50) NOT NULL,
  `home_phone` varchar(12) NOT NULL,
  `employee_code` varchar(25) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `employee_code` (`employee_code`),
  KEY `first_name` (`first_name`,`last_name`)
) ENGINE=Aria;

INSERT INTO `employees_example` (`first_name`, `last_name`, `position`, `home_address`, `home_phone`, `employee_code`)
  VALUES
  ('Mustapha', 'Mond', 'Chief Executive Officer', '692 Promiscuous Plaza', '326-555-3492', 'MM1'),
  ('Henry', 'Foster', 'Store Manager', '314 Savage Circle', '326-555-3847', 'HF1'),
  ('Bernard', 'Marx', 'Cashier', '1240 Ambient Avenue', '326-555-8456', 'BM1'),
  ('Lenina', 'Crowne', 'Cashier', '281 Bumblepuppy Boulevard', '328-555-2349', 'LC1'),
  ('Fanny', 'Crowne', 'Restocker', '1023 Bokanovsky Lane', '326-555-6329', 'FC1'),
  ('Helmholtz', 'Watson', 'Janitor', '944 Soma Court', '329-555-2478', 'HW1');

SHOW INDEXES FROM employees_example;
+-------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table             | Non_unique | Key_name      | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| employees_example |          0 | PRIMARY       |            1 | id            | A         |           7 |     NULL | NULL   |      | BTREE      |         |               |
| employees_example |          0 | employee_code |            1 | employee_code | A         |           7 |     NULL | NULL   |      | BTREE      |         |               |
| employees_example |          1 | first_name    |            1 | first_name    | A         |        NULL |     NULL | NULL   |      | BTREE      |         |               |
| employees_example |          1 | first_name    |            2 | last_name     | A         |        NULL |     NULL | NULL   |      | BTREE      |         |               |
+-------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
SELECT on a primary key:
EXPLAIN SELECT * FROM employees_example WHERE id=1;
+------+-------------+-------------------+-------+---------------+---------+---------+-------+------+-------+
| id   | select_type | table             | type  | possible_keys | key     | key_len | ref   | rows | Extra |
+------+-------------+-------------------+-------+---------------+---------+---------+-------+------+-------+
|    1 | SIMPLE      | employees_example | const | PRIMARY       | PRIMARY | 4       | const |    1 |       |
+------+-------------+-------------------+-------+---------------+---------+---------+-------+------+-------+
The type is const, which means that only one possible result could be returned. Now, returning the same record but searching by their phone number:
EXPLAIN SELECT * FROM employees_example WHERE home_phone='326-555-3492';
+------+-------------+-------------------+------+---------------+------+---------+------+------+-------------+
| id   | select_type | table             | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+------+-------------+-------------------+------+---------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | employees_example | ALL  | NULL          | NULL | NULL    | NULL |    6 | Using where |
+------+-------------+-------------------+------+---------------+------+---------+------+------+-------------+
Here, the type is All, which means no index could be used. Looking at the rows count, a full table scan (all six rows) had to be performed in order to retrieve the record. If it's a requirement to search by phone number, an index will have to be created.
SHOW EXPLAIN example:
SHOW EXPLAIN FOR 1;
+------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra       |
+------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
|    1 | SIMPLE      | tbl   | index | NULL          | a    | 5       | NULL | 1000107 | Using index |
+------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
1 row in set, 1 warning (0.00 sec)

Example of ref_or_null Optimization

SELECT * FROM table_name
  WHERE key_column=expr OR key_column IS NULL;
ref_or_null is something that often happens when you use subqueries with NOT IN as then one has to do an extra check for NULL values if the first value didn't have a matching row.

See Also

No comments:

Post a Comment

Note: only a member of this blog may post a comment.