使用 sql 语句查询数据的时候,如果表里有好几个索引,mysql 优化器会自己判断使用哪个索引进行搜索。但是,where 条件比较复杂的时候,优化器使用的索引就不一定是最优索引了。

例如,最近有个需求,查询 4 月份每个商家的微信、支付宝订单额,sql 语句如下:

SELECT
    o.client_id,
    c.client_name,
    SUM(IF(o.pay_by IN(4, 5), o.money, 0)) AS m_alipay,
    SUM(IF(o.pay_by IN(1, 2, 3), o.money, 0)) AS m_wechat
FROM
    orders o
    JOIN clients c ON o.client_id = c.id
WHERE
    o.pay_time >= UNIX_TIMESTAMP("2020-04-01")
    AND o.pay_time < UNIX_TIMESTAMP("2020-04-30 23:59:59")
    AND c.status = 1
GROUP BY
    o.client_id

orders 订单表数据量 400 万条,clients 商家信息表 2000 条,这个 sql 语句执行花了差不多 26 秒。

通常我们会在慢查询 sql 语句前面加上 explain 关键字,查看执行计划:

mysql> EXPLAIN
    -> SELECT
    ->     o.client_id,
    ->     c.client_name,
    ->     SUM(IF(o.pay_by IN(4, 5), o.money, 0)) AS m_alipay,
    ->     SUM(IF(o.pay_by IN(1, 2, 3), o.money, 0)) AS m_wechat
    -> FROM
    ->     orders o
    ->     JOIN clients c ON o.client_id = c.id
    -> WHERE
    ->     o.pay_time >= UNIX_TIMESTAMP("2020-04-01")
    ->     AND o.pay_time < UNIX_TIMESTAMP("2020-04-30 23:59:59")
    ->     AND c.status = 1
    -> GROUP BY
    ->     o.client_id;
+----+-------------+-------+------+---------------------------------------+---------------+---------+-----------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys                         | key           | key_len | ref       | rows | Extra                                        |
+----+-------------+-------+------+---------------------------------------+---------------+---------+-----------+------+----------------------------------------------+
|  1 | SIMPLE      | c     | ALL  | PRIMARY                               | NULL          | NULL    | NULL      |  715 | Using where; Using temporary; Using filesort |
+----+-------------+-------+------+---------------------------------------+---------------+---------+-----------+------+----------------------------------------------+
|  1 | SIMPLE      | o     | ref  | idx_pay_time,idx_client_id,idx_lin... | idx_client_id | 99      | shop.c.id |  201 | Using where                                  |
+----+-------------+-------+------+---------------------------------------+---------------+---------+-----------+------+----------------------------------------------+
2 rows in set, 1 warning (0.01 sec)

orders 订单表有好几个索引,如 idx_pay_time、idx_client_id,分别是 pay_time、client_id 字段的单列索引。在执行计划中可以看到,mysql 使用订单表的 idx_client_id 索引进行搜索,预估扫描的数据是 201 条,但是 4 月份订单不可能只有这么点!可见,优化器使用的 idx_client_id 并不是最优索引。

在 mysql 中,添加 force index() 可以强制 sql 查询时使用指定索引,如 force index(idx_pay_time) 强制使用 pay_time 字段的索引,sql 改成:

SELECT
    o.client_id,
    c.client_name,
    SUM(IF(o.pay_by IN(4, 5), o.money, 0)) AS m_alipay,
    SUM(IF(o.pay_by IN(1, 2, 3), o.money, 0)) AS m_wechat
FROM
    orders o FORCE INDEX(idx_pay_time)
    JOIN clients c ON o.client_id = c.id
WHERE
    o.pay_time >= UNIX_TIMESTAMP("2020-04-01")
    AND o.pay_time < UNIX_TIMESTAMP("2020-04-30 23:59:59")
    AND c.status = 1
GROUP BY
    o.client_id

在前面加上 explain 查看执行计划:

mysql> EXPLAIN
    -> SELECT
    ->     o.client_id,
    ->     c.client_name,
    ->     SUM(IF(o.pay_by IN(4, 5), o.money, 0)) AS m_alipay,
    ->     SUM(IF(o.pay_by IN(1, 2, 3), o.money, 0)) AS m_wechat
    -> FROM
    ->     orders o FORCE INDEX(idx_pay_time)
    ->     JOIN clients c ON o.client_id = c.id
    -> WHERE
    ->     o.pay_time >= UNIX_TIMESTAMP("2020-04-01")
    ->     AND o.pay_time < UNIX_TIMESTAMP("2020-04-30 23:59:59")
    ->     AND c.status = 1
    -> GROUP BY
    ->     o.client_id;
+----+-------------+-------+--------+---------------------------------------+--------------+---------+------------------+--------+-------------------------------------------------------+
| id | select_type | table | type   | possible_keys                         | key          | key_len | ref              | rows   | Extra                                                 |
+----+-------------+-------+--------+---------------------------------------+--------------+---------+------------------+--------+-------------------------------------------------------+
|  1 | SIMPLE      | o     | range  | idx_pay_time,idx_client_id,idx_lin... | idx_pay_time | 5       | NULL             | 671630 | Using index condition; Using where; Using temporar... |
+----+-------------+-------+--------+---------------------------------------+--------------+---------+------------------+--------+-------------------------------------------------------+
|  1 | SIMPLE      | c     | eq_ref | PRIMARY                               | PRIMARY      | 98      | shop.o.client_id |      1 | Using where                                           |
+----+-------------+-------+--------+---------------------------------------+--------------+---------+------------------+--------+-------------------------------------------------------+
2 rows in set, 1 warning (0.01 sec)

这次 orders 订单表扫描的数据是 67 万多条,这个订单数才是接近真实情况的。

最后,执行 sql 查询,优化后的语句耗时 2.7 秒。