OceanBase 数据库目前只支持一种基于代价的查询改写——或展开(OR-EXPANSION)。

创新互联公司主要从事成都做网站、网站制作、网页设计、企业做网站、公司建网站等业务。立足成都服务黄埔,10余年网站建设经验,价格优惠、服务专业,欢迎来电咨询建站服务:18982081108
数据库中很多高级的改写规则(例如 complex view merge 和窗口函数改写)都需要基于代价进行改写,OceanBase 数据库后续版本会支持这些复杂的改写规则。
OR-EXPANSION 是将一个查询改写成若干个用 UNION 组成的子查询,可以为每个子查询提供更优的优化空间,但是也会导致多个子查询的执行,所以这个改写需要基于代价去判断。
OR-EXPANSION 的改写主要有如下三个作用:
允许每个分支使用不同的索引来加速查询。
如下例所示,Q1 会被改写成 Q2 的形式,其中 Q2 中的谓词 LNNVL(t1.a = 1) 保证了这两个子查询不会生成重复的结果。如果不进行改写,Q1 一般来说会选择主表作为访问路径,对于 Q2 来说,如果 t1 上存在索引(a)和索引(b),那么该改写可能会让 Q2 中的每一个子查询选择索引作为访问路径。
Q1: 
obclient>SELECT * FROM t1 WHERE t1.a = 1 OR t1.b = 1;
Q2: 
obclient>SELECT * FROM t1 WHERE t1.a = 1 UNION ALL SELECT * FROM t1.b = 1 
       AND LNNVL(t1.a = 1);
完整示例如下:
obclient>CREATE TABLE t1(a INT, b INT, c INT, d INT, e INT, INDEX IDX_a(a), 
         INDEX IDX_b(b));
Query OK, 0 rows affected (0.17 sec)
/*如果不进行 OR-EXPANSION 的改写,该查询只能使用主表访问路径*/
obclient> EXPLAIN SELECT/*+NO_REWRITE()*/ * FROM t1 WHERE t1.a = 1 OR t1.b = 1;
+--------------------------------------------------------------+
| Query Plan                                                                         |
+--------------------------------------------------------------+
| ===================================
|ID|OPERATOR  |NAME|EST. ROWS|COST|
-----------------------------------
|0 |TABLE SCAN|t1  |4        |649 |
===================================
Outputs & filters:
-------------------------------------
  0 - output([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), filter([t1.a = 1 OR t1.b = 1]),
      access([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), partitions(p0)
/*改写之后,每个子查询能使用不同的索引访问路径*/
obclient>EXPLAIN SELECT * FROM t1 WHERE t1.a = 1 OR t1.b = 1;
+------------------------------------------------------------------------+
| Query Plan                                                                                         |
+------------------------------------------------------------------------+
| =========================================
|ID|OPERATOR   |NAME     |EST. ROWS|COST|
-----------------------------------------
|0 |UNION ALL  |         |3        |190 |
|1 | TABLE SCAN|t1(idx_a)|2        |94  |
|2 | TABLE SCAN|t1(idx_b)|1        |95  |
=========================================
Outputs & filters:
-------------------------------------
  0 - output([UNION(t1.a, t1.a)], [UNION(t1.b, t1.b)], [UNION(t1.c, t1.c)], [UNION(t1.d, t1.d)], [UNION(t1.e, t1.e)]), filter(nil)
  1 - output([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), filter(nil),
      access([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), partitions(p0)
  2 - output([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), filter([lnnvl(t1.a = 1)]),
      access([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), partitions(p02
允许每个分支使用不同的连接算法来加速查询,避免使用笛卡尔联接。
如下例所示,Q1 会被改写成 Q2 的形式。对于 Q1 来说,它的联接方式只能是 NESTED LOOP JOIN (笛卡尔乘积), 但是被改写之后,每个子查询都可以选择 NESTED LOOP JOIN、HASH JOIN 或者 MERGE JOIN,这样会有更多的优化空间。
Q1: 
obclient>SELECT * FROM t1, t2 WHERE t1.a = t2.a OR t1.b = t2.b;
Q2: 
obclient>SELECT * FROM t1, t2 WHERE t1.a = t2.a UNION ALL
     SELECT * FROM t1, t2 WHERE t1.b = t2.b AND LNNVL(t1.a = t2.a);
完整示例如下:
obclient> CREATE TABLE t1(a INT, b INT);
Query OK, 0 rows affected (0.17 sec)
obclient> CREATE TABLE t2(a INT, b INT);
Query OK, 0 rows affected (0.13 sec)
/*如果不进行改写,只能使用 NESTED LOOP JOIN*/
obclient> EXPLAIN SELECT/*+NO_REWRITE()*/ * FROM t1, t2 
       WHERE t1.a = t2.a OR t1.b = t2.b;
+--------------------------------------------------------------------------+
| Query Plan                                                                                          |
+--------------------------------------------------------------------------+
| ===========================================
|ID|OPERATOR        |NAME|EST. ROWS|COST  |
-------------------------------------------
|0 |NESTED-LOOP JOIN|    |3957     |585457|
|1 | TABLE SCAN     |t1  |1000     |499   |
|2 | TABLE SCAN     |t2  |4        |583   |
===========================================
Outputs & filters:
-------------------------------------
  0 - output([t1.a], [t1.b], [t2.a], [t2.b]), filter(nil),
      conds(nil), nl_params_([t1.a], [t1.b])
  1 - output([t1.a], [t1.b]), filter(nil),
      access([t1.a], [t1.b]), partitions(p0)
  2 - output([t2.a], [t2.b]), filter([? = t2.a OR ? = t2.b]),
      access([t2.a], [t2.b]), partitions(p0)
/*被改写之后,每个子查询都使用了 HASH JOIN*/
obclient>  EXPLAIN SELECT * FROM t1, t2 WHERE t1.a = t2.a OR t1.b = t2.b;
+--------------------------------------------------------------------------+
| Query Plan                                                                                         |
+--------------------------------------------------------------------------+
|ID|OPERATOR    |NAME|EST. ROWS|COST|
-------------------------------------
|0 |UNION ALL   |    |2970     |9105|
|1 | HASH JOIN  |    |1980     |3997|
|2 |  TABLE SCAN|t1  |1000     |499 |
|3 |  TABLE SCAN|t2  |1000     |499 |
|4 | HASH JOIN  |    |990      |3659|
|5 |  TABLE SCAN|t1  |1000     |499 |
|6 |  TABLE SCAN|t2  |1000     |499 |
=====================================
Outputs & filters:
-------------------------------------
  0 - output([UNION(t1.a, t1.a)], [UNION(t1.b, t1.b)], [UNION(t2.a, t2.a)], [UNION(t2.b, t2.b)]), filter(nil)
  1 - output([t1.a], [t1.b], [t2.a], [t2.b]), filter(nil),
      equal_conds([t1.a = t2.a]), other_conds(nil)
  2 - output([t1.a], [t1.b]), filter(nil),
      access([t1.a], [t1.b]), partitions(p0)
  3 - output([t2.a], [t2.b]), filter(nil),
      access([t2.a], [t2.b]), partitions(p0)
  4 - output([t1.a], [t1.b], [t2.a], [t2.b]), filter(nil),
      equal_conds([t1.b = t2.b]), other_conds([lnnvl(t1.a = t2.a)])
  5 - output([t1.a], [t1.b]), filter(nil),
      access([t1.a], [t1.b]), partitions(p0)
  6 - output([t2.a], [t2.b]), filter(nil),
      access([t2.a], [t2.b]), partitions(p0)
允许每个分支分别消除排序,更加快速的获取 TOP-K 结果。
如下例所示,Q1 会被改写成 Q2。对于 Q1 来说,执行方式是只能把满足条件的行数找出来,然后进行排序,最终取 TOP-10 结果。对于 Q2 来说,如果存在索引(a,b), 那么 Q2 中的两个子查询都可以使用索引把排序消除,每个子查询取 TOP-10 结果,然后最终对这 20 行数据排序一下取出最终的 TOP-10 行。
Q1: 
obclient>SELECT * FROM t1 WHERE t1.a = 1 OR t1.a = 2 ORDER BY b LIMIT 10;
Q2: 
obclient>SELECT * FROM  
    (SELECT * FROM  t1 WHERE t1.a = 1 ORDER BY b LIMIT 10 UNION ALL
     SELECT * FROM  t1 WHERE t1.a = 2 ORDER BY b LIMIT 10) AS TEMP
    ORDER BY temp.b LIMIT 10;
完整示例如下:
obclient> CREATE TABLE t1(a INT, b INT, INDEX IDX_a(a, b));
Query OK, 0 rows affected (0.20 sec)
/*不改写的话,需要排序最终获取 TOP-K 结果*/
obclient> EXPLAIN SELECT/*+NO_REWRITE()*/ * FROM t1 WHERE t1.a = 1 OR t1.a = 2 
        ORDER BY b LIMIT 10;
+-------------------------------------------------------------------------+
| Query Plan                                                                                         |
+-------------------------------------------------------------------------+
| ==========================================
|ID|OPERATOR    |NAME     |EST. ROWS|COST|
------------------------------------------
|0 |LIMIT       |         |4        |77  |
|1 | TOP-N SORT |         |4        |76  |
|2 |  TABLE SCAN|t1(idx_a)|4        |73  |
==========================================
Outputs & filters:
-------------------------------------
  0 - output([t1.a], [t1.b]), filter(nil), limit(10), offset(nil)
  1 - output([t1.a], [t1.b]), filter(nil), sort_keys([t1.b, ASC]), topn(10)
  2 - output([t1.a], [t1.b]), filter(nil),
      access([t1.a], [t1.b]), partitions(p0)
/* 进行改写的话,排序算子可以被消除,最终获取 TOP-K 结果*/
obclient>EXPLAIN SELECT * FROM t1 WHERE t1.a = 1 OR t1.a = 2 
        ORDER BY b LIMIT 10;
+-------------------------------------------------------------------------+
| Query Plan                                                                                          |
+-------------------------------------------------------------------------+
| ===========================================
|ID|OPERATOR     |NAME     |EST. ROWS|COST|
-------------------------------------------
|0 |LIMIT        |         |3        |76  |
|1 | TOP-N SORT  |         |3        |76  |
|2 |  UNION ALL  |         |3        |74  |
|3 |   TABLE SCAN|t1(idx_a)|2        |37  |
|4 |   TABLE SCAN|t1(idx_a)|1        |37  |
===========================================
Outputs & filters:
-------------------------------------
  0 - output([UNION(t1.a, t1.a)], [UNION(t1.b, t1.b)]), filter(nil), limit(10), offset(nil)
  1 - output([UNION(t1.a, t1.a)], [UNION(t1.b, t1.b)]), filter(nil), sort_keys([UNION(t1.b, t1.b), ASC]), topn(10)
  2 - output([UNION(t1.a, t1.a)], [UNION(t1.b, t1.b)]), filter(nil)
  3 - output([t1.a], [t1.b]), filter(nil),
      access([t1.a], [t1.b]), partitions(p0),
      limit(10), offset(nil)
  4 - output([t1.a], [t1.b]), filter([lnnvl(t1.a = 1)]),
      access([t1.a], [t1.b]), partitions(p0),
      limit(10), offset(nil)
                本文题目:创新互联OceanBase教程:OceanBase基于代价的查询改写
                
                分享URL:http://www.csdahua.cn/qtweb/news13/54313.html
            
网站建设、网络推广公司-快上网,是专注品牌与效果的网站制作,网络营销seo公司;服务项目有等
声明:本网站发布的内容(图片、视频和文字)以用户投稿、用户转载内容为主,如果涉及侵权请尽快告知,我们将会在第一时间删除。文章观点不代表本网站立场,如需处理请联系客服。电话:028-86922220;邮箱:631063699@qq.com。内容未经允许不得转载,或转载时需注明来源: 快上网