V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
NewConn
V2EX  ›  程序员

关于几张超大表联合查询查询 SQL 的问题

  •  
  •   NewConn · 2020-06-16 15:25:03 +08:00 · 2392 次点击
    这是一个创建于 1613 天前的主题,其中的信息可能已经有所发展或是发生改变。
    WITH E AS (
        SELECT ID
        FROM A
        WHERE type = 2
            AND DELETE_FLAG = 0
            AND uid = 41
        UNION
        SELECT a.ID
        FROM A a, B b, A c
        WHERE a.type = 3
            AND a.DELETE_FLAG = 0
            AND a.uid = 41
            AND a.ID = b.ID
            AND b.parent_id = c.ID
    )
    SELECT COUNT(1)
    FROM D d, E e, A t, F f
    WHERE d.DELETE_FLAG = 0
        AND d.ID_ = e.ID
        AND e.ID = t.ID
        AND d.aid = f.ID
        AND f.DELETE_FLAG = 0
    

    其中 D 表最大,有 37M(37,000,000)的数据,A 表 410k 数据,F 表 165k 数据

    SQL 语句是查询一个数量,在原来 Oracle 下 15s 左右就可以查询;迁移 其他数据库后需要 100s 。请问各路大神有什么好的解决方法? 目前 D 表已经是分表了,再分表也不现实; D 表的( DELETE_FLAG,ID,aid )也加了联合索引。 请问有没有其他的优化方法。 执行计划如下:

    PLAN_TABLE_OUTPUT
    Plan hash value: 244456078
     
    ------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                                | Name                      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                         |                           |     1 |    43 |       |   420K  (1)| 01:24:03 |
    |   1 |  SORT AGGREGATE                          |                           |     1 |    43 |       |            |          |
    |*  2 |   HASH JOIN                              |                           |    22M|   906M|  3104K|   420K  (1)| 01:24:03 |
    |*  3 |    TABLE ACCESS FULL                     | FULL_ATTRIBUTE_RELATION   |   151K|  1327K|       |   548   (1)| 00:00:07 |
    |*  4 |    HASH JOIN                             |                           |    21M|   707M|  7312K|   371K  (1)| 01:14:24 |
    |   5 |     INDEX FAST FULL SCAN                 | SYS_C0019687              |   415K|  2436K|       |   344   (1)| 00:00:05 |
    |*  6 |     HASH JOIN                            |                           |    21M|   582M|       |   329K  (1)| 01:05:59 |
    |   7 |      VIEW                                |                           | 69509 |   882K|       |  5236   (1)| 00:01:03 |
    |   8 |       SORT UNIQUE                        |                           | 69509 |  1177K|  1784K|  5236   (1)| 00:01:03 |
    |   9 |        UNION-ALL                         |                           |       |       |       |            |          |
    |* 10 |         VIEW                             | index$_join$_001          | 64314 |  1004K|       |  3146   (1)| 00:00:38 |
    |* 11 |          HASH JOIN                       |                           |       |       |       |            |          |
    |* 12 |           HASH JOIN                      |                           |       |       |       |            |          |
    |* 13 |            HASH JOIN                     |                           |       |       |       |            |          |
    |  14 |             BITMAP CONVERSION TO ROWIDS  |                           | 64314 |  1004K|       |    11   (0)| 00:00:01 |
    |* 15 |              BITMAP INDEX SINGLE VALUE   | NODE_PDUID_INDEX          |       |       |       |            |          |
    |  16 |             BITMAP CONVERSION TO ROWIDS  |                           | 64314 |  1004K|       |    37   (0)| 00:00:01 |
    |* 17 |              BITMAP INDEX SINGLE VALUE   | NODE_DELETE_FLAG_INDEX    |       |       |       |            |          |
    |  18 |            BITMAP CONVERSION TO ROWIDS   |                           | 64314 |  1004K|       |    38   (0)| 00:00:01 |
    |* 19 |             BITMAP INDEX SINGLE VALUE    | NODE_TYPE_INDEX           |       |       |       |            |          |
    |  20 |           INDEX FAST FULL SCAN           | SYS_C0019687              | 64314 |  1004K|       |  1579   (1)| 00:00:19 |
    |* 21 |         HASH JOIN                        |                           |  5195 |   172K|       |  1743   (1)| 00:00:21 |
    |* 22 |          HASH JOIN                       |                           |  5637 |   154K|       |  1397   (1)| 00:00:17 |
    |* 23 |           VIEW                           | index$_join$_002          |  5627 | 90032 |       |  1329   (1)| 00:00:16 |
    |* 24 |            HASH JOIN                     |                           |       |       |       |            |          |
    |* 25 |             HASH JOIN                    |                           |       |       |       |            |          |
    |* 26 |              HASH JOIN                   |                           |       |       |       |            |          |
    |  27 |               BITMAP CONVERSION TO ROWIDS|                           |  5627 | 90032 |       |     4   (0)| 00:00:01 |
    |* 28 |                BITMAP INDEX SINGLE VALUE | NODE_TYPE_INDEX           |       |       |       |            |          |
    |  29 |               BITMAP CONVERSION TO ROWIDS|                           |  5627 | 90032 |       |    11   (0)| 00:00:01 |
    |* 30 |                BITMAP INDEX SINGLE VALUE | NODE_PDUID_INDEX          |       |       |       |            |          |
    |  31 |              BITMAP CONVERSION TO ROWIDS |                           |  5627 | 90032 |       |    37   (0)| 00:00:01 |
    |* 32 |               BITMAP INDEX SINGLE VALUE  | NODE_DELETE_FLAG_INDEX    |       |       |       |            |          |
    |  33 |             INDEX FAST FULL SCAN         | SYS_C0019687              |  5627 | 90032 |       |  1579   (1)| 00:00:19 |
    |  34 |           TABLE ACCESS FULL              | FULL_REL_SPECNODE         | 26489 |   310K|       |    68   (0)| 00:00:01 |
    |  35 |          INDEX FAST FULL SCAN            | SYS_C0019687              |   415K|  2436K|       |   344   (1)| 00:00:05 |
    |* 36 |      VIEW                                | index$_join$_005          |    37M|   530M|       |   324K  (1)| 01:04:54 |
    |* 37 |       HASH JOIN                          |                           |       |       |       |            |          |
    |* 38 |        HASH JOIN                         |                           |       |       |       |            |          |
    |  39 |         BITMAP CONVERSION TO ROWIDS      |                           |    37M|   530M|       |  1687   (1)| 00:00:21 |
    |* 40 |          BITMAP INDEX SINGLE VALUE       | FULL_ITEMSINFO_41_INDEX_3 |       |       |       |            |          |
    |  41 |         INDEX FAST FULL SCAN             | FULL_ITEMSINFO_41_INDEX_1 |    37M|   530M|       |   115K  (1)| 00:23:06 |
    |  42 |        BITMAP CONVERSION TO ROWIDS       |                           |    37M|   530M|       | 12843   (1)| 00:02:35 |
    |  43 |         BITMAP INDEX FULL SCAN           | FULL_ITEMSINFO_41_INDEX_2 |       |       |       |            |          |
    ------------------------------------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
    "   2 - access(""I"".""ATTRRID_""=""A"".""ID"")"
    "   3 - filter(""A"".""ATTRR_DELETEFLAG""=0)"
    "   4 - access(""N"".""NODE_ID""=""T"".""NODE_ID"")"
    "   6 - access(""I"".""NODE_ID_""=""N"".""NODE_ID"")"
    "  10 - filter(""NODE_PDUID""=41 AND ""NODE_TYPE""=2 AND ""NODE_DELETEFLAG""=0)"
      11 - access(ROWID=ROWID)
      12 - access(ROWID=ROWID)
      13 - access(ROWID=ROWID)
    "  15 - access(""NODE_PDUID""=41)"
    "  17 - access(""NODE_DELETEFLAG""=0)"
    "  19 - access(""NODE_TYPE""=2)"
    "  21 - access(""B"".""NODE_PARENTID""=""C"".""NODE_ID"")"
    "  22 - access(""A"".""NODE_ID""=""B"".""NODE_ID"")"
    "  23 - filter(""A"".""NODE_TYPE""=3 AND ""A"".""NODE_PDUID""=41 AND ""A"".""NODE_DELETEFLAG""=0)"
      24 - access(ROWID=ROWID)
      25 - access(ROWID=ROWID)
      26 - access(ROWID=ROWID)
    "  28 - access(""A"".""NODE_TYPE""=3)"
    "  30 - access(""A"".""NODE_PDUID""=41)"
    "  32 - access(""A"".""NODE_DELETEFLAG""=0)"
    "  36 - filter(""I"".""ISDELETED""=0)"
      37 - access(ROWID=ROWID)
      38 - access(ROWID=ROWID)
    "  40 - access(""I"".""ISDELETED""=0)"
    
    18 条回复    2020-06-17 21:34:01 +08:00
    liprais
        1
    liprais  
       2020-06-16 15:26:09 +08:00
    先贴执行计划
    wysnylc
        2
    wysnylc  
       2020-06-16 15:28:13 +08:00
    异步多次查询,查询结果使用 map filter 拼接
    takemeaway
        3
    takemeaway  
       2020-06-16 15:28:58 +08:00
    表倒是不大,,,就是连得有点多。
    搞不懂这样的需求,干嘛这样设计表。
    NewConn
        4
    NewConn  
    OP
       2020-06-16 15:31:21 +08:00
    ```
    ------------------------------------------------------------------------------------------------------------------------------------------
    | Id | Description | Owner | Name | Rows | Cost | Bytes | Remark |
    ------------------------------------------------------------------------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | | 1 | 7387.245 | | |
    | 1 | AGGR | | | 1 | 7387.245 | | |
    | 2 | HASH JOIN(R) | | | 340000 | 7387.245 | | |
    | 3 | HASH JOIN(R) | | | 680000 | 1666.518 | | |
    | 4 | HASH JOIN(R) | | | 3509333 | 1666.435 | | |
    | 5 | TABLE ACCESS BY INDEX ONLY | FULLDB | FULL_ITEMSINFO_41 I | 34832500 | 0.083 | | |
    | 6 | INDEX RANGE SCAN | FULLDB | FULL_ITEMSINFO_41_INDEX_3 | 34832500 | 0.083 | | |
    | 7 | SUBSELECT | | | 82850 | 1666.353 | | |
    | 8 | HASH UNION | | | 82850 | 1666.353 | | |
    | 9 | TABLE ACCESS BY INDEX ONLY | FULLDB | FULL_SPECNODEINFO | 82522 | 1420.588 | | |
    | 10 | INDEX RANGE SCAN | FULLDB | FSNI_IDX_1 | 82522 | 1420.588 | | |
    | 11 | NESTED LOOPS | | | 328 | 245.765 | | |
    | 12 | NESTED LOOPS | | | 329 | 225.038 | | |
    | 13 | TABLE ACCESS BY INDEX ONLY | FULLDB | FULL_SPECNODEINFO A | 5072 | 100.588 | | |
    | 14 | INDEX RANGE SCAN | FULLDB | FSNI_IDX_1 | 5072 | 100.588 | | |
    | 15 | TABLE ACCESS BY INDEX ONLY | FULLDB | FULL_REL_SPECNODE B | 27052 | 124.450 | | |
    | 16 | INDEX RANGE SCAN | FULLDB | IX_NODEID_PARENTID_FRS | 27052 | 124.450 | | |
    | 17 | TABLE ACCESS BY INDEX ONLY | FULLDB | FULL_SPECNODEINFO C | 415200 | 20.727 | | |
    | 18 | INDEX UNIQUE SCAN | FULLDB | PK_FULL_SPECNODEINFO | 415200 | 20.727 | | |
    | 19 | TABLE ACCESS BY INDEX ONLY | FULLDB | FULL_ATTRIBUTE_RELATION A | 118912 | 0.083 | | |
    | 20 | INDEX RANGE SCAN | FULLDB | FULL_ATTRIBUTE_RELATION_INDEX_1 | 118912 | 0.083 | | |
    | 21 | TABLE ACCESS BY INDEX ONLY | FULLDB | FULL_SPECNODEINFO T | 415200 | 5720.727 | | |
    | 22 | INDEX FAST FULL SCAN | FULLDB | PK_FULL_SPECNODEINFO | 415200 | 5720.727 | | |
    ------------------------------------------------------------------------------------------------------------------------------------------
    ```

    执行计划是这个,问题描述里面是 Oracle 的执行计划。谢谢大家
    hooopo
        5
    hooopo  
       2020-06-16 15:42:59 +08:00 via Android
    pg 吗 什么版本 cte 改掉试试
    liprais
        6
    liprais  
       2020-06-16 15:55:48 +08:00
    @hooopo gauss 是华为魔改的 pgsql 的 mpp 数据库
    fangcan
        7
    fangcan  
       2020-06-16 16:11:34 +08:00
    为什么 3 千多万的数据才 37MB ? 我们的一般都要好多 GB 了
    Still4
        8
    Still4  
       2020-06-16 17:24:36 +08:00
    最外层 id=2 这里,似乎执行了扫全表,rows=22M

    应该是建立了临时表 E,导致索引失效?
    hooopo
        9
    hooopo  
       2020-06-16 17:31:22 +08:00 via Android
    @liprais 为什么这东西有人敢用
    BadAngel
        10
    BadAngel  
       2020-06-16 17:38:09 +08:00
    @hooopo 也许就是菊厂的兄弟,有时候没办法,为了不被勒脖子,总要牺牲点啥
    zhangysh1995
        11
    zhangysh1995  
       2020-06-16 18:43:02 +08:00
    WITH E AS (
    SELECT ID
    FROM A
    WHERE type = 2
    AND DELETE_FLAG = 0
    AND uid = 41
    UNION
    SELECT a.ID
    FROM A a, B b, A c
    WHERE a.type = 3
    AND a.DELETE_FLAG = 0
    AND a.uid = 41
    AND a.ID = b.ID
    AND b.parent_id = c.ID
    )

    这里面为什么不能先把 a.type in (2,3) and a.delete_flag =0 and a.uid = 41 先选择出来然后再别的操作?这一句的 selectivity 有多少?有多少符合条件的?
    lenqu
        12
    lenqu  
       2020-06-16 19:01:41 +08:00
    我觉得联合操作有点多
    想问你们分表的时候没有具体分表逻辑,比如某一 key 的范围划分?
    CRVV
        13
    CRVV  
       2020-06-16 19:06:03 +08:00
    > D 表的( DELETE_FLAG,ID,aid )也加了联合索引

    D.ID 在这个查询里就没被用到


    这个 SQL 里面,E 就是 A,t 也是 A,最外层写了一个 A INNER JOIN A ON A.id = A.id ,确定是这样的么?


    这个 SQL 写得有点奇怪,比如还有一个问题是 cte 里面不用写 UNION,一次 SELECT 就可以了。
    先把它整理清楚了再优化吧。
    CRVV
        14
    CRVV  
       2020-06-16 19:10:38 +08:00
    @zhangysh1995
    如果他发的 SQL 是对的,那个 cte 等价于

    SELECT id
    FROM A
    LEFT JOIN B ON A. id = B. id
    LEFT JOIN A AS parent ON B.parent_id = parent. id
    WHERE A. uid = 41 AND (A.type = 2 OR (A.type = 3 AND parent. id IS NOT NULL))

    然后这个 cte 也不用写 cte,直接和下面的 JOIN 写在一起就好了


    另外这个 EXPLAIN 明显不是 PostgreSQL,我也没看出来这是什么数据库

    ```
    创建新回复过程中遇到一些问题:
    请不要在每一个回复中都包括外链,这看起来像是在 spamming
    ```
    @Livid 这个外链的检查也太敏感了吧
    liprais
        15
    liprais  
       2020-06-16 19:18:37 +08:00 via iPhone
    @hooopo 实话实话,高斯做了十年还是有点东西的,据我所知测试也很严谨,但是内斗太严重,真正有多少东西能进入最终的产品就不得而知了
    NewConn
        16
    NewConn  
    OP
       2020-06-17 10:34:12 +08:00
    @CRVV 感谢老哥。说实话我刚来不久,也不是非常清楚业务逻辑,所以最近遇到的这些 SQl 的 union 、left join,以及几千行的存储过程,我也不知道怎么从业务逻辑入手优化,只能单拎出来语句进行优化。老哥给的建议我详细看看,非常感谢
    NewConn
        17
    NewConn  
    OP
       2020-06-17 10:35:36 +08:00
    @zhangysh1995 感谢老哥,我按您的思路去梳理一下这个 SQL 逻辑
    zhangysh1995
        18
    zhangysh1995  
       2020-06-17 21:34:01 +08:00
    @NewConn 不好意思我是妹子。
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   5490 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 30ms · UTC 08:44 · PVG 16:44 · LAX 00:44 · JFK 03:44
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.