【Rust】sqlparser-rs ~ SQL Parser for Rust ~

■ はじめに

小ネタ。

業務で、複雑なSQLを解析するために、
sqlparser-rs を使ってParseして、
そのParse結果をDOT言語化して、
それを、コマンドもしくは、以下の関連記事でやったようなツールで
SVG画像化するといったことをやっていた。

そこで、その元となったRust製のSQL Parserである「sqlparser-rs」について
メモしておく。

https://dk521123.hatenablog.com/entry/2023/06/14/174104

目次

【1】sqlparser-rs
【2】環境設定
 0)確認用SQLを用意する
 1)Rust をセットアップする
 2)git clone する
 3)実行する
【3】サンプル
 例1:CREATE TABLE文 
 例2:INSERT INTO文

【1】sqlparser-rs

* (冒頭で既にいったが) Rust製のSQL Parser

https://github.com/sqlparser-rs/sqlparser-rs

【2】環境設定

0)確認用SQLを用意する

sample.sql

SELECT * FROM hello WHERE id='X0001';

1)Rust をセットアップする

* Rust をインストールする
 (その過程で、Visual Studioがインストールされる)

Rust ~ 環境構築編 ~
https://dk521123.hatenablog.com/entry/2023/04/22/234808

2)git clone する

# Git clone で、ソースをダウンロードする
git clone https://github.com/sqlparser-rs/sqlparser-rs.git

# ついでに「0)確認用SQLを用意する」のSQLを直下に置いておく

3)実行する

[1] Visual Studio の場合、[View]-[Terminal]-[Developer Command Prompt]を選択
[2] cargo run で実行

コマンド例

# cargo run --features json_example --example cli sample.sql
cargo run --features json_example --example cli sample.sql > output.txt

出力結果例「output.txt」

Parsing from file 'sample.sql' using GenericDialect
2023-06-15T12:17:24.381Z DEBUG [sqlparser::parser] Parsing sql 'SELECT * FROM hello WHERE id='X0001';'...
2023-06-15T12:17:24.382Z DEBUG [sqlparser::parser] parsing expr
2023-06-15T12:17:24.382Z DEBUG [sqlparser::parser] prefix: Identifier(Ident { value: "id", quote_style: None })
2023-06-15T12:17:24.382Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: Eq, location: Location { line: 0, column: 0 } }
2023-06-15T12:17:24.382Z DEBUG [sqlparser::parser] 0: = 1: 'X0001' 2: ;
2023-06-15T12:17:24.382Z DEBUG [sqlparser::parser] next precedence: 20
2023-06-15T12:17:24.382Z DEBUG [sqlparser::parser] parsing expr
2023-06-15T12:17:24.382Z DEBUG [sqlparser::parser] prefix: Value(SingleQuotedString("X0001"))
2023-06-15T12:17:24.383Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: SemiColon, location: Location { line: 0, column: 0 } }
2023-06-15T12:17:24.383Z DEBUG [sqlparser::parser] 0: ; 1: EOF 2: EOF
2023-06-15T12:17:24.383Z DEBUG [sqlparser::parser] next precedence: 0
2023-06-15T12:17:24.383Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: SemiColon, location: Location { line: 0, column: 0 } }
2023-06-15T12:17:24.383Z DEBUG [sqlparser::parser] 0: ; 1: EOF 2: EOF
2023-06-15T12:17:24.383Z DEBUG [sqlparser::parser] next precedence: 0
Round-trip:
'SELECT * FROM hello WHERE id = 'X0001''
Serialized as JSON:
[
  {
    "Query": {
      "with": null,
      "body": {
        "Select": {
          "distinct": null,
          "top": null,
          "projection": [
            {
              "Wildcard": {
                "opt_exclude": null,
                "opt_except": null,
                "opt_rename": null,
                "opt_replace": null
              }
            }
          ],
          "into": null,
          "from": [
            {
              "relation": {
                "Table": {
                  "name": [
                    {
                      "value": "hello",
                      "quote_style": null
                    }
                  ],
                  "alias": null,
                  "args": null,
                  "with_hints": []
                }
              },
              "joins": []
            }
          ],
          "lateral_views": [],
          "selection": {
            "BinaryOp": {
              "left": {
                "Identifier": {
                  "value": "id",
                  "quote_style": null
                }
              },
              "op": "Eq",
              "right": {
                "Value": {
                  "SingleQuotedString": "X0001"
                }
              }
            }
          },
          "group_by": [],
          "cluster_by": [],
          "distribute_by": [],
          "sort_by": [],
          "having": null,
          "named_window": [],
          "qualify": null
        }
      },
      "order_by": [],
      "limit": null,
      "offset": null,
      "fetch": null,
      "locks": []
    }
  }
]

【3】サンプル

例1:CREATE TABLE文

CREATE TABLE new_table
AS
SELECT
  t1.id
  ,t1.code
  ,t2.name
FROM table_1 AS t1
LEFT OUTER JOIN table_2 As t2
ON t2.id = t1.id
WHERE  t1.code = 'x0001';

出力結果例

[
  {
    "CreateTable": {
      "or_replace": false,
      "temporary": false,
      "external": false,
      "global": null,
      "if_not_exists": false,
      "transient": false,
      "name": [
        {
          "value": "new_table",
          "quote_style": null
        }
      ],
      "columns": [],
      "constraints": [],
      "hive_distribution": "NONE",
      "hive_formats": {
        "row_format": null,
        "storage": null,
        "location": null
      },
      "table_properties": [],
      "with_options": [],
      "file_format": null,
      "location": null,
      "query": {
        "with": null,
        "body": {
          "Select": {
            "distinct": null,
            "top": null,
            "projection": [
              {
                "UnnamedExpr": {
                  "CompoundIdentifier": [
                    {
                      "value": "t1",
                      "quote_style": null
                    },
                    {
                      "value": "id",
                      "quote_style": null
                    }
                  ]
                }
              },
              {
                "UnnamedExpr": {
                  "CompoundIdentifier": [
                    {
                      "value": "t1",
                      "quote_style": null
                    },
                    {
                      "value": "code",
                      "quote_style": null
                    }
                  ]
                }
              },
              {
                "UnnamedExpr": {
                  "CompoundIdentifier": [
                    {
                      "value": "t2",
                      "quote_style": null
                    },
                    {
                      "value": "name",
                      "quote_style": null
                    }
                  ]
                }
              }
            ],
            "into": null,
            "from": [
              {
                "relation": {
                  "Table": {
                    "name": [
                      {
                        "value": "table_1",
                        "quote_style": null
                      }
                    ],
                    "alias": {
                      "name": {
                        "value": "t1",
                        "quote_style": null
                      },
                      "columns": []
                    },
                    "args": null,
                    "with_hints": []
                  }
                },
                "joins": [
                  {
                    "relation": {
                      "Table": {
                        "name": [
                          {
                            "value": "table_2",
                            "quote_style": null
                          }
                        ],
                        "alias": {
                          "name": {
                            "value": "t2",
                            "quote_style": null
                          },
                          "columns": []
                        },
                        "args": null,
                        "with_hints": []
                      }
                    },
                    "join_operator": {
                      "LeftOuter": {
                        "On": {
                          "BinaryOp": {
                            "left": {
                              "CompoundIdentifier": [
                                {
                                  "value": "t2",
                                  "quote_style": null
                                },
                                {
                                  "value": "id",
                                  "quote_style": null
                                }
                              ]
                            },
                            "op": "Eq",
                            "right": {
                              "CompoundIdentifier": [
                                {
                                  "value": "t1",
                                  "quote_style": null
                                },
                                {
                                  "value": "id",
                                  "quote_style": null
                                }
                              ]
                            }
                          }
                        }
                      }
                    }
                  }
                ]
              }
            ],
            "lateral_views": [],
            "selection": {
              "BinaryOp": {
                "left": {
                  "CompoundIdentifier": [
                    {
                      "value": "t1",
                      "quote_style": null
                    },
                    {
                      "value": "code",
                      "quote_style": null
                    }
                  ]
                },
                "op": "Eq",
                "right": {
                  "Value": {
                    "SingleQuotedString": "x0001"
                  }
                }
              }
            },
            "group_by": [],
            "cluster_by": [],
            "distribute_by": [],
            "sort_by": [],
            "having": null,
            "named_window": [],
            "qualify": null
          }
        },
        "order_by": [],
        "limit": null,
        "offset": null,
        "fetch": null,
        "locks": []
      },
      "without_rowid": false,
      "like": null,
      "clone": null,
      "engine": null,
      "default_charset": null,
      "collation": null,
      "on_commit": null,
      "on_cluster": null,
      "order_by": null
    }
  }
]

例2:INSERT INTO文

INSERT INTO main_table(id, code, name)
SELECT
  t1.id
  , t1.code
  , t2.name 
FROM
  table1 t1
INNER JOIN
 table2 t2
ON
 t1.id = t2.id
WHERE
  t1.code != 'X001'
  AND t2.name is not null;

出力結果例

[
  {
    "CreateTable": {
      "or_replace": false,
      "temporary": false,
      "external": false,
      "global": null,
      "if_not_exists": false,
      "transient": false,
      "name": [
        {
          "value": "new_table",
          "quote_style": null
        }
      ],
      "columns": [],
      "constraints": [],
      "hive_distribution": "NONE",
      "hive_formats": {
        "row_format": null,
        "storage": null,
        "location": null
      },
      "table_properties": [],
      "with_options": [],
      "file_format": null,
      "location": null,
      "query": {
        "with": null,
        "body": {
          "Select": {
            "distinct": null,
            "top": null,
            "projection": [
              {
                "UnnamedExpr": {
                  "CompoundIdentifier": [
                    {
                      "value": "t1",
                      "quote_style": null
                    },
                    {
                      "value": "id",
                      "quote_style": null
                    }
                  ]
                }
              },
              {
                "UnnamedExpr": {
                  "CompoundIdentifier": [
                    {
                      "value": "t1",
                      "quote_style": null
                    },
                    {
                      "value": "code",
                      "quote_style": null
                    }
                  ]
                }
              },
              {
                "UnnamedExpr": {
                  "CompoundIdentifier": [
                    {
                      "value": "t2",
                      "quote_style": null
                    },
                    {
                      "value": "name",
                      "quote_style": null
                    }
                  ]
                }
              }
            ],
            "into": null,
            "from": [
              {
                "relation": {
                  "Table": {
                    "name": [
                      {
                        "value": "table_1",
                        "quote_style": null
                      }
                    ],
                    "alias": {
                      "name": {
                        "value": "t1",
                        "quote_style": null
                      },
                      "columns": []
                    },
                    "args": null,
                    "with_hints": []
                  }
                },
                "joins": [
                  {
                    "relation": {
                      "Table": {
                        "name": [
                          {
                            "value": "table_2",
                            "quote_style": null
                          }
                        ],
                        "alias": {
                          "name": {
                            "value": "t2",
                            "quote_style": null
                          },
                          "columns": []
                        },
                        "args": null,
                        "with_hints": []
                      }
                    },
                    "join_operator": {
                      "LeftOuter": {
                        "On": {
                          "BinaryOp": {
                            "left": {
                              "CompoundIdentifier": [
                                {
                                  "value": "t2",
                                  "quote_style": null
                                },
                                {
                                  "value": "id",
                                  "quote_style": null
                                }
                              ]
                            },
                            "op": "Eq",
                            "right": {
                              "CompoundIdentifier": [
                                {
                                  "value": "t1",
                                  "quote_style": null
                                },
                                {
                                  "value": "id",
                                  "quote_style": null
                                }
                              ]
                            }
                          }
                        }
                      }
                    }
                  }
                ]
              }
            ],
            "lateral_views": [],
            "selection": {
              "BinaryOp": {
                "left": {
                  "CompoundIdentifier": [
                    {
                      "value": "t1",
                      "quote_style": null
                    },
                    {
                      "value": "code",
                      "quote_style": null
                    }
                  ]
                },
                "op": "Eq",
                "right": {
                  "Value": {
                    "SingleQuotedString": "x0001"
                  }
                }
              }
            },
            "group_by": [],
            "cluster_by": [],
            "distribute_by": [],
            "sort_by": [],
            "having": null,
            "named_window": [],
            "qualify": null
          }
        },
        "order_by": [],
        "limit": null,
        "offset": null,
        "fetch": null,
        "locks": []
      },
      "without_rowid": false,
      "like": null,
      "clone": null,
      "engine": null,
      "default_charset": null,
      "collation": null,
      "on_commit": null,
      "on_cluster": null,
      "order_by": null
    }
  }
]

関連記事

Rust ~ 環境構築編 ~
https://dk521123.hatenablog.com/entry/2023/04/22/234808
Python + DOT言語で図作成するには
https://dk521123.hatenablog.com/entry/2023/06/14/174104
jq コマンド ~ コマンドで JSON を扱う ~
https://dk521123.hatenablog.com/entry/2020/02/01/000000