This ticket tracks the work remaining to complete adding source location information into sqlparser ## Background - @Nyrox and @iffyio introduced the foundations for storing source location information in the AST nodes in https://github.com/apache/datafusion-sqlparser-rs/pull/1435. This information can be used to provide more specific error messages, and potentially syntax highlighting among other great things. - In order to 1. minimize the disruption to downstream projects that use `sqlparser-rs` and 2. avoid a single massive PR and 3. work together as a community, we are implementing this feature incrementally over several releases. Let's use this ticket to organize needed / remaining work. If you find additional features are needed / issues, please leave a comment on this ticket ## Source Span Contributing Guidelines For contributing source spans improvement in addition to the general [contribution guidelines], please make sure to pay attention to the following: - `Ident` always have correct source spans - We try to minimize downstream breaking changes - Consider using [`Span::union`] in favor of storing spans on all nodes - Any metadata added to compute spans must not change semantics (`Eq`, `Ord`, `Hash`, etc.). See [`AttachedToken`] for more information. [contribution guidelines]: https://github.com/apache/datafusion-sqlparser-rs/blob/main/README.md#contributing [`Span::union`]: ast::Span::union [`AttachedToken`]: ast::helpers::attached_token::AttachedToken When adding support for source spans on a type, consider the impact to consumers of that type and whether your change would require a consumer to do non-trivial changes to their code. Example of a trivial change ```rust match node { ast::Query { field1, field2, location: _, // add a new line to ignored location } If adding source spans to a type would require a significant change like wrapping the type, please open an issue to discuss. # AST Node Equality and Hashes When adding tokens to AST nodes, make sure to store them using the [AttachedToken](https://docs.rs/sqlparser/latest/sqlparser/ast/helpers/struct.AttachedToken.html) (TODO UPDATE SOURCE REFERENCE)to ensure that semantically equivalent AST nodes compare as equal and hash to the same value. i.e. `select 5` and `SELECT 5` would compare as different `Select` nodes, if the select token was stored directly. f.e. ```rust struct Select { select_token: AttachedToken, // only used for spans /// remaining fields field1, field2, ... } ``` Some high level work (list from https://github.com/apache/datafusion-sqlparser-rs/pull/1435) - Store keyword `TokenWithLocation` for expressions that currently don't have them - Implement spans for the rest of the AST, namely `Statement`s ## Tasks - [ ] https://github.com/apache/datafusion-sqlparser-rs/issues/1563 - [ ] Look into reducing AST size with smaller offset sizes (e.g. `u32` rather than `usize`) - [ ] Store spans for ast::value::Value - [x] https://github.com/apache/datafusion-sqlparser-rs/issues/1858 - [ ] https://github.com/apache/datafusion-sqlparser-rs/issues/1548