Tree-sitter Parsing
Parse source code into concrete syntax trees using Tree-sitter. Based on go-tree-sitter bindings.
Tree-sitter produces syntax trees that:
- Represent the full structure of source code
- Update incrementally as code changes
- Are robust to syntax errors (partial parsing)
- Support pattern-based queries using S-expressions
Loading
local treesitter = require("treesitter")
Supported Languages
| Language | Aliases | Root Node |
|---|---|---|
| Go | go, golang |
source_file |
| JavaScript | js, javascript |
program |
| TypeScript | ts, typescript |
program |
| TSX | tsx |
program |
| Python | python, py |
module |
| Lua | lua |
chunk |
| PHP | php |
program |
| C# | csharp, cs, c# |
compilation_unit |
| HTML | html, html5 |
document |
| Markdown | markdown, md |
document |
| SQL | sql |
- |
local langs = treesitter.supported_languages()
-- {go = true, javascript = true, python = true, ...}
Quick Start
Parse Code
local code = [[
func hello() {
return "Hello!"
}
]]
local tree, err = treesitter.parse("go", code)
if err then
return nil, err
end
local root = tree:root_node()
print(root:kind()) -- "source_file"
print(root:child_count()) -- number of top-level declarations
Query Syntax Tree
local code = [[
func hello() {}
func world() {}
]]
local tree = treesitter.parse("go", code)
local root = tree:root_node()
-- Find all function names
local query = treesitter.query("go", [[
(function_declaration name: (identifier) @func_name)
]])
local captures = query:captures(root, code)
for _, capture in ipairs(captures) do
print(capture.name, capture.text)
end
-- "func_name" "hello"
-- "func_name" "world"
Parsing
Simple Parse
Parse source code into a syntax tree. Creates a temporary parser internally.
local tree, err = treesitter.parse("go", code)
| Parameter | Type | Description |
|---|---|---|
language |
string | Language name or alias |
code |
string | Source code |
Returns: Tree, error
Reusable Parser
Create a parser for repeated parsing or incremental updates.
local parser = treesitter.parser()
parser:set_language("go")
local tree1 = parser:parse("package main")
-- Incremental parse with old tree
local tree2 = parser:parse("package main\nfunc foo() {}", tree1)
parser:close()
Returns: Parser
Parser Methods
| Method | Description |
|---|---|
set_language(lang) |
Set parser language, returns boolean, error |
get_language() |
Get current language name |
parse(code, old_tree?) |
Parse code, optionally with old tree for incremental parsing |
set_timeout(duration) |
Set parse timeout (string like "1s" or nanoseconds) |
set_ranges(ranges) |
Set byte ranges to parse |
reset() |
Reset parser state |
close() |
Release parser resources |
Syntax Trees
Get Root Node
local tree = treesitter.parse("go", "package main")
local root = tree:root_node()
print(root:kind()) -- "source_file"
print(root:text()) -- "package main"
Tree Methods
| Method | Description |
|---|---|
root_node() |
Get root node of tree |
root_node_with_offset(bytes, point) |
Get root with offset applied |
language() |
Get tree's language object |
copy() |
Create deep copy of tree |
walk() |
Create cursor for traversal |
edit(edit_table) |
Apply incremental edit |
changed_ranges(other_tree) |
Get ranges that changed |
included_ranges() |
Get ranges included during parsing |
dot_graph() |
Get DOT graph representation |
close() |
Release tree resources |
Incremental Editing
Update the tree when source code changes:
local code = "func main() { x := 1 }"
local tree = treesitter.parse("go", code)
-- Mark edit: changed "1" to "100" at byte 19
tree:edit({
start_byte = 19,
old_end_byte = 20,
new_end_byte = 22,
start_row = 0,
start_column = 19,
old_end_row = 0,
old_end_column = 20,
new_end_row = 0,
new_end_column = 22
})
-- Re-parse with edited tree (faster than full parse)
local parser = treesitter.parser()
parser:set_language("go")
local new_tree = parser:parse("func main() { x := 100 }", tree)
Nodes
Nodes represent elements in the syntax tree.
Node Types
local node = root:child(0)
-- Type information
print(node:kind()) -- "package_clause"
print(node:type()) -- same as kind()
print(node:is_named()) -- true for significant nodes
print(node:grammar_name()) -- grammar rule name
Navigation
-- Children
local child = node:child(0) -- by index (0-based)
local named = node:named_child(0) -- named children only
local count = node:child_count()
local named_count = node:named_child_count()
-- Siblings
local next = node:next_sibling()
local prev = node:prev_sibling()
local next_named = node:next_named_sibling()
local prev_named = node:prev_named_sibling()
-- Parent
local parent = node:parent()
-- By field name
local name_node = func_decl:child_by_field_name("name")
local field = node:field_name_for_child(0)
Position Information
-- Byte offsets
local start = node:start_byte()
local end_ = node:end_byte()
-- Row/column positions (0-based)
local start_pt = node:start_point() -- {row = 0, column = 0}
local end_pt = node:end_point() -- {row = 0, column = 12}
-- Source text
local text = node:text()
Error Detection
if root:has_error() then
-- Tree contains syntax errors
end
if node:is_error() then
-- This specific node is an error
end
if node:is_missing() then
-- Parser inserted this to recover from error
end
S-Expression
local sexp = node:to_sexp()
-- "(source_file (package_clause (package_identifier)))"
Queries
Pattern matching using Tree-sitter's query language (S-expressions).
Create Query
local query, err = treesitter.query("go", [[
(function_declaration
name: (identifier) @func_name
parameters: (parameter_list) @params
)
]])
| Parameter | Type | Description |
|---|---|---|
language |
string | Language name |
pattern |
string | Query pattern in S-expression syntax |
Returns: Query, error
Execute Query
-- Get all captures (flattened)
local captures = query:captures(root, source_code)
for _, capture in ipairs(captures) do
print(capture.name) -- "@func_name"
print(capture.text) -- actual text
print(capture.index) -- capture index
-- capture.node is the Node object
end
-- Get matches (grouped by pattern)
local matches = query:matches(root, source_code)
for _, match in ipairs(matches) do
print(match.id, match.pattern)
for _, capture in ipairs(match.captures) do
print(capture.name, capture.node:text())
end
end
Query Control
-- Limit query scope
query:set_byte_range(0, 1000)
query:set_point_range({row = 0, column = 0}, {row = 10, column = 0})
-- Limit matches
query:set_match_limit(100)
if query:did_exceed_match_limit() then
-- More matches exist
end
-- Timeout (string duration or nanoseconds)
query:set_timeout("500ms")
query:set_timeout(1000000000) -- 1 second in nanoseconds
-- Disable patterns/captures
query:disable_pattern(0)
query:disable_capture("func_name")
Query Inspection
local pattern_count = query:pattern_count()
local capture_count = query:capture_count()
local name = query:capture_name_for_id(0)
local id = query:capture_index_for_name("func_name")
Tree Cursor
Efficient traversal without creating node objects at each step.
Basic Traversal
local cursor = tree:walk()
-- Start at root
print(cursor:current_node():kind()) -- "source_file"
print(cursor:current_depth()) -- 0
-- Navigate
if cursor:goto_first_child() then
print(cursor:current_node():kind())
print(cursor:current_depth()) -- 1
end
if cursor:goto_next_sibling() then
-- moved to next sibling
end
cursor:goto_parent() -- back to parent
cursor:close()
Cursor Methods
| Method | Returns | Description |
|---|---|---|
current_node() |
Node |
Node at cursor position |
current_depth() |
integer |
Depth (0 = root) |
current_field_name() |
string? |
Field name if any |
goto_parent() |
boolean |
Move to parent |
goto_first_child() |
boolean |
Move to first child |
goto_last_child() |
boolean |
Move to last child |
goto_next_sibling() |
boolean |
Move to next sibling |
goto_previous_sibling() |
boolean |
Move to previous sibling |
goto_first_child_for_byte(n) |
integer? |
Move to child containing byte |
goto_first_child_for_point(pt) |
integer? |
Move to child containing point |
reset(node) |
- | Reset cursor to node |
copy() |
Cursor |
Create copy of cursor |
close() |
- | Release resources |
Language Metadata
local lang = treesitter.language("go")
print(lang:version()) -- ABI version
print(lang:node_kind_count()) -- number of node types
print(lang:field_count()) -- number of fields
-- Node kind lookup
local kind = lang:node_kind_for_id(1)
local id = lang:id_for_node_kind("identifier", true)
local is_named = lang:node_kind_is_named(1)
-- Field lookup
local field_name = lang:field_name_for_id(1)
local field_id = lang:field_id_for_name("name")
Errors
| Condition | Kind | Retryable |
|---|---|---|
| Language not supported | errors.INVALID |
no |
| Language has no binding | errors.INVALID |
no |
| Invalid query pattern | errors.INVALID |
no |
| Invalid positions | errors.INVALID |
no |
| Parse failed | errors.INTERNAL |
no |
See Error Handling for working with errors.
Query Syntax Reference
Tree-sitter queries use S-expression patterns:
; Match a node type
(identifier)
; Match with field names
(function_declaration name: (identifier))
; Capture with @name
(function_declaration name: (identifier) @func_name)
; Multiple patterns
[
(function_declaration)
(method_declaration)
] @declaration
; Wildcards
(_) ; any node
(identifier)+ ; one or more
(identifier)* ; zero or more
(identifier)? ; optional
; Predicates
((identifier) @var
(#match? @var "^_")) ; regex match
See Tree-sitter Query Syntax for complete documentation.