tree_haver v3.0.0 released!
3.0.0 - 2025-12-16
- TAG: v3.0.0
- COVERAGE: 85.19% – 909/1067 lines in 11 files
- BRANCH COVERAGE: 67.47% – 338/501 branches in 11 files
- 92.93% documented
Added
Backend Requirements
- MRI Backend: Requires
ruby_tree_sitterv2.0+ (exceptions inherit fromExceptionnotStandardError)- In ruby_tree_sitter v2.0, TreeSitter errors were changed to inherit from Exception for thread-safety
- TreeHaver now properly handles:
ParserNotFoundError,LanguageLoadError,SymbolNotFoundError, etc.
Thread-Safe Backend Selection (Hybrid Approach)
- NEW: Block-based backend API -
TreeHaver.with_backend(:ffi) { ... }for thread-safe backend selection- Thread-local context with proper nesting support
- Exception-safe (context restored even on errors)
- Fully backward compatible with existing global backend setting
- NEW: Explicit backend parameters
Parser.new(backend: :mri)- specify backend when creating parserLanguage.from_library(path, backend: :ffi)- specify backend when loading language- Backend parameters override thread context and global settings
- NEW: Backend introspection -
parser.backendreturns the current backend name (:ffi,:mri, etc.) - Backend precedence chain:
explicit parameter > thread context > global setting > :auto - Backend-aware caching - Language cache now includes backend in cache key to prevent cross-backend pollution
- Added
TreeHaver.effective_backend- returns the currently effective backend considering precedence - Added
TreeHaver.current_backend_context- returns thread-local backend context - Added
TreeHaver.resolve_backend_module(explicit_backend)- resolves backend module with precedence
Examples and Discovery
- Added 18 comprehensive examples demonstrating all backends and languages
- JSON examples (5): auto, MRI, Rust, FFI, Java
- JSONC examples (5): auto, MRI, Rust, FFI, Java
- Bash examples (5): auto, MRI, Rust, FFI, Java
- Citrus examples (3): TOML, Finitio, Dhall
- All examples use bundler inline (self-contained, no Gemfile needed)
- Added
examples/run_all.rb- comprehensive test runner with colored output - Updated
examples/README.md- complete guide to all examples
- Added
TreeHaver::CitrusGrammarFinderfor language-agnostic discovery and registration of Citrus-based grammar gems- Automatically discovers Citrus grammar gems by gem name and grammar constant path
- Validates grammar modules respond to
.parse(source)before registration - Provides helpful error messages when grammars are not found
- Added multi-backend language registry supporting multiple backends per language simultaneously
- Restructured
LanguageRegistryto use nested hash:{ language: { backend_type: config } } - Enables registering both tree-sitter and Citrus grammars for the same language without conflicts
- Supports runtime backend switching, benchmarking, and fallback scenarios
- Restructured
- Added
LanguageRegistry.register(name, backend_type, **config)with backend-specific configuration storage - Added
LanguageRegistry.registered(name, backend_type = nil)to query by specific backend or get all backends - Added
TreeHaver::Backends::Citrus::Node#structural?method to distinguish structural nodes from terminals- Uses Citrus grammar’s
terminal?method to dynamically determine node classification - Works with any Citrus grammar without language-specific knowledge
- Uses Citrus grammar’s
Changed
- BREAKING: All errors now inherit from
TreeHaver::Errorwhich inherits fromException- see: https://github.com/Faveod/ruby-tree-sitter/pull/83 for reasoning
- BREAKING:
LanguageRegistry.registersignature changed fromregister(name, path:, symbol:)toregister(name, backend_type, **config)- This enables proper separation of tree-sitter and Citrus configurations
- Users should update to use
TreeHaver.register_languageinstead of callingLanguageRegistry.registerdirectly
- Updated
TreeHaver.register_languageto support both tree-sitter and Citrus grammars in single call or separate calls- Can now register:
register_language(:toml, path: "...", symbol: "...", grammar_module: TomlRB::Document) - INTENTIONAL DESIGN: Uses separate
ifstatements (notelsif) to allow registering both backends simultaneously - Enables maximum flexibility: runtime backend switching, performance benchmarking, fallback scenarios
- Multiple registrations for same language now merge instead of overwrite
- Can now register:
Improved
Code Quality and Documentation
- Uniform backend API: All backends now implement
reset!method for consistent testing interface- Eliminates need for tests to manipulate private instance variables
- Provides clean way to reset backend state between tests
- Documented design decisions with inline rationale
- FFI Tree finalizer behavior and why Parser doesn’t use finalizers
resolve_backend_moduleearly-return pattern with comprehensive commentsregister_languagemulti-backend registration capability extensively documented
- Enhanced YARD documentation
- All Citrus examples now include
gem_nameparameter (matches actual usage patterns) - Added complete examples showing both single-backend and multi-backend registration
- Documented backend precedence chain and thread-safety guarantees
- All Citrus examples now include
- Comprehensive test coverage for thread-safe backend selection
- Thread-local context tests
- Parser backend parameter tests
- Language backend parameter tests
- Concurrent parsing tests with multiple backends
- Backend-aware cache isolation tests
- Nested block behavior tests (inner blocks override outer blocks)
- Exception safety tests (context restored even on errors)
- Explicit parameter precedence tests
- Updated
Language.method_missingto automatically select appropriate grammar based on active backend- tree-sitter backends (MRI, Rust, FFI, Java) query
:tree_sitterregistry key - Citrus backend queries
:citrusregistry key - Provides clear error messages when requested backend has no registered grammar
- tree-sitter backends (MRI, Rust, FFI, Java) query
- Improved
TreeHaver::Backends::Citrus::Node#typeto use dynamic Citrus grammar introspection- Uses event
.namemethod and Symbol events for accurate type extraction - Works with any Citrus grammar without language-specific code
- Handles compound rules (Repeat, Choice, Optional) intelligently
- Uses event
Fixed
Thread-Safety and Backend Selection
- Fixed
resolve_backend_moduleto properly handle mocked backends withoutavailable?method- Assumes modules without
available?are available (for test compatibility and backward compatibility) - Only rejects if module explicitly has
available?method and returns false - Makes code more defensive and test-friendly
- Assumes modules without
- Fixed Language cache to include backend in cache key
- Prevents returning wrong backend’s Language object when switching backends
- Essential for correctness with multiple backends in use
- Cache key now:
"#{path}:#{symbol}:#{backend}"instead of just"#{path}:#{symbol}"
- Fixed
TreeHaver.register_languageto properly support multi-backend registration- Documented intentional design: uses
ifnotelsifto allow both backends in one call - Added comprehensive inline comments explaining why no early return
- Added extensive YARD documentation with examples
- Documented intentional design: uses
Backend Bug Fixes
- Fixed critical double-wrapping bug in ALL backends (MRI, Rust, FFI, Java, Citrus)
- Backend
Parser#parseandparse_stringmethods now return raw backend trees - TreeHaverParser wraps the raw tree in TreeHaverTree (single wrapping)
- Previously backends were returning TreeHaverTree, then TreeHaverParser wrapped it again (double wrapping)
- This caused
@inner_treeto be a TreeHaver::Tree instead of raw backend tree, leading to nil errors
- Backend
- Fixed TreeHaver::Parser to pass source parameter when wrapping backend trees
- Enables
Node#textto work correctly by providing source for text extraction - Fixes all parse and parse_string methods to include
source: sourceparameter
- Enables
- Fixed MRI backend to properly use ruby_tree_sitter API
- Fixed
require "tree_sitter"(gem name isruby_tree_sitterbut requirestree_sitter) - Fixed
Language.loadto use correct argument order:(symbol_name, path) - Fixed
Parser#parseto useparse_string(nil, source)instead of creating Input objects - Fixed
Language.from_libraryto implement the expected signature matching other backends
- Fixed
- Fixed FFI backend missing essential node methods
- Added
ts_node_start_byte,ts_node_end_byte,ts_node_start_point,ts_node_end_point - Added
ts_node_is_null,ts_node_is_named - These methods are required for accessing node byte positions and metadata
- Fixes
NoMethodErrorwhen using FFI backend to traverse AST nodes
- Added
- Fixed GrammarFinder error messages for environment variable validation
- Detects leading/trailing whitespace in paths and provides correction suggestions
- Shows when TREE_SITTER_*_PATH is set but points to nonexistent file
- Provides helpful guidance for setting environment variables correctly
- Fixed registry conflicts when registering multiple backend types for the same language
- Fixed
CitrusGrammarFinderto properly handle gems with non-standard require paths (e.g.,toml-rb.rbvstoml/rb.rb) - Fixed Citrus backend infinite recursion in
Node#extract_type_from_event- Added cycle detection to prevent stack overflow when traversing recursive grammar structures
Known Issues
- MRI backend + Bash grammar: ABI/symbol loading incompatibility
- The ruby_tree_sitter gem cannot load tree-sitter-bash grammar (symbol not found)
- Workaround: Use FFI backend instead (works perfectly)
- This is documented in examples and test runner
- Rust backend + Bash grammar: Version mismatch due to static linking
- tree_stump statically links tree-sitter at compile time
- System bash.so may be compiled with different tree-sitter version
- Workaround: Use FFI backend (dynamic linking avoids version conflicts)
- This is documented in examples with detailed explanations
Notes on Backward Compatibility
Despite the major version bump to 3.0.0 (following semver due to the breaking LanguageRegistry.register signature change), most users will experience NO BREAKING CHANGES:
Why 3.0.0?
LanguageRegistry.registersignature changed to support multi-backend registration- However, most users should use
TreeHaver.register_language(which remains backward compatible) - Direct calls to
LanguageRegistry.registerare rare in practice
What Stays the Same?
- Global backend setting:
TreeHaver.backend = :ffiworks unchanged - Parser creation:
Parser.newwithout parameters works as before - Language loading:
Language.from_library(path)works as before - Auto-detection: Backend auto-selection still works when backend is
:auto - All existing code continues to work without modifications
What’s New (All Optional)?
- Thread-safe block API:
TreeHaver.with_backend(:ffi) { ... } - Explicit backend parameters:
Parser.new(backend: :mri) - Backend introspection:
parser.backend - Multi-backend language registration
Migration Path: Existing codebases can upgrade to 3.0.0 and gain access to new thread-safe features without changing any existing code. The new features are purely additive and opt-in.
Many paths lead to being a sponsor or a backer of this project. Are you on such a path?