added vendor/

This commit is contained in:
2026-02-20 13:21:17 +02:00
parent dcb22b58d9
commit 7dc6f9cb08
1140 changed files with 722477 additions and 26 deletions
+533
View File
@@ -0,0 +1,533 @@
# internal/libyaml
This package provides low-level YAML processing functionality through a 3-stage
pipeline: Scanner → Parser → Emitter.
It implements the libyaml C library functionality in Go.
## Directory Overview
The `internal/libyaml` package implements the core YAML processing stages:
1. **Scanner** - Tokenizes YAML text into tokens
2. **Parser** - Converts tokens into events following YAML grammar rules
3. **Emitter** - Serializes events back into YAML text
## File Organization
### Main Source Files
- **scanner.go** - YAML scanner/tokenizer implementation
- **parser.go** - YAML parser (tokens → events)
- **emitter.go** - YAML emitter (events → YAML output)
- **api.go** - Public API for Parser and Emitter types
- **yaml.go** - Core types and constants (Event, Token, enums)
- **reader.go** - Input handling and encoding detection
- **writer.go** - Output handling
- **yamlprivate.go** - Internal types and helper functions
### Test Files
- **scanner_test.go** - Scanner tests
- **parser_test.go** - Parser tests
- **emitter_test.go** - Emitter tests
- **api_test.go** - API tests
- **yaml_test.go** - Utility function tests
- **reader_test.go** - Reader tests
- **writer_test.go** - Writer tests
- **yamlprivate_test.go** - Character classification tests
- **loader_test.go** - Data loader scalar resolution tests
- **yamldatatest_test.go** - YAML test data loading framework
- **yamldatatest_loader.go** - YAML test data loader with scalar type resolution (exported for reuse)
### Test Data Files (in `testdata/`)
- **scanner.yaml** - Scanner test cases
- **parser.yaml** - Parser test cases
- **emitter.yaml** - Emitter test cases
- **api.yaml** - API test cases
- **yaml.yaml** - Utility function test cases
- **reader.yaml** - Reader test cases
- **writer.yaml** - Writer test cases
- **yamlprivate.yaml** - Character classification test cases
- **loader.yaml** - Data loader scalar resolution test cases
## Processing Pipeline
### 1. Scanner (scanner.go)
The scanner converts YAML text into tokens.
**Input**: Raw YAML text (string or []byte)
**Output**: Stream of tokens
**Token types include**:
- `SCALAR_TOKEN` - Plain, quoted, or block scalar values
- `KEY_TOKEN`, `VALUE_TOKEN` - Mapping key/value indicators
- `BLOCK_MAPPING_START_TOKEN`, `FLOW_MAPPING_START_TOKEN` - Mapping delimiters
- `BLOCK_SEQUENCE_START_TOKEN`, `FLOW_SEQUENCE_START_TOKEN` - Sequence delimiters
- `ANCHOR_TOKEN`, `ALIAS_TOKEN` - Anchor definitions and references
- `TAG_TOKEN` - Type tags
- `DOCUMENT_START_TOKEN`, `DOCUMENT_END_TOKEN` - Document boundaries
**Responsibilities**:
- Character encoding detection (UTF-8, UTF-16LE, UTF-16BE)
- Line break normalization
- Indentation tracking
- Quote and escape sequence handling
### 2. Parser (parser.go)
The parser converts tokens into events following YAML grammar rules.
**Input**: Stream of tokens from Scanner
**Output**: Stream of events
**Event types include**:
- `STREAM_START_EVENT`, `STREAM_END_EVENT` - Stream boundaries
- `DOCUMENT_START_EVENT`, `DOCUMENT_END_EVENT` - Document boundaries
- `SCALAR_EVENT` - Scalar values
- `MAPPING_START_EVENT`, `MAPPING_END_EVENT` - Mapping boundaries
- `SEQUENCE_START_EVENT`, `SEQUENCE_END_EVENT` - Sequence boundaries
- `ALIAS_EVENT` - Anchor references
**Responsibilities**:
- Implementing YAML grammar and validation
- Managing document directives (%YAML, %TAG)
- Resolving anchors and aliases
- Tracking implicit vs explicit markers
- Style preservation (plain, single-quoted, double-quoted, literal, folded)
### 3. Emitter (emitter.go)
The emitter converts events back into YAML text.
**Input**: Stream of events
**Output**: YAML text
**Responsibilities**:
- Style selection (plain/quoted scalars, block/flow collections)
- Formatting control (canonical mode, indentation, line width)
- Character encoding
- Anchor and tag serialization
- Document marker generation (---, ...)
**Configuration options**:
- `Canonical` - Emit in canonical YAML form
- `Indent` - Indentation width (2-9 spaces)
- `Width` - Line width (-1 for unlimited)
- `Unicode` - Enable Unicode character output
- `LineBreak` - Line break style (LN, CR, CRLN)
## Testing Framework
### Test Architecture
The testing framework uses a data-driven approach:
1. **Test data** is stored in YAML files in the `testdata/` directory
2. **Test logic** is implemented in Go files (`*_test.go`)
3. **One-to-one pairing**: Each `testdata/foo.yaml` has a corresponding `foo_test.go`
**Benefits**:
- Easy to add new test cases without writing Go code
- Test data is human-readable and self-documenting
- Test logic is reusable across many test cases
- Test data is separated from test code for clarity
- Tests can become a common suite for multiple YAML frameworks
### Test Data Files
Each YAML file contains test cases for a specific component:
- **scanner.yaml** - Scanner/tokenization tests
- Token sequence verification
- Token property validation (value, style)
- Error detection
- **parser.yaml** - Parser/event generation tests
- Event sequence verification
- Event property validation (anchor, tag, value, directives)
- Error detection
- **emitter.yaml** - Emitter/serialization tests
- Event-to-YAML conversion
- Configuration options testing
- Roundtrip testing (parse → emit)
- Writer integration
- **api.yaml** - API constructor and method tests
- Constructor validation
- Method behavior and state changes
- Panic conditions
- Cleanup verification
- **yaml.yaml** - Utility function tests
- Enum String() methods
- Style accessor methods
- **reader.yaml** - Reader/input handling tests
- Encoding detection (UTF-8, UTF-16LE, UTF-16BE)
- Buffer management
- Error handling
- **writer.yaml** - Writer/output handling tests
- Buffer flushing
- Output handlers (string, io.Writer)
- Error conditions
- **yamlprivate.yaml** - Character classification tests
- Character type predicates (isAlpha, isDigit, isHex, etc.)
- Character conversion functions (asDigit, asHex, width)
- Unicode handling
- **loader.yaml** - Data loader scalar resolution tests
- Numeric type resolution (integers, floats)
- Boolean and null value handling
- String vs numeric type disambiguation
- Mixed-type collections
### Test Framework Implementation
The test framework is implemented in `yamldatatest_loader.go` and `yamldatatest_test.go`:
**Core functions**:
- `LoadYAML(data []byte) (interface{}, error)` - Parses YAML using libyaml parser with scalar type resolution (exported)
- `UnmarshalStruct(target interface{}, data map[string]interface{}) error` - Populates structs (exported)
- `LoadTestCases(filename string) ([]TestCase, error)` - Loads and parses test YAML files
- `coerceScalar(value string) interface{}` - Resolves scalar strings to appropriate Go types (int, float64, bool, nil, string)
**Core types**:
- `TestCase` struct - Umbrella structure containing fields for all test types
- Uses `interface{}` for flexible field types
- Post-processing converts generic fields to specific types
**Post-processing**:
After loading, the framework processes test data:
- Converts `Want` (interface{}) to `WantEvents`, `WantTokens`, or `WantSpecs` based on test type
- Converts `Want` (interface{}) to `WantContains` (handles both scalar and sequence)
- Converts `Checks` to field validation specifications
### Test Types
#### Scanner Tests
**scan-tokens** - Verify token sequence
```yaml
- scan-tokens:
name: Simple scalar
yaml: |-
hello
want:
- STREAM_START_TOKEN
- SCALAR_TOKEN
- STREAM_END_TOKEN
```
**scan-tokens-detailed** - Verify token properties
```yaml
- scan-tokens-detailed:
name: Single quoted scalar
yaml: |-
'hello world'
want:
- STREAM_START_TOKEN
- SCALAR_TOKEN:
style: SINGLE_QUOTED_SCALAR_STYLE
value: hello world
- STREAM_END_TOKEN
```
**scan-error** - Verify error detection
```yaml
- scan-error:
name: Invalid character
yaml: "\x01"
```
#### Parser Tests
**parse-events** - Verify event sequence
```yaml
- parse-events:
name: Simple mapping
yaml: |
key: value
want:
- STREAM_START_EVENT
- DOCUMENT_START_EVENT
- MAPPING_START_EVENT
- SCALAR_EVENT
- SCALAR_EVENT
- MAPPING_END_EVENT
- DOCUMENT_END_EVENT
- STREAM_END_EVENT
```
**parse-events-detailed** - Verify event properties
```yaml
- parse-events-detailed:
name: Anchor and alias
yaml: |
- &anchor value
- *anchor
want:
- STREAM_START_EVENT
- DOCUMENT_START_EVENT
- SEQUENCE_START_EVENT
- SCALAR_EVENT:
anchor: anchor
value: value
- ALIAS_EVENT:
anchor: anchor
- SEQUENCE_END_EVENT
- DOCUMENT_END_EVENT
- STREAM_END_EVENT
```
**parse-error** - Verify error detection
```yaml
- parse-error:
name: Error state
yaml: |
key: : invalid
```
#### Emitter Tests
**emit** - Emit events and verify output contains expected strings
```yaml
- emit:
name: Simple scalar
data:
- STREAM_START_EVENT:
encoding: UTF8_ENCODING
- DOCUMENT_START_EVENT:
implicit: true
- SCALAR_EVENT:
value: hello
implicit: true
style: PLAIN_SCALAR_STYLE
- DOCUMENT_END_EVENT:
implicit: true
- STREAM_END_EVENT
want: hello
```
**emit-config** - Emit with configuration
```yaml
- emit-config:
name: Custom indent
conf:
indent: 4
data:
- STREAM_START_EVENT:
encoding: UTF8_ENCODING
- DOCUMENT_START_EVENT:
implicit: true
- MAPPING_START_EVENT:
implicit: true
style: BLOCK_MAPPING_STYLE
# ... more events
want: key
```
**roundtrip** - Parse → emit, verify output
```yaml
- roundtrip:
name: Roundtrip
yaml: |
key: value
list:
- item1
- item2
want:
- key
- value
- item1
```
**emit-writer** - Emit to io.Writer
```yaml
- emit-writer:
name: Writer
data:
- STREAM_START_EVENT:
encoding: UTF8_ENCODING
# ... more events
want: test
```
#### API Tests
**api-new** - Test constructors
```yaml
- api-new:
name: New parser
with: NewParser
test:
- nil: [raw-buffer, false]
- cap: [raw-buffer, 512]
- nil: [buffer, false]
- cap: [buffer, 1536]
```
**api-method** - Test methods and field state
```yaml
- api-method:
name: Parser set input string
with: NewParser
byte: true
call: [SetInputString, 'key: value']
test:
- eq: [input, 'key: value']
- eq: [input-pos, 0]
- nil: [read-handler, false]
```
**api-panic** - Test methods that should panic
```yaml
- api-panic:
name: Parser set input string twice
with: NewParser
byte: true
init: [SetInputString, first]
call: [SetInputString, second]
want: must set the input source only once
```
**api-delete** - Test cleanup
```yaml
- api-delete:
name: Parser delete
with: NewParser
byte: true
init: [SetInputString, test]
test:
- len: [input, 0]
- len: [buffer, 0]
```
**api-new-event** - Test event constructors
```yaml
- api-new-event:
name: New stream start event
call: [NewStreamStartEvent, UTF8_ENCODING]
test:
- eq: [Type, STREAM_START_EVENT]
- eq: [encoding, UTF8_ENCODING]
```
#### Utility Tests
**enum-string** - Test String() methods of enums
```yaml
- enum-string:
name: Scalar style plain
enum: [ScalarStyle, PLAIN_SCALAR_STYLE]
want: Plain
```
**style-accessor** - Test style accessor methods
```yaml
- style-accessor:
name: Event scalar style
test: [ScalarStyle, DOUBLE_QUOTED_SCALAR_STYLE]
```
#### Loader Tests
**scalar-resolution** - Test scalar type resolution
```yaml
- scalar-resolution:
name: Positive integer
yaml: "42"
want: 42
- scalar-resolution:
name: Negative float
yaml: "-2.5"
want: -2.5
```
**Resolution order**:
1. Boolean (true, false)
2. Null (null keyword only)
3. Hexadecimal integer (0x prefix)
4. Float (contains .)
5. Decimal integer
6. String (fallback)
### Common Keys in Test YAML Files
Test cases use a **type-as-key** format where the test type is the map key:
```yaml
- test-type:
name: Test case name
# ... other fields
```
**Common fields**:
- **name** - Test case name (title case convention)
- **yaml** - Input YAML string to test
- **want** - Expected result (format varies by test type)
- For api-panic: string containing expected panic message substring
- For scan-error/parse-error: boolean (defaults to true if omitted; set to false if no error expected)
- For enum-string: string representing expected String() output
- For other types: varies (may be sequence or scalar)
- **data** - For emitter tests: list of event specifications to emit
- **conf** - For emitter config tests: emitter configuration options
- **with** - For API tests: constructor name (NewParser, NewEmitter)
- **call** - For API tests: method call [MethodName, arg1, arg2, ...]
- **init** - For API panic tests: setup method call before main method
- **byte** - For API tests: boolean flag to convert string args to []byte
- **test** - For API tests: list of field validation checks in format `operator: [field, value]` where operator is one of: nil, cap, len, eq, gte, len-gt.
- **test** - For style-accessor tests: array of [Method, STYLE] where Method is the accessor method (e.g., ScalarStyle) and STYLE is the style constant (e.g., DOUBLE_QUOTED_SCALAR_STYLE).
- **enum** - For enum tests: array of [Type, Value] where Type is the enum type (e.g., ScalarStyle) and Value is the constant (e.g., PLAIN_SCALAR_STYLE)
**Note on scalar type resolution**: Unquoted scalar values in test data are automatically resolved to appropriate Go types (int, float64, bool, nil) by the `LoadYAML` function. Quoted scalars remain as strings.
### Running Tests
```bash
# Run all tests in the package
go test ./internal/libyaml
# Run specific test file
go test ./internal/libyaml -run TestScanner
go test ./internal/libyaml -run TestParser
go test ./internal/libyaml -run TestEmitter
go test ./internal/libyaml -run TestAPI
go test ./internal/libyaml -run TestYAML
go test ./internal/libyaml -run TestLoader
# Run specific test case (using subtest name)
go test ./internal/libyaml -run TestScanner/Block_sequence
go test ./internal/libyaml -run TestParser/Anchor_and_alias
go test ./internal/libyaml -run TestEmitter/Flow_mapping
go test ./internal/libyaml -run TestLoader/Scientific_notation_lowercase_e
# Run with verbose output
go test -v ./internal/libyaml
# Run with coverage
go test -cover ./internal/libyaml
```