Class EarleyGenerator
A specialized version of the Earley parser that generates all possible sentences from a grammar.
Inherited Members
Namespace: EarleyParser
Assembly: EarleyParser.dll
Syntax
public class EarleyGenerator : EarleyParser
Remarks
Instead of parsing a specific input, it builds the complete parse forest for all possible sentences up to a specified length.
Key features:
-
Complete Parse Forest Generation:
- Generates all possible parse trees for sentences up to maxWords length
- Maintains the same chart structure as the parser for consistency
- Uses "generator" tokens as placeholders for actual input
-
Generation Modes:
- Parts of Speech Sequences: Generate sequences of grammatical categories
- Full Parse Trees: Generate complete bracketed parse tree representations
-
Analysis Capabilities:
- Counts total number of possible derivations
- Analyzes tree properties (recursion, basic trees)
- Supports both context-free and linear indexed grammars
This class is particularly useful for:
- Grammar analysis and validation
- Testing grammar coverage
- Generating training data
- Understanding grammar ambiguity
Constructors
| Edit this page View SourceEarleyGenerator(Grammar, Vocabulary, int)
Initializes a new instance of the EarleyGenerator class.
Declaration
public EarleyGenerator(Grammar g, Vocabulary v, int maxWords)
Parameters
| Type | Name | Description |
|---|---|---|
| Grammar | g | The grammar to use for generation. |
| Vocabulary | v | The vocabulary that maps words to their possible parts of speech. |
| int | maxWords | The maximum length of sentences to generate. |
Methods
| Edit this page View SourceAddLexicalizedRules(List<Rule>, int, bool)
Skips lexicalized rules processing in generation mode.
Declaration
protected override void AddLexicalizedRules(List<Rule> lexicalRules, int i, bool preprocess = true)
Parameters
| Type | Name | Description |
|---|---|---|
| List<Rule> | lexicalRules | Not used in generator mode. |
| int | i | Not used in generator mode. |
| bool | preprocess | Not used in generator mode. |
Overrides
| Edit this page View SourceGetAllSequences(bool)
Gets all possible sequences that can be generated from the grammar.
Declaration
public string[][] GetAllSequences(bool onlyPartsOfSpeechSequences = true)
Parameters
| Type | Name | Description |
|---|---|---|
| bool | onlyPartsOfSpeechSequences | If true, returns only sequences of parts of speech. If false, returns full bracketed parse trees. |
Returns
| Type | Description |
|---|---|
| string[][] | A two-dimensional array of sequences, where each row represents a different length. |
GetPossibleSyntacticCategoriesForToken(string)
Gets all possible syntactic categories for a token in generation mode.
Declaration
protected override HashSet<string> GetPossibleSyntacticCategoriesForToken(string nextScannableTerm)
Parameters
| Type | Name | Description |
|---|---|---|
| string | nextScannableTerm | Not used in generator mode. |
Returns
| Type | Description |
|---|---|
| HashSet<string> | A set of all possible syntactic categories from the vocabulary. |
Overrides
| Edit this page View SourcePrepareEarleyTable(string[], int)
Prepares the Earley table for sentence generation.
Declaration
protected override EarleyColumn[] PrepareEarleyTable(string[] text, int maxWords)
Parameters
| Type | Name | Description |
|---|---|---|
| string[] | text | Not used in generator mode. |
| int | maxWords | The maximum length of sentences to generate. |
Returns
| Type | Description |
|---|---|
| EarleyColumn[] | An array of EarleyColumn objects representing the generation table. |