Skip to content

Commit

Permalink
Subsection formatting
Browse files Browse the repository at this point in the history
HexMerlin committed Jan 20, 2025
1 parent 40db597 commit ec99534
Showing 4 changed files with 23 additions and 38 deletions.
28 changes: 9 additions & 19 deletions Automata.Docs/ALANG.md
Original file line number Diff line number Diff line change
@@ -106,27 +106,17 @@ For example:

### Operation Definitions
```
Union: L₁ ∪ L₂ = { w | w ∈ L₁ or w ∈ L₂ }
Difference: L₁ - L₂ = { w | w ∈ L₁ and w ∉ L₂ }
Intersection: L₁ ∩ L₂ = { w | w ∈ L₁ and w ∈ L₂ }
Concatenation: L₁ ⋅ L₂ = { w | w = uv, u ∈ L₁, v ∈ L₂ }
Option: L? = L ∪ { ε }
Kleene Star: L* = ⋃ₙ₌₀^∞ Lⁿ, where L⁰ = { ε }, Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1
Kleene Plus: L⁺ = ⋃ₙ₌₁^∞ Lⁿ, where Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1
Complement: ᒾL = Σ* \ L
```

### Operation Definitions
```math
\text{Union: } L_1 \cup L_2 = \{ w \mid w \in L_1 \text{ or } w \in L_2 \}
\text{Difference: } L_1 - L_2 = \{ w \mid w \in L_1 \text{ and } w \notin L_2 \}
\text{Intersection: } L_1 \cap L_2 = \{ w \mid w \in L_1 \text{ and } w \in L_2 \}
\text{Concatenation: } L_1 \cdot L_2 = \{ w \mid w = uv, u \in L_1, v \in L_2 \}
\text{Option: } L? = L \cup \{ \varepsilon \}
\text{Kleene Star: } L^* = \bigcup_{n=0}^\infty L^n, \text{ where } L^0 = \{ \varepsilon \}, L^n = L \cdot L^{n-1} \text{ for } n \geq 1
\text{Kleene Plus: } L^+ = \bigcup_{n=1}^\infty L^n, \text{ where } L^n = L \cdot L^{n-1} \text{ for } n \geq 1
\text{Complement: } \neg L = \Sigma^* \setminus L
Union: L₁ ∪ L₂ = { w | w ∈ L₁ or w ∈ L₂ }
Difference: L₁ - L₂ = { w | w ∈ L₁ and w ∉ L₂ }
Intersection: L₁ ∩ L₂ = { w | w ∈ L₁ and w ∈ L₂ }
Concatenation: L₁ ⋅ L₂ = { w | w = uv, u ∈ L₁, v ∈ L₂ }
Option: L? = L ∪ { ε }
Kleene Star: L* = ⋃ₙ₌₀ⁿ Lⁿ, where L⁰ = { ε }, Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1
Kleene Plus: L⁺ = ⋃ₙ₌₁ⁿ Lⁿ, where Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1
Complement: ᒾL = Σ* \ L
```

## C# API
2 changes: 2 additions & 0 deletions Automata.Docs/Automata.Docs.csproj
Original file line number Diff line number Diff line change
@@ -5,6 +5,8 @@
<OutputType>Library</OutputType>
<!-- Define the output directory for docfx -->
<DocsOutputDir>$(SolutionDir)\docs</DocsOutputDir>
<!-- Disable NuGet package restore -->
<RestoreProjectStyle>None</RestoreProjectStyle>
</PropertyGroup>

<!-- Target to run docfx during the build process -->
29 changes: 11 additions & 18 deletions docs/ALANG.html
Original file line number Diff line number Diff line change
@@ -315,24 +315,17 @@ <h3 id="alang-expression-examples">Alang expression examples</h3>
<p><code>(x1 | x2 | x3)* - (x1 x2 x3)+</code> : All sequences constaining {x1, x2, x3}, except repetitions of &quot;x1 x2 x3&quot;.</p>
<p><code>()</code> : The empty language that does not accept anything. For example, it is the result from <code>hello - hello</code> and from <code>hello &amp; world</code>.</p>
<h3 id="operation-definitions">Operation Definitions</h3>
<pre><code>Union: L₁ ∪ L₂ = { w | w ∈ L₁ or w ∈ L₂ }
Difference: L₁ - L₂ = { w | w ∈ L₁ and w ∉ L₂ }
Intersection: L₁ ∩ L₂ = { w | w ∈ L₁ and w ∈ L₂ }
Concatenation: L₁ ⋅ L₂ = { w | w = uv, u ∈ L₁, v ∈ L₂ }
Option: L? = L ∪ { ε }
Kleene Star: L* = ⋃ₙ₌₀^∞ Lⁿ, where L⁰ = { ε }, Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1
Kleene Plus: L⁺ = ⋃ₙ₌₁^∞ Lⁿ, where Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1
Complement: ᒾL = Σ* \ L
</code></pre>
<h3 id="operation-definitions-1">Operation Definitions</h3>
<pre><code class="lang-math">\text{Union: } L_1 \cup L_2 = \{ w \mid w \in L_1 \text{ or } w \in L_2 \}
\text{Difference: } L_1 - L_2 = \{ w \mid w \in L_1 \text{ and } w \notin L_2 \}
\text{Intersection: } L_1 \cap L_2 = \{ w \mid w \in L_1 \text{ and } w \in L_2 \}
\text{Concatenation: } L_1 \cdot L_2 = \{ w \mid w = uv, u \in L_1, v \in L_2 \}
\text{Option: } L? = L \cup \{ \varepsilon \}
\text{Kleene Star: } L^* = \bigcup_{n=0}^\infty L^n, \text{ where } L^0 = \{ \varepsilon \}, L^n = L \cdot L^{n-1} \text{ for } n \geq 1
\text{Kleene Plus: } L^+ = \bigcup_{n=1}^\infty L^n, \text{ where } L^n = L \cdot L^{n-1} \text{ for } n \geq 1
\text{Complement: } \neg L = \Sigma^* \setminus L
<pre><code>### Operation Definitions

Union: L₁ ∪ L₂ = { w | w ∈ L₁ or w ∈ L₂ }
Difference: L₁ - L₂ = { w | w ∈ L₁ and w ∉ L₂ }
Intersection: L₁ ∩ L₂ = { w | w ∈ L₁ and w ∈ L₂ }
Concatenation: L₁ ⋅ L₂ = { w | w = uv, u ∈ L₁, v ∈ L₂ }
Option: L? = L ∪ { ε }
Kleene Star: L* = ⋃ₙ₌₀ⁿ Lⁿ, where L⁰ = { ε }, Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1
Kleene Plus: L⁺ = ⋃ₙ₌₁ⁿ Lⁿ, where Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1
Complement: ᒾL = Σ* \ L

</code></pre>
<h2 id="c-api">C# API</h2>
<p>The Alang parser and FSA compiler is provided by the namespace <strong>Automata.Core.Alang</strong>.</p>
2 changes: 1 addition & 1 deletion docs/index.json
Original file line number Diff line number Diff line change
@@ -2,7 +2,7 @@
"ALANG.html": {
"href": "ALANG.html",
"title": "Alang (Automata Language) | Automata Docs",
"keywords": "Alang (Automata Language) Alang is a formal language for defining finite-state automata using human-readable regular expressions. It supports many operations, such as union, intersection, complement and set difference, enabling expressions like \"(a? (b | c)* - (b b))+\". Alang's syntax is defined by the Alang Grammar which is an LL(1) context-free grammar. The Alang parser is optimized for fast parsing of very large inputs. The parser validates syntactic correctness and generates detailed error messages for invalid inputs. Alang Grammar Specification Grammar Rule Expansion AlangRegex (root) Union \uD83D\uDD39Union Difference (| Difference)* \uD83D\uDD39Difference Intersection (- Intersection)* \uD83D\uDD39Intersection Concatenation (& Concatenation)* \uD83D\uDD39Concatenation UnaryRegex+ UnaryRegex PrimaryRegex (Option ┃ KleeneStar ┃ KleenePlus ┃ Complement)* \uD83D\uDD39Option PrimaryRegex ? \uD83D\uDD39KleeneStar PrimaryRegex * \uD83D\uDD39KleenePlus PrimaryRegex + \uD83D\uDD39Complement PrimaryRegex ~ PrimaryRegex ( AlangRegex ) ┃ Symbol ┃ Wildcard ┃ EmptyLang \uD83D\uDD39Symbol SymbolChar+ \uD83D\uDD39Wildcard . \uD83D\uDD39EmptyLang () SymbolChar any character except operator characters and whitespace \uD83D\uDD39 Denotes an actual node type in the resulting AST (abstract syntax tree) outputed by the parser. Note to developers: All types marked with a \uD83D\uDD39 have corresponding classes with the exact same names in the namespace Automata.Core.Alang. For an input to be valid, the root rule AlangRegex must cover the entire input, with no residue. Operators Operators with higher precedence levels bind more tightly than those with lower levels. Operators of the same precedence level are left-associative (left-to-right). All unary operators are postfix operators and all binary operators are infix operators. Precedence Operation/Unit Operator Character Position & Arity 1 Union L₁ | L₂ Infix Binary 2 Difference L₁ - L₂ Infix Binary 3 Intersection L₁ & L₂ Infix Binary 4 Concatenation L₁ L₂ Infix Implicit 5 Option L ? Postfix Unary 5 Kleene Star L* Postfix Unary 5 Kleene Plus L+ Postfix Unary 5 Complement L~ Postfix Unary 6 Group ( L ) Enclosing Unary 7 EmptyLang () Empty parentheses 7 Wildcard . Terminal 7 Symbol string literal Terminal Whitespace Multiple Whitespace is allowed anywhere in the grammar, except within Symbols. Whitespace is never required anywhere - except for separating directly adjacent Symbols or operators. Thus, the parser resolves all reserved tokens as delimiters: The following are correcly delimited: hello+world or hello(world). Whitespace denotes any whitespace character (i.e. space, tab, newline, etc.). The formal whitespace definition is equivalent to .NET's char.IsWhiteSpace(char c). Symbols Symbols have a specific meaning - as formally defined by automata theory: User-defined string literals that constitute the atoms of Alang expressions. It is equivalent to symbols in finite-state automata. Can contain any characters except reserved operator characters or whitespace. They can never be empty. Symbols are strings and are not to be confused with characters, Wildcard A Wildcard is a special token denoted by a . (dot). It represents any symbol in the alphabet. For example: . - hello represents the language of all symbols except 'hello'. (. - hello).* represents the language of all sequences, except those containing 'hello'. The Empty Language ∅ and The Language containing only epsilon {ε} The Empty Language ∅ is the language that does not cotain anything. It is written in Alang using empty parentheses (). Its corresponding grammar rule is EmptyLang and the parse tree type is EmptyLang. Its automata equivalence is an automaton that does not accept anything (not even the empty string). In most scenarios, () is not required when writing a Alang expressions. However, many operations can result in the empty language. For example a - (a | b) is equivalent to (). The language containing only the empty string {ε} It is written in Alang as ()?, since the Option operator ? unites the operand with {ε}: L? = L ∪ { ε } Its automata equivalence is an automaton that only accepts ε. Note that () ≠ {ε}. For instance: Concatenating any language L with () => (). Concatenating any language L with {ε} => L. Alang expression examples (a? (b | c) )+ : All sequences from the set {a, b, c} where any 'a' must be followed by 'b' or 'c'. a+~ b : Complement of 'a+' - all sequences that are not 1 or more 'a's, followed by a 'b' (x1 | x2 | x3)* - (x1 x2 x3)+ : All sequences constaining {x1, x2, x3}, except repetitions of \"x1 x2 x3\". () : The empty language that does not accept anything. For example, it is the result from hello - hello and from hello & world. Operation Definitions Union: L₁ ∪ L₂ = { w | w ∈ L₁ or w ∈ L₂ } Difference: L₁ - L₂ = { w | w ∈ L₁ and w ∉ L₂ } Intersection: L₁ ∩ L₂ = { w | w ∈ L₁ and w ∈ L₂ } Concatenation: L₁ ⋅ L₂ = { w | w = uv, u ∈ L₁, v ∈ L₂ } Option: L? = L ∪ { ε } Kleene Star: L* = ⋃ₙ₌₀^∞ Lⁿ, where L⁰ = { ε }, Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1 Kleene Plus: L⁺ = ⋃ₙ₌₁^∞ Lⁿ, where Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1 Complement: ᒾL = Σ* \\ L Operation Definitions \\text{Union: } L_1 \\cup L_2 = \\{ w \\mid w \\in L_1 \\text{ or } w \\in L_2 \\} \\text{Difference: } L_1 - L_2 = \\{ w \\mid w \\in L_1 \\text{ and } w \\notin L_2 \\} \\text{Intersection: } L_1 \\cap L_2 = \\{ w \\mid w \\in L_1 \\text{ and } w \\in L_2 \\} \\text{Concatenation: } L_1 \\cdot L_2 = \\{ w \\mid w = uv, u \\in L_1, v \\in L_2 \\} \\text{Option: } L? = L \\cup \\{ \\varepsilon \\} \\text{Kleene Star: } L^* = \\bigcup_{n=0}^\\infty L^n, \\text{ where } L^0 = \\{ \\varepsilon \\}, L^n = L \\cdot L^{n-1} \\text{ for } n \\geq 1 \\text{Kleene Plus: } L^+ = \\bigcup_{n=1}^\\infty L^n, \\text{ where } L^n = L \\cdot L^{n-1} \\text{ for } n \\geq 1 \\text{Complement: } \\neg L = \\Sigma^* \\setminus L C# API The Alang parser and FSA compiler is provided by the namespace Automata.Core.Alang. Key class: AlangRegex Example usage: AlangRegex regex = AlangRegex.Parse(\"(a? (b | c) )+\"); // Create an Alang regex Mfa fsa = regex.Compile(); // Compile the regex to a minimal finite-state automaton For more information, see the Automata documentation"
"keywords": "Alang (Automata Language) Alang is a formal language for defining finite-state automata using human-readable regular expressions. It supports many operations, such as union, intersection, complement and set difference, enabling expressions like \"(a? (b | c)* - (b b))+\". Alang's syntax is defined by the Alang Grammar which is an LL(1) context-free grammar. The Alang parser is optimized for fast parsing of very large inputs. The parser validates syntactic correctness and generates detailed error messages for invalid inputs. Alang Grammar Specification Grammar Rule Expansion AlangRegex (root) Union \uD83D\uDD39Union Difference (| Difference)* \uD83D\uDD39Difference Intersection (- Intersection)* \uD83D\uDD39Intersection Concatenation (& Concatenation)* \uD83D\uDD39Concatenation UnaryRegex+ UnaryRegex PrimaryRegex (Option ┃ KleeneStar ┃ KleenePlus ┃ Complement)* \uD83D\uDD39Option PrimaryRegex ? \uD83D\uDD39KleeneStar PrimaryRegex * \uD83D\uDD39KleenePlus PrimaryRegex + \uD83D\uDD39Complement PrimaryRegex ~ PrimaryRegex ( AlangRegex ) ┃ Symbol ┃ Wildcard ┃ EmptyLang \uD83D\uDD39Symbol SymbolChar+ \uD83D\uDD39Wildcard . \uD83D\uDD39EmptyLang () SymbolChar any character except operator characters and whitespace \uD83D\uDD39 Denotes an actual node type in the resulting AST (abstract syntax tree) outputed by the parser. Note to developers: All types marked with a \uD83D\uDD39 have corresponding classes with the exact same names in the namespace Automata.Core.Alang. For an input to be valid, the root rule AlangRegex must cover the entire input, with no residue. Operators Operators with higher precedence levels bind more tightly than those with lower levels. Operators of the same precedence level are left-associative (left-to-right). All unary operators are postfix operators and all binary operators are infix operators. Precedence Operation/Unit Operator Character Position & Arity 1 Union L₁ | L₂ Infix Binary 2 Difference L₁ - L₂ Infix Binary 3 Intersection L₁ & L₂ Infix Binary 4 Concatenation L₁ L₂ Infix Implicit 5 Option L ? Postfix Unary 5 Kleene Star L* Postfix Unary 5 Kleene Plus L+ Postfix Unary 5 Complement L~ Postfix Unary 6 Group ( L ) Enclosing Unary 7 EmptyLang () Empty parentheses 7 Wildcard . Terminal 7 Symbol string literal Terminal Whitespace Multiple Whitespace is allowed anywhere in the grammar, except within Symbols. Whitespace is never required anywhere - except for separating directly adjacent Symbols or operators. Thus, the parser resolves all reserved tokens as delimiters: The following are correcly delimited: hello+world or hello(world). Whitespace denotes any whitespace character (i.e. space, tab, newline, etc.). The formal whitespace definition is equivalent to .NET's char.IsWhiteSpace(char c). Symbols Symbols have a specific meaning - as formally defined by automata theory: User-defined string literals that constitute the atoms of Alang expressions. It is equivalent to symbols in finite-state automata. Can contain any characters except reserved operator characters or whitespace. They can never be empty. Symbols are strings and are not to be confused with characters, Wildcard A Wildcard is a special token denoted by a . (dot). It represents any symbol in the alphabet. For example: . - hello represents the language of all symbols except 'hello'. (. - hello).* represents the language of all sequences, except those containing 'hello'. The Empty Language ∅ and The Language containing only epsilon {ε} The Empty Language ∅ is the language that does not cotain anything. It is written in Alang using empty parentheses (). Its corresponding grammar rule is EmptyLang and the parse tree type is EmptyLang. Its automata equivalence is an automaton that does not accept anything (not even the empty string). In most scenarios, () is not required when writing a Alang expressions. However, many operations can result in the empty language. For example a - (a | b) is equivalent to (). The language containing only the empty string {ε} It is written in Alang as ()?, since the Option operator ? unites the operand with {ε}: L? = L ∪ { ε } Its automata equivalence is an automaton that only accepts ε. Note that () ≠ {ε}. For instance: Concatenating any language L with () => (). Concatenating any language L with {ε} => L. Alang expression examples (a? (b | c) )+ : All sequences from the set {a, b, c} where any 'a' must be followed by 'b' or 'c'. a+~ b : Complement of 'a+' - all sequences that are not 1 or more 'a's, followed by a 'b' (x1 | x2 | x3)* - (x1 x2 x3)+ : All sequences constaining {x1, x2, x3}, except repetitions of \"x1 x2 x3\". () : The empty language that does not accept anything. For example, it is the result from hello - hello and from hello & world. Operation Definitions ### Operation Definitions Union: L₁ ∪ L₂ = { w | w ∈ L₁ or w ∈ L₂ } Difference: L₁ - L₂ = { w | w ∈ L₁ and w ∉ L₂ } Intersection: L₁ ∩ L₂ = { w | w ∈ L₁ and w ∈ L₂ } Concatenation: L₁ ⋅ L₂ = { w | w = uv, u ∈ L₁, v ∈ L₂ } Option: L? = L ∪ { ε } Kleene Star: L* = ⋃ₙ₌₀ⁿ Lⁿ, where L⁰ = { ε }, Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1 Kleene Plus: L⁺ = ⋃ₙ₌₁ⁿ Lⁿ, where Lⁿ = L ⋅ Lⁿ⁻¹ for n ≥ 1 Complement: ᒾL = Σ* \\ L C# API The Alang parser and FSA compiler is provided by the namespace Automata.Core.Alang. Key class: AlangRegex Example usage: AlangRegex regex = AlangRegex.Parse(\"(a? (b | c) )+\"); // Create an Alang regex Mfa fsa = regex.Compile(); // Compile the regex to a minimal finite-state automaton For more information, see the Automata documentation"
},
"Automata.Core.Alang.AlangCursor.html": {
"href": "Automata.Core.Alang.AlangCursor.html",

0 comments on commit ec99534

Please sign in to comment.