Details | Last modification | View Log | RSS feed
| Rev | Author | Line No. | Line | 
|---|---|---|---|
| 14 | pmbaty | 1 | //===- Automaton.td ----------------------------------------*- tablegen -*-===// | 
| 2 | // | ||
| 3 | // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
| 4 | // See https://llvm.org/LICENSE.txt for license information. | ||
| 5 | // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
| 6 | // | ||
| 7 | //===----------------------------------------------------------------------===// | ||
| 8 | // | ||
| 9 | // This file defines the key top-level classes needed to produce a reasonably | ||
| 10 | // generic finite-state automaton. | ||
| 11 | // | ||
| 12 | //===----------------------------------------------------------------------===// | ||
| 13 | |||
| 14 | // Define a record inheriting from GenericAutomaton to generate a reasonably | ||
| 15 | // generic finite-state automaton over a set of actions and states. | ||
| 16 | // | ||
| 17 | // This automaton is defined by: | ||
| 18 | // 1) a state space (explicit, always bits<32>). | ||
| 19 | // 2) a set of input symbols (actions, explicit) and | ||
| 20 | // 3) a transition function from state + action -> state. | ||
| 21 | // | ||
| 22 | // A theoretical automaton is defined by <Q, S, d, q0, F>: | ||
| 23 | // Q: A set of possible states. | ||
| 24 | // S: (sigma) The input alphabet. | ||
| 25 | // d: (delta) The transition function f(q in Q, s in S) -> q' in Q. | ||
| 26 | // F: The set of final (accepting) states. | ||
| 27 | // | ||
| 28 | // Because generating all possible states is tedious, we instead define the | ||
| 29 | // transition function only and crawl all reachable states starting from the | ||
| 30 | // initial state with all inputs under all transitions until termination. | ||
| 31 | // | ||
| 32 | // We define F = S, that is, all valid states are accepting. | ||
| 33 | // | ||
| 34 | // To ensure the generation of the automaton terminates, the state transitions | ||
| 35 | // are defined as a lattice (meaning every transitioned-to state is more | ||
| 36 | // specific than the transitioned-from state, for some definition of specificity). | ||
| 37 | // Concretely a transition may set one or more bits in the state that were | ||
| 38 | // previously zero to one. If any bit was not zero, the transition is invalid. | ||
| 39 | // | ||
| 40 | // Instead of defining all possible states (which would be cumbersome), the user | ||
| 41 | // provides a set of possible Transitions from state A, consuming an input | ||
| 42 | // symbol A to state B. The Transition object transforms state A to state B and | ||
| 43 | // acts as a predicate. This means the state space can be discovered by crawling | ||
| 44 | // all the possible transitions until none are valid. | ||
| 45 | // | ||
| 46 | // This automaton is considered to be nondeterministic, meaning that multiple | ||
| 47 | // transitions can occur from any (state, action) pair. The generated automaton | ||
| 48 | // is determinized, meaning that is executes in O(k) time where k is the input | ||
| 49 | // sequence length. | ||
| 50 | // | ||
| 51 | // In addition to a generated automaton that determines if a sequence of inputs | ||
| 52 | // is accepted or not, a table is emitted that allows determining a plausible | ||
| 53 | // sequence of states traversed to accept that input. | ||
| 54 | class GenericAutomaton { | ||
| 55 | // Name of a class that inherits from Transition. All records inheriting from | ||
| 56 | // this class will be considered when constructing the automaton. | ||
| 57 | string TransitionClass; | ||
| 58 | |||
| 59 | // Names of fields within TransitionClass that define the action symbol. This | ||
| 60 | // defines the action as an N-tuple. | ||
| 61 | // | ||
| 62 | // Each symbol field can be of class, int, string or code type. | ||
| 63 | // If the type of a field is a class, the Record's name is used verbatim | ||
| 64 | // in C++ and the class name is used as the C++ type name. | ||
| 65 | // If the type of a field is a string, code or int, that is also used | ||
| 66 | // verbatim in C++. | ||
| 67 | // | ||
| 68 | // To override the C++ type name for field F, define a field called TypeOf_F. | ||
| 69 | // This should be a string that will be used verbatim in C++. | ||
| 70 | // | ||
| 71 | // As an example, to define a 2-tuple with an enum and a string, one might: | ||
| 72 |   //   def MyTransition : Transition { | ||
| 73 | // MyEnum S1; | ||
| 74 | // int S2; | ||
| 75 | // } | ||
| 76 |   //   def MyAutomaton : GenericAutomaton }{ | ||
| 77 | // let TransitionClass = "Transition"; | ||
| 78 | // let SymbolFields = ["S1", "S2"]; | ||
| 79 | // let TypeOf_S1 = "MyEnumInCxxKind"; | ||
| 80 | // } | ||
| 81 | list<string> SymbolFields; | ||
| 82 | } | ||
| 83 | |||
| 84 | // All transitions inherit from Transition. | ||
| 85 | class Transition { | ||
| 86 | // A transition S' = T(S) is valid if, for every set bit in NewState, the | ||
| 87 | // corresponding bit in S is clear. That is: | ||
| 88 | // def T(S): | ||
| 89 | // S' = S | NewState | ||
| 90 | // return S' if S' != S else Failure | ||
| 91 | // | ||
| 92 | // The automaton generator uses this property to crawl the set of possible | ||
| 93 | // transitions from a starting state of 0b0. | ||
| 94 | bits<32> NewState; | ||
| 95 | } |