I made a simple regular expression engine, which supports concatenation, alternation, closure, and char a .. z
. Source code is here
The way I represent nfa and dfa is to use record:
type state = int with sexp, compare
type alphabet = char with sexp, compare
type transaction = state * alphabet option * state with sexp, compare
type d_transaction = state * alphabet * state with sexp, compare
type state_set = State_set.t
type states_set = States_set.t
type nfa = {
states : State_set.t ;
alphabets : Alphabet_set.t ;
transactions : Transaction_set.t;
start_state : state;
final_states : State_set.t;
}
type dfa = {
d_states : State_set.t ;
d_alphabets : Alphabet_set.t;
d_transactions : D_Transaction_set.t ;
d_start_state : state ;
d_final_states : State_set.t;
}
For example, a string "a*" will be parsed into Closure (Char 'a')
, and then is transformed to nfa: states: 0 1 2 3 alphabets: a transactions: 0->e->1, 1->a>2, 2->a->3, 2->e->1, 0->a->3 start_start: 1 final_states: 4
then dfa:
states: 0 1 alphabets: a transactions: 0->a->1, 1->a->1 start_start: 1 final_states: 0 1
However, I use a lot of recursion in my code. The way my program generates state number for each node in nfa and dfa is realy unpredictable. I have no idea how to verify if the generated dfa is correct without using a pen and a piece of paper to test it myself
I am trying to figure out a better way to test my code so I can add more features into my program in the future.
Can anyone please give me some suggestions?
Aucun commentaire:
Enregistrer un commentaire