How regular expression works?
There are two type of regular expression engines. They are
In NFA the regular expression pattern will be compiled into byte-code and the engine executes the code one at a time, moving from one state to another. If the result in a state is failure then it backtracks to try another string to see if it matches the input. Lets go through an example to understand it better. So the job is to find a string "that" in a sentence.
Input sentence: The one who are crazy enough to think that they can change the world are the ones who do.
The regular expression would be: /that/
Give the task to match the string "that", the javascript engine moves to the first character to see if it matches the pattern, yes it is. Then it moves forward to the next character and see if it matches, yes it is again. Then it moves to the next character "e", oops! failed to match. So it backtrack to position 2 to see if the character "h" forms the start of a pattern match, Nope, its not... Similarly, it keeps moving back and forth until it find the exact match "that" at position 39-42. The regular expression engine is a eager algorithm which return a match as soon as it find one and terminate the program, unless /g is specified in the regex to keep moving forward.
Appendix
What is finite state automata?
Its a mathematical model for computation, represented using states. At any point in time, a program can be in only one state. They are extremely powerful for parsers which decide if a given string matches a specific pattern or not.
There are two type of regular expression engines. They are
- Text-directed engine (or) Deterministic Finite Automatons (DFA)
- Regex-directed engine (or) Nondeterministic Finite Automaton (NFA)
In NFA the regular expression pattern will be compiled into byte-code and the engine executes the code one at a time, moving from one state to another. If the result in a state is failure then it backtracks to try another string to see if it matches the input. Lets go through an example to understand it better. So the job is to find a string "that" in a sentence.
Input sentence: The one who are crazy enough to think that they can change the world are the ones who do.
The regular expression would be: /that/
Give the task to match the string "that", the javascript engine moves to the first character to see if it matches the pattern, yes it is. Then it moves forward to the next character and see if it matches, yes it is again. Then it moves to the next character "e", oops! failed to match. So it backtrack to position 2 to see if the character "h" forms the start of a pattern match, Nope, its not... Similarly, it keeps moving back and forth until it find the exact match "that" at position 39-42. The regular expression engine is a eager algorithm which return a match as soon as it find one and terminate the program, unless /g is specified in the regex to keep moving forward.
Appendix
What is finite state automata?
Its a mathematical model for computation, represented using states. At any point in time, a program can be in only one state. They are extremely powerful for parsers which decide if a given string matches a specific pattern or not.
No comments:
Post a Comment