Converting an FSM to a RegEx

04 Oct 2015

Creating an Equivalent Regular Expression for a Finite State Machine;
Part of my notes from ITEC 420 "Computability Theory and Formal Languages" at Radford University, Fall 2015

Given:

Language:

Machine:

original machine

Preparation

For many smaller machines, you can get by on intuition and not do this. But this is required for the procedural, algorithmic solution to work.

1. Add additional start and end states

Replace the existing start state by adding a new start state with a single transition on epsilon to the original start state. (We will arbitrarily name this state q_s, for "start".)

Create a new accept with epsilon transitions from all the original accept states to this new state. Change the original accept states to non-accept states. (We will arbitrarily name this state q_e, for "end".)

The important property for these two new states is that the new start state will only have transitions out, and the new end state will only have transitions in.

machine after adding new states

2. Replace transitions on multiple symbols with a union

If there is more than one transition between two states, or a transition on more than one symbol (which is effectively the same thing), these should be replaced with a union RE.

For instance, in our example machine, there are two transitions ((q₂, a), q₂) and ((q₂, b), q₂), represented by a single loopback arrow on state q₂. This transition should be re-labelled with the regular expression .

machine after replacing union transitions

3. Add null transitions

Every state must be connected to every other state. Every state in the original machine must have a transition to every other state in the original machine, including itself. If one does not exist already, add a null transition (a transition on the empty RE ). Also add null transitions from the new start state added in 1 to every other state (except to the original start state, which already has an epsilon transition). Add a null transition from all the original non-accept states to the new end state (the original accept states should already have epsilon transitions to the new end state from 1).

At the end, every state will be connected somehow to every other state, except that the start state will still only have outbound transitions and the end state will still only have inbound transitions (so these two will not have loopback transitions to themselves.)

machine after adding transitions

4. Table representation

With all those null transitions, the diagram is a little nuts - and it gets even more nutes the more states there are in the machine. It may be helpful to represent the modified machine as a transition table:

	0	1	2
S
0	b	a
1	b		a
2
E

The rows are the origin states for the transitions (note the empty row for q_e), and the columns are the destination states for the transitions (note the empty column for q_s). So for example, the transition from q₀ to q_e is , and the transition from q₂ to q₂ (a loopback) is a, b.

Removing states

To generate the regular expression, we repeatedly remove original states from the machine, until only the new start and end states remain. At each step, we replace the transitions involving the state we removed with equivalent transitions labelled using a regular expression rather than a set of symbols from the machine's alphabet. At the end, there will only be two states (q_s and q_e) and one transition, labelled with a regular expression which is equivalent to the original machine.

1. Remove a state

Pick one of the original states to remove. When running on intuition, pick states that are more isolated. But when following the procedural solution exhaustively, it doesn't matter which one you pick. Call it rip (we're ripping it out). We'll start by removing state q₂.

2. Create pairs of remaining states

Determine all possible transitions in the new machine. The most straightforward way to do this is to construct a new table without the state we want to remove. For instance:

	0	1	E
S	?	?	?
0	?	?	?
1	?	?	?
E

There won't be any transitions into S, or out of E, so we need to figure out the transitions shown with a ?.

3. Determine new transitions

The formula to create a regular expression for one of these transitions is:

Where returns the existing pattern for a transition from s₁ to s₂.

If we pick an origin state p and a destination state q, then the new transition from p to q will be the direct transition from p to q unioned with the concatenation of the transition from p to the state being removed, the transition from the state being removed to itself, and the transition from the state being removed to q. (This is why we had to add all those null transitions - this formula would not work uniformly unless each of these transitions exists!)

For example, the new transition from q₀ to q₁ would be:

This is consistent because before removing the state, there was not path from q₀ to q₁ that passed through q₂. Therefore, after removing q₂ the transition from q₀ to q₁ remains unchanged.

Here are the rest of the transitions after removing q₂:

The filled in table is:

	0	1
S
0	b	a
1	b
E

A diagram of the machine after removing q₂ is:

machine after removing q2

In this case, it would be fairly straightforward to determine the new machine without explicitly working out all 9 of the transitions - they "new" transitions are the same as they were before! The real utility of using this procedure is in later steps or more complicated machines.

4. Repeat

Continue removing states, combining transitions into regular expressions, until only q_s and q_e remain. The regular expression labelling the transition from q_s to q_e should be equivalent to the original finite state machine (that is, it generates the same set of strings that the machine accepts).

Solution

Remove State 1

	S	0	E
S
0
E

machine after removing state q1

Remove State 0

	S	E
S
E

machine after removing state q0

So, the original machine is equivalent to the regular expression .