Automata IV: Introducing the NFA

01 June 2015

Introduction

Today we're going to meet the DFA's more liberal cousin the Non-Deterministic Finite Automaton, or NFA for short. Using Wikipedia as a guide, we can see that an NFA is defined as an automaton that doesn't match the following rules that a DFA must follow:

  • each of its transitions is uniquely determined by its source state and input symbol, and
  • reading an input symbol is required for each state transition.

This is a little confusing to picture just from those words, so let's look at some examples.

DFA

This is a DFA because;

  1. There are no transitions in this state machine that can occur without an input symbol.
  2. At any given state, for each "letter" in the alphabet, there is only one possible transition that can occur. For example at state 3, only one move is possible for input "s".

NFA

This, by contrast, is an NFA because:

  1. The transition between state 2 and state 4 follows what is called an "epsilon move". An epsilon move allows the state of the NFA to advance without any input.
  2. The transition between state 2 and state 3 and state 2 and state 9 both occur on the letter "e".

Building the NFA

While these examples aren't particularly interesting, they are simple enough to highlight an important difference when building an NFA as opposed to a DFA. Namely, that building an NFA requires the ability to have multiple transitions for the same input character from a given state. So, for example, in the above NFA, state 2 must allow for a transition to state 3 and state 9.

As you'll remember from the first post on automata, the way we implemented the DFA was to use a transition table, that mapped a state and an input onto the resulting state. For the NFA, we have to go one further. We need to map a state and an input onto a collection of resulting states. This means that the value for state 2, with the letter "e" would contain the collection [3, 9]

Implementation

So, to actually build the NFA using code, we need to make two major changes to how we generated our DFAs. Firstly, we need to switch from our transition table being a sequence of states containing a transition for each element of the alphabet, to a sequence of states containing a sequence of transitions for each element of the alphabet. That is, the states property of the NFA is a Seq[Seq[Seq[T]]] rather than a Seq[Seq[T]] as it was in the DFA.

That changes the behavior of the rest of the NFA as well; wherever we were expecting a two-dimensional sequence, we're now dealing with a three-dimensional sequence.

We also need to introduce the concept of an "epsilon" move. Because we have a generic NFA, our epsilon move needs to be of type T and thus we need to append the epsilon character to our alphabet when we construct the NFA.

All together, the new class looks a lot like the class we wrote for the DFA, with a few key changes:

import java.io.PrintWriter

class NFA[T](alphabet: Seq[T], val epsilon: T, val states: Seq[Seq[Seq[Int]]], val acceptanceStates: Seq[Int], val startState: Int) {

  private val fullAlphabet = alphabet :+ epsilon

  def this(alphabet: Seq[T], epsilon: T) {
    this(alphabet, epsilon, Seq.fill(2)(Seq.fill(alphabet.length + 1)(Seq(0))), Seq.empty, NFA.DefaultStartState)
  }

  def this(alphabet: Seq[T], epsilon: T, startState: Int) {
    this(alphabet, epsilon, Seq.fill(2)(Seq.fill(alphabet.length + 1)(Seq(0))), Seq.empty, startState)
  }

  def addTransition(from: Int, key: T): NFA[T] = {
    val newStates = states ++ Seq(Seq.fill(fullAlphabet.length)(Seq(DFA.DeadState)))
    val result = new NFA(alphabet, epsilon, newStates, acceptanceStates, startState)
    val firstLiveStateOfResult = result.states.length -1
    result.addTransition(from, key, firstLiveStateOfResult)
  }

  def addTransition(from: Int, key: T, to: Int): NFA[T] = {
    val newStates = states.updated(from, states(from).updated(getAlphabetIndex(key), states(from)(getAlphabetIndex(key)).filter(_ != 0) :+ to))
    new NFA(alphabet, epsilon, newStates, acceptanceStates, startState)
  }

  def addAcceptingState(state: Int) = new NFA(alphabet, epsilon, states, acceptanceStates :+ state, startState)

  private def getAlphabetIndex(item: T) = fullAlphabet.zipWithIndex.filter(a => a._1 == item)(0)._2

  def toGraphVis(path: String) {
    val out = new PrintWriter(path)
    out.print(toString())
    out.close()
  }

  override def toString() = {
    def printTransition(state1: Int, transitionIdx: Int, state2: Int) = {
      val transitionValue = fullAlphabet(transitionIdx)
      if (state2 != DFA.DeadState) s"s$state1 -> s$state2 [label = $transitionValue]"
      else ""
    }
    def getShape(node: Int) = if(acceptanceStates.contains(node)) "doublecircle" else "circle"
    def printNode(node: Int) = "node [shape = " + getShape(node) + "] s" + node + ";"

    val config = "\n\trankdir=LR;\n\tsize=8.5;\n\t\n\t"
    val stateList = "node [shape = point] s0\n\t" + states.zipWithIndex.filter(_._2 != 0).map(s => printNode(s._2)).mkString("\n\t")
    val startNode = s"s0 -> s$startState"
    val otherNodes = states.zipWithIndex.flatMap(s => s._1.zipWithIndex.flatMap(e => e._1.map(p => printTransition(s._2, e._2, p))))
    val otherNodesString = otherNodes.filter(_ != "").mkString(";\n\t")

    s"digraph nfa { $config\n\t$stateList\n\t\n\t$startNode\n\t$otherNodesString;\n}"
  }
}

object NFA {
  val DeadState = 0
  val DefaultStartState = 1
}

I still have a bit of work to do here, and I suspect things will change a lot as I continue to build on the behavior of the NFA. Working with multi-dimensional arrays has never been clean in any language I've used, but this just feels like a right mess of parentheses. I'm not familiar enough right now with Scala to be able to tell if this is because I'm not comfortable with the language, or if it is simply because working with multi-dimensional arrays is inherently messy.

The basic functionality though, does work, and we can build NFA's quite easily with this code, so for now, we'll continue on our journey.

There are a few pieces left to get in place before we can get to a Regex implementation, and they will be the topics of future blog posts. Namely, the two big pieces that are left are translating an NFA to a DFA, and translating a Regex string into an NFA. Stay tuned for these.

Tags: Compilers | Automata | Scala
Tweet