Formal Regular Expressions

08 Oct 2015

Part of my notes from ITEC 420 "Computability Theory and Formal Languages" at Radford University, Fall 2015

Syntax

A regular expression is a string of symbols that describes a language (as an arithmetic expression is a string of symbols that describes a numeric value).

For some alphabet \(\Sigma\), a valid (syntactically correct) regular expressions R is:

\(R \rightarrow \varnothing \mid \epsilon \mid symbol \mid R^* \mid R^+ \mid (R) \mid RR \mid R \cup R\)
\(symbol \rightarrow \text{any symbol in } \Sigma\)

This definition can be broken up into base cases and recursive cases:

A regular expression is:

\(\varnothing\)
\(\epsilon\)
Any single symbol in \(\Sigma\)

And if \(\alpha\) and \(\beta\) are regular expressions, then so are:

\(\alpha\beta\)
\(\alpha \cup \beta\)
\(\alpha^*\)
\(\alpha^+\)
\((\alpha)\)

Semantics

Assume L is a function that, given some regular expressions, returns the equivalent regular language:

L : reg. expr. -> language

Then each of the forms given in the syntactic definition above has the following meanings:

\(L(\varnothing) = \{\}\) (the empty language)
\(L(\epsilon) = \{\epsilon\}\) (language containing the empty string)
\(\exists c \in \Sigma \rightarrow L(c) = \{c\}\) (the language containing c)
\(L(\alpha\beta) = L(\alpha)L(\beta)\) (concatenation)
\(L(\alpha \cup \beta) = L(\alpha) \cup L(\beta)\) (union)
\(L(\alpha^*) = (L(\alpha))^*\) (Kleene star)
\(L(\alpha^+) = L(\alpha\alpha^*) = L(\alpha)(L(\alpha))^*\) (Kleene plus)
\(L((\alpha)) = L(\alpha)\) (grouping with parenthesis)

Operators

Precedence

From highest to lowest:

Kleene star
Concatenation
Union

So \(a^*b \cup c = ((a^*)b) \cup c\)

Properties

Union

Regular expressions describe sets, so union behaves as with sets...

Commutative: \(a \cup b = b \cup a\)
Associative: \((a \cup b) \cup c = a \cup (b \cup c)\)
Idempotent: \(a \cup a = a\)
\(\varnothing\) for identity: \(a \cup \varnothing = a\)

Concatenation

Associative: \((ab)c = a(bc)\)
\(\epsilon\) for identity: \(a\epsilon = a\)
\(\varnothing\) for "zero": \(a\varnothing = \varnothing\)
Distributes over union: \((a \cup b) c = (ac) \cup (bc)\), \(c (a \cup b) = (ca) \cup (cb)\)

Kleene star

(these are... less intuitive)

\(\varnothing^* = \epsilon\)
\(\epsilon^* = \epsilon\)
\((a^*)^* = a^*\)
\(a^*a^* = a^*\)
\((a \cup b)^* = (a^*b^*)^*\)
\(a^* \subseteq b^* \Rightarrow a^*b^* = b^*a^* = b^*\)
\(a \subseteq b^* \Rightarrow a^*b^* = b^*a^* = b^*\)