# Posts Tagged ‘context-free languages’

## It only looks like a homework problem…

Almost exactly one year ago (10 Sept 2010), Cem Say asked the following deceptively simple question:

Is the language $latex \{ a^{i}b^{j}c^{k} \mid i \neq j, i \neq k, j \neq k \}$ non-context-free?

Recall that a language is context-free if it can be accepted by a pushdown automaton. A pushdown automaton, in turn, is a finite automaton with access to a stack (so it has access to an unbounded amount of memory). A standard example of a context-free language that cannot be recognized by a finite automaton without a stack is {$latex a^n b^n \mid n \geq 1 $}. If a finite automaton has no access to a stack, it cannot store the value of $latex n$ if $latex n$ gets too large, so it has no way to be certain that the number of $latex b$ appearing exactly match the number of $latex a$. This intuition is formalized in the famous pumping lemma; and there is a similar tool, the pumping lemma for context-free languages, that can be used to prove that a language is not context-free.

It is easy to apply a standard technique — the pumping lemma — to show the language Cem Say asked about is not regular. (We make the exponents large enough that a finite automaton has to repeat a state, and take advantage of that confusion.) However, there doesn’t appear to be a good way to apply the pumping lemma for context-free languages to show this particular language is context-free. Initially, though, some site members thought the question was easy enough to be undergraduate homework, since the definition of the language is so close to other languages whose status can be resolved with a pumping lemma.

Fortunately, Frank Weinberg knew about Ogden’s Lemma, which is an extension of the pumping lemma for context-free languages (the pumping lemma is a special case of Ogden’s Lemma). With this tool, he was able to prove that, indeed, the language Cem Say asked about is not context-free. The proof I will give in this blog entry can be found in Frank Weinberg’s answer; I will give a version of the proof polished by Tsuyoshi Ito in the comments to that answer.

First, here is a statement of Ogden’s Lemma.

Ogden’s Lemma: If $latex L$ is a context-free language, then there exists some number $latex p>0$ such that for any string $latex w$ of length at least $latex p$ in $latex L$, and every way of “marking” $latex p$ or more of the positions in $latex w$, $latex w$ can be written as $latex w=uxyzv$ with string $latex u$, $latex x$, $latex y$, $latex z$, $latex v$, such that

- $latex xz$ has at least one marked position,
- $latex xyz$ has at most $latex p$ marked positions, and
- $latex ux^iyz^iv$ is in $latex L$ for every $latex i \geq 0$.

The pumping lemma for context-free languages is the special case where every position is marked. Armed with this tool, we can now prove that $latex \lbrace a^{i}b^{j}c^{k} \mid i \neq j, i \neq k, j \neq k \rbrace$ is not context-free.

Proof(that Cem Say’s language is not context-free): For a given $latex p$, choose $latex a^i b^p c^k$ and mark all the $latex b$’s (and nothing else). We choose $latex i$ and $latex k$ such that for every choice of how many $latex b$’s are actually pumped there is one pumping exponent such that the number of $latex b$’s is equal to $latex i$ and one where it is equal to $latex k$. Then $latex i$ and $latex k$ have to be from the set $latex S= \bigcap_{1 \leq n \leq p} \lbrace p-n + m*n \mid m \in IN_0\rbrace$ where $latex IN_0$ is the set of positive integers. The set $latex S$ is infinite, because it contains $latex p+ im$ for all $latex i \geq 0$, where $latex m$ is the least common muiltiple of $latex \lbrace 1, \ldots, p \rbrace$. So, assuming the language is context-free, we can always find an $latex i$ and $latex k$ in $latex S$ for any $latex b$, and by condition (3) of Ogden’s Lemma, the resulting string will be in $latex L$, contradicting the original definition of $latex L$. So the language cannot be context-free after all.

For questions and answers on related topics, click on the formal languages tag.