We want to recap the concepts of a grammar and a formal language in the following definition.

Let $\Sigma$ be an alphabet and $G=(V,T,R,S)$ be a grammar over this alphabet. We say a string $y\in\Sigma^*$ can be **syntactically derived** from another string $x$ using the grammar $G$ (denoted by $x\Rightarrow y$), if $$\exists u,w\in (V\cup T)^*, \exists P\to C\in R: (x=uPv\wedge y=uCv).$$

In other words, $x\Rightarrow y$ if and only the two strings have the form $x=uPw$ and $y=uCw$ (for some other strings $u,w$) and there is a rule $P\to C.$

Note that $y$ can be derived from $x$ by applying several rules, e.g. $x\Rightarrow z_1\Rightarrow\cdots\Rightarrow z_n\Rightarrow y$. It is, therefore, reasonable to require that the set $R$ of grammar rules is a transitive relation.

Denoting by $\Rightarrow^*$ the **transitive hull** of the grammar rule relation $R,$ we can now state that a language $L\subset\Sigma^*$ is said to be **generated by the grammar** $G$ if and only if

$$L=L(G)=\{y\in\Sigma^*\mid S\Rightarrow^* y\}.$$

In this case, we call $L$ a **formal language**.

- In other words, $L(G)$ consists exactly of those strings $y\in\Sigma^*$ which can be grammatically derived from the starting symbol $S$ in only finitely many steps applying the grammar rules.
- $L(G)$ consists therefore solely of strings built out of terminal symbol (elements of $T$).
- Any non-terminal symbol (element of $V$, variable) plays only a placeholder-role in $L(G)$ until it gets replaced by a rule in $R$ by something else.
- Only when we have replaced every non-terminal symbol, we have produced a string of the language $L(G).$

| | | | | created: 2020-04-23 16:21:05 | modified: 2020-05-24 07:53:07 | by: *bookofproofs* | references: [1086], [7895]

[1086] **Erk, Katrin; Priese, Lutz**: “Theoretische Informatik”, Springer Verlag, 2000, 2. Auflage

[7895] **Hoffmann, Dirk**: “Theoretische Informatik, 3. Auflage”, Hanser, 2015