Beta

Compute Nullable Non-Terminals

Description:

Background

Context-free grammars (CFGs) are used to define the structure of programming languages. Recall that a CFG has four parts:

  • a set of termial symbols
  • a set of non-terminal symbols
  • a set of productions
  • a start symbol which is one of the non-terminals

There is also a special non-terminal called 'epsilon' which represents an empty string. Here's an example. A CFG could have {A, B, C, D, E} as non-terminals, {a, b, c} as terminals, A as the start symbol, and the following productions:

  • A -> aB
  • A -> cC
  • B -> AB
  • B -> epsilon
  • C -> bc
  • D -> ab
  • D -> epsilon
  • E -> BD

A possible derivation with this CFG is: A => aB => aAB => acCB => acbcB => acbc. We started with the start symbol and used the productions to get a string of only terminal symbols. We say that 'acbc' is in the language of this grammar.

Notice that in the last step of our derivation, B seemed to disappear without a trace. This is ok, becauase the grammar specifies that B may be transformed into the empty string. When a non-terminal can be transformed into the empty string, we say that it is nullable. It is possible for a non-terminal to be nullable even if epsilon does not appear directly in one of its productions. In our grammar, E is nullable! Here's proof: E => BD => D => (empty string)

The Problem

Your challenge is to write a function that determines which non-terminals in a CFG are nullable. A grammar will be represented as a list of tuples. Each tuple corresponds to a production in the grammar. The first item in a tuple is the left hand side of the production. The second item is the right hand side. Your function should return the nullable non-terminals as a set. Here is a demo using our example CFG:

>>> grammar = [
      ('A', ['a', 'B']),
      ('A', ['c', 'C']),
      ('B', ['A', 'B']),
      ('B', ['epsilon']),
      ('C', ['b', 'c']),
      ('D', ['a', 'b']),
      ('D', ['epsilon']),
      ('E', ['B', 'D'])
    ]

>>> nullables(grammar)
{'E', 'D', 'B'}

The string 'epsilon' is reserved. All other strings can be used to label non-terminals and terminals.

Algorithms

Similar Kata:

Stats:

CreatedDec 11, 2015
PublishedDec 11, 2015
Warriors Trained117
Total Skips35
Total Code Submissions151
Total Times Completed27
Python Completions27
Total Stars4
% of votes with a positive feedback rating93% of 14
Total "Very Satisfied" Votes12
Total "Somewhat Satisfied" Votes2
Total "Not Satisfied" Votes0
Total Rank Assessments15
Average Assessed Rank
5 kyu
Highest Assessed Rank
4 kyu
Lowest Assessed Rank
7 kyu
Ad
Contributors
  • jgdodson Avatar
Ad