mardi 16 mars 2021

How to isolate test cases for ANTLR parser in Python?

I want to test my ANTLR parser with some test cases using Python and the antlr v4 runtime. The problem that I ran into is about isolation of test cases. In each test case, another file should be parsed. But the parser (or the stream reader--I can't tell) seems to concatenate the files that were read so far. So in test case n, the parsing result is the concatenation of result of case n-1 and case n (and recursively back until the first test case).

Here is my testing code:

import unittest
import os
from antlr4 import *
from citerefparser.antlr.NumPredListener import NumPredListener
from citerefparser.antlr.NumPredLexer import NumPredLexer
from citerefparser.antlr.NumPredParser import NumPredParser
from citerefparser.numpred import NumPredListListener

class NumPredTstBase(unittest.TestCase):

    fname = None
    
    def setUp(self):
        self.parsed = []
        input_stream = FileStream(os.path.join("tests", "samples", self.fname), encoding="utf-8")
        lexer = NumPredLexer(input_stream)
        stream = CommonTokenStream(lexer)
        parser = NumPredParser(stream)
        tree = parser.numpreds()    # start rule
        printer = NumPredListListener()
        walker = ParseTreeWalker()
        walker.walk(printer, tree)
        self.parsed = printer.predecessors # get result stored in the listener

    def tearDown(self):
        self.parsed = []
        
class TestCite1(NumPredTstBase):

    fname = "cite1.txt"

    def test_parsed(self):
        self.assertEqual(
            [p.predecessor.token for p in self.parsed],
            ["Cant", "cfr"])

class TestCite2(NumPredTstBase):

    fname = "cite2.txt"

    def test_parsed(self):
        self.assertEqual(
            [p.predecessor.token for p in self.parsed],
            ['q', 'a', 'q', 'a', 'ad', 'art', 'In', 'Sent', 'art', 'sol', 'D', 'q', 'art', 'sol', 'et'])

To me, it looks like everything is newly constructed during setUp. I use python setup.py test as test runner and get the following result:

test_parsed (tests.citerefparser.numpred.TestCite1) ... ok
test_parsed (tests.citerefparser.numpred.TestCite2) ... FAIL
...
AssertionError: Lists differ: ['Cant', 'cfr', 'q', 'a', 'q', 'a', 'ad', '[58 chars]'et'] != ['q', 'a', 'q', 'a', 'ad', 'art', 'In', 'Se[43 chars]'et']

First differing element 0:
'Cant'
'q'

How can I isolate the parser runs when running unit tests?

Kind regards

Aucun commentaire:

Enregistrer un commentaire