automata.code_handling.py.reader.DocstringRemover

Overview

DocstringRemover is a subclass of the NodeTransformer class in the Abstract Syntax Trees (AST) module in Python. This class provides functionality for removing docstrings from Python code.

The class implements custom versions of the visit_AsyncFunctionDef, visit_ClassDef, visit_FunctionDef, and visit_Module methods. These are used for traversing the provided AST and removing expression nodes (Expr) that contain constants (which normally capture docstrings). These methods modify the provided node ‘in place’ and then continue the visit on the modified node.

It is used predominantly for source code handling where docstrings are not required, such as when comparing code for exact match.

Example

The DocstringRemover class allows you to remove docstrings with a syntax similar to the following.

# Suppose you have this piece of code with docstrings:
def sample_func():
    """
    Sample function docstring
    """
    print("Hello world!")

# You would import DocstringRemover:
from automata.code_handling.py.reader import DocstringRemover

# You would convert the source code to AST.
tree = ast.parse(code)

# Now, create a node transformer object and modify the ast.
transformer = DocstringRemover()
tree = transformer.visit(tree)

Limitations

One of the main limitations of the DocstringRemover class is that it operates on the level of the AST, which can be complex to deal with if the user is not familiar with the structure. It requires that source code be transpiled into AST prior to the usage of the class. It’s also not designed to handle code where docstrings need to be preserved for functionality.

Note that this class does not check if the constant contained inside the Expr node is actually a docstring - it removes all constant expressions. Depending on the program, this may lead to unintended deletions.

Follow-up Questions:

  • Why does the visit_Module method not perform any modifications on the node like the other methods do?

  • Are there any plans to make a DocstringPreserver class which might undo the operations performed by the DocstringRemover class?