automata.code_handling.py.reader.DocstringRemover
Overview
DocstringRemover is a subclass of the NodeTransformer class in the
Abstract Syntax Trees (AST) module in Python. This class provides
functionality for removing docstrings from Python code.
The class implements custom versions of the visit_AsyncFunctionDef,
visit_ClassDef, visit_FunctionDef, and visit_Module methods.
These are used for traversing the provided AST and removing expression
nodes (Expr) that contain constants (which normally capture
docstrings). These methods modify the provided node ‘in place’ and then
continue the visit on the modified node.
It is used predominantly for source code handling where docstrings are not required, such as when comparing code for exact match.
Example
The DocstringRemover class allows you to remove docstrings with a
syntax similar to the following.
# Suppose you have this piece of code with docstrings:
def sample_func():
"""
Sample function docstring
"""
print("Hello world!")
# You would import DocstringRemover:
from automata.code_handling.py.reader import DocstringRemover
# You would convert the source code to AST.
tree = ast.parse(code)
# Now, create a node transformer object and modify the ast.
transformer = DocstringRemover()
tree = transformer.visit(tree)
Limitations
One of the main limitations of the DocstringRemover class is that it
operates on the level of the AST, which can be complex to deal with if
the user is not familiar with the structure. It requires that source
code be transpiled into AST prior to the usage of the class. It’s also
not designed to handle code where docstrings need to be preserved for
functionality.
Note that this class does not check if the constant contained inside the
Expr node is actually a docstring - it removes all constant
expressions. Depending on the program, this may lead to unintended
deletions.
Follow-up Questions:
Why does the visit_Module method not perform any modifications on the node like the other methods do?
Are there any plans to make a DocstringPreserver class which might undo the operations performed by the DocstringRemover class?