Graph Generation
All algorithms that are concerned with generating the import graph live in this module. On a high level, the following takes place:
- Given an absolute path to a directory, recursively collect all python files in it
- Compile the AST of each file
- Collect all import statements from the ASTs, retain those that successfully match against any of the parsed files (a module)
- From the context of the import-node within its AST, categorize each import statement
- Distribute the imports onto the modules they took place in, merge their categories if there were multiple
- From these
module -> modules
mappings, generate a di-graph where the nodes are all parsed modules, and the edges are the imports that take place between them - A bidirectional edge implies an import cycle, and each edge retains the import categories as metadata to help interpret the severity of the cycle
Attributes¶
Classes¶
Module
¶
Represent a python module and the modules it imports.
Every module has links to both its parent and children modules, as well as a collection of modules that it imports in some way or another.
Parent-module imports are a bit of a special case, as their name doesn't actually
exist in the child module's namespace. But since any import like
from foo.bar import baz
will, before baz
is resolved, import foo.bar
(which in
turn needs an import of foo
before that), they are registered first thing anyway.
Treating their reliance chain as imports models their relationship accurately for the most part, but does create the impression of cycles if a parent imports names from a child, which is a popular pattern for simplifying/exposing a public API. As a consequence, these child-parent imports should be treated differently during analysis.
See Also
discuss.python.org: Partial Modules for a technical explanation of how parent and child modules interact during imports.
Source code in src/byecycle/graph.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
|
Functions¶
__hash__()
¶
Setting the hash of a module to be equal to its name's hash.
That way, module objects can be found in hash maps by searching for their name as a string. This also means that you can't mix strings and modules in said hash maps without getting very confusing bugs. But, you know, why would you ever want to do that anyway, right?
Source code in src/byecycle/graph.py
__eq__(other)
¶
Modules of the same name should always be equal to each other.
A module comparing equal with its name enables searching for them in hash maps by their name.
add_import(import_, tag, root)
¶
Register a local import statement and its tag to this Module.
Parameters:
-
import_
(ImportStatement
) –Equivalent to an import statement. If the name is set, it may or may not refer to a module.
-
tag
(ImportKind
) –Describes the kind of import, e.g. import of module
foo
has the tagstyping
, meaning it is the import of a parent module within anif typing.TYPE_CHECKING
block, which will be important information once we want to visualize the severity of certain cyclic imports. -
root
(Module
) –The root module, which is used to find the other module. It must point to the root module of the fully parsed source tree in order to produce correct results.
Warning
Given that an import statement of the form from foo.bar import baz
can't be
reasonably resolved in a static approach, a little guessing has to take place.
The current idea is to try and import the full normalized statement
foo.bar.baz
as a module. If that fails (read, no python file was parsed
which corresponds to the module name), only the part between from
and
import
, i.e. foo.bar
is attempted, assuming that baz
is an attribute
within foo.bar
.
As long as the import statements would not raise an ImportError
, this
should always produce correct results.
Source code in src/byecycle/graph.py
parse(source_path)
classmethod
¶
Walks down a source tree and registers each python file as a Module
.
After parsing all files recursively, all import statements in each file that
import a module that resolves to any of the files that we just parsed are listed
on their respective Module
. Additionally, some metadata
from the context of the import is retained. Specifically, the import-kind
definitions are:
vanilla
-
A regular import at the top-level of the module
typing
-
Only executed during static type analysis
dynamic
-
Scoped in a function, which might not be executed on module load
conditional
-
In an if-block at the top-level of the module, so only maybe executed
parent
-
Due to the module in question being a parent of the current module (in python, parent modules are imported before their children)
If a module is imported multiple times in different ways, all their metadata is aggregated on the same entry.
Parameters:
-
source_path
(Path
) –Location of the source tree of the package that should be walked. The
.name
attribute of this parameter is assumed to be the name of the package in order to identify which imports are local imports.
Returns:
-
Module
–The top level, aka "root", module.
Source code in src/byecycle/graph.py
146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
|
ImportVisitor
¶
Bases: NodeVisitor
Collect import statements in a module and assign them an ImportKind
category.
Relies on a non-standard _parent
field being present in each node which contains
a link to its parent node, and _module
to resolve relative imports.
Source code in src/byecycle/graph.py
Functions¶
find_import_kind(node)
classmethod
¶
Find the ImportKind
of an import statement.
A single import statement can only have a single ImportKind
.
This function uses information in the ast.AST
-node to identify if it was
dynamic
, typing
, conditional
, or vanilla
.
Parameters:
-
node
(Import | ImportFrom
) –Node in which the import statement takes place.
Returns:
-
ImportKind
–The identified
ImportKind
, by defaultvanilla
.
Source code in src/byecycle/graph.py
Functions¶
build_digraph(root, **kwargs)
¶
Turns module-import-mappings into a smart graph object.
Parameters:
-
root
(Module
) –Gets walked to produce all
Module
objects that know what other local modules they import, and how. -
**kwargs
(Unpack[SeverityMap]
, default:{}
) –Override the default settings for the severity of the "how" when imports in local modules might cause import cycles.
Returns:
-
DiGraph
–A graph object which supports the kind of operations that you'd want when working with graph-like structures. Every edge in the graph has a
tags
and acycle
entry (accessible withgetitem()
) holding metadata that can help interpret how much of an issue a particular cyclic import might be.