pythonCatchConflictsHook: prevent exponential worst-case

The hook performs a depth first search on the graph defined by
propagatedBuildInputs. This traverses all paths through the graph,
except for any cycles. In the worst case with a highly connected graph,
this search can take exponential time. In practice, this means that in
cases with long dependency chains and multiple packages depending on the
same package, the hook can take several hours to run.

Avoid this problem by keeping track of already visited paths and only
visiting each path once. This makes the search complete in linear time.

The visible effect of this change is that, if a conflict is found, only
one dependency chain that leads to the conflicting package is printed,
rather than all the possible dependency chains.

Changed files
+8 -1
pkgs
development
interpreters
python
catch_conflicts
+8 -1
pkgs/development/interpreters/python/catch_conflicts/catch_conflicts.py
···
import collections
import sys
import os
-
from typing import Dict, List, Tuple
+
from typing import Dict, List, Set, Tuple
do_abort: bool = False
packages: Dict[str, Dict[str, List[Dict[str, List[str]]]]] = collections.defaultdict(list)
+
found_paths: Set[Path] = set()
out_path: Path = Path(os.getenv("out"))
version: Tuple[int, int] = sys.version_info
site_packages_path: str = f'lib/python{version[0]}.{version[1]}/site-packages'
···
def find_packages(store_path: Path, site_packages_path: str, parents: List[str]) -> None:
site_packages: Path = (store_path / site_packages_path)
propagated_build_inputs: Path = (store_path / "nix-support/propagated-build-inputs")
+
+
# only visit each path once, to avoid exponential complexity with highly
+
# connected dependency graphs
+
if store_path in found_paths:
+
return
+
found_paths.add(store_path)
# add the current package to the list
if site_packages.exists():