aider/scripts
Claudia Pellegrino 31c4198cee
fix: let fewer conflicts occur across requirements
**tl;dr** Introduce a common umbrella constraints file (that works
across requirement extras) to avoid package version conflicts and
to reduce the need for manual pinning in `*.in` files.

Previously, spurious package version conflicts could sometimes occur
across requirements for `pip install -e .`, `pip install -e .[help]`,
`pip install -e .[playwright]`, and so on. Here’s why:

- There are five different requirement configs: the set of base
  requirements (`requirements.txt`) and four additional requirement sets\
  (aka "extras"): `dev`, `help`, `browser`, and `playwright`.

- Each of those five configurations spans its own tree of dependencies
  [1]. Those five trees can slightly overlap. (For example, `greenlet`
  is a transitive requirement for both the `help` and `playwright`
  trees, respectively.)

- If you want to resolve those dependency trees so you get concrete
  version numbers, you can’t just look at each tree independently.
  This is because when trees overlap, they sometimes pull in the same
  package for different reasons, respectively, and maybe with different
  version constraints.
  For example, the `help` tree pulls in `greenlet`, because `sqlalchemy`
  requires it. At the same time, the `playwright` tree also pulls in
  `greenlet` because it’s needed by the `playwright` package.
  Resolving those constraints strictly individually (i.e., per tree) is
  usually a mistake. It may work for a while, but occasionally you’re
  going to end up with two conflicting versions of the same package.

To prevent those version conflicts from occurring, the five
`pip-compile` invocations were designed as a chain.
The process starts at the smallest tree (i.e., the base
`requirements.in` file). It calculates the version numbers for the tree,
remembers the result, and feeds it into the calculation of the next
tree.

The chain design somewhat helped mitigate conflicts, but not always.
The reason for that is that the chain works like a greedy algorithm:
once a decision has been made for a given package in a tree, that
decision is immediately final, and the compilation process isn’t allowed
to go back and change that decision if it learns new information.
New information comes in all the time, because larger trees usually have
more complex constraints than smaller trees, and the process visits
larger trees later, facing additional constraints as it hops from tree
to tree. Sometimes it bumps into a new constraint against a package for
which it has already made a decision earlier (i.e., it has pinned the
concrete version number in the `requirements*.txt` file of an earlier
tree).

That’s why the greedy chain-based method, even though it mostly works
just fine, can never avoid spurious conflicts entirely.
To help mitigate those conflicts, pinning entries were manually added to
`requirements.in` files on a case-by-case basis as conflicts occurred.
Those entries can make the file difficult to reason about, and they must
be kept in sync manually as packages get upgraded. That’s a maintenance
burden.

Turning the chain into an umbrella may help. Instead of hopping from
tree to tree, look at the entire forest at once, calculate all the
concrete version numbers for all trees in one fell swoop, and save the
results in a common, all-encompassing umbrella file.

Armed with the umbrella file (called `common-constraints.txt`), visit
each tree (in any order – it no longer matters) and feed it just the
umbrella file as a constraint, along with its own `*.in` file as the
input.
Chaining is no longer necessary, because the umbrella file already
contains all version constraints for all the packages one tree could
possibly need, and then some.

This technique should reduce manual pinning inside `*.in` files, and
makes sure that computed version numbers no longer contradict each other
across trees.

[1]: From a graph theory point of view, I’m being blatantly incorrect
here; those dependency graphs are usually not trees, because they have
cycles. I’m still going to call them "trees" for the sake of this
discussion, because the word "tree" feels less abstract and intimidating
and hopefully more relatable.
2025-03-02 02:50:03 +01:00
..
__init__.py copy 2024-12-13 12:55:33 -08:00
blame.py copy 2025-02-26 09:05:16 -08:00
Dockerfile.jekyll feat: add verbose and trace flags for Jekyll debugging 2024-11-26 05:59:52 -08:00
history_prompts.py include author in history updates 2025-01-09 12:03:05 -08:00
issues.py style: Apply linter formatting to issues.py script 2025-02-10 11:37:35 -08:00
jekyll_build.sh renamed dockerfile 2024-05-15 11:29:34 -07:00
jekyll_run.sh update jekyll to aider/website/ 2024-07-05 13:30:18 -03:00
my_models.py copy 2025-01-10 15:53:17 -08:00
pip-compile.sh fix: let fewer conflicts occur across requirements 2025-03-02 02:50:03 +01:00
update-blame.sh feat: Allow version arg for blame script, default to v0.1.0 2024-12-13 11:49:15 -08:00
update-docs.sh copy 2025-01-13 14:28:34 -08:00
update-history.py feat: include author info in git log output for history updates 2025-01-09 11:57:23 -08:00
versionbump.py cleanup cog of toml 2024-12-16 12:08:08 -08:00
yank-old-versions.py style: Run linter on Python script 2024-09-23 12:27:13 -07:00