From 47810310c903d2cdf756b4a1fa96e6dc448535dc Mon Sep 17 00:00:00 2001 From: Paul Gauthier Date: Wed, 22 May 2024 11:55:10 -0700 Subject: [PATCH] Added linting article --- _posts/2024-05-22-linting.md | 141 +++++++++++++++++++++++++++++++++++ 1 file changed, 141 insertions(+) create mode 100644 _posts/2024-05-22-linting.md diff --git a/_posts/2024-05-22-linting.md b/_posts/2024-05-22-linting.md new file mode 100644 index 000000000..b2f0954c6 --- /dev/null +++ b/_posts/2024-05-22-linting.md @@ -0,0 +1,141 @@ +--- +title: Linting code for LLMs with tree-sitter +excerpt: Aider now lints code after every LLM edit and automatically fixes errors, using tree-sitter and AST-aware code context. +--- + +# Linting code for LLMs with tree-sitter + +Aider now lints your code after every LLM edit, and offers to automatically fix +any linting errors. +You can also use aider's lint-and-fix functionality on your source files any time +you like, to speedily resolve issues with code written by humans. + +Aider shows linting errors to the LLM in a novel format, +using the abstract syntax tree (AST) to display relevant code context for each +error. +This increases the ability of the LLM to understand the problem and +make the correct changes to resolve it. + +Aider ships with basic linters built with tree-sitter that support +[most popular programming languages](https://github.com/paul-gauthier/grep-ast/blob/main/grep_ast/parsers.py). +These built in linters will detect syntax errors and other fatal problems with the code. + +You can also configure aider to use your preferred linters. +This allows aider to check for a larger class of problems, keep the code style +aligned with the rest of your team, etc. + +## Linting and fixing your code + +Aider now lints each source file after it applies the edits +suggested by an LLM. +If problems are found, aider will ask if you'd like it to +attempt to fix the errors. +If so, aider will send the LLM a report of the lint errors +and request changes to fix them. This process may iterate a few times +as the LLM works to fully resolve all the issues. + +You can also lint and fix files any time, on demand from within the aider chat or via the +command line: + +- The in-chat `/lint` command will lint and fix all the files which have +been added to the chat by default. Or you can name any files +in your git repo as arguments. +- From the command line, you can run `aider --lint` to lint and fix +all the dirty files in the repo. +Or you can specify specific filenames on the command line. + + +## An LLM-friendly lint report + +Most linting tools produce terse and cryptic output, +which is one reason many engineers appreciate IDEs that highlight +linting errors. +LLM's don't have the luxury of using an IDE, so aider sends +the linting errors in an LLM friendly format. + +Here's an example of raw output of the `flake8` python linter: + +``` +app.py:23:36: F821 undefined name 'num' +app.py:41:16: F541 f-string is missing placeholders +``` + +This sort of output depends on the user to reference line numbers to find and fix +each reported error. +LLMs are quite bad at working with source code line numbers, often +making off-by-one errors and other mistakes even when provided with +a fully numbered code listing. + +Aider augments the raw linter by +displaying and +highlighting the lines that have errors within their +containing functions, methods, classes. +To do this, aider uses tree-sitter to obtain the code's AST and analyzes it +in light of the linting errors. +LLMs are more effective at editing code that's provided +with context like this. + +``` +app.py:23:36: F821 undefined name 'num' +app.py:41:16: F541 f-string is missing placeholders + +app.py: +...⋮... + 6│class LongNum: + 7│ def __init__(self, num): + 8│ """ + 9│ Initialize the number. + 10│ """ +...⋮... + 19│ def __str__(self): + 20│ """ + 21│ Render the number as a string. + 22│ """ + 23█ return str(num) + 24│ + 25│ + 26│@app.route('/subtract//') +...⋮... + 38│@app.route('/divide//') + 39│def divide(x, y): + 40│ if y == 0: + 41█ return f"Error: Cannot divide by zero" + 42│ else: + 43│ result = x / y + 44│ return str(result) + 45│ +...⋮... +``` + +## Basic linters for most popular languages + +Aider comes batteries-included with built in linters for +[most popular programming languages](https://github.com/paul-gauthier/grep-ast/blob/main/grep_ast/parsers.py). +This provides wide support for linting without requiring +users to manually install a linter and configure it to work with aider. + +Aider's built in language-agnostic linter uses tree-sitter to parse +the AST of each file. +When tree-sitter encounters a syntax error or other fatal issue +parsing a source file, it inserts an AST node with type `ERROR`. +Aider simply uses these `ERROR` nodes to identify all the lines +with syntax or other types of fatal error, and displays +them in the LLM friendly format described above. + +## Configuring your preferred linters + +You can optionally configure aider to use +your preferred linters with the `--lint-cmd` switch. + +``` +# To lint javascript with jslint +aider --lint-cmd javascript:jslint + +# To lint python with flake8 using some specific args: +aider --lint-cmd "python:flake8 --select=E9,F821,F823..." +``` + +You can provide multiple `--lint-cmd` switches +to set linters for various languages. +You can also durably set linters in your `.aider.conf.yml` file. +