fix: Improve code-in-json blog post

This commit is contained in:
Paul Gauthier 2024-08-15 10:21:42 -07:00 committed by Paul Gauthier (aider)
parent 8f0cc731fd
commit 31e7a75f0e

View file

@ -134,9 +134,9 @@ document.addEventListener('DOMContentLoaded', function () {
## Abstract ## Abstract
The newest LLMs have support for returning properly formatted JSON responses, The newest LLMs have support for returning properly formatted JSON responses,
making it easy for client applications to parse complex responses. making it easier for clients to parse complex responses.
This makes it tempting for AI coding applications to This makes it tempting for AI coding applications to
use tool function calls or other structured reply formats to use JSON replies to
receive code from LLMs. receive code from LLMs.
Unfortunately, Unfortunately,
LLMs write worse code when asked to wrap it in JSON, harming their ability LLMs write worse code when asked to wrap it in JSON, harming their ability
@ -145,21 +145,19 @@ On a variant of the aider code editing benchmark,
JSON-wrapping code JSON-wrapping code
often significantly harms coding often significantly harms coding
performance performance
compared to returning code as plain (markdown) text. compared to returning code as plain text.
This holds true across many top coding LLMs, This holds true across many top coding LLMs,
and even OpenAI's newest gpt-4o-2024-08-06 with "strict" JSON support including OpenAI's new gpt-4o-2024-08-06
suffers from this code-in-JSON handicap. which has strong JSON support.
## Introduction ## Introduction
A lot of people wonder why aider doesn't tell LLMs to A lot of people wonder why aider doesn't use LLM tools for code editing.
use tools or function calls to
specify code edits.
Instead, aider asks for code edits in plain text, like this: Instead, aider asks for code edits in plain text, like this:
```` ````
greeting.py greeting.py
```python ```
<<<<<<< SEARCH <<<<<<< SEARCH
def greeting(): def greeting():
print("Hello") print("Hello")
@ -173,7 +171,7 @@ def greeting():
People expect that it would be easier and more reliable to use tool calls, People expect that it would be easier and more reliable to use tool calls,
which would return a structured JSON response: which would return a structured JSON response:
``` ```json
{ {
"filename": "greeting.py", "filename": "greeting.py",
"search": "def greeting():\n print(\"Hello\")\n" "search": "def greeting():\n print(\"Hello\")\n"
@ -188,8 +186,8 @@ For example, OpenAI recently announced the ability to
[strictly enforce that JSON responses will be syntactically correct [strictly enforce that JSON responses will be syntactically correct
and conform to a specified schema](https://openai.com/index/introducing-structured-outputs-in-the-api/). and conform to a specified schema](https://openai.com/index/introducing-structured-outputs-in-the-api/).
But producing valid (schema compliant) JSON is not sufficient for this use case. But producing valid (schema compliant) JSON is not sufficient for working with AI generated code.
The JSON also has to contain valid, high quality code. The code inside the JSON has to be valid and high quality too.
Unfortunately, Unfortunately,
LLMs write worse code when they're asked to LLMs write worse code when they're asked to
wrap it in JSON. wrap it in JSON.
@ -202,7 +200,7 @@ newlines `\n`
mixed into the code. mixed into the code.
Imagine the additional Imagine the additional
complexity complexity
if the code itself contained JSON or other quoted strings, if the code itself contained quoted strings
with their with their
own escape sequences. own escape sequences.
@ -264,7 +262,7 @@ two parameters, as shown below.
For maximum simplicity, the LLM didn't have to specify the filename, For maximum simplicity, the LLM didn't have to specify the filename,
since the benchmark operates on one source file at a time. since the benchmark operates on one source file at a time.
``` ```json
{ {
"explanation": "Here is the program you asked for which prints \"Hello\"", "explanation": "Here is the program you asked for which prints \"Hello\"",
"content": "def greeting():\n print(\"Hello\")\n" "content": "def greeting():\n print(\"Hello\")\n"