mirror of
https://github.com/Aider-AI/aider.git
synced 2025-06-01 18:25:00 +00:00
copy
This commit is contained in:
parent
afc7cc8f21
commit
5e82455c85
1 changed files with 13 additions and 10 deletions
|
@ -3,16 +3,14 @@
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
Aider is an open source command line chat tool that lets you ask GPT to edit
|
Aider is an open source command line chat tool that lets you work with GPT to edit
|
||||||
code in your local git repos.
|
code in your local git repo.
|
||||||
You can use aider to ask GPT to add features, write tests or make other changes and
|
You can use aider to have GPT add features, write tests or make other changes to your code.
|
||||||
improvements to your code.
|
|
||||||
|
|
||||||
The ability for GPT to reliably edit local source files is
|
The ability for GPT to reliably edit local source files is
|
||||||
crucial for this functionality.
|
crucial for this functionality, and depends mainly on the "edit format".
|
||||||
Much of this depends on the "edit format", which is an important component of the
|
The edit format is an important component of the system prompt,
|
||||||
system prompt.
|
which specifies how GPT should structure code edits in its
|
||||||
The edit format specifies how GPT should structure code edits in its
|
|
||||||
responses.
|
responses.
|
||||||
|
|
||||||
Aider currently uses simple text based editing formats, but
|
Aider currently uses simple text based editing formats, but
|
||||||
|
@ -242,12 +240,17 @@ The benchmark results have me fairly convinced that the new
|
||||||
`gpt-3.5-turbo-0613` and `gpt-3.5-16k-0613` models
|
`gpt-3.5-turbo-0613` and `gpt-3.5-16k-0613` models
|
||||||
are a bit worse at code editing than
|
are a bit worse at code editing than
|
||||||
the older `gpt-3.5-turbo-0301` model.
|
the older `gpt-3.5-turbo-0301` model.
|
||||||
This is especially visible in the "first coding attempt"
|
|
||||||
|
This is visible in the "first coding attempt"
|
||||||
portion of each result, before GPT gets a second chance to edit the code.
|
portion of each result, before GPT gets a second chance to edit the code.
|
||||||
Look at the horizontal white line in the middle of the first three blue bars.
|
Look at the horizontal white line in the middle of the first three blue bars.
|
||||||
|
|
||||||
Performance with the `whole` edit format was 46% for the
|
Performance with the `whole` edit format was 46% for the
|
||||||
February model and only 39% for the June models.
|
February model and only 39% for the June models.
|
||||||
|
|
||||||
|
But also note how much the solid green `diff` bars
|
||||||
|
degrade between the February and June GPT-3.5 models.
|
||||||
|
They drop from 30% down to about 19%.
|
||||||
|
|
||||||
I saw other signs of this degraded performance
|
I saw other signs of this degraded performance
|
||||||
in earlier versions of the
|
in earlier versions of the
|
||||||
benchmark as well.
|
benchmark as well.
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue