mirror of
https://github.com/Aider-AI/aider.git
synced 2025-05-28 00:05:01 +00:00
copy
This commit is contained in:
parent
6610a8310c
commit
8c73a7be35
1 changed files with 2 additions and 2 deletions
|
@ -35,8 +35,8 @@ I ran the benchmark
|
|||
on almost all the ChatGPT models, using a variety of edit formats.
|
||||
This produced some interesting observations:
|
||||
|
||||
- Asking GPT to just return an updated copy of the whole file in a normal fenced code block is by far the most reliable way to have it edit code. This is true across all gpt-3.5 and gpt-4 models. Keeping the output format dead simple seems to leave GPT with more brain power to devote to the actual coding task. GPT is also less likely to mangle this simple output format.
|
||||
- Using the new function calling API is worse than returning whole files in markdown. GPT writes worse code and frequently mangles the output format, even though OpenAI introduced the function calling API to make structured output formatting more reliable. This was a big surprise.
|
||||
- Asking GPT to just return an updated copy of the whole file in a normal fenced code block is by far the most reliable edit format. This is true across all gpt-3.5 and gpt-4 models. Keeping the output format dead simple seems to leave GPT with more brain power to devote to the actual coding task. GPT is also less likely to mangle this simple output format.
|
||||
- Using the new function calling API is worse than the above whole file method, for all models. GPT writes worse code and frequently mangles this output format, even though OpenAI introduced the function calling API to make structured output formatting more reliable. This was a big surprise.
|
||||
- The new June (`0613`) versions of `gpt-3.5-turbo` are worse at code editing than the older Feb (`0301`) version. This was unexpected.
|
||||
- The gpt-4 models are much better at code editing than the gpt-3.5 models. This was expected.
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue