This commit is contained in:
Paul Gauthier 2023-07-01 15:21:21 -07:00
parent 2c87b5b9a1
commit 51bc71446e

View file

@ -43,7 +43,7 @@ The results were quite interesting:
- Using the new function calling API performed worse than the above whole file method for all models. GPT-3.5 especially produced inferior code and frequently mangled this output format. This was surprising, as the functions API was introduced to enhance the reliability of structured outputs. The results from these `func` edit methods are shown as patterned bars in the graph (both green and blue).
- As expected, the GPT-4 models outperformed the GPT-3.5 models in code editing.
The quantitative benchmark results align with my developing intuition
The quantitative benchmark results align with my intuitions
about prompting GPT for complex tasks like coding. It's beneficial to
minimize the "cognitive overhead" of formatting the response, allowing
GPT to concentrate on the task at hand. As an analogy, asking a junior