laziness24-turbo-udiff-never2x

This commit is contained in:
Paul Gauthier 2023-12-18 18:43:15 -08:00
parent fd4e890217
commit 308007a8e9
3 changed files with 17 additions and 6 deletions

View file

@ -5,7 +5,9 @@ from .base_prompts import CoderPrompts
class UnifiedDiffPrompts(CoderPrompts): class UnifiedDiffPrompts(CoderPrompts):
main_system = """Act as an expert software developer. main_system = """Act as an expert software developer.
You are diligent and tireless, and you always COMPLETELY IMPLEMENT the needed code. You are diligent and tireless!
You NEVER leave comments describing code without implementing it!
You always COMPLETELY IMPLEMENT the needed code!
Always use best practices when coding. Always use best practices when coding.
Respect and use existing conventions, libraries, etc that are already present in the code base. Respect and use existing conventions, libraries, etc that are already present in the code base.
@ -95,6 +97,10 @@ Delete the entire existing version with `-` lines and then add a new, updated ve
This will help you generate correct code and correct diffs. This will help you generate correct code and correct diffs.
To make a new file, show a diff from `--- /dev/null` to `+++ path/to/new/file.ext`. To make a new file, show a diff from `--- /dev/null` to `+++ path/to/new/file.ext`.
You are diligent and tireless!
You NEVER leave comments describing code without implementing it!
You always COMPLETELY IMPLEMENT the needed code!
""" """
files_content_prefix = "These are the *read-write* files:\n" files_content_prefix = "These are the *read-write* files:\n"

View file

@ -132,7 +132,10 @@ def find_non_self_methods(path):
non_self_methods = [] non_self_methods = []
for filename in python_files: for filename in python_files:
with open(filename, "r") as file: with open(filename, "r") as file:
node = ast.parse(file.read(), filename=filename) try:
node = ast.parse(file.read(), filename=filename)
except:
pass
checker = SelfUsageChecker() checker = SelfUsageChecker()
checker.visit(node) checker.visit(node)
for method in checker.non_self_methods: for method in checker.non_self_methods:
@ -145,7 +148,7 @@ def process(entry):
fname, class_name, method_name, class_children, method_children = entry fname, class_name, method_name, class_children, method_children = entry
if method_children > class_children / 2: if method_children > class_children / 2:
return return
if method_children < 100: if method_children < 250:
return return
fname = Path(fname) fname = Path(fname)
@ -154,7 +157,7 @@ def process(entry):
print(f"{fname} {class_name} {method_name} {class_children} {method_children}") print(f"{fname} {class_name} {method_name} {class_children} {method_children}")
dname = Path("tmp.benchmarks/refactor-benchmark") dname = Path("tmp.benchmarks/refactor-benchmark-pylint")
dname.mkdir(exist_ok=True) dname.mkdir(exist_ok=True)
dname = dname / f"{fname.stem}_{class_name}_{method_name}" dname = dname / f"{fname.stem}_{class_name}_{method_name}"

View file

@ -79,8 +79,8 @@ code edits, because it's the
default output format of `git diff`: default output format of `git diff`:
```diff ```diff
--- a/hello.py --- a/greeting.py
+++ b/hello.py +++ b/greeting.py
@@ -1,5 +1,5 @@ @@ -1,5 +1,5 @@
def main(args): def main(args):
# show a greeting # show a greeting
@ -246,6 +246,7 @@ They exhibit a variety of problems:
- GPT forgets things like comments, docstrings, blank lines, etc. Or it skips over some code that it doesn't intend to change. - GPT forgets things like comments, docstrings, blank lines, etc. Or it skips over some code that it doesn't intend to change.
- GPT forgets the leading *plus* `+` character to mark novel lines that it wants to add to the file. It incorrectly includes them with a leading *space* as if they were already there. - GPT forgets the leading *plus* `+` character to mark novel lines that it wants to add to the file. It incorrectly includes them with a leading *space* as if they were already there.
- GPT outdents all of the code, removing all the leading white space which is shared across the lines. So a chunk of deeply indented code is shown in a diff with only the leading white space that changes between the lines in the chunk.
- GPT jumps ahead to show edits to a different part of the file without starting a new hunk with a `@@ ... @@` divider. - GPT jumps ahead to show edits to a different part of the file without starting a new hunk with a `@@ ... @@` divider.
As an example of the first issue, consider this source code: As an example of the first issue, consider this source code:
@ -285,6 +286,7 @@ If a hunk doesn't apply cleanly, aider uses a number of strategies:
- Normalize the hunk, by taking the *minus* `-` and *space* lines as one version of the hunk and the *space* and *plus* `+` lines as a second version and doing an actual unified diff on them. - Normalize the hunk, by taking the *minus* `-` and *space* lines as one version of the hunk and the *space* and *plus* `+` lines as a second version and doing an actual unified diff on them.
- Try and discover new lines that GPT is trying to add but which it forgot to mark with *plus* `+` markers. This is done by diffing the *minus* `-` and *space* lines back against the original file. - Try and discover new lines that GPT is trying to add but which it forgot to mark with *plus* `+` markers. This is done by diffing the *minus* `-` and *space* lines back against the original file.
- Try and apply the hunk using "relative leading white space", so we can match and patch correctly even if the hunk has been uniformly indented or outdented.
- Break a large hunk apart into an overlapping sequence of smaller hunks, which each contain only one contiguous run of *plus* `+` and *minus* `-` lines. Try and apply each of these sub-hunks independently. - Break a large hunk apart into an overlapping sequence of smaller hunks, which each contain only one contiguous run of *plus* `+` and *minus* `-` lines. Try and apply each of these sub-hunks independently.
- Vary the size and offset of the "context window" of *space* lines from the hunk that are used to localize the edit to a specific part of the file. - Vary the size and offset of the "context window" of *space* lines from the hunk that are used to localize the edit to a specific part of the file.
- Combine the above mechanisms to progressively become more permissive about how to apply the hunk. - Combine the above mechanisms to progressively become more permissive about how to apply the hunk.