Commit graph

300 commits

Author SHA1 Message Date
Paul Gauthier
cdf4969c01 copy 2024-05-22 20:24:10 -07:00
Paul Gauthier
9f2554fed7 copy 2024-05-22 15:33:11 -07:00
Paul Gauthier
fcfc05e2d3 aider: Make the x and y axis labels also #555.
# Aider chat conversation:

USER: make the x and y axis labels also #555

ASSISTANT: Here are the changes to make the x and y axis labels also #555 in `benchmark/swe_bench_lite.py` and `benchmark/over_time.py`:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
    ax.set_ylabel("Pass rate (%)", fontsize=18)
    ax.set_title("SWE Bench Lite", fontsize=20)
    ax.set_ylim(0, 30)
    plt.xticks(rotation=45, ha="right", fontsize=16)
    plt.tight_layout(pad=3.0)
    plt.savefig("swe_bench_lite.jpg")
    plt.savefig("swe_bench_lite.svg")
    imgcat(fig)
=======
    ax.set_ylabel("Pass rate (%)", fontsize=18, color="#555")
    ax.set_title("SWE Bench Lite", fontsize=20)
    ax.set_ylim(0, 30)
    plt.xticks(rotation=45, ha="right", fontsize=16)
    plt.tight_layout(pad=3.0)
    plt.savefig("swe_bench_lite.jpg")
    plt.savefig("swe_bench_lite.svg")
    imgcat(fig)
    ax.xaxis.label.set_color("#555")
>>>>>>> REPLACE
```

benchmark/over_time.py
```python
<<<<<<< SEARCH
    ax.set_xlabel("Model release date", fontsize=18)
    ax.set_ylabel("Aider code editing benchmark,\npercent completed correctly", fontsize=18)
    ax.set_title("LLM code editing skill by model release date", fontsize=20)
    ax.set_ylim(0, 30)
    plt.xticks(fontsize=14)
    plt.tight_layout(pad=3.0)
    plt.savefig("tmp_over_time.png")
    plt.savefig("tmp_over_time.svg")
    imgcat(fig)
=======
    ax.set_xlabel("Model release date", fontsize=18, color="#555")
    ax.set_ylabel("Aider code editing benchmark,\npercent completed correctly", fontsize=18, color="#555")
    ax.set_title("LLM code editing skill by model release date", fontsize=20)
    ax.set_ylim(0, 30)
    plt.xticks(fontsize=14)
    plt.tight_layout(pad=3.0)
    plt.savefig("tmp_over_time.png")
    plt.savefig("tmp_over_time.svg")
    imgcat(fig)
>>>>>>> REPLACE
```
2024-05-22 15:29:58 -07:00
Paul Gauthier
c3b8caf480 Update text color to "#555" in swe_bench_lite.py. 2024-05-22 15:29:57 -07:00
Paul Gauthier
1f7ff54717 aider: Set font color using plt.rcParams to resolve KeyError for unrecognized key "font.color".
# Aider chat conversation:

USER: KeyError: 'Unrecognized key "font.color" for group "font" and name "color"'

ASSISTANT: To fix the `KeyError` related to the unrecognized key "font.color", we need to set the font color using `plt.rcParams` instead of `rc`.

Here are the changes:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
    rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10, "color": "#444444"})
=======
    rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10})
    plt.rcParams["text.color"] = "#444444"
>>>>>>> REPLACE
```

benchmark/over_time.py
```python
<<<<<<< SEARCH
    rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10, "color": "#444444"})
=======
    rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10})
    plt.rcParams["text.color"] = "#444444"
>>>>>>> REPLACE
```
2024-05-22 15:28:44 -07:00
Paul Gauthier
81ac2e18cc aider: Changed font color to dark grey in benchmark/swe_bench_lite.py and benchmark/over_time.py.
# Aider chat conversation:

USER: make the font color dark grey not black

ASSISTANT: Here are the changes to make the font color dark grey instead of black in `benchmark/swe_bench_lite.py` and `benchmark/over_time.py`:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
    rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10})
=======
    rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10, "color": "#444444"})
>>>>>>> REPLACE
```

benchmark/over_time.py
```python
<<<<<<< SEARCH
    rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10})
=======
    rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10, "color": "#444444"})
>>>>>>> REPLACE
```
2024-05-22 15:28:30 -07:00
Paul Gauthier
5ba549834c Change the figure size in swe_bench_lite plot from 10x6 to 10x5. 2024-05-22 15:28:28 -07:00
Paul Gauthier
0b01b7caf5 copy 2024-05-22 15:26:16 -07:00
Paul Gauthier
19e7823859 aider: Add more whitespace around the perimeter.
# Aider chat conversation:

USER: add more whitespace around the perimeter

ASSISTANT: Here are the changes to add more whitespace around the perimeter in `benchmark/swe_bench_lite.py` and `benchmark/over_time.py`:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
    plt.tight_layout()
=======
    plt.tight_layout(pad=3.0)
>>>>>>> REPLACE
```

benchmark/over_time.py
```python
<<<<<<< SEARCH
    plt.tight_layout()
=======
    plt.tight_layout(pad=3.0)
>>>>>>> REPLACE
```
2024-05-22 15:18:45 -07:00
Paul Gauthier
aac19f5911 Update xticks fontsize to 16 in swe_bench_lite.py. 2024-05-22 15:18:44 -07:00
Paul Gauthier
a6ae4cab06 aider: Make the model labels on the x axis larger.
# Aider chat conversation:

USER: make the model labels on the x axis larger

ASSISTANT: Here are the changes to make the model labels on the x-axis larger in `benchmark/swe_bench_lite.py` and `benchmark/over_time.py`:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
    plt.xticks(rotation=45, ha='right')
=======
    plt.xticks(rotation=45, ha='right', fontsize=14)
>>>>>>> REPLACE
```

benchmark/over_time.py
```python
<<<<<<< SEARCH
    plt.tight_layout()
=======
    plt.xticks(fontsize=14)
    plt.tight_layout()
>>>>>>> REPLACE
```
2024-05-22 15:18:20 -07:00
Paul Gauthier
6765b12a7d Update plot_swe_bench_lite function to handle model names with '|' by replacing it with newline character and adjusting bar colors and transparency. 2024-05-22 15:18:18 -07:00
Paul Gauthier
5ef96a32e2 aider: Set the y-axis to go to 30.
# Aider chat conversation:

USER: make the y-axis go to 30

ASSISTANT: Here are the changes to make the y-axis go to 30 in `benchmark/swe_bench_lite.py`:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
    plt.xticks(rotation=45, ha='right')
=======
    ax.set_ylim(0, 30)
    plt.xticks(rotation=45, ha='right')
>>>>>>> REPLACE
```

Here are the changes to make the y-axis go to 30 in `benchmark/over_time.py`:

benchmark/over_time.py
```python
<<<<<<< SEARCH
    plt.tight_layout()
=======
    ax.set_ylim(0, 30)
    plt.tight_layout()
>>>>>>> REPLACE
```
2024-05-22 15:14:38 -07:00
Paul Gauthier
d1bbcb9bf8 Change pass rate label to lowercase and update colors in SWE Bench Lite plot. 2024-05-22 15:14:37 -07:00
Paul Gauthier
5761218b9a aider: Add imgcat to display the graph after saving it.
# Aider chat conversation:

USER: save it as a png and svg and imgcat it out

ASSISTANT: To save the graph as a PNG and SVG and then display it using `imgcat`, we need to modify `benchmark/swe_bench_lite.py`:

1. Import `imgcat`.
2. Add `imgcat` to display the graph after saving it.

Here are the *SEARCH/REPLACE* blocks:

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
from matplotlib import rc
=======
from matplotlib import rc
from imgcat import imgcat
>>>>>>> REPLACE
```

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
    plt.show()
=======
    imgcat(fig)
    plt.show()
>>>>>>> REPLACE
```
2024-05-22 15:13:01 -07:00
Paul Gauthier
c9349d6e23 Updated SWE Bench Lite to display pass rates by model with adjusted colors and title. 2024-05-22 15:13:00 -07:00
Paul Gauthier
61019093a2 aider: Added code to plot a vertical bar graph of data from tmp.txt in swe_bench_lite.py.
# Aider chat conversation:

USER: plot a vertical bar graph of the data in tmp.txt. use similar fonts, colors and graph style to the one in over_time.py. put the new graph code in swe_bench_lite.py

ASSISTANT: Here is the code to plot a vertical bar graph of the data in `tmp.txt` using similar fonts, colors, and graph style to the one in `over_time.py`. This code will be placed in `swe_bench_lite.py`.

benchmark/swe_bench_lite.py
```python
<<<<<<< SEARCH
=======
import matplotlib.pyplot as plt
from matplotlib import rc

def plot_swe_bench_lite(data_file):
    with open(data_file, "r") as file:
        lines = file.readlines()

    models = []
    pass_rates = []

    for line in lines:
        if line.strip():
            pass_rate, model = line.split("%")
            models.append(model.strip())
            pass_rates.append(float(pass_rate.strip()))

    plt.rcParams["hatch.linewidth"] = 0.5
    plt.rcParams["hatch.color"] = "#444444"

    rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10})

    fig, ax = plt.subplots(figsize=(10, 5))
    ax.grid(axis="y", zorder=0, lw=0.2)
    for spine in ax.spines.values():
        spine.set_edgecolor("#DDDDDD")
        spine.set_linewidth(0.5)

    colors = [
        "red" if "Aider" in model else "green" if "AutoCodeRover" in model else "blue" for model in models
    ]
    bars = ax.bar(models, pass_rates, color=colors, alpha=0.5, zorder=3)

    for bar in bars:
        yval = bar.get_height()
        ax.text(bar.get_x() + bar.get_width()/2, yval + 0.5, f'{yval}%', ha='center', va='bottom', fontsize=12, alpha=0.75)

    ax.set_xlabel("Models", fontsize=18)
    ax.set_ylabel("Pass Rate (%)", fontsize=18)
    ax.set_title("SWE Benchmark Pass Rates by Model", fontsize=20)
    plt.xticks(rotation=45, ha='right')
    plt.tight_layout()
    plt.savefig("swe_bench_lite.png")
    plt.savefig("swe_bench_lite.svg")
    plt.show()

# Example usage
plot_swe_bench_lite("benchmark/tmp.txt")
>>>>>>> REPLACE
```
2024-05-22 15:11:22 -07:00
Paul Gauthier
ca18220b77 num_with_malformed_responses 2024-05-19 14:19:06 -07:00
Paul Gauthier
74c0df8df8 svg 2024-05-15 11:54:00 -07:00
Paul Gauthier
c8d2649bb8 Update model names to remove parentheses and extra whitespace in benchmark plot. 2024-05-15 11:36:37 -07:00
Paul Gauthier
445c73267a Updated plot dimensions and axis labels for better visualization in benchmark over time. 2024-05-15 11:29:44 -07:00
Paul Gauthier
77ba518a1a aider: Made the border of the graph thinner and lighter by adjusting the spines properties of the axes.
# Aider chat conversation:

USER: make the border of the graph thinner and lighter

ASSISTANT: To make the border of the graph thinner and lighter, we can adjust the `spines` properties of the axes.

Here are the changes:

benchmark/over_time.py
```python
<<<<<<< SEARCH
    colors = ['red' if 'gpt-4' in model else 'green' if 'gpt-3.5' in model else 'blue' for model in models]
=======
    for spine in ax.spines.values():
        spine.set_edgecolor('#DDDDDD')
        spine.set_linewidth(0.5)

    colors = ['red' if 'gpt-4' in model else 'green' if 'gpt-3.5' in model else 'blue' for model in models]
>>>>>>> REPLACE
```
2024-05-15 11:21:01 -07:00
Paul Gauthier
a018fdfe2c aider: Make the dots bigger.
# Aider chat conversation:

USER: make the dots bigger

ASSISTANT: To make the dots bigger, we can adjust the `s` parameter in the `scatter` function, which controls the size of the dots.

Here are the changes:

benchmark/over_time.py
```python
<<<<<<< SEARCH
    ax.scatter(dates, pass_rates, c=colors, alpha=0.5)
=======
    ax.scatter(dates, pass_rates, c=colors, alpha=0.5, s=100)
>>>>>>> REPLACE
```
2024-05-15 11:20:05 -07:00
Paul Gauthier
3f1b86a1cc aider: Make the axes labels font larger.
# Aider chat conversation:

USER: I did `git reset --hard HEAD~1` to discard the last edits. Please wait for further instructions before attempting that change again. Feel free to ask relevant questions about why the changes were reverted.

ASSISTANT: Understood. If you have any specific instructions or further changes you'd like to make, please let me know!

USER: make the axes labels font larger

ASSISTANT: To make the axes labels font larger, we can adjust the `fontsize` parameter in the `set_xlabel` and `set_ylabel` methods.

Here are the changes:

benchmark/over_time.py
```python
<<<<<<< SEARCH
    ax.set_xlabel('Model release date')
    ax.set_ylabel('Aider code editing benchmark,\npercent completed correctly')
=======
    ax.set_xlabel('Model release date', fontsize=14)
    ax.set_ylabel('Aider code editing benchmark,\npercent completed correctly', fontsize=14)
>>>>>>> REPLACE
```
2024-05-15 11:14:48 -07:00
Paul Gauthier
05e34783c2 Update plot labels and annotations for better readability and clarity. 2024-05-15 11:13:17 -07:00
Paul Gauthier
75fa061016 aider: Adjust the position of labels to prevent overlap with dots.
# Aider chat conversation:

USER: make the labels not overlap the dots

ASSISTANT: To make the labels not overlap the dots, we can adjust the position of the labels slightly using the `xytext` parameter in the `annotate` function.

Here are the changes:

benchmark/over_time.py
```python
<<<<<<< SEARCH
        ax.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75)
=======
        ax.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75,
                    xytext=(5, 5), textcoords='offset points')
>>>>>>> REPLACE
```
2024-05-15 11:06:32 -07:00
Paul Gauthier
49eb72cc20 Updated image file format from SVG to PNG and added imgcat library for displaying the plot. 2024-05-15 11:06:31 -07:00
Paul Gauthier
8d670f187c aider: Used the colorscheme and fonts from the graphs in plots.py in over_time.py.
# Aider chat conversation:

USER: use the colorscheme and fonts from the graphs in plots.py

ASSISTANT: To use the colorscheme and fonts from the graphs in `plots.py`, we need to:

1. Import the necessary settings from `plots.py`.
2. Apply these settings to the graph in `over_time.py`.

Here are the changes:

benchmark/over_time.py
```python
<<<<<<< SEARCH
import yaml
from datetime import datetime
=======
import yaml
from datetime import datetime
from matplotlib import rc
>>>>>>> REPLACE
```

benchmark/over_time.py
```python
<<<<<<< SEARCH
    plt.figure(figsize=(10, 6))
=======
    plt.rcParams["hatch.linewidth"] = 0.5
    plt.rcParams["hatch.color"] = "#444444"

    rc("font", **{"family": "sans-serif", "sans-serif": ["Helvetica"], "size": 10})

    fig, ax = plt.subplots(figsize=(6, 4))
    ax.grid(axis="y", zorder=0, lw=0.2)
>>>>>>> REPLACE
```

benchmark/over_time.py
```python
<<<<<<< SEARCH
    plt.scatter(dates, pass_rates, c=colors, alpha=0.5)
=======
    ax.scatter(dates, pass_rates, c=colors, alpha=0.5)
>>>>>>> REPLACE
```

benchmark/over_time.py
```python
<<<<<<< SEARCH
        plt.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75)

    plt.xlabel('Release Date')
    plt.ylabel('Pass Rate 2')
    plt.title('Model Performance Over Time')
    plt.grid(True)
    plt.tight_layout()
    plt.show()
=======
        ax.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75)

    ax.set_xlabel('Release Date')
    ax.set_ylabel('Pass Rate 2')
    ax.set_title('Model Performance Over Time')
    plt.tight_layout()
    plt.savefig("tmp_over_time.svg")
    imgcat(fig)
>>>>>>> REPLACE
```
2024-05-15 11:04:56 -07:00
Paul Gauthier
85d9d8430f aider: Color the "gpt-3.5" models green in the plotting code.
# Aider chat conversation:

USER: color the gpt-3.5 models a new color

ASSISTANT: To color the "gpt-3.5" models a new color, we need to modify the plotting code in `benchmark/over_time.py` to check for "gpt-3.5" in the model name and assign a different color.

Here are the changes:

benchmark/over_time.py
```python
<<<<<<< SEARCH
    colors = ['red' if 'gpt-4' in model else 'blue' for model in models]
=======
    colors = ['red' if 'gpt-4' in model else 'green' if 'gpt-3.5' in model else 'blue' for model in models]
>>>>>>> REPLACE
```
2024-05-15 11:04:14 -07:00
Paul Gauthier
0d11357980 aider: Color the "gpt-4" models differently.
# Aider chat conversation:

USER: color the "gpt-4" models differently

ASSISTANT: To color the "gpt-4" models differently, we need to modify the plotting code in `benchmark/over_time.py` to check for "gpt-4" in the model name and assign a different color.

Here are the changes:

benchmark/over_time.py
```python
<<<<<<< SEARCH
    plt.scatter(dates, pass_rates, c='blue', alpha=0.5)
=======
    colors = ['red' if 'gpt-4' in model else 'blue' for model in models]
    plt.scatter(dates, pass_rates, c=colors, alpha=0.5)
>>>>>>> REPLACE
```

This change will color the points red if the model name contains "gpt-4", otherwise, it will color them blue.
2024-05-15 11:03:47 -07:00
Paul Gauthier
af2f4b73e9 aider: Use the released field directly in the plot instead of parsing it with datetime.strptime.
# Aider chat conversation:

USER: I ran this command:

 python benchmark/over_time.py

And got this output:

Traceback (most recent call last):
  File "/Users/gauthier/Projects/aider/benchmark/over_time.py", line 33, in <module>
    plot_over_time('_data/edit_leaderboard.yml')
  File "/Users/gauthier/Projects/aider/benchmark/over_time.py", line 15, in plot_over_time
    dates.append(datetime.strptime(entry['released'], '%Y-%m-%d'))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: strptime() argument 1 must be str, not datetime.date

ASSISTANT: The error indicates that the `released` field in the YAML file is already a `datetime.date` object, so we don't need to parse it with `datetime.strptime`. We can directly use it in the plot.

Here are the changes:

benchmark/over_time.py
```python
<<<<<<< SEARCH
            dates.append(datetime.strptime(entry['released'], '%Y-%m-%d'))
=======
            dates.append(entry['released'])
>>>>>>> REPLACE
```
2024-05-15 11:03:07 -07:00
Paul Gauthier
49ac9ca357 aider: Added scatter plot functionality for YAML entries with a released date and pass_rate_2.
# Aider chat conversation:

USER: in over_time.py, draw a graph for each of the yaml entries with a `released` date. plot date on x, `pass_rate_2` on y and label with the `model` name. do a scatter plot. use colors/styles similar to the other graphs.

ASSISTANT: To draw a scatter plot for each of the YAML entries with a `released` date, we need to:

1. Read the YAML file.
2. Extract the `released` date and `pass_rate_2` for each entry.
3. Plot the data using matplotlib.

Here are the changes:

benchmark/over_time.py
```python
<<<<<<< SEARCH
=======
import matplotlib.pyplot as plt
import yaml
from datetime import datetime

def plot_over_time(yaml_file):
    with open(yaml_file, 'r') as file:
        data = yaml.safe_load(file)

    dates = []
    pass_rates = []
    models = []

    for entry in data:
        if 'released' in entry and 'pass_rate_2' in entry:
            dates.append(datetime.strptime(entry['released'], '%Y-%m-%d'))
            pass_rates.append(entry['pass_rate_2'])
            models.append(entry['model'])

    plt.figure(figsize=(10, 6))
    plt.scatter(dates, pass_rates, c='blue', alpha=0.5)

    for i, model in enumerate(models):
        plt.annotate(model, (dates[i], pass_rates[i]), fontsize=8, alpha=0.75)

    plt.xlabel('Release Date')
    plt.ylabel('Pass Rate 2')
    plt.title('Model Performance Over Time')
    plt.grid(True)
    plt.tight_layout()
    plt.show()

# Example usage
plot_over_time('_data/edit_leaderboard.yml')
>>>>>>> REPLACE
```
2024-05-15 11:02:44 -07:00
Paul Gauthier
70b1c0c20c load .env in benchmark.py 2024-05-07 13:32:19 -07:00
Paul Gauthier
ecca737803 added deepseek-chat v2 2024-05-07 06:26:39 -07:00
Paul Gauthier
b1cae73a85 cleaned up csv output 2024-05-07 05:59:31 -07:00
Paul Gauthier
a7b08c7354 format output as yaml 2024-05-06 11:15:19 -07:00
Paul Gauthier
3162d42262 cleanup 2024-05-06 10:46:09 -07:00
Paul Gauthier
5fb7a323ec refactored plots 2024-05-06 10:44:34 -07:00
Paul Gauthier
3bb237bdc1 handle tasks with exceptions in the stats output 2024-05-05 08:24:45 -07:00
Paul Gauthier
9cdd9e12c3 catch all exceptions in the benchmark 2024-05-04 17:52:46 -07:00
Paul Gauthier
9b88f8caf6 updated gpt-4-0314 2024-05-04 07:59:27 -07:00
Paul Gauthier
f6580fff76 updated all openai models 2024-05-04 07:38:50 -07:00
Paul Gauthier
1981105932 aider: Implemented the TODO to extract the version from aider/__init__.py for each commit hash.
# Aider chat conversation:

USER: do the todo

ASSISTANT: Here is the *SEARCH/REPLACE block* to do the TODO in the code:

benchmark/benchmark.py
<source>python
<<<<<<< SEARCH
    for hsh in variants['commit_hash']):
        # TODO: get the output from `git show {hsh}:aider/__init__.py`
=======
    for hsh in variants['commit_hash']:
        try:
            version = subprocess.check_output(
                ['git', 'show', f'{hsh}:aider/__init__.py'],
                universal_newlines=True
            )
            version = re.search(r'__version__ = "(.*)"', version).group(1)
            csv.append(version)
        except subprocess.CalledProcessError:
            csv.append('unknown')
>>>>>>> REPLACE
</source>

This change:
1. Loops through each commit hash in `variants['commit_hash']`
2. Uses `subprocess.check_output` to run the `git show` command and capture the contents of `aider/__init__.py` at that commit
3. Searches the file contents for the `__version__ = "X.Y.Z"` line and extracts the version string
4. Appends the version to the `csv` list, or `'unknown'` if there was an error getting the file contents

The `try/except` handles cases where the commit hash doesn't exist or `aider/__init__.py` is missing.
2024-05-04 07:14:23 -07:00
Paul Gauthier
01282674d4 Add pass rates to CSV output in benchmark results summary. 2024-05-04 07:13:40 -07:00
Paul Gauthier
f0b794ce9c revert docker.sh 2024-05-03 14:46:52 -07:00
Paul Gauthier
26d841ccc7 aider: Add --add-host flag to docker run command to allow container to connect to host machine on macOS using host.docker.internal.
# Aider chat conversation:

USER: when i run `curl http://127.0.0.1:11434` inside the container, it "Couldn't connect to server". i want it to connect to the server running on the host computer on that port. the host is a mac.

ASSISTANT: To allow the Docker container to connect to the host machine on macOS, you can use the special DNS name `host.docker.internal` which resolves to the internal IP address used by the host. This is a feature provided by Docker for Mac to facilitate such connections.

Here's the diff to update the `docker.sh` script to include the `--add-host` flag, which adds an entry to the container's `/etc/hosts` file:

```diff
--- benchmark/docker.sh
+++ benchmark/docker.sh
@@ ... @@
 docker run \
        -it --rm \
        -v `pwd`:/aider \
        -v `pwd`/tmp.benchmarks/.:/benchmarks \
        -e OPENAI_API_KEY=$OPENAI_API_KEY \
        -e HISTFILE=/aider/.bash_history \
        -e AIDER_DOCKER=1 \
        -e AIDER_BENCHMARK_DIR=/benchmarks \
+       --add-host=host.docker.internal:host-gateway \
        --net=host \
        -p 11434:11434 \
        aider-benchmark \
        bash
```

With this change, when you run `curl http://host.docker.internal:11434` inside the container, it should connect to the server running on the host computer on that port.
2024-05-01 13:54:30 -07:00
Paul Gauthier
71b675d0ef Add network host mode and change port mapping in docker run script. 2024-05-01 13:54:28 -07:00
Paul Gauthier
5e4ac44008 aider: Add port forwarding for the client inside the container to connect to the Ollama server on the host.
# Aider chat conversation:

USER: add a port fwd so a client inside the contain can connect to the ollama server running on the host

ASSISTANT: To forward a port so that a client inside the container can connect to the Ollama server running on the host, you need to add a `--publish` or `-p` option to the `docker run` command in the `benchmark/docker.sh` script. This option takes the format `host_port:container_port`.

Assuming the Ollama server is running on port 12345 on the host, and you want to use the same port inside the container, the diff would look like this:

```diff
--- benchmark/docker.sh
+++ benchmark/docker.sh
@@ ... @@
 docker run \
        -it --rm \
        -v `pwd`:/aider \
        -v `pwd`/tmp.benchmarks/.:/benchmarks \
        -e OPENAI_API_KEY=$OPENAI_API_KEY \
        -e HISTFILE=/aider/.bash_history \
        -e AIDER_DOCKER=1 \
        -e AIDER_BENCHMARK_DIR=/benchmarks \
+       -p 12345:12345 \
        aider-benchmark \
        bash
```

This change will forward port 12345 from the host to port 12345 in the container, allowing a client inside the container to connect to the Ollama server running on the host machine.
2024-05-01 13:46:45 -07:00
Paul Gauthier
4461c7c4b2 fixed benchmark 2024-04-23 09:44:04 -07:00
Paul Gauthier
fd5b9bbfcb Added groq llama3 2024-04-22 07:12:01 -07:00