aider

mirror of https://github.com/Aider-AI/aider.git synced 2025-05-29 08:44:59 +00:00

Author	SHA1	Message	Date
Paul Gauthier	e42005a78e	plumbed through to /web	2024-02-08 11:47:28 -08:00
Paul Gauthier	5b78d929a6	Roughly working scraper	2024-02-08 11:44:55 -08:00
Paul Gauthier	9bf3a6e0c6	aider: Implemented HTTP request using `httpx` library in `scrape_with_httpx` method. # Aider chat conversation: USER: finish `scrape_with_httpx` ASSISTANT: To finish the `scrape_with_httpx` method, I will implement the HTTP request using the `httpx` library. This will involve making a GET request to the provided URL and returning the content of the response. Here is the diff for the changes: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ def scrape_with_httpx(self, url): - pass + import httpx + headers = { + 'User-Agent': aider_user_agent + } + try: + with httpx.Client(headers=headers) as client: + response = client.get(url) + response.raise_for_status() + return response.text + except httpx.HTTPError as http_err: + self.print_error(f'HTTP error occurred: {http_err}') + except Exception as err: + self.print_error(f'An error occurred: {err}') + return None ``` This implementation uses `httpx.Client` to send a GET request with a custom User-Agent header. It handles any HTTP errors by printing them out and returns `None` in case of an error. If the request is successful, it returns the text content of the response.	2024-02-08 11:29:46 -08:00
Paul Gauthier	e04187d1ad	Refactor scraper to use Playwright for web scraping and handle user agent string.	2024-02-08 11:29:44 -08:00
Paul Gauthier	f10a9d2e7b	aider: Retrieve the user agent from the browser context after creating a new page in the `scrape_with_playwright` function. # Aider chat conversation: USER: determine the user_agent ASSISTANT: To determine the `user_agent` in the `scrape_with_playwright` function, we can use the `browser.new_context()` method which allows us to specify a `user_agent` string. We will set a common user agent string for a desktop browser. Here is the diff for the changes: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ def scrape_with_playwright(url): with sync_playwright() ^C KeyboardInterrupt USER: I don't want to SET it. I want to determine what it is ASSISTANT: To determine the `user_agent` being used by the browser in the `scrape_with_playwright` function, we can retrieve it from the browser context after creating a new page. Here is the diff for the changes: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ def scrape_with_playwright(url): with sync_playwright() as p: try: browser = p.chromium.launch() except Exception as e: print(repr(e)) return + page = browser.new_page() + user_agent = page.evaluate("navigator.userAgent") + print(f"User Agent: {user_agent}") - page = browser.new_page() page.goto(url) content = page.content() browser.close() return content ```	2024-02-08 11:02:54 -08:00
Paul Gauthier	65111591d7	Add scraper module for data extraction.	2024-02-08 11:02:52 -08:00
Paul Gauthier	e2acc028d0	aider: Set the user agent to 'Aider v0.24.0-dev'. # Aider chat conversation: USER: set the user agent to 'Aider v0.24.0-dev' ASSISTANT: ```diff --- playw.py +++ playw.py @@ ... @@ def main(url): # Check if Chromium is installed, if not, install it with sync_playwright() as p: - p.chromium.launch() + browser = p.chromium.launch(user_agent='Aider v0.24.0-dev') + page = browser.new_page() + page.goto(url) + content = page.content() + print(content) + browser.close() if __name__ == "__main__": if len(sys.argv) < 2: ``` USER: UnifiedDiffNoMatch: hunk failed to apply! playw.py does not contain lines that match the diff you provided! Try again. DO NOT skip blank lines, comments, docstrings, etc! The diff needs to apply cleanly to the lines in playw.py! playw.py does not contain these 7 exact lines in a row: ``` def main(url): # Check if Chromium is installed, if not, install it with sync_playwright() as p: p.chromium.launch() if __name__ == "__main__": if len(sys.argv) < 2: ``` ASSISTANT: I apologize for the oversight. Here is the corrected diff: ```diff --- playw.py +++ playw.py @@ ... @@ with sync_playwright() as p: - browser = p.chromium.launch() + browser = p.chromium.launch(user_agent='Aider v0.24.0-dev') page = browser.new_page() page.goto(url) #page.wait_for_load_state('networkidle') content = page.content() print(content) browser.close() ```	2024-02-08 07:59:12 -08:00
Paul Gauthier	5a06508e0e	Install Chromium if not already installed before launching it.	2024-02-08 07:58:59 -08:00
Paul Gauthier	09de0f5d43	aider: Add check to see if Playwright needs to install Chromium and its dependencies. # Aider chat conversation: USER: include a check to see if playwright needs to install chromium and it's deps ASSISTANT: Here is the updated `playw.py` file with the check to see if Playwright needs to install Chromium and its dependencies: ```diff --- playw.py +++ playw.py @@ ... @@ import sys +from playwright.__main__ import main as playwright_install def main(url): + # Check if Chromium is installed, if not, install it + try: + with sync_playwright() as p: + p.chromium.launch() + except Exception as e: + print("Chromium is not installed. Installing necessary dependencies...") + playwright_install(['install', 'chromium']) + with sync_playwright() as p: browser = p.chromium.launch() page = browser.new_page() page.goto(url) content = page.content() print(content) browser.close() if __name__ == "__main__": if len(sys.argv) < 2: print("Usage: python playw.py <URL>") sys.exit(1) main(sys.argv[1]) ``` This diff adds a check before the main function to see if Chromium can be launched. If an exception occurs, it assumes Chromium is not installed and proceeds to install it using Playwright's command-line interface.	2024-02-08 07:24:46 -08:00
Paul Gauthier	d1782d1c87	aider: Remove asynchronous code and use synchronous Playwright API. # Aider chat conversation: USER: don't use async ASSISTANT: Here is the updated `playw.py` file without using async: ```diff --- playw.py +++ playw.py @@ ... @@ -import asyncio -from playwright.async_api import async_playwright +from playwright.sync_api import sync_playwright import sys -async def main(url): - async with async_playwright() as p: - browser = await p.chromium.launch() - page = await browser.new_page() - await page.goto(url) - content = await page.content() - print(content) - await browser.close() +def main(url): + with sync_playwright() as p: + browser = p.chromium.launch() + page = browser.new_page() + page.goto(url) + content = page.content() + print(content) + browser.close() if __name__ == "__main__": if len(sys.argv) < 2: print("Usage: python playw.py <URL>") sys.exit(1) - asyncio.run(main(sys.argv[1])) + main(sys.argv[1]) ``` This diff removes the asynchronous code and replaces it with the synchronous version provided by Playwright's `sync_api`. The `async` and `await` keywords are removed, and the `async_playwright` context manager is replaced with `sync_playwright`. The script should now run synchronously.	2024-02-08 07:21:23 -08:00
Paul Gauthier	a2649cabe9	aider: Added Playwright script to fetch text from a supplied URL. # Aider chat conversation: USER: write a playwright script to fetch the text from a url supplied on the cmd line ASSISTANT: To create a Playwright script that fetches text from a URL supplied on the command line, I will write the necessary code in the `playw.py` file. The script will use Playwright's Python package to launch a browser, navigate to the provided URL, and print the text content of the page to the console. Here is the diff for the new `playw.py` file: ```diff --- /dev/null +++ playw.py @@ ... @@ +import asyncio +from playwright.async_api import async_playwright +import sys + +async def main(url): + async with async_playwright() as p: + browser = await p.chromium.launch() + page = await browser.new_page() + await page.goto(url) + content = await page.content() + print(content) + await browser.close() + +if __name__ == "__main__": + if len(sys.argv) < 2: + print("Usage: python playw.py <URL>") + sys.exit(1) + asyncio.run(main(sys.argv[1])) ``` This script will require the Playwright package to be installed. You can install it using `pip install playwright` and then run `playwright install` to install the necessary browser binaries. To run the script, use the command `python playw.py <URL>`, replacing `<URL>` with the actual URL you want to fetch text from.	2024-02-08 07:20:50 -08:00
Paul Gauthier	1758937042	wip	2024-02-08 07:18:39 -08:00
Paul Gauthier	61531b9430	aider: Add chromedriver-autoinstaller to requirements.in. # Aider chat conversation: USER: I added these read-write files: requirements.in ASSISTANT: It seems there might be a misunderstanding. You've indicated that you've added the `requirements.in` file, but you haven't specified any changes that need to be made to it or any other files. If you need any modifications or have any specific requests regarding the `requirements.in` or any other file, please let me know, and I'll be happy to assist you with the necessary changes. USER: add chromedriver-autoinstller to .in not .txt ASSISTANT: Here is the diff to add `chromedriver-autoinstaller` to the `requirements.in` file: ```diff --- requirements.in +++ requirements.in @@ ... @@ webdriver_manager selenium +chromedriver-autoinstaller ``` This change adds `chromedriver-autoinstaller` to the `requirements.in` file, which is used to generate the `requirements.txt` file with `pip-compile`. After adding this line, you would run `pip-compile requirements.in` to update the `requirements.txt` file accordingly.	2024-02-07 16:30:30 -08:00
Paul Gauthier	99be34fb61	aider: Add chromedriver-autoinstaller to requirements.txt for automatic installation of ChromeDriver. # Aider chat conversation: USER: does the user need to install anything beyond the pip requirements to have selenium drive chrome? ASSISTANT: To have Selenium drive Chrome, the user needs to have the Chrome browser installed on their system and also the ChromeDriver executable, which allows Selenium to interface with Chrome. The ChromeDriver version must be compatible with the version of Chrome installed on the system. The `requirements.txt` file already includes `selenium`, but it does not include `chromedriver-autoinstaller`, which can automatically download and install the correct version of ChromeDriver for the user's version of Chrome. To facilitate this, we can add `chromedriver-autoinstaller` to the `requirements.txt` file. Here is the diff to add `chromedriver-autoinstaller` to the `requirements.txt`: ```diff --- requirements.txt +++ requirements.txt @@ ... @@ +selenium==4.17.2 +chromedriver-autoinstaller==0.3.1 # via -r requirements.in ``` By adding `chromedriver-autoinstaller`, the user can ensure that the correct version of ChromeDriver is installed and updated as needed, without manual intervention.	2024-02-07 16:29:45 -08:00
Paul Gauthier	ffdd034b7b	Add new dependencies and update existing ones.	2024-02-07 16:29:44 -08:00
Paul Gauthier	29fa63cd3b	aider: Add bs4 to requirements. # Aider chat conversation: USER: add bs4 ASSISTANT: ```diff --- requirements.in +++ requirements.in @@ ... @@ soundfile +bs4 PyYAML Pillow ```	2024-02-07 16:13:09 -08:00
Paul Gauthier	aa54fdd5f7	aider: Add the missing requirements for selenium and related packages. # Aider chat conversation: USER: add the missing requirements for selenium and related packages ASSISTANT: ```diff --- requirements.in +++ requirements.in @@ ... @@ +webdriver_manager +selenium ``` The above diff adds the missing requirements for Selenium and the WebDriver Manager to the `requirements.in` file.	2024-02-07 14:23:24 -08:00
Paul Gauthier	ef12107bd7	aider: Add command /web to use headless selenium to scrape a webpage and add the content to the chat. # Aider chat conversation: USER: Add a command /web to use headless selenium to scrape a webpage and add the content to the chat ASSISTANT: To implement the `/web` command, we will need to add a new method to the `Commands` class in the `aider/commands.py` file. This method will use Selenium with a headless browser to scrape the content of a given webpage and add the content to the chat. Here are the diffs for the changes: ```diff --- aider/commands.py +++ aider/commands.py @@ ... @@ from pathlib import Path +from selenium import webdriver +from selenium.webdriver.chrome.options import Options +from selenium.webdriver.chrome.service import Service +from selenium.webdriver.common.by import By +from webdriver_manager.chrome import ChromeDriverManager import re import subprocess @@ ... @@ class Commands: voice = None + web_driver = None def __init__(self, io, coder, voice_language=None): self.io = io @@ ... @@ self.voice_language = voice_language self.tokenizer = coder.main_model.tokenizer + self.initialize_web_driver() + def initialize_web_driver(self): + chrome_options = Options() + chrome_options.add_argument("--headless") + chrome_options.add_argument("--disable-gpu") + chrome_options.add_argument("--no-sandbox") + chrome_options.add_argument("--disable-dev-shm-usage") + self.web_driver = webdriver.Chrome( + service=Service(ChromeDriverManager().install()), + options=chrome_options + ) + + def cmd_web(self, args): + "Use headless selenium to scrape a webpage and add the content to the chat" + url = args.strip() + if not url: + self.io.tool_error("Please provide a URL to scrape.") + return + + try: + self.web_driver.get(url) + page_content = self.web_driver.find_element(By.TAG_NAME, "body").text + self.io.tool_output(f"Content from {url}:\n{page_content}") + return page_content + except Exception as e: + self.io.tool_error(f"Error scraping {url}: {e}") + def is_command(self, inp): return inp[0] in "/!" @@ ... @@ ``` This diff adds the necessary imports for Selenium and the WebDriver manager, initializes a headless Chrome WebDriver when the `Commands` class is instantiated, and adds a new command method `cmd_web` that takes a URL as an argument, scrapes the webpage, and outputs the content to the chat.	2024-02-07 14:22:02 -08:00
Paul Gauthier	8d9c2ba60b	Added links to openai docs/pricing	2024-02-07 14:17:24 -08:00
Paul Gauthier	49f81ac784	ssh	2024-02-07 12:13:32 -08:00
Paul Gauthier	35f812859e	Rationalized openai model attributes #458	2024-02-07 12:07:30 -08:00
Paul Gauthier	873a388693	Fixed the ! shortcut for /run	2024-02-06 16:14:27 -08:00
Paul Gauthier	926e393529	set version to 0.23.1-dev	2024-02-03 08:40:50 -08:00
Paul Gauthier	b0fd819241	version bump to 0.23.0	2024-02-03 08:40:17 -08:00
Paul Gauthier	174702babe	Updated gpt-4 turbo model switches	2024-02-03 08:39:19 -08:00
Paul Gauthier	bca265abfc	Updated HISTORY	2024-02-03 08:31:28 -08:00
Paul Gauthier	e2e2ee83a4	Added support for gpt-4-0125-preview	2024-02-03 08:27:31 -08:00
Paul Gauthier	f5887a5098	tweaking graph labels	2024-02-03 08:25:19 -08:00
Paul Gauthier	da14474cbe	Updated HISTORY	2024-01-31 17:55:22 -08:00
Paul Gauthier	1317ee1ace	Added --openrouter	2024-01-31 17:33:50 -08:00
Paul Gauthier	e79732dcf7	Added env_vars for all the OPENAI_ env vars	2024-01-31 15:26:53 -08:00
Paul Gauthier	885e33c2b3	copy	2024-01-26 13:51:03 -08:00
Paul Gauthier	edcf9b146b	updated graph	2024-01-25 15:33:52 -08:00
Paul Gauthier	bae834cc2f	updated graph	2024-01-25 15:12:03 -08:00
Paul Gauthier	0fbd702bfa	updated graph	2024-01-25 14:06:14 -08:00
Paul Gauthier	506c366b16	copy	2024-01-25 13:33:03 -08:00
Paul Gauthier	09bd4d7caf	updated graph	2024-01-25 13:31:43 -08:00
Paul Gauthier	a75e7c8472	copy	2024-01-25 13:17:18 -08:00
Paul Gauthier	cd59f0d63a	copy	2024-01-25 13:11:14 -08:00
Paul Gauthier	af9bab9643	copy	2024-01-25 13:05:22 -08:00
Paul Gauthier	9033be74bf	Initial benchmark results for 0125	2024-01-25 13:00:16 -08:00
Paul Gauthier	13e9a01f3b	copy	2024-01-24 10:19:50 -08:00
Paul Gauthier	448735a604	copy	2024-01-24 10:07:56 -08:00
Paul Gauthier	18c3b67ec2	copy	2024-01-24 09:59:41 -08:00
Paul Gauthier	22a73efec4	copy	2024-01-24 09:56:14 -08:00
Paul Gauthier	f6075605c6	copy	2024-01-24 09:53:04 -08:00
Paul Gauthier	065bc63e33	Merge remote-tracking branch 'origin/main'	2024-01-23 13:07:56 -08:00
Paul Gauthier	0c314144d2	Added /quit #451	2024-01-23 13:07:24 -08:00
paul-gauthier	247cab56f0	Merge pull request #453 from zackees/patch-1 fix https://github.com/paul-gauthier/aider/issues/452	2024-01-23 13:05:15 -08:00
Paul Gauthier	c31f2313ba	updated docs	2024-01-23 12:58:29 -08:00

1 2 3 4 5 ...

3432 commits