aider

mirror of https://github.com/Aider-AI/aider.git synced 2025-05-31 09:44:59 +00:00

Author	SHA1	Message	Date
Paul Gauthier	a3285edc4f	aider: Add --openai-organization-id param to set OpenAI organization ID. # Aider chat conversation: USER: add --openai-organization-id param that sets OpenAI(organization=...) ASSISTANT: ```diff --- aider/main.py +++ aider/main.py @@ ... @@ model_group.add_argument( "--openai-api-deployment-id", metavar="OPENAI_API_DEPLOYMENT_ID", env_var="OPENAI_API_DEPLOYMENT_ID", help="Specify the deployment_id", ) + model_group.add_argument( + "--openai-organization-id", + metavar="OPENAI_ORGANIZATION_ID", + env_var="OPENAI_ORGANIZATION_ID", + help="Specify the OpenAI organization ID", + ) model_group.add_argument( "--openrouter", dest="openai_api_base", action="store_const", @@ ... @@ if args.openai_api_type == "azure": client = openai.AzureOpenAI( api_key=args.openai_api_key, azure_endpoint=args.openai_api_base, api_version=args.openai_api_version, azure_deployment=args.openai_api_deployment_id, ) else: kwargs = dict() if args.openai_api_base: kwargs["base_url"] = args.openai_api_base if "openrouter.ai" in args.openai_api_base: kwargs["default_headers"] = { "HTTP-Referer": "http://aider.chat", "X-Title": "Aider", } + if args.openai_organization_id: + kwargs["organization"] = args.openai_organization_id client = openai.OpenAI(api_key=args.openai_api_key, **kwargs) @@ ... @@ if args.verbose: show = scrub_sensitive_info(parser.format_values()) io.tool_output(show) io.tool_output("Option settings:") for arg, val in sorted(vars(args).items()): io.tool_output(f" - {arg}: {scrub_sensitive_info(str(val))}") + if args.openai_organization_id: + io.tool_output(f" - openai_organization_id: {args.openai_organization_id}") ``` This diff adds a new command-line argument `--openai-organization-id` to the `aider/main.py` file, which allows the user to specify the OpenAI organization ID. It also ensures that if this argument is provided, it is passed to the `openai.OpenAI` client initialization as the `organization` parameter. Additionally, it outputs the organization ID if verbose mode is enabled.	2024-02-24 07:11:58 -08:00
Paul Gauthier	69d1107b31	set version to 0.24.2-dev	2024-02-11 05:51:40 -08:00
Paul Gauthier	d26693e636	version bump to 0.24.1	2024-02-11 05:51:09 -08:00
Paul Gauthier	72e6cb901c	Fixed bug with cost computations when --no-steam in effect #475	2024-02-11 05:48:32 -08:00
Paul Gauthier	0fa2505ac5	Delete pandoc installer	2024-02-10 08:48:22 -08:00
Paul Gauthier	3d948788dd	set version to 0.24.1-dev	2024-02-10 07:35:36 -08:00
Paul Gauthier	08a22b9cf7	version bump to 0.24.0	2024-02-10 07:35:02 -08:00
Paul Gauthier	1e70b3fb05	Updated HISTORY	2024-02-10 07:34:21 -08:00
Paul Gauthier	d376767ca3	Use gpt-3.5-turbo-0125 for commit messages	2024-02-10 07:32:26 -08:00
Paul Gauthier	6ddfc894e7	Updated HISTORY	2024-02-10 07:31:04 -08:00
Paul Gauthier	bdef4308fe	Simpler calls to pypandoc	2024-02-08 16:14:17 -08:00
Paul Gauthier	efff174f9a	Use download_pandoc, which works everywhere including arm64	2024-02-08 15:56:00 -08:00
Paul Gauthier	2dee76378b	keep hrefs	2024-02-08 15:19:00 -08:00
Paul Gauthier	05c74202ec	copy	2024-02-08 14:20:49 -08:00
Paul Gauthier	cc36329691	output with pandoc, cleanup with bs and re	2024-02-08 14:14:42 -08:00
Paul Gauthier	be60b785a4	aider: Remove all data: URLs from slimdown_html. # Aider chat conversation: USER: in `slimdown_html` remove all data: urls ASSISTANT: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ def slimdown_html(soup): # Remove all <img> tags for img in soup.find_all('img'): img.decompose() + # Remove all elements with data: URLs + for tag in soup.find_all(href=lambda x: x and x.startswith('data:')): + tag.decompose() + for tag in soup.find_all(src=lambda x: x and x.startswith('data:')): + tag.decompose() # Remove all per-element CSS styles for tag in soup.find_all(True): tag.attrs.pop('style', None) # Remove all internal anchor elements for anchor in soup.find_all('a', href=True): if anchor['href'].startswith('#'): anchor.decompose() return soup ```	2024-02-08 12:12:03 -08:00
Paul Gauthier	84ae51c005	Add functionality to write scraped content to a temporary file and optimize HTML parsing.	2024-02-08 12:11:59 -08:00
Paul Gauthier	33a62c8dc6	aider: finish `slimdown_html`. remove all <img> tags and all per-element css-styles and all internal anchor elements # Aider chat conversation: USER: finish `slimdown_html`. remove all <img> tags and all per-element css-styles and all internal anchor elements ASSISTANT: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ -def slimdown_html(page_source: str) -> str: - soup = BeautifulSoup(page_source, "html.parser") - # ... +def slimdown_html(page_source: str) -> str: + soup = BeautifulSoup(page_source, "html.parser") + # Remove all <img> tags + for img in soup.find_all('img'): + img.decompose() + # Remove all per-element CSS styles + for tag in soup.find_all(True): + tag.attrs.pop('style', None) + # Remove all internal anchor elements + for anchor in soup.find_all('a', href=True): + if anchor['href'].startswith('#'): + anchor.decompose() + return str(soup) ```	2024-02-08 12:10:12 -08:00
Paul Gauthier	1520bb976a	Change html_to_text to html_to_markdown and add slimdown_html function.	2024-02-08 12:10:10 -08:00
Paul Gauthier	1509e21698	aider: Implemented `html_to_markdown` using `pypandoc`. # Aider chat conversation: USER: use pypandoc to implement `html_to_markdown` ASSISTANT: To implement `html_to_markdown` using `pypandoc`, I will: 1. Add an import for `pypandoc`. 2. Implement the `html_to_markdown` function using `pypandoc.convert_text`. Here are the diffs for those changes: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ import sys +import pypandoc import httpx from bs4 import BeautifulSoup from playwright.sync_api import sync_playwright @@ ... @@ def html_to_markdown(page_source: str) -> str: - pass + return pypandoc.convert_text(page_source, 'md', format='html') ``` Please ensure that `pypandoc` is installed in your environment to use this implementation. If it's not installed, you can install it using `pip install pypandoc`.	2024-02-08 12:02:45 -08:00
Paul Gauthier	75e7f2bd4e	Add html_to_markdown function to convert HTML to Markdown.	2024-02-08 12:02:42 -08:00
Paul Gauthier	681f26d010	Print playwright instructions after the content is displayed, so they are not lost	2024-02-08 12:01:18 -08:00
Paul Gauthier	e42005a78e	plumbed through to /web	2024-02-08 11:47:28 -08:00
Paul Gauthier	5b78d929a6	Roughly working scraper	2024-02-08 11:44:55 -08:00
Paul Gauthier	9bf3a6e0c6	aider: Implemented HTTP request using `httpx` library in `scrape_with_httpx` method. # Aider chat conversation: USER: finish `scrape_with_httpx` ASSISTANT: To finish the `scrape_with_httpx` method, I will implement the HTTP request using the `httpx` library. This will involve making a GET request to the provided URL and returning the content of the response. Here is the diff for the changes: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ def scrape_with_httpx(self, url): - pass + import httpx + headers = { + 'User-Agent': aider_user_agent + } + try: + with httpx.Client(headers=headers) as client: + response = client.get(url) + response.raise_for_status() + return response.text + except httpx.HTTPError as http_err: + self.print_error(f'HTTP error occurred: {http_err}') + except Exception as err: + self.print_error(f'An error occurred: {err}') + return None ``` This implementation uses `httpx.Client` to send a GET request with a custom User-Agent header. It handles any HTTP errors by printing them out and returns `None` in case of an error. If the request is successful, it returns the text content of the response.	2024-02-08 11:29:46 -08:00
Paul Gauthier	e04187d1ad	Refactor scraper to use Playwright for web scraping and handle user agent string.	2024-02-08 11:29:44 -08:00
Paul Gauthier	f10a9d2e7b	aider: Retrieve the user agent from the browser context after creating a new page in the `scrape_with_playwright` function. # Aider chat conversation: USER: determine the user_agent ASSISTANT: To determine the `user_agent` in the `scrape_with_playwright` function, we can use the `browser.new_context()` method which allows us to specify a `user_agent` string. We will set a common user agent string for a desktop browser. Here is the diff for the changes: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ def scrape_with_playwright(url): with sync_playwright() ^C KeyboardInterrupt USER: I don't want to SET it. I want to determine what it is ASSISTANT: To determine the `user_agent` being used by the browser in the `scrape_with_playwright` function, we can retrieve it from the browser context after creating a new page. Here is the diff for the changes: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ def scrape_with_playwright(url): with sync_playwright() as p: try: browser = p.chromium.launch() except Exception as e: print(repr(e)) return + page = browser.new_page() + user_agent = page.evaluate("navigator.userAgent") + print(f"User Agent: {user_agent}") - page = browser.new_page() page.goto(url) content = page.content() browser.close() return content ```	2024-02-08 11:02:54 -08:00
Paul Gauthier	65111591d7	Add scraper module for data extraction.	2024-02-08 11:02:52 -08:00
Paul Gauthier	1758937042	wip	2024-02-08 07:18:39 -08:00
Paul Gauthier	ef12107bd7	aider: Add command /web to use headless selenium to scrape a webpage and add the content to the chat. # Aider chat conversation: USER: Add a command /web to use headless selenium to scrape a webpage and add the content to the chat ASSISTANT: To implement the `/web` command, we will need to add a new method to the `Commands` class in the `aider/commands.py` file. This method will use Selenium with a headless browser to scrape the content of a given webpage and add the content to the chat. Here are the diffs for the changes: ```diff --- aider/commands.py +++ aider/commands.py @@ ... @@ from pathlib import Path +from selenium import webdriver +from selenium.webdriver.chrome.options import Options +from selenium.webdriver.chrome.service import Service +from selenium.webdriver.common.by import By +from webdriver_manager.chrome import ChromeDriverManager import re import subprocess @@ ... @@ class Commands: voice = None + web_driver = None def __init__(self, io, coder, voice_language=None): self.io = io @@ ... @@ self.voice_language = voice_language self.tokenizer = coder.main_model.tokenizer + self.initialize_web_driver() + def initialize_web_driver(self): + chrome_options = Options() + chrome_options.add_argument("--headless") + chrome_options.add_argument("--disable-gpu") + chrome_options.add_argument("--no-sandbox") + chrome_options.add_argument("--disable-dev-shm-usage") + self.web_driver = webdriver.Chrome( + service=Service(ChromeDriverManager().install()), + options=chrome_options + ) + + def cmd_web(self, args): + "Use headless selenium to scrape a webpage and add the content to the chat" + url = args.strip() + if not url: + self.io.tool_error("Please provide a URL to scrape.") + return + + try: + self.web_driver.get(url) + page_content = self.web_driver.find_element(By.TAG_NAME, "body").text + self.io.tool_output(f"Content from {url}:\n{page_content}") + return page_content + except Exception as e: + self.io.tool_error(f"Error scraping {url}: {e}") + def is_command(self, inp): return inp[0] in "/!" @@ ... @@ ``` This diff adds the necessary imports for Selenium and the WebDriver manager, initializes a headless Chrome WebDriver when the `Commands` class is instantiated, and adds a new command method `cmd_web` that takes a URL as an argument, scrapes the webpage, and outputs the content to the chat.	2024-02-07 14:22:02 -08:00
Paul Gauthier	8d9c2ba60b	Added links to openai docs/pricing	2024-02-07 14:17:24 -08:00
Paul Gauthier	49f81ac784	ssh	2024-02-07 12:13:32 -08:00
Paul Gauthier	35f812859e	Rationalized openai model attributes #458	2024-02-07 12:07:30 -08:00
Paul Gauthier	873a388693	Fixed the ! shortcut for /run	2024-02-06 16:14:27 -08:00
Paul Gauthier	926e393529	set version to 0.23.1-dev	2024-02-03 08:40:50 -08:00
Paul Gauthier	b0fd819241	version bump to 0.23.0	2024-02-03 08:40:17 -08:00
Paul Gauthier	174702babe	Updated gpt-4 turbo model switches	2024-02-03 08:39:19 -08:00
Paul Gauthier	e2e2ee83a4	Added support for gpt-4-0125-preview	2024-02-03 08:27:31 -08:00
Paul Gauthier	1317ee1ace	Added --openrouter	2024-01-31 17:33:50 -08:00
Paul Gauthier	e79732dcf7	Added env_vars for all the OPENAI_ env vars	2024-01-31 15:26:53 -08:00
Paul Gauthier	065bc63e33	Merge remote-tracking branch 'origin/main'	2024-01-23 13:07:56 -08:00
Paul Gauthier	0c314144d2	Added /quit #451	2024-01-23 13:07:24 -08:00
paul-gauthier	247cab56f0	Merge pull request #453 from zackees/patch-1 fix https://github.com/paul-gauthier/aider/issues/452	2024-01-23 13:05:15 -08:00
Paul Gauthier	c31f2313ba	updated docs	2024-01-23 12:58:29 -08:00
Paul Gauthier	3e96fda71a	catch shutdown errors	2024-01-23 10:45:58 -08:00
Paul Gauthier	af4f224c00	cleanup	2024-01-23 10:02:07 -08:00
Paul Gauthier	580c52bd85	set mdargs on init	2024-01-23 09:58:57 -08:00
Paul Gauthier	da131da427	aider: Add __del__ method to MarkdownStream to call self.live.stop(). # Aider chat conversation: USER: make sire self.live.stop() is called no matter how the object gets destroyed ASSISTANT: To ensure that `self.live.stop()` is called no matter how the object gets destroyed, we can use Python's built-in `__del__` method. This method is called when an object is about to be destroyed. We can override this method in our `MarkdownStream` class to stop the `Live` instance. Here are the steps to make this change: 1. Add a `__del__` method to the `MarkdownStream` class that calls `self.live.stop()`. Here is the SEARCH/REPLACE block: aider/mdstream.py <source>python <<<<<<< SEARCH class MarkdownStream: live = None when = 0 min_delay = 0.050 live_window = 6 def __init__(self): self.printed = [] self.live = Live(Text(''), refresh_per_second=1./self.min_delay) self.live.start() ======= class MarkdownStream: live = None when = 0 min_delay = 0.050 live_window = 6 def __init__(self): self.printed = [] self.live = Live(Text(''), refresh_per_second=1./self.min_delay) self.live.start() def __del__(self): self.live.stop() >>>>>>> REPLACE </source>	2024-01-23 09:50:27 -08:00
Paul Gauthier	24f1e01177	Refactored MarkdownStream to use rich.live for real-time updates and improved text rendering.	2024-01-23 09:50:23 -08:00
Paul Gauthier	b143bc56ac	switch to mdstream	2024-01-23 09:20:31 -08:00

... 3 4 5 6 7 ...

1688 commits