aider

mirror of https://github.com/Aider-AI/aider.git synced 2025-05-30 17:24:59 +00:00

Author	SHA1	Message	Date
Paul Gauthier	265e71010f	Make assets/ reference absolute	2024-02-11 07:58:25 -08:00
Paul Gauthier	ae3373eb89	fixed redirect	2024-02-11 07:57:50 -08:00
Paul Gauthier	e630e51668	better excerpt logic; fixed svg ref	2024-02-11 07:55:44 -08:00
Paul Gauthier	d7476bb303	aider: Update blog index to display post excerpt if available, otherwise use the first 100 words. # Aider chat conversation: USER: use the excerpt if one exists, otherwise keep using the first 100 words ASSISTANT: ```diff --- blog/index.html +++ blog/index.html @@ ... @@ <div class="entry"> - {{ post.content \| strip_html \| truncatewords: 100 }}... + {{ post.excerpt \| default: post.content \| strip_html \| truncatewords: 100 }}... </div> ```	2024-02-11 07:52:29 -08:00
Paul Gauthier	a3f9f6c5c6	aider: Added an excerpt for speed benchmarks report. # Aider chat conversation: USER: add an excerpt ASSISTANT: ```diff --- _posts/2023-11-06-benchmarks-speed-1106.md +++ _posts/2023-11-06-benchmarks-speed-1106.md @@ ... @@ --- title: Speed benchmarks of GPT-4 Turbo and gpt-3.5-turbo-1106 +excerpt: This report provides a detailed comparison of the speed of GPT-4 Turbo and gpt-3.5-turbo-1106 models based on the aider benchmarking suite. redirect_from: - /docs/benchmarks-speed-1106.md --- ```	2024-02-11 07:51:38 -08:00
Paul Gauthier	81ad2f18fe	moved benchmarks-speed-1106 article	2024-02-11 07:50:18 -08:00
Paul Gauthier	2f8109ece1	copy	2024-02-11 06:36:47 -08:00
Paul Gauthier	a6f8d7ec6f	aider: Add the redirect plugin to the `_config.yml`. # Aider chat conversation: USER: add the redirect plugin ASSISTANT: To add the Jekyll Redirect From plugin to your `_config.yml`, you need to add the plugin to the `plugins` list. Here's the diff for the change: ```diff --- _config.yml +++ _config.yml @@ ... @@ theme: jekyll-theme-cayman url: "https://aider.chat" +plugins: + - jekyll-redirect-from + defaults: - scope: path: "README.md" type: "pages" values: description: "A command-line chat tool for coding with GPT" ```	2024-02-11 06:34:01 -08:00
Paul Gauthier	c8a732dde2	copy	2024-02-11 06:31:57 -08:00
Paul Gauthier	0904e8e1a0	copy	2024-02-11 06:29:40 -08:00
Paul Gauthier	c95a4ae171	copy	2024-02-11 06:28:16 -08:00
Paul Gauthier	027faa384d	whitespace	2024-02-11 06:21:11 -08:00
Paul Gauthier	f948c855ac	renamed	2024-02-11 06:18:35 -08:00
Paul Gauthier	ac22c314d1	use first 100 words for excerpt	2024-02-11 06:16:21 -08:00
Paul Gauthier	69c32c2903	aider: Add frontmatter for 301 redirect from old URL to new URL. # Aider chat conversation: USER: this post used to be live at /docs/aider-20.html. add frontmatter so that that old url does a 301 redirect to here. ASSISTANT: ```diff --- _posts/2024-02-11-aider-20.html +++ _posts/2024-02-11-aider-20.html @@ ... @@ + +--- +redirect_from: + - /docs/aider-20.html +--- + # Aider v0.20.0 is live Aider v0.20.0 is now available in pypi. ```	2024-02-11 06:15:54 -08:00
Paul Gauthier	f317915e8a	Added new blog post for February 11, 2024.	2024-02-11 06:15:53 -08:00
Paul Gauthier	3fb698cdca	added blog	2024-02-11 06:07:29 -08:00
Paul Gauthier	69d1107b31	set version to 0.24.2-dev	2024-02-11 05:51:40 -08:00
Paul Gauthier	d26693e636	version bump to 0.24.1	2024-02-11 05:51:09 -08:00
Paul Gauthier	19d1a57e5b	Updated HISTORY	2024-02-11 05:49:30 -08:00
Paul Gauthier	72e6cb901c	Fixed bug with cost computations when --no-steam in effect #475	2024-02-11 05:48:32 -08:00
Paul Gauthier	a1a2cf7ac2	Added pipx faq entry	2024-02-10 10:40:12 -08:00
Paul Gauthier	0fa2505ac5	Delete pandoc installer	2024-02-10 08:48:22 -08:00
Paul Gauthier	3d948788dd	set version to 0.24.1-dev	2024-02-10 07:35:36 -08:00
Paul Gauthier	08a22b9cf7	version bump to 0.24.0	2024-02-10 07:35:02 -08:00
Paul Gauthier	1e70b3fb05	Updated HISTORY	2024-02-10 07:34:21 -08:00
Paul Gauthier	d376767ca3	Use gpt-3.5-turbo-0125 for commit messages	2024-02-10 07:32:26 -08:00
Paul Gauthier	6ddfc894e7	Updated HISTORY	2024-02-10 07:31:04 -08:00
Paul Gauthier	bdef4308fe	Simpler calls to pypandoc	2024-02-08 16:14:17 -08:00
Paul Gauthier	efff174f9a	Use download_pandoc, which works everywhere including arm64	2024-02-08 15:56:00 -08:00
Paul Gauthier	2dee76378b	keep hrefs	2024-02-08 15:19:00 -08:00
Paul Gauthier	62c67a1e23	copy	2024-02-08 14:41:51 -08:00
Paul Gauthier	115bf10978	Added /web to commands doc	2024-02-08 14:39:18 -08:00
Paul Gauthier	50b6a00331	bs4 -> beautifulsoup4	2024-02-08 14:29:56 -08:00
Paul Gauthier	05c74202ec	copy	2024-02-08 14:20:49 -08:00
Paul Gauthier	cc36329691	output with pandoc, cleanup with bs and re	2024-02-08 14:14:42 -08:00
Paul Gauthier	be60b785a4	aider: Remove all data: URLs from slimdown_html. # Aider chat conversation: USER: in `slimdown_html` remove all data: urls ASSISTANT: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ def slimdown_html(soup): # Remove all <img> tags for img in soup.find_all('img'): img.decompose() + # Remove all elements with data: URLs + for tag in soup.find_all(href=lambda x: x and x.startswith('data:')): + tag.decompose() + for tag in soup.find_all(src=lambda x: x and x.startswith('data:')): + tag.decompose() # Remove all per-element CSS styles for tag in soup.find_all(True): tag.attrs.pop('style', None) # Remove all internal anchor elements for anchor in soup.find_all('a', href=True): if anchor['href'].startswith('#'): anchor.decompose() return soup ```	2024-02-08 12:12:03 -08:00
Paul Gauthier	84ae51c005	Add functionality to write scraped content to a temporary file and optimize HTML parsing.	2024-02-08 12:11:59 -08:00
Paul Gauthier	33a62c8dc6	aider: finish `slimdown_html`. remove all <img> tags and all per-element css-styles and all internal anchor elements # Aider chat conversation: USER: finish `slimdown_html`. remove all <img> tags and all per-element css-styles and all internal anchor elements ASSISTANT: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ -def slimdown_html(page_source: str) -> str: - soup = BeautifulSoup(page_source, "html.parser") - # ... +def slimdown_html(page_source: str) -> str: + soup = BeautifulSoup(page_source, "html.parser") + # Remove all <img> tags + for img in soup.find_all('img'): + img.decompose() + # Remove all per-element CSS styles + for tag in soup.find_all(True): + tag.attrs.pop('style', None) + # Remove all internal anchor elements + for anchor in soup.find_all('a', href=True): + if anchor['href'].startswith('#'): + anchor.decompose() + return str(soup) ```	2024-02-08 12:10:12 -08:00
Paul Gauthier	1520bb976a	Change html_to_text to html_to_markdown and add slimdown_html function.	2024-02-08 12:10:10 -08:00
Paul Gauthier	1509e21698	aider: Implemented `html_to_markdown` using `pypandoc`. # Aider chat conversation: USER: use pypandoc to implement `html_to_markdown` ASSISTANT: To implement `html_to_markdown` using `pypandoc`, I will: 1. Add an import for `pypandoc`. 2. Implement the `html_to_markdown` function using `pypandoc.convert_text`. Here are the diffs for those changes: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ import sys +import pypandoc import httpx from bs4 import BeautifulSoup from playwright.sync_api import sync_playwright @@ ... @@ def html_to_markdown(page_source: str) -> str: - pass + return pypandoc.convert_text(page_source, 'md', format='html') ``` Please ensure that `pypandoc` is installed in your environment to use this implementation. If it's not installed, you can install it using `pip install pypandoc`.	2024-02-08 12:02:45 -08:00
Paul Gauthier	75e7f2bd4e	Add html_to_markdown function to convert HTML to Markdown.	2024-02-08 12:02:42 -08:00
Paul Gauthier	681f26d010	Print playwright instructions after the content is displayed, so they are not lost	2024-02-08 12:01:18 -08:00
Paul Gauthier	e42005a78e	plumbed through to /web	2024-02-08 11:47:28 -08:00
Paul Gauthier	5b78d929a6	Roughly working scraper	2024-02-08 11:44:55 -08:00
Paul Gauthier	9bf3a6e0c6	aider: Implemented HTTP request using `httpx` library in `scrape_with_httpx` method. # Aider chat conversation: USER: finish `scrape_with_httpx` ASSISTANT: To finish the `scrape_with_httpx` method, I will implement the HTTP request using the `httpx` library. This will involve making a GET request to the provided URL and returning the content of the response. Here is the diff for the changes: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ def scrape_with_httpx(self, url): - pass + import httpx + headers = { + 'User-Agent': aider_user_agent + } + try: + with httpx.Client(headers=headers) as client: + response = client.get(url) + response.raise_for_status() + return response.text + except httpx.HTTPError as http_err: + self.print_error(f'HTTP error occurred: {http_err}') + except Exception as err: + self.print_error(f'An error occurred: {err}') + return None ``` This implementation uses `httpx.Client` to send a GET request with a custom User-Agent header. It handles any HTTP errors by printing them out and returns `None` in case of an error. If the request is successful, it returns the text content of the response.	2024-02-08 11:29:46 -08:00
Paul Gauthier	e04187d1ad	Refactor scraper to use Playwright for web scraping and handle user agent string.	2024-02-08 11:29:44 -08:00
Paul Gauthier	f10a9d2e7b	aider: Retrieve the user agent from the browser context after creating a new page in the `scrape_with_playwright` function. # Aider chat conversation: USER: determine the user_agent ASSISTANT: To determine the `user_agent` in the `scrape_with_playwright` function, we can use the `browser.new_context()` method which allows us to specify a `user_agent` string. We will set a common user agent string for a desktop browser. Here is the diff for the changes: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ def scrape_with_playwright(url): with sync_playwright() ^C KeyboardInterrupt USER: I don't want to SET it. I want to determine what it is ASSISTANT: To determine the `user_agent` being used by the browser in the `scrape_with_playwright` function, we can retrieve it from the browser context after creating a new page. Here is the diff for the changes: ```diff --- aider/scrape.py +++ aider/scrape.py @@ ... @@ def scrape_with_playwright(url): with sync_playwright() as p: try: browser = p.chromium.launch() except Exception as e: print(repr(e)) return + page = browser.new_page() + user_agent = page.evaluate("navigator.userAgent") + print(f"User Agent: {user_agent}") - page = browser.new_page() page.goto(url) content = page.content() browser.close() return content ```	2024-02-08 11:02:54 -08:00
Paul Gauthier	65111591d7	Add scraper module for data extraction.	2024-02-08 11:02:52 -08:00
Paul Gauthier	e2acc028d0	aider: Set the user agent to 'Aider v0.24.0-dev'. # Aider chat conversation: USER: set the user agent to 'Aider v0.24.0-dev' ASSISTANT: ```diff --- playw.py +++ playw.py @@ ... @@ def main(url): # Check if Chromium is installed, if not, install it with sync_playwright() as p: - p.chromium.launch() + browser = p.chromium.launch(user_agent='Aider v0.24.0-dev') + page = browser.new_page() + page.goto(url) + content = page.content() + print(content) + browser.close() if __name__ == "__main__": if len(sys.argv) < 2: ``` USER: UnifiedDiffNoMatch: hunk failed to apply! playw.py does not contain lines that match the diff you provided! Try again. DO NOT skip blank lines, comments, docstrings, etc! The diff needs to apply cleanly to the lines in playw.py! playw.py does not contain these 7 exact lines in a row: ``` def main(url): # Check if Chromium is installed, if not, install it with sync_playwright() as p: p.chromium.launch() if __name__ == "__main__": if len(sys.argv) < 2: ``` ASSISTANT: I apologize for the oversight. Here is the corrected diff: ```diff --- playw.py +++ playw.py @@ ... @@ with sync_playwright() as p: - browser = p.chromium.launch() + browser = p.chromium.launch(user_agent='Aider v0.24.0-dev') page = browser.new_page() page.goto(url) #page.wait_for_load_state('networkidle') content = page.content() print(content) browser.close() ```	2024-02-08 07:59:12 -08:00

1 2 3 4 5 ...

3475 commits