Roughly working scraper

This commit is contained in:
Paul Gauthier 2024-02-08 11:44:55 -08:00
parent 9bf3a6e0c6
commit 5b78d929a6
6 changed files with 77 additions and 208 deletions

View file

@ -77,6 +77,28 @@ Put a line in it like this to specify your api key:
openai-api-key: sk-...
```
## Enable Playwright
Aider supports adding web pages to the chat with the `/web <url>` command.
When you add a url to the chat, aider fetches the page and scrapes its
content.
By default, aider uses the `httpx` library to scrape web pages, but this only
works on a subset of web pages.
Some sites explicitly block requests from tools like httpx.
Others rely heavily on javascript to render the page content,
which isn't possible using only httpx.
Aider works best with all web pages if you install
Playwright's chromium browser and its dependencies:
```
playwright install --with-deps chromium
```
See the
[Playwright for Python documentation](https://playwright.dev/python/docs/browsers#install-system-dependencies)
for additional information.
## Enable voice coding (optional)