Commit graph

44 commits

Author SHA1 Message Date
Paul Gauthier
cbcbff341b style: format voice.py and suppress syntax warnings 2024-12-08 18:14:30 -08:00
Philippe de Reynal
e11faadf39 feat: ability to select audio input device 2024-11-30 11:24:34 +01:00
Paul Gauthier
3edcd71a44 fix: move pydub import to avoid warnings 2024-09-24 09:29:00 -07:00
Paul Gauthier (aider)
ab786279e6 style: Fix linter warnings 2024-09-24 09:28:31 -07:00
Paul Gauthier (aider)
e267dc13af fix: Move imports to top of file 2024-09-24 09:28:28 -07:00
Paul Gauthier (aider)
d0e6dc2c1e style: Wrap long lines in voice.py 2024-09-24 09:28:16 -07:00
Paul Gauthier (aider)
da752bb00c fix: Suppress warning about missing ffmpeg or avconv 2024-09-24 09:28:13 -07:00
Mike Bailey
1cc30a22f9 feat: Option to compress audio files by ~90%
Add option to reduce bandwidth (and potentially latency) by converting
voice recordings (wav) into a compressed audio format (webm or mp3).

Default behaviour is unchanged.

> File uploads are currently limited to 25 MB and the following input file
> types are supported: mp3, mp4, mpeg, mpga, m4a, wav, and webm.
>
> - https://platform.openai.com/docs/guides/speech-to-text
2024-09-22 22:20:07 +10:00
Paul Gauthier
aedba59c6e feat: add error handling for transcription in Voice class 2024-09-03 08:59:58 -07:00
Paul Gauthier (aider)
130579ea3c style: Format error message in Voice class for better readability 2024-08-30 07:04:13 -07:00
Paul Gauthier (aider)
e88ec36a81 feat: improve error handling for audio device issues 2024-08-30 07:04:09 -07:00
Paul Gauthier
c598cdd576 Catch self.sd.PortAudioError gracefully #849 2024-07-13 07:57:12 +01:00
Paul Gauthier
9d02628cf8 streamlit borks sys.path, causes import("litellm") to load our litellm.py; fix 2024-07-03 21:32:50 -03:00
Paul Gauthier
ed35af44b3 defer numpy, bs4 and jsonschema 2024-07-03 13:35:33 -03:00
Paul Gauthier
73c6af8851 aider: Implemented a more flexible approach to handle different microphone sample rates in aider/voice.py.
# Aider chat conversation:

USER: https://github.com/paul-gauthier/aider/issues/612

https://github.com/paul-gauthier/aider/issues/612:

[Skip to content](#start-of-content)

## Navigation Menu

Toggle navigation

[](https://github.com/)

[Sign
in](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612)

-   Product

    -   [](https://github.com/features/actions)

        Actions

        Automate any workflow

    -   [](https://github.com/features/packages)

        Packages

        Host and manage packages

    -   [](https://github.com/features/security)

        Security

        Find and fix vulnerabilities

    -   [](https://github.com/features/codespaces)

        Codespaces

        Instant dev environments

    -   [](https://github.com/features/copilot)

        GitHub Copilot

        Write better code with AI

    -   [](https://github.com/features/code-review)

        Code review

        Manage code changes

    -   [](https://github.com/features/issues)

        Issues

        Plan and track work

    -   [](https://github.com/features/discussions)

        Discussions

        Collaborate outside of code

    Explore
    -   [All features](https://github.com/features)
    -   [Documentation](https://docs.github.com)
    -   [GitHub Skills](https://skills.github.com)
    -   [Blog](https://github.blog)

-   Solutions

    For
    -   [Enterprise](https://github.com/enterprise)
    -   [Teams](https://github.com/team)
    -   [Startups](https://github.com/enterprise/startups)
    -   [Education](https://education.github.com)

    By Solution
    -   [CI/CD & Automation](https://github.com/solutions/ci-cd)
    -   [DevOps](https://github.com/solutions/devops)
    -   [DevSecOps](https://resources.github.com/devops/fundamentals/devsecops)

    Resources
    -   [Learning Pathways](https://resources.github.com/learn/pathways)
    -   [White papers, Ebooks, Webinars](https://resources.github.com)
    -   [Customer Stories](https://github.com/customer-stories)
    -   [Partners](https://partner.github.com)

-   Open Source

    -   [](/sponsors)

        GitHub Sponsors

        Fund open source developers

    -   [](https://github.com/readme)

        The ReadME Project

        GitHub community articles

    Repositories
    -   [Topics](https://github.com/topics)
    -   [Trending](https://github.com/trending)
    -   [Collections](https://github.com/collections)

-   Enterprise

    -   [](/enterprise)

        Enterprise platform

        AI-powered developer platform

    Available add-ons
    -   [](https://github.com/enterprise/advanced-security)

        Advanced Security

        Enterprise-grade security features

    -   [](/features/copilot#enterprise)

        GitHub Copilot

        Enterprise-grade AI features

    -   [](/premium-support)

        Premium Support

        Enterprise-grade 24/7 support

-   [Pricing](https://github.com/pricing)

Search or jump to\...

# Search code, repositories, users, issues, pull requests\...

Search

Clear

[Search syntax
tips](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax)

# Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

# Saved searches

## Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our
[documentation](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax).

Cancel

Create saved search

[Sign
in](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612)

[Sign
up](/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=paul-gauthier%2Faider)

You signed in with another tab or window. [Reload]() to refresh your
session. You signed out in another tab or window. [Reload]() to refresh
your session. You switched accounts on another tab or window. [Reload]()
to refresh your session.

Dismiss alert

{{ message }}

::: {role="main"}

[paul-gauthier](/paul-gauthier) / **[aider](/paul-gauthier/aider)**
Public

-   [Notifications](/login?return_to=%2Fpaul-gauthier%2Faider) You must
    be signed in to change notification settings

-   [Fork 1k](/login?return_to=%2Fpaul-gauthier%2Faider)

-

    [ Star 10.5k](/login?return_to=%2Fpaul-gauthier%2Faider)

-   [Code ](/paul-gauthier/aider)
-   [Issues 57](/paul-gauthier/aider/issues)
-   [Pull requests 14](/paul-gauthier/aider/pulls)
-   [Actions ](/paul-gauthier/aider/actions)
-   [Projects 0](/paul-gauthier/aider/projects)
-   [Security](/paul-gauthier/aider/security)
-   [Insights ](/paul-gauthier/aider/pulse)

Additional navigation options

-   [ Code ](/paul-gauthier/aider)
-   [ Issues ](/paul-gauthier/aider/issues)
-   [ Pull requests ](/paul-gauthier/aider/pulls)
-   [ Actions ](/paul-gauthier/aider/actions)
-   [ Projects ](/paul-gauthier/aider/projects)
-   [ Security ](/paul-gauthier/aider/security)
-   [ Insights ](/paul-gauthier/aider/pulse)

New issue

**Have a question about this project?** Sign up for a free GitHub
account to open an issue and contact its maintainers and the community.

[Sign up for
GitHub](/signup?return_to=%2Fpaul-gauthier%2Faider%2Fissues%2Fnew%2Fchoose)

By clicking "Sign up for GitHub", you agree to our [terms of
service](https://docs.github.com/terms) and [privacy
statement](https://docs.github.com/privacy). We'll occasionally send you
account related emails.

Already on GitHub? [Sign
in](/login?return_to=%2Fpaul-gauthier%2Faider%2Fissues%2Fnew%2Fchoose)
to your account

[Jump to bottom](#issue-comment-box)

# Unsupported Audio Sample Rate #612

Open

[aaronson2012](/aaronson2012) opened this issue May 24, 2024 · 0
comments

Open

# [Unsupported Audio Sample Rate](#top) #612

[aaronson2012](/aaronson2012) opened this issue May 24, 2024 · 0
comments

Labels

[enhancement](/paul-gauthier/aider/labels/enhancement) New feature or
request

## Comments

[](/aaronson2012)

Copy link

###

**[aaronson2012](/aaronson2012)** commented [May 24,
2024](#issue-2314049815)

+-----------------------------------------------------------------------+
| ### Issue                                                             |
|                                                                       |
| aider was crashing when i ran the /voice command and I realized that  |
| my computers microphone did not support a sample rate of 16kHz. It    |
| was an easy fix for me i just changed the sample_rate variable to one |
| my mic supported but this isn\'t a very elegant solution. Could there |
| be a better way so all mics are supported?                            |
|                                                                       |
| ### Version and model info                                            |
|                                                                       |
| Latest verision                                                       |
+-----------------------------------------------------------------------+
|                                                                       |
|                                                                       |
| The text was updated successfully, but these errors were encountered: |
|                                                                       |
|                                                                       |
+-----------------------------------------------------------------------+

All reactions

[paul-gauthier](/paul-gauthier) added the
[enhancement](/paul-gauthier/aider/labels/enhancement) New feature or
request label [Jun 4, 2024](#event-13039727805)

[Sign up for free](/join?source=comment-repo) **to join this
conversation on GitHub**. Already have an account? [Sign in to
comment](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612)

Assignees

No one assigned

Labels

[enhancement](/paul-gauthier/aider/labels/enhancement) New feature or
request

Projects

None yet

Milestone

No milestone

Development

No branches or pull requests

2 participants

:::

## Footer

[](https://github.com) © 2024 GitHub, Inc.

### Footer navigation

-   [Terms](https://docs.github.com/site-policy/github-terms/github-terms-of-service)
-   [Privacy](https://docs.github.com/site-policy/privacy-policies/github-privacy-statement)
-   [Security](https://github.com/security)
-   [Status](https://www.githubstatus.com/)
-   [Docs](https://docs.github.com/)
-   [Contact](https://support.github.com?tags=dotcom-footer)
-   Manage cookies
-   Do not share my personal information

You can't perform that action at this time.

ASSISTANT: Based on the issue description, I believe the following file is most likely to need changes to support different microphone sample rates:

aider/voice.py

This file contains the Voice class which handles recording and transcribing audio. The sample_rate variable is hardcoded in this file.

To make the suggested change, we would need to modify how the sample rate is determined in aider/voice.py.

Please add aider/voice.py to the chat so I can propose the specific code changes needed. Let me know if you would like me to explain my reasoning further or if you have any other questions!

USER: I added these files to the chat: aider/voice.py.

If you need to propose edits to other existing files not already added to the chat, you *MUST* tell the me their full path names and ask me to *add the files to the chat*. End your reply and wait for my approval. You can keep asking if you then decide you need to edit more files.

ASSISTANT: Here are the proposed changes to aider/voice.py to support different microphone sample rates:

aider/voice.py
```python
<<<<<<< SEARCH
        sample_rate = 16000  # 16kHz
=======
        try:
            sample_rate = int(self.sd.query_devices(None, 'input')['default_samplerate'])
        except (TypeError, ValueError):
            sample_rate = 16000  # fallback to 16kHz if unable to query device
>>>>>>> REPLACE
```

This change:

1. Attempts to query the default sample rate of the input device using sounddevice's `query_devices()` function.
2. If successful, it uses that sample rate.
3. If it fails to get a valid sample rate, it falls back to the original 16kHz.

This should allow aider to automatically adapt to the microphone's supported sample rate in most cases, while still having a reasonable fallback.

Let me know if you would like me to explain or modify the changes further.
2024-06-04 10:27:22 -07:00
Paul Gauthier
9ff6770a04 refactored litellm to avoid duplicating workarounds 2024-05-08 08:05:15 -07:00
Paul Gauthier
01cf038bb5 Quiet litellm 2024-04-23 10:37:43 -07:00
Paul Gauthier
f1c09ececf Switched voice to litellm 2024-04-18 14:13:26 -07:00
Paul Gauthier
c770fc4380 cleaned up client refs 2024-04-17 15:47:07 -07:00
Paul Gauthier
6ebc142377 roughed in openai 1.x 2023-12-05 07:37:05 -08:00
Paul Gauthier
22eb344192 disable /voice if import soundfile failes #274 2023-10-12 07:00:51 -07:00
Paul Gauthier
963bc83a9c Late-bind importing sounddevice, as it was slowing down aider startup 2023-08-19 15:18:46 -07:00
Paul Gauthier
c07013abf4 finished roughing in --voice-language 2023-08-14 14:18:37 -03:00
Paul Gauthier
bddcdd39a8 Refactor record_and_transcribe method in voice.py to accept a language parameter. 2023-08-14 14:11:31 -03:00
Paul Gauthier
23579d4741 Refactor voice recording and transcription to allow for specifying the language. 2023-08-14 14:11:03 -03:00
Paul Gauthier
fcb209c5aa refac 2023-08-11 11:16:31 -03:00
Paul Gauthier
b6b271b70c pass the input history as a prompt to whisper 2023-08-11 11:03:37 -03:00
Paul Gauthier
c38a877d81 clip low volumes to 0 2023-08-11 10:33:20 -03:00
Paul Gauthier
b59e5bf298 copy 2023-08-10 22:27:18 -03:00
Paul Gauthier
24f574a7dc better bar 2023-08-10 22:21:39 -03:00
Paul Gauthier
bfa7fedd18 catch ^c 2023-08-10 22:18:41 -03:00
Paul Gauthier
053751cb43 show loudness bar 2023-08-10 22:17:31 -03:00
Paul Gauthier
4dc704272e aider: Updated callback function to calculate the root mean square (RMS) of the input audio data.
# Aider chat conversation:

USER: what is `indata` ?

ASSISTANT: `indata` is a parameter in the callback function defined within the `record_and_transcribe` function in the `aider/voice.py` file. This parameter represents the input audio data that is being recorded.

In the context of the `sounddevice` library, which is used for audio recording in this script, `indata` is a 2D numpy array containing the recorded audio data. The rows in the array represent individual frames, while the columns represent channels.

In this specific script, the audio is being recorded in mono (single channel), so there will be only one column. The data in `indata` is being put into a queue and also being dumped for debugging purposes.

USER: how can i estimate the avg volume based on indata?

ASSISTANT: To estimate the average volume of the audio data, you can calculate the root mean square (RMS) of the `indata` array. The RMS represents the square root of the average of the squares of the values, and it's a common way to measure the magnitude of a varying quantity, such as audio signals.

Here's how you can modify the `record_and_transcribe` function to calculate the RMS:

```python
aider/voice.py
<<<<<<< HEAD
    def callback(indata, frames, time, status):
        """This is called (from a separate thread) for each audio block."""
        q.put(indata.copy())
        dump(indata)
        dump(indata.mean())
=======
    import numpy as np

    def callback(indata, frames, time, status):
        """This is called (from a separate thread) for each audio block."""
        q.put(indata.copy())
        rms = np.sqrt(np.mean(indata**2))
        dump(rms)
>>>>>>> updated
```

In this change, we import the numpy library and replace the line `dump(indata.mean())` with `rms = np.sqrt(np.mean(indata**2))` and `dump(rms)`. This calculates the RMS of the `indata` and dumps it for debugging purposes.
2023-08-10 21:57:42 -03:00
Paul Gauthier
aba2bb610d proper temp file 2023-08-10 21:54:04 -03:00
Paul Gauthier
4812d17372 prettier 2023-08-10 17:12:19 -03:00
Paul Gauthier
7b993a9bbb gracefully handle missing portaudio 2023-08-10 16:55:45 -03:00
Paul Gauthier
fed24aa444 lint 2023-08-10 16:53:34 -03:00
Paul Gauthier
d74a3ce123 Refactor record_and_transcribe function to remove unnecessary api_key parameter and import dump module. 2023-08-10 16:13:23 -03:00
Paul Gauthier
9b526d51e4 Refactored voice recording to use a queue and write audio to a file. 2023-08-10 16:10:02 -03:00
Paul Gauthier
1f42b0839f aider: Fixed issue with passing the callback argument to the sd.rec function in record_and_transcribe function.
# Aider chat conversation:

USER: I ran this command:

 python -m aider.voice

And got this output:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/gauthier/Projects/aider/aider/voice.py", line 42, in <module>
    print(record_and_transcribe(api_key))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gauthier/Projects/aider/aider/voice.py", line 22, in record_and_transcribe
    recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, callback=on_key_press)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gauthier/Projects/aider/.venv/lib/python3.11/site-packages/sounddevice.py", line 276, in rec
    ctx.start_stream(InputStream, samplerate, ctx.input_channels,
TypeError: _CallbackContext.start_stream() got multiple values for argument 'callback'
Recording started, press any key to stop...

ASSISTANT: It seems like there's an issue with the way the `callback` argument is being passed to the `sd.rec` function in the `record_and_transcribe` function in the `aider/voice.py` file.

The `sd.rec` function is a convenience function that internally creates an instance of `sd.InputStream` and starts recording. The `callback` argument should be passed to the `sd.InputStream` constructor, not the `sd.rec` function.

Here's how we can fix this:

1. Modify the `aider/voice.py` file.

```python
aider/voice.py
<<<<<<< HEAD
    recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, callback=on_key_press)
=======
    # Create an instance of InputStream with the callback
    stream = sd.InputStream(samplerate=sample_rate, channels=1, callback=on_key_press)
    stream.start()
    recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1)
>>>>>>> updated
```

This change creates an instance of `sd.InputStream` with the `on_key_press` callback and starts the stream before calling `sd.rec`. This should ensure that the callback is correctly passed to the `InputStream` and not the `rec` function.
2023-08-10 15:49:26 -03:00
Paul Gauthier
c7d23f1b5a aider: Moved the api_key setting into the main function and pulled it from os.environ.
# Aider chat conversation:

USER: mov the `api_key` setting into main, and pull it from os.environ

ASSISTANT: To move the `api_key` setting into the main function and pull it from `os.environ`, we need to modify the `aider/voice.py` file.

Here are the changes:

```python
aider/voice.py
<<<<<<< HEAD
def record_and_transcribe():
    # Set your OpenAI API key
    openai.api_key = 'your-api-key'
=======
import os

def record_and_transcribe(api_key):
>>>>>>> updated
```

```python
aider/voice.py
<<<<<<< HEAD
if __name__ == "__main__":
    print(record_and_transcribe())
=======
if __name__ == "__main__":
    api_key = os.getenv('OPENAI_API_KEY')
    if not api_key:
        raise ValueError("Please set the OPENAI_API_KEY environment variable.")
    print(record_and_transcribe(api_key))
>>>>>>> updated
```

These changes remove the hard-coded API key from the `record_and_transcribe` function and instead pass it as an argument. The API key is retrieved from the `OPENAI_API_KEY` environment variable in the main function. If the environment variable is not set, an error is raised.
2023-08-10 15:47:18 -03:00
Paul Gauthier
c0f6cf6977 aider: Added main function to call record_and_transcribe if the script is run directly.
# Aider chat conversation:

USER: add a main func to call this if the script is run directly

ASSISTANT: To add a main function that calls `record_and_transcribe` if the script is run directly, we need to modify the `aider/voice.py` file.

Here are the changes:

```python
aider/voice.py
<<<<<<< HEAD
    # Return the transcription
    return response['choices'][0]['text']
=======
    # Return the transcription
    return response['choices'][0]['text']

if __name__ == "__main__":
    print(record_and_transcribe())
>>>>>>> updated
```

This change adds a check to see if the script is being run directly (i.e., not imported as a module). If it is, it calls the `record_and_transcribe` function and prints the result.
2023-08-10 15:46:31 -03:00
Paul Gauthier
1053d1781c aider: Added a function record_and_transcribe to aider/voice.py that records audio until the user presses a key, then submits it to the Whisper API to get the text back.
# Aider chat conversation:

USER: what file formats does the openai whisper api accept?

ASSISTANT: The OpenAI Whisper ASR API accepts audio data in the following formats:

- FLAC: Free Lossless Audio Codec. This is a lossless audio format that doesn't lose any data during compression. It's the recommended format for best performance.
- WAV: Waveform Audio File Format. This is a standard audio format used mainly in Windows PCs.
- MULAW: μ-law algorithm (or mu-law) is a companding algorithm, primarily used in the digital telecommunication systems of North America and Japan.

Please note that the audio data must be single-channel (mono), and the sample rate must be 16 kHz for FLAC and WAV formats, and 8 kHz for MULAW format.

USER: make a script that records audio until the user presses a key, then submits it to whisper to get the text back

ASSISTANT: To accomplish this, we need to create a new Python script that uses the `sounddevice` library to record audio, the `keyboard` library to listen for a key press, and the `openai.Whisper` API to transcribe the audio.

Here is the *edit block* for this new script:

```python
newfile/record_and_transcribe.py
<<<<<<< HEAD

=======
import sounddevice as sd
import numpy as np
import keyboard
import openai
import io
import os

# Set your OpenAI API key
openai.api_key = 'your-api-key'

# Set the sample rate and duration for the recording
sample_rate = 16000  # 16kHz
duration = 10  # in seconds

# Create a callback function to stop recording when a key is pressed
def on_key_press(e):
    print("Key pressed, stopping recording...")
    sd.stop()

# Start the recording
print("Recording started, press any key to stop...")
recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, callback=on_key_press)

# Wait for a key press
keyboard.wait()

# Convert the recording to bytes
recording_bytes = io.BytesIO()
np.save(recording_bytes, recording, allow_pickle=False)
recording_bytes = recording_bytes.getvalue()

# Transcribe the audio using the Whisper API
response = openai.Whisper.asr.create(audio_data=recording_bytes)

# Print the transcription
print("Transcription: ", response['choices'][0]['text'])
>>>>>>> updated
```

Please replace `'your-api-key'` with your actual OpenAI API key. This script records audio for a maximum of 10 seconds, but it stops recording as soon as any key is pressed. The recorded audio is then transcribed using the Whisper API, and the transcription is printed to the console.

Please note that you need to install the `sounddevice`, `numpy`, `keyboard`, and `openai` Python libraries to run this script. You can install them using pip:

```bash
pip install sounddevice numpy keyboard openai
```

Also, please note that the `keyboard` library requires root privileges to work, so you might need to run your script as a superuser.

USER: make a function that records audio until the user presses a key, then submits it to whisper to get the text back

ASSISTANT: To create a function that accomplishes this, we need to modify the `aider/voice.py` file. The function should be named `record_and_transcribe` and it should use the `sounddevice` and `keyboard` libraries to record audio and listen for a key press, respectively. It should also use the `openai.Whisper` API to transcribe the audio.

Here is the *edit block* for this change:

```python
aider/voice.py
<<<<<<< HEAD

=======
import sounddevice as sd
import numpy as np
import keyboard
import openai
import io

def record_and_transcribe():
    # Set your OpenAI API key
    openai.api_key = 'your-api-key'

    # Set the sample rate and duration for the recording
    sample_rate = 16000  # 16kHz
    duration = 10  # in seconds

    # Create a callback function to stop recording when a key is pressed
    def on_key_press(e):
        print("Key pressed, stopping recording...")
        sd.stop()

    # Start the recording
    print("Recording started, press any key to stop...")
    recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, callback=on_key_press)

    # Wait for a key press
    keyboard.wait()

    # Convert the recording to bytes
    recording_bytes = io.BytesIO()
    np.save(recording_bytes, recording, allow_pickle=False)
    recording_bytes = recording_bytes.getvalue()

    # Transcribe the audio using the Whisper API
    response = openai.Whisper.asr.create(audio_data=recording_bytes)

    # Return the transcription
    return response['choices'][0]['text']
>>>>>>> updated
```

Please replace `'your-api-key'` with your actual OpenAI API key. This function records audio for a maximum of 10 seconds, but it stops recording as soon as any key is pressed. The recorded audio is then transcribed using the Whisper API, and the transcription is returned by the function.

Please note that you need to install the `sounddevice`, `numpy`, `keyboard`, and `openai` Python libraries to use this function. You can install them using pip:

```bash
pip install sounddevice numpy keyboard openai
```

Also, please note that the `keyboard` library requires root privileges to work, so you might need to run your script as a superuser.
2023-08-10 15:45:49 -03:00
Paul Gauthier
fdb1cac20b aider: Added aider/voice.py 2023-08-10 15:41:42 -03:00