aider

mirror of https://github.com/Aider-AI/aider.git synced 2025-06-08 05:34:59 +00:00

Author	SHA1	Message	Date
Paul Gauthier	c7d23f1b5a	aider: Moved the `api_key` setting into the main function and pulled it from `os.environ`. # Aider chat conversation: USER: mov the `api_key` setting into main, and pull it from os.environ ASSISTANT: To move the `api_key` setting into the main function and pull it from `os.environ`, we need to modify the `aider/voice.py` file. Here are the changes: ```python aider/voice.py <<<<<<< HEAD def record_and_transcribe(): # Set your OpenAI API key openai.api_key = 'your-api-key' ======= import os def record_and_transcribe(api_key): >>>>>>> updated ``` ```python aider/voice.py <<<<<<< HEAD if __name__ == "__main__": print(record_and_transcribe()) ======= if __name__ == "__main__": api_key = os.getenv('OPENAI_API_KEY') if not api_key: raise ValueError("Please set the OPENAI_API_KEY environment variable.") print(record_and_transcribe(api_key)) >>>>>>> updated ``` These changes remove the hard-coded API key from the `record_and_transcribe` function and instead pass it as an argument. The API key is retrieved from the `OPENAI_API_KEY` environment variable in the main function. If the environment variable is not set, an error is raised.	2023-08-10 15:47:18 -03:00
Paul Gauthier	c0f6cf6977	aider: Added main function to call `record_and_transcribe` if the script is run directly. # Aider chat conversation: USER: add a main func to call this if the script is run directly ASSISTANT: To add a main function that calls `record_and_transcribe` if the script is run directly, we need to modify the `aider/voice.py` file. Here are the changes: ```python aider/voice.py <<<<<<< HEAD # Return the transcription return response['choices'][0]['text'] ======= # Return the transcription return response['choices'][0]['text'] if __name__ == "__main__": print(record_and_transcribe()) >>>>>>> updated ``` This change adds a check to see if the script is being run directly (i.e., not imported as a module). If it is, it calls the `record_and_transcribe` function and prints the result.	2023-08-10 15:46:31 -03:00
Paul Gauthier	1053d1781c	aider: Added a function `record_and_transcribe` to `aider/voice.py` that records audio until the user presses a key, then submits it to the Whisper API to get the text back. # Aider chat conversation: USER: what file formats does the openai whisper api accept? ASSISTANT: The OpenAI Whisper ASR API accepts audio data in the following formats: - FLAC: Free Lossless Audio Codec. This is a lossless audio format that doesn't lose any data during compression. It's the recommended format for best performance. - WAV: Waveform Audio File Format. This is a standard audio format used mainly in Windows PCs. - MULAW: μ-law algorithm (or mu-law) is a companding algorithm, primarily used in the digital telecommunication systems of North America and Japan. Please note that the audio data must be single-channel (mono), and the sample rate must be 16 kHz for FLAC and WAV formats, and 8 kHz for MULAW format. USER: make a script that records audio until the user presses a key, then submits it to whisper to get the text back ASSISTANT: To accomplish this, we need to create a new Python script that uses the `sounddevice` library to record audio, the `keyboard` library to listen for a key press, and the `openai.Whisper` API to transcribe the audio. Here is the edit block for this new script: ```python newfile/record_and_transcribe.py <<<<<<< HEAD ======= import sounddevice as sd import numpy as np import keyboard import openai import io import os # Set your OpenAI API key openai.api_key = 'your-api-key' # Set the sample rate and duration for the recording sample_rate = 16000 # 16kHz duration = 10 # in seconds # Create a callback function to stop recording when a key is pressed def on_key_press(e): print("Key pressed, stopping recording...") sd.stop() # Start the recording print("Recording started, press any key to stop...") recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, callback=on_key_press) # Wait for a key press keyboard.wait() # Convert the recording to bytes recording_bytes = io.BytesIO() np.save(recording_bytes, recording, allow_pickle=False) recording_bytes = recording_bytes.getvalue() # Transcribe the audio using the Whisper API response = openai.Whisper.asr.create(audio_data=recording_bytes) # Print the transcription print("Transcription: ", response['choices'][0]['text']) >>>>>>> updated ``` Please replace `'your-api-key'` with your actual OpenAI API key. This script records audio for a maximum of 10 seconds, but it stops recording as soon as any key is pressed. The recorded audio is then transcribed using the Whisper API, and the transcription is printed to the console. Please note that you need to install the `sounddevice`, `numpy`, `keyboard`, and `openai` Python libraries to run this script. You can install them using pip: ```bash pip install sounddevice numpy keyboard openai ``` Also, please note that the `keyboard` library requires root privileges to work, so you might need to run your script as a superuser. USER: make a function that records audio until the user presses a key, then submits it to whisper to get the text back ASSISTANT: To create a function that accomplishes this, we need to modify the `aider/voice.py` file. The function should be named `record_and_transcribe` and it should use the `sounddevice` and `keyboard` libraries to record audio and listen for a key press, respectively. It should also use the `openai.Whisper` API to transcribe the audio. Here is the edit block for this change: ```python aider/voice.py <<<<<<< HEAD ======= import sounddevice as sd import numpy as np import keyboard import openai import io def record_and_transcribe(): # Set your OpenAI API key openai.api_key = 'your-api-key' # Set the sample rate and duration for the recording sample_rate = 16000 # 16kHz duration = 10 # in seconds # Create a callback function to stop recording when a key is pressed def on_key_press(e): print("Key pressed, stopping recording...") sd.stop() # Start the recording print("Recording started, press any key to stop...") recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, callback=on_key_press) # Wait for a key press keyboard.wait() # Convert the recording to bytes recording_bytes = io.BytesIO() np.save(recording_bytes, recording, allow_pickle=False) recording_bytes = recording_bytes.getvalue() # Transcribe the audio using the Whisper API response = openai.Whisper.asr.create(audio_data=recording_bytes) # Return the transcription return response['choices'][0]['text'] >>>>>>> updated ``` Please replace `'your-api-key'` with your actual OpenAI API key. This function records audio for a maximum of 10 seconds, but it stops recording as soon as any key is pressed. The recorded audio is then transcribed using the Whisper API, and the transcription is returned by the function. Please note that you need to install the `sounddevice`, `numpy`, `keyboard`, and `openai` Python libraries to use this function. You can install them using pip: ```bash pip install sounddevice numpy keyboard openai ``` Also, please note that the `keyboard` library requires root privileges to work, so you might need to run your script as a superuser.	2023-08-10 15:45:49 -03:00
Paul Gauthier	fdb1cac20b	aider: Added aider/voice.py	2023-08-10 15:41:42 -03:00

1 2

54 commits