aider

mirror of https://github.com/Aider-AI/aider.git synced 2025-06-04 19:55:00 +00:00

Author	SHA1	Message	Date
Paul Gauthier (aider)	1d672616be	fix: correct file export path in voice format conversion	2025-01-07 09:43:49 -08:00
Paul Gauthier	01ef2351b3	refactor: Use temporary variable for audio format to avoid state mutation	2025-01-04 12:08:05 -08:00
Paul Gauthier	de1d566e9e	refactor: Simplify audio file handling and conversion logic	2025-01-04 12:06:07 -08:00
Paul Gauthier (aider)	19114a61ae	style: Format print statement in voice.py for better readability	2025-01-04 12:04:13 -08:00
Paul Gauthier (aider)	36e5599ead	feat: Add file size logging after audio conversion	2025-01-04 12:04:09 -08:00
Paul Gauthier (aider)	f5a82e575c	style: Format print statement in voice.py for better readability	2025-01-04 12:03:32 -08:00
Paul Gauthier (aider)	1851de323d	feat: add error handling and feedback for audio conversion	2025-01-04 12:03:25 -08:00
Paul Gauthier	f9408640a3	refactor: Simplify audio file size check and format conversion logic	2025-01-04 12:03:24 -08:00
Paul Gauthier (aider)	73837730fa	feat: Add file size check and mp3 conversion for large audio files	2025-01-04 11:59:42 -08:00
Paul Gauthier	9b46991721	refactor: Add comment for handling large file uploads in voice.py	2025-01-04 11:59:40 -08:00
Paul Gauthier	cbcbff341b	style: format voice.py and suppress syntax warnings	2024-12-08 18:14:30 -08:00
Philippe de Reynal	e11faadf39	feat: ability to select audio input device	2024-11-30 11:24:34 +01:00
Paul Gauthier	3edcd71a44	fix: move pydub import to avoid warnings	2024-09-24 09:29:00 -07:00
Paul Gauthier (aider)	ab786279e6	style: Fix linter warnings	2024-09-24 09:28:31 -07:00
Paul Gauthier (aider)	e267dc13af	fix: Move imports to top of file	2024-09-24 09:28:28 -07:00
Paul Gauthier (aider)	d0e6dc2c1e	style: Wrap long lines in voice.py	2024-09-24 09:28:16 -07:00
Paul Gauthier (aider)	da752bb00c	fix: Suppress warning about missing ffmpeg or avconv	2024-09-24 09:28:13 -07:00
Mike Bailey	1cc30a22f9	feat: Option to compress audio files by ~90% Add option to reduce bandwidth (and potentially latency) by converting voice recordings (wav) into a compressed audio format (webm or mp3). Default behaviour is unchanged. > File uploads are currently limited to 25 MB and the following input file > types are supported: mp3, mp4, mpeg, mpga, m4a, wav, and webm. > > - https://platform.openai.com/docs/guides/speech-to-text	2024-09-22 22:20:07 +10:00
Paul Gauthier	aedba59c6e	feat: add error handling for transcription in Voice class	2024-09-03 08:59:58 -07:00
Paul Gauthier (aider)	130579ea3c	style: Format error message in Voice class for better readability	2024-08-30 07:04:13 -07:00
Paul Gauthier (aider)	e88ec36a81	feat: improve error handling for audio device issues	2024-08-30 07:04:09 -07:00
Paul Gauthier	c598cdd576	Catch self.sd.PortAudioError gracefully #849	2024-07-13 07:57:12 +01:00
Paul Gauthier	9d02628cf8	streamlit borks sys.path, causes import("litellm") to load our litellm.py; fix	2024-07-03 21:32:50 -03:00
Paul Gauthier	ed35af44b3	defer numpy, bs4 and jsonschema	2024-07-03 13:35:33 -03:00
Paul Gauthier	73c6af8851	aider: Implemented a more flexible approach to handle different microphone sample rates in aider/voice.py. # Aider chat conversation: USER: https://github.com/paul-gauthier/aider/issues/612 https://github.com/paul-gauthier/aider/issues/612: [Skip to content](#start-of-content) ## Navigation Menu Toggle navigation [](https://github.com/) [Sign in](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612) - Product - [](https://github.com/features/actions) Actions Automate any workflow - [](https://github.com/features/packages) Packages Host and manage packages - [](https://github.com/features/security) Security Find and fix vulnerabilities - [](https://github.com/features/codespaces) Codespaces Instant dev environments - [](https://github.com/features/copilot) GitHub Copilot Write better code with AI - [](https://github.com/features/code-review) Code review Manage code changes - [](https://github.com/features/issues) Issues Plan and track work - [](https://github.com/features/discussions) Discussions Collaborate outside of code Explore - [All features](https://github.com/features) - [Documentation](https://docs.github.com) - [GitHub Skills](https://skills.github.com) - [Blog](https://github.blog) - Solutions For - [Enterprise](https://github.com/enterprise) - [Teams](https://github.com/team) - [Startups](https://github.com/enterprise/startups) - [Education](https://education.github.com) By Solution - [CI/CD & Automation](https://github.com/solutions/ci-cd) - [DevOps](https://github.com/solutions/devops) - [DevSecOps](https://resources.github.com/devops/fundamentals/devsecops) Resources - [Learning Pathways](https://resources.github.com/learn/pathways) - [White papers, Ebooks, Webinars](https://resources.github.com) - [Customer Stories](https://github.com/customer-stories) - [Partners](https://partner.github.com) - Open Source - [](/sponsors) GitHub Sponsors Fund open source developers - [](https://github.com/readme) The ReadME Project GitHub community articles Repositories - [Topics](https://github.com/topics) - [Trending](https://github.com/trending) - [Collections](https://github.com/collections) - Enterprise - [](/enterprise) Enterprise platform AI-powered developer platform Available add-ons - [](https://github.com/enterprise/advanced-security) Advanced Security Enterprise-grade security features - [](/features/copilot#enterprise) GitHub Copilot Enterprise-grade AI features - [](/premium-support) Premium Support Enterprise-grade 24/7 support - [Pricing](https://github.com/pricing) Search or jump to\... # Search code, repositories, users, issues, pull requests\... Search Clear [Search syntax tips](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax) # Provide feedback We read every piece of feedback, and take your input very seriously. Include my email address so I can be contacted Cancel Submit feedback # Saved searches ## Use saved searches to filter your results more quickly Name Query To see all available qualifiers, see our [documentation](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax). Cancel Create saved search [Sign in](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612) [Sign up](/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=paul-gauthier%2Faider) You signed in with another tab or window. [Reload]() to refresh your session. You signed out in another tab or window. [Reload]() to refresh your session. You switched accounts on another tab or window. [Reload]() to refresh your session. Dismiss alert {{ message }} ::: {role="main"} [paul-gauthier](/paul-gauthier) / [aider](/paul-gauthier/aider) Public - [Notifications](/login?return_to=%2Fpaul-gauthier%2Faider) You must be signed in to change notification settings - [Fork 1k](/login?return_to=%2Fpaul-gauthier%2Faider) - [ Star 10.5k](/login?return_to=%2Fpaul-gauthier%2Faider) - [Code ](/paul-gauthier/aider) - [Issues 57](/paul-gauthier/aider/issues) - [Pull requests 14](/paul-gauthier/aider/pulls) - [Actions ](/paul-gauthier/aider/actions) - [Projects 0](/paul-gauthier/aider/projects) - [Security](/paul-gauthier/aider/security) - [Insights ](/paul-gauthier/aider/pulse) Additional navigation options - [ Code ](/paul-gauthier/aider) - [ Issues ](/paul-gauthier/aider/issues) - [ Pull requests ](/paul-gauthier/aider/pulls) - [ Actions ](/paul-gauthier/aider/actions) - [ Projects ](/paul-gauthier/aider/projects) - [ Security ](/paul-gauthier/aider/security) - [ Insights ](/paul-gauthier/aider/pulse) New issue Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community. [Sign up for GitHub](/signup?return_to=%2Fpaul-gauthier%2Faider%2Fissues%2Fnew%2Fchoose) By clicking "Sign up for GitHub", you agree to our [terms of service](https://docs.github.com/terms) and [privacy statement](https://docs.github.com/privacy). We'll occasionally send you account related emails. Already on GitHub? [Sign in](/login?return_to=%2Fpaul-gauthier%2Faider%2Fissues%2Fnew%2Fchoose) to your account [Jump to bottom](#issue-comment-box) # Unsupported Audio Sample Rate #612 Open [aaronson2012](/aaronson2012) opened this issue May 24, 2024 · 0 comments Open # [Unsupported Audio Sample Rate](#top) #612 [aaronson2012](/aaronson2012) opened this issue May 24, 2024 · 0 comments Labels [enhancement](/paul-gauthier/aider/labels/enhancement) New feature or request ## Comments [](/aaronson2012) Copy link ### [aaronson2012](/aaronson2012) commented [May 24, 2024](#issue-2314049815) +-----------------------------------------------------------------------+ \| ### Issue \| \| \| \| aider was crashing when i ran the /voice command and I realized that \| \| my computers microphone did not support a sample rate of 16kHz. It \| \| was an easy fix for me i just changed the sample_rate variable to one \| \| my mic supported but this isn\'t a very elegant solution. Could there \| \| be a better way so all mics are supported? \| \| \| \| ### Version and model info \| \| \| \| Latest verision \| +-----------------------------------------------------------------------+ \| \| \| \| \| The text was updated successfully, but these errors were encountered: \| \| \| \| \| +-----------------------------------------------------------------------+ All reactions [paul-gauthier](/paul-gauthier) added the [enhancement](/paul-gauthier/aider/labels/enhancement) New feature or request label [Jun 4, 2024](#event-13039727805) [Sign up for free](/join?source=comment-repo) to join this conversation on GitHub. Already have an account? [Sign in to comment](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612) Assignees No one assigned Labels [enhancement](/paul-gauthier/aider/labels/enhancement) New feature or request Projects None yet Milestone No milestone Development No branches or pull requests 2 participants ::: ## Footer [](https://github.com) © 2024 GitHub, Inc. ### Footer navigation - [Terms](https://docs.github.com/site-policy/github-terms/github-terms-of-service) - [Privacy](https://docs.github.com/site-policy/privacy-policies/github-privacy-statement) - [Security](https://github.com/security) - [Status](https://www.githubstatus.com/) - [Docs](https://docs.github.com/) - [Contact](https://support.github.com?tags=dotcom-footer) - Manage cookies - Do not share my personal information You can't perform that action at this time. ASSISTANT: Based on the issue description, I believe the following file is most likely to need changes to support different microphone sample rates: aider/voice.py This file contains the Voice class which handles recording and transcribing audio. The sample_rate variable is hardcoded in this file. To make the suggested change, we would need to modify how the sample rate is determined in aider/voice.py. Please add aider/voice.py to the chat so I can propose the specific code changes needed. Let me know if you would like me to explain my reasoning further or if you have any other questions! USER: I added these files to the chat: aider/voice.py. If you need to propose edits to other existing files not already added to the chat, you MUST tell the me their full path names and ask me to add the files to the chat. End your reply and wait for my approval. You can keep asking if you then decide you need to edit more files. ASSISTANT: Here are the proposed changes to aider/voice.py to support different microphone sample rates: aider/voice.py ```python <<<<<<< SEARCH sample_rate = 16000 # 16kHz ======= try: sample_rate = int(self.sd.query_devices(None, 'input')['default_samplerate']) except (TypeError, ValueError): sample_rate = 16000 # fallback to 16kHz if unable to query device >>>>>>> REPLACE ``` This change: 1. Attempts to query the default sample rate of the input device using sounddevice's `query_devices()` function. 2. If successful, it uses that sample rate. 3. If it fails to get a valid sample rate, it falls back to the original 16kHz. This should allow aider to automatically adapt to the microphone's supported sample rate in most cases, while still having a reasonable fallback. Let me know if you would like me to explain or modify the changes further.	2024-06-04 10:27:22 -07:00
Paul Gauthier	9ff6770a04	refactored litellm to avoid duplicating workarounds	2024-05-08 08:05:15 -07:00
Paul Gauthier	01cf038bb5	Quiet litellm	2024-04-23 10:37:43 -07:00
Paul Gauthier	f1c09ececf	Switched voice to litellm	2024-04-18 14:13:26 -07:00
Paul Gauthier	c770fc4380	cleaned up client refs	2024-04-17 15:47:07 -07:00
Paul Gauthier	6ebc142377	roughed in openai 1.x	2023-12-05 07:37:05 -08:00
Paul Gauthier	22eb344192	disable /voice if import soundfile failes #274	2023-10-12 07:00:51 -07:00
Paul Gauthier	963bc83a9c	Late-bind importing sounddevice, as it was slowing down aider startup	2023-08-19 15:18:46 -07:00
Paul Gauthier	c07013abf4	finished roughing in --voice-language	2023-08-14 14:18:37 -03:00
Paul Gauthier	bddcdd39a8	Refactor record_and_transcribe method in voice.py to accept a language parameter.	2023-08-14 14:11:31 -03:00
Paul Gauthier	23579d4741	Refactor voice recording and transcription to allow for specifying the language.	2023-08-14 14:11:03 -03:00
Paul Gauthier	fcb209c5aa	refac	2023-08-11 11:16:31 -03:00
Paul Gauthier	b6b271b70c	pass the input history as a prompt to whisper	2023-08-11 11:03:37 -03:00
Paul Gauthier	c38a877d81	clip low volumes to 0	2023-08-11 10:33:20 -03:00
Paul Gauthier	b59e5bf298	copy	2023-08-10 22:27:18 -03:00
Paul Gauthier	24f574a7dc	better bar	2023-08-10 22:21:39 -03:00
Paul Gauthier	bfa7fedd18	catch ^c	2023-08-10 22:18:41 -03:00
Paul Gauthier	053751cb43	show loudness bar	2023-08-10 22:17:31 -03:00
Paul Gauthier	4dc704272e	aider: Updated callback function to calculate the root mean square (RMS) of the input audio data. # Aider chat conversation: USER: what is `indata` ? ASSISTANT: `indata` is a parameter in the callback function defined within the `record_and_transcribe` function in the `aider/voice.py` file. This parameter represents the input audio data that is being recorded. In the context of the `sounddevice` library, which is used for audio recording in this script, `indata` is a 2D numpy array containing the recorded audio data. The rows in the array represent individual frames, while the columns represent channels. In this specific script, the audio is being recorded in mono (single channel), so there will be only one column. The data in `indata` is being put into a queue and also being dumped for debugging purposes. USER: how can i estimate the avg volume based on indata? ASSISTANT: To estimate the average volume of the audio data, you can calculate the root mean square (RMS) of the `indata` array. The RMS represents the square root of the average of the squares of the values, and it's a common way to measure the magnitude of a varying quantity, such as audio signals. Here's how you can modify the `record_and_transcribe` function to calculate the RMS: ```python aider/voice.py <<<<<<< HEAD def callback(indata, frames, time, status): """This is called (from a separate thread) for each audio block.""" q.put(indata.copy()) dump(indata) dump(indata.mean()) ======= import numpy as np def callback(indata, frames, time, status): """This is called (from a separate thread) for each audio block.""" q.put(indata.copy()) rms = np.sqrt(np.mean(indata2)) dump(rms) >>>>>>> updated ``` In this change, we import the numpy library and replace the line `dump(indata.mean())` with `rms = np.sqrt(np.mean(indata2))` and `dump(rms)`. This calculates the RMS of the `indata` and dumps it for debugging purposes.	2023-08-10 21:57:42 -03:00
Paul Gauthier	aba2bb610d	proper temp file	2023-08-10 21:54:04 -03:00
Paul Gauthier	4812d17372	prettier	2023-08-10 17:12:19 -03:00
Paul Gauthier	7b993a9bbb	gracefully handle missing portaudio	2023-08-10 16:55:45 -03:00
Paul Gauthier	fed24aa444	lint	2023-08-10 16:53:34 -03:00
Paul Gauthier	d74a3ce123	Refactor record_and_transcribe function to remove unnecessary api_key parameter and import dump module.	2023-08-10 16:13:23 -03:00
Paul Gauthier	9b526d51e4	Refactored voice recording to use a queue and write audio to a file.	2023-08-10 16:10:02 -03:00
Paul Gauthier	1f42b0839f	aider: Fixed issue with passing the callback argument to the sd.rec function in record_and_transcribe function. # Aider chat conversation: USER: I ran this command: python -m aider.voice And got this output: Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/Users/gauthier/Projects/aider/aider/voice.py", line 42, in <module> print(record_and_transcribe(api_key)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/gauthier/Projects/aider/aider/voice.py", line 22, in record_and_transcribe recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, callback=on_key_press) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/gauthier/Projects/aider/.venv/lib/python3.11/site-packages/sounddevice.py", line 276, in rec ctx.start_stream(InputStream, samplerate, ctx.input_channels, TypeError: _CallbackContext.start_stream() got multiple values for argument 'callback' Recording started, press any key to stop... ASSISTANT: It seems like there's an issue with the way the `callback` argument is being passed to the `sd.rec` function in the `record_and_transcribe` function in the `aider/voice.py` file. The `sd.rec` function is a convenience function that internally creates an instance of `sd.InputStream` and starts recording. The `callback` argument should be passed to the `sd.InputStream` constructor, not the `sd.rec` function. Here's how we can fix this: 1. Modify the `aider/voice.py` file. ```python aider/voice.py <<<<<<< HEAD recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, callback=on_key_press) ======= # Create an instance of InputStream with the callback stream = sd.InputStream(samplerate=sample_rate, channels=1, callback=on_key_press) stream.start() recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1) >>>>>>> updated ``` This change creates an instance of `sd.InputStream` with the `on_key_press` callback and starts the stream before calling `sd.rec`. This should ensure that the callback is correctly passed to the `InputStream` and not the `rec` function.	2023-08-10 15:49:26 -03:00

1 2

54 commits