Paul Gauthier (aider)
1d672616be
fix: correct file export path in voice format conversion
2025-01-07 09:43:49 -08:00
Paul Gauthier
01ef2351b3
refactor: Use temporary variable for audio format to avoid state mutation
2025-01-04 12:08:05 -08:00
Paul Gauthier
de1d566e9e
refactor: Simplify audio file handling and conversion logic
2025-01-04 12:06:07 -08:00
Paul Gauthier (aider)
19114a61ae
style: Format print statement in voice.py for better readability
2025-01-04 12:04:13 -08:00
Paul Gauthier (aider)
36e5599ead
feat: Add file size logging after audio conversion
2025-01-04 12:04:09 -08:00
Paul Gauthier (aider)
f5a82e575c
style: Format print statement in voice.py for better readability
2025-01-04 12:03:32 -08:00
Paul Gauthier (aider)
1851de323d
feat: add error handling and feedback for audio conversion
2025-01-04 12:03:25 -08:00
Paul Gauthier
f9408640a3
refactor: Simplify audio file size check and format conversion logic
2025-01-04 12:03:24 -08:00
Paul Gauthier (aider)
73837730fa
feat: Add file size check and mp3 conversion for large audio files
2025-01-04 11:59:42 -08:00
Paul Gauthier
9b46991721
refactor: Add comment for handling large file uploads in voice.py
2025-01-04 11:59:40 -08:00
Paul Gauthier
cbcbff341b
style: format voice.py and suppress syntax warnings
2024-12-08 18:14:30 -08:00
Philippe de Reynal
e11faadf39
feat: ability to select audio input device
2024-11-30 11:24:34 +01:00
Paul Gauthier
3edcd71a44
fix: move pydub import to avoid warnings
2024-09-24 09:29:00 -07:00
Paul Gauthier (aider)
ab786279e6
style: Fix linter warnings
2024-09-24 09:28:31 -07:00
Paul Gauthier (aider)
e267dc13af
fix: Move imports to top of file
2024-09-24 09:28:28 -07:00
Paul Gauthier (aider)
d0e6dc2c1e
style: Wrap long lines in voice.py
2024-09-24 09:28:16 -07:00
Paul Gauthier (aider)
da752bb00c
fix: Suppress warning about missing ffmpeg or avconv
2024-09-24 09:28:13 -07:00
Mike Bailey
1cc30a22f9
feat: Option to compress audio files by ~90%
...
Add option to reduce bandwidth (and potentially latency) by converting
voice recordings (wav) into a compressed audio format (webm or mp3).
Default behaviour is unchanged.
> File uploads are currently limited to 25 MB and the following input file
> types are supported: mp3, mp4, mpeg, mpga, m4a, wav, and webm.
>
> - https://platform.openai.com/docs/guides/speech-to-text
2024-09-22 22:20:07 +10:00
Paul Gauthier
aedba59c6e
feat: add error handling for transcription in Voice class
2024-09-03 08:59:58 -07:00
Paul Gauthier (aider)
130579ea3c
style: Format error message in Voice class for better readability
2024-08-30 07:04:13 -07:00
Paul Gauthier (aider)
e88ec36a81
feat: improve error handling for audio device issues
2024-08-30 07:04:09 -07:00
Paul Gauthier
c598cdd576
Catch self.sd.PortAudioError gracefully #849
2024-07-13 07:57:12 +01:00
Paul Gauthier
9d02628cf8
streamlit borks sys.path, causes import("litellm") to load our litellm.py; fix
2024-07-03 21:32:50 -03:00
Paul Gauthier
ed35af44b3
defer numpy, bs4 and jsonschema
2024-07-03 13:35:33 -03:00
Paul Gauthier
73c6af8851
aider: Implemented a more flexible approach to handle different microphone sample rates in aider/voice.py.
...
# Aider chat conversation:
USER: https://github.com/paul-gauthier/aider/issues/612
https://github.com/paul-gauthier/aider/issues/612 :
[Skip to content](#start-of-content)
## Navigation Menu
Toggle navigation
[](https://github.com/ )
[Sign
in](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612)
- Product
- [](https://github.com/features/actions )
Actions
Automate any workflow
- [](https://github.com/features/packages )
Packages
Host and manage packages
- [](https://github.com/features/security )
Security
Find and fix vulnerabilities
- [](https://github.com/features/codespaces )
Codespaces
Instant dev environments
- [](https://github.com/features/copilot )
GitHub Copilot
Write better code with AI
- [](https://github.com/features/code-review )
Code review
Manage code changes
- [](https://github.com/features/issues )
Issues
Plan and track work
- [](https://github.com/features/discussions )
Discussions
Collaborate outside of code
Explore
- [All features](https://github.com/features )
- [Documentation](https://docs.github.com )
- [GitHub Skills](https://skills.github.com )
- [Blog](https://github.blog )
- Solutions
For
- [Enterprise](https://github.com/enterprise )
- [Teams](https://github.com/team )
- [Startups](https://github.com/enterprise/startups )
- [Education](https://education.github.com )
By Solution
- [CI/CD & Automation](https://github.com/solutions/ci-cd )
- [DevOps](https://github.com/solutions/devops )
- [DevSecOps](https://resources.github.com/devops/fundamentals/devsecops )
Resources
- [Learning Pathways](https://resources.github.com/learn/pathways )
- [White papers, Ebooks, Webinars](https://resources.github.com )
- [Customer Stories](https://github.com/customer-stories )
- [Partners](https://partner.github.com )
- Open Source
- [](/sponsors)
GitHub Sponsors
Fund open source developers
- [](https://github.com/readme )
The ReadME Project
GitHub community articles
Repositories
- [Topics](https://github.com/topics )
- [Trending](https://github.com/trending )
- [Collections](https://github.com/collections )
- Enterprise
- [](/enterprise)
Enterprise platform
AI-powered developer platform
Available add-ons
- [](https://github.com/enterprise/advanced-security )
Advanced Security
Enterprise-grade security features
- [](/features/copilot#enterprise)
GitHub Copilot
Enterprise-grade AI features
- [](/premium-support)
Premium Support
Enterprise-grade 24/7 support
- [Pricing](https://github.com/pricing )
Search or jump to\...
# Search code, repositories, users, issues, pull requests\...
Search
Clear
[Search syntax
tips](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax )
# Provide feedback
We read every piece of feedback, and take your input very seriously.
Include my email address so I can be contacted
Cancel
Submit feedback
# Saved searches
## Use saved searches to filter your results more quickly
Name
Query
To see all available qualifiers, see our
[documentation](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax ).
Cancel
Create saved search
[Sign
in](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612)
[Sign
up](/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=paul-gauthier%2Faider)
You signed in with another tab or window. [Reload]() to refresh your
session. You signed out in another tab or window. [Reload]() to refresh
your session. You switched accounts on another tab or window. [Reload]()
to refresh your session.
Dismiss alert
{{ message }}
::: {role="main"}
[paul-gauthier](/paul-gauthier) / **[aider](/paul-gauthier/aider)**
Public
- [Notifications](/login?return_to=%2Fpaul-gauthier%2Faider) You must
be signed in to change notification settings
- [Fork 1k](/login?return_to=%2Fpaul-gauthier%2Faider)
-
[ Star 10.5k](/login?return_to=%2Fpaul-gauthier%2Faider)
- [Code ](/paul-gauthier/aider)
- [Issues 57](/paul-gauthier/aider/issues)
- [Pull requests 14](/paul-gauthier/aider/pulls)
- [Actions ](/paul-gauthier/aider/actions)
- [Projects 0](/paul-gauthier/aider/projects)
- [Security](/paul-gauthier/aider/security)
- [Insights ](/paul-gauthier/aider/pulse)
Additional navigation options
- [ Code ](/paul-gauthier/aider)
- [ Issues ](/paul-gauthier/aider/issues)
- [ Pull requests ](/paul-gauthier/aider/pulls)
- [ Actions ](/paul-gauthier/aider/actions)
- [ Projects ](/paul-gauthier/aider/projects)
- [ Security ](/paul-gauthier/aider/security)
- [ Insights ](/paul-gauthier/aider/pulse)
New issue
**Have a question about this project?** Sign up for a free GitHub
account to open an issue and contact its maintainers and the community.
[Sign up for
GitHub](/signup?return_to=%2Fpaul-gauthier%2Faider%2Fissues%2Fnew%2Fchoose)
By clicking "Sign up for GitHub", you agree to our [terms of
service](https://docs.github.com/terms ) and [privacy
statement](https://docs.github.com/privacy ). We'll occasionally send you
account related emails.
Already on GitHub? [Sign
in](/login?return_to=%2Fpaul-gauthier%2Faider%2Fissues%2Fnew%2Fchoose)
to your account
[Jump to bottom](#issue-comment-box)
# Unsupported Audio Sample Rate #612
Open
[aaronson2012](/aaronson2012) opened this issue May 24, 2024 · 0
comments
Open
# [Unsupported Audio Sample Rate](#top) #612
[aaronson2012](/aaronson2012) opened this issue May 24, 2024 · 0
comments
Labels
[enhancement](/paul-gauthier/aider/labels/enhancement) New feature or
request
## Comments
[](/aaronson2012)
Copy link
###
**[aaronson2012](/aaronson2012)** commented [May 24,
2024](#issue-2314049815)
+-----------------------------------------------------------------------+
| ### Issue |
| |
| aider was crashing when i ran the /voice command and I realized that |
| my computers microphone did not support a sample rate of 16kHz. It |
| was an easy fix for me i just changed the sample_rate variable to one |
| my mic supported but this isn\'t a very elegant solution. Could there |
| be a better way so all mics are supported? |
| |
| ### Version and model info |
| |
| Latest verision |
+-----------------------------------------------------------------------+
| |
| |
| The text was updated successfully, but these errors were encountered: |
| |
| |
+-----------------------------------------------------------------------+
All reactions
[paul-gauthier](/paul-gauthier) added the
[enhancement](/paul-gauthier/aider/labels/enhancement) New feature or
request label [Jun 4, 2024](#event-13039727805)
[Sign up for free](/join?source=comment-repo) **to join this
conversation on GitHub**. Already have an account? [Sign in to
comment](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612)
Assignees
No one assigned
Labels
[enhancement](/paul-gauthier/aider/labels/enhancement) New feature or
request
Projects
None yet
Milestone
No milestone
Development
No branches or pull requests
2 participants
:::
## Footer
[](https://github.com ) © 2024 GitHub, Inc.
### Footer navigation
- [Terms](https://docs.github.com/site-policy/github-terms/github-terms-of-service )
- [Privacy](https://docs.github.com/site-policy/privacy-policies/github-privacy-statement )
- [Security](https://github.com/security )
- [Status](https://www.githubstatus.com/ )
- [Docs](https://docs.github.com/ )
- [Contact](https://support.github.com?tags=dotcom-footer )
- Manage cookies
- Do not share my personal information
You can't perform that action at this time.
ASSISTANT: Based on the issue description, I believe the following file is most likely to need changes to support different microphone sample rates:
aider/voice.py
This file contains the Voice class which handles recording and transcribing audio. The sample_rate variable is hardcoded in this file.
To make the suggested change, we would need to modify how the sample rate is determined in aider/voice.py.
Please add aider/voice.py to the chat so I can propose the specific code changes needed. Let me know if you would like me to explain my reasoning further or if you have any other questions!
USER: I added these files to the chat: aider/voice.py.
If you need to propose edits to other existing files not already added to the chat, you *MUST* tell the me their full path names and ask me to *add the files to the chat*. End your reply and wait for my approval. You can keep asking if you then decide you need to edit more files.
ASSISTANT: Here are the proposed changes to aider/voice.py to support different microphone sample rates:
aider/voice.py
```python
<<<<<<< SEARCH
sample_rate = 16000 # 16kHz
=======
try:
sample_rate = int(self.sd.query_devices(None, 'input')['default_samplerate'])
except (TypeError, ValueError):
sample_rate = 16000 # fallback to 16kHz if unable to query device
>>>>>>> REPLACE
```
This change:
1. Attempts to query the default sample rate of the input device using sounddevice's `query_devices()` function.
2. If successful, it uses that sample rate.
3. If it fails to get a valid sample rate, it falls back to the original 16kHz.
This should allow aider to automatically adapt to the microphone's supported sample rate in most cases, while still having a reasonable fallback.
Let me know if you would like me to explain or modify the changes further.
2024-06-04 10:27:22 -07:00
Paul Gauthier
9ff6770a04
refactored litellm to avoid duplicating workarounds
2024-05-08 08:05:15 -07:00
Paul Gauthier
01cf038bb5
Quiet litellm
2024-04-23 10:37:43 -07:00
Paul Gauthier
f1c09ececf
Switched voice to litellm
2024-04-18 14:13:26 -07:00
Paul Gauthier
c770fc4380
cleaned up client refs
2024-04-17 15:47:07 -07:00
Paul Gauthier
6ebc142377
roughed in openai 1.x
2023-12-05 07:37:05 -08:00
Paul Gauthier
22eb344192
disable /voice if import soundfile failes #274
2023-10-12 07:00:51 -07:00
Paul Gauthier
963bc83a9c
Late-bind importing sounddevice, as it was slowing down aider startup
2023-08-19 15:18:46 -07:00
Paul Gauthier
c07013abf4
finished roughing in --voice-language
2023-08-14 14:18:37 -03:00
Paul Gauthier
bddcdd39a8
Refactor record_and_transcribe method in voice.py to accept a language parameter.
2023-08-14 14:11:31 -03:00
Paul Gauthier
23579d4741
Refactor voice recording and transcription to allow for specifying the language.
2023-08-14 14:11:03 -03:00
Paul Gauthier
fcb209c5aa
refac
2023-08-11 11:16:31 -03:00
Paul Gauthier
b6b271b70c
pass the input history as a prompt to whisper
2023-08-11 11:03:37 -03:00
Paul Gauthier
c38a877d81
clip low volumes to 0
2023-08-11 10:33:20 -03:00
Paul Gauthier
b59e5bf298
copy
2023-08-10 22:27:18 -03:00
Paul Gauthier
24f574a7dc
better bar
2023-08-10 22:21:39 -03:00
Paul Gauthier
bfa7fedd18
catch ^c
2023-08-10 22:18:41 -03:00
Paul Gauthier
053751cb43
show loudness bar
2023-08-10 22:17:31 -03:00
Paul Gauthier
4dc704272e
aider: Updated callback function to calculate the root mean square (RMS) of the input audio data.
...
# Aider chat conversation:
USER: what is `indata` ?
ASSISTANT: `indata` is a parameter in the callback function defined within the `record_and_transcribe` function in the `aider/voice.py` file. This parameter represents the input audio data that is being recorded.
In the context of the `sounddevice` library, which is used for audio recording in this script, `indata` is a 2D numpy array containing the recorded audio data. The rows in the array represent individual frames, while the columns represent channels.
In this specific script, the audio is being recorded in mono (single channel), so there will be only one column. The data in `indata` is being put into a queue and also being dumped for debugging purposes.
USER: how can i estimate the avg volume based on indata?
ASSISTANT: To estimate the average volume of the audio data, you can calculate the root mean square (RMS) of the `indata` array. The RMS represents the square root of the average of the squares of the values, and it's a common way to measure the magnitude of a varying quantity, such as audio signals.
Here's how you can modify the `record_and_transcribe` function to calculate the RMS:
```python
aider/voice.py
<<<<<<< HEAD
def callback(indata, frames, time, status):
"""This is called (from a separate thread) for each audio block."""
q.put(indata.copy())
dump(indata)
dump(indata.mean())
=======
import numpy as np
def callback(indata, frames, time, status):
"""This is called (from a separate thread) for each audio block."""
q.put(indata.copy())
rms = np.sqrt(np.mean(indata**2))
dump(rms)
>>>>>>> updated
```
In this change, we import the numpy library and replace the line `dump(indata.mean())` with `rms = np.sqrt(np.mean(indata**2))` and `dump(rms)`. This calculates the RMS of the `indata` and dumps it for debugging purposes.
2023-08-10 21:57:42 -03:00
Paul Gauthier
aba2bb610d
proper temp file
2023-08-10 21:54:04 -03:00
Paul Gauthier
4812d17372
prettier
2023-08-10 17:12:19 -03:00
Paul Gauthier
7b993a9bbb
gracefully handle missing portaudio
2023-08-10 16:55:45 -03:00
Paul Gauthier
fed24aa444
lint
2023-08-10 16:53:34 -03:00
Paul Gauthier
d74a3ce123
Refactor record_and_transcribe function to remove unnecessary api_key parameter and import dump module.
2023-08-10 16:13:23 -03:00
Paul Gauthier
9b526d51e4
Refactored voice recording to use a queue and write audio to a file.
2023-08-10 16:10:02 -03:00
Paul Gauthier
1f42b0839f
aider: Fixed issue with passing the callback argument to the sd.rec function in record_and_transcribe function.
...
# Aider chat conversation:
USER: I ran this command:
python -m aider.voice
And got this output:
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/Users/gauthier/Projects/aider/aider/voice.py", line 42, in <module>
print(record_and_transcribe(api_key))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/gauthier/Projects/aider/aider/voice.py", line 22, in record_and_transcribe
recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, callback=on_key_press)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/gauthier/Projects/aider/.venv/lib/python3.11/site-packages/sounddevice.py", line 276, in rec
ctx.start_stream(InputStream, samplerate, ctx.input_channels,
TypeError: _CallbackContext.start_stream() got multiple values for argument 'callback'
Recording started, press any key to stop...
ASSISTANT: It seems like there's an issue with the way the `callback` argument is being passed to the `sd.rec` function in the `record_and_transcribe` function in the `aider/voice.py` file.
The `sd.rec` function is a convenience function that internally creates an instance of `sd.InputStream` and starts recording. The `callback` argument should be passed to the `sd.InputStream` constructor, not the `sd.rec` function.
Here's how we can fix this:
1. Modify the `aider/voice.py` file.
```python
aider/voice.py
<<<<<<< HEAD
recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, callback=on_key_press)
=======
# Create an instance of InputStream with the callback
stream = sd.InputStream(samplerate=sample_rate, channels=1, callback=on_key_press)
stream.start()
recording = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1)
>>>>>>> updated
```
This change creates an instance of `sd.InputStream` with the `on_key_press` callback and starts the stream before calling `sd.rec`. This should ensure that the callback is correctly passed to the `InputStream` and not the `rec` function.
2023-08-10 15:49:26 -03:00