aider/aider/voice.py
Paul Gauthier 73c6af8851 aider: Implemented a more flexible approach to handle different microphone sample rates in aider/voice.py.
# Aider chat conversation:

USER: https://github.com/paul-gauthier/aider/issues/612

https://github.com/paul-gauthier/aider/issues/612:

[Skip to content](#start-of-content)

## Navigation Menu

Toggle navigation

[](https://github.com/)

[Sign
in](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612)

-   Product

    -   [](https://github.com/features/actions)

        Actions

        Automate any workflow

    -   [](https://github.com/features/packages)

        Packages

        Host and manage packages

    -   [](https://github.com/features/security)

        Security

        Find and fix vulnerabilities

    -   [](https://github.com/features/codespaces)

        Codespaces

        Instant dev environments

    -   [](https://github.com/features/copilot)

        GitHub Copilot

        Write better code with AI

    -   [](https://github.com/features/code-review)

        Code review

        Manage code changes

    -   [](https://github.com/features/issues)

        Issues

        Plan and track work

    -   [](https://github.com/features/discussions)

        Discussions

        Collaborate outside of code

    Explore
    -   [All features](https://github.com/features)
    -   [Documentation](https://docs.github.com)
    -   [GitHub Skills](https://skills.github.com)
    -   [Blog](https://github.blog)

-   Solutions

    For
    -   [Enterprise](https://github.com/enterprise)
    -   [Teams](https://github.com/team)
    -   [Startups](https://github.com/enterprise/startups)
    -   [Education](https://education.github.com)

    By Solution
    -   [CI/CD & Automation](https://github.com/solutions/ci-cd)
    -   [DevOps](https://github.com/solutions/devops)
    -   [DevSecOps](https://resources.github.com/devops/fundamentals/devsecops)

    Resources
    -   [Learning Pathways](https://resources.github.com/learn/pathways)
    -   [White papers, Ebooks, Webinars](https://resources.github.com)
    -   [Customer Stories](https://github.com/customer-stories)
    -   [Partners](https://partner.github.com)

-   Open Source

    -   [](/sponsors)

        GitHub Sponsors

        Fund open source developers

    -   [](https://github.com/readme)

        The ReadME Project

        GitHub community articles

    Repositories
    -   [Topics](https://github.com/topics)
    -   [Trending](https://github.com/trending)
    -   [Collections](https://github.com/collections)

-   Enterprise

    -   [](/enterprise)

        Enterprise platform

        AI-powered developer platform

    Available add-ons
    -   [](https://github.com/enterprise/advanced-security)

        Advanced Security

        Enterprise-grade security features

    -   [](/features/copilot#enterprise)

        GitHub Copilot

        Enterprise-grade AI features

    -   [](/premium-support)

        Premium Support

        Enterprise-grade 24/7 support

-   [Pricing](https://github.com/pricing)

Search or jump to\...

# Search code, repositories, users, issues, pull requests\...

Search

Clear

[Search syntax
tips](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax)

# Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

# Saved searches

## Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our
[documentation](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax).

Cancel

Create saved search

[Sign
in](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612)

[Sign
up](/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=paul-gauthier%2Faider)

You signed in with another tab or window. [Reload]() to refresh your
session. You signed out in another tab or window. [Reload]() to refresh
your session. You switched accounts on another tab or window. [Reload]()
to refresh your session.

Dismiss alert

{{ message }}

::: {role="main"}

[paul-gauthier](/paul-gauthier) / **[aider](/paul-gauthier/aider)**
Public

-   [Notifications](/login?return_to=%2Fpaul-gauthier%2Faider) You must
    be signed in to change notification settings

-   [Fork 1k](/login?return_to=%2Fpaul-gauthier%2Faider)

-

    [ Star 10.5k](/login?return_to=%2Fpaul-gauthier%2Faider)

-   [Code ](/paul-gauthier/aider)
-   [Issues 57](/paul-gauthier/aider/issues)
-   [Pull requests 14](/paul-gauthier/aider/pulls)
-   [Actions ](/paul-gauthier/aider/actions)
-   [Projects 0](/paul-gauthier/aider/projects)
-   [Security](/paul-gauthier/aider/security)
-   [Insights ](/paul-gauthier/aider/pulse)

Additional navigation options

-   [ Code ](/paul-gauthier/aider)
-   [ Issues ](/paul-gauthier/aider/issues)
-   [ Pull requests ](/paul-gauthier/aider/pulls)
-   [ Actions ](/paul-gauthier/aider/actions)
-   [ Projects ](/paul-gauthier/aider/projects)
-   [ Security ](/paul-gauthier/aider/security)
-   [ Insights ](/paul-gauthier/aider/pulse)

New issue

**Have a question about this project?** Sign up for a free GitHub
account to open an issue and contact its maintainers and the community.

[Sign up for
GitHub](/signup?return_to=%2Fpaul-gauthier%2Faider%2Fissues%2Fnew%2Fchoose)

By clicking "Sign up for GitHub", you agree to our [terms of
service](https://docs.github.com/terms) and [privacy
statement](https://docs.github.com/privacy). We'll occasionally send you
account related emails.

Already on GitHub? [Sign
in](/login?return_to=%2Fpaul-gauthier%2Faider%2Fissues%2Fnew%2Fchoose)
to your account

[Jump to bottom](#issue-comment-box)

# Unsupported Audio Sample Rate #612

Open

[aaronson2012](/aaronson2012) opened this issue May 24, 2024 · 0
comments

Open

# [Unsupported Audio Sample Rate](#top) #612

[aaronson2012](/aaronson2012) opened this issue May 24, 2024 · 0
comments

Labels

[enhancement](/paul-gauthier/aider/labels/enhancement) New feature or
request

## Comments

[](/aaronson2012)

Copy link

###

**[aaronson2012](/aaronson2012)** commented [May 24,
2024](#issue-2314049815)

+-----------------------------------------------------------------------+
| ### Issue                                                             |
|                                                                       |
| aider was crashing when i ran the /voice command and I realized that  |
| my computers microphone did not support a sample rate of 16kHz. It    |
| was an easy fix for me i just changed the sample_rate variable to one |
| my mic supported but this isn\'t a very elegant solution. Could there |
| be a better way so all mics are supported?                            |
|                                                                       |
| ### Version and model info                                            |
|                                                                       |
| Latest verision                                                       |
+-----------------------------------------------------------------------+
|                                                                       |
|                                                                       |
| The text was updated successfully, but these errors were encountered: |
|                                                                       |
|                                                                       |
+-----------------------------------------------------------------------+

All reactions

[paul-gauthier](/paul-gauthier) added the
[enhancement](/paul-gauthier/aider/labels/enhancement) New feature or
request label [Jun 4, 2024](#event-13039727805)

[Sign up for free](/join?source=comment-repo) **to join this
conversation on GitHub**. Already have an account? [Sign in to
comment](/login?return_to=https%3A%2F%2Fgithub.com%2Fpaul-gauthier%2Faider%2Fissues%2F612)

Assignees

No one assigned

Labels

[enhancement](/paul-gauthier/aider/labels/enhancement) New feature or
request

Projects

None yet

Milestone

No milestone

Development

No branches or pull requests

2 participants

:::

## Footer

[](https://github.com) © 2024 GitHub, Inc.

### Footer navigation

-   [Terms](https://docs.github.com/site-policy/github-terms/github-terms-of-service)
-   [Privacy](https://docs.github.com/site-policy/privacy-policies/github-privacy-statement)
-   [Security](https://github.com/security)
-   [Status](https://www.githubstatus.com/)
-   [Docs](https://docs.github.com/)
-   [Contact](https://support.github.com?tags=dotcom-footer)
-   Manage cookies
-   Do not share my personal information

You can't perform that action at this time.

ASSISTANT: Based on the issue description, I believe the following file is most likely to need changes to support different microphone sample rates:

aider/voice.py

This file contains the Voice class which handles recording and transcribing audio. The sample_rate variable is hardcoded in this file.

To make the suggested change, we would need to modify how the sample rate is determined in aider/voice.py.

Please add aider/voice.py to the chat so I can propose the specific code changes needed. Let me know if you would like me to explain my reasoning further or if you have any other questions!

USER: I added these files to the chat: aider/voice.py.

If you need to propose edits to other existing files not already added to the chat, you *MUST* tell the me their full path names and ask me to *add the files to the chat*. End your reply and wait for my approval. You can keep asking if you then decide you need to edit more files.

ASSISTANT: Here are the proposed changes to aider/voice.py to support different microphone sample rates:

aider/voice.py
```python
<<<<<<< SEARCH
        sample_rate = 16000  # 16kHz
=======
        try:
            sample_rate = int(self.sd.query_devices(None, 'input')['default_samplerate'])
        except (TypeError, ValueError):
            sample_rate = 16000  # fallback to 16kHz if unable to query device
>>>>>>> REPLACE
```

This change:

1. Attempts to query the default sample rate of the input device using sounddevice's `query_devices()` function.
2. If successful, it uses that sample rate.
3. If it fails to get a valid sample rate, it falls back to the original 16kHz.

This should allow aider to automatically adapt to the microphone's supported sample rate in most cases, while still having a reasonable fallback.

Let me know if you would like me to explain or modify the changes further.
2024-06-04 10:27:22 -07:00

107 lines
2.8 KiB
Python

import os
import queue
import tempfile
import time
import numpy as np
from aider.litellm import litellm
try:
import soundfile as sf
except (OSError, ModuleNotFoundError):
sf = None
from prompt_toolkit.shortcuts import prompt
from .dump import dump # noqa: F401
class SoundDeviceError(Exception):
pass
class Voice:
max_rms = 0
min_rms = 1e5
pct = 0
threshold = 0.15
def __init__(self):
if sf is None:
raise SoundDeviceError
try:
print("Initializing sound device...")
import sounddevice as sd
self.sd = sd
except (OSError, ModuleNotFoundError):
raise SoundDeviceError
def callback(self, indata, frames, time, status):
"""This is called (from a separate thread) for each audio block."""
rms = np.sqrt(np.mean(indata**2))
self.max_rms = max(self.max_rms, rms)
self.min_rms = min(self.min_rms, rms)
rng = self.max_rms - self.min_rms
if rng > 0.001:
self.pct = (rms - self.min_rms) / rng
else:
self.pct = 0.5
self.q.put(indata.copy())
def get_prompt(self):
num = 10
if np.isnan(self.pct) or self.pct < self.threshold:
cnt = 0
else:
cnt = int(self.pct * 10)
bar = "" * cnt + "" * (num - cnt)
bar = bar[:num]
dur = time.time() - self.start_time
return f"Recording, press ENTER when done... {dur:.1f}sec {bar}"
def record_and_transcribe(self, history=None, language=None):
try:
return self.raw_record_and_transcribe(history, language)
except KeyboardInterrupt:
return
def raw_record_and_transcribe(self, history, language):
self.q = queue.Queue()
filename = tempfile.mktemp(suffix=".wav")
try:
sample_rate = int(self.sd.query_devices(None, 'input')['default_samplerate'])
except (TypeError, ValueError):
sample_rate = 16000 # fallback to 16kHz if unable to query device
self.start_time = time.time()
with self.sd.InputStream(samplerate=sample_rate, channels=1, callback=self.callback):
prompt(self.get_prompt, refresh_interval=0.1)
with sf.SoundFile(filename, mode="x", samplerate=sample_rate, channels=1) as file:
while not self.q.empty():
file.write(self.q.get())
with open(filename, "rb") as fh:
transcript = litellm.transcription(
model="whisper-1", file=fh, prompt=history, language=language
)
text = transcript.text
return text
if __name__ == "__main__":
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
raise ValueError("Please set the OPENAI_API_KEY environment variable.")
print(Voice().record_and_transcribe())