init
This commit is contained in:
24
requirements.md
Normal file
24
requirements.md
Normal file
@@ -0,0 +1,24 @@
|
||||
### 🧩 **Requirement: Speech-to-Text POC (No 3rd-Party APIs)**
|
||||
|
||||
#### **Goal**
|
||||
|
||||
Build a simple proof of concept (POC) that captures live microphone audio from the browser, sends it to a backend server, converts the audio to text using an open-source/local library, and displays the text on the UI.
|
||||
|
||||
#### **Key Points**
|
||||
|
||||
* A basic `index.html` page to:
|
||||
|
||||
* Start/stop microphone recording.
|
||||
* Stream audio to the backend.
|
||||
* Display the transcribed text in real-time or after processing.
|
||||
* A backend server (e.g., Node.js or Python) that:
|
||||
|
||||
* Receives audio stream.
|
||||
* Uses a **local speech-to-text library** (e.g., [Vosk](https://alphacephei.com/vosk/)) — **no external APIs**.
|
||||
* Sends back the transcribed text to the frontend.
|
||||
|
||||
#### **Note**
|
||||
|
||||
* I am using fish terminal
|
||||
* The solution should run locally and utilize system hardware.
|
||||
* Avoid any third-party cloud services.
|
||||
Reference in New Issue
Block a user