1.9 KiB

Raw Permalink Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a speech-to-text proof of concept that runs entirely locally without third-party APIs. The system captures live microphone audio from a browser, sends it to a backend server, and converts it to text using open-source libraries like Vosk.

Architecture

The project consists of two main components:

Frontend: Basic HTML page with JavaScript for microphone capture and audio streaming
Backend: Server (Node.js or Python) that receives audio streams and performs speech-to-text conversion using local libraries

Development Environment

User runs fish terminal
All processing must be local (no cloud services)
System should utilize local hardware for speech recognition

Key Implementation Requirements

Real-time or near-real-time audio streaming from browser to backend
Local speech-to-text processing using libraries like Vosk
Display transcribed text on the frontend UI
Start/stop recording functionality
WebSocket or similar real-time communication between frontend and backend

Development Commands

Docker (Recommended)

docker-compose up --build - Build and start the application
docker-compose down - Stop the application

Local Development

yarn install - Install dependencies (yarn is configured)
yarn start - Start the server
yarn dev - Start with nodemon for development

Technology Stack

Backend: Node.js with Express and WebSocket server
Frontend: HTML5 + JavaScript with AudioWorklet for audio capture
Speech Recognition: Vosk library (Python) for local processing
Communication: WebSocket for real-time audio streaming and transcription

Setup Requirements

Download Vosk model to ./vosk-model/ directory
Server runs on http://localhost:3000
WebSocket API available at ws://localhost:3000 for external clients

1.9 KiB Raw Permalink Blame History