stt-vosk-py-node/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

This is a speech-to-text proof of concept that runs entirely locally without third-party APIs. The system captures live microphone audio from a browser, sends it to a backend server, and converts it to text using open-source libraries like Vosk.

## Architecture

The project consists of two main components:
- **Frontend**: Basic HTML page with JavaScript for microphone capture and audio streaming
- **Backend**: Server (Node.js or Python) that receives audio streams and performs speech-to-text conversion using local libraries

## Development Environment

- User runs fish terminal
- All processing must be local (no cloud services)
- System should utilize local hardware for speech recognition

## Key Implementation Requirements

- Real-time or near-real-time audio streaming from browser to backend
- Local speech-to-text processing using libraries like Vosk
- Display transcribed text on the frontend UI
- Start/stop recording functionality
- WebSocket or similar real-time communication between frontend and backend

## Development Commands

### Docker (Recommended)
- `docker-compose up --build` - Build and start the application
- `docker-compose down` - Stop the application

### Local Development
- `yarn install` - Install dependencies (yarn is configured)
- `yarn start` - Start the server
- `yarn dev` - Start with nodemon for development

## Technology Stack

- **Backend**: Node.js with Express and WebSocket server
- **Frontend**: HTML5 + JavaScript with AudioWorklet for audio capture
- **Speech Recognition**: Vosk library (Python) for local processing
- **Communication**: WebSocket for real-time audio streaming and transcription

## Setup Requirements

- Download Vosk model to `./vosk-model/` directory
- Server runs on http://localhost:3000
- WebSocket API available at `ws://localhost:3000` for external clients