Simple OpenAI Whisper Java Client
whisperer
is a Micronaut CLI that records microphone input until you press Enter, forwards the WAV file to the OpenAI Audio Transcriptions API, then prints and copies the transcript. The project lives in <your-repo>/examples/whisperer
and targets macOS users who want a fast, keyboard-friendly dictation workflow.
Why Another Whisper Client?
pbcopy
integration keeps hands on the keyboard—no manual copy/paste after dictation.GraalVM native image builds let the CLI start instantly and access CoreAudio without a JVM startup delay.
The Micronaut runtime keeps the code modular while staying lightweight for a CLI tool.
Command Flow
The entry point is the Picocli-powered command in <your-repo>/examples/whisperer/src/main/java/com/iqbalaissaoui/WhispererCommand.java
.
Key runtime behavior:
Forces
java.home
when missing so Java Sound can discover CoreAudio providers inside a native image.Shows a "● recording" prompt, waits for Enter, and then joins the background writer thread before calling OpenAI.
Sends failures to stderr, prints the temp WAV path for inspection, and cleans up files after a successful clipboard copy.
Capturing Microphone Audio
AudioRecorder
walks through possible sample rates, channels, and endianness to find a working TargetDataLine
. It streams audio to WAV using Java Sound’s AudioSystem
.
Highlights:
Runs the WAV writer on a dedicated daemon thread so Picocli stays responsive.
When the mixer cannot satisfy the requested format, it reopens the line to discover the mixer’s default format.
Offers helpful diagnostics if no microphone is available or permissions are missing.
Talking to the OpenAI Audio API
TranscriptionService
uses Micronaut’s HttpClient
to POST a multipart request to /v1/audio/transcriptions
.
Reads the binary WAV into memory for the multipart upload (typical mic captures are tiny, so this stays lightweight).
Accepts overrides via Micronaut properties:
openai.model
defaults towhisper-1
, andopenai.prompt
defaults toOutput in English
.Throws a descriptive
IllegalStateException
when OpenAI responds with non-2xx status codes to surface API errors in the CLI.
Clipboard Integration
ClipboardService
shells out to pbcopy
, letting the transcript land directly on the macOS clipboard. Any non-zero exit code triggers a warning, making it easy to diagnose if pbcopy
is unavailable.
Running Locally
Export the OpenAI API key:
export OPENAI_API_KEY=sk-...Launch the CLI in JVM mode:
./mvnw exec:exec -Dexec.mainClass=com.iqbalaissaoui.WhispererCommandSpeak, press Enter, and the transcript appears in both the terminal output and your clipboard.
Building the Native Image
The Maven build is preconfigured for GraalVM via -Dpackaging=native-image
.
Tips:
Ensure
JAVA_HOME
orGRAALVM_HOME
points to your GraalVM distribution before running the native binary; otherwise Java Sound cannot bootstrap.Use
mvn -Pnative package
if you prefer Maven’s native profile instead of the packaging flag.
Testing Strategy
WhispererCommandTest
runs Picocli inside a Micronaut context marked with Environment.TEST
. The command short-circuits before touching the microphone, which keeps CI runs hermetic while still checking that Picocli wiring and the -v
flag work.
Source Code
macOS Automator Launcher
To avoid running the binary manually, create an Automator "Quick Action" that executes the bundled whisperer.sh
script. Assign it a keyboard shortcut, and macOS will start whisperer
and begin recording whenever you trigger the shortcut.