Iqbal´s DLQ Help

Simple OpenAI Whisper Java Client

whisperer is a Micronaut CLI that records microphone input until you press Enter, forwards the WAV file to the OpenAI Audio Transcriptions API, then prints and copies the transcript. The project lives in <your-repo>/examples/whisperer and targets macOS users who want a fast, keyboard-friendly dictation workflow.

Why Another Whisper Client?

  • pbcopy integration keeps hands on the keyboard—no manual copy/paste after dictation.

  • GraalVM native image builds let the CLI start instantly and access CoreAudio without a JVM startup delay.

  • The Micronaut runtime keeps the code modular while staying lightweight for a CLI tool.

Command Flow

The entry point is the Picocli-powered command in <your-repo>/examples/whisperer/src/main/java/com/iqbalaissaoui/WhispererCommand.java.

@Command(name = "whisperer", description = "Press-to-stop voice capture that sends audio to OpenAI's transcription API") public class WhispererCommand implements Callable<Integer> { public Integer call() { tempFile = Files.createTempFile("whisperer-", ".wav"); audioRecorder.start(tempFile); waitForEnter(); audioRecorder.stop(); audioRecorder.awaitCompletion(); String transcript = transcriptionService.transcribe(tempFile); System.out.println(transcript); clipboardService.copyToClipboard(transcript); } }

Key runtime behavior:

  • Forces java.home when missing so Java Sound can discover CoreAudio providers inside a native image.

  • Shows a "● recording" prompt, waits for Enter, and then joins the background writer thread before calling OpenAI.

  • Sends failures to stderr, prints the temp WAV path for inspection, and cleans up files after a successful clipboard copy.

Capturing Microphone Audio

AudioRecorder walks through possible sample rates, channels, and endianness to find a working TargetDataLine. It streams audio to WAV using Java Sound’s AudioSystem.

for (AudioFormat candidate : candidateFormats()) { DataLine.Info info = new DataLine.Info(TargetDataLine.class, candidate); for (Mixer.Info mixerInfo : mixers) { Mixer mixer = AudioSystem.getMixer(mixerInfo); if (!mixer.isLineSupported(info)) { continue; } TargetDataLine candidateLine = (TargetDataLine) mixer.getLine(info); candidateLine.open(candidate); candidateLine.start(); line = candidateLine; selectedFormat = candidate; break outer; } }

Highlights:

  • Runs the WAV writer on a dedicated daemon thread so Picocli stays responsive.

  • When the mixer cannot satisfy the requested format, it reopens the line to discover the mixer’s default format.

  • Offers helpful diagnostics if no microphone is available or permissions are missing.

Talking to the OpenAI Audio API

TranscriptionService uses Micronaut’s HttpClient to POST a multipart request to /v1/audio/transcriptions.

MultipartBody body = MultipartBody.builder() .addPart("model", model) .addPart("response_format", "text") .addPart("prompt", prompt) .addPart("file", fileName, MediaType.of("audio/wav"), audioBytes) .build(); MutableHttpRequest<?> request = HttpRequest.POST(URI.create(API), body) .accept(MediaType.TEXT_PLAIN_TYPE) .contentType(MediaType.MULTIPART_FORM_DATA_TYPE) .header(HttpHeaders.AUTHORIZATION, "Bearer " + apiKey);
  • Reads the binary WAV into memory for the multipart upload (typical mic captures are tiny, so this stays lightweight).

  • Accepts overrides via Micronaut properties: openai.model defaults to whisper-1, and openai.prompt defaults to Output in English.

  • Throws a descriptive IllegalStateException when OpenAI responds with non-2xx status codes to surface API errors in the CLI.

Clipboard Integration

ClipboardService shells out to pbcopy, letting the transcript land directly on the macOS clipboard. Any non-zero exit code triggers a warning, making it easy to diagnose if pbcopy is unavailable.

Running Locally

  1. Export the OpenAI API key:

    export OPENAI_API_KEY=sk-...
  2. Launch the CLI in JVM mode:

    ./mvnw exec:exec -Dexec.mainClass=com.iqbalaissaoui.WhispererCommand

    Speak, press Enter, and the transcript appears in both the terminal output and your clipboard.

Building the Native Image

The Maven build is preconfigured for GraalVM via -Dpackaging=native-image.

./mvnw -Dpackaging=native-image package ./target/whisperer

Tips:

  • Ensure JAVA_HOME or GRAALVM_HOME points to your GraalVM distribution before running the native binary; otherwise Java Sound cannot bootstrap.

  • Use mvn -Pnative package if you prefer Maven’s native profile instead of the packaging flag.

Testing Strategy

WhispererCommandTest runs Picocli inside a Micronaut context marked with Environment.TEST. The command short-circuits before touching the microphone, which keeps CI runs hermetic while still checking that Picocli wiring and the -v flag work.

Source Code

macOS Automator Launcher

To avoid running the binary manually, create an Automator "Quick Action" that executes the bundled whisperer.sh script. Assign it a keyboard shortcut, and macOS will start whisperer and begin recording whenever you trigger the shortcut.

01 October 2025