Image to Audio (Spectrogram Synth)
Drop an image. Each column becomes a slice of audio whose frequencies are determined by the column's pixel brightness.
Drop your audio file here
or click to browse a file
High-contrast images, text or logos work best. Source is downsampled to 256×128 internally.
About this image-to-audio synth
This tool reverses a spectrogram. It reads the image, treats each column as a moment in time, and uses the column's pixel brightness as a recipe for which frequencies should sound at that moment. Bright pixel near the top of a column = high frequency loud at that moment; bright pixel at the bottom = low frequency.
The result is a piece of audio that, when fed into a real spectrogram viewer, will visually reveal your original image. Used to be a popular Easter-egg trick in songs (Aphex Twin, Venetian Snares); now it's mostly used for ML-generated audio art and curious noise experiments.
How to convert an image to audio
- 01
Drop in an image
PNG / JPG. High-contrast images with simple shapes work best.
- 02
Pick a duration + frequency range
10s and 100-8000 Hz are good defaults. Longer = each pixel column gets more time.
- 03
Render and download
Output is a WAV. Open it in a spectrogram viewer (or our /spectrogram tool) to see the image emerge.
Why use image-to-audio
- Free, private, no install
- Adjustable duration, sample rate, and frequency range
- Output is uncompressed WAV — best for spectrogram playback
- No upload, no signup
- Useful for music Easter eggs, ML datasets, weird-noise experiments
- Round-trip with our /spectrogram tool to verify the image survived
Image-to-audio FAQ
Will it sound good?
No — and that's the point. The output is intentionally noisy / glitchy because the audio is built from the image, not the other way around. Open the WAV in a spectrogram viewer to see the image, that's the payoff.
What images work best?
Simple, high-contrast designs — text, logos, geometric shapes. Photos work but the spectrogram comes out muddy. Use white-on-black or grayscale for the cleanest result.
How big can the image be?
The tool downsamples the image to a fixed resolution (256×128 by default) so any source size works. Bigger detail in the source maps to richer detail in the spectrogram.
How do I see the image in the audio?
Drop the resulting WAV into our /spectrogram tool, or any DAW with a spectrogram view (Audacity, iZotope RX). Your image will appear as the visualisation.
More master & tools tools
Loudness, tags, inspect, generate