Neural Audio Intelligence

Speech Emotion
Recognition

Decoding human emotion from raw voice — one waveform at a time.

87.4% Accuracy
7,442 Audio Clips
4 Datasets
Scroll

Listening to the shape of sound

Before any prediction, the model listens. It processes raw audio at 22,050 Hz, trims silence, and pads to a fixed 3-second window — isolating the emotional signature embedded in pitch, rhythm, and amplitude.

What the CNN sees

The mel spectrogram transforms audio into a 128-band frequency image over time. The CNN branch reads this like a photograph — detecting texture, rhythm, and tonal contours that correlate with specific emotional states.

Dual-branch signal processing

CNN reads the mel spectrogram. MLP reads 645 hand-crafted acoustic features. Both streams fuse to classify emotion in real time.

01 / INPUT
Raw Audio
22,050 Hz · mono
silence-trimmed · 3 s window
02 / CNN
Mel Spectrogram
128 mel bands × 131 frames
3-channel multiscale
03 / MLP
Flat Feature Vector
645 dims · MFCC + chroma
spectral · tonnetz · pitch
04 / FUSION
Feature Fusion
Concat → Dense 256
CBAM attention · Dropout
05 / OUTPUT
Classification
Softmax(4) · Focal Loss
angry · happy · neutral · sad

Four emotional states, one model

Angry
89.2%
class accuracy
Happy
85.7%
class accuracy
Neutral
84.1%
class accuracy
Sad
90.8%
class accuracy

Try it yourself

● checking model...

Upload a .wav clip or record 3 seconds of your voice. The model processes it in real-time and returns an emotion prediction with confidence scores.

Drop your audio file here

supports .wav · .mp3 · .flac · .ogg

Analysing waveform...

DEMO MODE — model not loaded
Confidence:

Results will appear here after you upload or record audio.

Speech Emotion Recognition  ·  Live Processing Pipeline
Raw Audio
Mel Spec
Features
CNN + MLP
Result
01 / INPUT Raw Audio Signal 22,050 Hz  ·  mono  ·  silence-trimmed
02 / CNN BRANCH Mel Spectrogram 128 mel bands  ·  131 time frames
03 / MLP BRANCH Feature Extraction 645 dims  ·  MFCC + chroma + spectral
04 / FUSION → OUTPUT CNN + MLP Inference Dual-branch  ·  Softmax(4)