Reality Defender's audio detection analyzes speech content using multiple models to identify signs of AI generation or manipulation. Each audio file receives a single overall judgement and score, produced by combining signals across models that ran on that content.
What's included in audio analysis
Audio detection includes signals from two individual models, combined by an ensemble:
rd-everest-aud (Advanced) — uses general-purpose embeddings to discriminate synthetic from authentic speech
rd-marconi-aud (Foundational) — uses embeddings extracted from an internal proprietary model trained to expose a variety of generative artifacts in synthetic speech
rd-slim-aud (Generalizable) — detects mismatches in style and linguistic patterns that are otherwise unique to real human speech
Both models contribute to a single rd-aud-ensemble score, which represents the combined assessment of manipulation likelihood for the file.
What you'll see in the API response
Each model returns its own result in the models[] array, including a status, finalScore, and predictionNumber. The top-level resultsSummary contains the overall verdict and aggregated score.
Models that are not applicable to a given media type (for example, video or image models on an audio file) will return "status": "NOT_APPLICABLE" with null score fields. This is expected behavior.
How to interpret audio model results
The two models use different approaches to detect synthetic speech. Divergence between them is expected — one model may detect manipulation that the other does not, depending on how the audio was generated. The ensemble accounts for this, combining both signals into a single calibrated score.
Audio models analyze content in 3-second segments, with an aggregation model producing the final decision for the full file. The score you receive reflects the aggregated assessment across the entire audio.
For most use cases, the overall audio judgement and score in resultsSummary should be used as the primary decision signal. Individual model results are available for debugging, auditing, or deeper investigation.