Sloka does zero-shot detection of the most popular audio cloning tools. Meaning, it was trained on just 4 of the tools below-but can detect content created by tools it hasn't seen in training.
All open source or commercial cloning tools use one of these known ML models:
Generative Adversarial Networks (GANs)
WaveNet
Autoregressive Models
Transformer-based models
Variational Autoencoders (VAEs)
https://cartesia.ai debuted a new model recently - State Space Models. It has impressive performance (< 200 ms latency) & matches ElevenLabs or Play.ht in how it sounds.
How does Sloka perform on Cartesia cloned content?
Sloka flags all Cartesia generated audio at the same 100% accuracy as say ElevenLabs or Play.ht. This is the strength of Sloka's zero-shot detection - you don't need to train it on new ML models. It has a very clear understanding of real audio. And can figure out even subtle traces of GenAI content.
Audio
Sanjay Schwab (Cloned)
Sanjay WF (Real)
Newsman WF (Cloned)