Audiogrep is fun. Grab some audio, automatically transcribe it, then chop it apart and stitch it all back together again in amusing ways. I fed it my talk from the Interactivef Pasts Conference (video archive).
My video:
[singing over video of Caesar’s assassination] The more we get together, together, together; the more we get together, the happier we’ll be. For your friends are my friends and my friends are your friends- the more we get together the happier we’ll be [fade out]
[Shawn] Let us kill some Romans…
The automatic transcription:
ma’am and no are you looking at a piano and again this law and a manner that have arrived yes that i thought that fact what mood he and and that
So ok, a bit dodgy, yes? From time to time it almost gets it right, or at least, something close to the kinds of things I might say:
as it happens stemmed from bricks fossilized knows social interaction in selecting becomes so straight for or as simulation on social hour
Anyway. Once the audio is ‘transcribed’, you can then use Audiogrep to search the text for words or patterns. It’ll then use the timestamps in the transcription to stitch together the audio. It’s quite amazing, really. So now I’m grabbing all of the audio from the entire conference and we’ll see what happens. Once I have the transcription files, I *think* I can use Videogrep (which requires .srt files, but maybe I can just swap?) to stitch together a video supercut of say everytime someone says ‘game’ or whatever.
Things I used: youtube-dl to grab the audio:
youtube-dl --extract-audio --audio-format mp3 -l VIDEO-URL-HERE
and ffmpeg and all of its dependencies.