In playing with the exact same use case, I was blown away at how good Gemini (flash 2.5 IIRC) transcoded podcasts with speaker identification and handled common "overlaps" in conversations. I can't remember what local Ollama models I played with but was not very impressed.
In playing with the exact same use case, I was blown away at how good Gemini (flash 2.5 IIRC) transcoded podcasts with speaker identification and handled common "overlaps" in conversations. I can't remember what local Ollama models I played with but was not very impressed.