The TTS model is trained on two things: speech samples and their transcript. If ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		jamez on Nov 2, 2022 \| parent \| context \| favorite \| on: An AI generated, never-ending discussion between W... The TTS model is trained on two things: speech samples and their transcript. If you add enough sniffle-symbols every time a sniffle appears in the speech, I am confident the model would pick up on that. And then you would be able to replicate a sniffle in the generation part. The more time-consuming bit would be to add in the training data for the language model those sniffle-symbols, so that they would be organically added in the text in the text-generation phase. But seriously, it's not worth it. I think he's a brilliant man with an idiosyncratic speech, let's leave it to that.

steve_adams_86 on Nov 3, 2022 [–]

I agree, I personally don't hear his sniffles when I'm listening to him intently. It's irrelevant. I was mostly curious if and how, generally speaking, a model could be trained to sniffle. Now that you describe it though it seems fairly clear, so thanks!

nortonham on Nov 3, 2022 | [–]

in all seriousness how do you not hear them?

Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact