I prefer listening to real people. No matter how good AI voices become, I still like knowing that the one reading the book to me understands what they are saying.
The issue is there’s a million books out there with no audio and never will. Im ok with Ai doing readings on books that wouldn’t otherwise get an audio version
With machine voice with no attempts at imitate human’s intonation - yes.
Hey for the deaf and people who need the info on the page, robot voice is better than nothing.
Just pretend the book is being narrated by Stephen Hawking!
WHY WOULD YOU SAY THAT. ROBOTS CAN SHOW EMOTION.
tiktok voice:
hate. let me tell you how much i’ve come to hate you since i began to live. there are 387.44 million miles of printed circuits in wafer thin layers that fill my complex…
The professional ai voices are amazing
unironically, that is a character that could use an uncanny robotic AI voice.
Surely I can just do that myself with an an epub and a free AI.
Glad I binned my Audible subscription many years ago.
I can get that for free. There are apps that will read an ebook to you already. The whole point of paying the premium on audible is the superior reading/acting. Not put up with mispronounced words, weird cadence and an inability to handle acronyms
I thought people mainly paid for the large library
I’ve tried one that works surprisingly well. Each sentence had great pacing, cadence, and correct enunciation- even had tone right when someone was shouting or angry or sad.
I wouldn’t really recommend it, though. While I couldn’t pick any single thing out that was wrong, overall it just didn’t quite flow. It’s like watching someone try to act that is technically doing everything right, but it just isn’t good. It basically didn’t understand the greater context of the story and was saying lines.
It was uncanny valley, but exclusively with voice.
Looking for iOS recommendations, preferably without a subscription that can read epub/pdf
I’m an android user, so not sure if it’s on iOS but I’ve used ReadEra
It’s on iOS.
Is there an offline tool that generates realistic audio for epubs as Mp3 ? Something like the free Ai tool, Vibe which is for transcription. Is there something similar for TTS, runs locally without complicated setup ( most are complicated using python and etc just for installation)
edit: needs to be close to realistic or at least accurate pronunciation because I am using the audio from books to learn languages. To improve listening comprehension while reading book.
Great question! I need to come back to this thread to see if something is suggested.
I’ve loaded epubs into the app ReadEra, which lets you read it like any other novel app or will, in real time, read it to you. It’s not the most natural of speech, but was good enough for my commute when I was in the midst of a compelling book.
Download TTS Server, and change the engine in Readera to use it. Use the Microsoft Azure settings in TTS, much more realistic. Little slow though is my only complaint as it sends/receives a paragraph at time, resulting in a pause now and again.
How do I do that? Have both readera and tts server on a Samsung Galaxy
Beautiful, it works. Why not.
It’s Amazon, what did you expect? Enshittification and monopoly abuse, no surprise.
Idk, they have pretty good stats that nobody will listen to an audio book if they don’t like the narrator, so being able to choose your own narrator on the fly isn’t really shitty
Enshittification isn’t adding new features that people want, it’s gradually lowering the quality of the product. So here if Audible is solely adding more possibilities, never at the cost of higher quality ones degrading, then indeed I’m wrong.
If though they hire less people to do good voice acting, then it’s really shitty.
I genuinely hope I’m wrong and they are ONLY adding new capabilities… but my entire experience with capitalism is that obtaining a monopolistic position is not done to improve quality but rather to increase margins regardless of how.
We’ll see!
trained on stolen books? then I guess I can download these from anywhere I may find for free as well, right?
Yep, copyright doesn’t apply to AI generated content.
(edit: the original book copyright would still apply however… So would only be public domain if the book itself was also public domain)
How about I spin up an AI model that outputs a near 1:1 copy of the training data?
Does that circumvent the copyright?
Duno, probably to some extent, similarly to how remixes of music sometimes have to pay royalties to the source of the sample if it’s recognisable…?
Actually would probably be more similar to the George Carlin AI impersonation lawsuit , but they settled, so idk.
free AI read audiobooks coming up
you couldn’t pay me to listen to an AI narrated book
Me too: there’s just something about how repetitive thier cadence can be, and putting random infections and stresses on words where it doesn’t make sense.
AI voices are not trained on books.
The ethical issue there is more around cloning celebrities
but AI itself is
Not sure what you are trying to say here. AI itself is an equation.
AI models have been trained on copyright protected books illegally. Maybe the voice have not
Well, yeah, you can. Whoever told you that you can’t, don’t believe them, they are probably being payed to say it. You could also pay for the book to support the author but most likely your money will not go to the author so don’t bother.
This has actually got me thinking differently about AI all together.
The best use for AI needs to be for the individual. I want MY ai to read books or research with or complete tasks for me.
I don’t want another company to do it for me or monetize it or steal content with it.
isn’t the current law not recognising AI stuff for copyright?
IE, downloading their audiobooks illegally is impossible are they are by default in the public domain.
I like your way of thinking
youtube already does it.
And it’s shit
YouTube is crawling with it. It’s unlistenable shit. The prosody is badly implemented, pronunciation is infuriatingly bad, and a lot of the text that these TTS are reading appears to be AI-generated. Otherwise, already dire standards of literacy are getting worse at an accelerating rate.
like how fans got obssessed with AI generated DBZ(what ifs)
Is voice AI trained on stolen data? I was under the impression that was LLMs.
Pretty much anything handling unstructed data (audio, video, text) is using training data that has copyrighted content.
This is clearly the future despite the outrage here.
There are at least 389 living languages with over 1M speakers. That alone means it’s impossible to reach some people and they get left out. Most of these languages dont even have enough professional voice actors to cover the bandwidth.
There are thousands of books released every year. That’s impossible to cover even in English alone.
Its an objective net good to have more accessible audio books and the privileged people who do care about this stuff can very much afford to vote with their wallets for non-ai voices.
In fact since AI moat is so minimal this will very quickly be adapted by open source solution providing audio book access to millions if not billions of people to whom this was not an option. Its amazing.
dont even have enough professional voice actors to cover the bandwidth
I’m pretty sure they’d be a lot more people ready to do that job if there was a good remuneration. Heck that sounds a lot more fun that a LOT of jobs out there!
Sure but that’s not how free markets work. If there’s only 3 million consumers you can’t afford 3 million voice actors but you can afford 3 million AI renders.
I’m not an economist but… 1 voice actor can serve 3 million consumers if they listen to the same content.
Anyway that’s not even my point, my point is that it is possible to cover, we as a society, driven both by VC with strategies of capturing markets (so precisely going against “free” market as an ideal) and consumers are making choices (like when one buys from the local farmer market vs Amazon deliveries). If though we, while fully understanding the consequence of such choice (namely how the sausage is made, here how AI models are trained and then run), believe it’s not valuable then sure, we can make that choice.
I’m just warning consumers then that if they don’t pay for quality content made a certain way, they can’t complain that they in turn don’t get the job they wanted because nobody out there is ready to pay for it.
2 sides of the same coin.
Most of these languages dont even have enough professional voice actors to cover the bandwidth.
And you think anyone is training AI voice models for those languages? Have you even seen how long it takes even large companies like Google to support the languages with hundreds of millions of speakers?
That’s the benefit of using AI and machine learning - once you have enough source material, you can throw it all in and it’ll eventually spit out a model.
Which is exactly what Meta did with their Massively Multilingual Speech project which supports text-to-speech and speech-to-text for 1107 different languages.Is it actually any good in 99% of them, I don’t have a clue, but it exists.
Seems more like a proof of concept project for that paper than something they are pursuing seriously judging by the GitHub location in some example folder that hasn’t seen any significant updates in over a year. If it is so great I would assume they would pursue it more actively and replace existing models with it two years later.
It becomes easier and cheaper every day. Today’s open source LLMs are better than last year’s best model.
You’re fundamentally misunderstanding the comment you replied to, they are not saying that voice AI are bad, they are saying there is not enough training data to improve the AI for these languages. How will it improve without good training data?
Thats not how AI training works and even then there’s absolutely enough data. Also training data can be created and even synthesized. There are many techniques to extract make training value from datasets that we discover every year - It’s really not a problem you think it is.
I’m genuinely confused how AI illiterate users here are. It’s just blind leading the blind.
Is it? I just tried again yesterday for a simple script since coding is the one thing apparently AI will replace people like me and it could not put together a working JavaScript script.
I have yet to see tangible results not announced by the people with sunken cost exploding their balls.
Sounds like a skill issue my dude. While you struggle to get a js script people are putting out entire programs with AI assistants so sure - you’re right and they’re wrong
yeah, I guess I didn’t prompt right lol
Yes, to effectively use AI you actually have to understand the medium you’re in to describe the problem you’re trying to solve. You can get there with prompting but it’ll take you much longer if you just don’t understand code yourself.
Thats why most senior software devs are not afraid of LLMs cause they need strong oversight and thats exactly what years of software dev experience trained you to do.
I’m a programmer with 22 years of experience. I understood the code well enough having written the solution myself the day before; I was precisely trying to see if AI would be useful with this example as it was a tad above basic stuff but not niche at all…
It failed miserably. The code ran but didn’t do anyithing at all or it did the wrong thing 4updating the wrong column for example). It would often ignore my requirements in favour of something easier
The worst part is it kept saying it “got it” and telling me some bs about why it didn’t work just to not correct it
Thats why most senior software devs are not afraid of LLMs cause they need strong oversight and thats exactly what years of software dev experience trained you to do.
what’s the point of this? If it cannot provide clean code and I have to check every line myself, I rather work with a junior who would usually do better, actually learn from my feedback and their experience and eventually become an independant asset
Stop drinking the kool aid
but for a service like audible.
So you can take the square root of that:
5x+7integral from 5z to 9x derivative of deltaT minus minus multiply times 3. Figure 1
Figure 1 shows a typical lizard living in a square root.
For now at least I bet this’ll be pretty mediocre. I’m a big audiobook fan and voice actors have a massive impact on the quality of the finished product. A great voice actor can make a mediocre book fun and engaging, a bad one can make a great book unlistenable. The best do great voice differentiation. As an example I’ve really enjoyed Andrea Parsneau’s work in The Wandering Inn series.
Imagine not liking the voicing of a book, so you just pick a different one.
You seem to be implying that’s ridiculous, but it is indeed exactly like that, though it’s not like I’m expecting every performance to be a masterpiece.
It’s also pretty subjective, for example folks either seem to love or hate R. C. Bray. My mother can’t stand the guy’s style, I think he’s okay.
Patrick Tull’s Aubrey/Maturin series is fucking amazing.
This is dumb as hell… if I wanted AI to read a book poorly to me, I’d just use screen reading accessibility features.
Are there any good ones nowadays that don’t sound like a robot?
Sure there are. ElevenLabs is one. You can probably tell they’re not human but they’re really decent.
Just tried it. Still a machine buy much better than default TTS.
In 10 years it’s probably gonna be really impressive.
They still don’t understand the context of what they’re reading though so they can’t apply tone correctly.
From what I’ve been able to hear it’s not that bad. They’re pretty good at having a general tone. But they may fail when it comes to emotional tones, like anger or sadness. But for just reading a book aloud there shouldn’t be any issue.
Fair. Definitely some awkward phrasing, but it’ll get better.
Speechify is probably the best option for this particular usecase.
No
Why would they when you can just plug any epub into a program and use google tts. Ive listened to about a book a day for the past few years doing this and i love it. Yeah it took getting used too, but once you find an ai voice you like and figure out which words to auto replace to sound right its honestly better then an audiobook. Well at least to me it is, i could never stand when the reader would change their voice for different characters.
This is what I don’t get from a business standpoint. Why would anyone buy an AI read audiobook for $20 when they can get the exact same audio by buying the ebook for $0.99 and running it through AI?
My experience is these systems never get the intonation and stresses right. It drives me nuts and I can’t listen to it.
Idk how much experience you have with this type of thing, but when I listen to my books i use my imagination to picture and hear things the way i want just like when i read a book normally. Ive read well over a 1000 books doing so, and that doesnt count rereads, and having the ability and willingness to use this method has drastically increased the amount i read but also my enjoyment doing so. The app i use also allows me to edit words and phrases throughput the book where i can correct how things are pronounced. Hell there’s a series that has this stupid catchphrase that i completely removed from all 20 books cause it was annoying. Im sure im only a single person that likes this method, but if i can find it enjoyable then when real ai gets put to work it’ll capture others.
It was bound to happen. I’m okay with ones that were never going to be turned into audiobooks to begin with… but they likely will use that as the norm for all books… I guess unless the author/publisher says not to.
Yeah currently contracts require the author’s or publisher’s consent. If anyone is a writer make sure to triple check your contracts for this shit.
And unless you are Stephan King or the like exactly how are you going to get the publishing cartel (I think they re consolidated downs to 3-4 publishers now) to change their contract to not include this? Their response will almost certainly be either “that’s non-negotiable” or “ok then you get half as much money”.
Publishers will at least retain the right to use AI audio books for themselves. And it’s much easier for an author to get a piece of something the publisher does than it is for them to get money for books Amazon recorded without their consent.
I’ve listened to a couple audiobooks where the author did the voice and i liked them. They know how phrases need to sound like better then an AI i would assume.