Keynote: AI Voice Cloning & Localization
Speaker : Dr. Winn Worawutkunchai , Founder & CEO BOTNOI Group
Event : SCBX Unlocking AI EP1 , Thailand Path to AI opportunities
Collaboration : SCBX and Insiderly.ai
Venue : SCBX NextTech, Siam Paragon, 4th Floor
Today's AI has many powers, not only asking questions and getting answers suddenly.
But it can also create images. You can create things by entering a few keywords.
Voice itself is also something that AI can create, which may sound daunting, but if it is used positively to help others, it can be of great benefit.
BOTNOI is an entrepreneur who uses AI to create voices, clones, or voice clones to create good things to happen.

Dr. Wynn Worawutchai, Founder & CEO of BOTNOI Group, introduced the development of voice cloning by giving the example of Andrew Ng, an AI expert who found that someone cloned his voice on Linkedin. Which voice is cloned by AI?
As a result, people can hardly distinguish which voice is the real Andrew Ng's voice.
Currently on some websites, such as: ThaiPBS Go further by offering a "Read to Listen" service, which clones the voice of the news anchor and presses to read the news. It is suitable for people who want to listen to the sound rather than read the text.
BOTNOI cloned the voice of a reporter from ThaiPBS and clicked it to read the news on the website as if it were actually reading this news.
The trick is amazing, but ThaiPBS has also been widely criticized because even though it uses BOTNOI to make AI read news, AI still cannot read abbreviated words, such as the Inspector General, making people misread that journalists misread it.
But it shows how smooth the cloning of the sound is, so in the end, I had to include a disclaimer that this is an AI-generated voice, not a real announcer. Prevent misunderstandings
last Humans have always tried to imitate nature, including building airplanes in the shape of birds, swords and knives in the shape of tigers.
Some things that used to have limitations, such as building a car that cannot drive itself, are no longer an obstacle today that can create a brain for a car and then drive it on its own without force.

BOTNOI applies this concept to sound generation. It tries to imitate the structure of the human brain until it comes to the Input Output equation that produces sounds that match reality.
Dr. Wynn said that we humans don't remember anything before the age of 4.
He himself tried to observe how his son interacted with the sounds around him. What he learned at that age was that the brains of children who listen to their mother's voice will try to connect what they hear and see, and it will be a learning that he applies to Botnoi's work.
If BOTNOI wants to clone someone's voice, it will have that person read about 200 sentences of text and then import their voice and the text into the system for the AI to learn.
Although the results of the early experiments did not get a good sound. However, when it is improved, the quality is significantly improved.
In addition to voice cloning, there is also deepfake, or cloning a person's voice clone face that can imitate the person's mouth movement. The person may not have said the word either, which is very subtle.
BOTNOI has also developed technology to the point of making a person's voice Speaking another language while maintaining an accent and maintaining one's own identity, which is something to be proud of and build on. It can be used in movie dubbing.
If technology advances far, it may be possible to make the voice of a hero in a Hollywood movie. Be able to speak Thai while maintaining your own accent, etc.
Another case study that we found was that during the COVID-19 pandemic, sales of small shops declined, so Cadbury invented a model that used famous Bollywood actors to help advertise stores to increase sales.
What happened was that the voice of the famous actor Shah Rukh Khan was taken and made people pay a lot of attention. Because it can modify the sound and adapt the image to be specific in selling various products without limits.
However, we must also be careful of cases of misuse. Despite being the creator himself, Dr. Wynn has been cloned by others to deceive others.
It has become a lesson for BOTNOI. What will this be like? Especially in an era when call center gangs are rampant every day.
Prevention today may still be difficult and there are few effective ways to do it. One of the ways to do this could be to hide the watermark in the AI voice to a frequency that humans can't hear, but it also needs to find a smoother way. Because if the person who intends to use the voice in a scam is already able to delete the frequency and misuse it anyway.
Dr. Wynn hopes that all relevant agencies or AI developers will be aware of this and help make the cloned voice use in the right way. Along with filling the gap in crime as much as possible.
Image Credit : SCBX