Fake news 2.0 AI will soon be able to mimic any human voice
The biggest loss caused by AI will be the complete destruction of trust in anything
you see or hear
By William Wesler
In 2018, fears of fake news will pale in comparison to new technology that can fake the human
voice.
This could create security nightmares.
Worse still, it could strip away from each of us a part of our uniqueness.
But companies, universities and governments are already working furiously to decode the
human voice for many applications.
These range from better integration of our internet-of-things devices to enabling more
natural interactions between humans and machines.
Technologically adept nation states (the US, China and Estonia) have waded into this space
and tech giants such as Google, Amazon, Apple and Facebook also have special projects on
voice.
It�s not that hard to develop an artificial voice, then model and reproduce spoken words
and phrases.
I remember being amazed when my original Apple Macintosh informed me of the date and time
in a dry, digital tone.
Making a natural-sounding voice involves algorithms that are far more complex and computationally
expensive.
But that technology is available now.
As any speech pathologist will attest, the human voice is far more than vocal-chord vibrations.
These vibrations are caused by air escaping our lungs and forcing open our vocal folds,
a process that produces tones as unique as a fingerprint because of the thousands of
waveforms that are conjured simultaneously and in chorus.
But a voice�s uniqueness is also tied to qualities we rarely consider, such as intonation,
inflection and pacing.
These aspects of our speech are situational, often subconscious and they make all the difference
to the listener.
They tell us when a phrase such as, �Wow, that outfit is something!� should be interpreted
as mean-spirited, sarcastic, loving or indifferent.
This challenge explains the early use of emoji in text messages.
They were needed to clarify the intent of a written message because it is extremely
difficult to interpret the true meaning of conversational speech that�s written instead
of spoken.
Details such as such as intonation, inflection and pacing are particularly difficult to model,
but we are getting there.
Adobe�s Project Voco is developing what is essentially a Photoshop of soundwaves.
It works by substituting waveforms for pixels to produce something that sounds natural.
The company is betting that, if enough of a person�s speech can be recorded (or data
mined), it will require little more than a cut-and-paste action to alter a recording
of their voice.
Adobe�s initial results from Voco are eerie, as well as awe-inspiring.
The prowess of the prototype indicates how soon common citizens will be unable to distinguish
between real voices and spoof ones.
If you have enough samples stored in your data library, then you can make anyone appear
to say almost anything.
Technology companies and investors are betting on the idea that these systems will eventually
have tremendous commercial value.
Even before that situation arises, though, this particular type of technology will present
big risks.
By 2018, a nefarious actor may easily be able to create a good enough vocal impersonation
to trick, confuse, enrage or mobilise the public.
Most citizens around the world will be simply unable to discern the difference between a
fake Trump or Putin soundbite and the real thing.
When you consider the widespread distrust of the media, institutions and expert gatekeepers,
audio fakery could be more than disruptive.
It could start wars.
Imagine the consequences of manufactured audio of a world leader making bellicose remarks,
supported by doctored video.
In 2018, will citizens � or military generals � be able to determine that it�s fake?
Không có nhận xét nào:
Đăng nhận xét