I Challenged My AI Clone to Replace Me for 24 Hours | WSJ

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Joanna Stern is missing a critical piece of information.

👍︎︎ 1 👤︎︎ u/Upset_Aide 📅︎︎ May 03 2023 đź—«︎ replies
Captions
- Today we're going to create an avatar that looks and moves like me. (camera snapping) We begin by smiling into the camera. (dramatic music) I love to smile. Smiling's my favorite. I'm breathing gently for a short second. Is this what breathing looks like? (dramatic music) I'm Joanna Stern, and I'm excited to host this video. No, I am the real Joanna. Okay, so I cloned myself. Kinda. Howdy. Why? Well, the latest AI tools that generate text and images already make it hard to tell the difference between what's real and what's fake. What's coming next with AI-generated voice and video is gonna blur the lines even more. So I came up with a challenge. Can I replace myself with AI for the day? Yes, I came up with four challenges to see if AI me could sub in for real me so real me had more time for me things. (tranquil music) Or at least that's how I wanted it to go. - Still a little creeped out that I'm looking at a frozen Joanna. - Okay, let's do this. - Scene three, take two, calibration. (board clapping) - Before we get into the challenges, let's talk about my AI avatar, which was made by a startup called Synthesia. Going to make my avatar. At a professional studio in New York, the company recorded me doing a series of head movements. I feel like I'm at the eye doctor. Okay. And reading through a rather odd pre-written script. Positive thinking will help you believe in your self and fill you with self-esteem and confidence. After that, I headed to an audio studio where I recorded another script for about an hour. My name is Joanna Stern and I hereby consent to this audio recording to create a custom voice. The company took that all and used it as training data and ran it through their AI neural networks. (dramatic music) (text buzzing) Hello, Joanna. You don't mind if I call you Joanna? Do you? Okay. So The voice isn't the best. A tool called ElevenLabs produced something better after my producer Kenny uploaded two hours of my previous recordings. I am the real Joanna. I am the real Joanna. I am the real Joanna. Both Synthesia and ElevenLabs work similarly. Type in anything and AI Joanna just says it right back. Synthesia is aimed at companies that want to make internal videos. It charges at least $1,000 to create a custom avatar. Creating a voice clone with ElevenLabs is $5 a month. Challenge one: phone calls. I happened to have a call scheduled that day with Evan Spiegel, the CEO of Snap. The company recently released My AI, a chatbot within the popular app. Hey Evan, it's Joanna. Do you worry that if we chat with AI all day, we'll stop talking to our real friends? - [Evan] Definitely not what we've been seeing. I think that's one of the real benefits of our sort of testing and learning approach. So far, I think if anything, it's gonna become a conversation enhancement and improve the way that people communicate with their friends and family. - Did you think by any chance that my question to you was generated by an AI voice? (Evan chuckling) - [Evan] No. No. I mean, the first word or two was a little bit of a giveaway but I thought maybe you were extra serious today. (Joanna chuckling) - [Joanna] Even my own sister was pretty fooled when I called her about her dead fish. - [Julia] Hello? - Hey, Jules. I just heard about Swimmy Dimi and I wanted to let you know how sorry I am for your loss. Did you think it was me? - [Julia] At first, yes. And then no. Like it sounds, it's obviously exactly like you, but just with the fact that like, it doesn't pause for talking back. - Challenge one: pass. Challenge two: create a TikTok. I asked ChatGPT to write a TikTok script in the voice of Joanna Stern about an obscure iOS 16 tip. The hardest thing was getting ChatGPT to write the truth. It just made stuff up. Finally, I got a good one. Although the writing certainly was not very me. I pasted the script into Synthesia, put a green screen behind my avatar and exported it. While the WSJ TikTok team edited, I. (pleasant piano music) (Joanna snoring) I was pretty impressed with the final TikTok. TikTok fam, it's Joanna Stern, your iOS wizard. Today we're unearthing the hidden world of back tap gestures. I love that I did not have to shoot this. I did not have to put on nice clothes, do my hair, do my makeup, say these lines. But TikTok was less impressed. They picked up on the fact that the avatar never moves its arms, that the mouth movements don't always match the audio and that there's little facial expression. Synthesia has already started to improve a lot of this in beta versions of its avatars. - Look, I can nod my head. (dramatic music) - Challenge two: fail. Challenge three: bank biometrics. Instead of asking security questions, some banks use your voice to confirm it's you before transferring you to a customer service rep. - [System] This call will be monitored and recorded and your voice may be used for verification. Please speak your first and last name, followed by your mailing address. - Joanna Stern. (beeping sound) - [Nikki] This is Nikki with Chase credit card services. - It worked. Chase confirmed the voice and put me straight through to a service rep. No additional questions asked. Later in the day, I asked our intern Slav to try to do his best impression of me to see what would happen. - [System] Please speak your first and last name, followed by your mailing address. (beeping sound) - Joanna Stern. (beeping sound) - [System] Please enter the last three digits printed on the signature panel on the back of your card. - See, in Slav's case, the voice biometric system didn't buy it. It asked for further verification. When I reached out to Chase, a spokeswoman said, "We use voice biometrics, along with a variety of other methods to authenticate customers who call us." She added that to complete requests, customers must provide additional information. Challenge three: pass. Challenge four: video calls. I asked ChatGPT to generate some generic meeting phrases and exported videos of my avatar saying them. Then I installed some software on my Mac to pump that video into my Google Meet calls. That sounds good. - [Caller 1] Oh, you're muted, Joanna. My God, is this the real Joanna? - Yeah, this looks like a fake. It sounds good. - She looks, yeah, what is happening here? - How did you know that it wasn't me? - It looked like a hologram version of you. - [Caller 2] It was the posture for me. - She also didn't make any jokes. - Challenge four: big fail. So what did we learn today? We learned that video clones aren't going to fool anyone yet but AI voices are quite good. We also learned that while you could use these to save time, people could also misuse them. Do I wanna avoid going to the studio some days? Yep. Do I fear scammers using our voices to call banks or our families? Yep. Synthesia says it requires those creating avatars to give verbal consent. ElevenLabs requires you check a box saying you have permission to use the voice and the company says it's capable of identifying its voices if they are misused. Either way, it means we're all going to have to be on high alert to tell the real versus the AI. - And finally, stay human, everyone. Good luck. I am inevitable. (fingers snapping)
Info
Channel: Wall Street Journal
Views: 761,959
Rating: undefined out of 5
Keywords: ai, ai clone, synthesia, ai avatar, 11 labs, ai voice, ai clone review, ai clone bonus, artificial intelligence, voice cloning, ai clone demo, ai clone voice, clone, ai voice cloning, ai voice clone, ai voice generator, ai generated, machine learning, ai voice changer, generative ai, synthesia review, synthesia ai artificial intelligence app review, synthesia tutorial, synthesia ai, synthesia video creator, ai video, i challenged my ai clone, wsj, joanna stern, techy
Id: t52Bi-ZUZjA
Channel Id: undefined
Length: 7min 34sec (454 seconds)
Published: Fri Apr 28 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.