This is a demo of long context
understanding an experimental feature
in our newest model, Gemini 1.5 Pro. We'll walk through some example prompts
using the three.js example code, which comes out to over 800,000 tokens. We extracted the code
for all of the three.js examples and put it together into this txt file
which we brought into Google AI Studio over here. We asked the model to find three examples
for learning about character animation. The model looked across
hundreds of examples and picked out these three. One
about blending skeletal animations, one about poses, and one about morph
targets for facial animations. All good choices based on our prompt. In this test, the model took around 60 seconds
to respond to each of these prompts. But keep in mind that latency times
might be higher or lower, as this is an experimental feature
we're optimizing. Next, we asked, “what controls
the animations on the littlest Tokyo demo?” As you can see
here, the model was able to find that demo and it explained that the animations
are embedded within the gLTF model. Next, we wanted to see if it could
customize this code for us, so we asked, “Show me some code to add a slider
to control the speed of the animation. Use that kind of GUI the other demos have.” This is what it looked like before
on the original three.js site, and here's the modified version. It's the same scene, but it added
this little slider to speed up, slow down,
or even stop the animation on the fly. It used this GUI library the other demos have, set a new parameter
called animation speed, and wired it up to the mixer in the scene. Like all generative models, responses
aren't always perfect. There's actually not an INIT function in this demo like
there is in most of the others. However, the code it gave us
did exactly what we wanted. Next, we tried a multimodal input
by giving it a screenshot of one of the demos. We didn't tell it anything
about this screenshot and just asked where we could find the code
for this demo seen over here. As you can see, the model was able to look through the hundreds of demos
and find the one that matched the image. Next, we asked the model
to make a change to the scene, asking, “how can I modify the code
to make the terrain flatter?” The model was able to zero in on one
particular function called generate height, and showed us
the exact line to tweak. Below the code, it clearly explained how the change
works. Over here, in the updated version, you can see that the terrain is indeed
flatter, just like we asked. We tried one more code modification task using this 3D
text demo over here. We asked, “I'm looking at the text geometry
demo and I want to make a few tweaks. How can I change the text to say, ‘goldfish’
and make the mesh materials look really shiny and metallic?” You can see the model identified
the correct demo and showed the precise lines in it
that need to be tweaked. Further down, it explained these material
properties, metalness and roughness and how to change them
to get a shiny effect. You can see that it definitely pulled off the task and the text looks a lot shinier now. These are just a couple examples
of what's possible with a context window of up to 1 million
multimodal tokens in Gemini 1.5 Pro. This is a demo of long context
understanding an experimental feature
in our newest model, Gemini 1.5 Pro. We'll walk through some example
from seizing the three jazz example code, which comes out to over 800,000 tokens. We extracted the code
for all of the three jazz examples and put it together into this text file,
which we brought into Google. We extracted the code
for all of the three jazz examples and put it together into this text file,
which we brought into Google Studio over here. We asked the model to find three examples
for learning about character animation. The model looked across
hundreds of examples and picked out these three one
about blending skeletal animations. What about poses? And what about morph
targets for facial animations? All good choices based on our prompt. In this test, the model took about 60 seconds
to respond to each of these prompts. But keep in mind that latency times
might be higher or lower, as this is an experimental feature
we're optimizing. Next, we asked what controls the animations
on the littlest Tokyo demo. And you can see here
the model was able to find that demo and it explained that the animations
are embedded with the guilty F model. Next, we wanted to see if we could
customize this code for us. So we asked,
Show me some code to add a slider to control the speed of the animation
and use that same kind of goofy. The other demos have. This is what it looked like before on
the original three JS site. And here's the modified version. It's the same scene,
but it added this little slider to speed up, slow down
or even stop the animation on the fly. It used this goofy library. The other demos have set a new parameter
called Animation Speed and wired it up to the mixer in the scene. Next, we try to multi-modal input
by giving it a screenshot of one of the demos. We didn't tell it anything
about the screenshot and just ask where we could find the code
for the demo scene over here. As you can see, the model was able
to look through hundreds of demos and find the one that matched the image. Next, we asked the model
to make a change to the scene, asking How can I modify the code
to make the terrain flatter? The model was able to zero in on one
particular function called generate height,
and showed us the exact line to tweak. Below the code,
it clearly explained how the change works over here in the updated version. You can see that the terrain is indeed
flatter, just like we asked. We tried one more code modification task using this 3D
text demo over here. We asked, I'm looking at the text geometry
demo and I want to make a few tweaks. How can I change the text to say, goldfish
and make the mesh materials look really shiny? Metallic. You can see the model identified
the correct demo and showed the precise lines in it
that need to be tweaked. Further down, it
explained these material properties metal ness and roughness and how to change them
to get the shiny effect. You can see the model identified
the correct demo and showed the precise lines in it
that need to be tweaked. Further down, it explained
these material properties, metal, nice and roughness and how to change
them to get a shiny effect. Like all generative models, responses
aren't always perfect. We did feel like it could have maybe reuse
the same material instance, but you can see that it definitely pulled off the task
and the text looks much shinier now. These are just a couple examples
of what's possible with a context window of up to 1 million
multimodal tokens in Gemini 1.5 Pro. This is a demo of long context
understanding an experimental feature
in our newest model, Gemini 1.5 Pro. We'll walk through some example prompts
using the three jazz example code, which comes out to over 800,000 tokens. We extracted the code
for all of the three JS examples and put it together in this text file
which we brought into Google Studio over here. We asked the model to find three examples
for learning about character animation. The model looked across hundreds
of examples and picked out these three. What about blending skeletal animations? What about poses? And what about morph
targets for facial animations? All good choices based on our approach. In this test,
the model took around 60 seconds to respond to each of these prompts. But keep in mind that latency times
might be higher or lower, as this is an experimental feature
we're optimizing. Okay. Next, we asked what controls
the animations on the littlest Tokyo demo. As you can see here,
the model was able to find that demo and it explained that the animations
are embedded within the gilts model. Next, we wanted to see
if it could customize this code for us. So we asked. Next, we wanted to see
if we could customize this code for us. So we asked, Show me some code
to add a slider to control the speed of the animation and use that
same kind of go with the other demos have. This is what it looked like before on
the original three JS site. And here's the modified version. It's the same scene,
but it added this little slider to speed up, slow down,
or even stop the animation on the fly. It used this goofy library. The other demos have set a parameter
called animation speed and wired it up to the mixer in the scene. Next, we try to multi-modal input
by giving it a screenshot of one of the demos. We didn't tell it anything about the
screenshot and just ask where we could. Next we try to multi-modal input by giving
it a screenshot of one of the demos. We didn't tell it anything
about the screenshot and just asked where we could find
the code for this demo scene over here. As you can see, the model was able to look through the hundreds of demos
and find the one that matched the image. Next, we asked the model
to make a change to the scene, asking How can I modify the code
to make the terrain flatter? The model was able to zero in on one
particular function called generate height, and showed us
the exact line to tweak below the code. It clearly explained how the change works
over here in the updated version. You can see that the terrain is indeed
flatter, just like we ask. We tried one more code modification task using this 3D text. We tried one more code modification task using this 3D texture demo over here. We tried one more code modification
task using this 3D text demo over here. I'm looking at the text geometry demo
and I want to make a few tweaks. How can I change the text to say, goldfish
and make the mesh materials look really shiny and metallic? You can see the model identified
the correct demo and showed the precise lines in it
that need to be tweaked. Further down, it explained these material
properties, metal ness and roughness and how to change them
to get a shiny effect. Like alternative models, responses
aren't always perfect. We did feel like it could have maybe reuse
the same material instance, but you can see that it definitely
pulled off the task and the text looks much shinier now. These are just a couple examples
of what's possible with the context window of up to 1 million
multimodal tokens in Gemini 1.5 Pro. These are just a couple examples
of what's possible with a context window of up
to 1 million multimodal tokens in Gemini 1.5 Pro. This is a demo of long context
understanding experimental feature
in our newest model, Gemini 1.5 Pro. We'll walk through some example prompts
using the three J.S. example code,
which comes out to over 800,000 tokens. We extracted the code
for all of the three JS examples and put it together into this text file
which we brought into Google Studio over here. We asked them all to find three examples
for learning about character animation. The model looked across hundreds
of examples and picked out these three. About blending skeletal animations. What about poses? And what about morph
targets for facial animations? All good choices based on our prompt. In this test, the model took about. In this test,
the model took around 60 seconds to respond to each of these prompts. But keep in mind that latency times
might be higher or lower, as this is an experimental feature
we're optimizing. In this test,
the model took around 60 seconds to respond to each of these prompts. But keep it in this test. The model took around 60 seconds
to respond to each of these prompts. But keep in mind that latency times
might be higher or lower, as this is an experimental feature
we're optimizing. Next, we asked what controls the animations
on the littlest Tokyo demo. Next, we asked what controls
the animations on the littlest Tokyo demo. As you can see here,
the model was able to find that demo and it explained that the animations
are embedded within the GLT model. And you can see
here the model was able to find that demo and it explained that the animations
are embedded within the GLT model. And you can see here
the model was able to find that demo and it explained that the animations
are embedded within the GLT model. Next, we wanted to see if we could
customize this code for us. So we asked, Show me some code to add a slider
to control the speed of the animation and use that to kind of go
with the other demos. Have. Next, we wanted to see if you could
customize this code for us. So we asked,
Show me some code to add a slider to control the speed of the animation
and use that same kind of gooey. The other demos have this is what it looked like before on the original three JS site,
and here's the modified version. It's the same scene,
but it added this little slider to speed up, slow down,
or even stop the animation on the fly. It uses this. It used this goofy library. The other demos have set a parameter
called animation speed and wired it up to the mixer in the scene. Next, we try to multi-modal input
by giving it a screenshot of one of the demos. We didn't tell it anything
about this screenshot and just asked where we could find the code
for this demo scene over here. As you can see,
the model was able to look through. As you can see, the model was able to look through the hundreds of demos
and find the one that matched the image. Next, we asked the model
to make a change to the scene, asking How can I modify the code
to make the terrain flatter? The model was able to zero in on one
particular function called generate height,
and showed us the exact line to tweak. Below, the code clearly explained how the change works
over here in the updated version. You can see that the the model was able to zero in on one particular function
called generate height and showed us the exact line
to tweak below the code. It clearly explained how the change works
over here in the updated version. You can see the terrain is indeed flatter,
just like we asked. We tried one more code modification
task using this 3D Tex demo over here. We asked. I'm looking at the Tex Geometry demo
and I want to make a few tweaks. How could I change the text to say,
goldfish and make the mesh materials look really shiny and metallic? You can see the model identified
the correct demo and showed the precise lines in it
that needs to be tweaked. Further down, it
explained these material properties, metal ness and roughness and how to change them
to get a shiny effect. Like all generative models, responses
aren't always perfect. We did feel like it could have maybe reuse
the same material instance, but you can see that it definitely
pulled off the task and the text looks much shinier now. Like all generative models, responses
aren't always perfect. We did feel like it could have maybe reuse
the same material instance, but you can see that it definitely pulled off the task
and the text looks much shinier now. These are just a couple examples
of what's possible with the context window of up to 1 million
multimodal tokens in Gemini, 1.5 Pro. These are just a couple examples
of what's possible with the context window of the 1 million
multimodal tokens in Gemini 1.5 Pro.