[MUSIC PLAYING] JEN PERSON: The MediaPipe
object detector task lets you detect the
presence and location of multiple classes of objects
within images or videos. For example, an object
detector can locate dogs within an image. There are APIs available
for Android, Python, iOS, and the web. To get started using the object
detection task for the web, first take a look at
the available object detection models. You can see this list
in the documentation linked in the description. This list might change over
time, so definitely check the docs for the latest. There are three recommended
models listed here with two available
formats each-- EfficientDet-Lite0,
EfficientDet-Lite2, and SSD MobileNetV2. All three models were trained
using the Coco data set, a large scale object
detection data set that contains over 1.5
million object instances and 80 object labels. I've linked to a list
of the Coco classes so you can see what
labels are available. The EfficientDet-Lite0
model is recommended because it strikes a balance
between latency and accuracy. It is both accurate
and lightweight enough for many use cases. EfficientDet-Lite2 model
is generally more accurate than EfficientDet-Lite0 but
is also slower and more memory intensive. This model is
appropriate for use cases where accuracy is a greater
priority than speed and size. The SSD MobileNetV2 model
is faster and lighter than EfficientDet-Lite0 but
also generally less accurate. It's appropriate
for use cases that require a fast,
lightweight model that sacrifices some accuracy. For more details on each of
these models, check the docs. If your use case requires a
more unique object detection solution, then you can customize
a model using Mediapipe Model Maker. I've linked to a
guide from the docs, but if you'd like a
video on getting started, let me know in the comments. Now that you've chosen a
model, install the Tasks Vision Package. You can download the
package using NPM and use a JavaScript
compilation tool like webpack or you can import the
package using a CDN. Note that under the hood,
MediaPipe for the web uses WebAssembly or WASM,
a binary instruction format for a stack-based VM. You don't need to be an expert
on the ins and outs of WASM to use MediaPipe
Solutions for the Web. In simplest terms, WASM
allows non-web-based code to run on the web. For the best user
experience, you don't want to bundle
your model or WASM binary into your website. Instead, you will store them
server side and provide links when initializing
your object detector. So let's explore
the code for this. Here we have a function
createObjectDetector. First we configure our
WASM binary loading using the FilesetResolver
forVisionTasks method. Then we create the
object detector using the ObjectDetector
createFromOptions method, passing the file set resolver
you just created and the model. You can also provide
optional parameters like a score threshold,
which indicates on a scale from 0 to 1 how
positive the model should be to return a detection,
and the running mode for the inference, which
is either image or video. Image is the default value. To run object
detection on an image, use the
ObjectDetector.detect method, passing the image source. This function is
synchronous, which is good to keep in mind
when designing your UI. The source can be an HTML canvas
element, HTML video element, HTML image element, image
data, or image bitmap. The detect method returns an
ObjectDetector result object. There are a series of
detections in order of how confident the model
is that the detected object belongs to the given category. In this example,
the first detection has a class name of dog and a
confidence score of 0.73828. The next detection is also
a dog with a confidence score of 0.73047. We can be reasonably
sure that there are two dogs in this image,
which sounds like a great image to me. You can access these
detection results using
detectionResult.detections. So if you want to
access the display name of the first
result, you would use detectionResult.detections,
grab the first result, which is of course the zeroth, get
the first category, which is also the zeroth, and then
get the category name property. You can iterate through results
to handle multiple detections. To detect objects in
frames of a video, get the current time
using performance.now. Then get the object
detection result using the objectDetector
detectForVideo method, passing the video element
in the current time. And that's it. With this code, you can get
started with object detection in your own web app. You can check out a complete
code example on Codepen and view all the available
solutions on the MediaPipe website or get hands on with
solutions in MediaPipe Studio. All these great resources
are linked in the video. Now that you have what
you need to get started, I want to know
what's next for you. Tell me what you're working on. Tell me what you learned. Tell me what you
still want to know. Drop a comment here on
YouTube or on LinkedIn. I can't wait to
see what you build. [MUSIC PLAYING]