Text recognition in Flutter in 2023 - create a OCR scanner app

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

In this video we are going to see how we can create an application that recognizes text extracted from images taken by the camera and prints it on the screen. To do this we are going to use the camera plugin to open a video feed directly from the device's camera, and the google_mlkit_text_recognition plugin to scan the camera frames for text. The first thing I'm going to do is create a new Flutter app for Android and iOS. Then I'm going to open it in Android Studio, although you can use Visual Studio Code if you prefer. I'm going to open pubspec.yaml and remove all the unnecessary comments. Now I am going to add all the necessary dependencies to detect text. First of all the camera plugin to see the device's camera feed, google_mlkit_text_recognition to be able to detect the text and finally permission_handler to be able to request permission to open the camera. We already have the dependencies installed and added in the pubspec.yaml file. Now we are going to make some adjustments in the native part of Android. Open the build.gradle file and modify the compileSdkVersion to 33. I use version 33 because it is the latest version, although a more recent version may already have come out by the time you watch this video. The next step is to set minSdkVersion to 22 and targetSdkVersion to 32. I use 32 and not 33 because there is currently a bug with the camera plugin that causes problems targeting 33, it is possible that by the time you watch this video this bug has already been fixed. You can try version 33 or higher if you want, if you have problems you can come back here and set 32. Finally, open the AndroidManifest file and declare that you are going to use the camera in your application. This step is necessary for the permission request to work correctly. We will do the exact same thing on iOS, open Info.plist and add this key-value pair that will let the system know that our app needs access to the camera. One last necessary step is to modify the Podfile as indicated in the permission_handler documentation. If we go to its page we will see a block of code that must be added to setup iOS. You can find this link in the video description. Copy and paste this block of code. Let's uncomment the camera part and delete the rest, since we don't need them for this example. At this point you can try to compile the app on iOS, if it doesn't work it is possible because a minimum version of iOS has to be set. We can perform this action by uncommenting this first line of the Podfile. Try to build the app on both Android and iOS to verify that there are no build errors. If the counter app is launched on both platforms we are good to go. Now we are going to add the necessary code to be able to request permission to open the camera. Open main.dart and delete the widget that comes by default, I'm going to make some changes in MyApp to adjust it to my liking. Now I'll create a new widget called MainScreen that will act as the main widget. It's important that it inherits from StatefulWidget since we're going to have to manage the lifecycle of various components. Add this state variable _isPermissionGranted which will manage if the camera permission has been granted. Define this Future that will be the one that will be executed first to request this permission. Now I am going to create a private method in charge of requesting the permission and updating the state according to the result. Note that this way of asking for permissions is simplified, I do it this way to avoid adding complexity to the video, but it's not the best. I leave you in the video description another video of mine where I explain how to request permissions in an optimal way, managing all the possible outcomes, such as if they deny the permission, or if they click on deny forever. Now fill the build() method with the following tree. Note that the first widget is a FutureBuilder, which will execute the future that request the permission at the beginning. For now we are going to show a text on the screen that will indicate if the permission has been granted or not. Run the app and try to grant the permission to see that the correct text is displayed. Uninstall it and run it again, this time denying the permission to see that the correct text is also displayed. The next step is to display the camera feed in our app. For this, we will need to have control of the lifecycle of our main widget. We can achieve this by adding the WidgetsBindingObserver mixin, add in initState() this line to turn MainScreenState into a lifecycle observer and don't forget to remove it in dispose(). Thanks to this action, we can override the didChangeAppLifecycleState() method, which will give us information about whether the app is in the foreground or whether it has been sent to the background. We will populate it later. For now, add this variable of type CameraController that will be used to control the camera. Now we are going to create a method whose purpose is to initialize the camera controller. This method receives the list of cameras that the device owns. First, we check that the _cameraController variable has not already been initialized, if so we exit the method. Now we have to choose which camera we want. To do this, we are going to iterate over all the cameras and choose the first camera that points backwards. We already know what camera we want, we are going to pass it on to another method that does not exist yet but that we are going to create right now. This method, _cameraSelected(), is responsible for initializing the variables with the camera already defined. To do this, initialize CameraController as follows. We set the resolution to maximum, since that will help with the text detection. We don't need audio. Call the asynchronous method initialize() to perform the initialization. Then add the following code that will be in charge of refreshing the state in order to show the already initialized camera. Now I am going to create a method to open the camera and another to close it. In both cases it is necessary to check that the variable _cameraController is not null. Now let's fill in didChangeApplifecycleState to show the camera if the app is in the foreground, and to release it if it's no longer in the foreground. First, check that _cameraController is not null and that it has been initialized. If the state is inactive we stop the camera. If the state is resumed, that is, the app has been sent to the foreground, open the camera. Finally we can add the camera widget to the tree. However, I'm going to do it right behind the main Scaffold, inside a Stack like this. The reason why I do it like this is because if we add the camera inside the Scaffold it is likely that empty bands will be seen around it, considering that the camera has approximately the same aspect ratio as the mobile screen. I think it is better to put the camera in fullscreen, but behind everything, to minimize this effect. In any case you can add this widget inside the Scaffold and see for yourself how it looks. Make the adjustments that you see on the screen to show the camera feed, or the text that indicates that the camera permission has not been granted. I am also going to add a button in the interface that will later be in charge of executing the text detection.. If you need to see this code more clearly, you have a link to the open source project of this video in the description. Try running the app now, you should see the camera preview in the app. After all that has been done, we are ready to carry out the scanning of the text. But first we need a place to display it, create a new file called result_screen.dart that defines a ResultScreen widget that accepts a String parameter. This parameter represents the text after it has been scanned. We can display this text in body of the screen with a bit of margin on the sides. Now go back to main.dart and add this variable of type TextRecognizer that will take care of recognizing text in the camera images. It is important to close it in the dispose() method. Add this method called _scanImage which will take care of scanning text. First, check that _cameraController is not null. Now get a reference to the Navigator. We will use this reference later to direct the user to the Result Screen. Add this try catch block where we will run the detection, displaying an error message on a SnackBar if something goes wrong. Now we will obtain a picture from the camera, we will define a variable of type File with its path, and with this we will create another variable of type InputImage. We then pass this variable to the processImage() method of _textRecognizer. At this point we can send the user to the ResultScreen, defining the recognized text as input parameter. Finally, all that remains is to use this method as a callback for the button that we have created before. Now run the app, point the rear camera of your device towards some printed text and press this button. If everything went well, the application should identify that text and show it to you on the screen. Here ends this basic tutorial on how to recognize text using OCR technology in Flutter. I hope you have found it useful, you have a link to the functional open source project made in this video as well as other links of interest related to what I have explained here in the video description. If this video has been useful to you, please give it a like, and consider subscribing for more content about Flutter and mobile development in general. I hope you have a nice rest of your day, I'll see you in the next video. Goodbye.

Info

Channel: David Serrano

Views: 21,941

Rating: undefined out of 5

Keywords: flutter, flutterdev, flutterdeveloper, dart, app, apps, mobile, flutterapp, software, android, ios, ocr, scanner app, text scanner, pdf scanner, photo scan, ocr scanner, text recognition, extract text, character recognition, mobile scanner, ocr reader, ocr text scanner, text detection

Id: hQ7tj8wouVM

Channel Id: undefined

Length: 13min 52sec (832 seconds)

Published: Tue Jan 10 2023