How to convert Image to text with Python

Hello dear reader! In this post we will teach you how to convert image to text using Python. So sit back, relax, and enjoy!

Introduction on how to convert image to text

Image-to-text conversion is a process in which text inside an image is extracted and converted into a word processor-compatible format. This was not possible until OCR (optical character recognition) was developed.

OCR is an application of machine learning. An OCR system has a model trained to recognize characters from either ASCII or UNICODE systems. They can look at pixel configurations and determine if they form characters or not. That’s how OCR systems can extract text from an image

Such systems are extremely useful for bridging the gap between paper and digital media. You can take pictures of physical documents and convert them into digital format using image-to-text conversion.

In this article, we will learn how Python developers can convert an image to text by writing a simple program in Python 3.

How to Convert an Image to Text in Python

Python is the most popular language for developing AI applications and programs. This is due to its simple yet powerful syntax and the huge community support it gets. Nowadays, there are plenty of OCR libraries available in Python. With their help, creating an image-to-text converting program is very easy.

We have listed the steps to create such a program below.

Open Google Colab Notebook

The Google Colab notebook is a free Python notebook that runs on Google servers. We prefer to use it because it does not require a lengthy installation or setup. You can simply open it and start programming.

To open a Google Colab notebook, open a browser on your PC and search for “Google Colab.” You will see a link like this on the SERPs.

Once you navigate to the web page, ignore everything and click on “New Notebook.” This will open an empty notebook.

Install Dependencies

For this tutorial, we are going to use an OCR library called Pytesseract. To use it we have to install it first. It is a two-part installation because you must install it once in your runtime and once in your system.

Here are the commands for doing that.

pip install pytesseract

And

!sudo apt install tesseract-oc

The second command is for the Linux terminal because Google Colab runs on linux. If you are following this tutorial using an IDE in Windows or Linux, then you need to run the second command in your Windows command prompt or the Linux terminal. Note that the command will be different for the Windows version.

In Colab, though, you can just type it in its own code block and run it.

It may take a while for the dependencies to install, so don’t worry if it takes longer than you anticipated.

Install Pillow

Since we are going to be working with images, we need to install the Pillow library. This library provides important functions that enable us to import and manipulate images inside Python. Without this library, it would be very difficult to add an image to the run-time and process it.

To install Pillow simply write the following command in a code block.

pip install pillow

This will install the Pillow library. That may look like this:

Add the Image to the Program Directory

You will need to add all images that need to be converted to text inside the program directory. You can do that very easily.

Look at the left pane on the webpage. It should have a series of icons in it.

Click on the icon that looks like a folder and then click on the upload icon.

This will open a navigation menu of your file explorer. Find and select the image from which you want to extract text. It will be uploaded to the Colab directory. You can even see it in your files folder.

Create a Function to Convert Image to Text

Now, you can create a function to extract text from your image. For our example, we used the following image.

So, here is what your code is going to look like.

import pytesseract

from PIL import Image

# Open the image file

image = Image.open('image.png')

# Perform OCR using PyTesseract

text = pytesseract.image_to_string(image)

# Print the extracted text

print(text)

As you can see, we import the Pytesseract library into our program, and then we import the image function from Pillow. Using the image function, we open our uploaded picture. The specific code is

image = Image.open('image.png')

Here, you need to replace “‘image.png’” with the name of your uploaded picture.

Then we use Tesseract to do OCR and finally, we print the text.

Here is what our output looked like:

This is it: you have now successfully created a program that can extract text from an image using Python.

Alternative Way of Converting Images to Text

If you think that writing a program for text extraction is too difficult or time-consuming, then you can use a different approach. That is, you can use online image-to-text converters.

There are plenty of OCR tools available online, and many of them are free. Some examples of such tools include Image to text converter.net.

Both are free tools that don’t require any registration either. They are also much easier to use compared to writing an entire program in Python.

Here’s how you can use a tool to convert images into text.

Open an Image-to-Text Converter

Open the webpage of a text extractor. You can do that by doing a Google search for any of the following terms.

OCR tool
Convert image to text
Extract text from the image

Etc. From the SERPs, open any tool that exists in the top five results (be careful not to open a sponsored or advertised link). Make sure the tool is accessible and easy to use.

Input Your Image

Most tools let you input your image by either dragging and dropping the image into their interface or by uploading it from your device. Some even allow you to enter the URL of an image.

Click Extract

There should be a button that says extract text, convert to text, or something along those lines. Click that button to start the extraction process. It will take a few seconds, and then your output will be sent to you.

Proofread and Save the Output

Proofread the output to make sure that the text was extracted correctly. Sometimes, OCR tools can make mistakes if the text or the image is not very clear. Weirdly styled fonts can also result in poor text recognition. That’s why proofreading is required.

Once that is done, you can save the output by downloading it or copying it to a file.

Most tools provide shortcuts for doing this, so use them for convenience. That’s it. You have now extracted text from an image using a tool.

Conclusion on how to convert an image to text

Python is a powerful language that can utilize OCR libraries to extract text from images. In this article, we saw how a simple program can be written in a Google Colab notebook for text extraction.

However, a more convenient method is to use an online tool instead. We checked out how you can do that in a few simple steps. So, now you know both methods of converting an image to text and can use either one of them whenever you like.

We hope you liked our article, to keep expanding your knowledge check our our Machine learning books, have a great day and keep rocking!

Subscribe to our awesome newsletter to get the best content on your journey to learn Machine Learning, including some exclusive free goodies!