Updated on 2023-11-13 GMT+08:00

What Is OCR?

Optical Character Recognition (OCR) detects and extracts text from images and converts the recognition results into an editable JSON format.

OCR provides open APIs, so you can use programming languages such as Python and Java to call OCR APIs to extract text from images. OCR allows you to automate the collection of key data. It helps you build an intelligent service system to improve efficiency. For details about how to obtain APIs, see Optical Character Recognition API Reference.

OCR also provides software development kits (SDKs) for multiple programming languages. For details about how to use SDKs, see the Optical Character Recognition SDK Reference.

Before You Start

You will need some basic programming skills. Familiarity with Java, Python, iOS, Android, and Node.js is recommended.

You need to call APIs to use OCR and transmit the results to the service system, or to convert the results from JSON to TXT or Excel form.

OCR Capabilities

  • General OCR

    Text in images (including web images and more) can be automatically identified.

  • Card OCR

    OCR automatically identifies information in images of certificates such as passports, ID cards, driving licenses, and converts the information into editable text.

Using OCR for the First Time

If you are a first-time user, the following sections are a good place to start: