YOLOv8 CNIC Field Detection: Fast & Accurate

CNIC Image Fields Detection and Identification through YOLOv8

Automating identity verification has become a key requirement for various industries nowadays, ranging from finance to government services. One common task is extracting structured information from national identity cards, such as CNIC images. In this blog post, the approach of leveraging YOLOv8 (a state-of-the-art object detection model) is shared, through which specific fields (like Name, CNIC number, and Dates) are detected and extracted from scanned CNIC images.

What is YOLOv8?

YOLOv8 is the latest version of the YOLO (You Only Look Once) object detection model developed by Ultralytics. It offers faster performance, improved accuracy, and a more streamlined architecture compared to its predecessors. It has built-in support for tasks like detection, segmentation, and classification. Its lightweight variants make it ideal for edge devices, while still delivering strong results on complex datasets.

How to extract CNIC data using YOLOv8?

The objective is to automatically detect specific fields in a CNIC image, such as Name, CNIC Number, and Date of Birth, and accurately extract the text written under those fields. This involves identifying the correct regions on the CNIC and applying OCR to extract structured data reliably.

Steps to Extract CNIC Data

A complete pipeline was developed to detect and extract structured fields from CNIC images. The process began with dataset collection and annotation using Roboflow, followed by splitting the data into training, validation, and test sets. YOLOv8 was then fine-tuned on the annotated images to detect key fields such as Name and CNIC Number. After training, inference was performed to obtain bounding boxes and labels, which were used to crop regions of interest. OCR (using Tesseract) was subsequently applied to extract text from each cropped field. Through this end-to-end approach, accurate field-level text extraction from scanned CNICs was achieved.

We chose YOLOv8 due to its:

⦁ High speed-to-accuracy ratio

⦁ Improved performance on small text regions

⦁ Compatibility with diverse resolutions

⦁ Lightweight architecture (especially the yolov8n variant) for mobile and edge deployment

⦁ Well-maintained ecosystem and active community support

All of these make YOLOv8 an excellent choice for real-time object detection tasks such as CNIC field extraction.

Steps required:

Eight steps are required to perform this process successfully.

⦁ Dataset Extraction ⦁ Dataset Annotation ⦁ Dataset Splitting ⦁ YOLO Model Loading ⦁ Fine-tuning YOLO Model ⦁ Saving/Exporting Model ⦁ Inference on Final Model ⦁ Text Extraction

As the YOLOv8 model is trained on COCO dataset, it was not trained on CNIC dataset, hence we need to custom train it for our own project.

1. Dataset Extraction

To begin with, a dataset containing CNIC images was needed. After exploring platforms such as GitHub, Kaggle, and Roboflow, a relevant Pakistani CNIC dataset was found on Roboflow. If identity cards from a different country are being used, a dataset specific to that region can be created or sourced accordingly.

Note: The Roboflow dataset contained annotations for all text regions, but we created custom annotations tailored to my use case.

Using a custom dataset gives much better results if you can have it available. In this example, a dataset containing different CNIC images of Pakistani individuals was used. Along with that it also has annotations of all the text in the image, which was not very useful for our use case.

2. Dataset Annotation

Secondly, we needed to annotate the dataset to suit our project. We used Roboflow to manually annotate the images accordingly.

Annotating the dataset comprises of two steps, we first have to upload the dataset, then: ⦁ Draw Bounding Boxes ⦁ Label the Bounding Boxes

For each bounding box we draw, we label it with the respective field e.g. Name or Identity Number. This is an example annotation for the CNIC images highlighting each annotation with different colours.

The annotations are like this for each image:

The final format of each text field in annotation file will consist of:

⦁ Class Id number ⦁ Bounding Box coordinates(Box Center(x,y), and Box Width and Height)

An example format is as follows: <class_id> <x_center> <y_center>

3. Dataset Splitting

After doing the annotations, and extracting the dataset to our local system, we will manually split the dataset into train,val and test folders. We can use the 70-20-10 rule for splitting the dataset, meaning 70% for train, 20% for validation, and 10% for testing. For this study we have organized images in:

⦁ images/train/ ⦁ images/val/ ⦁ images/test/

We created annotation files (YOLO format) in:

⦁ labels/train/ ⦁ labels/val/

This step is essential to later on fine-tune and train our YOLOv8 model.

Next we will also create a data.yaml file. A ‘data.yaml’ file contains the project directory and helps YOLO model navigate and extract the images and labels for the respective dataset.

4. YOLO Model Loading

Next we will import ultralytics package, and from ultralytics package we will import YOLO.

pip install ultralytics from ultralytics import YOLO

After that we will use our specific model of the YOLOv8 nano version (yolov8n). You can use any other version as well depending on your scope of work and requirements.

YOLOv8 has several versions (like n, s, m, l, x) designed for different performance needs. Pick the one that suits your project best. You can read more about them here.

model = YOLO("yolov8n.pt")

5. Fine-tuning YOLO Model

Now we will load the ‘data.yaml’ for training please change the path if yaml is at some other place in your case. model.train(data='/content/drive/MyDrive/My First Project.v1i.yolov8/data.yaml', epochs=50, imgsz=640, batch=16)

You can adjust the number of epochs accordingly,increasing epochs for better results, keeping hardware constraints in mind.

This will effectively fine-tune the YOLO model on our personalized dataset.

6. Saving/Exporting Model

yolo_model=model.export(format='pb')

This will essentially export and save the fine-tuned model on our local system. This ensures we can use our model later on for multiple use cases.

7. Inference on Final Model

results = model('/content/drive/MyDrive/ID_CARD_TEXT.v3i.yolov8/test', stream=True)

This will save the inference in the results variable. Each prediction in results will contain: ⦁ Bounding box coordinates ⦁ Class labels ⦁ Confidence scores

8. Text Extraction(OCR)

For text extraction or OCR, we can use any engine like Pytesseract or EasyOCR depending on higher accuracy on our respective dataset. As we have the coordinates and labels saved in the ‘results’ variable for each of the predictions, we technically have the labels mapped to the individual fields, a thing which was not possible before fine-tuning, and is possible through the YOLO model.

So for each image we can use the coordinates to crop the image with the coordinates and push the image in the OCR engine.

import pytesseract from PIL import Image for box in boxes: x1, y1, x2, y2 = map(int, box.xyxy[0]) crop = image[y1:y2, x1:x2] text = pytesseract.image_to_string(crop, config='--psm 6') print(f"{model.names[cls_id]}: {text.strip()}") --psm 6 was ideal for single-line text.

#You can clean the output to remove unwanted characters using post-processing techniques (e.g., regex, string cleaning).

Significance of this Pipeline

With this simple pipeline, you can:

⦁ Detect CNIC fields like Name, CNIC Number, and Date of Birth ⦁ Accurately extract associated text using OCR ⦁ Build a fast and reliable offline identity parsing system

It worked well on high-resolution CNICs with minimal noise. Performance slightly dropped on blurry images, so preprocessing methods (grayscale, thresholding) were added before OCR in those cases (which you can find in the complete project code at Github).

Use cases of this PipeLine:

This solution is completely offline, customizable, and scalable for a variety of applications including:

With further improvements like OCR enhancement, spell correction, and UI integration, it can be deployed as part of a robust identity verification system.

Future Improvements

While the current system performs well for structured CNIC field detection and text extraction, there’s plenty of room to make it even more robust and production-ready. Here are some ideas for future enhancements:

⦁ Back Side Detection Extend the model to support detection and text extraction from both sides of the identity cards in countries where information is available on both sides like in Pakistan.

⦁ OCR Post-Processing Add intelligent post-processing techniques such as:

⦁ Spell correction using language models or dictionaries

⦁ Regex validation for CNIC numbers and dates

⦁ Confidence-based filtering to discard noisy OCR output

⦁ Image Preprocessing for OCR Enhance text recognition by applying adaptive thresholding, denoising, and contrast adjustments to the cropped regions before passing them to the OCR engine.

⦁ Support for Multiple Languages Incorporate OCR models that can handle Urdu, the native language used on most Pakistani CNICs, in addition to English.

⦁ Model Optimization Quantize or convert the model for mobile/edge deployment (e.g., TensorRT, ONNX, Core ML), enabling it to run efficiently on low-resource devices.

⦁ Real-Time Integration Integrate the solution into a mobile app or web dashboard for real-time CNIC scanning using a phone or webcam.

⦁ Auto-Rotation and Alignment Add a pre-step that automatically detects and corrects rotated or skewed CNIC images before detection and OCR.

⦁ Deployment as an API or SDK Wrap the entire pipeline into an API or SDK for easy integration into existing systems such as banking apps or government portals.

CNIC Image Fields Detection and Identification through YOLOv8

Sameed Abdul Quddoos

2025-06-10

CNIC Image Fields Detection and Identification through YOLOv8

What is YOLOv8?

How to extract CNIC data using YOLOv8?

Steps to Extract CNIC Data

1. Dataset Extraction

2. Dataset Annotation

3. Dataset Splitting

4. YOLO Model Loading

5. Fine-tuning YOLO Model

6. Saving/Exporting Model

7. Inference on Final Model

8. Text Extraction(OCR)

Significance of this Pipeline

Use cases of this PipeLine:

KYC (Know Your Customer) Systems

Automated form filling

Digital onboarding platforms

Future Improvements

CNIC Image Fields Detection and Identification through YOLOv8

Conclusion

021-34688885

+92-3347262232