CV Parsing for recruitment

Digitise data, accelerate recruitment processes and save time with CV parsing and information extraction.

Why use CV Parsing and information extraction

Artificial intelligence can help optimise the time required for managing applications and the transfer of large amounts of data, even from the very early recruitment stages. Combined with the work of the recruiter, the Information Extraction and CV Parsing (also Resume Parsing) techniques speed up the process of pulling out relevant candidate information. The steps of compiling the application form and storing data are both automated. As a result, the recruiter can manage applications faster (in digital or paper format), even at Career days, trade fairs or recruiting events. With a quick scan of the CV the digitised information is extracted and added to a database or transferred to other management software. The application process is improved on all fronts: it also improves candidate experience and reduces the dropout rate during the application phase.

parsing of documents

Parsing of documents

Parsing solutions don’t just apply to the HR industry or the search and selection process! Operating in a generalised manner, the parsers automatically recognise and analyse every type of digital document, adapting to the specific internal and sector needs of each company

What is the information extraction and CV Parsing process

From a textual or image format document (CV), Inda’s Information Extraction and Resume/CV Parsing allow the extraction of a candidate’s unstructured data and their conversion into structured information or documents in XML format. These are automatic processes that can replace the traditional filling in of the application form, accelerating recruiting activities.

  • 1. Recognition of the CV file
  • 2. Text extraction
  • 3. Identification of structured information
  • 4. Data mapping

Through Document Layout Analysis (DLA) and Optical Character Recognition (OCR) techniques, it is possible to distinguish documents in textual format from those in image format. The analysis of the layout, structure and sections makes it possible to differentiate the various types of CVs.

File extensions supported by Inda: ['pdf', 'doc', 'docx', 'odt', 'txt', 'html', 'pptx', 'rtf', 'jpg', 'jpeg', 'png', 'tif', 'tiff'].

After analysing the CV format, it is possible to proceed with the extraction of well-ordered text from the source document.

The operation of analysing and exporting data from a CV is possible through CV/Resume Parsing. This captures the text from the CV which is converted into structured information or documents in XML format. 

structured information

Through Named Entity Recognition (NER) techniques it is possible to distinguish specific entities such as "name", "surname", "job". Thanks to Relation Extraction (RE) it is also possible to understand the type of relationship that these semantic entities have with specific sections of the CV.

Among the entities recognised and extracted by Inda:

Personal data

  • Name
  • Surname
  • Email
  • Address (address; city; state; postcode)
  • Age
  • Date of birth
  • Average level of experience (number of previous position, total experience)
  • Links
  • Phone number

Professional experiences

  • Start date
  • End date
  • Seniority
  • Company
  • Location (city; state)
  • Position

Educational qualifications

  • Start date
  • End date
  • Duration
  • Location (city; state)
  • Istitute
  • Title
  • Disciplinary field

Skills and language

  • Skills (competence; score)
  • Language (mother tongue; foreign language)

Additional information

  • Photo
  • CV language


As a last step, the extracted information is mapped into predefined fields. Generally, the application forms contain drop-down menus with multiple options, for which it is necessary to map the extracted information into the most similar options from those proposed.

How do information extraction and CV Parsing work

Information Extraction mainly uses technologies integrated in the areas of Computer Vision and Natural Language Processing

Included in Computer Vision technology, this is a process of identifying and analysing the layout and geometric sections of a document.

Integrated with Computer Vision techniques and based on Natural Language Processing algorithms, this is a system that recognises a sequence of characters within an image format in a document.

This is a tool that allows the recognition of the language in which the content of a document (for example a CV) is written.

Starting from the analysis of the format and geometry of the layout (DLA) of a document, this system provides the recognition and extraction of well-ordered text.

Based on Deep Learning algorithms at the heart of Natural Language Processing (NLP), this is an information extraction process through which entity recognition is carried out starting from a text.

Based on Deep Learning and Natural Language Processing (NLP) techniques, this allows relationships between entities recognised by NER to be identified.

Do you want to know more about Inda's features?

Request a demo and find out how to make the most of the benefits of Information Extraction and CV Parsing to optimise your recruitment

The benefits of information extraction and CV Parsing

Improve the candidate Experience
Increase Return on Investment (ROI)
Increase the Conversion Rate
Enhance Candidate Attraction

Contact us

+39 0371 5948800
Corso Duca d’Aosta, 1 – Turin
Via Caviglia 11 –  Milan

Copyright © 2021 Inda

Inda is a solution by Intervieweb S.r.l. part of the Zucchetti Group P.IVA: 10067590017

Privacy policy   Cookie policy