Read PDF File Using Python in Robot Framework – Devstringx

Back to Blog

Read PDF File Using Python in Robot Framework – Devstringx

Task:- Read Data from the pdf file and compare the text “Testing” is present in the pdf file

Create a function to read data from PDF File using Python

First Install PdfMiner and Pdf2TextLibrary libraries in your system as per the steps mentioned below:

  1. Open a command prompt.
  2. Write the “pip install pdfminer” command to install pdfminer library.
  3. Write the “pip install robot framework-pdf2textlibrary” command to install pdf2textlibrary.

Now create a python file. You can give any name to your file and save it with .py extension.

I have created a file python file as Pdf2TextLibrary.py

from pdfminer.pdfinterpimport PDFResourceManager, PDFPageInterpreter
from pdfminer.converterimport TextConverter
from pdfminer.layoutimport LAParams
from pdfminer.pdfpageimport PDFPage
from io import StringIO

class Pdf2TextLibrary(object):
ROBOT_LIBRARY_SCOPE = ‘Global’

def __init__(self):
print('pdf to text library')

def convert_pdf_to_txt(self,path):
rsrcmgr = PDFResourceManager()
retstr = StringIO()
codec = 'utf-8'
laparams = LAParams()
device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
fp = open(path, 'rb')
interpreter = PDFPageInterpreter(rsrcmgr, device)
password = ""
maxpages = 0
caching = True
pagenos=set()
for page in PDFPage.get_pages(fp, pagenos, maxpages=maxpages, password=password,caching=caching, check_extractable=True):
interpreter.process_page(page)
fp.close()
device.close()
str = retstr.getvalue()
retstr.close()
return str

Here I have created a function as convert_pdf_to_txt to convert PDF data to text.

Read Also:- Read Excel Using Python in Robot Framework

Calling python function in robot framework PDF to read pdf data.

# Import the File in which you have created a function to read data from a PDF file.
*** Settings ***
Library        ../Scripts/Pdf2TextLibrary.py

*** Test Cases ***
Read PDF File Data

# open downloaded PDF and read data from PDF${file_name}  List Files in directory   ${EXECDIR}/Files/Downloads
${string}=    convert_pdf_to_txt${DWNLDFOLDER}${file_name}
Should Contain    ${string}    Testing  #check entered text is present in PDF

Let’s save this file with “TestPDF.robot”

Run the above file by using the following command:
>robot TestPDF.robot

Output: Your program will run successfully if the text is present in the downloaded pdf.

FAQs

  • What is the purpose of Robot Framework?

An open-source, general automation framework built on Python is called Robot Framework. It can be utilized for robotic process automation and test automation (RPA). Robot Framework Foundation provides support for it. The instrument is utilized in the software development process by numerous market leaders.

  • Does Robot Framework need to be programmed?

Robot Framework pdf library is simple to understand and use because the user is not required to create a challenging piece of code.

  • What makes Robot Framework superior to Selenium?

There is a significant distinction between the two: Robot is a test framework that utilizes test libraries (both internal and external) to carry out tests, whereas Selenium is just a WebDriver/library that depends on test automation runners to carry out tests.

  • Is Robot Framework compatible with Windows?

Download the robot framework pdf library

Type the following command at the command prompt after opening the python folder: download robot framework using pip

  • How do I use Robot Framework to read PDFs in python?

[Arguments] Script for robots explained ${pdf file name} The argument for the keyword is the file name of a PDF document.

Get Text From Pdf using $text ${pdf file name} Using the Get Text From Pdf keyword offered by the RPA, we extract the text from the PDF file.

File creation ${OUTPUT DIR} ${/}${pdf file name}.

If you are interested in even more articles and information on Robot Framework from us here at Devstringx, then we have a lot to choose from for you.

Share this post

Back to Blog