getText Action: getText
The getText action extracts content from specific pages of a PDF file.
This action is ideal for use cases where targeted text or image extraction from specific PDF pages is required for automation workflows.
Example: You need to extract the text from pages 1 to 5 of the Report.pdf file located in “C:\User\Documents\”.
Steps to Configure:
- Add a new step.
- Select Set a Variable Value from the Action dropdown.
- Enter a variable name in Element Key (e.g., ExtractedText). This variable will store the extracted text.
- Click on Form, select Functions, and choose PDF File Handler Functions.
- In the Action dropdown, select getText.
- Provide Parameters:
- FPath: Specify the full path of the PDF file to extract text from (e.g., C:\User\Documents\Report.pdf).
- Start: Define the starting page number for the extraction (e.g., 1).
- End: Define the ending page number for the extraction (e.g., 5).
- Page Contains: Select what type of content to include from the dropdown:
- Text: Extract only text content.
- Images: Extract only images.
- All: Extract both text and images.
- Click Save.


Outcome on execution:
- If the file contains the following text on pages 1 to 3:
- Page 1: This is a test document.
- Page 2: It demonstrates the GetText function.
- Page 3: Automation simplifies text extraction.
- The variable ${ExtractedText} will store the text from page 1 to page 3 as shown:
- This is a test document.
- It demonstrates the GetText function.
- Automation simplifies text extraction.
- Use the variable syntax ${VariableName} (e.g., ${ExtractedText}) to access this text in later steps.