Redacting Data in PDF Files
HuLoop’s Redaction function, part of its Intelligent Document Processing capabilities, enables you to automatically identify and conceal sensitive data such as Personally Identifiable Information (PII) within PDF files. The redacted information is visually removed from the document using bounding box detection and replaced with blank or blacked-out sections.
The Redaction function identifies the data points defined by the user and redacts them without requiring manual intervention. It’s especially useful for handling documents with confidential information that needs to be securely shared or archived.
Note: Ensure that the source PDF file is closed before running the automation. If the file is open during execution, the process may fail.
Steps to Configure:
- Add a new step in your automation workflow.
- Select Set a Variable Value from the Action dropdown.
- Enter a variable name in Element Key (e.g., RedactedFilePath). This will store the path of the redacted output file.
- Click Form, select Functions, and choose Redaction from the dropdown.
- Provide the following parameters:
- Source: Full path to the input PDF file that needs redaction
- Destination: Path where the redacted file should be saved
- Auto-Redact: Specify true to automatically redact all matching fields, or false to selectively redact fields defined in the custom group.
- Start-Page: (Optional) Page number to begin extraction.
- End-Page: (Optional) Page number to stop extraction.
- Custom-Groups: Name of the group that includes the fields/entities to be redacted (e.g., PII_Group).
Note: Make sure the required identifiers are selected in the chosen PII group. If no identifiers are selected, the function may return empty results.
You can select multiple PII groups-both standard and custom—during function setup to extract or redact a wider set of information in one step.
- Click Save.


Expected Outcome on Execution: After execution, the variable defined in the Element Key field will store the path to the which can be used in later steps for processing or validation using the syntax ${Variable Name}.
Example 1: Redacting PII from a Customer Profile PDF using Standard PII Group.
Scenario: You have a file named CustomerProfile.pdf that contains sensitive details like Name and Email Address. You want to redact these fields before sharing the document.

Steps to Configure:
- In Element Key, enter RedactedFile.
- In Form, choose the Redaction function.
- Set these values in the parameters:
- Source=C:\Data\CustomerProfile.pdf
- Destination=C:\HuLoop
- Auto-Redact=true
- Custom-Groups=PII_Group


Expected Outcome:
- HuLoop scans the document, identifies PII based on the selected PII_Group, and redacts it.
- A new file is generated with the redacted content, ensuring secure handling of sensitive information.
- The path to this redacted file is stored in the defined variable and can be referenced in later steps.

Example 2: Redacting Information Using a Custom PII Group
Scenario: You want to redact sensitive data like Salary Band, Bonus Eligibility, Performance Score, and Employee Grade from the file EmployeeConfidential_CustomPII.pdf. These fields are not part of HuLoop’s standard PII group, so you must create a Custom PII Group that includes these specific identifiers.

Steps to Configure:
- Add a New Step to your automation.
- From the Action drop-down, select Set a Variable Value.
- In the Element Key, enter a variable name to store the redaction status (e.g., RedactionStatus).
- Click on Form, choose Functions, and select Redaction.
- Fill in the Parameters:
- Source = C:\Documents\EmployeeConfidential_CustomPII.pdf
- Destination = C:\Documents
- Auto-Redact = true
- Custom-Groups = Employee Confidential (This group should include Salary Band, Bonus Eligibility, etc. and all the entities to be redacted should be selected in the custom group)
- Click Save.


You can select multiple custom groups to redact multiple types of entities at once.
Note: Ensure that the file to be redacted is not open on your system during execution, as this may interfere with the automation.
Expected Outcome on Execution:
- The system redacts all occurrences of the specified custom PII fields (e.g., “Salary Band”, “Bonus Eligibility”) in the PDF.
- The redacted version of the file is saved to the specified destination.
- The variable RedactionStatus returns true if the redaction is successful.
