Skip to main content

Overview

Using AnyParser, you can extract PII from your documents, including
  • Name
  • Phone Number
  • Address
  • Email Address
  • Linkedin URL
  • Github URL
  • Summary

Setup

Refer to the Quickstart guide to install the AnyParser SDK and get your api key. First, set up your AnyParser client.
anyparser_pii.py
from any_parser import AnyParser

ap = AnyParser(api_key="...")
Then, use the anyparser_pii method, passing in the following:
  • file_path (str): the path to the local file
anyparser_extract_pii.py
pii_result, total_time = ap.extract_pii(file_path="/path/to/your/file")
This will return two things:
  • pii_result (dict): Dictionary with the keys corresponding to PII types, and the values extracted from the document
  • total_time (str): the time elapsed in seconds

Full Code

anyparser_extract_pii.py
from any_parser import AnyParser

ap = AnyParser(api_key="...")

local_file_path = "/path/to/your/file"

pii_result, total_time = ap.extract_pii(local_file_path)

Output

A dictionary containing Personally Identifiable Information (PII).

Full Notebook Examples

Check out these notebooks for more detailed examples of using both sync and async AnyParser.

AnyParser Sync Extract PII Example

Extracting key-values from a fake W2 document.

AnyParser Async Extract PII Example

Extracting key-values from a fake W2 document.