πŸ”¬ Extract structured data from D&B company reports with GPT-4o

⚑ 232 views Β· πŸ”¬ Document Extraction & Analysis

Description

Pull a Dun & Bradstreet Business Information Report (PDF) by DUNS, convert the response into a binary PDF file, extract readable text, and use OpenAI to return a clean, flat JSON with only the key fields you care about (e.g., report date, Paydex, viability score, credit limit). Includes Sticky Notes for quick setup help and guidance.


βœ… What this template does


🧩 How it works (node-by-node)

  1. Manual Trigger β€” Runs the workflow on demand (β€œWhen clicking β€˜Execute workflow’”).
  2. D&B Report (HTTP Request) β€” Calls the D&B Reports API for a Business Information Report (PDF).
  3. Convert to PDF File (Convert to File) β€” Turns the D&B response payload into a binary PDF.
  4. Extract Binary (Extract from File) β€” Extracts text content from the PDF.
  5. OpenAI Chat Model β€” Provides the language model context for the analyzer.
  6. Analyze PDF (AI Agent) β€” Reads the extracted text and applies strict rules for a flat JSON output.
  7. Structured Output (AI Structured Output Parser) β€” Enforces a schema and validates/auto-fixes the JSON shape.
  8. (Optional) Get Bearer Token (HTTP Request) β€” Template guidance for OAuth token retrieval (shown as disabled; included for reference if you prefer Bearer flows).

πŸ› οΈ Setup instructions (from the JSON)

1) D&B Report (HTTP Request)

2) Convert to PDF File (Convert to File)

3) Extract Binary (Extract from File)

4) OpenAI Model(s)

5) Analyze PDF (AI Agent)

6) Structured Output (AI Structured Output Parser)

{
  "report_date": "",
  "company_name": "",
  "duns": "",
  "dnb_rating_overall": "",
  "composite_credit_appraisal": "",
  "viability_score": "",
  "portfolio_comparison_score": "",
  "paydex_3mo": "",
  "paydex_24mo": "",
  "credit_limit_conservative": ""
}

7) (Optional) Get Bearer Token (HTTP Request) β€” Disabled example

If you prefer fetching tokens dynamically:

> In this template, the D&B Report node uses Header Auth credential instead. Use one strategy consistently (credentials are recommended for security).


🧠 Output schema (flat JSON)

The analyzer + parser return a single flat object like:

{
  "report_date": "2024-12-31",
  "company_name": "Example Corp",
  "duns": "123456789",
  "dnb_rating_overall": "5A2",
  "composite_credit_appraisal": "Fair",
  "viability_score": "3",
  "portfolio_comparison_score": "2",
  "paydex_3mo": "80",
  "paydex_24mo": "78",
  "credit_limit_conservative": "25000"
}

πŸ§ͺ Test flow

  1. Click Execute workflow (Manual Trigger).
  2. Confirm D&B Report returns the PDF response.
  3. Check Convert to PDF File for a binary file.
  4. Verify Extract from File produces a text field.
  5. Inspect Analyze PDF β†’ Structured Output for valid JSON.

πŸ” Security notes


🧩 Customize


🩹 Troubleshooting


πŸ—’οΈ Sticky Notes (included)


πŸ“¬ Contact

Need help customizing this (e.g., routing the PDF to Drive, mapping JSON to your CRM, or expanding the schema)?

πŸ“§ robert@ynteractive.com
πŸ”— https://www.linkedin.com/in/robert-breen-29429625/
🌐 https://ynteractive.com

πŸ”— Nodes Used

HTTP Request, AI Agent, OpenAI Chat Model, Structured Output Parser, Convert to File, Extract from File

πŸ“₯ Import

Download workflow.json and import into n8n: Workflow menu β†’ Import from File

πŸ“– Importing guide Β· πŸ”‘ Credential setup