Cleaning, Validation, calculation Blocks

Welcome to your comprehensive support page for Docsumo’s Field Setup! This guide will walk you through the process of using AI-powered, plain-English prompts to extract, transform, and validate data from documents—no coding required.


Video


Overview

Docsumo’s Field Setup allows you to:

  • Create extraction rules using natural-language prompts
  • Normalize, calculate, and validate document data with AI
  • Test and iterate your rules instantly before production

Key Features & Prompt Examples

1. No-Code Extraction with Natural Language

Define what you want to extract in everyday English.
Prompt Examples:

  • Extract the "Invoice Number" and "Invoice Date" from the document.
  • Find and extract the "Customer Name" and "Total Amount Due".
  • Extract all line items including "Description", "Quantity", and "Unit Price".

📘

More detail: Extraction block and best practices


2. Cleaning

Standardize data to fit your database conventions, even if vendors use different formats.

Prompt Examples:

  • Normalize all variations of vendor names such as "Acme Corp", "ACME CORPORATION", or "Acme Co." to "ACME CORP".
  • Convert all dates to the YYYY-MM-DD format.
  • Map payment terms like "Net 30", "30 days", and "Due in 30" to "Net 30".

3. Calculations

Let Docsumo compute values on the fly, directly from your documents.
Prompt Examples:

  • Calculate the total sum of all {{field name}} and extract as {{file name}}.
  • Extract all {{field name}} values and calculate the average.
  • Sum all {{field name}} across the invoice and output as {{field name}}.

4. Validation

Ensure extracted data meets your requirements before further processing.
Prompt Examples:

  • Validate that the "Account Number" field contains exactly 11 digits.
  • Check if "Invoice Date" is not later than "Due Date".
  • Ensure "Total Amount" is a positive number and not blank.

5. Instant Testing & Iteration

Test your prompts and see the results immediately. Adjust as needed for accuracy.
Prompt Examples:

  • Extract "PO Number" and validate it starts with "PO-" followed by 6 digits.
  • Test extraction of "Vendor Email" and ensure it matches a valid email format.


Best Practices & Tips

  • Be specific in your prompts for higher accuracy.
  • Use validation to catch errors before they enter your workflow.
  • Iterate and test with a variety of sample documents.
  • Combine extraction, normalization, and validation in a single workflow for efficiency.