Cleaning, Validation, calculation Blocks

Welcome to your comprehensive support page for Docsumo’s Field Setup! This guide will walk you through the process of using AI-powered, plain-English prompts to extract, transform, and validate data from documents—no coding required.


Video


Overview

Docsumo’s Field Setup allows you to:

  • Create extraction rules using natural-language prompts
  • Normalize, calculate, and validate document data with AI
  • Test and iterate your rules instantly before production

Key Features & Prompt Examples

1. No-Code Extraction with Natural Language

Define what you want to extract in everyday English.
Prompt Examples:

  • Extract the "Invoice Number" and "Invoice Date" from the document.
  • Find and extract the "Customer Name" and "Total Amount Due".
  • Extract all line items including "Description", "Quantity", and "Unit Price".
📘


2. Cleaning

Use it when the extracted data needs to be reformatted or standardized before it's saved. This helps user to get the data in desired format to fit the database conventions.

Common cases:

  • Make text uppercase, lowercase, or title case.
  • Remove unwanted characters — currency symbols ($, ), extra spaces, line breaks.
  • Standardize date formats (e.g. 01/02/20252025-01-02).
  • Normalize variations of the same thing — "Acme Corp", "ACME Corporation", "acme co."ACME CORP.
  • Strip prefixes or suffixes — "Invoice #INV-1234"INV-1234
  • Use condition-based default values — set a field to a specific value when a condition is met (e.g., if a field has value X, replace it with Y).

Note: To use a field in prompt please press /

Prompt examples

  • Convert date to YYYY-MM-DD format.
  • Convert the vendor_name to uppercase.
  • Remove the dollar sign and any commas from the total_amount.
  • Map "Net 30", "30 days", and "Due in 30" to "Net 30" in self
  • If vendor_name contains "Amazon", set category to "E-commerce".

When cleaning fails

If the cleaning code can't run due to some execution errors, you will see the message as given below:

Custom code execution failed.
Details: <message>

The Details: line tells you exactly what went wrong so you can fix it using the prompt.




3. Calculations

Use it when you need to compute a value rather than read it straight from the document. Calculation works on the field's own extracted value or on other fields you reference including combination of both.

Common cases:

  • Transform the field's own value — double it, halve it, round it, take a percentage of it.
  • Combine the field's value with a constant — multiply by a tax rate, add a fixed fee, convert units (e.g. inches × 2.54 for centimetres).
  • Combine two or more fields — multiply Quantity × Unit Price for a row subtotal, subtract Discount from Subtotal.
  • Aggregate a column of line items — sum, average, max, or minimum of Line Total, Quantity, etc.
  • Work across sections — reference a field from a different section in the calculation.
  • Date math — find the number of days between two dates (e.g. Invoice Date and Due Date).

Note: To use a field in prompt please press /

Prompt examples

  • Multiply self value by 2.
  • Multiply "Subtotal" by 0.18 to get the tax amount.
  • Multiply "Quantity" by "Unit Price" for each line item.
  • Sum all "Line Total" values and return the result.
  • Calculate the average of all "Unit Price" values.
  • Calculate the number of days between "Invoice Date" and "Due Date".

When calculation fails

If the calculation code can't run — for example, dividing by zero or hitting a missing value — you'll see a message in this format:

Custom code execution failed.
Details: <message>

The Details: line tells you exactly what went wrong so you can instruct in the prompt to fix it.


4. Validation

Use it when you want to check that extracted, cleaned, or calculated data is correct before it leaves Docsumo.

Common cases:

  • Make sure a required field isn't blank.
  • Check a value follows a pattern — invoice number starts with INV-, account number is exactly 11 digits, email looks like an email.
  • Confirm a number is in a sensible range — total is positive, quantity isn't zero.
  • Cross-check between fields — Due Date is after Invoice Date, sum of line items matches Invoice Total.

Note: To use a field in prompt please press /

Prompt examples

  • Validate that "Account Number" contains exactly 11 digits.
  • Check that "Invoice Date" is not later than "Due Date".
  • Ensure "Total Amount" is positive and not blank.
  • Validate that the sum of "Line Total" matches "Invoice Total".

When validation fails

A validation can fail for two reasons, and the message you see tells you which:

  • The validation rule returned false. The field will show why it failed — for example, "Account number has 10 digits, expected 11" — so you can fix the prompt to address those issues and generate the custom code again.

  • The validation code couldn't run (an execution error). You'll see a message in this format:

    Custom code execution failed.
    Details: <ExceptionType>: <message>

    The Details: line tells you exactly what went wrong so you can instruct in the prompt to fix it.



5. Testing and iterating

Once you've added your blocks, click Run Test to try them on a sample document. The result appears in line with each block, including any errors.

  • If the test passes — the extracted, cleaned, or calculated value is shown.
  • If the test fails — you'll see one of:
    • A validation reason telling you exactly what didn't match. Fix the prompt and re-run.
    • An execution error with the Python error in the Details: line. This can happen on Cleaning, Calculation, Validation, or Custom Code blocks — any block that runs code under the hood.
    • A referenced-field error — the field this block depends on failed. Fix that field first.

For Cleaning, Calculation, and Validation you may also need Generate and Run to regenerate the underlying code after a prompt change.


Common problems and fixes

My validation always returns False — how do I see why? Validation prompts return a reason on failure. Phrase your prompt to mention proper reason why it failed. The reason will appear in case of failure.

My custom code says ZeroDivisionError — what now? A value you're dividing by is 0 or missing on some documents. Copy the error that you see and instruct AI to fix it.

A field I reference shows an error, and my block won't run. That's expected — Docsumo skips blocks that depend on broken fields and surfaces the upstream error. Fix the referenced field first. You will get information about the failed referenced fields so you can correctly handle it.

I changed the prompt but the result hasn't changed. Click Run Test again. Validation and Custom Code blocks may also need Generate and Run to regenerate the underlying code.

Best practices

  • Be specific in prompts. "Extract the invoice number that starts with INV-" beats "Extract the invoice number".
  • Name sections and tables descriptively. Docsumo uses these names as context.
  • Validate the things that matter. A failing validation with a clear reason is far cheaper than a bad value reaching your downstream system.
  • Combine blocks freely — extract, clean, calculate, validate. Use only the blocks the field actually needs.
  • Test on a variety of documents before going to production. Edge cases (missing fields, unusual formats) often only show up on real samples.