Additional Parameters

The additional parameters to customise the bank statement document type

The list of parameters that can be used in the document type to enable/disable features or configure the extraction of the documents.

All Additional Parameters:

The Bank Statement document type supports the following additional parameters:

Parameter name

Description

Default value

Possible values

date_format

Set the date format to parse the dates in the specified format in transaction tables or account info (start and end date).

mmddyy

  • ddmmyy
  • mmddyy
  • auto-ddmmyy
  • auto-mmddyy

limit

The number of pages to process.

20

max: 75

region

The config support for specific regions.

general

strict_rerun

Include missing signs, decimals, and daily balance checks in the re-run validation step.

false

true
false

use_digital

Pick the digital data if present in the document, if not then only process with the internal OCR.

false

true
false

strict_stp

Including date tally check for determining STP for a document.

false

true
false

enable_category

Enable the transaction data enrichment. It allows the populating of categories, subcategories, and merchants for each transaction in the tables.

false

true
false

world_bs

Support for World Bank statements enables the processing of documents in languages other than English. This feature can extract transaction tables and account information in their native languages.

false

true
false

llm_kv

Enable LLM for picking KV data.

false (true when world_bs is set to true)

true
false

detect_fraud

Enable bank statement-level fraud checks on documents.

false

true
false

model_path

The path of the KV Model to be used.

path to the model in the bucket

enable_new_year_correction_logic

Determines whether to apply advanced year correction logic to transaction table dates based on start and end dates.

false

  • true
  • false

auto_date_parsing_threshold

The maximum number of dates that are opposite of the required date format to trigger the auto-flipping and correction of the date

Set dynamically as 5% of the dates extracted in the document

It could be any value ranging from 0 to the number of dates in the document


Description of Additional Parameters

  1. Date Format
    • The format of dates that will be parsed and displayed in the transaction tables and account info/key-value pairs(start and end dates)
    • The default value is mmddyy, the other option is ddmmyy
    • You can also enable auto-ddmmyy or auto-mmddyy where the extractor will correct the date formats if the document is in ddmm format and the required date format is mmdd based on the number of such dates in the document.
  2. Limit
    • The number of pages that should be processed.
  3. Region
    • The region-specific config for header mapping of the columns in transaction tables
    • The default value is general, other possible options could be the country codes like:
      • us for United States of America (general)
      • in for India
      • it for Italy
      • nl for Netherlands
      • es for Spain
      • de for Germany
      • at for Austria
      • dk for Denmark
      • fr for France
      • gr for Greece
      • no for Norway
      • se for Sweden
      • ch for Switzerland
      • ie for Ireland
      • ph for Philippines
      • id for Indonesia
  4. Strict Re-Run
    • While the re-run validation, enable the checks for
      • Missing Signs
      • Missing Decimal
      • Daily Balance Check
    • This makes re-run validation more robust for any mistake made while editing fields.
  5. Use Digital
    • Pick the digital data in the document for processing.
    • If digital data is not present then only use the data from internal OCR.
    • The digital data is given priority since the OCR extraction might not be precise or accurate enough in low-quality images or documents. The precision of the signs, decimals, and characters might have been missed by extraction, which could be important in the context of the field being extracted.
  6. Strict STP
    • Use the date tally checks into consideration as well while calculating STP for the document.
    • By default, only documents that have passed tallying and have high confidence in all key-value (KV) fields/transaction table rows are set to the Processed status, which is Straight Through Processing (STP). By adding strict_stp, the document will also undergo date tallying checks.
  7. Enable Category
    • Enable populating of category, subcategory, and merchant for each transaction row in the table.
  8. World BS
    • Enable parsing and populating data/transaction tables in languages other than English.
    • If the document is in a different language, such as French or German, the headers for transaction tables and the key fields for key-value (KV) data need to be determined based on the language detected in the document. Therefore, if the world_bs flag is set, the tables and KV data are mapped according to the language extracted from the document.
  9. LLM KV
    • Enable the KV data to be populated with the help of LLM
    • The flag is enabled internally when the world_bs is set as true
  10. Detect Fraud
    • Enable checks on the document for potential fraud or anomaly.
  11. Model Path
    • The path of the model that will be used for getting the key-value (KV) fields.
  12. Enable Year Correction
    • This parameter, when set to true, enables specific year correction logic for transaction table dates based on start and end dates. This logic adjusts dates if their years are incorrect or missing.
  13. Enable Auto Date Parsing
    • For date formats such as auto-ddmmyy or auto-mmddyy, you can use the auto_date_parsing_threshold parameter to enable automatic correction of dates. This correction is applied when the count of date values in a different (opposite to the set format) format exceeds the specified threshold. Examples could be 10, 31, 365, etc.