Additional Parameters

The additional parameters to customise the bank statement document type

The list of parameters that can be used in the document type to enable/disable features or configure the extraction of the documents.

All Additional Parameters:

The Bank Statement document type supports the following additional parameters:

Parameter nameDescriptionDefault valuePossible values
date_formatSet the date format to parse the dates in the specified format in transaction tables or account info (start and end date). mmddyy- ddmmyy
- mmddyy
- auto-ddmmyy
- auto-mmddyy
limitThe number of pages to process.20max: 75
regionThe config support for specific regions.general
strict_rerunInclude missing signs, decimals, and daily balance checks in the re-run validation step.falsetrue
false
use_digitalPick the digital data if present in the document, if not then only process with the internal OCR.falsetrue
false
strict_stpIncluding date tally check for determining STP for a document.falsetrue
false
enable_categoryEnable the transaction data enrichment. It allows the populating of categories, subcategories, and merchants for each transaction in the tables.falsetrue
false
world_bsSupport for World Bank statements enables the processing of documents in languages other than English. This feature can extract transaction tables and account information in their native languages.falsetrue
false
llm_kvEnable LLM for picking KV data.false (true when world_bs is set to true)true
false
detect_fraudEnable bank statement-level fraud checks on documents.falsetrue
false
model_pathThe path of the KV Model to be used.path to the model in the bucket
enable_new_year_correction_logicDetermines whether to apply advanced year correction logic to transaction table dates based on start and end dates.false- true
- false
auto_date_parsing_thresholdThe maximum number of dates that are opposite of the required date format to trigger the auto-flipping and correction of the dateSet dynamically as 5% of the dates extracted in the documentIt could be any value ranging from 0 to the number of dates in the document

Description of Additional Parameters

  1. Date Format
    • The format of dates that will be parsed and displayed in the transaction tables and account info/key-value pairs(start and end dates)
    • The default value is mmddyy, the other option is ddmmyy
    • You can also enable auto-ddmmyy or auto-mmddyy where the extractor will correct the date formats if the document is in ddmm format and the required date format is mmdd based on the number of such dates in the document.
  2. Limit
    • The number of pages that should be processed.
  3. Region
    • The region-specific config for header mapping of the columns in transaction tables
    • The default value is general, other possible options could be the country codes like:
      • us for United States of America (general)
      • in for India
      • it for Italy
      • nl for Netherlands
      • es for Spain
      • de for Germany
      • at for Austria
      • dk for Denmark
      • fr for France
      • gr for Greece
      • no for Norway
      • se for Sweden
      • ch for Switzerland
      • ie for Ireland
      • ph for Philippines
      • id for Indonesia
  4. Strict Re-Run
    • While the re-run validation, enable the checks for
      • Missing Signs
      • Missing Decimal
      • Daily Balance Check
    • This makes re-run validation more robust for any mistake made while editing fields.
  5. Use Digital
    • Pick the digital data in the document for processing.
    • If digital data is not present then only use the data from internal OCR.
    • The digital data is given priority since the OCR extraction might not be precise or accurate enough in low-quality images or documents. The precision of the signs, decimals, and characters might have been missed by extraction, which could be important in the context of the field being extracted.
  6. Strict STP
    • Use the date tally checks into consideration as well while calculating STP for the document.
    • By default, only documents that have passed tallying and have high confidence in all key-value (KV) fields/transaction table rows are set to the Processed status, which is Straight Through Processing (STP). By adding strict_stp, the document will also undergo date tallying checks.
  7. Enable Category
    • Enable populating of category, subcategory, and merchant for each transaction row in the table.
  8. World BS
    • Enable parsing and populating data/transaction tables in languages other than English.
    • If the document is in a different language, such as French or German, the headers for transaction tables and the key fields for key-value (KV) data need to be determined based on the language detected in the document. Therefore, if the world_bs flag is set, the tables and KV data are mapped according to the language extracted from the document.
  9. LLM KV
    • Enable the KV data to be populated with the help of LLM
    • The flag is enabled internally when the world_bs is set as true
  10. Detect Fraud
    • Enable checks on the document for potential fraud or anomaly.
  11. Model Path
    • The path of the model that will be used for getting the key-value (KV) fields.
  12. Enable Year Correction
    • This parameter, when set to true, enables specific year correction logic for transaction table dates based on start and end dates. This logic adjusts dates if their years are incorrect or missing.
  13. Enable Auto Date Parsing
    • For date formats such as auto-ddmmyy or auto-mmddyy, you can use the auto_date_parsing_threshold parameter to enable automatic correction of dates. This correction is applied when the count of date values in a different (opposite to the set format) format exceeds the specified threshold. Examples could be 10, 31, 365, etc.