Fraud Detection
Detecting anomalies and fraud on bank statement documents
What is Fraud detection?
Fraud detection on bank statements involves identifying and flagging documents that may contain fake or manipulated data. This process helps ensure that only genuine documents are processed, thereby preventing potential fraud and maintaining the accuracy of financial records.
The document might have been tampered with malicious intentions, which could be detected by authorities to differentiate between genuine bank statements and fraudulent ones.
Fraud checks on bank statements
-
To enable the fraud check to be triggered when the document is uploaded, head to the Document Type >
Configure>Extraction>Additional Configuration>Fraud Analysis. -
Toggle the
Fraud Analysisoption in theAdditional Configurationsection of the document type.NOTE: For older and existing document types, you need to reset the document type to enable the fraud table field to be visible on the Fields.
The Fraud Check field is a table grid representing three tables.
- Summary of Fraud and Authenticity Score
- All the checks that have been applied to the document
- All the checks that failed (i.e., which might indicate that it is a probable fraud) in-depth with description/position

Fraud Score Table
The fraud score table is used to give an overview of the authenticity or the risk level of the bank statement document
The table looks like follows:
| Name of Check | Description | Result | Score |
|---|---|---|---|
| Fraud Score | The likelihood of the document being fraudulent | The numerical value representing the overall fraud score | |
| Base Sum | Sum of failed check base scores | The base sum of the individual fraud checks | |
| Combo Multiplier | The combination multiplier applied | The level of multiplier is applied based on the severity of the failed checks | |
| Authenticity Score | The likelihood of the document being genuine | The numerical range from 0 to 100 represents the authenticity of the document. | |
| Authenticity Level | The label associated with the authenticity level | HIGH/MEDIUM/LOW/ VERY LOW | The numerical range from 0 to 100 represents the authenticity of the document. |
| Risk Level | Derived from the authenticity score, depicting the likelihood of fraud in the document | NONE/LOW/MEDIUM/HIGH | The numerical value representing the overall fraud score |
The breakdown of the fraud score, authenticity, and the risk level score is given below:
| Fraud Score | Authenticity Score | Authenticity Level | Risk Level |
|---|---|---|---|
| 0-8 | 81-100 | HIGH | NONE |
| 8-16 | 61-80 | MEDIUM | LOW |
| 16-26 | 31-60 | LOW | MEDIUM |
| 28+ | 0-30 | VERY LOW | HIGH |
Fraud Table
The second grid lists all the fraud checks applied to the bank statement document. It has the broader category of the fraud check, the name of the check, and the result column, where Pass might indicate no signal of fraud, and Failed indicates a probable signal of a specific fraud.
Grid 2: All the checks applied to the document
| Fraud Check | Name of Check | Description | Result |
|---|---|---|---|
| exif_analysis | is_date_modified | Passed | |
| exif_analysis | is_author_blacklist | Passed | |
| exif_analysis | is_creator_blacklist | Passed | |
| exif_analysis | is_producer_blacklist | Passed | |
| exif_analysis | is_checksum_invalid | Passed | |
| transaction_analysis | high_amount_check | High Amount transactions detected | Failed |
| transaction_analysis | frequent_transactions | Passed | |
| transaction_analysis | repeated_transactions | Passed | |
| transaction_analysis | circular_transactions | Passed | |
| transaction_analysis | daily_transactions_above_threshold | Passed |
Failed Fraud Checks Table
The third grid/table lists all the failed signals and instances of the fraud checks.
- One check can have one or more instances, for example the statement could have many high amounts that depict a huge deviation from the normal transactions, or spikes in volume for a certain days in the given statement and the extracted transactions
Grid 3: All the checks that failed
| Fraud Check | Name of Check | Description | Result |
|---|---|---|---|
| transaction_analysis | high_amount_check | High Amount transactions detected | Failed |
| transaction_analysis | high_amount_check | High Amount transactions detected | Failed |
| transaction_analysis | high_amount_check | High Amount transactions detected | Failed |
The third table will display individual checks and indicate the exact position where tampering or anomalies may have occurred.
There won't be a position for generic checks like tally_checks or document meta-data checks like EXIF analysis.
Transaction-level checks are conducted on the extracted transaction tables in the document. These checks include verifying the account information, such as opening and closing balances, and the start and end dates.
The amount layout alignment check also uses the OCR data from the extracted transaction tables.
Checks
There are 2 categories of checks currently implemented into the Bank statement API, in all, there are 11 individual fraud checks as follows:
- EXIF Analysis: This analysis examines the document's metadata for signs of tampering or suspicious values. It checks metadata fields such as author, producer, and creator against a known set of whitelisted and blacklisted terms and keywords. If any match is found, the check is marked as fraud. Additionally, it verifies PDF trailer tampering and validates the checksum from the trailer.
- Transaction Analysis: This analysis identifies anomalies or patterns in transaction table data extracted from the bank statement document. It includes basic checks on the tally, daily balance, repeated or overly frequent transactions, anomalies in transactions (such as inconsistent trailing digits or high amounts), circular or round trip transactions, and invalid dates.
There might be erroneous fraud detection
There might be instances where the document might not be fraud per se, but there might be fraud checks failing for that document, those could be for a few justifiable reasons:
- The image quality in the document is low due to which the extraction and analysis might not have been accurate enough.
- The document might have such a structure that might cause fraud checks giving false positive analysis of the layout or data.
| No. | Check | Name | Description |
|---|---|---|---|
| 1 | EXIF Analysis | is_date_modified | The metadata of the document had the created and modified dates as different dates |
| 2 | is_author_blacklist | The metadata of the document had the author field as a suspicious name from the whitelisted source of bank names | |
| 3 | is_creator_blacklist | The metadata of the document had the creator field as a suspicious tool from the blacklisted software used to create the document | |
| 4 | is_producer_blacklist | The metadata of the document had the producer filed set as a suspicious tool from the blacklisted software used to create the document | |
| 5 | is_trailer_checksum_invalid | The checksum present in the metadata of the document has an invalid hash | |
| 6 | Transaction Analysis | high_amount_check | There is a transaction with an amount having a very high value than the rest of the transactions |
| 7 | frequent_transactions | There are a lot of frequent transactions with the same description | |
| 8 | repeated_transactions | There are a lot of transactions with the same description and the same amount | |
| 9 | circular_transactions | There are transactions with a similar description and with debit and credit as the same amount | |
| 10 | daily_transactions_above_threshold | The day in the statement has too many transactions compared to the rest of the days | |
| 11 | date_check | The date is either invalid or incorrect in the context of the statement Out of range from the start and end dates Invalid date or format (31st Feb, etc) |
In case of an error from the fraud module, the fraud check table will be a row table with the description marked as Error during fraud analysis or Fraud check timed out. in case of a timeout issue from the processing of data.
Updated 20 days ago
