Fraud Detection

Detecting anomalies and fraud on bank statement documents

What is Fraud detection?

Fraud detection on bank statements involves identifying and flagging documents that may contain fake or manipulated data. This process helps ensure that only genuine documents are processed, thereby preventing potential fraud and maintaining the accuracy of financial records.

The document might have been tampered with malicious intentions, which could be detected by authorities to differentiate between genuine bank statements and fraudulent ones.

Fraud checks on bank statements

  • To enable the fraud check to be triggered when the document is uploaded, head to the Document Type > Configure > Extraction > Additional Configuration > Fraud Analysis.

  • Toggle the Fraud Analysis option in the Additional Configuration section of the document type.

    NOTE: For older and existing document types, you need to reset the document type to enable the fraud table field to be visible on the Fields.

The Fraud Check field is a table grid representing three tables.

  1. Summary of Fraud and Authenticity Score
  2. All the checks that have been applied to the document
  3. All the checks that failed (i.e., which might indicate that it is a probable fraud) in-depth with description/position

Fraud Score Table

The fraud score table is used to give an overview of the authenticity or the risk level of the bank statement document

The table looks like follows:

Name of CheckDescriptionResultScore
Fraud ScoreThe likelihood of the document being fraudulentThe numerical value representing the overall fraud score
Base SumSum of failed check base scoresThe base sum of the individual fraud checks
Combo MultiplierThe combination multiplier appliedThe level of multiplier is applied based on the severity of the failed checks
Authenticity ScoreThe likelihood of the document being genuineThe numerical range from 0 to 100 represents the authenticity of the document.
Authenticity LevelThe label associated with the authenticity levelHIGH/MEDIUM/LOW/ VERY LOWThe numerical range from 0 to 100 represents the authenticity of the document.
Risk LevelDerived from the authenticity score, depicting the likelihood of fraud in the documentNONE/LOW/MEDIUM/HIGHThe numerical value representing the overall fraud score

The breakdown of the fraud score, authenticity, and the risk level score is given below:

Fraud ScoreAuthenticity ScoreAuthenticity LevelRisk Level
0-881-100HIGHNONE
8-1661-80MEDIUMLOW
16-2631-60LOWMEDIUM
28+0-30VERY LOWHIGH

Fraud Table

The second grid lists all the fraud checks applied to the bank statement document. It has the broader category of the fraud check, the name of the check, and the result column, where Pass might indicate no signal of fraud, and Failed indicates a probable signal of a specific fraud.

Grid 2: All the checks applied to the document

Fraud CheckName of CheckDescriptionResult
exif_analysisis_date_modifiedPassed
exif_analysisis_author_blacklistPassed
exif_analysisis_creator_blacklistPassed
exif_analysisis_producer_blacklistPassed
exif_analysisis_checksum_invalidPassed
transaction_analysishigh_amount_checkHigh Amount transactions detectedFailed
transaction_analysisfrequent_transactionsPassed
transaction_analysisrepeated_transactionsPassed
transaction_analysiscircular_transactionsPassed
transaction_analysisdaily_transactions_above_thresholdPassed

Failed Fraud Checks Table

The third grid/table lists all the failed signals and instances of the fraud checks.

  • One check can have one or more instances, for example the statement could have many high amounts that depict a huge deviation from the normal transactions, or spikes in volume for a certain days in the given statement and the extracted transactions

Grid 3: All the checks that failed

Fraud CheckName of CheckDescriptionResult
transaction_analysishigh_amount_checkHigh Amount transactions detectedFailed
transaction_analysishigh_amount_checkHigh Amount transactions detectedFailed
transaction_analysishigh_amount_checkHigh Amount transactions detectedFailed

The third table will display individual checks and indicate the exact position where tampering or anomalies may have occurred.

There won't be a position for generic checks like tally_checks or document meta-data checks like EXIF analysis.

Transaction-level checks are conducted on the extracted transaction tables in the document. These checks include verifying the account information, such as opening and closing balances, and the start and end dates.

The amount layout alignment check also uses the OCR data from the extracted transaction tables.

Checks

There are 2 categories of checks currently implemented into the Bank statement API, in all, there are 11 individual fraud checks as follows:

  1. EXIF Analysis: This analysis examines the document's metadata for signs of tampering or suspicious values. It checks metadata fields such as author, producer, and creator against a known set of whitelisted and blacklisted terms and keywords. If any match is found, the check is marked as fraud. Additionally, it verifies PDF trailer tampering and validates the checksum from the trailer.
  2. Transaction Analysis: This analysis identifies anomalies or patterns in transaction table data extracted from the bank statement document. It includes basic checks on the tally, daily balance, repeated or overly frequent transactions, anomalies in transactions (such as inconsistent trailing digits or high amounts), circular or round trip transactions, and invalid dates.

🚧

There might be erroneous fraud detection

There might be instances where the document might not be fraud per se, but there might be fraud checks failing for that document, those could be for a few justifiable reasons:

  • The image quality in the document is low due to which the extraction and analysis might not have been accurate enough.
  • The document might have such a structure that might cause fraud checks giving false positive analysis of the layout or data.
No.CheckNameDescription
1EXIF Analysisis_date_modifiedThe metadata of the document had the created and modified dates as different dates
2is_author_blacklistThe metadata of the document had the author field as a suspicious name from the whitelisted source of bank names
3is_creator_blacklistThe metadata of the document had the creator field as a suspicious tool from the blacklisted software used to create the document
4is_producer_blacklistThe metadata of the document had the producer filed set as a suspicious tool from the blacklisted software used to create the document
5is_trailer_checksum_invalidThe checksum present in the metadata of the document has an invalid hash
6Transaction Analysishigh_amount_checkThere is a transaction with an amount having a very high value than the rest of the transactions
7frequent_transactionsThere are a lot of frequent transactions with the same description
8repeated_transactionsThere are a lot of transactions with the same description and the same amount
9circular_transactionsThere are transactions with a similar description and with debit and credit as the same amount
10daily_transactions_above_thresholdThe day in the statement has too many transactions compared to the rest of the days
11date_checkThe date is either invalid or incorrect in the context of the statement
Out of range from the start and end dates
Invalid date or format (31st Feb, etc)

In case of an error from the fraud module, the fraud check table will be a row table with the description marked as Error during fraud analysis or Fraud check timed out. in case of a timeout issue from the processing of data.