Document Lifecycle Stages

Every document in Docsumo has a certain status:

UI StatusJSON/Webhook Status
Processingnew
Reviewingreviewing
Review Skippedreview_skipped
Processedprocessed
Errederred
                           Diagrammatic Representation of Document Status Change

Processing

When you first upload the document, the document goes into processing status, where our system tries to extract the data from the document.

Reviewing

Once the document is processed by our system, it goes into reviewing status.

You can view the files in the reviewing status by going to the Review tab .You can review the documents by clicking on Start Reviewing button.

                                View Files in Reviewing Status

You can also choose to review documents of a particular document type.

                            Review Documents of a particular document type

The user can view the data that has been extracted from the document from the review screen, shown below. Users can also make necessary corrections

                                   Docsumo PDF Review Screen

                                  Docsumo Spreadsheet Review Screen

Skipped or Review Skipped

Files that the user doesn't want to be reviewed, can be moved to Review Skipped. Files in this stage can later be moved to reviewing stage, if needed.

                                       Skipped Document

Note: Documents in spreadsheets workflow cannot be skipped.

Processed

Once the user completes reviewing the data, they can click the Approve Button to move it to Processed Status.

Docsumo validates the data against different validation rules if any. The document is processed if there is no validation error.

                     Document not getting approved because of validation errors

In such cases, if you are sure about the correctness of the data, there is an option to approve the document with errors by clicking on double check icon, also known as approve with error, which is highlighted in red box in the figure above.

               Conformation dialogue shown to user when approving document with error

In spreadsheet review screen, Approve acts both as Approve and Approve With Error, the document is approved normally if there are no errors. In case of validation errors, similar dialogue is shown, which lets the user approve the document with error.

                               Approving Document in Spreadsheet Workflow

You can further review the processed document, by opening the document and clicking on Start Review Note: The document can directly go from processing to processed, if Straight Through Processing(STP) is turned on and the extraction satisfies all the validation rules. STP cannot be turned on for spreadsheet workflow as of now. See More: How to download extracted data

When document are approved or approve with error status is same . We added new key to metadata approve_with_error:true. ​

Erred

When the document cannot be processed by our system due to different known or unknown technical reasons, the document goes into erred status. Hovering over the Erred message icon will show an error message, that indicates the reason for failure.

                                      Erred Document with error message

Common reasons for errors are:

  1. No data to extract

If there is no data to extract from the document, which generally happens if the file is empty or in case of spreadsheets workflow, if there is no tabular region that we can extract. In such cases the document goes to error.

  1. Timeout

If a document takes more than 10 mins to process, then the document is sent to timeout. The user can manually retry the document processing in such cases.

  1. Password protected or corrupt files

If a file is password protected or corrupt. We cannot process the file further and it goes to erred status.

  1. Extraction Error

Sometimes there is issue while processing the documents in the server side. This might be due to error in extraction or Internal Server Error.

These are the most common reasons why documents go to error.