Improving Extraction Results

Ensuring accurate and efficient data extraction is vital. This support documentation provides you with strategies and methods to enhance extraction results, including uploading additional documents for training and model training.

When it comes to accuracy in document processing with Docsumo, the approach varies depending on your specific use case. We'll break down the two scenarios:

Scenario 1: Using Pre-Trained Document Types

If you're leveraging one of our 50+ pre-trained document types, you're already starting with a strong foundation for accuracy. These pre-trained models are designed to provide high accuracy right from the first document. However, if you encounter issues with accuracy, it's likely due to document personalisation. Don't worry—our intelligent system can adapt.

Steps to Enhance Accuracy:

  1. Identify Errors: Review the results for 2-3 documents and pinpoint any inaccuracies or errors.
  2. Correct Errors: Fix the errors you've identified. This helps train the system to understand your document's unique characteristics.
  3. Automated Results: After addressing these initial issues, you'll start to see improved automated results. The system will adapt to your specific document needs.


    If you still see a scope for improvement you can always train an advanced model.

Scenario 2: Custom Document Types

When your requirements don't align with any of our pre-trained document types, and you're working with a custom document type, the path to high accuracy requires a bit more effort but it's easy.

Steps to Enhance Accuracy:

  1. Annotate 20 Documents: To start, you'll need to annotate a set of 20 documents. Annotation involves labelling fields and data on these documents so the system learns how to extract the information accurately.
    That might sound like a tedious job, but our intelligent system will make it easier for you by generating best suited suggestions and values that seem relevant for the field label.
  2. Train an Advanced Model: Once you've annotated the initial set, you can train an advanced model. This model will use the labeled data to improve accuracy.

Iterate and refine: The key to achieving high accuracy with custom document types is iteration. Continue to review and correct results, annotate additional documents if needed, and refine the model.

Docsumo's flexibility and adaptability ensure that you can achieve your accuracy goals, whether you're using pre-trained document types or custom ones.