Train Your Own or Fine-Tune Document Type
Adding a custom document type in Docsumo is a straightforward process that allows you to tailor document data extraction to your specific needs. Follow these steps to create a custom document type:
If you only need to fine-tune without changing fields, please proceed to step 9. If you need to edit fields as well, go to step 5.
Step 1. Go to the Document Type Page
- Begin by navigating to the Document Type Page in your account.
Step 2. Click "Add Document Type"
- On the Document Type Page, click the "Add Document Type" button to initiate the document type creation process.
Step 3. Select "Create Your Own"
- In the popup that appears, you'll see a list of available document types. Look for the category"Custom Document Type" and select "Create your own."
Step 4. Name Your Document Type and Upload Documents
- In the subsequent popup, provide a name for your custom document type.
- Upload sample documents that represent the type you want to create. These samples will be used to train the model for data extraction.
Step 5. Locate the Document Type Card
- After creating your custom document type, you will see a card representing it on the Document Type Page.
Step 6. Click "Edit Fields" on the Card
- Click the "Edit Fields" button on the card to define the specific fields you wish to extract from documents of this type.
Step 7. Define the Fields
- In the Edit field interface, add the fields you want to extract. You can set rules and patterns to identify and extract data accurately.
Step 8. Save the Changes
- Once you've defined the fields, save the changes.
Step 9. Reprocess the Document or upload new document
- Upload a 20+ new document or reprocess the existing document to apply the changes you made to the extraction rules.
Step 10. Go to the Review Screen
- Once the document has been reprocessed, navigate to the Review Screen using the "Review" button on the document type card to review the extracted data.
Step 11. Annotate the Document
- In the Review Screen, you can annotate the document for the correct values. The AI Assist feature will assist in generating relevant output for the defined fields.
- If you notice any corrections or missing values, you can annotate them accordingly.
Step 12. Confirm
- After annotating the document, press the "Confirm" button to finalise the extraction process.
Step 13. Train the Model
- To improve efficiency, annotate 20 or more documents with your custom document type.
- After annotating a sufficient number of documents, you can train a model around them. This will allow for ML-generated results without the need for manual effort in future document processing.
By following these steps, you can create a custom document type, tailored to your specific document structure and data extraction requirements. This customisation ensures accurate and efficient data extraction for your unique document types.
Accuracy of the model trained on the below sample sizes. Document: On a one-page document, such as an invoice
Number of sample (Confirmed/Approved) | Model | Accuracy |
---|---|---|
0 | AI Assist | ~ 50% |
20 | AI Assist + Custom model | ~ 70% |
70 | AI Assist + Custom model | ~ 80% |
120 | AI Assist + Custom model | ~ 85% |
200+ | AI Assist + Custom model | ~ 90-95% |
Accuracy can vary depending on the document types and no of fields you want to extract. Please contact our sales or customer success team if you have any questions. You can email us at [email protected] or chat from within the product at app.docsumo.com (not the website).
By following this guide, you can effectively create your own document type, and enhance your document processing capabilities to meet your specific needs.
Updated 11 months ago