Train Your Document Type

  1. What is the process for training a new document type in Docsumo?
    Training a new document type involves uploading sample documents, defining fields for extraction, and refining the model with feedback until it accurately processes the document. For detailed steps, visit this guide.

  2. How many samples are required to train a new document type?
    Typically, it’s recommended to provide at least 20-30 diverse samples per document type to effectively train the AI model. For more guidelines, see this guide.

  3. Can I train the model with documents that have varying formats?
    Yes, training the model with documents in different formats helps improve its ability to generalize and accurately extract data from a variety of layouts. Learn more here.

  4. What are field mappings, and why are they important?
    Field mappings define which parts of the document the model should extract data from. Correct mappings are crucial for accurate data extraction. For more information, see this page.

  5. How long does it take to train a new document type?
    The time required to train a document type varies depending on the complexity of the document and the number of samples provided. Generally, it can take a few hours to a few days. More details are available here.

  6. What happens after the initial training phase?
    After initial training, the model enters a fine-tuning phase where additional samples and feedback help improve its accuracy. Learn more about the post-training process here.

  7. Can I retrain a model if extraction results are not satisfactory?
    Yes, you can retrain the model by providing additional samples or adjusting field mappings to improve accuracy. Steps for retraining are outlined here.

  8. How do I test the accuracy of a newly trained document type?
    You can test accuracy by running the model on a set of validation documents and reviewing the extracted data for correctness. Testing procedures are available here.

  9. What should I do if the model struggles with certain document elements?
    If the model struggles with specific elements, provide additional targeted samples or adjust the field mappings to help the model learn those elements better. Troubleshooting tips are provided here.

  10. Can I use a previously trained model as a baseline for a new document type?
    Yes, you can use an existing model as a baseline and then further train it with new samples to adapt to a different document type. This approach is efficient for similar document types. Learn how to do this here.

  11. How does feedback improve the training of a document type?
    Feedback helps the model learn from its mistakes by correcting any inaccuracies in the extracted data, which in turn improves future extraction results. Learn more about providing feedback here.

  12. What are the benefits of training a custom document type in Docsumo?
    Training a custom document type ensures that the model is tailored to your specific needs, leading to higher accuracy and efficiency in data extraction. Benefits are detailed here.

  13. Can I update a trained model as document formats evolve?
    Yes, you can continuously update the model by retraining it with new samples as your document formats change, ensuring ongoing accuracy. Update procedures are described here.

  14. How do I export the trained model for use in other systems?
    Once trained, you can export the model to integrate with other systems. For export instructions, see here.