Blog: Tips on Legal Tech, E-Discovery, & More

Solving Legal Document Challenges: Casefleet's OCR Technology Evolves

Written by Meg Hall | November 15, 2023

In legal proceedings, every word can carry significant weight. As legal professionals increasingly rely on digital documents for case preparation, discovery, and evidence presentation, the ability of Optical Character Recognition (OCR) technology to flawlessly convert scanned documents and images into searchable, editable text becomes crucial.

Casefleet understands that quality OCR is critical to the success of your case. It helps ensure no detail is overlooked and serves to preserve the integrity of the information you're using to craft your case narrative. Accurate OCR helps ensure that justice is served with every letter accurately accounted for. This post delves into why high-quality OCR is a necessity in the legal arena, where the stakes are as high as the demand for precision - and how CaseFleet is achieving new heights in OCR excellence.

What is OCR?

OCR is a technology that allows computers to read and understand printed or handwritten text from images or scanned documents. In simpler terms, it's like teaching a computer to read the text from a picture, just like humans do.

When you scan a document or take a photo of a page, OCR software examines the letters and words in the image and converts them into a form that a computer can edit, search, or process, making it much easier to locate specific words, edit the text, or even have it read aloud by text-to-speech programs.

The OCR dilemma

Poor quality OCR can lead to inaccuracies in text detection in several ways, which in turn can result in errors during searches or missed information. Some of the common issues with low-quality OCR include:

  1.  Misinterpretation of Characters. For instance, the software might read the letter "O" as the number "0", or confuse "l" (lowercase L) with "I" (uppercase i). This type of error can change the meaning of words and phrases, leading to inaccuracies in full-text search results.
  2. Difficulty with Handwriting or Fonts. If the OCR software cannot accurately interpret these styles, it may either miss these texts entirely or misread them, causing crucial information to be overlooked during searches.
  3. Poor Image Quality. If the original document or image scanned is of poor quality, perhaps due to low resolution, blurriness, or uneven lighting, the OCR software might not accurately detect all the text. This can result in missing or incorrect characters in the digitized version.
  4. Inaccurate Indexing and Searchability. When OCR inaccuracies occur, the resulting text may not be properly indexed. This means that even if a document contains relevant information, it might not show up in search results due to errors in the text conversion process.

    In summary, poor OCR quality can have a domino effect on the reliability and effectiveness of text searches in digitized documents. This can be particularly problematic in fields like legal, academic, or government work, where the accuracy of information is paramount.

Casefleet's OCR overhaul

Earlier this month, Casefleet released a major upgrade to our document processing, including significant advancements to our text recognition and document review tools.


First, Casefleet's OCR quality has dramatically improved. In our initial testing, we've seen OCR accuracy improve as much as 96 percent! 

Here's an example of scanned text that the old OCR had issues reading correctly:

Before the upgrade, portions of the paragraph were missed by OCR and the order of characters was jumbled due to the scanned image being askew. Prior OCR would produce this as the content of the text:

LM jett thrust. The separation bu rade after jettison, providing 2 foot per second posig thrust.

As you might imagine, the inaccuracies in the detected and indexed text resulted in the passage being less-than-searchable. However, with our recent overhaul, OCR accuracy has improved nearly 100-fold, resulting in better text indexes for searches:

LM jettison is done radially, CSM below, with final sep pyros providing approximately 0.4 foot per second radial thrust. The separation burn is performed five minutes after jettison, providing 2 foot per second posigrade thrust.

In addition to accuracy improvements, Casefleet's upgraded OCR also:

  • detects rotated pages and will fix rotation automatically on a page-by-page basis.
  • detects problems in malformed PDF files that could lead to text becoming invisible during final presentation of a document.
  • recognizes more instances of handwritten text.

Seeing the impact

Working with software offering superior OCR technology can significantly streamline the document review process, leading to considerable savings in time and resources:

  • Accuracy and Efficiency: High-quality OCR accurately digitizes text from documents, reducing errors and the need for manual checks, thus speeding up the review process.
  • Searchability: OCR-processed documents become easily searchable, allowing users to quickly locate specific information or keywords in large volumes of data, significantly saving time.
  • Time Savings in Legal Review: In legal contexts, where reviewing contracts, case files, and evidence documents is time-intensive, OCR technology can drastically cut down the time required to locate and analyze pertinent information, allowing legal professionals to focus on analysis and case strategy.
  • Cost Savings: Reviewing documents with superior OCR will lead to long-term savings by reducing manual labor, decreasing physical storage needs, and increasing overall document processing efficiency.

To see the new OCR in action, start your free trial or schedule a personalized demo today!