Healthcare providers are under constant pressure to increase workflow efficiencies and overall patient care. Healthcare IT must apply the latest technologies to a multitude of scenarios in an ever-changing landscape. Even with healthcare organizations embracing electronic medical records, it’s important to address the continued use of paper.
By leveraging Optical Character Recognition to scan and automatically accession orders, the ordering process can drastically improve if done correctly. It is important to understand where OCR can be successful and where it falls short. Together we can address the opportunities, issues, and challenges in automating laboratory order entry leveraging OCR technologies. We will explore the importance of bringing together the people, process, and technology.
This whitepaper offers real-world tips to increase the likelihood for success when implementing your OCR process. Move the healthcare industry forward by implementing a new technology on an old process.
Unfortunately, despite all the advances in technology, laboratories are unable to become completely paperless. Even if an entirely paperless workflow was possible, using paper is still an appropriate option due to HIPAA regulations or other security concerns. With paper as an unavoidable part of the process, many laboratories rely on a team of accessioners to read, interpret, and enter data from a test requisition form into a Laboratory Information System (LIS). As test requisitions become more difficult to understand and test volume increases, human errors are more likely to occur. By eliminating this time-consuming and error-prone workflow step you can help increase overall speed and accuracy of patient care. While you may not be able to eliminate paper, you can enhance the user's process by letting your computer help with the heavy lifting.
Am I saying you no longer need accessioners? No, but I am saying that accessioners can be trained to execute beyond their normal day-to-day with value-add activities in Quality Assurance, such as fixing incomplete orders and fraud prevention. Enhancements in optical character recognition (OCR) and related technologies allow healthcare providers to accurately capture data from paper turning them into structured, searchable data. For the laboratory, we want to take this a step beyond searchable digital copies of test requisitions and automate the process of transferring data from paper to an electronic order.
OCR stands for Optical Character Recognition and is the technology used to translate a scanned image of characters into computer-encoded text. In most scenarios, it is used to translate a printed document of text into something a computer can read and search as if each letter was typed individually. In addition to printed characters, a major challenge with laboratory test requisitions is the presence of handwriting. OCR is not generally used to recognize handwriting. This is where Intelligent Character Recognition (ICR) comes in.
Before moving forward, here are a few relevant terms to familiarize yourself with.
Beyond healthcare organizations, a mixture of these technologies are actively used in financial institutions, credit card companies, law offices, and government agencies. Have you ever deposited a check into your bank account through a mobile app or ATM? OCR and ICR technologies are being used to convert your information from handwriting and printed text into digital characters for your bank’s software to consume. However, don’t get too excited just yet. These technologies are very different when used in your laboratory vs. a bank because a check is nowhere near as complex as a test requisition form.
The most common question laboratories ask regarding OCR related technologies is, “How accurate is it?” To answer this question, let’s start by defining accuracy. The English Oxford Dictionary defines accuracy as, “The quality or state of being correct or precise.” When applied to creating a letter-perfect searchable representation of a document, OCR accuracy comes into play and is an important term to understand. In this case, we are defining accuracy by setting the rules for what is considered accurate or inaccurate. This way the software knows the level of confidence it has in assuming corrections. For example, when creating an electronic order from a paper requisition, the user has a choice to measure accuracy based on their needs, which may be different than traditional document management systems.
Now back to the question, “How accurate is it?” OCR vendors tend to avoid answering this question because the answer is ambiguous. In the rare case a vendor does clearly state their accuracy, it runs the risk of having too many caveats in the fine print. Think of defining OCR/ICR accuracy measurements like a weight loss program. Weight loss programs often require additional work on your own time with warnings such as, “Results may vary” or “You must use our product in conjunction with a healthy diet and exercise.” OCR, like weight loss, requires additional work and has many factors that need to come together in order to be successful.
Given the following scenario, how would you measure accuracy?
After scanning a document, the OCR engine interprets his name as "Ray Kurzvveil,” however, it has assumed this is incorrect and is confident enough to autocorrect the output to "Ray Kurzweil.” The correct spelling is Ray Kurzweil.
In this scenario, the OCR engine misinterpreted a “w” for two consecutive “v” characters. While the optical character recognition accuracy was incorrect, the Intelligent Word Recognition (IWR) was able to overcome the inaccuracies and provide a confident autocorrection. To measure accuracy we must configure a confidence threshold in the OCR software to define what decisions to make. The above scenario is an example of a confidence threshold, which has been set to identify specific fields on a test requisition form and decide what level of accuracy is acceptable to process without human interaction.
At this moment you might be asking yourself, “Now wait a minute. I use the latest note taking applications and they can interpret handwriting into text with ease. What makes accuracy in this situation so complicated?” Keep in mind there is a big difference between the software capturing your handwriting and an OCR engine interpreting a large quantity of documents with unknown handwriting from a scanned image. The key advantage your note taking application has is that it also has character stroke information. The application can “watch” you write and make predictions based on a memory of your strokes and writing style. When scanning a test requisition form, the OCR engine will not have this additional information, making it much more difficult to interpret. Be careful when evaluating OCR engines and research if their publicized accuracy is based on OCR, IMR, or some other methodology.
There are seven critical steps to successfully transfer a blank test requisition form to an automated lab accession. Let’s dive into each step.
Designing a document that’s easy for a computer to understand is much different than designing a document for the ordering physician to understand. These two priorities compete with one another and either make it easy for computers to understand or make it easy for humans to understand. In this case, the physician's needs overpower the computer’s, as the form is crucial to their job. And with the physician's needs as a priority, there is only so much you can do to cater to the computer. Things like controlling who fills out the form or altering the ordering process so the computer is able to read the document better are out of the question. So what can you do?
The design of your requisition is an important factor in making the physician and software happy when reading it over. Though I stated before that these elements compete with one another, there is a key element in the design process that will merge yours and the software’s needs in this process. When in the designing phase keep OCR at the top of your mind because how you design your requisition will be a key influential factor in this process. To help better understand, I will discuss special markers, barcodes, field constraints, font, and magnetic ink, all of which are common design elements of an OCR-friendly test requisition form. Let’s take a deeper look at how these elements help to ease the performance.
The first design element is a special marker or a barcode, which is used to help identify the type of form you are inputting into the system. You may be able to optically recognize it as a test requisition, but the computer will use the barcode to understand the forms current version and the interpretation parameters. In short, using a custom barcode will allow the software to instantly understand which type of document it is scanning.
The next design element requires you to add the appropriate field constraints. A common mistake made when creating requisitions is creating freeform text fields. For instance, the ordering physician could write in cursive, write sloppy, or write multiple responses in one area, causing the software to misread the text. For the best results, your form needs to have character separation to be able to differentiate between handwritten letters. To get the most out of your process, educating your marketing team on the importance of these constraints will be necessary as they are the liaison between the physicians and your laboratory.
To enhance their process, banks use a special “OCR” design element in the font making it easier for the image to become a letter because the software has a data storage of known letters. This font originated as “OCR-A” and has now gone through several iterations and splits from different companies in order to refine the process. In addition to OCR-A, checks use MICR. MICR stands for Magnetic Ink Character Recognition and is verifiable through the magnetic ink used to print the checks. This font may look familiar, as it commonly appears at the bottom of your checkbook. Here is an example of what both of the OCR-A and MICR fonts look like.
Beyond using OCR specific fonts, one of the most impactful design elements to incorporate in your design are comb lines. Comb lines are horizontal lines with small vertical tick marks used to separate each letter. They are used to encourage ordering physicians to write letters separately if done properly. Be careful, when using comb lines the tick marks often are too small and too close together. I recommend having the tick marks at least half the height of the expected characters. Below are two examples of comb lines, one done properly and the other suboptimally.
This example has poorly spaced tick lines that are also too short. Remember, the tick lines should be at least half the height of the expected characters.
This second comb line example is done properly, it has enough space between tick lines and the height is half the height of the expected handwritten characters.
While comb lines encourage character separation they are not the best suited for ICR processing.
Character Boxes are a better option to encourage character separation. Boxes fully constrain each letter into its own space. Again, like comb lines, character boxes will only be effective if implemented properly. Do not make your boxes too small or the order physician will be unable to write small enough and his or her letters will be written on or outside the boxes. Best practices suggest creating a square box when possible. Rectangular boxes are better than comb lines, however, if too narrow they can cause unnaturally small looking characters.
Too narrow, tall boxes look like this:
Here is an example of the proper spacing and size of square boxes:
If space permits, creating character boxes with spaces in between will provide the best results.
The goal of the design phase is to encourage behavior to write in the appropriate space and influence the capture of the most accurate characters as possible.
In the design phase, the key takeaway was to add constraints to the physical areas of interest. In a similar way, you will need to limit the possible interpretations or conclusions the OCR engine can come to. For maximum efficiency gains, it’s important to do everything you can to focus and limit the character recognition engine to a group of specific characters. Let’s explore how you can do this.
There are a handful of fields with finite values that have obvious constraints. For instance, there are a finite number of states, cities, and zip codes. When configuring these fields for interpretation, it doesn’t make sense to allow for any word. The accuracy threshold for character recognition can be lowered as long as there is a high confidence match on the limited set of expected outcomes. Another field with obvious constraints is a date field. A date field should only allow digits. Creating a list of expected outcomes is one half of the equation. Another approach is eliminating special characters in appropriate fields. For example:
Creating a configuration to identify and define proper expected and unexpected characters will enhance the outcome and confidence of your OCR engine. While newer OCR technologies are not trained on a specific font or language, it is still important to configure the OCR engine with the type of text being used on the document. If you know your test requisitions should be filled out only in English, limit your possible word set to the English language.
Now it’s time to take your configuration to the next level by using custom database lookups. Ask yourself, have you seen this patient before? Compare the scanned text to a list of your patients from your CRM. Remember the test write-in spot on your test requisition? Any write-in areas of your requisition should be limited to the individual tests you offer. To limit the number of possibilities and increase overall accuracy, test write-in fields are appropriate fields to have a custom database lookup against your Laboratory Information System (LIS).
When printing and distributing test requisitions your first decision is to determine if you are you going to print requisitions internally or outsource this service? Printing your own requisitions allows for complete control and flexibility but requires more effort. Second, you have the option of using the same test requisition for each client or designing and printing specialized forms on an individual client basis. Third, you must choose between printing methods. Will it be the time-tested method of printing with a dot matrix or will it be a newer laser printing technology? The choice is yours.
When printing your own requisitions, another decision to make is the paper thickness, which is critical to successfully scanning double-sided requisitions. If the paper is too thin and the characters bleed through it can affect the accuracy of scans. Your scanner could combine letters on the opposite side and create noise impairing the OCR interface engine’s ability to interpret the information on your form. Your test requisitions should be printed on paper thick enough to prevent text on the back side from bleeding through during scanning.
Are your requisitions printed in duplicate or triplicate? If you need multiple copies printed at the same time, you can choose high-throughput dot matrix printers or use laser printers to print individual pages and perform a gluing process to combine each into one form, before shipping. If you decide to use laser printers and glue, keep in mind that additional staff may be required to print and glue the requisitions. This effort can increase if you are also creating custom test requisitions per client. You will need to perform a cost-benefit analysis to see which option is best for your laboratory.
Whether you choose to print with your own equipment or outsource to a third party, inventory management is a key part of the process. Ensure your vendor has the appropriate software to synchronize with your ordering system and automatically order additional forms. If the vendor’s software is unable to accomplish this synchronously, the sales organization should at least allow you to order additional forms in the system on demand.
When scanning your requisitions think of this phrase, “Garbage in, garbage out.” If the quality of your scan is bad, then the OCR process will be bad. The goal is to get the highest quality scan possible. You should configure your scanner for a minimum of 300 DPI to capture enough detail for the OCR process. Even though you can achieve 300 DPI or higher with a consumer grade scanner, make sure you purchase a commercial grade printer. If you have spent any time in a room full of accessioners you will understand what I mean by the chaos you witness with stacks of folded, crumpled, or otherwise mangled requisition forms. Scanning stacks of flattened paper is difficult enough, but adding fold lines and skewed scans inhibits the process further. Depending on your process you can scan each requisition as you go or you can scan in bulk several times a day.
>I cannot express enough the importance of quality scans with your accessioners.
>The accuracy during processing is highly dependent on two things:
My advice is to configure an iterative approach for processing. Use lower quality thresholds for faster processing. If a scan is unable to interpret the document, start the processing over with higher quality settings. Do this process over and over until you achieve the optimal results. This approach allows the majority of your requisitions to be processed quickly while maintaining your accuracy. The additional processing overhead will only affect a few problem requisitions.
As discussed, no OCR engine is 100% accurate. Even with 100% character recognition accuracy, it is recommended to have a certain level of human QC due to the complexities of test requisitions. When it comes to time and money, human interaction is the most resource intensive part of your data capture process. Don’t think of this time as only a cost. Allow the machine to automatically create the order, but enable the accessioner to QC a percentage of orders starting with all non-100% accurate scans. Most OCR software suites have a built-in screen allowing the accessioner to interact and fix any errors caused by characters not meeting the configured confidence threshold. This process will happen between the first image processing and the order creation. During this process look for any opportunities to observe the processing behavior by understanding what is working and correcting what can use improvement. For example, if you notice the order physician constantly writing outside of the comb lines you should adjust your design. The accessioners should document any trends witnessed while scanning and processing requisitions. This iterative approach is an important part of the continuous improvement process.
The final step is to configure your OCR software so that it is able to convert ingested information and create an HL7 order message to send to your Laboratory Information System (LIS). Generally, OCR software suites allow for custom programming in a multitude of programming languages. You can use one of the supported languages to create an HL7 message. Alternatively, you can leverage an existing HL7 interface engine to read from the OCR software database and create and invest in an HL7 order message.
Speed, accuracy, and cost are at opposite ends of the spectrum. Increasing accuracy can slow down the process requiring longer processing times and larger files sizes. Both of which lead to an increased cost. Similarly, if speed is most important to you, the accuracy can be lost unless you increase the cost to include a large compute cluster with parallel processing.
Success can mean different things depending on the desired outcome. How do you measure success? Character recognition accuracy, or the ability to recognize individual characters, influences success, but I personally measure success as the ability to create an overall accurate order in the Laboratory Information System. If the confidence level is low on character recognition, but the OCR engine is properly augmented with external databases that result in a proper order, I consider that a successful implementation.
An often overlooked next step is simply telling the ordering physicians about your OCR initiative and explaining the OCR process to them. Use this as an opportunity for your sales team to express the importance of writing clearly and describe all the benefits OCR will bring them, including faster turnaround times and increased data accuracy. Both will lead to better overall patient care.
On the technical side, you can eliminate physical paper from your side of the process altogether by pushing the scanning process closer to the ordering physician. We will explore methods to accomplish this in a future white paper. For now, use these tips to take your test requisitions to the next level. I hope you’ve walked away with some valuable knowledge to keep in mind as you continue to enhance your laboratory workflow.
Let’s make your test requisitions optimally machine-readable!