Optical Character Recognition, or OCR, is a technology that recognizes text within a digital image.  The process of converting a scanned image into recognizable characters can make scanned documents searchable, in order to locate unique key terms or phrases within the file. You may have a scanned document that is 200 pages long, but you only need to find the pages that mention your Department’s name. Or maybe you have an entire shared drive of scanned documents, and you need to find every document that contains a specific budget number. This resource will explain how to convert PDF files to make the text recognizable and then will explain how to best search for these files.

As we move toward a digital future, it is time to move away from paper and enhance your electronic data. Scanning paper records is a fantastic way to make information more accessible and secure. Unfortunately by default, a scanned document is little more than a high-definition photograph. As a result, users cannot easily make edits to the content of those scanned images and cannot easily search the file contents.

A traditional UW crest which features visible text of 'University of Washington' and '1861' Imagine you scanned a document that featured the University of Washington crest. While our eyes can read and interpret that the scanned image says ‘University of Washington’ and ‘1861’, your computer program may not automatically recognize or interpret that the scanned document has readable characters. As a result, once scanning a record, the only way for users to easily know what the file contains is by strategically using folder structures, file naming conventions, and actually reading the document.

However, there are amazing tools available at your fingertips that can turn scanned documents into valuable information assets by converting a scan into searchable characters. This process of converting scanned images into searchable characters is known as Optical Character Recognition, or OCR. By applying the OCR process to example above, the software can then understand that the scanned document above features the words ‘University of Washington’ and ‘1861’. Once the scanned document is converted, users can easily search within an individual document or across folders’ worth of documents for specific words or phrases.

One of the easiest programs to conduct this OCR conversion process is within Adobe Acrobat Pro. It is recommended to scan paper records into PDF files and so Adobe Acrobat Pro is an obvious choice when handling PDF files. However, an Adobe Acrobat Pro subscription is required to use these capabilities. You will not be able to use the OCR conversion process using Adobe Reader.

There is no need to convert most Microsoft Word, Excel, or PowerPoint file formats. That is because text is already recognizable within those programs and users already have the ability to edit or make changes to content within those file types. Instead, this resource will be able to guide users on how to OCR PDF files and how to use the advanced search functions once doing so.

OCR also helps make a document more accessible. Refer to UWIT’s website for more information regarding accessibility guidelines.

Learn how to:

OCR Convert PDF files using Adobe Acrobat Pro:

Search for Text


Recognize Text in One PDF Document

The OCR process within Adobe Acrobat is known as Recognize Text. The first step is to open the PDF document you will want to enable for enhanced search capabilities.  Then in the tool bar across the top of the document:

  1. Press Tools
    a PDF, opened in Adobe Acrobat Pro, with the 'Tools' button circled
  2. Click Enhance Scans
    a PDF, opened in Adobe Acrobat Pro, with a list of tools. The two locations of the 'Enhance Scans' button are circled
  3. On the top tool barthere will be a number of new options including Insert; Enhance; Recognize Text
  4. Click on the AA Recognize Text and a drop down menu will appear
    a PDF, opened in Adobe Acrobat Pro, displaying with the 'Enhanced Scans' toolbar with the 'Recognize Text' button circled
  5. Click on the In This File Button
    a drop down menu from the 'Recognize Text' button, the 'In This File' button is circled
    1. A new toolbar will appear
      the PDF, opened in Adobe Acrobat Pro, with the 'Recognize Text' toolbar displayed
    2. Ensure that:
      1. All Pages is selected
      2. Language: English (US)
      3. The settings should read:
        1. All pages
        2. Document Language: English (US)
        3. Output: Searchable Image
        4. Downsample To: 600 dpi
          the 'Recognize Text' settings pop up
    3. Click OK
  6. Click the Recognize Text button
    the PDF, opened in Adobe Acrobat Pro, with the 'Recognize Text' toolbar displayed and the 'Recognize Text' button circled
  7. A status bar will appear at the bottom of the page while Adobe Recognizes the Text in the PDF document
    a status bar pop up displaying status that reads Page 2: Converting scanned page to Searchable Image Exact
  8. Be sure to Save the document you have open once the conversion is complete.
    1. Pressing save and overwriting the existing file is prudent. By overwriting the existing file, you eliminate the creation of duplicate copies. The OCR process is not actually changing the content of the scanned record and so there is no need to maintain two copies of the same file.
      a PDF, opened in Adobe Acrobat Pro, displaying with the 'Enhanced Scans' toolbar , the 'Save' icon in the top left corner is circled

Recognize Text in Multiple PDF Documents

You may have an entire folder filled with previously scanned documents that you want to enhance. These instructions will allow you to OCR convert multiple PDF documents at once.

Warning: You will not be able to open other Adobe PDFs while the recognition process is occurring, so be sure to plan ahead. If you have more than ten files or the files are hundreds of pages long, consider waiting until the end of the day to do it so it can process everything while you are away from the computer.

  1. Open a PDF document
    1. Click on Tools
      a PDF, opened in Adobe Acrobat Pro, with the Tools toolbar circled
  2. Click Enhance Scans
    a PDF, opened in Adobe Acrobat Pro, with a list of tools. The two locations of the 'Enhance Scans' button are circled
  3. On the top tool bar at the top of the window, there will be a number of new options including Insert; Enhance; Recognize Text
  4. Click on the AA Recognize Text and a drop down menu will appear
    a PDF, opened in Adobe Acrobat Pro, displaying with the 'Enhanced Scans' toolbar with the 'Recognize Text' button circled
  5. Click on In Multiple Files...
    a drop down menu from the 'Recognize Text' button, the 'In Multiple Files' button is circled
  6. A pop up will appear with the document you were just working on, already listed
    a pop up window after having clicked on the 'In Multiple Files' button. The pop up features the current PDF file that is opened and with an opportunity to add more
  7. Add other files to recognize text:
    There are two ways you can do this. One way is to search for files within the program and select them to be added to the queue (Option A). Alternatively, you can manually drag and drop files into the queue (Option B).
    1. Option A:
      1. In the top left there is an Add Files... button
        1. Click on that button
      2. From the drop down menu, select Add Files...
        the popup window with a drop down that occurs after having pressed the 'Add Files…' button. The 'Add Files' button is circled in that drop down menu
      3. Select the additional files you would like to recognize text in.
        1. You can select multiple documents by holding the Ctrl button
          a Windows File Folder filled with PDFs with multiple PDF files selected at one time
        2. You can select all documents in a folder by pressing Ctrl + A
          a Windows File Folder filled with PDFs with all the PDF files selected at one time
      4. Click Open
      5. The PDF document or multiple PDFs will now be added to the list.
      6. Repeat this process for all the PDF files you want.
      7. If you wish to deselect files that have been placed in the popup for text recognition, click on the file name so that it is highlighted. Press the Remove button found in the bottom left hand corner
        the popup window in Adobe Acrobat Pro with a list of PDF files previously selected and the 'Remove' button circled
      8. Once you are ready, press the OK button
        the popup window in Adobe Acrobat Pro with the 'OK' button circled
    2. Option B:
      1. Alternatively, you can drag and drop files from your windows file structures into the white space of the adobe list. You will need to open the source folder and Adobe Acrobat so you can view both applications on the same screen.
        1. You can select multiple documents by holding the Ctrl button. Once selected, just drag and drop into the white space on the Adobe Acrobat screen.
          a split-screen view with the left half of the screen featuring the Adobe Acrobat Pro popup window while the right side features the Windows File Folder filled with PDFs, multiple PDFs are selected
        2. You can select all documents in a folder by pressing Ctrl + A. Once selected, just drag and drop into the white space on the Adobe Acrobat screen.
          a split-screen view with the left half of the screen featuring the Adobe Acrobat Pro popup window while the right side features the Windows File Folder filled with PDFs, all PDFs are selected
      2. The PDF files will be added to the list once you release your cursor
      3. You can repeat this drag & drop process until all files are added.
      4. If you wish to deselect files that have been placed in the popup for text recognition, click on the file name so that it is highlighted. Press the Remove button found in the bottom left hand corner
        the popup window in Adobe Acrobat Pro with a list of PDF files previously selected and the 'Remove' button circled
      5. Once you're ready, press the OK button
        the popup window in Adobe Acrobat Pro with the 'OK' button circled
      6. Once pressing OK, A new pop up will appear
        1. Keep all the settings and choices the same
          1. Target Folder: The Same Folder Selected at Start
          2. File Naming: Keep Original File Names
          3. Overwrite existing Files is checked
          4. Press OK
          the popup window in Adobe Acrobat Pro after clicking the 'OK' button featuring the Output Option settings

          Overwriting the existing files is prudent. By overwriting the existing files, you eliminate the creation of duplicate copies of the same record. The actual content of the record will remain unchanged. All this process is doing is making the existing content more accessible and searchable.

        2. A pop up will appear
          1. Keep Pages as “All Pages” selected
          2. Keep Settings as they are
            • Document Language: English (US)
            • Output: Searchable Image
            • Downsample To: 600 dpi
          the popup window in Adobe Acrobat Pro after clicking the 'OK' button in the Output Options popup, this popup displaying the Recognize Text General Settings window
      7. A status bar will appear while Adobe Recognizes the Text in all the PDF documents
        a status bar pop up displaying status that reads 'Page 2: Converting scanned page to Searchable Image Exact'

Back to Top of Page


Find Text Within One PDF Document

The instructions below allows you to locate a specific word or phrase within a PDF document that you have converted using the OCR process.

  1. Open a PDF Document using Adobe Acrobat Pro
  2. Go to the Edit tab, located in the top left corner
    a PDF, opened in Adobe Acrobat Pro, with the 'Edit' button circled
  3. Click on Find (or press Ctrl+F)
    the drop down menu after having clicked the 'Edit' button. The 'Find' button with a magnifying glass icon is circled
  4. A popup will appear titled Find
    1. Type the word or phrase you would like to search for
    2. Click Next
      a pop up window with the 'Find' search bar and two button below it featuring 'Previous' and 'Next' buttons
  5. Starting on the page you are on, Adobe will search for each instance in the document matching the search parameters and highlight them
  6. Clicking Next will bring you to next result in the document
    the Adobe Acrobat Pro window with the words
  7. Once you have no more new search results, a pop up will appear stating that no more matches were found
    the pop up featuring that no more matches were found to the search
  8. If you want to do a more advanced search within a document, refer to the instructions below for using the Advanced Search features

Back to Top of Page


Search Across PDF Documents Using Adobe Advanced Search

The instructions below allows you to locate a specific word or phrase within any and all PDF records that have had their text recognized. Perhaps you’re looking for a budget number across multiple files and folders. Perhaps you’re looking for a student or faculty name. This method will locate all instances the key word or phrase appears.

WARNING: Before you begin, please be sure to always keep one PDF Document open at all times when using the Advanced Search. If you close the last PDF Document, it will close the entire Adobe Acrobat application, along with your Advanced Search results.

  1. Open a PDF Document using Adobe Acrobat Pro
  2. Go to the Edit tab, located in the top left corner
    a PDF, opened in Adobe Acrobat Pro, with the 'Edit' button circled
  3. Click on Advanced Search (or press Shift+Ctrl+F)
    the drop down menu after having clicked the 'Edit' button. The 'Advanced Search' button is circled
  4. A popup will appear titled Search
    1. Click on the circle next to: All PDF Documents in
      the Advanced Search pop up
    2. Click on the folder drop down menu and choose which folder you would like to search in.
      1. If you do not find the folder you would like in the suggested list. Select Browse for Location...
        the Advanced Search pop up with a drop down menu extended under the 'All PDFs in' section, displaying a list of computer folders
        1. In the new pop up, browse and find the folder you like to search within and press OK
          the Browse For Folder pop up with an example Windows Folder to select. The  screenshot features a folder called
    3. Type the word or phrase you would like to search for
    4. Choose any of the four checkbox selections if applicable
    5. Press Search
      the Advanced Search pop up with the 'Office Admin' windows folder selected as the destination, the phrase
  5. You can then expand the results by clicking the arrow icon next to the document name.
    1. Doing so will display how many times, and a brief location of where, in the document the word/phrase appears
      the pop up window displaying the search results with one PDF located. A dropdown arrow icon next to the search result is circled
  6. If you hover your mouse over the small adobe icon or over the title of the PDF, you can find the location of the document.
    the pop up window displaying the search results with one PDF located. A small adobe PDF icon is circled and a temporary popup is displayed showing metadata details about the one PDF
  7. If you click on the search result, it will automatically open the PDF document and jump the location within the document of the word/phrase
    the pop up window displaying the search results with one PDF located. The located search term 'Customer Copy' is circled

Back to Top of Page


Search Across Folders Using Windows Search Features

The Windows search feature we are all familiar with only searches titles of files. Following the instructions below will allow you to search for and locate a specific word or phrase in individual documents across Windows File Explorer. Once a PDF file has been converted through the OCR processes above, these instructions will help make that PDF discoverable when using Windows File Explorer search functions.

  1. Open the Windows File Explorer for the top folder you wish the search within
    1. In this example, doing a search from this folder level will search the loose documents as well as each of the subfolders
      a Windows File Folder filled with folders and individual documents
  2. Click in the search window in the top right corner and type your search parameters
    a Windows File Folder, zoomed in at the top part of the window featuring the search bar in the top right corner with the word 'classified' typed in the search bar. The search results feature text saying 'No items match your search'
  3. Press Enter on your keyboard to begin the search
    1. In this example, the initial search led to no search results
  4. In the toolbar at the top of the page, under Search Tools, click Search
    a Windows File Folder, zoomed in at the top part of the window featuring the search bar in the top right corner with the word 'classified' typed in the search bar with the 'Search Tools - Search' toolbar at the top of the page circled
    1. From this drop down menu of items, you can narrow your search parameters including the file format, file size, tags, and data modified.
  5. Click on the Advanced options button
    the zoomed in dropdown menu that pops up after pressing the 'Search' toolbar button. The dropdown menu has the
  6. From the drop down menu, under the In non-indexed locations click on File contents.
    1. If the check mark is already visible, you can skip this step.
  7. Once clicking this button, the search will be repeated. But this time Windows will search not only the title of the files, but will search within the contents of your documents.
    1. If you have recognized text within PDF documents, the PDF documents will appear in the results if they meet your search parameters
    2. In this example, 3 items were located by changing this setting. One of them was a PDF document with text recognized.
      the a Windows File Folder, displaying the term 'classified' in the search bar and featuring one PDF and two excel documents as being located with the search term and setting
  8. You can go back to the Search Tools & Advanced options and notice that there is now a check mark next to the File contents.
    the zoomed in dropdown menu that pops up after pressing the 'Search' toolbar button. The dropdown menu, under the 'In non-indexed locations' section displays a checkmark next to the 'File Contents' button
  9. You are done! If you wish to change your settings back to only searching the file titles and not searching the contents of documents, be sure to uncheck the File contents option.