Optical character recognition (OCR) is a blessing for publishers and authors, converting an image of text into a document, for example in Word. Magic! A lost manuscript restored from hard copy!
The OCR process is somewhat similar to the way a human harvests words from page or screen.
At this point, the similarity between human reading and OCR "reading" ends. For us, the harvest is only step one. Humans attribute meaning to these shapes, and assemble meanings into a narrative. We interpret, we evaluate, we follow, we think. From a page full of symbols, we construct something bigger.
OCR is kind of dyslexic, or perhaps is still a pre-schooler. It thinks in pictures. It sees a shape, compares that shape with known symbols, and regurgitates its best guess. (It finds some fonts much clearer than others, by the way.)
I've been reformatting a novel from such a document. Sometimes you can literally see why OCR software might reach a certain conclusion...
And sometimes it's hard to spot the logic...
So what's the moral of this story? Never skimp on proofreading.