Google has bought anti-spam firm reCaptcha for an undisclosed sum, as the search giant looks to further expand its ability to organise the world’s data. reCaptcha helps protect sites from spam and fraud by assuring a response is not generated by a computer. The technology presents users with images of words that must be typed in to verify legitimate online requests and registrations. reCaptcha is currently used by more than 100,000 websites from spam and fraud.
Since reCaptcha scans these words from archival newspapers and old books, Google said it hopes to use the technology to improve its own text scanning projects, such as Google Books, to create searchable archives.
reCAPTCHA’s unique technology improves the process that converts scanned images into plain text, known as Optical Character Recognition (OCR).
This technology also powers large scale text scanning projects like Google Books and Google News Archive Search.
Having the text version of documents is important because plain text can be searched, easily rendered on mobile devices and displayed to visually impaired users.