Power of Humanity Harnessed in Computing
by Donna Lee
Ever wonder exactly what is going on with those words you have to type in to get access to some websites? Those strange, twisted, distorted words that you have to tilt your head to the side to see? Sure, it says that it’s to prevent automated programs (bots) from bombarding the website with unauthorized requests and e-mails (spam), but is that the whole story?
Turns out that, no, it is not. Those words are actually an ingenious plot to make you work for free. But it’s not as bad as it sounds. Especially not if you like e-books.
What’s up with the words?
The words that you see are part of the CAPTCHA program, which stands for the “Completely Automated Public Turing test to tell Computers and Humans Apart.” Invented in 2000 at the height of free online services and spamming, the CAPTCHA served as our best defense against bots for several years. Basically, the function of the CAPTCHA is to provide a test that humans are able to pass and computers cannot…but that computers can score, making quick and easy determinations of spam-bots possible. By asking the computer user to identify distorted text, the program can ascertain whether the user is an actual human being and, if not, deny access to the website in question.
“Great,” you might say, “but how is this making me work for free? And what does it have to do with e-books and such?”
How is this useful?
The first iteration of the CAPTCHA used random letters to determine the difference between human and spam-bot. In 2006, however, the creator of the program, Dr. Luis von Ahn, calculated that the amount of time people were spending on solving CAPTCHA puzzles amounted to approximately half a million hours per year. Five hundred thousand hours of wasted effort every year, he realized. The next logical question was: what could be done with that time? How can that human effort, that human brainpower, be put to use?
His answer was to change the CAPTCHA to show scanned text from documents which were being converted to digital format from paper. Many times, old documents (especially handwritten ones) cannot be read accurately by digital scanners but CAN be identified by human beings. Two problems were neatly solved: the text presented to the user was obviously incomprehensible to computers (as computers had already tried to figure out those words but could not), and the time that people were spending on gaining access to protected websites was put to an actual, useful purpose; a purpose that, if the participants were paid minimum wage to complete, would cost $500 million a year.
Is there more?
Further adaptation of this idea has led to other advances: Dr. von Ahn has also pioneered a language translation program, Duolingo, which teaches 6 languages including Spanish, French, and Portuguese using game-style learning. At the same time, the program collects data on the learning and presentation techniques which help participants understand the points easier and faster. With an estimated 3 million users who “play” for about 30 minutes a day, the sheer amount of brainpower applied to this program is staggering.
Over time, this data will be able to help linguists and researchers determine the best ways to guide language learning based on culture, first language, gender, and an assortment of other criteria which have never been assessed before.
Whoever said that nothing good was ever free?
Featured image: License: Creative Commons image source
Author Donna Lee as worked at Edictive for the past three years, delivering production management service to the film industry. From time to time she writes about film and television production on Edictive blog.