Tuesday, April 22, 2008

CAPTCHA: insights into making humans and algos work together

Early CAPTCHAs such as these, generated by the EZ-Gimpy program, were used on Yahoo. However, technology was developed to read this type of CAPTCHA http://www.cs.sfu.ca/~mori/research/gimpy/ .Image via WikipediaI did some work on CAPTCHA during my work at RPI. I recently revisited this topic and I am excited to see some interesting developments. CAPTCHAs are very popular but they do slow people down. The people hours spent world-wide in entering CAPTCHAs have grown to a sizable number. The increased use of CAPTCHAs as a way to prevent computer-bot access, led to desperate ways to get around it. Some porn sites did something quite creative. They require users to solve a CAPTCHA before they can see the next image. The entered CAPTCHA is then used to create free email accounts, profiles, etc.

Luis von Ahn, CAPTCHA inventor, and now a professor at CMU, developed reCAPTCHA. reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly. The interesting insight here is that human cycles being spent in solving CAPTCHAs can now be used to help computers solve a difficult problem and also serve a philanthropic purpose i.e. digitizing books.

Luis developed the ESP game, bought by Google, which is helping label the images of the web. He turned the labeling process to a game where a user describes an image with labels. If a label matches with an existing label from another user for the image, the label is reserved and the user needs to enter another label. The first few labels get easily entered but after that, some interesting labels emerge. For instance, some labeled George Bush with 'dumb' which to a computer may not be very obvious if no context words in documents suggest that. Luis claims with 5000 people playing this game simultaneously, all the images in Google can be labeled in 2 months! The interesting insight here is that humans were used to solve a problem that computers are yet to solve effectively.

So there are two ideas here:

1) Identify tasks that humans do on a mundane basis and then find a way to use those cycles spent to solve hard computer problems.

2) Identify hard computer problems and figure out creative collaborative ways in which humans can be used to solve these problems.

In both examples, Luis replaced computers with humans to solve the problem (of-course used the computer GUI to facilitate the replacement). I believe if one could connect humans and computers at a deeper level more fruit can bear. Following two examples will explain my point:

Computer Assisted Visual Pattern Recognition: machines do object segmentation, classification and communicate with humans by drawing a visual model around the segmented object as a medium for feedback. When the human is not satisfied with the results, the human modifies the visual model and the classifier takes the inputs and runs again. My MS thesis was on this idea applied to flower and skin recognition and my advisor George Nagy also advocates this.

Content Creation: as humans write an article, the computer suggests spelling corrections. As I actually write this blog post I have a Zemanta plug-in that suggests images and links I can use. I did not have to search for the images in this post, they were suggested to me by the machine.

The insight here is that computers were not replaced by humans but collaborated at a deeper level to solve a hard problem. They still attempted to solve the problem but were augmented with a communication medium (e.g. the visual model) where they sought help from the human if the they failed at their attempt to solve the problem.

Are there more applications? I would bet there are many, all one needs is thinking out of the box, some brilliant ideas, and one has the next big application.

Here is a very nice talk by Luis on Human Computation at Google:



1 comments:

Replica Watches said...

24371833691754917475 As we Last Chaos Gold announced shortly Replica Watches after his Replica Cartier Watches tragic Replica Rolex Watches death, the Swedish LastChaos Gold studio GRIN secretly Replica Watch developing a Replica Chanel Watches spin-off of the Replica Swiss Watches legendary Final Exact Replica Graham Watch Fantasy Replica Montblanc Watches series Swiss Replica Watches from Square Enix. Replica Breguet Watches According to concept art leaks Replica Breitling Watches on the canvas, then Replica BRM Watch canceled the Tag Heuer Replica Watch title was called Replica IWC Watch Final Fantasy Last Chaos Gold Fortress.