Friday, May 18, 2012

Who's afraid of the big bad algorithm?

Not me.  But some newsie on NPR is afraid one is gonna conquer the world.  She was talking about Netflix and their movie recommendations, all made by computer.  Not that I think Netflix's recommendations are extra ordinary, but occasionally they do steer me onto a good flick.  The newsie feared that  improvements in the the "algorithm" would yield a killer app that could read minds, violate civil liberties, and put Skynet in charge of the world.
  Not to worry.  First of all, the algorithm Netflix uses is trivial.  Algorithm means procedure.  As an example, consider a popular algorithm to find square roots.  It goes like this, guess what the root might be.  Square your guess and compare it with the original number.  If the squared guess is too big, try a smaller guess, conversely if the squared guess is too small, try a bigger guess.  Repeat until the squared guess is close enough to the original number.  Code this algorithm in your favorite computer language, and you have a program to find square roots.
   What Netflix does is ask us viewers which movies we like.  Then it looks to find other movies that are like the ones we like.  To do this you need a list of  all the movies in Netflix, and  to go with each movie , we need some properties.  Such as type (western, war movie, musical, costume drama, animated, etc) cast (actor and actress who play in the movie), director, rating (G, PG, R ...), year released, color or black & white, and so on.  All the computer does is look for movies that match the properties of the movies the customer likes.  This is a database of movies.  The Netflix "algorithm" is merely find movies with properties as close as possible to the properties of the customer's liked movies.  For a computer guy, that's a straightforward bit of coding. 
  What makes it work well is the database.  Especially if we can define some more properties. Amount of violence, and sexiness come immediately to mind, but there must be more.  The more well chosen and well defined properties in the data base, the better the match. 
    But it's the database that makes Netflix work, the algorithm is trivial. 

No comments: