Let’s talk AI development based on something everyone has experienced – Skip Intro. You’ve seen it on Netflix when you are watching a show. That simple button jumps you two minutes into the future and gets you through the snappy intro with ease. Then there is the auto play functionality that knows you are at the end of a TV show and automatically begins the next episode in fifteen seconds. We have all been sucked into a binge on Netflix by that 15 second countdown. Yet, how do they create these features that streamline your TV viewing experience?
Now, many people would assume that Netflix has created an AI system that can watch TV shows and apply flags to the datafile. These flags would be used by the program to enable certain features: skip intros/autoplay the next episode. This assumption would be because of marketing departments. In every major corporation, there is a marketing department that is talking about their AI programs or machine learning programs and how they are moving forward. Except, creating an AI or machine learning programs is difficult and very expensive.
When I say very expensive, I mean like ungodly expensive. If you look at Google’s AI department, there are like thirty engineers working fulltime in Palo Alto. Not only that, if you are on the AI program, you have to be over compensated so you don’t jump ship and take IP with you to another company. That means, the company has (more than likely) hundreds of engineers working on creating a singular program that will do important tasks. While these departments exist, they aren’t being applied to the technology you see.
A great example was a lawsuit that was levied against Microsoft. Former employees for Bing Search were suing because they had to stare at horrible pictures and flag them. Why didn’t Microsoft just create an AI system that flagged inappropriate pictures automatically and keep humans out of the picture? Simple, an engineer is going to cost hundreds of thousands of dollars, a human that goes through pictures and flags them will cost thirty thousand dollars. In order to flag a picture, you just need to have eyes and basic comprehension of what is inappropriate. It was a business decision and it was cheaper to hire someone to flag something than to build an AI system.
So, how do major technology companies create little features we love? Simple, they probably pay employees some money to grind through a basic task instead of developing software to do it. The skip intro is a great example. I can think of a few ways it can be done:
- Someone at Netflix watches every show, flags the beginning of the intro in the file, then flags when the intro ends. That tells the program where to skip to.
- When people watch the show, they fast forward the video to the end of the intro manually. This tells you the beginning of the intro and the end of the intro, but you are going to have to make assumptions based on the user input that tells you they actually fast forwarded an intro.
- You are using crowd sourced data and assumptions to deduce the intro time length for that series.
- The first people have to provide that input, means the feature doesn’t become available till later. Also different people will do it at different times.
- Person/Software solution. This is you take either 1 or 2 above and then apply a software solution. All video files are data, as such, at a certain point in time there is a unique set of data for that specific video. So a person goes to the first episode with a intro, flags the beginning of the intro (provides you reproducible piece of data for all intros) and then the end of the intro (another unique piece of data). Since the introduction is always the same (cut from the same file), you can always find the introductions beginning and end.
- Once you have flagged these unique pieces of data, you just have software go through and add the flags to all the files.
- If the intro changes, you bring the human piece back in and they flag the new intro the same way. Then the software goes through and adds the flags at the appropriate points in the file.
- Software solution. Instead of having a person flag the intro file for the program. You have the software program look through all the files for a season and isolate out the repeated pieces of data. That means, your intro would be the same piece of data in each episode. Once you see there is a pattern of data, you isolate that and add the flags at the beginning of the sequence and the end of the sequence so a user can skip the intro. This would work for the credits also as they would appear as a sequence.
- Problem with this is there could be hidden videos at the end of the episode – you would lose this added value to the episode if you automated the skipping of the video.
- It would cost a good amount to get to this point instead of crowd sourcing or hiring a cheap resource to look through the videos and apply the flags.
Option three allows you to save a ton of money on software development and avoid pitfalls of automation (when something doesn’t follow the same form as originally assumed when the software was coded). As such, you wouldn’t have to do additional development as things change or shift. This would require an ongoing cost of a human being, but depending on the amount of work needed to code the automated solution, it could end up being the cheaper option for the next ten years.
Then again, I could be completely wrong and Skynet is running all the major corporations in the world or we are in a simulation.