HITS Spring: How Metal Toad, AWS Use Machine Learning to Moderate Online Content

Every individual and company today is creating more and more content and distributing that content faster than ever. But that increase in content creation and distribution is making it harder than ever for human moderators to ensure that the content generated meets each company’s guidelines and is appropriate for its users, according to Joaquin Lippincott, CEO of Metal Toad, an Amazon Web Services (AWS) consulting partner and managed services provider.

During the Audience + Insights breakout session “Content Moderation & Machine Learning – How AWS Algorithms Can Automate Solution to a Growing Problem” at the Hollywood Innovation and Transformation Summit (HITS) on May 19, he discussed how AWS machine learning algorithms can help automate content moderation to ensure that obscenities are detected at a fraction of the cost as human moderation.

“We’re implementing a system called [Amazon] Rekognition,” he told viewers. With Rekognition, “you can feed it video or images, pay a small fee, and then it feeds you back data,” he explained, referring to it as basically “a robot that is going to do our dirty work for us.”

In the specific case of Metal Toad, that means it can check “for nudity, suggestive content, violence, visually disturbing images, drug/tobacco/alcohol use, gambling, rude gestures or hate symbols,” he pointed out. “Apart from that last one, it’s all the fun stuff, I suppose,” he said, adding: “I was talking to somebody and they said ‘all of the things you just mentioned probably indicate whether something is going to be a smash hit in the summer or not.’”

To use Rekognition, users first need to create an AWS account. Then you download a software development kit, “write some code and then you’re ready to go,” he noted.

Although Metal Toad uses Python, he said you “can use a number of different languages in addition to that. For any technical people using Rekognition, there is “going to be a language that they’re going to be interested in working in,” he noted, “so there’s no real barrier in terms of, if you have a particular language that you built your platform on,” you will be able to find it and use it, he said.

“One of the nice things is if you integrate for content moderation, a lot of that work that you put in place can be used to do other metadata extraction and other analysis,” he pointed out.

He went on to provide a demo of Rekognition. The first content scanned found no flags for objectionable content, he noted. Scanning the next piece of content found a “suggestive 99 percent bare-chested male,” he said.

For one recent proof of concept, “we spent about a third of our time building the front end, we spent about a third of the time building out the Amazon backend, making sure we had the right virtual servers and instances, then about a third of the time actually writing the Lambdas or doing that code,” he told viewers.

When it comes to cost, “Rekognition is relatively affordable versus humans,” he said, adding: “They charge for video processing for content moderation. It’s about 10 cents per minute and for an image for most people that is up to a million images [and] it’s going to be about a 10th of a cent to get that data back. And then the other stuff – that sort of underlying AWS infrastructure – all of those Lambdas really only cost about $3 a day. So it’s incredibly inexpensive to maintain this kind of infrastructure.”

In comparison, he said, “a professional setup could run $5,000-$10,000 if you didn’t want to do it yourself.”

And how well does it work? “We taught our machine all about human bad behaviour by having it watch James Bond because you get gambling, you get alcohol, you get tobacco, suggestive content and then violence,” he said. “Remember all of the things that make a movie a smash hit kind of exist right there in James Bond” films, he noted.

Rekognition was “very accurate in terms of tagging everything,” he said. Although “we were scanning videos and images,” there are eight other AWS services that can read the text and can transcribe audio into text,” as well as textual-based sentiment analysis engines like [Amazon] Comprehend,” he added.

Pointing to the Metal Toad flyers available for attendees, he joked: “I can offer you nothing except partial nudity, James Bond references, and a picture of Bender the robot.”

To view the entire presentation, click here.

The Hollywood Innovation and Transformation Summit event was produced by MESA in association with the Hollywood IT Society (HITS), Media & Entertainment Data Center Alliance (MEDCA), presented by ICVR and sponsored by Genpact, MicroStrategy, Whip Media, Convergent Risks, Perforce, Richey May Technology Solutions, Signiant, Softtek, Bluescape, Databricks, KeyCode Media, Metal Toad, Shift, Zendesk, EIDR, Fortinet, Arch Platform Technologies and Amazon Studios.