During an interview at Bloomberg’s Tech Summit last Thursday, Meta’s chief product officer Chris Cox said that Instagram and Facebook have an advantage in the generative AI space because of all the “public” photos available to them.
While Meta is not a major player in the AI image generator market, the company continues to build its text-to-image model called Emu and with other companies struggling to find training data to keep up with the demands of AI, Cox thinks Meta has an advantage.
“We don’t train on private stuff, we don’t train on stuff that people share with their friends, we do train on things that are public,” he tells Bloomberg.
Cox said that Emu can make “really amazing quality images” thanks to “Instagram being the data set that was used to train it” which he described as “one of the great repositories of incredible imagery.”
On using user data to train AI models, “We don’t train on private stuff, we don’t train on stuff people share with their friends. We do train on things that are public,” @Meta Chief Product Officer Chris Cox #BloombergTech pic.twitter.com/FC0SWlTgqY
— Bloomberg Live (@BloombergLive) May 9, 2024
He goes on to say that Instagram’s varied content — including photos of “art, fashion, culture, and also just images of people” — is what makes it a useful tool for building AI image generators.
Photographers Have Little Choice
For many photographers, Instagram is the most important social media platform. Feelings of mistrust and outrage at the way AI image generators were built are felt widely in the creative community so this may be something of a conundrum for creators.
Cox says Meta is not training on photos from private accounts but, in general, photographers need public accounts to get their work and name out into the world. It is a very important marketing tool.
Back in February, Meta CEO Mark Zuckerberg made clear that he is using images posted on Facebook and Instagram to train the company’s generative AI tools with.
“When people think about data, they typically think about the corpus that you might use to train a model up front,” Zuckerberg said in an earnings call.
“On Facebook and Instagram, there are hundreds of billions of publicly shared images and tens of billions of public videos, which we estimate is greater than the Common Crawl dataset and people share large numbers of public text posts in comments across our services as well.”