How Reka Uses Shutterstock Data to Create State-of-the-Art Multimodal AI Models

What happens when a company needing rich metadata for its multimodal AI models joins forces with a global leader in creative content? The answer lies in the strategic partnership between Reka and Shutterstock.

In this post, we’ll explore:


Shutterstock Partners with Reka

In the world of tech, the AI landscape is the hot place to be right now. However, as new companies attempt to outpace their competitors rapidly, sometimes a little outside help can reap rewards. 

Reka faced a challenge: Its team needed to train its multimodal AI models but required access to vast amounts of data. High-quality metadata plays a crucial role in training by enhancing the model’s ability to understand, interpret, and generate content across different modalities, such as text, images, and video.

Licensing Shutterstock data would enable Reka to improve the accuracy of its data labeling, boost the efficiency of its model training, and facilitate advanced features like personalization and customization of its AI.

At the same time, Shutterstock was searching for an efficient and effective way to enhance the metadata tied to hundreds of millions of its library assets. Doing this would make it easier for its customers to find a specific image or video on the Shutterstock marketplace. Rich and diverse metadata would also lead to better categorization and visibility of a contributor’s products on the marketplace, potentially driving more sales, maximizing the asset’s revenue potential, and improving product discoverability for everyone equally.

So, in June 2024, Shutterstock formed a multi-year partnership with Reka to license data from its vast library of assets to develop Reka’s frontier-class multimodal language AI models. In return, the AI company would enhance the metadata of Shutterstock’s 550 million assets across images and video.


Reka Multimodal AI

If you’re unfamiliar with multimodal AI, these models can analyze multiple types of inputs simultaneously. In simple terms, you can provide them with images, videos, audio, PDFs, and text, and they will generate more complex and nuanced outputs.

Here’s an example of how Reka’s Core model works in practice:

However, to receive an accurate answer, the AI models require a monumental amount of data, and it’s not easy to acquire ethically sourced, clean, and enriched data for training multimodal AI models.

If you keep up with news about artificial intelligence, you’ve probably heard about tech companies “scraping the web” for data. Besides angering content creators, scraping public data can also lead to potential lawsuits that will derail even the most powerful models.

That’s why more and more companies are opting for legally licensed data that they can count on, avoiding potential lawsuits for violating website terms of service, copyrights, and privacy policies.

Complying with all applicable laws can be a lot of work. Reka recognized Shutterstock as a global leader in visual data, offering the best dataset for training multimodal AI models, which streamlined the complex process for them.

Ethical considerations abound in this area, including data handling, user protection, bias prevention, and ensuring ethical practices at every stage of data management.

“By using ethically sourced data from our library of more than 670 million assets, we’re helping Reka reach its goal of advancing the AI development of its models and, in return, we can further improve our library. Improving the metadata to our asset library will do some great things for customers and contributors, such as increasing the chances of assets being indexed and ranked by search engines, both in our marketplace and externally,” says Shutterstock’s Senior Director of Product Jergan Callebaut.

Shutterstock expects this multi-year partnership to continue to blossom as we look to the future. More than 60 million new assets are added to the Shutterstock library annually, all of which will need well-tagged and described data. With Reka’s help, customers will spend even less time searching and more time creating.

Take, for example, the Shutterstock image below.

The photo description is “Woman pouring honey onto thin pancakes with berries at table.

But with Reka’s AI adding more info to the image metadata, it can also have things like:

  • Setting: An indoor setting, likely a kitchen or dining area, with a focus on a meal preparation or serving scene.
  • Visual style: The visual aesthetic is warm and inviting, with soft, natural lighting suggesting a homey atmosphere.
  • Color palette: Brown, beige, yellow, green, red.
  • Objects: Stack of crepes, honey, blueberries, cherry, mint leaf, wooden bowl, honey dipper, plate.
  • Spatial relationships: The stack of crepes is centered on the plate, with honey being poured over it. Blueberries and cherries are placed around the crepes, and a mint leaf is on top. In the background, a wooden bowl and honey dipper are visible.
  • Detailed description: This image captures a moment of meal preparation or serving, featuring a stack of crepes on a plate with honey being poured over them. Surrounding the crepes are fresh blueberries and cherries, with a mint leaf adding a pop of color and likely a fresh flavor. The wooden bowl and honey dipper in the background suggest that the honey is freshly poured. The setting appears to be a homey kitchen or dining area, with soft, natural lighting creating a warm and inviting atmosphere.


Shutterstock Data Upgraded

Shutterstock data licensing has gained momentum over the past couple of years and is now trusted by companies such as NVIDIA, OpenAI, Meta, and LG. Our data utilizes a wide variety of content types, including 620M+ images, 50M+ videos, 3M+ music and audio tracks, and 1.2M+ 3D models.

So, whether you need metadata to build your generative AI models, licensable data for machine learning, or want to train content moderation operating systems on Shutterstock’s diverse array of images, the quality of its best-in-class metadata will only get better. 

Shutterstock is also continuing to invest in its extensive library of human-created and reviewed metadata, but with Reka’s help, they’re further detailing the metadata and making it more nuanced for data services clients, which can lead to higher click-through rates and boosted user engagement of their own websites and products. 

If you want to get started, you can contact Shutterstock directly or work through Amazon Web Services and Google Cloud. Its team is happy to help you ideate, curate, and customize datasets for your needs.

In fact, Reka utilized Amazon Web Services to acquire the data. “The S3 delivery through AWS has been so smooth. We appreciate how robust, secure, highly scalable, and reliable it is,” says Eric Chen, Head of Applied at Reka.

Sample dataset for generative AI, featuring images of a martini, a male model with flowing purple hair, a close-up shot of a dog with its tongue sticking out, and more. Above the images are metadata tags

Impactful Data

“This partnership is a testament to the power of collaboration. By working with Shutterstock, not only are we streamlining the complex process of acquiring data for our AI models, we are also delivering better models for our customers,” says Dani Yogatama, CEO and Co-Founder of Reka.



License this background cover image via Shutterstock AI Generator.


This post was originally published onSeptember 20, 2024

Recently viewed


FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Todays Chronic is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – todayschronic.com. The content will be deleted within 24 hours.

Leave a Comment