Last updated October 24, 2024
Shutterstock is a creative platform that powers every storyteller to effortlessly bring any idea to life via its world-class content, tools, and services. Long gone are the days of Shutterstock as just a stock photo marketplace – we are now an end-to-end creative platform with Editorial, 3D, music, and video content with production and AI tools and capabilities.
Data Licensing
As our platform evolves with the needs of the content market so does the breadth of our product offerings. In 2021 Shutterstock began exploring the potential of content beyond the traditional creative applications and licensing uses offered on Shutterstock.com. We discovered that creative content holds incredible value beyond its visual and design face value and this is when the concept of data licensing began to emerge on our platform. Data licensing is an exciting opportunity to maximize the revenue-generating potential of creative assets, allowing customers seeking content for training computer vision systems to utilize our vast collection in a whole new way.
What is computer vision?
Computer vision is a scientific discipline that seeks to develop techniques to help computers “see” and understand the content of digital images such as photographs and videos. A model is the engine that governs the behavior of the computer vision system. Researchers train machine learning models to identify visual objects as well as the human eye. Computer vision technology powers many of the features of our own Shutterstock visual search tools, including reverse image search and similar content suggestions.
What are Shutterstock datasets?
Shutterstock datasets are a product offering developed to support companies building computer vision machine learning models. Shutterstock datasets are sets of visual content organized by a specific theme or topic that can include images (including photos, illustrations, and vectors), videos, and 3D models. All content within datasets incorporates metadata, including keywords, descriptions, geo-location, and categories. Examples of datasets include a wide range of industry categories like food & beverage, transportation & autonomous vehicles, animals & wildlife, clothing & apparel, travel, tourism & hospitality, etc.
Datasets provide a potential new source of revenue by extending our reach to new customers that do not currently work with Shutterstock, including AI researchers, leading technology developers and manufacturers.
What are datasets used for?
In general, companies use content datasets like the ones we offer at Shutterstock to power computer vision applications such as:
Visual search: People can easily search images on their smartphone library by entering a keyword like “cat” or “sunset” to find all relevant photos.
Autonomous vehicles: Self-driving cars can operate safely by understanding their specific surroundings — including other cars, people, roads, stop signs, and more.
Content moderation: Social media companies can rapidly identify, review and remove content that is violent or extreme in nature.
Product categorization: eCommerce and retail companies can recommend relevant products to their customers.
AI content generation: AI platforms can train systems to automatically generate new images based on text prompts
The goal is to help companies easily build, train, and automate their object recognition models to improve their technology and better serve the needs of users.
When were datasets introduced?
Shutterstock announced the launch of Shutterstock.AI and computer vision products, also known as “datasets,” in July 2021. At that time we posted contributor-facing information on our help center. This article is continuously updated, including new information alongside our October 2022 announcement of our AI-generated content partnership with OpenAI and November announcement of our partnership with LG. The inclusion of content from our existing library in datasets is covered under Section 1a of our Contributor Terms of Service, which grants Shutterstock the right to develop new features and products.
However, in January 2023 we added an opt out function in the contributor account settings, which allows artists to exclude their content from any future datasets if they prefer not to have their content used for training computer vision technology.
What type of content is included in Shutterstock datasets?
Currently, Shutterstock datasets consist of visual content including images (photos, illustrations, vectors), videos, 3D models, and music. The Offset collection and premium video (Select) content are not included in Shutterstock datasets. Some editorial content may be included in certain circumstances when the AI-model training aligns with editorial-use standards (premier editorial content is excluded).
What type of metadata is included with Shutterstock datasets?
Only the standard metadata requested from contributors is incorporated into Shutterstock datasets -- there is no change to your content submission process. The specific metadata provided within Shutterstock datasets varies based on the needs of the partner and can include a combination of information voluntarily provided by contributors, as well as technical information and additional labels added by Shutterstock’s own machine learning models.
Contributor-provided metadata may include the asset description, keywords, and categories describing objects depicted in the assets. In certain cases, metadata may include some geolocation information provided by contributors. Broad demographic information about the models featured in photographs, including age, gender, and race/ethnicity, may also be included in metadata labels.
How is my content used? What kind of license is provided for Shutterstock datasets?
Each dataset is comprised of content based on the customer's specific model training needs and a limited license that covers usage only within the scope of training machine learning technology. Companies purchasing Shutterstock datasets (content and metadata) may only use them to train machine learning and computer vision models. The use of content for commercial or public applications such as marketing, advertising, etc., is strictly prohibited, and companies are required to have appropriate security measures in place to ensure there is no unauthorized distribution of content.
Is any personal data being shared with partners?
The metadata included in Shutterstock datasets is descriptive information about visual assets. While some limited demographic information from model releases (such as age, gender, or race/ethnicity of models) may be included within dataset metadata, full model releases are never shared, and the identity of models and contributors is never shared with computer vision partners.
Can I opt out of data licensing and having my content included in future datasets?
Yes, in January 2023 we have added an option in the contributor account settings that allows you to opt out of having your content included in future datasets. In August 2024 we are modifying the control settings to allow contributors to have separate toggles for image and video data licensing.
Data Catalog
In June of 2023 we modified our approach to content submissions to expand the breadth of content that can be used specifically for datasets and introduced the Data Catalog in the contributor account. The Data Catalog provides an easy way to view what content has been accepted specifically for data licensing. You can learn more about these changes here.
The Data Catalog will reflect content that was accepted solely for data licensing from June 2023 and onward. Any content included in data licensing sales prior to June 2023 will not be reflected there and will still be included only in the creative Marketplace Catalog. Also, for those contributors opted into data licensing any content that is accepted for the creative marketplace will automatically be made available for data licensing as well.
For those opted out of data licensing, the Data Catalog will not be visible, however any content that is not acceptable for the creative Marketplace but is acceptable for data licensing will be marked as such in the Review tab. Shutterstock will retain these review results and this content can be published for data licensing in the event that the contributor elects to opt in to future data deals.
All Shutterstock contributors can manage their content licensing permissions in their account settings – learn more about the setting options here.
Compensation: The Contributor Fund
This is a new form of earnings for contributors beyond downloads and licensing of individual assets for commercial or editorial use. We are firmly committed to including our contributors as partners on this journey, and ensuring they receive a share of the proceeds from computer vision datasets and generative AI models when their content is used in the creation of these technologies. Given the collective nature of this product, we developed a revenue share compensation model.
How are my individual earnings calculated?
Contributors will receive a share of the entire contract value paid by customers licensing datasets. The share individual contributors receive will be proportionate to the volume of their content and metadata that is included in the purchased datasets. Contributors will earn a 20% average corporate royalty rate of revenue received by Shutterstock for data licenses.
Although inclusion in datasets is not reflected as other individual downloads in the Earnings Summary, like earnings from other eCommerce products, Shutterstock maintains an internal database of all assets used in all datasets that have been created since the launch of this product, so we can compensate our contributors accordingly.
Why can’t I see the earnings and specific downloads from datasets in my Earnings Summary?
Due to their highly customized nature and scope of use, datasets are not a product that can be purchased directly on our website. Since datasets are manually curated, the individual assets that are included in this product are not reflected in your contributor account download history and Earnings Summary.
How will I get paid?
Earnings from datasets and downloads of AI-generated content produced with integrated technology on our platform are pooled in a collective fund and will be distributed periodically as the fund accumulates significant revenue for distribution. If you have generated earnings from the fund, you will see those posted in your Earnings Summary, in the "Contributor Fund" column.
