Media content consists of both essence (the content itself) and its associated metadata. Everybody acknowledges that the metadata is important to classifying and locating content, so media companies tend to put a lot of thought into collecting and managing metadata — what type of information will be collected, where it will be entered and how often, etc. The idea is to ensure consistent, thorough metadata collection so that users can find and remonetize specific pieces of content. Metadata-gathering is a critical part of the metadata management process, to be sure, but it’s only half the process. What people tend to ignore is the other piece of metadata management — ensuring that the metadata is secure and archived.
Why do they ignore it? Because media companies tend to focus so much on securing the actual content that they put little if any thought into securing the associated metadata, which is often stored in another database separate from the content itself. If you lost the metadata, it would be nearly impossible to find the right video in a timely manner — let alone a specific subclip or frame of video — rendering the content library practically useless. Even so, most people don’t often think beyond protecting the content.
Like the essence it describes, the metadata needs to be treated as critical information and backed up securely. A best practice for protecting your metadata is to ensure that, while you’re backing up your content, you’re also backing up and archiving your metadata database. Most production workflow asset tools, whether a media asset manager or production asset manager, will be able to tie the metadata back to the content, but only if the metadata is securely maintained. If the metadata is lost, the search capability is essentially lost with it.
Therefore, when designing a storage infrastructure for a file-based media operation, it’s a good idea to consult with your media asset management (MAM) provider to define a standard for backing up and archiving your metadata. Your MAM provider will work with you to build schemas that define how the metadata will be collected, stored, and managed, and how it will relate back to the actual content, so it makes sense to add metadata security to the list.
Securing the metadata should start when capturing the content and continue through every stage in the lifecycle of that content — with increasingly more metadata being captured through each stage of production. For example, in the acquisition stage, if you acquire the content electronically, you can design the acquisition schema to test any content that comes in to ensure that the associated metadata is embedded in the files. Then you can supplement the incoming metadata with your own additions to complete the schema. The media asset manager might strip that embedded metadata from the file, read it, and then re-embed it. At that point, the metadata should be backed up for safe keeping. Similar redundant copies of metadata should be a requirement, just as it is to archive content on ingest. As metadata is added during editing, audio sweetening and color correction, finishing, and delivery, it should be secured and backed up during the process.
In most systems, the metadata resides on one set of disks — essentially in a single workspace — while the content resides on another. The reason for the separate locations is simply because, based on the file sizes, the access patterns are different. The content itself could be multiple hundreds of gigabytes or terabytes in size, so it tends to be stored in high-capacity, low-latency storage media and streamed from its storage location rather than fully restored upon access, a process that maintains smooth playback and eliminates glitches and dropped frames. The metadata, on the other hand, is much smaller — sometimes as few as several kilobytes — because most of the information is in a text file or database entry. For that reason, it can be stored on more traditional storage media. When the media asset manager accesses the metadata files, it is a much simpler action that’s usually measured in IOPS (input/output operations per second), which means there are no throughput worries and associated latency concerns.
When planning a backup and archiving policy for metadata, you also have to consider that a single piece of metadata could describe more than one piece of content, such as when multiple events are happening within a video. For example, suppose you have video footage that was filmed at LaGuardia Airport. The video depicts one plane landing and another one taking off. The overarching metadata to describe both of those events is the same — it’s all footage from LaGuardia Airport — but it might contain subtags such as “American Airlines jet landing” and “US Airways jet taking off.” The video clips of each event might reside in two different places, but the metadata resides in a single place. So when devising a method for backing up the metadata, it’s important to keep in mind all of the content that is associated with that metadata and how it will all tie together.
Once you and your MAM vendor have determined your schemas and archiving protocols for every stage in the workflow, then it’s a matter of filling in the metadata from there. The metadata should be continually backed up while the work is in progress. In the end, when it’s time to archive the content, you also want to archive the metadata along with it for remonetization purposes.
Losing metadata means losing the linkage between the media asset manager and the content. The best way to maintain that linkage and ensure content continues to be discoverable and monetizable is through an aggressive policy of metadata backup starting in the earliest stage of the workflow, and then archiving the metadata at the end just as you do the final content. Securing metadata is something that you might not think about, but it’s a part of the infrastructure that is every bit as important as storing and securing the content itself.