As I’ve said in prior posts, keeping data in native format for later use is increasingly a “must have” for many customers. This is the starting point. Stage two is, of course, turning raw data into useful information by adding knowledge or context.
Before you can transition data into business information, you also must find the pieces of data that are interesting or useful. In the media and entertainment world, this is done predominantly through a concept called “metadata tagging.” Metadata tagging is a process by which every unique data element (for video, this would be a frame) is enriched with business information likely to identify its value. For a sports shot of Kobe Bryant sinking a 3-pointer, the frame is likely to be tagged with terms such as like “Kobe Bryant,” “Lakers,” “3-pointer,” etc. This is obviously so when broadcasters later need to grab a quick shot of Kobe, Laker’s wins or sample 3-point shots in the NBA, they can rapidly find and use this frame.
Metadata tagging is a concept whose time may be coming soon in the commercial world too, driven by the desire to turn data into “Big Data.”
Why do we need metadata tagging? Well, the best way I can think of to describe this is to tell you about my past month. (Get ready to laugh at my expense.)
We’ve been doing spring cleaning this year at the Lee household. My husband bought a new car, and he wanted to put it in the garage.
As a result, we found ourselves shuffling “stuff” to and from the garage to the rental cage, with the derivative requirement to unpack and sell 5 dish pack-sized moving boxes full of my teenage son’s childhood Legos (about 50 pounds worth). This would allow my son to earn a little spare cash on eBay – while making space for my husband’s car.
By the way, I mention Lego “pounds,” because, I have learned, you can sell Legos by the pound. However, you get a much higher dollar value for the Legos if they are organized into the original complete kits with manuals, even without the original boxes. To put this into perspective, some of the large legacy Star Wars kits, such as the “Death Star” from Star Wars™ sell on eBay for over $600.
So some idiot (was that me?) decided it would be a good idea to reassemble the volume, variety, etc. of Lego blocks back into the original kits. The plan was that I would sort blocks into categories and then my son would take it from there. Notice, by the way, that somehow everybody in the family is getting something out of this endeavor EXCEPT me. (If I can channel Dierks Bentley for a minute: “What was I thinkin’?”)
The decision to take on this project happened over a month ago. OMG, you should see my house right now!
Over the years, the Lego kits came completely undone. At the moment of opening, the boxes were destroyed; and after a single vanilla implementation of each original kit, the multitude and variety of blocks were merged into other more original creations, and finally dumped, along with all the original instruction guides, into giant bins.
Three weeks ago, I started sorting with dish pack-sized masses of an infinite variety of Lego blocks – from Star Wars miniature figures to Harry Potter village blocks to Bionicles and other things I cannot even describe (although I have since learned that Lego has a unique name for each and every one of them). Trying to “process” all these bits back into kits (using the original instruction manuals, which had no separate parts inventories until after 2006) has been an amazingly challenging task. Literally every horizontal surface in the house is covered with plastic containers. I’ve tried sorting by color, by type, by theme. Nothing works very well, frankly. It’s just plain brute force labor. All of which may explain why Legos sell for much less by the pound.
This strikes me as being very similar to trying to manage massive data bits with human knowledge (after the fact) to try to create real information. Had the Lego bits been tagged (with metadata) and kept in a more organized fashion, my house would NOT be the chaos it is right now.
Now in the data world, I realize that a 4-way Intel processor is probably much better at sorting than I am, but the problem is, just as I struggled with whether to sort by color, size or theme, how does a business know what category to sort by?
And I can’t imagine what it would be like if, in addition to my current chaos, a new random box of Legos kept arriving every day, at the same time I was trying to organize what we already have. Yet, if you think about it, that’s exactly what some Big Data customers are dealing with, each and every day.
This entire exercise all speaks to the value of the process of metadata tagging at data creation or ingest.
These tools are widely used in select data intensive verticals, like media and publishing. But as I said, with the growth in Big Data, business metadata is a concept any business needs to consider – before the business finds itself with the data equivalent of my five dish pack boxes of Legos.