Along with the growth in the number of data-creating sources, there have been significant increases in the speed at which it is being created. Much of the data created today (most of which could provide utility to the business if harnessed properly) is via improvements in instrumentation, e.g., website clickstream data, mobile GPS data, machine sensor data, RFID tag data, network device logs, etc.
IBM used a term in one of their eBooks that I happen to like: “data exhaust”. The authors describe it by saying these types of data are, “generated in huge amounts (often terabytes per day) but typically isn’t tapped for business insight.”
The reason this analogy resonated with me is because of the car I happened to drive in high school (1985 Dodge Omni GLH-T, watch here, starting at 0:56). This odd-looking car’s rapid acceleration was accomplished via a turbocharger. For those unfamiliar, a turbocharger harnesses the engine’s exhaust gases to spin a turbine, which in turn forces more air into the combustion chamber; more air means more fuel can be consumed, so each stroke of the piston generates more power.
Bringing this back to the data world, if you are able to utilize your “data exhaust” through Big Data techniques (consider this the turbocharger), it is possible to accelerate your business. As with Volume, consider this term relative to your current capabilities.
If the data are faster than you can currently handle, consider it “Big”.
The last V speaks to the growing diversity in the types and structure (or lack of structure) of data available today.
We’re no longer just talking about data stored in a data warehouse, but there is opportunity trapped within our growing volumes of semi-structured (like XML) and unstructured data (like email, other text content, video, sound, images, etc.). Big Data techniques can be used to find the needles (insights) in the haystack (data) by combining data of various structures to yield the insights that will drive the best next action for the business.
In relative terms, if the data are broader than you can handle, consider it “Big”.