Bringing Dark Data Out into the Light

Dark DataThere’s been some chatter lately about the significance of “dark data,” and accompanying “dark analytics.” This usually pertains to the information that either exists in normally unreachable places, such as someone’s spreadsheet, or is formed as a result of analytics and cross-evaluation of existing data sources. For example, video captures of customers’ facial reactions while viewing products in a store could be considered dark data. The question is: can this data be identified, captured and bottled?

With the rise of artificial intelligence, machine learning, more dark data may be seeing the light of day. “Valuable insights from this data are gained by solving statistical inference problems at massive scale,” according to Abhishek Budholiya, of Future Market Insights, which recently published an analysis of the dark analytics market. “Dark analytics assist in recognizing better unused opportunities mainly in sales and marketing processes by analyzing customer behavior insights.” Sales, production and distribution trends are also potential candidates for dark data analysis.

One set of analysts also sees dark data as analysis of what is known as the dark web with its inaccessible, and often edgy sites. However, most of the opportunities are with more well-known and mundane sources, including raw text-based data, “which may include text messages, documents, email, video and audio files, and still images,” according to a report by Tracie Kambies, Nitin Mittal, Sandeep Kumar Sharma, all with Deloitte. Much of this may be locked away in the “deep web,” which includes curated sites maintained within organizations or by third parties.

The good news is that many enterprises may already even have most of this data close by. “In many organizations, large collections of structured and unstructured data sit idle” Kambies and her co-authors point out. “On the structured side, it’s typically difficult to make meaningful connections between disparate data sets. For example, a large insurance company could map its employees’ home addresses and parking pass assignments with their workplace satisfaction ratings and retention data and discover whether total door-to-desk commute time factors into voluntary turnover.”

Dark data may also include “valuable information on pricing, customer behavior, and competitors may be buried within traditional unstructured data,” the Deloitte authors add. “Untapped data includes emails, notes, messages, documents, logs, and notifications — such as those from internet of things devices—and even untranslated data assets from markets that are not English-speaking. This information remains largely unused either because it is not housed in a relational database or because the tools and techniques needed to leverage it efficiently did not exist until recently.”

Technology is opening up many of these formerly dark sources. “Strategic, customer, and operational insights are buried within volumes of raw information generated by transactional systems, social media, search engines, and countless other technologies,” Budholiya observes. Such technology includes “distributed data architecture, in-memory processing, machine learning, visualization, natural language processing, and cognitive analytics to validate or clarify assumptions, identify valuable patterns and insights, inform decision-making, and help chart new strategies.”

In terms of cognitive analytics, artificial and machine learning are showing great promise in opening up dark data. “Using computer vision (the ability of computers to identify objects, scenes, and activities in images), advanced pattern recognition, and video and sound analytics, companies can now mine such data sources as audio and video files and still images contained in nontraditional formats to better understand their customers, employees, operations, and markets.”

Kambies and her co-authors make the following recommendations for making the most of the dark data that may eventually be available to decision makers:

Ask the right questions. This effort needs to be led by the business, but a great deal of guidance and support is required. “Work with business teams to pinpoint specific questions dark data could help answer as well as potential dark analytics sources and untapped opportunities.” There may be data sources — such as video feeds or social sentiment — that haven’t been recognized as having potential value.

Look outside. “Augment data with publicly available demographic, location, and statistical information to generate more expansive, detailed reports and useful insights.”

Augment data talent. Data scientists are an important resource in this effort, of course, but also look for visual and graphic design skills, as well as traditional skills in master data management and data architecture.

Explore advanced visualization tools. “Information is more easily digested when presented as an infographic, dashboard, or other visual representation.”

The post Bringing Dark Data Out into the Light appeared first on The Informatica Blog – Perspectives for the Data Ready Enterprise.


Source: Informatica Perspectives