Many organisations are transcending to an altitude that they never would have considered a few years back. Even in their more lucid moments, the idea of collecting and analysing the volumes of unstructured data we are seeing today would not have been considered.
The challenge that organisations are faced with is how to extract value out of the unstructured data within their organisations.
- Firstly there is a lot of work in collecting the data; consider free text, sound files and video files all need to be identified within your organisation.
- Secondly, we need to parse the data to extract interesting pieces of information, text analytics to derive value from free text, phonetic engines to derive value from sound files, and visual recognition engines to derive value from video files.
- And finally we need to integrate this new insight with the existing information that we have in our data warehouses.
With the exception of our more clandestine customers who we can't talk about (although we can assume their capabilities if we believe half of what we see in an episode of Spooks), most organisations are at the very early stages of this journey. The first big step that has been underway for the last few years has been around extracting value out of the free text fields that are captured within our organisations (typically at the call centre). The text analytics algorithms have been improving over time, however they tend to be single language focused. And in global market we need more. As for the voice and video files it seems that these algorithms are still in the early stages of R&D.
The latest silver bullet has been a significant change in approach, no longer are we trying to parse everything and put it into a relational database in a structured way, we are now leveraging technologies like map reduce to parse unstructured information and return just the valuable pieces. The returned valuable information can then be integrated with the structured information for analysis. The end goal will be that a single query language will integrate both structured and unstructured data on the fly, watch this space as some exciting acquisitions are building out this goal.
Teradata’s partnership with Cloudera, a map reduce provider, is allowing the search of unstructured data in a new way. The early adopters are starting to make this work today. The moment of clarity, or a bingo moment if you will, is that organisations are going to be able to build out their understanding of customers with this new flood of data. The hope is that they will use it to make my experience as a consumer a better one.
Are you starting down this journey?
The post A Moment of Truth appeared first on International Blog.