There’s a big marketing and consultant push on UIMA, unstructured information management architecture. But I think it is largely missing the point for the real world corporate BI people, i.e. those not spooks or librarians. The critical concept, ignored by many, is that unstructured information is of two kinds, explicit and tacit. Even Wikipedia gets it wrong, ignoring the latter.
I believe BI gains most from tacit intelligence, but that’s not where the product marketing thrust lies. The importance of tacit unstructured information in BI is summed up well by Timo Elliott in a cartoon and by James Taylor’s recent blog post.
The heavy hitters in BI software, as evidenced by the recent takeover activity, are pressing home apparent advantage to be gained by corporations with analysis of masses of emails, news, documents, etc. See, for example, the description of UIMA.
Well and good. But you can’t make a silk purse out of a sow’s ear, as Jonathan Swift said. And you can’t create relevant action oriented information for executives out of data that has no embedded useful information in it. The ocean of documents, with some exceptions, is a BI desert for most companies. But I mix my metaphors.
Basically, I believe that a corporation’s vast compendium of historical documents has little BI relevance. It may be useful to track or assess a person’s background, or to isolate the cause of a problem. But history rarely contains the up-to-date information that’s relevant to managing a business, assessing current performance and finding problems. The real lies in exploiting the tacit stuff.
I’ve quoted Henry Minzberg often before, but it bears repeating, as the message hasn’t yet been fully understood in the mainstream of BI: “The strategic database of an organization is in the minds of its managers, not in the databases of its computers”. This is as true today as it was in 1974. Today, one can add: “Or in the morass of historical documents and emails”.
Of course recent emails and documents often contain important information that can, and should, be part of a BI context; but usually only as the seed for a collaborative knowledge building process. This is the nub of the issue; it’s hard to identify, collate, disseminate and collaborate on tacit unstructured information. Perhaps this is why most authors steer clear of the issue. But we need to address it if we are to be effective. More on this in my next post.
I wrote last year detailing some of my research in the 90s on the subject of “hard” and “soft” information, how valuable it is in many BI contexts particularly CRM, but also how difficult it is to exploit. In this context hard information refers to the structured, numeric, formatted, BI reports. Soft information is the unformatted, unarticulated, information in managers’ and professionals’ minds.
An interesting article by Rick Taylor dealing with unstructured information is relevant here. It says, in part:
The key to defining knowledge management is to make sure you are separating “explicit” knowledge from “tacit” knowledge. Explicit knowledge is anything easy to quantify, write down, document or explain. Tacit knowledge is everything else. The knowledge based on ones experiences, and often times, at a subconscious level. It is information that you don’t necessarily know you know until you are reminded of it. If you were asked to write down everything you know, could you do it?
The key to defining knowledge management is to make sure you are separating “explicit” knowledge from “tacit” knowledge. Explicit knowledge is anything easy to quantify, write down, document or explain. Tacit knowledge is everything else. The knowledge based on one's experiences, and often times, at a subconscious level. It is information that you don’t necessarily know you know until you are reminded of it. If you were asked to write down everything you know, could you do it?
The explicit and tacit labels were used first in this context, I believe, by Nonaka and Takeuchi in The Knowledge-Creating Company.
The BI key questions that arise from this discussion are, I believe:
- What are the most useful sources of unstructured information in our business? Explicit or Tacit?
- If Explicit, how do we best marshal the information and report it?
- If Tacit, ditto?
- Is the information we get from our unstructured sources complete, and ready for promulgation, or do we need to amplify or build on it before it’s useful?
I believe that the above analysis outlines the problem of utilizing tacit unstructured information reasonably well. I’ll offer my answers to these issues subsequently.
About the Author
Recognizing that rapid development methodologies are here to stay as part of the BI system context, Cyril Brookes is currently studying how we can evolve new approaches to requirements definition based on bottom-up principles. This has led to a renewed focus on documenting corporate BI environments, and the work described as BI Documenter; see http://www.bidocumenter.com.