Eight to Late

Sensemaking and Analytics for Organizations

Archive for the ‘Business Intelligence’ Category

Six heresies for business intelligence

with 10 comments

What is business intelligence?

I recently asked a few acquaintances to answer this question without referring to that great single point of truth in the cloud.  They duly came up with a variety of  responses ranging from data warehousing and the names of specific business intelligence tools to particular functions such as reporting or decision support.

After receiving their responses, I did what I asked my respondents not to: I googled the term.  Here are a few samples of what I found:

According to CIO magazine, Business intelligence is an umbrella term that refers to a variety of software applications used to analyze an organization’s raw data.

Wikipedia, on the other hand, tells us that BI is a set of theories, methodologies, architectures, and technologies that transform raw data into meaningful and useful information for business purposes.

Finally, Webopedia, tell us that BI [refers to] the tools and systems that play a key role in the strategic planning process of the corporation.

What’s interesting about the above responses and definitions is that they focus largely on processes and methodologies or tools and techniques. Now, without downplaying the importance of either, I think that many of the problems of business intelligence practice come from taking a perspective that is overly focused on methodology and technique.  In this post, I attempt to broaden this perspective by making some potentially controversial statements –or heresies – that challenge this view. My aim is not so much to criticize current practice as to encourage – or provoke – business intelligence professionals to take a closer look at some of the assumptions underlie their practices.

The heresies

Without further ado, here are my six heresies for business intelligence practice (in no particular order).

A single point of truth is a mirage

Many organisations embark on ambitious programs to build enterprise data warehouses – unified data repositories that serve as a single source of truth for all business-relevant data.  Leaving aside the technical  and business issues associated with establishing definitive data sources and harmonizing data, there is the more fundamental question of what is meant by truth.

The most commonly accepted notion of truth is that information (or data in a particular context) is true if it describes something as it actually is. A major issue with this viewpoint is that data (or information) can never fully describe a real-world object or event. For example, when a sales rep records a customer call, he or she notes down only what is required by the customer management system. Other data that may well be more important is not captured or is relegated to a “Notes” or “Comments” field that is rarely if ever searched or accessed. Indeed, data represents only a fraction of the truth, however one chooses to define it – more on this below.

Some might say that it is naïve to expect our databases to capture all aspects of reality, and that what is needed is a broad consensus between all relevant stakeholders as to what constitutes the truth. The problem with this is that such a consensus is often achieved by means that are not democratic. For example, a KPI definition chosen by a manager may be hotly contested by an employee.  Nevertheless, the employee has to accept it because that is the way (many) organisations work. Another significant issue is that the notion of relevant stakeholders is itself problematic because it is often difficult to come up with clear criterion by which to define relevance.

There are other ways to approach the notion of truth: for example, one might say that a piece of data is true as long as it is practically useful to deem it so. Such a viewpoint, though common, is flawed because utility is in the eye of the beholder: a sales manager may think it useful to believe a particular KPI whereas a sales rep might disagree (particularly if the KPI portrays the rep in a bad light!).

These varied interpretations of what constitute a truth have implications for the notion of a single point of truth. For one, the various interpretations are incommensurate – they cannot be judged by the same standard.  Further, different people may interpret the same piece of data differently. This is something that BI professionals have likely come across – say when attempting to come up with a harmonized definition for a customer record.

In short: the notion of a single point of truth is problematic because there is a great deal of ambiguity about what constitutes a truth.

There is no such thing as raw data

In his book, Memory Practices in the Sciences, Geoffrey Bowker wrote, “Raw data is both an oxymoron and a bad idea; to the contrary, data should be cooked with care.”  I love this quote because it tells a great truth (!) about so-called “raw” data.

To elaborate: raw data is never unprocessed. Firstly, the data collector always makes a choice as to what data will be collected and what will not. So in this sense, data already has meaning imposed on it. Second, and perhaps more important, the method of collection affects the data. For example, responses to a survey depend on how the questions are framed and how the survey itself is carried out (anonymous, face-to-face etc.).   This is also true for more “objective” data such as costs and expenses. In both cases, the actual numbers depend on specific accounting practices used in the organization. So, raw data is an oxymoron because data is never raw, and as Bowker tells us, we need to ensure that the filters we apply and the methods of collection we use are such that the resulting data is “cooked with care.”

In short: data is never raw, it is always “cooked.”

There are no best practices for business intelligence, only appropriate ones

Many software shops and consultancies devise frameworks and methodologies for business intelligence which they claim are based on best or proven practices. However, those who swallow that line and attempt to implement the practices often find that the results obtained are far from best.

I have discussed the shortcomings of best practices in a general context in an earlier article, and (at greater length) in my book. A problem with best practice approaches is that they assume a universal yardstick of what is best.  As a corollary, this also suggest that practices can be transplanted from one organization to another in a wholesale manner, without extensive customisation. This overlooks the fact that organisations are unique, and what works in one may not work in another.

A deeper issue is that much of the knowledge pertaining to best practices is tacit – that is, it cannot be codified in written form. Indeed, what differentiates good business intelligence developers or architects from great ones is not what they learnt from a textbook (or in a training course), but how they actually practice their craft.  These consist of things that they do instinctively and would find hard to put into words.

So, instead of looking to import best practices from your favourite vendor, it is better to focus on understanding what goes on in your environment. A critical examination of your environment and processes will reveal opportunities for improvement. These incremental improvements will cumulatively add up to your very own, customized “best practices.”

In short: develop your own business intelligence best practices rather than copying those peddled by “experts.”

Business intelligence does not support strategic decision-making

One of the stated aims of business intelligence systems is to support better business decision making in organisations (see the Wikipedia article, for example). It is true that business intelligence systems are perfectly adequate – even indispensable – for certain decision-making situations. Examples of these include, financial reporting (when done right!) and other operational reporting (inventory, logistics etc).  These generally tend to be routine situations with clear cut decision criteria and well-defined processes – i.e. decisions that can be programmed.

In contrast, decisions pertaining to strategic matters cannot be programmed. Examples of such decisions include: dealing with an uncertain business environment, responding to a new competitor etc. The reason such decisions cannot be programmed is that they depend on a host of factors other than data and are generally made in situations that are ambiguous.  Typically people use deliberative methods – i.e. methods based on argumentation – to arrive at decisions on such matters.  The sad fact is that all the major business tools in the market lack support for deliberative decision-making. Check out this post for more on what can be done about this.

In short: business intelligence does not support strategic decision-making .

Big data is not the panacea it is trumpeted to be

One of the more recent trends in business intelligence is the move towards analyzing increasingly large, diverse, rapidly changing datasets – what goes under the umbrella term big data.  Analysing these datasets entails the use of new technologies (e.g. Hadoop and NoSQL)  as well as statistical techniques that are not familiar to many mainstream business intelligence professionals.

Much has been claimed for big data; in fact, one might say too much.  In this article Tim Harford (aka the Undercover Economist) summarises the four main claims of “big data cheerleaders” as follows (the four phrases below are quoted directly from the article):

  1. Data analysis produces uncannily accurate results.
  2. Every single data point can be captured, making old statistical sampling techniques obsolete.
  3. It is passé to fret about what causes what, because statistical correlation tells us what we need to know.
  4. Scientific or statistical models aren’t needed.

The problem, as Harford points out, is that all of these claims are incorrect.

Firstly, the accuracy of the results that come out of a big data analysis depend critically on how the analysis is formulated. However, even analyses based on well-founded assumptions can get it wrong, as is illustrated in this article about Google Flu Trends.

Secondly, it is pretty obvious that it is impossible to capture every single data point (also relevant here is the discussion on raw data above – i.e. how data is selected for inclusion).

The third claim is simply absurd. The fact is detecting a correlation is not the same as  understanding what is going on a point made rather nicely by Dilbert.  Enough said, I think.

Fourthly, the claim that scientific or statistical models aren’t needed is simply ill-informed. As any big data practitioner will tell you, big data analysis relies on statistics. Moreover, as mentioned earlier, a correlation-based understanding is no understanding at all –  it cannot be reliably extrapolated to related situations without the help of hypotheses and (possibly tentative)  models of how the phenomenon under study works.

Finally, as Danah Boyd and Kate Crawford point out in this paper , big data changes the meaning of what it means to know something….and it is highly debatable as to whether these changes are for the better. See the paper for more on this point. (Acknowledgement: the title of this post is inspired by the title of the Boyd-Crawford paper).

In short:  business intelligence practitioners should not uncritically accept the pronouncements of big data evangelists and vendors.

Business intelligence has ethical implications

This heresy applies to much more than business intelligence: any human activity that affects other people has an ethical dimension. Many IT professionals tend to overlook this facet of their work because they are unaware of it – and sometimes prefer to remain so. Fact is, the decisions business intelligence professionals make with respect to usability, display, testing etc. have a potential impact on the people who use their applications. The impact may be as trivial as having to click a button or filter too many before they get their report, to something more significant, like a data error that leads to a poor business decision.

In short: business intelligence professionals ought to consider how their artefacts and applications affect their users.

In closing

This brings me to the end of my heresies for business intelligence. I suspect there will be a few practitioners who agree with me and (possibly many) others who don’t…and some of the latter may even find specific statements provocative. If so, I consider my job done, for my intent was to get business intelligence practitioners to question a few unquestioned tenets of their profession.

Written by K

April 3, 2014 at 9:29 pm

On the disconnect between business intelligence and strategic decision-making

with 4 comments

One of the stated aims of business intelligence (BI) systems is to support better business decision making in organisations (see the Wikipedia article on BI, for example). However, as I have discussed in an earlier post, the usefulness of BI systems in making decisions regarding complex or ambiguous matters is moot. Quoting from that post:

…many decisions [in organisations] have to be made based on incomplete and/or ambiguous information that can be interpreted in a variety of ways. Examples include issues such as what an organization should do in response to increased competition or formulating a sales action plan in a rapidly changing business environment. These issues are wicked; among other things, there is a diversity of viewpoints on how they should be resolved. A business manager and a sales representative are likely to have different views on how sales action plans should be adjusted in response to a changing business environment. The shortcomings of BI systems become particularly obvious when dealing with such problems.

This brings up the question as to how is BI actually used in organisations.

Quoting again from my earlier article:

BI systems are perfectly adequate – even indispensable – for certain situations. Examples of these include, financial reporting (when done right!) and other operational reporting (inventory, logistics etc).  These generally tend to be routine situations with clear cut decision criteria and well-defined processes. Simply put, they are the kinds of decisions that can be programmed.

Typically programmed decisions are made when checking on or monitoring business activities.  I would hazard a guess that BI applications are generally used to carry out such routine monitoring of business processes (and take rule-based corrective action, if necessary) rather than in making complex decisions.  To use a phrase coined by James March, BI applications are used in surveillance mode rather than decision mode.

Unfortunately most BI vendors are yet to address this gap. Most new features that vendors come up with operate in surveillance mode rather than decision mode – that is, they help organisations track (and correct) performance rather than decide on complex/uncertain matters. Thus, despite vendor claims to the contrary, BI is still used as a means to measure and manage  operational matters rather than to make strategic decisions.

Incidentally, this is also true of over-hyped new areas such as Big Data and Predictive Analytics.

Big data refers to a set of technologies and techniques that are useful when analysing large volumes of fast-changing, unstructured data to make operational decisions. For example, commercially available big data products such as splunk can monitor  vast numbers  of unstructured server logs in real time and tell you what corrective actions need to be taken (an operational decision) but they cannot tell you what IT investments you should make over the next five years (a strategic decision).

Predictive analytics refers to a wide range of techniques that are used to identify patterns in past data in order to make predictions about the future events. However, the predictions made using such techniques can only be as good as the underlying mathematical models. Consequently, success in predictive analytics depends crucially on knowing the key variables that govern the phenomena of interest. Identifying these variables can be difficult, if not impossible, in the case of business decisions because of human factors (intentions, motivations etc.). As  Gregory Piatetski-Shapiro puts it in this article, “Predictive analytics can figure out how to land on Mars, but not who will buy a Mars bar.”

So, the question arises:  what do BI vendors need to do in order to facilitate decision-making on complex matters?

To answer this we need to take a brief look at the process of decision-making. The traditional  view is that the decisions are made by working through the following steps:

  1. Identifying available options
  2. Understanding the consequences of each option.
  3. Rating options based on preferences for those consequences
  4. Selecting an option (based on rules and ratings)

However, as I have discussed in a post on the nature of decision making in organisations, in the case of complex decisions not only is it hard to identify all options and their consequences, even preferences and/or selection rules may change as one’s knowledge of the options improves.  As a consequence, such decisions necessarily involve informal reasoning  – a deliberative process that takes into account partipants’ values and beliefs,  in addition to logic and  “hard facts”.  The important point, as Tim van Gelder notes in a brilliant post entitled, The missing “I” in BI, is that none of the BI suites in the market support informal reasoning.  The lack of support is especially strange because there are well-known techniques such as Issue Based Information System (IBIS) and Argument Mapping  that can be used to facilitate and capture such reasoning.

This gap does not matter in the case of operational decisions as the choice is made on the basis of straightforward (or programmable) rules, as in steps 1 through 4 above. However, the situation is different in case of complex or non-programmable decisions such as those that are made in the face of uncertainty. In these cases the lack of support for facilitating, capturing and storing decision rationale becomes a huge handicap.

In summary:  Currently available BI tools are good for operational rather than strategic decision-making because they do not offer any support for the deliberative process that is needed to make complex decisions in the face of ambiguity or uncertainty. The adage, “data doesn’t make decisions, people do”  is particularly true for strategic decisions, but it appears  BI vendors are yet to recognise this.

Acknowledgement:

This post was inspired by a recent comment on one of my earlier posts on business intelligence.

Written by K

March 7, 2013 at 8:58 pm

Out damn’d SPOT: an essay on data, information and truth in organisations

with 4 comments

Introduction

Jack: My report tells me that we are on track to make budget this year.

Jill: That’s strange, my report tells me otherwise

Jack: That can’t be. Have you used the right filters?

Jill: Yes – the one’s you sent me yesterday.

Jack: There must be something else…my figures must be right, they come from the ERP system.

Jill: Oh, that must be it then…mine are from the reporting system.

Conversations such as the one above occur quite often in organisation-land.  It is one of the reasons why organisations chase the holy grail of a single point of truth (SPOT): an organisation-wide repository that holds the officially endorsed true version of data, regardless of where it originates from. Such a repository is often known as an Enterprise Data Warehouse (EDW).

Like all holy grails, however, the EDW, is a mythical object that exists in only in the pages of textbooks (and vendor brochures…). It is at best an ideal to strive towards. But, like chasing the end of a rainbow it is an exercise that may prove exhausting and ultimately, futile.

Regardless of whether or not organisations can get to that mythical end of the rainbow – and there are those who claim to have got there – there is a deeper issue with the standard view of data and information that hold sway in organisation-land.   In this post I examine these standard conceptions of data and information and truth, drawing largely on this paper by Bernd Carsten Stahl and a number of secondary sources.

Some truths about data and information

As Stahl observes in his introduction:

Many assume that information is central to managerial decision making and that more and higher quality information will lead to better outcomes. This assumption persists even though Russell Ackoff argued over 40 years ago that it is misleading

The reason for the remarkable persistence of this incorrect assumption is that there is a lack of clarity as to what data and information actually are.

To begin with let’s take a look at what these terms mean in the sense in which they are commonly used in organisations. Data typically refers to raw, unprocessed facts or the results of measurements. Information is data that is imbued with meaning and relevance because it is referred to in a context of interest. For example, a piece of numerical data by itself has no meaning – it is just a number. However, its meaning becomes clear once we are provided a context – for example, that the number is the price of a particular product.

The above seems straightforward enough and embodies the standard view of data and information in organisations. However, a closer look reveals some serious problems. For example, what we call raw data is not unprocessed – the data collector always makes a choice as to what data will be collected and what will not. So in this sense, data already has meaning imposed on it. Further, there is no guarantee that what has been excluded is irrelevant. As another example, decision makers will often use data (relevant or not) just because it is available. This is a particularly common practice when defining business KPIs – people often use data that can be obtained easily rather than attempting to measure metrics that are relevant.

Four perspectives on truth

One of the tacit assumptions that managers make about the information available to them is that it is true.  But what exactly does this mean?  Let’s answer this question by taking a whirlwind tour of some theories of truth.

The most commonly accepted notion of truth is that of correspondence, that a statement is true if it describes something as it actually is.  This is pretty much how truth is perceived in business intelligence: data/information is true or valid if it describes something – a customer, an order or whatever – as it actually is.

More generally, the term correspondence theory of truth refers to a family of theories that trace their origins back to antiquity. According to Wikipedia:

Correspondence theories claim that true beliefs and true statements correspond to the actual state of affairs. This type of theory attempts to posit a relationship between thoughts or statements on one hand, and things or facts on the other. It is a traditional model which goes back at least to some of the classical Greek philosophers such as Socrates, Plato, and Aristotle. This class of theories holds that the truth or the falsity of a representation is determined solely by how it relates to a reality; that is, by whether it accurately describes that reality.

One of the problems with correspondence theories is that they require the existence of an objective reality that can be perceived in the same way by everyone. This assumption is clearly problematic, especially for issues that have a social dimension. Such issues are perceived differently by different stakeholders, and each of these will legitimately seek data that supports their point of view. The problem is that there is often no way to determine which data is “objectively right.” More to the point, in such situations the very notion of “objective rightness” can be legitimately questioned.

Another issue with correspondence theories is that a piece of data can at best be an abstraction of a real-world object or event.  This is a serious issue with correspondence theories in the context of data in organisations. For example, when a sales rep records a customer call, he or she notes down only what is required by the customer management system. Other data that may well be more important is not captured or is relegated to a “Notes” or “Comments” field that is rarely if ever searched or accessed.

Another perspective is offered by the so called consensus theories of truth which assert that true statements are those that are agreed to by the relevant group of people. This is often the way truth is established in organisations. For example, managers may choose to calculate Key Performance Indicators (KPIs )using certain pieces of data that are deemed to be true.  The problem with this is that consensus can be achieved by means that are not necessarily democratic. For example, a KPI definition chosen by a manager may be hotly contested by an employee.  Nevertheless, the employee has to accept it because organisations are typically not democratic. A more significant issue is that  the notion of “relevant group” is problematic because there is no clear criterion by which to define relevance.

Pragmatic theories of truth assert that truth is a function of utility – i.e. a statement is true if it is useful to believe it is so. In other words, the truth of a statement is to be judged by the payoff obtained by believing it to be true.  One of the problems with these theories is that it may be useful for some people to believe in a particular statement while is useful for others to disbelieve it. A good example of such a statement is: there is an objective reality. Scientists may find it useful to believe this whereas social constructionists may not. Closer home, it may be useful for a manager to believe that a particular customer is a good prospect (based on market intelligence, say), but a sales rep who knows the customer is unlikely to switch brands may think it useful to believe otherwise.

Finally, coherence theories of truth tell us that statements that are true must be consistent with a wider set of beliefs. In organisational terms, a piece of information or data that is true only if it does not contradict things that others in the organisation believe to be true. Coherence theories emphasise that the truth of statements cannot be established in isolation but must be evaluated as part of a larger system of statements (or beliefs). For example, managers may believe certain KPIs to be true because they fit in with other things they know about their business.

…And so to conclude

The truth is a slippery beast: what is true and what is not depends on what exactly one means by the truth and, as we have seen, there are several different conceptions of truth.

One may well ask if this matters from a practical point of view.  To put it plainly: should executives, middle managers and frontline employees (not to mention business intelligence analysts and data warehouse designers) worry about philosophical theories of truth?  My contention is that they should, if only to understand that the criteria they use for determining the validity of their data and information are little more than conventions that are easily overturned by taking other, equally legitimate, points of view.

Written by K

October 17, 2012 at 9:11 pm