Archive for the ‘Consulting’ Category
The politics of data warehousing revisited
Introduction
An enterprise IT initiative generally affects a range of stakeholders groups, each with their own take on why the project is being undertaken and what the result should look like. This diversity of views is no surprise: an organisation-wide effort affects many divisions and departments, so there are bound to be differing – even conflicting – views regarding the initiative and its expected outcome.
The existence of many irreconcilable viewpoints is one of the main symptoms of a wicked problem – a problem that is hard to define, let alone solve. Paul Culmsee has written about the inherent wickedness of projects that involve collaborative platforms such as SharePoint. In this post I discuss how another class of enterprise scale initiatives – efforts to consolidate and harmonise organizational data for analytical and reporting purposes (so-called data warehouse projects) – display characteristics of wickedness. I also briefly discuss a couple of approaches that can be used to manage this issue.
As some of my readers may not be familiar with the terms data warehouse or wicked problem, I’ll start with a short introduction to the two terms in order to set the stage for the main topic.
Data warehouse
A data warehouse is a repository of data that a business deems important for reporting and analysis. Ideally, a data warehouse integrates data from multiple sources – for example, CRM and financial systems – thereby serving as an authoritative source for management reports (often referred to as a “single point of truth”). There are at least a couple of different design philosophies for data warehouses, but I won’t go into these as they are not relevant to the discussion. What’s interesting is that most of the literature on data warehousing deals with its technical aspects – things such as data modelling and extract-transform-load processes – yet, as anyone who has been involved in an enterprise-scale data warehousing effort will tell you, the biggest challenges are political, not technical. To be fair, this was recognized a while ago – Marc Demarest wrote an article on the politics of data warehousing in 1997. However, it is worth revisiting this issue because there are techniques to handle it that weren’t widely known at the time Demarest wrote his article. I discuss these briefly later, but first let’s look at what wickedness means and its relevance to data warehouse projects.
Wicked problems
The term wicked problem was coined by Horst Rittel and Melvin Webber in a now-classic paper entitled Dilemmas in a General Theory of Planning. The paper is essentially a critique of the traditional approach to social planning, wherein decisions are made by experts who, by virtue of their specialist knowledge and training, are assumed to know best. Such an approach often doesn’t work because it ends up alienating stakeholders who are adversely affected by the “solution.” This is a symptom of social complexity – messiness and conflict arising from diverse opinions as to what the problem is and how it should be solved. Those involved in enterprise-scale IT initiatives – whether as users, managers or technical specialists – would have had first-hand experience of this social complexity.
How do we know that a problem is socially complex (or wicked)? That’s easy: In the paper, Rittel and Webber describe ten criteria for wickedness – so a problem is wicked if it satisfies some or all of the Rittel-Webber criteria. We’ll take a look at the criteria and their relevance to data warehousing next.
The wickedness of data warehouse initiatives
To support my claim about the wickedness of data warehousing initiatives, I’ll simply list the ten Rittel-Webber criteria (in their original form) along with a brief commentary on how they can crop up in data warehouse projects. Here we go:
- There is no definitive formulation of a wicked problem: Those who have worked on organisation-wide efforts at integrating data will know that the first problem is to decide “what’s in and what’s out” – that is, what data sources are considered in scope for integration. The problem arises because different business stakeholders have different views on what is important. For example, data that is critical to HR may not be a priority for the marketing function.
- Wicked problems have no stopping rule: Data warehouse initiatives are never definitively completed: there are always new data sources that need to be integrated; old ones to be turned off; business rules to be changed and so on. Any stopping rule that one might define will need to be revised as new business requirements come up and new data sources are revealed.
- Solutions to wicked problems are not true or false, but better or worse: This is simply an expression of the truism that there is no right or wrong way to build a data warehouse. There are a range of different architectures and approaches that can be chosen, each with their pros and cons (see this paper for a comparison of the two most popular approaches). The problem is that one often cannot tell beforehand which approach is going to be best for a particular situation.
- There is no immediate or ultimate test of a solution to a wicked problem: This is a statement of the fact that one cannot tell whether or not a particular implementation can completely solve the problem of data integration. As Rittel and Webber put it, “…any solution, after being implemented, will generate waves of consequences over an extended – virtually unbounded – period of time. Moreover, the next day’s consequences of the solution may yield utterly undesirable consequences…” Although these words are somewhat over-the-top, the message isn’t: for example, I have seen situations where programming errors that have remained undetected for years (yes, years) have lead to incorrect data being used in reports.
- Every solution to a wicked problem is a “one-shot” operation; because there is no opportunity to learn by trial and error, every attempt counts significantly: Because of the high costs of implementation, enterprise-scale IT initiatives tend to be one-shot affairs. Another limiting factor is that there is usually a very short window of time in which the project must be completed – as the cliché goes, “users need these reports yesterday.” Among other things, this precludes the option of learning by trial and error.
- Wicked problems do not have an enumerable (or exhaustively describable) set of potential solutions, nor is there a set of well-describable options that may be incorporated into the plan: This point may seem like it doesn’t apply to data warehousing initiatives – all data warehousing projects have a plan, right? Nevertheless, those who have worked on such projects will attest to the fact that the plan – such as it is – needs frequent revision because of surprises that crop up along the way. Iterative/incremental development approaches can address these issues to some extent, but cannot eliminate them completely. Because of time constraints, it is inevitable that solutions to unexpected roadblocks occur through improvisation rather than planning.
- Every wicked problem is essentially unique: This one is easy to see: every organisation is unique, and so are its data integration requirements. Methodologists and consultants may try to convince you otherwise, and tempt you into following generic approaches – but don’t be fooled, generic approaches will come unstuck. Your data is unique, treat it with the respect and seriousness it deserves.
- Every wicked problem can be considered to be a symptom of another problem: One of the key drivers of data warehouse projects is that organizations tend to have the same (or similar) data residing in multiple databases. As a consequence there are several different “sources of truth” for reports. These different sources of truth arise because systems used in different departments may have different definitions of the same business entity. For example, a customer might be defined in one way within the financial system but in another way in a CRM system. Seen in this light, the problem of multiple sources of truth is actually a symptom of lack of communication between different departments, what is sometimes called silo mentality.
- The existence of a discrepancy representing a wicked problem can be explained in numerous ways. The choice of explanation determines the nature of the problem’s resolution: As discussed in the previous point, the discrepancy in the case of a data integration problem is the lack of congruency between different data sources. There can be a range of explanations for the discrepancy. For example, one explanation may be that the data is actually different – a customer in the CRM system is not the same as the customer in the finance system; another explanation may be that the two entities are the same but their definitions differ because the systems were developed independently of each other. The data integration solution in the two cases will differ –in other words, the solution to the problem depends on which explanation is seen as the correct one.
- The planner has no right to be wrong: The data warehouse designer is in a difficult position: he or she may have to reconcile contradictory requirements. Following from the example of the previous point, whatever design decisions the designer makes regarding the definition of a customer, there will be some parties that will not be happy: if she goes with the finance definition, sales will be ticked off; if she chooses the sales definition, finance will not be happy; if she chooses to define a single common entity, neither will be pleased. Yet, her mandate is to satisfy all business requirements. This criterion is essentially an expression of the political aspect of data warehouse projects.
I find it quite amazing that criteria that were framed in the context of social planning problems can apply word-for-word to data consolidation initiatives.
Managing wickedness in data warehousing
As should be evident from the above, wicked problems can’t be solved in the usual sense of the word, but they can be managed. Although there are many techniques to manage wickedness, they all focus on the same end: to help all stakeholder groups reach a shared understanding of the problem and make a shared commitment to action. Such a shared understanding is absolutely critical because business and IT folks often have differing views on what a data warehouse ought to be.
One approach that I have used to help stakeholders get to a shared understanding in data warehouse projects is dialogue mapping, a facilitation technique that maps out the conversation between stakeholders as it occurs. Dialogue mapping uses the Issue-Based Information System (IBIS) notation which was invented by Rittel as a means to document the different facets of a wicked problem. See this post for a data warehouse related example of dialogue mapping and this one for more on the IBIS notation.
Shared understanding and commitment to action is well and good, but in the end success is measured by deliverables: the data warehouse and accompanying reports must be built. One of the challenges with a data warehouse initiative is that customers have to wait a long (very long!) time before they see any tangible benefits. Agile approaches to data warehousing offer a way to address this issue. For those interested in the nuts and bolts of agile data warehousing, I recommend Ralph Hughes’ book, which discusses how Scrum can be adapted for data warehousing projects.
Although the juxtaposition of the terms “agile” and “data warehouse” may sound oxymoronic to some, there is evidence that it works (see this case study, for example). Of course, no approach is a silver bullet; those who want to read about potential problems may want to look at this thesis for a research-based view of the pros and cons of an agile approach to data warehousing.
In the end, though, one has to keep in mind that no development technique – agile or otherwise – will succeed unless all stakeholders have a shared understanding of what the data warehouse is intended to achieve. The biggest issues are organisational rather than technical.
Conclusion
As we have seen, corporate data integration problems satisfy many – if not all – of the criteria for wickedness. The main implication of this is that data consolidation at an enterprise level is not just a difficult technical problem it is also a socially complex one. Although tackling this requires skills and techniques that are outside of the standard repertoire of technical staff and managers, these skills can be learnt. What’s more, they are critical for success: those who undertake data warehouse projects without an understanding of the conflicting agendas of stakeholder groups may fail for reasons that have nothing to do with technology.
A thin veneer of process
Some time back I published a post arguing that much of the knowledge relating to organizational practices is tacit – i.e. it is impossible to capture in writing or speech. Consequently, best practices and standards that purportedly codify “best of breed” organizational practices are necessarily incomplete: they do not (and cannot) detail how a practice should be internalised and implemented in specific situations.
For a best practice to be successful, it has to be understood and moulded in a way that makes sense in the working culture and environment of the implementing organisation. One might refer to this process as “adaptation” or “customization”, but it is much more than minor tweaking of a standard process or practice. Tacit knowledge relates to the process of learning, or getting to know. This necessarily differs from individual to individual, and can’t be picked up by reading best practice manuals. Building tacit knowledge takes time and, therefore, so does the establishment of new organizational processes. Consequently, there is a lot of individual on-the-job learning and tinkering before a newly instituted procedure becomes an organizational practice.
This highlights a gap between how practices are implemented and how they actually work. All too often, an organisation will institute a project to implement a best practice – say a quality management methodology – and declare success as soon as the project is completed. Such a declaration is premature because the new practice is yet to take root in the organisation. This common approach to best practice implementation does not allow enough time for the learning and dialogue that is so necessary for the establishment of an organizational practice. The practice remains “a thin veneer of process” that peels off all too easily.
Yet, despite the fact that it does not work, the project-oriented approach remains popular. Why is this so? I believe this happens because decision-makers view the implementation of best practices as a purely technical problem – practices are seen as procedures that can be grafted upon the organization without due regard to culture or context and environment or ethics. When culture, context and people are considered as incidental, practices are reduced to their mechanical (or bureaucratic) elements – those that can be captured in documents, workflow diagrams and forms. These elements are tangible so implementers can point to these as “proof” that the processes have been implemented.
Hence the manager who says: “We have rolled out our new project management system and all users have undergone training. The implementation of the new methodology has been completed. ”
Sorry, but it has just begun. Success – if it comes at all – will take a lot more time and effort.
So how should best practice implementations be approached?
It should be clear that a successful implementation cannot come from a cookbook approach that follows textbook or consultant “recipes.” Rather, it involves the following:
- Extensive adaptation of techniques to suit the context and environment of the organisation.
- Involvement of the people who will work with and be affected the processes. This often goes under the banner of “buy-in”, but it is more than that: these people must have a say in what adaptations are made and how they are made. But even before they do that, they must be allowed to play with the process – to tinker – so that they can improve their understanding of its intent and working.
- An understanding that the process is not cast in stone – that it must be modified as employees gain insights into how the process can be improved.
All these elements tie into the idea that practices and procedures involve tacit knowledge that sits in people’s heads. The visible, or explicit, aspects – which are often mistaken for the practice – are but a thin veneer of process.
So, in conclusion, the technical implementation of a best practice is only the beginning – it is the start of the real work of internalizing the practice through learning required to sustain and support it.
Pathways to folly: a brief foray into non-knowledge
One of the assumptions of managerial practice is that organisational knowledge is based on valid data. Of course, knowledge is more than just data. The steps from data to knowledge and beyond are described in the much used (and misused) data-information-knowledge-wisdom (DIKW) hierarchy. The model organises the aforementioned elements in a “knowledge pyramid” as shown in Figure 1. The basic idea is that data, when organised in a way that makes contextual sense, equates to information which, when understood and assimilated, leads to knowledge which then, finally, after much cogitation and reflection, may lead to wisdom.
In this post, I explore “evil twins” of the DIKW framework: hierarchical models of non-knowledge. My discussion is based on a paper by Jay Bernstein, with some extrapolations of my own. My aim is to illustrate (in a not-so-serious way) that there are many more managerial pathways to ignorance and folly than there are to knowledge and wisdom.
I’ll start with a quote from the paper. Bernstein states that:
Looking at the way DIKW decomposes a sequence of levels surrounding knowledge invites us to wonder if an analogous sequence of stages surrounds ignorance, and where associated phenomena like credulity and misinformation fit.
Accordingly he starts his argument by noting opposites for each term in the DIKW hierarchy. These are listed in the table below:
| DIKW term | Opposite |
| Data | Incorrect data, Falsehood, Missing data, |
| Information | Misinformation, Disinformation, Guesswork, |
| Knowledge | Delusion, Unawareness, Ignorance |
| Wisdom | Folly |
This is not an exhaustive list of antonyms – only a few terms that make sense in the context of an “evil twin” of DIKW are listed. It should also be noted that I have added some antonyms that Bernstein does not mention. In the remainder of this post, I will focus on discussing the possible relationships between these terms that are opposites of those that appear in the DIKW model.
The first thing to note is that there is generally more than one antonym for each element of the DIKW hierarchy. Further, every antonym has a different meaning from others. For example – the absence of data is different from incorrect data which in turn is different from a deliberate falsehood. This is no surprise – it is simply a manifestation of the principle that there are many more ways to get things wrong than there are to get them right.
An implication of the above is that there can be more than one road to folly depending on how one gets things wrong. Before we discuss these, it is best to nail down the meanings of some of the words listed above (in the sense in which they are used in this article):
Misinformation – information that is incorrect or inaccurate
Disinformation – information that is deliberately manipulated to mislead.
Delusion – false belief.
Unawareness – the state of not being fully cognisant of the facts.
Ignorance – a lack of knowledge.
Folly – foolishness, lack of understanding or sense.
The meanings of the other words in the table are clear enough and need no elaboration.
Meanings clarified, we can now look at the some of the “pyramids of folly” that can be constructed from the opposites listed in the table.
Let’s start with incorrect data. Data that is incorrect will mislead, hence resulting in misinformation. Misinformed people end up with false beliefs – i.e. they are deluded. These beliefs can cause them to make foolish decisions that betray a lack of understanding or sense. This gives us the pyramid of delusion shown in Figure 2.
Similarly, Figure 3 shows a pyramid of unawareness that arises from falsehoods and Figure 4, a pyramid of ignorance that results from missing data.
Figures 2 through 4 are distinct pathways to folly. I reckon many of my readers would have seen examples of these in real life situations. (Tragically, many managers who traverse these pathways are unaware that they are doing so. This may be a manifestation of the Dunning-Kruger effect.)
There’s more though – one can get things wrong at higher level independent of whether or not the lower levels are done right. For example, one can draw the wrong conclusions from (correct) data. This would result in the pyramid shown in Figure 5.
Finally, I should mention that it’s even worse: since we are talking about non-knowledge, anything goes. Folly needs no effort whatsoever, it can be achieved without any data, information or knowledge (or their opposites). Indeed, one can play endlessly with antonyms and near-antonyms of the DIKW terms (including those not listed here) and come up with a plethora of pyramids, each denoting a possible pathway to folly.






