Eight to Late

Complex decision making as an infinite game

A decision is the act of choosing between two or more options.

There are two kinds of decisions, computable and non-computable [1]. In the former, options are well-defined and finite in number, and there are unambiguous facts (data) available based on which options can be rated. In the latter, options are neither clear nor enumerable and facts, if available at all, are ambiguous.

Computable decisions are simple, non-computable decisions are complex. We’ll refer to the two decision types by these names in the remainder of this article.

An example of a simple decision is buying a product (TV, car or whatever) based on well-defined criteria (price, features etc.). An example of a complex decision is formulating a business strategy.

It should be clear that simple decisions involve smaller temporal and monetary stakes – i.e. the cost of getting things wrong is limited and the effects of a bad decision wear off in (a relatively short) time. Neither is true for complex decisions: the cost of a poor choice can be significant, and its negative effects tend to persist over time.

A key feature of complex decisions is that they (usually) affect multiple parties. That is, they are socially complex. This has implications regarding how such decisions should be approached. More on this later.

Conventional decision theory is based on the notion of maximizing benefit or utility. For simple decisions it is assumed that utility of each option can be computed; for complex decisions it is assumed they can be estimated, or at least ranked. The latter assumption is questionable because each party affected by a complex decision will have its own notion of utility, at least at the outset. Moreover, since neither options nor facts are unambiguous at the start, it makes little sense to attempt to estimate utility upfront.

The above being the case, it is clear that complex decisions cannot be made on the basis of maximizing utility alone. Something else is needed.

–x–

James Carse’s classic book, Finite and Infinite Games, begins with the following lines:

There are at least two kinds of games. One could be called finite, the other infinite. A finite game is played for the purpose of winning, an infinite for the purpose of continuing the play.

A finite game ends when a player or team wins. However, “just as it is essential for a finite game to have a definitive ending, it must also have a precise beginning. Therefore, we can speak of finite games as having temporal boundaries.”

The parallel between simple decisions and finite games should be evident. Although less obvious, it is useful to think of a complex decision as an infinite game.

When making a complex decision – such as a business strategy – decision-makers will often focus on maximising potential benefits (aka utility). However, as often as not, the outcome of the decision will fall far short of the expected benefits and may, in some cases, even lead to ruin. This being so, it is perhaps more fruitful to focus on staying in the game (keep playing) rather than winning (maximising utility).

The aim of a complex decision should be to stay in the game rather than win.

How does one ensure that one stays in the game? Heinz von Foerster’s ethical imperative offers an answer”

Always act to increase your choices.

That is, one should decide in such a way that increases one’s options in the future thereby increasing chances of staying in the game. One can frame this discussion in terms of adaptability: the greater the number of options the greater the ability to adapt to unexpected changes in the environment.

How can one “act to increase one’s choices”?

One way to do this is to leverage social complexity: get different parties to articulate their preferred options. Some of these options are likely to contradict each other. Nevertheless, there are ways to handle such a diversity of potentially contradictory views in an inclusive manner (for an example, see this paper; for more, check out this book). Such an approach also ensures that the problem and solution spaces are explored more exhaustively than if only a limited number of viewpoints are considered.

The point is this: there are always more options available than apparent. Indeed, the number of unexplored options at any stage is potentially infinite. The job of the infinite player (decision-maker) is to act so as surface them gradually, and thus stay in the game.

–x–

Traditionally, decision-making is seen as a logical undertaking based on facts or data. In contrast, when viewed as an infinite game, complex decision-making becomes a matter of ethics rather than logic.

Why ethics?

The answer lies in von Foerster’s dictum to increase one’s choices. By doing so, one increases the chances that fewer stakeholders’ interests are overlooked in the decision-making process.

As Wittgenstein famously said, “It is clear ethics cannot be articulated.” All those tedious classes and books on business ethics miss the point entirely. Ethical matters are necessarily oblique: the decision-maker who decides in a way that increases (future) choices, will be acting ethically without drawing attention to it, or even being consciously aware of it.

–x–

Any honest discussion of complex decision-making in organisations must address the issue of power.

Carse asserts that players (i.e. decision-makers in the context of this article) become powerful by acquiring titles (e.g. CEO, Manager etc.). However, such titles can only be acquired by winning a finite game– i.e. by being successful in competitions for roles. Power therefore relates to finite rather than infinite games.

As he notes in his book:

Power is a concept that belongs only in finite play. To speak meaningfully of a person’s power is to speak of what that person has already achieved, the titles they have already won.

Be that as it may, one cannot overlook the reality that those in powerful positions can (and often do) subvert the decision-making process by obstructing open and honest discussion of contentious issues. Sometimes they do so by their mere presence in the room.

How does a complex decision-maker deal with the issue of power?

Carse offers the following answer:

How do infinite players contend with power? Since the outcome of infinite play is endlessly open, there is no way of looking back to make an assessment of the power or weakness of earlier play. Infinite players look forward, not to a victory but toward ongoing play. A finite player plays to be powerful; the infinite player plays with strength. Power is concerned (and a consequence of) what has happened, strength with what has yet to happen. Power will be always restricted to a relatively small number of people. Anyone can be strong.

What strength means is context-dependent, but the following may help clarify its relationship to power:

Late last year I attended an end-of-year event at the university I teach at. There I bumped into a student I had mentored some time ago. We got talking about his workplace (a large government agency).

At one point he asked, “We really need to radically change the way we think about and work with data, but I’m not a manager and have no authority to initiate changes that need to be made.”

“Why don’t you demonstrate what you are capable of? Since you are familiar your data, it should be easy enough to frame and tackle a small yet meaningful data science problem.” I replied.

“What if my manager doesn’t like my taking the initiative?”

“It is easier to beg forgiveness than seek permission.”

“He might feel threatened and make life difficult for me.”

“If management doesn’t like you’re doing, it’s their loss. What’s the worst that could happen? You could lose your job. With what you are learning at university you should have no trouble moving on to another role. Indeed, by doing so, you will diversify your experience and increase your future options.”

–x–

To summarise: when deciding on complex matters, act in a way that maximises possibility rather than utility. Such an approach is inherently ethical and enhances one’s chances of staying in the game.

Complex decision making is an infinite game.

[1] There are many other terms for this classification: tame and wicked (Horst Rittel), programmed and non-programmed (Herbert Simon), complicated and complex (David Snowden). Paul Culmsee and I have, perhaps confusingly, used the terms uncertain and ambiguous to refer to these in our books. There are minor contextual differences between how these different authors interpret these terms, but for the most part they are synonymous with computable/non-computable.

Written by K

January 21, 2020 at 4:09 am

Posted in Decision Making, Emergent Design, Organizations, sensemaking, Wicked Problems

Tackling the John Smith Problem – deduplicating data via fuzzy matching in R

with 3 comments

Last week I attended a CRM & data user group meeting for not-for-profits (NFPs), organized by my friend Yael Wasserman from Mission Australia. Following a presentation from a vendor, we broke up into groups and discussed common data quality issues that NFPs (and dare I say most other organisations) face. Number one on the list was the vexing issue of duplicate constituent (donor) records – henceforth referred to as dupes. I like to call this the John Smith Problem as it is likely that a typical customer database in a country with a large Anglo population is likely to have a fair number of records for customers with that name. The problem is tricky because one has to identify John Smiths who appear to be distinct in the database but are actually the same person, while also ensuring that one does not inadvertently merge two distinct John Smiths.

(Picture Credit: Clone Team by Dawn Hudson, https://www.dreamstime.com/royalty-free-stock-photos-clone-team-image8071248)

The John Smith problem is particularly acute for NFPs as much of their customer data comes in either via manual data entry or bulk loads with less than optimal validation. To be sure, all the NFPs represented at the meeting have some level of validation on both modes of entry, but all participants admitted that dupes tend to sneak in nonetheless…and at volumes that merit serious attention. Yael and his team have had some success in cracking the dupe problem using SQL-based matching of a combination of fields such as first name, last name and address or first name, last name and phone number and so on. However, as he pointed out, this method is limited because:

It does not allow for typos and misspellings.
Matching on too few fields runs the risk of false positives – i.e. labelling non-dupes as dupes.

The problems arise because SQL-based matching requires one to pre-specify match patterns. The solution is straightforward: use fuzzy matching instead. The idea behind fuzzy matching is simple: allow for inexact matches, assigning each match a similarity score ranging from 0 to 1 with 0 being complete dissimilarity and 1 being a perfect match. My primary objective in this article is to show how one can make headway with the John Smith problem using the fuzzy matching capabilities available in R.

A bit about fuzzy matching

Before getting down to fuzzy matching, it is worth a brief introduction on how it works. The basic idea is simple: one has to generalise the notion of a match from a binary “match” / “no match” to allow for partial matching. To do this, we need to introduce the notion of an edit distance, which is essentially the minimum number of operations required to transform one string into another. For example, the edit distance between the strings boy and bay is 1: there’s only one edit required to transform one string to the other. The Levenshtein distance is the most commonly used edit distance. It is essentially, “the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.”

A variant called the Damerau-Levenshtein distance, which additionally allows for the transposition of two adjacent characters (counted as one operation, not two) is found to be more useful in practice. We’ll use an implementation of this called the optimal string alignment (osa) distance. If you’re interested in finding out more about osa, check out the Damerau-Levenshtein article linked to earlier in this paragraph.

Since longer strings will potentially have larger numeric distances between them, it makes sense to normalise the distance to a value lying between 0 and 1. We’ll do this by dividing the calculated osa distance by the length of the larger of the two strings . Yes, this is crude but, as you will see, it works reasonably well. The resulting number is a normalised measure of the dissimilarity between the two strings. To get a similarity measure we simply subtract the dissimilarity from 1. So, a normalised dissimilarity of 1 translates to similarity score of 0 – i.e. the strings are perfectly dissimilar. I hope I’m not belabouring the point; I just want to make sure it is perfectly clear before going on.

Preparation

In what follows, I assume you have R and RStudio installed. If not, you can access the software here and here for Windows and here for Macs; installation for both products is usually quite straightforward.

You may also want to download the Excel file many_john_smiths which contains records for ten fictitious John Smiths. At this point I should affirm that as far as the dataset is concerned, any resemblance to actual John Smiths, living or dead, is purely coincidental! Once you have downloaded the file you will want to open it in Excel and examine the records and save it as a csv file in your R working directory (or any other convenient place) for processing in R.

As an aside, if you have access to a database, you may also want to load the file into a table called many_john_smiths and run the following dupe-detecting SQL statement:

select * from many_john_smiths t1 where exists (select 'x' from many_john_smiths t2 where t1.FirstName=t2.FirstName and t1.LastName=t2.LastName and t1.AddressPostcode=t2.AddressPostcode and t1.CustomerID <> t2.CustomerID)

You may also want to try matching on other column combinations such as First/Last Name and AddressLine1 or First/Last Name and AddressSuburb for example. The limitations of column-based exact matching will be evident immediately. Indeed, I have deliberately designed the records to highlight some of the issues associated with dirty data: misspellings, typos, misheard names over the phone etc. A quick perusal of the records will show that there are probably two distinct John Smiths in the list. The problem is to quantify this observation. We do that next.

Tackling the John Smith problem using R

We’ll use the following libraries: stringdist and stringi . The first library, stringdist, contains a bunch of string distance functions, we’ll use stringdistmatrix() which returns a matrix of pairwise string distances (osa by default) when passed a vector of strings, and stringi has a number of string utilities from which we’ll use str_length(), which returns the length of string.

OK, so on to the code. The first step is to load the required libraries:

#load libraries library("stringdist") library("stringr")

We then read in the data, ensuring that we override the annoying default behaviour of R, which is to convert strings to categorical variables – we want strings to remain strings!

#read data, taking care to ensure that strings remain strings df <- read.csv("many_john_smiths.csv",stringsAsFactors = F) #examine dataframe str(df)

The output from str(df) (not shown) indicates that all columns barring CustomerID are indeed strings (i.e. type=character).

The next step is to find the length of each row:
#find length of string formed by each row (excluding title) rowlen <- str_length(paste0(df$FirstName,df$LastName,df$AddressLine1, df$AddressPostcode,df$AddressSuburb,df$Phone)) #examine row lengths rowlen > [1] 41 43 39 42 28 41 42 42 42 43

Note that I have excluded the Title column as I did not think it was relevant to determining duplicates.

Next we find the distance between every pair of records in the dataset. We’ll use the stringdistmatrix()function mentioned earlier:
#stringdistmatrix - finds pairwise osa distance between every pair of elements in a #character vector d <- stringdistmatrix(paste0(df$FirstName,df$LastName,df$AddressLine1, df$AddressPostcode,df$AddressSuburb,df$Phone)) d 1 2 3 4 5 6 7 8 9 2 7 3 10 13 4 15 21 24 5 19 26 26 15 6 22 21 28 12 18 7 20 23 26 9 21 14 8 10 13 17 20 23 25 22 9 19 22 19 21 24 29 23 22 10 17 22 25 13 22 19 16 22 24

stringdistmatrix() returns an object of type dist (distance), which is essentially a vector of pairwise distances.

For reasons that will become clear later, it is convenient to normalise the distance – i.e. scale it to a number that lies between 0 and 1. We’ll do this by dividing the distance between two strings by the length of the longer string. We’ll use the nifty base R function combn() to compute the maximum length for every pair of strings:
#find the length of the longer of two strings in each pair pwmax <- combn(rowlen,2,max,simplify = T)
The first argument is the vector from which combinations are to be generated, the second is the group size (2, since we want pairs) and the third argument indicates whether or not the result should be returned as an array (simplify=T) or list (simplify=F). The returned object, pwmax, is a one-dimensional array containing the pairwise maximum lengths. This has the same length and is organised in the same way as the object d returned by stringdistmatrix() (check that!). Therefore, to normalise d we simply divide it by pwmax
#normalised distance dist_norm <- d/pwmax
The normalised distance lies between 0 and 1 (check this!) so we can define similarity as 1 minus distance:
#similarity = 1 - distance similarity <- round(1-dist_norm,2) sim_matrix <- as.matrix(similarity) sim_matrix 1 2 3 4 5 6 7 8 9 10 1 0.00 0.84 0.76 0.64 0.54 0.46 0.52 0.76 0.55 0.60 2 0.84 0.00 0.70 0.51 0.40 0.51 0.47 0.70 0.49 0.49 3 0.76 0.70 0.00 0.43 0.33 0.32 0.38 0.60 0.55 0.42 4 0.64 0.51 0.43 0.00 0.64 0.71 0.79 0.52 0.50 0.70 5 0.54 0.40 0.33 0.64 0.00 0.56 0.50 0.45 0.43 0.49 6 0.46 0.51 0.32 0.71 0.56 0.00 0.67 0.40 0.31 0.56 7 0.52 0.47 0.38 0.79 0.50 0.67 0.00 0.48 0.45 0.63 8 0.76 0.70 0.60 0.52 0.45 0.40 0.48 0.00 0.48 0.49 9 0.55 0.49 0.55 0.50 0.43 0.31 0.45 0.48 0.00 0.44 10 0.60 0.49 0.42 0.70 0.49 0.56 0.63 0.49 0.44 0.00

The diagonal entries are 0, but that doesn’t matter because we know that every string is perfectly similar to itself! Apart from that, the similarity matrix looks quite reasonable: you can, for example, see that records 1 and 2 (similarity score=0.84) are quite similar while records 1 and 6 are quite dissimilar (similarity score=0.46). Now let’s extract some results more systematically. We’ll do this by printing out the top 5 non-diagonal similarity scores and the associated records for each of them. This needs a bit of work. To start with, we note that the similarity matrix (like the distance matrix) is symmetric so we’ll convert it into an upper triangular matrix to avoid double counting. We’ll also set the diagonal entries to 0 to avoid comparing a record with itself:
#convert to upper triangular to prevent double counting sim_matrix[lower.tri(sim_matrix)] <- 0 #set diagonals to zero to avoid comparing row to itself diag(sim_matrix) <- 0
Next we create a function that returns the n largest similarity scores and their associated row and column number – we’ll need the latter to identify the pair of records that are associated with each score:
#adapted from: #https://stackoverflow.com/questions/32544566/find-the-largest-values-on-a-matrix-in-r nlargest <- function(m, n) { res <- order(m, decreasing = T)[seq_len(n)]; pos <- arrayInd(res, dim(m), useNames = TRUE); list(values = m[res], position = pos) }

The function takes two arguments: a matrix m and a number n indicating the top n scores to be returned. Let’s set this number to 5 – i.e. we want the top 5 scores and the associated record indexes. We’ll store the output of nlargest in the variable sim_list:
top_n <- 5 sim_list <- nlargest(sim_matrix,top_n)
Finally, we loop through sim_list printing out the scores and associated records as we go along:
for (i in 1:top_n){ rec <- as.character(df[sim_list$position[i],]) sim_rec <- as.character(df[sim_list$position[i+top_n],]) cat("score: ",sim_list$values[i],"\n") cat("record 1: ",rec,"\n") cat ("record 2: ",sim_rec,"\n\n") } score: 0.84 record 1: 1 John Smith Mr 12 Acadia Rd Burnton 9671 1234 5678 record 2: 2 Jhon Smith Mr 12 Arcadia Road Bernton 967 1233 5678
score: 0.79 record 1: 4 John Smith Mr 13 Kynaston Rd Burnton 9671 34561234 record 2: 7 Jon Smith Mr. 13 Kinaston Rd Barnston 9761 36451223
score: 0.76 record 1: 1 John Smith Mr 12 Acadia Rd Burnton 9671 1234 5678 record 2: 3 J Smith Mr. 12 Acadia Ave Burnton 867`1 1233 567
score: 0.76 record 1: 1 John Smith Mr 12 Acadia Rd Burnton 9671 1234 5678 record 2: 8 John Smith Dr 12 Aracadia St Brenton 9761 12345666
score: 0.71 record 1: 4 John Smith Mr 13 Kynaston Rd Burnton 9671 34561234 record 2: 6 John S Dr. 12 Kinaston Road Bernton 9677 34561223

As you can see, the method correctly identifies close matches: there appear to be 2 distinct records (1 and 4) – and possibly more, depending on where one sets the similarity threshold. I’ll leave you to explore this further on your own.

The John Smith problem in real life

As a proof of concept, I ran the following SQL on a real CRM database hosted on SQL Server:
select FirstName+LastName, count(*) from TableName group by FirstName+LastName having count(*)>100 order by count(*) desc
I was gratified to note that John Smith did indeed come up tops – well over 200 records. I suspected there were a few duplicates lurking within, so I extracted the records and ran the above R code (with a few minor changes). I found there indeed were some duplicates! I also observed that the code ran with no noticeable degradation despite the dataset having well over 10 times the number of records used in the toy example above. I have not run it for larger datasets yet, but I suspect one will run into memory issues when the number of records gets into the thousands. Nevertheless, based on my experimentation thus far, this method appears viable for small datasets.

The problem of deduplicating large datasets is left as an exercise for motivated readers 😛

Wrapping up

Often organisations will turn to specialist consultancies to fix data quality issues only to find that their work, besides being quite pricey, comes with a lot of caveats and cosmetic fixes that do not address the problem fully. Given this, there is a case to be made for doing as much of the exploratory groundwork as one can so that one gets a good idea of what can be done and what cannot. At the very least, one will then be able to keep one’s consultants on their toes. In my experience, the John Smith problem ranks right up there in the list of data quality issues that NFPs and many other organisations face. This article is intended as a starting point to address this issue using an easily available and cost effective technology.

Finally, I should reiterate that the approach discussed here is just one of many possible and is neither optimal nor efficient. Nevertheless, it works quite well on small datasets, and is therefore offered here as a starting point for your own attempts at tackling the problem. If you come up with something better – as I am sure you can – I’d greatly appreciate your letting me know via the contact page on this blog or better yet, a comment.

Acknowledgements:

I’m indebted to Homan Zhao and Sree Acharath for helpful conversations on fuzzy matching. I’m also grateful to all those who attended the NFP CRM and Data User Group meetup that was held earlier this month – the discussions at that meeting inspired this piece.

Written by K

October 9, 2019 at 8:49 pm

Posted in Data Analytics, Data Science, R

Tagged with Data Management

3 or 7, truth or trust

with one comment

“It is clear that ethics cannot be articulated.” – Ludwig Wittgenstein

Over the last few years I’ve been teaching and refining a series of lecture-workshops on Decision Making Under Uncertainty. Audiences include data scientists and mid-level managers working in corporates and public service agencies. The course is based on the distinction between uncertainties in which the variables are known and can be quantified versus those in which the variables are not known upfront and/or are hard to quantify.

Before going any further, it is worth explaining the distinction via a couple of examples:

An example of the first type of uncertainty is project estimation. A project has an associated time and cost, and although we don’t know what their values are upfront, we can estimate them if we have the right data. The point to note is this: because such problems can be quantified, the human brain tends to deal with them in a logical manner.

In contrast, business strategy is an example of the second kind of uncertainty. Here we do not know what the key variables are upfront. Indeed we cannot, because different stakeholders will perceive different aspects of a strategy to be paramount depending on their interests – consider, for example, the perspective of a CFO versus that of a CMO. Because of these differences, one cannot make progress on such problems until agreement has been reached on what is important to the group as a whole. The point to note here is that since such problems involve contentious issues, our reactions to them tend to be emotional rather than logical.

The difference between the two types of uncertainty is best conveyed experientially, so I have a few in-class activities aimed at doing just that. One of them is an exercise I call “3 or 7“, in which I give students a sheet with the following printed on it:

Circle either the number 3 or 7 below depending on whether you want 3 marks or 7 marks added to your Assignment 2 final mark. Yes, this offer is for real, but there a catch: if more than 10% of the class select 7, no one gets anything.

Write your student ID on the paper so that Kailash can award you the marks. Needless to say, your choice will remain confidential, no one (but Kailash) will know what you have selected.

3 7

Prior to handing out the sheet, I tell them that they:

should sit far enough apart so that they can’t see what their neighbours choose,
are not allowed to communicate their choices to others until the entire class has turned their sheets.

Before reading any further you may want to think about what typically happens.

–x–

Many readers would have recognized this exercise as a version of the Prisoner’s Dilemma and, indeed, many students in my classes recognize this too. Even so, there are always enough of “win at the cost of others” types in the room who ensure that I don’t have to award any extra marks. I’ve run the exercise about 10 times, often with groups comprised of highly collaborative individuals who work well together. Despite that,15-20% of the class ends up opting for 7.

It never fails to surprise me that, even in relatively close-knit groups, there are invariably a number of individuals who, if given a chance to gain at the expense of their colleagues, will not hesitate to do so providing their anonymity is ensured.

–x–

Conventional management thinking deems that any organisational activity involving several people has to be closely supervised. Underlying this view is the assumption that individuals involved in the activity will, if left unsupervised, make decisions based on self-interest rather than the common good (as happens in the prisoner’s dilemma game). This assumption finds justification in rational choice theory, which predicts that individuals will act in ways that maximise their personal benefit without any regard to the common good. This view is exemplified in 3 or 7 and, at a societal level, in the so-called Tragedy of the Commons, where individuals who have access to a common resource over-exploit it, thus depleting the resource entirely.

Fortunately, such a scenario need not come to pass: the work of Elinor Ostrom, one of the 2009 Nobel prize winners for Economics, shows that, given the right conditions, groups can work towards the common good even if it means forgoing personal gains.

Classical economics assumes that individuals’ actions are driven by rational self-interest – i.e. the well-known “what’s in it for me” factor. Clearly, the group will achieve much better results as a whole if it were to exploit the resource in a cooperative way. There are several real-world examples where such cooperative behaviour has been successful in achieving outcomes for the common good (this paper touches on some). However, according to classical economic theory, such cooperative behaviour is simply not possible.

So the question is: what’s wrong with rational choice theory? A couple of things, at least:

Firstly, implicit in rational choice theory is the assumption that individuals can figure out the best choice in any given situation. This is obviously incorrect. As Ostrom has stated in one of her papers:

Because individuals are boundedly rational, they do not calculate a complete set of strategies for every situation they face. Few situations in life generate information about all potential actions that one can take, all outcomes that can be obtained, and all strategies that others can take.

Instead, they use heuristics (experienced-based methods), norms (value-based techniques) and rules (mutually agreed regulations) to arrive at “good enough” decisions. Note that Ostrom makes a distinction between norms and rules, the former being implicit (unstated) rules, which are determined by the cultural attitudes and values)

Secondly, rational choice theory assumes that humans behave as self-centred, short-term maximisers. Such theories work in competitive situations such as the stock-market but not in situations in which collective action is called for, such as the prisoners dilemma.

Ostrom’s work essentially addresses the limitations of rational choice theory by outlining how individuals can work together to overcome self-interest.

–x–

In a paper entitled, A Behavioral Approach to the Rational Choice Theory of Collective Action, published in 1998, Ostrom states that:

…much of our current public policy analysis is based on an assumption that rational individuals are helplessly trapped in social dilemmas from which they cannot extract themselves without inducement or sanctions applied from the outside. Many policies based on this assumption have been subject to major failure and have exacerbated the very problems they were intended to ameliorate. Policies based on the assumptions that individuals can learn how to devise well-tailored rules and cooperate conditionally when they participate in the design of institutions affecting them are more successful in the field…[Note: see this book by Baland and Platteau, for example]

Since rational choice theory aims to maximise individual gain, it does not work in situations that demand collective action – and Ostrom presents some very general evidence to back this claim. More interesting than the refutation of rational choice theory, though, is Ostrom’s discussion of the ways in which individuals “trapped” in social dilemmas end up making the right choices. In particular she singles out two empirically grounded ways in which individuals work towards outcomes that are much better than those offered by rational choice theory. These are:

Communication: In the rational view, communication makes no difference to the outcome. That is, even if individuals make promises and commitments to each other (through communication), they will invariably break these for the sake of personal gain …or so the theory goes. In real life, however, it has been found that opportunities for communication significantly raise the cooperation rate in collective efforts (see this paper abstract or this one, for example). Moreover, research shows that face-to-face is far superior to any other form of communication, and that the main benefit achieved through communication is exchanging mutual commitment (“I promise to do this if you’ll promise to do that”) and increasing trust between individuals. It is interesting that the main role of communication is to enhance or reinforce the relationship between individuals rather than to transfer information. This is in line with the interactional theory of communication.

Innovative Governance: Communication by itself may not be enough; there must be consequences for those who break promises and commitments. Accordingly, cooperation can be encouraged by implementing mutually accepted rules for individual conduct, and imposing sanctions on those who violate them. This effectively amounts to designing and implementing novel governance structures for the activity. Note that this must be done by the group; rules thrust upon the group by an external authority are unlikely to work.

Of course, these factors do not come into play in artificially constrained and time-bound scenarios like 3 or 7. In such situations, there is no opportunity or time to communicate or set up governance structures. What is clear, even from the simple 3 or 7 exercise, is that these are required even for groups that appear to be close-knit.

Ostrom also identifies three core relationships that promote cooperation. These are:

Reciprocity: this refers to a family of strategies that are based on the expectation that people will respond to each other in kind – i.e. that they will do unto others as others do unto them. In group situations, reciprocity can be a very effective means to promote and sustain cooperative behaviour.

Reputation: This refers to the general view of others towards a person. As such, reputation is a part of how others perceive a person, so it forms a part of the identity of the person in question. In situations demanding collective action, people might make judgements on a person’s reliability and trustworthiness based on his or her reputation.’

Trust: Trust refers to expectations regarding others’ responses in situations where one has to act before others. And if you think about it, everything else in Ostrom’s framework is ultimately aimed at engendering or – if that doesn’t work – enforcing trust.

–x—

In an article on ethics and second-order cybernetics, Heinz von Foerster tells the following story:

I have a dear friend who grew up in Marrakech. The house of his family stood on the street that divide the Jewish and the Arabic quarter. As a boy he played with all the others, listened to what they thought and said, and learned of their fundamentally different views. When I asked him once, “Who was right?” he said, “They are both right.”

“But this cannot be,” I argued from an Aristotelian platform, “Only one of them can have the truth!”

“The problem is not truth,” he answered, “The problem is trust.”

For me, that last line summarises the lesson implicit in the admittedly artificial scenario of 3 or 7. In our search for facts and decision-making frameworks we forget the simple truth that in many real-life dilemmas they matter less than we think. Facts and frameworks cannot help us decide on ambiguous matters in which the outcome depends on what other people do. In such cases the problem is not truth; the problem is trust. From your own experience it should be evident it is impossible convince others of your trustworthiness by assertion, the only way to do so is by behaving in a trustworthy way. That is, by behaving ethically rather than talking about it, a point that is squarely missed by so-called business ethics classes.

Yes, it is clear that ethics cannot be articulated.

Notes:

Portions of this article are lightly edited sections from a 2009 article that I wrote on Ostrom’s work and its relevance to project management.
Finally, an unrelated but important matter for which I seek your support for a common good: I’m taking on the 7 Bridges Walk to help those affected by cancer. Please donate via my 7 Bridges fundraising page if you can . Every dollar counts; all funds raised will help Cancer Council work towards the vision of a cancer free future.

Written by K

September 18, 2019 at 8:28 pm

Posted in Communication, Decision Making, Organizations, Project Management, Wicked Problems

Eight to Late

Complex decision making as an infinite game