Chapter 4

Chapter 4

The Procedure of Computer-Search for Keywords

Among my four cardinal results the last two are in line with the kinds of results I have obtained in many previous cases without the assistance of any particular tool. (Some two dozen such analyses can be found in Scharnberg, 1996.) I am, however, absolutely convinced that without the assistance of a computer it would never have occurred to me to look for or detect the first two.

In a sense the computer is omnipresent in my analyses, although I have used no advanced computer program. My search for keywords is a primitive way of using the computer. But in the case of Elvira it has turned out to be highly adequate and efficacious.

I shall try to explain how my approach developed. In the police interrogation 1992-04-09 Fanny Mollbeck recounted one kind of acts of sexual abuse, which her foster daughter had allegedly told her about: More than once her father had practiced sexual intercourse upon Elvira, while her two-year younger sister Ingrid was sitting at the bedside.

This is "the bedside event(s)". When our task is to analyse the evidence, the first question should be whether Elvira agreed that she had experienced such assaults.

The second question should be whether Elviras descriptions of the bedside event at various occasions agree with one other. This issue is important in itself. But there is a further reason for focusing on it. I have worked in law courts for over 15 years, and in my 1996 book, as well as elsewhere, I have described more than 50 cases. About half of them have been exposed to penetrating analyses. I have documented in print that not one single girl whose account was not true, have been capable of recalling from one police or court interrogation to the next what she had previously told. This characteristic was equally prominent in those girls who were deliberately lying on their own initiative, and in those who were victims of indoctrination.

Elviras father should be asked. But it might be no simple matter to interpret answers given by a possible suspect. I would therefore suggest that the third essential question should be whether Ingrid confirms or rejects this allegation.

Here we encounter the first impediment. To answer such questions we must search through the entire police investigation, which turned out to comprise 245,000 words. A large number of psychological studies have shown that human beings are not very skilled in handling large amounts of information. They will usually ignore the overwhelming majority of the facts and base their conclusion on a small sample. And they may well ignore those facts that most strongly support the conclusion which, objectively, is the true one.

Unfortunately, most textbooks of judicial evidence evaluation are based on the following three erroneous assumptions: (a) Judges have reasonably correct recollections of the facts that were presented during the proceedings; they will experience neither crucial forgetting nor illusory memories. (b) Judges attach reasonably correct evidential values to the facts. (c) The only kind of combination judges need to perform, is weighing together all the evidential values.

These circumstances need to be mentioned already at this stage, even though they primarily belong in a later chapter. I shall also mention a vital logical rule: The evidential value of both (note: both) of two facts may be zero or close to zero. Nevertheless both facts in combination may have a very high evidential value.

A further property of human perception should be noticed. While a person is reading a large mass of text, his attention levels will often oscillate between high and low levels. And low levels might befall facts that actually have strong evidential power.

I do not pretend not to be a victim of oscillating attention and other shortcomings. Rather, I have deliberately tried to circumvent them. How can I be sure that I have found all instances of the bedside event And how can I be sure that I have noticed all the significant features of the separate accounts of this event

My solution was to scan all interrogations, affidavits and other documents into one single document, which will henceforth be called the central document. It is this document that comprises 245,000 words.

I also made other large, but not immense, documents. One of them contained all judgements passed by all courts. It should be noted that Swedish judgements are significantly more sizable than judgements in most other countries.

It takes only a few seconds to computer-search the entire central document for any keyword. Almost instantaneously I shall find the frequency of the words "-lock-" , "-close-" , "-door-" (or rather of their Swedish counterparts: "-lås-", "-stäng-", "-dörr-"); they are 68 and 276 and 73, respectively. I have scrutinised every instance of these words, and can therefore be sure of having made no oversight – albeit with one trivial and one important caveat.

The trivial caveat is that this approach will yield a number of "false positives". To construct an English analogy: searching for "now" will also yield "snow" and "acknowledged".

The important caveat is that I am not interested in words but in categories of meaning, that is, in events that satisfy certain conditions. And we can never be sure that a certain word will invariably occur whenever a certain event is mentioned.

I strongly recommend that all instances of the word in question should be marked in such a way that the researcher cannot fail to notice them even at a casual glance. And it would be wise to mark them in a way that is never used for other purposes or in other contexts. (My own technique is to give the letters the double size and adding ££.)

Is word-search by and large superior to topic-search I take no stand on this issue. There is no reason why both approaches should not be applied. But there are a number of reasons why I want to give computer-search for words a prominent place.

(1) I have only written two books of more than 600 pages. My book on Freuds non-authentic observations (1993) was written on a typewriter. My book on textual analysis of legal evidence of sexual abuse (1996) was written on a computer. As a consequence only the latter is available on my hard disc.

The abuse book has two normal indices of subjects and names. By contrast, the Freud book contains a very large analytic index with five hierarchic categories of meaning.

Now, when I want to find a special page where I have written about a certain topic, it has invariably been significantly easier for me to search for a word in my abuse book, than to search via the meaning categories in my Freud book.

The same ease is observed in my as yet unpublished manuscripts, although no meaning categories have been worked out for these writings.

(2) Selecting the words which it would be adequate to search for is often easy. It goes without saying that "bedside" is one of the first words to be tried out in order to find all accounts of the bedside event.

As an alternative, a number of meaningful categories could be invented on the basis of the account of the bedside event given above. But we cannot be sure to find the optimal definition of meaning categories, unless we also take a number of other assaults into consideration. Hence, such an approach could turn out to be a labour-consuming trial-and-error task.

(3) It may not always be unambiguous whether a concrete account belongs to a certain category.

Let us focus on the bedside assault together with two other events: the nail polish assault and the nail polish conversation. Elvira recounted that at one occasion she was present while Ingrid painted the fathers anus with nail polish. But Elvira also recounts a conversation she had later with Ingrid. Ingrid allegedly said that Elviras recollection was not true: Elvira was actually the one who painted the fathers anus while Ingrid watched. As a consequence, when Elvira recounts these events to the police, she does not know whether her own or Ingrids version is the true one.

The word that should in the first place be searched for in this case is "nail polish". It is less obvious what would be the optimal meaning categories for the following three events: painting the anus; talking with Ingrid about the painting event; and telling the police about both the painting and the conversation.

(4) So far I have never encountered a computer-assisted analysis of legal evidence, whether for word search or for topic search. In the Scandinavian countries it would be a completely alien idea to judges and defence counsels to undertake such a task. And whatever it may be in many other countries, it is not very common.

(5) My strongest motive for stressing computer-search for words rather than for meaning categories is that this procedure is the least time-consuming. Starting with a word-search does not prevent other procedures from being applied later.

(6) Regardless of whether we search for words or categories of meaning, we cannot escape the task of reading through the entire central document in order to check whether we have overlooked some instances of a certain kind of events. But even here I venture a hypothesis about the performance of most people. If there are, objectively, 29 instances of a specific kind of assault, we may be more prone to find those few ones we have missed, if a word-search had already identified and marked, say, 26 instances. – This is also one reason why I recommend a very conspicuous marking of all instances that have already been found by word-search.

In the case of Elvira it turned out that search for the word "bedside" missed no instance in the central document, but missed one instance in the document of all judgements.

What should we do if search for the word "bedside" had yielded no events except the one we had already found Then the next step might be to try other words such as "Ingrid" and "sister". This search actually yielded the only instance that was missing, and that occurred in one of the judgements.

Translation to swedish >>>

Next chapter

Uppdaterad: 2009-11-19

Yakida