Current Reading – Engaging with Content Analysis and a different notion of “coding”

I’m currently reading these two books:

Krippendorff, K. (2013). Content analysis: An introduction to its methodology (3rd Edition). Sage.
https://uk.sagepub.com/en-gb/eur/content-analysis/book234903

(I note someone’s put a full copy of the second edition up on academia.edu if you google for it… But you didn’t read that here 😉

This is a VERY readable introduction to content analysis which is really interesting and has a great section of computer support including extended considerations of the use of CAQDAS packages such as ATLAS.ti, NVivo and the more content-analysis oriented QDA Miner / WordSTST combo.

I’m now starting:

Leetaru, K. (2012). Data mining methods for the content analyst: An introduction to the computational analysis of content. Routledge.
https://www.routledge.com/Data-Mining-Methods-for-the-Content-Analyst-An-Introduction-to-the-Computational/Leetaru/p/book/9780415895149

Content Analysis and Coding vs Inductive impressions.

Will need to turn this in to a “full post” in due course but first notes from Krippendorf around coding noted that:

P127: “Recording takes place when observers, readers or analysts interpret what they see, read, or find and then state their experiences in the formal terms of an analysis, coding is the term content analysts use when this process is carried out according to observer-independent rules”

I find this interesting because… the “formal terms of an analysis” are emphasised in originating Grounded Theory (GT) T and hermeneutics approaches as key but often seem to be much diminished in contemporary practices of those “using GT” or other approaches to analysis influenced by GT. The formality of defining codes and consistently applying them is however very much inductive and open to continuous, data-driven revision.

However, it is the notion of observer independence where arguably the approach of content analysis differs so much from the inductive and interpretivist ideas framing much of qualitative analysis and the assumptions that proceed from that into suggestions of what software can do to assist such analysis. However, in CAQDAS packages “coding” can support or encompass both approaches – and I wonder to what extent this is a key source of the tensions, mistrust or the (frequent) misrepresentation of what CAQDAS packages “do” to analysis.

To be continued…

 

 

KWIC interfaces and concordances

This image from the excellent QD in Practice event organised at Leeds University really drove home to me just how powerful and useful KWIC (Key Words In Context) concordance displays can be.

kwicinarabic

In the image above I cannot even read the script – I don’t read arabic. Not only can I not read the script it is written from right-to-left, yet KWIC works.

I can see, without being able to understand, that there is a difference between lines 1, 2 3, lines 4 though 11 are the same, line 12 is different and lines 13 through 20 are the same in terms of the words in red that appear before (it’s R>L text, remember!) the highlighted keyword.

Since I first encountered KWIC in a module on corpus approaches to language teaching I have recognised that it has an incredible simplicity and power compared to many other ways of showing highlighted text.

From text to context – displaying search results in NVivo at Present

Compare it to this:

nvivowordsincontextsearchview1

Which is the results output from a text search in NVivo.

This is not a bad output, I see context in a similar was a KWIC concordance and can access the underlying data immediately. However, the appearance precludes some rather more important options KWIC enables.

Another way to reach this sort of word search is by running a word frequency query in NVivo – which will then create a list of words along with information on their length, their count, a weighted percentage (need to learn more on that) and a list of “similar words”.

The similar words are derived by including stemmed words – a process which has some issues associated with it which I’ll go into a little later. Here I’m going to focus on the representation of that information:

nvivowordfrequencyresults

So double-clicking on a word takes me to the same display as previously for a stemmed text search:

nvivowordsincontextsearchview2

Again not bad – I get some context and information on the source. And from it I can go and find the word in context in the original text by clicking the link – and the word is helpfully highlighted:

highlightedwordsincontext

A closer view – word trees

EDIT/UPDATE – from chatting with Silvana (and revisiting Kathleen’s comments in the NVivo Users Group). Word tree is indeed *very* similar to KWIC:

wordTree-NVivo

they show the key word in the middle and the branching before and after. The differences however are still important – while you can select the text to see connections:

wordTree-highlighted

What you cannot see as easily are the sentences across, or any variation. It’s a powerful tool that does much of the work of KWIC – but I’m not sure if the simplification comes at a cost. This is one for me to look at further – thanks to Kathleen for flagging it to me to cogitate on and explore further!

Of course MaxQDA does have KWIC 

What you can’t do or see easily with this… but could with KWIC

However, there are a bunch of things I can’t do or easily see which KWIC would enable:

  • Which words come before or after? (visible in word tree)
    • Consider for example the potentially very important differences between the pronouns that precede or follow a key term that is emerging as a theme or word – for example work/working or team/s and if or how these might very between groups or align with attributes you;re interested in (e.g. managers vs subordinates)
    • Consider for example the important differences between how use and used can appear as a verb, a modal auxiliary :
      • I used the software four years ago (verb, p/t)
      • I used to hate the software (quasi-modal)
      • I got used to the software (adjective phrase)
    • Which stems are associated? (Not sure if this is visible with word tree???)
      • Consider the spurious stemming that can occur e.g.
        • Office
        • Officer
        • Official
      • Which words are associated with particular stems or synonyms
        • Consider the difference between stems of
          • be, been, being
        • Compared to lemmatisation as
          • am, was, are, were

And here’s where the power yet simplicity of KWIC really holds potential for working with this sort of query and any coding from that. Consider what you can see when the data is presented in a KWIC concordance:

Ref 1:  0.01%

 a little while since I’ve

 use

d  Adobe Connect. Okay [pause] oh
Ref 2:  0.02%

 STS and how you’ve been

 using

 caqdas software, but it’s just
Ref 3:  0.02%

 that particularly made it seem

 use

ful or relevant or drew you
Ref 4:  0.02%

 ANT, but nevertheless he is

 using

 some of the principles of
Ref 5:  0.01%

 by Actor-network theory have

 use

d  software in their research. Erm
Ref 6:  0.02%

 poll is people who are

 using

 CAQDAS packages, some is people
Ref 7:  0.02%

 is people who are not

 using

 those. Erm, and some is
Ref 8:  0.02%

 some is people who are

 using

 a mixture of-, a sort
Ref 9:  0.02%

 wondered, what software are you

 using

? Erm, and one info [skip
Ref 10:  0.02%

 you know, beca

 use

-,  I start using what I knew at that
Ref 11:  0.02%

 start my PhD, we start

 using

 a specific software that I
Ref 12:  0.02%

 software that I had been

 using

 before, which is a qualitative
Ref 13:  0.01%

 study, then I have to

 use

  something that I knew and
Ref 14:  0.01%

 with Atlas T, and I

 use

  it-, I will explain it
Ref 15:  0.01%

 but later …[15.34] Then I

 use

d  Atlas T from the very
Ref 16:  0.01%

 the very beginning, and I

 use

d  it only to qualify all
Ref 17:  0.01%

 of my research. Erm, the

 use

 of Atlas T was useful
Ref 18:  0.02%

my

 use

of Atlas T was useful at some extent,
Ref 19:  0.01%

 best tool that I can

 use

, but I will explain it
Ref 20:  0.01%

 apply principles of ANT and

 use

  a specific software?’ [18.54] So
Ref 21:  0.01%

 of mine, err, quite frequently

 use

s the phrase ‘auto-magical’, and
Ref 22:  0.02%

 understand how ANTA can be

 use

ful in that sense. Of course
Ref 23:  0.02%

 learning, analytics, big data and

 using

 those special softwares, but I
Ref 24:  0.01%

 didn’t get how I can

 use

  it for my research, really
Ref 25:  0.01%

 and show me how you

 use

  Atlas.ti that would be really
Ref 26:  0.01%

 tools and options you do

 use

, that have supported you the
Ref 27:  0.02%

 broken.’ So which-, so you’re

 using

 Atlas T on a Mac
Ref 28:  0.01%

 Yes I [skip]-, I’m just

 use

[skip] [25.47] Steve W Okay
Ref 29:  0.02%

 finished my thesis, I am

 using

 [skip] as a module from
Ref 30:  0.02%

 you. This paper is about

 using

 ANT principles through my research
Ref 31:  0.01%

 yesterday found that I can

 use

  AtlasT not in my Windows
Ref 33:  0.02%

 with statements from other documents

 using

 categories of analysis. I mean
Ref 34:  0.01%

 you generate and did you

 use

? Alberto There is no [unclear

The power and importance of sorting

What I would like to be able to see is the kind of output shown above as an option along with the normal contextual view. I would want to be able to sort it by the middle column and/or the words immediately preceding or following that. This then really helps spot patterns:

Loc %

Text 1

Stem

Text 2
Ref 13:  0.01%

 study, then I have to

 use

  something that I knew and
Ref 14:  0.01%

 with Atlas T, and I

 use

  it-, I will explain it
Ref 17:  0.01%

 of my research. Erm, the

 use

 of Atlas T was useful
Ref 18:  0.02%

my

 use

of Atlas T was use ful at some extent, to some
Ref 19:  0.01%

 best tool that I can

 use

, but I will explain it
Ref 20:  0.01%

 apply principles of ANT and

 use

  a specific software?’ [18.54] So
Ref 24:  0.01%

 didn’t get how I can

 use

  it for my research, really
Ref 25:  0.01%

 and show me how you

 use

  Atlas.ti that would be really
Ref 26:  0.01%

 tools and options you do

 use

, that have supported you the
Ref 28:  0.01%

 Yes I [skip]-, I’m just

 use

[skip] [25.47] Steve W Okay
Ref 31:  0.01%

 yesterday found that I can

 use

  AtlasT not in my Windows
Ref 34:  0.01%

 you generate and did you

 use

? Alberto There is no [unclear
Ref 10:  0.02%

 you know, beca

 use

-,  I start using what I knew at that
Ref 1:  0.01%

 a little while since I’ve

 use

d  Adobe Connect. Okay [pause] oh
Ref 5:  0.01%

 by Actor-network theory have

 use

d  software in their research. Erm
Ref 15:  0.01%

 but later …[15.34] Then I

 use

d  Atlas T from the very
Ref 16:  0.01%

 the very beginning, and I

 use

d  it only to qualify all
Ref 3:  0.02%

 that particularly made it seem

 use

ful or relevant or drew you
Ref 22:  0.02%

 understand how ANTA can be

 use

ful in that sense. Of course
Ref 21:  0.01%

 of mine, err, quite frequently

 use

s the phrase ‘auto-magical’, and
Ref 2:  0.02%

 STS and how you’ve been

 using

 caqdas software, but it’s just
Ref 4:  0.02%

 ANT, but nevertheless he is

 using

 some of the principles of
Ref 6:  0.02%

 poll is people who are

 using

 CAQDAS packages, some is people
Ref 7:  0.02%

 is people who are not

 using

 those. Erm, and some is
Ref 8:  0.02%

 some is people who are

 using

 a mixture of-, a sort
Ref 9:  0.02%

 wondered, what software are you

 using

? Erm, and one info [skip
Ref 11:  0.02%

 start my PhD, we start

 using

 a specific software that I
Ref 12:  0.02%

 software that I had been

 using

 before, which is a qualitative
Ref 23:  0.02%

 learning, analytics, big data and

 using

 those special softwares, but I
Ref 27:  0.02%

 broken.’ So which-, so you’re

 using

 Atlas T on a Mac
Ref 29:  0.02%

 finished my thesis, I am

 using

 [skip] as a module from
Ref 30:  0.02%

 you. This paper is about

 using

 ANT principles through my research
Ref 33:  0.02%

 with statements from other documents

 using

 categories of analysis. I mean

This would help with viewing the associations created from a query.

The next level – making this KWIC view a way of shaping the associations of stems and synonyms

However, to really have power you would need to be able to use it to interact with and change those associations.  the functions I would really like (via right click or similar) are:

1 – remove link of stem (e.g. De-link office and officer as being the same word)

2 – remove synonym association (e.g.

3 – (Ideally – probably harder!)  create a link for lemmatisation and ideally save it to a dictionary or thesaurus. AND / OR differentiate on set of used to from another set of used to.

All of these are hugely facilitated by a KWIC concordance view – and hopefully some of this is fairly simple whilst other aspects may need to be on a longer list but I believe are really worthy of consideration especially for approaches oriented more towards content analysis and data mining rather than inductive analysis.