KWIC interfaces and concordances

This image from the excellent QD in Practice event organised at Leeds University really drove home to me just how powerful and useful KWIC (Key Words In Context) concordance displays can be.

kwicinarabic

In the image above I cannot even read the script – I don’t read arabic. Not only can I not read the script it is written from right-to-left, yet KWIC works.

I can see, without being able to understand, that there is a difference between lines 1, 2 3, lines 4 though 11 are the same, line 12 is different and lines 13 through 20 are the same in terms of the words in red that appear before (it’s R>L text, remember!) the highlighted keyword.

Since I first encountered KWIC in a module on corpus approaches to language teaching I have recognised that it has an incredible simplicity and power compared to many other ways of showing highlighted text.

From text to context – displaying search results in NVivo at Present

Compare it to this:

nvivowordsincontextsearchview1

Which is the results output from a text search in NVivo.

This is not a bad output, I see context in a similar was a KWIC concordance and can access the underlying data immediately. However, the appearance precludes some rather more important options KWIC enables.

Another way to reach this sort of word search is by running a word frequency query in NVivo – which will then create a list of words along with information on their length, their count, a weighted percentage (need to learn more on that) and a list of “similar words”.

The similar words are derived by including stemmed words – a process which has some issues associated with it which I’ll go into a little later. Here I’m going to focus on the representation of that information:

nvivowordfrequencyresults

So double-clicking on a word takes me to the same display as previously for a stemmed text search:

nvivowordsincontextsearchview2

Again not bad – I get some context and information on the source. And from it I can go and find the word in context in the original text by clicking the link – and the word is helpfully highlighted:

highlightedwordsincontext

A closer view – word trees

EDIT/UPDATE – from chatting with Silvana (and revisiting Kathleen’s comments in the NVivo Users Group). Word tree is indeed *very* similar to KWIC:

wordTree-NVivo

they show the key word in the middle and the branching before and after. The differences however are still important – while you can select the text to see connections:

wordTree-highlighted

What you cannot see as easily are the sentences across, or any variation. It’s a powerful tool that does much of the work of KWIC – but I’m not sure if the simplification comes at a cost. This is one for me to look at further – thanks to Kathleen for flagging it to me to cogitate on and explore further!

Of course MaxQDA does have KWIC

What you can’t do or see easily with this… but could with KWIC

However, there are a bunch of things I can’t do or easily see which KWIC would enable:

Which words come before or after? (visible in word tree)
- Consider for example the potentially very important differences between the pronouns that precede or follow a key term that is emerging as a theme or word – for example work/working or team/s and if or how these might very between groups or align with attributes you;re interested in (e.g. managers vs subordinates)
  - We work together well as a team
  - We worked really well on that project
  - I work well in teams
  - I work on my own whenever I can
    - Thanks to Kristi Jackson for her reply about this topic on the NVIvo users group for prompting this update!
- Consider for example the important differences between how use and used can appear as a verb, a modal auxiliary :
  - I used the software four years ago (verb, p/t)
  - I used to hate the software (quasi-modal)
  - I got used to the software (adjective phrase)
- Which stems are associated? (Not sure if this is visible with word tree???)
  - Consider the spurious stemming that can occur e.g.
    - Office
    - Officer
    - Official
  - Which words are associated with particular stems or synonyms
    - Consider the difference between stems of
      - be, been, being
    - Compared to lemmatisation as
      - am, was, are, were

And here’s where the power yet simplicity of KWIC really holds potential for working with this sort of query and any coding from that. Consider what you can see when the data is presented in a KWIC concordance:

Ref 1:	0.01%	a little while since I’ve	use	d	Adobe Connect. Okay [pause] oh
Ref 2:	0.02%	STS and how you’ve been	using		caqdas software, but it’s just
Ref 3:	0.02%	that particularly made it seem	use	*ful*	or relevant or drew you
Ref 4:	0.02%	ANT, but nevertheless he is	using		some of the principles of
Ref 5:	0.01%	by Actor-network theory have	use	d	software in their research. Erm
Ref 6:	0.02%	poll is people who are	using		CAQDAS packages, some is people
Ref 7:	0.02%	is people who are not	using		those. Erm, and some is
Ref 8:	0.02%	some is people who are	using		a mixture of-, a sort
Ref 9:	0.02%	wondered, what software are you	using		? Erm, and one info [skip
Ref 10:	0.02%	you know, beca	use	-,	I start using what I knew at that
Ref 11:	0.02%	start my PhD, we start	using		a specific software that I
Ref 12:	0.02%	software that I had been	using		before, which is a qualitative
Ref 13:	0.01%	study, then I have to	use		something that I knew and
Ref 14:	0.01%	with Atlas T, and I	use		it-, I will explain it
Ref 15:	0.01%	but later …[15.34] Then I	use	d	Atlas T from the very
Ref 16:	0.01%	the very beginning, and I	use	d	it only to qualify all
Ref 17:	0.01%	of my research. Erm, the	use		of Atlas T was useful
Ref 18:	0.02%	my	use		of Atlas T was useful at some extent,
Ref 19:	0.01%	best tool that I can	use		, but I will explain it
Ref 20:	0.01%	apply principles of ANT and	use		a specific software?’ [18.54] So
Ref 21:	0.01%	of mine, err, quite frequently	use	s	the phrase ‘auto-magical’, and
Ref 22:	0.02%	understand how ANTA can be	use	*ful*	in that sense. Of course
Ref 23:	0.02%	learning, analytics, big data and	using		those special softwares, but I
Ref 24:	0.01%	didn’t get how I can	use		it for my research, really
Ref 25:	0.01%	and show me how you	use		Atlas.ti that would be really
Ref 26:	0.01%	tools and options you do	use		, that have supported you the
Ref 27:	0.02%	broken.’ So which-, so you’re	using		Atlas T on a Mac
Ref 28:	0.01%	Yes I [skip]-, I’m just	use		[skip] [25.47] Steve W Okay
Ref 29:	0.02%	finished my thesis, I am	using		[skip] as a module from
Ref 30:	0.02%	you. This paper is about	using		ANT principles through my research
Ref 31:	0.01%	yesterday found that I can	use		AtlasT not in my Windows
Ref 33:	0.02%	with statements from other documents	using		categories of analysis. I mean
Ref 34:	0.01%	you generate and did you	use		? Alberto There is no [unclear

The power and importance of sorting

What I would like to be able to see is the kind of output shown above as an option along with the normal contextual view. I would want to be able to sort it by the middle column and/or the words immediately preceding or following that. This then really helps spot patterns:

Loc	%	Text 1	Stem		Text 2
Ref 13:	0.01%	study, then I have to	use		something that I knew and
Ref 14:	0.01%	with Atlas T, and I	use		it-, I will explain it
Ref 17:	0.01%	of my research. Erm, the	use		of Atlas T was useful
Ref 18:	0.02%	my	use		of Atlas T was use ful at some extent, to some
Ref 19:	0.01%	best tool that I can	use		, but I will explain it
Ref 20:	0.01%	apply principles of ANT and	use		a specific software?’ [18.54] So
Ref 24:	0.01%	didn’t get how I can	use		it for my research, really
Ref 25:	0.01%	and show me how you	use		Atlas.ti that would be really
Ref 26:	0.01%	tools and options you do	use		, that have supported you the
Ref 28:	0.01%	Yes I [skip]-, I’m just	use		[skip] [25.47] Steve W Okay
Ref 31:	0.01%	yesterday found that I can	use		AtlasT not in my Windows
Ref 34:	0.01%	you generate and did you	use		? Alberto There is no [unclear
Ref 10:	0.02%	you know, beca	use	-,	I start using what I knew at that
Ref 1:	0.01%	a little while since I’ve	use	d	Adobe Connect. Okay [pause] oh
Ref 5:	0.01%	by Actor-network theory have	use	d	software in their research. Erm
Ref 15:	0.01%	but later …[15.34] Then I	use	d	Atlas T from the very
Ref 16:	0.01%	the very beginning, and I	use	d	it only to qualify all
Ref 3:	0.02%	that particularly made it seem	use	ful	or relevant or drew you
Ref 22:	0.02%	understand how ANTA can be	use	ful	in that sense. Of course
Ref 21:	0.01%	of mine, err, quite frequently	use	s	the phrase ‘auto-magical’, and
Ref 2:	0.02%	STS and how you’ve been	using		caqdas software, but it’s just
Ref 4:	0.02%	ANT, but nevertheless he is	using		some of the principles of
Ref 6:	0.02%	poll is people who are	using		CAQDAS packages, some is people
Ref 7:	0.02%	is people who are not	using		those. Erm, and some is
Ref 8:	0.02%	some is people who are	using		a mixture of-, a sort
Ref 9:	0.02%	wondered, what software are you	using		? Erm, and one info [skip
Ref 11:	0.02%	start my PhD, we start	using		a specific software that I
Ref 12:	0.02%	software that I had been	using		before, which is a qualitative
Ref 23:	0.02%	learning, analytics, big data and	using		those special softwares, but I
Ref 27:	0.02%	broken.’ So which-, so you’re	using		Atlas T on a Mac
Ref 29:	0.02%	finished my thesis, I am	using		[skip] as a module from
Ref 30:	0.02%	you. This paper is about	using		ANT principles through my research
Ref 33:	0.02%	with statements from other documents	using		categories of analysis. I mean

This would help with viewing the associations created from a query.

The next level – making this KWIC view a way of shaping the associations of stems and synonyms

However, to really have power you would need to be able to use it to interact with and change those associations. the functions I would really like (via right click or similar) are:

1 – remove link of stem (e.g. De-link office and officer as being the same word)

2 – remove synonym association (e.g.

3 – (Ideally – probably harder!) create a link for lemmatisation and ideally save it to a dictionary or thesaurus. AND / OR differentiate on set of used to from another set of used to.

All of these are hugely facilitated by a KWIC concordance view – and hopefully some of this is fairly simple whilst other aspects may need to be on a longer list but I believe are really worthy of consideration especially for approaches oriented more towards content analysis and data mining rather than inductive analysis.

2 thoughts on “KWIC interfaces and concordances”

Christina Silver says:

March 5, 2017 at 7:31 am

Hi Steve, thanks for this. You’re right about how useful KWIC can be and the limitations in some CAQDAS packages right now. But did you know that MAXQDA and QDA Miner do enable what you’re asking for – I’ll write a post on it on the Five-Level QDA method blog this week. http://www.fivelevelqda.com/blog

LikeLike

1. stevecaqdas says:
  
  March 5, 2017 at 9:27 am
  
  Hi Christina, Thanks for taking the time to read and comment.
  
  I’ve just started using QDA Miner/wordstat (and REALLY liking it) – one of the first things I really liked seeing was KWIC concordancing – which in part promoted this post.
  
  Using and learning MaxQDA is still (sadly) stuck on my “to do” list though their trainers certification programme is hugely attractive and great to see they’ve done KWIC http://www.maxqda.com/maxqda-update-12-3-maxdictio
  
  Will be updating/extending this post soon to explore how it could come in to ATLAS.ti 8 and Mac.
  
  Next post will be about relationships as I continue trying to work out what relationship nodes do in NVivo and how they contrast with hyperlinking in ATLAS.ti in terms of visual representation alone vs functional querying… something of a work in progress but trying to become more disciplined about writing every day and getting things up here as a place to begin a conversation and get comments so your input and responses are a big boost 😀
  
  LikeLike

	stevecaqdas on Downloading YouTube videos, ca…
	swanafsj on Working with Arabic in NVivo (…
	CoConvert on Downloading YouTube videos, ca…
	CoConvert on Downloading YouTube videos, ca…
	socialdownloadmanage… on Downloading YouTube videos, ca…

KWIC interfaces and concordances

From text to context – displaying search results in NVivo at Present

A closer view – word trees

What you can’t do or see easily with this… but could with KWIC

The power and importance of sorting

The next level – making this KWIC view a way of shaping the associations of stems and synonyms

Published by stevecaqdas

2 thoughts on “KWIC interfaces and concordances”

Leave a comment Cancel reply

From text to context – displaying search results in NVivo at Present

A closer view – word trees

What you can’t do or see easily with this… but could with KWIC

The power and importance of sorting

The next level – making this KWIC view a way of shaping the associations of stems and synonyms

Share this:

Related

Published by stevecaqdas

2 thoughts on “KWIC interfaces and concordances”

Leave a comment Cancel reply