“But can I bring my notes?” Ideas and investigations on improving Literature Import into CAQDAS Software

Update: ATLAS.ti 22 finally does it!!!!

5 years on from this post and the latest update from ATLAS.ti has finally brought this feature in… Gamer changer? I think so. ATLAS.ti 22 is certainly leading the way with amazing features right now! More details in this article from ATLAS.ti.

Introduction

It’s evident that a key area where CAQDAS software is having an impact on research practices is in literature reviews. This may be rather “old news” for those working with NVivo (available since version 9, released 2010) or MaxQDA (available since v11, released in 2012) but it is relatively recent for ATLAS.ti (version 8, released 2016) and for some seems a strange departure – this software is for empirical data isn’t it? We have Reference management software for managing literature and notes on that don’t we? Well – yes, and no – and this blog post explores some of the crossovers, continuities and contested spaces between these two major types of research support software for unstructured data.

Now, my background is as an ATLAS.ti user so this trend still seems relatively recent to me – it wasn’t even on my radar when I was setting out on my PhD thesis in 2012. The focus in books and articles and tutorials was on working with empirical data using varying shades of Grounded Theory-derived and/or thematic-orientated approaches to analysis. I didn’t ven think of importing literature – but by the time I came to writing up and was desperately searching through my PDf notes made in Endnote x7 and finding the search function to be very (very) poor I was frustrated by my inability to seamlessly link my annotations and their groupings via codes from my empirical work with the theoretical ideas I’d already written a lot of notes on and highlighted extensively in Endnote.

Come the end of my thesis and in subsequent work – especially with the launch of the ATLAS.ti iPad app as a great PDF reader I started to engage with literature reviews. A few blog posts starting to appear (e.g. Dr. Ken Riopelle’s experiments with the mobile app http://atlasti.com/2014/03/26/how-to-use-atlas-ti-mobile-app-with-the-browzine-app-for-literature-reviews/ ). As prep for a job interview I used the ATLAS.ti app to look at connections between my PhD work and the work related to the research and the research team – I didn’t get the job (though I came close 2nd and got useful feedback) but I did get to write it up and begin building connections with ATLAS.ti’s training programme http://atlasti.com/2014/06/12/1722/ )

Part 2) The most frequent questions about importing literature

When I’m teaching PhD students and research staff about making an informed choice and then using CAQDAS effectively, I draw on these experiences to strongly advocate for the sense and power and potential of undertaking the lit review in a CAQDAS package. This is often seen as rather novel, however, the potential is typically recognised pretty quickly especially when contrasted with the limits on classification, grouping search and retrieval of notes made in current ref management software. But it is essential to consider and account for this recognition of the potential is always in the context of, and in relation to, existing practices of managing, highlighting, annotating and summarising literature.

Unsurprisingly therefore… the following question always comes up:

“OK so I can import the reference info and the documents – can I import the notes I’ve made?”

The answer is… no.

The result is disappointment, and frequently a decision to stick with current practices due to these barriers. And it’s those barriers and steps to remove them that are the focus of this extended blog post.

And to show I’m not just making this up here’s an example – from a presentation on NVivo for lit reviews by Silvana di Gregorio at the NVivo @ Lancaster event:
https://vimeo.com/223259096/84d441ca75#t=1195s

This student has made extensive notes in Mendeley and understandably wants to import those as well as the PDFs.

Now the highlights will display but they will not be integrated into the programme architecture and all the work and ideas in those notes are left behind – to be re-created slowly and repetitively one-by-one via copy and paste. Or abandoned. Or (more likely) the lit and this practice will stay in Mendeley as a result.

HOWEVER, that phrase “slowly and repetitively one-by-one via copy and paste” seems all wrong – as it is EXACTLY that sort of thing that computers excel at doing reliably, quickly and automatically. If you have to do exactly the same thing over and over and over to move data from one place to another SURELY a computer should be doing that for you?

With that as the basis the rest of this article considers in turn:

Part 3) What Reference management software is, does and the practices it supports and has extended in to and the relationships with CAQDAS

Part 4) Turns to look more broadly at good ways and recommended practices for working with research literature and how these are supported in RM software compared with CAQDAS.

Part 5) Takes a deeper focus on RM software and changing priorities and associated practices from a focus on bibliographic accuracy to supporting reading and review.

Part 6) Turns towards practical ideas and proposals for improving import of PDFs from Rm software

Part 7) Turns to applying this in practice, in the hope of giving some help to the developers by bringing together my explorations through linking to standards, code, APIs etc.

Part 8) Lays out annotated segments of the code exported from Acrobat Reader of PDf annotations and notes and the relationship to the XML exported from ATLAS.ti to put these ideas into a coded context.

Part 9) Concludes this essay and also anticipates possible objections and potential approaches to mitigate those.

Then there are appendices of links to resources and some extended detail on the development and feature history of leading RM and CAQDAS packages

I draw on my experience of using and teaching CAQDAS software (ATLAS.ti and NVivo) and also using and teaching effective use and workflows for literature management and review software.

Part 3) What does ref management software do?

Reference Management software has evolved to extend beyond its original place in the research process: the end and writing and including in-text citations and constructing a bibliography.

They were extended to support the start of a literature review (searching for and importing references and attaching the full text).

Increasingly they are now seeking to support the middle – the actual work not just the admin – which is the reading, and working with the literature.

There’s a good table of comparisons and history at https://en.wikipedia.org/wiki/Comparison_of_reference_management_software

Gilmour and Kuo (2011) give a succinct list for reference managers (RM):
RMs serve a variety of functions. Generally, we would expect an RM to be able to:

  1. Import citations from bibliographic databases and websites
  2. Gather metadata from PDF files
  3. Allow organization of citations within the RM database
  4. Allow annotation of citations
  5. Allow sharing of the RM database or portions thereof with colleagues
  6. Allow data interchange with other RM products through standard metadata formats (e.g., RIS, BibTeX)
  7. Produce formatted citations in a variety of styles
  8. Work with word processing software to facilitate in-text citation

http://www.istl.org/11-summer/refereed2.html

Some of these are specific to RM functionality, others have continuities and impact on working with CAQDAS in literature reviews.

Point 4 that is the key point of continuity in practice and the focus for this blog post/series as that is where the interaction with CAQDAS software becomes important in terms of annotation of citations.

CAQDAS software is in a different area from points 1 and 2 which concern finding and organising literature (though with potential to learn from 2 for auto coding perhaps?).
Point 3 – organising references in the database – is important for CAQDAS to help organise imported data.
It has its own way(s) of addressing point 5 with regard to sharing projects in a research team.
There is a need to have connections to point 6 to support exporting a literature review with a meaningful connection to the references.
When it comes to writing and creating a bibliography it is not, currently, in the same game for points 7 and 8. However, in “next-generation CAQDAS” there could well be similar requirements for this sort of export to enable referencing to project items stored in data archives and referenced via open-data formats to support referencing the underlying data in a project.

Part 4) Approaches and Recommendations for working with research literature

With both RM software and CAQDAS contesting and seeking to become key actors in the middle stage of working with literature – what is this work? Well here are some useful quotes I often draw on:

Recording your Reading

By the time you begin a research degree. it is likely that you will have learned the habit of keeping your reading notes in a word processed file, organized in terms of (emerging) topics. I stress ‘reading notes’ because it is important from the start that you do not simply collate books or photocopies of articles for ‘later’ reading but read as you go. Equally, your notes should not just consist of chunks of written or scanned extracts from the original sources but should represent your ideas on the relevance of what you are reading for your (emerging) research problem.
(Silverman, 2013, p. 340)

Silverman then goes on to cite Phelps et. al.’s succinct suggestions:

Phelps, Fisher, and Ellis (2007) TABLE 19.1 Reading and Note Taking

▪ Never pick up and put down an article without doing something with it
▪ Highlight key points, write notes in the margins, and write summaries elsewhere
▪ Transfer notes and summaries to where you will use them in your dissertation
▪ Ensure that each note will stand alone without you needing to go back to the original
Adapted from Phelps et al. (2007)
(cited in Silverman, 2013, p. 341)

Drawing on these we can see that working with literature is another qualitative practice – literature after all is text that you are reading, analysing and interacting with in ways that are analogous to many qualitative analysis practices.

Phelps’ four points talk can be translated into CAQDAS and RM software support features and practices – which equate to “doing something”.

  • Notes in the margins (quotation comments – ATLAS.ti, Annotations – Nvivo, Sticky Notes – RM software)
  • Summaries elsewhere (linked memos and/or document comments – ATLAS.ti and NVivo, Notes – RM software)
  • Transfer notes and summaries to where you will use them: on a computer that’s the promise of these packages: they ARE where you will use them, and for RM software they hook in to where you will cite them (through memo links and project exports in CAQDAS, through cite-while-you-write plugins with access to the notes in RM software)

The “lightbulb” moment for students comes when contrasting how these approaches are supported by RM software compared with how CAQDAS can/could. I pose the following questions:

  1. What do you do currently?
  2. Where and when do you read?
  3. How do you highlight/annotate/summarise?
  4. How do you group those highlights/annotations/summaries together?
  5. How do you relate these pertinent segments literature together?
  6. How do you find and retrieve highlights, quotes and their associated notes?

It is points 4, 5 and 6 that really articulate the power and potential of CAQDAS – the issues of grouping, relating and locating the notes and ideas and insights they have had.

This can be contrasted with the limited grouping and search functions in RM software:

Illustration – search in PDF notes in Endote X7, which identifies8 documents with multiple comments where the word I’m searching for appears – but doesn’t show content of notes or even which note the word appears in! :

BLOG-image-Endnote-searchingPDfNotes

CAQDAS software opens up the potential of doing this by using coding to group together quotes and notes on them. Bazeley (2013) suggests that these will cluster around three areas: methods, topic and theory. This would suggest highlighting, annotating and grouping those highlights and annotations (via codes) based on:

  • different methods used and
  • previous explorations of the topic
  • collecting together results and their significance
  • the different framing of the topic and methods in different theories by different authors and theorists,

(The terms in italics can be used to structure coding for a literature review which CAQDAS software then enables approaches to explore co-occurrences between those codes which can be further explored using the reference information to track patterns within and across different types or eras of literature.)

Part 5) RM packages in focus: priorities, changes and practices

Gilmour and Cobus-Kuo’s (2011) paper  “compares four prominent RMs: CiteULike, RefWorks, Mendeley, and Zotero, in terms of features offered and the accuracy of the bibliographies that they generate.” The focus, which is the historical place of RM software, is on generating a bibliography and the accuracy of that. This is the core work of RM software and clearly differentiated from and not commensurate with CAQDAS. However developments of the packages lead them to engage with Mead and Berryman’s argument that: “it is not the users themselves who have changed, but their workflow” (Mead and Berryman 2010).

“The all-too-familiar scenario as discussed in the literature depicts the researcher with many PDFs stored in various places who needs a tool to simply upload the documents and pull the citation information into their RM product of choice (Mead and Berryman 2010; Barsky 2010).” (Gilmour and Cobus-Kuo, 2011)

However, this scenario differs markedly from the location that CAQDAS software seeks to engage with in the lit review workflow, the one that has only recently become integrated into RM software – that of actually working with PDF’s in the terms considered above -annotating, highlighting and grouping segments and notes together based on shared features.

In terms of CAQDAS’ role in the lit review process extracting reference data plays a supporting role of organising the documents in a project for the purposes of helping to order or filter queries of the metadata added through coding and annotating.

In terms of a literature review then Gilmour and Kuo’s question of “What are the primary and secondary needs of the user based on workflow?” is particularly pertinent.

What I find interesting in the question I receive from users – exemplified earlier – is just how much the workflow and use of RM software seems to have changed. For those who are engaging in read and annotating electronically this aligns to a Gilmore and Cobus-Kuo’s (2011) observation that there will be a shift towards “new researchers who are more flexible in their work habits and may be more willing to learn new RMs that provide Web 2.0 functionality and PDF features.”

What emerges from these overlapping (albeit unsystematic and partial views derived from my practices and those of students I have worked with) is a picture of some RM software having taken residence into the space that CAQDAS is seeking to (re)define and “own” – that of working WITH literature in terms of active reading and engagement with the texts. However, CAQDAS software has a set of compelling features and options that are substantially more developed than those in RM software, as well as the prospect of being the core management environment for analysing and connecting both literature and empirical data – which RM software will (probably) never do – with even the ambitious ideas of Colwiz (https://www.colwiz.com/about ) sticking to group management of literature not project management or empirical data.

This space is therefore outside the historic and traditional realm of RM software and is potentially an area where RM software could learn from both CAQDAS and Note making software and CAQDAS needs to substantially enhance its integrations if it wishes to really tempt and engage existing, increasingly sophisticated RM software users.

Part 6) Ideas and approaches to improving import of literature into CAQDAS software

If CAQDAS software is to make a bigger play for recognition as a particularly useful type of tool for conducting lit reviews – which manufacturers certainly seem keen to do (cf blog for ATLAS.ti at http://atlasti.com/2017/02/09/lit-reviews/ and blog for NVivo at http://www.qsrinternational.com/blog/hone-your-nvivo-skills-with-literature-reviews and guide from MaxQDA http://www.maxqda.com/maxqda-literature-reviews-reference-management-software ) – then there is surely a strong case for substantially removing barriers and improving the migration from some of the tools and practices considered here to both facilitate and encourage transition. This would also attract users to do reviews in these more powerful packages with the features outlined previously – namely multiple categorisation of notes and quotes (through coding), advanced retrieval (through queries), and connected writing (through memos).

As noted previously and illustrated with an example of the question being asked: If you’ve started doing a lot of your lit review in Mendeley, or Zotpad, or Endnote and you’ve made a lot of highlights and notes on PDFs you will want to preserve and use this work. It seems reasonable that software claiming to do all of those things better should be able to import the work you have already done and support you to build on it.

It doesn’t.

Could it?

From what I’ve been finding out it seems the answer is potentially yes – and what I now proceed to do is to sketch out ideas of how this could be done and some of the initial things I’ve been finding out.

There’s quite a caveat though: I’m a user of software with reasonable technical understanding but I’m not, never have been, and never will be “a programmer” so there are parts of this where I’m speculating, making educate guesses or don’t understand it fully at present – but would really welcome input from those more adept at programming and knowledgeable of the complex an consulted PDF standard(s).

High level view of improved import of PDFs from RM software

Import references and linked PDFs with additional option to include PDF comments (and ideally highlights) to be translated into the CAQDAS programme structure (e.g. as quotations with comments in ATLAS.ti, or as annotations in NVivo)

Other desirable features:

1) Importing highlights as well?

Whilst it is the case that highlights will display on the imported PDF they will not become translated into actual project elements. If they were imported rather than merely displayed then “highlight” annotations would appear in the list of all annotations in NVivo allowing quick retrieval of highlighted passages. The merit may be rather marginal but shouldn’t be dismissed.
So, if these could be imported then that could either as quotations without a comment (in ATLAS.ti) and with a code of “highlight” and either an element colour or a code of “highlight – hello”
In NVivo they could be imported as coded segments with a node named “highlight” and the appropriate element colour in NVivo.

2) Import any keywords from notes

(if applicable – still exploring this in mendeley and zotero) as code names for these items.

3) Import metadata

This could include colours authors, dates etc.

Part 7) Exploring this in practical terms for developers – standards, codebases, APIs etc

So… HOW COULD YOU DO THE EXPORT?
It looks like I’m not the only one puzzling about this based on this on github: https://github.com/nichtich/marginalia/wiki/Support-of-PDF-annotations

So this is where it gets a little more sketchy and I hit the limits of my knowledge – I’m hoping there are some good as a second loop or option in the import procedure so it was seamless across ref management programmes.

I anticipate this would involve some sort of loop for the programme on import – import ref management data, check if PDF attached (so far the same), then check if the imported (or to-be-imported) PDFs have annotations, if so export annotations as XFDF and then import the details from the XFDF into the programme structure.

I explore this in more detail below.

Alternative / interim approaches – getting the RM software to do the annotation export
However as this is something of a “nice to have” alternatives could be clear sets of instructions for using features in software or third party apps to export data into a format that can then be imported and annotated onto the PDFs. This sort of interim/experimental release stage could require that the user is required to export the XFDF files.

Mendeley

This seems more advanced in some packages than others e.g. Mendeley enables this on a document by document basis to export an annotated PDF (see https://blog.mendeley.com/2012/04/19/how-to-series-how-to-export-your-annotations-alone-or-with-your-pdf-part-8-of-12/ )

Illustration: Exporting annotated PDFs from Mendeley

BLOG-image-Mendeley Export PDF menu Screen Shot 2017-07-03 at 19.41.52

There is a python library on GitHub: https://github.com/Xunius/Menotexport to do this in bulk. However this wouldn’t create XFDF files.

Zotero

ZotPad as a plugin for zotero appears to offer bulk export of PDFs and extraction of annotations (see http://zotfile.com/#extract-pdf-annotations )
Again, no XFDF export.

Endnote

Unsurprisingly Endnote doesn’t seem to do much here – despite user requests dating back to 2014 http://community.thomsonreuters.com/t5/EndNote-Product-Suggestions/Export-PDF-annotations-highlight-notes-etc/td-p/59388
However there are ways to export multiple PDFs to a folder (see http://community.thomsonreuters.com/t5/EndNote-How-To/Exporting-PDFs-to-a-separate-folder/td-p/53127 ) in order to then work with them via Acrobat Reader or Pro. Bulk export of comments therefore isn’t great, but is possible.

Papers

Papers is Mac only but does support exporting notes, annotations and comments.
http://support.mekentosj.com/kb/share-share-and-export-collections-and-content/how-to-export-notes-and-annotations-from-papers-3-for-mac

Adobe Acrobat Reader DC

Acrobat Reader enables exporting via FDF (proprietary) and XFDF (XML based) formats (see https://helpx.adobe.com/acrobat/using/importing-exporting-comments.html ) which can be done from the free acrobat Reader DC (see https://forums.adobe.com/thread/1942791 )

BLOG-image-exportingCommentsFromAcrobatPro

Acrobat Pro

This can be automated to be done in bulk via Acrobat Pro using a script (see https://forums.adobe.com/thread/1385576 ) else Aspose offer a commercial .net library to do this (see https://docs.aspose.com/display/pdfnet/Importing+and+Exporting+Annotations+to+XFDF )

LINK: Example FDF file for comparison with XFDF (proprietary file) https://lancaster.box.com/s/utjh0s72unmfxdvxhuh1xddnn0myxuh5

Mobile Apps

If you’re not using ATLAS.ti for iPad for annotating PDFs (which is great! Unfortunately though there’s no app for iPhone and the Android app doesn’t support PDFs), or MaxQDA app for iOS (iPhone/iPad) or Android then then it is likely that other apps have inserted themselves into the space for reading and annotating.

Popular apps include:

GoodReader

Has excellent export of annotations as a flat text file via email, but doesn’t look set to create XFDF files.

iAnnotate

Similar export options to GoodReader – as a text file identifying properties (page, highlight o underline colour, text highlighted) but no clear pathway to XFDF export.

Code Snippets, APIs and Scripts I’ve identified

Commercial libraries and APIs for .net along with clear articles setting out principles an processes and formats are available from ASPOSE https://docs.aspose.com/display/pdfnet/Importing+and+Exporting+Annotations+to+XFDF

There’s some java code for getting the annotations: https://gist.github.com/i000313/6372210

And a python script to extract PDF comments too https://gist.github.com/ckolumbus/10103544

The XFDF standard

So XFDF is the standard for this area – here’s some more on it:
XFDF ISO Documentation https://www.iso.org/obp/ui/#iso:std:iso:19444:-1:ed-1:v1:en
And these are the latest Q’s on stack overflow
https://stackoverflow.com/questions/tagged/xfdf

Part 8) Mapping elements from Acrobat Reader XFDF Export to ATLAS.ti XML Export.

Whilst the inner workings of NVivo are rather obfuscated and it offers no coded export, ATLAS.ti by contrast is somewhat clearer in the ways it works with programme elements which can be exported as XML. (MaxQDA does as well – see http://www.maxqda.com/maxqda-export-options-the-new-xml-export – however as I’m only just starting to learn that software and hope to look at this again later)Whilst there is (as yet) no XML standard for interoperability between CAQDAS packages – something the KWALON project has been working on (see conference report at http://www.dlib.org/dlib/march17/karcher/03karcher.html for an account of the conference session), nor an option to import the ATLAS.ti XML it at least gives an opportunity for looking at continuities between XFDF and ATLAS.ti elements for potential import.

My Process for exploring and annotating XFDF and ATLAS.ti XML code:

1 – I marked up a PDF document in Endnote, using highlights, underlines and comments.

BLOG-image-exampleOfUnderlyingAnnotatedPDF

2 – Opened the annotated PDF attachment from Endnote in Acrobat Reader DC. Exported comments from Acrobat Reader as an XFDF file

BLOG-image-PDFannotationPaneInAcrobatPro  > BLOG-image-exportingCommentsFromAcrobatPro

FILE LINK – XFDF export – https://lancaster.box.com/s/edon8znhjh4py9f606t1qtf349vjaq1m

3 – Imported the document into ATLAS.ti Mac and marked it up in an equivalent way to how I envisage import could/would work as outlined above.

BLOG-image-ATLAS.ti marking up PDF

LINK – ATLAS.ti Project bundle https://lancaster.box.com/s/62c6xzeor9t74xoojn78lev6eqxpi7ti

4 – Opened the XFDF file in DreamWeaver to look at the structure, elements and attributes

5 – Exported the ATLAS.ti project as XML and opened that in Dreamweaver to explore the structure, elements and attributes.

BLOG-image-ExportATLAStiXML Screen Shot 2017-07-06 at 13.09.13

ATLAS.ti PROJECT FILE LINK https://lancaster.box.com/s/vx48sl3vixtktukgl5rjzyja0z56pyhr

6 – Commented the two XML files to note continuities and potential equivalencies between them – see below.

Links
Annotated XFDF FILE https://lancaster.box.com/s/tw3qiud5bdxziz08bgzaso26wq8mkn1f
Annotated ATLAS.TI XML FILE https://lancaster.box.com/s/vx48sl3vixtktukgl5rjzyja0z56pyhr

7 – Made all the above available via Box

8 – Added the example code with my annotations below within textarea tags

NEXT STEPS:
(9 – Hustle and flatter the awesome ATLAS.ti Mac developer Friedrich Markgraf, aka Fritz, aka @fzwob to read this and think about implementing it 😉

10 – Do the same for NVivo and MaxQDA and see if either the competitiveness of this market or the co-operation of developers around things like XML standards helps get this implemented in one or more packages.

11 – Get on with something less geeky… 😉

Annotated XML Examples

The key annotations here are all between the brackets.

Annotated XFDF File Exported from Acrobat Reader

The following code is displayed based on information on using the sourcecode element – detailed at https://en.support.wordpress.com/code/posting-source-code/.

<!-- XML DTD onitted -->
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<!-- annots collects together all the annotations -->
	<annots>
		<!-- *** highlight *** is one of the main waus of marking up text in a PDF- potentially useful to import as a quotation based on the coords and then add a code of "highlight" along with allocating the same color to the code  -->
		<highlight  			color="#FFFF00"  			flags="print"  			date="D:20130615195221+01'00'"  			name="c0096ebd-aa1b-7d48-894a-95b72c9f2399"  			page="0"  			coords="514.652000,326.134000,622.026000,326.134000,514.652000,314.528000,622.026000,314.528000,624.150000,326.139000,781.075000,326.139000,624.150000,314.533000,781.075000,314.533000,471.566000,313.191000,602.565000,313.191000,471.566000,301.585000,602.565000,301.585000,604.330000,313.189000,780.594000,313.189000,604.330000,301.583000,780.594000,301.583000,471.590000,300.231000,540.806000,300.231000,471.590000,288.624000,540.806000,288.624000,542.050000,300.229000,781.168000,300.229000,542.050000,288.623000,781.168000,288.623000,471.490000,287.269000,781.711000,287.269000,471.490000,275.663000,781.711000,275.663000,471.500000,274.299000,780.689000,274.299000,471.500000,262.693000,780.689000,262.693000,471.476000,261.341000,551.463000,261.341000,471.476000,249.734000,551.463000,249.734000,550.690000,261.339000,781.594000,261.339000,550.690000,249.733000,781.594000,249.733000,471.490000,248.379000,774.987000,248.379000,471.490000,236.773000,774.987000,236.773000,471.510000,235.429000,611.514000,235.429000,471.510000,223.823000,611.514000,223.823000" rect="471.476000,223.823000,781.711000,326.139000"  			title="Steve" 			>
			<popup  				flags="print,nozoom,norotate"  				open="no"  				page="0"  				rect="827.640015,206.134003,1007.640015,326.134003" 			/>
		</highlight>
<!-- other lines cut here -->
	<!-- *** underline *** is one of the wahys of marking up text in a PDF- potentially useful to import as a quotation based on the coords and then add a code of "underline" along with allocating the same color to the code -->
		<underline  			color="#0000FF"  			flags="print"  			date="D:20130616180638+01'00'"  			name="847814b0-ca2c-434a-bdc1-8fb56b678584"  			page="1"  			coords="71.422000,383.299000,167.391000,383.299000,71.422000,371.805000,167.391000,371.805000,190.030000,383.639000,356.882000,383.639000,190.030000,371.732000,356.882000,371.732000,47.047000,370.332000,51.620000,370.332000,47.047000,358.837000,51.620000,358.837000,52.550000,370.329000,130.752000,370.329000,52.550000,358.835000,130.752000,358.835000,132.380000,370.576000,149.254000,370.576000,132.380000,358.785000,149.254000,358.785000,156.620000,370.689000,331.049000,370.689000,156.620000,358.747000,331.049000,358.747000"  			rect="47.047000,358.747000,356.882000,383.639000"  			title="Steve">
			<popup  				flags="print,nozoom,norotate"  				open="no"  				page="1"  				rect="825.119995,263.298996,1005.119995,383.298996"/>
		</underline>

	<!-- *** text *** is the most important element for importing - these are the comments -->
	<!-- *** color *** attribute could be used to give a color to the element in the CAQDAS package -->
	<!-- <icon> could be used to give a code for this element in the CAQDAS package -->
	<!-- *** rect *** is co-ordinates for this comment on the PDF, nearest equivalent woudl eb a selection by area and then coding that -->
	<!-- <title> seems to map to author -->
	<text  		color="#FFFF00"  		flags="print,nozoom,norotate"  		date="D:20130616180638+01'00'"  		name="f7a56df4-b0b6-3342-b856-2a54b4bd250b"  		icon="Comment"  		page="1"  		rect="361.296997,333.329010,379.296997,351.329010"  		title="Steve" 	>
		<!-- *** contents *** is the KEY element - this is the actual content of a textual comment -->
		<contents>
			Contrasts with views from Bourdieu where taste is a way of at ratifying and dominating rather than something constructed
		</contents>
		<!-- * popup * appears redundant as this controls the display on scren of the comment which has no equivalent or relevance in CAQDAS packages -->
		<popup  			flags="print,nozoom,norotate"  			open="no"  			page="1"  			rect="396.297000,239.329000,646.297000,351.329000" 		/>
	</text>
</annots>
<!-- **<f>** is the file reference for the file itself - will be essential for co-ordinating the XFDF with the imported file -->
<f href="../Documents/My EndNote Library.Data/PDF/0914600930/Akrich-1992-DeScriptionOfTechnicalObjects_inSh.pdf" />
<ids original="EEE4ED80D36A11E280FEA0F5ADA9D1EA" modified="9C468E0F3E2DC5E695A4B9500B40565A" />
</xfdf>
<!-- remaining code omitted in this illustration -->
 

Annotated ATLAS.ti XML File Exported from ATLAS.ti Mac

The following code is displayed based on information on using the sourcecode element – detailed at https://en.support.wordpress.com/code/posting-source-code/.

<!-- DTD and initial tags omitted -->
<!-- Identifying the primary documents -->
    <primDocs size="2">
        <primDoc name="Akrich-1992-DeScriptionOfTechnicalObjects_inSh.pdf" id="pd_1_1" loc="doc_1" au="Steve Admin" cDate="2017-07-04T09:48:58" mDate="2017-07-04T09:48:58" qIndex="">
			<!-- Identifying start of quotations -->
            <quotations size="12">
				<!-- q is the tag for an individual quotation -->
                <q name="Iamarguing,therefore,thattechnicalobjectsparticipatein   ing heterogeneous networks that bring toget…" id="q1_1_1" au="Steve Admin" cDate="2017-07-04T10:04:34" mDate="2017-07-04T10:04:34" loc="start=368 end=531 startpage=1 endpage=1">
					<!-- ***  content  *** denotes the actual content of the quotation, ie the actual copy on the page, equivalent in XFDF for a highlight would be the mass of co-ords -->
                    <content size="163">

Iamarguing,therefore,thattechnicalobjectsparticipatein   ing heterogeneous networks that bring together actants of all types and sizes, whether human or nonhuman.3

                    </content>
                </q>
                <q name="But how can we describe the specific role they play within these networks? Because the answer has to…" id="q1_2_2" au="Steve Admin" cDate="2017-07-04T10:04:40" mDate="2017-07-04T10:04:40" loc="start=532 end=820 startpage=1 endpage=1">
                    <content size="288">

But how can we describe the specific role they play within these networks? Because the answer has to do with the way in which they build, maintain, and stabilize a structure of links between diverse actants, we can adopt neither simple technological determinism nor social constructivism.

                    </content>
                </q>
				<!-- q is the tag for a quotation for an area of the PDF that is empty - equivalent to the display of the comment icon on screen. THe loc values map to rect values for text element in XFDF -->
                <q name="Quotation 1:3" id="q1_3_3" au="Steve Admin" cDate="2017-07-04T10:06:18" mDate="2017-07-04T10:12:16" loc="x=359 y=338 width=23 height=23 page=1">
					<!-- A *** comment *** with a type of text is equivalent to the contents element within the text element in XFDF -->
                    <comment type="text/html" size="121">

Contrasts with views from Bourdieu where taste is a way of at ratifying and dominating rather than something constructed

                    </comment>
                </q>
                <q name="To do this we have to move constantly between the technical and the social" id="q1_4_4" au="Steve Admin" cDate="2017-07-04T10:06:31" mDate="2017-07-04T10:06:31" loc="start=3748 end=3822 startpage=1 endpage=1">
                    <content size="74">

To do this we have to move constantly between the technical and

the social

                    </content>
                </q>
                <q name="To do this we have to move constantly between the technical and the social." id="q1_5_5" au="Steve Admin" cDate="2017-07-04T10:07:16" mDate="2017-07-04T10:07:16" loc="start=3748 end=3823 startpage=1 endpage=1">
                    <content size="75">

To do this we have to move constantly between the technical and

the social.

                    </content>
                </q>
                <q name="echnological determinism pays no attention to what is brought together, and ultimately replaced, by…" id="q1_7_6" au="Steve Admin" cDate="2017-07-04T10:08:13" mDate="2017-07-04T10:08:13" loc="start=827 end=1070 startpage=1 endpage=1">
                    <content size="243">

echnological determinism pays no attention to what is brought together, and ultimately replaced, by the structural effects of a net- work. By contrast social GO tivi denies the Q.bchu:a"C_J ofobjects and assumes that oul peupi ean ave at1Js s.

                    </content>
                </q>
                <q name="The boundary is turned into a line of demarcation traced, .. within a geography ofdelegation,4 betwe…" id="q1_8_7" au="Steve Admin" cDate="2017-07-04T10:08:33" mDate="2017-07-04T10:08:33" loc="start=4051 end=4232 startpage=1 endpage=1">
                    <content size="181">

The boundary is turned into a line of demarcation traced, ..

within a geography ofdelegation,4 between what is assumed by the technical object and the competences of other actants.

                    </content>
                </q>
                <q name="the description of these elementary mechanisms ofad- justment poses two problems, one ofmethod and t…" id="q1_9_8" au="Steve Admin" cDate="2017-07-04T10:09:09" mDate="2017-07-04T10:09:09" loc="start=4241 end=4365 startpage=1 endpage=1">
                    <content size="124">

the description of these elementary mechanisms ofad- justment poses two problems, one ofmethod and the other ofvocab- ulary.

                    </content>
                </q>
                <q name="Quotation 1:10" id="q1_10_9" au="Steve Admin" cDate="2017-07-04T10:09:54" mDate="2017-07-04T10:09:54" loc="x=361 y=245 width=22 height=21 page=1"/>
                <q name="Quotation 1:11" id="q1_11_10" au="Steve Admin" cDate="2017-07-04T10:10:01" mDate="2017-07-04T10:10:01" loc="x=362 y=183 width=20 height=27 page=1">
                    <comment type="text/html" size="265">

Hugely significant para and one to empirically investigate in my data: firstly to what extent do style guides constrain how bodies relate to tasted objects, and second how can these links be characterised, how far can style guides be re-shaped, manipulated or used?

                    </comment>
                </q>
                <q name="Quotation 1:12" id="q1_12_11" au="Steve Admin" cDate="2017-07-04T10:10:09" mDate="2017-07-04T10:10:09" loc="x=362 y=108 width=27 height=28 page=1">
                    <comment type="text/html" size="193">

Competences being significant here as it is that competency that is being assessed, but the assessment is contingent on knowing, remembering and applying (implicitly accepting) the style guides

                    </comment>
                </q>
                <q name="Quotation 1:13" id="q1_13_12" au="Steve Admin" cDate="2017-07-04T10:10:51" mDate="2017-07-04T10:10:51" loc="x=361 y=156 width=32 height=28 page=1">
                    <comment type="text/html" size="61">

Boundary here, does or can this relate to "boundary objects"?

                    </comment>
                </q>
            </quotations>
        </primDoc>
        <primDoc name="Back - 2012 - Tape recorder-annotated.pdf" id="pd_2_2" loc="doc_2" au="Steve Admin" cDate="2017-07-04T10:01:28" mDate="2017-07-04T10:12:34" qIndex="">
            <quotations size="0"/>
        </primDoc>
    </primDocs>
    <codes size="2">
		<!-- codes is the list of codes - potentially used to transfer highlight types in with the name equalling their colour?-->
        <code name="highlight color=yellow" id="co_1" au="Steve Admin" cDate="2017-07-04T10:06:49" mDate="2017-07-04T10:06:49" color="" cCount="0" qCount="5"/>
        <code name="underline" id="co_2" au="Steve Admin" cDate="2017-07-04T10:08:21" mDate="2017-07-04T10:08:21" color="" cCount="0" qCount="1"/>
    </codes>
<!-- remaining code omitted in this illustration -->
 

Part 9) Concluding thoughts (and anticipating objections)

So that’s been rather long but hopefully with some point and use value! However it’s always clear that development priorities are set to allocate limited resource to an extended and never-ending list of fixes and improvements. Despite this coming up so often when teaching whether it has registered in terms of “user requests” is an unknown.

There are also two probable lines of objection I anticipate:

Developers – this is too difficult/varied/complex and marginal benefit

Companies/Sales/Marketing: this is too complex to do slickly and simply for our users.

Potential approaches to mitigate these objections:

Lots of tech companies are enabling “experimental features” – for example Tumblr https://www.theverge.com/2016/5/11/11655050/tumblrs-new-labs-program-lets-users-test-experimental-features , Google Chrome –http://ccm.net/faq/32470-google-chrome-how-to-access-and-enable-experimental-features and Firefox https://developer.mozilla.org/en-US/Firefox/Experimental_features
This approach enables development and prototyping beta testing then an experimental/opt-in release for a self-selecting group of typically more advanced users. It’s like an extra beta test and can do several key things:

  1. Enable engaging with a skilled user base for a practical pre-release test period
  2. Build a relationship with users to suggest features and develop what amount to support materials and workarounds – helping those working on programme documentation.
  3. Creating a space for features where the expectation is that the user may need to do some work or define some procedures and processes to get data to the stage needed for import – thus reducing the developer load

(In this model an interim stage may be that for advanced users opting in they can import comments from Mendeley but they either have to export one-by-one or use a third party tool. Once they’ve done what’s needed the experimental feature will do the import you requested. It then becomes an imperative on the RM user base to request a feature for bulk-export of annotated PDFs from their respective RM manufacturer or consortium, or via third party development. (Which sets up Mendeley and Zotero to do this quickly, whilst Endnote developers Thompson Reuters are pretty poor at responding to feedback and requests – certainly in my experience!)

These then become potentially powerful ways of improving a product pre-launch but also showing a more engaged and open way of working with a user base. Furthermore, as sort of approach might enable some more collaborative and innovative ways of trialling new features and collecting feedback and even crowd-sourcing support and documentation.

Conclusion:

So there we have it – ideas and approaches to improving lit import for PDF notes along with a bunch of ideas about working with lit in CAQDAS and relationships between practices. I personally think the prize for “converting” new users to a product might be quite significant as whoever nails it first and/or best can expect to have a real jump in usage if other factors are equal.

Next steps include looking at MaxQDA more to explore ideas for import there – however the programmers there are VERY adept and I hope there’s enough here to support translation into their architecture and terminology.

Anyway, thanks for reading, PLEASE comment. Oh, and if anyone thinks some of this might be worth presenting or publishing then suggestions VERY welcome too. be publishable (in a newsletter for a company? A book chapter? A practitioner journal or in a different form in an academic journal then suggestions VERY welcome too)

References

Barsky, E. (2010). Mendeley. Issues in Science and Technology Librarianship, Summer. doi:10.5062/F4S46PVC http://www.istl.org/10-summer/electronic.html

Bazeley, P. (2013). Qualitative data analysis : practical strategies. London: SAGE. https://uk.sagepub.com/en-gb/eur/qualitative-data-analysis/book234222

Gilmour, R., & Cobus-Kuo, L. (2011). Reference management software: a comparative analysis of four products. Issues in Science and Technology Librarianship, 66(66), 63-75. http://www.istl.org/11-summer/refereed2.html?a%5C_aid=3598aabf

Mead, T. L., & Berryman, D. R. (2010). Reference and PDF-manager software: complexities, support and workflow. Medical Reference Services Quarterly, 29(4), 388-393. doi:10.1080/02763869.2010.518928 http://dx.doi.org/10.1080/02763869.2010.518928

Phelps, R., Fisher, K., & Ellis, A. (2007). Organizing and managing your research: a practical guide for postgraduates. London: London : SAGE.  https://uk.sagepub.com/en-gb/eur/organizing-and-managing-your-research/book228894

Silverman, D. (2013). Doing qualitative research. London: London : Sage. https://uk.sagepub.com/en-gb/eur/doing-qualitative-research/book239644

Appendix 1 – Lit Import Development and History into the leading CAQDAS packages

Lit import into NVivo arrived in version 9 (http://help-nv9-en.qsrinternational.com/procedures/exchange_data_between_nvivo_and_reference_management_tools.htm ) and has remained relatively stable since – importing RIS information into the source classification sheet as well as the document description and a linked memo. The full text is imported with any highlighting visible and can then be annotated and coded.

Lit import into ATLAS.ti only came in much more recently with version an update to version 8 (see http://atlasti.com/2017/02/09/lit-reviews/ and 8 http://downloads.atlasti.com/docs/whatsnew8.pdf)

MaxQDA introduced literature import in v11 in 2012. They have brought increasing focus to this through providing a guide to lit reviews for users http://www.maxqda.com/maxqda-literature-reviews-reference-management-software

Appendix 2 – Details of Lit management Apps

Mendeley:

Mendeley is popular, based on a freemium model and – from my perspective at least – made a BIG impact on changing the view of the potential for reference management software to become a core part of the research process far beyond the basic origins of compiling reference lists on a single workstation. to be seen has extensively supported working across computers via cloud sync as well as having a very slick way of annotating PDFs on screen and being able to search those notes (see https://blog.mendeley.com/2012/08/28/how-to-series-how-to-search-your-notes-and-other-fields-part-10-of-12/)

Some Mendeley history:

Inception in 2008 (https://blog.mendeley.com/2008/03/11/hello-world/)
Launch of iPhone app in 2010 ( https://blog.mendeley.com/2010/07/21/our-first-iphone-app-has-arrived/ )
Improvements to app in 2011 (https://blog.mendeley.com/2011/05/23/mendeley-ios-app-gets-an-update/ )

Endnote:

Endnote has been around for a long time to manage reference lists in word. Mendeley came along and kind of re-wrote what reference management software could achieve in terms of not just being about citing work but actually integrating into the whole process of locating, grouping, reading and annotating then citing. Endnote has being playing catch up for years, with a few bumps and BAD mis-steps on the road (like trying to sue the open-source competition: https://en.wikipedia.org/wiki/EndNote#Legal_dispute_with_Zotero )
In terms of functionality it finally got to where Mendeley was in 2008 about five years late with the launch of X7 in 2013 (see ref: Endnote version history: https://en.wikipedia.org/wiki/EndNote#Version_history_and_compatibility ) – though in a FAR less well-designed or easy-to-use way that still feels clunky and retrofitted not designed-in.

However, the mobile implementation was also a challenge (high force for the app initially at £12.99 with start-of-year sales, then dropped to £2.99, now free). Initially it was VERY limited to (literally) scribbling on your iPad screen without it doing anything more than that with version 1 (launched Jan 25th 2013) – it was only with the release of 1.1 in Jan 31st 2014 that the Mendeley type functionality became available:

Version 1.1 (Jan 31, 2014)

– Expanded set of PDF annotation tools include inserting notes, highlighting, underlining, shapes, strikethrough and free hand drawing
– PDF annotations made on EndNote desktop or online can be viewed, edited, and searched in the app
– PDF annotations made in older versions of the app will be saved and made editable with the new tools
– New Reference Types include Podcast, Press Release, and Interview
– Updated Reference Types include Conference Paper, Blog, Data set, Thesis, and Manuscript
Details from – https://www.appannie.com/apps/ios/app/endnote-for-ipad/details/

7 thoughts on ““But can I bring my notes?” Ideas and investigations on improving Literature Import into CAQDAS Software

  1. This is fascinating exploration. I have been thinking the same about these QDA applications. If they can support the standard Adobe annotations, the flow between the annotation of a pdf and the information retrieval using the QDA application would be plausible. Right now, it is a huge barrier.

    1) the QDA application developers need not worry about the quirks of every reference manager. They need to support just the standard Adobe annotations. It is the responsibility of the annotations software to make its system amenable with the standard system.
    2) The workflow is much plausible with Maxqda specifically because the PDF doesn’t even have to be imported. You can annotate the pdf both inside and outside MaxQda at the same time; and keep the flow in sync. This is because MaxQDA can store the pdf outside of its bundle.

    We have problems with the licensing of Atlas.ti. Here in my circle, we are advising students not to fall to Atlas.ti because of the licensing issue. Once they are out of school, they cannot keep their data that they stored in Atlas because the license is a dangerous lock-down (assume a student forget to export her data: she is basically strewed). MaxQda is a tool to invest one’s time on. NVIVO is also getting better. So far, the mac version is not as good as Windows version. For that, it would be interesting if these features could be implemented in MaxQda and NVIVO.

    Thanks for the write up. This is interesting area to explore.

    Like

    1. Hi Dellu,

      Summer holidays are now over so I can finally formulate a proper reply!

      On your point 1, I would entirely agree – assuming that the Adobe standards are usable of course!

      I’d like to know more about point 2. Specifically – is MaxQDA able to search and code the changing content of a document if it is amended outside the package? Or would changes and annotations not be indexed or codable within MaxQDA?

      With regard to ATLAS.ti one of the things I really like is that the license is permanent but for a limited project. However MaxQDA have really led on having a “viewer” package so that the project can still be viewed (and shared with a supervisor or external examiner or collaborators) without necessitating them buying the software. I anticipate that both NVivo and ATLAS.ti will be working on exactly this.

      I sometimes refer to MaxQDA as “betaMaxQDA” as it reminds me so much of the betamax video format (see https://en.wikipedia.org/wiki/Betamax) – much better in its technological implementation and quality than the competitor VHS format and used in industry but unable to make the sort of market penetration into consumer and semi-pro fields that it arguably should have done. However as Woods, Paulus, Atkins & Macklin’s (2016). lit review of academic studies giving detail of the CAQDAS packages used (see https://scholar.google.co.uk/citations?view_op=view_citation&hl=en&user=BfiDInEAAAAJ&citation_for_view=BfiDInEAAAAJ:7xtfDMUJqJkC ) shows MaxQDA is a minority player with NVivo leading by a large margin with ATLAS.ti following.

      Steve

      References:
      Woods, M., Paulus, T., Atkins, D. P., & Macklin, R. (2016). Advancing qualitative research using qualitative data analysis software (QDAS)? Reviewing potential versus practice in published studies using ATLAS. ti and NVivo, 1994–2013. Social Science Computer Review, 34(5), 597-617.

      Liked by 1 person

      1. Right now, MaxQda doesn’t understand the standard annotations. That is my experience in the mac environment, at least. I just thought MaxQDA is in good position to implement the feature.

        AS for the popularity of the applications, it usually tells much about the advertisements that the companies did than about the quality of the packages, really. Probably NVIVO has done more agressive advertisement: or, they are more corporate friendly. For most people, the reason why they use one package over the other usually melted down which they have it for free. Many universities, including our own, offer NVIVO licenses for free.

        “With regard to ATLAS.ti one of the things I really like is that the license is permanent but for a limited project.”: how comes the license is permanent? the license actually expires: and, have no way of opening your project once the license expired.

        Like

  2. Sounds like the workflow might then be 1) import RIS file into Mendeley, 2) add the PDF to the RIS (again in Mendeley), 3) then export to NVivo — and then 4) read and do your markup in NVivo. A bit complicated — unless I’m missing something?

    Like

  3. That’s pretty much it at the moment. I’d generally recommend reading initially in Mendeleye/Ref Mgmt software and then considering importing selected papers into CAQDAS. THere’s an excellent blog post about good approaches to critical evaluation and really making use of the memos (and coding the memo) in this excellent blog from Christina Silver:
    https://www.qdaservices.co.uk/post/the-crux-of-literature-reviewing-using-caqdas-packages

    Like

Leave a comment