Transcribing with Microsoft Word Online and CAQDAS packages

This blog post outlines the steps of working with MS Word online (part of office 365) to generate automatic transcripts and import them into your CAQDAS package as a transcript.

By bringing audio and synchronised transcripts into a CAQDAS package you gain the opportunity to engage with the data as you correct the transcript – bringing the immersion that is often touted as the key benefit of manual transcription and linking this in with the tools you’ll use for analysis (annotation, memoing, coding) and bringing the efficiencies of automated transcription (typically speeding the process up from 5-7 hours manual transcription per hour of audio to 1-2 hours of correction and engagement per hour of auto transcribed audio).


Why use Word?

Working with Word – video

Preparing and importing a transcript into ATLAS.ti

Preparing and importing a transcript into MAXQDA

Preparing and importing a transcript into NVivo

Why use Word?

Microsoft Word transcription has a LOAD of advantages:

  1. First and foremost – it’s free to many students, researchers and academics as part of an institutional Office 365 license or their own personal Office 365 subscription.
  2. It’s VERY good and amazingly accurate.
  3. As a result of the first it is likely to be approved for data management as part of your institutional policies for REC purposes and research data as that will be part of the office 3656 agreement E.G. see this from Lancaster University.
    • While a lot of students may use Descript, Trint, or others they probably aren[”t compliant with Ethics requirements on research data!
  4. It’s multi-lingual (though the documentation claims otherwise – so it’s unclear how multilingual!)
  5. It’s familiar.
  6. It’s simple.
  7. There’s good documentation information and detailed step-by-steps

So that’s a pretty powerful ansd solid list.

Limitations, Options and considerations

There’s a key limitation with Word: time. You have up to 5 hours (300 minutes) per month. After that you can’t transcribe. This may change (it could become a charged service for more – who knows). There’s information and detailed step-by-steps for the word part here, however it’s not entirely accurate as it can and does work in languages other than EN-US.

There are two key options: synchronised or not.

Why synchronise?

Synchronisation allows you to listen to the audio/view the video that accompanies a segment of transcription. This has a lot of potential for analysis a listening to the audio gives you the opportunity to engage with (and add to analysis) the “four Ts” of spoken language that are lost in transcription: Tone, Tempo and Tenor (or Timbre) which all carry a LOT of meaning that is lost when spoken interaction is reduced to the written word. Don’t believe me? Try relaxing and unwinding to this.

With video there is even more to be gained by synchronising a transcript to the audio – and the opportunity is then therte to add additional informatyion to the transcript to make it a visual transcript too.

So, where synchronisation is easy and you can work with the text easily with or without the synchrnised audio/video (as is the case with ATLAS.ti and MAXQDA) this, to me, is a bit of a no-brainer – import synchronised.

Why not synchronise?

This is perhaps a more pragmatic decision where technology is going to get in the way. Unfortunately, with NVivo, it is substantially more fiddly, error-prone and usually involves quite a lot of to-and-fro correction so it’s then worth thinking a little more about costs vs benefits for correction and engagement.

Working with Word

The video below shows some basic steps for working with MS Word

Preparing and importing a transcript into ATLAS.ti

Preparing transcripts in word

Importing audio and transcript into ATLAS.ti 9

The (awesome) ATLAS.ti focus group coding tool requires the speaker name to be on a new line either preceded by @ or followed by :

So.. When renaming the speakers in Word include a colon after the speaker name

OR if you forget – do this when running the search and replace above.

The main search and replace would be to change the timestamp, space,speaker id, paragraph mark (^p), text

00:00:01 SW

content here


SW   content here  

timestamp,paragraph mark(^p),speaker id,tab(^t),text.

So the speaker ID was SW it would be

Search for: SW^p

Replace with: ^pSW:^t

Preparing and importing a transcript into MAXQDA

The video below shows this process. (Please bear with – I need to work more on grasping the process of speaker coding via search and code before I document this further here.)

The paraphrase function in MAXQDA makes it an amazing tool for this process as it’s so good at supporting the move from correction and making notes to coding.

Preparing and importing a transcript into NVivo

With NVivo the question of whether or not to import synhchronised is worth a little more consideration.

Importing is a bit of a pain, the steps to take are more complex and the steps to add in order to debug the steps make it harder still. The interface is a bit clunky and it’s harder to see, code and work with the transcript.

Preparing an unsynchronised transcript

So… if you’re happy with just correcting the transcript in Word and don’t really need to engage with the three T’s then consider sticking to that and just export with speaker names and import into NVivo as a document.

I’d definitely recommend having a second document open (e.g. in a text editor or notes app like OneNote) as you make your corrections in word to make notes and reflections as you work through correcting the transcript. You’d then be able to add that as a linked source memo in NVivo to connect those initial notes from correcting.

Preparing a synchronised transcript to import into NVivo- step-by-step:

To make this work effectively you’ll need to convert the text into a table and number the rows – this makes auto-coding for speaker and debugging errors WAY easier. It’s not too hard – honest!

Preparing Transcripts for NVivo


  1. Search and replace in word to get timestamp, speaker name and transcript onto a single row using S’n’R
  2. Check it’s correct
  3. Number the table rows
  4. Close
  5. Import
  6. Use error messages and table row info to debug
  7. Import again
  8. Repeat 6&7 till it works
  9. Listen, Correct and engage with the data through annotating
  10. Write up reflections and insights in a memo
  11. Auto-code by speaker name

The video below takes you through this step-by-step.


00:00:01 SW
Content here

The main search and replace would be to change the timestamp, space,speaker id, paragraph mark (^p), text

To: timestamp,tab (^t),speaker id,tab (^t),text

 00:00:01   SW   Content here 

If the speaker ID was SW it would be

Search for: SW^p

Replace with: ^tSW^t

You’d then insert the column to the left and auto-number.

Importing transcripts into NVivo

Having prepared you then import – and as you’ll see you usually cycle back a few times and will find the steps of doing this in a table and auto-numbering the rows invaluable in this!