Using ATLAS.ti to Correct and Code Automatically generated Transcripts from Teams or Zoom

This post takes you through the process of automatically creating a full written transcript for an audio or video file and importing it into ATLAS.ti to correct and code.

There is now excellent documentation of this in the the online manual for ATLAS.ti Mac and the online manual for Windows. So this is post is more focused on getting those transcripts out of other platforms and also about opportunities for analysis once imported in to ATLAS.ti

ATLAS.ti has led the way in making this really easy and cut out a step as it will clean on import.

UPDATES:

ATLAS.ti has great documentation of update changes here.
March 2021 – VTT and SRT import supported in Windows.

December 2020 – VTT and SRT import added in Mac

See this post for a cross-platform text step-by-step with lots of links to how to’s and documentation etc.

Prerequisites

The following are important prerequisites. You will need:

  1. A media file that is either:
    1. A recording within Microsoft Teams saved to Stream
      OR
    2. A media file you can convert and upload to Microsoft Stream*
      OR
    3. An audio or video recording through an institutionally licensed Zoom account (with subtitling enabled)
      OR
    4. A recording from another system that outputs a subtitle file (that you will then convert to VTT)
  2. Installed version of ATLAS.ti v9
  3. Installation of the free VLC media player

Process

  1. Create a media file with subtitle file in VTT format
  2. Download the media file and the subtitle file
  3. Clean the subtitle file ready for import.
  4. Import the media file into ATLAs.ti
  5. Importing the cleaned subtitles as a synchronised transcript in your CAQDAS package
  6. Listen to the media file and read the synchronised transcript in order to begin analysis through
    • Correcting the transcript
    • Labeling speakers
    • Making notes (annotation)
    • Initial coding the transcript

Step One – Create a media file with subtitle file in VTT format

Depending where you start there are a few ways this will work – all have the same end point: a media file and a VTT transcript. It’s all detailed over in this post.

The introductory video was created with Teams, another was created in Zoom. You can also (currently) upload videos to Stream or use a wide range of other applications and system to create an automatic transcript of a media file.

Step Two – Download the media file and the subtitle file

Here’s a copy of the interview video you can download and a VTT file if you want to try it:

Interview with Freidrich Markgraf (mp4 42Mb)

VTT file from Stream.

Step Three – Clean the subtitle file ready for import using an online tool

This step is no longer needed in ATLAS.ti as native support for VTT import is now enabled.

Option 1 – Clean the VTT file into CAQDAS ready format online

Go to https://www.lancaster.ac.uk/staff/ellist/vtttocaqdas.html

Upload your VTT file, Click convert, download the text file.

Option 2 – create your own copy of the converter

Go to the GitHub page at https://github.com/TimEllis/vttprocessor

Step Four – Import the media file into your CAQDAS package

This varies a little between packages. The previous difference is difference is no longer the case – you can now edit timestamps in both Mac and Windows, however as these are auto-generated you shouldn’t need to.

ATLAS.ti 9 Windows

It’s now well documented in the online manual for Windows.

There is information on page 11 of the manual and details here about windows supported media formats used by ATLAS.ti

Details of adding documents to a project is in online quick tour documentation here and in the manual on page 24. Details about working with transcripts is on page 10.

ATLAS.ti 9 Mac

Working with transcripts is in the online manual for ATLAS.ti Mac and

Adding documents to ATLAS.ti for Mac is in the online quick tour here

There is further information in the online manual for ATLAS.ti Mac about transcript formats on page 48, about adding media files on page 51. There is also extensive information about working with transcripts on pages 52-54.

Step Five – Import the cleaned subtitles as a synchronised transcript

ATLAS.ti 9 Windows

There is now excellent documentation on this process in the online manual much improved information on editing transcripts in the 90 page “quick (?! :-O ) tour” manual (pages 18-19).

ATLAS.ti 9 Mac

There is excellent information in the online manual for ATLAS.ti Mac about transcript formats on page 48, about adding media files on page 51. There is also extensive information about working with transcripts on pages 52-54 – again there is at present no information on editing the transcript and correcting it – so here’s a video:

Step Six – Listen to the media and correct the transcript (and begin initial analysis steps)

So this is where it all pays off!

This process allows you to now use the powerful tools within the CAQDAS package to playback the audio / video (including slowing playback speed,adjusting volume and setting reqwind intervals when you press play/pause + keyboard shortcuts for the play/pause functions) whilst you read the transcript and make corrections. But not only corrections! You can also annotate the transcript and even start coding at this stage.

ATLAS.ti 9 Windows

ATLAS.ti Mac

Resources

Here’s the ATLAS.ti file (89Mb) with one corrected plus focus group coded transcript and several uncorrected transcripts from the videos above if you want to have a look / play.

The blog bit – background, next steps, context

So this has been a real focus for me recently. I’ve had a lot of help and encouragement – see acknowledgements below – but also NEED from students and groups who are wondering how to do transcription better.

I’ve REALLY liked working with this in ATLAS.ti 9 – the way that you can integrate annotation and auto-coding via the focus group coding tool into the transcription process is key.

I also think it really gives the lie to the idea that manual transcription is “the best way” to get in touch with audio. I’m kind of hoping that the sudden shifts the pandemic has caused in practice and process might lead to some developments and rethinking of analysis. This quote has been too true for too long:

Over the past 50 years the habitual nature of our research practice has obscured serious attention to the precise nature of the devices used by social scientists (Platt 2002, Lee 2004). For qualitative researchers the tape-recorder became the prime professional instrument intrinsically connected to capturing human voices on tape in the context of interviews. David Silverman argues that the reliance on these techniques has limited the sociological imagination: “Qualitative researchers’ almost Pavlovian tendency to identify research design with interviews has blinkered them to the possible gains of other kinds of data” (Silverman 2007: 42). The strength of this impulse is widely evident from the methodological design of undergraduate dissertations to multimillion pound research grant applications. The result is a kind of inertia, as Roger Stack argues: “It would appear that after the invention of the tape-recorder, much of sociology took a deep sigh, sank back into the chair and decided to think very little about the potential of technology for the practical work of doing sociology” (Slack 1998: 1.10).

Back L. (2010) Broken Devices and New Opportunities: Re-imagining the tools of Qualitative Research. ESRC National Centre for Research Methods

Citing:
Lee, R. M. (2004) ‘Recording Technologies and the Interview in Sociology, 1920-2000’, Sociology, 38(5): 869-899
Platt, J. (2002) ‘The History of the Interview,’ in J. F. Gubrium and J. A. Holstein (eds) Handbook of the Interview Research: Context and Method, Thousand Oaks, CA: Sage pp. 35-54.
Silverman D. (2007) A very short, fairly interesting and reasonably cheap book about qualitative research, Los Angeles, Calif.: SAGE.
Slack R. (1998) On the Potentialities and Problems of a www based naturalistic Sociology. Sociological Research Online 3. http://socresonline.org.uk/3/2/3.html

Various additional links and notes:

How and when Stream will be changing https://docs.microsoft.com/en-gb/stream/streamnew/new-stream

Bits about zoom needing transcripts switched on and how to do this (ie.e. send this link to your institutional zoom administrator see https://support.zoom.us/hc/en-us/articles/115004794983-Using-audio-transcription-for-cloud-recordings- )

A cool free online tool for converting other transcript formats (e.g. from EStream, Panopto or other systems) https://subtitletools.com/

And finally for more information on the VTT format see this excellent page.

Thanks and acknowledgements

This hasn’t happened alone. Throughout this Friedrich Markgraf has been incredibly accommodating – and giving his time for the demo interview was just a part of that. Thanks definitely due for all his excellent, encouraging and very helpful input via Twitter – and for working on the great new features for direct import into ATLAS.ti.

Many thanks to Tim Ellis especially for his work on the VTT cleaner and sharing it via GitHub.

And to Amir Michalovich for his enthusiasm and sharing some excel tricks and of course Christina Silver for her draft reading, promoting and general enthusiasm, encouragement and suggestions. And also to Sandra Flynn and her great blog post about trials and tribulations of a PhD student working with NVivo which really helped me realise that time spent on this stuff can have an impact and a value.

If you’ve got suggestions, ideas, updates, developments or found this useful please post a comment, link to this or build on it.

One thought on “Using ATLAS.ti to Correct and Code Automatically generated Transcripts from Teams or Zoom

Leave a comment