This document summarizes information gathered by a research team at Penn State. The team investigated various services and techniques related to live captioning or CART services for deaf or hard of hearing individuals.

Table of Contents

Research Team

  • Patrick Besong, Multimedia Specialist, Teaching and Learning with Technology
  • Anita Colyer Graham, Manager of Access, World Campus
  • Elizabeth J. Pyatt, Instructional Designer, Teaching and Learning with Technology
  • With input from Deborah J. Austin, Coordinator, Student Disability Resources

What is Live Captioning/CART?

The term CART (Communication Access Realtime Translation) refers to technologies allowing a person with different hearing issues to quickly access spoken word content during a live event such as a course lecture, student team meeting, or any public event. Put more simply, CART or live captioning is captioning that occurs during a live event.

Because CART needs to happen simultaneously with the event, it requires a CART captioner who is able to produce captions quickly enough for them to be useful to the individuals who need them while the event is taking place. In contrast, someone providing captions of a recorded video or audio can pause and rewind content as needed until a caption is accurately created. Many CART providers are also trained as court reporters and are familiar with court reporting stenography equipment and techniques.

Traditionally, the person providing a CART service sat in the same location as the person needing the accommodation and could use either a stenography machine or computer linked to a display that could be viewed by the individual (Brewer, 2015).

Note: In some cases the display was shown to everyone in order to preserve the anonymity of the person needing the content (Brewer 2015).

Stenography machine in use

Court reporter using a stenography machine. Image from U.S. District Court of Wyoming.

When live events utilize telephone or virtual meeting software such as Zoom, Adobe Connect, Blackboard Collaborate, Blue Jeans and others, it is possible for the CART transcriber or live captionist to be in one location and the deaf/hard of hearing person to be in another location.

Screencapture during Zoom call, including example of live captions

Image of Zoom with participant acting as captionist.

Sign Language

In addition to live captioning, using a sign language interpreter is another possible accommodation. Normally this person is also in the same room, but could also be present in a video stream. Some translation or sign language interpreter services such as Interpretek or  LanguageLine offer this service. For many people though, live captioning is an equitable substitute.

Interpreter stands by computer to translate speech to sign language

By SignVideo, London, U.K. Licensed by Creative Commons via Wikimedia Commons

Why is it important?

There are several benefits to providing live captioning:

  1. Live captioning allows a person who is hard of hearing or deaf to read the words that are spoken during the event, as it is taking place. The person can fully participate in the event, including asking questions about the content to the presenter.
  2. A transcript can usually be saved for later review by any viewer as needed.
  3. If live captions are recorded as part of the event, captioned videos or transcribed audio can be posted more quickly because a text transcript has been made. In formats where captions are not automatically embedded in the event recording, a transcript can be placed with the recording, or time stamped and merged with the video file to create a captioned video.
  4. Live captioning can benefit hearing participants in a virtual meeting room if there are audio issues, non-native speakers of English, or a person who missed a phrase. The live captions can also help any viewer with spelling of new vocabulary or a name.

Hiring a Vendor

Because of the speed necessary for live captioning, it is generally recommended that a vendor be hired. Another alternative would be to hire a person certified in CART transcription for different events within the institution.

Virtual Meeting Platforms

There are a number of virtual meeting platforms in use today. Below is a partial list of systems that have been used at Penn State. No matter what system your institution may use, it’s important to factor in how the system can support live captioning.

In Person vs. Remote Captioning

For a virtual meeting, planners can hire a captionist who is in the same room as the speaker or one who is located at a remote location. Examples of vendors who provide live captioning remotely include VITAC (formerly Caption Colorado) or WGBH Media Access Group. Providers of captioning in the same room are generally local in the region and may include court reporters working to supplement their incomes.

Remote Speech-to-text Services

The following is a partial list of some Speech to Text services.

  • IBSU (Internet Broadcast Services Unlimited) – Used by World Campus in conjunction with Zoom.
  • VITAC (formerly Caption Colorado) – A service which has been used by several Penn State units.
  • C-Print – A captioning technology/service developed at National Technical Institute for the Deaf, at the Rochester Institute of Technology. This is used to display real time transcriptions during course lectures and other settings. Captionists can use this system to set abbreviations for faster entry.
  • TypeWell – Similar to C-Print. Not a word for word transcription, but a summary based on meaning.
  • WGBH Media Access Group – Another recognized organization providing accessibility services.


The National Court Reporters Association ( provides training and certification for live captioners and court reporters in the United State. Any captioner who provides support for Penn State educational programming must be certified through this organization.

Caption Window

It is best to use a platform that has a caption window open to everyone in the virtual session so that anyone who opens that window may view the captioning. This also ensures that in a recording, the live captioning is also recorded. At the time of this writing, Zoom and Adobe Connect, Blackboard Collaborate  included caption window options, but Blue Jeans did not.

It is also important that vendors can access the captioning window within a tool. For instance, Adobe Connect supports live captioning, but only if the event planner installs an extra plugin for the captionist. Another factor is whether a remote vendor is capable of accessing the caption area in a particular platform. For instance, although Zoom does provide a caption window, not all vendors use it.

A less usable configuration is one where the person who needs captions is forced to view two systems – the virtual meeting and a separate caption feed – simultaneously. This splits the attention of the viewer and means that the planners must do additional work to capture the transcription and include it as a caption in the recording.

Automatic Speech Recognition (ASR)

Some systems may use automatic speech recognition (ASR), that is electronic devices or software with speech recognition technology to capture speech and convert it to text. Although this has a promising future, factors such as background noise, low volume, pitch, pronunciation, and speaker accents place limits on accuracy during a live broadcast. Ideally, there should be a system to manually correct errors in a live broadcast, and results based on a recording should always be checked for accuracy before being posted as caption file.

Event Set Up Tips

When organizing an event

  1. Scheduling Turnaround
  2. Be prepared for last-minute requests. For instance, in a course, some students with partial hearing may decide they need captions at the last minute. You may want to determine a policy of how soon before an event a request can be made. Some vendors have stated windows of response time; it’s best to request a live captioner at least several work days prior to the event.

  3. Keywords for captionists
  4. Captionists can program shortcuts for frequently used keywords in a lecture (e.g. “aly” for “accessibility”). If a list of keywords or vocabulary lists, including proper names, is sent to the captionist before the event, he or she can set up keyboard shortcuts.

  5. Microphones
  6. Microphones are important for large events and events with remote participants. It’s important to ensure that microphones are working at the time of the event. This may include checking that that batteries in a microphone are fully charged or that backup batteries are available. One remote captioning process for live, on-site events requires the use of a USB port microphone, which should be fully charged. Charging takes several hours and should be done in advance of the event. The microphone may need to be charged each evening if the event lasts more than one day. The USB port portion of the microphone is plugged into the student’s computer (which should be plugged into a power source), and the microphone itself is provided to the faculty member or main presenter. Such a microphone has a range of about 65 feet, and one charge lasts approximately eight hours. An extended audio range helps if the presenter moves around during the presentation. It also helps the live captioner hear questions that are asked from the audience.

  7. Participants and microphones
  8. Whenever possible, make sure all speakers, including students or audience members asking questions, have access to a microphone. Depending on the event, it might be wise to have one microphone for the main speaker and additional microphones set up for audience members to use. If only one microphone is available, either it can be passed around or the speaker should repeat the question.

    Note: This will benefit all remote participants even ones with normal levels of hearing.

  9. Equipment compatibility for Captionist
  10. Live captioning is done by captionists who use equipment similar to that of court stenographers. The captioner uses a combination of keyboard shortcuts to type very quickly, producing captions in real time. If a remote captioner is providing captioning support online via a company, equipment compatibility should not be an issue. If the captioner is attending the event face-to-face for an event broadcast online, it will be necessary to verify in advance of the event start that the keyboard’s output can be successfully inputted into the online system being used.

  11. One screen is better
  12. Some setups have video image in one window and live captions in another. Unfortunately this requires the viewer to split his or her attention between two windows instead of viewing both the image and caption text together. A two screen setup also makes it more difficult to add captions to the video recording in post production.

  13. Virtual Setup Options
  14. When planning an event with virtual attendees, an important consideration is whether all participants will be attending virtually or whether some will be in the same space as the speaker.

    1. All online or by telephone – One option is that the speaker and all attendees are in different spaces. The advantage here is that remote viewers all have equal footing.
    2. Part online – This happens when there is a live audience with the speaker, but some participants are online. In this case remote viewers usually more disadvantaged because they don’t have the same visual cues as the live audience and their audio is usually poorer than in the actual location. Event planners will need to pass the mic around or repeat questions. It’s also important to remember to ask for feedback from the remote audience.
    3. Face-to-face only – The speaker and the audience are located in the same room; however, the captioner may be located remotely. In this case, it is important to be sure the audio quality is sufficient for the captionist.

Post-Production CART Transcription Files to Captions

If a live event has been recorded and the CART transcription file archived, it is possible to convert the transcription to a closed caption file. However, it is important that any errors that occurred during the live session be fixed. In addition, it will be likely that time codes will need to be added to the transcription file.

One method to add time codes to a transcript is to load the video to YouTube and the transcripts as a caption file.

You can then use the YouTube automatic speech recognition technology to match the text to the time. The video can then be deleted as needed.

See the YouTube Support article for details for details.

Note: Although there are concerns about automatic speech recognition systems, having a pre-existing transcript improves the accuracy in terms of adding time codes.


Brewer, Laura (2015) Caption Corner: What is CART? The Journal of Court Reporting.
Accessed 31 May, 2017.

Top of Page