I need to transcribe video and end up with both transcription and SRT file
Thread poster: adifrank

United States
Jun 5, 2014

Hi. I have the following task and not sure which software would be best.
I am to receive several video files containing audio in English.
The video files are final and will not be edited or changed.
I need to transcribe the English Audio and ultimately end up with two files:
(1) a MS Word doc of the transcription. Just text. No timecodes, no character limitation per line.
(2) an SRT subtitle file timed to the video.

The bare text for both (1) and (2) should be identical of course, the only difference being the formatting and the added timecodes for (2).

Of course, one could do this manually, but I was wondering if I could just do (2) for example and be able to also export the SRT to contain only the spoken text and without line breaks representing the break between subtitle lines.

If each line in (1) consisted of a whole subtitle from (2) that would be perfectly fine, provided that it completes a full sentence. Simply put, for (1) I can't have hard line breaks in the middle of sentences.

I would be most grateful to get some suggestions from experienced transcribers and/or subtitlers. I would be willing to purchase software that could do this.

Thank you!


José Henrique Lamensdorf  Identity Verified
Local time: 11:07
English to Portuguese
+ ...
Not sure what you are up to Jun 6, 2014

Be my guest to read this article for some clarification. Please keep in mind that it was intended for translation clients who have no idea on how subtitling is done. Yet some points there may be useful to you.

Answering your question:
For transcribing/translating directly, use Express Scribe
For time-spotting and creating a SRT file, use Subtitle Workshop or Media Subtitler.


United States
Thanks and some further clarification Jun 6, 2014

Thanks for replying Jose!

I took a quick look at the software you recommended. I will test it out further over the next week.
So from what I understand, I would first need to transcribe the English audio and then spot the transcription and break it down into subtitles. Is that correct? Is it necessarily a two-step process?

To clarify a bit more what I'm getting at -
It would save time to just transcribe directly into subtitle format, rather than go through the 2-step process of first transcribing and then creating the subtitles.
However, I'm using a translation memory. The videos correspond to print material and there is a significant amount of text from the print material that is repeated in the videos. When the text is in subtitle format (e.g. SRT file), it is segmented which can cause inconsistencies in the translations between the print and the video. It also can lead to errors in the translations, since the translators will only receive the english text, without the video for reference.

[Edited at 2014-06-06 15:25 GMT]


José Henrique Lamensdorf  Identity Verified
Local time: 11:07
English to Portuguese
+ ...
That's how *I* do it, one of the many methods available Jun 8, 2014

One thing you'll have to understand is that I came from the translation for dubbing world. I began doing it for dubbing in 1987, using an open-reel audio tape recorder.

For some reasons, including a personal innate talent I discovered, my "metrics" (matching translated text to the actor's mouth movements on the screen) were always considered "superb". I only got into translation for subtitling in 2004, due to some client requests to the tune "I can't find anyone else to do it" on account of the subject and quality requirements.

So I adapted my m.o. from dubbing into subtitling: some tools in common, yet a completely different frame of mind. Some input on this page, however this is intended to guide a client towards an educated decision.

Anyway, I always avoid transcribing a video, unless the client really wants a transcript, and is willing to pay for it. IMHO it's often a waste of time and/or money. It is useless for translation intended for either dubbing or subtitling, since it completely lacks the video "rhythm". Video translation should be done directly from the video, so the translator can mimic the video rhythm in the target language.

Now and then a client wants to "save" money, so they get the video transcribed, and then translated. Of course, some of them require Trados for the second step. They are the kind of people who say that everyone must always use Trados for anything, even to brush their teeth; it's some kind of a creed.

So they give me the full script, translated. Of course, if it's well done, it's invaluable assistance for proper names' spelling. (If a Japanese guy speaking English on the video mentioned Prof. Zbygniew Wojechszlecki, I'd have no clue to search for the proper spelling on Google.icon_wink.gif )

However the work it takes to adapt and chop that script for subtitling is more than re-translating the entire video directly. On top of the expense they had transcribing and translating, I surcharge 50% on the "spotting" rate, as it includes adapting too. Actually I should charge more, considering the additional time and labor involved, but this would lead them to give up.

For the record, I use WordFast, not Trados, just to mention that I am not at all against CAT tools. However I never use a CAT tool to translate video, as it cannot consider the video rhythm.

Rhythm is an absolute must for lip-sync dubbing.

Subtitling is all about getting the gist of whatever is said, and translating it into something as concise as possible, so the spectator, after having read the subtitle, will have some time left to watch the action. (Otherwise you cold just e-mail them the translated script text, and save bandwidth by not sendng the video.)

A CAT tool can do neither; it can't take rhythm into account.

I know that Trados worshippers will throw stones at me, but that's the truth.

Of course, low-budget subtitling will take a transcript, run it through Trados (or machine translation), and then have Subtitle Workshop (v6 only) automatically break it into subtitles and then time them "approximately"... but automatically! That's what you often see on the fansubs available on the web. However it's on the other end of the scale from the Disney-like quality level my clientele demands, since I specialize in corporate video.


United States
yet further clarification Jun 10, 2014

Thanks Jose.

For now, I'm dealing only with subtitles. No dubbing.

At my company we're using WordFast, not Trados.

We'll be receiving possibly more than 100 video files for a huge e-learning job. The e-learning job consists of dozens of slide shows, animations and documents, as well as videos which are incorporated in the e-learning courses.

The TM is a concern here because we want to maintain consistency throughout all the courses.

However, when the spoken text from the videos are broken to two-line subtitles it creates segmentation that can cause the TM to give wrong results.

That is why, we believe that it would be better to first translate a transcript and then breakdown the translation into subtitles.

The courses are going into several different languages.

We are just starting out with subtitling, so we're still learning.


José Henrique Lamensdorf  Identity Verified
Local time: 11:07
English to Portuguese
+ ...
My specialty Jun 10, 2014

Coincidentally, training programs is my major specialty in translation. I've translated several hundred training programs (a partial list for ONE client is here, involving video (either dubbing or subtitling - a few watchable samples here), instructor/facilitator guides, workbooks, handouts, software, the works!

I'm not peddling my services, since all I can offer is EN-US PT-BR, and you need several language pairs.

The major issue in subtitling is conciseness, which leaves time for the spectator to watch some the action after reading. If it's only a talking head (pretty common in some training videos), it's all right to be somewhat more verbose. If there is any interaction, however, conciseness is a must.

I'm not sure if I took a bad sample, however most videos I saw subtitled on TED by volunteers use the full script translation. Though all are talking heads (or bodies), sometimes there is just too much text to read in the time allotted. We can hear faster than we can read (and I am a very fast reader!).

So the issue is how you are going to subtitle those videos. Line breaks cause enough trouble with spellcheckers already, so they should wreak havoc with CAT tools.

One possible solution is to translate into (concise) subtitles without line breaks, then have Subtitle Workshop break the lines automatically (v6; as v2.51 can't do it), and adjust them manually wherever needed. After having mastered the commands, it's quick and easy to join subtitles, and re-break them BEFORE time-spotting.

This is not a piece of cake, however it should work.


United States
Thanks Jun 10, 2014

Okay, thanks for your input.
I found this link:
But using a whole new translation tool is not an option. I'll need to find some sort of similar workaround for WordFast. Thanks.


To report site rules violations or get help, contact a site moderator:

You can also contact site staff by submitting a support request »

I need to transcribe video and end up with both transcription and SRT file

Advanced search

SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »

  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search