Simon Says Assemble Interview – How Does Automatic Transcription & Editing Work?

January 14th, 2021 Jump to Comment Section

Simon Says is an AI-driven transcription platform that makes it very easy to transcribe all your interviews and other sound bites effortlessly. We had the opportunity to speak with Shamir Allibhai, CEO of Simon Says, to learn more about the underlying technology and the company’s latest offering: Simon Says Assemble.

Their most recent offering, Simon Says Assemble (we reported about it here before), even lets you create a video rough cut of your story based on these automatically generated transcripts. Just drag’n’drop a few pieces of text and start creating a timeline for the NLE of your choice.

Simon Says – AI-based transcription

Automatically transcribing a given video interview is a long-standing dream of many filmmakers. But it has to be near perfect to make sense. If it’s not good, you’ll spend more time fixing it than it will help you improve your workflow. As Simon Say’s CEO Shamir Allibhai puts it:

Only when the transcription is accurate can it be usable.

It seems that Simon Says can now deliver these near-perfect transcripts in multiple languages, so they figured it might be a good time to take the next step: What exactly do filmmakers do once the triscript is ready? Well, most likely they build what’s called a paper edit. You look for key sentences in the written document, glue them together, and build a story that way. The next step is to translate those sentences back into timecode ins and outs to track down the corresponding video snippets and put them into a timeline. This way, a rough cut can be built.

Simon Says Assemble

But then what? Export the rough cut, send it to clients and other key crew members to gather feedback. Once that’s received via email, start over. Again and again. To simplify this tedious process, Simon Says Assemble has a solution for you. AI-generated transcripts can be used to create the rough cut by simply dragging and dropping the relevant snippets of text. The system then automatically assembles the relevant parts of the video and off you go.

Moreover, working on such a document-based editing process is similar to working on a shared Google document on the web. Everyone involved can edit, review, and rearrange the cut. Once everyone is satisfied, you can export an XML file and edit the created rough cut directly in an NLE of your choice. Very neat!


Simon Says Assemble itself is free to use, but the underlying process of transcribing videos requires a paid service from Simon Says. There are different tiers, but basically it is not a subscription, but a pay-as-you-go service with various options for prepaid packages and discounts on the price per minute. Check their website for more details here.

There’s also a free trial which includes 15 minutes of transciption if you sign up as a new user.


What do you think? Is transcribing video something you need to do on a regular basis? Share your experiences in the comments below!

Leave a reply

Sort by:
Sort by:

Take part in the CineD community experience