Human Transcription – Still Relevant?

It is so easy to ask AI to do everything for you at the moment, from writing articles (this is definitely not AI – my very sore left wrist can attest to this!) through to producing summaries of meetings, providing information and creating new documents and emails. All good and really helpful, and I am sure most organisations and individuals are now using AI on a very regular basis to improve their productivity and work levels. However, what about transcription?

We are finding time and again, that when business clients send over AI-produced transcriptions (academic clients rarely use it due to the lack of data security and institution prohibitions), the AI transcription doesn’t just contain errors, which are understandable, it also contains hallucinations and thoughts from the imagination of the AI model being used.

AI Hallucinations

There was a study completed some time ago into the ChatGPT model, which showed a propensity of their audio to text model, Whisper, to make up text when transcribing audio. This content was not just possible errors, but also quite disturbing content bearing no similarities to the actual recording. Full details of this study can be found here in our earlier article:

It is almost as if the AI model is programmed to please human users, and it assumes that human users want it to help by interpreting the audio and then creating something more interesting than the content transcribed. Fine if you are writing a story or a screenplay I guess, but definitely not for accurate transcription of an audio recording!

So at present there is very much a need for human transcription. Although AI transcription has its place for dictation and creating rough transcripts for 1 or 2 speakers, it really struggles with multi-speakers, accents, local language use, detecting hard to hear voices, background noise, remembering not to make up sections of text and formatting. In fact it struggles to cope with over 95% of our work, which tends to be one or more of the above. Also humans obviously don’t have to remember not to make up sections of text – although we want to please our clients, we don’t tend to hallucinate in the same way AI does!

We very much doubt that AI in its current form is going to be able to compete with humans for academic, research and sensitive transcription of audio & video recordings. Whether or not anyone can train an AI model to accurately transcribe accents, hard to hear recordings and multi-speakers remains to be seen.

Our Accreditations

We are Cyber Essentials Plus audited annually and we hold the Cyber Essentials and Cyber Essentials Plus certificates. We are UKAS ISO 27001:2022 audited and accredited and ISO 9001 & ISO 14001 systems accredited company. We are members of the American Translators Association and we are assessed for GDPR compliance annually by IASME (Cyber Assurance Level 1).

10% Profits to Charity

10% of our profits are donated to the Ten Percent Foundation, a charitable trust registered in the UK. Since 2000 over £150,000 has been donated to projects in Africa and the UK. Click here for details.