When data sources are audio files (e.g. depth interviews, customer calls, focus groups) a huge amount of time is needed to listen to each file to take notes and identify key items. Not only time-consuming, but the manual process to analyze this data doesn't scale across projects, read more about the audio use case here.
In contrast, Relevance AI's Audio Premium workflow provides you with the ability to:
- Transcribe audio
- Split audio
- Semantically analyze sentences (only on split sentences)
- Split audio file into chapters
- Highlight keywords
- Extracts time stamp (before, current and after)
- and more!
As a result, you will receive several datasets where the transcription and analysis results are saved. Saved datasets include sentence-splitting and chapter-splitting results alongside their specific outputs).
Upload your audio files to the Relevance AI platform using Upload media. Alternatively, if you have already stored your audio files on the web (i.e. you can access them via an
http...URL), include their URLs in a CSV file (a sample here) and upload the CSV to Relevance AI.
Note: When uploading your media files to Relevance AI, if your audio files are large, allow time for the upload process to finalize; uploaded files are highlighted in green on the upload wizard.
As a result, you will have a dataset in which each entry represents an audio file including a URL to where the file is uploaded.
Once your dataset is ready, locate "Audio Premium" under Workflows.
Follow the steps in the setup wizard:
- Select the field that contains the URLs to your audio file(s)
- Select your desired analysis: Chapters or Sentences
Chapters: Input audio file(s) are broken in pieces spoken by one person and based on the concept. Summary, gist and speaker are some of the important output fields under this analysis.
Sentences: Input audio file(s) are broken into spoken sentences. Speaker, sentiment and keywords are some of the important output fields under this analysis.
- Execute the workflow
You can track the progress on workflow history or wait till you receive an email notification on workflow is finalized.
There will be new datasets under your account containing the output. We use the selected analysis title (i.e. "sentences" and "chapters") as an ending for naming the resulting datasets, please see image below:
- Under the sentiment dataset (split audios on sentences), you will see fields such as transcription, sentiment, speaker, keywords, events, next and previous transcript and audio for each row.
- Under the chapter dataset (split audios on context), you will see fields such as transcription, gist, summary, headlines, next and previous transcript and audio for each row.
Note 1: Speaker diarization is done in a way that the first speaker/voice is A, the second one B, and so on.
Note 2: To easily differentiate responses from interviewers specially when working on multiple files, make sure that the first voice in the audio file belongs to the interviewer.
After successful execution of the workflow, the audio is transcribed and chunked in three different ways. Results are saves as datasets under your account. You can treat the data as text and apply workflows such as AI Clustering and AI Tagging.
There are Gist and Summary fields under the chapter dataset which can provide you with high level analysis over the theme. But to further understand the data and deeper analysis, we recommend applying AI Clustering and AI Tagging to your data.
Relevance AI's Explorer is a great tool that provides you with variety of configurable data views as well as search and filtering on your data.
Updated about 24 hours ago