Speaker separation setting will partition the input audio stream into homogeneous segments according to the speaker identity. This works on both multi-channel audio and mono-channel audio as well.
Input Type Supported: Audio, Video
"speaker_separation.enable "
flag to "True"
and mentioning the Number of speakers using the "num_speakers" flag the speech to text output will be speaker separated with each transcript section having a speaker number. A speaker number (integer) is automatically assigned to each speaker cluster.
If the number of speakers present in the conversation is unknown, the"num_speakers"
can be set to "0"
and Marsview's Speaker separation model will automatically detect the number of speakers.
Transaction ID i
s returned in the JSON body once the processing job is launched successfully. This Transaction ID
can be used to check the status of the job or fetch the results of the job once the metadata is computed{"status":true,"transaction_id":32dcef1a-5724-4df8-a4a5-fb43c047716b}
Speech to Text
has to be enabled for Action Items
to be enabled){"status":false,"error":{"code":"CVAPI01","message":"DependencyError: Speech to text must be enabled for speaker_separation to be enabled"}}
Speech to text
must be enabled for speaker_separation
to be enabled. (Error code: CVAPI01)
The accuracy of the diarized output will be much higher when num_speakers
is mentioned
{"status":{"speech_to_text":"Queued","speaker_separation":"Queued"}"data":{"speech_to_text":{}}}
{"status":{"speech_to_text":"Completed","speaker_spearation":"Completed"}"data":{"speech_to_text":{"sentences":[...{"sentence" : "Be sure to check out the support document at marsview.ai","start_time" : "172200.0","end_time" : "175100.0","speakers" : ["2"]},{"sentence" : "Sure, Thats what i was looking for, Thank You!","start_time" : "175100.0","end_time" : "177300.0","speakers" : ["1"]},...]}}}
Fields | Description |
| Starting time of the chunk in milliseconds |
| Ending time of the chunk in milliseconds |
| A speaker number (integer) is automatically assigned to each speaker cluster |
​