Marsview Automatic Speech Recognition (ASR) technology accurately converts speech into text in live or batch mode. API can be deployed in the cloud or on-premise. Get superior accuracy, speaker separation, punctuation, casing, word-level time markers, and more.
Features | Feature Description |
Speech-to-text | accurately converts speech into text in live or batch mode |
Automatic Punctuation | accurately adds punctuation to the transcribed text |
Custom Vocabulary | boost domain-specific terminology, proper nouns, abbreviations by adding a simple list/taxonomy of words/phrases. |
Speaker Separation | automatically detect the number of speakers in your audio file, and each word in the transcription text can be associated with its speaker. |
PII Detection | the transcription that has Personally Identifiable Information (PII), such as phone numbers and social security numbers can be redacted. All redacted text will be replaced with special characters (#,*) |
Sentence Topics | The most relevant topics, concepts, discussion points from the conversation are generated based on the overall scope of the discussion. (For an detailed description of the different types of topics that are available checkout the Topics section) |
Model | Usecases | Parameter |
Default | Best for all types of data and accents (English only) | Default |
Custom language or accent | Tailored for your data. You can expect a large improvement compared to the default model on your data. | Contact us for more info: [email protected] |
Input Type Supported: Video, Audio
true
speech_to_text
key is set to true
under the settings
objectTransaction ID
is returned in the JSON body once the processing job is launched successfully.
This Transaction ID
can be used to check the status of the job or fetch the results of the job once the metadata is computed{"status":true,"transaction_id":32dcef1a-5724-4df8-a4a5-fb43c047716b}
Speech to Text
has to be enabled for Action Items
to be enabled){"status":false,"error":{"code":"MCST07","message":"DependencyError: action_items depends on speech_to_text"}}
curl --request POST 'https://api.marsview.ai/v1/conversation/compute' \--header 'appSecret: 32dcef1a-5724-4df8-a4a5-fb43c047716b' \--header 'appId: 1ZrKT0tTv7rVWX-qNAKLc' \--header 'Content-Type: application/json' \--data-raw '{"settings":{"speech_to_text":{"enable":true,"pii_detection":false,"custom_vocabulary":["Marsview" , "OmeoDataPlatform"],"sentence_topics":true},"speaker_separation":{"enable":true}}}'
Given below is a sample response JSON when the Status code is 200.
{"status":true,"transaction_id":32dcef1a-5724-4df8-a4a5-fb43c047716b,"message": " Compute job for file-id: 32dcef1a-5724-4df8-a4a5-fb43c047716b launched successfully"}
data
object returns the requested metadata if it is computed. The status
object shows the current state of the requested metadata. Status for each metadata field can take values "Queued"/"Processing"/"Completed
".
Shown below is a case where STT Job is in "Queued"
state and "Completed"
state. {"status":{"speech_to_text":"Queued"}"data":{"speech_to_text":{}}}
{"status":{"speech_to_text":"Queued"}"data":{"speech_to_text":{"sentences":[...{"sentence" : "Be sure to check out the support document at marsview.ai","start_time" : "172200.0","end_time" : "175100.0","speakers" : ["2"],"topics": [{"topic":"support document","type":"AI Generated"}]},{"sentence" : "Sure, Thats what i was looking for, Thank You!","start_time" : "175100.0","end_time" : "177300.0","speakers" : ["1"],"topics":[]},...]}}}
Fields | Description |
sentences | Contains a list of |
sentence | Contains the Marsview STT generated transcript for tht particular chunk. |
start_time | Starting time of the chunk in millseconds. |
end_time | Ending time of the chunk in milliseconds. |
speakers | Speaker Number for that particular chunk (Refer to |
​