Speech-to-text has two different REST APIs. Each API serves its special purpose and uses different sets of endpoints.
The Speech-to-text REST APIs are:
- Speech-to-text REST API v3.0 is used for Batch transcription and Custom Speech. v3.0 is a successor of v2.0.
- Speech-to-text REST API for short audio is used for online transcription as an alternative to the Speech SDK. Requests using this API can transmit only up to 60 seconds of audio per request.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | #URI del servicio de reconocimiento $URI = 'https://speech.platform.bing.com/speech/recognition/interactive/cognitiveservices/v1?language=de-de&format=detailed' #Cabeceras (incluir la clave del API) $Cabeceras = @{ 'Ocp-Apim-Subscription-Key' = 'Clave del API'; 'Transfer-Encoding' = 'chunked' 'Content-type' = 'audio/wav; codec=audio/pcm; samplerate=16000' } #Convertir el fichero WAV en Bytes $AudioBytes = [System.IO.File]::ReadAllBytes("C:\Users\juan\Desktop\recono\Intro.wav") #Preparar la petición y obtener respuesta $Respuesta = Invoke-RestMethod -Method POST -Uri $URI -Headers $Cabeceras -Body $AudioBytes #Convertir respuesta en formato JSON ConvertTo-Json $Respuesta |