How to Convert Speech to Text using Google Speech-to-Text REST API

There are two possible ways to write the code.

(1) Upload the Audio File to Google Cloud Storage

First write code to upload the audio file to Google Cloud Storage.
For example, see:

AutoIt Upload File to Google Cloud Storage
C Upload File to Google Cloud Storage
Chilkat2-Python Upload File to Google Cloud Storage
C++ Upload File to Google Cloud Storage
C# Upload File to Google Cloud Storage
DataFlex Upload File to Google Cloud Storage
Delphi DLL Upload File to Google Cloud Storage
Visual FoxPro Upload File to Google Cloud Storage
Go Upload File to Google Cloud Storage
Java Upload File to Google Cloud Storage
Node.js Upload File to Google Cloud Storage
Objective-C Upload File to Google Cloud Storage
Perl Upload File to Google Cloud Storage
PHP Extension Upload File to Google Cloud Storage
PowerBuilder Upload File to Google Cloud Storage
PowerShell Upload File to Google Cloud Storage
PureBasic Upload File to Google Cloud Storage
Ruby Upload File to Google Cloud Storage
Swift Upload File to Google Cloud Storage
Tcl Upload File to Google Cloud Storage
Visual Basic 6.0 Upload File to Google Cloud Storage
VB.NET Upload File to Google Cloud Storage
VBScript Upload File to Google Cloud Storage
Xojo Plugin Upload File to Google Cloud Storage

Next, write the code to make the Speech-to-Text REST API call. You can use Chilkat’s online tool at https://tools.chilkat.io/curlHttp to convert the following CURL statement to code using Chilkat in your chosen programming language.

Note that the uri references the audio file previously uploaded to Google Cloud Storage.

curl -H "Content-Type: application/json" \
     -H "Authorization: Bearer [YOUR_ACCESS_TOKEN]" \
     --data '{
       "config": {
         "encoding":"FLAC",
         "sampleRateHertz": 16000,
         "languageCode": "en-US"
       },
       "audio": {
         "uri":"gs://your-bucket/audio.flac"
       }
     }' \
     "https://speech.googleapis.com/v1/speech:recognize?key=[YOUR_API_KEY]"

(2) Base64 Encode the Contents of the Audio File and Send Base64 in the REST API POST

You can alternatively base64 encode the contents of the audio file and include the Base64 directly in the REST API POST.
To base64 encode, simply load the file into a Chilkat BinData object, then call BinData.GetEncoded(“base64”) to get the data as base64.

curl -H "Content-Type: application/json" \
     -H "Authorization: Bearer [YOUR_ACCESS_TOKEN]" \
     --data '{
       "config": {
         "encoding":"FLAC",
         "sampleRateHertz": 16000,
         "languageCode": "en-US"
       },
       "audio": {
         "content": "[BASE64_ENCODED_AUDIO_DATA]"
       }
     }' \
     "https://speech.googleapis.com/v1/speech:recognize?key=[YOUR_API_KEY]"

Again, use Chilkat’s online tool at https://tools.chilkat.io/curlHttp to convert the above CURL statement to source code.

Tags :