dtinth · August 13, 2019 12:59 · Aug 13, 2019
diff --git a/README.md b/README.md
@@ -0,0 +1,61 @@
+How to transcribe Thai speech in videos into text.
+
+## Requirements
+
+- Google Cloud or Firebase project **with** billing enabled.
+
+- [`gcloud` command line tool installed](https://cloud.google.com/sdk/gcloud/).
+
+- `ffmpeg` or Docker.
+
+- `youtube-dl` to download YouTube videos.
+
+- 30 Baht per 1 hour of input.
+
+## Step 1: Grab the audio track
+
+Example, from YouTube, using `youtube-dl`:
+
+```
+youtube-dl -f bestaudio 'https://www.youtube.com/watch?v=..........'
+```
+
+## Step 2: Convert
+
+We need to convert a audio into a format that is supported by Google Cloud APIs.
+We will use OGG Opus.
+
+```
+docker run -v "$PWD:/data" jrottenberg/ffmpeg -i "/data/<FILENAME>.m4a" -c:a libopus -ar 16000 -ac 1 "/data/<FILENAME>.ogg"
+```
+
+To cut a portion of audio, put `-ss <START TIME> -t <DURATION>` before `-i`. For example, `-ss 01:38:23 -t 00:30:00`.
+
+## Step 3: Recognize
+
+1. Upload the ogg file to Google/Firebase Cloud Storage. After uploading, you will get a `<STORAGE LOCATION>` such as `gs://<PROJECT>.appspot.com/transcribe/<FILENAME>.ogg`.
+
+2. Start the transcription:
+   ```sh
+   gcloud ml speech recognize-long-running "<STORAGE LOCATION>" --language-code=th --encoding=ogg-opus --include-word-time-offsets --sample-rate=16000 --async
+   ```
+
+   It will print out:
+
+   ```json
+   {
+     "name": "5766027198115285298"
+   }
+   ```
+
+   This is your `<OPERATION ID>`.
+
+3. Wait for the operation to finish and write the results to the file.
+
+   ```json
+   gcloud ml speech operations wait "<OPERATION ID>" > "<FILENAME>.json"
+   ```
+
+View the JSON file.
+
+![](https://i.imgur.com/wMu4DUp.png)
No results found