$ ./main -m models/ggml-base-q5_1.bin -f ~/zh.wav whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-base-q5_1.bin' whisper_model_load: loading model whisper_model_load: n_vocab = 51865 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 512 whisper_model_load: n_audio_head = 8 whisper_model_load: n_audio_layer = 6 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 512 whisper_model_load: n_text_head = 8 whisper_model_load: n_text_layer = 6 whisper_model_load: n_mels = 80 whisper_model_load: ftype = 9 whisper_model_load: qntvr = 1 whisper_model_load: type = 2 (base) whisper_model_load: adding 1608 extra tokens whisper_model_load: n_langs = 99 whisper_model_load: CPU total size = 59.22 MB (1 buffers) whisper_model_load: model size = 59.12 MB whisper_init_state: kv self size = 16.52 MB whisper_init_state: kv cross size = 18.43 MB whisper_init_state: compute buffer (conv) = 16.17 MB whisper_init_state: compute buffer (encode) = 94.42 MB whisper_init_state: compute buffer (cross) = 5.08 MB whisper_init_state: compute buffer (decode) = 105.96 MB system_info: n_threads = 4 / 32 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 0 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | main: processing '/home/ubuntu/zh.wav' (79949 samples, 5.0 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ... [00:00:00.000 --> 00:00:04.480] I think running is the most important thing for me to see health. whisper_print_timings: load time = 53.78 ms whisper_print_timings: fallbacks = 0 p / 0 h whisper_print_timings: mel time = 6.41 ms whisper_print_timings: sample time = 27.94 ms / 82 runs ( 0.34 ms per run) whisper_print_timings: encode time = 1310.36 ms / 1 runs ( 1310.36 ms per run) whisper_print_timings: decode time = 4.54 ms / 2 runs ( 2.27 ms per run) whisper_print_timings: batchd time = 132.78 ms / 78 runs ( 1.70 ms per run) whisper_print_timings: prompt time = 0.00 ms / 1 runs ( 0.00 ms per run) whisper_print_timings: total time = 1539.46 ms