4,000 emails/month for free | Mailtrap sends real emails now!

Whisper.php - Automatic speech recognition and transcription

Published on by

Whisper.php - Automatic speech recognition and transcription image

Speech recognition can be complex, but it doesn't have to be as Whisper.php can help to simplify the process for you. Whisper.php is a PHP wrapper for whisper.cpp, a C/C++ port of OpenAI's Whisper model. This package was created by Kyrian Obikwelu who recently released v1.0.0, enabling fully local and API-free transcription directly in your projects. It provides:

  • High and low-level APIs
  • Model auto-downloading
  • Support for various audio formats (e.g. MP3, WAV, OGG, M4A)
  • Multiple output format exports (e.g. TXT, SRT, VTT, or CSV)
  • Callback support for streaming and progress tracking

Whisper.php requires the FFI (Foreign Function Interface) extension to be installed and enabled in PHP. This extension allows you to interact with C libraries directly from PHP..

Assuming you have FFI enabled, to install Whisper.php you would run:

composer require codewithkyrian/whisper.php

Whisper.php offers both low-level and high-level APIs. The low-level API provides fine-grained control over the transcription process, closely mirroring the original C implementation. The high-level API offers a simpler, more abstracted interface for a streamlined workflow.

For the purpose of this article we will use the High-level API.

use Codewithkyrian\Whisper\Whisper;
use function Codewithkyrian\Whisper\readAudio;
use function Codewithkyrian\Whisper\toTimestamp;
 
// Transcribe Audio
$whisper = Whisper::fromPretrained('tiny.en', baseDir: __DIR__.'/models');
$audio = readAudio(__DIR__.'/audio/laravel-news-227-sample.mp3');
$segments = $whisper->transcribe($audio, 4);
 
// Output transcribed segment data
foreach ($segments as $segment) {
echo toTimestamp($segment->startTimestamp) . ': ' . $segment->text . "\n";
}

Whisper.php relies on some platform-specific shared libraries. As such they will be automatically downloaded the first time you initialize a model with Whisper::fromPretrained() and stored in our models directory. The initial download will cause a slight delay on the first run, but thankfully once the libraries are cached, subsequent runs will perform much faster. Some of the supported Whisper base models are: tiny.en, base, base.en among others.

Next, the readAudio() function simplifies audio processing by resampling it to 16kHz, a balance between audio quality and efficiency. This captures the core frequencies of human speech while reducing the amount of data to process.

The transcribe() method then takes the resampled audio and breaks it up into segments with start and end timestamps along with the text, which we can output in our desired format.

As a test we used a recent episode of the Laravel News Podcast. As you can see, it is not perfect but it does a good job. The output would look like the following:

00:00:00,000: Hey everybody how's it going welcome to the level this podcast episode 227 today is November
00:00:05,040: 26th
00:00:06,400: 2024
00:00:07,680: Glad to have you hanging out with us and glad that Michael finally figured out his microphone...

Note: At the time of writing, only Linux and macOS are supported, while Windows support is still under development.

You can learn more about this package and view the source code on GitHub.

Yannick Lyn Fatt photo

Staff Writer at Laravel News and Full stack web developer.

Cube

Laravel Newsletter

Join 40k+ other developers and never miss out on new tips, tutorials, and more.

image
Laravel Cloud

Easily create and manage your servers and deploy your Laravel applications in seconds.

Visit Laravel Cloud
Bacancy logo

Bacancy

Supercharge your project with a seasoned Laravel developer with 4-6 years of experience for just $3200/month. Get 160 hours of dedicated expertise & a risk-free 15-day trial. Schedule a call now!

Bacancy
Tinkerwell logo

Tinkerwell

The must-have code runner for Laravel developers. Tinker with AI, autocompletion and instant feedback on local and production environments.

Tinkerwell
Get expert guidance in a few days with a Laravel code review logo

Get expert guidance in a few days with a Laravel code review

Expert code review! Get clear, practical feedback from two Laravel devs with 10+ years of experience helping teams build better apps.

Get expert guidance in a few days with a Laravel code review
Kirschbaum logo

Kirschbaum

Providing innovation and stability to ensure your web application succeeds.

Kirschbaum
Shift logo

Shift

Running an old Laravel version? Instant, automated Laravel upgrades and code modernization to keep your applications fresh.

Shift
Harpoon: Next generation time tracking and invoicing logo

Harpoon: Next generation time tracking and invoicing

The next generation time-tracking and billing software that helps your agency plan and forecast a profitable future.

Harpoon: Next generation time tracking and invoicing
Lucky Media logo

Lucky Media

Get Lucky Now - the ideal choice for Laravel Development, with over a decade of experience!

Lucky Media
SaaSykit: Laravel SaaS Starter Kit logo

SaaSykit: Laravel SaaS Starter Kit

SaaSykit is a Multi-tenant Laravel SaaS Starter Kit that comes with all features required to run a modern SaaS. Payments, Beautiful Checkout, Admin Panel, User dashboard, Auth, Ready Components, Stats, Blog, Docs and more.

SaaSykit: Laravel SaaS Starter Kit

The latest

View all →
FrankenPHP v1.11.2 Released With 30% Faster CGO, 40% Faster GC, and Security Patches image

FrankenPHP v1.11.2 Released With 30% Faster CGO, 40% Faster GC, and Security Patches

Read article
Capture Web Page Screenshots in Laravel with Spatie's Laravel Screenshot image

Capture Web Page Screenshots in Laravel with Spatie's Laravel Screenshot

Read article
Nimbus: An In-Browser API Testing Playground for Laravel image

Nimbus: An In-Browser API Testing Playground for Laravel

Read article
Laravel 12.51.0 Adds afterSending Callbacks, Validator whenFails, and MySQL Timeout image

Laravel 12.51.0 Adds afterSending Callbacks, Validator whenFails, and MySQL Timeout

Read article
Handling Large Datasets with Pagination and Cursors in Laravel MongoDB image

Handling Large Datasets with Pagination and Cursors in Laravel MongoDB

Read article
Driver-Based Architecture in Spatie's Laravel PDF v2 image

Driver-Based Architecture in Spatie's Laravel PDF v2

Read article