Hire Laravel developers with AI expertise at $20/hr. Get started in 48 hours.

Whisper.php - Automatic speech recognition and transcription

Published on by

Whisper.php - Automatic speech recognition and transcription image

Speech recognition can be complex, but it doesn't have to be as Whisper.php can help to simplify the process for you. Whisper.php is a PHP wrapper for whisper.cpp, a C/C++ port of OpenAI's Whisper model. This package was created by Kyrian Obikwelu who recently released v1.0.0, enabling fully local and API-free transcription directly in your projects. It provides:

  • High and low-level APIs
  • Model auto-downloading
  • Support for various audio formats (e.g. MP3, WAV, OGG, M4A)
  • Multiple output format exports (e.g. TXT, SRT, VTT, or CSV)
  • Callback support for streaming and progress tracking

Whisper.php requires the FFI (Foreign Function Interface) extension to be installed and enabled in PHP. This extension allows you to interact with C libraries directly from PHP..

Assuming you have FFI enabled, to install Whisper.php you would run:

composer require codewithkyrian/whisper.php

Whisper.php offers both low-level and high-level APIs. The low-level API provides fine-grained control over the transcription process, closely mirroring the original C implementation. The high-level API offers a simpler, more abstracted interface for a streamlined workflow.

For the purpose of this article we will use the High-level API.

use Codewithkyrian\Whisper\Whisper;
use function Codewithkyrian\Whisper\readAudio;
use function Codewithkyrian\Whisper\toTimestamp;
 
// Transcribe Audio
$whisper = Whisper::fromPretrained('tiny.en', baseDir: __DIR__.'/models');
$audio = readAudio(__DIR__.'/audio/laravel-news-227-sample.mp3');
$segments = $whisper->transcribe($audio, 4);
 
// Output transcribed segment data
foreach ($segments as $segment) {
echo toTimestamp($segment->startTimestamp) . ': ' . $segment->text . "\n";
}

Whisper.php relies on some platform-specific shared libraries. As such they will be automatically downloaded the first time you initialize a model with Whisper::fromPretrained() and stored in our models directory. The initial download will cause a slight delay on the first run, but thankfully once the libraries are cached, subsequent runs will perform much faster. Some of the supported Whisper base models are: tiny.en, base, base.en among others.

Next, the readAudio() function simplifies audio processing by resampling it to 16kHz, a balance between audio quality and efficiency. This captures the core frequencies of human speech while reducing the amount of data to process.

The transcribe() method then takes the resampled audio and breaks it up into segments with start and end timestamps along with the text, which we can output in our desired format.

As a test we used a recent episode of the Laravel News Podcast. As you can see, it is not perfect but it does a good job. The output would look like the following:

00:00:00,000: Hey everybody how's it going welcome to the level this podcast episode 227 today is November
00:00:05,040: 26th
00:00:06,400: 2024
00:00:07,680: Glad to have you hanging out with us and glad that Michael finally figured out his microphone...

Note: At the time of writing, only Linux and macOS are supported, while Windows support is still under development.

You can learn more about this package and view the source code on GitHub.

Yannick Lyn Fatt photo

Staff Writer at Laravel News and Full stack web developer.

Cube

Laravel Newsletter

Join 40k+ other developers and never miss out on new tips, tutorials, and more.

image
SerpApi

The Web Search API for Your LLM and AI Applications

Visit SerpApi
Bacancy logo

Bacancy

Supercharge your project with a seasoned Laravel developer with 4-6 years of experience for just $3200/month. Get 160 hours of dedicated expertise & a risk-free 15-day trial. Schedule a call now!

Bacancy
Tinkerwell logo

Tinkerwell

The must-have code runner for Laravel developers. Tinker with AI, autocompletion and instant feedback on local and production environments.

Tinkerwell
Get expert guidance in a few days with a Laravel code review logo

Get expert guidance in a few days with a Laravel code review

Expert code review! Get clear, practical feedback from two Laravel devs with 10+ years of experience helping teams build better apps.

Get expert guidance in a few days with a Laravel code review
PhpStorm logo

PhpStorm

The go-to PHP IDE with extensive out-of-the-box support for Laravel and its ecosystem.

PhpStorm
Laravel Cloud logo

Laravel Cloud

Easily create and manage your servers and deploy your Laravel applications in seconds.

Laravel Cloud
Acquaint Softtech logo

Acquaint Softtech

Acquaint Softtech offers AI-ready Laravel developers who onboard in 48 hours at $3000/Month with no lengthy sales process and a 100 percent money-back guarantee.

Acquaint Softtech
Kirschbaum logo

Kirschbaum

Providing innovation and stability to ensure your web application succeeds.

Kirschbaum
Shift logo

Shift

Running an old Laravel version? Instant, automated Laravel upgrades and code modernization to keep your applications fresh.

Shift
Harpoon: Next generation time tracking and invoicing logo

Harpoon: Next generation time tracking and invoicing

The next generation time-tracking and billing software that helps your agency plan and forecast a profitable future.

Harpoon: Next generation time tracking and invoicing
Lucky Media logo

Lucky Media

Get Lucky Now - the ideal choice for Laravel Development, with over a decade of experience!

Lucky Media
SaaSykit: Laravel SaaS Starter Kit logo

SaaSykit: Laravel SaaS Starter Kit

SaaSykit is a Multi-tenant Laravel SaaS Starter Kit that comes with all features required to run a modern SaaS. Payments, Beautiful Checkout, Admin Panel, User dashboard, Auth, Ready Components, Stats, Blog, Docs and more.

SaaSykit: Laravel SaaS Starter Kit

The latest

View all →
Serve Markdown Versions of Your Laravel Pages to AI Agents image

Serve Markdown Versions of Your Laravel Pages to AI Agents

Read article
The Inertia v3 Beta is Here image

The Inertia v3 Beta is Here

Read article
Polyscope Is an Ai-First Dev Environment for Orchestrating Agents image

Polyscope Is an Ai-First Dev Environment for Orchestrating Agents

Read article
Filament v5.3.0 Released with Deferred Tab Badges and Column Manager Improvements image

Filament v5.3.0 Released with Deferred Tab Badges and Column Manager Improvements

Read article
Ward: A Security Scanner for Laravel image

Ward: A Security Scanner for Laravel

Read article
Kit: An Opinionated API Starter Kit for Laravel image

Kit: An Opinionated API Starter Kit for Laravel

Read article