In terminal-based applications, accurately determining the display width of Unicode characters is essential for proper text alignment and formatting. However, handling modern Unicode text presents challenges:
- Emoji and East Asian (CJK) characters often occupy two cells in terminals.
- Zero-width characters (e.g., joiners, marks) don't affect layout but can cause width calculation errors.
- Complex text like emojis with skin tone modifiers or flags requires special handling.
- PHP's built-in functions don't fully address these edge cases.
To address these issues, Aaron Francis created the Grapheme library which provides an accurate, performant, and thoroughly tested method to calculate the display width of any character or grapheme cluster in PHP applications.
Key Features:
- Comprehensive Unicode Support: Handles CJK characters, emoji (including modifiers), zero-width characters, combining marks, regional indicators, and variation selectors.
- High Performance: Optimized with early-return paths and smart caching mechanisms.
- Minimal Dependencies: Requires PHP 8.2+ and an optional intl extension.
- Terminal Compatibility: Aims to match the behavior of
wcwidth()
in modern terminal emulators.
To install this package, use composer:
composer require soloterm/grapheme
Then some examples of using the package are as follows:
use SoloTerm\Grapheme\Grapheme; Grapheme::wcwidth('Я'); // Returns: 1Grapheme::wcwidth('文'); // Returns: 2Grapheme::wcwidth('😀'); // Returns: 2
For more information and to view the source code, visit the Grapheme GitHub repository.