How to Convert HTML to Plain Text in PHP
Last updated on by Paul Redmond
The Hypertext PHP package converts HTML to pure text and beautifully handles various and malformed HTML. It works by taking an HTML string as input and removing all the markup, leaving the plain text:
<!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>My Blog</title></head><body> <h1>Welcome to My Blog</h1> <p>This is a paragraph of text on my webpage.</p> <a href="https://blog.com/posts">Click here</a> to view my posts.</body></html>
Using Hypertext's Transformer
class, you can convert the HTML to pure text like so:
use Stevebauman\Hypertext\Transformer; // Pure text outputecho (new Transformer)->toText($input); // Welcome to My Blog This is a paragraph of text on my webpage. Click here to view my posts.
Let's say that you want to convert everything to text but also preserve newlines and links:
echo (new Transformer) ->keepLinks() ->keepNewLines() ->toText($input); /*Welcome to My BlogThis is a paragraph of text on my webpage.<a href="https://blog.com/posts">Click Here</a> to view my posts.*/
Here is a list of the top features in the hypertext package:
- Removes CSS
- Removes scripts
- Removes headers
- Removes non-HTML based content
- Preserves spacing
- Preserves links (optional)
- Preserves new lines (optional)
You can find the complete details for this package on Github: stevebauman/hypertext.