PHP JSON Parser - Read large JSON from any source in a memory-efficient way


July 13th, 2023

PHP JSON Parser - Read large JSON from any source in a memory-efficient way

JSON Parser is a zero-dependencies pull parser to read large JSON from any source in a memory-efficient way. You can read JSON from any source, such as a string, URL, etc., and iterate through it like so:

// a source is anything that can provide a JSON, in this case an endpoint
$source = '';
foreach (new JsonParser($source) as $key => $value) {
// instead of loading the whole JSON, we keep in memory only one key and value at a time

If you don't want to use foreach this parser also comes with a traverse method that looks like the following:

JsonParser::parse($source)->traverse(function (mixed $value, string|int $key, JsonParser $parser) {
// lazily load one key and value at a time; we can also access the parser if needed

The above examples demonstrate using a URL to process JSON, but the package supports several data sources. At the time of writing, the readme lists the following sources:

  • strings, e.g. {"foo":"bar"}
  • iterables, i.e. arrays or instances of Traversable
  • file paths, e.g. /path/to/large.json
  • resources, e.g. streams
  • API endpoint URLs, e.g. https://endpoint.json or any instance of Psr\Http\Message\UriInterface
  • PSR-7 requests, i.e. any instance of Psr\Http\Message\RequestInterface
  • PSR-7 messages, i.e. any instance of Psr\Http\Message\MessageInterface
  • PSR-7 streams, i.e. any instance of Psr\Http\Message\StreamInterface
  • Laravel HTTP client requests, i.e. any instance of Illuminate\Http\Client\Request
  • Laravel HTTP client responses, i.e. any instance of Illuminate\Http\Client\Response
  • user-defined sources, i.e. any instance of Cerbero\JsonParser\Sources\Source

Another amazing feature of this library that I want to point out is pointers which are useful to extract only specific subtrees from a large JSON dataset:

// Select the first gender result
$json = JsonParser::parse($source)->pointer('/results/0/gender');
foreach ($json as $key => $value) {
// 1st and only iteration: $key === 'gender', $value === 'female'
// Get all gender results
$json = JsonParser::parse($source)->pointer('/results/-/gender');
// ...

This package has many other features I still need to mention that you should check out! For example, it has a progress API to track the parsing progress (i.e., completion percentage, bytes processed, etc). Check out this package, get full installation instructions, and view the source code on GitHub!

Filed in:

Paul Redmond

Full stack web developer. Author of Lumen Programming Guide and Docker for PHP Developers.