Martin Joo

« back published by @mmartin_joo on December 26, 2021

Laravel LazyCollection & PHP generators

If you ever need to load thousands or millions of records from your DB or from a log file, you will probably run out of memory. But if you know how:

  • PHP generators
  • LazyCollections in Laravel

work, then you can avoid these problems.

What is a LazyCollection in Laravel?

Imagine you have 10 000 products in your application, or 100 000 users, or 1 000 000 audit log records. So you have a lot of rows in your database table. One of your method you need to query all the records and do something. So you need to query 100 000 users load them into a collection and then do something with each of them. At some point you will run out of memory.

This is where LazyCollections come into the picture. They allow us to work with 100 000 users without loading them into the memory at once. Here's how it looks like:

/** @var LazyCollection */
$users = User::orderByDesc('last_login')->cursor();

foreach ($users as $user) {
    echo $user->id;
}

The QueryBuilder has a cursor method that returns a LazyCollection. This will:

  • Only run one database query
  • Only keep one Eloquent model loaded in memory at a time

This is a win-win situation, and you should consider using it when you have large datasets. Now let's see how it works.

PHP generators

LazyCollection utilizing the power of PHP's generators. With a little bit of simplification a generator function is a function that has multiple return statements. But instead of return we can use the yield keyword. Here's an example:

function getProducts(): Generator
{
    foreach (range(1, 10000) as $i) {
        yield [
            'id' => $i,
            'name' => "Product #{$i}",
            'price' => rand(9, 99),
        ];
    }
}

foreach (getProducts() as $product) {
    echo $product['id'];
}

Any function that uses the yield keyword will return a Generator object which implements the Iterable interface so we can use it in a foreach.

Each time you call the getProducts() function you get exactly one product back. The function above is the equivalent of:

function getProducts(): array
{
    $products = [];
    foreach (range(1, 10000) as $i) {
        $products[] = [
            'id' => $i,
            'name' => "Product #{$i}",
            'price' => rand(9, 99),
        ];
    }

    return $products;
}

But this function will load 10 000 product into memory each time you call it.

Now let's see what with the memory using the function with the array:

Working with 10 000 items
Peak memory usage: 5485.6796875KB (5.45MB)

Working with 100 000 items
Peak memory usage: 49240.2265625KB (49MB)

Working with 300 000 items
PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to allocate 20480 bytes)

It reached the 128MB memory limit with 300 000 items. And these items are light-weight arrays with only scalar attributes! Image if you have heavy-weight Eloquent models with 4-5 different relationships, attribute accessors and so on...

Now let's see the memory usage of the generator function:

Working with 10 000 items
Peak memory usage: 908.703125KB (<1MB)

Working with 100 000 items
Peak memory usage: 4504.7265625KB (4.5MB)

Working with 1 000 000 items
Peak memory usage: 33176.734375KB (33MB) 

Working with 2 000 000 items
Peak memory usage: 65944.7421875KB (65MB)

Working with 3 000 000 items
PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to allocate 134217736 bytes)

It reached the 128MB memory limit with 3 000 000 items. We were able to load and loop through 2 millions items. This is 20x as much as before! It obviously using more and more memory because we iterate through the Generator and instantiating variables to do that.

So this is Generators in a nutshell. They provide the power of LazyCollection. Now let's implement a LazyCollection class!

Implementing a LazyCollection class that uses Generators

The cursor() example earlier is specifically designed with data collections, but LazyCollection comes in a more generic form:

LazyCollection::make(function () {
    $handle = fopen('log.txt', 'r');
    while (($line = fgets($handle)) !== false) {
        yield $line;
    }
})->chunk(4)->map(function ($lines) {
    return LogEntry::fromLines($lines);
})->each(function (LogEntry $logEntry) {
    // Process the log entry...
});

First let's create a constructor and a make() helper:

public function __construct(private Closure $source)
{
}

public static function make(Closure $source): static
{
    return new static($source);
}

That's easy, all we do is storing the source which is a Closure. How can it be used in an each() for example?

public function each(Closure $callback): static
{
    foreach (($this->source)() as $key => $item) {
        $callback($item, $key);
    }
    
    return $this;
}

We call the source, and we know it's gonna return a Generator, so we can iterate through it. And that's all we need for a very basic LazyCollection! Now we can test it:

LazyCollection::make(function () {
    foreach (range(1, 1_000_000) as $i) {
        yield [
            'id' => $i,
            'name' => "Product #{$i}",
            'price' => rand(9, 99),
        ];
    }
})->each(function (array $product) {
    var_dump($product['id']);
});

And it works as expected. Here's the memory usage:

Working with 1 000 000 items
Peak memory usage: 33237.6171875KB (33MB)

33MB with 1 000 000 product array. Just as before with generators.

We can also create a range() helper that can be used like this:

$count = 100_000_000;
LazyCollection::range(1, $count)
    ->each(fn (int $i) => var_dump($i));

That's a real method on the Laravel LazyCollection. Here's the implementation:

public static function range(int $from, int $to): static
{
    return new static(function () use ($from, $to) {
        for (; $from <= $to; $from++) {
            yield $from;
        }
    });
}

After you know how generators work I think it's easy to understand this one. Basically it's just a wrapper around a PHP Generator.

After implementing each() the other collection functions are quite similar and easy. Of course the Laravel LazyCollection is more complicated than this one, but it's just a basic implementation for understanding the magic behind LazyCollection. Which after all, doesn't seem like magic...