The Iterator interface
(PHP 5 >= 5.0.0)
Introduction
Interface for external iterators or objects that can be iterated themselves internally.
Interface synopsis
Predefined iterators
PHP already provides a number of iterators for many day to day tasks. See SPL iterators for a list.
Examples
Example #1 Basic usage
This example demonstrates in which order methods are called when using foreach with an iterator.
<?php
class myIterator implements Iterator {
private $position = 0;
private $array = array(
"firstelement",
"secondelement",
"lastelement",
);
public function __construct() {
$this->position = 0;
}
function rewind() {
var_dump(__METHOD__);
$this->position = 0;
}
function current() {
var_dump(__METHOD__);
return $this->array[$this->position];
}
function key() {
var_dump(__METHOD__);
return $this->position;
}
function next() {
var_dump(__METHOD__);
++$this->position;
}
function valid() {
var_dump(__METHOD__);
return isset($this->array[$this->position]);
}
}
$it = new myIterator;
foreach($it as $key => $value) {
var_dump($key, $value);
echo "\n";
}
?>
The above example will output something similar to:
string(18) "myIterator::rewind" string(17) "myIterator::valid" string(19) "myIterator::current" string(15) "myIterator::key" int(0) string(12) "firstelement" string(16) "myIterator::next" string(17) "myIterator::valid" string(19) "myIterator::current" string(15) "myIterator::key" int(1) string(13) "secondelement" string(16) "myIterator::next" string(17) "myIterator::valid" string(19) "myIterator::current" string(15) "myIterator::key" int(2) string(11) "lastelement" string(16) "myIterator::next" string(17) "myIterator::valid"
Table of Contents
- Iterator::current — Return the current element
- Iterator::key — Return the key of the current element
- Iterator::next — Move forward to next element
- Iterator::rewind — Rewind the Iterator to the first element
- Iterator::valid — Checks if current position is valid
Коментарии
It's important to note that following won't work if you have null values.
<?php
function valid() {
var_dump(__METHOD__);
return isset($this->array[$this->position]);
}
?>
Other examples have shown the following which won't work if you have false values:
<?php
function valid() {
return $this->current() !== false;
}
?>
Instead use:
<?php
function valid() {
return array_key_exists($this->array, $this->position);
}
?>
Or the following if you do not store the position.
<?php
public function valid() {
return !is_null(key($this->array));
}
?>
<?php
# - Here is an implementation of the Iterator interface for arrays
# which works with maps (key/value pairs)
# as well as traditional arrays
# (contiguous monotonically increasing indexes).
# Though it pretty much does what an array
# would normally do within foreach() loops,
# this class may be useful for using arrays
# with code that generically/only supports the
# Iterator interface.
# Another use of this class is to simply provide
# object methods with tightly controlling iteration of arrays.
class tIterator_array implements Iterator {
private $myArray;
public function __construct( $givenArray ) {
$this->myArray = $givenArray;
}
function rewind() {
return reset($this->myArray);
}
function current() {
return current($this->myArray);
}
function key() {
return key($this->myArray);
}
function next() {
return next($this->myArray);
}
function valid() {
return key($this->myArray) !== null;
}
}
?>
So, playing around with iterators in PHP (coming from languages where I'm spoiled with generators to do things like this), I wrote a quick piece of code to give the Fibonacci sequence (to infinity, though only the first terms up to F_{10} are output).
<?php
class Fibonacci implements Iterator {
private $previous = 1;
private $current = 0;
private $key = 0;
public function current() {
return $this->current;
}
public function key() {
return $this->key;
}
public function next() {
$newprevious = $this->current;
$this->current += $this->previous;
$this->previous = $newprevious;
$this->key++;
}
public function rewind() {
$this->previous = 1;
$this->current = 0;
$this->key = 0;
}
public function valid() {
return true;
}
}
$seq = new Fibonacci;
$i = 0;
foreach ($seq as $f) {
echo "$f\n";
if ($i++ === 10) break;
}
?>
Here's a Fibonacci example using the formula, rather than addition.
<?php
/**
* @author Anthony Sterling
*/
class FibonacciSequence implements Iterator
{
protected
$limit = 0;
protected
$key = 0;
public function __construct($limit = 0)
{
$this->limit = (integer)$limit;
}
public function current()
{
return round(
(pow(((1 + sqrt(5)) / 2), $this->key) - pow((-1 / (1 + sqrt(5)) / 2), $this->key)) / sqrt(5),
null
);
}
public function key()
{
return $this->key;
}
public function next()
{
$this->key++;
}
public function rewind()
{
$this->key = 0;
}
public function valid()
{
return $this->key < $this->limit;
}
}
foreach(new FibonacciSequence() as $number)
{
printf(
'%d<br />',
$number
);
}
/*
0
1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
1597
2584
4181
6765
10946
17711
28657
46368
75025
121393
196418
317811
514229
*/
?>
Anthony.
Order of operations when using a foreach loop:
1. Before the first iteration of the loop, Iterator::rewind() is called.
2. Before each iteration of the loop, Iterator::valid() is called.
3a. It Iterator::valid() returns false, the loop is terminated.
3b. If Iterator::valid() returns true, Iterator::current() and
Iterator::key() are called.
4. The loop body is evaluated.
5. After each iteration of the loop, Iterator::next() is called and we repeat from step 2 above.
This is roughly equivalent to:
<?php
$it->rewind();
while ($it->valid())
{
$key = $it->key();
$value = $it->current();
// ...
$it->next();
}
?>
The loop isn't terminated until Iterator::valid() returns false or the body of the loop executes a break statement.
The only two methods that are always executed are Iterator::rewind() and Iterator::valid() (unless rewind throws an exception).
The Iterator::next() method need not return anything. It is defined as returning void. On the other hand, sometimes it is convenient for this method to return something, in which case you can do so if you want.
If your iterator is doing something expensive, like making a database query and iterating over the result set, the best place to make the query is probably in the Iterator::rewind() implementation.
In this case, the construction of the iterator itself can be cheap, and after construction you can continue to set the properties of the query all the way up to the beginning of the foreach loop since the
Iterator::rewind() method isn't called until then.
Things to keep in mind when making a database result set iterator:
* Make sure you close your cursor or otherwise clean up any previous query at the top of the rewind method. Otherwise your code will break if the same iterator is used in two consecutive foreach loops when the first loop terminates with a break statement before all the results are iterated over.
* Make sure your rewind() implementation tries to grab the first result so that the subsequent call to valid() will know whether or not the result set is empty. I do this by explicitly calling next() from the end of my rewind() implementation.
* For things like result set iterators, there really isn't always a "key" that you can return, unless you know you have a scalar primary key column in the query. Unfortunately, there will be cases where either the iterator doesn't know the primary key column because it isn't providing the query, the nature of the query is such that a primary key isn't applicable, the iterator is iterating over a table that doesn't have one, or the iterator is iterating over a table that has a compound primary key. In these cases, key() can return either:
the row index (based on a simple counter that you provide), or can simply return null.
Iterators can also be used to:
* iterate over the lines of a file or rows of a CSV file
* iterate over the characters of a string
* iterate over the tokens in an input stream
* iterate over the matches returned by an xpath expression
* iterate over the matches returned by a regexp
* iterate over the files in a folder
* etc...
For Iterators implement database queries, what I've found is that if you want to chain multiple iterators together using a "MultipleIterator" then you *do not* want ::rewind() to actually execute your query, especially if it's expensive.
Instead, what I've done is implement that portion in "valid."
e.g.
<?php
class Database_Result_Iterator {
...
private $_db_resource = null;
private $_loaded = false;
private $_valid = false;
function rewind() {
if ($this->_db_resource) {
mysql_free($this->_db_resource);
$this->_db_resource = null;
}
$this->_loaded = false;
$this->_valid = false;
}
function valid() {
if ($this->_loaded) {
$this->load();
}
return $this->_valid;
}
private function load() {
$this->_db_resource = mysql_query(...);
$this->_loaded = true;
$this->next(); // Sets _valid
}
}
?>
That way if you chain multiple queries in a "MultipleIterator" together, the "rewind" call (which rewinds all iterators at once) does not execute every query at once.
In addition, I found that the MultipleIterator may now work best for other reasons, but still, the above is a good idea to postpone queries until the last possible moment they are needed.
Be carefull with Iterator when using nested loops or deleting items inside the collection while looping over it.
It can be tricky to detect.
This unexpected behavior is pertinent if you think about it long enough.
<?php
foreach($it as $key => $value)
echo $value;
#output: value1, value2, value3
foreach($it as $key => $value)
foreach($it as $key => $value)
echo $value;
#output: value1, value2, value3
foreach($it as $key => $value)
foreach(clone $it as $key => $value)
echo $value;
#output: value1, value2, value3, value1, value2, value3, value1, value2, value3
foreach($it as $key => $value)
{
echo $value;
array_shift($it->values);
}
#ouput: value1, value3
?>
<?php
/*
* An implementation of the Iterator
* with simpleXML to remove a node and generate a new XML file.
*
* project.xml file:
* <?xml version="1.0" encoding="UTF-8"?>
* ...
* <data>
* <item>
* <value>one</value>
* </item>
* <item>
* <value>two</value>
* </item>
* ...
* </data>
*
*/
class parseXML implements Iterator {
private $position;
private $xml;
public $item;
public function __construct() {
$this->position = 0;
$this->xml = simplexml_load_file('project.xml');
}
public function unsetItem() {
foreach ($this as $key => $value) {
if ($value->value == $this->item ) {
unset($this->xml->data->item[$key]);
}
}
$this->mkXML();
}
public function mkXML() {
file_put_contents('project.xml', $this->xml->asXML() );
}
function rewind() {
$this->position = 0;
}
function current() {
return $this->xml->data->item[$this->position];
}
function key() {
return $this->position;
}
function next() {
++$this->position;
}
function valid() {
return isset($this->xml->data->item[$this->position]);
}
}
$itemRemove = new parseXML();
$itemRemove->item = "one";
$itemRemove->unsetItem();
?>
Be aware that when you call a method like current($this) within the Iterator class, the properties of the class are returned and the Iterator's current() isn't called. This is because current() applies to arrays and the Iterator class is then interpretated as an array.
If you have a custom iterator that may throw an exception in it's current() method, there is no way to catch the exception without breaking a foreach loop.
The following for loop allows you to skip elements for which $iterator->current() throws an exception, rather than breaking the loop.
<?php
for ($iterator->rewind(); $iterator->valid(); $iterator->next()) {
try {
$value = $iterator->current();
} catch (Exception $exception) {
continue;
}
# ...
}
?>
If you're using PHP 5.5 or above and are creating a simple iterator, consider using a generator function instead. There is significantly less boilerplate code and the code is easier to read. http://au1.php.net/generators
Examples of use
<?php
class myIterator implements Iterator
{
private
$_array = array();
public function __construct(array $array)
{
$this->_array = $array;
}
public function rewind()
{
reset($this->_array);
}
public function current()
{
return current($this->_array);
}
public function key()
{
return key($this->_array);
}
public function next()
{
next($this->_array);
}
public function valid()
{
return $this->key() !== null;
}
}
$it = new myIterator(array('foo_1' => 'bar_1','foo_2' => 'bar_2'));
//example 1 : foreach
foreach($it as $key => $value)
{
var_dump($key, $value);
}
//example 2 : while
$it -> rewind();
while($it->valid())
{
var_dump($it->key(), $it->current());
$it->next();
}
//example 3 : for
for($it->rewind();$it->valid();$it->next())
{
var_dump($it->key(), $it->current());
}
?>
An interesting fact that I didn't read in the doc:
the key() method is called only if your foreach loop needs it.
For instance, the following loop calls the key() method:
<?php
foreach($it as $key => $value) {
var_dump($key, $value);
echo "\n";
}
?>
But the following loop doesn't:
<?php
foreach($it as $value) {
var_dump($value);
echo "\n";
}
?>
With a large number of `current`, `next`, `key`, and `reset` array function implementations, care needs to be taken to ensure that deletions and nested loops are accounted for appropriately for each situation.
The following class has been copied from a previous comment and modified to allow it to be used within nested loops.
<?php
# Comment removed for brevity.
class tIterator_array implements Iterator {
private $myArray;
// Store each iteration in a separate array.
private $iterations = [];
private $i = -1;
public function __construct( $givenArray ) {
$this->myArray = $givenArray;
}
function rewind() {
// Rewind is called at the start of the loop. This is where we can append the current array to start our new iteration.
$this->iterations[] = $this->myArray;
$this->i++;
return reset( $this->iterations[ $this->i ] );
}
function current() {
return current( $this->iterations[ $this->i ] );
}
function key() {
return key( $this->iterations[ $this->i ] );
}
function next() {
return next( $this->iterations[ $this->i ] );
}
function valid() {
if ( null === $this->key() ) {
// Standard valid check. When null is returned the loop has finished, so we decrement the index and remove the latest iteration.
array_pop( $this->iterations );
$this->i--;
return false;
}
return true;
}
}
// Example:
$a = new tIterator_array( [1, 2] );
foreach ( $a as $k => $v ) {
echo " $k => $v:\n";
foreach ( $a as $k => $v ) {
echo " $k => $v,\n";
}
}
// Output:
# 0 => 1:
# 0 => 1,
# 1 => 2,
# 1 => 2:
# 0 => 1,
# 1 => 2,
?>
RocketInABog's seemingly trivial tIterator_array class has one huge problem (which just cost me a couple of hours).
Consider this example, using their class:
<?php
$values = ['one', 'two', 'three'];
foreach ($values as $v) {}
$current = current($values);
// $current === 'one', as you would expect
$iterator = new tIterator_array($values);
foreach ($iterator as $v) {}
$current = $iterator->current(); // do NOT use current($iterator) or key($iterator)!!!
// $current === false, but why?
?>
The problem is that foreach resets arrays, but doesn't call Iterator::rewind on objects!
I also think it's a design mistake that foreach works with Iterator, but current(), key() and end() don't - these iterate over the objects fields.
I just refactored some code to use an Iterator instead of an array, and it broke in several very unexpected ways because of these differences.
The "scalar" restriction on key() is no longer true. A simple example is that Generators can yield non-scalar keys.
/**
* Iterate a directory tree by walking the tree. For each directory in
* the tree rooted at directory $parent_dir (including $parent_dir
* itself), it returns $dirpath => array($dirnames, $filenames).
*
* $dirpath is a string, the path to the directory. $dirnames is a list
* of the names of the subdirectories in dirpath (excluding '.' and
* '..'). $filenames is a list of the names of the non-directory files in
* $dirpath.
*
* Inspired by Python os.walk; see
* https://docs.python.org/3/library/os.html#os.walk
* Implemented as an iterator rather than a generator.
*
* @return array
*/
class walker implements Iterator {
private $parent_dir = '';
private $d = 0; // iterator
private $dirs = array(); // indexed by iterator
private $dirstack;
private $discovered = array(); // indexed by directory
private $v; // current directory
private $cur_dirnames = array(); // current subdirs
private $cur_filenames = array(); // current files in directory
public function __construct($parent_dir) {
$this->parent_dir = $parent_dir;
$this->d = 0;
$this->dirs[$this->d] = $parent_dir;
$this->dirstack = new SplStack();
$this->dirstack->push($parent_dir);
$this->discovered = array($parent_dir => true);
$this->v = $parent_dir;
$this->cur_dirnames = array();
$this->cur_filenames = array();
$this->next();
}
public function rewind() {
$this->__construct($this->parent_dir);
}
public function current() {
return array($this->cur_dirnames, $this->cur_filenames);
}
public function key() {
return $this->v;
}
public function next() {
++$this->d;
$this->v = $this->dirstack->pop();
$this->dirs[$this->d] = $this->v;
$this->cur_dirnames = array();
$this->cur_filenames = array();
if (!$dh = opendir($this->v)) {
// opendir emits E_WARNING if unable to open directory, likely due
// to a permissions issue or directory removed before we could get
// there
return;
}
// discover the directories, return directories and files
while (false !== ($fn = readdir($dh))) {
if ($fn != '.' && $fn !== '..') {
$fullfn = $this->v . '/' . $fn;
if (is_dir($fullfn)) {
$this->cur_dirnames[] = $fn;
if (!array_key_exists($fullfn, $this->discovered)) {
$this->discovered[$fullfn] = true;
$this->dirstack->push($fullfn);
}
} else {
$this->cur_filenames[] = $fn;
}
}
}
closedir($dh);
}
public function valid() {
return $this->dirstack->count();
}
}
$tree = new walker("/tmp");
foreach ($tree as $parent_dir => $nodes) {
$subdirs = $nodes[0];
$files = $nodes[1];
printf("%s\n", $parent_dir);
if ($subdirs) printf(" %s\n", implode("/\n ", $subdirs));
if ($files) printf(" %s\n", implode("\n ", $files));
print("\n");
}
If you implemented Iterator methods next() and rewind() by calling array functions next() and reset() and returning their results, be advised that this violates the tentative return types (void in both cases) introduced with PHP 8.1.
You can add the #[\ReturnTypeWillChange] attribute to both method implementations but that will only delay the issue until PHP 9.0 comes around.
Better adapt your implementations now (stop returning anything from these methods) and, if need be, only add the return type declarations later.