strnatcmp
(PHP 4, PHP 5)
strnatcmp — String comparisons using a "natural order" algorithm
Description
$str1
, string $str2
)This function implements a comparison algorithm that orders alphanumeric strings in the way a human being would, this is described as a "natural ordering". Note that this comparison is case sensitive.
Parameters
-
str1
-
The first string.
-
str2
-
The second string.
Return Values
Similar to other string comparison functions, this one returns < 0 if
str1
is less than str2
; >
0 if str1
is greater than
str2
, and 0 if they are equal.
Examples
An example of the difference between this algorithm and the regular computer string sorting algorithms (used in strcmp()) can be seen below:
<?php
$arr1 = $arr2 = array("img12.png", "img10.png", "img2.png", "img1.png");
echo "Standard string comparison\n";
usort($arr1, "strcmp");
print_r($arr1);
echo "\nNatural order string comparison\n";
usort($arr2, "strnatcmp");
print_r($arr2);
?>
The above example will output:
Standard string comparison Array ( [0] => img1.png [1] => img10.png [2] => img12.png [3] => img2.png ) Natural order string comparison Array ( [0] => img1.png [1] => img2.png [2] => img10.png [3] => img12.png )
See Also
- preg_match() - Perform a regular expression match
- strcasecmp() - Binary safe case-insensitive string comparison
- substr() - Return part of a string
- stristr() - Case-insensitive strstr
- strcmp() - Binary safe string comparison
- strncmp() - Binary safe string comparison of the first n characters
- strncasecmp() - Binary safe case-insensitive string comparison of the first n characters
- strnatcasecmp() - Case insensitive string comparisons using a "natural order" algorithm
- strstr() - Find the first occurrence of a string
- natsort() - Sort an array using a "natural order" algorithm
- natcasesort() - Sort an array using a case insensitive "natural order" algorithm
- addcslashes
- addslashes
- bin2hex
- chop
- chr
- chunk_split
- convert_cyr_string
- convert_uudecode
- convert_uuencode
- count_chars
- crc32
- crypt
- echo
- explode
- fprintf
- get_html_translation_table
- hebrev
- hebrevc
- hex2bin
- html_entity_decode
- htmlentities
- htmlspecialchars_decode
- htmlspecialchars
- implode
- join
- lcfirst
- levenshtein
- localeconv
- ltrim
- md5_file
- md5
- metaphone
- money_format
- nl_langinfo
- nl2br
- number_format
- ord
- parse_str
- printf
- quoted_printable_decode
- quoted_printable_encode
- quotemeta
- rtrim
- setlocale
- sha1_file
- sha1
- similar_text
- soundex
- sprintf
- sscanf
- str_getcsv
- str_ireplace
- str_pad
- str_repeat
- str_replace
- str_rot13
- str_shuffle
- str_split
- str_word_count
- strcasecmp
- strchr
- strcmp
- strcoll
- strcspn
- strip_tags
- stripcslashes
- stripos
- stripslashes
- stristr
- strlen
- strnatcasecmp
- strnatcmp
- strncasecmp
- strncmp
- strpbrk
- strpos
- strrchr
- strrev
- strripos
- strrpos
- strspn
- strstr
- strtok
- strtolower
- strtoupper
- strtr
- substr_compare
- substr_count
- substr_replace
- substr
- trim
- ucfirst
- ucwords
- vfprintf
- vprintf
- vsprintf
- wordwrap
Коментарии
There seems to be a bug in the localization for strnatcmp and strnatcasecmp. I searched the reported bugs and found a few entries which were up to four years old (but the problem still exists when using swedish characters).
These functions might work instead.
<?php
function _strnatcasecmp($left, $right) {
return _strnatcmp(strtolower($left), strtolower($right));
}
function _strnatcmp($left, $right) {
while((strlen($left) > 0) && (strlen($right) > 0)) {
if(preg_match('/^([^0-9]*)([0-9].*)$/Us', $left, $lMatch)) {
$lTest = $lMatch[1];
$left = $lMatch[2];
} else {
$lTest = $left;
$left = '';
}
if(preg_match('/^([^0-9]*)([0-9].*)$/Us', $right, $rMatch)) {
$rTest = $rMatch[1];
$right = $rMatch[2];
} else {
$rTest = $right;
$right = '';
}
$test = strcmp($lTest, $rTest);
if($test != 0) {
return $test;
}
if(preg_match('/^([0-9]+)([^0-9].*)?$/Us', $left, $lMatch)) {
$lTest = intval($lMatch[1]);
$left = $lMatch[2];
} else {
$lTest = 0;
}
if(preg_match('/^([0-9]+)([^0-9].*)?$/Us', $right, $rMatch)) {
$rTest = intval($rMatch[1]);
$right = $rMatch[2];
} else {
$rTest = 0;
}
$test = $lTest - $rTest;
if($test != 0) {
return $test;
}
}
return strcmp($left, $right);
}
?>
The code is not optimized. It was just made to solve my problem.
Can also be used with combination of a compare for an array nested value, like
<?php
$array = array(
"city" => "xyz",
"names" => array(
array(
"name" => "Ana2",
"id" => 1
) ,
array(
"name" => "Ana1",
"id" => 2
)
)
);
usort($array["names"], function ($a, $b) { return strnatcmp($a['name'], $b['name']);} );
This function has some interesting behaviour on strings consisting of mixed numbers and letters.
One may expect that such a mixed string would be treated as alpha-numeric, but that is not true.
var_dump(strnatcmp('23','123')); →
int(-1)
As expected, 23<123 (even though first digit is higher, overall number is smaller)
var_dump(strnatcmp('yz','xyz')); →
int(1)
As expected, yz>xyz (string comparison, irregardless of string length)
var_dump(strnatcmp('2x','12y')); →
int(-1)
Remarkable, 2x<12y (does a numeric comparison)
var_dump(strnatcmp('20x','12y'));
int(1)
Remarkable, 20x>12y (does a numeric comparison)
It seems to be splitting what is being compared into runs of numbers and letters, and then comparing each run in isolation, until it has an ordering difference.
Some more remarkable outcomes:
var_dump(strnatcmp("0.15m", "0.2m"));
int(1)
var_dump(strnatcmp("0.15m", "0.20m"));
int(-1)
It's not about localisation:
var_dump(strnatcmp("0,15m", "0,2m"));
int(1)
var_dump(strnatcmp("0,15m", "0,20m"));
int(-1)