strnatcmp
(PHP 4, PHP 5)
strnatcmp — Сравнение строк с использованием алгоритма "естественного упорядочения"
Описание
Эта функция реализует алгоритм сравнения, упорядочивающий алфавитно-цифровые строки подобно тому, как это сделал бы человек. Пример, показывающий отличие этого алгоритма от обыных функций сравнения, приведен ниже
<?php
$arr1 = $arr2 = array("img12.png", "img10.png", "img2.png", "img1.png");
echo "Стандартный алгоритм сравнения\n";
usort($arr1, "strcmp");
print_r($arr1);
echo "\nАлгоритм \"естественного упорядочения\"\n";
usort($arr2, "strnatcmp");
print_r($arr2);
?>
Стандартный алгоритм сравнения Array ( [0] => img1.png [1] => img10.png [2] => img12.png [3] => img2.png ) Алгоритм "естественного упорядочения" Array ( [0] => img1.png [1] => img2.png [2] => img10.png [3] => img12.png )
Подобно другим функциям сравнения строк, strnatcmp() возвращает отрицательное число, если str1 меньше, чем str2 ; положительное число если, str1 больше, чем str2 , и 0 если строки равны.
Эта функция учитывает регистр символов.
См. также описание функций ereg(), strcasecmp(), substr(), stristr(), strcmp(), strncmp(), strncasecmp(), strnatcasecmp(), strstr(), natsort() и natcasesort().
- addcslashes
- addslashes
- bin2hex
- chop
- chr
- chunk_split
- convert_cyr_string
- convert_uudecode
- convert_uuencode
- count_chars
- crc32
- crypt
- echo
- explode
- fprintf
- get_html_translation_table
- hebrev
- hebrevc
- hex2bin
- html_entity_decode
- htmlentities
- htmlspecialchars_decode
- htmlspecialchars
- implode
- join
- lcfirst
- levenshtein
- localeconv
- ltrim
- md5_file
- md5
- metaphone
- money_format
- nl_langinfo
- nl2br
- number_format
- ord
- parse_str
- printf
- quoted_printable_decode
- quoted_printable_encode
- quotemeta
- rtrim
- setlocale
- sha1_file
- sha1
- similar_text
- soundex
- sprintf
- sscanf
- str_getcsv
- str_ireplace
- str_pad
- str_repeat
- str_replace
- str_rot13
- str_shuffle
- str_split
- str_word_count
- strcasecmp
- strchr
- strcmp
- strcoll
- strcspn
- strip_tags
- stripcslashes
- stripos
- stripslashes
- stristr
- strlen
- strnatcasecmp
- strnatcmp
- strncasecmp
- strncmp
- strpbrk
- strpos
- strrchr
- strrev
- strripos
- strrpos
- strspn
- strstr
- strtok
- strtolower
- strtoupper
- strtr
- substr_compare
- substr_count
- substr_replace
- substr
- trim
- ucfirst
- ucwords
- vfprintf
- vprintf
- vsprintf
- wordwrap
Коментарии
There seems to be a bug in the localization for strnatcmp and strnatcasecmp. I searched the reported bugs and found a few entries which were up to four years old (but the problem still exists when using swedish characters).
These functions might work instead.
<?php
function _strnatcasecmp($left, $right) {
return _strnatcmp(strtolower($left), strtolower($right));
}
function _strnatcmp($left, $right) {
while((strlen($left) > 0) && (strlen($right) > 0)) {
if(preg_match('/^([^0-9]*)([0-9].*)$/Us', $left, $lMatch)) {
$lTest = $lMatch[1];
$left = $lMatch[2];
} else {
$lTest = $left;
$left = '';
}
if(preg_match('/^([^0-9]*)([0-9].*)$/Us', $right, $rMatch)) {
$rTest = $rMatch[1];
$right = $rMatch[2];
} else {
$rTest = $right;
$right = '';
}
$test = strcmp($lTest, $rTest);
if($test != 0) {
return $test;
}
if(preg_match('/^([0-9]+)([^0-9].*)?$/Us', $left, $lMatch)) {
$lTest = intval($lMatch[1]);
$left = $lMatch[2];
} else {
$lTest = 0;
}
if(preg_match('/^([0-9]+)([^0-9].*)?$/Us', $right, $rMatch)) {
$rTest = intval($rMatch[1]);
$right = $rMatch[2];
} else {
$rTest = 0;
}
$test = $lTest - $rTest;
if($test != 0) {
return $test;
}
}
return strcmp($left, $right);
}
?>
The code is not optimized. It was just made to solve my problem.
Can also be used with combination of a compare for an array nested value, like
<?php
$array = array(
"city" => "xyz",
"names" => array(
array(
"name" => "Ana2",
"id" => 1
) ,
array(
"name" => "Ana1",
"id" => 2
)
)
);
usort($array["names"], function ($a, $b) { return strnatcmp($a['name'], $b['name']);} );
This function has some interesting behaviour on strings consisting of mixed numbers and letters.
One may expect that such a mixed string would be treated as alpha-numeric, but that is not true.
var_dump(strnatcmp('23','123')); →
int(-1)
As expected, 23<123 (even though first digit is higher, overall number is smaller)
var_dump(strnatcmp('yz','xyz')); →
int(1)
As expected, yz>xyz (string comparison, irregardless of string length)
var_dump(strnatcmp('2x','12y')); →
int(-1)
Remarkable, 2x<12y (does a numeric comparison)
var_dump(strnatcmp('20x','12y'));
int(1)
Remarkable, 20x>12y (does a numeric comparison)
It seems to be splitting what is being compared into runs of numbers and letters, and then comparing each run in isolation, until it has an ordering difference.
Some more remarkable outcomes:
var_dump(strnatcmp("0.15m", "0.2m"));
int(1)
var_dump(strnatcmp("0.15m", "0.20m"));
int(-1)
It's not about localisation:
var_dump(strnatcmp("0,15m", "0,2m"));
int(1)
var_dump(strnatcmp("0,15m", "0,20m"));
int(-1)