strcmp
(PHP 4, PHP 5)
strcmp — Сравнение строк, безопасное для данных в двоичной форме
Описание
int strcmp
( string $str1
, string $str2
)
Возвращает отрицательное число, если str1 меньше, чем str2 ; положительное число, если str1 больше, чем str2 , и 0 если строки равны.
Эта функция учитывает регистр символов.
См. также описание функций ereg(), strcasecmp(), substr(), stristr(), strncasecmp(), strncmp() и strstr().
- addcslashes
- addslashes
- bin2hex
- chop
- chr
- chunk_split
- convert_cyr_string
- convert_uudecode
- convert_uuencode
- count_chars
- crc32
- crypt
- echo
- explode
- fprintf
- get_html_translation_table
- hebrev
- hebrevc
- hex2bin
- html_entity_decode
- htmlentities
- htmlspecialchars_decode
- htmlspecialchars
- implode
- join
- lcfirst
- levenshtein
- localeconv
- ltrim
- md5_file
- md5
- metaphone
- money_format
- nl_langinfo
- nl2br
- number_format
- ord
- parse_str
- printf
- quoted_printable_decode
- quoted_printable_encode
- quotemeta
- rtrim
- setlocale
- sha1_file
- sha1
- similar_text
- soundex
- sprintf
- sscanf
- str_getcsv
- str_ireplace
- str_pad
- str_repeat
- str_replace
- str_rot13
- str_shuffle
- str_split
- str_word_count
- strcasecmp
- strchr
- strcmp
- strcoll
- strcspn
- strip_tags
- stripcslashes
- stripos
- stripslashes
- stristr
- strlen
- strnatcasecmp
- strnatcmp
- strncasecmp
- strncmp
- strpbrk
- strpos
- strrchr
- strrev
- strripos
- strrpos
- strspn
- strstr
- strtok
- strtolower
- strtoupper
- strtr
- substr_compare
- substr_count
- substr_replace
- substr
- trim
- ucfirst
- ucwords
- vfprintf
- vprintf
- vsprintf
- wordwrap
Коментарии
Hey be sure the string you are comparing has not special characters like '\n' or something like that.
In summary, strcmp() does not necessarily use the ASCII code order of each character like in the 'C' locale, but instead parse each string to match language-specific character entities (such as 'ch' in Spanish, or 'dz' in Czech), whose collation order is then compared. When both character entities have the same collation order (such as 'ss' and '?' in German), they are compared relative to their code by strcmp(), or considered equal by strcasecmp().
The LC_COLLATE locale setting is then considered: only if LC_COLLATE=C or LC_ALL=C does strcmp() compare strings by character code.
Generally, most locales define the following order:
control, space, punctuation and underscore, digit, alpha (lower then upper with Latin scripts; or final, middle, then isolated, initial with Arabic script), symbols, others...
With strcasecmp(), the alpha subclass is ignored and consider all forms of letters as equal.
Note also that some locales behave differently with accented characters: some consider they are the same letter as the unaccented letter (with a minor collation order, e.g. French, Italian, Spanish), some consider they are distinct letters with an independant collation order (e.g. in the C locale, or in Nordic languages).
Finally, the collation string is not considering individual characters but instead groups of characters that form a single letter:
- for example "ch" or "CH" in Spanish which is always after all other strings beginning with 'c' or 'C', including "cz", but before 'd' or 'D';
- 'ss' and '?' in German;
- 'dz', 'DZ' and 'Dz' in some Central European languages written with the Latin script...
- UTF-8, UTF-16 (Unicode), S-JIS, Big5, ISO2022 character encoding of a locale (the suffix in the locale name) first decode the characters into the UCS4/ISO10646 code position before applying the rules of the language indicated by the main locale...
So be extremely careful to what you consider a "character", as it may just mean a encoding byte with no significance in the string collation algorithm: the first character of the string "cholera" in Spanish is "ch", not "c" !
Some notes about the spanish locale. I've read some notes that says "CH", "RR" or "LL" must be considered as a single letter in Spanish. That's not really tru. "CH", "RR" and "LL" where considered a single letter in the past (lot of years ago), for that you must use the "Tradictional Sort". Nowadays, the Academy uses the Modern Sort and recomends not to consider anymore "CH", "RR" and "LL" as a single letter. They must be considered two separated letters and sort and compare on that way.
Ju just have to take a look to the Offial Spanish Language Dictionary and you can see there that from many years ago there is not the separated section for "CH", "LL" or "RR" ... i.e. words starting with CH must be after the ones starting by CG, and before the ones starting by CI.
If you want to strings according to locale, use strcoll instead.
Just a short comment to the note of arnar at hm dot is: md5() is a hash function and therefore it may happen (although it is very unlikely) that the md5() checksums of two different strings will be equal (hash collision) ...
One big caveat - strings retrieved from the backtick operation may be zero terminated (C-style), and therefore will not be equal to the non-zero terminated strings (roughly Pascal-style) normal in PHP. The workaround is to surround every `` pair or shell_exec() function with the trim() function. This is likely to be an issue with other functions that invoke shells; I haven't bothered to check.
On Debian Lenny (and RHEL 5, with minor differences), I get this:
====PHP====
<?php
$sz = `pwd`;
$ps = "/var/www";
echo "Zero-terminated string:<br />sz = ".$sz."<br />str_split(sz) = "; print_r(str_split($sz));
echo "<br /><br />";
echo "Pascal-style string:<br />ps = ".$ps."<br />str_split(ps) = "; print_r(str_split($ps));
echo "<br /><br />";
echo "Normal results of comparison:<br />";
echo "sz == ps = ".($sz == $ps ? "true" : "false")."<br />";
echo "strcmp(sz,ps) = ".strcmp($sz,$ps);
echo "<br /><br />";
echo "Comparison with trim()'d zero-terminated string:<br />";
echo "trim(sz) = ".trim($sz)."<br />";
echo "str_split(trim(sz)) = "; print_r(str_split(trim($sz))); echo "<br />";
echo "trim(sz) == ps = ".(trim($sz) == $ps ? "true" : "false")."<br />";
echo "strcmp(trim(sz),ps) = ".strcmp(trim($sz),$ps);
?>
====Output====
Zero-terminated string:
sz = /var/www
str_split(sz) = Array ( [0] => / [1] => v [2] => a [3] => r [4] => / [5] => w [6] => w [7] => w [8] => )
Pascal-style string:
ps = /var/www
str_split(ps) = Array ( [0] => / [1] => v [2] => a [3] => r [4] => / [5] => w [6] => w [7] => w )
Normal results of comparison:
sz == ps = false
strcmp(sz,ps) = 1
Comparison with trim()'d zero-terminated string:
trim(sz) = /var/www
str_split(trim(sz)) = Array ( [0] => / [1] => v [2] => a [3] => r [4] => / [5] => w [6] => w [7] => w )
trim(sz) == ps = true
strcmp(trim(sz),ps) = 0
Don't forget the similar_text() function...
function.similar-text
Note a difference between 5.2 and 5.3 versions
echo (int)strcmp('pending',array());
will output -1 in PHP 5.2.16 (probably in all versions prior 5.3)
but will output 0 in PHP 5.3.3
Of course, you never need to use array as a parameter in string comparisions.
If you rely on strcmp for safe string comparisons, both parameters must be strings, the result is otherwise extremely unpredictable.
For instance you may get an unexpected 0, or return values of NULL, -2, 2, 3 and -3.
strcmp("5", 5) => 0
strcmp("15", 0xf) => 0
strcmp(61529519452809720693702583126814, 61529519452809720000000000000000) => 0
strcmp(NULL, false) => 0
strcmp(NULL, "") => 0
strcmp(NULL, 0) => -1
strcmp(false, -1) => -2
strcmp("15", NULL) => 2
strcmp(NULL, "foo") => -3
strcmp("foo", NULL) => 3
strcmp("foo", false) => 3
strcmp("foo", 0) => 1
strcmp("foo", 5) => 1
strcmp("foo", array()) => NULL + PHP Warning
strcmp("foo", new stdClass) => NULL + PHP Warning
strcmp(function(){}, "") => NULL + PHP Warning
i hope this will give you a clear idea how strcmp works internally.
<?php
$str1 = "b";
echo ord($str1); //98
echo "<br/>";
$str2 = "t";
echo ord($str2); //116
echo "<br/>";
echo ord($str1)-ord($str2);//-18
$str1 = "bear";
$str2 = "tear";
$str3 = "";
echo "<pre>";
echo strcmp($str1, $str2); // -18
echo "<br/>";
echo strcmp($str2, $str1); //18
echo "<br/>";
echo strcmp($str2, $str2); //0
echo "<br/>";
echo strcmp($str2, $str3); //4
echo "<br/>";
echo strcmp($str3, $str2); //-4
echo "<br/>";
echo strcmp($str3, $str3); // 0
echo "</pre>";
?>
Since it may not be obvious to some people, please note that there is another possible return value for this function.
strcmp() will return NULL on failure.
This has the side effect of equating to a match when using an equals comparison (==).
Instead, you may wish to test matches using the identical comparison (===), which should not catch a NULL return.
---------------------
Example
---------------------
$variable1 = array();
$ans === strcmp($variable1, $variable2);
This will stop $ans from returning a match;
Please use strcmp() carefully when comparing user input, as this may have potential security implications in your code.
strcmp returns -1 ou 1 if two strings are not identical,
and 0 when they are, except when comparing a string and an empty string (<?php $a = ""; ?>), it returns the length of the string.
For instance:
<?php
$a = "foo"; // length 3
$b = ""; // empty string
$c = "barbar"; // length 6
echo strcmp($a, $a); // outputs 0
echo strcmp($a, $c); // outputs 1
echo strcmp($c, $a); // outputs -1
echo strcmp($a, $b); // outputs 3
echo strcmp($b, $a); // outputs -3
echo strcmp($c, $b); // outputs 6
echo strcmp($b, $c); // outputs -6
?>
Vulnerability (in PHP >=5.3) :
<?php
if (strcmp($_POST['password'], 'sekret') == 0) {
echo "Welcome, authorized user!\n";
} else {
echo "Go away, imposter.\n";
}
?>
$ curl -d password=sekret http://andersk.scripts.mit.edu/strcmp.php
Welcome, authorized user!
$ curl -d password=wrong http://andersk.scripts.mit.edu/strcmp.php
Go away, imposter.
$ curl -d password[]=wrong http://andersk.scripts.mit.edu/strcmp.php
Welcome, authorized user!
SRC of this example: https://www.quora.com/Why-is-PHP-hated-by-so-many-developers
1) If the two strings have identical BEGINNING parts, they are trunkated from both strings.
2) The resulting strings are compared with two possible outcomes:
a) if one of the resulting strings is an empty string, then the length of the non-empty string is returned (the sign depending on the order in which you pass the arguments to the function)
b) in any other case just the numerical values of the FIRST characters are compared. The result is +1 or -1 no matter how big is the difference between the numerical values.
<?php
$str = array('','a','afox','foxa');
$size = count($str);
echo '<pre>';
for($i=0; $i<$size; $i++)
{
for($j=$i+1; $j<$size; $j++)
{
echo '<br>('.$str[$i].','.$str[$j].') = '.strcmp($str[$i], $str[$j]);
echo '<br>('.$str[$j].','.$str[$i] .') = '.strcmp($str[$j], $str[$i]);
}
}
echo '</pre>';
?>
In Apache/2.4.37 (Win32) OpenSSL/1.1.1 PHP/7.2.12 produces the following results:
(,a) = -1 //comparing with an empty string produces the length of the NON-empty string
(a,) = 1 // ditto
(,afox) = -4 // ditto
(afox,) = 4 // ditto
(,foxa) = -4 // ditto
(foxa,) = 4 // ditto
(a,afox) = -3 // The identical BEGINNING part ("a") is trunkated from both strings. Then the remaining "fox" is compared to the remaing empty string in the other argument. Produces the length of the NON-empty string. Same as in all the above examples.
(afox,a) = 3 // ditto
(a,foxa) = -1 // Nothing to trunkate. Just the numerical values of the first letters are compared
(foxa,a) = 1 // ditto
(afox,foxa) = -1 // ditto
(foxa,afox) = 1 // ditto
In case you want to get results -1, 0 or 1 always, like JS indexOf();
<?php
function cmp(string $str1, string $str2): int {
return ($str1 > $str2) - ($str1 < $str2);
}
$str1 = 'a';
$str2 = 'z';
var_dump(cmp($str1, $str2), strcmp($str1, $str2));
//=> int(-1) int(-25) int(-25)
$str1 = 'a';
$str2 = '1';
var_dump(cmp($str1, $str2), strcmp($str1, $str2));
//=> int(1) int(48) int(48)
?>
strcmp and strcasecmp does not work well with multibyte (UTF8) strings and there are no mb_strcmp or mb_strcasecmp - instead look at the wonderful Collator class with method compare (search for Collator above) - supports not only UTF8 but also different national collations (sort orders).
Natural sort is also supported, use setAttribute to set Collator::NUMERIC_COLLATION to Collator::ON.