mb_ereg_search_setpos
(PHP 4 >= 4.2.0, PHP 5)
mb_ereg_search_setpos — Set start point of next regular expression match
Description
bool mb_ereg_search_setpos
( int
$position
)mb_ereg_search_setpos() sets the starting point of a match for mb_ereg_search().
Parameters
-
position
-
The position to set.
Return Values
Returns TRUE
on success or FALSE
on failure.
Notes
Note:
The internal encoding or the character encoding specified by mb_regex_encoding() will be used as the character encoding for this function.
See Also
- mb_regex_encoding() - Set/Get character encoding for multibyte regex
- mb_ereg_search_init() - Setup string and regular expression for a multibyte regular expression match
- PHP Руководство
- Функции по категориям
- Индекс функций
- Справочник функций
- Поддержка языков и кодировок
- Многобайтные строки
- mb_check_encoding
- mb_convert_case
- mb_convert_encoding
- mb_convert_kana
- mb_convert_variables
- mb_decode_mimeheader
- mb_decode_numericentity
- mb_detect_encoding
- mb_detect_order
- mb_encode_mimeheader
- mb_encode_numericentity
- mb_encoding_aliases
- mb_ereg_match
- mb_ereg_replace_callback
- mb_ereg_replace
- mb_ereg_search_getpos
- mb_ereg_search_getregs
- mb_ereg_search_init
- mb_ereg_search_pos
- mb_ereg_search_regs
- mb_ereg_search_setpos
- mb_ereg_search
- mb_ereg
- mb_eregi_replace
- mb_eregi
- mb_get_info
- mb_http_input
- mb_http_output
- mb_internal_encoding
- mb_language
- mb_list_encodings
- mb_output_handler
- mb_parse_str
- mb_preferred_mime_name
- mb_regex_encoding
- mb_regex_set_options
- mb_send_mail
- mb_split
- mb_strcut
- mb_strimwidth
- mb_stripos
- mb_stristr
- mb_strlen
- mb_strpos
- mb_strrchr
- mb_strrichr
- mb_strripos
- mb_strrpos
- mb_strstr
- mb_strtolower
- mb_strtoupper
- mb_strwidth
- mb_substitute_character
- mb_substr_count
- mb_substr
Коментарии
This method, like mb_ereg_search_pos, appears to use byte offsets, not character offsets. This seems counter intuitive for the mb_* methods, which inherently take a "character" view of strings, as opposed to a "byte" based view. Even the mb_strpos method returns a character offset.
The following code reveals this byte-oriented behaviour:
<?php
$x = 'abc456789'. "\u{1000}" .'abc4567890';
$re = 'ab.';
echo 'x='. $x .PHP_EOL;
echo 're='. $re .PHP_EOL;
mb_ereg_search_init( $x );
mb_internal_encoding( mb_detect_encoding( $x) );
echo 'mb_strlen='. mb_strlen( $x ) .PHP_EOL;
echo 'strlen='. strlen( $x ) .PHP_EOL;
foreach ( array( 0, 9, 10, 11, 12, 13 ) as $o ) {
mb_ereg_search_setpos( $o );
echo 'Offset='. $o
.' mb_substr='. mb_substr( $x, $o )
.' substr='. substr( $x, $o )
.' mb_ereg_search_regs='. print_r( mb_ereg_search_regs( $re ), true )
.PHP_EOL;
}
?>
With character offsets, we would expect offsets 11 and above to return no search result, whereas what we see is:
<?php
=abc456789ကabc4567890
re=ab.
mb_strlen=20
strlen=22
Offset=0 mb_substr=abc456789ကabc4567890 substr=abc456789ကabc4567890 mb_ereg_search_regs=Array
(
[0] => abc
)
Offset=9 mb_substr=ကabc4567890 substr=ကabc4567890 mb_ereg_search_regs=Array
(
[0] => abc
)
Offset=10 mb_substr=abc4567890 substr=��abc4567890 mb_ereg_search_regs=Array
(
[0] => abc
)
Offset=11 mb_substr=bc4567890 substr=�abc4567890 mb_ereg_search_regs=Array
(
[0] => abc
)
Offset=12 mb_substr=c4567890 substr=abc4567890 mb_ereg_search_regs=Array
(
[0] => abc
)
Offset=13 mb_substr=4567890 substr=bc4567890 mb_ereg_search_regs=
?>