mb_ereg_replace

(PHP 4 >= 4.2.0, PHP 5)

mb_ereg_replace — Осуществляет замену по регулярному выражению с поддержкой многобайтовых символов

Описание

string mb_ereg_replace ( string $pattern , string $replacement , string $string [, string $option = "msr" ] )

Сканирует строку string для поиска совпадений с pattern, затем заменяет совпавший текст на replacement

Список параметров

pattern

Шаблон регулярного выражения.

В pattern могут использоваться многобайтовые символы.

replacement

Текст замены.

string

Проверяемая строка string.

option

Условие совпадения может быть установлено параметром option. Если в этом параметре указан модификатор i , будет игнорироваться регистр. Если указан модификатор x , будут игнорироваться пробелы. Если указан модификатор m , поиск совпадений будет выполняться в многострочном режиме, а '.' будет совпадать с символом конца строки. Если указан модификатор p , поиск совпадений будет выполняться в режиме POSIX; символ конца строки будет рассматриваться как обычный символ. Если указан модификатор e , строка replacement будет вычисляться как выражение PHP.

Возвращаемые значения

Результирующая строка string в случае успеха, или FALSE в случае ошибки.

Примечания

Замечание:
Для этой функции будет использована внутренняя кодировка или кодировка, установленная функцией mb_regex_encoding().

Внимание

Никогда не используйте модификатор e при работе с данными, полученными из недостоверных источников. Не выполняется никакого автоматического экранирования этих данных (в отличие от preg_replace()). Неучитывание данных требований, скорее всего, создаст уязвимость выполнения удаленного кода в вашем приложении.

Смотрите также

mb_regex_encoding() - Возвращает текущую кодировку для многобайтового регулярного выражения в виде строки
mb_eregi_replace() - Осуществляет замену по регулярному выражению с поддержкой многобайтовых символов без учета регистра

Коментарии

Aug 09

Автор: faxe at neostrada dot pl


A simple mb_str_ireplace() implementation - a faster (?) replacement for non-regexp multi-byte string replacement:



<?php

function mb_str_ireplace($co, $naCo, $wCzym)

{

    $wCzymM = mb_strtolower($wCzym);

    $coM    = mb_strtolower($co);

    $offset = 0;

    

        while(!is_bool($poz = mb_strpos($wCzymM, $coM, $offset)))

    {

        $offset = $poz + mb_strlen($naCo);

        $wCzym = mb_substr($wCzym, 0, $poz). $naCo .mb_substr($wCzym, $poz+mb_strlen($co));

        $wCzymM = mb_strtolower($wCzym);

    }

    

    return $wCzym;

}

?>



[thiago - EDITOR NOTE: This function has improvements from d-okumura [aat] fi{dot}kyd[dot]co.jp]

2005-08-09 18:52:41

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Feb 26

Автор: vondrej(at)gmail(dot)com


Are you looking for htmlentities() for multibyte strings? This might help you - it just replace <, >, ", '



<?php

/**

 *  Multibyte equivalent for htmlentities() [lite version :)]

 *

 * @param string $str

 * @param string $encoding

 * @return string

 **/

function mb_htmlentities($str, $encoding = 'utf-8') {

    mb_regex_encoding($encoding);

    $pattern = array('<', '>', '"', '\'');

    $replacement = array('&lt;', '&gt;', '&quot;', '&#39;');

    for ($i=0; $i<sizeof($pattern); $i++) {

        $str = mb_ereg_replace($pattern[$i], $replacement[$i], $str);

    }

    return $str;

}

?>

2006-02-26 17:47:52

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Jul 09

Автор: mpnicholas [@t] gmail (dot) com


Regarding the mb_str_ireplace() function: I benchmarked it against mb_eregi_replace() for single-character substitution, and it was significantly slower. Despite avoiding the ereg call, I think the while loop ends slowing you down too much for this to be practical.

2006-07-09 18:09:53

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Nov 01

Автор: squeegee


well, if you just calculated the length of the find and replace strings once instead of on every loop, it would likely speed it up a lot.

2006-11-01 09:41:01

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Dec 04

Автор: Анонимус@


'i' option does not work correctly with multibyte characters. The function does not locate/replace the multibyte string if it's different case then specified on multibyte needle which is in different case.

2006-12-04 10:36:33

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Jul 01

Автор: gmx dot net at ulrich dot mierendorff


If you want to replace characters like "ä" or "ø" you can use mb_ereg_replace, but it is very slow. str_replace is much faster and also works with characters like "ä" or "ø"!



I think this has something to with the fact that str_replace works on byte level and does not care about characters.

I hope that can help.

2008-07-01 10:39:43

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Jul 24

Автор: keizo at gomo dot jp


<?php

$pattern = "([あ-ん]+)[0-9]+";

$string = mb_ereg_replace($pattern, '「\\1」:\\0', $string);

?>



you can use \\n for capture group in replacement

2008-07-24 00:32:30

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Feb 03

Автор: daemoneye at gmail dot com


I got a pretty nasty error while trying to parse table rows(all contents were set to UTF-8) from the database for a dictionary project. The idea was to get all the rows from the first table (that is a table with bulgarian phrase in the first field, and its translation in english, french and german in the next fields). I needed to index all the bulgarian words that are found in the table to make an intelligent search. And that is where my headache started.



First of all, even with mb_strtolower() a lot of cyrillic characters went corrupted (ex: 'т,ъ,у,ф,б,г,з,ж,' etc...). After an hour of different attempts I got such a solution:



<?php



mb_internal_encoding("UTF-8");

mb_regex_encoding("UTF-8");



$rows = $db->getRows();



$contents = array();

foreach ($rows as $eachRow)

{

    $cleared = str_replace($commonWords, ' ', mb_strtolower(stripslashes($eachRow['bulgarian']), 'UTF-8' ));

    if (trim($cleared) != '') $contents[] = trim($cleared);

}    



$list = array();

foreach ($contents as $eachRow)

{

    $exploded = explode(' ', $eachRow);

    foreach ($exploded as $eachExpl)

    {

        $eachExpl = mb_ereg_replace('[^а-я ]',' ', $eachExpl);

        if (trim($eachExpl) != '') 

            if (!in_array($eachExpl, $list, true))    $list[] = trim($eachExpl);

    }

}



?>



To work properly I got to set all the internal encoding settings to UTF-8. Else the default Latin-1 got half my database with missing characters.



I am posting this solution just in case someone has encountered a similar problem. Hope it helps you in case you need something like that.

2009-02-03 04:53:20

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Dec 30

Автор: Pluche


Unlike preg_replace, mb_ereg_replace doesn't use separators



Exemple with preg_replace :

<?php $data = preg_replace("/[^A-Za-z0-9\.\-]/","",$data); ?>



Exemple with mb_ereg_replace :

<?php $data = mb_ereg_replace("[^A-Za-z0-9\.\-]","",$data); ?>

2010-12-30 06:17:08

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Jun 15

Автор: trng


You can use \\n for capture group in replacement.

And you can NOT use $n notation (unlike preg_replace function).

2011-06-15 12:22:22

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Feb 03

Автор: marco at thenetworksolution dot it


To selectively uppercase parts of a string via mb_eregi_replace



    $str = mb_eregi_replace('\b([0-9]{1,4}[a-z]{1,2})\b', "strtoupper

('\\1')", $str, 'e');



Full example, how to fix an address manually typed, uppercasing the first letter of a words and keeping uppercase roman numerals and the letters A,B,C after the house number):



function ucAddress($str) {

// first lowercase all and use the default ucwords

    $str = ucwords(strtolower($str));

// let's fix the default ucwords...

// uppercase letters after house number (was lowercased by the strtolower above)

    $str = mb_eregi_replace('\b([0-9]{1,4}[a-z]{1,2})\b', "strtoupper

('\\1')", $str, 'e');

// the same for roman numerals

    $str = mb_eregi_replace('\bM{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b', "strtoupper('\\0')", $str, 'e');

    return $str;

}

2014-02-03 16:47:37

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Feb 03

Автор: marco at thenetworksolution dot it


To selectively uppercase parts of a string via mb_eregi_replace



    $str = mb_eregi_replace('\b([0-9]{1,4}[a-z]{1,2})\b', "strtoupper

('\\1')", $str, 'e');



Full example, how to fix an address manually typed, uppercasing the first letter of a words and keeping uppercase roman numerals and the letters A,B,C after the house number):



function ucAddress($str) {

// first lowercase all and use the default ucwords

    $str = ucwords(strtolower($str));

// let's fix the default ucwords...

// uppercase letters after house number (was lowercased by the strtolower above)

    $str = mb_eregi_replace('\b([0-9]{1,4}[a-z]{1,2})\b', "strtoupper

('\\1')", $str, 'e');

// the same for roman numerals

    $str = mb_eregi_replace('\bM{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b', "strtoupper('\\0')", $str, 'e');

    return $str;

}



Dr. Marco Marsala

Network Solution srl

http://www.realizzazionesitigenova.it

2014-02-03 16:47:54

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Jan 19

Автор: Anonymous


Pluche's comment should REALLY be added to the documentation, preferably under the "$pattern" param description. It is crucial to using this function.

2016-01-19 13:23:24

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Jul 17

Автор: ms2705335 at gmail dot com


As trng mentioned before you can use \\n for replacement but NOT \\\\n as mentioned in preg_replace docs. So string definition will be like:

$str = '\\1';

2017-07-17 13:10:37

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Sep 01

Автор: Alexey Khrulev


If encoding of PHP script differs from encoding of string to be processed by mb_ereg_replace(), then you can't just write pattern in script. Both $pattern and $replacement must be converted to same encoding as string to be processed. In this example script is in UTF-8, file to be processed is in UTF-16LE encoding:



<?php

$file_encoding = 'UTF-16LE';

mb_regex_encoding( $file_encoding );



$pattern     = "aaa";

$replacement = "AAA";

$pattern_encoded     = mb_convert_encoding( $pattern,     $file_encoding, 'UTF-8' );

$replacement_encoded = mb_convert_encoding( $replacement, $file_encoding, 'UTF-8' );



$result = mb_ereg_replace( $pattern_encoded, $replacement_encoded, file_get_contents('UTF-16LE.txt') );

file_put_contents('UTF-16LE-updated.txt', $result);

?>

2017-09-01 01:10:26

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

Feb 07

Автор: j-fr dot fortier at wanadoo dot fr


Since PHP 5.4, to make uppercase ou lowercase characters, or rewrite some uris, without to take care about initial encoding, the transliteration is easier (and probably the best way): see http://php.net/manual/fr/transliterator.transliterate.php and http://userguide.icu-project.org/transforms/general



For example (with create) (french text: replace all accuentued -éèàîïùç...- chars with ascii chars):

<?php

$transliterator = Transliterator::create("NFD; [:Nonspacing Mark:] Remove; NFC;");

echo $transliterator->transliterate("Héhé, ça marche !");

?>

// Result: « Hehe, ca marche ! »



To rewrite a phrase in URI (with createFromRules):

<?php

$transliterator = Transliterator::createFromRules("::Latin-ASCII; ::Lower; [^[:L:][:N:]]+ > '-';");

echo trim($transliterator->transliterate("Héhé, ça marche !"), '-');

?>

// Result : « hehe-ca-marche »

2019-02-07 18:56:55

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

May 19

Автор: Anonymous


Notations to reference captures in the replacement string:



<?php



// (1) \\number notation: (1 to 9, not greater than 9)

echo mb_ereg_replace('(\S*) (\S*) (\S*)', '\\1 jam, \\2 juice, \\3 squash', 'apple orange lemon').'<br>'; // apple jam, orange juice, lemon squash



// (2) \k<number> notation: (also greater than 9) (also as \k'number')

echo mb_ereg_replace('(\S*) (\S*) (\S*)', '\k<1> jam, \k<2> juice, \k<3> squash', 'apple orange lemon').'<br>'; // (same as above)



// (3) \k<word> notation: (also as \k'word')

echo mb_ereg_replace('(?<word1>\S*) (?<word2>\S*) (?<word3>\S*)', '\k<word1> jam, \k<word2> juice, \k<word3> squash', 'apple orange lemon').'<br>'; // (same as above)



// Note non-named-subpatterns like "(\S*)" should not be used with named-subpatterns like "(?<word>..)" because non-named-subpatterns cannot be captured when named-subpatterns exist.

2022-05-19 21:35:18

http://php5.kiev.ua/manual/ru/function.mb-ereg-replace.html

mb_ereg_replace_callback

mb_ereg_search_getpos

Функции для работы с Многобайтными строками

PHP Manual

PHP5

Для web разработчика

May 10
Функция mb_ereg_replace() - Осуществляет замену по регулярному выражению с поддержкой многобайтовых символов

mb_ereg_replace

Описание

Список параметров

Возвращаемые значения

Примечания

Смотрите также

Коментарии

PHP5

Для web разработчика

May 10Функция mb_ereg_replace() - Осуществляет замену по регулярному выражению с поддержкой многобайтовых символов

mb_ereg_replace

Описание

Список параметров

Возвращаемые значения

Примечания

Смотрите также

Коментарии

May 10
Функция mb_ereg_replace() - Осуществляет замену по регулярному выражению с поддержкой многобайтовых символов