Dot
Outside a character class, a dot in the pattern matches any one character in the subject, including a non-printing character, but not (by default) newline. If the PCRE_DOTALL option is set, then dots match newlines as well. The handling of dot is entirely independent of the handling of circumflex and dollar, the only relationship being that they both involve newline characters. Dot has no special meaning in a character class.
\C can be used to match single byte. It makes sense in UTF-8 mode where full stop matches the whole character which can consist of multiple bytes.
- PHP Руководство
- Функции по категориям
- Индекс функций
- Справочник функций
- Обработка текста
- Функции для работы с регулярными выражениями (Perl-совместимые)
- Регулярные выражения PCRE
- Вступление
- Разделители
- Метасимволы
- Экранирующие последовательности
- Свойства Unicode-символов
- Якоря
- Метасимвол точка
- Символьные классы
- Альтернативный выбор
- Установка внутренних опций
- Подмаски
- Повторение
- Обратные ссылки
- Утверждения
- Однократные подмаски
- Условные подмаски
- Комментарии
- Рекурсивные шаблоны
- Производительность
Коментарии
Consider,
preg_match_all("/<img.*>/", $htmlfile, $match);
Since PCRE_DOTALL is not used, this pattern is expected to NOT make matches across multiple lines. However, in somecases it can, depending on the PCRE default settings and your data ($htmlfile). The problem is that some are set to recognize NEWLINES differently.
To fix this use,
preg_match_all("/(*ANY)<img.*>/", $htmlfile, $match);
Now, any character that could possibly be seen as a newline will be interpreted as a newline by the PCRE.
NOTE: This pattern has been available since PCRE version 7.3