The Spoofchecker class
(PHP >= 5.4.0, PECL intl >= 2.0.0)
Introduction
Class synopsis
Spoofchecker
{
/* Constants */
/* Methods */
}Predefined Constants
Spoofchecker::SINGLE_SCRIPT_CONFUSABLE
Spoofchecker::MIXED_SCRIPT_CONFUSABLE
Spoofchecker::WHOLE_SCRIPT_CONFUSABLE
Spoofchecker::ANY_CASE
Spoofchecker::SINGLE_SCRIPT
Spoofchecker::INVISIBLE
Spoofchecker::CHAR_LIMIT
Table of Contents
- Spoofchecker::areConfusable — Checks if a given text contains any confusable characters
- Spoofchecker::__construct — Constructor
- Spoofchecker::isSuspicious — Checks if a given text contains any suspicious characters
- Spoofchecker::setAllowedLocales — Locales to use when running checks
- Spoofchecker::setChecks — Set the checks to run
- PHP Руководство
- Функции по категориям
- Индекс функций
- Справочник функций
- Поддержка языков и кодировок
- Введение
- Установка и настройка
- Предопределенные константы
- Примеры
- The Collator class
- The NumberFormatter class
- The Locale class
- The Normalizer class
- The MessageFormatter class
- The IntlCalendar class
- The IntlTimeZone class
- The IntlDateFormatter class
- The ResourceBundle class
- The Spoofchecker class
- The Transliterator class
- The IntlBreakIterator class
- The IntlRuleBasedBreakIterator class
- The IntlCodePointBreakIterator class
- The IntlPartsIterator class
- The UConverter class
- Grapheme Функции
- IDN Функции
- IntlChar
- Exception class for intl errors
- The IntlIterator class
- intl Функции
Коментарии
From http://icu-project.org/apiref/icu4j/com/ibm/icu/text/SpoofChecker.html :
SINGLE_SCRIPT_CONFUSABLE: indicates that the two strings are visually confusable and that they are from the same script
MIXED_SCRIPT_CONFUSABLE: indicates that the two strings are visually confusable and that they are NOT from the same script
WHOLE_SCRIPT_CONFUSABLE: indicates that the two strings are visually confusable and that they are NOT from the same script BUT both of them are single-script strings
ANY_CASE: Deprecated.
SINGLE_SCRIPT: Deprecated.
INVISIBLE: Check an identifier for the presence of invisible characters, such as zero-width spaces, or character sequences that are likely not to display, such as multiple occurrences of the same non-spacing mark.
CHAR_LIMIT: Check that an identifier contains only characters from a specified set of acceptable characters.
Explanation of whole script, mixed script and single script confusables in UTS 39 section 4 : http://unicode.org/reports/tr39/#Confusable_Detection
Details from Java SpoofChecker class at http://icu-project.org/apiref/icu4j/com/ibm/icu/text/SpoofChecker.html
Spoofchecker yields false positives by defaut when Whole-Script Confusables (WSC) and Mixed-Script Confusables (MSC) checks are used.
They have been deprecated since ICU 58:
http://bugs.icu-project.org/trac/ticket/12549#comment:10
Workarounds: upgrade ICU to 58+, or avoid the MSC and WSC checks with Spoofcheckers' setChecks() function.