Sanitize filters
ID | Name | Options | Flags | Description |
---|---|---|---|---|
FILTER_SANITIZE_EMAIL |
"email" | Remove all characters except letters, digits and !#$%&'*+-/=?^_`{|}~@.[]. | ||
FILTER_SANITIZE_ENCODED |
"encoded" |
FILTER_FLAG_STRIP_LOW ,
FILTER_FLAG_STRIP_HIGH ,
FILTER_FLAG_ENCODE_LOW ,
FILTER_FLAG_ENCODE_HIGH
|
URL-encode string, optionally strip or encode special characters. | |
FILTER_SANITIZE_MAGIC_QUOTES |
"magic_quotes" | Apply addslashes(). | ||
FILTER_SANITIZE_NUMBER_FLOAT |
"number_float" |
FILTER_FLAG_ALLOW_FRACTION ,
FILTER_FLAG_ALLOW_THOUSAND ,
FILTER_FLAG_ALLOW_SCIENTIFIC
|
Remove all characters except digits, +- and optionally .,eE. | |
FILTER_SANITIZE_NUMBER_INT |
"number_int" | Remove all characters except digits, plus and minus sign. | ||
FILTER_SANITIZE_SPECIAL_CHARS |
"special_chars" |
FILTER_FLAG_STRIP_LOW ,
FILTER_FLAG_STRIP_HIGH ,
FILTER_FLAG_ENCODE_HIGH
|
HTML-escape '"<>& and characters with ASCII value less than 32, optionally strip or encode other special characters. | |
FILTER_SANITIZE_FULL_SPECIAL_CHARS |
"full_special_chars" |
FILTER_FLAG_NO_ENCODE_QUOTES ,
|
Equivalent to calling htmlspecialchars() with ENT_QUOTES set. Encoding quotes can
be disabled by setting FILTER_FLAG_NO_ENCODE_QUOTES . Like htmlspecialchars(), this
filter is aware of the default_charset and if a sequence of bytes is detected that
makes up an invalid character in the current character set then the entire string is rejected resulting in a 0-length string.
When using this filter as a default filter, see the warning below about setting the default flags to 0.
|
|
FILTER_SANITIZE_STRING |
"string" |
FILTER_FLAG_NO_ENCODE_QUOTES ,
FILTER_FLAG_STRIP_LOW ,
FILTER_FLAG_STRIP_HIGH ,
FILTER_FLAG_ENCODE_LOW ,
FILTER_FLAG_ENCODE_HIGH ,
FILTER_FLAG_ENCODE_AMP
|
Strip tags, optionally strip or encode special characters. | |
FILTER_SANITIZE_STRIPPED |
"stripped" | Alias of "string" filter. | ||
FILTER_SANITIZE_URL |
"url" | Remove all characters except letters, digits and $-_.+!*'(),{}|\\^~[]`<>#%";/?:@&=. | ||
FILTER_UNSAFE_RAW |
"unsafe_raw" |
FILTER_FLAG_STRIP_LOW ,
FILTER_FLAG_STRIP_HIGH ,
FILTER_FLAG_ENCODE_LOW ,
FILTER_FLAG_ENCODE_HIGH ,
FILTER_FLAG_ENCODE_AMP
|
Do nothing, optionally strip or encode special characters. |
Warning
When using one of these filters as a default filter either through your ini file
or through your web server's configuration, the default flags is set to
FILTER_FLAG_NO_ENCODE_QUOTES
. You need to explicitly set
filter.default_flags to 0 to have quotes encoded by default. Like this:
Example #1 Configuring the default filter to act like htmlspecialchars
filter.default = full_special_chars
filter.default_flags = 0
Коментарии
It's not entirely clear what the LOW and HIGH ranges are. LOW is characters below 32, HIGH is those above 127, i.e. outside the ASCII range.
<?php
$a = "\tcafé\n";
//This will remove the tab and the line break
echo filter_var($a, FILTER_SANITIZE_STRING, FILTER_FLAG_STRIP_LOW);
//This will remove the é.
echo filter_var($a, FILTER_SANITIZE_STRING, FILTER_FLAG_STRIP_HIGH);
?>
Remember to trim() the $_POST before your filters are applied:
<?php
// We trim the $_POST data before any spaces get encoded to "%20"
// Trim array values using this function "trim_value"
function trim_value(&$value)
{
$value = trim($value); // this removes whitespace and related characters from the beginning and end of the string
}
array_filter($_POST, 'trim_value'); // the data in $_POST is trimmed
$postfilter = // set up the filters to be used with the trimmed post array
array(
'user_tasks' => array('filter' => FILTER_SANITIZE_STRING, 'flags' => !FILTER_FLAG_STRIP_LOW), // removes tags. formatting code is encoded -- add nl2br() when displaying
'username' => array('filter' => FILTER_SANITIZE_ENCODED, 'flags' => FILTER_FLAG_STRIP_LOW), // we are using this in the url
'mod_title' => array('filter' => FILTER_SANITIZE_ENCODED, 'flags' => FILTER_FLAG_STRIP_LOW), // we are using this in the url
);
$revised_post_array = filter_var_array($_POST, $postfilter); // must be referenced via a variable which is now an array that takes the place of $_POST[]
echo (nl2br($revised_post_array['user_tasks'])); //-- use nl2br() upon output like so, for the ['user_tasks'] array value so that the newlines are formatted, since this is our HTML <textarea> field and we want to maintain newlines
?>
Just to clarify, since this may be unknown for a lot of people:
ASCII characters above 127 are known as "Extended" and they represent characters such as greek letters and accented letters in latin alphabets, used in languages such as pt_BR.
A good ASCII quick reference (aside from the already mentioned Wikipedia article) can be found at: http://www.asciicodes.com/
Here is a simpler and a better presented ASCII list for the <32 or 127> filters
(if wikipedia confused the hell out of you):
http://www.danshort.com/ASCIImap/
FILTER_SANITIZE_STRING doesn't behavior the same as strip_tags function. strip_tags allows less than symbol inferred from context, FILTER_SANITIZE_STRING strips regardless.
<?php
$smaller = "not a tag < 5";
echo strip_tags($smaller); // -> not a tag < 5
echo filter_var ( $smaller, FILTER_SANITIZE_STRING); // -> not a tag
?>
To include multiple flags, simply separate the flags with vertical pipe symbols.
For example, if you want to use filter_var() to sanitize $string with FILTER_SANITIZE_STRING and pass in FILTER_FLAG_STRIP_HIGH and FILTER_FLAG_STRIP_LOW, just call it like this:
$string = filter_var($string, FILTER_SANITIZE_STRING, FILTER_FLAG_STRIP_HIGH | FILTER_FLAG_STRIP_LOW);
The same goes for passing a flags field in an options array in the case of using callbacks.
$var = filter_var($string, FILTER_SANITIZE_SPECIAL_CHARS,
array('flags' => FILTER_FLAG_STRIP_LOW | FILTER_FLAG_ENCODE_HIGH));
Thanks to the Brain Goo blog at popmartian.com/tipsntricks/for this info.
Please be aware that when using filter_var() with FILTER_SANITIZE_NUMBER_FLOAT and FILTER_SANITIZE_NUMBER_INT the result will be a string, even if the input value is actually a float or an int.
Use FILTER_VALIDATE_FLOAT and FILTER_VALIDATE_INT, which will convert the result to the expected type.
Although it's specifically mentioned in the above documentation, because many seem to find this unintuitive it's worth pointing out that FILTER_SANITIZE_NUMBER_FLOAT will remove the decimal character unless you specify FILTER_FLAG_ALLOW_FRACTION:
<?php
$number_string = '12.34';
echo filter_var( $number_string, FILTER_SANITIZE_NUMBER_FLOAT ); // 1234
echo filter_var( $number_string, FILTER_SANITIZE_NUMBER_FLOAT, FILTER_FLAG_ALLOW_FRACTION ); // 12.34
?>
With the deprecation of FILTER_SANITIZE_STRING, the "use htmlspecialchars instead" is an incomplete comment. The functionality of FILTER_SANITIZE_STRING was a combination of htmlspcialchars and (approximately) strip_tags. For true compatibility a polyfil may be needed:
<?php
function filter_string_polyfill(string $string): string
{
$str = preg_replace('/\x00|<[^>]*>?/', '', $string);
return str_replace(["'", '"'], [''', '"'], $str);
}
$string = "Some \"' <bizzare> string & to Sanitize < !$@%";
echo filter_var($string,FILTER_SANITIZE_STRING).PHP_EOL;
//Some "' string & to Sanitize
echo htmlspecialchars($string).PHP_EOL;
//Some "' <bizzare> string & to Sanitize < !$@%
echo strip_tags($string).PHP_EOL;
//Some "' string & to Sanitize < !$@%
echo htmlspecialchars(strip_tags($string,ENT_QUOTES)).PHP_EOL;
//Some "' string & to Sanitize < !$@%
echo filter_string_polyfill($string).PHP_EOL;
//Some "' string & to Sanitize
None of these parameters do answer my need, so I've made my own function, hope it helps some other folks!
function removeSpecialChars($valueToClean)
{
return htmlspecialchars(str_replace([",", "#", "$", "%", "*", "~", "'", "=", "{", "[", "|", "`", "^", "]", "}", ":", ";", "<", ">", "/", "?", "&"], "", $valueToClean));
}