get_html_translation_table

(PHP 4, PHP 5)

get_html_translation_table — Returns the translation table used by htmlspecialchars() and htmlentities()

Description

array get_html_translation_table ([ int $table = HTML_SPECIALCHARS [, int $flags = ENT_COMPAT | ENT_HTML401 [, string $encoding = "UTF-8" ]]] )

get_html_translation_table() will return the translation table that is used internally for htmlspecialchars() and htmlentities().

Note:
Special characters can be encoded in several ways. E.g. " can be encoded as ", " or &#x22. get_html_translation_table() returns only the form used by htmlspecialchars() and htmlentities().

Parameters

table

Which table to return. Either HTML_ENTITIES or HTML_SPECIALCHARS.

flags

A bitmask of one or more of the following flags, which specify which quotes the table will contain as well as which document type the table is for. The default is ENT_COMPAT | ENT_HTML401.

**Available `flags` constants**
Constant Name	Description
`ENT_COMPAT`	Table will contain entities for double-quotes, but not for single-quotes.
`ENT_QUOTES`	Table will contain entities for both double and single quotes.
`ENT_NOQUOTES`	Table will neither contain entities for single quotes nor for double quotes.
`ENT_HTML401`	Table for HTML 4.01.
`ENT_XML1`	Table for XML 1.
`ENT_XHTML`	Table for XHTML.
`ENT_HTML5`	Table for HTML 5.

encoding

Encoding to use. If omitted, the default value for this argument is ISO-8859-1 in versions of PHP prior to 5.4.0, and UTF-8 from PHP 5.4.0 onwards.

The following character sets are supported:

**Supported charsets**
Charset	Aliases	Description
ISO-8859-1	ISO8859-1	Western European, Latin-1.
ISO-8859-5	ISO8859-5	Little used cyrillic charset (Latin/Cyrillic).
ISO-8859-15	ISO8859-15	Western European, Latin-9. Adds the Euro sign, French and Finnish letters missing in Latin-1 (ISO-8859-1).
UTF-8		ASCII compatible multi-byte 8-bit Unicode.
cp866	ibm866, 866	DOS-specific Cyrillic charset.
cp1251	Windows-1251, win-1251, 1251	Windows-specific Cyrillic charset.
cp1252	Windows-1252, 1252	Windows specific charset for Western European.
KOI8-R	koi8-ru, koi8r	Russian.
BIG5	950	Traditional Chinese, mainly used in Taiwan.
GB2312	936	Simplified Chinese, national standard character set.
BIG5-HKSCS		Big5 with Hong Kong extensions, Traditional Chinese.
Shift_JIS	SJIS, SJIS-win, cp932, 932	Japanese
EUC-JP	EUCJP, eucJP-win	Japanese
MacRoman		Charset that was used by Mac OS.
''		An empty string activates detection from script encoding (Zend multibyte), default_charset and current locale (see nl_langinfo() and setlocale()), in this order. Not recommended.

Note: Any other character sets are not recognized. The default encoding will be used instead and a warning will be emitted.

Return Values

Returns the translation table as an array, with the original characters as keys and entities as values.

Changelog

Version	Description
5.4.0	The default value for the `encoding` parameter was changed to UTF-8.
5.4.0	The constants `ENT_HTML401`, `ENT_XML1`, `ENT_XHTML` and `ENT_HTML5` were added.
5.3.4	The `encoding` parameter was added.

Examples

Example #1 Translation Table Example


<?php
var_dump(get_html_translation_table(HTML_ENTITIES, ENT_QUOTES | ENT_HTML5));
?>

The above example will output something similar to:

array(1510) {
  ["
"]=>
  string(9) "&NewLine;"
  ["!"]=>
  string(6) "&excl;"
  ["""]=>
  string(6) "&quot;"
  ["#"]=>
  string(5) "&num;"
  ["$"]=>
  string(8) "&dollar;"
  ["%"]=>
  string(8) "&percnt;"
  ["&"]=>
  string(5) "&amp;"
  ["'"]=>
  string(6) "&apos;"
  // ...
}

Коментарии

Jun 19

Автор: dirk at hartmann dot net


get_html_translation_table

It works only with the first 256 Codepositions.

For Higher Positions, for Example &#1092;

(a kyrillic Letter) it shows the same.

2001-06-19 16:41:41

http://php5.kiev.ua/manual/ru/function.get-html-translation-table.html

Oct 28

Автор: kumar at chicagomodular.com


without heavy scientific analysis, this seems to work as a quick fix to making text originating from a Microsoft Word document display as HTML:



<?php

function DoHTMLEntities ($string)

    {

        $trans_tbl = get_html_translation_table (HTML_ENTITIES);

        

        // MS Word strangeness.. 

        // smart single/ double quotes:

        $trans_tbl[chr(145)] = '\''; 

        $trans_tbl[chr(146)] = '\''; 

        $trans_tbl[chr(147)] = '&quot;'; 

        $trans_tbl[chr(148)] = '&quot;'; 



                // Acute 'e'

        $trans_tbl[chr(142)] = '&eacute;';

        

        return strtr ($string, $trans_tbl);

    }

?>

2002-10-28 19:51:50

http://php5.kiev.ua/manual/ru/function.get-html-translation-table.html

Jan 03

Автор: kevin_bro at hostedstuff dot com


Alans version didn't seem to work right. If you're having the same problem consider using this slightly modified version instead:



function unhtmlentities ($string)  {

   $trans_tbl = get_html_translation_table (HTML_ENTITIES);

   $trans_tbl = array_flip ($trans_tbl);

   $ret = strtr ($string, $trans_tbl);

   return preg_replace('/&#(\d+);/me', 

      "chr('\\1')",$ret);

}

2003-01-03 08:06:58

http://php5.kiev.ua/manual/ru/function.get-html-translation-table.html

May 18

Автор: Alex Minkoff


If you want to display special HTML entities in a web browser, you can use the following code:



<?

$entities = get_html_translation_table(HTML_ENTITIES);

foreach ($entities as $entity) {

    $new_entities[$entity] = htmlspecialchars($entity);

}

echo "<pre>";

print_r($new_entities);

echo "</pre>";

?>



If you don't, the key name of each element will appear to be the same as the element content itself, making it look mighty stupid. ;)

2005-05-18 19:30:09

http://php5.kiev.ua/manual/ru/function.get-html-translation-table.html

May 29

Автор: Patrick nospam at nospam mesopia dot com


Not sure what's going on here but I've run into a problem that others might face as well...



<?php



$translations = array_flip(get_html_translation_table(HTML_ENTITIES,ENT_QUOTES));



?>



returns the single quote ' as being equal to &#39; while



<?php



$translatedString = htmlentities($string,ENT_QUOTES);



?>

returns it as being equal to &#039;



I've had to do a specific string replacement for the time being... Not sure if it's an issue with the function or the array manipulation.



-Pat

2005-05-29 22:00:57

http://php5.kiev.ua/manual/ru/function.get-html-translation-table.html

Dec 31

Автор: Jérôme Jaglale


htmlentities includes htmlspecialchars, so here's how to convert an UTF-8 string :

htmlentities($string, ENT_QUOTES, 'UTF-8');

2006-12-31 13:43:36

http://php5.kiev.ua/manual/ru/function.get-html-translation-table.html

Jul 20

Автор: Maurizio Siliani at trident dot it


If you have troubles (like me) getting data from ISO-8859-1 encoded forms where user copy and paste from word, this routine could be useful.

It adds to the standard get_html_translation_table the codes of the characters usually M$ Word replacs into typed text.

Otherwise those characters would never be displayed correctly in html output.



function get_html_translation_table_CP1252() {

    $trans = get_html_translation_table(HTML_ENTITIES);

    $trans[chr(130)] = '&sbquo;';    // Single Low-9 Quotation Mark

    $trans[chr(131)] = '&fnof;';    // Latin Small Letter F With Hook

    $trans[chr(132)] = '&bdquo;';    // Double Low-9 Quotation Mark

    $trans[chr(133)] = '&hellip;';    // Horizontal Ellipsis

    $trans[chr(134)] = '&dagger;';    // Dagger

    $trans[chr(135)] = '&Dagger;';    // Double Dagger

    $trans[chr(136)] = '&circ;';    // Modifier Letter Circumflex Accent

    $trans[chr(137)] = '&permil;';    // Per Mille Sign

    $trans[chr(138)] = '&Scaron;';    // Latin Capital Letter S With Caron

    $trans[chr(139)] = '&lsaquo;';    // Single Left-Pointing Angle Quotation Mark

    $trans[chr(140)] = '&OElig;    ';    // Latin Capital Ligature OE

    $trans[chr(145)] = '&lsquo;';    // Left Single Quotation Mark

    $trans[chr(146)] = '&rsquo;';    // Right Single Quotation Mark

    $trans[chr(147)] = '&ldquo;';    // Left Double Quotation Mark

    $trans[chr(148)] = '&rdquo;';    // Right Double Quotation Mark

    $trans[chr(149)] = '&bull;';    // Bullet

    $trans[chr(150)] = '&ndash;';    // En Dash

    $trans[chr(151)] = '&mdash;';    // Em Dash

    $trans[chr(152)] = '&tilde;';    // Small Tilde

    $trans[chr(153)] = '&trade;';    // Trade Mark Sign

    $trans[chr(154)] = '&scaron;';    // Latin Small Letter S With Caron

    $trans[chr(155)] = '&rsaquo;';    // Single Right-Pointing Angle Quotation Mark

    $trans[chr(156)] = '&oelig;';    // Latin Small Ligature OE

    $trans[chr(159)] = '&Yuml;';    // Latin Capital Letter Y With Diaeresis

    ksort($trans);

    return $trans;

}

2007-07-20 11:43:03

http://php5.kiev.ua/manual/ru/function.get-html-translation-table.html

Sep 07

Автор: iain (duh) workingsoftware.com.au


I wrote a quick little function for converting something like '&middot;' into '&#183;':



$to_convert = '&middot;'; 

$table = get_html_translation_table(HTML_ENTITIES);

$equiv = '&#'.ord(array_search($to_convert,$table)).';';

2007-09-07 05:06:11

http://php5.kiev.ua/manual/ru/function.get-html-translation-table.html

Sep 23

Автор: Kenneth Kin Lum


to display the mapping on a webpage no matter what the server encoding is, this can be used



  echo "<pre>\n";

  echo htmlentities(print_r((get_html_translation_table(HTML_SPECIALCHARS)), true));

  echo htmlentities(print_r((get_html_translation_table(HTML_ENTITIES)), true));



since get_html_translation_table() actually gives the special chars in iso-8859-1 (Latin-1) encoding, so to see the tables correctly using



  print_r(get_html_translation_table(HTML_ENTITIES));



your server needs to give a HTTP header as iso-8859-1, unless you use header() or manually set the browser's encoding setting to iso-8859-1.  And you need to view the source of the page to see the mapping.  (except English version of IE 7 outputs the page source as iso-8859-1 anyway).

2008-09-23 08:54:39

http://php5.kiev.ua/manual/ru/function.get-html-translation-table.html

Sep 15

Автор: kevin at cwsmailbox dot xom


Be careful using get_html_translation_table() in a loop, as it's very slow.

2010-09-15 12:55:30

http://php5.kiev.ua/manual/ru/function.get-html-translation-table.html

Dec 12

Автор: michael dot genesis at gmail dot com


The fact that MS-word and some other sources use CP-1252, and that it is so close to Latin1 ('ISO-8859-1') causes a lot of confusion. What confused me the most was finding that mySQL uses CP-1252 by default.



You may run into trouble if you find yourself tempted to do something like this:

<?php

    $trans[chr(149)] = '&bull;';    // Bullet

    $trans[chr(150)] = '&ndash;';    // En Dash

    $trans[chr(151)] = '&mdash;';    // Em Dash

    $trans[chr(152)] = '&tilde;';    // Small Tilde

    $trans[chr(153)] = '&trade;';    // Trade Mark Sign

?>



Don't do it. DON'T DO IT!



You can use:

<?php

    $translationTable = get_html_translation_table(HTML_ENTITIES, ENT_NOQUOTES, 'WINDOWS-1252');

?>



or just convert directly:

<?php

    $output = htmlentities($input, ENT_NOQUOTES, 'WINDOWS-1252');

?>



But your web page is probably encoded UTF-8, and you probably don't really want CP-1252 text flying around, so fix the character encoding first:

<?php

    $output = mb_convert_encoding($input, 'UTF-8', 'WINDOWS-1252');

    $ouput = htmlentities($output);

?>

2011-12-12 17:01:57

http://php5.kiev.ua/manual/ru/function.get-html-translation-table.html

PHP5

Для web разработчика

Jul 07
Функция get_html_translation_table() - Returns the translation table used by htmlspecialchars and htmlentities

get_html_translation_table

Description

Parameters

Return Values

Changelog

Examples

See Also

Коментарии

PHP5

Для web разработчика

Jul 07Функция get_html_translation_table() - Returns the translation table used by htmlspecialchars and htmlentities

get_html_translation_table

Description

Parameters

Return Values

Changelog

Examples

See Also

Коментарии

Jul 07
Функция get_html_translation_table() - Returns the translation table used by htmlspecialchars and htmlentities