DOMDocument::loadXML
(PHP 5)
DOMDocument::loadXML — Load XML from a string
Description
Loads an XML document from a string.
Return Values
Returns TRUE
on success or FALSE
on failure. If called statically, returns a
DOMDocument or FALSE
on failure.
Errors/Exceptions
If an empty string is passed as the source
,
a warning will be generated. This warning is not generated by libxml
and cannot be handled using libxml's error handling functions.
This method may be called statically, but will issue an E_STRICT
error.
Examples
Example #1 Creating a Document
<?php
$doc = new DOMDocument();
$doc->loadXML('<root><node/></root>');
echo $doc->saveXML();
?>
Example #2 Static invocation of loadXML
<?php
// Issues an E_STRICT error
$doc = DOMDocument::loadXML('<root><node/></root>');
echo $doc->saveXML();
?>
See Also
- DOMDocument::load() - Load XML from a file
- DOMDocument::save() - Dumps the internal XML tree back into a file
- DOMDocument::saveXML() - Dumps the internal XML tree back into a string
- PHP Руководство
- Функции по категориям
- Индекс функций
- Справочник функций
- Обработка XML
- Document Object Model
- Функция DOMDocument::__construct() - Создание нового DOMDocument объекта
- Функция DOMDocument::createAttribute() - Создает новый атрибут
- Функция DOMDocument::createAttributeNS() - Создает новый узел-атрибут с соответствующим ему пространством имен
- Функция DOMDocument::createCDATASection() - Создает новый cdata узел
- Функция DOMDocument::createComment() - Создает новый узел-комментарий
- Функция DOMDocument::createDocumentFragment() - Создание фрагмента докуента
- Функция DOMDocument::createElement() - Создает новый узел-элемент
- Функция DOMDocument::createElementNS() - Создание нового узла-элемента с соответствующим пространством имен
- Функция DOMDocument::createEntityReference() - Создание нового узла-ссылки на сущность
- Функция DOMDocument::createProcessingInstruction() - Создает новый PI-узел
- Функция DOMDocument::createTextNode() - Создает новый текстовый узел
- Функция DOMDocument::getElementById() - Ищет элемент с заданным id
- Функция DOMDocument::getElementsByTagName() - Ищет все элементы с заданным локальным именем
- Функция DOMDocument::getElementsByTagNameNS() - Ищет элементы с заданным именем в определенном пространстве имен
- Функция DOMDocument::importNode() - Импорт узла в текущий документ
- Функция DOMDocument::load() - Загрузка XML из файла
- Функция DOMDocument::loadHTML() - Загрузка HTML из строки
- Функция DOMDocument::loadHTMLFile() - Загрузка HTML из файла
- Функция DOMDocument::loadXML() - Загрузка XML из строки
- Функция DOMDocument::normalizeDocument() - Нормализует документ
- Функция DOMDocument::registerNodeClass() - Регистрация расширенного класса, используемого для создания базового типа узлов
- Функция DOMDocument::relaxNGValidate() - Производит проверку документа на правильность построения посредством relaxNG
- Функция DOMDocument::relaxNGValidateSource() - Проверяет документ посредством relaxNG
- Функция DOMDocument::save() - Сохраняет XML дерево из внутреннего представления в файл
- DOMDocument::saveHTML
- DOMDocument::saveHTMLFile
- Функция DOMDocument::saveXML() - Сохраняет XML дерево из внутреннего представления в виде строки
- Функция DOMDocument::schemaValidate() - Проверяет действительности документа, основываясь на заданной схеме
- Функция DOMDocument::schemaValidateSource() - Проверяет действительность документа, основываясь на схеме
- Функция DOMDocument::validate() - Проверяет документ на соответствие его DTD
- Функция DOMDocument::xinclude() - Проводит вставку XInclude разделов в объектах DOMDocument
Коментарии
Note that loadXML crops off beginning and trailing whitespace and linebreaks.
When using loadXML and appendChild to add a chunk of XML to an existing document, you may want to force a linebreak between the end of the XML chunk and the next line (usually a close tag) in the output file:
$childDocument = new DOMDocument;
$childDocument>preserveWhiteSpace = true;
$childDocument->loadXML(..XML-Chunk..);
$mNewNode = $mainDOcument->importNode($childDocument->documentElement, true);
$ParentNode->appendChild($mNewNode);
$ParentNode->appendChild($mainDocument->createTextNode("\\n ");
Although it is said that DOM should not be used to make 'pretty' XML output, it is something I struggled with to get something that was readable for testing. Another solution is to use the createDocumentFragment()->appendXML(..XML-Chunk..) instead, which seems not to trim off linebreaks like DOMDocument->loadXML() does.
While loadXML() expects its input to have a leading XML processing instruction to deduce the encoding used, there's no such concept in (non-XML) HTML documents. Thus, the libxml library underlying the DOM functions peeks at the <META> tags to figure out the encoding used.
See http://xmlsoft.org/encoding.html.
loadXml reports an error instead of throwing an exception when the xml is not well formed. This is annoying if you are trying to to loadXml() in a try...catch statement. Apparently its a feature, not a bug, because this conforms to a spefication.
If you want to catch an exception instead of generating a report, you could do something like
<?php
function HandleXmlError($errno, $errstr, $errfile, $errline)
{
if ($errno==E_WARNING && (substr_count($errstr,"DOMDocument::loadXML()")>0))
{
throw new DOMException($errstr);
}
else
return false;
}
function XmlLoader($strXml)
{
set_error_handler('HandleXmlError');
$dom = new DOMDocument();
$dom->loadXml($strXml);
restore_error_handler();
return $dom;
}
?>
Returning false in function HandleXmlError() causes a fallback to the default error handler.
earth at anonymous dot com,
preserveWhiteSpace property needs to be set to false for formatOutput to work properly, for some reason.
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->loadXML($xmlStr);
...
$element->appendChild(...);
...
$dom->formatOutput = true;
$xmlStr = $dom->saveXML();
echo $xmlStr;
This would format the output nicely.
When using loadXML() to parse a string that contains entity references (e.g., ), be sure that those entity references are properly declared through the use of a DOCTYPE declaration; otherwise, loadXML() will not be able to interpret the string.
Example:
<?php
$str = <<<XML
<?xml version="1.0" encoding="iso-8859-1"?>
<div>This is a non-breaking space.</div>
XML;
$dd1 = new DOMDocument();
$dd1->loadXML($str);
echo $dd1->saveXML();
?>
Given the above code, PHP will issue a Warning about the entity 'nbsp' not being properly declared. Also, the call to saveXML() will return nothing but a trimmed-down version of the original processing instruction...everything else is gone, and all because of the undeclared entity.
Instead, explicitly declare the entity first:
<?php
$str = <<<XML
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE root [
<!ENTITY nbsp " ">
]>
<div>This is a non-breaking space.</div>
XML;
$dd2 = new DOMDocument();
$dd2->loadXML($str);
echo $dd2->saveXML();
?>
Since the 'nbsp' entity is defined in the DOCTYPE, PHP no longer issues that Warning; the string is now well-formed, and loadXML() understands it perfectly.
You can also use references to external DTDs in the same way (e.g., <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">), which is particularly important if you need to do this for many different documents with many different possible entities.
Also, as a sidenote...entity references created by createEntityReference() do not need this kind of explicit declaration.
The documentation states that loadXML can be called statically, but this is misleading. This feature seems to be a special case hack and its use seems to be discouraged according to http://bugs.php.net/bug.php?id=41398.
Calling the method statically will fail with an error if the code runs with E_STRICT error reporting enabled.
The documentation should be changed to make it clear that a static call is against recommended practice and won't work with E_STRICT.
Possible values for the options parameter can be found here:
http://us3.php.net/manual/en/ref.libxml.php#libxml.constants
For some reason, when you set DOMDocument's property 'recover' to true, using '@' to mask errors thrown by loadXml() won't work.
Here's my workaround:
function maskErrors() {}
set_error_handler('maskErrors');
$dom->loadXml($xml);
restore_error_handler();
You could also simply do this: error_reporting(0); and then set back error_reporting to its original state.
Instead of doing this:
<?php
$str = <<<XML
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE root [
<!ENTITY nbsp " ">
]>
<div>This is a non-breaking space.</div>
XML;
$dd2 = new DOMDocument();
$dd2->loadXML($str);
echo $dd2->saveXML();
?>
simply use:
loadHTML() rather than loadXML().
Always remember that with the default parameters this function doesn't handle well large files, i.e. if a text node is longer than 10Mb it can raise an exception stating:
DOMDocument::loadXML(): internal error Extra content at the end of the document in Entity
even though the XML is fine.
The cause is a definition in parserInternals.h of lixml:
#define XML_MAX_TEXT_LENGTH 10000000
To allow the function to process larger files, pass the LIBXML_PARSEHUGE as an option and it will work just fine:
$domDocument->loadXML($xml, LIBXML_PARSEHUGE);
A call to loadXML() will overwrite the XML declaration previously created in the constructor of DOMDocument. This can cause encoding problems if there is no XML declaration in the loaded XML and you don't have control over the source (e.g. if the XML is coming from a webservice). To fix this, set encoding AFTER loading the XML using the 'encoding' class property of DOMDocument. Example:
Bad situation:
test.xml:
<test>
<hello>hi</hello>
<field>ø</field>
</test>
test.php:
$xmlDoc = new DOMDocument("1.0", "utf-8"); // Parameters here are overwritten anyway when using loadXML(), and are not really relevant
$testXML = file_get_contents("test.xml");
$xmlDoc->loadXML($testXML);
// Print the contents to a file or in a log function to get the output, using $xmlDoc->saveXML()
Output:
<?xml version="1.0"?>
<test>
<hello>hi</hello>
<field>ø</field>
</test>
Good situation:
test.xml:
<test>
<hello>hi</hello>
<field>ø</field>
</test>
test.php:
$xmlDoc = new DOMDocument("1.0", "utf-8"); // Parameters here are overwritten anyway when using loadXML(), and are not really relevant
$testXML = file_get_contents("test.xml");
$xmlDoc->loadXML($testXML);
$xmlDoc->encoding = "utf-8";
// Print the contents to a file or in a log function to get the output, using $xmlDoc->saveXML()
Output:
<?xml version="1.0" encoding="utf-8"?>
<test>
<hello>hi</hello>
<field>ø</field>
</test>