To correctly extract a value from a CDATA just make sure you cast the SimpleXML Element to a string value by using the cast operator:
<?php
$xml = '<?xml version="1.0" encoding="UTF-8" ?>
<rss>
<channel>
<item>
<title><![CDATA[Tom & Jerry]]></title>
</item>
</channel>
</rss>';
$xml = simplexml_load_string($xml);
// echo does the casting for you
echo $xml->channel->item->title;
// but vardump (or print_r) not!
var_dump($xml->channel->item->title);
// so cast the SimpleXML Element to 'string' solve this issue
var_dump((string) $xml->channel->item->title);
?>
Above will output:
Tom & Jerry
object(SimpleXMLElement)#4 (0) {}
string(11) "Tom & Jerry"
simplexml_load_file
(PHP 5)
simplexml_load_file — Übersetzt ein XML-File in ein Objekt
Beschreibung
Die Funktion wandelt das übergebene wohlgeformte (well-formed) XML-Dokument in ein Objekt um.
Parameter-Liste
- filename
-
Pfad zur XML-Datei.
Hinweis: Libxml 2 demaskiert den URI, wollen Sie also zum Beispiel b&c als Wert für den URI Parameter a verwenden, müssen Sie die Funktion wie folgt aufrufen: simplexml_load_file(rawurlencode('http://example.com/?a=' . urlencode('b&c'))). Seit PHP 5.1.0 wird Ihnen dieser Schritt von PHP abgenommen.
- class_name
-
Sie können den optionalen Parameter class_name verwenden, wenn simple_load_file() ein Objekt der spezifischen Klasse zurückgeben soll. Die gewählte Klasse sollte von der Klasse SimpleXMLElement abgeleitet sein.
- options
-
Seit PHP 5.1.0 und Libxml 2.6.0 können Sie zusätzlich den Parameter options verwenden, um weitere Libxml-Parameter anzugeben.
- ns
-
- is_prefix
-
Rückgabewerte
Gibt ein object der Klasse SimpleXMLElement zurück, dessen Eigenschaften die Daten des XML-Dokuments enthalten. Im Fehlerfall wird FALSE zurückgegeben.
Beispiele
Beispiel #1 Ein XML-Dokument auswerten
<?php
// Die Datei test.xml enthält ein XML-Dokument mit einem Wurzel-Element
// und mindestens einem Element /[root]/title.
if (file_exists('test.xml')) {
$xml = simplexml_load_file('test.xml');
print_r($xml);
} else {
exit('Konnte test.xml nicht öffnen.');
}
?>
Das Skript gibt nach erfolgreichem Laden folgendes aus:
SimpleXMLElement Object ( [title] => Beispiel-Titel ... )
Ab diesem Punkt können Sie $xml->title und andere Elemente verwenden.
simplexml_load_file
20-Feb-2007 02:08
10-Dec-2006 02:35
In regards to Anonymous on 7th April 2006
There is a way to get back HTML tags. For example:
<?xml version="1.0"?>
<intro>
Welcome to <b>Example.com</b>!
</intro>
<?php
// I use @ so that it doesn't spit out content of my XML in an error message if the load fails. The content could be passwords so this is just to be safe.
$xml = @simplexml_load_file('content_intro.xml');
if ($xml) {
// asXML() will keep the HTML tags but it will also keep the parent tag <intro> so I strip them out with a str_replace. You could obviously also use a preg_replace if you have lots of tags.
$intro = str_replace(array('<intro>', '</intro>'), '', $xml->asXML());
} else {
$error = "Could not load intro XML file.";
}
?>
With this method someone can change the intro in content_intro.xml and ensure that the HTML is well formed and not ruin the whole site design.
06-Apr-2006 09:21
What has been found when using the script is that simplexml_load_file() will remove any HTML formating inside the XML file, and will also only load so many layers deep. If your XML file is to deap, it will return a boolean false.
09-Mar-2006 05:21
Be careful if you are using simplexml data directly to feed your MySQL database using MYSQLi and bind parameters.
The data coming from simplexml are Objects and the bind parameters functions of MySQLi do NOT like that! (it causes some memory leak and can crash Apache/PHP)
In order to do this properly you MUST cast your values to the right type (string, integer...) before passing them to the binding methods of MySQLi.
I did not find that in the documentation and it caused me a lot of headache.
06-Feb-2006 08:26
Sorry there's a mistake in the previous function :
<?php
function &getXMLnode($object, $param) {
foreach($object as $key => $value) {
if(isset($object->$key->$param)) {
return $object->$key->$param;
}
if(is_object($object->$key)&&!empty($object->$key)) {
$new_obj = $object->$key;
// Must use getXMLnode function there (recursive)
$ret = getXMLnode($new_obj, $param);
}
}
if($ret) return (string) $ret;
return false;
}
?>
03-Feb-2006 09:11
So it seems SimpleXML doesn't support CDATA... I bashed together this little regex function to sort out the CDATA before trying to parse XML with the likes of simplexml_load_file / simplexml_load_string. Hope it might help somebody and would be very interested to hear of better solutions. (Other than *not* using SimpleXML of course! ;)
It looks for any <![CDATA [Text and HTML etc in here]]> elements, htmlspecialchar()'s the encapsulated data and then strips the "<![CDATA [" and "]]>" tags out.
<?php
function simplexml_unCDATAise($xml) {
$new_xml = NULL;
preg_match_all("/\<\!\[CDATA \[(.*)\]\]\>/U", $xml, $args);
if (is_array($args)) {
if (isset($args[0]) && isset($args[1])) {
$new_xml = $xml;
for ($i=0; $i<count($args[0]); $i++) {
$old_text = $args[0][$i];
$new_text = htmlspecialchars($args[1][$i]);
$new_xml = str_replace($old_text, $new_text, $new_xml);
}
}
}
return $new_xml;
}
//Usage:
$xml = 'Your XML with CDATA...';
$xml = simplexml_unCDATAise($xml);
$xml_object = simplexml_load_string($xml);
?>
03-Feb-2006 03:37
Suppose you have loaded a XML file into $simpleXML_obj.
The structure is like below :
SimpleXMLElement Object
(
[node1] => SimpleXMLElement Object
(
[subnode1] => value1
[subnode2] => value2
[subnode3] => value3
)
[node2] => SimpleXMLElement Object
(
[subnode4] => value4
[subnode5] => value5
[subnode6] => value6
)
)
When searching a specific node in the object, you may use this function :
<?php
function &getXMLnode($object, $param) {
foreach($object as $key => $value) {
if(isset($object->$key->$param)) {
return $object->$key->$param;
}
if(is_object($object->$key)&&!empty($object->$key)) {
$new_obj = $object->$key;
$ret = getCfgParam($new_obj, $param);
}
}
if($ret) return (string) $ret;
return false;
}
?>
So if you want to get subnode4 value you may use this function like this :
<?php
$result = getXMLnode($simpleXML_obj, 'subnode4');
echo $result;
?>
It display "value4"
12-Jan-2006 06:46
simplexml_load_file creates an xml-tree with values that are UTF-8 strings. To convert them to the more common encoding
ISO-8859-1 (Latin-1), use "utf8_decode".
30-Sep-2005 08:52
Micro$oft Word uses non-standard characters and they create problems in using simplexml_load_file.
Many systems include non-standard Word character in their implementation of ISO-8859-1. So an XML document containing that characters can appear well-formed (i.e.) to many browsers. But if you try to load this kind of documents with simplexml_load_file you'll have a little bunch of troubles..
I believe that this is exactly the same question discussed in htmlentites. Following notes to htmlentitles are interesting here too (given in the reverse order, to grant the history):
http://it.php.net/manual/en/function.htmlentities.php#26379
http://it.php.net/manual/en/function.htmlentities.php#41152
http://it.php.net/manual/en/function.htmlentities.php#42126
http://it.php.net/manual/en/function.htmlentities.php#42511
12-Sep-2005 11:06
If the property of an object is empty the array is not created. Here is a version object2array that transfers properly.
<?php
function object2array($object)
{
$return = NULL;
if(is_array($object))
{
foreach($object as $key => $value)
$return[$key] = object2array($value);
}
else
{
$var = get_object_vars($object);
if($var)
{
foreach($var as $key => $value)
$return[$key] = ($key && !$value) ? NULL : object2array($value);
}
else return $object;
}
return $return;
}
?>
