PHP cdata 处理(详细介绍)
author:一佰互联 2019-04-29 click:162
当时在网上找了一个CDATA的转换器, 修改之后, 将CDATA标签给过滤掉。如下
复制代码 代码如下:
// States:
//
// "out"
// "<"
// "<!"
// "<!["
// "<![C"
// "<![CD"
// "<![CDAT"
// "<![CDATA"
// "in"
// "]"
// "]]"
//
// (Yes, the states a represented by strings.)
//
$state = "out";
$a = str_split($xml);
$new_xml = "";
foreach ($a AS $k => $v) {
// Deal with "state".
switch ( $state ) {
case "out":
if ( "<" == $v ) {
$state = $v;
} else {
$new_xml .= $v;
}
break;
case "<":
if ( "!" == $v ) {
$state = $state . $v;
} else {
$new_xml .= $state . $v;
$state = "out";
}
break;
case "<!":
if ( "[" == $v ) {
$state = $state . $v;
} else {
$new_xml .= $state . $v;
$state = "out";
}
break;
case "<![":
if ( "C" == $v ) {
$state = $state . $v;
} else {
$new_xml .= $state . $v;
$state = "out";
}
break;
case "<![C":
if ( "D" == $v ) {
$state = $state . $v;
} else {
$new_xml .= $state . $v;
$state = "out";
}
break;
case "<![CD":
if ( "A" == $v ) {
$state = $state . $v;
} else {
$new_xml .= $state . $v;
$state = "out";
}
break;
case "<![CDA":
if ( "T" == $v ) {
$state = $state . $v;
} else {
$new_xml .= $state . $v;
$state = "out";
}
break;
case "<![CDAT":
if ( "A" == $v ) {
$state = $state . $v;
} else {
$new_xml .= $state . $v;
$state = "out";
}
break;
case "<![CDATA":
if ( "[" == $v ) {
$cdata = "";
$state = "in";
} else {
$new_xml .= $state . $v;
$state = "out";
}
break;
case "in":
if ( "]" == $v ) {
$state = $v;
} else {
$cdata .= $v;
}
break;
case "]":
if ( "]" == $v ) {
$state = $state . $v;
} else {
$cdata .= $state . $v;
$state = "in";
}
break;
case "]]":
if ( ">" == $v ) {
$new_xml .= htmlentities($cdata);
# $new_xml.= $cdata;
// $new_xml .= str_replace(">",">",
// str_replace(">","<",
// str_replace(""",""",
// str_replace("&","&",
// $cdata))));
$state = "out";
} else {
$cdata .= $state . $v;
$state = "in";
}
break;
} // switch
}
//
// Return.
//
return $new_xml;
最近发现,总是有alert发出来, 说是simplexml解析出错。 发现是原来有xml的数据是<![CDATA[domain[test]]] >. 出现了连续的3个], 造成上面的解析函数不能处理。 而且这个问题很难修正, 你不知道下次会不会有4, 5个]出现。 所以决定还是将这段解析 的代码换成DOM XML,本身 DOM的处理还是比较简单的, 包含DOMElement, DOMDocument, DOMNodeList, DOMNode几个 component. 对于 DOMNode有nodeValue, nodeType, nodeName的成员函数。 首先先用loadXML将string转化为DOMDocument对像, 再用getElementsByTagName转化为DOMNodeList对像, 再使用->item(0)转化为DOMNOde, 然后就可以使用上面的三种方法了。 对于 <aa color="red">test</aa>这种xml标签, 要使用 attribute函数。