javascript - Regular expression for extracting XML tag

I have some XML which I want to extract via a javascript regular expression. An example of the XML is shown below:

<rules><and><gt propName="Unit" value="5" type="System.Int32"/><or><startsWith propName="DeviceType"/></or></and></rules>

I’m having problems extracting just the xml names “gt” and “startsWith”. For example, with the following expression

<(.+?)\s

I get:

“<rules><and><gt”

rather than just “gt”.

Can anyone supply the correct expression?

I have some XML which I want to extract via a javascript regular expression. An example of the XML is shown below:

<rules><and><gt propName="Unit" value="5" type="System.Int32"/><or><startsWith propName="DeviceType"/></or></and></rules>

I’m having problems extracting just the xml names “gt” and “startsWith”. For example, with the following expression

<(.+?)\s

I get:

“<rules><and><gt”

rather than just “gt”.

Can anyone supply the correct expression?

Share Improve this question edited Jul 9, 2011 at 4:33 Brad Mace 27.9k18 gold badges109 silver badges152 bronze badges asked Sep 20, 2010 at 11:35 Retrocoder 4,72311 gold badges50 silver badges72 bronze badges

You shouldn't use a regex but <([^> ]+) will probably do :) – jensgram Commented Sep 20, 2010 at 11:44

Add a ment |

4 Answers 4

Sorted by: Reset to default 4

Regex is a poor tool to parse xml. You can easily parse the XML in JavaScript. A library like jQuery makes this task especially easy (for example):

var xml = '<rules><and><gt propName="Unit" value="5" type="System.Int32"/><or><startsWith propName="DeviceType"/></or></and></rules>';
var gt = $('gt', xml);
var t = gt.attr('type'); //System.Int32

Well, \s matches whitespace. So you actually tell the regex engine to:

<(.+?)\s
^^    ^
||    \ until you find a whitespace
|\ slurp in anything (but whitespace)
\ as long as it starts with an opening pointy bracket

You could, for example use:

<([^\s>]+?)

but you should always consider this.

Don't use a regex to do this kind of things. Rather use the DOM processing functions such as

var gtElements = document.getElementsByTagName('gt');
var startsWithElements = document.getElementsByTagName('startsWith');

The most robust method would be to use the browser's built-in XML parser and standard DOM methods for extracting the elements you want:

var parseXml;

if (window.DOMParser) {
    parseXml = function(xmlStr) {
        return ( new window.DOMParser() ).parseFromString(xmlStr, "text/xml");
    };
} else if (typeof window.ActiveXObject != "undefined" &&
        new window.ActiveXObject("Microsoft.XMLDOM")) {
    parseXml = function(xmlStr) {
        var xmlDoc = new window.ActiveXObject("Microsoft.XMLDOM");
        xmlDoc.async = "false";
        xmlDoc.loadXML(xmlStr);
        return xmlDoc;
    };
} else {
    parseXml = function() { return null; }
}

var xmlStr = '<rules><and>' +
    '<gt propName="Unit" value="5" type="System.Int32"/><or>' + 
    '<startsWith propName="DeviceType"/></or></and></rules>';

var xmlDoc = parseXml(xmlStr);
if (xmlDoc) {
    var gt = xmlDoc.getElementsByTagName("gt")[0];
    alert( gt.getAttribute("propName") );
}

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1745623234a4636650.html

javascript - Regular expression for extracting XML tag - Stack Overflow

4 Answers 4

发表回复

评论列表（0条）

联系我们

400-800-8888

javascript - Regular expression for extracting XML tag - Stack Overflow

4 Answers 4

相关推荐