javascript - Parse HTML String to Array - Stack Overflow

I have an html string that contains multiple <p> tags. WIthin each <p> tag there is a word

I have an html string that contains multiple <p> tags. WIthin each <p> tag there is a word and its definition.

let data = "<p><strong>Word 1:</strong> Definition of word 1</p><p><strong>Word 2:</strong> Definition of word 2</p>"

My goal is to convert this html string into an array of objects that looks like below:

[
 {"word": "Word 1", "definition": "Definition of word 1"},
 {"word": "Word 2", "definition": "Definition of word 2"}
]

I am doing it as follows:

var parser = new DOMParser();
  var parsedHtml    = parser.parseFromString(data, "text/html");
  let pTags = parsedHtml.getElementsByTagName("p");
  let vocab = []
  pTags.forEach(function(item){
    // This is where I need help to split and convert item into object
    vocab.push(item.innerHTML)
  });

As you can see the ment in the above code, that is where I'm stuck. Any help is appreciated.

I have an html string that contains multiple <p> tags. WIthin each <p> tag there is a word and its definition.

let data = "<p><strong>Word 1:</strong> Definition of word 1</p><p><strong>Word 2:</strong> Definition of word 2</p>"

My goal is to convert this html string into an array of objects that looks like below:

[
 {"word": "Word 1", "definition": "Definition of word 1"},
 {"word": "Word 2", "definition": "Definition of word 2"}
]

I am doing it as follows:

var parser = new DOMParser();
  var parsedHtml    = parser.parseFromString(data, "text/html");
  let pTags = parsedHtml.getElementsByTagName("p");
  let vocab = []
  pTags.forEach(function(item){
    // This is where I need help to split and convert item into object
    vocab.push(item.innerHTML)
  });

As you can see the ment in the above code, that is where I'm stuck. Any help is appreciated.

Share Improve this question edited Dec 21, 2018 at 11:23 Rom 1,84814 silver badges28 bronze badges asked Dec 21, 2018 at 10:27 asanasasanas 4,28015 gold badges52 silver badges82 bronze badges 4
  • Please create and share a fiddle which can describe what you tried to do – Varit J Patel Commented Dec 21, 2018 at 10:32
  • How about this link stackoverflow./questions/13272406/…? – soorapadman Commented Dec 21, 2018 at 10:33
  • @manfromnowhere That's not about parsing HTML, it's JSON. – Barmar Commented Dec 21, 2018 at 10:37
  • Can you change the HTML to put a tag around the definition, like <span class="definition">Definition of word 1</span>? – Barmar Commented Dec 21, 2018 at 10:38
Add a ment  | 

3 Answers 3

Reset to default 3

Use textContent to get the text out of an element. The word is in the strong child element, the definition is the rest of the text.

var parser = new DomParser();
  var parsedHtml    = parser.parseFromString(data, "text/html");
  let pTags = parsedHtml.getElementsByTagName("p");
  let vocab = []
  pTags.forEach(function(item){
    let word = item.getElementsByTagName("strong")[0].textContent.trim();
    let allText = item.textContent;
    let definition = allText.replace(word, "").trim();
    vocab.push({word: word, definition: definition})
  });

A bit adhoc but works.

const data = "<p><strong>Word 1:</strong> Definition of word 1</p><p><strong>Word 2:</strong> Definition of word 2</p>";
const parsedData = [
  {
    "word1": data.split('<strong>')[1].split('</strong>')[0].trim(),
    "definition": data.split('</strong>')[1].split('</p>')[0].trim()
  },
  {
    "word2": data.split('</p>')[1].split('<strong>')[1].split('</strong>')[0].trim(),
    "definition": data.split('</p>')[1].split('</strong>')[1].split('</p>')[0].trim()
  }
]
console.log(parsedData);

You should fix:

  • DOMParser, not DomParser
  • pTags cannot use .forEach(), please use for loop

My solution for your problem:

let data = "<p><strong>Word 1:</strong> Definition of word 1</p><p><strong>Word 2:</strong> Definition of word 2</p>"

var parser = new DOMParser();
var parsedHtml = parser.parseFromString(data, "text/html");
let pTags = parsedHtml.getElementsByTagName("p");
let vocab = [];
for (let p of pTags) {
  const word = p.getElementsByTagName('strong')[0].innerHTML.replace(':', '').trim();
  const definition = p.innerHTML.replace(/<strong>.*<\/strong>/, '').trim();
  vocab.push( { word, definition } )
}

console.log(vocab);

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745413266a4626627.html

相关推荐

  • javascript - Parse HTML String to Array - Stack Overflow

    I have an html string that contains multiple <p> tags. WIthin each <p> tag there is a word

    4小时前
    30

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信