javascript - How to fix truncated string json to be able to parse it again - Stack Overflow

I have a json string that is truncated in google logs (for some reasons that is out of hand to fix)the

I have a json string that is truncated in google logs (for some reasons that is out of hand to fix)

the json string is is like 255000 chars, for simplicity consider the original json was like below

 { "a": "1", 
   "b" : [{
      "b1": "<span>text b1</span>"
   },
   {
      "b2": "<span>text b2</span>"
   }],
   "c": "3"
 }

and truncated string es as

 { "a": "1", 
   "b" : [{
      "b1": "<spa... 34 characters truncated.

For sure if I try to JSON.parse above string I get an error

Now what I am thinking as solution is to make it a valid json with loosing some data so the output I am expecting is

 { "a": "1", 
   "b" : [{
      "b1": "<spa"
    }]
 }

What is done above is

  • removing ... 34 characters truncated. from end of string
  • closing the string by adding " (in this case was needed but also might not be needed)
  • adding enough ] and } to make it valid object again

there are other cases to cover as well

for example

 { "a": "1", 
   "b" : [{
      "b1": ... 34 characters truncated.

or

 { "a": "1", 
   "b" : [{
      "b1 ... 34 characters truncated.

What is the best approach to do it?

I tried dirty-json library to fix but actually that doesn't help for this case

I have a json string that is truncated in google logs (for some reasons that is out of hand to fix)

the json string is is like 255000 chars, for simplicity consider the original json was like below

 { "a": "1", 
   "b" : [{
      "b1": "<span>text b1</span>"
   },
   {
      "b2": "<span>text b2</span>"
   }],
   "c": "3"
 }

and truncated string es as

 { "a": "1", 
   "b" : [{
      "b1": "<spa... 34 characters truncated.

For sure if I try to JSON.parse above string I get an error

Now what I am thinking as solution is to make it a valid json with loosing some data so the output I am expecting is

 { "a": "1", 
   "b" : [{
      "b1": "<spa"
    }]
 }

What is done above is

  • removing ... 34 characters truncated. from end of string
  • closing the string by adding " (in this case was needed but also might not be needed)
  • adding enough ] and } to make it valid object again

there are other cases to cover as well

for example

 { "a": "1", 
   "b" : [{
      "b1": ... 34 characters truncated.

or

 { "a": "1", 
   "b" : [{
      "b1 ... 34 characters truncated.

What is the best approach to do it?

I tried dirty-json library to fix but actually that doesn't help for this case

Share Improve this question edited Nov 29, 2020 at 18:39 Reza asked Nov 29, 2020 at 18:22 RezaReza 20k16 gold badges98 silver badges174 bronze badges 1
  • One thing I can think of, is to parse a JSON.parse() error and fix according to the error text – Ivan Satsiuk Commented Nov 29, 2020 at 20:58
Add a ment  | 

2 Answers 2

Reset to default 7

npm package untruncate-json working as expected

import untruncateJson from "untruncate-json";

const string = ` { "a": "1", 
   "b" : [{
      "b1": "<spa... 34 characters truncated.`;

// remove non-json ending
const truncatedJson = string.replace(/\.\.\. \d+ characters truncated\.$/,'');

// run library
const untruncatedJson = untruncateJson(truncatedJson);

console.log(untruncatedJson);

output:

{ 
    "a": "1", 
    "b" : [
        { "b1": "<spa" }
    ]
}

I'm afraid you'll have to do it by yourself. But your approach sounds like a good start.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744153957a4560768.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信