Im using an API to receive data, several parts cause problem which I need to resolve:
"Apple® iPad® 2 with Wi-Fi - 16GB"
"Rocketfish™ - Premium Vehicle Charger for Apple® iPad™, iPhone® and iPod®"
I need to remove special UTF8 chars such as ® ™
, how can I achieve this ?
Im using an API to receive data, several parts cause problem which I need to resolve:
"Apple® iPad® 2 with Wi-Fi - 16GB"
"Rocketfish™ - Premium Vehicle Charger for Apple® iPad™, iPhone® and iPod®"
I need to remove special UTF8 chars such as ® ™
, how can I achieve this ?
- 1 Are you sure that you can't work with those characters? Anyway, what you're looking for isn't a regex but a simple char by char filter. Way faster and way more efficient. – Colin Hebert Commented Apr 13, 2012 at 15:40
- Duplicated: stackoverflow./questions/3465874/… – lamelas Commented Apr 13, 2012 at 15:41
2 Answers
Reset to default 3if you want to remove all symbols except basic latin just apply a regular expression like
str = str.replace(/[\u0080-\uFFFF]+/g, "");
See this list of unicode characters to choose which characters you need to accept or not
First, please make sure you absolutely cannot work with those "problematic" symbols. Clean modern program should correctly understand input in any language.
As for your request to remove anything unreadable, it is better to specify what you want to leave instead, since F. Calderan's example won't remove any extra Unicode symbols above specified FFFF position. So, considering you only want ASCII:
str = str.replace(/[^\u0000-\u007F]+/g, "");
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745035941a4607518.html
评论列表(0条)