I need to find find/replace or convert pilcrow / partial differential characters in a string as they currently show as �.
What I thought would work but doesn't:
const value = 'Javascript Regex pattern for Pilcrow (¶) or Partial Differential (∂) character';
const matches = value.match(/\u2029/gmi);
console.log(matches);
I need to find find/replace or convert pilcrow / partial differential characters in a string as they currently show as �.
What I thought would work but doesn't:
const value = 'Javascript Regex pattern for Pilcrow (¶) or Partial Differential (∂) character';
const matches = value.match(/\u2029/gmi);
console.log(matches);
But returns empty.
To be honest, I'm not even sure how to achieve what I need to do.
Share Improve this question edited Aug 19, 2020 at 13:32 hgb123 14.9k3 gold badges23 silver badges43 bronze badges asked Aug 19, 2020 at 13:25 Sean DelaneySean Delaney 3276 silver badges23 bronze badges 9-
1
use
u00B6
instead – mplungjan Commented Aug 19, 2020 at 13:31 -
1
/¶/gmi
? (Although it might be easier to fix the character encoding.) – Ivar Commented Aug 19, 2020 at 13:32 - 1 matches is null – Sean Delaney Commented Aug 19, 2020 at 13:32
-
2
What's wrong with
/[¶∂]/gmi
? – Wyck Commented Aug 19, 2020 at 13:33 -
6
If it displays as
�
then it's very likely that it isn't properly encoded to begin with. In other words, you won't find ¶ or ∂ because they aren't there. – Álvaro González Commented Aug 19, 2020 at 13:35
3 Answers
Reset to default 5The correct Unicode code points are U+00B6 and U+2202, not U+2029. You'll also want to use a [] character range in your expression:
const value = 'Javascript Regex pattern for Pilcrow (¶) or Partial Differential (∂) character';
const matches = value.match(/[\u00B6\u2202]/gmi);
console.log(matches);
Of course, you don't really need \u escapes in the first place:
const value = 'Javascript Regex pattern for Pilcrow (¶) or Partial Differential (∂) character';
const matches = value.match(/[¶∂]/gmi);
console.log(matches);
Last but not least, you say:
they currently show as �.
If that's the case, it's very likely that it isn't properly encoded to begin with. In other words, you won't find ¶
or ∂
because they aren't there. I suggest you address this first.
Use String.prototype.codePointAt
to extract the unicode UTF-16 code point and convert it into hex digits sequence.
const toUnicodeCodePointHex = (str) => {
const codePoint = str.codePointAt(0).toString(16);
return '\\u' + '0000'.substring(0, 4 - codePoint.length) + codePoint;
};
const value = 'Javascript Regex pattern for Pilcrow (¶) or Partial Differential (∂) character';
const re = new RegExp(['¶', '∂'].map((item) => toUnicodeCodePointHex(item)).join('|'), 'ig');
const matches = value.match(re);
console.log(matches);
See this very nice article
by Mathias Bynens.
You can find them by hex or octal value:
const matches = value.match(/\u00B6|\u2202/g);
Regex for each:
Pilcrow: \u00B6
or \xB6
or \266
Partial Differential: \u2202
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744854024a4597298.html
评论列表(0条)