javascript - Looking for regex: soft hyphen or not a word character - Stack Overflow

I am looking for a regex yielding to characters which are not word-characters nor a soft hyphen (U+00AD

I am looking for a regex yielding to characters which are not word-characters nor a soft hyphen (U+00AD).

This will give me characters which are not word-characters:

((?=\W).)

But what about the soft hyphen character? What is the correct regex?

I am looking for a regex yielding to characters which are not word-characters nor a soft hyphen (U+00AD).

This will give me characters which are not word-characters:

((?=\W).)

But what about the soft hyphen character? What is the correct regex?

Share Improve this question edited Oct 13, 2016 at 21:13 William Perron 4957 silver badges16 bronze badges asked Aug 24, 2011 at 14:13 ThariamaThariama 50.9k13 gold badges145 silver badges175 bronze badges 4
  • no, this does not work, i am not looking for hyphen but "soft hyphen" - that is a different character (U+00AD see fileformat.info/info/unicode/char/ad/index.htm) – Thariama Commented Aug 24, 2011 at 14:20
  • to be honest - i do not have to – Thariama Commented Aug 24, 2011 at 14:25
  • 1 You would have to use a hex escape for U+00AD. Note that Javascript is broken with respect to word characters or anything that isn’t stoneage ASCII — which includes U+00AD. If someone writes "élève" then you are going to be in big trouble whether they have a soft hyphen in there or not. Javascript is absolutely the worst possible language for writing regexes for non-ASCII strings. Basically, you are severely screwed. Try writing a pattern for "é-lève" or "ja-la-pe-ño" in Javascript, where those real hyphens are actually soft ones. NFW. – tchrist Commented Aug 24, 2011 at 14:25
  • @tchrist: luckily i am only interrested in the soft hyphen character itself not any special character around that one – Thariama Commented Aug 24, 2011 at 14:27
Add a ment  | 

3 Answers 3

Reset to default 4

You can do this:

[^\w\u00AD]

(NOT a word or soft hyphen)

I created a quick and dirty last_symbol() function:

function last_symbol(str) { 
    var result = str.match(/([^\w\u00AD])[\w\u00AD]*$/); 
    return (result == null) ? null : result[1]; }

last_symbol('hello')   // null
last_symbol('hell!')   // '!'
last_symbol('hell!o$') // '$'

You can use \u00AD to match the unicode soft hypen character, so you should be able to negate this expression and bine it with \W to match characters which are not a word character and not a soft hyphen.

[^\u00AD\w]+

Use regex /\x{AD}/u to match soft hyphens in PHP!

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744770508a4592728.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信