I am looking for a regex yielding to characters which are not word-characters nor a soft hyphen (U+00AD).
This will give me characters which are not word-characters:
((?=\W).)
But what about the soft hyphen character? What is the correct regex?
I am looking for a regex yielding to characters which are not word-characters nor a soft hyphen (U+00AD).
This will give me characters which are not word-characters:
((?=\W).)
But what about the soft hyphen character? What is the correct regex?
Share Improve this question edited Oct 13, 2016 at 21:13 William Perron 4957 silver badges16 bronze badges asked Aug 24, 2011 at 14:13 ThariamaThariama 50.9k13 gold badges145 silver badges175 bronze badges 4- no, this does not work, i am not looking for hyphen but "soft hyphen" - that is a different character (U+00AD see fileformat.info/info/unicode/char/ad/index.htm) – Thariama Commented Aug 24, 2011 at 14:20
- to be honest - i do not have to – Thariama Commented Aug 24, 2011 at 14:25
-
1
You would have to use a hex escape for U+00AD. Note that Javascript is broken with respect to word characters or anything that isn’t stoneage ASCII — which includes U+00AD. If someone writes
"élève"
then you are going to be in big trouble whether they have a soft hyphen in there or not. Javascript is absolutely the worst possible language for writing regexes for non-ASCII strings. Basically, you are severely screwed. Try writing a pattern for"é-lève"
or"ja-la-pe-ño"
in Javascript, where those real hyphens are actually soft ones. NFW. – tchrist Commented Aug 24, 2011 at 14:25 - @tchrist: luckily i am only interrested in the soft hyphen character itself not any special character around that one – Thariama Commented Aug 24, 2011 at 14:27
3 Answers
Reset to default 4You can do this:
[^\w\u00AD]
(NOT a word or soft hyphen)
I created a quick and dirty last_symbol()
function:
function last_symbol(str) {
var result = str.match(/([^\w\u00AD])[\w\u00AD]*$/);
return (result == null) ? null : result[1]; }
last_symbol('hello') // null
last_symbol('hell!') // '!'
last_symbol('hell!o$') // '$'
You can use \u00AD
to match the unicode soft hypen character, so you should be able to negate this expression and bine it with \W
to match characters which are not a word character and not a soft hyphen.
[^\u00AD\w]+
Use regex /\x{AD}/u
to match soft hyphens in PHP!
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744770508a4592728.html
评论列表(0条)