I need help to create a regex (for JavaScript .match and PHP preg_match) that validates a unix type absolute path to a file (with international characters such as åäöøæð and so on) so that:
- /path/to/someWhere is valid
- /path/tø/sömewhere is valid
- /path/to//somewhere is invalid
- path/to/somewhere is invalid
- /path/to/somewhere/ is invalid
The regex needs to handle paths regardless of their depth (/path/to or /path/to/somewhere or /path/to/somewhere/else)
I have a regexp that marks 1 to 3 as valid /^\/.+[^\/]$/ , the problem is to make this regex not to mark 3 as valid as it contains // without any other character in between.
I need help to create a regex (for JavaScript .match and PHP preg_match) that validates a unix type absolute path to a file (with international characters such as åäöøæð and so on) so that:
- /path/to/someWhere is valid
- /path/tø/sömewhere is valid
- /path/to//somewhere is invalid
- path/to/somewhere is invalid
- /path/to/somewhere/ is invalid
The regex needs to handle paths regardless of their depth (/path/to or /path/to/somewhere or /path/to/somewhere/else)
I have a regexp that marks 1 to 3 as valid /^\/.+[^\/]$/ , the problem is to make this regex not to mark 3 as valid as it contains // without any other character in between.
Share Improve this question asked Sep 28, 2010 at 13:45 tirithentirithen 3,52711 gold badges44 silver badges67 bronze badges 2- I love regex posts like this, to an outsider we all look like spambots or something. – Incognito Commented Sep 28, 2010 at 13:52
- Unix/Linux pathnames actually work just fine if they end in a slash or have double-slashes. – Pointy Commented Sep 28, 2010 at 14:30
6 Answers
Reset to default 4Regex isn't really needed here. As far as I can see, there are three things you want to ensure:
- The string starts with
/
- The string doesn't end with
/
, unless the whole string is/
- The string doesn't contain any instances of
//
All three of the above can be done with string functions.
In PHP:
if ($string != '/' && ($string[0] != '/' || $string[strlen($string)-1] == '/' || strpos($string, '//') > -1))
{
// string is invalid
}
In Javascript:
if (string != '/' && (string.charAt(0) != '/' || string.charAt(string.length - 1) == '/' || string.indexOf('//') > -1))
{
// string is invalid
}
Resources:
- PHP's strpos function
- Javascript's String.charAt and String.indexOf functions
A Solution for PHP:
$lines = array(
"/path/to/someWhere",
"/path/tø/sömewhere",
"/path/to//somewhere",
"path/to/somewhere",
"/path/to/somewhere/",
);
foreach($lines as $line){
var_dump(preg_match('#^(/[^/]+)+$#',$line)); // dumps int(1) int(1) int(0) int(0) int(0)
}
I think this will do it:
^(:?\/$|(:?\/[^/]+)+$)
That says to accept any string that's either just a /, or any string formed from a sequence of one or more repetitions of a / followed by one or more non-/ characters.
This uses all greedy quantifiers so it should be fast; also, for performance, the ^ anchor is factored out.
That's a Javascript regex. I'm not a PHP programmer so the main thing I don't know is whether the non-capturing group syntax works in PHP. Also, I'm not sure how you'd handle "quoting" the slash characters.
This should work:
^/[^/]?$|^/[^/]([^/]|/[^/])*?[^/]$
It allows any character except /
, or a /
followed by any character except /
. It also makes sure that the last character isn’t a /
, and that the second character isn’t one either.
Finally, this uses /
without escaping. To use it in PHP, don’t use /
as the regex delimiter – this just makes the regular expression hard to read. Use any other character, e.g. ;
to delimit the expression instead:
;^/[^/]?$|^/[^/]([^/]|/[^/])*?[^/]$;
EDIT: Added special handing for the root path, "/"
, and paths that consist of a single letter directory.
If the path matches ^[^\/]|\/\/|.\/$
, it is invalid. Otherwise it is valid.
it's not regex, but works just as well.
str_replace('//', '/', $file)
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744275147a4566292.html
评论列表(0条)