Javascript regex find variables in a math equation - Stack Overflow

I want to find in a math expression elements that are not wrapped between { and }Examples:Input: abc+1*

I want to find in a math expression elements that are not wrapped between { and }

Examples:

  • Input: abc+1*def
    Matches: ["abc", "1", "def"]

  • Input: {abc}+1+def
    Matches: ["1", "def"]

  • Input: abc+(1+def)
    Matches: ["abc", "1", "def"]

  • Input: abc+(1+{def})
    Matches: ["abc", "1"]

  • Input: abc def+(1.1+{ghi})
    Matches: ["abc def", "1.1"]

  • Input: 1.1-{abc def}
    Matches: ["1.1"]

Rules

  • The expression is well-formed. (So there won't be start parenthesis without closing parenthesis or starting { without })
  • The math symbols allowed in the expression are + - / * and ( )
  • Numbers could be decimals.
  • Variables could contains spaces.
  • Only one level of { } (no nested brackets)

So far, I ended with:

(^[^/*+({})-]+|(?:[/*+({})-])[^/*+({})-]+(?:[/*+({})-])|[^/*+({})-]+$)

I split the task into 3:

  • match elements at the beginning of the string
  • match elements that are between two { and }
  • match elements at the end of the string

But it doesn't work as expected.

Any idea ?

I want to find in a math expression elements that are not wrapped between { and }

Examples:

  • Input: abc+1*def
    Matches: ["abc", "1", "def"]

  • Input: {abc}+1+def
    Matches: ["1", "def"]

  • Input: abc+(1+def)
    Matches: ["abc", "1", "def"]

  • Input: abc+(1+{def})
    Matches: ["abc", "1"]

  • Input: abc def+(1.1+{ghi})
    Matches: ["abc def", "1.1"]

  • Input: 1.1-{abc def}
    Matches: ["1.1"]

Rules

  • The expression is well-formed. (So there won't be start parenthesis without closing parenthesis or starting { without })
  • The math symbols allowed in the expression are + - / * and ( )
  • Numbers could be decimals.
  • Variables could contains spaces.
  • Only one level of { } (no nested brackets)

So far, I ended with: http://regex101./r/gU0dO4

(^[^/*+({})-]+|(?:[/*+({})-])[^/*+({})-]+(?:[/*+({})-])|[^/*+({})-]+$)

I split the task into 3:

  • match elements at the beginning of the string
  • match elements that are between two { and }
  • match elements at the end of the string

But it doesn't work as expected.

Any idea ?

Share Improve this question edited May 14, 2014 at 9:41 fluminis asked May 14, 2014 at 8:28 fluminisfluminis 4,1394 gold badges36 silver badges48 bronze badges 6
  • 2 Consider going the other way as the first (and independent) step - replace all "{..}" with "". – user2864740 Commented May 14, 2014 at 8:31
  • 1 Using regex to validate parenthesis pairs is already well plicated, for the sanity of the maintainer consider alternatives to doing it in regex. – Theraot Commented May 14, 2014 at 8:35
  • 3 I do not want to validate the expression, I know for sure that the expression is valid. I just want to get the tokens that are not wrapped between { and }. – fluminis Commented May 14, 2014 at 8:38
  • Do you have only one level of { } or can you have deep nesting (which changes everything and would require a parsing) ? – Denys Séguret Commented May 14, 2014 at 8:55
  • 1 This really changes everything... – Denys Séguret Commented May 14, 2014 at 9:06
 |  Show 1 more ment

4 Answers 4

Reset to default 3

Matching {}s, especially nested ones is hard (read impossible) for a standard regular expression, since it requires counting the number of {s you encountered so you know which } terminated it.

Instead, a simple string manipulation method could work, this is a very basic parser that just reads the string left to right and consumes it when outside of parentheses.

var input = "abc def+(1.1+{ghi})"; // I assume well formed, as well as no precedence
var inParens = false;
var output = [], buffer = "", parenCount = 0;
for(var i = 0; i < input.length; i++){
    if(!inParens){
          if(input[i] === "{"){
              inParens = true;
              parenCount++;
          } else if (["+","-","(",")","/","*"].some(function(x){ 
               return x === input[i]; 
          })){ // got symbol
              if(buffer!==""){ // buffer has stuff to add to input
                  output.push(buffer); // add the last symbol
                  buffer = "";
              }
          } else { // letter or number
              buffer += input[i]; // push to buffer
          }
    } else { // inParens is true
         if(input[i] === "{") parenCount++;
         if(input[i] === "}") parenCount--;
         if(parenCount === 0) inParens = false; // consume again
    }
}

This might be an interesting regexp challenge, but in the real world you'd be much better off simply finding all [^+/*()-]+ groups and removing those enclosed in {}'s

"abc def+(1.1+{ghi})".match(/[^+/*()-]+/g).filter(
    function(x) { return !/^{.+?}$/.test(x) })
// ["abc def", "1.1"]

That being said, regexes is not a correct way to parse math expressions. For serious parsing, consider using formal grammars and parsers. There are plenty of parser generators for javascript, for example, in PEG.js you can write a grammar like

expr
  = left:multiplicative "+" expr
  / multiplicative

multiplicative
  = left:primary "*" right:multiplicative
  / primary

primary
  = atom
  / "{" expr "}"
  / "(" expr ")"

atom = number / word

number = n:[0-9.]+ { return parseFloat(n.join("")) }
word = w:[a-zA-Z ]+ { return w.join("") }

and generate a parser which will be able to turn

 abc def+(1.1+{ghi})

into

[
   "abc def",
   "+",
   [
      "(",
      [
         1.1,
         "+",
         [
            "{",
            "ghi",
            "}"
         ]
      ],
      ")"
   ]
]

Then you can iterate this array just normally and fetch the parts you're interested in.

The variable names you mentioned can be match by \b[\w.]+\b since they are strictly bounded by word separators

Since you have well formed formulas, the names you don't want to capture are strictly followed by }, therefore you can use a lookahead expression to exclude these :

(\b[\w.]+ \b)(?!})

Will match the required elements (http://regexr./38rch).

Edit:

For more plex uses like correctly matching :

  • abc {def{}}
  • abc def+(1.1+{g{h}i})

We need to change the lookahead term to (?|({|}))

To include the match of 1.2-{abc def} we need to change the \b1. This term is using lookaround expression which are not available in javascript. So we have to work around.

(?:^|[^a-zA-Z0-9. ])([a-zA-Z0-9. ]+(?=[^0-9A-Za-z. ]))(?!({|}))

Seems to be a good one for our examples (http://regex101./r/oH7dO1).

1 \b is the separation between a \w and a \W \z or \a. Since \w does not include space and \W does, it is inpatible with the definition of our variable names.

Going forward with user2864740's ment, you can replace all things between {} with empty and then match the remaining.

var matches = "string here".replace(/{.+?}/g,"").match(/\b[\w. ]+\b/g);

Since you know that expressions are valid, just select \w+

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744843024a4596672.html

相关推荐

  • Javascript regex find variables in a math equation - Stack Overflow

    I want to find in a math expression elements that are not wrapped between { and }Examples:Input: abc+1*

    2天前
    50

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信