c++ - How to match std::stringstream through AST? - Stack Overflow

Target code block:int age = 5;std::stringstream q;q << "my name "<< "is

Target code block:

int age = 5;
std::stringstream q;
q << "my name "
<< "is Tom, "
<< "my age is " << age;

I'm trying to create a matcher to match the entire block from the second line.

I tried a very case specific matcher, but it does not work. My goal is to match similar code blocks, where the LHS and RHS number is not fixed.

cxxOperatorCallExpr(hasOverloadedOperatorName("<<"),
  hasLHS(ignoringParenImpCasts(expr(anyOf(
    stringLiteral().bind("firstString"),
    stringLiteral().bind("secondString"),
    stringLiteral().bind("thirdString"))))),
  hasRHS(ignoringParenImpCasts(expr(anyOf(
    declRefExpr(to(varDecl(hasType(cxxRecordDecl(hasName("std::stringstream")))))),
    declRefExpr(hasType(isInteger())))))))

I tried to test from finding the line std::stringstream q; with varDecl(hasType(cxxRecordDecl(hasName("std::stringstream")))), but it returns empty as well.

How should I start or even how to build a matcher based on the ast dump log?

Appreciate any help.

Target code block:

int age = 5;
std::stringstream q;
q << "my name "
<< "is Tom, "
<< "my age is " << age;

I'm trying to create a matcher to match the entire block from the second line.

I tried a very case specific matcher, but it does not work. My goal is to match similar code blocks, where the LHS and RHS number is not fixed.

cxxOperatorCallExpr(hasOverloadedOperatorName("<<"),
  hasLHS(ignoringParenImpCasts(expr(anyOf(
    stringLiteral().bind("firstString"),
    stringLiteral().bind("secondString"),
    stringLiteral().bind("thirdString"))))),
  hasRHS(ignoringParenImpCasts(expr(anyOf(
    declRefExpr(to(varDecl(hasType(cxxRecordDecl(hasName("std::stringstream")))))),
    declRefExpr(hasType(isInteger())))))))

I tried to test from finding the line std::stringstream q; with varDecl(hasType(cxxRecordDecl(hasName("std::stringstream")))), but it returns empty as well.

How should I start or even how to build a matcher based on the ast dump log?

Appreciate any help.

Share Improve this question edited Nov 19, 2024 at 23:21 Scott McPeak 13.2k2 gold badges51 silver badges92 bronze badges asked Nov 19, 2024 at 9:40 LunaticJapeLunaticJape 1,5846 gold badges24 silver badges41 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 2

Match a variable declaration of type std::stringstream

Starting with the second question, to match a variable declaration of type std::stringstream, use a Clang AST matcher like this:

varDecl(
  hasType(
    asString("std::stringstream")
  )
)

The asString matcher turns the type into a string, making it reasonably easy to use.

The attempted matcher in the question:

varDecl(hasType(cxxRecordDecl(hasName("std::stringstream"))))

fails for a few reasons:

  • The sequence hasType(cxxRecordDecl(...)) will never match because cxxRecordDecl matches a declaration (a piece of syntax) while hasType matches a types (an abstract semantic notion). You would need to at least insert hasDeclaration between the two, but that cannot be directly applied until the kind of type has been further refined, since not all types have declarations.

  • The actual type in this case is an ElaboratedType, which Clang seemingly sprinkles throughout its Type structures in a way I find somewhat unpredictable. That makes it hard in general to match Types.

  • The type to which the name std::stringstream refers is a TypedefType for a template specialization, rather than a stand-alone class, so cxxRecordDecl won't match. (Note: TypedefType is used for type aliases created with either typedef or using.)

If you want to match a declaration of a std::stringstream variable without using asString, use this more complicated matcher:

varDecl(
  hasType(
    elaboratedType(
      namesType(
        typedefType(
          hasDeclaration(
            namedDecl(
              hasName("std::stringstream")
            )
          )
        )
      )
    )
  )
)

Match operator<< applied to a stringstream

Armed with the above, we can try to match a use of operator<< where the left-hand side is a stringstream. The question has this example expression:

q << "my name " << "is Tom, " << "my age is " << age;

This is parsed as a nested tree of binary operators:

(((q << "my name ") << "is Tom, ") << "my age is ") << age;

So we need to look for a use of operator<< where somewhere in the left-hand side is a stringstream, and the right-hand side is either a string literal or an integer-valued expression (I'm partly guessing the intent based on the attempted matcher in the question). That can be done like so:

cxxOperatorCallExpr(
  hasOverloadedOperatorName("<<"),
  hasLHS(
    hasDescendant(
      expr(
        hasType(
          asString("std::stringstream")
        )
      ).bind("stringStreamExpr")
    )
  ),
  hasRHS(
    ignoringParenImpCasts(
      expr(
        anyOf(
          stringLiteral(
          ).bind("stringLiteral"),
          expr(
            hasType(
              isInteger()
            )
          ).bind("intExpr")
        )
      )
    )
  )
)

This will separately report matches for each occurrence of operator<<. It does not try to report a single match for the entire compound expression with all of the various arguments at once; the AST matcher language is not really powerful enough to do that robustly.

Complete example

Here is a shell script that runs clang-query with the above matcher:

#!/bin/sh

PATH=/d/opt/clang+llvm-18.1.8-msvc/bin:$PATH

matcher='
  cxxOperatorCallExpr(
    hasOverloadedOperatorName("<<"),
    hasLHS(
      hasDescendant(
        expr(
          hasType(
            asString("std::stringstream")
          )
        ).bind("stringStreamExpr")
      )
    ),
    hasRHS(
      ignoringParenImpCasts(
        expr(
          anyOf(
            stringLiteral(
            ).bind("stringLiteral"),
            expr(
              hasType(
                isInteger()
              )
            ).bind("intExpr")
          )
        )
      )
    )
  )
'

clang-query \
  -c "set bind-root false" \
  -c "m $matcher" \
  test --

# EOF

Test input file test:

// test
// Match an `operator<<` expression involving `std::stringstream`.

#include <sstream>                     // std::stringstream

void f()
{
  int age = 5;
  std::stringstream q;
  q << "my name "
  << "is Tom, "
  << "my age is " << age;
}

// EOF

Output of the script:

Match #1:

$PWD\test:12:22: note: 
      "intExpr" binds here
   12 |   << "my age is " << age;
      |                      ^~~
$PWD\test:10:3: note: 
      "stringStreamExpr" binds here
   10 |   q << "my name "
      |   ^

Match #2:

$PWD\test:12:6: note: 
      "stringLiteral" binds here
   12 |   << "my age is " << age;
      |      ^~~~~~~~~~~~
$PWD\test:10:3: note: 
      "stringStreamExpr" binds here
   10 |   q << "my name "
      |   ^

Match #3:

$PWD\test:11:6: note: 
      "stringLiteral" binds here
   11 |   << "is Tom, "
      |      ^~~~~~~~~~
$PWD\test:10:3: note: 
      "stringStreamExpr" binds here
   10 |   q << "my name "
      |   ^

Match #4:

$PWD\test:10:8: note: 
      "stringLiteral" binds here
   10 |   q << "my name "
      |        ^~~~~~~~~~
$PWD\test:10:3: note: 
      "stringStreamExpr" binds here
   10 |   q << "my name "
      |   ^
4 matches.

Issues with the matcher in the question

The question has this attempted matcher:

cxxOperatorCallExpr(hasOverloadedOperatorName("<<"),
  hasLHS(ignoringParenImpCasts(expr(anyOf(
    stringLiteral().bind("firstString"),
    stringLiteral().bind("secondString"),
    stringLiteral().bind("thirdString"))))),
  hasRHS(ignoringParenImpCasts(expr(anyOf(
    declRefExpr(to(varDecl(hasType(cxxRecordDecl(hasName("std::stringstream")))))),
    declRefExpr(hasType(isInteger())))))))

This has several problems worth identifying as part of clarifying how the AST is structured and how matchers work:

  • It seems to be looking for string literals on the left-hand side and stringstream variables on the right-hand side; those should be swapped.

  • It is trying to match multiple string literals inside anyOf, but anyOf will stop after the first branch succeeds. If you change that to allOf, then all of the bindings will just go to the same expression. In some cases, forEachDescendant can be used to do something like this, but I find it difficult to use robustly, and it would not apply here (or at least I don't see how to make it work in this case).

  • It seems to expect the entire statement to be a single cxxOperatorCallExpr, but it's actually a tree of them.

  • It has the issues with recognizing stringstream variables explained in the first part of this answer.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745569534a4633596.html

相关推荐

  • c++ - How to match std::stringstream through AST? - Stack Overflow

    Target code block:int age = 5;std::stringstream q;q << "my name "<< "is

    21小时前
    40

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信