Defining a uima ruta rule - Stack Overflow

I need to define a rule but I'm struggling to get through the UIMA Ruta syntax. I have a dictionar

I need to define a rule but I'm struggling to get through the UIMA Ruta syntax. I have a dictionary, which, if it finds a matching word in the text, looks two words ahead and two words behind this word. If the word has uppercase letters, the condition is fulfilled.

Example: "Prime Minister" Margaret Thatcher

The phrase "Prime Minister" is in the dictionary, and the condition of two words after it starting with an uppercase letter is satisfied. The result is Margaret Thatcher.

This is how my very rough rule is written, but the syntax is incorrect, and I cannot figure out how to improve it:

WORDLIST MyWordList = 'test_words.txt';

// Declaration of annotations for marked words
DECLARE MyAnnotation;
DECLARE CapitalizedAnnotation;

Document {-> MyWordList 
            -2, -1 {
            (REGEXP("[A-ZÁ-Ž]") && NOT(REGEXP("[a-zá-ž]"))){ 
                -> MARK(CapitalizedAnnotation); 
            }
        }
        
        +1, +2 {

            (REGEXP("[A-ZÁ-Ž]") && NOT(REGEXP("[a-zá-ž]"))){ 
                -> MARK(CapitalizedAnnotation); 
            }
        }
    }
};

I need to define a rule but I'm struggling to get through the UIMA Ruta syntax. I have a dictionary, which, if it finds a matching word in the text, looks two words ahead and two words behind this word. If the word has uppercase letters, the condition is fulfilled.

Example: "Prime Minister" Margaret Thatcher

The phrase "Prime Minister" is in the dictionary, and the condition of two words after it starting with an uppercase letter is satisfied. The result is Margaret Thatcher.

This is how my very rough rule is written, but the syntax is incorrect, and I cannot figure out how to improve it:

WORDLIST MyWordList = 'test_words.txt';

// Declaration of annotations for marked words
DECLARE MyAnnotation;
DECLARE CapitalizedAnnotation;

Document {-> MyWordList 
            -2, -1 {
            (REGEXP("[A-ZÁ-Ž]") && NOT(REGEXP("[a-zá-ž]"))){ 
                -> MARK(CapitalizedAnnotation); 
            }
        }
        
        +1, +2 {

            (REGEXP("[A-ZÁ-Ž]") && NOT(REGEXP("[a-zá-ž]"))){ 
                -> MARK(CapitalizedAnnotation); 
            }
        }
    }
};
Share Improve this question asked Mar 25 at 11:21 Pavlína Z.Pavlína Z. 1
Add a comment  | 

1 Answer 1

Reset to default 0

Maybe something like that?

WORDLIST MyWordList = 'test_words.txt';
// Declaration of annotations for marked words
DECLARE MyAnnotation;
DECLARE CapitalizedAnnotation;
MARKFAST(MyAnnotation, MyWordList);

MyAnnotation CW[2,2]{-> CapitalizedAnnotation};
CW[2,2]{-> CapitalizedAnnotation} @MyAnnotation;

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744200679a4562864.html

相关推荐

  • Defining a uima ruta rule - Stack Overflow

    I need to define a rule but I'm struggling to get through the UIMA Ruta syntax. I have a dictionar

    8天前
    10

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信