jquery - How can I use a Regular Expression to replace everything except specific words in a string with Javascript - Stack Over

admin•2025-04-10 17:11:46•questions•阅读0

Imagine you have a string like this: "This is a sentence with words."I have an array of words

Imagine you have a string like this: "This is a sentence with words."

I have an array of words like $wordList = ["sentence", "words"];

I want to highlight words that aren't on the list. Which means I need to find and replace everything else and I can't seem to crack how to do that (if it's possible) with RegEx.

If I want to match the words I can do something like:

text = text.replace(/(sentence|words)\b/g, '$&');

(which will wrap the matching words in "mark" tags and, assuming I have some css for , highlight them) which works perfectly. But I need the opposite! I need it to basically select the entire string and then exclude the words listed. I've tried /^((?!sentence|words)*)*$/gm but this gives me a strange infinity issue because I think it's too open ended.

Taking that original sentence, what I would hope to end up with is " This is a sentence with some words."

Basically wrapping (via replace) everything except the words listed.

The closest I can seem to get is something like /^(?!sentence|words).*\b/igm which will successfully do it if a line starts with one of the words (ignoring that entire line).

So to summarize: 1) Take a string 2) take a list of words 3) replace everything in the string except the list of words.

Possible? (jQuery is loaded for something else already, so raw JS or jQuery are both acceptable).

Imagine you have a string like this: "This is a sentence with words."

I have an array of words like $wordList = ["sentence", "words"];

I want to highlight words that aren't on the list. Which means I need to find and replace everything else and I can't seem to crack how to do that (if it's possible) with RegEx.

If I want to match the words I can do something like:

text = text.replace(/(sentence|words)\b/g, '$&');

Taking that original sentence, what I would hope to end up with is " This is a sentence with some words."

Basically wrapping (via replace) everything except the words listed.

The closest I can seem to get is something like /^(?!sentence|words).*\b/igm which will successfully do it if a line starts with one of the words (ignoring that entire line).

So to summarize: 1) Take a string 2) take a list of words 3) replace everything in the string except the list of words.

Possible? (jQuery is loaded for something else already, so raw JS or jQuery are both acceptable).

Share Improve this question asked Aug 18, 2017 at 23:00 Michael MacDonald 4332 gold badges6 silver badges17 bronze badges

Why is regex necessary? The suggested regex doesn't make any sense, (?!sentence|words)* shouldn't even be a valid construct (it's a lookahead, should not be quantified) and most of the answers barely use viable constructs for the problem. – Unihedron Commented Dec 14, 2017 at 22:00

Add a ment |

4 Answers 4

Sorted by: Reset to default 5

Create the regex from the word list.
Then do a string replace with the regex.
(It's a tricky regex)

var wordList = ["sentence", "words"];

// join the array into a string using '|'.  
var str = wordList.join('|');
// finalize the string with a negative assertion
str = '\\W*(?:\\b(?!(?:' + str + ')\\b)\\w+\\W*|\\W+)+';

//create a regex from the string
var Rx = new RegExp( str, 'g' );
console.log( Rx ); 

var text = "%%%555This is a sentence with words, but not sentences ?!??!!...";
text = text.replace( Rx, '<mark>$&</mark>');

console.log( text );

Output

/\W*(?:\b(?!(?:sentence|words)\b)\w+\W*|\W+)+/g
<mark>%%%555This is a </mark>sentence<mark> with </mark>words<mark>, but not sentences ?!??!!...</mark>

Addendum

The regex above assumes the word list contains only word characters.
If that's not the case, you must match the words to advance the match position
past them. This is easily acplished with a simplified regex and a callback function.

var wordList = ["sentence", "words", "won't"];

// join the array into a string using '|'.  
var str = wordList.join('|');
str = '([\\S\\s]*?)(\\b(?:' + str + ')\\b|$)';

//create a regex from the string
var Rx = new RegExp( str, 'g' );
console.log( Rx ); 

var text = "%%%555This is a sentence with words, but won't be sentences ?!??!!...";

// Use a callback to insert the 'mark'
text = text.replace(
        Rx,
        function(match, p1,p2)
        {
           var retStr = '';
           if ( p1.length > 0 )
              retStr = '<mark>' + p1 + '</mark>';
           return retStr + p2;
        }
      );

console.log( text );

Output

/([\S\s]*?)(\b(?:sentence|words|won't)\b|$)/g
<mark>%%%555This is a </mark>sentence<mark> with </mark>words<mark>, but 
</mark>won't<mark> be sentences ?!??!!...</mark>

You could still perform the replacement on the positive matches, but reverse the closing/opening tag, and add an opening tag at the start and a closing one at the end of the string. I use here your regular expression which could be anything you want, so I'll assume it matches correctly what needs to be matched:

var text = "This is a sentence with words.";

text = "<mark>" + text.replace(/\b(sentence|words)\b/g, '</mark>$&<mark>') + "</mark>";

// If empty tags bother you, you can add:
text = text.replace(/<mark><\/mark>/g, "");

console.log(text);

Time Complexity

In ments below someone makes a point that the second replacement (which is optional) is a waste of time. But it has linear time plexity as is illustrated in the following snippet which charts the duration for increasing string sizes.

The X axis represents the number of characters in the input string, and the Y-axis represents the number of milliseconds it takes to execute the replacement with /<\/mark>/g on such input string:

// Reserve memory for the longest string
const s = '<mark></mark>' + '<mark>x</mark>'.repeat(2000);
    regex = /<mark><\/mark>/g,
    millisecs = {};
// Collect timings for several string sizes:
for (let size = 100; size < 25000; size+=100) {
	millisecs[size] = test(15, 8, _ => s.substr(0, size).replace(regex, ''));
}
// Show results in a chart:
chartFunction(canvas, millisecs, "len", "ms");

// Utilities
function test(countPerRun, runs, f) {
    let fastest = Infinity;
    for (let run = 0; run < runs; run++) {
        const started = performance.now();
        for (let i = 0; i < countPerRun; i++) f();
        // Keep the duration of the fastest run:
        fastest = Math.min(fastest, (performance.now() - started) / countPerRun);
    }
    return fastest;
}

function chartFunction(canvas, y, labelX, labelY) {
    const ctx = canvas.getContext('2d'),
        axisPix = [40, 20],
        largeY = Object.values(y).sort( (a, b) => b - a )[
                    Math.floor(Object.keys(y).length / 10)
                ] * 1.3; // add 30% to value at the 90th percentile 
        max = [+Object.keys(y).pop(), largeY],
        coeff = [(canvas.width-axisPix[0]) / max[0], (canvas.height-axisPix[1]) / max[1]],
        textAlignPix = [-8, -13];
    ctx.translate(axisPix[0], canvas.height-axisPix[1]);
    text(labelY + "/" + labelX, [-5, -13], [1, 1], false, 2);
    // Draw axis lines
    for (let dim = 0; dim < 2; dim++) {
        const c = coeff[dim], world = [c, 1];
        let interval = 10**Math.floor(Math.log10(60 / c));
        while (interval * c < 30) interval *= 2;
        if (interval * c > 60) interval /= 2;
        let decimals = ((interval+'').split('.')[1] || '').length;
        line([[0, 0], [max[dim], 0]], world, dim);
        for (let x = 0; x <= max[dim]; x += interval) {
            line([[x, 0], [x, -5]], world, dim);
            text(x.toFixed(decimals), [x, textAlignPix[1-dim]], world, dim, dim+1);
        }
    }
    // Draw function
    line(Object.entries(y), coeff);

    function translate(coordinates, world, swap) {
        return coordinates.map( p => {
            p = [p[0] * world[0], p[1] * world[1]];
            return swap ? p.reverse() : p;
        });
    }
    
    function line(coordinates, world, swap) {
        coordinates = translate(coordinates, world, swap);
        ctx.beginPath();
        ctx.moveTo(coordinates[0][0], -coordinates[0][1]);
        for (const [x, y] of coordinates.slice(1)) ctx.lineTo(x, -y);
        ctx.stroke();
    }

    function text(s, p, world, swap, align) { // align: 0=left,1=center,2=right
        const [[x, y]] = translate([p], world, swap);
        ctx.font = '9px courier';
        ctx.fillText(s, x - 2.5*align*s.length, 2.5-y);
    }
}

<canvas id="canvas" width="600" height="200"></canvas>

For each string size (which is incremented with steps of 100 characters), the time to run the regex 15 times is measured. This measurement is repeated 8 times and the duration of the fastest run is reported in the graph. On my PC the regex runs in 25µs on a string with 25 000 characters (consisting of  tags). So not something to worry about ;-)

You may see some spikes in the chart (due to browser and OS interference), but the overall tendency is linear. Given that the main regex has linear time plexity, the overall time plexity is not negatively affected by it.

However that optional part can be performed without regular expression as follows:

if (text.substr(6, 7) === '</mark>') text = text.substr(13);
if (text.substr(-13, 6) === '<mark>') text = text.substr(0, text.length-13);

Due to how JavaScript engines deal with strings (immutable), this longer code runs in constant time.

Of course, it does not change the overall time plexity, which remains linear.

I'm not sure if this will work for every case, but for the given string it does.

let s1 = "This is a sentence with words.";
let wordList = ["sentence", "words"];

let reg = new RegExp("([\\s\\S]*?)(" + wordList.join("|") + ")", "g");

console.log(s1.replace(reg, "<mark>$1</mark>$2"))

Do it the opposite way: Mark everything and unmark the matched words you have.

text = `<mark>${text.replace(/\b(sentence|words)\b/g, '</mark>$&<mark>')}</mark>`;

Negated regex is possible but inefficient for this. In fact regex is not the right tool. The viable method is to go through the strings and manually construct the end string:

//var text = "This is a sentence with words.";
//var wordlist = ["sentence", "words"];
var result = "";
var marked = false;
var nextIndex = 0;

while (nextIndex != -1) {
    var endIndex = text.indexOf(" ", nextIndex + 1);
    var substring = text.slice(nextIndex, endIndex == -1 ? text.length : endIndex);
    var contains = wordlist.some(word => substring.includes(word));
    if (!contains && !marked) {
        result += "<mark>";
        marked = true;
    }
    if (contains && marked) {
        result += "</mark>";
        marked = false;
    }
    result += substring;
    nextIndex = endIndex;
}

if (marked) {
    result += "</mark>";
}
text = result;

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1743753900a4501405.html

admin

questions
javascript - getServerSideProps() in Next.js (using the App Router) - Stack Overflow
I have tried to touch Next.js 13 (using the App Router) and faced some issues. The structure of my test
admin
33分钟前
00
questions
Umbraco 10 Site Went Down Due to Lucene Index Corruption – High Traffic Causing Issues? - Stack Overflow
Issue:I am running Umbraco 10.1.0, and my site has gone down twice without any changes. When I checked
admin
30分钟前
00
questions
javascript - ValidationException: ExpressionAttributeValues must not be empty - Stack Overflow
Even though ExpressionAttributeValues is not empty it gives me this error ValidationException: Expressi
admin
27分钟前
10
questions
javascript - React-Native Firebase update column - Stack Overflow
In Firebase database, I am trying to update one column only. Firebase database 'users' JSON:
admin
26分钟前
00
questions
javascript - Visual Studio Compile Problems with Definitely Typed Typescript - Stack Overflow
I have a Visual Studio project with typescript files in it and I want to get intellisense working for t
admin
20分钟前
00
questions
javascript - Vite can't resolve paths from CSS url() - Stack Overflow
I have a Vite 4 project that uses vanilla JS & no frameworks. When I reference an asset using CSS u
admin
17分钟前
00
questions
jquery - How to get elements width by classname using javascript for loop? - Stack Overflow
When i tested it's will be alert blank value and 30px I want to get alert 400px and 30px How can
admin
16分钟前
00
questions
javascript - process tag before converting it into tag using bootstrap tags input - Stack Overflow
i am using bootstrap tags input in my site.basically what i am trying to do is,ask user to type urls
admin
14分钟前
00
questions
java - How can I fix Lombok not working in IntelliJ IDEA? - Stack Overflow
My @RequiredArgsConstructor is not workingannotation is enabledlombok is installedremoved the import
admin
12分钟前
00
questions
containers - Embed code for check-in form produces unwanted lower margin - Stack Overflow
I'm a newb, sorry. To chase down the formatting misbehavior I RBM within the unwanted gap, Inspect
admin
11分钟前
00
questions
jquery - Change Border Color of Div Based on Selection With CSS and JavaScript - Stack Overflow
I'm trying to create a JavaScript function that allows us to change the border color of a div base
admin
9分钟前
00
questions
Javascript static methods vs prototypalinstatiated methods on performance - Stack Overflow
I have been experimenting using static methods in Javascript. Instead of having objects inherit from a
admin
9分钟前
00
questions
javascript - How to get multiple responses from PHP file via AJAX? - Stack Overflow
My PHP file doing 2 operations: 1. Submits data from form into db table, 2. Sends email.What I want to
admin
8分钟前
00
questions
How to detect android pinch zoom in javascript - Stack Overflow
I would like to know if there's any way to detect that a pinch zoom event has taken place. I could
admin
8分钟前
00
questions
javascript - On form submit, check size of files ready to be uploaded? - Stack Overflow
I have a form which can have multiple file inputs (one file per input, not multi select).When I submit
admin
5分钟前
00
questions
How can I "break up" numbers into smaller pieces in javascript? - Stack Overflow
I think the title needs some explaining. I wan't to make my program break up a number into smaller
admin
3分钟前
00
questions
javascript - How to remove attribute from response using sequelize? - Stack Overflow
const User = sequelize.define('user', { attributesfirstName: {type: Sequelize.STRING,allowN
admin
3分钟前
00
questions
javascript - Wordpress Theming: add custom scripts and jquery the correct way - Stack Overflow
Just took over a wordpress project. The former developers screwed up the theming so it prevents my ajax
admin
1分钟前
00
questions
javascript - Force no line-break on pseudo element (:after) - Stack Overflow
EDIT: This CodePen now showcases the variety of proposed solutionsIf you resize the available space in t
admin
57秒前
00
questions
Javascript How do i call functions from an array of function names - Stack Overflow
var action = ['function1','function2' etc ]var obj = new objectcreate ();for (func
admin
27秒前
00

发表回复

评论列表（0条）

暂无评论

jquery - How can I use a Regular Expression to replace everything except specific words in a string with Javascript - Stack Over

4 Answers 4

Time Complexity

发表回复

评论列表（0条）

联系我们

400-800-8888

jquery - How can I use a Regular Expression to replace everything except specific words in a string with Javascript - Stack Over

4 Answers 4

Time Complexity

相关推荐

发表回复

评论列表（0条）

联系我们

400-800-8888