I have a MongoDB collection where I store values as concatenated object references in a single string field. The values are structured like this:
{
"resource": {
"fields": {
"value": {
"to": "parent_671f3db04b00e7efd82a6c5b;image_67d00437953155602e87b6fa;file_671f3db04b00e7efd82a6123"
}
}
}
}
Currently, I am searching for specific ObjectId substrings within this field using a regex query:
{
"resource.fields.value.to": {
"$regex": "671f3db04b00e7efd82a6c5b",
"$options": "i"
}
}
This search works but is slow, especially on large datasets. I want to optimize the query performance by creating an index on this field.
Questions: How can I create an index to speed up regex-based searches on this field?
Would a full-text index help in this case, or is there a better approach?
Are there alternative ways to structure my data to make queries more efficient?(not optimal)
I have a MongoDB collection where I store values as concatenated object references in a single string field. The values are structured like this:
{
"resource": {
"fields": {
"value": {
"to": "parent_671f3db04b00e7efd82a6c5b;image_67d00437953155602e87b6fa;file_671f3db04b00e7efd82a6123"
}
}
}
}
Currently, I am searching for specific ObjectId substrings within this field using a regex query:
{
"resource.fields.value.to": {
"$regex": "671f3db04b00e7efd82a6c5b",
"$options": "i"
}
}
This search works but is slow, especially on large datasets. I want to optimize the query performance by creating an index on this field.
Questions: How can I create an index to speed up regex-based searches on this field?
Would a full-text index help in this case, or is there a better approach?
Are there alternative ways to structure my data to make queries more efficient?(not optimal)
Share Improve this question asked Mar 11 at 13:44 Jovan DimovJovan Dimov 434 bronze badges 1 |1 Answer
Reset to default 2From the $regex
docs: Index use and performance for $regex
queries varies depending on whether the query is case-sensitive or case-insensitive. ... Case-insensitive indexes typically do not improve performance for $regex
queries. The $regex
implementation is not collation-aware and cannot utilize case-insensitive indexes efficiently.
Wrt
How can I create an index to speed up regex-based searches on this field?
1. If you can convert the resource.fields.value.to
field to always be lowercase and also do the search with your id lower-cased beforehand, then you can drop the case-insensitive option: "i"
part and the regex search could use the index.
2. If you could use prefix expressions with Case-Sensitive queries: "Further optimization can occur if the regular expression is a "prefix expression", which means that all potential matches start with the same string. This allows MongoDB to construct a "range" from that prefix and only match against those values from the index that fall within that range."
3. Wrt
Are there alternative ways to structure my data to make queries more efficient?(not optimal)
This may be obvious but your data structure can be optimised to split up on the fields you have. Like:
{
"resource": {
"fields": {
"value": {
"to": {
"parent": "671f3db04b00e7efd82a6c5b",
"image": "67d00437953155602e87b6fa",
"file": "671f3db04b00e7efd82a6123",
}
}
}
}
}
Note that I have dropped the prefixes parent_
, image_
, file_
since the field specifies which Id it's for. And then you can use a prefix regex or just standard equality checks with an $or
clause, which will use the index - this one will require 3 indexes: one for each of the to
fields. (You may also want to consider un-nesting the fields a bit, but that depends on your usage.)
The query then becomes:
db.collection.find({
$or: [
{ "resource.fields.value.to.parent": "671f3db04b00e7efd82a6c5b" },
{ "resource.fields.value.to.image": "671f3db04b00e7efd82a6c5b" },
{ "resource.fields.value.to.file": "671f3db04b00e7efd82a6c5b" }
]
})
Mongo Playground
An alternative structure you could have is: split up the to
fields as an array of separate strings:
{
"resource": {
"fields": {
"value": {
"to": [
"parent_671f3db04b00e7efd82a6c5b",
"image_67d00437953155602e87b6fa",
"file_671f3db04b00e7efd82a6123"
]
}
}
}
}
And to query it, you can use an $or
or $in
clause. Note that you will need to add each prefix for the equality check to work and take advantage of the index:
// with OR clause
db.collection.find({
$or: [
{ "resource.fields.value.to": "parent_671f3db04b00e7efd82a6c5b" },
{ "resource.fields.value.to": "image_671f3db04b00e7efd82a6c5b" },
{ "resource.fields.value.to": "file_671f3db04b00e7efd82a6c5b" }
]
})
// with IN clause
db.collection.find({
"resource.fields.value.to": {
$in: [
"parent_671f3db04b00e7efd82a6c5b",
"image_671f3db04b00e7efd82a6c5b",
"file_671f3db04b00e7efd82a6c5b"
]
}
})
Mongo Playground 2A, with $or
Mongo Playground 2B, with $in
Or as a single regex, with prefixed-regexes OR'ed internally - but check if this uses the index:
db.collection.find({
"resource.fields.value.to": {
"$regex": "^parent_671f3db04b00e7efd82a6c5b|^image_671f3db04b00e7efd82a6c5b|^file_671f3db04b00e7efd82a6c5b"
}
},
)
Mongo Playground 3
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744791030a4593917.html
$regex
queries varies depending on whether the query is case-sensitive or case-insensitive. ... Case-insensitive indexes typically do not improve performance for$regex
queries. The$regex
implementation is not collation-aware and cannot utilize case-insensitive indexes efficiently. – aneroid Commented Mar 11 at 13:52