sql - Compound index on database Log table: (Status, CreatedTime) or (CreatedTime, Status)? - Stack Overflow

I usually use a compound key (CreatedTime, Status) for my Log table, but I’m reconsidering this design.

I usually use a compound key (CreatedTime, Status) for my Log table, but I’m reconsidering this design. Since CreatedTime is typically very unique and Status only has 3-5 possible values, it seems that Status might not add much to further filtering after CreatedTime.

Most of my queries involve retrieving logs for a specific time range, optionally filtering or counting by Status. Conceptually, if I were working with a physical log book sorted by time, identifying entries with a specific Status (e.g., "Successful") would be cumbersome. On the other hand, having separate log books for each Status, all sorted by time, could make searching more efficient—though combining and re-sorting results for all Status values might complicate things. Does the database optimize for such scenarios?

I've already asked three different AIs about this, but their answers were vague and contradictory (and even the same AI gives different answers just by asking slightly different), and I can't find much on Google and SO. Could someone confirm whether my intuition here is correct?

I usually use a compound key (CreatedTime, Status) for my Log table, but I’m reconsidering this design. Since CreatedTime is typically very unique and Status only has 3-5 possible values, it seems that Status might not add much to further filtering after CreatedTime.

Most of my queries involve retrieving logs for a specific time range, optionally filtering or counting by Status. Conceptually, if I were working with a physical log book sorted by time, identifying entries with a specific Status (e.g., "Successful") would be cumbersome. On the other hand, having separate log books for each Status, all sorted by time, could make searching more efficient—though combining and re-sorting results for all Status values might complicate things. Does the database optimize for such scenarios?

I've already asked three different AIs about this, but their answers were vague and contradictory (and even the same AI gives different answers just by asking slightly different), and I can't find much on Google and SO. Could someone confirm whether my intuition here is correct?

Share Improve this question edited Mar 22 at 2:57 Dale K 27.5k15 gold badges58 silver badges83 bronze badges asked Mar 22 at 2:44 Luke VoLuke Vo 20.9k25 gold badges127 silver badges230 bronze badges 2
  • I'm not totally sure what you are asking to be honest, but the answer is, it depends, specifically on the exact queries you end up using. The general rule of thumb is, index on keys that give the most definition first, i.e. a date before a status. But depending on your queries, you might not even benefit from adding status to the index at all. You best bet is to actually test it. – Dale K Commented Mar 22 at 3:01
  • How many records are we talking here? If you use an index that has the CreatedTime first, it's going to be efficient in retrieving data ranges, but if you want to count by status in addition to selecting the date range, it's going to scan all the index entries between the 2 dates. It definitely helps to include the status in the index. On the other hand, if you index by status first and then the CreateTime, it's going to help with queries that filter by a single status and date range. – boggy Commented Mar 22 at 3:08
Add a comment  | 

2 Answers 2

Reset to default 3

Your indexes should be based on how you typically query your data. Since you typically query logs for a given date range, then an index on CreatedTime would be most efficient. I doubt that the secondary kay on Status makes a significant difference unless you query a very large number of logs without your date range any only a few match the status you want. Also, since you most likely do not have multiple logs of different statuses at the exact same time, the sub-index is not helping since it's going to have to scan all index records for the given timeframe anyway.

Indexing by Status, CreatedTime is not going to be significantly faster than CreatedDate, Status if you want logs of one status, and will be less performance if you force the engine to scan through several statuses and consolidate the results.

Of secondary importance is how the data is added. Since you almost certainly add logs sequentially, indexing by Status, CreatedTime will be less efficient since you'll be inserting records in the middle quite often, making it harder to add records. Indexing on CreatedTime means you'll almost always be adding records to the end of your table, barring unusual activity like bulk imports of older logs.

I'd recommend 2 indexes:

  • CreatedTime

  • Status, CreatedTime where Status != 'OK' - a partial index

You'd want to query all logs for some time period and also, for example, all entries with Status=Error for some time period. Those 2 indexes will help you with both. As vast majority of log entries will likely have OK status (or equivalent), so the second index will be much smaller.


Storing logs in a database is generally not very cost efficient. It might cause long backup and restore times, sudden increases of storage usage causing out of storage space errors, unplanned IO usage spikes while it is vacuumed and other potential problems.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744326372a4568669.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信