I have been taking a dump of telephone number assignment on a weekly basis and then imported the data into a MySQL table. A record would have at least the following relevant information: ph_index, ph_num, userid and last_update. I would like to create a MySQL statement that would delete duplicated records but would leave the earliest and latest records for each user alone.
For example: If a telephone number has been assigned only to user1, I only want the earliest assigned and the most recent records.
Initial data:
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
2, 999-999-9999, user1, 2000-01-08
3, 999-999-9999, user1, 2000-01-15
4, 999-999-9999, user1, 2000-01-21
After deleting the duplicates
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
4, 999-999-9999, user1, 2000-01-21
If a telephone number has been assigned to user1 for a period, then re-assigned to user2 for another period, and then back to user1; I want the first and last dates each period.
Initial data:
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
2, 999-999-9999, user1, 2000-01-08
3, 999-999-9999, user1, 2000-01-15
4, 999-999-9999, user2, 2000-01-21
5, 999-999-9999, user2, 2000-01-28
6, 999-999-9999, user2, 2000-02-04
7, 999-999-9999, user1, 2000-02-11
8, 999-999-9999, user1, 2000-02-18
9, 999-999-9999, user1, 2000-02-25
After deleting the duplicates
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
3, 999-999-9999, user1, 2000-01-15
4, 999-999-9999, user2, 2000-01-21
6, 999-999-9999, user2, 2000-02-04
7, 999-999-9999, user1, 2000-02-11
9, 999-999-9999, user1, 2000-02-25
I would like to be able to do this for all 2000+ telephone numbers in the table.
TIA
I have been taking a dump of telephone number assignment on a weekly basis and then imported the data into a MySQL table. A record would have at least the following relevant information: ph_index, ph_num, userid and last_update. I would like to create a MySQL statement that would delete duplicated records but would leave the earliest and latest records for each user alone.
For example: If a telephone number has been assigned only to user1, I only want the earliest assigned and the most recent records.
Initial data:
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
2, 999-999-9999, user1, 2000-01-08
3, 999-999-9999, user1, 2000-01-15
4, 999-999-9999, user1, 2000-01-21
After deleting the duplicates
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
4, 999-999-9999, user1, 2000-01-21
If a telephone number has been assigned to user1 for a period, then re-assigned to user2 for another period, and then back to user1; I want the first and last dates each period.
Initial data:
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
2, 999-999-9999, user1, 2000-01-08
3, 999-999-9999, user1, 2000-01-15
4, 999-999-9999, user2, 2000-01-21
5, 999-999-9999, user2, 2000-01-28
6, 999-999-9999, user2, 2000-02-04
7, 999-999-9999, user1, 2000-02-11
8, 999-999-9999, user1, 2000-02-18
9, 999-999-9999, user1, 2000-02-25
After deleting the duplicates
ph_index, ph_num, userid and last_update
1, 999-999-9999, user1, 2000-01-01
3, 999-999-9999, user1, 2000-01-15
4, 999-999-9999, user2, 2000-01-21
6, 999-999-9999, user2, 2000-02-04
7, 999-999-9999, user1, 2000-02-11
9, 999-999-9999, user1, 2000-02-25
I would like to be able to do this for all 2000+ telephone numbers in the table.
TIA
Share Improve this question asked Mar 2 at 23:27 SingsiblawSingsiblaw 356 bronze badges 1- You'd delete a row when both more ancient and more recent row for this user exists. I.e. two WHERE EXISTS may solve. On 8+ you may enumerate the rows in CTE with ROW_NUMBER() with direct and backward ordering and delete the rows where both numbers are above 1. You may select MIN and MAX timestamps per user in the subquery and delete rows with all another timestamps for this user. You may imagine more methods. – Akina Commented Mar 3 at 5:31
1 Answer
Reset to default 3WITH cte AS (
SELECT
*,
(userid = LAG(userid, 1, '') OVER w_ph_num)
AND
(userid = LEAD(userid, 1, '') OVER w_ph_num)
AS deletion_flag
FROM dump
WINDOW w_ph_num AS (PARTITION BY ph_num ORDER BY last_update)
)
DELETE dump
FROM dump
JOIN cte ON cte.ph_index = dump.ph_index
WHERE cte.deletion_flag = 1
In MySQL since version 8.0.2, the LAG
and LEAD
window functions can be used to find the previous and next user for each phone number. If the user matches both, the record can be deleted.
ph_index | ph_num | userid | last_update | prev_user (LAG) | next_user (LEAD) | deletion_flag |
---|---|---|---|---|---|---|
1 | 999-999-9999 | user1 | 2000-01-01 | user1 | 0 | |
2 | 999-999-9999 | user1 | 2000-01-08 | user1 | user1 | 1 |
3 | 999-999-9999 | user1 | 2000-01-15 | user1 | user2 | 0 |
4 | 999-999-9999 | user2 | 2000-01-21 | user1 | user2 | 0 |
5 | 999-999-9999 | user2 | 2000-01-28 | user2 | user2 | 1 |
6 | 999-999-9999 | user2 | 2000-02-04 | user2 | user1 | 0 |
7 | 999-999-9999 | user1 | 2000-02-11 | user2 | user1 | 0 |
8 | 999-999-9999 | user1 | 2000-02-18 | user1 | user1 | 1 |
9 | 999-999-9999 | user1 | 2000-02-25 | user1 | 0 |
db<>fiddle
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745114676a4612059.html
评论列表(0条)