一、项目场景:

有张调用记录表有多个客户端的调用记录,每条调用记录有当次调用记录的详细信息
分页列表需要查出 每个客户端的调用次数、以及最新一次调用的信息。

这里举个栗子来模拟一下业务,创建一张评论表,里面存了每个用户的多个评论信息。SQL如下

CREATE TABLE `user_comment` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(10) unsigned NOT NULL,
  `contents` varchar(200) NOT NULL,
  `create_time` datetime NOT NULL,
  `update_time` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`),
  KEY `addtime` (`create_time`),
  KEY `uid_addtime` (`user_id`,`create_time`)
) ENGINE=InnoDB AUTO_INCREMENT=12 DEFAULT CHARSET=utf8;

插入数据

INSERT INTO `test`.`user_comment`(`id`, `user_id`, `contents`, `create_time`, `update_time`) VALUES (1, 1, '评论1', '2017-05-17 00:00:00', '2017-05-17 00:00:00');
INSERT INTO `test`.`user_comment`(`id`, `user_id`, `contents`, `create_time`, `update_time`) VALUES (2, 1, '评论2', '2017-05-17 00:00:01', '2017-05-17 00:00:01');
INSERT INTO `test`.`user_comment`(`id`, `user_id`, `contents`, `create_time`, `update_time`) VALUES (3, 2, '评论1', '2017-05-17 00:00:02', '2017-05-17 00:00:02');
INSERT INTO `test`.`user_comment`(`id`, `user_id`, `contents`, `create_time`, `update_time`) VALUES (4, 2, '评论2', '2017-05-17 00:00:03', '2017-05-17 00:00:03');
INSERT INTO `test`.`user_comment`(`id`, `user_id`, `contents`, `create_time`, `update_time`) VALUES (5, 3, '评论1', '2017-05-17 00:00:04', '2017-05-17 00:00:04');
INSERT INTO `test`.`user_comment`(`id`, `user_id`, `contents`, `create_time`, `update_time`) VALUES (6, 1, '评论3', '2017-05-17 00:00:05', '2017-05-17 00:00:05');
INSERT INTO `test`.`user_comment`(`id`, `user_id`, `contents`, `create_time`, `update_time`) VALUES (7, 4, '评论1', '2017-05-17 00:00:06', '2017-05-17 00:00:06');
INSERT INTO `test`.`user_comment`(`id`, `user_id`, `contents`, `create_time`, `update_time`) VALUES (8, 4, '评论2', '2017-05-17 00:00:07', '2017-05-17 00:00:07');
INSERT INTO `test`.`user_comment`(`id`, `user_id`, `contents`, `create_time`, `update_time`) VALUES (9, 4, '评论3', '2017-05-17 00:00:08', '2017-05-17 00:00:08');
INSERT INTO `test`.`user_comment`(`id`, `user_id`, `contents`, `create_time`, `update_time`) VALUES (10, 4, '评论4', '2017-05-17 00:00:09', '2017-05-17 00:00:09');
INSERT INTO `test`.`user_comment`(`id`, `user_id`, `contents`, `create_time`, `update_time`) VALUES (11, 3, '评论2', '2017-05-17 00:00:10', '2017-05-17 00:00:10');

查看数据结果

mysql> select * from user_comment;
+----+---------+----------+---------------------+---------------------+
| id | user_id | contents | create_time         | update_time         |
+----+---------+----------+---------------------+---------------------+
|  1 |       1 | 评论1    | 2017-05-17 00:00:00 | 2017-05-17 00:00:00 |
|  2 |       1 | 评论2    | 2017-05-17 00:00:01 | 2017-05-17 00:00:01 |
|  3 |       2 | 评论1    | 2017-05-17 00:00:02 | 2017-05-17 00:00:02 |
|  4 |       2 | 评论2    | 2017-05-17 00:00:03 | 2017-05-17 00:00:03 |
|  5 |       3 | 评论1    | 2017-05-17 00:00:04 | 2017-05-17 00:00:04 |
|  6 |       1 | 评论3    | 2017-05-17 00:00:05 | 2017-05-17 00:00:05 |
|  7 |       4 | 评论1    | 2017-05-17 00:00:06 | 2017-05-17 00:00:06 |
|  8 |       4 | 评论2    | 2017-05-17 00:00:07 | 2017-05-17 00:00:07 |
|  9 |       4 | 评论3    | 2017-05-17 00:00:08 | 2017-05-17 00:00:08 |
| 10 |       4 | 评论4    | 2017-05-17 00:00:09 | 2017-05-17 00:00:09 |
| 11 |       3 | 评论2    | 2017-05-17 00:00:10 | 2017-05-17 00:00:10 |
+----+---------+----------+---------------------+---------------------+
11 rows in set (0.05 sec)

二、问题描述:

现在要查询的是,每个用户的最新一条评论信息、以及统计用户总共评论了多少次。

这里很容易想到用 group by 去对 user_id 进行分组,得出总共有多少个用户评论,但是此时,非 group by字段的选择是怎么样的呢?

我的理解是,group by 对非分组的列,选择是最先遇到的一条记录。即每个分组里 id 最小的。所以,如果我们要拿到每个分组里最新的一条评论信息,则在分组前对评论进行排序即可,

SQL如下

SELECT
	c.*,
	count( c.user_id ) AS comment_count 
FROM
	( SELECT * FROM user_comment ORDER BY create_time DESC ) c 
GROUP BY
	c.user_id;

MYSQL 5.6 版本下执行的结果


mysql> select c.*, count(c.user_id) as comment_count from (select * from user_comment order by create_time desc )c group by c.user_id;
+----+---------+----------+---------------------+---------------------+---------------+
| id | user_id | contents | create_time         | update_time         | comment_count |
+----+---------+----------+---------------------+---------------------+---------------+
|  6 |       1 | 评论3    | 2017-05-17 00:00:05 | 2017-05-17 00:00:05 |             3 |
|  4 |       2 | 评论2    | 2017-05-17 00:00:03 | 2017-05-17 00:00:03 |             2 |
| 11 |       3 | 评论2    | 2017-05-17 00:00:10 | 2017-05-17 00:00:10 |             2 |
| 10 |       4 | 评论4    | 2017-05-17 00:00:09 | 2017-05-17 00:00:09 |             4 |
+----+---------+----------+---------------------+---------------------+---------------+
4 rows in set (0.08 sec)

MYSQL 5.7 版本下执行的结果

mysql> select c.*, count(c.user_id) as comment_count from (select * from user_comment order by create_time desc )c group by c.user_id;
+----+---------+----------+---------------------+---------------------+---------------+
| id | user_id | contents | create_time         | update_time         | comment_count |
+----+---------+----------+---------------------+---------------------+---------------+
|  1 |       1 | 评论1    | 2017-05-17 00:00:00 | 2017-05-17 00:00:00 |             3 |
|  3 |       2 | 评论1    | 2017-05-17 00:00:02 | 2017-05-17 00:00:02 |             2 |
|  5 |       3 | 评论1    | 2017-05-17 00:00:04 | 2017-05-17 00:00:04 |             2 |
|  7 |       4 | 评论1    | 2017-05-17 00:00:06 | 2017-05-17 00:00:06 |             4 |
+----+---------+----------+---------------------+---------------------+---------------+
4 rows in set (0.03 sec)

这里就踩了个坑,在 GROUP BY 实现组内排序的时候,发现同一条SQL 在不同的 MYSQL版本下会查询出不同的数据。本次测试版本为 MYSQL 5.6MYSQL 5 .7,结果刚好相反。在5.6下是可以拿到正确结果的,而更高版本不行,所以要换一种实现方式。


三、原因分析:

MYSQL认为 group by是不稳定的,所以要消除这种不稳定性,要明确聚合分组的列。在MYSQL 5.7 版本对 group by 做了调整。详情这里不阐述,移步百度自行搜索。

四、解决方案:

注:网上有网友提出,在上面sql的基础上,在内循环加上 limit 即可解决,
我个人认为这样有点鸡肋,因为不知道表的数据有多少。
网友这种方式实现如下:

SELECT
	c.*,
	count( c.user_id ) AS comment_count 
FROM
	( SELECT * FROM user_comment ORDER BY create_time DESC LIMIT 1000 ) c 
GROUP BY
	c.user_id;
mysql> select c.*, count(c.user_id) as comment_count from (select * from user_comment order by create_time desc limit 1000 )c group by c.user_id;
+----+---------+----------+---------------------+---------------------+---------------+
| id | user_id | contents | create_time         | update_time         | comment_count |
+----+---------+----------+---------------------+---------------------+---------------+
|  6 |       1 | 评论3    | 2017-05-17 00:00:05 | 2017-05-17 00:00:05 |             3 |
|  4 |       2 | 评论2    | 2017-05-17 00:00:03 | 2017-05-17 00:00:03 |             2 |
| 11 |       3 | 评论2    | 2017-05-17 00:00:10 | 2017-05-17 00:00:10 |             2 |
| 10 |       4 | 评论4    | 2017-05-17 00:00:09 | 2017-05-17 00:00:09 |             4 |
+----+---------+----------+---------------------+---------------------+---------------+
4 rows in set (0.04 sec)

如上可以看到得到了我们想要的结果,也和MYSQL 5.7执行的一致。但是这种方式不太行,我们另寻途径。

1)组内排序的另一种实现方式

如果是自增id的话,id最大,则记录最新。如果不是自增,则可以用创建时间来区分。巧用 Max(xxx)函数来实现组内排序。

SQL如下

SELECT
	* 
FROM
	user_comment 
WHERE
	id IN ( SELECT MAX( id ) FROM user_comment GROUP BY user_id ) 
ORDER BY
	user_id

效果和上述方式一致,也可以达到组内排序的效果。

+----+---------+----------+---------------------+---------------------+
| id | user_id | contents | create_time         | update_time         |
+----+---------+----------+---------------------+---------------------+
|  6 |       1 | 评论3    | 2017-05-17 00:00:05 | 2017-05-17 00:00:05 |
|  4 |       2 | 评论2    | 2017-05-17 00:00:03 | 2017-05-17 00:00:03 |
| 11 |       3 | 评论2    | 2017-05-17 00:00:10 | 2017-05-17 00:00:10 |
| 10 |       4 | 评论4    | 2017-05-17 00:00:09 | 2017-05-17 00:00:09 |
+----+---------+----------+---------------------+---------------------+
4 rows in set (0.04 sec)

注意: 写到这里就达到组内排序的目标了。但是还没达到我的要求。因为就是这样处理后,没法再做统计汇总了,所以要达到目的还是要改写。

最终SQL如下:

SELECT
	c2.*,
	COUNT( c1.user_id ) AS comment_count 
FROM
	user_comment c1
	RIGHT JOIN ( SELECT * FROM user_comment WHERE id IN ( SELECT MAX( id ) FROM user_comment GROUP BY user_id ) ) c2 ON c1.user_id = c2.user_id 
GROUP BY
	c2.user_id

查询结果

mysql> SELECT
	c2.*,
	COUNT( c1.user_id ) AS comment_count 
FROM
	user_comment c1
	RIGHT JOIN ( SELECT * FROM user_comment WHERE id IN ( SELECT MAX( id ) FROM user_comment GROUP BY user_id ) ) c2 ON c1.user_id = c2.user_id 
GROUP BY
	c2.user_id;
+----+---------+----------+---------------------+---------------------+---------------+
| id | user_id | contents | create_time         | update_time         | comment_count |
+----+---------+----------+---------------------+---------------------+---------------+
|  6 |       1 | 评论3    | 2017-05-17 00:00:05 | 2017-05-17 00:00:05 |             3 |
|  4 |       2 | 评论2    | 2017-05-17 00:00:03 | 2017-05-17 00:00:03 |             2 |
| 11 |       3 | 评论2    | 2017-05-17 00:00:10 | 2017-05-17 00:00:10 |             2 |
| 10 |       4 | 评论4    | 2017-05-17 00:00:09 | 2017-05-17 00:00:09 |             4 |
+----+---------+----------+---------------------+---------------------+---------------+
4 rows in set (0.07 sec)

码字不易,如有帮助点个赞趴~~

Logo

为开发者提供学习成长、分享交流、生态实践、资源工具等服务,帮助开发者快速成长。

更多推荐