KeepingyourdataworkontheserverusingUNION_MySQL

作者：蕊儿巧乐-滋 | 来源：互联网 | 2018-04-19 03:46

KeepingyourdataworkontheserverusingUNION

I have found myself using UNION in MySQL more and more lately. In this example, I am using it to speed up queries that are using IN clauses. MySQL handles the IN clause like a big OR operation. Recently, I created what looks like a very crazy query using UNION, that in fact helped our MySQL servers perform much better.

With any technology you use, you have to ask yourself, "What is this tech good at doing?" For me, MySQL has always been excelent at running lots of small queries that use primary, unique, or well defined covering indexes. I guess most databases are good at that. Perhaps that is the bare minimum for any database. MySQL seems to excel at doing this however. We had a query that looked like this:

select category_id, count(*) from some_table
where
article_id in (1,2,3,4,5,6,7,8,9) and
category_id in (11,22,33,44,55,66,77,88,99) and
some_date_time > now() - interval 30 day
group by
category_id

There were more things in the where clause. I am not including them all in these examples. MySQL does not have a lot it can do with that query. Maybe there is a key on the date field it can use. And if the date field limits the possible rows, a scan of those rows will be quick. That was not the case here. We were asking for a lot of data to be scanned. Depending on how many items were in the in clauses, this query could take as much as 800 milliseconds to return. Our goal at DealNews is to have all pages generate in under 300 milliseconds. So, this one query was 2.5x our total page time.

In case you were wondering what this query is used for, it is used to calculate the counts of items in sub categories on our category navigation pages. On this page it's the box on the left hand side labeled "Category". Those numbers next to each category are what we are asking this query to return to us.

Because I know how my data is stored and structured, I can fix this slow query. I happen to know that there are many fewer rows for each item for article_id than there is for category_id. There is also a key on this table on article_id and some_date_time. That means, for a single article_id, MySQL could find the rows it wants very quickly. Without using a union, the only solution would be to query all this data in a loop in code and get all the results back and reassemble them in code. That is a lot of wasted round trip work for the application however. You see this pattern a fair amount in PHP code. It is one of my pet peeves. I have written before about keeping the data on the server . The same idea applies here. I turned the above query into this:

select category_id, sum(count) as count from 
(
	(
		select category_id, count(*) as count from some_table
		where
			article_id=1 and
			category_id in (11,22,33,44,55,66,77,88,99) and
			some_date_time > now() - interval 30 day
		group by
			category_id
	)
	union all
	(
		select category_id, count(*) as count from some_table
		where
			article_id=2 and
			category_id in (11,22,33,44,55,66,77,88,99) and
			some_date_time > now() - interval 30 day
		group by
			category_id
	)
	union all
	(
		select category_id, count(*) as count from some_table
		where
			article_id=3 and
			category_id in (11,22,33,44,55,66,77,88,99) and
			some_date_time > now() - interval 30 day
		group by
			category_id
	)
	union all
	(
		select category_id, count(*) as count from some_table
		where
			article_id=4 and
			category_id in (11,22,33,44,55,66,77,88,99) and
			some_date_time > now() - interval 30 day
		group by
			category_id
	)
	union all
	(
		select category_id, count(*) as count from some_table
		where
			article_id=5 and
			category_id in (11,22,33,44,55,66,77,88,99) and
			some_date_time > now() - interval 30 day
		group by
			category_id
	)
	union all
	(
		select category_id, count(*) as count from some_table
		where
			article_id=6 and
			category_id in (11,22,33,44,55,66,77,88,99) and
			some_date_time > now() - interval 30 day
		group by
			category_id
	)
	union all
	(
		select category_id, count(*) as count from some_table
		where
			article_id=7 and
			category_id in (11,22,33,44,55,66,77,88,99) and
			some_date_time > now() - interval 30 day
		group by
			category_id
	)
	union all
	(
		select category_id, count(*) as count from some_table
		where
			article_id=8 and
			category_id in (11,22,33,44,55,66,77,88,99) and
			some_date_time > now() - interval 30 day
		group by
			category_id
	)
	union all
	(
		select category_id, count(*) as count from some_table
		where
			article_id=9 and
			category_id in (11,22,33,44,55,66,77,88,99) and
			some_date_time > now() - interval 30 day
		group by
			category_id
	)
) derived_table
group by
	category_id

Pretty gnarly looking huh? The run time of that query is 8ms. Yes, MySQL has to perform 9 subqueries and then the outer query. And because it can use good keys for the subqueries, the total execution time for this query is only 8ms. The data comes back from the database ready to use in one trip to the server. The page generation time for those pages went from a mean of 213ms with a standard deviation of 136ms to a mean of 196ms and standard deviation of 81ms. That may not sound like a lot. Take a look at how much less work the MySQL servers are doing now. mysql graph showing decrease in rows read

mysql graph showing decrease in rows read

The arrow in the image is when I rolled the change out. Several other graphs show the change in server performance as well.

The UNION is a great way to keep your data on the server until it's ready to come back to your application. Do you think it can be of use to you in your application?

推荐阅读

go
如何使用PHP向系统日历中添加事件？

本文介绍了如何使用PHP向系统日历中添加事件的方法，通过使用PHP技术可以实现自动添加事件的功能，从而实现全局通知系统和迅速记录工具的自动化。同时还提到了系统exchange自带的日历具有同步感的特点，以及使用web技术实现自动添加事件的优势。 ... [详细]

蜡笔小新 2023-12-14 21:02:28
go
2018年人工智能大数据的爆发，学Java还是Python？

本文介绍了2018年人工智能大数据的爆发以及学习Java和Python的相关知识。在人工智能和大数据时代，Java和Python这两门编程语言都很优秀且火爆。选择学习哪门语言要根据个人兴趣爱好来决定。Python是一门拥有简洁语法的高级编程语言，容易上手。其特色之一是强制使用空白符作为语句缩进，使得新手可以快速上手。目前，Python在人工智能领域有着广泛的应用。如果对Java、Python或大数据感兴趣，欢迎加入qq群458345782。 ... [详细]

蜡笔小新 2023-12-14 20:08:28
perl
Android 新闻App的本地服务器搭建教程

本文介绍了在开发Android新闻App时，搭建本地服务器的步骤。通过使用XAMPP软件，可以一键式搭建起开发环境，包括Apache、MySQL、PHP、PERL。在本地服务器上新建数据库和表，并设置相应的属性。最后，给出了创建new表的SQL语句。这个教程适合初学者参考。 ... [详细]

蜡笔小新 2023-12-14 17:15:19
get
搭建Windows Server 2012 R2 IIS8.5+PHP（FastCGI）+MySQL环境的详细步骤

本文详细介绍了搭建Windows Server 2012 R2 IIS8.5+PHP（FastCGI）+MySQL环境的步骤，包括环境说明、相关软件下载的地址以及所需的插件下载地址。 ... [详细]

蜡笔小新 2023-12-14 17:03:58
php
PHP设置MySQL字符集的方法及使用mysqli_set_charset函数

本文介绍了PHP设置MySQL字符集的方法，详细介绍了使用mysqli_set_charset函数来规定与数据库服务器进行数据传送时要使用的字符集。通过示例代码演示了如何设置默认客户端字符集。 ... [详细]

蜡笔小新 2023-12-14 15:30:33
js
如何限制php数据库链接数和连接超时时间？

本文介绍了如何使用php限制数据库插入的条数并显示每次插入数据库之间的数据数目，以及避免重复提交的方法。同时还介绍了如何限制某一个数据库用户的并发连接数，以及设置数据库的连接数和连接超时时间的方法。最后提供了一些关于浏览器在线用户数和数据库连接数量比例的参考值。 ... [详细]

蜡笔小新 2023-12-14 14:06:10
get
Redis数据结构之string应用场景解析

本文介绍了Redis的基础数据结构string的应用场景，并以面试的形式进行问答讲解，帮助读者更好地理解和应用Redis。同时，描述了一位面试者的心理状态和面试官的行为。 ... [详细]

蜡笔小新 2023-12-14 14:02:42
php
Metasploit攻击渗透实践

本文介绍了Metasploit攻击渗透实践的内容和要求，包括主动攻击、针对浏览器和客户端的攻击，以及成功应用辅助模块的实践过程。其中涉及使用Hydra在不知道密码的情况下攻击metsploit2靶机获取密码，以及攻击浏览器中的tomcat服务的具体步骤。同时还讲解了爆破密码的方法和设置攻击目标主机的相关参数。 ... [详细]

蜡笔小新 2023-12-14 12:14:09
js
Hibernate基础映射

在说Hibernate映射前，我们先来了解下对象关系映射ORM。ORM的实现思想就是将关系数据库中表的数据映射成对象，以对象的形式展现。这样开发人员就可以把对数据库的操作转化为对 ... [详细]

蜡笔小新 2023-12-14 10:57:47
js
SpringBoot集成前端模版（thymeleaf）的配置步骤

本文介绍了在SpringBoot中集成thymeleaf前端模版的配置步骤，包括在application.properties配置文件中添加thymeleaf的配置信息，引入thymeleaf的jar包，以及创建PageController并添加index方法。 ... [详细]

蜡笔小新 2023-12-14 10:11:46
js
知识图谱——机器大脑中的知识库

本文介绍了知识图谱在机器大脑中的应用，以及搜索引擎在知识图谱方面的发展。以谷歌知识图谱为例，说明了知识图谱的智能化特点。通过搜索引擎用户可以获取更加智能化的答案，如搜索关键词"Marie Curie"，会得到居里夫人的详细信息以及与之相关的历史人物。知识图谱的出现引起了搜索引擎行业的变革，不仅美国的微软必应，中国的百度、搜狗等搜索引擎公司也纷纷推出了自己的知识图谱。 ... [详细]

蜡笔小新 2023-12-14 10:06:19
php
PHP中的MySQL函数库及其常用函数介绍

本文由编程笔记小编整理，介绍了PHP中的MySQL函数库及其常用函数，包括mysql_connect、mysql_error、mysql_select_db、mysql_query、mysql_affected_row、mysql_close等。希望对读者有一定的参考价值。 ... [详细]

蜡笔小新 2023-12-14 08:19:53
go
MACElasticsearch安装步骤及验证方法

本文介绍了MACElasticsearch的安装步骤，包括下载ZIP文件、解压到安装目录、启动服务，并提供了验证启动是否成功的方法。同时，还介绍了安装elasticsearch-head插件的方法，以便于进行查询操作。 ... [详细]

蜡笔小新 2023-12-13 23:42:43
js
asp.net微信公众平台开发目录汇总陆续更新的相关内容

本文内容为asp.net微信公众平台开发的目录汇总，包括数据库设计、多层架构框架搭建和入口实现、微信消息封装及反射赋值、关注事件、用户记录、回复文本消息、图文消息、服务搭建（接入）、自定义菜单等。同时提供了示例代码和相关的后台管理功能。内容涵盖了多个方面，适合综合运用。 ... [详细]

蜡笔小新 2023-12-14 22:40:22
php
Matplotlib，带有已保存图形的注释已被切断

Matplotlib，带有已保 ... [详细]

蜡笔小新 2023-12-14 20:14:33

蕊儿巧乐-滋

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章