I have strange (?) problem with ordering in Postgres by foreign key. It's second table & query that takes much longer with order by than without.
我在Postgres用外键订购会有奇怪的问题(?)。它是第二个表和查询,需要更长时间才能完成订单。
EXPLAIN ANALYZE SELECT "spoleczniak_zdjecia"."id", "spoleczniak_zdjecia"."postac_id", "spoleczniak_zdjecia"."zdjecie", "spoleczniak_zdjecia"."opis", "spoleczniak_zdjecia"."data", "spoleczniak_zdjecia"."avatar", "spoleczniak_zdjecia"."tagi", "postac_postacie"."id", "postac_postacie"."user_id", "postac_postacie"."avatar", "postac_postacie"."ikonka", "postac_postacie"."imie", "postac_postacie"."nazwisko", "postac_postacie"."pseudonim", "postac_postacie"."plec", "postac_postacie"."wzrost", "postac_postacie"."waga", "postac_postacie"."ur_tydz", "postac_postacie"."ur_rok", "postac_postacie"."ur_miasto_id", "postac_postacie"."akt_miasto_id", "postac_postacie"."kasa", "postac_postacie"."punkty", "postac_postacie"."zmeczenie", "postac_postacie"."zdrowie", "postac_postacie"."kariera" FROM "spoleczniak_zdjecia" INNER JOIN "taggit_taggeditem" ON ("spoleczniak_zdjecia"."id" = "taggit_taggeditem"."object_id") INNER JOIN "taggit_tag" ON ("taggit_taggeditem"."tag_id" = "taggit_tag"."id") INNER JOIN "postac_postacie" ON ("spoleczniak_zdjecia"."postac_id" = "postac_postacie"."id") WHERE ("taggit_tag"."slug" = 'ja' AND "taggit_taggeditem"."content_type_id" = 922 ) ORDER BY "spoleczniak_zdjecia"."id" DESC LIMIT 28;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=27.88..27.89 rows=7 width=198) (actual time=2984.689..2984.697 rows=28 loops=1)
-> Sort (cost=27.88..27.89 rows=7 width=198) (actual time=2984.688..2984.692 rows=28 loops=1)
Sort Key: spoleczniak_zdjecia.id
Sort Method: top-N heapsort Memory: 32kB
-> Nested Loop (cost=2.31..27.78 rows=7 width=198) (actual time=1.063..2974.901 rows=9091 loops=1)
-> Nested Loop (cost=2.31..22.02 rows=7 width=109) (actual time=1.057..2899.010 rows=9091 loops=1)
-> Nested Loop (cost=2.31..19.92 rows=7 width=4) (actual time=1.046..2848.853 rows=9103 loops=1)
-> Index Scan using taggit_tag_slug on taggit_tag (cost=0.00..4.27 rows=1 width=4) (actual time=0.025..0.027 rows=1 loops=1)
Index Cond: ((slug)::text = 'ja'::text)
-> Bitmap Heap Scan on taggit_taggeditem (cost=2.31..15.56 rows=7 width=8) (actual time=1.019..2847.244 rows=9103 loops=1)
Recheck Cond: (tag_id = taggit_tag.id)
Filter: (content_type_id = 922)
-> Bitmap Index Scan on taggit_taggeditem_tag_id (cost=0.00..2.31 rows=7 width=0) (actual time=0.954..0.954 rows=9103 loops=1)
Index Cond: (tag_id = taggit_tag.id)
-> Index Scan using spoleczniak_zdjecia_pkey on spoleczniak_zdjecia (cost=0.00..0.29 rows=1 width=109) (actual time=0.005..0.005 rows=1 loops=9103)
Index Cond: (id = taggit_taggeditem.object_id)
-> Index Scan using postac_postacie_pkey on postac_postacie (cost=0.00..0.81 rows=1 width=89) (actual time=0.007..0.007 rows=1 loops=9091)
Index Cond: (id = spoleczniak_zdjecia.postac_id)
Total runtime: 2984.760 ms
And here is without order by:
这里没有订单:
EXPLAIN ANALYZE SELECT "spoleczniak_zdjecia"."id", "spoleczniak_zdjecia"."postac_id", "spoleczniak_zdjecia"."zdjecie", "spoleczniak_zdjecia"."opis", "spoleczniak_zdjecia"."data", "spoleczniak_zdjecia"."avatar", "spoleczniak_zdjecia"."tagi", "postac_postacie"."id", "postac_postacie"."user_id", "postac_postacie"."avatar", "postac_postacie"."ikonka", "postac_postacie"."imie", "postac_postacie"."nazwisko", "postac_postacie"."pseudonim", "postac_postacie"."plec", "postac_postacie"."wzrost", "postac_postacie"."waga", "postac_postacie"."ur_tydz", "postac_postacie"."ur_rok", "postac_postacie"."ur_miasto_id", "postac_postacie"."akt_miasto_id", "postac_postacie"."kasa", "postac_postacie"."punkty", "postac_postacie"."zmeczenie", "postac_postacie"."zdrowie", "postac_postacie"."kariera" FROM "spoleczniak_zdjecia" INNER JOIN "taggit_taggeditem" ON ("spoleczniak_zdjecia"."id" = "taggit_taggeditem"."object_id") INNER JOIN "taggit_tag" ON ("taggit_taggeditem"."tag_id" = "taggit_tag"."id") INNER JOIN "postac_postacie" ON ("spoleczniak_zdjecia"."postac_id" = "postac_postacie"."id") WHERE ("taggit_tag"."slug" = 'ja' AND "taggit_taggeditem"."content_type_id" = 922 ) LIMIT 28;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=2.31..27.78 rows=7 width=198) (actual time=1.113..1.482 rows=28 loops=1)
-> Nested Loop (cost=2.31..27.78 rows=7 width=198) (actual time=1.112..1.477 rows=28 loops=1)
-> Nested Loop (cost=2.31..22.02 rows=7 width=109) (actual time=1.102..1.292 rows=28 loops=1)
-> Nested Loop (cost=2.31..19.92 rows=7 width=4) (actual time=1.092..1.145 rows=28 loops=1)
-> Index Scan using taggit_tag_slug on taggit_tag (cost=0.00..4.27 rows=1 width=4) (actual time=0.017..0.017 rows=1 loops=1)
Index Cond: ((slug)::text = 'ja'::text)
-> Bitmap Heap Scan on taggit_taggeditem (cost=2.31..15.56 rows=7 width=8) (actual time=1.072..1.118 rows=28 loops=1)
Recheck Cond: (tag_id = taggit_tag.id)
Filter: (content_type_id = 922)
-> Bitmap Index Scan on taggit_taggeditem_tag_id (cost=0.00..2.31 rows=7 width=0) (actual time=0.989..0.989 rows=9103 loops=1)
Index Cond: (tag_id = taggit_tag.id)
-> Index Scan using spoleczniak_zdjecia_pkey on spoleczniak_zdjecia (cost=0.00..0.29 rows=1 width=109) (actual time=0.004..0.005 rows=1 loops=28)
Index Cond: (id = taggit_taggeditem.object_id)
-> Index Scan using postac_postacie_pkey on postac_postacie (cost=0.00..0.81 rows=1 width=89) (actual time=0.005..0.005 rows=1 loops=28)
Index Cond: (id = spoleczniak_zdjecia.postac_id)
Total runtime: 1.562 ms
What can cause problem? It's query? Config? Any particular config should I check? In my last question there was more complex query, but that query is not complex at all. Any suggestions?
什么可能导致问题?这是查询?配置?我应该检查任何特定的配置?在我的上一个问题中,有更复杂的查询,但该查询根本不复杂。有什么建议么?
And btw. that query is generated by Django (django-taggit to be precise). And btw. part II, it's not poor hardware at all (i7, 16 GB of RAM, RAID 10 3x2 for OS and data + 2 RAID1 disks for WAL, 512 MB of RAID cache + BBU)
顺便说一下。该查询由Django生成(准确地说是django-taggit)。顺便说一下。第二部分,它的硬件并不差(i7,16 GB的RAM,OS 10的RAID 10 3x2和WAL的数据+ 2个RAID1磁盘,512 MB的RAID缓存+ BBU)
Plain text query:
纯文本查询:
SELECT "spoleczniak_zdjecia"."id", "spoleczniak_zdjecia"."postac_id", "spoleczniak_zdjecia"."zdjecie", "spoleczniak_zdjecia"."opis", "spoleczniak_zdjecia"."data", "spoleczniak_zdjecia"."avatar", "spoleczniak_zdjecia"."tagi", "postac_postacie"."id", "postac_postacie"."user_id", "postac_postacie"."avatar", "postac_postacie"."ikonka", "postac_postacie"."imie", "postac_postacie"."nazwisko", "postac_postacie"."pseudonim", "postac_postacie"."plec", "postac_postacie"."wzrost", "postac_postacie"."waga", "postac_postacie"."ur_tydz", "postac_postacie"."ur_rok", "postac_postacie"."ur_miasto_id", "postac_postacie"."akt_miasto_id", "postac_postacie"."kasa", "postac_postacie"."punkty", "postac_postacie"."zmeczenie", "postac_postacie"."zdrowie", "postac_postacie"."kariera" FROM "spoleczniak_zdjecia" INNER JOIN "taggit_taggeditem" ON ("spoleczniak_zdjecia"."id" = "taggit_taggeditem"."object_id") INNER JOIN "taggit_tag" ON ("taggit_taggeditem"."tag_id" = "taggit_tag"."id") INNER JOIN "postac_postacie" ON ("spoleczniak_zdjecia"."postac_id" = "postac_postacie"."id") WHERE ("taggit_tag"."slug" = 'ja' AND "taggit_taggeditem"."content_type_id" = 922 ) ORDER BY "spoleczniak_zdjecia"."id" DESC LIMIT 28;
选择“spoleczniak_zdjecia”。“id”,“spoleczniak_zdjecia”。“postac_id”,“spoleczniak_zdjecia”。“zdjecie”,“spoleczniak_zdjecia”。“opis”,“spoleczniak_zdjecia”。“data”,“spoleczniak_zdjecia”。“avatar”,“ spoleczniak_zdjecia“。”tagi“,”postac_postacie“。”id“,”postac_postacie“。”user_id“,”postac_postacie“。”avatar“,”postac_postacie“。”ikonka“,”postac_postacie“。”imie“,”postac_postacie“ 。“nazwisko”,“postac_postacie”。“pseudonim”,“postac_postacie”。“plec”,“postac_postacie”。“wzrost”,“postac_postacie”。“waga”,“postac_postacie”。“ur_tydz”,“postac_postacie”。“ ur_rok“,”postac_postacie“。”ur_miasto_id“,”postac_postacie“。”akt_miasto_id“,”postac_postacie“。”kasa“,”postac_postacie“。”punkty“,”postac_postacie“。”zmeczenie“,”postac_postacie“。”zdrowie“ ,“postac_postacie”。“kariera”FROM“spoleczniak_zdjecia”INNER JOIN“taggit_taggeditem”ON(“spoleczniak_zdjecia”。“id”=“taggit_taggeditem”。“object_id”)INNER JOIN“taggit_tag”ON(“taggit_taggeditem”。“tag_id”= “taggit_tag”。“id”)INNER JOIN“postac_postacie” ON(“spoleczniak_zdjecia”。“postac_id”=“postac_postacie”。“id”)WHERE(“taggit_tag”。“slug”='ja'AND“taggit_taggeditem”。“content_type_id”= 922)ORDER BY“spoleczniak_zdjecia”。“id “DESC LIMIT 28;
1
The difference is right here in the second line of the EXPLAIN output:
区别在于EXPLAIN输出的第二行:
-> Sort (cost=27.88..27.89 rows=7 width=198) (actual time=2984.688..2984.692 rows=28 loops=1)
Notice that the "actual time" is pretty much the entire time of the query. Sorting requires not only a bunch of comparisons (i.e. the cost of sorting anything) but also extra data management, the server needs to copy some data (rows or pointers to rows) to a temporary location so that it can be sorted without disturbing anything else.
请注意,“实际时间”几乎是查询的整个时间。排序不仅需要一堆比较(即排序任何东西的成本),还需要额外的数据管理,服务器需要将一些数据(行或指针指向行)复制到临时位置,以便可以对其进行排序而不会打扰其他任何内容。
Any query will take longer with sorting unless you get lucky and your sorting matches the order on disk and optimizer can notice that they match up.
任何查询都需要更长的时间进行排序,除非你很幸运,你的排序与磁盘上的顺序匹配,优化器可以注意到它们匹配。
0
The second one returns you the first 28 records found regardless the order.
第二个返回无论顺序如何找到的前28个记录。
The first you has to order the results THEN returning you the 28 first records.
首先你必须订购结果,然后返回28条第一条记录。
If the data is not modified, the query with the ORDER BY
will returns the same 28 records every time.
如果未修改数据,则使用ORDER BY的查询将每次返回相同的28条记录。
But the second query can returns 28 differents records each time you execute it. The result is not guaranteed.
但是第二个查询每次执行时都会返回28个不同的记录。结果无法保证。