Lets create 2 tables with test data:
1)
create table test_users(
id serial primary key
);
insert into test_users
select * from generate_series(1,750000);
2)
create table test_posts(
id bigserial primary key,
user_id INT
);
CREATE index idx ON test_posts(user_id, id);
INSERT INTO test_posts (user_id) SELECT 2 FROM generate_series (1,30038);
INSERT INTO test_posts (user_id) SELECT 3 FROM generate_series (1,31036);
INSERT INTO test_posts (user_id) SELECT 4 FROM generate_series (1,24032);
INSERT INTO test_posts (user_id) SELECT 5 FROM generate_series (1,93094);
INSERT INTO test_posts (user_id) SELECT 6 FROM generate_series (1,31033);
INSERT INTO test_posts (user_id) SELECT 7 FROM generate_series (1,62063);
INSERT INTO test_posts (user_id) SELECT 724309 FROM generate_series (1,2);
INSERT INTO test_posts (user_id) SELECT 724306 FROM generate_series (1,1);
Now, lets check count of rows and for example post max id
per user, from test_posts
table:
SELECT user_id, count(*) , max(id) as maxid
FROM test_posts
GROUP BY user_id
order by count(*)
so, user_id = 724306
have only one post and (in my case) post maxid is: 271299
.
user_id = 5
have most posts in table and maxid in my case is: 178200
Now I run this query:
SELECT test_posts.id
FROM test_posts
JOIN test_users
ON test_posts.user_id = test_users.id
WHERE test_posts.id = 271299
explain analyze
shows:
"Merge Join (cost=8.88..9.06 rows=1 width=8) (actual time=146.561..146.563 rows=1 loops=1)"
" Merge Cond: (test_users.id = test_posts.user_id)"
" -> Index Only Scan using test_users_pkey on test_users (cost=0.42..22808.42 rows=750000 width=4) (actual time=0.012..110.091 rows=724307 loops=1)"
" Heap Fetches: 724307"
" -> Sort (cost=8.45..8.46 rows=1 width=12) (actual time=0.012..0.013 rows=1 loops=1)"
" Sort Key: test_posts.user_id"
" Sort Method: quicksort Memory: 25kB"
" -> Index Scan using test_posts_pkey on test_posts (cost=0.42..8.44 rows=1 width=12) (actual time=0.007..0.007 rows=1 loops=1)"
" Index Cond: (id = 271299)"
"Planning time: 0.214 ms"
"Execution time: 146.588 ms"
Now run same query, just use 178200:
SELECT test_posts.id
FROM test_posts
JOIN test_users
ON test_posts.user_id = test_users.id
WHERE test_posts.id = 178200
explain analyze
from this:
"Merge Join (cost=8.88..9.06 rows=1 width=8) (actual time=0.040..0.042 rows=1 loops=1)"
" Merge Cond: (test_users.id = test_posts.user_id)"
" -> Index Only Scan using test_users_pkey on test_users (cost=0.42..22808.42 rows=750000 width=4) (actual time=0.016..0.018 rows=6 loops=1)"
" Heap Fetches: 6"
" -> Sort (cost=8.45..8.46 rows=1 width=12) (actual time=0.017..0.018 rows=1 loops=1)"
" Sort Key: test_posts.user_id"
" Sort Method: quicksort Memory: 25kB"
" -> Index Scan using test_posts_pkey on test_posts (cost=0.42..8.44 rows=1 width=12) (actual time=0.011..0.012 rows=1 loops=1)"
" Index Cond: (id = 178200)"
"Planning time: 0.249 ms"
"Execution time: 0.072 ms"
What causes that difference in execution time ?
Aucun commentaire:
Enregistrer un commentaire