2014-03-05

PostgreSQL performance improvement for WHERE, ORDER BY and LIMIT


PostgreSQL has a "strange" behaviour for the following database table and select.

Table:
CREATE TABLE example.person
(
  id bigint NOT NULL,
  lastname character varying(255) NOT NULL,
  CONSTRAINT person_pkey PRIMARY KEY (id)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE example.person
  OWNER TO postgres;

-- Index: example.person_lastname

-- DROP INDEX example.person_lastname;

CREATE INDEX person_lastname
  ON example.person
  USING btree
  (lastname);


Select:
select * from example.person where lastname ilike 'V%' order by lastname asc limit 1

select * from example.person where lastname ilike 'V%' order by lastname desc limit 1

The table has about 600 thousand entries and the query planner is absolutely equal as you can see:

explain select * from example.person where lastname ilike 'V%' order by lastname asc limit 1
"Limit  (cost=0.00..0.08 rows=1 width=61)"
"  ->  Index Scan using person_lastname on person  (cost=0.00..54683.17 rows=653276 width=61)"
"        Filter: ((lastname)::text ~~* 'V%'::text)"

explain select * from example.person where lastname ilike 'V%' order by lastname desc limit 1 
"Limit  (cost=0.00..0.08 rows=1 width=61)"
"  ->  Index Scan Backward using person_lastname on person  (cost=0.00..54683.17 rows=653276 width=61)"
"        Filter: ((lastname)::text ~~* 'V%'::text)"

The problem is that the first statement takes about 3.5 sec compared to the second one which only needs 0.6 sec.

The EXPLAIN ANALYSE shows different results which indicates that the database doesn't execute the queries similar.

explain analyse select * from example.person where lastname ilike 'V%' order by lastname asc limit 1
"Limit  (cost=0.00..0.08 rows=1 width=61) (actual time=3469.829..3469.829 rows=1 loops=1)"
"  ->  Index Scan using person_lastname on person  (cost=0.00..54683.17 rows=653276 width=61) (actual time=3469.827..3469.827 rows=1 loops=1)"
"        Filter: ((lastname)::text ~~* 'V%'::text)"
"Total runtime: 3469.853 ms"

explain analyse select * from example.person where lastname ilike 'V%' order by lastname desc limit 1
"Limit  (cost=0.00..0.08 rows=1 width=61) (actual time=670.210..670.210 rows=1 loops=1)"
"  ->  Index Scan Backward using person_lastname on person  (cost=0.00..54683.17 rows=653276 width=61) (actual time=670.208..670.208 rows=1 loops=1)"
"        Filter: ((lastname)::text ~~* 'V%'::text)"
"Total runtime: 670.233 ms"

The solution for the problem is a different index because PostgreSQL allows the definition of several operator classes. In that case the index would look like:

CREATE INDEX person_lastname
  ON example.person
  USING btree
  (lastname varchar_pattern_ops);

Now both select statements need 0.6 sec.


Tested with following PostgreSQL Databases:
  • 8.3
  • 9.2

No comments:

Post a Comment