Discussion:
Make UPDATE query quicker?
(too old to reply)
James David Smith
2012-10-12 10:16:00 UTC
Permalink
Hi all,

Wondered if someone had any tips about how to do this UPDATE query
quicker. I've got two tables:

CREATE TABLE import_table
( date_time TIMESTAMP
person_id, INTEGER
data REAL)

CREATE TABLE master_table
(date_time TIMESTAMP
person_id INTEGER
data REAL)

Each table has 172800 rows in it.

I want to move the 'data' from the import_table to the master_table by
matching on both the date_time and the person_id. I do this with this
query:

UPDATE master_table a
SET data =
(SELECT b.date
FROM import_table b
WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
b.data IS NOT NULL)

However I need to do this about 20 times, and the first time is still
running at the moment (about 15 minutes).

Am I doing something wrong? Should I put some indexes on the columns
somehow to improve performance?

Thanks

James
--
Sent via pgsql-novice mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice
Michael Wood
2012-10-12 11:53:55 UTC
Permalink
Post by James David Smith
Hi all,
Wondered if someone had any tips about how to do this UPDATE query
CREATE TABLE import_table
( date_time TIMESTAMP
person_id, INTEGER
data REAL)
CREATE TABLE master_table
(date_time TIMESTAMP
person_id INTEGER
data REAL)
Each table has 172800 rows in it.
I want to move the 'data' from the import_table to the master_table by
matching on both the date_time and the person_id. I do this with this
UPDATE master_table a
SET data =
(SELECT b.date
FROM import_table b
WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
b.data IS NOT NULL)
However I need to do this about 20 times, and the first time is still
running at the moment (about 15 minutes).
Am I doing something wrong? Should I put some indexes on the columns
somehow to improve performance?
I don't know if this will help performance-wise, but maybe it's worth a try:

See if doing it with an UPDATE FROM helps at all. e.g.:

http://stackoverflow.com/a/2766766/495319

--
Michael Wood <***@gmail.com>
--
Sent via pgsql-novice mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice
James David Smith
2012-10-12 12:14:17 UTC
Permalink
Michael,

Thanks, I will give that a try later today.

The first of the 20 updates I need to do took about 25 minutes in the
end by the way.

All other useful suggestions gratefully recevied... ;-)

Thanks

James
Post by Michael Wood
Post by James David Smith
Hi all,
Wondered if someone had any tips about how to do this UPDATE query
CREATE TABLE import_table
( date_time TIMESTAMP
person_id, INTEGER
data REAL)
CREATE TABLE master_table
(date_time TIMESTAMP
person_id INTEGER
data REAL)
Each table has 172800 rows in it.
I want to move the 'data' from the import_table to the master_table by
matching on both the date_time and the person_id. I do this with this
UPDATE master_table a
SET data =
(SELECT b.date
FROM import_table b
WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
b.data IS NOT NULL)
However I need to do this about 20 times, and the first time is still
running at the moment (about 15 minutes).
Am I doing something wrong? Should I put some indexes on the columns
somehow to improve performance?
http://stackoverflow.com/a/2766766/495319
--
--
Sent via pgsql-novice mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice
VB N
2012-10-12 14:00:11 UTC
Permalink
On Fri, Oct 12, 2012 at 3:46 PM, James David Smith <
Post by James David Smith
Hi all,
Wondered if someone had any tips about how to do this UPDATE query
CREATE TABLE import_table
( date_time TIMESTAMP
person_id, INTEGER
data REAL)
CREATE TABLE master_table
(date_time TIMESTAMP
person_id INTEGER
data REAL)
Each table has 172800 rows in it.
I want to move the 'data' from the import_table to the master_table by
matching on both the date_time and the person_id. I do this with this
UPDATE master_table a
SET data =
(SELECT b.date
FROM import_table b
WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
b.data IS NOT NULL)
However I need to do this about 20 times, and the first time is still
running at the moment (about 15 minutes).
Am I doing something wrong? Should I put some indexes on the columns
somehow to improve performance?
Any idea how long the SELECT is taking ? Indexes on import_table (an Index
on person_id) should help. ofcourse it all depends on the uniqueness of the
column and size of the table etc. Whats the uniqueness (the n_distinct from
pg_stats table) on all the 3 columns in the WHERE clause ?

Regards,
VB
James David Smith
2012-10-12 14:05:42 UTC
Permalink
Hi VB,

1) No idea how long the select is taking unfortunately. How can I
figure this out?
2) I'll stick an index on person_id in the import_table.
3) Not sure what you mean by this unfortunately: "Whats the uniqueness
(the n_distinct from pg_stats table) on all the 3 columns in the WHERE
clause ?"

Thanks

James
Post by VB N
On Fri, Oct 12, 2012 at 3:46 PM, James David Smith
Post by James David Smith
Hi all,
Wondered if someone had any tips about how to do this UPDATE query
CREATE TABLE import_table
( date_time TIMESTAMP
person_id, INTEGER
data REAL)
CREATE TABLE master_table
(date_time TIMESTAMP
person_id INTEGER
data REAL)
Each table has 172800 rows in it.
I want to move the 'data' from the import_table to the master_table by
matching on both the date_time and the person_id. I do this with this
UPDATE master_table a
SET data =
(SELECT b.date
FROM import_table b
WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
b.data IS NOT NULL)
However I need to do this about 20 times, and the first time is still
running at the moment (about 15 minutes).
Am I doing something wrong? Should I put some indexes on the columns
somehow to improve performance?
Any idea how long the SELECT is taking ? Indexes on import_table (an Index
on person_id) should help. ofcourse it all depends on the uniqueness of the
column and size of the table etc. Whats the uniqueness (the n_distinct from
pg_stats table) on all the 3 columns in the WHERE clause ?
Regards,
VB
--
Sent via pgsql-novice mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice
Tim Bowden
2012-10-12 18:31:40 UTC
Permalink
Post by James David Smith
Hi all,
Wondered if someone had any tips about how to do this UPDATE query
CREATE TABLE import_table
( date_time TIMESTAMP
person_id, INTEGER
data REAL)
CREATE TABLE master_table
(date_time TIMESTAMP
person_id INTEGER
data REAL)
Each table has 172800 rows in it.
I want to move the 'data' from the import_table to the master_table by
matching on both the date_time and the person_id. I do this with this
UPDATE master_table a
SET data =
(SELECT b.date
FROM import_table b
WHERE a.date_time = b.date_time AND a.person_id = b.person_id AND
b.data IS NOT NULL)
However I need to do this about 20 times, and the first time is still
running at the moment (about 15 minutes).
Am I doing something wrong? Should I put some indexes on the columns
somehow to improve performance?
Thanks
James
For this you definitely want indexes. Also you have not defined any
primary keys. I'm going to assume the combination of date_time and
person_id defines a unique record. A suitable index will automatically
be created if you define these two columns as the primary key.

ALTER TABLE master_table ADD PRIMARY KEY(date_time, person_id);


Do the same for import_table.

Also your query isn't doing what you think it is. Where b.data is null,
then no record is returned by the select statement but the update is not
conditional so you end up with a.data is null. You might want to check
any records in master_table that previously had a data value but the
same record in update_table had a null value for data. SELECT * from
import_table WHERE data IS NULL; would be a good place to start.

The query you want is something like this:

UPDATE master_table a SET data = b.data from import_table b where
a.date_time = b.date_time and a.person_id = a.person_id
and b.data is not null;

HTH
Tim Bowden
--
Sent via pgsql-novice mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice
Loading...