In this post, we’ll look at a few crucial queries that are helpful for exploring into PostgreSQL problems and evaluating the health of an existing instance.
What is hit ratio of the database?
1 2 3 4 5 6 7 8 9 10 11 | SELECT datname, ( blks_hit * 100 /(blks_hit + blks_read) ):: numeric as hit_ratio from pg_stat_database WHERE datname not in ( 'postgres', 'template0', 'template1' ); |
What is commit ratio of database?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | SELECT datname, round( ( xact_commit :: float * 100 /(xact_commit + xact_rollback) ):: numeric, 2 ) as successful_xact_ratio FROM pg_stat_database WHERE datname not in ( 'postgres', 'template0', 'template1' ); |
Get the temp file usage of database?
1 2 3 4 5 6 7 8 9 10 11 | select datname, temp_files, round(temp_bytes / 1024 / 1024, 2) as temp_filesize_MB from pg_stat_database WHERE datname not in ( 'postgres', 'template0', 'template1' ) and temp_files > 0; |
Frequency of checkpoints?
1 | select * from pg_stat_bgwriter; |
If the checkpoints are happening frequently it will cause more IO load on the machine so increase the max_wal_size parameter.
Use below query to find the frequency of the checkpoints
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | WITH sub as ( SELECT EXTRACT( EPOCH FROM (now() - stats_reset) ) AS seconds_since_start, ( checkpoints_timed + checkpoints_req ) AS total_checkpoints FROM pg_stat_bgwriter ) SELECT total_checkpoints, seconds_since_start / total_checkpoints / 60 AS minutes_between_checkpoints FROM sub; |
To determine how much data is being written for each checkpoint, set log_checkpoints. If not, we may use this query to determine how many blocks were written during the checkpoint.
1 | select buffers_checkpoint/(checkpoints_timed+checkpoints_req) from pg_stat_bgwriter |
Get top five tables with highest sequential scans?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | SELECT schemaname, relname, seq_scan, seq_tup_read, seq_tup_read / seq_scan as avg_seq_tup_read FROM pg_stat_all_tables WHERE seq_scan > 0 and pg_total_relation_size(schemaname || '.' || relname) > 104857600 and schemaname not like 'pg_%' ORDER BY 5 DESC LIMIT 5; |
This query will display all tables with more consecutive scans and a size greater than 100MB.
Find tables with low hot updates?
To know the hot updates in PostgreSQL refer – https://www.postgresql.org/docs/current/storage-hot.html.
To identify the hot update ratio for tables use the below query.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | SELECT pg_stat_all_tables.schemaname || '.' || pg_stat_all_tables.relname AS TABLE_NAME, pg_size_pretty( pg_relation_size(relid) ) AS table_size, coalesce( t.spcname, ( SELECT spcname FROM pg_tablespace WHERE oid = ( SELECT dattablespace FROM pg_database WHERE datname = current_database() ) ) ) AS tblsp, seq_scan, idx_scan, n_tup_ins, n_tup_upd, n_tup_hot_upd, n_tup_del, coalesce(n_tup_ins, 0)+ 2 * coalesce(n_tup_upd, 0)- coalesce(n_tup_hot_upd, 0)+ coalesce(n_tup_del, 0) AS total, ( coalesce(n_tup_hot_upd, 0):: float * 100 /( CASE WHEN n_tup_upd > 0 THEN n_tup_upd ELSE 1 END ):: float ):: numeric(10, 2) AS hot_rate, ( SELECT v[1] FROM regexp_matches( reloptions :: text, E 'fillfactor=(\d+)' ) AS r(v) LIMIT 1 ) AS fillfactor FROM pg_stat_all_tables JOIN pg_class c ON c.oid = relid LEFT JOIN pg_tablespace t ON t.oid = c.reltablespace WHERE ( coalesce(n_tup_ins, 0)+ coalesce(n_tup_upd, 0)+ coalesce(n_tup_del, 0) )> 0 AND pg_stat_all_tables.schemaname NOT IN ('pg_catalog', 'pg_global') ORDER BY total DESC LIMIT 50; |
Get table bloat information?
Bloat in tables or indexes slow down the performance by unnecessary IO. It is recommended to regularly check the bloat and do defragmentation by using pg_repack.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 | WITH constants AS ( -- define some constants for sizes of things -- for reference down the query and easy maintenance SELECT current_setting('block_size')::numeric AS bs, 23 AS hdr, 8 AS ma ) , no_stats AS ( -- screen out table who have attributes -- which dont have stats, such as JSON SELECT table_schema, table_name, n_live_tup::numeric as est_rows, pg_table_size(relid)::numeric as table_size FROM information_schema.columns JOIN pg_stat_user_tables as psut ON table_schema = psut.schemaname AND table_name = psut.relname LEFT OUTER JOIN pg_stats ON table_schema = pg_stats.schemaname AND table_name = pg_stats.tablename AND column_name = attname WHERE attname IS NULL AND table_schema NOT IN ( 'pg_catalog', 'information_schema' ) GROUP BY table_schema, table_name, relid, n_live_tup ) , null_headers AS ( -- calculate null header sizes -- omitting tables which dont have complete stats -- and attributes which aren't visible SELECT hdr + 1 + (sum( case when null_frac <> 0 THEN 1 else 0 END ) / 8) as nullhdr, SUM((1 - null_frac)*avg_width) as datawidth, MAX(null_frac) as maxfracsum, schemaname, tablename, hdr, ma, bs FROM pg_stats CROSS JOIN constants LEFT OUTER JOIN no_stats ON schemaname = no_stats.table_schema AND tablename = no_stats.table_name WHERE schemaname NOT IN ( 'pg_catalog', 'information_schema' ) AND no_stats.table_name IS NULL AND EXISTS ( SELECT 1 FROM information_schema.columns WHERE schemaname = columns.table_schema AND tablename = columns.table_name ) GROUP BY schemaname, tablename, hdr, ma, bs ) , data_headers AS ( -- estimate header and row size SELECT ma, bs, hdr, schemaname, tablename, (datawidth + (hdr + ma - ( case when hdr % ma = 0 THEN ma ELSE hdr % ma END )))::numeric AS datahdr, (maxfracsum*(nullhdr + ma - ( case when nullhdr % ma = 0 THEN ma ELSE nullhdr % ma END ))) AS nullhdr2 FROM null_headers ) , table_estimates AS ( -- make estimates of how large the table should be -- based on row and page size SELECT schemaname, tablename, bs, reltuples::numeric as est_rows, relpages * bs as table_bytes, CEIL((reltuples* (datahdr + nullhdr2 + 4 + ma - ( CASE WHEN datahdr % ma = 0 THEN ma ELSE datahdr % ma END ) ) / (bs - 20))) * bs AS expected_bytes, reltoastrelid FROM data_headers JOIN pg_class ON tablename = relname JOIN pg_namespace ON relnamespace = pg_namespace.oid AND schemaname = nspname WHERE pg_class.relkind = 'r' ) , estimates_with_toast AS ( -- add in estimated TOAST table sizes -- estimate based on 4 toast tuples per page because we dont have -- anything better. also append the no_data tables SELECT schemaname, tablename, TRUE as can_estimate, est_rows, table_bytes + ( coalesce(toast.relpages, 0) * bs ) as table_bytes, expected_bytes + ( ceil( coalesce(toast.reltuples, 0) / 4 ) * bs ) as expected_bytes FROM table_estimates LEFT OUTER JOIN pg_class as toast ON table_estimates.reltoastrelid = toast.oid AND toast.relkind = 't' ) , table_estimates_plus AS ( -- add some extra metadata to the table data -- and calculations to be reused -- including whether we cant estimate it -- or whether we think it might be compressed SELECT current_database() as databasename, schemaname, tablename, can_estimate, est_rows, CASE WHEN table_bytes > 0 THEN table_bytes::NUMERIC ELSE NULL::NUMERIC END AS table_bytes, CASE WHEN expected_bytes > 0 THEN expected_bytes::NUMERIC ELSE NULL::NUMERIC END AS expected_bytes, CASE WHEN expected_bytes > 0 AND table_bytes > 0 AND expected_bytes <= table_bytes THEN (table_bytes - expected_bytes)::NUMERIC ELSE 0::NUMERIC END AS bloat_bytes FROM estimates_with_toast UNION ALL SELECT current_database() as databasename, table_schema, table_name, FALSE, est_rows, table_size, NULL::NUMERIC, NULL::NUMERIC FROM no_stats ) , bloat_data AS ( -- do final math calculations and formatting select current_database() as databasename, schemaname, tablename, can_estimate, table_bytes, round(table_bytes / (1024 ^ 2)::NUMERIC, 3) as table_mb, expected_bytes, round(expected_bytes / (1024 ^ 2)::NUMERIC, 3) as expected_mb, round(bloat_bytes*100 / table_bytes) as pct_bloat, round(bloat_bytes / (1024::NUMERIC ^ 2), 2) as mb_bloat, table_bytes, expected_bytes, est_rows FROM table_estimates_plus ) -- filter output for bloated tables SELECT databasename, schemaname, tablename, can_estimate, est_rows, pct_bloat, mb_bloat, table_mb FROM bloat_data -- this where clause defines which tables actually appear -- in the bloat chart -- example below filters for tables which are either 50% -- bloated and more than 20mb in size, or more than 25% -- bloated and more than 1GB in size WHERE ( pct_bloat >= 50 AND mb_bloat >= 20 ) OR ( pct_bloat >= 25 AND mb_bloat >= 1000 ) ORDER BY pct_bloat DESC; |
Get list of unused indexes?
The following query will return any unused indexes which are not part of any constraint.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | SELECT s.schemaname, s.relname AS tablename, s.indexrelname AS indexname, pg_relation_size(s.indexrelid) AS index_size FROM pg_catalog.pg_stat_user_indexes s JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid WHERE s.idx_scan = 0 AND 0 <>ALL (i.indkey) AND NOT i.indisunique AND NOT EXISTS (SELECT 1 FROM pg_catalog.pg_constraint c WHERE c.conindid = s.indexrelid) AND NOT EXISTS (SELECT 1 FROM pg_catalog.pg_inherits AS inh WHERE inh.inhrelid = s.indexrelid) ORDER BY pg_relation_size(s.indexrelid) DESC; |
We appreciate you taking the time to explore the content above. I hope the information served the reason for which you sought out the blog.
Comments on how to make the blog better are indeed very appreciated. If you have any questions, suggestions, or criticism, kindly email us.
To be informed about all of our content, subscribe now.