Essential Queries for PostgreSQL Health Assessment

In this post, we’ll look at a few crucial queries that are helpful for exploring into PostgreSQL problems and evaluating the health of an existing instance.

What is hit ratio of the database?

 SELECT datname, ( blks_hit * 100 /(blks_hit + blks_read) ):: numeric as hit_ratio from pg_stat_database WHERE datname not in ( 'postgres', 'template0', 'template1' );

SELECT

datname,

(

blks_hit * 100 /(blks_hit + blks_read)

):: numeric as hit_ratio

from

pg_stat_database

WHERE

datname not in (

'postgres', 'template0', 'template1'

);

If the hit ratio is less than 90%, there may be a problem with low allocation of shared buffers or queries doing large table scans. The hit ratio should be close to 100%. Boost shared buffers or tweak queries that do more IO.

What is commit ratio of database?

 SELECT datname, round( ( xact_commit :: float * 100 /(xact_commit + xact_rollback) ):: numeric, 2 ) as successful_xact_ratio FROM pg_stat_database WHERE datname not in ( 'postgres', 'template0', 'template1' );

SELECT

datname,

round(

(

xact_commit :: float * 100 /(xact_commit + xact_rollback)

):: numeric,

) as successful_xact_ratio

FROM

pg_stat_database

WHERE

datname not in (

'postgres', 'template0', 'template1'

);

We must engage with the application team to determine why there are so many transaction rollbacks if the commit percentage is less than 95%. Consider a scenario where many of your transactions contain DML which can result in fragmentation.

Get the temp file usage of database?

 select datname, temp_files, round(temp_bytes / 1024 / 1024, 2) as temp_filesize_MB from pg_stat_database WHERE datname not in ( 'postgres', 'template0', 'template1' ) and temp_files > 0;

select

datname,

temp_files,

round(temp_bytes / 1024 / 1024, 2) as temp_filesize_MB

from

pg_stat_database

WHERE

datname not in (

'postgres', 'template0', 'template1'

)

and temp_files > 0;

Use the log_temp_files parameter to log queries utilizing the temp files and modify the queries if you see that the temp files and bytes are high.

Frequency of checkpoints?

 select * from pg_stat_bgwriter;

1	select * from pg_stat_bgwriter;

There are two important columns checkpoints_req, checkpoints_timed. If the checkpoints_req is more than the checkpoints_timed, PostgreSQL is doing checkpoints due to the high WAL generation.

If the checkpoints are happening frequently it will cause more IO load on the machine so increase the max_wal_size parameter.

Use below query to find the frequency of the checkpoints

 WITH sub as ( SELECT EXTRACT( EPOCH FROM (now() - stats_reset) ) AS seconds_since_start, ( checkpoints_timed + checkpoints_req ) AS total_checkpoints FROM pg_stat_bgwriter ) SELECT total_checkpoints, seconds_since_start / total_checkpoints / 60 AS minutes_between_checkpoints FROM sub;

WITH sub as (

SELECT

EXTRACT(

EPOCH

FROM

(now() - stats_reset)

) AS seconds_since_start,

(

checkpoints_timed + checkpoints_req

) AS total_checkpoints

FROM

pg_stat_bgwriter

)

SELECT

total_checkpoints,

seconds_since_start / total_checkpoints / 60 AS minutes_between_checkpoints

FROM

sub;

To determine how much data is being written for each checkpoint, set log_checkpoints. If not, we may use this query to determine how many blocks were written during the checkpoint.

 select buffers_checkpoint/(checkpoints_timed+checkpoints_req) from pg_stat_bgwriter

1	select buffers_checkpoint/(checkpoints_timed+checkpoints_req) from pg_stat_bgwriter

Get top five tables with highest sequential scans?

 SELECT schemaname, relname, seq_scan, seq_tup_read, seq_tup_read / seq_scan as avg_seq_tup_read FROM pg_stat_all_tables WHERE seq_scan > 0 and pg_total_relation_size(schemaname || '.' || relname) > 104857600 and schemaname not like 'pg_%' ORDER BY 5 DESC LIMIT 5;

SELECT

schemaname,

relname,

seq_scan,

seq_tup_read,

seq_tup_read / seq_scan as avg_seq_tup_read

FROM

pg_stat_all_tables

WHERE

seq_scan > 0

and pg_total_relation_size(schemaname || '.' || relname) > 104857600

and schemaname not like 'pg_%'

ORDER BY

5 DESC

LIMIT

This query will display all tables with more consecutive scans and a size greater than 100MB.

Find tables with low hot updates?

To know the hot updates in PostgreSQL refer – https://www.postgresql.org/docs/current/storage-hot.html.

To identify the hot update ratio for tables use the below query.

 SELECT pg_stat_all_tables.schemaname || '.' || pg_stat_all_tables.relname AS TABLE_NAME, pg_size_pretty( pg_relation_size(relid) ) AS table_size, coalesce( t.spcname, ( SELECT spcname FROM pg_tablespace WHERE oid = ( SELECT dattablespace FROM pg_database WHERE datname = current_database() ) ) ) AS tblsp, seq_scan, idx_scan, n_tup_ins, n_tup_upd, n_tup_hot_upd, n_tup_del, coalesce(n_tup_ins, 0)+ 2 * coalesce(n_tup_upd, 0)- coalesce(n_tup_hot_upd, 0)+ coalesce(n_tup_del, 0) AS total, ( coalesce(n_tup_hot_upd, 0):: float * 100 /( CASE WHEN n_tup_upd > 0 THEN n_tup_upd ELSE 1 END ):: float ):: numeric(10, 2) AS hot_rate, ( SELECT v[1] FROM regexp_matches( reloptions :: text, E 'fillfactor=(\d+)' ) AS r(v) LIMIT 1 ) AS fillfactor FROM pg_stat_all_tables JOIN pg_class c ON c.oid = relid LEFT JOIN pg_tablespace t ON t.oid = c.reltablespace WHERE ( coalesce(n_tup_ins, 0)+ coalesce(n_tup_upd, 0)+ coalesce(n_tup_del, 0) )> 0 AND pg_stat_all_tables.schemaname NOT IN ('pg_catalog', 'pg_global') ORDER BY total DESC LIMIT 50;

SELECT

pg_stat_all_tables.schemaname || '.' || pg_stat_all_tables.relname AS TABLE_NAME,

pg_size_pretty(

pg_relation_size(relid)

) AS table_size,

coalesce(

t.spcname,

(

SELECT

spcname

FROM

pg_tablespace

WHERE

oid = (

SELECT

dattablespace

FROM

pg_database

WHERE

datname = current_database()

)

) AS tblsp,

seq_scan,

idx_scan,

n_tup_ins,

n_tup_upd,

n_tup_hot_upd,

n_tup_del,

coalesce(n_tup_ins, 0)+ 2 * coalesce(n_tup_upd, 0)- coalesce(n_tup_hot_upd, 0)+ coalesce(n_tup_del, 0) AS total,

(

coalesce(n_tup_hot_upd, 0):: float * 100 /(

CASE WHEN n_tup_upd > 0 THEN n_tup_upd ELSE 1 END

):: float

):: numeric(10, 2) AS hot_rate,

(

SELECT

v[1]

FROM

regexp_matches(

reloptions :: text, E 'fillfactor=(\d+)'

) AS r(v)

LIMIT

) AS fillfactor

FROM

pg_stat_all_tables

JOIN pg_class c ON c.oid = relid

LEFT JOIN pg_tablespace t ON t.oid = c.reltablespace

WHERE

(

coalesce(n_tup_ins, 0)+ coalesce(n_tup_upd, 0)+ coalesce(n_tup_del, 0)

)> 0

AND pg_stat_all_tables.schemaname NOT IN ('pg_catalog', 'pg_global')

ORDER BY

total DESC

LIMIT

50;

Get table bloat information?

Bloat in tables or indexes slow down the performance by unnecessary IO. It is recommended to regularly check the bloat and do defragmentation by using pg_repack.

 WITH constants AS ( -- define some constants for sizes of things -- for reference down the query and easy maintenance SELECT current_setting('block_size')::numeric AS bs, 23 AS hdr, 8 AS ma ) , no_stats AS ( -- screen out table who have attributes -- which dont have stats, such as JSON SELECT table_schema, table_name, n_live_tup::numeric as est_rows, pg_table_size(relid)::numeric as table_size FROM information_schema.columns JOIN pg_stat_user_tables as psut ON table_schema = psut.schemaname AND table_name = psut.relname LEFT OUTER JOIN pg_stats ON table_schema = pg_stats.schemaname AND table_name = pg_stats.tablename AND column_name = attname WHERE attname IS NULL AND table_schema NOT IN ( 'pg_catalog', 'information_schema' ) GROUP BY table_schema, table_name, relid, n_live_tup ) , null_headers AS ( -- calculate null header sizes -- omitting tables which dont have complete stats -- and attributes which aren't visible SELECT hdr + 1 + (sum( case when null_frac <> 0 THEN 1 else 0 END ) / 8) as nullhdr, SUM((1 - null_frac)*avg_width) as datawidth, MAX(null_frac) as maxfracsum, schemaname, tablename, hdr, ma, bs FROM pg_stats CROSS JOIN constants LEFT OUTER JOIN no_stats ON schemaname = no_stats.table_schema AND tablename = no_stats.table_name WHERE schemaname NOT IN ( 'pg_catalog', 'information_schema' ) AND no_stats.table_name IS NULL AND EXISTS ( SELECT 1 FROM information_schema.columns WHERE schemaname = columns.table_schema AND tablename = columns.table_name ) GROUP BY schemaname, tablename, hdr, ma, bs ) , data_headers AS ( -- estimate header and row size SELECT ma, bs, hdr, schemaname, tablename, (datawidth + (hdr + ma - ( case when hdr % ma = 0 THEN ma ELSE hdr % ma END )))::numeric AS datahdr, (maxfracsum*(nullhdr + ma - ( case when nullhdr % ma = 0 THEN ma ELSE nullhdr % ma END ))) AS nullhdr2 FROM null_headers ) , table_estimates AS ( -- make estimates of how large the table should be -- based on row and page size SELECT schemaname, tablename, bs, reltuples::numeric as est_rows, relpages * bs as table_bytes, CEIL((reltuples* (datahdr + nullhdr2 + 4 + ma - ( CASE WHEN datahdr % ma = 0 THEN ma ELSE datahdr % ma END ) ) / (bs - 20))) * bs AS expected_bytes, reltoastrelid FROM data_headers JOIN pg_class ON tablename = relname JOIN pg_namespace ON relnamespace = pg_namespace.oid AND schemaname = nspname WHERE pg_class.relkind = 'r' ) , estimates_with_toast AS ( -- add in estimated TOAST table sizes -- estimate based on 4 toast tuples per page because we dont have -- anything better. also append the no_data tables SELECT schemaname, tablename, TRUE as can_estimate, est_rows, table_bytes + ( coalesce(toast.relpages, 0) * bs ) as table_bytes, expected_bytes + ( ceil( coalesce(toast.reltuples, 0) / 4 ) * bs ) as expected_bytes FROM table_estimates LEFT OUTER JOIN pg_class as toast ON table_estimates.reltoastrelid = toast.oid AND toast.relkind = 't' ) , table_estimates_plus AS ( -- add some extra metadata to the table data -- and calculations to be reused -- including whether we cant estimate it -- or whether we think it might be compressed SELECT current_database() as databasename, schemaname, tablename, can_estimate, est_rows, CASE WHEN table_bytes > 0 THEN table_bytes::NUMERIC ELSE NULL::NUMERIC END AS table_bytes, CASE WHEN expected_bytes > 0 THEN expected_bytes::NUMERIC ELSE NULL::NUMERIC END AS expected_bytes, CASE WHEN expected_bytes > 0 AND table_bytes > 0 AND expected_bytes <= table_bytes THEN (table_bytes - expected_bytes)::NUMERIC ELSE 0::NUMERIC END AS bloat_bytes FROM estimates_with_toast UNION ALL SELECT current_database() as databasename, table_schema, table_name, FALSE, est_rows, table_size, NULL::NUMERIC, NULL::NUMERIC FROM no_stats ) , bloat_data AS ( -- do final math calculations and formatting select current_database() as databasename, schemaname, tablename, can_estimate, table_bytes, round(table_bytes / (1024 ^ 2)::NUMERIC, 3) as table_mb, expected_bytes, round(expected_bytes / (1024 ^ 2)::NUMERIC, 3) as expected_mb, round(bloat_bytes*100 / table_bytes) as pct_bloat, round(bloat_bytes / (1024::NUMERIC ^ 2), 2) as mb_bloat, table_bytes, expected_bytes, est_rows FROM table_estimates_plus ) -- filter output for bloated tables SELECT databasename, schemaname, tablename, can_estimate, est_rows, pct_bloat, mb_bloat, table_mb FROM bloat_data -- this where clause defines which tables actually appear -- in the bloat chart -- example below filters for tables which are either 50% -- bloated and more than 20mb in size, or more than 25% -- bloated and more than 1GB in size WHERE ( pct_bloat >= 50 AND mb_bloat >= 20 ) OR ( pct_bloat >= 25 AND mb_bloat >= 1000 ) ORDER BY pct_bloat DESC;

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

WITH constants AS

(

-- define some constants for sizes of things

-- for reference down the query and easy maintenance

SELECT

current_setting('block_size')::numeric AS bs,

23 AS hdr,

8 AS ma

)

no_stats AS

(

-- screen out table who have attributes

-- which dont have stats, such as JSON

SELECT

table_schema,

table_name,

n_live_tup::numeric as est_rows,

pg_table_size(relid)::numeric as table_size

FROM

information_schema.columns

JOIN

pg_stat_user_tables as psut

ON table_schema = psut.schemaname

AND table_name = psut.relname

LEFT OUTER JOIN

pg_stats

ON table_schema = pg_stats.schemaname

AND table_name = pg_stats.tablename

AND column_name = attname

WHERE

attname IS NULL

AND table_schema NOT IN

(

'pg_catalog',

'information_schema'

)

GROUP BY

table_schema,

table_name,

relid,

n_live_tup

)

null_headers AS

(

-- calculate null header sizes

-- omitting tables which dont have complete stats

-- and attributes which aren't visible

SELECT

hdr + 1 + (sum(

case

when

null_frac <> 0

THEN

else

END

) / 8) as nullhdr, SUM((1 - null_frac)*avg_width) as datawidth, MAX(null_frac) as maxfracsum, schemaname, tablename, hdr, ma, bs

FROM

pg_stats

CROSS JOIN

constants

LEFT OUTER JOIN

no_stats

ON schemaname = no_stats.table_schema

AND tablename = no_stats.table_name

WHERE

schemaname NOT IN

(

'pg_catalog', 'information_schema'

)

AND no_stats.table_name IS NULL

AND EXISTS

(

SELECT

FROM

information_schema.columns

WHERE

schemaname = columns.table_schema

AND tablename = columns.table_name

)

GROUP BY

schemaname,

tablename,

hdr,

ma,

)

data_headers AS

(

-- estimate header and row size

SELECT

ma,

bs,

hdr,

schemaname,

tablename,

(datawidth + (hdr + ma - (

case

when

hdr % ma = 0

THEN

ELSE

hdr % ma

END

)))::numeric AS datahdr, (maxfracsum*(nullhdr + ma - (

case

when

nullhdr % ma = 0

THEN

ELSE

nullhdr % ma

END

))) AS nullhdr2

FROM

null_headers

)

, table_estimates AS

(

-- make estimates of how large the table should be

-- based on row and page size

SELECT

schemaname,

tablename,

bs,

reltuples::numeric as est_rows,

relpages * bs as table_bytes,

CEIL((reltuples* (datahdr + nullhdr2 + 4 + ma - (

CASE

WHEN

datahdr % ma = 0

THEN

ELSE

datahdr % ma

END

) ) / (bs - 20))) * bs AS expected_bytes, reltoastrelid

FROM

data_headers

JOIN

pg_class

ON tablename = relname

JOIN

pg_namespace

ON relnamespace = pg_namespace.oid

AND schemaname = nspname

WHERE

pg_class.relkind = 'r'

)

, estimates_with_toast AS

(

-- add in estimated TOAST table sizes

-- estimate based on 4 toast tuples per page because we dont have

-- anything better. also append the no_data tables

SELECT

schemaname,

tablename,

TRUE as can_estimate,

est_rows,

table_bytes + ( coalesce(toast.relpages, 0) * bs ) as table_bytes,

expected_bytes + ( ceil( coalesce(toast.reltuples, 0) / 4 ) * bs ) as expected_bytes

FROM

table_estimates

LEFT OUTER JOIN

pg_class as toast

ON table_estimates.reltoastrelid = toast.oid

AND toast.relkind = 't'

)

table_estimates_plus AS

(

-- add some extra metadata to the table data

-- and calculations to be reused

-- including whether we cant estimate it

-- or whether we think it might be compressed

SELECT

current_database() as databasename,

schemaname,

tablename,

can_estimate,

est_rows,

CASE

WHEN

table_bytes > 0

THEN

table_bytes::NUMERIC

ELSE

NULL::NUMERIC

END

AS table_bytes,

CASE

WHEN

expected_bytes > 0

THEN

expected_bytes::NUMERIC

ELSE

NULL::NUMERIC

END

AS expected_bytes,

CASE

WHEN

expected_bytes > 0

AND table_bytes > 0

AND expected_bytes <= table_bytes

THEN

(table_bytes - expected_bytes)::NUMERIC

ELSE

0::NUMERIC

END

AS bloat_bytes

FROM

estimates_with_toast

UNION ALL

SELECT

current_database() as databasename,

table_schema,

table_name,

FALSE,

est_rows,

table_size,

NULL::NUMERIC,

NULL::NUMERIC FROM no_stats

)

bloat_data AS

(

-- do final math calculations and formatting

select

current_database() as databasename,

schemaname,

tablename,

can_estimate,

table_bytes,

round(table_bytes / (1024 ^ 2)::NUMERIC, 3) as table_mb,

expected_bytes,

round(expected_bytes / (1024 ^ 2)::NUMERIC, 3) as expected_mb,

round(bloat_bytes*100 / table_bytes) as pct_bloat,

round(bloat_bytes / (1024::NUMERIC ^ 2), 2) as mb_bloat,

table_bytes,

expected_bytes,

est_rows

FROM

table_estimates_plus

)

-- filter output for bloated tables

SELECT

databasename,

schemaname,

tablename,

can_estimate,

est_rows,

pct_bloat,

mb_bloat,

table_mb

FROM

bloat_data -- this where clause defines which tables actually appear

-- in the bloat chart

-- example below filters for tables which are either 50%

-- bloated and more than 20mb in size, or more than 25%

-- bloated and more than 1GB in size

WHERE

(

pct_bloat >= 50

AND mb_bloat >= 20

)

(

pct_bloat >= 25

AND mb_bloat >= 1000

)

ORDER BY

pct_bloat DESC;

Get list of unused indexes?

The following query will return any unused indexes which are not part of any constraint.

 SELECT s.schemaname, s.relname AS tablename, s.indexrelname AS indexname, pg_relation_size(s.indexrelid) AS index_size FROM pg_catalog.pg_stat_user_indexes s JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid WHERE s.idx_scan = 0 AND 0 <>ALL (i.indkey) AND NOT i.indisunique AND NOT EXISTS (SELECT 1 FROM pg_catalog.pg_constraint c WHERE c.conindid = s.indexrelid) AND NOT EXISTS (SELECT 1 FROM pg_catalog.pg_inherits AS inh WHERE inh.inhrelid = s.indexrelid) ORDER BY pg_relation_size(s.indexrelid) DESC;

SELECT s.schemaname,

s.relname AS tablename,

s.indexrelname AS indexname,

pg_relation_size(s.indexrelid) AS index_size

FROM pg_catalog.pg_stat_user_indexes s

JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid

WHERE s.idx_scan = 0

AND 0 <>ALL (i.indkey)

AND NOT i.indisunique

AND NOT EXISTS

(SELECT 1 FROM pg_catalog.pg_constraint c

WHERE c.conindid = s.indexrelid)

AND NOT EXISTS

(SELECT 1 FROM pg_catalog.pg_inherits AS inh

WHERE inh.inhrelid = s.indexrelid)

ORDER BY pg_relation_size(s.indexrelid) DESC;

We appreciate you taking the time to explore the content above. I hope the information served the reason for which you sought out the blog.

Comments on how to make the blog better are indeed very appreciated. If you have any questions, suggestions, or criticism, kindly email us.

To be informed about all of our content, subscribe now.

What is hit ratio of the database?

What is commit ratio of database?

Get the temp file usage of database?

Frequency of checkpoints?

Get top five tables with highest sequential scans?

Find tables with low hot updates?

Get table bloat information?

Get list of unused indexes?

Author: Anurag Shukla

Leave a Reply Cancel reply

Essential Queries for PostgreSQL Health Assessment

What is hit ratio of the database?

What is commit ratio of database?

Get the temp file usage of database?

Frequency of checkpoints?

Get top five tables with highest sequential scans?

Find tables with low hot updates?

Get table bloat information?

Get list of unused indexes?

Author: Anurag Shukla

Related posts

Leave a Reply Cancel reply