Thursday, 7 July 2016

Useful Queries for troubleshooting amazon redshift

USEFUL QUERIES FOR TROUBLESHOOTING IN AMAZON REDSHIFT 

Here are some of my queries for troubleshooting in amazon redshift. I have collected this from different sources.

TO CHECK LIST OF RUNNING QUERIES AND USERNAMES:

select a.userid, cast(u.usename as varchar(100)), a.query, a.label, a.pid, a.starttime, b.duration,
b.duration/1000000 as duration_sec, b.query as querytext
from stv_inflight a, stv_recents b, pg_user u
where a.pid = b.pid and a.userid = u.usesysid




select pid, trim(user_name), starttime, substring(query,1,20) from stv_recents where status='Running'

TO CANCEL A RUNNING QUERY:

cancel <pid>


You can get pid from one of the queries above used to check running queries.


TO LOOK FOR ALERTS:

select * from STL_ALERT_EVENT_LOG
where query = 1011
order by event_time desc
limit 100;


TO CHECK TABLE SIZE:

select trim(pgdb.datname) as Database, trim(pgn.nspname) as Schema,
trim(a.name) as Table, b.mbytes, a.rows
from ( select db_id, id, name, sum(rows) as rows from stv_tbl_perm a group by db_id, id, name ) as a
join pg_class as pgc on pgc.oid = a.id
join pg_namespace as pgn on pgn.oid = pgc.relnamespace
join pg_database as pgdb on pgdb.oid = a.db_id
join (select tbl, count(*) as mbytes
from stv_blocklist group by tbl) b on a.id=b.tbl
order by b.mbytes desc, a.db_id, a.name;


TO CHECK FOR TABLE COMPRESSION:

analyze <tablename>;
analyze compression <tablename>;



TO ANALYZE ENCODING:

select "column", type, encoding
from pg_table_def where tablename = 'biglist';



TO CHECK LIST OF FILES COPIED:

select * from stl_load_errors

select * from stl_load_commits


select query, trim(filename) as file, curtime as updated, *
from stl_load_commits
where query = pg_last_copy_id();


TO CHECK LOAD ERRORS

select d.query, substring(d.filename,14,20),
d.line_number as line,
substring(d.value,1,16) as value,
substring(le.err_reason,1,48) as err_reason
from stl_loaderror_detail d, stl_load_errors le
where d.query = le.query
and d.query = pg_last_copy_id();


TO CHECK FOR DISKSPACE USED IN REDSHIFT:

select owner as node, diskno, used, capacity
from stv_partitions
order by 1, 2, 3, 4;
select query, trim(querytxt) as sqlquery
from stl_query
order by query desc limit 5;


SOME IMPORTANT AWS COMMANDS:

To resize the redshift cluster (node type and number of nodes always required):

aws  redshift modify-cluster --cluster-identifier <cluster name> --node-type dw2.8xlarge --number-of-nodes 3

To get filelist on S3:

aws s3 ls $BUCKET/  > ./filecount.out

To get status of cluster and other information of cluster in text format:

aws redshift describe-clusters --output text