Friday, February 26, 2016

Netezza: nz_online_vacuum is for online manual vacuum


See what's new in IBM Netezza Software Support Tools NPS 7.2.0.6.P1 release notes.

Netezza manual vacuum (nz_manual_vacuum) has always required to stop NPS that incurred database service downtime.  This is not the case with NPS version 7.2.0.3-P2 and higher where new manual vacuum script  is to run manual vacuum for system in online state.  The new nz_online_valuum utility can be downloaded separately as part of Netezza NPS tools installation bundle e.g. no NPS upgrade needed.  

See instructions how to download and upgrade existing Netezza NPS tools (aka /nz/support/bin directory): IBM Support: Fix Central - download Netezza NPS 7.2 tools

Unfortunately, nz_online_valuum utility is picky and won't run on incomparable NPS versions:



[nz@server]$ /nz/support/bin/nz_online_vacuum -h

Usage:    nz_online_vacuum  [<database>]

Purpose:  Reindex the host catalogs ... while the system is online

          The database catalog is made up of a number of system tables and indexes
          that are stored on the NPS host.  In extreme situations, those relations
          can grow rather large in size, thereby affecting performance (when accessing
          the information contained in the catalog).

          An access exclusive lock is issued for each database.  No users can be signed
          into that database, no cross database queries to the database can be running.
          If the lock cannot be obtained then that database will not be vacuum'ed at
          this time and the output will report

                  ERROR:  unable to acquire lock on database DBNAME

          Online reindexing includes both a vacuum and a reindex.
          The SYSTEM database is not included in this operation.

          See also:  nz_manual_vacuum

Inputs:   The database name is optional ... if you want to process just the
          specified database.  Otherwise, the default is to process all databases.

Outputs:  Each database is processed, in turn.  Sample output

             1   DEV_DB       done
             2   PROD_DB      ERROR:  unable to acquire lock on database PROD_DB
             3   QA_DB        done
             4   TEST_DB      done

Running script on incomparable NPS system will fail. This is what it renders running on NPS 7.2.0.3-P1:  

[nz@server]$ /nz/support/bin/nz_online_vacuum

ERROR:  This script can only be run on the following NPS versions
             7.0.4.7-P2  or greater
             7.1.0.4-P2  or greater
             7.2.0.3-P2  or greater


Related documentation:   REINDEX DATABASE

Thursday, February 25, 2016

Netezza SQL: UNIQUE constraint clause

See related post: 


Netezza SQL compiles with ANSI SQL-92 standard implementing Postgrees SQL. The ANSI SQL-92 standard supported by most relational databases on the market

Netezza supports referential integrity (e.g. definitions for foreign, primary, and unique keys) as part of SQL-92 standard requirement. However, the keys are not enforced. This means that you can define the keys in SQL but they won't make any affect on the data. Nothing would prevent to violate referential integrity like inserting non matching values in child table that suppose to match to parent table. 

Let's experiment with UNIQUE KEY constraint:


/* Start time 10/15/2013 6:15:05 PM, end time 10/15/2013 6:15:05 PM. */
/* Duration 0.0270027 sec. */
create table test2delete (
        col1 smallint
       ,col3 varchar(30 )
);

/*
      NOTICE:  unique key constraints not enforced  <-- Netezza complains, but proceeds.
 */
/* Start time 10/15/2013 6:15:05 PM, end time 10/15/2013 6:15:06 PM. */
/* Duration 0.0370037 sec. */

alter table test2delete add constraint uk_test2delete unique (col1);

/* Start time 10/15/2013 6:15:06 PM, end time 10/15/2013 6:15:06 PM. */
/* Duration 0.2890289 sec. */
/* Records Affected: 1. */

insert into test2delete values (1,'mystring');

/* Start time 10/15/2013 6:15:06 PM, end time 10/15/2013 6:15:06 PM. */
/* Duration 0.0760076 sec. */
/* Records Affected: 1. */

insert into test2delete values (1,'mystring');

/* Start time 10/15/2013 6:15:06 PM, end time 10/15/2013 6:15:06 PM. */
/* Duration 0.0770077 sec. */
/* Records Affected: 1. */


insert into test2delete values (1,'mystring');


<-- All DUPLICATES inserted !!!!



To conclude, Netezza NPS 7.2 documentation emphasizes: 

Note: The system permits and maintains primary key, default, foreign key, unique, and references. Because IBM® Netezza® does not support constraint checking and referential integrity, you must ensure your own constraint checking and referential integrity.


Related documentation:   Column and table constraints

Netezza SQL: FOREIGN KEY clause

See related post: 

Netezza SQL compiles with ANSI SQL-92 standard implementing Postgrees SQL. The ANSI SQL-92 standard supported by most relational databases on the market

Netezza supports referential integrity (e.g. definitions for foreign, primary, and unique keys) as part of SQL-92 standard requirement. However, the keys are not enforced. This means that you can define the keys in SQL but they won't make any affect on the data. Nothing would prevent to violate referential integrity like inserting non matching values in child table that suppose to match to parent table. 

Let's experiment with FOREIGN KEY constraint:

------------------------------------------------------------------
/*
NOTICE:  foreign key constraints not enforced       <-- Netezza complains, but proceeds.
 */
/* Start time 4/25/2013 6:21:32 PM, end time 4/25/2013 6:21:32 PM. */
/* Duration 0.078125 sec. */

CREATE TABLE table2ref
(
        col1    smallint NOT NULL
       ,colref2 smallint
       ,col3      varchar(30 )
  ,CONSTRAINT fk_column_colref2          <-- THIS IS NAME OF REF CONSTRAINT
   FOREIGN KEY (colref2) 
   REFERENCES test2delete (col1)   <-- the ref table "test2delete" must exist. 
);                            if table "test2delete" does not exist SQL will fail.

/* Start time 4/25/2013 6:21:32 PM, end time 4/25/2013 6:21:37 PM. */
/* Duration 4.53125 sec. */                        
/* Records Affected: 1. */

insert into table2ref values (3,12345,'reftabstring');   <-- THIS IS NON-EXISTING KEY "12345"
                                  in table "test2delete.
                                  Nothing stops inserting non-matching value "12345"



To conclude, Netezza NPS 7.2 documentation emphasizes: 

Note: The system permits and maintains primary key, default, foreign key, unique, and references. Because IBM® Netezza® does not support constraint checking and referential integrity, you must ensure your own constraint checking and referential integrity.


Related documentation:   Column and table constraints

Wednesday, February 24, 2016

Netezza: nzbackup & nzrestore on NPS 7.2

See related post: Netezza: nzbackup & nzrestore on NPS 7.0.

QUESTION

What parallelism factor to pick running nzbackup with -streams option for database backup?
What parallelism factor to pick running  nzrestore with -streams option for database restore?

ANSWER:

Pick -streams AUTO  for both nzbackup and nzrestore .   


Advise: I would recommended to test running multiple nzbackup -streams AUTO concurrently in shell script. This may shrink overall backup time by saturating all available network bandwidth. This option yet to be tested with external backup storage.  Stay tuned ... 



EXPLANATION
This empirical test was inspired by poorly performing nzrestore from milti-stream backupset at NPS 7.0 and later releases.  At October 2013 it was proven that  nzrestore recovery time degrades by factor of 2 with each uptick of streams in source nzbackup dataset. For more details about this experiments see post Netezza: nzbackup & nzrestore on NPS 7.0.

Starting from Netezza NPS 7.2 database restore can be done in multi-stream mode. IBM did a great job improving nzbackup and nzrestore utilities.  This test proves that new milti-stream operations show highly improved performance. 

TEST CASES:
The matrix below explains various combination running  nzbackup followed by nzrestore with -streams option. The test was conducted using host server disk for target work space. The database used for testing has size of 500GB with three dozen of tables consisting of mix datatypes columns. 


RESULT

The greener, the better.  The nzbackup -streams AUTO exhibit the best backup performance: "23 min".  The recovery performance running nzrestore -streams AUTO is very close to backup speed: "33 min".  To validate the data test AUTO vs. AUTO was repeated 8 times showing very close results. The average number was derived for both backup and restore: 23 min / 33 min.  



Related documentation:  Database backup and restore


Netezza: nzbackup & nzrestore on NPS 7.0

See related post: Netezza: nzbackup & nzrestore on NPS 7.2.

QUESTION

What parallelism factor to pick running nzbackup with -streams option at NPS 7.0  (late 2013)?

ANSWER:

Do NOT use -streams option. Always run single stream nzbackup.   Build for parallelism manually using shell script executing single-stream nzbackup jobs concurrently.  This will create a single-stream backupset, which performs fast when recover data with nzrestore

Advise: upgrade to NPS 7.2 or higher,where both nzbackup and nzrestore have  option that perform well. See Netezza: nzbackup & nzrestore on NPS 7.2 post


EXPLANATION
As of late 2013, Netezza NPS version 7.0 has parallel nzbackup option -stream, however nzrestore can only be made in a single stream.  Empirical test shows that recovery of data using nzrestore linearly degrades with increase of parallelism in source backupset.

TEST CASES
To obtain the data for this analysis we picked the same size of data. The nzbackup /nzrestore were tested on local backup at netezza host disk and NetBackup Data Domain disk storage.  Both disk and network storage showed relatively similar speed performance. 


RESULT

For nzbackup -streams 4  is the "sweet spot" to make create backupset fastest. Seems like a catch, but look further down for nzrestore timing. 


For nzrestore, on the other hand, is the fastest when data recovery made from a single-stremed backup set.  With increase of parallel -streams in original backupset nzrestore degrades linearly by factor of 2 with each step up in prallelism. Note, nzrestore with 4 streams failed in this test.



NZBACKUP vs. NZRESTORE:

On the same time scale (Axis Y) compare: nzbackup and nzrestor.


Related documentation:  The nzbackup command  and  The nzrestore command

Sunday, February 14, 2016

Netezza: view to monitor user run and logon statistics



Create user activity monitoring views. Views are useful to monitor user logons and Netezza  appliance usage:

Create view DDL.

SELECT 'DEV'                                                   as ENV
    ,username                                              AS USERNAME
    ,u.CREATEDATE                                           AS CREATEDATE
    ,timezone(MAX(LAST_LOGON), 'UTC','EST' )                as LAST_LOGON
    , CURRENT_DATE-timezone(MAX(LAST_LOGON), 'UTC','EST' as LAST_LOGON_DAYS
    ,SUM(ACCESS_COUNT)                                   as TOTAL_LOGIN_COUNT
--- List concatenation here: put all database on one line comma delimited.
--- Read: http://nz2nz.blogspot.com/2016/02/netezza-concatenate-rows-into-string.html
    ,trim (trailing ', ' from
            SQL_EXTENSIONS.. replace
               (SQL_EXTENSIONS.. replace
                   (SQL_EXTENSIONS..XMLserialize
                       (SQL_EXTENSIONS..XMLagg
                           (SQL_EXTENSIONS..XMLElement( 'X',dbname))),
               '<X>','' )
          , '</X>' ,',' )) AS DB_ACCESSED
FROM
(
     select username
              ,dbname
              , max( time) AS LAST_LOGON
              , count(* AS ACCESS_COUNT
     from QUERY_HISTORY.."$v_hist_log_events"
     where username not like 'ADMIN%'   --- not interested in ADMIN** users
       and username not like 'ZZ_%'     --- not interested in ZZ_** functional users
     group by username
          , dbname
) as LAST_LOGON_DAYS
left outer join _v_user u using (username)
GROUP BY 1 ,2 ,3
order by TOTAL_LOGIN_COUNT desc

;







SELECT  'DEV'                                   AS ENV
        , QP.USERNAME                            AS USERNAME
        , DATE_TRUNC('day' , QP.SUBMITTIME)     AS THEDAY
        , MIN(QP.SUBMITTIME)                     AS RUN_FROM
        , MAX(QE.FINISHTIME)                     AS RUN_TO
        , COUNT(*)                              AS QUERY_COUNT
        , SUM((QE.FINISHTIME - QP.SUBMITTIME))   AS TOTAL_ELAPSED
        , SUM(QE.RESULTROWS)                     AS TOTAL_ROWS
        , SUM(P.RESULTBYTES)                     AS TOTAL_BYTES
FROM  (       QUERY_HISTORY.QHIST_USER."$hist_query_prolog_3" QP
         JOIN QUERY_HISTORY.QHIST_USER."$hist_query_epilog_3" QE   
               USING (NPSID, NPSINSTANCEID, OPID) )
   JOIN (     QUERY_HISTORY.QHIST_USER."$hist_session_prolog_3" SP
         JOIN QUERY_HISTORY.QHIST_USER."$hist_session_epilog_3" SE 
               USING (NPSID, NPSINSTANCEID, SESSIONID))
  USING (NPSID, NPSINSTANCEID)
LEFT JOIN (
        SELECT PP.NPSID
       , PP.NPSINSTANCEID
       , PP.OPID
       , PP.SESSIONID
       , SUM(PE.RESULTBYTES) AS RESULTBYTES
       , SUM((PP.PREPTIME - PP.SUBMITTIME)) AS QUEUETIME
       , SUM((PP.GRATIME - PP.PREPTIME)) AS PREPTIME
       , SUM((PP.STARTTIME - PP.GRATIME)) AS GRATIME
       , COUNT(PP.PLANID) AS NUMPLANS
       , SUM(PP.QCRESTART) AS NUMRESTARTS
        FROM (QUERY_HISTORY.QHIST_USER."$hist_plan_prolog_3" PP
        JOIN QUERY_HISTORY.QHIST_USER."$hist_plan_epilog_3" PE 
        USING (NPSID, NPSINSTANCEID, OPID, SESSIONID, PLANID))
        GROUP BY PP.NPSID, PP.NPSINSTANCEID, PP.OPID, PP.SESSIONID
) P
 USING (NPSID, NPSINSTANCEID, OPID)
WHERE QP.SESSIONID = SP.SESSIONID
  AND QE.STATUS = 0
  AND QP.SUBMITTIME > current_timestamp - interval '60 days'
  AND UPPER(QP.QUERYTEXT) NOT IN ( 'COMMIT','BEGIN' , 'SELECT * FROM DUMMY WHERE 1=2')
  AND UPPER(QP.QUERYTEXT) not like 'SHOW %'
  AND UPPER(QP.QUERYTEXT) not like 'SET %'
  AND UPPER(QP.QUERYTEXT) not like 'SELECT VERSION%'
  AND UPPER(QP.QUERYTEXT) not like 'SELECT DB_ENCODING%'
  AND UPPER(QP.QUERYTEXT) not like 'SELECT NCHAR_ENCODING%'
  AND UPPER(QP.QUERYTEXT) not like 'SELECT CURRENT_SID%'
  AND UPPER(QP.QUERYTEXT) not like 'SELECT ''JDBC CLIENT VERSION''%'
  /* to be continued based on your system. This xclusion list is to be revised ...*/
GROUP BY ENV, QP.USERNAME, DATE_TRUNC( 'day', QP.SUBMITTIME)
order by THEDAY desc, TOTAL_ELAPSED desc
;



Aginity: visualize processing skew for running Netezza query

Aginity provides visualization tool to identify query processing skew.

Execute query as: 

Friday, February 12, 2016

Netezza: enable CROSS JOIN control set ENABLE_CROSSJOIN_CONTROL


Enable CROSS JOIN control feature by adding enable_crossjoin_control parameter to /nz/data/postgresql.conf file:

###
### In the file: /nz/data/postgresql.conf
###

enable_crossjoin_control = true





BEFORE NPS REBOOT

/* NOTICE:  Cross join control not enabled, at runtime option will be ignored. */
/* Start time 5/11/2015 4:55:02 PM, end time 5/11/2015 4:55:02 PM. */
/* Duration 0.0312002 sec. */
alter user USERNAME  allow cross join FALSE;



AFTER NPS REBOOT

Testing with different user setting for ALLOW CROSS JOIN . See: ALTER USER.

Default:

_V_USER.CROSS_JOIN_ALLOWED IS NULL


/* ERROR:  Execution of implicit cross joins is not permitted */
SELECT count (* ) FROM TABLE_1, TABLE _2;

/* ERROR:  Execution of explicit cross joins is not permitted */
SELECT count (* ) FROM TABLE_1 cross join  TABLE_2;


alter user MYUSER
allow cross join FALSE;


/* ERROR:  Execution of implicit cross joins is not permitted */
SELECT count (* ) FROM TABLE_1, TABLE_2;

/* ERROR:  Execution of explicit cross joins is not permitted */
SELECT count (* ) FROM TABLE_1 cross join  TABLE_2;


alter user MYUSER
allow cross join TRUE;


/* ERROR:  Execution of implicit cross joins is not permitted */
SELECT count (* ) FROM TABLE_1, TABLE_2;

/* Success. Ok !*/
SELECT count (* ) FROM TABLE_1 cross join  TABLE_2;
  



SET CONTROL AT DATABASE GROUP AND USER LEVEL:
Replace NULL with TRUE|FALSE:   ALTER GROUP xyz ALLOW CROSS JOIN TRUE;  If still left NULL, the user defaults to FALSE.
  
--- fine grained control at GROUP level ---
select groupname, cross_joins_allowed  from _v_group;
--- fine username control at USER level ---
select username, cross_joins_allowed  from SYSTEM ..v_user;