General Purpose Scripts for use with NPS Versions 4.x, 5.x, 6.x and 7.x
Copyright IBM Corp. 2011, 2013, 2014, 2015, 2016
LICENSE: See licenses provided at download or with distribution for language or\and
locale-specific licenses
Toolkit Version: 7.2 (2016-07-04)
For NPS Versions: 4.6, 5.0, 5.2, 6.0, 6.1, 7.0, 7.1, 7.2
All of the scripts support online help ( -? or -h or -help ), which provides the same detail information
that you would find in this document. If you don't know or don't remember what it is you're looking for,
try the "nz_help" script.
Of particular interest to many should be the "Major Utilities". Please take the time to familiarize
yourself with those.
User Visible Changes
====================
Script Name New Option(s) Comments
--------------------- ------------- ---------------
nz_csv New script to dump out tables in in CSV format
nz_ddl_all_grants New script to dump out all GRANT statements across all databases
nz_ddl_diff -ignoreQuotes Ignore quotes and case in object names/DDL
nz_genstats -default Have the system decide the default type of statistics to be generated
nz_groom -scan+records Perform a table scan + record level groom of the table (if needed)
-brief Limit the output to only tables that are considered "groom worthy"
nz_health Includes output from the nznpssysrevs command
Includes output about the frequency/usage of the nz_*** scripts
nz_migrate -fillRecord Treat missing trailing input fields as null
-truncString Truncate any string value that exceeds its declared storage
nz_plan After each table ScanNode, show what is done with that data set
nz_rerandomize New script to re-randomize the rows in a randomly distributed table
nz_scan_table_extents New script to scan 1 table ... 1 dataslice ... 1 extent at a time
nz_zonemap Include additional information with the existing "-info" option
Change Log
==========
nz_check_views Fixed a logic bug in the parsing of the command line options
nz_ddl_ext_table Corrected the DDL output that is generated for certain external table options
nz_ddl_grant_user Corrected the DDL output that is generated for certain GRANTs
nz_format Corrected the output produced when processing a "Modified plan"
nz_genstats Insure that any global optimizer settings don't interfer with the running of this script
nz_groom Fixed the use of the "-percent" option when it was the only condition specified
Output format has changed slightly ... see the online help regarding the "-brief" option
nz_migrate Fixed a logic bug that wouldn't recognize quoted tablenames that began with a number
nz_plan Corrected the display of the "Elapsed" time when a plan crossed a date boundry
nz_record_skew Account for higer skew ratios (as in an 8 rack Mako with 1,920 data slices)
nz_set Includes additional optimizer settings in the output
nz_spu_swap_space Account for higer skew ratios (as in an 8 rack Mako with 1,920 data slices)
nz_view_references Fixed the "-create" option so that the table that is created can hold > 32K entries,
and it will also treat empty values as NULL's rather than as '' (a 0-length string)
PREVIOUS User Visible Changes
=============================
Script Name New Option(s) Comments
--------------------- ------------- ---------------
nz_backup -delim Allows you to specify the column delimiter to be used
nz_check_views -rebuild Automatically rebuilds all of the user VIEWs
nz_db_size -temp New option used to display information on temporary tables
nz_ddl_scheduler_rule New script to generate the DDL for any scheduler rules
nz_ddl_table_redesign -sample Samples just a subset of the table to speed things up
nz_load_files -filter Allows you to preprocess each data file prior to it being loaded
nz_migrate -noData Perform all of the steps EXCEPT for migrating any actual data
-loopback Exercises the system by migrating data from/into itself
nz_online_vacuum New script to reindex the host catalogs while online
nz_plan -width Allows you to specify the display width of the SQL column
-tar Search all of the *.tgz archives to try and find the *.pln file
nz_query_history New *.ddl file
nz_responders -width Allows you to specify the display width of the SQL column
nz_sysutil_history.ddl New *.ddl file
nz_sysutil_stats New script to display statistics about system utilization
nz_unload -header New option to include column names as the first line of output
nz_zonemap -info Display summary information about the zonemap table itself
PREVIOUS Change Log
===================
add_commas Don't throw an error if the string is not numeric
nz_altered_tables Join on like datatypes for performance
nz_backup Fix a problem when using nz_restore when NZ_HOST is set
nz_best_practices Performance improvements
nz_build_html_help_output Include additional information in the output
nz_catalog_size Fix embedded function (add_commas)
nz_change_owner Add support for SCHEDULER RULEs
nz_cksum Fix the cksum that is performed on columns of type NUMERIC and *CHAR*
nz_ddl Add support for SCHEDULER RULEs and "External Table Locations"
nz_db_size Join on like datatypes for performance
"-temp" option was added to display information on temporary tables
Fix an obscure issue with displaying versioned/row secure tables
nz_ddl_comment Remove an unnecessary join
nz_ddl_ext_table Switched from using "egrep -E" to "grep -E"
nz_ddl_history_config Include a "SET HISTORY CONFIGURATION" in the generated ddl
nz_ddl_library Remove an unnecessary join
nz_ddl_owner Add support for SCHEDULER RULEs
nz_ddl_scheduler_rule Add support for SCHEDULER RULEs
nz_ddl_table Adjust output of some \echo statements to eliminate any problems with them
nz_ddl_table_redesign Use "quoted column names" to eliminate any problems caused by reserved words
nz_ddl_view Switched from using "egrep -E" to "grep -E"
nz_find_table_orphans Change how the MAX(objid) that has been used is determined
Provide a grand total as to how much space is orphaned
Eliminate unnecessary work that was being done when "-delete" was chosen
nz_format Support SQL statements in *.pln files that span multiple lines
nz_genstats Ignore NZ_SCHEMA when processing the SYSTEM database
"-force" will now run against all non-empty tables (even if rowcount is 0)
nz_get_object_name Add support for SCHEDULER RULEs
nz_groom Fixed the "-groom records" step which was being skipped before
Don't waste time processing/grooming empty tables
Don't process tables that are already below the thresholds for "-rows" or "-size"
When doing a groom '-records', don't bother also doing a goom pages on small tables
nz_health Switched from using "egrep -E" to "grep -E"
nz_manual_vacuum Up the threshold for reporting on catalog files (from 25MB to 75MB)
nz_migrate When the source table is an external table then only one thread/stream is used
When running on a linux client, use the local nzodbcsql program
Fixes the use of "-timeout" (when "-status" was not included)
Fixes the "tearing down of all threads" when an error is encountered
Switched from using "egrep -E" to "grep -E"
nz_plan Be more accommodating of missing preceeding spaces on the " -- Snippet " lines
"-tar" will search all of the *.tgz archives to try and find the *.pln file
nz_query_history Fix an issue where some columns might not be populated with data
nz_query_stats Updated to work against Version 1, 2 or 3 of the Query History Database
nz_responders Never display a negative number for the elapsd/queued time
nz_set Includes additional optimizer settings in the output
nz_skew Join on like datatypes for performance
nz_spu_swap_space Reworked the sql that calculates + displays the per-plan swap-space-usage
nz_stats Join on like datatypes for performance
When displaying #'s of Objects', show the database MIN/AVG/MAX #'s
Switched from using "egrep -E" to "grep -E"
nz_sysutil_stats Includes additional information in the output to identify the system
nz_update_statistic_unique_values Allows you to specify that there is NO dispersion value for a column
nz_view_references Use the proper indentation when displaying SYNONYM information
nz_zonemap New "-info" option will display summary information about the zonemap table
General Information
===================
The scripts treat all command line options as case insensitive -- this applies to database
names and object names and any switches/arguments. One exception to this rule is if you
have used delimited object names, in which case they are case sensitive. And when specifying
them on the command line you would need to enter them as '"My Table"' ... e.g.,
<single quote><double quote>The Name<double quote><single quote>
While these scripts were developed and tested on an NPS server, many (but not all) of
them should also be able to be run from a remote client running linux/unix. Some scripts
will need to be run as the linux user "nz" (on the NPS host itself) as the script requires
access to certain privileged executables and files. Some scripts will need to be run as
the database user "ADMIN" as the script accesses certain privileged tables and views (that
only the ADMIN user has access to, by default). If you are having problems running a
particular script, first review the online help for the script to see if it mentions any
special requirements. When in doubt, try running the script from the nz/ADMIN account.
Generic Command Line Options
----------------------------
The scripts support the following generic NPS options (which are similar to nzsql, nzload, etc...)
-d <dbname> Specify database name to connect to [NZ_DATABASE]
-db <dbname>
-schema <schemaname> Specify schema name to connect to [NZ_SCHEMA]
-u <username> Specify database username [NZ_USER]
-w <password> Specify the database user password [NZ_PASSWORD]
-pw <password>
-host <host> Specify database server host [NZ_HOST]
-port <port> Specify database server port [NZ_PORT]
-rev Show version information and exit
Schemas
-------
Additional command line options that these scripts support when using schemas
-schemas ALL Process ALL schemas in the database, rather than just one
schema. This applies to all scripts (to include nz_db_size,
nz_ddl*, nz_groom, nz_genstats, nz_migrate, ...)
Alternatively, you can set up this environment variable
export NZ_SCHEMAS=ALL
so that you don't have to keep specifying "-schemas ALL" on
the command line each time you invoke a script (if you like
working with all of the schemas all of the time). Since
this is a different environment varable (than NZ_SCHEMA) it
will not interfere with the other nz*** cli tools.
-schemas <...> Specify a list (a subset) of the schemas to be processed
-schema default Allows you to connect to a database's default schema
(without having to know/specify its actual name)
When processing multiple schemas, the objects in a database will be displayed using
their SchemaName.ObjectName
When choosing "-schemas <...>", if you specify just a single schema name then only
that one schema will be processed. But the objects will still be displayed using
their SchemaName.ObjectName
Picking A Set Of Objects
------------------------
Previously, many of the scripts allowed you to process either a single object or all objects.
For example, "nz_ddl_table PROD" would produce the DDL for all tables in the database PROD,
whereas "nz_ddl_table PROD CUSTOMERS" would produce the DDL for just the one table, CUSTOMERS.
But what if you wanted something in-between, some subset of objects or tables or views or whatever.
More than one, but less than all.
Now you can do just that! Rather than specifying just one object name when invoking a script,
you can INSTEAD use the following command line options to specify whatever subset of object names
you are interested in processing.
-in <string ...>
-NOTin <string ...>
-like <string ...>
-NOTlike <string ...>
They are patterned after the SQL constructs of IN, NOT IN, LIKE, NOT LIKE. You can use any
combination of the above, and specify any number of strings.
For -in/-NOTin, the strings are case insensitive exact matches. If you need a particular string
to be treated as case sensitive, specify it thusly: '"My Objectname"'
For -like/-NOTlike, the strings are case insensitive wild card matches. But you get to decide
where the wild cards go by adding the symbol % to each string (at the beginning, middle, and/or end).
Example: nz_ddl_table PROD -like %fact% %dim% -notlike %_bu test% -in SALES INVENTORY -notin SALES_FACT
To experiment with these switches, and find out which objects will (or will not) be selected, try the
nz_get_***_names scripts (nz_get_table_names, nz_get_view_names, etc ...)
There is an additional option that allows you to limit the selection to only those objects owned
by the specified username.
-owner <username>
* * * * I N D E X * * * *
Major Utilities
DDL
nz_clone | To clone the DDL for an existing object, and optionally assign it a new name. |
nz_ddl | To dump out all of the SQL/DDL that defines this NPS system. |
nz_ddl_aggregate | To dump out the SQL/DDL that was used to create a user defined aggregate. |
nz_ddl_all_grants | To dump out ALL of the GRANT statements for this system. |
nz_ddl_comment | To dump out the SQL/DDL used to add a comment to an object. |
nz_ddl_database | To dump out the SQL/DDL that was used to create a database. |
nz_ddl_diff | Report any DDL "diff"erences between two databases. |
nz_ddl_ext_table | To dump out the SQL/DDL that was used to create an external table. |
nz_ddl_function | To dump out the SQL/DDL that was used to create a user defined function. |
nz_ddl_grant_group | To dump out the SQL/DDL that represents any access GRANT'ed to a group. |
nz_ddl_grant_user | To dump out the SQL/DDL that represents any access GRANT'ed to a user. |
nz_ddl_group | To dump out the SQL/DDL that was used to create a group. |
nz_ddl_history_config | To dump out the SQL/DDL that was used to create a history configuration. |
nz_ddl_library | To dump out the SQL/DDL that was used to create a user defined shared library. |
nz_ddl_mview | To dump out the SQL/DDL that was used to create a materialized view. |
nz_ddl_object | To dump out the SQL/DDL that was used to create any object (of any type). |
nz_ddl_owner | To dump out the SQL/DDL used to change the owner of an object. |
nz_ddl_procedure | To dump out the SQL/DDL that was used to create a user defined procedure. |
nz_ddl_scheduler_rule | To dump out the SQL/DDL that was used to create a scheduler rule |
nz_ddl_schema | To dump out the SQL/DDL that was used to create a schema. |
nz_ddl_security | To dump out the SQL/DDL for creating various security objects. |
nz_ddl_sequence | To dump out the SQL/DDL that was used to create a sequence. |
nz_ddl_synonym | To dump out the SQL/DDL that was used to create a synonym. |
nz_ddl_sysdef | To dump out the SQL/DDL for setting the system's default values. |
nz_ddl_table | To dump out the SQL/DDL that was used to create a table. |
nz_ddl_table_redesign | Provide alternative DDL for a table that optimizes each column's datatype. |
nz_ddl_user | To dump out the SQL/DDL that was used to create a user. |
nz_ddl_view | To dump out the SQL/DDL that was used to create a view. |
Statistics / Table Information
Hardware
nz_check_disks | To show S.M.A.R.T. information concerning the status of each SPU's disk drive. |
nz_check_disk_scan_speeds | Check the read/write speed of each disk or 'dataslice'. |
nz_mm | Display information about each MM (Management Module) and/or its blades. |
nz_ping | Diagnostic Script: 'ping' all of the Mustang SPUs to verify their location. |
nz_sense | Provide environmental "sense" data for Mustang series SPA's. |
nz_show_topology | Provide a report that describes the overall disk topology. |
nz_temperatures | Report temperature and voltage information for each of the Mustang SPAs. |
ACL (Access Control Lists) / Permissions
Diagnostic / Debugging / Support Tools
nz_catalog_size | To report information about the size of the catalog that resides on the host. |
nz_compiler_check | Verify that the C++ compiler (and its license) are operating correctly. |
nz_compiler_stats | Report various statistics about the utilization of the object code cache. |
nz_core | Dump the program execution/backtrace from a core file. |
nz_find_object | To help find+identify an 'object' -- based on its name or its object id value. |
nz_find_object_orphans | Diagnostic Script: Used to identify certain discrepancies within the catalogs. |
nz_find_table_orphans | Check that both the host and the SPU/S-Blades have entries for every table. |
nz_frag | Dump out extent/page allocations in order to visualize table fragmentation. |
nz_genc | Recompile code snippets (under /nz/kit/log/gencErrors) to identify any problems. |
nz_host_memory | Display the host's memory allocation table. |
nz_manual_vacuum | Vacuum and reindex the host catalogs. |
nz_online_vacuum | Reindex the host catalogs ... while the system is online |
nz_responders | Show interactions/responses for running queries (across each dataslice). |
nz_scan_table_extents | Scan 1 table ... 1 dataslice ... 1 extent at a time. |
nz_show_locks | Show information about locks being held on the system. |
nz_spu_memory | Provide a summary of how much memory the individual SPUs are using. |
nz_spu_swap_space | Provide a summary of how much swap space the individual SPUs are using. |
nz_spu_top | Show the current CPU Utilization and Disk Utilization on the S-Blades. |
nz_test | Run a test to verify that these scripts can connect to the database. |
nz_transactions | Display information about the current database transactions. |
Miscellaneous / Other
nz_abort | Abort the specified user sessions. |
nz_altered_tables | Display information about all versioned (ALTER'ed) tables. |
nz_backup_size_estimate | To estimate the storage requirements of a '-differential' database backup. |
nz_build_html_help_output | To generate HTML output that documents all of the individual nz_*** scripts. |
nz_catalog_diff | Display any "diff"erences between two different versions of the NPS catalogs. |
nz_catalog_dump | To dump out (describe) the catalog definition of all system tables and views. |
nz_check_views | Check each VIEW to make sure it is not obsolete (and in need of rebuilding). |
nz_columns | Create a table that will provide all column definitions across all databases. |
nz_compress_old_files | Compress (via gzip) old + unused files under the /nz directory. |
nz_compressedTableRatio | Estimate the compression ratio of a table or materialized view. |
nz_csv | To dump out the tables in a database in CSV format (comma separated values) |
nz_db_tables_rowcount | To display the tablename + rowcount for every table in a given database. |
nz_db_tables_rowcount_statistic | Display the tablename, rowcount, and storage size for all tables in a database. |
nz_db_views_rowcount | To display the rowcount + viewname for every view in a given database. |
nz_dimension_or_fact | Identify whether a given table is a dimension table or a fact table. |
nz_dump_ext_table | Dump out the header info found at the start of a compressed external table/file. |
nz_find_control_chars_in_data | Find any binary/non-printable control characters in a table's text columns. |
nz_find_non_integer_strings | Find any non-integer characters in a table's text column. |
nz_format | To format a block of SQL to make it more readable. |
nz_grep_views | Search all views, looking for any matches against the specified < search_string > . |
nz_groom | A wrapper around the 'groom' command to provide additional functionality. |
nz_inconsistent_data_types | List column names whose datatype is NOT consistently used from table to table. |
nz_invisible | Provide information about the number of visible and INVISIBLE rows in a table. |
nz_load4 | To speed up the loading of a single file by using multiple nzload jobs. |
nz_load_files | Utility script to assist with loading many data files into a table (or tables). |
nz_lock | Check to see if an exclusive lock can be obtained on a table. |
nz_maintenance_mode | Disable user access to the server, or to just the specific database(s). |
nz_pause | An alternative to "nzsystem pause". |
nz_physical_table_layout | To list out a table's columns -- sorted by their PHYSICAL field ID. |
nz_reclaim | A wrapper around the 'nzreclaim' utility to provide additional functionality. |
nz_record_skew | To check the record skew for a table. |
nz_replay | Extract queries (from the query history) so they can be replayed/retested. |
nz_replicate | A script to assist in replicating databases across two different NPS hosts. |
nz_rerandomize | To redistribute (or, "re-randomize") the rows in a table that are DISTRIBUTE'd ON RANDOM |
nz_rev | Report the major.minor revision level of the software running on the host. |
nz_select_fixed_data | Extract data from a table -- formatting each column to a fixed width. |
nz_select_quoted_data | Extract data from a table -- wrapping each column value in "double quotes". |
nz_set | Dump out the optimizer 'SET'tings that are currently in effect. |
nz_state | Mimics the "nzstate" and "nzstate -terse" commands ... but via SQL. |
nz_storage_stats | Report various storage statistics about the system. |
nz_sysmgmt_view_references | Identify the relationships between various system/mgmt tables and views. |
nz_tables | List ALL tables in a database, along with other useful pieces of information. |
nz_table_constraints | To dump out the SQL/DDL pertaining to any table constraints. |
nz_table_references | Identify any tables with PRIMARY KEY / FOREIGN KEY relationships. |
nz_unload | To unload data from a table swiftly ... via the use of remote external tables. |
nz_view_references | Identify the relationships between various user tables and views. |
nz_view_plan_file | View a *.pln plan file -- on a remote/client worksation. |
nz_watch | Watch (i.e., monitor) the system ... to verify that it is operational. |
Building Blocks
* * * * Major Utilities * * * *
nz_backup / nz_restore
Usage: nz_backup -dir <dirname> -format <ascii|binary|gzip> [ optional args ]
-or-
nz_restore -dir <dirname> -format <ascii|binary|gzip> [ optional args ]
Purpose: To backup (or restore) one or more tables.
An nz_backup must be run locally (on the NPS host being backed up).
An nz_restore can be used to restore data into a remote NPS host. Just
include the "-host" switch, or set the NZ_HOST environment variable.
Note: When doing an "nz_restore -format binary -host <xxx>", the script
issues an "nzload -compress true". This nzload feature only exists
in nzload as of NPS 4.6. If you want to do this on an older version
of NPS (4.0 or 4.5) then:
o Install a copy of the 4.6 client toolkit somewhere on your box
(it can be used against the older server releases)
o Add it's bin directory to the start of your search PATH
o Then invoke nz_restore
These scripts can process a single table, multiple tables, or an entire database.
The data format that is used can be either
ascii -- which is very portable.
binary -- which is the database's compressed external table format. This
is much faster, and results in significantly smaller backup sets.
gzip -- ascii, which is gzip'ed on the NPS host.
The data can be written to (or read from) disk files or named pipes. If you
use named pipes, another application is used to consume (or produce) the
data. You provide the hooks to that other application.
These scripts just concern themselves with the DATA itself. When backing up
a table, the DDL is not included. When restoring a table, the script expects
the table to already exist. It will not create it. It will not truncate it
(so if the table currently has any data in it, that data will be left untouched
by this script).
To backup tables requires the following permissions:
GRANT SELECT ON _VT_HOSTTXMGR TO <user/group>;
GRANT SELECT ON _VT_HOSTTX_INVISIBLE TO <user/group>;
GRANT SELECT ON _VT_DBOS_CONNECTION TO <user/group>;
--To obtain information about transactions
GRANT LIST ON <DATABASE|dbname> TO <user/group>;
--The user must have access to the database that contains the tables
GRANT SELECT ON <TABLE|tablename> TO <user/group>;
--The user must have access to the tables themselves, and their data
GRANT CREATE EXTERNAL TABLE TO <user/group>;
--The user must be able to create external tables, into which the
--data will be unloaded
To restore/reload a table requires an additional permission:
GRANT INSERT ON <TABLE|tablename> TO <user/group>;
--The user must be able to insert (i.e., reload) data back into the tables
Options: REQUIRED ARGUMENTS
==================
-dir <dirname> [...]
The full path to the directory in which the data files will be written
to (or read from). This directory must already exist and permit write
access to it. The directory name can be as meaningful as you wish to
make it.
If you are running this script as some linux user other than 'nz', please
note that it will actually be one of the 'nz' processes that writes the
data into this directory. So linux user 'nz' must also have write access
to it. If you are using named pipes (rather than files) then this is the
directory where the named pipes will be created.
Examples:
-dir /backups/backup_set_17
-dir /snap_storage/bu/customer_db/2006_11_18
-dir /tmp
If desired, you may split the backup files up across multiple directories/
file systems. Each thread can be associated with a separate "-dir <dirname>"
by specifying them on the command line. If you use this feature, then the
number of directories specified must match the number of threads.
When processing multiple schemas via this script subdirectories will be used.
Each schema being processed will have its own subdirectory ... with the same
name. For an nz_backup, the subdirectory will automatically be created by this
script. For an nz_restore, the subdirectoy must exist and contain the
relevant table files.
-file <filename>
By default, the files that this script reads+writes will have the same name
as the corresponding table that is being backed up or restored. This option
allows you to allows you to specify a different filename.
For example, you could backup a table using the specified filename
nz_backup -t SALES -file SALES_2013 ...
And restore the file into a different table
nz_restore -t SALES_OLD -file SALES_2013 ...
This option is only allowed when processing a single table.
-format <ascii|binary|gzip>
Identifies the format to be used for the output files.
ascii Universal in nature, but typically results in larger files and
slower performance.
binary The database's compressed external table format.
gzip ascii, which is then compressed (using gzip). By definition,
compressing and decompressing data uses up a lot of CPU cycles (i.e.,
it takes a long time). When using the binary format (compressed/
external), the work is done in parallel across all of ths SPUs ... so
it is very quick. But this option uses the NPS host to gzip/gunzip the
data. You will (almost always) want to use multiple threads in order
to get more of the host's SMP processors involved in order to speed
things along. The sweet spot seems to be about 8 threads, though you
can certainly use a larger/smaller number if you want to break the
backup files up into more/fewer pieces.
best --to-- least
====== ====== ======
Speed: binary ascii gzip
Size: gzip binary ascii
Universality: ascii gzip binary
OPTIONAL ARGUMENTS
==================
-t <tablename> [...]
# Table(s) within the database to be processed. If none are specified,
# then all tables in the database will be processed.
#
# If you have a file that contains a list of tablenames to be backed up,
# (separated by spaces and/or newlines) it can be used via the following
# syntax:
#
# -t `cat /tmp/the_list_of_tables`
-threads <nnn>
# Breaking the backup into multiple threads (per table) can increase the
# overall thruput, especially for large tables. This will also have the
# effect of creating smaller backup files, since each one will now be
# broken up into multiple pieces.
#
# By default, only a single thread will be used. You can specify a number
# from 1..31. Whatever value you specify for the backup must also be used
# for the restore. In general, the sweet spot seems to be about 6 threads.
-script <scriptname>
# Instead of backing up (or restoring) the data from disk files, you can use
# named pipes -- allowing another application to consume (or produce) the data
# on the fly. To use named pipes, specify that application/script here. The
# script will be automatically invoked and passed (as arg 1) the fully rooted
# pathname of the named pipe that is it supposed to use.
#
# For example scripts, see the file(s)
# nz_backup.script_example
# nz_restore.script_example
-whereclause <clause>
# Normally, nz_backup is used to backup the entire contents of a table --
# all visible rows. This option allows "you" to tack on a WHERE clause
# to the data that gets selected ... allowing "you" to backup a subset
# of the table. All the power (and responsibility) is put into your
# hands. Do wrap the clause in double quotes so it will be passed into the
# script correctly. Examples:
#
# -whereclause "customer_key = 2"
# -whereclause "customer_key in (1,3,5)"
# -whereclause "region_name = 'AMERICA' or region_key = 0"
# -whereClause "order_date between '1998-01-01' and '1998-12-31'"
#
# Because this clause gets applied to all tables being backed up, you would
# probably only want to backup a single table at a time (when using this
# clause) ... since the clause will typically contain column names that are
# specific to that table.
#
# This clause only applies to backups (not restores). Its use will be logged
# in the output of this script (as well as in the pg.log file).
-ignoreTxID
# Tables are individually backed up, one at a time. Since a backup may span
# many hours, this script insures that the backup represents a consistent
# point-in-time by using the transaction IDs attached to each row.
#
# This switch will override that feature ... and backs up each table with
# whatever data it contains when the backup (of that particular table) is
# kicked off. This insures that your backup will include ALL of the data
# in a table that has been committed (so you're not susceptible to long
# running or stale transactions).
#
# This switch is primarily of use with NPS 4.0. In later releases, this
# script is able to do things differently.
-dropdata
# This is for testing purposes only. As the name implies, the backup will
# be written to /dev/null, resulting in no backup at all. This is useful
# for testing the performance of the NPS components that are involved
# (SPUs/S-Blades ==> Host), while excluding the speed/overhead of your host
# storage. The "-sizedata" option is basically the same ... but provides
# more information.
-sizedata
# This is for testing purposes only. Like "-dropdata", but rather than
# sending the backup data directly to /dev/null it will first be piped
# thru "wc -c" in order to count the number of bytes in the backup
# stream (e.g., to provide you with actual sizing information). So it
# has the performance characteristics of "-dropdata" ... but provides
# you additional information.
#
# You can use "-format <ascii|binary>" and 1 or multiple "-threads <nn>"
# when using this switch.
#
# Each table will include the following pieces of information
#
# Info: source table size 80,740,352
# Info: backup file size 76,574,691
#
# And summary lines for the entire backup set will display the
#
# TOTAL source table size : 48,267,526,144
# TOTAL backup file size : 46,088,648,210
#
# The table size is whatever the table size is -- the amount of storage
# space it using on disk (as reported by nz_db_size or nz_tables). This
# script doesn't know/care if the data is compressed on disk (e.g, CTA0/1/2).
# Nor does the script know if there are any logically deleted rows in the
# table (taking up space in the table, but which would not be part of a
# backup data set).
#
# The backup size is the amount of storage that would be required if the
# backup data set was actually written to disk. This would represent either
# the ascii or the binary (compressed external table format) version of the
# data ... whatever you chose.
#
# To get sizing information for a full nzbackup, e.g.
# nzbackup -db DBNAME -dir /tmp
# you would use a command line such as this
# nz_backup -db DBNAME -dir /tmp -format binary -sizedata
-delim <char>
# Separator between fields [ default = | ] when using "-format ascii" or
# "-format gzip". If you override the default for nz_backup, you must
# do the same when invoking nz_restore against those data files.
Outputs: Status/log/timing information will be sent to standard out ... and will
include information about any ERROR's that might be encountered.
Exit status: 0 = success, non-0 = ERROR's were encountered
Examples: $ nz_backup -format binary -dir /backupdir
$ nz_restore -format ascii -dir /tmp -db my_db -t table1 -script /tmp/my_script
Comparison: nzbackup/nzrestore nz_backup/nz_restore
================== ====================
NPS CLI Utility Add-on Script
Backup Type
Full X X
-differential X
-cumulative X
Granularity
Entire Database X X
Individual Table(s) nzrestore X
Formats Supported
Compressed Binary X X
Ascii X
Ascii gzip'ed X
Output Destination
Veritas (NPS 4.0) X
Tivoli (NPS 4.6) X
Disk X X
Named Pipes X
Multi-Stream Support nzbackup (in 6.0) X
DDL included as part X Use the nz_ddl* scripts
of the backup set
nz_best_practices
Usage: nz_best_practices [-quiet|-verbose ] [-rowcount|-tablename] [ database ]
Purpose: Report on how well a database conforms to a subset of "best practices".
The items that are flagged are simply being brought to your attention.
You may already be well aware of them, and have valid reasons for all
of the design and operational choices you have made.
But if you aren't aware of some of these items, perhaps they warrant a
further look on your part.
Note: This script could take a long time to run.
See Also: nz_columns, nz_inconsistent_data_types
Inputs: -quiet Controls the level of additional commentary that is
-verbose included in the output. (It is the same commentary each
time you run this script.) The default is -verbose
-rowcount Controls the sort order for the output. You can sort
-tablename it numerically by the table's rowcount statistic. Or
you can sort it alphabetically by the table's name.
The default is by the -tablename
The database name is optional. If a database is not specified,
then all databases / schemas will be processed.
Outputs: The analysis of your system, to include
o Statistics Status
o Distribution Issues
o Data Type Issues
o Inconsistent Data Type Usage
This script compares columns+datatypes within a single database. To
compare them across ALL databases use "nz_inconsistent_data_types"
o Reclaimable disk space
nz_cksum
Usage: nz_cksum [ -verbose ] [-slow|-columns|-fast|-count] [database [table [column ...]]] [-groupBy <str>]
Purpose: Checksum and count the rows in a table (or a view).
This script will calculate a checksum value for a table (by generating and
running some fancy SQL against the data in that table). If multiple instances
of the table exist (in other databases or on other servers) this checksum
can be used to determine whether or not the two tables "appear" to contain
identical data.
As this is simply a mathematical calculation, matching checksums are not a
100% guarantee that two tables are indeed identical. In linux terms, it is
the difference between the 'cksum' and 'cmp' programs.
To confirm that two tables are indeed identical, or to find the differences
between the tables, you could use the following SQL. But note that it involves
four scans of the table (all rows/all columns) ... so it can put a significant
load on your system.
( SELECT * FROM table_1 EXCEPT SELECT * FROM table_2 )
UNION ALL
( SELECT * FROM table_2 EXCEPT SELECT * FROM table_1 );
The DDL for the table is integrated into the checksum calculation. If
two tables contain exactly the same values ... but are defined somewhat
differently ... they will produce two different checksum values.
No special permissions are required to run this script. If you can access
the data in the tables, you can generate a cksum for the data.
Inputs: All arguments are optional.
If the database name is not specified, then this script will process
all databases / schemas / tables.
If a database name is specified, the script will process all tables in
the specified database (within the specified schema).
If a table name is specified, the script will process just that table.
Alternatively, you can specify a view name rather than a table name.
If the column(s) are specified, then only those column(s) will participate
when doing a '-slow', '-columns' or '-fast' checksum. (The '-count'
checksum doesn't involve any columns to begin with, since it only counts
the number of rows.)
-verbose If this argument is included, then the SQL that the script generates
(to compute the checksum) will also be echoed to standard out.
You may specify one of the following 'options' to be used for calculating the
checksum value. Each option results in a DIFFERENT set of calculations -- and
thus a different checksum value -- for any given table. The default is "-slow".
-slow The query generated by this script could take a long time to
run because its checksum will use ALL of the table's columns
in its calculations. The more columns there are in your
table, the longer the script will take to run.
-columns Like "-slow", but a checksum is computed + displayed for each
of the individual columns in the table. This will run slower
than -slow. If the checksums for two tables do not match, using
this option might help narrow the problem down to a specific
column and/or datatype.
-fast This query will run much faster (almost as fast as the simple
'-count' option shown next) because the script chooses a
single, representative column from the table -- and uses
only it in the checksum calculations.
-count The query will simply run a "select count(*) from <table>;"
Thus, the checksum will be based solely upon the table's
rowcount, and not any of its data/contents.
Normally the script generates a single checksum for the table as a whole.
The following option allows you to specify a column to group the data by,
producing multiple instances of a checksum and value pair.
-groupBy <str> # Where <str> can be a column name
# Or it can be a derived column
For example: nz_cksum ... -groupBy "to_char(TRANSACTION_DATE, 'YYYY')"
Which will result in multiple lines of output (multiple checksums being
generated), one for each group. Example:
58536183.1882613261548 601221 2013
58536188.9418109700016 79482 2014
58536183.2479770377668 5276 2015
Where the first column/value is the checksum, the second column/value
is the rowcount, and the third column/value is the groupBy value.
Other interesting uses might be
-groupBy datasliceid
-groupBy createxid
Outputs: The output of this script will look something like this (one line per table)
9012.112321 100000 customer_table
417895.0 0 test_table
Where the first numeric/decimal value is the checksum value
Where the second integer value is the rowcount
Where the third value is the tablename processed
If you use the "-columns" option, the output will look something like this,
for any given table
$ nz_cksum -columns tpch supplier
SUPPLIER | Checksum Value
--------------- | ----------------------
S_SUPPKEY | 31333.3568127940196
S_NAME | 45840.7346410200000
S_ADDRESS | 29880.1814736320796
S_NATIONKEY | 28828.7611842798212
S_PHONE | 48672.5877128160000
S_ACCTBAL | 26001.1575312253976
S_COMMENT | 31472.7052920206732
--------------- | ----------------------
DDL Cksum | 4031294125
Table Rowcount | 20000
This script generates a lot of SQL math. Sometimes a runtime error,
such as one of the following, might be encountered
ERROR: Precision range error
ERROR: overflow in 128 bit arithmetic
because this script
o is very generic in nature
o favors speed over thoroughness
o has to handle all different data types
o has to handle all possible ranges of values
When this occurs, the script will simply display the following warning
message. This warning can be ignored -- as it simply relects a limitation
of this script.
Unable to compute a cksum for <table>
nz_db_size
Usage: nz_db_size [ database [ objectname ]] [ optional args ]
Purpose: Displays the amount of storage space that is being used by each table.
Whether the table is compressed or uncompressed makes no difference to this
script. The amount of storage that the table is using is the amount of
storage that the table is using.
This script can report storage information for
Tables / Temporary Tables / Secure Tables
Materialized Views
and for new objects that do not yet exist (the transaction in which they are
being created has not yet been committed).
It reflects just the primary copy of the data, which includes
o The page header (32 bytes per 128KB page)
o The record(s), and their associated record headers/padding (if any)
o The space taken up by any logically deleted (e.g., reclaimable) records
Alternatively, if you use the "-pages" or "-extents" switch, it will
display the amount of storage space that has been allocated.
To use this script, you must have SELECT access to the following objects
_V_OBJ_RELATION_XDB
_V_SYS_OBJECT_DSLICE_INFO
and, if you are using the "nz_db_size -ctas" option, you must also have access to
_VT_TXUNFINISHED
_VT_TBL_IDS (rev 4,5)
_VT_TBLDICT (rev 6+)
As of NPS 6.0+
The system always reports the amount of space used in terms of 128KB
pages. So the numbers will always be the same regardless of whether
(or not) you've included the optional "-pages" switch.
If you have a versioned Table/Secure Table, as the result of doing an
ALTER TABLE <table> [ADD|DROP] COLUMN ...
the string "(Versioned)" will be displayed after the tablename in
order to indicate that.
If you are trying to monitor the size of a table as it is changing/growing
include the "-extents" option.
See also: nz_tables
Inputs: The database name and objectname are optional. If an objectname is not
specified, then all objects within the database will be shown. If the
database name is not specified, then all databases will be processed.
-summary If this option is selected, then summary information for
ALL databases / ALL schemas will be displayed (i.e., there
will be no table level info).
-detail Whereas if this option is selected, then just the information
about the individual tables/objects will be shown (i.e., the
summary information for the Appliance/Database/Schema will
be skipped. This also makes the report somewhat simpler and
faster.
-bytes By default, sizes will be reported in all of these units of
-kb measurement. You can specify that only specific units of
-mb measurement be used by selecting any combination of these
-gb switches.
-tb
-s|-size By default, the output will be sorted by "Object" type and
"Name" (ascending).
This switch will cause the output to be sorted by the table
size (descending).
-owners Include the database/table owner in the output listing
-owner <name> ONLY show objects belonging to this specific owner. The totals
(for the database and for the appliance) will reflect only the
objects that belong to this owner.
-ctas|-temp These interchangeable options allow you to see tables that you
wouldn't normally be able to see. Thus you can look at what
tables are being created, how much space they are taking up,
and you can monitor their growth/progress. This includes
o Temporary tables
o Tables currently being created (Create Table AS ....)
o Materialized views currently being created
o Tables created within an explicit transaction block which
has not yet done a COMMIT
o nzrestore operations
The output that is generated is for ALL databases/schemas.
-used This script reports on how much disk space is actually taken up by the
-pages data. But that is not the same thing as reporting how much storage
-extents space has been allocated to the table. Space is allocated in 3MB
extents. Each extent is subdivided into 24 128KB pages. The SPU
reads/writes a page of data at a time. If an extent only has two
pages with data, then NPS will only read those two pages.
By default, this script displays the amount of storage "-used". If
you want to know how much storage space has been allocated (in terms
of 3MB extents) or used (in terms of 128KB pages) then include one of
these optional switches. You can use any combination of these three
switches.
For example: nz_db_size -bytes -used -pages -extents
As of NPS 6.0, the system reports the amount of space used in terms of
128KB pages. Thus, the "-used" option shows the exact same numbers as
the "-pages" option.
Putting this in linux terms, take the following example:
$ du -bh /etc/motd
109 /etc/motd
$ du -sh /etc/motd
4.0K /etc/motd
The first version of the "du" command shows that the file contains
109 bytes of data. The second version shows that the file has been
allocated 4K of disk space. Two different ways to measure the same
thing. Two completely different numbers. Use whatever measurement
is appropriate for your needs.
The "nzstats -type table" command reports space in terms of the
number of 128KB pages that have been used (partially or completely).
For reporting purposes, these two commands are comparable:
nz_db_size -kb -pages
and
nzstats -type table -colMatch "DB Name","Table Name","Disk Space"
-units As mentioned, you can see how much storage is being used up in terms
of "-pages" or "-extents". The page information is always a multiple
of 131,072. The extent information is always a multiple of 3,145,728.
This switch will display the actual NUMBER of pages/extents involved.
It can be used in any combination with the switchs
-bytes -kb -mb -gb -tb
When using this switch, you should also specify whether you want to
see the # of units displayed for "-pages" or for "-extents" (or both).
Simply specifying
nz_db_size -units
Is shorthand for
nz_db_size -bytes -units -used -pages -extents
-growth If you are trying to monitor the size of a table as it is changing/
growing include this switch. (This is only needed as of the 6.0
release. And this is exactly the same as using the "-extents"
switch. But "-growth" might be more meaningful and easier to
remember.)
Notes: A minimalist approach ... if you wanted to display just one metric about one table
nz_db_size DBNAME TBLNAME -bytes -details
The corollary to that would be to display everything about everything
nz_db_size -used -pages -extents -bytes -kb -mb -gb -tb -units -owners
Outputs: By default, a report such as this will be produced. You are only allowed to see
info for tables that you normally have access to ... and the rolled up numbers (for
the Appliance or Database) will be representative of that as well.
Object | Name | Bytes | KB | MB
-----------+----------------------------------+----------------------+------------------+--------------
Appliance | server7 | 18,279,660,781,568 | 17,851,231,232 | 17,432,843
Database | TEST_DB | 66,977,792 | 65,408 | 64
Table | EMPTY_TABLE | 0 | 0 | 0
Sec Table | ROW_SECURITY_TABLE_A | 12,058,624 | 11,776 | 12
Sec Table | ROW_SECURITY_TABLE_B (Versioned) | 24,117,248 | 23,552 | 23
MView | SAMPLE_MVIEW | 2,359,296 | 2,304 | 2
Table | SAMPLE_TABLE_A | 8,781,824 | 8,576 | 8
Table | SAMPLE_TABLE_B (Versioned) | 19,398,656 | 18,944 | 19
Table | SAMPLE_TABLE_C | 262,144 | 256 | 0
nz_gra_history (nz_gra_history.ddl)
Usage: nz_gra_history [ database [ table ]]
Purpose: Copy the Scheduler's GRA information out to a permanent table on disk.
The Scheduler's GRA (Guaranteed Resource Allocation) information is stored
in memory in a virtual table (_V_SCHED_GRA). This script can be used to copy
it out to a permanent table. By doing this
o Any BI tool can query the information easily
o The information is persistent, and will be maintained after an nzstop
If you wish to collect this information, this script should be run periodically
(e.g., once an hour) via a batch/cron job to update the permanent table with
the latest data from the Scheduler's GRA table.
Note: As of NPS version 6.0, the system view _V_SCHED_GRA_EXT provides
additional ('EXT'ended) information. That view will be used to populate
the permanent table (assuming the DDL used to create the NZ_GRA_HISTORY
table included those additional columns).
Note: As of NPS version 7.0, the system view _V_SCHED_GRA_EXT has been
changed yet again and includes additional columns.
Inputs: By default, the information will be loaded into the NZ_GRA_HISTORY
table in the SYSTEM database -- which you must create ahead of time
(use the SQL/DDL in the file nz_gra_history.ddl for this purpose).
[ database ] Alternatively, you can specify a different database
[ table ] and table that you want the GRA history data loaded into.
Outputs: A status message, such as the following, will be sent to standard out
Load session of table 'nz_gra_history' completed successfully
nz_health
Usage: nz_health
Purpose: Reports on the overall health of the system.
Any specific problems that might warrant your attention will be
identified in the resultant output.
Inputs: None
Outputs: The report will check upon, and include the following information
o database uptime
o software version
o system.cfg settings
o active sessions
o S.M.A.R.T. disk errors on SPUs
o hardware issues (failed hardware, regens taking place, ...)
o topology problems
o interrupted nzreclaim jobs
o disk utilization (min/avg/mag)
o extreme table skew
o shared memory utilization
o host processes
o host storage utilization
o core files
o interesting errors found in the pg.log file
o Etc...
nz_help
Usage: nz_help [search_string]
Purpose: Provide help (a listing) of the scripts that are included with this toolkit.
Notes:
All scripts provide online help, which can be displayed by invoking the
individual script with a "-?" or "-h" or "-help" option.
While script names themselves are case sensitive (because linux is),
options/switches/arguments to these scripts are case-insensitive.
If you used delimited/quoted object names within the database, then you must
take care when specifying them on the linux command line.
o You must enter them as case sensitive
o You must wrap them in double quotes
o And to protect the double quotes (from the linux shell) you must
wrap it again in single quotes! For example: '"My Quoted Object Name"'
Inputs: If the optional search_string is specified, then only those scripts that
match the seach string will be listed. A case insensitive wild-carded
search is performed.
-v|-verbose Display the entire online help text for the script. Thus,
this command would display all online help/documentation
for all scripts: nz_help -v
-version Display the "NPS Versions" supported by these scripts.
Outputs: A listing such as this will be produced.
$ nz_help synonym
Toolkit Version: 7.2 (2016-07-04)
For NPS Versions: 4.6, 5.0, 5.2, 6.0, 6.1, 7.0, 7.1, 7.2
Directory Location: /nz/support/bin
nz_ddl_synonym To dump out the SQL/DDL that was used to create a synonym.
nz_get_synonym_definition For the specified synonym, return the REFERENCED object.
nz_get_synonym_name Verifies that the specified synonym exists.
nz_get_synonym_names List the synonym names found in this database.
nz_get_synonym_objid List the object id for the specified synonym
nz_get_synonym_owner List the owner (creator of) the specified synonym
nz_migrate
Usage: nz_migrate -sdb <dbname> -tdb <dbname> -thost <name/IP> [optional args]
Purpose: Migrate (i.e., copy) database table(s) from one NPS server to another.
It can also be used to make a copy of a database/table on the same server.
Source The server/database containing the data which is to be copied.
Target The server/database to receive the data. The database and
table(s) must already exist on the target.
Optionally, this script can automatically create the entire
target database, or just the target table(s), via the options
-CreateTargetDatabase YES
-CreateTargetTable YES
This script can be invoked from the source host or from the target host.
Or, it can be invoked from a linux client -- though that would not provide
the best performance as the data has to flow from the source to/thru the
client and then back out to the target.
SOURCE ==> client ==> TARGET
Regarding Materialized Views: One would never migrate the data in a materialized
view directly (or load it, or back it up, for that matter). You only deal with
the base table itself ... and the materialized view is automatically maintained
by the NPS software. However, there is additional overhead if the system has to
maintain the MView along with the base table. When doing such a migration
o The data first gets migrated/written to the temp/swap space partition
o The temp/swap space is then read and written to the target table
o The target table is then read and written to the associated MView
You should consider suspending any relevant MViews on the target machine prior
to doing the migration ... and then refreshing them upon completion.
ALTER VIEW view MATERIALIZE { REFRESH | SUSPEND }
If the target table is distributed randomly and you choose to use "-format binary",
then you must first suspend any mviews associated with that table.
Regarding Sequences: The only 'data' associated with a sequence is the definition
(the DDL) of the sequence itself, which you can obtain via the script 'nz_ddl_sequence'.
This script can be run as any user. No special privileges need to be granted
beyond the basics (such as the ability to read from the source tables and
create external tables, and the ability to write to the target tables).
To migrate tables requires permissions such as these:
On the SOURCE system
GRANT LIST ON <DATABASE|dbname> TO <user/group>;
--The user must have access to the database that contains the table(s)
GRANT SELECT ON <TABLE|tablename> TO <user/group>;
--The user must have access to the tables themselves, and their data
GRANT CREATE EXTERNAL TABLE TO <user/group>;
--The user must be able to create external tables, into which the
--data will be unloaded
On the TARGET system
GRANT LIST ON <DATABASE|dbname> TO <user/group>;
--The user must have access to the database that contains the table(s)
GRANT SELECT, INSERT , TRUNCATE ON <TABLE|tablename> TO <user/group>;
--The user must be able to insert (i.e., nzload) data into the target
--tables. The TRUNCATE privilege is only needed if you want to use the
--the command line option: -TruncateTargetTable YES
Inputs: REQUIRED Arguments
==================
-sdb <dbname> # Source database name
-tdb <dbname> # Target database name
-shost <name/IP> # Source host
-thost <name/IP> # Target host
# The "-shost <name/IP>", if not specified, will default to "localhost"
# (i.e., the current system). If you are not invoking this script directly
# on the source host, then this value must be specified.
#
# If -shost and -thost are both set to the string "localhost", then
# this script will recognize that you simply want to copy the data
# from one database into another (on the same machine). In which case
# it will perform a simpler/faster cross-database INSERT (rather than
# doing a 'migration' which has more overhead)
A minimalist command line might be as simple as something like this:
nz_migrate -sdb proddb -tdb testdb -thost devbox
OPTIONAL Arguments
==================
-suser <user> # Source user [SUSER]
-tuser <user> # Target user [TUSER]
-spassword <password> # Source password [SPASSWORD]
-tpassword <password> # Target password [TPASSWORD]
-sport <#> # Source port [SPORT]
-tport <#> # Target port [TPORT]
# If any of these arguments are NOT specified, the default NZ_USER,
# NZ_PASSWORD, and NZ_DBMS_PORT environment variables will be used
# in their place when connecting to the source and target machines.
# Rather than passing these arguments on the command line, you could
# instead specify them via the environment variables listed above.
#
# For the passwords this would be more secure -- as the passwords
# wouldn't appear if someone issued a 'ps' command. Alternatively,
# if you have set up the password cache (via the 'nzpassword' command)
# the passwords can be obtained directly from that.
#
# Exceptions: If you do NOT invoke nz_migrate from the SOURCE host,
# then an ODBC connection is used to unload the data ...
# and the password cache cannot be used (for the SOURCE).
#
# If you include the "-TargetOrderByClause <clause>",
# then an ODBC connection is used to load the data ...
# and the password cache cannot be used (for the TARGET).
#
# Generally, the NZ_DBMS_PORT is never specified (so the default value
# of 5480 is used). This simply allows you to override the default if
# you need to.
-sschema <schema> # Source schema [SSCHEMA]
-tschema <schema> # Target schema [TSCHEMA]
# This only applies if the database supports and is using schemas. If so, then ...
# If the above options are specified they will be used.
# Otherwise, the default NZ_SCHEMA (or "-schema <schema>") option will be used.
# Otherwise, the default schema (for that particular database) will be connected to.
#
# The schema, in both the source and target databases, must already exist.
#
# Other options:
#
# Multiple schemas can be processed at one time by specifying
# -schemas ALL # To process "ALL" schemas in the source database
# -schemas test # To process the specified schema
# -schemas s1 s2 s3 # To process the specified list of schemas
#
# If you include the command line option "-CreateTargetTable yes" then
# o If the target schema does not already exist it will first be created (with the
# same name as the source schema). No other objects (views, synonyms, etc ...)
# will be created within the schema ... just the schema itself.
# o If the table (in the target schema) does not already exist, it will be created.
-t|-table|-tables <tablename> [...]
# The table(s) within the database to be migrated. If none are specified,
# then all tables in the source database will be migrated. This switch
# can be specified multiple times.
#
# Note: If a table contains a column of type INTERVAL (which is rather
# atypical) you may have to use the "-format binary" option.
# If the target system is running NPS 7.2+, the nzload'er now
# has support for interval data types ... allowing you to choose
# either "-format ascii" or "-format binary" when doing a migration.
-exclude <tablename> [...]
# If you don't specify a list of tables to be migrated, then all of the
# tables in the source database will be migrated. This option allows
# you to identify specific tables that are to be EXCLUDE'd from that
# migration.
-tableFile <filename>
# You can specify a file which lists the tablenames to be migrated (the
# names can be separated by newlines or tabs). This switch can be
# used in conjunction with the -table option. This switch can be
# specified multiple times.
-tableSQL <SQLQueryString>
# You can use this option to dynamically generate the list of tablenames
# to be migrated, via a user-specified SQLQueryString that runs against
# the source database. This allows you to include/exclude many tables
# all at once.
#
# Sample Usage:
# -tableSQL "select tablename from _v_table where objtype = 'TABLE' and tablename not like 'TEST%'"
#
# This option can be specified only once. This option must be used by
# itself (and not in conjunction with the -table or -tableFile options).
-sTable <tablename> # source Table
-tTable <tablename> # target Table
# This option allows you to migrate a source table into a target table
# with a different name. When using this option you may only process
# one table at a time via this script. You must specify both the source
# tablename and the target tablename. The -CreateTargetTable and
# -TruncateTargetTable options may be used. You may not use the
# "-t <tablename>" option at the same time.
-format <ascii|binary>
# The data transfer format to be used. The default is ascii.
#
# ascii Universal in nature, but results in a larger volume of data
# needing to be transferred -- which means the migration will
# typically take longer. It is especially well suited for
# doing migrations
# o between different sized machines (i.e., TF12 --> TF48 )
# o when the DDL differs slightly between the two tables
# (i.e., different data types, distribution keys, etc.)
#
# binary The database's compressed external table format. It can result
# in higher throughput, and thus better performance, especially
# when doing migrations between the same sized machines (machines
# which have the same number of dataslices).
#
# Normally, data is compressed on the source SPUs and then later
# decompressed on the target SPUs. But if the # of dataslices
# does not match, then the decompression will actually occur on
# the target host rather than on the target SPUs (thus, losing
# out on the performance advantages of parallelism). This can be
# offset by using multiple "-threads <n>" ... in which case the
# decompression will now run in multiple processes on the target
# host (thus making use of multiple SMP processors/cores).
#
# Note: In 4.5 (and earlier revs) nzodbcsql was used to load the
# data into the target machine. Which means that the
# password cache could not be used for the target host.
#
# In 4.6 (and later revs) nzload is used to load the data
# into the target machine. Which means that the password
# cache can now be used.
-threads <n>
# Each table will be processed by using '<n>' threads (parallel unload/load
# streams) in order to make optimal use of the SMP host and the network
# bandwidth. By default
# 1 thread will be used for small tables ( < 1M rows )
# 4 threads will be used for larger tables
# Or, you can override the number of threads to be used with this switch.
-cksum <yes|no|fast|slow> [column ...]
# Should a cksum be run against both tables to confirm that the source and
# target tables (appear to) contain the same data? The default is "Yes".
# Choices are
# Yes | Count -- Perform a simple "select COUNT(*) from <table>"
# No -- Perform NO cksum
# Fast -- Perform a FAST cksum (only 1 column is used)
# Slow -- Perform a SLOW cksum (all columns are used)
#
# The source table's cksum is performed at the start of the migration ...
# so that it reflects the data that is being migrated (in case someone
# inserts/updates/deletes rows from the table sometime after the migration
# process began). It is run as a separate/background job so as to not
# delay the start of the data flow.
#
# The target table's cksum is performed at the end of the migration ...
# after the data is in it.
#
# If column(s) are specified, then only those column(s) will participate
# when doing a Slow or Fast checksum. The Count checksum doesn't involve
# any columns to begin with, since it only counts the number of rows.
#
# See also: nz_cksum
-TruncateTargetTable <no|yes>
# Before loading any data into the target table, TRUNCATE it to ensure it
# is empty. If you use this switch be careful not to accidentally mix up
# your source and target hosts. Or your original/source data will be LOST
# FOREVER! The default value is NO.
-CreateTargetTable <no|yes>
# If the table does not already exist in the target database, do you want
# this script to attempt to automatically create it for you? The default
# value is NO. If YES, the table will be created, and its definition will
# include any primary key and unique column constraints that may have been
# defined. The script will also attempt to add any relevant foreign key
# constraints to the table. However, as those are dependent upon other
# tables, that operation may or may not be successful. If that operation
# fails it will not be treated as an error by this script. Rather, the
# following notice will be included in the output.
#
# ....NOTICE: One (or more) of the foreign key constraints was not added to the table
#
# The script 'nz_table_constraints' will be helpful if you wish to add
# those table constraints back in later on.
-CreateTargetDatabase <no|yes>
# If the target DATABASE itself does not exist, do you want this script
# to attempt to automatically create it (and ALL of its tables and other
# objects) for you? The default value is NO. If YES, then this script
# will (basically) run
#
# nz_ddl SOURCE_DATABASE_NAME -rename TARGET_DATABASE_NAME
#
# on the source system (to obtain the original DDL), and then feed that
# into nzsql running against the target system (to create the objects).
#
# Many non-fatal errors might occur (for example, a CREATE VIEW statement
# might reference objects in another database ... that doesn't exist on the
# target system ... and thus, that DDL statement would fail). For purposes
# of SUCCESS/FAILURE, the only thing that this script cares about is whether
# the target database itself was successfully created.
#
# As the output from these operations can be quite voluminous, the details
# will be logged to a disk file in case you wish to reference them.
#
# An example ... migrating a database, and all of its DDL and DATA:
# nz_migrate -shost my_host -sdb OLD_DB -createtargetdatabase True \
# -thost your_host -tdb NEW_DB
#
# But to migrate just the DDL (no data) ... this would be a good approach:
# nz_ddl OLD_DB -rename NEW_DB | nzsql -host your_host
#
# See 'nz_ddl -help' for additional information.
# Regarding SCHEMA's
# ==================
# If you do not specify that you want to process multiple schemas (via the
# the "-schemas" option) then only the objects+tables in the default schema
# (in the source database) will be created and migrated to the target
# database that gets created.
#
# Multiple schemas can be created + processed by specifying
# -schemas ALL # To process "ALL" schemas in the source database
# -schemas test # To process the specified schema
# -schemas s1 s2 s3 # To process the specified list of schemas
#
# The specified schemas will be created. All objects (tables, views, synonyms, etc ...)
# in those schemas will be created. All tables in those schemas will be migrated.
-CreateTargetUDX <no|yes>
# This option only applies when the following two conditions are met
# You specified '-CreateTargetDatabase yes'
# The target database does not exist (and thus will be created)
#
# If both conditions are true, then this option can be used to specify
# that the script should also recreate any user defined FUNCTION, AGGREGATE
# or LIBRARY that existed in the source database.
#
# The default for this option is NO.
#
# If you specify YES then please note the following:
# o On the source machine, you must be running as the linux user "nz"
# so that the object files (under /nz/data ) can be accessed
# o The object files are scp'ed to the target machine, so this script
# (e.g., scp) will most likely prompt you for the necessary password
# o The source and target machines must be of a compatible architecture
# (e.g., Mustang-->Mustang or IBM PDA N1001-->IBM PDA N2001)
#
# As an alternative for copying/creating UDX's, you might want to consider
# the following scripts:
# nz_ddl_function / nz_ddl_aggregate / nz_ddl_library
-to <email@recipient>
# Email address to be sent a copy of the output from this script
-errors_to <email@recipient>
# Email address to be sent a copy of the output from this script ...
# but only if an error is encountered.
-mail_program <script_or_executable>
# When sending email, this script invokes
# /nz/kit/sbin/sendMail if you are the linux user "nz"
# else /bin/mail for all other linux users
#
# This option allows you to tie in a different mailer program (or shell
# script) without needing to make any changes to the nz_migrate script
# itself.
-genStats <none|full|express|basic>
# After each individual table migration completes, this script can
# automatically issue a GENERATE STATISTICS command against the target
# table. It is invoked as a background job so as to not block the rest
# of the migration process.
#
# Options are
# None -- Do nothing (the default)
# Full -- Generate /* full */ Statistics
# Express -- Generate EXPRESS Statistics
# Basic -- Generate /* basic */ Statistics (as of 4.6+)
#
# For more information about statistics, see 'nz_genstats -help'
-SourceObjectType <table|any>
# nz_migrate is used to copy data from a "source table" into a "target table".
# But why can't the source be something other than a table? Such as a
# VIEW or a MATERIALIZED VIEW or a SYNONYM or an EXTERNAL TABLE. This is
# rather atypical, but allowed (with certain caveats).
#
# To do this you must
# -- specify the source objectname(s), via the "-table" or "-sTable" option
# -- include the "-SourceObjectType any" option
#
# Otherwise, the default checks that are performed by this script will insure
# that the SourceObjectType is truly a "table" and you won't be allowed to
# migrate data from objects that are not of type table.
#
# Caveats
#
# When the source object is a VIEW, MATERIALIZED VIEW, or EXTERNAL TABLE
# then only one thread/stream is used to migrate the data (because the
# DATASLICEID column, which is used to parallelize the streams, is not
# supported for such objects).
#
# The target table must already exist ... and be ready to accept the data
# being migrated. The -CreateTargetTable option is not supported when the
# source table is NOT a table. This restriction might be able to be removed
# in the future if there is an actual need for it.
-noData
# Don't actually move any data ... but do everything else (as requested on
# the command line). Thus, you could use this option to
# o Create the target database / tables
# o Truncate the target tables to clean them out
# o Cksum the source + target tables (to see if they contain identical data)
# o Test things in general (without actually moving data)
-SourceWhereClause <clause>
# Normally, nz_migrate is used to migrate over the entire contents of a
# table -- all visible rows. This option allows "you" to tack on a WHERE
# clause to the data that gets selected ... allowing "you" to migrate a
# subset of the table. All the power (and responsibility) is put into your
# hands. Do wrap the clause in double quotes so it will be passed into the
# script correctly. Examples:
#
# -SourceWhereClause "customer_key = 2"
# -SourceWhereClause "customer_key in (1,3,5)"
# -SourceWhereClause "region_name = 'AMERICA' or region_key = 0"
# -SourceWhereClause "order_date between '1998-01-01' and '1998-12-31'"
# -SourceWhereClause "customer_num in (select cust_id from sales..delinquent_accounts)"
#
# Because this clause gets applied to all tables being migrated, you would
# probably only want to migrate a single table at a time (when using this
# clause) ... since the clause will typically contain column names that are
# specific to that table.
#
# This clause gets applied to the source table. Its use will be logged in
# the output of this script (as well as in the pg.log file on the source
# machine). After the migration, when the tables are compared (via the
# cksum step) this clause will be applied to both the source and target
# tables ... so that only that data that was moved will be cksum'ed.
-TargetOrderByClause <clause>
# This option allows the data to be sorted -- after it is moved to the
# target machine, and before it is actually inserted into the target
# table -- in order to obtain optimal performance on future queries that
# would benefit from zonemap restrictions. It is up to you to decide what
# column(s) the data should be sorted on, and to pass those column name(s)
# into this script. Do wrap the clause in double quotes so it will be
# passed into the script correctly. Examples:
#
# -TargetOrderByClause "purchase_date"
# -TargetOrderByClause "purchase_date asc"
# -TargetOrderByClause "purchase_date asc, store_id asc"
#
# Because this clause gets applied to all tables being migrated, you would
# probably only want to migrate a single table at a time (when using this
# clause) ... since the clause will typically contain column names that are
# specific to that table.
#
# This clause gets applied to the target table. Its use will be logged in
# the output of this script (as well as in the pg.log file on the target
# machine).
#
# How it works (plan wise): The data will be sent to the target machine --
# down to the appropriate SPU/dataslice -- and written out to the swap
# partition. After all of the data has been moved over, then each SPU/
# dataslice will sort its dataset and insert that ordered data into the
# target table. A typical *.pln file would look something like this:
#
# 500[00]: dbs ScanNode table 1000000470
# 500[02]: dbs ParserNode table "PROD_DB.ADMIN.#aet_3228_1290_903" delim=|
# 500[05]: dbs DownloadTableNode distribute into link 1000000471, numDistKeys=1 keys=[ 0 ]
#
# 1[00]: spu ScanNode table 1000000471
# 1[03]: spu SortNode (type=0), 1 cols
# 1[06]: spu InsertNode into table "PROD_DB.ADMIN.CUSTOMER_TBL" 421665
#
# The migration WILL take longer ... because we have tacked on an extra step
# (this sort node) that occurs after the data has been copied over. However,
# this does save you the trouble of having to (a) first migrate the data over
# and (b) then do the sort operation yourself. Thus, in the long run, it
# might save you some time, effort, and disk space.
#
# It is also recommended that you include these command line options:
# -threads 1 -format ascii
#
# If you want the data "well sorted", it needs to be sorted as a single dataset.
# Using multiple threads would cause each thread to sort its data separately.
# The end result would be that the data was somewhat sorted, but not "well
# sorted" (so why bother sorting it at all?)
#
# If the number of dataslices match (between the source and target machines)
# then the sortedness of the target data should be comparable to that of the
# source data (whatever it may be). If the number of dataslices does not
# match, you can use "-format binary" ... but (for the reasons just mentioned)
# you should not use multiple threads. A binary migration (using just one
# thread) between dissimilar machines will take a long time -- longer than
# an ascii migration. So the "-format ascii" is recommended.
-TargetDistributeOnClause <clause>
# IF ... the target table does not exist
# IF ... you include the "-CreateTargetTable YES" option, thereby requesting this
# script to automatically issue a CREATE TABLE statement on your behalf
# THEN ... this option will allow you to control/override the distribution key.
# By default, it will match that of the source table. But you can
# specify whatever you want it to be. Do wrap the clause in double
# quotes so it will be passed into the script correctly. Examples:
#
# -TargetDistributeOnClause "Random"
# -TargetDistributeOnClause "(customer_id)"
# -TargetDistributeOnClause "(customer_id, account_id)"
#
# Because this clause gets applied to all tables being created, you would
# probably only want to migrate a single table at a time (when using this
# clause) ... since the clause will typically contain column names that are
# specific to a table. The exception might be if you wanted all of the
# newly created tables to be distributed on random.
#
# This clause gets applied to the target table. Its use will be logged in
# the output of this script (as well as in the pg.log file on the target
# machine).
#
# When using this option, you must use "-format ascii" (because a binary
# migration requires that the source and target table structures match
# exactly ... to include the distribution key).
-viaSynonym
# If you pass the script a synonym name (rather than a table name) ... and
# If you include this switch ... then
#
# The script will find the table that the synonym references (on the source)
# The table will be migrated to the target
# Upon successful completion,
# a synonym by the same name will be (re)created (on the target)
# referencing the table that was just migrated to it.
#
# For example, the synonym TODAYS_DATA might reference the table MARCH_12.
# Once the table MARCH_12 has been successfully migrated to the target host,
# the synonm (TODAYS_DATA) will be updated to point to that table.
-cloneDDL
# This script may issue CREATE TABLE statements (-CreateTargetTable YES)
# This script may issue CREATE SYNONYM statements (-viaSynonym)
#
# If the -cloneDDL switch is included, then this script will generate all
# of the DDL associated with the object -- to attempt to faithfully clone
# it. This would include the following statements:
#
# COMMENT ON ...
# GRANT ...
# ALTER ... OWNER TO ...
#
# Note: This is always the case for the '-CreateTargetDatabase' option.
-status [<n>]
# Provides periodic status updates while the data is flowing:
#
# .....data flowing.....
# .....status Total: 5,077,991,424 Average: 507,799,142 elapsed seconds: 10
# .....status Total: 10,688,921,600 Average: 510,084,561 elapsed seconds: 21
# .....status Total: 16,720,592,896 Average: 603,167,129 elapsed seconds: 31
# .....status Total: 23,155,965,952 Average: 643,537,305 elapsed seconds: 41
# .....status Total: 29,871,308,800 Average: 671,534,284 elapsed seconds: 51
# .....status Total: 36,969,644,032 Average: 709,833,523 elapsed seconds: 61
#
# The first column is the total number of bytes transferred
# The second column is the average number of bytes transferred per second
# since the last status update
#
# And it provides additional summary statistics for each table, and for the
# migration as a whole:
#
# .....# of bytes xfer'ed 2,062,729,601,280
# .....xfer rate (bytes per second) 917,584,342
#
# By default, the status will be updated every 60 seconds. You can specify
# an optional value from 1..3600 seconds.
#
# Implementation details: The dataset, as it is passed between the two
# hosts, is piped thru "dd" in order to count the number of bytes. This
# is not the size of the table on disk -- rather, it is the size of the
# ascii or binary dataset that gets migrated between the two hosts. Because
# it involves an additional process ("dd") getting injected into the data
# stream there is some -- albeit minimal -- overhead when using this switch.
# With older versions of linux, the reported "# of bytes xfer'ed" might be
# slightly less than the actual value because it is based on the number of
# whole blocks processed by dd.
-restart <n>
# Sometimes a table migration may fail because of a momentary network
# blip between the two hosts (such as a "Communication link failure")
# or a momentary hardware or software issue. This script will
# automatically restart the migration of a table under the following
# conditions:
#
# You specified the option "-TruncateTargetTable yes"
# --or--
# You specified the option "-SourceWhereClause <clause>"
#
# During the previous attempt to migrate the table, one (or more) of
# the threads may have successfully completed ... which would result
# in some (but not all) data existing in the target table. Which would
# leave the target table in an inconsistent state. Before a restart of
# the migration is attempted, this script will attempt to either
# TRUNCATE the target table (which should be instantaneous) or
# issue a DELETE statement against the target table based on the
# "-SourceWhereClause <clause>" (which could take a long time to
# run).
#
# By default, the # of restart attempts is set to 1 -- though it only
# kicks in based on the above specified conditions. You can set
# the number of restart attempts to a value from 0 (no restart) to 10.
#
# A restart does not guarantee success. It just automates the attempt.
# A restart will only take place if the problem occurred during the
# migration of the data (and not during the early setup nor during the
# final cksum phases). If the subsequent migration of the table is
# successful then the script will treat it as a success -- with
# appropriate log messages being included in the output to identify
# what the script is doing.
-timeout [<n>]
# Sometimes a table migration hangs up ... where, for example, all but one
# of the threads finishes. Perhaps the source sent a packet of data but
# the target never received it. Neither one received an error. But neither
# one will continue ... as they are each waiting for the other to do something.
#
# This option tries to monitor for such situations, and will automatically
# terminate the hung thread(s) when it notices that 0 bytes of data have
# been transferred between the two systems over the specified length of time
# (the -timeout value). Specifying this option automatically enables the
# '-status' option as well.
#
# By default, this option is disabled. Include this switch on the command
# line to enable it. The default timeout value is set to 600 seconds (10
# minutes). You can specify an optional value from 1..7200 seconds (2 hours).
#
# If the timeout is reached, the migration of the current table will
# automatically be terminated. If desired, the migration of that table
# can automatically be restarted as well. See the '-restart' option
# described above for more details.
-sbin <source bin/executables directory>
-tbin <target bin/executables directory>
# You should be able to freely migrate between NPS machines running
# software versions 3.1, 4.0, 4.5, 4.6, 5.0 and 6.0.
#
# If you are migrating from a 3.0 system, there may be software
# incompatibilities between the source and target machines.
#
# To accommodate this difference, do the following:
# 1) Get a copy of the client toolkit ("cli.package.tar.z") that
# matches the version of software running on your TARGET machine.
# 2) Install it somewhere on your SOURCE machine. But not under the
# directory "/nz" as you don't want to mix up the server + client
# software.
# 3) When you invoke nz_migrate, include the "-tbin" argument so
# that the script can find and use the proper toolkit when
# communicating with the TARGET machine.
#
# For example,
# -tbin /export/home/nz/client_toolkits/3.1/bin
-loopback
# When this option is specified, all other options are ignored.
#
# It will perform a loopback test on this NPS host by migrating a dummy
# table (using NZ_CHECK_DISK_SCAN_SPEEDS), "migrating" it from the SYSTEM
# database into the MIGRATION_TEST database (which will be created).
#
# This test will basically unload the data ... send it up to the host ...
# where it will then be reparsed and sent back down to the SPUs ... where
# it will then be written into the target table. So it is basically a test
# of the entire migration process.
#
# This does not exercise the house network (nor a 2nd/target host).
# Since this NPS server acts as both the source host and the target host, in
# many aspects it has 2X the amount of work to do.
#
# It uses the following migrate options: -format ascii -threads 8 -status 10
-fillRecord treat missing trailing input fields as null (columns must be "nullable")
-truncString truncate any string value that exceeds its declared char/varchar storage
# These nzload options can also be applied here. They are only relevant if
# you are doing a "-format ascii" migration. They allow you to migrate data
# into a target table that has a slightly different shape than the source table.
# The target table can have more columns
# The target table can have text columns with a shorter defined length
#
# If using either of these options, you should stick with the default -cksum option,
# which simply compares the rowcount between the source + target tables. Any
# other cksum that gets generated would (almost definitely) be different ...
# which this script would treat as a failed migration attempt.
Cancel: When the nz_migrate script runs it launches many subordinate processes. If you
need to kill the migration the easiest method would be to press ^C.
Otherwise, you can issue the following kill statement
kill -- -NNN
where NNN is the process ID of the nz_migrate script. Since many processes
are associated with the migration, the relevant PID # is listed in the
nz_migrate output (and is also a part of the directory/log file names).
For example:
Data Format : ascii (using nzsql and nzload)
Log Directory : /tmp/nz_migrate.20100101_120000.37554
Log File : /tmp/nz_migrate.20100101_120000.37554/nz_migrate.output
Top Level PID : 37554
The migration is NOT run as one large transaction (where everything gets
migrated or nothing gets migrated). Rather, each table is migrated separately.
Beyond that, if multiple threads are used ... then each thread is a separate/
independent transaction. So what does this mean for you ... if you kill a
migration?
o Any tables that have already been migrated are done.
o For the current table
If any threads have already finished ... then that data has been COMMIT'ed
and does exist in the target table.
For any threads that are still running ... that data will be rolled back/
discarded (and the discarded rows may be using up space in the target table).
So ... your target table will be an indeterminate state. Before restarting
the migration for that table, you may want to TRUNCATE it or use the
"-TruncateTargetTable TRUE" option when invoking nz_migrate.
Errors: Error: Load session of table '<tablename>' has 1 error(s)
You are probably doing a "-format ascii" migration between the
two systems. The target table does NOT have to be identical
to the source table. For example, the column names can be
different, or an INTEGER column on the source machine can be
loaded into a VARCHAR column on the target machine. But if
you get this error, then some data incompatibility seems to
have cropped up (e.g., trying to load a 'date' into a column
that is defined as 'time'). Check the DDL for the two tables.
Check the *.nzlog file for additional clues.
[ISQL]ERROR: Could not SQLExecute
You are doing a "-format binary" migration between the two systems,
and the definition of the target table (# of columns, data types,
null/not null constraints, distribution keys) does not match that
of the source table. They must match.
Error: packet header version '9' does not match expected '11'
- client and server versions may be different
It appears that the migration is between two different NPS hosts that
are running two different versions of the NPS software -- and some of
the NZ*** tools (most likely, nzsql or nzload) are not compatible
between the two machines. See notes above regarding the "-tbin" option.
ERROR: loader does not currently support INTERVAL data
The INTERVAL data type is somewhat atypical, e.g.
'1 year 2 months 3 weeks 4 days 5 hours 6 minutes 7 seconds'
It is a derived column -- the result of a comparision between two
TIME and/or TIMESTAMP columns -- which you then stored in a new table.
Try migrating this specific table using the "-format binary" option
instead.
Support for interval data types has been added to the nzload'er in
NPS version 7.2.
ERROR: nz_cksum's differ!
After each table is migrated, an (optional) cksum is calculated against
both the source + target tables, and then compared. For some reason
the checksum's don't match. The tables should be analyzed + compared
further to try to determine why this occurred.
ERROR: Reload column count mismatch.
# i.e., the total number of columns does not match
ERROR: Reload column type mismatch.
# i.e., the column data types do not match
# or the null/not null constraints do not match
ERROR: Reload distribution algorithm mismatch.
# i.e., the DISTRIBUTE ON clause does not match
These errors occur when you are doing a "-format binary" migration
between the two systems, and the definition of the target table does
not match that of the source table. They must match.
Note: The "nz_ddl_diff" script can be used to compare two table's DDL.
Error: Unexpected protocol character/message
The issue is that nzload uses a hard-coded 10 second timeout during
connection establishment, and the NPS host is not responding in time.
The NPS host does a REVERSE lookup of the ip address of the client to
find its hostname. If the client machine is not in the same sub-net
as the NPS host, then the name-lookup may take too long.
Either configure the NPS host to do the name-lookup properly (the domain
name server might need to be corrected: /etc/resolve.conf), or change the
/etc/hosts file to include an entry for the client name and ip address.
To confirm that this is the problem, on the TARGET NPS host try doing this
time host <ip-of-client-not-working/Source Host>
or
time dig -x <ip-of-client-not-working/Source Host>
If this returns the client name and other information immediately, then
the problem is something else. If it takes longer than 10 seconds to
return the information, then that is causing the timeout of nzload to
expire.
Outputs: Status/log/timing information for each table that is migrated will be sent
to standard out, as well as details about any problems that might need to
be addressed.
Exit status: 0 = success, non-0 = ERROR's were encountered
Sample output (for a migration that included many of the optional options)
nz_migrate table 'MARCH_12'
.....processing table 1 of 1
.....referenced via synonym 'TODAYS_DATA'
.....using target table 'MY_TEST_RUN'
.....creating the target table
.....Target DISTRIBUTE ON Clause RANDOM
.....truncating the target table
.....migration process started at 2011-03-12 11:37:48
.....estimated # of records 16,384
.....Source WHERE Clause AND (cust_id < 90000)
.....Target ORDER BY Clause part_num, ship_date
.....nzload starting ( thread 1 of 1 )
.....unloading data ( thread 1 of 1 )
.....data flowing.....
.....unload results ( thread 1 of 1 ) INSERT 0 16384
.....unload finished ( thread 1 of 1 ) elapsed seconds: 1
.....nzload finished ( thread 1 of 1 ) elapsed seconds: 2
.....nzload successful ( thread 1 of 1 )
.....data flow finished
.....migration process ended at 2011-03-12 11:37:50
.....actual # of records unloaded 16,384
.....
.....migration completed TOTAL seconds: 2
.....
.....cksum process started at 2011-03-12 11:37:50
.....cksum process ended at 2011-03-12 11:37:51
.....confirmed cksum: 450386793.9812719738880 16384 MARCH_12
.....
.....cksum completed TOTAL seconds: 1
.....
.....reference synonym was successfully created in the target database
nz_plan
Usage: nz_plan [<planfile>] [-tar] [-fast] [-scan|-noscan] [-db <database>]
Purpose: Analyze a query's *.pln file and highlight things of note.
The script also obtains and displays many additional pieces of information
that are not found in the planfile itself.
Inputs: All fields are optional
[<planfile>]
If you specify a planfile/number ( e.g., 14335 or 14335.pln or test.pln )
the script will search for the planfile you are talking about. The *.pln
extension is optional. It will search for the planfile in the following
directory locations in the following order:
o your current working directory
o /nz/data/plans /* e.g., an active query */
o /nz/kit/log/planshist/current /* e.g., a completed query */
NPS 7.0: By default, all plan files are now stored in a gzip compressed tarball.
This script uses the new SQL command "SHOW PLANFILE <nnn>;" to obtain the planfile
so that it can then be analyzed.
-tar The SHOW PLANFILE command only searches the two most recent tarballs
when looking for your plan file. If you include this switch, the
script will search every compressed tarball (newest to oldest) looking
for the planfile. This is not the default option -- as it takes
additional time to search the additional tarballs. But it provides
an easy-to-use option to try and find an older *.pln file in any of
the tarballs under: /nz/kit/log/planshist/*/*.tgz
Note: If you invoke nzsql as follows: nzsql -plndir /tmp/my_dir
a copy of all query plans and *.cpp files (for this nzsql session) are placed
under the specified directory location.
If you do not specify a "<planfile>", then the script will instead list info
about the last 25 queries that YOU have run on the box ... in order to help
you identify the one you are interested in. The following options give you
some additional control:
-<nnn> List info about "<nnn>" queries, rather than just the
last 25.
-all List info about queries run by any/all users, rather
than just you.
Example: -100 -all
Will list the last 100 queries that were run by anyone.
-width <nnn> The display width for the SQL column. Default is 40.
Range is 3-120.
[-last]
Instead of specifying a planfile/number, you can just enter "-last". In
which case the script will get the MAX(planid) from the system and analyze
the planfile associated with it (which will be for either an active or a
completed query). But this may or may not be the plan you are interested
in. It is simply a shortcut that the author of this script implemented so
that he didn't have to bother with looking up a plan number. This 'feature'
tends to work best if you are the only one using the system at the time.
[-fast]
Make this script run faster ... by skipping many of its queries against the
system catalogs. If your planfile involved a lot of different tables, this
can be a significant time savings. However ... it does mean that less
information (mainly, the statistics for each of the tables) will be included
in the output produced by this script.
[-scan|-noscan]
If specified, this script will scan each of the tables involved, attempting
to gather more accurate information as to
o the exact number of rows selected from the table
o the Min/Avg/Max # of rows selected, per SPU (to show intermediate/query skew)
o the scan time
The default setting has been changed back to "-noscan" (as the additional
scans can add significant overhead/time to the running of the script ...
and may interfere with other users on the system).
Notes: The information this provides is based on a simple scan of the table ...
so it won't reflect any benefits from join-aware zone maps (fewer
rows being selected / faster scan times).
Also, the scan time that is reported may be higher due to other users
currently on the system.
[-db <database>]
The name of the database. This script can usually determine that for
itself. But if you are working with a *.pln file and an nzdumpschema
from another system, you may want to specify it -- since the name of
the database you are working with (XXX_SHADOW) will not be the same
name as the original database (XXX) that is reflected in the *.pln file.
Outputs: An analysis of the *.pln file will be produced -- to start you on your journey.
It will include the following information.
SNIPPETS
========
Snippet Timings
Shows the elapsed time for each snippet (inclusive of any code compilations).
A query plan may contain tens (or hundreds) of snippets. When analyzing a
plan file, you would usually want to concentrate on the longer running
snippets ... specifically starting your analysis at the first such snippet.
# Of Snippets
How many snippets are in the *.pln file. Most snippets involve both the host
and the SPU. But some snippets are only host based. And some snippets are
only SPU based.
If your query has a large number of snippets (~100) perhaps
it is too complex for the optimizer to optimally optimize.
Consider breaking the query up into a number of separate/smaller
steps.
# Of Compilations
NPS dynamically generates C++ snippet code that will be run on the host or
on the SPUs. That C++ code must be compiled before it can be used. This
tells us how many compilations were involved in the query. Once a query is
run, the compiled code is saved in the 'code cache' ... so further invocations
of a query (or a similar query) should be able to skip the compilations, and
the overhead associated with them.
Compilation Timings
This shows how long the compilation took on each of the individual *.cpp files.
It also provides the "Total Compile Time".
Max Code Complexity
Basically, this is an indication as to how big the (biggest) *.cpp file is.
The bigger the file, the longer it takes to compile. Roughly speaking.
Linker Invocations
If your SQL invokes user defined functions/aggregates, then after the C++
code is compiled another step is involved to link it with the UDF/UDA.
THINGS OF NOTE
==============
Prior Plans
As of 4.6, the optimizer may decide to scan and "Pre-Broadcast" some
of the dimension tables. This occurs before your query is run ... before
its execution plan is even determined. This provides the optimizer with
additional information that it can use when generating the execution
plan for your query. Any such "Prior Plans" will be listed. If those
plans took any significant time to run (> ~1s), the timing information
will also be included.
Prior JIT Plans
The optimizer may decide to sample the data in some of the fact tables,
in order to get a better estimate as to the number of rows that will
be selected from it. This occurs before your query is run ... before its
execution plan is even determined. This provides the optimizer with
additional information that it can use when generating the execution
plan for your query. Any such "system generated sample scans" will be
listed. If those plans took any significant time to run (> ~1s), the
timing information will also be included.
Just-In-Time Statistics
The number of "Prior JIT Plans" (see above) that were performed.
Disk Based Join
If the tables being joined are too large, the operation may go
disk-based. It is always faster when things can be done in
memory (and not involve the disk).
Cross Product Join
When joining tables, did you forget to specify a join condition
in your WHERE clause? If so, you'll be bringing back all rows
in table1 TIMES all rows in table2. Possibly a lot of data.
Expression Based Join
The vast majority of all table joins involves a "hash join". An
expression based join is typically a nested loop operation, and
(when it involves large tables) can take awhile.
Merge Join
The vast majority of all table joins involves a "hash join". A
merge join is rather atypical, sometimes being used with sorted
materialized views. A merge join often involves extra (sort)
nodes and can take awhile.
MaterializeNode
If a grouping/aggregation involves a large number of groups
it may bump up against its allotted memory limit and need to
'materialize' the data. I.E., it goes disk based -- which
causes the information to get written out to the SPU's temp/swap
area -- which means the snippet will take longer to run.
The optimizer does not deal with a MaterializeNode. Rather, it
is added at runtime (during DBOS execution) if the system thinks
a snippet might exhaust its memory allocation.
SUBPLAN_EXPR
Very atypical.
Host Distributions
Typically, when data is moved around the internal network, you
will have spu <--> spu distributions
host --> spu broadcasts
A host distribution is atypical.
TABLES BEING SCANNED
====================
For each table, the following information (from the *.pln file) is shown
ScanNode (with estimates as to the # of rows that will be
selected, and the cost/scan time for the table)
RestrictNode (i.e., the WHERE clause, if any)
Time for snippet (the actual elapsed time for the snippet ...
inclusive of scan time and all other processing)
This script will provide these additional details
BASE TABLE If a materialized view is being scanned, this is
the base table it is built from
DISTRIBUTED ON The distribution key/column(s) for the table
TABLE STATISTICS A summary of the optimizer statistics that are
known for this table, to include the table's total
rowcount.
And, if you include the "-scan" option ...
ROWS RETURNED The exact number of rows that will be selected
from this table, based on the restrictions/where
clause. This will further be broken up into the
Min/Avg/Max number of rows per SPU.
SCAN TIME The actual/elapsed time to scan just the table.
This excludes any time+work that the snippet
would spend on other processing tasks.
If join-aware zonemaps kick in, these numbers may be less meaningful.
TABLE SUMMARY
=============
Summarizes the above "TABLES BEING SCANNED" information.
Optimizer Estimate The number of rows that the optimizer estimates
it is going to select/project from the table.
Movement As we scan a table we may do some processing of that data
(join it, aggregate it, sort it, etc...). But what do we
do with the remaining data at the end of that snippet?
broadcast - The data is broadcast out to all of the dataslices
dbs DownloadTableNode broadcast into link
distribute - The data is redistributed between the dataslices
spu DownloadTableNode distribute into link
SaveTemp - The data is saved to the temp/swap partition
spu SaveTempNode
Return - The data is returned to the host
spu ReturnNode
Jit? Indicates that the optimizer estimate for this
table was based on a JIT-Stats sample scan.
-scan Estimate If you included the optional "-scan" flag, this
is this script's estimate as to the number of rows
that will be selected/projected from the table.
Table Rowcount The total number of rows in the table (i.e., the
statistic value that is stored in the catalogs).
Statistics Indicates the overall status of the statistics
that have been generated on this table.
If "<unknown>", it simply means that the catalog
statistics are not accessible to this script for
some reason.
If "Basic", "Express" or "Full", then those are
the type of stats that have been generated on this
table, and they are 100% up-to-date.
Otherwise, this column will display "outdated"
Fact? Indicates that the optimizer has tagged this table
as being a "(FACT)" table.
Distributed On The distribution column(s) for the table.
JOIN CONDITIONS
===============
The table joins that are taking place, and in what order.
Take note of any datatype conversions that take place, as you might want to
try to eliminate them. In the example below, the customer.customer_number
column is being recast to an INT4 (integer) datatype. This leads to (some)
extra processing overhead. More importantly, it may have resulted in some
extra data distributions that could have been avoided.
3[05]: spu HashJoinNode
-- (INT4(customer.customer_number) = orders.customer_number)
INTERESTING SNIPPETS
====================
Highlights some of the highlights from each snippet.
Scans
Joins
Sorts
Data Movement (redistributions or broadcasts across the internal fabric)
Timings
nz_query_history (nz_query_history.ddl)
Usage: nz_query_history [ database [ table ]]
Purpose: Copy the query history information out to a permanent table on disk.
Query history information is stored in memory in a virtual table (_V_QRYHIST).
This script can be used to copy it out to a permanent table. By doing this
o Any BI tool can query the information easily
o The history is no longer limited to the last 2,000 queries
o The information is persistent, and will be maintained after an nzstop
Note: As of NPS version 6.0, the system view _V_PLAN_RESOURCE provides some
additional useful metrics for each query plan. This script will join
the two tables together when populating the permanent table.
Each new NPS release has continually extended both of these system
tables (as to the columns/information they provide). To take advantage
of this -- to presist that additional information -- you may need to
(re)create the permanent table with the proper DDL.
This only includes SQL/DML operations that generate query execution plans
that run down on the SPUs, such as the following types of statements
o SELECT
o INSERT ( which includes external table loads+unloads, and thus nzload )
o UPDATE
o DELETE
o GENERATE [EXPRESS] STATISTICS
o CTAS ( CREATE TABLE <table> AS SELECT ... )
This does not include
o TRUNCATE
o nzreclaim
o SQL/DDL statements ( such as CREATE/ALTER/DROP <object> )
o Catalog queries/lookups
o PREPARE ("limit 0") statements, which just parse the SQL
By default ( host.queryHistShowInternal = no ) this does not include
o JIT-Stats Sample Scans
Note: NPS 4.6 offers a new "Query History Collection and Reporting"
capability. It captures details about the user activity on the NPS
system, such as the queries that are run, query plans, table access,
column access, session creation, and failed authentication requests.
The history information is saved in a history database. This replaces
the _V_QRYHIST (which is used by this script), though it is still
supported for backward compatibility.
This script should be run periodically (e.g., once every 15 minutes)
via a batch/cron job to update the permanent table with the latest
data from the query history. New data (one row per query) is nzload'ed
(e.g., INSERT'ed) into the permanent table. This script adds data
to the table ... its does not remove data. If you wish, you can
clean up "old" data via a simple SQL statement, such as:
delete from nz_query_history where qh_tend < '2008-01-01' ;
To run this script as a non-admin user you must be granted certain
privileges -- to be able to access ALL of the pieces of information.
These are
GRANT LIST, SELECT ON USER TO <the_user> ;
GRANT LIST, SELECT ON DATABASE TO <the_user> ;
Inputs: By default, the information will be loaded into the NZ_QUERY_HISTORY
table in the SYSTEM database -- which you must create ahead of time.
In that this table is really just a regular user table, the SYSTEM
database might not be the best location for it. You should consider
creating it in a different database.
Use the SQL/DDL in the file 'nz_query_history.ddl' to create the table.
This file also creates a view against the nz_query_history table, in
which it performs various time calculations for you.
[ database ] Optionally, you can specify a different database and
[ table ] tablename that you want the data loaded into
Outputs: A status message, such as the following, will be sent to standard out
Load session of table 'NZ_QUERY_HISTORY' completed successfully
This script also returns an exit status (that corresponds to the exit
status that nzload returns):
0 = success (all rows were loaded)
1 = failure (no rows were loaded)
2 = partial success (some of the rows were loaded)
The *.nzlog file will be written out to /tmp.
If any bad rows were encountered, a *.nzbad file will be written out to /tmp.
It will include a date+time stamp as part of its filename.
nz_query_stats
Usage: nz_query_stats [<database> [<object>]] [ optional args ]
Purpose: Report various statistics about the queries that have been run on this box.
This script can be used against both
o the "Old" style query history table ( _V_QRYHIST / nz_query_history )
o the "New" style "Query History Collection And Reporting" database
The two share some similarities. But there are significant differences, so
one should not expect identical reports/statistics to be produced from both.
"Old" Basically collected information on SQL/DML operations that generated
query execution plans that ran on the SPUs. This mainly consisted of
the following types of statements:
o SELECT
o INSERT ( which includes external table loads+unloads, and thus nzload )
o UPDATE
o DELETE
o GENERATE [EXPRESS] STATISTICS
o CTAS ( CREATE TABLE <table> AS SELECT ... )
Some queries would have resulted in multiple plan files being generated + run.
For example
o Jit-Stats sample scans and Pre-Broadcast plans can be used just about
anywhere (before the actual query itself is planned + executed).
o CTAS operations might have initiated an automatic GENSTATS operation on
the resultant table after it was created (if it met the size threshold).
o GENSTATs commands are sometimes broken up into multiple steps/plan files,
each one processing a subset of the columns in the table (especially on
tables with many columns).
So, while "1 query = 1 plan" might be the general case, sometimes "1 query = many
plans" and each plan would have been treated/counted separately in the stats.
Note: Jit-Stats scans are only recorded if you set the registry setting
host.queryHistShowInternal=yes
Specifically, this list does not include
o nzreclaim
o TRUNCATE statements
o SQL/DDL statements ( such as CREATE/ALTER/DROP <object> )
o Catalog queries/lookups
o PREPARE ("limit 0") statements, which just parse the SQL
"New" The new history database records a lot more information ... such as
o TRUNCATE statements
o SQL/DDL statements ( such as CREATE/ALTER/DROP <object> )
o Some catalog queries
o PREPARE ("limit 0") statements, which just parse the SQL
o AUTH, BEGIN, COMMIT, LOCK, ROLLBACK, SET, SHOW, VACUUM statements
(to name a few)
At this time, this script is only going to report stats on the queries/
statements that actually involve a query plan. The reasons being:
o To maintain some consistency between the "Old" and "New" reports
o Most statements (not involving a query plan) would typically have
a sub-zero runtime, and including them in the stats (and there would
be many of them) would have a tendency to skew the statistics reported.
Regardless of whether the query involved one (or many) plans, it will be
treated as a single entity ... more accurately representing the single
query that the user actually invoked. Thus, (for example) this script
won't report information about any separate Jit-Stats/Pre-Broadcast plans
as they will get rolled up into the parent query.
The tables in the new history database are just like any other end-user
tables. You should generate statistics on them (from time to time) so
that the optimizer can generate good execution plans for any queries you
(or this script) might run against those tables. The new history database
involves more tables (and rows/data) than when compared to the old query
history (which was based on just a single table, and a single row per
query) making it even more important to periodically generate statistics
against the new history database.
The new history database can also provide additional information, such
as table access, column access, session creation, and failed authentication
requests, etc ...
Regarding timestamps: They are recorded in microseconds (whereas the old
history table only provided granularity in seconds). They are recorded
based on GMT. This script automatically adjusts them based on the local
timezone offset ... so that a day is a day (and does not overlap other days).
Inputs: The database and object names are optional. If not specified, then this script
will be run against SYSTEM..NZ_QUERY_HISTORY ... the "Old" style query history
table. This table is typically created and populated via the "nz_query_history"
script. If you have renamed the NZ_QUERY_HISTORY table to something else (and/or
located it in a different database) simply specify the database + table name
on the command line.
If you specify just the database name by itself, the script will figure out the
right thing to do (basing it on the existence of either the old NZ_QUERY_HISTORY
table or the new "$v_hist_queries" view. If both exist within the same database,
then the NZ_QUERY_HISTORY table is what will be used).
For example: nz_query_stats histdb
If you want to see statistics based on the virtual (in-memory) query history table,
specify: nz_query_stats system _v_qryhist
All of the following arguments are also optional.
-classic When displaying the 'Query Run Times', the output now
includes 3 additional columns of information to provide
some additional statistics (example below). If you DON'T
want those columns displayed, include the "-classic" switch
to get the old behavior (1 interval, that uses just 1 column)
Grouped By
Query Run Times Sum Completed % Completed % Remaining
... ... ... ... ...
4 3 469 99.15 % 0.85 %
3 2 466 98.52 % 1.48 %
2 11 464 98.09 % 1.91 %
1 74 453 95.77 % 4.23 %
0 379 379 80.12 % 19.88 %
-where "<clause>" You can use an optional WHERE clause to restrict the
report in whatever manner you wish. You should wrap
it in quotes. You should pass valid SQL, as this script
does no parsing/checking of its own. The column names
and values should be appropriate
o the "Old" style uses the NZ_QUERY_HISTORY table
o the "New" style uses the "$v_hist_queries" view
Examples:
=========
-where " qh_user != 'ADMIN' "
-where " qh_user != 'ADMIN' and QH_DATABASE in ('PRODUCTION', 'SALES') "
If any one of the following three options are specified, then they must all
be specified.
-interval "<value>" The reporting interval/increment, such as "7 days" or
"4 weeks" or "1 month". You should wrap it in quotes.
-periods <nn> The number of time periods/intervals to report upon.
Information for each time period will be reported in
its own column (which will be 15 characters wide).
-startdate <date> The beginning date for the report, e.g. 2008-06-25
The query's start date (QH_START) will be used to
evaluate what date range a query falls under.
Examples:
=========
-interval "1 day" -periods 7 -startdate 2008-06-25
# Report on seven individual days, starting on 2008-06-25
-interval "1 week" -periods 4 -startdate 2008-06-25
# Report on four one-week periods
-interval "1 month" -periods 6 -startdate 2008-07-01
# Report on six months, starting at the beginning of July.
Outputs: Sample output ...
$ nz_query_stats -interval "1 month" -periods 3 -startdate 2008-07-01
Query Statistics 2008-07-01 2008-08-01 2008-09-01
# Of Days 0 0 0
# Of Queries 0 0 0
# Of Incompletes 0 0 0
Queries Per Day
Minimum 0 0 0
Average 0 0 0
Maximum 0 0 0
Snippets Per Query
Minimum 0 0 0
Average 0 0 0
Maximum 0 0 0
# Of Rows Returned
Minimum 0 0 0
Average 0 0 0
Maximum 0 0 0
# Of Bytes Returned
Minimum 0 0 0
Average 0 0 0
Maximum 0 0 0
Query Queue Time
Minimum 0 0 0
Average 0 0 0
Maximum 0 0 0
Query Run Time
Minimum 0 0 0
Average 0 0 0
Maximum 0 0 0
Grouped By Query Run Times
90 - 99 0 0 0
80 - 89 0 0 0
70 - 79 0 0 0
60 - 69 0 0 0
50 - 59 0 0 0
40 - 49 0 0 0
30 - 39 0 0 0
20 - 29 0 0 0
10 - 19 0 0 0
9 0 0 0
8 0 0 0
7 0 0 0
6 0 0 0
5 0 0 0
4 0 0 0
3 0 0 0
2 0 0 0
1 0 0 0
0 0 0 0
Grouped By Query Types
COPY 0 0 0
CREATE 0 0 0
DELETE 0 0 0
GENERATE 0 0 0
INSERT 0 0 0
SELECT 0 0 0
UPDATE 0 0 0
WITH 0 0 0
miscellaneous 0 0 0
nz_skew
Usage: nz_skew [amount] [-verbose] [-sort <name|skew|size>]
Purpose: Identify any issues with data skew on the system.
This includes both tables and materialized views, and spans all databases.
This script can be helpful in identifying storage problems encountered while
the data is at rest on disk.
Inputs: All arguments are optional.
amount Allows you to specify the amount of table skew you are
interested in, e.g. "Min-Max SKEW (MB)". Any table with
this amount of skew, or more, will be listed.
The default value is 100 (as in 100 MB). If the space
usage for a given table varies by more than 100 MB --
from the dataslice with the least amount of data to
the dataslice with the greatest amount of data -- then
that table will be listed in the output.
You can adjust this reporting value up/down as you see fit.
Specifying a value of 0 will result in all (non-empty) tables
in all databases being listed.
-verbose For each of the skewed tables, include information about
the distribution key(s) in the output.
-sort name Sort the output by the "Database/Table" name (the default)
-sort size Sort the output by the table size, e.g. "Total MB"
-sort skew Sort the output by the amount of skew, e.g. "Min-Max SKEW (MB)"
Outputs: A report such as this will be sent to standard out.
The "SKEW Ratio" is expressed as a range. At the low end it could be ".00", which
would indicate that one (or more) of the dataslices are storing no data for this
table. At the high end, the value could match the number of dataslices on the
box (in this example, an IBM Netezza 1000-12 -- which has 92 data slices) ... so
the value of "92.00" indicates that one dataslice contains ALL of the data for
this table (e.g., the table is totally skewed).
DSlice is the dataslice with the Maximum amount of storage used (for that table).
A table that is using up exactly the same amount of storage across all dataslices
(e.g., no discernable skew at all) would have a ratio of "1.00 - 1.00".
All of the storage sizes are rounded to the nearest MB for display purposes.
Anything less than .5 is rounded down. Anything >= .5 is rounded up. So a
value of "0" does not necessarily mean the table is empty. It does mean
that the storage value is < 0.5 MB.
This script can display skew for tables that "do not yet exist". This would
be tables that are still being created ... where the transaction has not yet
done a COMMIT (such as in the case of an ongoing CTAS or nzrestore operation).
For such tables, the Database name will show as "<Unknown>"
the Table name will be prefixed with "(*)"
This script can display skew for temporary tables. For such tables, the
Table name will be shown as:
<system assigned unique tablename> / <user defined temporary tablename>
$ nz_skew 0
SPU Disk Utilization
===========================
# Of DataSlices 22
Least Full DSlice # 3
Most Full DSlice # 2
Extents Per Dataslice 121,586
Storage Per DataSlice (GB) 356.209
Storage Used (GB)
Minimum 310.632
Average 330.701
Maximum 346.339
Storage Used (%)
Minimum 87.205
Average 92.839
Maximum 97.229
Total Storage
Available (TB) 7.653
Used (TB) 7.105
Used (%) 92.839
Remaining (TB) 0.548
Remaining (%) 7.161
Table Skew That Is > 0 MB
===========================
Database | Table | Total MB | Minimum | Average | Maximum | Min-Max SKEW (MB)| SKEW Ratio | DSlice
----------+-----------------------------+----------+---------+---------+---------+------------------+--------------+--------
<Unknown> | (*)CTAS_EXAMPLE | 536 | 0 | 6 | 536 | 536 | .00 - 92.00 | 7
TEST_DB | BALANCED_DISTRIBUTION | 6,176 | 67 | 67 | 67 | 0 | 1.00 - 1.00 | 18
TEST_DB | EMPTY_TABLE | 0 | 0 | 0 | 0 | 0 | .00 - .00 | 12
TEST_DB | SAMPLE_TABLE | 13,717 | 134 | 149 | 201 | 67 | .90 - 1.35 | 3
TEST_DB | TOTALLY_SKEWED_DISTRIBUTION | 536 | 0 | 6 | 536 | 536 | .00 - 92.00 | 2
TEST_DB | temp.32570.1/MY_TEMP1 | 536 | 0 | 6 | 536 | 536 | .00 - 92.00 | 8
nz_stats
Usage: nz_stats
Purpose: Show various statistics about the system (like 'nzstats' does, but more).
Inputs: The more databases you have, the longer the script will take to run.
If the number of databases is > 50, then (in the interest of time)
the datapoints *'ed below will NOT be collected.
-verbose Include those datapoints (regardless of the number of databases)
-brief Exclude those datapoints (regardless of the number of databases)
Outputs: This script will produce a report that includes over 200 datapoints.
Amongst other things, it will show information about:
#'s Of Objects
Table Size
MView Size
Tables Per Database
MViews Per Database
* MViews Per Table (Max)
Columns Per Table
Columns Per MView
Table Row Size
MView Row Size
* Column Types (for TABLEs)
* Column Types (for MVIEWs)
* Column Types (for VIEWs)
Permissions
High Watermark Counters
Code Cache
SPU Disk Usage
Query Statistics
nz_sysutil_history (nz_sysutil_history.ddl)
Usage: nz_sysutil_history
Purpose: Copy the system utilization information out to a permanent table on disk.
The system utilization information is stored in memory in a virtual table
(_V_SYSTEM_UTIL). This script can be used to copy it out to a permanent
table. By doing this
o Any BI tool can query the information easily
o The information is persistent, and will be maintained after an nzstop
If you wish to collect this information, this script should be run periodically
(e.g., once an hour) via a batch/cron job to update the permanent table with
the latest data from the _V_SYSTEM_UTIL table.
Inputs: By default, the information will be loaded into the NZ_SYSUTIL_HISTORY
table in the SYSTEM database -- which you must create ahead of time
(use the SQL/DDL in the file nz_sysutil_history.ddl for this purpose).
[ database ] Alternatively, you can specify a different database
[ table ] and table that you want the sysutil history data loaded into.
Outputs: A status message, such as the following, will be sent to standard out
Load session of table 'nz_sysutil_history' completed successfully
nz_sysutil_stats
Usage: nz_sysutil_stats [<database> [<table>]]
Purpose: Display some statistics about system utilization
By default, this script will report using the information in _V_SYSTEM_UTIL
It is a virtual management view that is always there, but transient in nature.
It contains a limited number of entries (generally, 1 row per minute, for the last
2.5 days). It is re-initialized anytime the database is nzstart'ed.
The statistics that are displayed by this script are specifically regarding
the SPUs. This management view also maintains statistics about the host/
server utilization, but they are generally of less interest (since most of
the work happens down on the SPUs).
Alternatively, the script can report using the information in NZ_SYSUTIL_HISTORY
A permanent version of the above table (_V_SYSTEM_UTIL) can be populated via
the nz_sysutil_history script. This table (if it exists) can provide infinitely
more detail data for analysis.
Inputs: [<database> [<table>]]
If not specified, then the data in _V_SYSTEM_UTIL will be used.
If you specify a database name, the script will look for a copy of the
NZ_SYSUTIL_HISTORY table in that database. Alternatively, you can specify
both the database + table names.
All of the following command line arguments are optional
-threshold <nnn>
A value from 0 to 100, the default is 0. When displaying the busiest time
periods (by hour, or by ten-minute intervals) only display those rows where
one of the MAX () usage stats meets or exceeds this percentage of utilization.
-limit <nnn>
A value from 1 to 1000000, the default is 100. When displaying the busiest
time periods (by hour, or by ten-minute intervals) limit the output to this
number of rows. The rows are first sorted by their MAX () usage, before the
limit is applied -- in order to concentrate on the time periods with the
heaviest usage.
-date [<day1> [<day2>]]
By default, the script reports on the entire time span -- based on whatever
range of data the table contains.
You can limit the report to a specific day
-date 2014-12-05
You can limit the report to a range of days
-date 2014-12-04 2014-12-10
You can limit the report to a specific time span (use quotes around the timestamps)
-date "2014-12-04 09:00:00" "2014-12-04 17:00:00"
-server
As mentioned, normally this script reports on the SPU side stats.
Use this switch if you instead want it to display stats for the host/server
Outputs: Generally speaking, "DISK Usage" is going to show one of the higher usage metrics
(because they are slow, mechanical devices). If you have cpu-intensive queries, then
you may have high "CPU Usage". The other items that are being measured and reported
on (FABRIC, MEMORY, TempDISK) do not usually hit the same level of utilization.
The report includes both "Avg" and "Max" measurements. "Max" is an important
measurement because the system is as busy as the busiest device. One can
also compare "Avg" to "Max" to look for signs of skew -- although some skew is to
be expected because different S-Blades support different numbers of dataslices/disks.
Example output:
$ nz_sysutil_stats -date "2014-12-20 19:00:00" "2014-12-20 20:00:00"
############################################################################################################################################
Name mybox
Description Mark's Test Box
Date 20-Jan-15, 21:06:37 UTC
Model IBM PureData System for Analytics N3001-005
############################################################################################################################################
Table Being Analyzed | _V_SYSTEM_UTIL
|
Time Frame: Start | 2014-12-20 19:00:44
End | 2014-12-20 19:59:44
|
# Of Elapsed Minutes | 59
# Of Measurements | 60
|
Reporting On | SPU
############################################################################################################################################
Utilization | Max CPU Usage | Max DISK Usage | Max FABRIC Usage | Max MEMORY Usage | Max TempDISK Usage
------------------- | ---------------------- | ---------------------- | ---------------------- | ---------------------- | ----------------------
00 - 10 % | 4 6.67 % | 0 .00 % | 42 70.00 % | 0 .00 % | 43 71.67 %
10 - 20 % | 3 5.00 % | 0 .00 % | 16 26.67 % | 43 71.67 % | 1 1.67 %
20 - 30 % | 4 6.67 % | 0 .00 % | 0 .00 % | 1 1.67 % | 1 1.67 %
30 - 40 % | 4 6.67 % | 0 .00 % | 0 .00 % | 0 .00 % | 1 1.67 %
40 - 50 % | 5 8.33 % | 0 .00 % | 0 .00 % | 16 26.67 % | 0 .00 %
50 - 60 % | 6 10.00 % | 1 1.67 % | 0 .00 % | 0 .00 % | 0 .00 %
60 - 70 % | 3 5.00 % | 1 1.67 % | 1 1.67 % | 0 .00 % | 0 .00 %
70 - 80 % | 4 6.67 % | 1 1.67 % | 1 1.67 % | 0 .00 % | 3 5.00 %
80 - 90 % | 2 3.33 % | 3 5.00 % | 0 .00 % | 0 .00 % | 8 13.33 %
90 - 100 % | 25 41.67 % | 54 90.00 % | 0 .00 % | 0 .00 % | 3 5.00 %
^ ^
| |
// for 25 minutes -- which is 41.67% of the overall time
// the CPU Usage was maxed out at 90-100% utilization
############################################################################################################################################
Busiest Hours
=============
Hour | CPU Usage (Avg - Max) | DISK Usage (Avg - Max) | FABRIC Usage (Avg - Max) | MEMORY Usage (Avg - Max) | TempDISK Usage (Avg - Max)
---------------------+-----------------------+------------------------+--------------------------+--------------------------+----------------------------
2014-12-20 19:00:00 | 0.51 - 0.67 | 0.69 - 0.97 | 0.05 - 0.07 | 0.15 - 0.20 | 0.19 - 0.21
(1 row)
Busiest 10 Minute Intervals
===========================
10 Min. | CPU Usage (Avg - Max) | DISK Usage (Avg - Max) | FABRIC Usage (Avg - Max) | MEMORY Usage (Avg - Max) | TempDISK Usage (Avg - Max)
---------------------+-----------------------+------------------------+--------------------------+--------------------------+----------------------------
2014-12-20 19:00:00 | 0.77 - 0.94 | 0.68 - 0.96 | 0.13 - 0.16 | 0.33 - 0.42 | 0.74 - 0.76
2014-12-20 19:10:00 | 0.61 - 0.86 | 0.65 - 0.86 | 0.13 - 0.14 | 0.22 - 0.31 | 0.39 - 0.49
2014-12-20 19:20:00 | 0.42 - 0.54 | 0.74 - 1.00 | 0.04 - 0.09 | 0.09 - 0.11 | -
2014-12-20 19:30:00 | 0.40 - 0.52 | 0.68 - 1.00 | 0.01 - 0.01 | 0.09 - 0.11 | -
2014-12-20 19:40:00 | 0.43 - 0.57 | 0.70 - 1.00 | 0.01 - 0.01 | 0.09 - 0.11 | -
2014-12-20 19:50:00 | 0.44 - 0.57 | 0.69 - 1.00 | 0.01 - 0.01 | 0.09 - 0.11 | -
(6 rows)
############################################################################################################################################
nz_zonemap
Usage: nz_zonemap <database> <table/mview> [column ...]
Purpose: To dump out the zonemap information for a table or materialized view.
Zonemaps are automatically created for the first 201 zone-mappable columns
in each table. A 'zone-mappable' column is of type
BIGINT, INTEGER, SMALLINT, BYTEINT (8/4/2/1 byte integers)
DATE
TIMESTAMP (date+time)
As of release 4.6, zonemaps are also automatically created for the 'special'
columns ROWID and CREATEXID.
For materialized views, if it included an ORDER BY clause, then zone maps
will also be maintained for all of the columns included in the ORDER BY
clause. The only exclusion is columns that are of a NUMERIC datatype and
have a precision between 19..38.
For clustered based tables, zonemaps will also be maintained for all of
the columns included in the ORGANIZE ON clause. The only exclusion is
columns that are of a NUMERIC datatype and have a precision between 19..38.
Zone-map values are stored as an 8-byte bigint. This script will attempt to
display the zone-map values in the appropriate format -- as an integer or a date
or a timestamp or a text string or a numeric (no formatting is attempted on
floating point columns). Generally, the valid range for dates and timestamps
is "0000-01-01" to "9999-12-31". If a zonemap value falls outside of this range
the script will display it as either "Invalid Date" or "Invalid Timestamp".
If you have a versioned Table/Secure Table, as the result of doing an
ALTER TABLE <table> [ADD|DROP] COLUMN ...
this script will not run against the table until it has been groomed.
As of release 7.2.1 ...
Previously, zonemaps were always stored in "Column-Oriented Layout".
Now, you have the option of switching to "Table-Oriented Layout".
This will use a small amount of additional disk space (when storing
the zonemaps), but it can also provide significant performance
improvements for certain workloads.
See: /nz/kit/bin/adm/nzsqa -hc zMapLayout
When the new "Table-Oriented Layout" is being used, the system does
not bother to store an EXTENT zonemap record for any partially filled
extent in a table (such as the last extent in the table). The
individual PAGE zonemap records are still maintained + written
(for each page in that extent). And once the extent is full (and
a new extent is allocated), then the extent zonemap record (for the
prior extent) is now written.
For a larger table, this primarily affects only the last extent --
there won't be an extent zonemap record for it. For a smaller table,
there might only be one partially filled extent in the table ...
thus, there would be no extent records at all. By default, nz_zonemap
only shows extent zonemap records ... and if there aren't any to
display then you might assume that no zonemap records = no data. But
that isn't necessarily the case ... as there could still be page
oriented zonemap records for that table. By using the option
nz_zonemap -page
you will be able to see ALL of the zonemap records associated with
the table (both EXTENT and PAGE).
Inputs: The database and table/mview names are required.
If no column name is specified, this script will simply display the
list of columns for which zone maps are maintained for this table/mview.
Or you can specify the names of as many columns as you want --
for which you want the zonemap information dumped out. The columns
will be displayed side-by-side.
If you specify a column name of ALL then the script will automatically
pick ALL zonemappable columns within the table (along with the ROWID and
CREATEXID columns). The ouptut could be quite wide ... and hard to
view on a text terminal. So it will be formatted somewhat differently,
hopefully to facilitate its ability to be imported into something else
(a table, a spreadsheet, ...) for further analysis.
On NPS 4.x you are limited (in this script) to specifying just 2 column names.
All of the following switches are optional.
-page Zonemaps are maintained at the extent level (3MB).
NPS 7.0 adds support for zonemaps at the page level (128KB).
This option will dump out all available zonemap records
for the extents AND for the 1..24 pages within each
extent. For use with NPS versions 7.0+
-pagemap A storage extent is comprised of 24 pages. If any of the
pages are not being used, then NPS will not bother to waste
its time scanning the pages. This option will display a
"pagemap" indicating which pages are/are not in use for
each of the extents. For use with NPS versions 6.0+
-dsid <nn> By default, data slice 1 (the 1st amongst all data slices)
will be reported upon. You can choose to look at the
zonemap records on a different data slice, if you wish.
Every SPU contains zonemap information. Rather than
showing all info for all SPUs (which could be voluminous),
this script will concentrate on just one SPU, which
should provide a good representative sample.
-percent_sorted If specifed, then the only thing that this script will
output is a number (a percentage) between 0 and 100 ...
to indicate how sorted the data is on disk (based on an
analysis of the zonemap information for the column(s)
specified). This is useful if you wish to use this
information in your own scripts. In which case, make
sure to test/check the exit status ( $? ) from this
script: 0 = success 1 = failure
-fast (6.0+) Accessing the system management table (_VT_SPU_ZMAP_INFO)
takes time. This option will do a CTAS to create a
permanent table/cache of that information ... so that
you can run this script over+over to analyze different
tables and different columns more readily. Including
this switch will (a) create the cache the first time
and (b) reuse the cache during subsequent invocations.
However, the information is static -- as of the moment
the script creates the permanent table (which could be
way in the past). DROP'ing the table SYSTEM..VT_SPU_ZMAP_INFO
will cause the script to recreate/refresh the information
it contains.
(5.0-) In NPS 4.x/5.x, this switch will cause the script to run
about 2X faster ... but at the cost of having to exclude
the "Sort" by column(s) from the output.
-info Dump out some summary information about the zonemap table
o Relevant system registry setings
o The overall size of the table (per dataslice)
o The number of extent/page zonemap records (per dataslice)
Outputs: A dump of the zonemap information for this table/mview ... for the
specified column(s) ... taken from a single data slice. Examples, with
commentary, follow.
...................................................................................
If the "Sort" column shows "true", then the (min) value in this extent is greater
than or equal to the (max) value of the previous extent. Which indicates optimal
zonemap usage and results in optimal performance. Basically, this script is trying
to see if the data is sorted on disk in ASCending order.
Extent # | gap | customer_num(min) | customer_num(max) | Sort
----------+-----+-------------------+-------------------+------
1 | | 300127 | 9100808 |
2 | | 51775807 | 97100423 | true
3 | 7 | 100000053 | 381221123 | true
And since a DESCending sort order should perform just as well as an ASCending sort
order (when it comes to zonemaps), the column will show "true" if the (max) value in
this extent is less than or equal to the (min) value of the previous extent.
The "Extent #" that is displayed is basically just a one-up number. Whereas the
"gap" column is used to indicate whether the extents are contiguous on disk. If
the extents are contiguous ... if the gap is 0 ... then a blank will be displayed.
Otherwise, this number will represent the number of other extents (not belonging
to this table) between this extent and the prior extent.
...................................................................................
If no details are displayed (if both the "(Min)" and "(Max)" values are null) then
there is no zonemap record for this column for this extent.
Extent # | gap | customer_name(min) | customer_name(max) | Sort
----------+-----+--------------------+--------------------+------
1 | | | |
2 | | | |
3 | 7 | | |
...................................................................................
If some of the details are "missing" for some of the extents, then a transaction
was probably interrupted ( e.g., ^C ) -- causing the zone maps for those extents to
not be set. A "GENERATE [EXPRESS] STATISTICS" should fix that right up.
Extent # | gap | customer_num(min) | customer_num(max) | Sort
----------+-----+-------------------+-------------------+------
1 | | 300127 | 9100808 |
2 | | 51775807 | 97100423 | true
3 | 7 | | |
This can also occur in other situations. Lets say you add an ORGANIZE ON clause
to an existing table that has existing data. And lets say that that ORGANIZE ON
clause includes a text (character) based column. Any new data that gets
nzload'ed/insert'ed into the table will automatically have zonemap entries
created for the text column. But any existing data (extents) will not have
zonemap records ... not until the table is GROOM'ed.
...................................................................................
If all of the rows in a particular extent are
(a) deleted, or
(b) contain the NULL value in this column
then
the Min/Max values will be displayed as a line of dashes, and
the "Sort" column will display one of the following strings
DEL
NULL
DEL/NULL
No query that contains a WHERE clause against this column would match these
conditions, thus this particular extent should not need to be scanned.
Extent # | gap | customer_num(min) | customer_num(max) | Sort
----------+-----+-------------------+-------------------+------------------
1 | | ---------- | ---------- | DEL/NULL
2 | | 51775807 | 97100423 |
3 | 7 | 100000053 | 381221123 | true
...................................................................................
Zonemaps have always been maintained at the extent level (where 1 extent = 3MB).
As of NPS 7.0, the system may now decide to also store zonemaps for individual
pages (where 1 page = 128KB). This is known as "page granular zonemaps".
The appliance automatically makes the determination as to what level of detail
should be stored in the zonemap table (for any given page / extent / column).
In this example, the "nz_zonemap -page" option has been specified ... so that
the script will dump out the additional zonemap detail records for each page
(if they exist).
For the column called "THE_EXTENT", only extent level zonemaps have been stored.
Thus, the "(Min)" and "(Max)" column values ... for pages 1..24 ... are all null.
Whereas for the column called "THE_PAGE", it has both an extent zonemap record
and 24 individual page zonemap records (per extent).
The "Sort" column is now a comparison of this extent versus the previous extent
or of this page versus the previous page
$ nz_zonemap -page SYSTEM NZ_CHECK_DISK_SCAN_SPEEDS THE_EXTENT THE_PAGE
Database: SYSTEM
Object Name: NZ_CHECK_DISK_SCAN_SPEEDS
Object Type: TABLE
Object ID : 200000
Data Slice: 1
Column 1: THE_EXTENT (SMALLINT)
Column 2: THE_PAGE (SMALLINT)
Extent # | gap | Page # | THE_EXTENT(min) | THE_EXTENT(max) | Sort | THE_PAGE(min) | THE_PAGE(max) | Sort
----------+-----+--------+-----------------+-----------------+------+---------------+---------------+------
1 | | extent | 1 | 1 | | 1 | 24 |
1 | | 1 | | | | 1 | 1 |
1 | | 2 | | | | 2 | 2 | true
1 | | 3 | | | | 3 | 3 | true
1 | | 4 | | | | 4 | 4 | true
1 | | 5 | | | | 5 | 5 | true
1 | | 6 | | | | 6 | 6 | true
1 | | 7 | | | | 7 | 7 | true
1 | | 8 | | | | 8 | 8 | true
1 | | 9 | | | | 9 | 9 | true
1 | | 10 | | | | 10 | 10 | true
1 | | 11 | | | | 11 | 11 | true
1 | | 12 | | | | 12 | 12 | true
1 | | 13 | | | | 13 | 13 | true
1 | | 14 | | | | 14 | 14 | true
1 | | 15 | | | | 15 | 15 | true
1 | | 16 | | | | 16 | 16 | true
1 | | 17 | | | | 17 | 17 | true
1 | | 18 | | | | 18 | 18 | true
1 | | 19 | | | | 19 | 19 | true
1 | | 20 | | | | 20 | 20 | true
1 | | 21 | | | | 21 | 21 | true
1 | | 22 | | | | 22 | 22 | true
1 | | 23 | | | | 23 | 23 | true
1 | | 24 | | | | 24 | 24 | true
2 | | extent | 2 | 2 | true | 1 | 24 |
2 | | 1 | | | | 1 | 1 | true
2 | | 2 | | | | 2 | 2 | true
2 | | 3 | | | | 3 | 3 | true
2 | | 4 | | | | 4 | 4 | true
2 | | 5 | | | | 5 | 5 | true
2 | | 6 | | | | 6 | 6 | true
2 | | 7 | | | | 7 | 7 | true
2 | | 8 | | | | 8 | 8 | true
<...>
...................................................................................
Storage is allocated an extent at a time (3MB). Within the extent, it is then
filled up with records a page at a time (128KB). The pages are filled front-to-back.
Once all of the available space in the extent is used up, a new extent is allocated.
Usually, all of the 24 pages within an extent are in-use. But there are
exceptions.
o The very last extent for a table will probably only be partially filled. So
any remaining pages will be unused (empty).
o If you have done a "GROOM TABLE <tablename> PAGES ALL;" then any pages that
contain 100% deleted/groomable rows will be marked as being empty. They
will no longer be used/scanned, though they still exist within the extent.
If all 24 pages in the extent are empty, the extent will be removed from
the table and added back to the storage pool.
o Clustered Base Tables (those created with an ORANIZE ON clause) may only
partially fill each cluster (or any given extent).
In this example, the "nz_zonemap -pagemap" option has been specified ... so
that the script will display the additional column "Used/Unused Pages (./0)"
to represent which pages are (or are not) in use within any given extent.
A "." indicates the page is in use. A "0" indicates the page is not being used.
$ nz_zonemap -pagemap SYSTEM SAMPLE_TABLE CREATEXID
Database: SYSTEM
Object Name: SAMPLE_TABLE
Object Type: TABLE
Object ID : 10569259
Data Slice: 1
Column 1: CREATEXID (BIGINT)
Extent # | gap | Used/Unused Pages (./0) | CREATEXID(min) | CREATEXID(max) | Sort
----------+-----+--------------------------+----------------+----------------+------
1 | | 0....................... | 98840880 | 98840880 |
2 | | .0...................... | 98840880 | 98840880 | true
3 | | ..0..................... | 98840880 | 98840880 | true
4 | | ...0.................... | 98840880 | 98840880 | true
5 | | ....0................... | 98840880 | 98840880 | true
6 | | .0.0.0.0.0.0.0.0.0.0.0.0 | 98840880 | 98840880 | true
7 | | 0.0.0.0.0.0.0.0.0.0.0.0. | 98840880 | 98840880 | true
8 | | .0000000000000000000000. | 98840880 | 98840880 | true
9 | | ........................ | 98840880 | 98840880 | true
10 | 1 | ........................ | 98840880 | 98840880 | true
11 | 3 | ........................ | 98840880 | 98840880 | true
12 | | 0.0................000.. | 98840880 | 98840880 | true
13 | | ......0000.............. | 98840880 | 98840880 | true
14 | 1 | ........................ | 98840880 | 98840880 | true
15 | | ............000000000000 | 98840880 | 98840880 | true
(15 rows)
* * * * DDL * * * *
nz_clone
Usage: nz_clone <database> <object> [<new_name>]
Purpose: To clone the DDL for an existing object, and optionally assign it a new name.
This script will produce any DDL statements related to the object
CREATE ...
COMMENT ON ...
GRANT ...
ALTER ... OWNER TO ...
While (optionally) substituting the new_name for its current name.
Inputs: The database and object name (of an existing object) is required.
The object can be of type
Table, External Table, View, Materialized View
Schema, Sequence, Synonym, Function, Aggregate, Procedure, Library
If the object is a function/aggregate/procedure, then you must pass
this script the exact signature, wrapped in single quotes. For example:
$ nz_clone SYSTEM 'TEST_FUNCTION_1(BYTEINT)'
To obtain a list of the exact signatures, see:
nz_get_function_signatures
nz_get_aggregate_signatures
nz_get_procedure_signatures
The <new_name> is optional ... only used if you want to assign a new
name to the object.
Outputs: SQL DDL will be sent to standard out.
You could, if you wanted to, pipe that output directly back into
nzsql to actually create said object all in one step. For example:
nz_clone prod_db sales_table puppy_dog | nzsql test_db
nz_ddl
Usage: nz_ddl [ <database> [ -rename <new_database_name> ]] [ -udxDir <dirname> ]
Purpose: To dump out all of the SQL/DDL that defines this NPS system.
This includes
CREATE GROUP ...
CREATE USER ...
CREATE DATABASE / CREATE SCHEMA ...
CREATE TABLE ...
CREATE EXTERNAL TABLE ...
CREATE VIEW ...
CREATE MATERIALIZED VIEW ...
CREATE SEQUENCE ...
CREATE SYNONYM ...
CREATE FUNCTION ...
CREATE AGGREGATE ...
CREATE LIBRARY ...
CREATE PROCEDURE ...
And also
COMMENT ... (to add COMMENTs to an object)
GRANT ... (to GRANT access privileges to users and groups)
SET ... (to SET the system default values)
ALTER ... (to ALTER the owner of an object)
UPDATE ... (to UPDATE the encrypted passwd column when creating users)
Access: For the most part, the nz_ddl* scripts access generic system views in
order to do their work. If you have access to a given object, the script
will be able to reproduce the DDL for it. With the following caveats:
4.6 No caveats
4.5 nz_ddl_function -- requires SELECT access to the system table _T_PROC
nz_ddl_aggregate -- requires SELECT access to the system table _T_AGGREGATE
4.0, 4.5 Do you use quoted database names ... e.g., "My Database" (which is
rather atypical to begin with)? If so, then various scripts will
want SELECT access to the system table _T_OBJECT in order to
identify whether or not a particular database name needs quoting.
Without such access the scripts will still function, but they won't
add quotes around any database name that would require quoting.
nz_ddl_user
When dumping out the CREATE USER statements, each user's default
password is initially set to 'password'. For this script to be
able to UPDATE the password (with the actual encrypted password)
that will require SELECT access to the system table _T_USER.
Otherwise, this script will not generate the additional SQL
statements to update the password.
nz_ddl_function, nz_ddl_aggregate, nz_ddl_library
These scripts place a copy of the host+SPU object files into
the directory '/tmp/nz_udx_object_files'. In order for the
copy operation to work successfully, the script must be run
as the linux user 'nz' so that it can access the original
files under /nz/data
nz_ddl_sequence
The starting value for the sequence ("START WITH ...") will be
based upon the _VT_SEQUENCE.NEXT_CACHE_VAL value (which is not
necessarily the next sequence number -- but rather the next
cache number/value that would be doled out. For more on this topic
see "Caching Sequences" in the "Database User's Guide".)
If you do not have access to that virtual table (and by default,
users do not) then the "START WITH ..." value will be based upon
whatever value was used when the sequence was originally created.
Inputs: By default, everything about the NPS server will be dumped out.
The <database> name is optional.
If a database name is included, then only the specified database/schema
will be processed. The output will include a "CREATE DATABASE" statement
(for the database), but there will be no "CREATE SCHEMA" statement ...
which would have the effect of creating all of the objects in the default
schema of the new database.
The SQL/DDL will include all of the CREATE, COMMENT, GRANT and ALTER
statements associated with the database/schema.
Specify "-owner <name>" to limit the output to those database objects
(tables, views, sequences, ...) owned by the specified user. It will
include CREATE and GRANT statements. It will not include ALTER ... OWNER
statements as all objects are owned by the same specified "-owner <name>".
It will not include any COMMENT statements.
-rename <new_database_name>
in which case the <new_database_name> name will be substituted for you into
the DDL that is generated by this script.
If you want to quickly clone a database structure on the same NPS host
(the DDL, not the data itself) you can do something like this:
nz_ddl INVENTORY -rename INVENTORY_COPY | nzsql
Likewise, you could clone the structure to another NPS host by doing:
nz_ddl INVENTORY -rename INVENTORY_COPY | nzsql -host another_host
Because groups and users are global in nature -- and not tied to
a particular database -- no CREATE GROUP/CREATE USER statements
will be included in this output. However, any GRANT's -- to give
groups and users the relevant access to the objects within this
database -- will be included.
-udxDir <dirname>
Part of the definition of any FUNCTION/AGGREGATE/LIBRARY is a reference to
two compiled object files -- one for the host and one for the SPU/SBlade.
For your convenience, a copy of these object files will be put under the
directory
/tmp/nz_udx_object_files
If you wish, you can use this switch to specify an alternative directory
location for the files to be copied to.
Should you want to use this DDL to create these same objects on another NPS
box, these files must be made available there.
Note: The scripts 'nz_ddl_group' and 'nz_ddl_user' can be used
separately to generate any desired CREATE GROUP or CREATE USER
statements.
Note: Privileges can also be set globally -- within the 'SYSTEM'
database -- and therefore be applicable to all databases. When
moving just a single database from one machine to another you
should issue the following commands
nz_ddl_grant_group -sysobj system
nz_ddl_grant_user -sysobj system
to retrieve and review those global privileges -- and see if there
are any you wish to set on the new machine.
Outputs: The SQL/DDL (with comments) will be sent to standard out. This
should be redirected to a disk file for future use.
When you are ready to replay the SQL (on this, or another, system)
you should do it as a user who has the appropriate privileges to
issue all of the various SQL/DDL statements.
When replaying the SQL, you might want to invoke it in a manner
such as this
nzsql < your_ddl.sql &> your_ddl.out
The output file can then be quickly scanned + checked for
problems in a manner such as this
cat your_ddl.out | grep -F -v "*****" | sort | LC_COLLATE=C uniq -c
The summarization that is produced would look something like this.
The 'NOTICE's are ok -- they simply indicate that some of the
CREATE TABLE statements had constraints associated with them.
If there are any 'ERROR's listed, they warrant your attention.
10 ALTER DATABASE
2 ALTER GROUP
1 ALTER SEQUENCE
2 ALTER SYNONYM
21 ALTER TABLE
19 ALTER USER
9 ALTER VIEW
19 COMMENT
31 CREATE DATABASE
18 CREATE EXTERNAL TABLE
14 CREATE GROUP
13 CREATE MATERIALIZED VIEW
16 CREATE SEQUENCE
15 CREATE SYNONYM
89 CREATE TABLE
18 CREATE USER
17 CREATE VIEW
151 GRANT
55 NOTICE: foreign key constraints not enforced
30 NOTICE: primary key constraints not enforced
12 NOTICE: unique key constraints not enforced
6 SET VARIABLE
31 UPDATE 1
nz_ddl_aggregate
Usage: nz_ddl_aggregate [database [aggregate_name/signature]]
Purpose: To dump out the SQL/DDL that was used to create a user defined aggregate.
i.e. CREATE or replace AGGREGATE ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The aggregate_name/signature is optional.
If not specified, then a CREATE AGGREGATE statement will be generated
for each UDA in the database.
Specify "-owner <name>" to limit the output to objects owned
by the specified user.
If you specify an aggregate_signature, the DDL to create just that
single UDA will be produced. You must pass this script the
exact signature, wrapped in single quotes. For example:
$ nz_ddl_aggregate SYSTEM 'TEST_AGGREGATE_1(BYTEINT)'
To obtain a list of the exact signatures, see:
nz_get_aggregate_signatures
If you specify an aggregate_name, the DDL to create all aggregates
matching that aggregate name will be produced. For example:
$ nz_ddl_aggregate SYSTEM TEST_AGGREGATE_1
-udxDir <dirname>
Part of the aggregate definition includes a reference to two compiled object
files -- one for the host and one for the SPU/SBlade. For your convenience,
a copy of these object files will be put under the directory
/tmp/nz_udx_object_files
If you wish, you can use this switch to specify an alternative directory
location for the files to be copied to.
Should you want to use this DDL to create these same aggregates on another
NPS box, these object files must be made available there.
Outputs: SQL DDL (the CREATE AGGREGATE statements) will be sent to standard out.
In order for the copy operation to work successfully, the script must
be run as the linux user 'nz' so that it can access the original files
under /nz/data
nz_ddl_all_grants
Usage: nz_ddl_all_grants
Purpose: To dump out ALL of the GRANT statements for this system.
This will be for all users and groups.
This will be for all objects.
This will be for all databases and schemas.
Inputs: None
Outputs: The SQL/DDL will be sent to standard out. It will include a "\connect"
statement to connect to each database (in turn), and all of the GRANTs
associated with that database.
This is a stripped down / modified version of the nz_ddl script. But
instead of dumping ALL ddl for ALL objects on this system, it will only
dump out the GRANT statements.
nz_ddl_comment
Usage: nz_ddl_comment [database [object]]
Purpose: To dump out the SQL/DDL used to add a comment to an object.
i.e. COMMENT ON object ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
By default, a COMMENT ON statement will be produced for all objects (and
columns) that have had user-defined comments added to them.
If you specify the optional object name, then only the COMMENT ON statement
pertaining to that object (and any of that object's columns) will be produced.
If there are no comments for that object your output will be blank.
If the object is a function/aggregate/procedure, then you must pass
this script the exact signature, wrapped in single quotes. For example:
$ nz_ddl_comment SYSTEM 'TEST_FUNCTION_1(BYTEINT)'
To obtain a list of the exact signatures, see:
nz_get_function_signatures
nz_get_aggregate_signatures
nz_get_procedure_signatures
Outputs: SQL DDL to generate the COMMENT ON statements will be sent to standard out.
nz_ddl_database
Usage: nz_ddl_database [database]
Purpose: To dump out the SQL/DDL that was used to create a database.
i.e. CREATE DATABASE ...
Inputs: The database name is optional.
If not specified, then a CREATE DATABASE statement will be generated for
each database on your NPS server.
If you specify the database name, the DDL to create just that single
database will be produced.
Outputs: SQL DDL (the CREATE DATABASE statements) will be sent to standard out.
nz_ddl_diff
Usage: nz_ddl_diff -sdb <database> -tdb <database> [ optional args ]
Purpose: Report any DDL "diff"erences between two databases.
The databases can reside on the same, or different, NPS servers.
Inputs: -sdb <database> # Source database. A required argument.
-tdb <database> # Target database. A required argument.
OPTIONAL Arguments
==================
-sequence # By default, all objects/types will be processed.
-library # To restrict the comparison to specific object
-function # types use any combination of these switches.
-aggregate
-table
-mview
-exttable
-synonym
-view
-procedure
# By default, the report will show objects that
-only_in_source # 1) Exist only in the source database
-only_in_target # 2) Exist only in the target database
-only_if_different # 3) Exist in both databases, but are different
#
# To restrict the report to one (or more) of these
# comparisons, use any combination of these switches.
-verbose # When two objects differ, the default report will only
# list the name of the object. To see a list of the
# actual differences, include this switch.
#
-brief # The default report includes various header information
# for readability. If you want the report to list
# only the object names (to make it easier to process
# via a subsequent script) include this switch.
#
# The -verbose and -brief options are mutually exclusive.
-ignore # Ignore (don't flag) the following:
# The use/absence of double quotes (located anywhere in the DDL)
# Any differences in UPPER/lower case
# Table constraints (UNIQUE/PRIMARY/FOREIGN keys)
#
# Thus, the following two table definitions would now be a match:
#
# CREATE TABLE "Example1"
# ( "Customer_ID" bigint not null
# ,UNIQUE ("Customer_ID")
# ,PRIMARY KEY ("Customer_ID")
# )
# DISTRIBUTE ON ("Customer_ID");
#
# CREATE TABLE EXAMPLE1
# ( CUSTOMER_ID bigint not null
# )
# DISTRIBUTE ON (CUSTOMER_ID);
-ignoreQuotes # Ignore quotes and case in object names/DDL
# But do not ignore additional objects like foreign keys on tables
-shost <name/IP> # Source host
-thost <name/IP> # Target host
-sschema <schema> # Source Schema
-tschema <schema> # Target Schema
-suser <user> # Source user [SUSER]
-spassword <password> # Source password [SPASSWORD]
-tuser <user> # Target user [TUSER]
-tpassword <password> # Target password [TPASSWORD]
# If the user/password arguments are NOT specified, the default
# NZ_USER and NZ_PASSWORD environment variables will be used in
# their place when connecting to the source and target machines.
#
# Rather than passing these arguments on the command line, you could
# instead specify them via the environment variables listed above.
# This would be more secure, as the passwords wouldn't appear if
# someone issued a 'ps' command.
#
# Alternatively, if you have set up the password cache (via the
# 'nzpassword' command) the passwords can be obtained directly from
# that.
Outputs: Example output
$ nz_ddl_diff -table -sdb production -tdb development
Object Type: TABLE
Only in source (Host: localhost Database: PRODUCTION)
==============
COPY_OF_PAYROLL
COPY_OF_STATES_DIMENSION
Only in target (Host: localhost Database: DEVELOPMENT)
==============
TEST1
TEST2
TEST3
Differences
===========
CUSTOMERS
nz_ddl_ext_table
Usage: nz_ddl_ext_table [database [ext_table]]
Purpose: To dump out the SQL/DDL that was used to create an external table.
i.e. CREATE EXTERNAL TABLE ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The external table name is optional.
If not specified, then a CREATE EXTERNAL TABLE statement will be
generated for each external table in the database.
Specify "-owner <name>" to limit the output to objects owned
by the specified user.
If you specify the table name, the DDL to create just that single
external table will be produced.
Outputs: SQL DDL (the CREATE EXTERNAL TABLE statements) will be sent to standard out.
nz_ddl_function
Usage: nz_ddl_function [database [function_name/signature]]
Purpose: To dump out the SQL/DDL that was used to create a user defined function.
i.e. CREATE or replace FUNCTION ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The function_name/signature is optional.
If not specified, then a CREATE FUNCTION statement will be generated
for each UDF in the database.
Specify "-owner <name>" to limit the output to objects owned
by the specified user.
If you specify a function_signature, the DDL to create just that
single UDF will be produced. You must pass this script the
exact signature, wrapped in single quotes. For example:
$ nz_ddl_function SYSTEM 'TEST_FUNCTION_1(BYTEINT)'
To obtain a list of the exact signatures, see:
nz_get_function_signatures
If you specify a function_name, the DDL to create all UDFs matching
that function name, will be produced. For example:
$ nz_ddl_function SYSTEM TEST_FUNCTION_1
-udxDir <dirname>
Part of the function definition includes a reference to two compiled object
files -- one for the host and one for the SPU/SBlade. For your convenience,
a copy of these object files will be put under the directory
/tmp/nz_udx_object_files
If you wish, you can use this switch to specify an alternative directory
location for the files to be copied to.
Should you want to use this DDL to create these same functions on another NPS
box, these object files must be made available there.
Outputs: SQL DDL (the CREATE FUNCTION statements) will be sent to standard out.
In order for the copy operation to work successfully, the script must
be run as the linux user 'nz' so that it can access the original files
under /nz/data
nz_ddl_grant_group
Usage: nz_ddl_grant_group [-all|-sysobj|-usrobj] [database [object_name]]
Purpose: To dump out the SQL/DDL that represents any access GRANT'ed to a group.
i.e. GRANT ...
This will show the various access privileges that have been GRANT'ed
(within this database) to any/all groups.
Inputs: If you specify "-all" (the default), then the privileges for any+all
objects will be dumped out.
-sysobj Only dump out the privileges for system objects (objects
with an objid <= 200000)
-usrobj Only dump out the privileges for user objects (objects
with an objid > 200000)
The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
If you specify the optional object_name, only the GRANT statements pertaining
to that specific object_name will be shown. If the object_name does not exist,
an error message will be displayed.
If the object_name is a function/aggregate/procedure, then you must pass
this script the exact signature, wrapped in single quotes. For example:
$ nz_ddl_grant_group SYSTEM 'TEST_FUNCTION_1(BYTEINT)'
To obtain a list of the exact signatures, see:
nz_get_function_signatures
nz_get_aggregate_signatures
nz_get_procedure_signatures
Outputs: SQL DDL (the GRANT statements) will be sent to standard out.
This may include some/all of the following
o GRANTs on named objects
o GRANTs on object classes
o Administrative privileges that have been GRANTed
o And any "... WITH GRANT OPTION" privileges
When processing a single schema, the GRANTs that are dumped out
apply to just that schema. For example
nz_ddl_grant_group PROD -schema DEV_SCHEMA
GRANT select ON TABLE TO DBA ;
If you want to include the GRANTs that apply to ALL schemas
in that database, use this form of the command.
nz_ddl_grant_group PROD -schemas DEV_SCHEMA
GRANT list ON ALL.TABLE TO DBA ;
GRANT select ON DEV_SCHEMA.TABLE TO DBA ;
By specifying "-schemas" (rather than the singular "-schema")
the output will include (a) the corresponding schema name or
(b) ALL to indicate that it is a global grant associated with
ALL schemas in this database.
To see privileges that apply to ALL databases, run this
script against the SYSTEM database.
nz_ddl_grant_user
Usage: nz_ddl_grant_user [-all|-sysobj|-usrobj] [database [object_name]]
Purpose: To dump out the SQL/DDL that represents any access GRANT'ed to a user.
i.e. GRANT ...
This will show the various access privileges that have been GRANT'ed
(within this database) to any/all users.
Inputs: If you specify "-all" (the default), then the privileges for any + all
objects will be dumped out.
-sysobj Only dump out the privileges for system objects (objects
with an objid <= 200000)
-usrobj Only dump out the privileges for user objects (objects
with an objid > 200000)
The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
If you specify the optional object_name, only the GRANT statements pertaining
to that specific object_name will be shown. If the object_name does not exist,
an error message will be displayed.
If the object_name is a function/aggregate/procedure, then you must pass
this script the exact signature, wrapped in single quotes. For example:
$ nz_ddl_grant_user SYSTEM 'TEST_FUNCTION_1(BYTEINT)'
To obtain a list of the exact signatures, see:
nz_get_function_signatures
nz_get_aggregate_signatures
nz_get_procedure_signatures
Outputs: SQL DDL (the GRANT statements) will be sent to standard out.
This may include some/all of the following
o GRANTs on named objects
o GRANTs on object classes
o Administrative privileges that have been GRANTed
o And any "... WITH GRANT OPTION" privileges
When processing a single schema, the GRANTs that are dumped out
apply to just that schema. For example
nz_ddl_grant_user PROD -schema DEV_SCHEMA
GRANT select ON TABLE TO MARK ;
If you want to include the GRANTs that apply to ALL schemas
in that database, use this form of the command.
nz_ddl_grant_user PROD -schemas DEV_SCHEMA
GRANT list ON ALL.TABLE TO MARK ;
GRANT select ON DEV_SCHEMA.TABLE TO MARK ;
By specifying "-schemas" (rather than the singular "-schema")
the output will include (a) the corresponding schema name or
(b) ALL to indicate that it is a global grant associated with
ALL schemas in this database.
To see privileges that apply to ALL databases, run this
script against the SYSTEM database.
nz_ddl_group
Usage: nz_ddl_group [groupname]
Purpose: To dump out the SQL/DDL that was used to create a group.
i.e. CREATE GROUP ...
Inputs: The group name is optional.
If not specified, then a CREATE GROUP statement will be generated for
each group defined on your NPS server.
If you specify the group name, the DDL to create just that single
group will be produced.
Note: If "ACCESS TIME" has been defined for groups, then the person running
this script must have been granted select access to the system table
_T_ACCESS_TIME. Otherwise, that information is not accessible and will not
be displayed by this script.
Outputs: SQL DDL (the CREATE GROUP statements) will be sent to standard out.
nz_ddl_history_config
Usage: nz_ddl_history_config [config_name]
Purpose: To dump out the SQL/DDL that was used to create a history configuration.
i.e. CREATE HISTORY CONFIGURATION ...
Inputs: The config_name is optional.
If not specified, then a CREATE HISTORY CONFIGURATION statement will be
generated for each history configuration defined on your NPS server.
If you specify the config_name, the DDL to create just that single history
configuration will be produced.
Outputs: SQL DDL (the CREATE HISTORY CONFIGURATION statements) will be sent to standard out.
nz_ddl_library
Usage: nz_ddl_library [database [library_name]]
Purpose: To dump out the SQL/DDL that was used to create a user defined shared library.
i.e. CREATE or replace LIBRARY ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The library name is optional.
If not specified, then a CREATE LIBRARY statement will be generated
for each library in the database.
Specify "-owner <name>" to limit the output to objects owned
by the specified user.
If you specify the library name, the DDL to create just that single
shared library will be produced.
-udxDir <dirname>
Part of the library definition includes a reference to two compiled object
files -- one for the host and one for the SPU/SBlade. For your convenience,
a copy of these object files will be put under the directory
/tmp/nz_udx_object_files
If you wish, you can use this switch to specify an alternative directory
location for the files to be copied to.
Should you want to use this DDL to create these same libraries on another NPS
box, these object files must be made available there.
Outputs: SQL DDL (the CREATE LIBRARY statements) will be sent to standard out.
In order for the copy operation to work successfully, the script must
be run as the linux user 'nz' so that it can access the original files
under /nz/data
nz_ddl_mview
Usage: nz_ddl_mview [database [materialized_view]]
Purpose: To dump out the SQL/DDL that was used to create a materialized view.
i.e. CREATE or replace MATERIALIZED VIEW ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The materialized view name is optional.
If not specified, then a CREATE MATERIALIZED VIEW statement will be
generated for each mview in the database.
Specify "-owner <name>" to limit the output to objects owned by
the specified user.
If you specify the materialized view name, the DDL to create just that
single mview will be produced.
Outputs: SQL DDL (the CREATE MATERIALIZED VIEW statements) will be sent to standard out.
nz_ddl_object
Usage: nz_ddl_object <database> <object> [-rename <name>]
Purpose: To dump out the SQL/DDL that was used to create any object (of any type).
i.e. CREATE ...
So this script is similar to many of the other nz_ddl_* scripts (such as
nz_ddl_table and nz_ddl_view and nz_ddl_sequence and etc ...)
But this script is general purpose in nature -- in that it will work
for any/all database objects (it is simply a general purpose wrapper
for the other scripts).
Beyond that, this script also produces any other DDL statements related
to that object, such as
COMMENT ON ...
GRANT ...
ALTER ... OWNER TO ...
And this script also supports a "-rename" option.
Inputs: The database and object names are required. The object can be of type
Table, External Table, View, Materialized View
Schema, Sequence, Synonym, Function, Aggregate, Procedure, Library
If the object is a function/aggregate/procedure, then you must pass
this script the exact signature, wrapped in single quotes. For example:
$ nz_ddl_object SYSTEM 'TEST_FUNCTION_1(BYTEINT)'
To obtain a list of the exact signatures, see:
nz_get_function_signatures
nz_get_aggregate_signatures
nz_get_procedure_signatures
If you include the optional "-rename <name>" switch, then the new
<name> will be substituted into the DDL that is produced, in lieu of
the original objectname. (e.g., useful if you want to clone an object)
Outputs: SQL DDL will be sent to standard out.
nz_ddl_owner
Usage: nz_ddl_owner [ object1 [ object2 ]]
Purpose: To dump out the SQL/DDL used to change the owner of an object.
i.e. ALTER object_type object_name OWNER TO user_name;
Inputs: If no arguments are specified, then 'ALTER' DDL for all USER, GROUP
and SCHEDULER RULE objects is dumped out.
If one argument is specified, then
if that argument represents a USER, GROUP or SCHEDULER RULE then
the 'ALTER' DDL for just that single object will be dumped out
if that argument represents a DATABASE, then the 'ALTER' DDL
for that database will be dumped out ... along with the 'ALTER'
DDL for all objects (tables, views, synonyms, etc ...) within
that database
If two arguments are specified, the first argument represents the
database name, and the second argument represents the object within
that database. The 'ALTER' DDL for just that specific object will be
dumped out.
If the object is a function/aggregate/procedure, then you must pass
this script the exact signature, wrapped in single quotes. For example:
$ nz_ddl_owner SYSTEM 'TEST_FUNCTION_1(BYTEINT)'
To obtain a list of the exact signatures, see:
nz_get_function_signatures
nz_get_aggregate_signatures
nz_get_procedure_signatures
Outputs: SQL DDL (the ALTER ... OWNER statements) will be sent to standard out.
If an object is owned by the 'ADMIN' user (as many objects tend to be)
then no ALTER ... OWNER statement will be produced for that object.
If the specified 'object2' does not exist within the database then an
error message will be displayed.
nz_ddl_procedure
Usage: nz_ddl_procedure [database [procedure_name/signature]]
Purpose: To dump out the SQL/DDL that was used to create a user defined procedure.
i.e. CREATE or replace PROCEDURE ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The procedure_name/signature is optional.
If not specified, then a CREATE PROCEDURE statement will be generated
for each procedure in the database.
Specify "-owner <name>" to limit the output to objects owned
by the specified user.
If you specify the procedure_signature, the DDL to create just that
single procedure will be produced. You must pass this script
the exact signature, wrapped in single quotes. For example:
$ nz_ddl_procedure SYSTEM 'TEST_PROCEDURE_1(BYTEINT)'
To obtain a list of the exact signatures, see:
nz_get_procedure_signatures
If you specify a procedure_name, the DDL to create all procedures
matching that procedure name will be produced. For example:
$ nz_ddl_procedure SYSTEM TEST_PROCEDURE_1
Outputs: SQL DDL (the CREATE PROCEDURE statements) will be sent to standard out.
nz_ddl_scheduler_rule
Usage: nz_ddl_scheduler_rule [rule_name]
Purpose: To dump out the SQL/DDL that was used to create a scheduler rule
i.e. CREATE SCHEDULER RULE ...
Inputs: The rule_name is optional.
If not specified, then a CREATE SCHEDULER RULE statement will be
generated for each scheduler rule defined on your NPS server.
If you specify the rule_name, the DDL to create just that single scheduler
rule will be produced.
Outputs: SQL DDL (the CREATE SCHEDULER RULE statements) will be sent to standard out.
nz_ddl_schema
Usage: nz_ddl_schema [database [schema]]
Purpose: To dump out the SQL/DDL that was used to create a schema.
i.e. CREATE SCHEMA ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The schema name is optional.
If not specified, then a CREATE SCHEMA statement will be generated
for each schema in the database.
If you specify the schema name, the DDL to create just that single
schema will be produced.
Outputs: SQL DDL (the CREATE SCHEMA statements) will be sent to standard out.
nz_ddl_security
Usage: nz_ddl_security
Purpose: To dump out the SQL/DDL for creating various security objects.
i.e. CREATE SECURITY LEVEL ...
CREATE CATEGORY ...
CREATE COHORT ...
CREATE KEYSTORE ...
CREATE CRYPTO KEY ...
Inputs: None.
Outputs: SQL DDL (the relevant CREATE statements) will be sent to standard out.
nz_ddl_sequence
Usage: nz_ddl_sequence [database [sequence]]
Purpose: To dump out the SQL/DDL that was used to create a sequence.
i.e. CREATE SEQUENCE ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The sequence name is optional.
If not specified, then a CREATE SEQUENCE statement will be generated
for each sequence in the database.
Specify "-owner <name>" to limit the output to objects owned by
the specified user.
If you specify the sequence name, the DDL to create just that single
sequence will be produced.
Note:
The starting value for the sequence ("START WITH ...") will be
based upon the _VT_SEQUENCE.NEXT_CACHE_VAL value (which is not
necessarily the next sequence number -- but rather the next
cache number/value that would be doled out. For more on this topic
see "Caching Sequences" in the "Database User's Guide".)
If you do not have access to that virtual table (and by default,
users do not) then the "START WITH ..." value will be based upon
whatever value was used when the sequence was originally created.
Outputs: SQL DDL (the CREATE SEQUENCE statements) will be sent to standard out.
nz_ddl_synonym
Usage: nz_ddl_synonym [database [synonym]]
Purpose: To dump out the SQL/DDL that was used to create a synonym.
i.e. CREATE SYNONYM ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The synonym name is optional.
If not specified, then a CREATE SYNONYM statement will be generated
for each synonym in the database.
Specify "-owner <name>" to limit the output to objects owned
by the specified user.
If you specify the synonym name, the DDL to create just that single
synonym will be produced.
Outputs: SQL DDL (the CREATE SYNONYM statements) will be sent to standard out.
nz_ddl_sysdef
Usage: nz_ddl_sysdef
Purpose: To dump out the SQL/DDL for setting the system's default values.
i.e. SET SYSTEM DEFAULT ...
SET AUTHENTICATION ...
Inputs: None.
Outputs: SQL DDL (the SET statements) will be sent to standard out -- but only for
parameters where the default value was actually adjusted.
System default parameters can be set for the following
QUERYTIMEOUT
SESSIONTIMEOUT
ROWSETLIMIT
DEFPRIORITY
MAXPRIORITY
MATERIALIZE REFRESH THRESHOLD
and
AUTHENTICATION
nz_ddl_table
Usage: nz_ddl_table [-nocalc] [-strict] [-constraints] [database [<tablename> [-rename <new_tablename>]]]
Purpose: To dump out the SQL/DDL that was used to create a table.
i.e. CREATE TABLE ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is optional.
If not specified, then a CREATE TABLE statement will be generated
for each table in the database.
Specify "-owner <name>" to limit the output to objects owned
by the specified user.
If you specify the table name, the DDL to create just that single
table will be produced. You also have the option to include
the switch
-rename <new_tablename>
in which case the <new_tablename> will be substituted into
the DDL that is generated.
-nocalc
Optional switch. If NOT specified, then this script will (by default) take
the extra time to calculate each table's uncompressed rowsize (based on its
DDL), which will be included in the output. i.e.,
Number of columns 100
(Constant) Data Size 620
Row Overhead 24
====================== =============
Total Row Size (bytes) 644
-strict
Optional switch. By definition, every column that is part of a primary key
or unique constraint should also be declared NOT NULL. If you include this
switch, the DDL that is produced by this script will -strict'ly adhere to
that standard -- even if you did not originally create the table with those
columns being defined as NOT NULL.
-constraints
Optional switch. This script automatically dumps out all table constraints
(unique, primary key, and foreign key definitions). All constraints have a
"CONSTRAINT name" associated with them, which you can either specify or allow
the system to assign a default name. By default, this script does not bother
to output the default constraint names as that would be superfluous. If you
want the default constraint names included in the output then include this
switch. For example:
,UNIQUE (CUSTOMER_ID)
versus
,Constraint CUSTOMER_TABLE_CUSTOMER_ID_UK UNIQUE (CUSTOMER_ID)
-comments
Optional switch. If any of the table columns have comments associated with
them, then they will be included in the DDL for the table. For example:
NAME varchar(30) /* Customer Name -- Last, First MI. */,
PHONE numeric(7,0),
SDATE date /* Ship date */,
-num
Optional switch. Number each column for easier reference (especially
helpful if your table has 1,600 columns). For example:
PROCESS_CODE_7 integer /* 137 */,
TASKING_ID_7 integer /* 138 */,
PROCESS_CODE_8 integer /* 139 */,
Outputs: SQL DDL (the CREATE TABLE statements) will be sent to standard out.
nz_ddl_table_redesign
Usage: nz_ddl_table_redesign <database> [table] [optional args]
Purpose: Provide alternative DDL for a table that optimizes each column's datatype.
For example: Let's say you have a column in your original table that is
defined as
AGE bigint,
That is going to require 8 bytes of storage. But do you really need 8 bytes?
Wouldn't a BYTEINT have worked just as well (allowing an "age" of 127 years)?
1 byte versus 8 bytes! Now, with automatic data compression, this isn't going
to be as much of a factor -- while the data is at rest on disk. But assuming
you ever want to do something with that data (that particular column) it has
to be read + decompressed -- in which case it would once again be 8 bytes.
Requiring more memory, more CPU cycles (to process 8 bytes instead of 1), more
network overhead (to move 8 bytes across the fabric instead of 1). Now, we're
only talking 7 bytes here. But if your table has 100B rows ... that would then
be 700 GB of "waste" for this one column alone.
The 'nz_ddl_table' script provides the actual/original DDL for a table. This
script provides DDL that is fine-tuned (based on each individual column's datatype,
based on the actual data contained within each column).
These are just redesign recommendations provided by a script. The final choice is
up to you. You need to decide what changes are appropriate for your environment.
Regarding INTEGER/NUMERIC data: When you initially create + load your table, if
you don't have a good idea as to what datatype to start out with (for a particular
column) then try VARCHAR(40). This script will then determine the best datatype/size
for the column (be it INTEGER or NUMERIC, or that it should be kept as a VARCHAR).
Do not use a floating point column for this purpose -- as that has limited precision
(at most 14 digits) and may be lossy (depending on the data you load into the table).
How does the script operate? By doing a full table scan and extensively analyzing
every value in every column. This script is very CPU intensive and can take
awhile to run.
What sort of tests, and recommendations, will be performed?
o If the table is empty, it will not be processed. The script cannot make
any recommendations if it has no data to analyze.
o If a column contains NO null values, the DDL will include the "NOT NULL"
specification.
o If a column contains ONLY null values, that will be flagged (why are you
using an apparently unused column?).
o If a column contains ONLY one value ... and that value is a
0 in the case of an INTEGER/NUMERIC/FLOAT datatype
0 length string in the case of a text datatype
00:00:00 time in the case of an INTERVAL/TIME/TIMETZ datatype
that will be flagged (why are you using an apparently unused column?).
o INTEGER columns will be reduced to the smallest possible INTEGER datatype.
BIGINT ==> INTEGER ==> SMALLINT ==> BYTEINT
int8 int4 int2 int1
-integer <any|skip>
The default is "-integer any" ... wherein this script decides what to do.
If you don't want to have INTEGER columns processed at all ... to preserve
their original datatype definition ... then specify "-integer skip"
o NUMERIC columns will be reduced to the smallest possible (PRECISION,SCALE).
For example, a NUMERIC(38,12) might be turned into a NUMERIC(7,2).
-numeric <any|numeric|skip>
The default is "-numeric any" ... wherein this script will decide whether
to use NUMERIC or INTEGER.
If the resultant SCALE=0, then an INTEGER datatype will be chosen instead
(assuming the value can be stored within an INT datatype). If you wish to
keep the NUMERIC as a NUMERIC, then specify "-numeric numeric"
If you don't want to have NUMERIC columns processed at all ... to preserve
their original (PRECISION,SCALE) ... then specify "-numeric skip"
o FLOATING POINT columns (REAL/float4 and DOUBLE PRECISION/float8) will be
reduced to the smallest possible NUMERIC(PRECISION,SCALE).
-float <any|numeric|skip>
The default is "-float any" ... wherein this script will decide whether to
use NUMERIC or INTEGER.
If the resultant SCALE=0, then an INTEGER datatype will be chosen instead.
If you wish to always have a NUMERIC datatype used, then specify
"-float numeric"
Not all floating point values can be adequately represented as a NUMERIC
datatype. For example, your floating point number might be
-3.1415926535898e+173
If scientific notation is used to store/represent the floating point number,
then this script will leave the column defined as a floating point number.
It is possible that the suggested NUMERIC datatype will actually be larger
(byte wise) than the original floating point datatype. For example, let's
say you have a FLOAT4 column that contains the following values:
123456
0.654321
In order for this data to be stored in a NUMERIC datatype it must be defined
as a NUMERIC(12,6) ... which uses 8 bytes of storage, rather than the 4 bytes
of storage associated with a FLOAT4.
If you don't want to have the floating point columns processed at all ... then
specify "-numeric skip"
o TIMESTAMP columns include both a DATE + TIME. If the time value is always
'00:00:00' (for all values in this column) then it will be suggested that
this column can be redesigned as a DATE datatype.
o TEXT columns are
CHARACTER, CHARACTER VARYING, NATIONAL CHARACTER, NATIONAL CHARACTER VARYING
char , varchar , nchar , nvarchar
-trim <trim|rtrim|none|skip>
The MAX(LENGTH()) of each column will be computed (and adjusted, as appropriate).
This is done by triming all leading/trailing spaces from each string, before
computing its length. Sometimes, leading and/or trailing spaces might be
considered significant. You can control how the spaces are treated (and thus
how the maximum length of the string is determined).
-trim trim # The default. TRIM() both leading and trailing spaces.
-trim rtrim # Perform an RTRIM() to trim only trailing spaces on the right.
-trim none # Leave the string alone ... do NOT trim any spaces from it.
-trim skip # Skip this check entirely and maintain the original column width.
Note that CHAR/NCHAR columns (by definition) are always space padded to
the defined width of the column. If "-trim none" is chosen, then the
defined widths of these column types will never be adjusted downwards.
VARCHAR columns have 2 bytes of overhead, which are used to specify the
length of the text string. Because of this, fewer bytes will actually
be used if certain columns are redefined as a CHAR datatype instead
(with a fixed length). So
VARCHAR(2) will be redefined as CHAR(2)
VARCHAR(1) will be redefined as CHAR(1)
Of course, a VARCHAR datatype is not quite the same thing as a CHAR
datatype. Similar, but different. It is up to you to make the
final determination as to whether, or not, this change is appropriate
for your needs.
-text <any|numeric|utf8|skip>
The default is "-text any" ... wherein this script will perform both of the
following tests.
Does the column contain only numeric strings ... which would allow it to be
redefined as a NUMERIC (or INTEGER) datatype? If so, it will be. If
you want only this test performed (and not the next one) specify
"-text numeric"
NCHAR/NVARCHAR columns are typically used to store UTF8 data (which uses from
1 to 4 bytes of storage, per character). If the data you are storing in
the column is simply LATIN9 data (which uses only 1 byte of storage per
character), then the column will be redesigned as a CHAR/VARCHAR column
instead. If you want only this test performed (and not the above one)
specify "-text utf8"
If you want to skip both of these tests, specify "-text skip"
o BINARY columns (BINARY VARYING and ST_GEOMETRY) will have their MAX(LENGTH())
computed (and adjusted, as appropriate).
-binary <any|skip>
The default is "-binary any" ... wherein this script decides what to do.
If you don't want to have BINARY columns processed at all ... to preserve
their original (defined) column width ... then specify "-binary skip".
Inputs: The database name is required.
The table name is optional. If specified, just that one table will be processed.
Otherwise, every table in this database will be processed.
-v|-verbose
Include in the output the SQL query that gets generated to perform all of these
analyses (in case you like looking at SQL).
-sample <nn>
By default the script samples all of the data in the table. This can take a long
time (but could result in the better analysis). Instead, you can have the script
sample a portion of the table (from 1 to 100 %) which could save a considerable
amount of runtime.
-insert
Along with the CREATE TABLE statement, include a corresponding INSERT statement.
i.e. INSERT INTO <table> SELECT col_1, ..., col_n FROM <database>..<table>;
Generally, the software will implicitly convert the data from the source column/
datatype into the corresponding target column/datatype. Explicit transformations
will only need to be added to the INSERT statement to process text strings (to
trim them, or to aid in their conversion to an integer/numeric datatype).
When this script is used to process multiple tables at once, the CREATE TABLE
statements will be listed first, and then the INSERT statements will follow.
-orderby
For the above INSERT statement, do you want it to include an optional ORDER BY
clause? If so, this script will chose the first column in the table that is of
type DATE or TIMESTAMP, and add it to the INSERT statement.
Sorted data increases the benefit of zonemap lookups and extent elimination.
However, the data must be sorted on the right column(s) for this to be of the
greatest benefit.
- you might want to choose a different date/timestamp column
- you might want to use a multi-column sort (ORDER BY) clause
- you might want to choose a column, or columns, of a different data type
Edit the INSERT statement to make any changes appropriate to your environment.
-round
When processing NUMERIC/FLOATING POINT/TEXT columns, this script may suggest an
alternative NUMERIC datatype. By default, the (PRECISION,SCALE) of that new
numeric will always use the smallest possible values that are appropriate for it.
Note that a numeric with a
Precision of 1..9 is always 4 bytes in size
Precision of 10..18 is always 8 bytes in size
Precision of 19..38 is always 16 bytes in size
Include the "-round" switch if you want numerics to always have their PRECISION
rounded up to the highest possible value for that data size -- either 9 / 18 / 38.
For example, a NUMERIC(9,2) would be suggested rather than a NUMERIC(4,2).
Storage wise, they are comparable. In this example, they are both 4 bytes in
size. However, there are other aspects to this.
If you multiply two NUMERIC(4,2) columns together, the default result would be a
NUMERIC(9,4) column -- which is still 4 bytes in size.
But if you instead multiplied two NUMERIC(9,2) columns, the default result
would be a NUMERIC(19,4) column -- which is 16 bytes in size.
So it makes a difference. But what difference will the difference make to you?
-integer <any|skip>
-numeric <any|numeric|skip>
-float <any|numeric|skip>
-text <any|numeric|utf8|skip>
-trim <trim|rtrim|none|skip>
-binary <any|skip>
These options were defined in detail under the Purpose section above.
-columns <n>
If your table contains more than 100 columns, it will be processed in groups
of 100 columns at a time (one 'full table scan' is invoked, per group). If
you wish, you can control how many columns get processed (per scan) by
specifying a number from 1..250.
Outputs: SQL DDL (the modified CREATE TABLE statements) will be sent to standard out.
It will include comments as to any + all DDL modifications that this script
decides to suggest. An example:
$ nz_ddl_table_redesign test_database test_table
\echo
\echo ***** Creating table: "TEST_TABLE"
CREATE TABLE TEST_TABLE
(
--REDESIGN
-- CUSTOMER_ID bigint ,
CUSTOMER_ID bigint not null ,
--REDESIGN
--This column contains only NULL values, and could possibly be eliminated entirely.
-- NICKNAME character varying(30) ,
NICKNAME character varying(1) ,
--REDESIGN
--This column does not appear to contain any meaningful data, and could possibly be eliminated entirely.
--(All values are the same value ... and are either a 0, a string of 0..more spaces, a time of 00:00:00, etc ...)
-- NUMBER_OF_ELEPHANTS_OWNED integer ,
NUMBER_OF_ELEPHANTS_OWNED byteint not null ,
--REDESIGN
-- AGE bigint ,
AGE byteint not null ,
--REDESIGN
-- SALARY numeric(38,16) ,
SALARY numeric(8,2) not null ,
--REDESIGN
-- PHONE_NUMBER double precision ,
PHONE_NUMBER integer not null ,
--REDESIGN
-- DOB timestamp ,
DOB date not null ,
--REDESIGN
-- STREET_ADDRESS national character varying(100) ,
STREET_ADDRESS character varying(68) not null ,
--REDESIGN
-- PIN character(10)
PIN smallint not null
)
DISTRIBUTE ON (CUSTOMER_ID)
;
nz_ddl_user
Usage: nz_ddl_user [username]
Purpose: To dump out the SQL/DDL that was used to create a user.
i.e. CREATE USER ...
Inputs: The user name is optional.
If not specified, then a CREATE USER statement will be generated for
each user defined on your NPS server.
If you specify the user name, the DDL to create just that single user
will be produced.
Notes:
When dumping out the CREATE USER statements, each user's default password
is initially set to 'password'. For this script to be able to UPDATE the
password (with the actual encrypted password) that will require SELECT
access to the system table _T_USER. Otherwise, this script will not
generate the additional SQL statements to update the password. The SQL
that is generated will have UPDATE statements for two system tables, both
_T_USER and _T_USER_OPTIONS, although the latter table went away after
version 4.6. That UPDATE statement is included simply for backwards
compatability of the DDL. If the DDL is replayed on a newer version of
the NPS software that UPDATE statement will throw an error which is
expected and can be ignored.
If "SECURITY LABEL" or "AUDIT CATEGORY" are defined for users, then the
person running this script must have been granted the admin privilege
"MANAGE SECURITY". Otherwise, that information is not accessible and will
not be displayed by this script.
If "COLLECT HISTORY" or "CONCURRENT SESSIONS" have been defined for users,
then the person running this script must have been granted select access to
the system table _T_USER. Otherwise, that information is not accessible and
will not be displayed by this script.
If "ACCESS TIME" has been defined for users, then the person running this
script must have been granted select access to the system table _T_ACCESS_TIME.
Otherwise, that information is not accessible and will not be displayed by
this script.
Outputs: SQL DDL (the CREATE USER statements) will be sent to standard out, as
well as "ALTER GROUP" statements for adding the user(s) to whatever
access and/or resource groups they belong to.
nz_ddl_view
Usage: nz_ddl_view [-format|-noformat] [database [viewname]]
Purpose: To dump out the SQL/DDL that was used to create a view.
i.e. CREATE or replace VIEW viewname AS SELECT ...
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The view name is optional.
If not specified, then a CREATE VIEW statement will be generated
for each view in the database.
Specify "-owner <name>" to limit the output to objects owned
by the specified user.
If you specify the view name, the DDL to create just that single
view will be produced.
Views consist of a "SELECT" query that is stored as one long line of
SQL text -- which can be hard to read. This script will reformat
the SQL to try and make it more readable (via the nz_format script).
-format Format the SQL text. If you are dumping the DDL for
a single/specified view then this is the default.
-noformat Do NOT format the SQL text. If you are dumping the DDL
for all views in a database, then this is the default.
Outputs: SQL DDL (the CREATE VIEW statements) will be sent to standard out.
Note: This represents the view definition as it has been rewritten by
NPS. It is not the "original+unmodified" SQL that was initially passed
to the CREATE VIEW statement.
* * * * Statistics / Table Information * * * *
nz_check_statistics
Usage: nz_check_statistics [database]
Purpose: Checks that a table's statistics are appropriate (based on the table's DDL)
This script makes no tests against the data itself ... it just concerns
itself with the statistics and the DDL.
For INTEGER, NUMERIC, and TEMPORAL data types the script verifies that the
statistic loval and hival are of the proper data type, within bounds, and
that the loval <= the hival.
For TEXT data types the script verifies that the LENGTH() of the statistic
is <= the defined width of the column.
Inputs: The database name is optional.
If specified, just that database will be checked.
If not specified, all databases will be checked.
Outputs: Any issues should be displayed in the output, for your further analysis and
attention.
An example of an eror:
Checking statistics for datatype: NUMERIC(p,s)
Database Name | Table Name | Column Name | Data Type | Statistic LOW value | Statistic HIGH value
---------------+------------+-------------+--------------+---------------------+----------------------
DEV | T1 | COL5 | NUMERIC(8,2) | 1.23 | 123.456789
(1 row)
nz_genstats
Usage: nz_genstats [-info] [-full|-express|-basic|-default] <database> [table ...]
Purpose: Generate statistics on those tables that don't have up-to-date statistics.
The optimizer uses statistics to guide its decisions on how best to
execute a query. The more reliable and up-to-date the statistics are,
the more accurate the optimizer's decisions are likely to be.
You can easily "GENERATE STATISTICS;" for an entire database;
Or you could issue a "GENERATE STATISTICS ON <table>;"
or a "GENERATE EXPRESS STATISTICS ON <table>;".
In this case, you must issue the sql statement on a table-by-table basis.
But what if a given table already has up-to-date statistics? Then
regenerating statistics for it would (typically) be a waste of your
time and the system's resources.
This script first checks the state of the statistics for each table,
and only then regenerates statistics on those tables that have outdated
statistics. (Statistics are either 100% up-to-date+accurate+reliable,
or they aren't at 100%. There is no other statistical measure that
measures statistics).
Note: If you wish to get/display the current statistic values for any
given table you can use the script "nz_get"
Inputs: The database name is required.
The table name is optional. If not specified then all tables in the
database / schema will be checked, and the statistics refreshed as
appropriate.
Or, you can specify one (or many) table names to be checked.
If you have a file that contains a list of tablenames to be checked,
it can be used via the following syntax:
nz_genstats DBNAME `cat /tmp/the_list_of_tables`
-info An optional switch
If included, then statistics won't actually be generated on any table.
Instead, this script will simply report information about what tables
have up-to-date statistics and what tables need to have statistics
generated against them.
[-full|-express|-basic|-default] An optional switch
You can control whether FULL, EXPRESS, or BASIC stats are generated for
the tables. (For an explanation of the difference between the three
choices, see "nz_get -help" ). If no options are specified, then this
script will adhere to whatever type of stats each individual table currently
has associated with it.
If you specify "-default" then this script will simply issue the
GENERATE STATISTICS command on the given table, and let the system decide
what type of statistics are to be generated for it (full statistics will be
generated on smaller dimension tables, and express statistics will be
generated on everything else).
Full statistics take the longest to generate -- but provide more accurate
dispersion information (the # of unique values) for each column.
Express statistics use a faster (but less accurate) mathematical formula
to determine the dispersion.
Basic statistics can be generated the fastest -- because they skip the
dispersion calculation entirely. But that leaves the optimizer with
one less piece of information about the table's columns.
Small Dimension Tables -- Full statistics are recommended, as the
accuracy of the dispersion value can have more of an impact on query
planning. And for small tables, the time difference between generating
full or express statistics is negligible.
Large Fact Tables -- Express statistics are typically used because of
the time savings in generating the statistics. And if statistics shows
that a particular column has 500M rather than 600M distinct values, that
won't make much of a difference in the optimizer's calculations (as
both numbers are already large).
[-min <nnn>] Optional switches (default is -min 0)
[-max <nnn>] (default is -max 99999999999999999)
These switches allow you to control which tables are to have a GENSTATS
run against them ... based on their table rowcount being between the -min
and -max values specifed. The default is for all tables to be included.
Example: Specifying '-max 1000000' will result in only tables with
<= 1M rows having a GENSTATS performed against them (and then only if
necessary ... if the statistics aren't already up-to-date).
-force
If statistics are already up-to-date for a given table, this script won't
bother to re-generate the statistics ... as that would generally be a waste
of time (because the statistics won't change). But there may be times when
you want to force a genstats to run ... e.g., to rebuild the zonemap entries
for a table. In which case, include this switch to always have genstats be
run.
Outputs: Sample output. The first run included the "-info" switch. No changes
to the database were made. The script simply lists what it WOULD have
done if you let it.
The second run actually invoked GENSTATS on three of the tables. The
elapsed runtime for the individual operations is included.
$ nz_genstats -info TPCH
Database: TPCH
# Tables: 9
# Table Name Table Rowcount Informational Only
===== ======================================== ============== =================================
1 AN_EMPTY_TABLE 0 empty table
2 CUSTOMER 150,000 express statistics ok
3 LINEITEM 6,001,215 GENERATE /*basic*/ STATISTICS ...
4 NATION 25 GENERATE /*full*/ STATISTICS ...
5 ORDERS 1,500,000 express statistics ok
6 PART 200,000 express statistics ok
7 PARTSUPP 800,000 express statistics ok
8 REGION 5 full statistics ok
9 SUPPLIER 20,000 GENERATE EXPRESS STATISTICS ...
$ nz_genstats TPCH
Database: TPCH
# Tables: 9
# Table Name Table Rowcount Current Operation Seconds
===== ======================================== ============== ================================= =======
1 AN_EMPTY_TABLE 0 empty table 0
2 CUSTOMER 150,000 express statistics ok 0
3 LINEITEM 6,001,215 GENERATE /*basic*/ STATISTICS ... 21
4 NATION 25 GENERATE /*full*/ STATISTICS ... 1
5 ORDERS 1,500,000 express statistics ok 0
6 PART 200,000 express statistics ok 0
7 PARTSUPP 800,000 express statistics ok 0
8 REGION 5 full statistics ok 0
9 SUPPLIER 20,000 GENERATE EXPRESS STATISTICS ... 4
nz_get
Usage: nz_get [ database [ table ]]
Purpose: Get (and display) the statistics for a user table or a system table.
Statistics are used by the NPS optimizer in order to plan the best
way to run each SQL query (for queries that are run against user data,
and also for queries that are run against the system catalogs).
Some of these statistics are automatically maintained by the system
(i.e., 'always-there' statistics). Other values are dynamically
calculated+saved when a "GENERATE [EXPRESS] STATISTICS" command is
issued.
This script will dump out all of the known statistics for a table.
Inputs: The database and table names are optional.
If the database name is not specified, then this script will process
all databases / schemas / tables.
If a database name is specified, the script will process all tables
in the specified database (within the specified schema).
If a table name is specified, the script will process just that table.
In lieu of a table name, you may instead specify a synonym name or
materialized view name -- in which case the statistics for the
underlying table will be displayed.
The table name may also be any one of the "SYSTEM TABLE" or
"MANAGEMENT TABLE" names.
o For a list of these names, see 'nz_get_sysmgmt_table_names'
o By default, normal (non-ADMIN) users do not have access
to any of these SYSTEM/MANAGEMENT tables. They are only
allowed to access them indirectly (thru SYSTEM/MANAGEMENT
views).
o Some of these tables are global in nature, so it does not
matter which database name you specify on the command line.
The results will always be the same. e.g. _VT_SPU_ZMAP_INFO
o Some of these tables are local in nature (database specific)
so the database you specify is relevant. e.g. _T_ATTRIBUTE
o If you wanted to look at the statistics for all of the
SYSTEM/MANAGEMENT tables, you could issue a command such as this:
for TABLE in `nz_get_sysmgmt_table_names system`; do nz_get system $TABLE; done
Outputs: The output of this script will include the following information
Database: name, objid
Table: name, objid, distribution clause, row count(statistic)
Per-Column: attnum the logical column number (from 1-1600)
Column Name
Statistics Status the validity/freshness of the statistics
Minimum Value
Maximum Value
# of Unique Values also known as the dispersion value
# of NULLs
MaxLen The MaxLen+AvgLen are only collected for
AvgLen columns of type VARCHAR, NCHAR, NVARCHAR
Regarding the "Statistic Status" ... the following indicates that statistics
have been explicitly generated against this table/column (via a GENSTATs
command) and that they are 100% up-to-date. The ONLY difference between the
three is in how the "# of Unique Values" (the dispersion) is generated.
The disperion value is the most complex statistic to gather, in terms of time +
memory + cpu usage.
Full Similar to doing a "COUNT(DISTINCT columname)". It generates the
most accurate dispersion value, but has the most overhead associated
with it.
Express It uses math to estimate the dispersion value (a hash is generated
for each column value, and then the number of unique hash keys is
added up). Much faster, but less precise.
Basic The dispersion calculation is skipped entirely. All of the other
"basic" statistics are still collected.
So, statistics are either 100% correct and up-to-date, or they're not. Any change
to a table ... even if it involves just 1 row being inserted/updated/deleted ...
results in the statistics no longer being 100% up-to-date. In other words, they
would be outdated. This doesn't mean that they would be bad or wrong. Just outdated.
When rows are nzload'ed/INSERT'ed into a table, NPS automatically compares
each column value against the MIN/MAX statistic for the column, and updates the
table's statistics accordingly. This is also referred to as "always there
statistics". This applies to all columns/all datatypes -- EXCEPT for text
columns (CHAR, NCHAR, VARCHAR, NVARCHAR). So that means that the MIN/MAX values
for the column are still OK -- they are still up-to-date (at the same time, the
table's rowcount statistic is also updated). But the other statistics values
(the dispersion and # of nulls) are not up-to-date as they can only be recomputed
via an explicit GENSTATS. In this case, the "Statistics Status" will show
Full Min/Max OK
Express Min/Max OK
Basic Min/Max OK
As mentioned, this doesn't apply to text columns. So all of the statistics that
are maintained for text columns would be outdated ... and can only be refreshed
via an explicit GENSTATS. In this case, the "Statistics Status" will show
Full Outdated
Express Outdated
Basic Outdated
If you never ever bothered to do a GENSTATS on a particular table, then the only
per-column statistics that would be available are the "always there statistics"
that NPS automatically collects. So this means you will have MIN/MAX statistics
(and only MIN/MAX statistics) for those columns. Which will be displayed as
Min/Max Only
As mentioned, "always there statistics" aren't collected for text columns. So
if you've never done an explicit GENSTATS ... and if NPS has never automatically
collected any statistics ... then there will be no statistics at all available
to the optimizer. In this case the "Statistics Status" will show
Unavailable
Other situations in which "Unavailable" applies
o if the table is empty
o if you've loaded a compressed external file into an empty table
We don't attempt to collect statistics on very wide text columns, which are
CHAR/VARCHAR columns with a defined length >= 24565
NCHAR/NVARCHAR columns with a defined length >= 6142
The "Statistics Status" for those columns will be displayed as
not maintained
Regarding "always there statistics" ...
when rows are nzload'ed/INSERT'ed into a table we can easily compare the column
values against the MIN+MAX statistics, and update them accordingly (if needed)
when rows are DELETE'd from a table, even though you might be deleting a row
that matches a particular MIN value or a particular MAX value, there is no
way to answer the following
a) is this the only row in the table with that particular value ?
b) and if so, what is the next MIN value or the next MAX value ?
The only way to determine those answers is via a GENSTATS (which processes
every row in the table). Thus, a DELETE operation never shrinks the
MIN/MAX values. But we still know that we can use+trust those statistics ...
that there is no value in the table column that is outside the current
MIN/MAX range.
when rows are UPDATE'd, we actually do an INSERT+DELETE. See above.
Regarding GENERATE [EXPRESS] STATISTICS ...
as of 4.6, there is really only one GENERATE STATISTICS command now. Both
GENERATE STATISTICS and GENERATE EXPRESS STATISTICS do the same thing.
the only difference between the two was in how they calculate the
"# of Unique Values" ... the dispersion value ... for a column. That
still occurs, but now NPS decides for itself which method to use.
If the table's rowcount is <= 10 * SPU_COUNT
Then FULL stats will be collected
where we actually try to do a COUNT(DISTINCT on_each_column)
# Of Columns Processed Per Scan: set STATS_COL_GRP_LIMIT = 10;
Else EXPRESS stats will be collected
where we hash the column values to approximate the dispersion value
# Of Columns Processed Per Scan: set SAMPLED_STATS_COL_GRP_LIMIT = 50;
In much older NPS releases, a GENERATE STATISTICS command would have generated
BASIC stats for any large table > 500M rows (JIT_DISP_MIN_ROWS). This is no
longer the case.
See Also: nz_genstats
nz_table_analyze
Usage: nz_table_analyze [ database [ table [ column ... ]]]
Purpose: Analyze a table column to determine the dispersion/uniqueness of its data.
When you GENERATE STATISTICS for a table, the system calculates and stores
the dispersion (the # of unique values) for each column.
But what about that dispersion?
What are the values?
Are they unique?
Are they evenly distributed amongst all of the records?
Is one value more predominant than the others?
Knowing this information may assist you in choosing distribution keys,
resolving skew, etc...
This script will analyze each column in a table and report upon it.
A separate/full table scan is performed, once per column analyzed.
Inputs: The database, table, and column name(s) are all optional.
If the database name is not specified, then this script will process
all databases / schemas / tables / columns.
If a database name is specified, the script will process all tables /
columns in the specified database (within the specified schema).
If a table name is specified, the script will process just that table.
When analyzing a specific database table, you may want to analyze
some -- but not all -- of the columns. You may specify the list of
column names to be processed.
Outputs: The output of this script will be of the following nature for each
column reported upon. Output is limited to the 25 highest occurring
values for that column.
Table Name: LINEITEM
Column Name: L_LINENUMBER
Column Type: INTEGER
STATISTICS
-------------------------------------
# of Records : 59,986,052
# of Unique Values : 7
# of NULLs :
-------------------:
Minimum Value : 1
Maximum Value : 7
% of Total | # of Instances | Value
------------------+----------------+-------
25.005813018 | 15000000 | 1
21.431778841 | 12856078 | 2
17.855739198 | 10710953 | 3
14.286107711 | 8569672 | 4
10.711780132 | 6425574 | 5
7.140996710 | 4283602 | 6
3.567784391 | 2140173 | 7
nz_update_statistic_date_high_value
Usage: nz_update_statistic_date_high_value <database> <table> <column> <increment>
Purpose: Update a table's statistics -- the MAX date value (of a DATE column).
This script can be used to update the MAX statistical value for the
specified table column (which must be of type DATE) that is stored
in the system catalog.
Inputs: The <database>, <table>, and <column> names are required.
The <increment> is required. It is a value representing the number
of days (beyond today) to set the MAX statistical value to.
Outputs: ** CAUTION **
This script directly updates one of the system tables.
It should be used with care, and only if you know what you are doing.
nz_update_statistic_min_or_max
Usage: nz_update_statistic_min_or_max <database> <table> <column> <value> <flag>
Purpose: Update a table's statistics -- the MIN/MAX column value.
This script can be used to update the MIN or MAX statistical value (for the
specified table column) that is stored in the system catalog.
Inputs: <database> required
<table> required
<column> required
<value> required (the value to be stored in the system catalog)
<flag> required (specify stahival to update the MAX value)
(specify staloval to update the MIN value)
Outputs: ** CAUTION **
This script directly updates one of the system tables.
It should be used with care, and only if you know what you are doing.
nz_update_statistic_null_values
Usage: nz_update_statistic_null_values <database> <table> <column> <value>
Purpose: Update a table's statistics -- # of NULL values in a column.
This script can be used to update the 'stanullfrac' statistical value (for
the specified table column) that is stored in the system catalog.
The 'stanullfrac' (null fraction) is the number of null values found
within this column in the table.
Inputs: <database> required
<table> required
<column> required
<value> required (this value represents the number of null values
to set the statistic to)
Outputs: ** CAUTION **
This script directly updates one of the system tables.
It should be used with care, and only if you know what you are doing.
nz_update_statistic_table_rowcount
Usage: nz_update_statistic_table_rowcount <database> <table> <value>
Purpose: Update a table's statistics -- # of rows in the table.
This script can be used to update the rowcount statistical value (for
the table) that is stored in the system catalog.
Inputs: <database> required
<table> required
<value> required (represents the 'number of rows' you want
to set the statistic value to)
Outputs: ** CAUTION **
This script directly updates one of the system tables.
It should be used with care, and only if you know what you are doing.
nz_update_statistic_unique_values
Usage: nz_update_statistic_unique_values <database> <table> <column> <value>
Purpose: Update a table's statistics -- # of unique values (dispersion) in a column.
This script can be used to update the 'attdispersion' statistical value (for
the specified table column) that is stored in the system catalog.
The 'attdispersion' (attribute dispersion, or frequency) is the number of
unique values found within this column in the table.
Inputs: <database> required
<table> required
<column> required
<value> required (this value represents the number of
unique values to set the statistic to)
Use a <value> of 0 to indicate that there is NO dispersion information
for this column ... as if only BASIC statistics have been generated on it.
Outputs: ** CAUTION **
This script directly updates one of the system tables.
It should be used with care, and only if you know what you are doing.
* * * * Hardware * * * *
nz_check_disks
Usage: nz_check_disks
Purpose: To show S.M.A.R.T. information concerning the status of each SPU's disk drive.
S.M.A.R.T. stands for Self-Monitoring, Analysis and Reporting Technology
Note: This script is only for use on NPS 4.x systems.
Inputs: None.
Outputs: Some summary information on the disk drives is reported.
For detailed disk drive analysis, one should instead run
/nz/support/bin/SmartCollect
nz_check_disk_scan_speeds
Usage: nz_check_disk_scan_speeds [ optional args ]
Purpose: Check the read/write speed of each disk or 'dataslice'.
This script will run a simple SQL query against a known table of known size
in order to check on the read (scan) performance of the disk drives. This
script will automatically create the table (if it has not been previously
created by this script). The table will be SYSTEM..NZ_CHECK_DISK_SCAN_SPEEDS
This script creates and uses a rather atypical table for this purpose. With
o very "wide" rows (only 2 rows fit on each 128KB page)
o very few columns (and very few zonemappable columns)
o a very small compression ratio (so the script should work on basically
all systems ... regardless of whether or not compression is enabled)
The table it creates is defined as follows:
THE_DATASLICE smallint -- The values will range from 0..m
-- (actual DATASLICEID values range from 1..n)
THE_EXTENT smallint -- A 1-up number for each extent
THE_PAGE smallint -- A number from 1..24
THE_ROW smallint -- A number from 1..2
FILLER1 character varying(32715)
FILLER2 character varying(32715)
/* Total Row Size (bytes) 40 - 65472 */
Inputs: All arguments are optional.
-iterations <nn> The number of times to scan the test table (i.e., to run a
COUNT(*) against it). The default is to run 5 passes.
-write While this script is typically used to check read speeds, it
can also be used to check write speeds. Adding this switch will
o automatically drop the table (if it exists)
o create it anew so that the write speeds can be reported upon
-cleanup This script creates a test table in the SYSTEM database (named
NZ_CHECK_DISK_SCAN_SPEEDS). By default, the script will leave
the test table there for future reuse -- so that the table
doesn't have to be (re)created anew each time. Add the
"-cleanup" option to have this script automatically do a
'drop table' when it is done running.
-size [0|1|2|4|8] This script creates + works with a table of a known size.
On the Mustang hardware, the default is 0.5 GB of storage (per
dataslice).
On the IBM Netezza 1000 hardware, the default is 4.0 GB of storage
(per dataslice). Smaller sizes should not be used ... as the
results could be affected by the disk cache.
You can use a smaller/larger sized table by specifying
-size 0 # 0.5 GB
-size 1 # 1.0 GB
-size 2 # 2.0 GB
-size 4 # 4.0 GB
-size 8 # 8.0 GB
This switch applies to
a) when the table is created ... how big should it be
b) when the timing test is run ... how much data should
it test against
For example: You can create a table with 8 GB of data
(per dataslice) ... and then run timing tests against 0.5 GB,
1 GB, 2 GB, 4 GB or 8 GB of data by including the appropriate
switch.
But if you only create the (default) 0.5 GB table, you can
only run timing tests against 0.5 GB of data. (Trying
otherwise will throw an error message).
-spa Test each SPA individually, one at a time
(testing all of the dataslices that belong to that SPA as a group)
-spa <nn> Test just the specified SPA
-spa <nn> -spu For the specified SPA, test each SPU individually,
one at a time (testing all of the dataslices that
belong to that SPU as a group)
-spa <nn> -spu <nn> Test just the specified SPU
-spa <nn> -enc For the specified SPA, test each disk enclosure
individually, one at a time (testing all of the
dataslices that belong to that disk enclosure as
a group)
-spa <nn> -enc <nn> Test just the specified disk enclosure
-dsid [ ... ] Specifying "-dsid" by itself will cause the script to test each
dataslice individually ... one at a time. A single iteration
(e.g., 1 scan per each dataslice) will be performed.
If you have a large multi-cabinet system with a large number
of dataslices, this WILL take a long time. (Though you can
^C this script at any time).
Optionally, you can include a list of dataslice IDs/numbers to
be checked. (Dataslices are numbered from 1..n). If you specify
a dataslice ID multiple times then it will be tested multiple
times. Example:
-dsid 1 10 300 301 302 500 500 500 500 500
As a convenience, all of the dataslices specifed will also be
tested all together as a 'SET'.
-dsrange <start> <end>
You can also specify a numeric RANGE of dataslices to be
tested. Example
-dsrange 100 120
The "-dsid" and "-dsrange" options can be used together and
can be used multiple times.
-cpu The default test is I/O intensive. This option will make the
test be CPU intensive ... in that the CPU's (on the S-Blades)
will be pegged at near 100% utilization.
-test Combines multiple tests into one to make things simpler.
First, it tests the SPAs one-by-one (-spa). Then, for the
slowest SPA, it tests the individual SPUs (-spu) and
ENClosures (-enc) in that SPA.
Outputs: The time it took to perform each write/read operation.
Note that the results can be affected by many different factors, to include:
o The physical location of the test table (on each disk drive)
o The background disk rewriter
o The disk cache
o Disk regeneration
o Concurrent user activity
When running this script, it is usually helpful to run "nz_responders -sleep 1" in
another terminal window ... to look for any dataslices/disks that are consistently
and significantly slower than the rest of the pack. After which, you should run a
follow up test to test the questionable dataslices individually by running
nz_check_disk_scan_speeds -dsid [ ... ]
nz_mm
Usage: nz_mm [-blade <opt>]
Purpose: Display information about each MM (Management Module) and/or its blades.
Inputs: By default, the "health" of the Management Module will be reported upon.
-verbose Allows you to control whether you get verbose or brief
-brief output when displaying the system health status. The
default is verbose.
Or you can specify "-blade" with one of the following options, in which
case that information will be reported upon for each BLADE connected
to each MM.
bootseq -- View the blade boot sequence settings
config -- View general settings
health -- View system health status
info -- Display identity and config of target
iocomp -- View I/O compatibility for blades
led -- Display Leds
list -- Display installed targets
sol -- View SOL status and view SOL config
temps -- View temperatures
volts -- View voltages
A password is required to connect to the MM. If you set the environment
variable MM_PASSWORD it will be picked up. Otherwise, you will be prompted
to enter a password (it will not be echoed back).
Outputs: For every MM device that can be found and connected to, the requested
information will be reported. This example shows the MM's "health"
DEVICE: mm001
Hostname: mm001
Static IP address: 10.0.129.0
Burned-in MAC address: 00:14:5E:E2:1E:1A
DHCP: Disabled - Use static IP configuration.
Last login: Friday May 20 2011 13:55 from 10.0.128.1 (SSH)
system: Critical
mm[1] : OK
mm[2] : OK
blade[1] : OK
blade[3] : OK
blade[5] : OK
power[1] : OK
power[2] : OK
power[3] : OK
power[4] : OK
blower[1] : OK
blower[2] : OK
switch[1] : OK
switch[2] : OK
switch[3] : Critical
switch[4] : OK
fanpack[1] : OK
fanpack[2] : OK
fanpack[3] : OK
fanpack[4] : OK
Outputs: Following are some sample outputs for each of the various "-blade <opt>"
bootseq -T blade[1]
nw
nodev
nodev
nodev
config -T blade[1]
-name SN#YK11509CW25N
health -T blade[1]
system:blade[1] : Non-Critical
info -T blade[1]
Name: SN#YK11509CW25N
UUID: 8027 A9F5 E89F B601 BEC7 0021 5EED FC1A
Manufacturer: IBM (WIST)
Manufacturer ID: 20301
Product ID: 13
Mach type/model: 4 X86 CPU Blade Server/8853AC1
Mach serial number: 06P8566
Manuf date: 53/09
Hardware rev: 7
Part no.: 59Y5638
FRU no.: 59Y5665
FRU serial no.: YK11509CW25N
CLEI: Not Available
Unique ID 1: Not Available
Unique ID 2: Not Available
Unique ID 3: Not Available
Unique ID 4: Not Available
Unique ID 5: Not Available
Unique ID 6: Not Available
Unique ID 7: Not Available
Unique ID 8: Not Available
MAC Address 1: 00:06:72:00:01:01
MAC Address 2: 00:06:72:01:01:01
MAC Address 3: Not Available
MAC Address 4: Not Available
MAC Address 5: Not Available
MAC Address 6: Not Available
MAC Address 7: Not Available
MAC Address 8: Not Available
BIOS
Build ID: BCE143AUS
Rel date: 12/09/2009
Rev: 1.19
Diagnostics
Build ID: BCYT30AUS
Rel date: 05/18/2009
Rev: 1.08
Blade sys. mgmt. proc.
Build ID: BCBT60A
Rev: 1.20
Local Control
KVM: Yes
Media Tray: Yes
Power On Time: 28 days 15 hours 33 min 16 secs
Number of Boots: 26
Product Name: HS21 Blade Server, 2 sockets for Intel dual- or quad-core
iocomp -T blade[1]
Bay Power Fabric Type Fabric on Blade Compt
------- ------- ------------------------------- ---------------- -------
IOM 1 Off Unknown Device Ethernet Mismatch
IOM 2 On Ethernet Switch Module Ethernet OK
IOM 3 On Storage Switch Module SAS OK
IOM 4 On Storage Switch Module SAS OK
IOM 3 On Storage Switch Module SAS OK
IOM 4 On Storage Switch Module SAS OK
led -T blade[1]
SN#YK11509CW25N
Error: off
Information: off
KVM: off
MT: off
Location: off
list -T blade[1]
system:blade[1] SN#YK11509CW25N
sol -T blade[1]
-status enabled
SOL Session: Ready
SOL retry interval: 250 ms
SOL retry count: 3
SOL bytes sent: 1454949
SOL bytes received: 0
SOL destination IP address: 10.10.10.80
SOL destination MAC: 00:21:5E:ED:FC:18
SOL I/O module slot number: 1
SOL console user ID:
SOL console login from:
SOL console session started:
SOL console session stopped:
Blade power state: On
temps -T blade[1]
Hard Warning
Component Value Warning Shutdown Reset
------------------------------- ------- ------- -------- -------
BIE Temp 25.00 56.00 70.00 49.00
volts -T blade[1]
Source Value Critical
--------------- ------- ----------------
BIE 1.5V +1.45 (+1.27,+1.72)
BIE 12V +11.96 (+10.23,+13.82)
BIE 3.3V +3.24 (+2.80,+3.79)
BIE 5V +4.99 (+4.23,+5.74)
Planar 0.9V +0.88 (+0.40,+1.50)
Planar 12V +12.12 (+10.20,+13.80)
Planar 3.3V +3.30 (+2.78,+3.79)
Planar 5V +4.93 (+4.23,+5.74)
Planar VBAT +3.11 (+2.54,+3.44)
nz_ping
Usage: nz_ping
Purpose: Diagnostic Script: 'ping' all of the Mustang SPUs to verify their location.
Note: This script is only for use on NPS 4.x systems.
Inputs: None.
Outputs: A listing that shows each SPU that responded to the ping. Any SPU failing to
respond will be marked with a '.'
nz_ping
'ping'ing 10.0.*.15
10.0.1. 1 . 3 4 5 6 7 8 9 10 11 12 13 14
10.0.2. 1 2 3 4 5 6 7 8 9 10 11 12 13 14
SFIs: 2 SPUs: 27 **ERROR** Expecting 28 SPUs
nz_sense
Usage: nz_sense [ -f ]
Purpose: Provide environmental "sense" data for Mustang series SPA's.
This script uses the "clide" utility (its "sense" command) to report
environmental sense data for the SPA's. This includes voltage
information, temperatures, and fan RPMs.
This script reports the information laid out in a simple grid format.
Note: This script is only for use on NPS 4.x systems.
Inputs: Optional Switch
-f Use degF, rather than degC, when reporting temperatures
Outputs: A report such as the following is produced. On pre-Mustang systems,
only the SFI information is available.
SFI (degC) Fan (rpms) Power Supply LEFT (degC) Power Supply RIGHT (degC)
----------------------------------------- ------------------- --------------------------- ---------------------------
SPA 1.25V 2.5V 3.3V 5V 12V Temp FAN-L FAN-M FAN-R 3.3V 5V 12V Temp 3.3V 5V 12V Temp
=== ===== ===== ===== ===== ===== ====== ===== ===== ===== ===== ===== ===== ====== ===== ===== ===== ======
1 1.24 2.62 3.25 5.03 12.06 36.00 4724 4758 4792 3.35 5.05 12.00 32.00 3.35 5.05 12.00 33.00
2 1.24 2.62 3.25 5.05 12.06 35.00 4692 4861 4878 3.28 4.97 11.69 32.00 3.33 5.05 12.00 32.00
3 1.24 2.60 3.25 5.03 12.06 34.00 4809 4675 4861 3.33 5.03 12.00 34.00 3.35 5.05 12.06 33.00
4 1.24 2.60 3.25 5.03 12.06 35.00 4896 4950 4950 3.33 5.05 12.00 31.00 3.33 5.65 11.94 33.00
nz_show_topology
Usage: nz_show_topology [ -l ]
Purpose: Provide a report that describes the overall disk topology.
For each dataslice
o Identify its primary hardware id and location
o Identify its mirror hardware id and location
o Display the amount of storage being used
If any issues are found with the topology (such as an
unmirrored dataslice, or a non-optimal location of the
SPUs) they will be flagged.
Inputs: -l Optional switch to produce a longer listing (on NPS 4.x systems).
If this switch is specified, then
o suggestions will be made as to which SPUs need to be
physically moved to correct any topology issues.
o the list of available spare spus will be displayed
Outputs: A report, such as the following, will be produced
This is an example from an NPS 4.x system (which is SPU based)
Data Slice | Primary Slice | Mirror Slice | Regenning To | Topology Check | GB Used | % Full
-----------+---------------+--------------+--------------+----------------+---------+-------
1 | 1.1 (1004) | 2.1 (1003) | | ok | 36.35 | 83.05
2 | 1.2 (1013) | 2.2 (1001) | | ok | 36.42 | 83.19
3 | 1.3 (1020) | 2.3 (1017) | | ok | 36.24 | 82.80
4 | 1.4 (1026) | 2.4 (1015) | | ok | 36.55 | 83.50
5 | 1.5 (1032) | 2.5 (1034) | | ok | 36.15 | 82.59
6 | 1.6 (1002) | 2.6 (1025) | | ok | 36.28 | 82.87
This is an example from an NPS 5.x/6.x system (which is S-Blade based)
DSlice | Size(GB) | Used(GB) | % Used | Spa # | Spu # | Enc # | Disk ID
--------+----------+----------+--------+-------+-------+-------+---------
1 | 356.21 | 212.14 | 59.55 | 1 | 5 | 1 | 1012
2 | 356.21 | 228.45 | 64.13 | 1 | 5 | 2 | 1030
3 | 356.21 | 228.07 | 64.03 | 1 | 3 | 1 | 1022
4 | 356.21 | 227.98 | 64.00 | 1 | 3 | 2 | 1036
5 | 356.21 | 201.64 | 56.61 | 1 | 1 | 1 | 1013
6 | 356.21 | 227.86 | 63.97 | 1 | 1 | 2 | 1032
This is an example from an NPS 7.x system.
DSlice | Size(GB) | Used(GB) | % Used | Rack # | Spa # | Spa ID | Spu # | Spu ID | Enc # | Enc ID | Disk # | Disk ID | Supporting Disks
--------+----------+----------+--------+--------+-------+--------+-------+--------+-------+--------+--------+---------+------------------
1 | 356.000 | 121.869 | 34.23 | 1 | 1 | 1002 | 1 | 1188 | 3 | 1054 | 5 | 1170 | 1044,1170
2 | 356.000 | 114.835 | 32.26 | 1 | 1 | 1002 | 1 | 1188 | 2 | 1033 | 3 | 1044 | 1044,1170
3 | 356.000 | 114.709 | 32.22 | 1 | 1 | 1002 | 13 | 1190 | 3 | 1054 | 7 | 1069 | 1069,1091
4 | 356.000 | 121.916 | 34.25 | 1 | 1 | 1002 | 13 | 1190 | 4 | 1075 | 8 | 1091 | 1069,1091
5 | 356.000 | 122.862 | 34.51 | 1 | 1 | 1002 | 13 | 1190 | 2 | 1033 | 6 | 1047 | 1047,1088
6 | 356.000 | 114.653 | 32.21 | 1 | 1 | 1002 | 13 | 1190 | 4 | 1075 | 5 | 1088 | 1047,1088
nz_temperatures
Usage: nz_temperatures [ -hot | -cold ]
Purpose: Report temperature and voltage information for each of the Mustang SPAs.
Note: This script is only for use on NPS 4.x systems.
Inputs: Optional Switch
-hot Only report temperatures hotter than the average
-cold Only report temperatures colder than the average
Outputs: A report such as the following is produced.
Code Meaning
---- --------------------------
. Average Temperature
+n # of degrees above average
-n # of degrees below average
<sp> Not an 'Active' SPU
SPA Cabinet # 1 Temperature Min/Avg/Max: 72/ 81/ 99 F SFI Temp 1.25V 2.50V 3.30V 5.00V 12.0V
=== ======================================================= ======== ===== ===== ===== ===== =====
1 +7 . . +5 . . . . . . . . +7 +9 118.40 F .00 .00 +.05 .00 +.25
2 . . -6 -6 -8 -4 -4 -4 -4 -4 . . . +9 111.20 F .00 -.01 +.03 +.05 +.25
3 . . -6 -8 -4 -8 -4 -4 -4 . . . . 111.20 F -.01 +.03 .00 +.03 +.19
4 . -6 -6 -6 -4 -6 -4 -4 -4 . . . . 111.20 F -.01 .00 +.02 +.03 +.19
5 . . -4 -9 -4 -6 -4 -4 -4 . . . . +9 111.20 F -.01 .00 .00 +.10 +.38
6 . -4 -6 -6 -4 -8 -4 . -4 . . . . +7 113.00 F -.01 +.03 +.03 +.08 +.31
7 . . . . -4 . . . . . . . +5 114.80 F -.01 .00 .00 +.10 +.38
8 +7 +7 +7 +7 +9 +7 +9 +9 +12 +14 +14 +16 +16 118.40 F .00 .00 .00 +.16 +.31
* * * * ACL (Access Control Lists) / Permissions * * * *
nz_change_owner
Usage: nz_change_owner -from <username> -to <username> [-db <database>] [-dropuser]
Purpose: To globally change the owner of a set of objects.
Inputs: -from <username> All objects owned by this user will be affected. This
is a required field. You must specify a valid username.
This script will not allow you to globally change the
ownership of all objects owned by 'ADMIN'.
-to <username> The user whom you want to transfer ownership to. This is
a required field. You must specify a valid username.
[-db <database>] An optional database name. If specified, then only objects
within this database (to include the database itself) will
be looked at.
Otherwise all objects in all databases, as well as any
global objects (users, groups and scheduler rules) will be
looked at to see if they were created by the specified
"-from" user.
[-dropuser] An optional argument. If included, the output produced by
this script will have a
DROP USER <username>;
statement at the very end.
Outputs: A set of SQL/DDL statements of the form
ALTER ... OWNER TO ... ;
Note: This script doesn't actually make the changes. It simply outputs a
stream of SQL statements that can be used to do the work. All you
need do is simply pipe that back into nzsql. E.g.,
nz_change_owner ... | nzsql
nz_db_group_access_listing
Usage: nz_db_group_access_listing [ database ]
Purpose: To show what groups have been granted access to what databases.
A group might have access to a given database because
o It was explicitly granted access to that database
o It was explicitly granted access to the DATABASE class
Inputs: The database name is optional.
If NOT specified, then a more comprehensive report will be produced -- showing
all databases that all groups have been granted access to -- along with some
additional columns of information.
If a specific database name is entered, then a simple listing -- of only the
groups that have access to that database -- will be produced. This output
could then be easily reused in other scripts.
Outputs: When reporting on ALL databases, a report such as the following will be produced.
nz_db_group_access_listing
Database Name | Database Created On | Group Name | Group Created By | Group Created On
---------------+---------------------+-------------+------------------+---------------------
DEV_DB | 2007-07-02 15:35:06 | DBA_GROUP | ADMIN | 2007-07-02 15:36:23
PROD_DB | 2007-07-02 15:35:10 | DBA_GROUP | ADMIN | 2007-07-02 15:36:23
PROD_DB | 2007-07-02 15:35:10 | STAFF_GROUP | ADMIN | 2007-07-02 15:35:52
SYSTEM | 2007-07-02 15:22:01 | PUBLIC | ADMIN | 2007-07-02 15:22:01
TEST_DB | 2007-07-02 15:35:13 | DBA_GROUP | ADMIN | 2007-07-02 15:36:23
TEST_DB | 2007-07-02 15:35:13 | QA_GROUP | ADMIN | 2007-07-02 15:35:43
When reporting on a single database, just the list of groups -- that have
access to that database -- will be produced.
nz_db_group_access_listing prod_db
DBA_GROUP
STAFF_GROUP
nz_db_user_access_listing
Usage: nz_db_user_access_listing [ database ]
Purpose: To show what users have been granted access to what databases.
A user might have access to a given database because
o They were explicitly granted access to that database
o They were explicitly granted access to the DATABASE class
o One of the groups they belong to was granted access to that database
o One of the groups they belong to was granted access to the DATABASE class
o They are the owner of that database
Inputs: The database name is optional.
If NOT specified, then a more comprehensive report will be produced -- showing
all databases that all users have been granted access to -- along with some
additional columns of information.
If a specific database name is entered, then a simple listing -- of only the
users that have access to that database -- will be produced. This output
could then be easily reused in other scripts.
Outputs: When reporting on ALL databases, a report such as the following will be produced.
Note: This listing does not include the SYSTEM database as all users (by default)
automatically have access to it. It will also not include the ADMIN user
as that user (by design) automatically has access to all databases.
nz_db_user_access_listing
Database Name | Database Created On | User Name | User Created By | User Created On | User Valid Until
---------------+---------------------+------------+-----------------+---------------------+------------------
DEV_DB | 2007-07-02 13:43:00 | JOHN_DOE | ADMIN | 2007-07-02 13:45:34 |
PROD_DB | 2007-07-02 13:43:04 | JANE_SMITH | ADMIN | 2007-07-02 13:45:38 |
TEST_DB | 2007-07-02 13:43:07 | JANE_SMITH | ADMIN | 2007-07-02 13:45:38 |
TEST_DB | 2007-07-02 13:43:07 | JOHN_DOE | ADMIN | 2007-07-02 13:45:34 |
When reporting on a single database, just the list of usernames -- that
have access to that database -- will be produced.
nz_db_user_access_listing test_db
JANE_SMITH
JOHN_DOE
nz_find_acl_issues
Usage: nz_find_acl_issues
Purpose: Diagnostic Script: Used to identify any oddities in the ACL tables.
Permissions (GRANTs) are stored as integer bitmasks in three
different system tables
_t_usrobj_priv privileges granted to a user
_t_grpobj_priv privileges granted to a group
_t_acl merged access control list
This script simply confirms that there are no stray
(unexpected) bits set anywhere.
Inputs: None.
Outputs: A report such as the following will be produced
Checking table: _t_usrobj_priv
1. # of rows with odd OBJECT privileges = 0
2. # of rows with odd ADMIN privileges = 0
nz_find_object_owners
Usage: nz_find_object_owners [user]
Purpose: Find objects that are owned by users -- other than the 'admin' user.
Often times, 'admin' is the owner of most objects in the
system. If another user is given the proper administrative
privileges, they can create objects on their own.
An owner of an object, by definition, has unrestricted access
to that object. And a DROP USER operation against a user will
fail as long as that user continues to own any objects.
Inputs: Optional user name -- if you want to see only those objects owned by
a specific user. If not provided, then information about all objects
owned by all users (other than 'admin') will be shown.
Outputs: For each object, list out
o the object name
o the object datatype
o the object owner
e.g.,
dba owns database development
dba owns user load_user
dba owns table load_test in database development
nz_fix_acl
Usage: nz_fix_acl [ -force ] [ -obsolete ]
Purpose: To fix discrepancies in the system ACL tables.
When you GRANT access to an object, an entry is made in either
the _T_USROBJ_PRIV table (user permissions), or
the _T_GRPOBJ_PRIV table (group permissions)
These tables are merged to produce the _T_ACL table (access control list).
This script will check each of those three tables, and correct any
discrepancies that it detects.
Inputs: -obsolete Checking for obsolete references (to objects or users or
databases that no longer exist) might take awhile. By
default, this feature is turned off. If you wish to
enable this check, include the "-obsolete" switch.
-force By default, this script will simply product a report
identifying the problems found. Include the "-force"
switch to allow this script to make the necessary corrections.
Outputs: A listing of the problems found -- or corrections made.
nz_get_acl
Usage: nz_get_acl [-user <username>|-group <groupname>] <object1> [object2]
Purpose: To list the access privileges associated with a particular object.
Note: The script "nz_my_access" can be used to list out ALL of the objects
that a given user has access to.
Inputs: -user <username> The user whose access privileges you are interested in knowing.
-group <groupname> Or, the group whose access privileges you are interested in knowing.
This is an optional field. Specify either a -user or a -group (or
neither). If not specified, then the access permissions that have
been granted to this object -- to any/all users/groups -- will be
listed.
<object1> If you are interested in a GLOBAL object (i.e., a database/user/group)
then specify it here.
Otherwise, specify the parent DATABASE that the object/relation resides in.
[object2] If the object you are interested in is anything other than a global object,
i.e., a
table, external table
view, materialized view
sequence, synonym
function, aggregate, procedure
system table, system view
management table, management view
then
<object1> refers to the database
[object2] refers to the object itself
For functions/aggregates/procedures, you must pass this script the
exact signature, wrapped in single quotes. For example:
$ nz_get_acl SYSTEM 'TEST_FUNCTION_1(BYTEINT)'
To obtain a list of the exact signatures, see:
nz_get_function_signatures
nz_get_aggregate_signatures
nz_get_procedure_signatures
-quiet | -verbose Controls the level of commentary that is included in the output.
The default is -verbose. If you specify -quiet, then only the
relevant SQL statements themselves (e.g., the GRANTs) will be shown.
Outputs: The access permissions -- if any -- and how they were derived will be sent to
standard out.
nz_get_admin
Usage: nz_get_admin [-user <user>] <database>
Purpose: List the administrative privileges that a user has been granted to a database.
Inputs: [-user <user>] The user whose admin privileges you are interested in
This is an optional field. If not specified, then the
admin privileges for any/all users/groups will be listed.
<database> The name of the database you are interested in.
Outputs: The admin privileges -- if any -- and how they were derived will
be sent to standard out.
nz_my_access
Usage: nz_my_access <username> [-nopublic]
Purpose: To list out ALL of the objects that a given user has access to.
This is across all databases/schemas. This includes all groups that the
user is a member of, as well as the PUBLIC group.
-nopublic Optional switch -- if you want to shorten the output and
EXCLUDE any privileges associated with the PUBLIC group.
Note: The script "nz_get_acl" can be used to list the access privileges
associated with a specific object (in the form of the original GRANT
statements).
In general, this script must be run as the ADMIN user because it accesses a
number of system tables that (by default) only the ADMIN user has access to.
Inputs: The username is required.
Outputs: A lengthy report will be produced. A small example is shown below:
Database | Schema | Object | Type | Privilege(s) | w/Grant Opt | Why you can access this object | User/Group
----------+--------+-------------+---------+--------------------+--------------------+----------------------------------+------------
PROD | ADMIN | MY_EXAMPLE | TABLE | ****************** | ****************** | (0) OWNER (user) | MARK
PROD | ADMIN | MY_EXAMPLE | TABLE | L | | (1) Explicit GRANT (user) | MARK
PROD | ADMIN | MY_EXAMPLE | TABLE | I | | (1) Explicit GRANT (group) | PUBLIC
PROD | ADMIN | MY_EXAMPLE | TABLE | S | | (1) Explicit GRANT (group) | DBA
PROD | ADMIN | MY_EXAMPLE | TABLE | IUD | | (2) Class Access (user) | MARK
PROD | ADMIN | MY_EXAMPLE | TABLE | T | | (2) Class Access (group) | PUBLIC
PROD | ADMIN | MY_EXAMPLE | TABLE | D LADB | | (2) Class Access (group) | DBA
PROD | ADMIN | MY_EXAMPLE | TABLE | LS | | (3) ALL.Class Access (user) | MARK
PROD | ADMIN | MY_EXAMPLE | TABLE | LS | | (3) ALL.Class Access (group) | PUBLIC
PROD | ADMIN | MY_EXAMPLE | TABLE | LS | | (3) ALL.Class Access (group) | DBA
PROD | ADMIN | MY_EXAMPLE | TABLE | LSIUDTLADBLGOECRXA | | (4) ALL.ALL.Class Access (user) | MARK
PROD | ADMIN | MY_EXAMPLE | TABLE | E | | (4) ALL.ALL.Class Access (group) | DBA
PROD | ADMIN | MY_EXAMPLE | TABLE | L | | (4) ALL.ALL.Class Access (group) | PUBLIC
"Why you can access this object"
================================
This column is broken down into levels, where 0 has the highest precedence
and 4 has the lowest precedence.
(0) OWNER
---------
You are the owner of this object and can do anything to it. You
probably CREATE'd the object, or else someone later ALTER'ed the
object and made you the owner of it. This takes precedence over
anything else.
(1) Explicit GRANT
------------------
You (or one of the groups that you belong to) have been GRANT'ed access
to this object.
For example ... (1) Explicit GRANT (user)
equates to ... GRANT list on MY_EXAMPLE to mark;
(2) Class Access
----------------
You (or one of the groups that you belong to) have been GRANT'ed access
to all objects of this type/class -- within this specific database.schema
For example ... (2) CLASS Access (user)
equates to ... GRANT insert, update, delete on TABLE to mark;
(3) ALL.Class Access
--------------------
You (or one of the groups that you belong to) have been GRANT'ed access
to all objects of this type/class -- in ALL schemas within this database.
For example ... (3) ALL.CLASS Access (user)
equates to ... GRANT list,select on ALL.TABLE to mark;
(4) ALL.ALL.Class Access
------------------------
You (or one of the groups that you belong to) have been GRANT'ed global
access to all objects of this type/class (across all databases/all schemas)
For example ... (4) GLOBAL Class Access (user)
which can be done this way \c system
GRANT all on TABLE to mark;
or it can be done this way GRANT all on ALL.ALL.TABLE to mark;
So 0 trumps 1, 1 trumps 2, 2 trumps 3, and 3 trumps 4.
Only the highest level comes into play (any lower levels are ignored).
At any level (1, 2, 3 or 4) the permissions are ADDITIVE. You have all of the
permissions that have been granted to you, as well as all of the permissions
that have been granted to any groups that you are a member of.
"Privilege(s)" -- The object privilege(s) that you are authorized
============
"w/Grant Opt" -- Whether you also have the right to grant a particular privilege
============= to others (e.g, "WITH GRANT OPTION")
The Object Privileges
L S I U D T L A D B L G O E C R X A
Are abbreviated as follows:
(L)ist (S)elect (I)nsert (U)pdate (D)elete (T)runcate (L)ock
(A)lter (D)rop a(B)ort (L)oad (G)enstats Gr(O)om (E)xecute
Label-A(C)cess Label-(R)estrict Label-E(X)pand Execute-(A)s
nz_my_grants
Usage: nz_my_grants <username/groupname>
Purpose: Dump out all of the GRANTs associated with a particular user (or group).
This is across all databases.
If you specify a username, then this will also include all groups that the
user is a member of, as well as the PUBLIC group.
Inputs: A username or groupname is required.
Outputs: SQL DDL (the GRANT statements) will be sent to standard out.
This may include some/all of the following
o GRANTs on named objects
o GRANTs on object classes
o Administrative privileges that have been GRANTed
o And any "... WITH GRANT OPTION" privileges
* * * * Diagnostic / Debugging / Support Tools * * * *
nz_catalog_size
Usage: nz_catalog_size [minimum_filesize_threshold]
Purpose: To report information about the size of the catalog that resides on the host.
Only user tables (the data itself) is stored on the SPUs. That, plus the zonemap
table.
Everything else is stored on the NPS host under the /nz/data directory.
This includes system tables, transaction journals, log files, configuration files, etc ...
Storage-wise, a database is basically just a subdirectory and a bunch of files. e.g.
/nz/data/base/1 Subdirectory 1, which is always the SYSTEM database
/nz/data/base/1/1249 corresponds to the _T_ATTRIBUTE system table/file
/nz/data/base/1/1259 corresponds to the _T_CLASS system table/file
/nz/data/base/1/5002 corresponds to the _T_DIST_MAP system table/file
/nz/data/base/1/5115 corresponds to the _T_STATISTIC system table/file
The above system tables/files are specific to each database (there is one instance, of
each of these files, for every database that is created).
But there are others that are global in nature ... there exists only one copy of
these files on the sever. e.g.
/nz/data/global The subdirectory in which these global tables/files exist
/nz/data/global/1260 corresponds to the _T_USER system table/file
/nz/data/global/1261 corresponds to the _T_GROUP system table/file
/nz/data/global/1262 corresponds to the _T_DATABASE system table/file
/nz/data/global/5006 corresponds to the _T_OBJECT system table/file
/nz/data/global/5014 corresponds to the _T_ACL system table/file
The output from this script lists the space occupied by each individual database.
The total size of the "Catalogs" will be greater than the SUM() of the individual
database sizes, as the catalogs include many additional items (such as this 'global'
subdirectory) that are not associated with a specific database.
This nz_catalog_size script reports various pieces of information, to include:
o The TOTAL directory size for /nz/data
o The total size of the files making up the Catalog
o The total size attributable to each individual database
o Any postmaster core files
o Any individual files that are greater than the minimum_filesize_threshold
The output includes a "reindex ?" column, wherein the size of one
of the larger system tables is compared to the size of its index.
If the index size is > 250% of the table size then that is used as
an indication that "(yes)", your system might benefit from running
the script "nz_manual_vacuum". The bigger the database ... the
higher the percentage ... the more databases that are over the 250%
threshold ... the more likely you will benefit from an nz_manual_vacuum.
(The percentage was recently increased from 120 to 250, as the lower
limit was too low).
Inputs: By default, only individual files > 75000000 (75MB) will be listed.
You can override this by specifying a different value (that is higher
or lower) for the minimum_size_threshold. (The default value for the
threshold was recently increased from 25MB to 75MB. As always, you can
override this and use whatever value you want.)
Outputs: A report such as the following will be produced.
/nz/data TOTAL directory size: 3.7G
==========================
Catalogs: 771M
+ CORE files: 2.6G
+ CODE cache: 9.0M
+ HISTORY data: 185M
+ UDX files: 100M
Database (sorted by name) Size reindex ? Directory Path
-------------------------------- ----- --------- --------------------------
HISTORY_DATABASE 69M 163 /nz/data.1.0/base/202895
PROD_DB 474M 258 (yes) /nz/data.1.0/base/200253
SYSTEM 18M 95 /nz/data.1.0/base/1
TEST_DB 69M 98 /nz/data.1.0/base/209304
Database (sorted by size) Size reindex ? Directory Path
-------------------------------- ----- --------- --------------------------
PROD_DB 474M 258 (yes) /nz/data.1.0/base/200253
HISTORY_DATABASE 69M 163 /nz/data.1.0/base/202895
TEST_DB 69M 98 /nz/data.1.0/base/209304
SYSTEM 18M 95 /nz/data.1.0/base/1
CORE files Size
--------------------------------------------------- -------------
/nz/data.1.0/base/1/core.1277077968.17950 2,600,000,000
Files that are greater than the specified threshold Size
--------------------------------------------------- -------------
/nz/data.1.0/base/200253/5030 86,589,440
200253=PROD_DB (DATABASE)
5030=_T_ACTIONFRAG (SYSTEM TABLE)
/nz/data.1.0/base/200253/1249 132,399,104
200253=PROD_DB (DATABASE)
1249=_T_ATTRIBUTE (SYSTEM TABLE)
/nz/data.1.0/base/200253/5305 209,633,280
200253=PROD_DB (DATABASE)
5305=_I_ATTRIBUTE_RELID_ATTNAM (SYSTEM INDEX)
nz_compiler_check
Usage: nz_compiler_check [NZ_KIT_PATHNAME]
Purpose: Verify that the C++ compiler (and its license) are operating correctly.
NPS dynamically generates C++ snippet code that will be run on the host
or on the SPUs. That C++ code must be compiled before it can be used.
On NPS 4.x systems, the compiler is license controlled. If the license
file is not set up properly, the compiler will not compile. This script
performs a basic sanity check against both compilers (the one used for
the host and the one used for the SPUs) to make sure that they (and the
license) are working.
This script is not relevant on later NPS versions (5.0+) as a different
compiler -- one that is not licence controlled -- is now used.
Inputs: [NZ_KIT_PATHNAME] By default, I will assume that the NPS software is
to be found under "/nz/kit" ... the typical+default
location. If need be, you can specify a different
location in which it is to be found.
Outputs: A report such as this ... assuming things are working correctly.
If you see different output, that may indicate that there are problems.
LM_LICENSE_FILE=/usr/local/flexlm/licenses/license.dat
NPS Software Kit=/nz/kit.4.5.4
NPS Version=4.0 or 4.5
Host snippet compiled successfully.
SPU snippet compiled successfully.
And here is evidence of the *.o files that were created.
-rw-rw-r-- 1 nz nz 3875 Apr 25 19:00 host_snippet.cpp
-rwxrwxr-x 1 nz nz 6513 Apr 25 19:00 host_snippet.o
-rw-rw-r-- 1 nz nz 5287 Apr 25 19:00 spu_snippet.cpp
-rw-rw-r-- 1 nz nz 1324 Apr 25 19:00 spu_snippet.o
*************
** SUCCESS **
*************
nz_compiler_stats
Usage: nz_compiler_stats
Purpose: Report various statistics about the utilization of the object code cache.
When you run a query, NPS generates and compiles C++ code on your behalf ...
for running up on the host or down on the SPUs/S-Blades. There is overhead
in doing compilations, so NPS creates and reuses an object code cache
( /nz/data/cache ) in order to eliminate any unnecessary (re)compilations.
Inputs: None
Outputs: A report, such as the following. Where
_VT_OBJCACHESTAT is a virtual table. As such, it maintains information only
about the current NPS instance.
Number of cache hits (the object file was found in the cache)
Number of cache misses due to a crc miss (the object file was not found
in the cache ... there was no CRC match)
Number of cache misses in spite of a crc hit (the object file was not found
in the cache. There was a crc hit, but none of the files matched.)
Code Fragments (REUSED) -- are objects that have been reused at least once
(e.g., a cache hit) by another query. (This is based on an analysis of
the current contents of the /nz/data/cache directory. As such, the
information it provides typically does span multiple NPS instances.)
Code Fragments (Used Once) -- are objects that have never been reused.
They may be relatively new, and just haven't been reused as of yet.
Or they may be rather specific in nature, and may never get reused.
_VT_OBJCACHESTAT
CACHE_HITS 381
CACHE_MISS_CRCMISSES 57
CACHE_CRCHITS 0
CRCHIT_CONTENTMISSES 0
Code Fragments (REUSED)
Number of Object Files 306
Minimum Compile Time 0
Average Compile Time 0.22
Maximum Compile Time 13
Grouped By Compile Time
10 + 1
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 54
0 251
Code Fragments (Used Once)
Number of Object Files 14
Minimum Compile Time 0
Average Compile Time 0.29
Maximum Compile Time 1
Grouped By Compile Time
10 + 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 4
0 10
nz_core
Usage: nz_core <core_file_name>
Purpose: Dump the program execution/backtrace from a core file.
The core file can be from either a host/server process
e.g. /nz/kit/log/<server>/core.<date_time>.<pid>
or from a SPU
e.g. /nz/kit/log/spucores/corespu<nn>
or from an SBlade
e.g. /nz/export/nz/kit/log/spucores/core.1.1.spu10.t1252428228.p2322.s11.gz
Inputs: The name of the core file to be processed.
Outputs: The backtrace from the core file. You might want to submit this to the
customer support group for further follow-up.
nz_find_object
Usage: nz_find_object [string]
Purpose: To help find+identify an 'object' -- based on its name or its object id value.
Inputs: A string, representing either
o The object name to be searched for. The lookup will be
a) case insensitive
b) automatically wild-carded (i.e., partial strings are permitted)
c) performed against all objects of all types
o The objects 'object id' value (if a number is entered, it will be
considered an object id -- rather than part of an object name).
If no string/value is specified, then information about EVERY object will
be returned.
Outputs: A report such as the following is produced
nz_find_object emp
The Object Name Is | It Is Of Type | Its 'objid' Is | In The Database
--------------------+------------------+----------------+-----------------
EMPLOYEE | TABLE | 202262 | PRODUCTION_DB
TEMP TABLE | RELATION | 4940 |
_T_TEMP_TABLE_INFO | SYSTEM TABLE | 5023 |
_VT_PLAN_SAVETEMP | MANAGEMENT TABLE | 4011 |
nz_find_object_orphans
Usage: nz_find_object_orphans
Purpose: Diagnostic Script: Used to identify certain discrepancies within the catalogs.
The catalogs are a bunch of inter-related system tables that are
stored on the NPS host. This script joins those various tables
to one another -- looking for rows in one table that seem to have
no corresponding row(s) in their associated table(s).
Inputs: None.
Outputs: If any discrepancies are found, they will be identified in
the output as
Probable Orphans
nz_find_table_orphans
Usage: nz_find_table_orphans [ -delete ]
Purpose: Check that both the host and the SPU/S-Blades have entries for every table.
There are two components to a table
1) an entry in the catalogs up on the host
2) an entry ( + storage ) down on the SPU/S-Blade
If either of those components is missing, the table is basically useless ...
and the remaining component is orphaned and needs removing.
This script will report on any such orphans.
Note: For the purposes of this script a 'materialized view' is similar to a
table. This script will also report on and process any materialized views.
Note: In NPS 6.0 the storage manager (down on the SPU/S-Blades) was redesigned.
It no longer keeps state information for tables holding no storage (e.g., tables
that are empty). Thus there no longer is an implication that existence of an entry
in the host catalog implies an entry on the spu. So the first part of this report
is no longer relevant/provided:
This is a list of tables and materialized views that are in the catalogs ...
BUT are NOT found on the SPUs.
But there is an implication going the other way: existence of an entry on the spu
implies existence of an entry in the host catalog.
Inputs: -delete Optional switch.
By default, this script simply reports on the orphans that it finds.
If you add this switch, the script will also delete the orphaned component.
Outputs: A report (such as the following) about any such orphaned objects ...
This is a list of tables and materialized views that are in the catalogs ...
BUT are NOT found on the SPUs.
Database Name | Object Name | Object Class | Object Identifier
---------------+--------------------------+--------------+-------------------
TEST_DB | SALES_FIGURES | Table | 200378
TEST_DB | _MSALES | MView Store | 200401
DEV_DB | TEST | Table | 200412
DEV_DB | _MTEST_MVIEW_1 | MView Store | 200435
(4 rows)
This is a list of tables and materialized views that are being stored on the SPUs ...
BUT for which NO information was found in the CATALOGS.
SPU Table ID
--------------
201399
201477
(2 rows)
nz_frag
Usage: nz_frag [database] <table/mview>
Purpose: Dump out extent/page allocations in order to visualize table fragmentation.
Storage is allocated an extent at a time (3MB). Within the extent, it is
then filled up with records a page at a time (128KB). The pages are filled
front-to-back. Once all of the available space in the extent is used up, a
new extent is allocated.
Usually, all of the 24 pages within an extent are in-use. But there are
exceptions.
o The very last extent for a table (on any given dataslice) will probably
only be partially filled. So any remaining pages will be unused (empty).
Unused pages are not scanned.
o If you have done a "GROOM TABLE <tablename> PAGES ALL;" then any pages
that contain 100% deleted/groomable rows will be marked as being empty.
They will no longer be used/scanned, though they still exist within the
extent. If all 24 pages in the extent are empty, the extent will be
removed from the table and added back to the global storage pool.
o Clustered Base Tables (those created with an ORGANIZE ON clause) may only
partially fill any given cluster/extent.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table/mview name is required. If only one argument is specified, it
will be taken as the table/mview name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table/mview name.
-dsid <nn> By default, data slice 1 (the 1st amongst all data slices)
will be reported upon. You can choose to look at the
information for a different data slice if you wish.
Storage allocation/usage is specific to each dataslice.
Rather than showing all info for all dataslices (which
could be voluminous), this script will concentrate on
just one dataslice, which should provide a good
representative sample.
-dsid all If you want to see ALL information for ALL dataslices,
specify this option. If you have a big table and a lot
of dataslices, the output could be volumnious.
-dsid 0 Summary information (shown here) is always included in
the output. If you want to see ONLY the summary information
(and no detail information for the dataslices) then specify
"-dsid 0" (actual dataslices are numbered starting at 1).
Summary information across all dataslices
=========================================
Total # Of | | # that are Non- | | % Contiguous |
Extents (min) | (max) | Contiguous (min) | (max) | (min) | (max)
--------------+-------+------------------+-------+--------------+-------
1368 | 1368 | 7 | 7 | 99.49 | 99.49
Outputs: A dump of the extent+page storage information for this table/mview (for
a single dataslice).
The "extent" that is displayed is the system assigned extent number.
Extents are allocated from 1 (the outer tracks) to <n> (the inner
tracks). On a TwinFin system the number of extents is 121K. On a
Striper system the number of extents is 66K.
The "#" column is a simple, one-up number ... to make things easier to
read.
The "gap" column is used to indicate whether the extents are contiguous
on disk. If the extents are contiguous (if the gap is 0) then a blank
will be displayed. Otherwise, this number will represent the number of
other extents (not belonging to this table) between this extent and the
prior extent.
"Used/Unused Pages (./0)" is used to represent which of the 24 pages
within each extent are (or are not) in use. A "." indicates the page
is in use. A "0" indicates the page is not being used.
Example follows:
$ nz_frag SYSTEM TEST_TABLE
Database: SYSTEM
Object Name: TEST_TABLE
Object Type: TABLE
Object ID : 10578974
Data Slice: 1
extent | DataSlice | # | gap | Used/Unused Pages (./0)
--------+-----------+----+-----+--------------------------
810 | 1 | 1 | | 0.......................
812 | 1 | 2 | 1 | .0......................
816 | 1 | 3 | 3 | ..0.....................
817 | 1 | 4 | | ...0....................
818 | 1 | 5 | | ....0...................
819 | 1 | 6 | | .0.0.0.0.0.0.0.0.0.0.0.0
820 | 1 | 7 | | ........................
821 | 1 | 8 | | 0.0.0.0.0.0.0.0.0.0.0.0.
822 | 1 | 9 | | ........................
823 | 1 | 10 | | 0......................0
824 | 1 | 11 | | .0000000000000000000000.
905 | 1 | 12 | 80 | ........................
906 | 1 | 13 | | ........................
907 | 1 | 14 | | 000000000000............
908 | 1 | 15 | | ............000000000000
909 | 1 | 16 | | ........................
910 | 1 | 17 | | ........................
911 | 1 | 18 | | ........................
912 | 1 | 19 | | .........000000.........
913 | 1 | 20 | | ........................
958 | 1 | 21 | 44 | ........................
959 | 1 | 22 | | ........................
962 | 1 | 23 | 2 | ........................
965 | 1 | 24 | 2 | ........................
967 | 1 | 25 | 1 | ........................
969 | 1 | 26 | 1 | ........................
971 | 1 | 27 | 1 | ........................
973 | 1 | 28 | 1 | ........................
976 | 1 | 29 | 2 | ........................
1009 | 1 | 30 | 32 | ........................
1011 | 1 | 31 | 1 | ........................
1013 | 1 | 32 | 1 | ...........0000000000000
(32 rows)
nz_genc
Usage: nz_genc [filename]
Purpose: Recompile code snippets (under /nz/kit/log/gencErrors) to identify any problems.
NPS dynamically generates C++ snippet code that will be run on the host or
on the SPUs/S-Blades. That C++ code must be compiled before it can be used.
On rare occassions, the compilation may fail. When that happens, all of the
relevant information is copied to this directory for diagnostic + reporting
purposes.
False positives sometimes occur, such as when a user kills a query (^C) while
a compilation is going on. Such occurrences do not need to be reported.
This script will compile the code snippets that it finds under the directory
/nz/kit/log/gencErrors in an attempt to identify which ones have problems.
Inputs: If no filename is provided, then the script will process all of the *.pln
files under '/nz/kit/log/gencErrors', recompiling the last code snippet
from each one (which would be the snippet that threw the compilation error).
If the filename refers to a specific *.pln file, then the script will
recompile the last code snippet from it (which would be the snippet that
threw the compilation error).
If the filename refers to a specific *.cpp file, then the script will
recompile it (regardless of whether or not it ever had any compilation
problems to begin with. In this way, you can compile any code snippet
of your choosing).
Notes: Code compilations can take awhile (that is why NPS makes use of
an object code cache).
If the compiler gets stuck in an optimization loop (check to see
if there is a long running 'dtoa' process), you can disable the
compiler optimization via the following system registry setting:
host.gencDiabKillOptMask=-1
Outputs: If the compilation succeeds
-- the *.o object file will be written out to the same directory as the
*.cpp file
-- you can remove this subdirectory from the system, as its contents
would appear to be of no particular interest
For each *.cpp snippet that is compiled, you will get a listing such as
this. "warning" messages generated by the compiler are pretty common.
If the "Compiler exit code:" is not 0, then it would appear that there is
a problem that may need to be reported to the customer support group.
Plan File: /nz/kit/log/gencErrors/2008-07-25_18:15:06.plan2/2.pln
Source File: s2_2.cpp
The compilation command to be invoked by this script (w/modified pathnames)
--------------------------------------------------------------------------------
/nz/kit.4.5/sbin/diab/5.0.3/LINUX386/bin/dplus -Xc++-old -c -o ./s2_2.o -Xexceptions -Xchar-signed -Xmember-max-align=4 -Xenum-is-int -D__PPC__ -tPPC440ES -DNZ_CPU_TYPE=44024 -Xkeywords=0x1F -Xlint=0xfffffffd -Xlicense-wait -Xsmall-data=0 -Xno-bss -Xsmall-const=0 -DNZDEBUG=0 -DGENCODE -DFOR_SPU -I/nz/kit.4.5/sys/include ./s2_2.cpp
The compilation is now running (2008-08-01 14:52:25) ...
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvv compiler messages vvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
"/nz/kit/sbin/diab/5.0.3/include/stdio.h", line 63: warning (dplus:1078): typedef previously declared without linkage specification
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ compiler messages ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Compiler exit code: 0 (Success)
nz_host_memory
Usage: nz_host_memory
Purpose: Display the host's memory allocation table.
On the database host, a large shared-memory pool (approximately
2.6 GB in size) is allocated when the server processes start up.
Report on this memory allocation/usage.
Inputs: None
Outputs: The memory allocation table is printed out.
nz_manual_vacuum
Usage: nz_manual_vacuum [ -backup <flag> ]
Purpose: Vacuum and reindex the host catalogs.
The database catalogs are a bunch of inter-related system tables that are
stored on the NPS host. In extreme situations, those catalogs can grow
rather large in size, thereby affecting performance (access to the
information contained in the catalogs).
This script can only be run when the database system is 'Stopped', and
will check to make sure that that is the case.
Inputs: -backup <flag> Do you want this script to automatically create a backup
of /nz/data before making any changes?
This is optional, and automatically defaults to YES.
The backup tar file will be placed in /nz
Outputs: All output from this script will be copied to a log file in /nz.
Any catalog files that were > 75MB in size (at the beginning of the script)
will have their before + after sizes displayed for comparision purposes.
For example:
Size Before Size After Pathname
=============== =============== =========================================================
770,304,436 670,476,724 /nz/data (excluding CORE files, CODE cache, HISTORY data)
--------------- --------------- ---------------------------------------------------------
86,589,440 86,589,440 /nz/data.1.0/base/200253/5030
132,399,104 132,399,104 /nz/data.1.0/base/200253/1249
209,633,280 120,602,624 /nz/data.1.0/base/200253/5305
42,094,745 42,094,745 /nz/data.1.0/kit/sbin/postgres
nz_online_vacuum
Usage: nz_online_vacuum [<database>]
Purpose: Reindex the host catalogs ... while the system is online
The database catalog is made up of a number of system tables and indexes
that are stored on the NPS host. In extreme situations, those relations
can grow rather large in size, thereby affecting performance (when accessing
the information contained in the catalog).
An access exclusive lock is issued for each database. No users can be signed
into that database, no cross database queries to the database can be running.
If the lock cannot be obtained then that database will not be vacuum'ed at
this time and the output will report
ERROR: unable to acquire lock on database DBNAME
Online reindexing includes both a vacuum (of the sytem tables) and a reindex
(of the system views).
Note: The SYSTEM database cannot be proessed as it is always in use. Some
relations ( _T_OBJECT:The table of objects, _T_ACL:Access Control Lists, etc)
reside in the SYSTEM database.
See also: nz_manual_vacuum
Inputs: The database name is optional ... if you want to process just the
specified database. Otherwise, the default is to process all databases.
Outputs: Each database is processed, in turn. Sample output
1 DEV_DB done
2 PROD_DB ERROR: unable to acquire lock on database PROD_DB
3 QA_DB done
4 TEST_DB done
nz_responders
Usage: nz_responders [ optional args ]
Purpose: Show interactions/responses for running queries (across each dataslice).
This script provides information from "nzsqa responders -sys" in a
more user-friendly format, along with additional pieces of information.
This script must be run on the NPS host as the linux user "nz" (but it
does not require you to be the database user "ADMIN")
By default, this script loops + runs forever. Press ^C at any time to
interrupt it.
Inputs: All arguments are optional
-sleep <sleep_time>
How long to sleep before refreshing the output. The default is
10 seconds. The range is 0..3600 seconds.
-plan <plan_id>
If you want the output to reflect just a particular query plan
specify it here. Otherwise, the output will include information
for all active queries.
-stragglers <cnt>
When the number of "Busy" dataslices is <= this value, the script
will start displaying the individual dataslice #'s that we are
waiting upon (in the "Dataslices ..." column). The default value
is 5 (which takes up 20 characters of screen space).
You can specify a value from 1..30 (4 characters of screen space
are needed per dataslice).
-stragglers TRUE
Only display information about a query/plan that CURRENTLY has
stragglers associated with it. Otherwise, it will be skipped.
The "-straggers" options can be used individually or together.
-spu
When displaying stragglers, the dataslice is what gets displayed. If
you also want to know the SPU ID's associated with those dataslices
include this switch. The SPU ID's will be displayed on the following
line. The number in (parenthesis) is the SPU ID. The number preceeding
that is how many stragger dataslices are from this particular SPU ID.
For example:
... Busy Dataslices ... SQL
... ==== ============================================ =====================
... 92 select * from table1;
... 8 5 6 11 12 15 16 19 20 select * from table2;
... SPU: 8 (1112)
... 80 select * from table3;
... 11 1 3 5 7 9 11 13 15 17 20 21 select * from table4;
... SPU: 4 (1112) 4 (1134) 3 (1138)
-v|-verbose
For each plan, for the snippet that it is currently running,
display the more interesting "Nodes" (such as ScanNode, JoinNode,
SortNode, DownloadTableNode, etc ...) so that you can get a quick
summary as to what each query plan is doing as this point in time.
-q|-queue
On a busy system there might be many jobs that are queued. By
default, this script will just print a single summary line, such
as: queued 14 (4/83.1/314 min/avg/max queuetime)
If you wish to have each of the queued jobs listed individually
include this switch.
-header <nnn>
A header (the column titles) will be printed out after every 30
iterations. You can adjust how frequently you want the headers
displayed.
-width <nnn>
The display width for the SQL column. Default is 40. Range is 3-120.
-loop forever # The default ... this script will loop forever.
# Press ^C at any time to interrupt it.
-loop once # This script will run once and then exit
-loop busy # The script will loop ... but then automatically exit
# once one of the following conditions is true.
# 1) There are NO active queries running on the box, or
# 2) The query you specified ("-plan <plan_id>") is done
-loop <nnn> # This script will loop a specified number of times, e.g.
# -loop 1 -loop 10 -loop 32767
-fast # Do you want the script to run faster or slower? The
-slow # default is now going to be "-fast".
#
# As of release 7.0, in order to be able to display how
# long the current snippet has been running, the script
# must make additional calls to nzsql in order to
# issue a 'SHOW PLANFILE <nnn>;' command. The "-fast"
# switch will eliminate that overhead ... but at the expense
# of not being able to include the snippet timing. In
# which case only the overall plan time will be displayed.
#
# The "-slow" switch causes the script to do that extra
# work, and adds a bit more overhead to the system.
Outputs: A report such as this will be produced.
The first column is the date/timestamp when this sample was taken.
The second column is the query plan number.
"Snippet" shows what snippet the plan is currently on / and the total
number of snippets in the plan.
"Time S/P" shows how long the current "S"nippet has been running / and
how long the "P"lan as a whole has been running.
State RUNNING - The query plan / snippet is currently running.
waiting - The query plan is running, but the current snippet
is waiting its turn to run. This may be because
o There are too many other snippets running
o Those other snippets are using too much memory
o Your WLM group is 'Overserved' (and has used more
than its prescribed share of system resources)
If loading data via external tables, it will always
be displayed as "waiting". See "loading" below.
delayed - The query plan is running, but the start of the
next snippet has been delayed. This may be because
the group that the user belongs to has exceeded
its resource maximum limit.
queued - The query plan is queued and has not yet started.
loading - This is an nzload job (or something of that nature).
If a query used external tables (rather than nzload)
then its state will instead be displayed as "waiting".
In either case, this script can not distinguish the
exact state of such jobs (since they typically involve
just a single, host-side only (DBOS) snippet). These
snippets might be running or they might be waiting. If
you need to dig deeper, try: nzsqa schedqueues -sys
done - The query completed just this instant.
"Busy" shows how many dataslices are still busy running this snippet
(22 happens to be the number of dataslices on an IBM Netezza 1000-3).
"Dataslices ..." shows the dataslice #'s that are still busy running
this particular snippet.
20141231 Plan # Snippet Time S/P State Busy Dataslices ... SQL Username/Database
======== ====== ========= ========= ======= ==== ==================== ======================================== =================
15:33:05 83 (2/2) 14/15 RUNNING 1 13 update product set part_number = null wh SCOTT/STORES
86 (2/37) 1/2 RUNNING 22 create table test1 as select * from syst ADMIN/SYSTEM
94 (8/10) 7/9 waiting select case when customer_id between 123 GUEST/MARKETING
96 (5/32) 18/22 delayed select * from nz_check_disk_scan_speeds GUEST/SYSTEM
97 (0/22) 21 queued select current_time, * from payroll wher BOSS/FINANCE
98 (0/1) 13 loading insert into LINEITEM select * from EXTER ADMIN/TPCH
105 (1/1) 0/0 done select count(*) from tiny_one; ADMIN/SYSTEM
nz_scan_table_extents
Usage: nz_scan_table_extents <database> <table> -dsid <nn>
Purpose: Scan 1 table ... 1 dataslice ... 1 extent at a time.
This is a diagnostic/test script for scanning a table. Instead of performing
a simple (and much faster) full table scan of the table, e.g.
select count(*) from <table>;
this script will instead try to scan the table extent by extent, one extent at a
time. If a failure is encountered on any single scan (or extent), the script will
continue on and attempt to scan the remainder of the table.
This is accomplished by using zonemap information. To each scan (or SELECT
statement) a WHERE clause is added to specify the range of CREATEXID and ROWID
values that we are interested in -- in an effort to limit the scan to a single
extent.
Inputs: The following inputs are required
<database> The database name
<table> The table name
-dsid <nn> The specific dataslice to be scanned
Outputs: Sample output. In this example, there were 3 extents allocated to this table
(on this particular dataslice). Each scan scanned exactly one (and only one)
extent.
Scan # | _EXTENTID | ROWCOUNT
--------+-----------+----------
1 | 43359040 | 64863
Scan # | _EXTENTID | ROWCOUNT
--------+-----------+----------
2 | 43359072 | 64813
Scan # | _EXTENTID | ROWCOUNT
--------+-----------+----------
3 | 43359136 | 20068
But that won't always be the case. Depending on how the table was populated
with data, some of the scans (because of the WHERE restrictions used) might
end up referencing data that falls in multiple extents. So in this example,
some scans scanned more than one extent (causing some extents to be scanned
more than once).
Scan # | _EXTENTID | ROWCOUNT
--------+-----------+----------
1 | 43359392 | 64220
1 | 43359424 | 2031
Scan # | _EXTENTID | ROWCOUNT
--------+-----------+----------
2 | 43359392 | 19654
2 | 43359424 | 64207
2 | 43359456 | 2119
Scan # | _EXTENTID | ROWCOUNT
--------+-----------+----------
3 | 43359424 | 20047
3 | 43359456 | 64059
3 | 43359488 | 1826
<snip>
nz_show_locks
Usage: nz_show_locks [<database> <object>]
Purpose: Show information about locks being held on the system.
o You can show summary information -- about all objects that have a lock
o You can show detailed information -- about locks on a specific object
Locks can apply to tables, views, sequences, ...
See Also: /nz/support/bin/SAMPLES/locks.sql
Inputs: If no arguments are specified, then a list of all user objects, that
currently have lock(s) associated with them, will be displayed.
-all Include this switch to get a list of ALL objects (both user and
system) that currently have lock(s) associated with them.
If you specify the optional <database> <object> names, then the script
will list what session(s) currently hold a lock on the object, and what
session(s) are waiting to acquire a lock on the object.
Outputs: A report such as the following.
$ nz_show_locks
User objects that currently have lock(s) associated with them
Database Name | Object Name | Object Type
---------------+-------------+-------------
PROD | TEST_TABLE | (table)
(1 rows)
$ nz_show_locks prod test_table
Database: PROD
Object: TEST_TABLE
Timestamp: 2010-02-16 15:46:37
=======================================================================================================
The following session(s) are HOLD'ing a lock on the object
Requested | Granted @ | Wait Time | SESSIONID | PROCESSID | USERNAME | LOCKMODE | Current SQL Command
-----------+-----------+-----------+-----------+-----------+----------+------------------+------------------------------------------
15:17:14 | 15:17:14 | 00:00:00 | 140466 | 24899 | MARKF | AccessShareLock | update test_table set address = '1600 P'
15:17:14 | 15:17:14 | 00:00:00 | 140466 | 24899 | MARKF | RowExclusiveLock | update test_table set address = '1600 P'
15:20:53 | 15:20:53 | 00:00:00 | 140483 | 25923 | GUEST | AccessShareLock | select count(*) from test_table;
(3 rows)
=======================================================================================================
The following sessions are WAIT'ing to access the object
Requested | Granted @ | Wait Time | SESSIONID | PROCESSID | USERNAME | LOCKMODE | Current SQL Command
-----------+-----------+-----------+-----------+-----------+----------+---------------------+----------------------------------
15:46:01 | | 00:00:36 | 140524 | 10515 | ADMIN | AccessExclusiveLock | lock table "TEST_TABLE"
15:46:29 | | 00:00:08 | 140445 | 24157 | DBA | AccessShareLock | select count(*) from test_table;
(2 rows)
=======================================================================================================
nz_spu_memory
Usage: nz_spu_memory [-force]
Purpose: Provide a summary of how much memory the individual SPUs are using.
This script might be helpful when trying to identify skewed queries.
WARNING: This script has the potential to cause the system to restart.
Please take this into consideration if you plan to run this script
on a production system.
Inputs: Because of the aforementioned warning, you will be prompted to verify
that you really, really want to run this script. Passing the "-force"
argument on the command line will eliminate the prompt.
Outputs: A report such as this will be produced. The SPUs are grouped
together for reporting purposes.
# Of SPUs Amount Of Memory Used (in bytes)
========= =================================
8 Total quantity in use: 20160448
163 Total quantity in use: 23429648
52 Total quantity in use: 23429680
1 Total quantity in use: 23437920
nz_spu_swap_space
Usage: nz_spu_swap_space [-verbose] [-loop [delay_time]] [ -bytes | -kb | -mb ]
Purpose: Provide a summary of how much swap space the individual SPUs are using.
This script might be helpful when trying to identify skewed queries.
The amount of swap space varies, based on the machine model and size.
Roughly speaking, the base numbers are
TwinFin - 108 GB per dataslice
Striper/Mako - 88 GB per dataslice
Inputs: -verbose Optional argument. If specified, the output will also
highlight the swap space being used up by dataslice 1
(the 1st amongst all data slices). A redistribution on
a column value of NULL/ZERO/BLANK results in those
particular records always winding up on this dataslice.
-loop [delay_time]
Have the script loop forver ... reporting on the swap
space used every 60 seconds. Terminate it via a ^C.
If you want to increase/decrease the default delay_time
(which is 60 seconds) specify a number from 1 to 300.
-bytes By default, this script will report swap space usage in MB's.
-kb You can choose a different reporting measure if you wish.
-mb
Outputs: A report such as this will be produced.
This is an example from an NPS 4.x system (which is SPU based). The SPUs are
grouped together by the amount of swap space 'Used' (otherwise, the output for
a 10800 would be 864 lines long).
# Of SPU's | Swap Space (MB) | ==> USED <== | Free | Dataslice 1
------------+-----------------+--------------+------------+-------------
102 | 24,779 | 183 | 24,596 |
1 | 24,779 | 182 | 24,597 | *
5 | 24,779 | 182 | 24,597 |
This is an example from an NPS 5.x TwinFin system (which is S-Blade based).
A single S-Bade mananges multiple dataslices (typically 6 to 8). The amount
of swap space 'Used' is for the S-Blade as a whole. The output also shows
the average space used, per dataslice (but note that this is the average ...
not the min nor the max)
# Of Dataslices
S-Blade's | Per S-Blade | Swap Space (MB) | ==> USED <== | USED (avg per dslice) | Free
-----------+-------------+-----------------+--------------+-----------------------+---------
1 | 8 | 892,138 | 446 | 56 | 891,692
1 | 6 | 669,087 | 7 | 1 | 669,081
1 | 8 | 892,138 | 7 | 1 | 892,131
This is an example from an NPS 7.2 Mako system. The script also displays the amount of
swap space used by any active query (that is using more than 1 page of swap, per data
slice). In this example, note that the first plan (#12) is terribly skewed -- in that
only one (of the 120 dataslices on this box) is using all of the swap space. Whereas
the second plan (#1390) has absolutely no skew.
# Of Dataslices
S-Blade's | Per S-Blade | Swap Space (MB) | ==> USED <== | USED (avg per dslice) | Free
-----------+-------------+--------------------+----------------+-----------------------+-----------
1 | 40 | 3,575,792 | 502,900 | 12,572 | 3,072,892
2 | 28 | 2,542,378 | 114,920 | 4,104 | 2,427,458
1 | 24 | 2,197,906 | 98,503 | 4,104 | 2,099,403
Plan # | Swap Space Used | DSlice MIN | DSlice AVG | DSlice MAX | SKEW RATIO
--------+----------------------+---------------+---------------+-----------------+-----------------
12 | 355,182,706,688 | 0 | 2,959,855,889 | 355,182,706,688 | 0.000 - 120.000
1390 | 516,402,708,480 | 4,303,355,904 | 4,303,355,904 | 4,303,355,904 | 1.000 - 1.000
nz_spu_top
Usage: nz_spu_top [ optional args ]
Purpose: Show the current CPU Utilization and Disk Utilization on the S-Blades.
This script provides a summary of the information returned by the command
nzsqa proctbl -allrole ACTIVE
This script must be run on the NPS host as the linux user "nz" (but it
does not require you to be the database user "ADMIN")
By default, this script loops + runs forever. Press ^C at any time to
interrupt it.
Inputs: All arguments are optional
-sleep <sleep_time>
How long to sleep before refreshing the output. The default is
10 seconds. The range is 1..3600 seconds.
-v|-verbose
This script displays the MIN/AVG/MAX values for the CPU Utilization
and Disk Utilization. Include this optional switch if you want to
see all of the individual values include this switch.
-loop forever # The default ... this script will loop forever.
# Press ^C at any time to interrupt it.
-loop once # This script will run once and then exit
-loop <nnn> # This script will loop a specified number of times, e.g.
# -loop 1 -loop 10 -loop 32767
Outputs: Sample outputs (with some commentary) follow.
# The script "nz_check_disk_scan_speeds" was running at the time. It is I/O intensive
# and will peg the disks at/near 100% utilization (as shown below). This output is from
# an IBM Netezza 1000-3 (which has 3 S-Blades and 22 disks/dataslices).
$ nz_spu_top -sleep 1 -loop 1 -verbose
MIN / AVG / MAX MIN / AVG / MAX
================================= ==================================
19:01:40 CPU Util %: 4.4 / 5.1 / 5.9 Disk Util %: 99.9 / 99.9 / 99.9
CPU Util: 4.4% Disk Util: 99.9%,99.9%,99.9%,99.9%,99.9%,99.9%,99.9%,99.9%
CPU Util: 5.2% Disk Util: 99.9%,99.9%,99.9%,99.9%,99.9%,99.9%
CPU Util: 5.9% Disk Util: 99.9%,99.9%,99.9%,99.9%,99.9%,99.9%,99.9%,99.9%
# Whereas, "nz_check_disk_scan_speeds -cpu" is CPU intensive, and will peg the CPUs
# at/near 100% utilization (as shown below). Note that one of the S-Blades has 8 CPU
# cores but is responsible for controlling just 6 disk drives. Thus, it is using 3/4ths
# (75%) of the total CPU.
$ nz_spu_top -sleep 1 -loop 1 -verbose
MIN / AVG / MAX MIN / AVG / MAX
================================= ==================================
19:01:55 CPU Util %: 75.6 / 91.8 / 100.0 Disk Util %: 63.1 / 81.9 / 99.9
CPU Util: 100.0% Disk Util: 65.5%,85.3%,63.1%,68.6%,72.9%,83.9%,77.4%,83.2%
CPU Util: 75.6% Disk Util: 99.9%,93.4%,97.9%,94.4%,97.1%,97.6%
CPU Util: 100.0% Disk Util: 86.8%,63.9%,84.9%,80.6%,78.4%,72.2%,70.9%,86.0%
# The appliance is currently idle ... no queries are running on it.
$ nz_spu_top -sleep 2
MIN / AVG / MAX MIN / AVG / MAX
================================= ==================================
19:35:54 CPU Util %: 0.0 / 0.1 / 0.2 Disk Util %: 0.0 / 0.0 / 0.0
19:35:56 CPU Util %: 0.0 / 0.0 / 0.0 Disk Util %: 0.0 / 0.0 / 0.0
19:35:59 CPU Util %: 0.0 / 0.0 / 0.0 Disk Util %: 0.0 / 0.0 / 0.0
19:36:01 CPU Util %: 0.0 / 0.1 / 0.2 Disk Util %: 0.0 / 0.0 / 0.0
19:36:04 CPU Util %: 0.0 / 0.1 / 0.2 Disk Util %: 0.0 / 0.0 / 0.0
nz_test
Usage: nz_test
Purpose: Run a test to verify that these scripts can connect to the database.
The nz_*** scripts assume that you are able to establish a valid connection
to the database. This script can be used to verify that, and help diagnose
any problems that you might be having.
NZ_USER This environment variable must be set up
NZ_DATABASE This environment variable must be set up
NZ_PASSWORD You can set this environment variable ... or you can set
up the password cache (via the 'nzpassword' command) in
which case the passwords are automatically obtained from
that.
NZ_HOST If you are running on a remote/client machine then this
environment variable must be set up. If you are running
on the NPS host itself, then this environment variable
is optional (if you do set it, you can set it to the
NPS's hostname or to 'localhost' or to 127.0.0.1).
PATH Your search $PATH must include the database software's
"bin" directory (where the nzsql, nzload, nzEtc ...
executables reside).
You do not have to include the directory location where
these nz_*** scripts reside ... but that would certainly
make things easier for you (when you go to invoke one).
Inputs: None
Outputs: If a successful connection is established, the script will display
Success
Otherwise, the relevant error message will be displayed.
nz_transactions
Usage: nz_transactions
Purpose: Display information about the current database transactions.
Notes: NPS only assigns/uses even transaction ID's. The majority of the
information shown below is only available if you are running NPS 4.6+.
Inputs: None
Outputs: A report such as the following (with some annotations added to this help
text to help explain things).
General transaction information
==========================================================
Stable TXid : 2893 (0xbd4) All transactions lower than this are visible (i.e., committed/rolled back)
Last TXid : 2970 (0xb9a) The last transaction ID that was assigned (probably to this transaction)
Difference : 38 Number of transactions that separated these two values
TX Array Size : 6 Maximum is system.maxTransactions=65490
Invisibility List : (2894,2898)
Session information about the oldest active transaction
==========================================================
It's TXid Value : 2894
HEX TXid Value : 0xb4e
ID : 17024
PID : 19118
USERNAME : ADMIN
DBNAME : SYSTEM
TYPE : sql
CONNTIME : 2010-03-18 16:16:11
STATUS : active
COMMAND : insert into table_1 select * from my_test_table;
PRIORITY : 3
CID : 19117
IPADDR : 127.0.0.1
Ongoing transactions that are making changes to the tables
(which will be rolled back if they are killed)
==========================================================
Session ID | TXid | HEX TXid | TX Start Time | User | Database | Operation | Object | Obj Type | Status | Latest SQL Statement
------------+------+----------+---------------------+-------+----------+-----------+---------+----------+--------+----------------------
17024 | 2894 | 0xb4e | 2010-03-18 16:39:20 | ADMIN | SYSTEM | Delete | TABLE_1 | TABLE | active | insert into table_1
17024 | 2894 | 0xb4e | 2010-03-18 16:39:20 | ADMIN | SYSTEM | Insert | TABLE_1 | TABLE | active | insert into table_1
17024 | 2894 | 0xb4e | 2010-03-18 16:39:20 | ADMIN | SYSTEM | Insert | TABLE_2 | TABLE | active | insert into table_1
17024 | 2894 | 0xb4e | 2010-03-18 16:39:20 | ADMIN | SYSTEM | Insert | TABLE_3 | TABLE | active | insert into table_1
17024 | 2894 | 0xb4e | 2010-03-18 16:39:20 | ADMIN | SYSTEM | Insert | TABLE_4 | TABLE | active | insert into table_1
(5 rows)
Notes: A single transaction might alter one, or many, tables. (Though any given SQL
Statement in the transaction would only alter a single table at a time).
The "Operation" column will show
Insert (because of an INSERT and/or an UPDATE statement)
Delete (because of a DELETE and/or an UPDATE statement)
The "TX Start Time" is when the transaction began ... not the session connect time.
This list shows TRANSACTIONS (that are altering the tables), not SESSIONS (that
are simply accessing the tables).
Transactions that are currently rolling back
==========================================================
Session ID | TXid | HEX TXid | TX Start Time | User | Database | Operation | Object | Obj Type | Status | Latest SQL Statement
------------+------+----------+--------------------+-------+----------+-----------+---------+----------+---------+----------------------
17157 | 2928 | 0xb70 |2010-03-18 16:41:50 | ADMIN | SYSTEM | Delete | TABLE_A | TABLE | tx-idle | insert into table_a
17157 | 2928 | 0xb70 |2010-03-18 16:41:50 | ADMIN | SYSTEM | Insert | TABLE_A | TABLE | tx-idle | insert into table_a
17157 | 2928 | 0xb70 |2010-03-18 16:41:50 | ADMIN | SYSTEM | Insert | TABLE_B | TABLE | tx-idle | insert into table_a
(3 rows)
XID | HEX xid | # Of Tables | FULLSCAN | Rollback Started | Blocks Modified | Blocks Remaining | Time Remaining (Est)
------+---------|-------------+----------+---------------------+------------------+------------------+----------------------
2928 | 0xb70 | 2 | f | 2010-03-18 16:44:07 | 2,742 | 2,598 | 00:01:26.750532
(1 row)
* * * * Miscellaneous / Other * * * *
nz_abort
Usage: nz_abort [-all|<dbname>|<username>]
Purpose: Abort the specified user sessions.
Inputs: If no options are specified, the script will abort all sessions with the
same username (CURRENT_USER).
-all # Abort all sessions.
<dbname> # Abort all sessions that are connected to the specified database.
<username> # Abort all sessions that are associated with the specified user.
Outputs: The sessions that are being aborted will be listed.
Aborting session: 16129
Aborting session: 16344
Aborting session: 20715
nz_altered_tables
Usage: nz_altered_tables [-groom]
Purpose: Display information about all versioned (ALTER'ed) tables.
Versioned tables come about as a result of doing an
ALTER TABLE <tablename> [ADD|DROP] COLUMN ...
This results in multiple data stores for the table. When you go to query
the table, NPS must recombine the separate data stores back into a single
entity. This action will be performed automatically and on-the-fly. But
it does result in additional query overhead. Therefore, it is a best
practice to reconstitute the table as soon as practical by doing a
GROOM TABLE <tablename> VERSIONS;
Notes: The maximum number of table versions allowed is 4 (which means you
can perform at most three ALTER TABLE commands before doing a GROOM,
since this number includes the original table version itself).
Versioned tables are new as of NPS release 6.0.
When using the nz_db_size script, any such tables will be flagged
as being '(Versioned)' in the output.
Inputs: -groom
Include this switch if you want the script to automatically issue a
"GROOM TABLE <tablename> VERSIONS;" against each of the altered tables
that it finds.
The records from the older table version(s) are read + rewritten --
appending them to the most recent table version (which is based on the
current DDL/layout for the table).
Any logically deleted records that can be GROOM'ed (removed) from the older
table versions will be ... as the data is being rewritten.
The most recent table version is not touched (its data records do not need to
be groomed/reformated). Thus, if it has any logically deleted records none
of them will be touched (groom'ed out of the table) at this time.
After the GROOM completes, NPS automatically perform a GENERATE STATISTICS
across the entire table.
Outputs: A report such as the following will be produced.
# Of Versioned Tables 4
Total # Of Versions 11
Database | Table Name | Size (Bytes) | # Of Versions
----------+-----------------+----------------------+---------------
CUST_DB | AN_EMPTY__TABLE | 0 | 2
CUST_DB | CUSTOMER_LIST | 2,452,881,408 | 4
TEST_DB | TABLE1 | 131,072 | 2
TEST_DB | TABLE2 | 1,310,720 | 3
(4 rows)
nz_backup_size_estimate
Usage: nz_backup_size_estimate -db <database> [-start <txid>] [-end <txid>]
Purpose: To estimate the storage requirements of a '-differential' database backup.
When you run "nzbackup -differential" against a database, only the rows that
have been INSERTed or UPDATEd (since the last backup) need to be included in
the backup set. This script attempts to analyze the database to provide you
with an estimate as to how much data that would involve.
Options: -db <database> The name of the database to be analyzed. Required.
-start <txid> Optional argument. This represents the transaction
id when the database was last backed up. If not
specified, then its value will be looked up in the
catalogs.
-end <txid> Optional argument. This represents the transaction
id as of this point in time -- if you were to do a
backup now. If not specified, then the stable
transaction id will be used.
Outputs: A report such as the following will be produced.
Visible+Invisible | Current Rowcount | DELETE'd Rows | INSERT'ed Rows | Table Size | Table Objid | Table Name
------------------+------------------+------------------+------------------+------------------+------------------+------------
4,096 | 4,096 | 0 | 2,048 | 147,520 | 206725 | TABLE_1
21,504 | 20,480 | 512 | 20,480 | 774,336 | 206735 | TABLE_2
0 | 0 | 0 | 0 | 0 | 206745 | TABLE_3
Visible+Invisible: The total number of rows being stored in the table at this point in time. It
includes both active (visible) rows and obsolete (invisible) rows.
Current Rowcount: The total number of visible rows in the table, at this point in time.
DELETE'd Rows: How many rows have been DELETE'd (and/or updated) since the last backup.
INSERT'ed Rows: How many rows have been INSERT'ed (and/or update'd or nzload'ed) since the last backup.
Table Size: In bytes. Divide this number by 'Visible+Invisible' rows to get the average row size.
nz_build_html_help_output
Usage: nz_build_html_help_output
Purpose: To generate HTML output that documents all of the individual nz_*** scripts.
Running this script will generate HTML output that is (basically) the online
help text for each of the individual scripts, nicely formatted using HTML
tags. Which may be the document you are reading right now!
Inputs: None
Outputs: The HTML help text is sent to standard out. You will probably want to
redirect it to a file.
nz_catalog_diff
Usage: nz_catalog_diff [<old_rev> <new_rev>]
Purpose: Display any "diff"erences between two different versions of the NPS catalogs.
The catalogs are the collection of predefined
system tables
system views
management tables
management views
This script will highlight what tables/views have been added, changed, or
deleted between the two versions.
Note: The information used by this script was originally generated by the
nz_catalog_dump script.
Inputs: This script will compare the current version of the catalog to its prior
version. Or, you can pass it the two versions to be compared. The
version numbers currently supported are:
4.6 5.0 6.0 7.0 7.0.3 7.1 7.2
Outputs: A report, such as the following, will be produced. In this example, the
column "ARGUMENTS" has had its datatype definition changed between 4.0 and
4.5. A number of new columns were added in 4.5. And the view definition
itself (the SQL SELECT statement) has changed.
Objects that have changed
==================================================
_V_FUNCTION
------------------------------------------
4.0: ARGUMENTS CHARACTER VARYING(200)
4.5: ARGUMENTS CHARACTER VARYING(117)
4.5: BUILTIN CHARACTER VARYING(1)
4.5: DETERMINISTIC BOOLEAN
4.5: DISPLAY BOOLEAN
4.5: FUNCTIONSIGNATURE CHARACTER VARYING(117)
4.5: LOGMASK CHARACTER VARYING(11)
4.5: MEMORY TEXT
4.5: RETURNSNULLONNULLINPUT BOOLEAN
Note: The "View Definition:" has changed.
nz_catalog_dump
Usage: nz_catalog_dump
Purpose: To dump out (describe) the catalog definition of all system tables and views.
This includes the table/view definition of every
system table
system view
management table
management view
The output generated by this script is later used by the "nz_catalog_diff"
script (which allows you to display the differences between two different
versions of the NPS catalogs).
This script has already been run for you. The output file(s) are included
in the scripts directory and named "nps_catalog_?.?.txt". Thus, there is
generally no need for you to run this script yourself.
Inputs: None
Outputs: The "\d" description of each system table/view will be sent to standard out.
You will probably want to redirect that output to an appropriately named
file ... so that it can then be used by the nz_catalog_diff script.
nz_check_views
Usage: nz_check_views [ database ] [-replace <flag>] [-rebuild]
Purpose: Check each VIEW to make sure it is not obsolete (and in need of rebuilding).
This script operates by issuing the following SQL against each view
select count(*) from <viewname> limit 0;
The purpose of the "limit 0" is to
o parse the VIEW/query to make sure it is valid
o whlie keeping it from actually running (down on the SPUs)
What types of actions can cause a view to become obsolete and need rebuilding?
o ALTER TABLE ... RENAME
o ALTER COLUMN ... RENAME
o ALTER COLUMN ... MODIFY
o DROP TABLE ...
Inputs: The database name is optional. If a database is not specified, then the
views in all databases / schemas will be processed.
-replace <flag>
If this script finds a problem with a view, should it attempt to recreate
it (via a "create or replace view ..." statement)? Default: no/false
-rebuild
Don't bother checking the individual views ahead of time.
Just go ahead and rebuild all of them.
Outputs: If a problem with any of the views is found, the relevant information will
be sent to standard out and would look something like this
Database: TEST1
View: V2
ERROR: Base table/view 'V1' attr 'COL1' has changed (datatype); rebuild view 'V2'
An exit status of 0 means success. Anything else means ERROR's were encountered
nz_columns
Usage: nz_columns
Purpose: Create a table that will provide all column definitions across all databases.
A table called NZ_COLUMNS will be created in the SYSTEM database. Its
definition is as follows:
Attribute | Type
-----------------+--------------------------------
DATABASE_NAME | NATIONAL CHARACTER VARYING(128)
SCHEMA_NAME | NATIONAL CHARACTER VARYING(128) -- new column as of 7.0.3+
OBJECT_NAME | NATIONAL CHARACTER VARYING(128)
OBJECT_TYPE | CHARACTER VARYING(128)
COLUMN_NAME | NATIONAL CHARACTER VARYING(128)
COLUMN_TYPE | CHARACTER VARYING(128)
COLUMN_NUMBER | SMALLINT
OBJECT_ROWCOUNT | BIGINT
It will be populated with the column information for all
o TABLEs
o EXTERNAL TABLEs
o VIEWs
o MATERIALIZED VIEWs
for all databases.
Thus providing one easy-to-query object in case you wish to do any
analysis against it. This is a static table, that is up-to-date
only as of the moment that it was created.
See Also: nz_inconsistent_data_types
Inputs: None
Outputs: Status information (such as the following) as it is building the
table for you.
17:18:42 Processing database 1 of 5: DEV
17:18:48 Processing database 2 of 5: DR
17:19:03 Processing database 3 of 5: PROD
17:19:41 Processing database 4 of 5: QA
17:20:02 Processing database 5 of 5: SYSTEM
CREATE TABLE
INSERT 0 31678
The table SYSTEM..NZ_COLUMNS has been created and is now ready to be queried.
nz_compress_old_files
Usage: nz_compress_old_files
Purpose: Compress (via gzip) old + unused files under the /nz directory.
It is useful as a general housekeeping script, and can be used at any time
to free up some space under /nz. It will not affect the running NPS system.
The following files are compressed:
core*
patch*.tar
nzinitsystem*.tar
nzupgradedb*.tar
as well as all log files under
/nz/kit.bak/log
Inputs: None
Outputs: Status messages will be sent to standard out.
nz_compressedTableRatio
Usage: nz_compressedTableRatio [database [table/mview ...]] [-size <nn>] [-verbose]
Purpose: Estimate the compression ratio of a table or materialized view.
As of NPS 6.0
If you have a versioned Table/Secure Table, as the result of doing an
ALTER TABLE <table> [ADD|DROP] COLUMN ...
this script will not run against the table until it has been groomed.
The system reports the amount of space used in terms of 128KB pages. For
small tables, this will tend to make the compression ratio meaningless.
(If a table is using just 5KB of storage on a particular dataslice, it
will be reported by this script as using 128KB). In general, the bigger
the table ... the more accurate the compression ratio will be.
Inputs: The database name is optional. If not specified then all databases / schemas
will be processed.
You may specify 1..many names (of tables or materialized views) to be
processed. If none are specified, then all tables and materialized views
in the specified database / schema will be processed.
-size <nn>
An optional argument. The bigger the table, the more accurate the
compression ratio. Conversely, the smaller the table the less accurate
the compression ratio. By including this switch you can skip smaller
tables entirely. If so specified, the table will only be processed if
it takes up at least <nn> MiB of storage (on average) per dataslice.
For example, on an IBM Netezza 1000-12, specifying "-size 1" would mean
that the table must be at least 92 MiB in size for it to be processed by
this script (92 dataslices * 1 MiB per dataslice). "-size 1" is a good
minimum to use.
On NPS versions 4.x and 5.x, the default for this script is "-size 0" ...
it will process all tables regardles of their size.
On NPS version 6.x +, the default for this script will now be "-size 1".
It will only process/report on tables that are of a minimal size.
-verbose
An optional argument. If included, additional stats will be displayed
about each table. For example:
# Of Rows: 6,451,338
Bytes/Row: 48 compressed
144 uncompressed
Outputs: A report such as the following.
....................................................................................
. The values below show the estimated size ratio of a compressed table to its .
. uncompressed form. An uncompressed table is approximately <ratio> times larger .
. than its compressed version. .
. .
. The 'Compressed Size' is the actual amount of storage being used by the table. .
. The 'Uncompressed Size' is an estimate based on mathematical calculations. .
....................................................................................
Database: TPCH
Table/MView Name Ratio Compressed Size Uncompressed Size Size Difference
================================ ===== =================== =================== ===================
CUSTOMER 1.21 24,035,438 29,083,956 5,048,518
LINEITEM 1.75 497,207,012 868,202,222 370,995,210
NATION 1.27 2,409 3,060 651
ORDERS 1.35 151,111,965 204,162,018 53,050,053
PART 1.29 24,077,989 31,170,510 7,092,521
PARTSUPP 1.23 113,506,073 139,996,884 26,490,811
REGION 1.10 595 656 61
SUPPLIER 1.24 1,400,636 1,736,129 335,493
================================ ===== =================== =================== ===================
Total For This Database 1.57 811,342,117 1,274,355,435 463,013,318
nz_csv
Usage: nz_csv <database> -dir <dirname> [-limit <nn>] [-threads <nn>]
Purpose: To dump out the tables in a database in CSV format (comma separated values)
Inputs: <database>
The database name is required. By default, all of the tables in that database
will be processed. You can specify a subset of tables (and/or schemas) by
including additional command line options (see below).
-dir <dirname>
This field is required. Specify the name of the directory where you want the
output files written to.
-limit <nn>
An optional field. By default, all of the rows in the table will be dumped out.
You can use this option to limit the output (from each table) to no more than
<nn> rows. Especially helpful if you just want to produce a smaller sample of
data for testing purposes. Specify a number from 1 to 2 billion.
-threads <nn>
An optional field. By default, this script will process 8 tables at a time
(by launching 8 concurrent nzsql sessions). This script uses a simple SELECT
statement (rather than external tables) to extract the data. So, by
parallelizing this work, it helps to speed things along. Specify a number
from 1..31.
Outputs: For each table, one file (matching that tablename) is written out to the
specified directory. The filenames will have a .CSV extension appended
to them.
The CSV (comma separated value) format will be as follows
o A comma , will be the delimiter between columns
o If a column value is NULL, then nothing will be put out
e.g., it would appear thusly ,,
o A double quote " will be used to delimit all text strings
e.g., "This is a text string"
This applies to CHAR, VARCHAR, NCHAR and NVARCHAR datatypes
o A double quote " used within the body of a text string will
be doubled up
e.g., "I said ""Hello!"""
o Any instances of a binary <000> or <001> will be stripped from
the text strings.
nz_db_tables_rowcount
Usage: nz_db_tables_rowcount [ database ]
Purpose: To display the tablename + rowcount for every table in a given database.
This number is NOT a table statistic -- it is the actual/current rowcount.
This script will perform a simple full table scan of each table to count
the number of rows. Likewise, it will also process each MView in the
database.
Inputs: The database name is optional.
If a database name is specified, the script will process just the
specified database / schema.
If the database name is not specified, then this script will process
all databases / all schemas. This will also include a test of the
"ZONE MAP TABLE ON SPU" (i.e., the _VT_SPU_ZMAP_INFO table).
Outputs: A report such as this will be produced.
Database SYSTEM
Rowcount Table Name
--------------- --------------------------------
0 AN_EMPTY_TABLE
1,048,576 CHAR_TEST
nz_db_tables_rowcount_statistic
Usage: nz_db_tables_rowcount_statistic [ database ]
Purpose: Display the tablename, rowcount, and storage size for all tables in a database.
This script is similar to "nz_db_tables_rowcount". But that script actually
performs a full table scan for every table to obtain its actual rowcount.
This script simply dumps out the rowcount STATISTIC that is stored in the
system catalogs. Thus, this script is infinitely faster. However the number
is, after all, "just a statistic" -- and is not guaranteed to be 100% accurate.
Inputs: The database name is optional.
If a database name is specified, the script will process just the
specified database / schema.
If the database name is not specified, then this script will process
all databases / all schemas.
Outputs: A report such as this will be produced.
Database Name : TPCH
# Of Tables : 8
# Of Rows : 8,661,245
# objid rowcount table size (bytes) TABLE NAME
------ --------- ----------------- ------------------ ----------
1 221976 150,000 29,071,424 CUSTOMER
2 221998 6,001,215 867,799,052 LINEITEM
3 221890 25 3,060 NATION
4 222036 1,500,000 204,137,484 ORDERS
5 221916 200,000 31,167,724 PART
6 221940 800,000 140,039,028 PARTSUPP
7 221904 5 656 REGION
8 221956 10,000 1,736,548 SUPPLIER
nz_db_views_rowcount
Usage: nz_db_views_rowcount [ database ]
Purpose: To display the rowcount + viewname for every view in a given database.
This number is NOT a statistic -- it is the actual/current rowcount.
This script will perform a "select count(*)" from each view to count
the number of rows.
Inputs: The database name is optional.
If a database name is specified, the script will process just the
specified database / schema.
If the database name is not specified, then this script will process
all databases / all schemas.
Outputs: A report such as this will be produced.
Database system
Rowcount View Name
--------------- --------------------------------
0 A_TEST_VIEW
1,048,576 EMPLOYEE_VIEW
nz_dimension_or_fact
Usage: nz_dimension_or_fact [database] <table>
Purpose: Identify whether a given table is a dimension table or a fact table.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: This script will return one of the following strings
dimension
fact
The logic? If a table is referenced by other tables -- i.e.,
other tables link to it via a FOREIGN KEY reference --
then it is most likely a dimension table.
Otherwise, the script will consider it a fact table.
An example
create table state_dimension (state_code int primary key);
NOTICE: primary key constraints not enforced
CREATE TABLE
create table customer_fact (customer_num int, state_code int,
foreign key (state_code) references state_dimension(state_code));
NOTICE: foreign key constraints not enforced
CREATE TABLE
$ nz_dimension_or_fact system state_dimension
dimension
$ nz_dimension_or_fact system customer_fact
fact
nz_dump_ext_table
Usage: nz_dump_ext_table <filename> [-blocks]
Purpose: Dump out the header info found at the start of a compressed external table/file.
The header provides information about the structure of the base table (upon
which the compressed external table is based).
o For each column, it will identify the data type and other attributes.
o For the table, it will identify the distribution type and column(s).
o For the system, it will identify the machine size (# of SPUs).
When reloading a compressed external table/file, the table you're loading it
into must be fundamentally similar in nature to the table that was dumped.
Otherwise, an an error will be thrown and the reload will terminate.
Possible errors include
Error: Reload column count mismatch.
If the number of columns isn't the same
Error: Reload column type mismatch.
If the type of each column isn't the same
Error: Reload distribution algorithm mismatch.
If the distribution type has changed (from random to hash, or vice versa)
Error: Reload distribution key mismatch.
If the distribution key (column) has changed
Error: Reload distribution key count mismatch.
If the number of columns in the distribution key has changed
Note: If the machine size (# of dataslices) has changed this will not throw
an error. However, the decompression of the data being reloaded takes
place on the host, rather than happening in parallel on the SPUs/SBlades.
Which means it will take longer.
Inputs: <filename> The name of the file that is to be read. Its header information
will be parsed and dumped back out to the screen.
-blocks Optionally, dump out the dataslice # and compressed size of each
of the blocks in the external table/file. If the file is large,
this will take forever. ^C the script once you've seen enough.
Outputs: A report such as the following.
Note: A compressed external table/file does not care about the actual table
name or columns names -- so they are not included in the header file.
Thus, this script will simply label all columns as 'column_??'.
/*
External Table Header Info (for file '/tmp/my_data_file.ext')
==========================
version = 2
numSPUs = 216
rowIDFence = 5179837100002
numDistKeys = 1
distKey1 = 2
distKey2 = 0
distKey3 = 0
distKey4 = 0
distAlgorithm = 0
numFields = 4
*/
CREATE TABLE ?????
(
column_1 integer,
column_2 integer,
column_3 integer,
column_4 integer
)
DISTRIBUTE ON (column_3);
/* ------------------------------------------------------------
1 DSlice: 1 Compressed Block Size: 61585
2 DSlice: 2 Compressed Block Size: 61759
3 DSlice: 3 Compressed Block Size: 61954
4 DSlice: 1 Compressed Block Size: 61819
5 DSlice: 2 Compressed Block Size: 61568
6 DSlice: 3 Compressed Block Size: 61547
7 DSlice: 1 Compressed Block Size: 61844
8 DSlice: 2 Compressed Block Size: 61819
9 DSlice: 3 Compressed Block Size: 61600
10 DSlice: 1 Compressed Block Size: 61375
11 DSlice: 1 Compressed Block Size: 4505
12 DSlice: 2 Compressed Block Size: 61647
13 DSlice: 2 Compressed Block Size: 4848
14 DSlice: 3 Compressed Block Size: 61507
15 DSlice: 3 Compressed Block Size: 5111
------------------------------------------------------------ */
nz_find_control_chars_in_data
Usage: nz_find_control_chars_in_data [database] <table>
Purpose: Find any binary/non-printable control characters in a table's text columns.
This script will perform a full table scan, searching for any occurrences of
binary/non-printable control characters within the table's CHAR and VARCHAR
columns.
By definition, one can store any ASCII character <000>..<255> in a
CHAR/VARCHAR datatype. But what that might really indicate is garbage
in your original data source (e.g., one wouldn't expect to find a
<BEL>/<007> in an address field). This script can be used to hunt down
those odd records.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: Those records that contain binary/control characters in any of
their CHAR/VARCHAR columns will be selected and returned to
standard out. If no such records are found, the result set will
be empty. If the table being processed has no columns of type
CHAR/VARCHAR, the result set will be empty.
nz_find_non_integer_strings
Usage: nz_find_non_integer_strings [database] <table> <column>
Purpose: Find any non-integer characters in a table's text column.
This script will scan the specified table column (that is of type CHAR/VARCHAR)
and look for any non-numeric characters which would cause a conversion (a cast)
of that column, into an integer datatype, to fail.
Basically ... make sure that the column contains only the digits 0-9.
See also: nz_ddl_table_redesign
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table and column names are required.
Outputs: For any rows with issues, the rowid and column value will be displayed.
nz_format
Usage: nz_format [<filename>]
Purpose: To format a block of SQL to make it more readable.
Inputs: You can pass the SQL to be formatted as 'standard in'
-or-
You can pass the name of the file containing the SQL.
The stream/file can contain multiple SQL statements.
This script can also be used to extract + format the SQL in a *.pln/*.nz_plan
file. Just pass it to this script (as stdin or as a file). No need to extract it.
Outputs: For example, SQL that looks like this
select name, address from customer_list order by name group by address;
Will be nicely formatted/indented to look like this
SELECT
name,
address
FROM
customer_list
ORDER BY
name
GROUP BY
address ;
nz_grep_views
Usage: nz_grep_views [-save] [-brief|-verbose] [-all|-sysobj|-usrobj] <search_string>
Purpose: Search all views, looking for any matches against the specified <search_string>.
This search will be performed against all views in all databases.
Why? Maybe you want to find any views that were based on a particular synonym.
Maybe you want to find any views that used a particular function.
Because you can.
Inputs: <search_string>
Required argument. The string that will be grep'ed for. It will be treated
as a case insensitive search.
The search is NOT against the "View definition". Rather, it is against the
rewritten code that has already been parsed+analyzed+rewritten.
-save Optional. This script will create a temporary file
( /tmp/nz_grep_views.datafile ) that contains the views after
they have been parsed+analyzed+rewritten by the database engine.
Creating this file may take awhile. If you are going to be
running this script repeatedly, using the optional "-save" switch
will save you some time.
When the script starts, if the file already exists, it will be reused.
When the script completes, the file will be left out there (rather
than being deleted to clean things up)
-brief Optional. The "-brief" switch will list only the matching view
-verbose names. The "-verbose" switch will include the parsed+analyzed+
rewritten view logic itself -- which can be quite lengthy, and
isn't meaningful to the average user. Default: -brief
-all Optional. When the temporary file is created, you can limit it
-sysobj to just system views ("-sysobj") or just user views ("-usrobj")
-usrobj or all views ("-all"). The default is for it to contain
information on just user views ("-usrobj").
Examples: Lets say I have a synonym called THE_TABLE. I can search for any view that
contains that identifier by doing
$ nz_grep_views the_table
But this might turn up many false positives ... as this is just a simple string
match via grep. In this case, I can tailor the search ... because I know that
the parsed+analyzed+rewritten view logic is going to preceed the object/relation
name with the tag "relname ". Thus, I could use a more exact search such as this
$ nz_grep_views "relname the_table "
Now ... let's say I want to find any view that makes use of the builtin function
TIMETZ_LARGER. First, I need to get the object ID associated with that function.
$ nzsql -c "select objid from _v_function where function = ^TIMETZ_LARGER^;"
This happens to return the value 1379 (in NPS, all objects -- including all
functions -- have a unique ID associated with them). Now that I know this value,
I can invoke the script with an appropriate search string
$ nz_grep_views "funcid 1379 "
Outputs: Sample report.
$ nz_grep_views yadayada
Scanning database: PRODUCTION
Scanning database: PRODUCTION_BACKUP
Scanning database: QA
Scanning database: QA_BACKUP
Scanning database: SYSTEM
Matches
=======
PRODUCTION..ORDERS_BY_ACCOUNT
PRODUCTION..ORDERS_BY_CUSTOMER
PRODUCTION..ORDERS_BY_DATE
QA..TEST_1
QA_BACKUP..TEST_1
nz_groom
Usage: nz_groom [dbname [tablename <...>]] [ optional args ]
Purpose: A wrapper around the 'groom' command to provide additional functionality.
nz_reclaim is automatically invoked when running NPS 4.x and 5.x
nz_groom is automatically invoked when running NPS 6.x +
For this script to function, you must have SELECT access to the following objects:
_T_BACKUP_HISTORY
_VT_HOSTTXMGR
Inputs: The database name is optional. If not specified then this script will
process all databases / schemas /tables. Otherwise, just the specified
database / schema will be acted upon.
The tablename is optional. If not specified then all tables in the
database / schema will be processed. Or, you can specify one (or many)
tablenames to be processed.
Specify one of the following options (if none are specified, the default will
be "-scan" ... in which case the table is simply scanned + reported on. No
changes will be made to the table.)
< -scan | -pages | -records | -records all | -records ready | -scan+records >
-scan This will report on the amount of space that would be reclaimed IF
you were to do a
"groom table <name> records all;"
Gathering this information simply involves a full table scan.
No actual GROOMing of the table takes place during this step.
-pages Run a page based groom against the entire table.
Any empty pages (128KB) will be removed from the scan list (although
they will still exist in the table). Any empty extents (3MB) will
be removed the from table.
Where "empty" means that the page/extent only contains deleted rows,
and those rows are currently reclaimable -- i.e. not of interest.
"groom table <name> pages all;"
The goal here is to remove as many rows as possible ... using the
fastest method possible.
-records Run the actual "groom table <name> records all;" against the
table. This option will, by far, take the longest to run as
it basically involves a read+rewrite of the entire table.
Therefore, when you choose this option, the script will
automatically do the following:
a) First perform the "-pages" groom operation, since it is faster
to throw away pages/extents of data, than individual records.
But for small tables it is not economical to perform both a
page level groom and a record level groom (in step c). So
the page level groom will be bypassed for any small table
(where the table size is < 30MB * NumberOfDataslices)
b) Then perform the "-scan" operation to see if there are
any rows that actually need reclaiming.
Based on those results, the script will then decide
whether it needs to proceed with the next step ... the
actual record level grooming. It will be skipped if it
would not be of any benefit, or if it does not meet at
least one of the specified thresholds (see below).
By default, if a table is processed (scanned) it will be
listed in the output. The tables that are groom worthy
are highlighted with a "*" in the first column. If you
want the output listing to show just these tables (and
exclude the ones that do not make the cutoff) include the
"-brief" option.
c) IF, and only IF, a record level groom is warranted it will
then be run against the table. The table must have at least 1
reclaimable row. Beyond that, the table must meet/exceed any
of the thresholds that you may have specified via this script.
-scan+records Just perform steps "b" (the scan) and "c" (the record level groom)
from above. The page level groom is bypassed.
-records <all|ready>
There may be times when you want to FORCE a record level groom
to be performed (even if there are no logically deleted rows in
the table to be reclaimed). For example:
o) To rewrite a table, after you have enabled compression,
so that it will now be compressed
o) To "organize" the rows within a clustered base table.
This option allows you to FORCE the record level groom (of your
choosing) to be performed. Note that this script will skip the
initial page level groom, and the subsequent scan of each table ...
as that just adds additional overhead, and it won't make any
difference (since the script is always going to end up doing the
record level groom that you requested anyway).
The following switches are optional and can be used in any combination. If multiple
conditions are specified, they will be OR'ed together. If a table does not meet at
least one of these thresholds it will be skipped.
-rows <nnn> # Example: -rows 1000000
# Only proceess tables with >= 1M rows (based on the statistic rowcount
# value that is maintained for the table)
# And then, only do a groom "-records" if it has >= 1M reclaimable rows
-size <nnn> # Example: -size 1000000000
# Only process tables taking up >= 1GB of storage
# And then, only do a groom "-records" if it has >= 1 GB of reclaimable space
-percent <nnn> # Example: -percent 50
# Only do a groom "-records" if >= 50% of the rows in the table are reclaimable
If no thresholds are specified, then:
o All non-empty tables are processed
o Empty tables (those with 0 bytes of allocated storage) are always bypassed by this script
o When specifying "-records"
the groom PAGES operation is skipped if the table size is < 30MB * NumberOfDataslices
the groom RECORDS operation is skipped if there are 0 reclaimable records
Additional options that may be specified include:
-mview If a table has one (or more) active materialized views associated
with it, the table cannot be groom'ed until any such MView has been
suspended. So, by default, this script will skip over those tables.
Add this switch to the command line if you want the script to
o Automatically suspend any such materialized views
o Perform the GROOM operation
o Refresh the materialized views upon completion
This option is not relevant if only a "-scan" is being performed.
-version If you have a versioned table (which is the result of doing an
'ALTER TABLE <table> [ADD|DROP] COLUMN ...' operation against
it), then the ONLY groom operation that is allowed against the
table is a GROOM VERSIONS. By default, this script will skip
over any such table.
Add this switch to the command line if you want the script to
automatically do a GROOM VERSIONS against such tables.
This option is not relevant if only a "-scan" is being performed.
-backupset <[backupsetid | NONE]>
This switch will allow you to override the default backupset. If
used, this information is simply passed along to the "groom"
command for it to handle.
The system synchronizes your groom request with the most recent backup
set to avoid reclaiming rows not yet captured by incremental backups.
In other words, groom will not remove any unbackedup data that has been
deleted or updated.
In addition, groom will not remove any data that an old/outstanding (yet
still active) transaction might still have an interest in.
So ... there may very well be logically deleted rows in a table ... that
groom is not willing to physically delete at this point in time (for the
above mentioned reasons). The scripts 'nz_invisible' and 'nz_transactions'
may be useful in such a situation.
For additional information, refer to the "System Administrator's Guide"
-brief By default, every table that is processed during the "-scan" phase will
be included in the output listing, because
o) This allows you to more closely monitor the progress of the script
o) More information is better than less information
The tables that are considered "groom worthy" (based on the thresholds
in effect) will be highlighted with a "*" in the first column. It is
just these tables that will have a "groom records" operation performed
against them.
If you want to restrict the output to just these tables (excluding any
table that does not make the cutoff) then include the "-brief" option.
Notes: If an unexpected error is encountered at any point in time, this script
will echo the error message and then exit. It will not continue processing
other tables if any problem is encountered.
The progress/status of ongoing GROOM operations can be monitored via the
system view _V_GROOM_STATUS
Outputs: For the default "-scan" option, a report such as this will be produced.
The "Size"s are estimates (based on each table's average row size).
The "Remaining Rows/Remaining Size" represents what would remain in the
table once the actual "groom table <name> records all;" is processed.
This includes visible rows -- as well as deleted rows that cannot yet
be groom'ed (for whatever reason).
The "NON-Groomable Rows" column represents the number of deleted rows that
are NOT elgible for grooming at this point in time, which may be due to
o an outstanding transaction that might have an interest in them
o the rows have not yet been captured in the most recent backup set
* Name Remaining Rows Remaining Size Reclaimable Rows Reclaimable Size NON-Groomable Rows
- -------- -------------- -------------- ---------------- ---------------- ------------------
* CUSTOMER 150,000 23,270,464 8,000 1,240,000 0
* LINEITEM 6,001,215 293,753,328 450,123 21,605,904 0
NATION 25 3,276,800 0 0 0
ORDERS 1,500,000 84,410,368 0 0 1,500,000
* PART 200,000 12,117,248 200,000 12,000,000 0
PARTSUPP 800,000 72,351,744 0 0 999
* REGION 5 320 10,235 655,040 0
* SUPPLIER 10,000 12,057,419 1 1,205 0
nz_inconsistent_data_types
Usage: nz_inconsistent_data_types
Purpose: List column names whose datatype is NOT consistently used from table to table.
This script will list any colum names where the data type associated with
that column differs from table to table, or from database to database.
This is a subset of the nz_best_practices script, which included a report
of the "Inconsistent Data Type Usage" (amongst other things).
But the nz_best_practices script only compared column names + data types
within a single database. This script compares them across ALL databases.
See Also: nz_columns
Inputs: None
Outputs: Said report.
nz_invisible
Usage: nz_invisible [database] <table> [-backupset <backupsetid|NONE>]
Purpose: Provide information about the number of visible and INVISIBLE rows in a table.
The transaction isolation level that the database uses is SERIALIZABLE. Your
transaction is isolated from any action by any other concurrently executing
transaction. All of this is efficiently accomplished on the NPS system thru
the use of per-row transaction ID's. You are protected from
o Lost Updates
o Uncommitted Data
o Inconsistent Data
o Phantom Inserts
If an nzload is going on ... how many new row have been inserted?
If an update is taking place ... how many rows have been updated thus far?
This script will allow you to see what affect other transactions are having
on a table ... while those transactions are still ongoing.
For this script to function, you must have SELECT access to the following
objects:
_T_BACKUP_HISTORY
_VT_DBOS_CONNECTION
_VT_HOSTTX_INVISIBLE
_VT_HOSTTXMGR
Note: In NPS 6.0 the storage manager was redesigned. As data is added to
a table (via nzload/INSERTs/UPDATEs) the table will grow and additional
extents will be added to it as necessary. But until the transaction has
been committed, those extents will not be permanently attached to the
table. Which has two effects: (1) If the transaction is rolled back,
the rollback operation may be near instantaneous as it may be a simple
matter of releasing those extents back to the general storage pool.
(2) This script (the query it invokes) may not be able to see/count
those additional rows (until the other transaction is committed and the
extents are permanently attached to the table).
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. A full table scan will be performed against
this table.
-backupset <[backupsetid | NONE]>
This script reports on both the # of DELETE'd and RECLAIMABLE rows.
Those two numbers can be different, because rows cannot be reclaimed if
o an outstanding transaction might have an interest in them
o the rows have not yet been captured in the most recent backup set
This option allows you to override the default backupset that is used to
indicate which rows are reclaimable.
Outputs: A report such as this will be produced.
This information will be based upon your 'transaction invisiblity list'
which is comprised of the following outstanding transaction ID's:
(12050,12090,13410)
--------------------------------------------------------------------------------------------------------
Row Count Percentage Comment DB1..MY_TABLE
========= ========== ================================================================================
5,387,699 100.00 PHYSICAL The total # of rows in the table
--------- ---------- --------------------------------------------------------------------------------
3,356,576 62.30 VISIBLE The # of visible rows. Basically a COUNT(*)
1,824,304 33.86 DELETE'd The # of DELETE'd rows
131,819 2.45 INVISIBLE Additions Rows added, but not yet committed
75,000 1.39 INVISIBLE Added + Deleted Rows added + deleted within the same transaction
========= ========== ================================================================================
1,077,440 20.00 Reclaimable The # of DELETE'd rows that can be reclaimed
(which is based on the backupset's stableTXid)
6,880 .13 Invisible Deletions The # of VISIBLE rows that have been deleted,
but not yet committed
--------------------------------------------------------------------------------------------------------
Where
PHYSICAL is the total number of rows that exist in the table. The SUM of
VISIBLE + DELETE'd + INVISIBLE Additions + INVISIBLE Added + Deleted.
VISIBLE represents the number of rows that a normal query would be able
to see at this moment in time (e.g., a "select count(*) from <table>;")
DELETE'd is the number of deleted (i.e., unused) rows in the table.
INVISIBLE Additions are rows newly added to the table ... that have
not yet been committed. This includes
o rows being nzload'ed
o rows being INSERT'ed
o rows being UPDATE'd ( because an update = delete + insert )
INVISIBLE Added + Deleted are rows which a single transaction was
reponsible for both adding them to the table and then removing them ...
all in the same transaction. They have not been committed yet. But
they'll probably never be visible because they have already been deleted.
Reclaimable shows how many of the logically "DELETE'd" rows can be
physically reclaimed/groomed in order to free up that storage. This
number is not included in the "PHYSICAL" total (since the DELETE'd
number was instead). This number will be <= the DELETE'd number.
If there are a large number of reclaimable rows, you might want to
consider running reclaiming/grooming the table at some point in time.
"INVISIBLE Deletions" are rows that will be deleted from this table ...
once the corresponding transaction has been committed. This includes
o rows being DELETE'd
o rows being UPDATE'd ( because an update = delete + insert )
This number is not included in the "PHYSICAL" total, since it simply
represents how many of the VISIBLE rows are marked for deletion.
nz_load4
Usage: nz_load4
Purpose: To speed up the loading of a single file by using multiple nzload jobs.
The parsing of data, by nzload, can be very cpu intensive. This work
happens on the NPS host on a single cpu core. The idea here is to
split up the input file, on the fly, into multiple data streams so that
they can be parsed by multiple nzload processes running in parallel ...
thereby increasing the overall load speed of any single file.
Inputs: Standard nzload command line options. What you pass to this script is
(for the most part) what it will pass along to the actual nzload program
when it gets invoked.
Additional/Optional Options
-threads <n>
As the name of this script implies, the input file will be split up
and processed by 4 separate data streams (by default) ... in order
to make optimal use of the SMP host ... in order to speed things up.
You can specify the number of streams/threads that you want to use,
from 1..31.
Outputs: Whatever nzload reports (times the number of nzload jobs that get launched).
The exit status from this script will (basically) be the same as nzload.
0 = success
1 = failure
2 = success (but some rows were rejected because of errors)
However, this script is running 'n' instances of nzload. Which means:
Some of the nzload jobs can succeed, while others might fail.
But only a single exit status can be returned. (If there is a
'mixed bag' of results, then the ranking will be 1 then 2 then 0.)
nz_load_files
Usage: nz_load_files -dir <dirname> [ optional args ]
Purpose: Utility script to assist with loading many data files into a table (or tables).
The parsing of data, by nzload, can be very cpu intensive. This work
happens on the NPS host on a single cpu core.
If you have many different data files that need to be loaded, this
script will launch + control multiple nzload jobs running in parallel ...
with each nzload job processing a single data file ... thereby
decreasing the overall time it takes to complete the work.
By default, the script processes (loads) 8 files at a time. The
files are processed in order based on their file size (largest first).
The script will process ascii files. It will also process compressed
files with any of the following extensions:
*.gz *.GZ *.zip *.ZIP
Any such file will automatically be decompressed instream (via zcat)
with the data being piped directly into nzload.
Inputs: -dir <dirname>
This field is required. Specify the name of the directory that
contains the data files to be loaded. Each + every file therein
will be processed. The script processes only the files in that
directory, it does not descend into any subdirectories.
Additional/Optional Options
-threads <n>
By default, this script will launch + control 8 concurrent nzload
jobs. You can specify the number of streams/threads you want to
use, from 1..31.
Specifying 1 is perfectly valid ... in which case each file is
processed in turn, one at a time. You won't benefit from
concurrency, but the script can still be used to automate +
control + time the work that it does (on your behalf).
Standard nzload command line options. What you pass to this script is
(for the most part) what it will pass along to the actual nzload program
when it gets invoked. With the following caveats:
-t <table>
Specifying this option will cause all of the files (in the
"-dir <dirname>") to be loaded into the named table. This
is useful when the filenames have no relation to the table
name ... and you just need to load a bunch of files into a
single table.
Otherwise ...
If you do NOT specify this option, the script will attempt to
deduce the name of the table based on the name of the file.
The filename is stripped of anything after the first period,
and that is used as the basis for the table name.
For example, all of the following files would be loaded into
the 'CUSTOMER' table:
customer.1
Customer.2
CUSTOMER.3.gz
CUSTOMER.data.2013_01_11
customer.TBL.gz
-filter <cmd>
Do you want to pre-process the data files before passing them into
nzload? You can tie in your own custom filter here. For example:
-filter "dd conv=ucase 2>/dev/null"
# Convert all of the text to upper case
-filter /tmp/my_custom_script
# Do whatever (the data will be passed in to/out of your script
# via stdin/stdout)
-filter "sed -r -e 's/([0-9]{2}):([0-9]{2}):([0-9]{2}):([0-9]{1})/\1:\2:\3.\4/g'"
# Using sed to fix a timestamp so that it can be loaded
# This pattern 10:11:12:123456
# becomes this 10:11:12.123456
Do wrap your filter cmd in double quotes so it will be evaluated correctly.
-df <filename>
This option is NOT allowed. The script is going to process ALL
of the datafiles that it finds (in the "-dir <dirname>" that you
specified). Thus, you don't identify the individual files.
-lf <filename>
-bf <filename>
-outputDir <dir>
These options are NOT allowed. Rather, the script will tell
nzload to use <FILENAME>.nzlog and <FILENAME>.nzbad for any files
that it creates. All of these files will be placed in a log
directory under /tmp.
Outputs: The exit status from this script will (basically) be the same as nzload.
0 = success
1 = failure
2 = success (but some rows were rejected because of errors)
However, this script will have run 'n' invocations of nzload.
Which means some of the nzload jobs can succeed, while others
might fail. But only a single exit status can be returned by
this script. If there is a 'mixed bag' of results, then the
ranking will be 1 then 2 then 0.
For each data file that is loaded, two lines of output are written:
STATE
=====
LOAD indicating the start of the nzload session
done indicating the completion of the nzload session
If the data file is not going to be loaded, because
there is 'no such table', or
it is an 'empty file'
then only a single line of output will be written, and the STATE
will show "skip".
Output will be written to the screen as the operations are performed
(so it is ordered by the "TIME" ... which results in an intermingling
of the "LOAD" and "done" messages for the various data files).
Output will also be written to a log file, where it is sorted based
on the "COUNTER" (a one-up number based on the "FILE SIZE"). Thus,
the "LOAD" and "done" lines will be paired together.
The "STATUS" message can be one of
(Success) The nzload was successful
(Partial Success) Some errors were encountered in the data file,
but nzload did not hit the "-maxErrors <n>"
threshold, so it loaded all the data that it could.
(Failure) Too many data errors, so the load was rolled back.
For a successful (or partially successful) nzload, the STATUS will also
show the number of rows that were nzload'ed into the table.
Sample output follows:
TIME COUNTER STATE TABLE NAME FILE NAME FILE SIZE STATUS
======== ========= ===== ==================== ==================== ========== =========
18:20:54 1 of 15 LOAD LINEITEM lineitem.tbl.gz 215248959
18:23:07 1 of 15 done LINEITEM lineitem.tbl.gz 215248959 (Success) 6001215
18:20:54 2 of 15 LOAD LINEITEM lineitem.test 111612333
18:20:55 2 of 15 done LINEITEM lineitem.test 111612333 (Failure)
18:20:55 3 of 15 LOAD ORDERS orders.tbl.gz 46113124
18:22:08 3 of 15 done ORDERS orders.tbl.gz 46113124 (Success) 1500000
18:20:55 4 of 15 LOAD PARTSUPP partsupp.tbl.gz 28024343
18:21:56 4 of 15 done PARTSUPP partsupp.tbl.gz 28024343 (Success) 800000
18:20:56 5 of 15 LOAD CUSTOMER Customer.2 19083406
18:21:16 5 of 15 done CUSTOMER Customer.2 19083406 (Success) 118412
18:20:56 6 of 15 LOAD CUSTOMER customer.1 16095258
18:21:14 6 of 15 done CUSTOMER customer.1 16095258 (Success) 100000
18:20:56 7 of 15 LOAD CUSTOMER customer.TBL.gz 8939370
18:21:22 7 of 15 done CUSTOMER customer.TBL.gz 8939370 (Partial Success) 150000
18:20:57 8 of 15 LOAD CUSTOMER CUSTOMER.data.2013_01_11 6760023
18:21:06 8 of 15 done CUSTOMER CUSTOMER.data.2013_01_11 6760023 (Success) 42001
18:20:57 9 of 15 LOAD CUSTOMER CUSTOMER.3.gz 5957590
18:21:16 9 of 15 done CUSTOMER CUSTOMER.3.gz 5957590 (Success) 99999
18:21:06 10 of 15 LOAD PART part.tbl.gz 5696082
18:21:28 10 of 15 done PART part.tbl.gz 5696082 (Success) 200000
18:21:14 11 of 15 skip /* no such table */ foo.bar 1234567 (no such table)
18:21:14 12 of 15 LOAD SUPPLIER supplier.tbl.gz 560229
18:21:18 12 of 15 done SUPPLIER supplier.tbl.gz 560229 (Success) 10000
18:21:17 13 of 15 LOAD NATION nation.tbl.gz 935
18:21:19 13 of 15 done NATION nation.tbl.gz 935 (Partial Success) 25
18:21:17 14 of 15 LOAD REGION region.tbl 391
18:21:20 14 of 15 done REGION region.tbl 391 (Success) 5
18:21:18 15 of 15 skip REGION region.stage 0 (empty file)
All nzload jobs have completed.
Data directory: /SAN1/landing_area/test_data
Log directory: /tmp/nz_load_files.test_data.2013_01_11_182054
Script logfile: /tmp/nz_load_files.test_data.2013_01_11_182054/script.output
Total Rows Loaded: 9021657
Elapsed Seconds: 133
Count Load Status
======= ===============
1 empty file
1 Failure
1 no such table
2 Partial Success
10 Success
nz_lock
Usage: nz_lock <database> <table> [timeout}
Purpose: Check to see if an exclusive lock can be obtained on a table.
If so -- then that would mean that no one else is currently accessing it.
An exclusive lock is automatically issued any time you invoke one of the
following commands:
alter table
drop table
truncate table
nzreclaim
If you issue one of those commands while any other job still has an
"outstanding interest" in the table, then your command will pend until it
can is satisfied. And while your exclusive lock is still outstanding, any
new attempts to access the table will be queued up behind it.
This script tests to see if an exclusive lock can be obtained. If not,
then it will automatically timeout. Your batch/ETL job could then send an
alert or try again later ... without leaving the exclusive lock pending.
Inputs: The database/table that you want to test.
The timeout value is optional. The default value is 5 seconds.
Allowable range is 1..300 seconds.
Outputs: This script simply exits with an appropriate status code.
0 indicates success, anything else is an error.
Your script might invoke this script in a manner such as this
while [ `nz_lock my_database my_table ; echo $?` != "0" ]; do
echo "unable to obtain the lock"
sleep 600 # Wait 10 minutes and try again
done
echo "the table is now ready for use"
nz_maintenance_mode
Usage: nz_maintenance_mode [ -on | -off [<database> ...]]
Purpose: Disable user access to the server, or to just the specific database(s).
Note: This does not apply to the 'ADMIN' account.
This script does not kick current/active users off of the system. It
simply blocks users from establishing any new connections.
Inputs: Optional. If no arguments are passed, the script will simply display the
report below ... indicating which database(s) do/don't allow connections.
-on Turn maintenance mode on (disable access)
-off Turn maintenance mode off (re-enable access)
The database name is optional. If not specified, then access will be
enabled/disabled to ALL databases. Otherwise, you can specify the
specific database name(s) that you want this operation to apply to.
Outputs: A reort, such as the following, will be displayed.
Database | Connections Allowed ?
-------------------+-----------------------
MY_DB | True
PRODUCTION | True
PRODUCTION_BACKUP | False
QA | True
QA_BACKUP | False
SYSTEM | True
(6 rows)
nz_pause
Usage: nz_pause [ -v ]
Purpose: An alternative to "nzsystem pause".
This script waits for an opportune time before it attempts to pause the
system (i.e., no activity) so that the system won't even begin to enter
the paused state while user's are still using it.
Inputs: -v Provide verbose status messages (printed every 10
seconds) while the script is waiting for activity
to stop.
Outputs: Script will display the current system state upon completion.
nz_physical_table_layout
Usage: nz_physical_table_layout [database] <tablename>
Purpose: To list out a table's columns -- sorted by their PHYSICAL field ID.
The logical field ID -- or column attribute number -- is simply the order
in which you defined the columns via the CREATE TABLE statement.
The physical field ID governs the order in which the fields are actually
stored in the record out on the disk.
Now, you probably didn't even know that there was a difference. And in
general, it usually isn't relevant. But internally it is important ...
because it could have an affect on which columns get compressed or
zonemapped. (If you want to know more about zonemaps, the nz_zonemap
script is better suited for that purpose).
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: A report such as the following. If multiple columns fall within a
"Physical Col Group", then the "Logical Col #" breaks the tie.
$ nz_physical_table_layout tpch lineitem
Attribute | Type | Logical Col # | Physical Col Group
-----------------+-----------------------+---------------+--------------------
L_QUANTITY | NUMERIC(15,2) | 5 | 3
L_EXTENDEDPRICE | NUMERIC(15,2) | 6 | 3
L_DISCOUNT | NUMERIC(15,2) | 7 | 3
L_TAX | NUMERIC(15,2) | 8 | 3
L_ORDERKEY | INTEGER | 1 | 7
L_PARTKEY | INTEGER | 2 | 7
L_SUPPKEY | INTEGER | 3 | 7
L_LINENUMBER | INTEGER | 4 | 7
L_SHIPDATE | DATE | 11 | 7
L_COMMITDATE | DATE | 12 | 7
L_RECEIPTDATE | DATE | 13 | 7
L_RETURNFLAG | CHARACTER(1) | 9 | 11
L_LINESTATUS | CHARACTER(1) | 10 | 11
L_SHIPMODE | CHARACTER(10) | 15 | 20
L_SHIPINSTRUCT | CHARACTER(25) | 14 | 27
L_COMMENT | CHARACTER VARYING(44) | 16 | 27
(16 rows)
nz_reclaim
Usage: nz_reclaim [dbname [tablename <...>]] [ optional args ]
Purpose: A wrapper around the 'nzreclaim' utility to provide additional functionality.
nz_reclaim is automatically invoked when running NPS 4.x and 5.x
nz_groom is automatically invoked when running NPS 6.x +
Inputs: The database name is optional. If not specified then all databases will be
processed.
The tablename is optional. If not specified then all tables in the database
will be processed. Or, you can specify one (or many) tablenames to be
processed.
-scan The default option. This reports on the space reclaimable
by a record level nzreclaim (e.g., 'nzreclaim -scanRecords').
Basically, this involves a full table scan.
-blocks Run a block-based reclaim, removing empty blocks from the
start of the table (e.g., 'nzreclaim -blocks'). Once any
NON-empty blocks are found the reclaim will immediately
finish up and exit. Thus, the overhead for this option is
typically even less than that of a full table scan -- since
only "the start of the table" typically needs to be scanned.
-records Run a record-based reclaim (e.g., 'nzreclaim -records').
This option will, by far, take the longest to run as it
basically involves a read+rewrite of the entire table.
Therefore, when you choose this option, the script will
automatically do the following:
a) First perform the "-blocks" operation, since it is faster
to throw away blocks of data than individual records.
b) Then perform the "-scan" operation to see if there are
any rows that actually need reclaiming. If not, we'll
be able to skip the actual record level reclaim as that
would not be of any benefit.
c) IF, and only IF, a record level reclaim is warranted,
it will then be run against the table. (It must
have reclaimable rows, and meet/exceed any of the
following thresholds that you might specify).
The following switches are optional, and apply to the -scan/-records options.
They can be used in any combination to limit the output report to just those
tables that meet/exceed the specified thresholds. If multiple conditions are
specified, they will be OR'ed together.
If a table does not meet at least one of the thresholds it will not be
reported upon (the "-scan" option) nor will the script bother to do a record
level reclaim against it (the "-records" option).
-rows <nnn> # Example: -rows 1000000
# Only show tables with >= 1M reclaimable rows
-percent <nnn> # Example: -percent 50
# Only show tables where >= 50% of the rows are reclaimable
-size <nnn> # Example: -size 1000000000
# Only show tables with >= 1 GB of reclaimable space
If no thresholds are specified, then:
o All tables will be displayed (for purposes of the "-scan" option).
o Only tables with >= 1 reclaimable row will be reclaimed ("-records" option)
-backupset <[backupsetid | NONE]>
This switch will allow you to override the default backupset. If used, the
switch is simply passed thru to the nzreclaim utility for it to process.
The system synchronizes your nzreclaim request with the most recent backup set
to avoid reclaiming rows not yet captured by incremental backups. In other words,
nzreclaim will not remove any unbackedup data that has been deleted or updated.
In addition, nzreclaim will not remove any data that an old/outstanding (yet
still active) transaction might still have an interest in.
So ... there may very well be logically deleted rows in a table ... that
nzreclaim is not willing to physically delete at this point in time (for the
above mentioned reasons). The scripts 'nz_invisible' and 'nz_transactions'
may be useful in such a situation.
For additional information, refer to the "System Administrator's Guide"
Notes: When nzreclaim runs, it requests an exclusive (unshared) lock on the table
being processed. This script will attempt to make sure that that lock can
be satisfied. If not, the table will be skipped over so as to not hold up
further processing. By default, this script will wait for up to 60 seconds
before giving up on the table. You can adjust the timeout value by setting
-timeout <nnn> # Where <nnn> is between 1 and 300 seconds.
If the 'nzreclaim' utility reports any errors at all, this script will echo
the error message and then exit. It will not continue if problems were
encountered.
Prior versions of this script attempted to estimate the amount of reclaimable
space by (basically) doing a simple SQL SELECT COUNT(*) statement against the
table. But that estimate would not necessarily be in agreement with nzreclaim.
So this script now uses "nzreclaim -scanRecords" instead.
Outputs: For the default "-scan" option, a report such as this will be produced
(which should look very similar to the output from nzreclaim itself)
Name Visible Rows Visible Size Reclaimable Rows Reclaimable Size
---------- ------------ ------------ ---------------- ----------------
CUSTOMER 150,000 23,270,464 8,000 1,240,000
LINEITEM 6,001,215 293,753,328 450,123 21,605,904
NATION 25 3,276,800 0 0
ORDERS 1,500,000 84,410,368 0 0
PART 200,000 12,117,248 200,000 12,000,000
PARTSUPP 800,000 72,351,744 0 0
REGION 5 320 10,235 655,040
SUPPLIER 10,000 12,057,419 1 1,205
nz_record_skew
Usage: nz_record_skew [ database [ table ]]
Purpose: To check the record skew for a table.
The following calculations will be performed
o The total number of rows in the table
o The MIN/AVG/MAX number of rows/SPU
o The skew ratio -- expressed as a range. A value of 1.000 - 1.000
would indicate no skew at all. In general, the larger the table,
the smaller you want the skew ratio to be.
This script performs a full table scan in order to make these calculations.
See also: "nz_skew", for an alternative (and faster) way to detect skew
Inputs: The database and table names are optional.
If the database name is not specified, then this script will process
all databases / all schemas.
If a database name is specified, the script will process just the
specified database / schema. If a table name is specified, the script
will process just the specified table.
Outputs: A report such as this will be produced.
Database: test (Sorted by Table Name)
Table | # SPUs | Row Count | Min # Rows | Avg # Rows | Max # Rows | SKEW RATIO
-------------+--------+-- ---------+------------+------------+------------+--------------
empty_table | 0 | 0 | 0 | 0 | 0 | 0.000 - 0.000
large_table | 216 | 59,986,052 | 276,179 | 277,713 | 279,730 | 0.994 - 1.007
medium_table | 216 | 276,480 | 1,280 | 1,280 | 1,280 | 1.000 - 1.000
small_table | 2 | 2 | 1 | 1 | 1 | 1.000 - 1.000
nz_replay
Usage: nz_replay [-by_planid|-by_sessionid] [-dir <dirname>] [<database> <table>]
Purpose: Extract queries (from the query history) so they can be replayed/retested.
This script will extract all of the SELECT statements from a query history
table, and reformat them in such a way that they can later be replayed in
a controlled fashion.
Inputs: The name of the <database> and <table> which containing the query history
information that is to be processed. If not specified, then it defaults to
SYSTEM NZ_QUERY_HISTORY
which can be populated via the nz_query_history script.
Since your NZ_QUERY_HISTORY table might contain many thousands of queries
spanning many months, you could create a new table ... with just a subset
of those records ... and then have the nz_replay script process that subset.
-by_planid By default, the SQL for each query plan will be written
out to its own separate *.sql file.
-by_sessionid Any given query session might consist of multiple SQL
statements, some of which are dependent on the ones that
came before (such as when CREATE TEMORARY TABLE AS SELECT ...
statements are issued). Use this switch if you want the
*.sql files to be combined ... so that any given *.sql file
contains ALL of the sql statements pertaining to that session.
In which case, the SQL statements will be fired off one
after another ... with no time delay inbetween.
-dir <dirname> By default, the SQL for each query plan is extracted from
the <database> <table> that contains the query history
information. However, there is a maximum length to that
SQL ... that being either 32K or 64K (depending on which
version of NPS you are using). And, as this script needs
some work space itself, the SQL statement is actually
truncated at 60K characters.
With this switch, if the original *.pln files are still
available, the script will attempt to extract the original
SQL for each query from the corresponding *.pln file ...
which should thus keep it from getting truncated. E.g.,
-dir /nz/kit/log/planshist/current
Outputs: This script will create a "sql" subdirectory under your current working
directory, containing all of the individual *.sql files.
This directory will also include a "replay.sh" script that you can invoke.
Its purpose is to
o launch all of the queries
o in the order that they were originally launched
o at the same time/offset that they were originally launched
nz_replicate
Usage: nz_replicate -backup -dir <dirname> [ -database <database> ]
-or-
nz_replicate -restore -dir <dirname> [ -database <database> ] -npshost <hostname>
Purpose: A script to assist in replicating databases across two different NPS hosts.
In a disaster recovery scenario, one database may be replicated across two
NPS hosts. One serves as the primary host (on which nzbackup's will be
performed). The other serves as the secondary host (on which nzrestore's
will be performed). The database on the secondary host is available for query
access, but is locked to prevent any alterations to it (except for those made
by GROOM and GENERATE STATISTICS).
This script is used to automate the INCREMENTAL backups and restores. You
must manually perform the first full backup and restore of each database
yourself, using commands such as these:
nzbackup -dir <dirname> -db <database>
nzrestore -dir <dirname> -db <database> -npshost <hostname> -lockdb true -increment 1
Basically, this script is just a wrapper ... which controls/invokes other
scripts, which themselves do the bulk of the work. Those scripts are listed
here, and are invoked in this order. If a particular script/file does not exist,
it will simply be ignored.
These scripts will be run on your primary (master) host -- to make the backup
nz_replicate.backup_starting # Invoked once
nz_replicate.pre_backup_cmd # Invoked for each database being replicated
nz_replicate.backup_cmd # Invoked for each database being replicated
nz_replicate.post_backup_cmd # Invoked for each database being replicated
nz_replicate.backup_ending # Invoked once
These scripts will be run on your secondary (DR) host -- to restore the backup
nz_replicate.restore_starting # Invoked once
nz_replicate.pre_restore_cmd # Invoked for each database being replicated
nz_replicate.restore_cmd # Invoked for each database being replicated
nz_replicate.post_restore_cmd # Invoked for each database being replicated
nz_replicate.restore_ending # Invoked once
Any fatal errors are fatal, and the script will exit at that point. If a non-fatal
error is hit, the script will display what is known about the problem ... and continue
on. An example of a non-fatal error is nzrestore not being able to create a view in
the replicated database.
Options: -backup This is the primary (master) host ... perform an nzbackup
-restore This is the secondary (DR) host ... perform an nzrestore
-dir <dirname> The full path to the directory in which the data
files will be written to (or read from).
-npshost <hostname> If doing a "-restore", you must specify the name of the
primary host on which the original database resides.
That name should match whatever is in the file
/nz/data/config/backupHostname.txt
on the primary host.
-database <database> Optional argument. This specifies the name of the
database to be backed up/restored. If not specified,
then a list of database name(s) will be read from
the control file 'nz_replicate.databases'
Outputs: Status/log/timing information will be sent to standard out ... and will
include information about any ERROR's that might be encountered. Sample
output is shown below.
Exit status: 0 = success, non-0 = fatal error's were encountered
--------------------------------------------------------------------------------
nz_replicate -dir /SNAP/BACKUPS -database INVENTORY -backup
SCRIPT: nz_replicate.pre_backup_cmd
Database: INVENTORY
Started: 2007-12-20 14:30:53
Checking each table by running a 'count(*)' against it ...
Finished: 2007-12-20 14:30:54
Seconds: 1
SCRIPT: nz_replicate.backup_cmd
Database: INVENTORY
Started: 2007-12-20 14:30:54
Backup running ... Backup of database INVENTORY to backupset 20071220192200 completed successfully.
Finished: 2007-12-20 14:30:59
Seconds: 5
SCRIPT: nz_replicate.backup_ending
Started: 2007-12-20 14:30:59
For the backup directory: '/SNAP/BACKUPS'
here is a count of the number of files, and the total file size (in bytes)
that needs to get moved over to the replication server ... so that it can
then be restored.
File Count: 13
Byte Count: 3942
Finished: 2007-12-20 14:30:59
Seconds: 0
--------------------------------------------------------------------------------
nz_replicate -dir /SNAP/BACKUPS -database INVENTORY -restore -npshost nps10800.thecustomer.com
SCRIPT: nz_replicate.restore_cmd
Database: INVENTORY
Started: 2007-12-20 14:39:23
Restore running ... Restore of increment 2 from backupset 20071220192200 to database 'INVENTORY' committed.
Restore of increment 3 from backupset 20071220192200 to database 'INVENTORY' committed.
Restore of increment 4 from backupset 20071220192200 to database 'INVENTORY' committed.
Finished: 2007-12-20 14:39:33
Seconds: 10
SCRIPT: nz_replicate.post_restore_cmd
Database: INVENTORY
Started: 2007-12-20 14:39:33
Generating EXPRESS statistics on each table ...
Finished: 2007-12-20 14:39:35
Seconds: 2
--------------------------------------------------------------------------------
nz_rerandomize
Usage: nz_rerandommize <database> [ optional args ]
Purpose: To redistribute (or, "re-randomize") the rows in a table that are DISTRIBUTE'd ON RANDOM
When you create a table using "DISTRIBUTE ON RANDOM" and then populate it with data,
the system will randomly (and fairly evenly) distribute the data across all of the
dataslices on the system. If you have a small dimension table it could end up
occupying space on every one of the dataslices, yet be storing just a few rows on
any given dataslice.
As of version 7.2.1.3, you can specify a "randomDistributionChunkSize" for the system.
This will effectively put <nnn> rows on a given dataslice, before going on to the next
dataslice. So this same, small dimension table might now (for example) be stored on
just 5 dataslices, rather than occupying space on every dataslice.
Note: The optimizer setting 'use_random_dist_chunk_size' allows you to disable this
feature on a per-session basis ... in which case you would get the old/original
functionality and performance.
When storing a table on a dataslice, the sytem allocates a 3MB extent ... which is
further subdivided into 24 128KB pages. The system reads/writes a "page" at a time.
A single 128KB page of data might contain (for example) 1,500 - 3,000 rows (+/-) ...
depending on the DDL/shape of your table, the compression ratio of the data, etc ...
So, why not store 100+ rows on a given page/extent/dataslice ... rather than just a few?
Benefits may include the following, amongst others:
o Less space being allocated (and going unused) when you have small dimension tables
o Dataslices containing no storage (for a given table) won't have any data to scan.
The less work they have to do (for a given query) the more time they have available
for servicing other queries.
o You may get more hits from the 'disk cache' since there will be fewer tables (on any
given dataslice) taking up storage space -- and thus fewer tables occupying slots
in the disk cache.
This script will (basically) do an INSERT+DELETE into each table in order to re-distribute
(to re-randomize) the data ... so that the new randomDistributionChunkSize can be put
into effect.
This script only applies to
o NPS Version 7.2.1.3+
o Tables that are defined as "DISTRIBUTE ON RANDOM"
o Tables that are smallish in nature (as defined by this script). Basically, this
will result in skewed tables. But these are small tables ... so there will only
be a small amount of skew ... and it will be insignificant.
Inputs: <database>
The database to be acted upon must be specified. The script will make sure that only
randomly distribute tables, of an appropriate size, are acted upon.
If you wish to choose specific tablenames refer to the -in / -NOTin / -like / -NOTlike
options described below.
-rows <nnn>
By default, this script will process any randomly distributed table with <= 250,000 rows.
You can adjust this limit up/down ... to a maximum of 2 million.
-info
Don't actually re-distribute the data. Rather, scan each table and report how many dataslices
it is currently occupying space on.
-groom
This script performs an INSERT + DELETE operation on each of the tables. The DELETE is a
logical delete, so the data is still there ... occupying storage on the various dataslices ...
until such time as you issue a GROOM operation to effect a physical deletion. Add this
switch if you want the script to GROOM all of the affected tables at the very end.
Note: If there are any other, outstanding transactions still running, the GROOM operation
may not be able to physically remove the rows when it is run.
-genstats
Since the script has inserted (and deleted) rows in the table ... even though they are the
same rows ... it has had an affect on the statistics for the table. Specifically, the table
rowcount statistic has been automatically incremented (NPS never automatically decrements it).
Add this switch if you want the script to invoke a GENERATE STATISTICS against all of the
affected tables.
Outputs: Sample output follows:
$ nz_rerandomize TPCH1 -info
# Of Rows DataSlices Table Name
========== ========== ========================================
25 22 NATION
5 5 REGION
$ nz_rerandomize TPCH1
--BEFORE-- --AFTER --
# Of Rows DataSlices DataSlices Table Name
========== ========== ========== ========================================
25 22 1 NATION
5 5 1 REGION
nz_rev
Usage: nz_rev [-num]
Purpose: Report the major.minor revision level of the software running on the host.
This information is useful if you want to design a script that runs on
different versions of the NPS software, and the script needs to do things
differently (depending on which version of the software it is running
against).
Note: nzrev reports the version of your client software.
This script reports the version of the host software.
Inputs: -num Optional switch. If used, the answer will be displayed as
a 2-digit number with no decimal point.
Outputs: The revision of the software running on the SERVER, i.e.
$ nz_rev
6.0
$ nz_rev -num
60
nz_select_fixed_data
Usage: nz_select_fixed_data [-nullValue <str>] -db <database> -t <table> -width <WIDTHS ...>
Purpose: Extract data from a table -- formatting each column to a fixed width.
This script extracts all of the data from a table (i.e., "SELECT *" ), while
formatting the output data such that each column adheres to a fixed width
that is specified by you.
This script does not use external tables, so (performance wise) it is not
recommended for use against really large tables.
Inputs: -nullValue <str> Optional. If a column contains a NULL, this is the
string that will be used to represent that. By default,
a "." will be used. You should not use a nullValue string
that is longer than your MINIMUM column width ... or else
you run the risk of overflowing the column's borders.
-db <database> Required. The name of the database.
-t <table> Required. The name of the table.
-width <WIDTHS ...> The widths of each of the columns.
If a negative number, the column will be left justified.
If a positive number, the column will be right justified.
If you specify a width of 0 (or don't specify enough
widths) then the column(s) will be skipped.
Sample usage:
nz_select_fixed_data -db prod_db -t customer_list -width 30 -10 3 5 0 0 -20
If you specify a column width that is too narrow to contain a particular data
element, then it will automatically be expanded. Which means it will NOT be
fixed width ... and the output will probably be useless.
DATE columns require 10 characters. TIME and TIMESTAMP columns typically require
8 and 19 characters, respectively, unless fractional seconds are involved.
Outputs: The formatted table will be dumped to standard out. You will probably want to
redirect it to a file.
You should check the exit status ( $? ) from this script. A '0' indicates
success. A '-1' indicates problems. A positive number indicates that there
was overflow ... one (or more) of the data values didn't fit within its
defined column width. This error number will represent the max line length
of the data that was outputted.
nz_select_quoted_data
Usage: nz_select_quoted_data <database> <table>
Purpose: Extract data from a table -- wrapping each column value in "double quotes".
This script extracts all of the data from a table (i.e., "SELECT *" ), while
formatting the output data such that each column's value is wrapped in a
set of "double quotes".
This script does not use external tables, so (performance wise) it is not
recommended for use against really large tables.
Inputs: The database and table names are required.
Outputs: Each column will be wrapped with ""
The columns will be separated with ,
For example
"1","2","3","4","John"
"100","-100","50","-50","Richard"
nz_set
Usage: nz_set [search_string]
Purpose: Dump out the optimizer 'SET'tings that are currently in effect.
The value/cost of any of these run-time parameters can be dynamically
changed within your SQL session by 'SET'ting it to a new value.
To change the value globally (for all users)
o Edit the file /nz/data/postgresql.conf
o Make the desired change(s) using this format
name = value
o Restart the database ( nzsystem restart )
--or--
Just send a signal to the relevant process ( pkill -HUP -f postmaster )
In the case of a restart, the postmaster adopts the default value (for
anything that is not specified in the file).
In the case of a pkill, the postmaster keeps the current value (if it
is not specified). In other words, removing/commenting out a value
does not cause it to revert back to its default -- you would need to
explicitly set it to the value that you want it to be.
Inputs: If the optional search_string is specified, then only those 'SET'tings that
match the seach string will be listed. A case insensitive wild-carded
search is performed. For example, to list only those settings related to
the FACTREL planner you might enter
$ nz_set fact
Outputs: A listing of 'SET' statements (what you would get from nzdumpschema) will
be produced. (Note: There may be additional optimizer 'SET'tings that are
not included in an nzdumpschema.)
sample
------
set ENABLE_JIT_STATS = on ;
set FORCE_JIT_STATS = off ;
set ENABLE_JITSTATS_ON_MVIEWS = on ;
set JIT_STATS_MIN_ROWS = 5000000 ;
set STATS_COL_GRP_LIMIT = 10 ;
set EXPRESS_STATS_COL_GRP_LIMIT = 30 ;
nz_state
Usage: nz_state [-terse]
Purpose: Mimics the "nzstate" and "nzstate -terse" commands ... but via SQL.
Inputs: -terse Option. Print only the current state literal.
Outputs: The system state will be echoed out, e.g.
$ nz_state
System state is 'Online'.
$ nz_state -terse
online
nz_storage_stats
Usage: nz_storage_stats
Purpose: Report various storage statistics about the system.
This script provides information about the primary data partition
(not the mirror nor the swap partition).
The amount of storage (per dataslice) varies across the different
hardware architectures. For the record, the sizes are:
(GB) 56.716 # Older Mustangs
(GB) 118.573 # Newer Mustangs
(GB) 3,725.000 # Cruiser
(GB) 356.000 # TwinFin
(GB) 195.000 # Striper / Mako
Inputs: None
Outputs: A report such as the following.
# Of DataSlices 8
Least Full DSlice # 3
Most Full DSlice # 2
Extents Per Dataslice 121,586
Storage Per DataSlice (GB) 356.209
Storage Used (GB)
Minimum 41.288
Average 42.888
Maximum 44.915
Storage Used (%)
Minimum 11.591
Average 12.040
Maximum 12.609
Total Storage
Available (TB) 2.783
Used (TB) 0.335
Used (%) 12.040
Remaining (TB) 2.448
Remaining (%) 87.960
nz_sysmgmt_view_references
Usage: nz_sysmgmt_view_references [[<database>] <view/table> ]
Purpose: Identify the relationships between various system/mgmt tables and views.
If you specify a system *OR* management VIEW, the script will
identify the views that reference it, and
identify the tables/views that it references
If you specify a system *OR* management TABLE, the script will
identify the views that reference it
Note: This script has already been run against all system/management views/tables.
If you wish, you can just peruse the relevant text file.
nps_sysmgmt_view_references.4.0.txt
nps_sysmgmt_view_references.4.5.txt
nps_sysmgmt_view_references.4.6.txt
nps_sysmgmt_view_references.5.0.txt
nps_sysmgmt_view_references.6.0.txt
nps_sysmgmt_view_references.7.0.txt
nps_sysmgmt_view_references.7.0.3.txt /* Full Schema Support release */
nps_sysmgmt_view_references.7.1.txt
nps_sysmgmt_view_references.7.2.txt
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead. The definition of system/management views is consistent
across all databases, so it doesn't really matter which one you specify.
The view/table name is optional. If not specified, then a report will be
produced for ALL system AND management tables
and also for ALL system AND management views
If only one argument is specified, it will be taken as the view/table name,
and the report will only cover just that one view/table.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the view/table name.
Outputs: A report such as the following
System View: _V_GROUP
Is directly referenced by the following views:
_V_GROUP_PRIV
_V_SCHED_GRA
_V_SCHED_SN
And references the following objects:
(system table) _T_GROUP
(system view) _V_OBJ_GROUP
(system view) _V_SYS_GROUP
(system table) _T_OBJECT
(system table) _T_OBJECT_CLASSES
(system table) _T_USER_OPTIONS
(system table) _T_ACL
(system table) _T_PRIORITY
nz_tables
Usage: nz_tables [-objid|-rowcount|-tablename|-size] [<database> [<tablename>]]
Purpose: List ALL tables in a database, along with other useful pieces of information.
See also: nz_db_size
Inputs: By default, the tables will be listed in the order in which they were
created (i.e., they will be sorted based on their "objid" object identifier
number). This is the order in which the tables get processed when you issue
any of the following commands against a database:
generate [express] statistics;
nzreclaim
Or, you can specify the sort order of your choosing
-objid the table's objid value (the default sort key)
-rowcount the table's rowcount statistic
-tablename the table's name
-size the table's size
The database name is optional.
If the database name is not specified, then this script will process
all databases / all schemas.
If a database name is specified, the script will process just the
specified database / schema.
If the optional table name is specified, then its POSITION in the
output list will be highlighted for easier identification (since
the sort order may not be by tablename).
Outputs: A report such as the following. The column that the sort is based
upon will have its title displayed in UPPERCASE.
$ nz_tables tpch partsupp
Database Name : TPCH
# Of Tables : 8
# Of Rows : 8,661,245
# OBJID rowcount table size (bytes) table name
------ --------- ----------------- ------------------ ----------
1 221890 25 3,060 NATION
2 221904 5 656 REGION
3 221916 200,000 31,167,724 PART
4 221940 800,000 140,039,028 * PARTSUPP
5 221956 10,000 1,736,548 SUPPLIER
6 221976 150,000 29,071,424 CUSTOMER
7 221998 6,001,215 867,799,052 LINEITEM
8 222036 1,500,000 204,137,484 ORDERS
nz_table_constraints
Usage: nz_table_constraints [database [<tablename>]]
Purpose: To dump out the SQL/DDL pertaining to any table constraints.
A table can have
0..1 PRIMARY KEY constraints
0..many FOREIGN KEY constraints
0..many UNIQUE constraints
A constraint can involve 1..many columns.
The system permits and maintains the definition of UNIQUE, PRIMARY KEY,
and FOREIGN KEY constraints. The NPS does not support automatic constraint
checks and referential integrity.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is optional. If not specified, then DDL pertaining to all
tables in this database -- that have constraints -- will be produced.
If you specify the table name, the constraint DDL specific to just that
table will be produced.
Outputs: SQL DDL will be sent to standard out. It will consist of ALTER TABLE
statements to both DROP and ADD each constraint.
ALTER TABLE fact_table DROP CONSTRAINT fact_table_pk RESTRICT;
ALTER TABLE fact_table ADD CONSTRAINT fact_table_pk PRIMARY KEY (c1, c3, c5);
nz_table_references
Usage: nz_table_references [database [table_name]]
Purpose: Identify any tables with PRIMARY KEY / FOREIGN KEY relationships.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is optional. If not specified, then all tables will be
considered.
See also: nz_table_constraints
Outputs: A report such as the following will be produced.
Tables With A Primary Key Defined | Tables With A Foreign Key Reference To It
-----------------------------------+-------------------------------------------
STATES | -----------------------------------------
STATES | CUSTOMERS
STATES | STORES
nz_unload
Usage: nz_unload -sql '"<statement>"' -file <filename>
Purpose: To unload data from a table swiftly ... via the use of remote external tables.
To move big volumes of data into/out of the system quickly, one should use
external tables. They are simple enough to use -- instead of doing this
select * from foo;
one could just do something like this
create external table '/tmp/foo' as select * from foo;
While nzsql supports external tables, it cannot be used with REMOTE external
tables. So if you are running on a client workstation you must use something
other than nzsql to initiate the above command. Which is what this script
does (it uses the nzsqlodbc program instead).
Inputs: These environment variables must be defined
NZ_HOST
NZ_DATABASE
NZ_USER
NZ_PASSWORD
-sql <statement> The SQL that you want to invoke. For example
-sql '"select * from my_table"'
-sql '"select * from prod_db.sales_schema.cust_table"'
-sql '"select * from your_table limit 100"'
-sql '"select name, address, state from emp_table where doh < '2011-01-01'"'
It is best to wrap your sql statement in single quote double quote pairs,
as shown above, to '"protect it"' from the linux shell.
If you have a file that contains your sql, it can be used via this syntax
-sql "`cat /tmp/test.sql`"
-file <filename> The output file to write the data to
-header Include this option if you want column headers (the column
names) included as the first line of output. This option
is available starting with NPS 7.2+.
Outputs: Data will be written to the file specified.
When done, a status message will be displayed to standard out.
Exit status: 0 = success, non-0 = an error occurred.
nz_view_references
Usage: nz_view_references [database [ view/table ]]
Purpose: Identify the relationships between various user tables and views.
For a specified view
a) identify what views reference it
b) identify the tables/views/synonyms/sequences that it references
For a specified table
a) identify what views reference it
If neither is specified, then
a) list all views in the database, and
b) identify the tables/views/synonyms/sequences that each references
Optionally, this script can also be used to (1) identify and (2) rebuild any
views that have become obsolete and appear to be in need of rebuilding. The
types of actions that can cause a view to become obsolete are:
o ALTER TABLE ... RENAME
o ALTER COLUMN ... RENAME
o ALTER COLUMN ... MODIFY
o DROP TABLE ...
When processing a single view, in order to "identify what views reference it",
this script must check every database. If you have a lot of databases this
can take awhile. If you want to skip this step ... and just deconstruct a
single view (identify just the objects that it references) ... use the
following method of specifying the view name:
-in ViewName
or -in V1 V2 V3 ...
To run this script you must be the ADMIN user, or have been granted the
following minimum set of privileges:
grant select on _T_OBJECT, _T_OBJECT_CLASSES, _T_DATABASE to <user/group>;
For releases prior to 7.0.4, you also need access to _T_ACTIONFRAG
Inputs: database
Optional. If not specified then all databases will be processed.
view/table
Optional. If not specified then all views in the database will be
processed.
-select <flag>
Should this script issue SQL against each view (running a simple
"select count(*) from <viewname> limit 0; ")? Doing so will allow
the script to perform a more thorough test. Default: no/false
-replace <flag>
If this script believes it may have found a problem with a view,
should it attempt to recreate it (via a "create or replace view ..."
statement)? Default: no/false
-create
If this option is specified it will (re)create + populate the table
SYSTEM..NZ_VIEW_REFERENCES with all view references/dependencies
for the specified objects, thus providing one easy-to-query table
in case you wish to do any analysis against it. This is a static
table, that is up-to-date only as of the moment that it was created,
containing only the requested information.
Outputs: A report such as the following will be produced. Any issues
found will be appropriately highlighted.
View: MASTER_VIEW
Is directly referenced by the following views:
PROD..CEO_VIEW
PROD..FINANCE_VIEW
PROD..MARKETING_VIEW
And references the following objects:
(VIEW) PROD..INVENTORY
(TABLE) PROD..PRODUCTS
(TABLE) PROD..PRICES
(VIEW) PROD..SALES_INFO
(TABLE) PROD..CUSTOMERS
(TABLE) PROD..PURCHASES
(TABLE) PROD..DATES
nz_view_plan_file
Usage: nz_view_plan_file [plan_id] [-v|-verbose]
Purpose: View a *.pln plan file -- on a remote/client worksation.
For a given query, the NPS Admin GUI allows one to "View Plan File".
This a similar implementation -- but for your CLI.
This can be done even if you are not the "nz" user (and don't have access
to the /nz/kit/log/planshist/current directory).
This can be done even if you are running on a remote/client workstation --
by simply setting up your NZ_HOST environment variable (which you have
probably already done anyway).
Note: The script 'nz_plan' now incorporates this functionality ... so you
can start relying solely on that script. Shorter name, easier to type in.
Inputs: [plan_id] The plan id/number
If you do not provide a plan_id, this script will simply dump out information
about the last 25 queries that you have submitted ... in order to help you to
identify and choose one.
If you specify the plan_id, then the plan file itself will be dumped out.
[-v|-verbose] If you wish to have the plan file analyzed + parsed by the
'nz_plan' script, include the -verbose (or -v) switch.
Outputs: Sample outputs follow
# Show the last 25 queries run by this user (as no plan_id has been specified).
# Any queries that are still active+running will be listed first.
$ nz_view_plan_file
PLAN_ID | User | Database | Start | End | Elapsed | SQL Query ... | Snippets | Rows
---------+-------+----------+----------+----------+----------+------------------------------------------+----------+------
152 | ADMIN | TPCH | 15:06:45 | | 00:00:04 | select n_name, sum(l_extendedprice * (1 | 10 of 11 |
151 | ADMIN | TPCH | 15:06:39 | 15:06:45 | 6 | select o_orderpriority, count(*) as orde | 3 | 5
149 | ADMIN | TPCH | 15:06:33 | 15:06:39 | 6 | select l_orderkey, sum(l_extendedprice * | 4 | 10
147 | ADMIN | TPCH | 15:06:28 | 15:06:32 | 4 | select s_acctbal, s_name, n_name, p_part | 15 | 100
146 | ADMIN | TPCH | 15:06:23 | 15:06:28 | 5 | select l_returnflag, l_linestatus, sum(l | 2 | 4
(5 rows)
# Show the plan file for plan_id 156
$ nz_view_plan_file 156
Execution Plan [plan id 156, job id 156, sig 0x06f33b07]:
SQL: select * from test_table;
1[00]: spu ScanNode table "PROD.ADMIN.TEST_TABLE" 204890 memoryMode=no flags=0x0 index=0 cost=1 (o)
-- Cost=0.0..0.0 Rows=1 Width=4 Size=4 Conf=100 {(COL1)}
1[01]: spu RestrictNode (NULL)
1[02]: spu ProjectNode, 1 cols, projectFlags=0x0
0:COL1
-- 0:COL1
1[03]: spu ReturnNode
501[00]: dbs ReturnNode
End Execution Plan
nz_watch
Usage: nz_watch [-sleep <nnn>] [-timeout <nnn>] [-to <email@recipient>]
Purpose: Watch (i.e., monitor) the system ... to verify that it is operational.
This script will run a simple query against a small table residing on the
SPUs/S-Blades -- to verify that the system is still up + running.
Inputs: -sleep <nnn> Number of seconds to sleep after each invocation of
the test query. The default is 60 seconds.
-timeout <nnn> Number of seconds to wait before timing out and
sending an alert. The default is 30 seconds.
-to <email@recipient> Email address of the person/group to be alerted.
If not specified, then the script will simply echo out
the alert information to standard out ... and no
automatic email notification will be sent.
If you wish to test this script (for example, to make sure the email
messages are correctly sent + received), you can do an "nzsystem pause"
of the database while the script is running.
Outputs: As each invocation of the query completes, the script will simply increment
an output counter. If the -timeout is ever encountered, additional details
will be included in the output.
* * * * Building Blocks * * * *
nz_get_aggregate_name
Usage: nz_get_aggregate_name [database] <aggregate_name>
Purpose: Verifies that the user defined aggregate exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The aggregate name is required. If only one argument is specified, it will
be taken as the aggregate name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the aggregate name.
Outputs: If the user defined aggregate exists, its name will be echoed back out.
Note: You can have multiple aggregates with the same name -- but with a
different signature (a different set of arguments). The output from this
script will be a single aggregate name, regardless of how many different
signatures it might have.
nz_get_aggregate_names
Usage: nz_get_aggregate_names [database]
Purpose: List the user defined aggregates found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all user defined aggregates will be listed out, one per line.
Note: You can have multiple aggregates with the same name -- but with a
different signature (a different set of arguments). The output from this
script will be a list of DISTINCT aggregate names.
nz_get_aggregate_signatures
Usage: nz_get_aggregate_signatures [ database [ aggregate_signature ]]
Purpose: List the signatures of the user defined aggregates in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The aggregate_signature is optional. If specified, if the signature does
exist, its name will be echoed back out.
If not specified, then all signatures for all aggregates in this database
will be listed.
Outputs: The aggregate signatures will be listed out, one per line.
nz_get_column_attnum
Usage: nz_get_column_attnum [database] <object> <column>
Purpose: Get a column's logical attribute number (its order/position) within the object.
When you create an object, each column is assigned a one-up attribute number
based on its order (as in a CREATE TABLE statement). This script returns that
column number.
The logical order is not the same as the physical order (for that, see
nz_physical_table_layout).
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead. If specified, it must be the first (of the three)
arguments listed on the command line.
The object name and column name are required.
The object can be of type
TABLE, SECURE TABLE, EXTERNAL TABLE
VIEW, MATERIALIZED VIEW
Outputs: The column number (which should generally be between 1 and 1600).
nz_get_column_name
Usage: nz_get_column_name [database] <object> <column>
Purpose: Verifies that the specified column exists in the object.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead. If specified, it must be the first (of the three)
arguments listed on the command line.
The object name and column name are required.
The object can be of type
TABLE, SECURE TABLE, EXTERNAL TABLE
VIEW, MATERIALIZED VIEW
Outputs: If the column exists in the object, its name will be echoed back out.
nz_get_column_names
Usage: nz_get_column_names [database] <object>
Purpose: List the column names comprising an object.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The object name is required. If only one argument is specified, it
will be taken as the object name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the object name.
The object can be of type
TABLE, SECURE TABLE, EXTERNAL TABLE
VIEW, MATERIALIZED VIEW
Outputs: The column names for the object will be listed out, one per line.
nz_get_column_oid
Usage: nz_get_column_oid [database] <object> <column>
Purpose: List the colid (column id) for the specified column.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead. If specified, it must be the first (of the three)
arguments listed on the command line.
The object name and column name are required.
The object can be of type
TABLE, SECURE TABLE, EXTERNAL TABLE
VIEW, MATERIALIZED VIEW
Outputs: The colid (column id) for the column is returned.
nz_get_column_type
Usage: nz_get_column_type [database] <table> <column>
Purpose: Get a column's defined data type.
For the specified column, this script will return its data type
(how it was defined in the CREATE TABLE statement)
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead. If specified, it must be the first (of the three)
arguments listed on the command line.
The object name and column name are required.
The object can be of type
TABLE, SECURE TABLE, EXTERNAL TABLE
VIEW, MATERIALIZED VIEW
Outputs: The column's datatype is returned. e.g.,
CHARACTER(10)
INTEGER
nz_get_database_name
Usage: nz_get_database_name [database]
Purpose: Verifies that the specified database exists on this server.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: If the database exists, its name will be echoed back out.
The exit status of this script will be 0.
If an error occurs (or if the database does not exist) an appropriate
error message will be displayed. And the exit status of this script
will be non-0.
nz_get_database_names
Usage: nz_get_database_names
Purpose: Return the list of database names on this server.
Inputs: None
Outputs: The database names for this server will be listed out, one per line.
The exit status of this script will be 0.
If an error occurs an appropriate error message will be displayed.
And the exit status of this script will be non-0.
nz_get_database_objid
Usage: nz_get_database_objid [database]
Purpose: List the object id for the specified database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The unique objid (object identifier) for the database is returned.
nz_get_database_owner
Usage: nz_get_database_owner [database]
Purpose: List the owner (creator of) the specified database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The owner of the database is returned
nz_get_database_table_column_names
Usage: nz_get_database_table_column_names [ database [ schema [ table [ column ]]]]
Purpose: A sample script -- that will list all database/schema/table/column names.
It will simply (and heavily) exericse the catalogs. It is not an example
of how you should code your own scripts.
Inputs: The database, schema, table, and column names are optional.
If a database is not specified, then all databases will be processed.
--> If a schema is not specified, then all schemas will be processed
--> If a table is not specified, then all tables will be processed.
--> If a column is not specified, then all columns will be processed.
If your system does not support schemas (prior to 7.0.3 ) then whatever
you specify for the schema name will be ignored.
Outputs: This script simply echoes out the database, schema, table, and column names
as it steps through them.
nz_get_ext_table_name
Usage: nz_get_ext_table_name [database] <ext_table>
Purpose: Verifies that the specified external table exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The external table name is required. If only one argument is specified,
it will be taken as the external table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the external table name.
Outputs: If the external table exists, its name will be echoed back out.
nz_get_ext_table_names
Usage: nz_get_ext_table_names [database]
Purpose: List the external table names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all external tables will be listed out, one per line.
nz_get_ext_table_objid
Usage: nz_get_ext_table_objid [database] <ext_table>
Purpose: List the object id for the specified external table.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The external table name is required. If only one argument is specified,
it will be taken as the external table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the external table name.
Outputs: The unique objid (object identifier) for the external table is returned.
nz_get_ext_table_owner
Usage: nz_get_ext_table_owner [database] <ext_table>
Purpose: List the owner (creator of) the specified external table.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The external table name is required. If only one argument is specified,
it will be taken as the external table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the external table name.
Outputs: The owner of the external table is returned
nz_get_function_name
Usage: nz_get_function_name [database] <function_name>
Purpose: Verifies that the user defined function exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The function name is required. If only one argument is specified, it will
be taken as the function name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the function name.
Outputs: If the user defined function exists, its name will be echoed back out.
Note: You can have multiple functions with the same name -- but with a
different signature (a different set of arguments). The output from this
script will be a single function name, regardless of how many different
signatures it might have.
nz_get_function_names
Usage: nz_get_function_names [database]
Purpose: List the user defined functions found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all user defined functions will be listed out, one per line.
Note: You can have multiple functions with the same name -- but with a
different signature (a different set of arguments). The output from this
script will be a list of DISTINCT function names.
nz_get_function_signatures
Usage: nz_get_function_signatures [ database [ function_signature ]]
Purpose: List the signatures of the user defined functions in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The function_signature is optional. If specified, if the signature does
exist, its names will be echoed back out.
If not specified, then all signatures for all functions in this database
will be listed.
Outputs: The function signatures will be listed out, one per line.
nz_get_group_name
Usage: nz_get_group_name <groupname>
Purpose: Verifies that the specified group exists.
Inputs: The groupname is required.
Outputs: If the group exists, its name will be echoed back out.
nz_get_group_names
Usage: nz_get_group_names
Purpose: Return the list of group names defined on this server.
Inputs: None
Outputs: The group names will be listed out, one per line.
nz_get_group_objid
Usage: nz_get_group_objid <group>
Purpose: List the object id for the specified group.
Inputs: The group name is required.
Outputs: The unique objid (object identifier) for the group is returned.
nz_get_group_owner
Usage: nz_get_group_owner <group>
Purpose: List the owner (creator of) the specified group.
Inputs: The group name is required.
Outputs: The owner of the group is returned
nz_get_group_users
Usage: nz_get_group_users <group>
Purpose: List the members of the specified group.
Inputs: The group name is required.
Outputs: The individual members that belong to this group are returned, one per line.
nz_get_lastTXid
Usage: nz_get_lastTXid
Purpose: To get the value of the last transaction ID that was assigned.
The lastTXid is always an even-numbered value.
The stableTXid is always an odd-numbered value.
On a "quiet" box (with no outstanding transactions) the stableTXid
would be = lastTXid + 1.
However, querying the transaction ID information itself results is
a transaction. So, on a "quiet" box, what you would actually see
is that the stableTXid = lastTXid - 1 (since a new transaction was
probably used + assigned to provide you with this information).
Inputs: None
Outputs: The lastTXid will be returned (in base 10 format)
nz_get_library_name
Usage: nz_get_library_name [database] <library_name>
Purpose: Verifies that the user defined library exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The library name is required. If only one argument is specified, it will
be taken as the lbrary name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the library name.
Outputs: If the user defined library exists, its name will be echoed back out.
nz_get_library_names
Usage: nz_get_library_names [database]
Purpose: List the user defined libraries found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all user defined libraries will be listed out, one per line.
nz_get_mgmt_table_name
Usage: nz_get_mgmt_table_name [database] <management_table>
Purpose: Verifies that the specified management table exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: If the management table exists, its name will be echoed back out.
nz_get_mgmt_table_names
Usage: nz_get_mgmt_table_names [database]
Purpose: List the management table names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all management tables will be listed out, one per line.
nz_get_mgmt_view_name
Usage: nz_get_mgmt_view_name [database] <management_view>
Purpose: Verifies that the specified management view exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The view name is required. If only one argument is specified, it will
be taken as the view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the view name.
Outputs: If the management view exists, its name will be echoed back out.
nz_get_mgmt_view_names
Usage: nz_get_mgmt_view_names [database]
Purpose: List the management view names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all management views will be listed out, one per line.
nz_get_model
Usage: nz_get_model
Purpose: To identify the model number of this NPS server.
Inputs: None
Outputs: The NPS model will be returned.
nz_get_mview_basename
Usage: nz_get_mview_basename [database] <materialized_view>
Purpose: Returns the base tablename that this materialized view is based upon.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The materialized view name is required. If only one argument is specified,
it will be taken as the materialized view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the materialized view name.
Outputs: If the materialized view exists, its base tablename will be echoed back out.
nz_get_mview_definition
Usage: nz_get_mview_definition [database] <materialized_view>
Purpose: Display the definition (the SQL) that defined the materialized view.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The materialized view name is required. If only one argument is specified,
it will be taken as the materialized view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the materialized view name.
Outputs: The definition of the materialized view is returned
nz_get_mview_matrelid
Usage: nz_get_mview_matrelid [database] <materialized_view>
Purpose: Get the OBJID of the storage that is associated with a materialized view.
A materialized view is like a view. But it is also like a table --
in that it has its own storage. That storage has its own 'name' and
'objid'. This script will return the OBJID number.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The materialized view name is required. If only one argument is specified,
it will be taken as the materialized view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the materialized view name.
Outputs: If the materialized view exists, its base MATRELID will be echoed
back out.
nz_get_mview_name
Usage: nz_get_mview_name [database] <materialized_view>
Purpose: Verifies that the specified materialized view exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The materialized view name is required. If only one argument is specified,
it will be taken as the materialized view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the materialized view name.
Outputs: If the materialized view exists, its name will be echoed back out.
nz_get_mview_names
Usage: nz_get_mview_names [database]
Purpose: List the materialized view names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all materialized views will be listed out, one per line.
nz_get_mview_objid
Usage: nz_get_mview_objid [database] <materialized_view>
Purpose: List the object id for the specified materialized view.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The materialized view name is required. If only one argument is specified,
it will be taken as the materialized view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the materialized view name.
Outputs: The unique objid (object identifier) for the
materialized view is returned.
nz_get_mview_owner
Usage: nz_get_mview_owner [database] <materialized_view>
Purpose: List the owner (creator of) the specified materialized view.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The materialized view name is required. If only one argument is specified,
it will be taken as the materialized view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the materialized view name.
Outputs: The owner of the materialized view is returned
nz_get_object_name
Usage: nz_get_object_name <object1> [object2]
Purpose: Verifies that the specified object exists.
For any given object name, this script will verify that an object by that name
does exist.
Inputs: <object1> If you are interested in a GLOBAL object (i.e., a database, user,
group, or scheduler rule) then specify it here.
Otherwise, specify the parent DATABASE that the object/relation
resides in.
[object2] If the object you are interested in is anything other than a global
object, i.e. a
table, external table
view, materialized view
sequence, synonym
system table, system view
management table, management view
then
<object1> refers to the database
[object2] refers to the object itself
If the object is a function/aggregate/procedure, then you must pass
this script the exact signature, wrapped in single quotes. For example:
$ nz_get_object_name SYSTEM 'TEST_FUNCTION_1(BYTEINT)'
To obtain a list of the exact signatures, see:
nz_get_function_signatures
nz_get_aggregate_signatures
nz_get_procedure_signatures
Outputs: If the object does exist, its name will be echoed back out.
nz_get_object_objid
Usage: nz_get_object_objid <object1> [object2]
Purpose: Get the unique objid (object identifier number) associated with an object.
For any given object name, this script will return its unique objid number.
Inputs: <object1> If you are interested in a GLOBAL object (i.e., a database/user/group)
then specify it here.
Otherwise, specify the parent DATABASE that the object/relation
resides in.
[object2] If the object you are interested in is anything other than a global
object, i.e. a
table, external table
view, materialized view
sequence, synonym
system table, system view
management table, management view
then
<object1> refers to the database
[object2] refers to the object itself
Outputs: If the object does exist, its objid value will be echoed back out.
nz_get_object_owner
Usage: nz_get_object_owner <object1> [object2]
Purpose: Get the owner of an object.
For a given object name, identify who owns it.
Inputs: <object1> If you are interested in a GLOBAL object (i.e., a database/user/group)
then specify it here.
Otherwise, specify the parent DATABASE that the object/relation
resides in.
[object2] If the object you are interested in is anything other than a global
object, i.e. a
table, external table
view, materialized view
sequence, synonym
system table, system view
management table, management view
then
<object1> refers to the database
[object2] refers to the object itself
Outputs: If the object does exist, the object owner will be echoed back out.
nz_get_object_type
Usage: nz_get_object_type <object1> [object2]
Purpose: For a given object, identify what type of object it is (TABLE, VIEW, etc ...).
Inputs: <object1> If you are interested in a GLOBAL object (i.e., a database/user/group)
then specify it here.
Otherwise, specify the parent DATABASE that the object/relation
resides in.
[object2] If the object you are interested in is anything other than a global
object, i.e. a
table, external table
view, materialized view
sequence, synonym
system table, system view
management table, management view
then
<object1> refers to the database
[object2] refers to the object itself
Outputs: If the object does exist, its object type will be echoed back out.
nz_get_procedure_name
Usage: nz_get_procedure_name [database>] <procedure_name>
Purpose: Verifies that the user defined procedure exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The procedure name is required. If only one argument is specified, it will
be taken as the procedure name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the procedure name.
Outputs: If the user defined procedure exists, its name will be echoed back out.
Note: You can have multiple procedures with the same name -- but with a
different signature (a different set of arguments). The output from this
script will be a single procedure name, regardless of how many different
signatures it might have.
nz_get_procedure_names
Usage: nz_get_procedure_names [database]
Purpose: List the user defined procedures found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all user defined procedures will be listed out, one per line.
Note: You can have multiple procedures with the same name -- but with a
different signature (a different set of arguments). The output from this
script will be a list of DISTINCT procedure names.
nz_get_procedure_signatures
Usage: nz_get_procedure_signatures [ database [ procedure_signature ]]
Purpose: List the signatures of the user defined procedures in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The procedure_signature is optional. If specified, if the signature does
exist, its name will be echoed back out.
If not specified, then all signatures for all procedures in this database
will be listed.
Outputs: The procedure signatures will be listed out, one per line.
nz_get_schema_name
Usage: nz_get_schema_name [[database] schema]
Purpose: Verifies that the specified schema exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The schema name is optional.
If only one argument is specified, it will be taken as the schema name.
If no arguments are specified, the script will return you the name of
the current_schema.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the schema name.
Outputs: If the schema exists, its name will be echoed back out.
nz_get_schema_names
Usage: nz_get_schema_names [database]
Purpose: List the schema names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all schemas will be listed out, one per line.
nz_get_schema_objid
Usage: nz_get_schema_objid [[database] schema]
Purpose: List the object id for the specified schema.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The schema name is optional.
If only one argument is specified, it will be taken as the schema name.
If no arguments are specified, the script will return you the objid of
the current_schema.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the schema name.
Outputs: The unique objid (object identifier) for the schema is returned.
nz_get_sequence_name
Usage: nz_get_sequence_name [database] <sequence>
Purpose: Verifies that the specified sequence exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The sequence name is required. If only one argument is specified, it
will be taken as the sequence name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the sequence name.
Outputs: If the sequence exists, its name will be echoed back out.
nz_get_sequence_names
Usage: nz_get_sequence_names [database]
Purpose: List the sequence names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all sequences will be listed out, one per line.
nz_get_sequence_objid
Usage: nz_get_sequence_objid [database] <sequence>
Purpose: List the object id for the specified sequence.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The sequence name is required. If only one argument is specified, it
will be taken as the sequence name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the sequence name.
Outputs: The unique objid (object identifier) for the sequence is returned.
nz_get_sequence_owner
Usage: nz_get_sequence_owner [database] <sequence>
Purpose: List the owner (creator of) the specified sequence.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The sequence name is required. If only one argument is specified, it
will be taken as the sequence name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the sequence name.
Outputs: The owner of the sequence is returned
nz_get_stableTXid
Usage: nz_get_stableTXid
Purpose: To get the value of the "Stable Transaction ID".
This is the transaction ID for which all transactions numerically lower
than it are visible (i.e., already committed or rolled back). Thus, it
negates the need for a transaction invisibility list.
It is actually is based on two conditions:
o It is the latest committed transaction
o It must not be included in any other transaction's invisibility list
Inputs: None
Outputs: The stableTXid will be returned (in base 10 format)
nz_get_synonym_definition
Usage: nz_get_synonym_definition [database] <synonym>
Purpose: For the specified synonym, return the REFERENCED object.
The complete specification of the object is returned, as follows
"DATABASE"."SCHEMA"."OBJECTNAME"
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The synonym name is required. If only one argument is specified, it
will be taken as the synonym name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the synonym name.
Outputs: If the synonym exists, the object it references will be echoed back out.
Note: Though a synonym may exist, the object it references may/may not
exist.
nz_get_synonym_name
Usage: nz_get_synonym_name [database] <synonym>
Purpose: Verifies that the specified synonym exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The synonym name is required. If only one argument is specified, it
will be taken as the synonym name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the synonym name.
Outputs: If the synonym exists, its name will be echoed back out.
nz_get_synonym_names
Usage: nz_get_synonym_names [database]
Purpose: List the synonym names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all synonyms will be listed out, one per line.
nz_get_synonym_objid
Usage: nz_get_synonym_objid [database] <synonym>
Purpose: List the object id for the specified synonym.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The synonym name is required. If only one argument is specified, it
will be taken as the synonym name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the synonym name.
Outputs: The unique objid (object identifier) for the synonym is returned.
nz_get_synonym_owner
Usage: nz_get_synonym_owner [database] <synonym>
Purpose: List the owner (creator of) the specified synonym.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The synonym name is required. If only one argument is specified, it
will be taken as the synonym name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the synonym name.
Outputs: The owner of the synonym is returned
nz_get_sysmgmt_table_name
Usage: nz_get_sysmgmt_table_name [database] <table>
Purpose: Verifies that the specified system table *OR* management table exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: If the table exists, its name will be echoed back out.
nz_get_sysmgmt_table_names
Usage: nz_get_sysmgmt_table_names [database]
Purpose: List the system table *AND* management table names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all tables will be listed out, one per line.
nz_get_sysmgmt_table_objid
Usage: nz_get_sysmgmt_table_objid [database] <table>
Purpose: List the object id for the specified system table *OR* management table.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: The unique objid (object identifier) for the table is returned.
nz_get_sysmgmt_view_name
Usage: nz_get_sysmgmt_view_name [database] <view>
Purpose: Verifies that the specified system view *OR* management view exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The view name is required. If only one argument is specified, it will
be taken as the view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the view name.
Outputs: If the view exists, its name will be echoed back out.
nz_get_sysmgmt_view_names
Usage: nz_get_sysmgmt_view_names [database]
Purpose: List the system view *AND* management view names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all views will be listed out, one per line.
nz_get_sysmgmt_view_objid
Usage: nz_get_sysmgmt_view_objid [database] <view>
Purpose: List the object id for the specified system view *OR* management view.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The view name is required. If only one argument is specified, it will
be taken as the view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the view name.
Outputs: The unique objid (object identifier) for the view is returned.
nz_get_sys_table_name
Usage: nz_get_sys_table_name [database] <system_table>
Purpose: Verifies that the specified system table exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: If the system table exists, its name will be echoed back out.
nz_get_sys_table_names
Usage: nz_get_sys_table_names [database]
Purpose: List the system table names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all system tables will be listed out, one per line.
nz_get_sys_view_name
Usage: nz_get_sys_view_name [database] <system_view>
Purpose: Verifies that the specified system view exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The view name is required. If only one argument is specified, it will
be taken as the view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the view name.
Outputs: If the system view exists, its name will be echoed back out.
nz_get_sys_view_names
Usage: nz_get_sys_view_names [database]
Purpose: List the system view names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all system views will be listed out, one per line.
nz_get_table_distribution_key
Usage: nz_get_table_distribution_key [database] <table>
Purpose: Identify the column name(s) on which this table is distributed.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: The columns comprising the distribution key are echoed out, one
column name per line, in the order in which they were specified
in the DISTRIBUTE ON ( ) clause.
If the table is distributed on random (round-robin), then this
script will simply return the string
RANDOM
nz_get_table_fks
Usage: nz_get_table_fks [database] <table>
Purpose: Identify the 'foreign tables' (if any) that are referenced by this table.
When you create a table, you may specify 0..many FOREIGN KEY constraints.
This script will identify all of the 'foreign tables' (if any) that are
associated with this specific table.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: If foreign key references were made, the names of the 'foreign tables' being
referenced will be listed, one table per line. If a 'foreign table' is
referenced more than once (i.e., in more than one constraint) it will only
be listed once.
nz_get_table_name
Usage: nz_get_table_name [database] <table>
Purpose: Verifies that the specified table exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: If the table exists, its name will be echoed back out.
nz_get_table_names
Usage: nz_get_table_names [database]
Purpose: List the table names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all tables will be listed out, one per line.
nz_get_table_objid
Usage: nz_get_table_objid [database] <table>
Purpose: List the object id for the specified table.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: The unique objid (object identifier) for the table is returned.
nz_get_table_organization_key
Usage: nz_get_table_organization_key [database] <table>
Purpose: Identify the column name(s) on which this table is organized.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: The columns comprising the organization clause are echoed out, one
column name per line, in the order in which they were specified
in the ORGANIZE ON ( ) clause.
If no ORGANIZE ON clause was specified (or if you are on a pre-6.0 system,
which is when this feature was first introduced) then this script will
simply return the string
NONE
nz_get_table_owner
Usage: nz_get_table_owner [database] <table>
Purpose: List the owner (creator of) the specified table.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: The owner of the table is returned
nz_get_table_pk
Usage: nz_get_table_pk [database] <table>
Purpose: If a PRIMARY KEY was defined for a table, list the column name(s) comprising it.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: The columns comprising the primary key are echoed out, one
column name per line, in the order in which they were specified.
nz_get_table_rowcount
Usage: nz_get_table_rowcount [database] <table>
Purpose: Perform a "SELECT COUNT(*) FROM <table>;" to get its true rowcount.
Thus, this script results in a full table scan being performed.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: The table rowcount is returned.
nz_get_table_rowcount_statistic
Usage: nz_get_table_rowcount_statistic [database] <table>
Purpose: Returns the STATISTICAL VALUE representing this table's rowcount.
This is simply the statistical value from the host catalog. It may, or
may not, match the table's actual rowcount at any given moment in time.
Because it is accessing a statistic value -- rather than performing a
full table scan -- it is infinitely faster.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The table name is required. If only one argument is specified, it will
be taken as the table name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the table name.
Outputs: The table's (STATISTICAL VALUE) rowocunt is returned.
nz_get_user_groups
Usage: nz_get_user_groups <user>
Purpose: List the groups that this user is a member of.
It does not include the 'PUBLIC' group, which all users typically belong to.
Inputs: The user name is required.
Outputs: The groups that this user belongs to are returned, one group per line.
nz_get_user_name
Usage: nz_get_user_name <username>
Purpose: Verifies that the specified user exists.
Inputs: The username is required.
Outputs: If the user exists, its name will be echoed back out.
nz_get_user_names
Usage: nz_get_user_names
Purpose: Return the list of user names defined on this server.
Inputs: None
Outputs: The user names will be listed out, one per line.
nz_get_user_objid
Usage: nz_get_user_objid <user>
Purpose: List the object id for the specified user.
Inputs: The user name is required.
Outputs: The unique objid (object identifier) for the user is returned.
nz_get_user_owner
Usage: nz_get_user_owner <user>
Purpose: List the owner (creator of) the specified user.
Inputs: The user name is required.
Outputs: The owner of the user is returned
nz_get_view_definition
Usage: nz_get_view_definition [database] <view>
Purpose: Display the definition (the SQL) that the view will execute.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The view name is required. If only one argument is specified, it will be
taken as the view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the view name.
Outputs: The definition of the view -- the SQL to be executed -- is returned.
nz_get_view_name
Usage: nz_get_view_name [database] <view>
Purpose: Verifies that the specified view exists.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The view name is required. If only one argument is specified, it will
be taken as the view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the view name.
Outputs: If the view exists, its name will be echoed back out.
nz_get_view_names
Usage: nz_get_view_names [database]
Purpose: List the view names found in this database.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
Outputs: The names of all views will be listed out, one per line.
nz_get_view_objid
Usage: nz_get_view_objid [database] <view>
Purpose: List the object id for the specified view.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The view name is required. If only one argument is specified, it will be
taken as the view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the view name.
Outputs: The unique objid (object identifier) for the view is returned.
nz_get_view_owner
Usage: nz_get_view_owner [database] <view>
Purpose: List the owner (creator of) the specified view.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The view name is required. If only one argument is specified, it will be
taken as the view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the view name.
Outputs: The owner of the view is returned
nz_get_view_rowcount
Usage: nz_get_view_rowcount [database] <view>
Purpose: Performs a "SELECT COUNT(*) FROM <view>;" and returns the rowcount.
Inputs: The database name is optional. If not specified, then $NZ_DATABASE will
be used instead.
The view name is required. If only one argument is specified, it will
be taken as the view name.
If two arguments are specified, the first will be taken as the database
name and the second will be taken as the view name.
Outputs: The rowcount is returned.
nz_wrapper
Usage: nz_wrapper <objname>
Purpose: To "wrap" an object name with quotes (or the appropriate delimiter).
In general, NPS treats all object names as being case insensitive.
It makes things easier on the user. But in reality NPS is case sensitive.
How to handle that?
When you specify an object name, if you enclose it in quotes then
NPS will assume it is case sensitive. And an exact match must be
made. Otherwise, NPS will convert the object name to all uppercase
characters (or lowercase characters, depending on your system's default).
When invoking a program/script on the command line ...
If you simply enter
your_object_name
then it will be treated as case insensitive. This script will actually
wrap it in ^^ symbols for later use/queries against the system catalogs.
select * from _v_obj_relation where objname = ^your_object_name^ ;
If you include quotes around it
'"Your Object Name"'
then it will be treated as case sensitive. This script will preserve
one set of quotes for later use/queries against the system catalogs.
select * from _v_obj_relation where objname = 'Your Object Name' ;
The shell likes to strip off the quotes. That is why, when you specify
the object name, you must wrap it in two sets of quotes -- to preserve
one set for the called application.
Inputs: The object name. e.g.
your_object_name -- case insensitive
YOUR_OBJECT_NAME -- case insensitive
'"Your Object Name"' -- case sensitive
Outputs: The object name, wrapped accordingly. e.g.
^your_object_name^ -- case insensitive
^YOUR_OBJECT_NAME^ -- case insensitive
'Your Object Name' -- case sensitive