SUMMARY:
There is NO significant storage saving (just 3.7%) when choosing between NATIONAL CHARACTER (Unicode) and CHARACTER (ASCII) for database table design at Netezza.
I recommend choosing a datatype that pertains to design and scalability of the application.
DETAILS:
Compare storage saving using CHARACTER VARYING instead of NATIONAL CHARACTER VARYING datatype at Netezza NPS 7.2.1.
NATIONAL CHARACTER
|
CHARACTER
|
CREATE TABLE ARCHIVE_TABLE_201405
(
CMTS_NM NATIONAL CHARACTER VARYING (30 ),
INTF_NM NATIONAL CHARACTER VARYING (30 ),
MAC_ADDR NATIONAL CHARACTER VARYING (20 ),
CM_STA NATIONAL CHARACTER VARYING (20 ),
TM_OFST CHARACTER VARYING (20 ),
IP_ADDR NATIONAL CHARACTER VARYING (20 ),
NODE_KEY NATIONAL CHARACTER VARYING (50 ),
BATCH_TS TIMESTAMP,
AS_OF_DT TIMESTAMP,
DATA_ACQ_ID INTEGER
)
DISTRIBUTE ON (AS_OF_DT, MAC_ADDR);
|
CREATE TABLE ARCHIVE_TABLE_201405_VARCHAR
(
CMTS_NM CHARACTER VARYING (30 ),
INTF_NM CHARACTER VARYING (30 ),
MAC_ADDR CHARACTER VARYING (20 ),
CM_STA CHARACTER VARYING (20 ),
TM_OFST CHARACTER VARYING (20 ),
IP_ADDR CHARACTER VARYING (20 ),
NODE_KEY CHARACTER VARYING (50 ),
BATCH_TS TIMESTAMP,
AS_OF_DT TIMESTAMP,
DATA_ACQ_ID INTEGER
)
DISTRIBUTE ON (AS_OF_DT, MAC_ADDR);
|
ALLOCATED_GB: 38 Gb
USED_GB : 38 Gb
RECORDS : 1,369,391,278
|
ALLOCATED_GB: 36.7 Gb
USED_GB : 36.7 Gb
RECORDS : 1,369,391,278
|
From description below, once could assume that each NVARCHAR (Unicode) character takes 4x larger storage than VARCHAR (ASCII). The result of the experiment indicates that this is not the case.
VARCHAR | CHARACTER VARYING, VARCHAR(n), CHARACTER VARYING(n), CHAR VARYING(n) | Variable length character string to a maximum length of n. If you do not specify n, the default is an unsized VARCHAR value. There is no blank padding, and the value is stored as entered. The maximum character string size is 64,000. |
NVARCHAR | NATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING, NATIONAL CHAR VARYING(n), NATIONAL CHARACTER VARYING(n), and NVARCHAR(n) | Variable length character string to a maximum length of n. If you do not specify n, the default is an unsized NVARCHAR value. The maximum length is 16,000 characters. |
REF: Netezza Data types and aliases