Wednesday, June 11, 2014

CHAR semantics and BYTE semantics

Earlier we are define the datatypes with byte size only (i.e. varchar2(10)), if we set datatype size varchar2(10) then it will store 10 characters for single byte character set. If we want to store the multibyte character of any other language then, in same column we are not able to store the 10 characters are multibyte in size.

From 9i oracle introduced NLS_LENGTH_SEMANTICS parameter. While installation oracle ask for character storage type "CHAR/BYTE".
We can check default character set with initialization parameter "NLS_LENGTH_SEMANTICS". The default length semantic is byte.

SQL> show parameter nls_length_

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
nls_length_semantics                 string      BYTE

Two type CHAR semantics and BYTE semantics

There are three different ways, we can declare CHAR/VARCHAR2 data type.
Let us take VARCHAR2(10) as an example here.

1. VARCHAR2(10)

BYTE SEMANTICS treat strings as a sequence of bytes. This is the default for character datatypes.
VARCHAR2(10 BYTE) in this oracle store only 10 byte of data, it depends upon the size of char. But if database store multibyte char it will store only 5 char.

CHAR SEMANTICS treat strings as a sequence of characters.
VARCHAR2(10 CHAR) will store 10 char at all whether they are single byte or mutlibyte character.

If you want to fill five chinese char and five english words together, then In byte semantics, this would require (5*3 bytes) + (1*5 bytes) = 20 bytes; in character semantics, the column would require 10 characters.

VARCHAR2(20 BYTE) use byte semantics.
VARCHAR2(10 CHAR) use character semantics.

Character semantics are useful for defining varying-width multibyte strings; it reduces the complexity when defining the actual length requirements for data storage.

No comments:

Post a Comment

web stats