Char Primitive Data Type Tutorial

This tutorial will explain the char Java data type. In Java, the char is a 16-bit Unicode character. That sounds scary! What is a 16-bit Unicode character? Every individual letter, number, and symbol on your screen such as A,B,C,X,Y,Z,0,1,2,3,#,$,% is represented by a Unicode character. To understand what a Unicode character is, you will need to understand some key concepts of disk memory. This tutorial will provide a crash course about things like bits, unsigned bytes, binary, decimal, ASCII, and how they relate to Unicode characters.



Bits and the Unsigned byte

Your hard drive, thumb drive, SD card are all examples of disk memory storage devices. Typically, each one has a memory capacity listed in megabytes, gigabytes, or terabytes. A bit is a single memory location that contains either a 1 or 0. There are 8 bits in a byte, there are 16 bits in two bytes which is what a Unicode character consists of. Let's imagine that we have an empty thumb drive. If we could "peel" it open and look the memory, we would see millions of 0's all lined up.
What would a single byte look like? 00000000

Binary Numeric System

Everybody knows the decimal system, you can count from 1-100 and then keep on going forever if you wanted to. With binary numbering, the numbers 1 and 0 are used to represent all real numbers. To figure out the decimal value of the binary representation, you will "read" the 1's and 0's from right to left. In a byte, the right-most bit represents the decimal value of either 1 or 0.

Decimal 0 =   Binary 00000000
Decimal 1 =   Binary 00000001
The second right-most bit represents a decimal value of either 2 or 0.
Decimal 2 =   Binary 00000010
The third right-most bit represents a decimal value of either 4 or 0.
Decimal 4 =   Binary 00000100
The forth right-most bit represents a decimal value of either 8 or 0.
Decimal 8 =   Binary 00001000
The fifth right-most bit represents a decimal value of either 16 or 0.
Decimal 16 =  Binary 00010000
The sixth right-most bit represents a decimal value of either 32 or 0.
Decimal 32 =  Binary 00100000
The seventh right-most bit represents a decimal value of either 64 or 0.
Decimal 64 =  Binary 01000000
The eighth right-most bit represents a decimal value of either 128 or 0.
Decimal 128 = Binary 10000000
Hopefully you noticed that the value of the bits keeps doubling as you move from right to left. Now we can build on this concept to create other numbers. We do this by "adding up" the bit columns. Let's create the number 3 by placing a 1 in both the first and second right-most columns. We will "add up" the max value of those two columns to produce the decimal equivalent of 3.
Decimal 3 =   Binary 00000011
Decimal 5 can be represented by adding up the first and third right-most columns.
Decimal 5 =   Binary 00000101
Decimal 6 through 10 represented as binary:
Decimal 6 =   Binary 00000110
Decimal 7 =   Binary 00000111
Decimal 8 =   Binary 00001000
Decimal 9 =   Binary 00001001
Decimal 10 =  Binary 00001010
The largest decimal value a single unsigned byte can hold is 255.
Decimal 255 = Binary 11111111

The ASCII table

ASCII stands for American Standard Code for Information Interchange. Simply put, every character such as A,B,C,X,Y,Z,0,1,2,3,#,$,% can be represented as a decimal or binary value. Think of ASCII as the predecessor to Unicode, because ACSII only represents the English alphabet and numbering system. I'll use the uppercase letter A to explain how ASCII works.
The letter A is represented by decimal 65.

Decimal 65 =  Binary 01000001 =  Uppercase A
If the only thing on your thumb drive was the letter A, you would "see" 010000010000000000000000000...followed by billions of zeros.
Decimal 66 =  Binary 01000010 =  Uppercase B
Decimal 67 =  Binary 01000011 =  Uppercase C
...
Decimal 84 =  Binary 01010100 =  Uppercase T
If the only thing on your thumb drive was the word CAT, you would "see" 0100001101000001010101000000000000000000000...followed by billions of zeros. Keep in mind, I am really over simplifying things, there is way more to memory storage. There are things way beyond the scope of this tutorial like file allocation tables, heads, sectors, etc.
I have provided an ASCII printable characters table at the bottom of this page.

16-bit Unicode

Early computing relied entirely on the ASCII standard - English only. As computer technology advanced, it became apparent that rest of the world's languages would need to be able to communicate using computers too. How could a someone in China use a computer if they didn't know how to read and write English? Unicode was born in 1987 to address this very issue. Instead of using just a single byte to represent a character they decided use two bytes to represent each character. If you recall from the binary numbering system above each time you extend a bit column to the left the number represented doubles.
The bit column values would like this: 32768|16384|8192|4096|2048|1024|512|256|128|64|32|16|8|4|2|1
That is a lot of characters that Java can support!
The original ASCII values where kept the same so you can just tack on an extra 8 zeros to the beginning of the each binary representation.
The letter A is represented by decimal 65.

Decimal 65 =  Binary 0000000001000001 =  Uppercase A
If the only thing on your thumb drive was the letter A, you would "see" 00000000010000010000000000000000000...followed by billions of zeros.
Decimal 66 =  Binary 0000000001000010 =  Uppercase B
Decimal 67 =  Binary 0000000001000011 =  Uppercase C
...
Decimal 84 =  Binary 0000000001010100 =  Uppercase T
With 16-bit Unicode, if the only thing on your thumb drive was the word CAT, you would "see" 00000000010000110000000001000001000000000101010000000000...followed by billions of zeros.

Character Literals

A Java character literal is simply a single character enclosed in single quotes. 'C', 'A', 'T', '8', '{' are all examples of character literals.


Open the command prompt (CMD - see the Getting Started ) and type in the following commands.

C:\Windows\System32>cd \
C:\>md Java
C:\>cd Java
C:\Java>
C:\Java>md CharPrimitive
C:\Java>cd CharPrimitive
C:\Java\CharPrimitive>Notepad CharPrimitive.java

Copy and Paste, or type the following code into Notepad and be sure to save the file when you are done.


class CharPrimitive {
    public static void main(String args[]) {
        char varA = 'A'; //(data type char) (variable/identifier varOne) (assignment operator =) (character literal 'A;)(semicolon - terminate statement)
        char varC = 'C', varT = 'T';
        
        System.out.print(varC);
        System.out.print(varA);
        System.out.print(varT);
        System.out.println();

        char varD = 68, varO = 79, varG = 71;
        System.out.print(varD);
        System.out.print(varO);
        System.out.print(varG);
        System.out.println();
    }
}

Now switch back to the command prompt (CMD) and type in javac CharPrimitive.java and press Enter.
Now type in java CharPrimitive and press Enter.


C:\Java\CharPrimitive>javac CharPrimitive.java
C:\Java\CharPrimitive>java CharPrimitive
CAT
DOG


Copy and Paste, or type the following code into the same Notepad file and be sure to save the file when you are done.


class CharPrimitive {
    public static void main(String args[]) {
        char varA = 'A';
        char backSpace = 8; 

        for (int i = 1; i <= 26; i++) {
            System.out.print(varA);
            System.out.print(",");
            varA++;
        }
        System.out.print(backSpace);
        System.out.println(" ");
    }
}

Now switch back to the command prompt (CMD) and type in javac CharPrimitive.java and press Enter.
Now type in java CharPrimitive and press Enter.


C:\Java\CharPrimitive>javac CharPrimitive.java
C:\Java\CharPrimitive>java CharPrimitive
A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z
Complete


Final thoughts


ASCII Printable Characters

Char Decimal Binary Description
  32 00100000 space
! 33 00100001 exclamation mark
" 34 00100010 quotation mark
# 35 00100011 number sign
$ 36 00100100 dollar sign
% 37 00100101 percent sign
& 38 00100110 ampersand
' 39 00100111 apostrophe
( 40 00101000 left parenthesis
) 41 00101001 right parenthesis
* 42 00101010 asterisk
+ 43 00101011 plus sign
, 44 00101100 comma
- 45 00101101 hyphen
. 46 00101110 period
/ 47 00101111 slash
0 48 00110000 Digit 0
1 49 00110001 Digit 1
2 50 00110010 Digit 2
3 51 00110011 Digit 3
4 52 00110100 Digit 4
5 53 00110101 Digit 5
6 54 00110110 Digit 6
7 55 00110111 Digit 7
8 56 00111000 Digit 8
9 57 00111001 Digit 9
: 58 00111010 colon
; 59 00111011 semicolon
< 60 00111100 less than
= 61 00111101 equals
> 62 00111110 greater than
? 63 00111111 question mark
@ 64 01000000 at symbol
A 65 01000001 Uppercase A
B 66 01000010 Uppercase B
C 67 01000011 Uppercase C
D 68 01000100 Uppercase D
E 69 01000101 Uppercase E
F 70 01000110 Uppercase F
G 71 01000111 Uppercase G
H 72 01001000 Uppercase H
I 73 01001001 Uppercase I
J 74 01001010 Uppercase J
K 75 01001011 Uppercase K
L 76 01001100 Uppercase L
M 77 01001101 Uppercase M
N 78 01001110 Uppercase N
O 79 01001111 Uppercase O
P 80 01010000 Uppercase P
Q 81 01010001 Uppercase Q
R 82 01010010 Uppercase R
S 83 01010011 Uppercase S
T 84 01010100 Uppercase T
U 85 01010101 Uppercase U
V 86 01010110 Uppercase V
W 87 01010111 Uppercase W
X 88 01011000 Uppercase X
Y 89 01011001 Uppercase Y
Z 90 01011010 Uppercase Z
[ 91 01011011 left square bracket
\ 92 01011100 backslash
] 93 01011101 right square bracket
^ 94 01011110 caret
_ 95 01011111 underscore
` 96 01100000 grave accent
a 97 01100001 lowercase a
b 98 01100010 lowercase b
c 99 01100011 lowercase c
d 100 01100100 lowercase d
e 101 01100101 lowercase e
f 102 01100110 lowercase f
g 103 01100111 lowercase g
h 104 01101000 lowercase h
i 105 01101001 lowercase i
j 106 01101010 lowercase j
k 107 01101011 lowercase k
l 108 01101100 lowercase l
m 109 01101101 lowercase m
n 110 01101110 lowercase n
o 111 01101111 lowercase o
p 112 01110000 lowercase p
q 113 01110001 lowercase q
r 114 01110010 lowercase r
s 115 01110011 lowercase s
t 116 01110100 lowercase t
u 117 01110101 lowercase u
v 118 01110110 lowercase v
w 119 01110111 lowercase w
x 120 01111000 lowercase x
y 121 01111001 lowercase y
z 122 01111010 lowercase z
{ 123 01111011 left curly brace
| 124 01111100 vertical bar
} 125 01111101 right curly brace
~ 126 01111110 tilde
127 01111111 Delete

Tutorials