December 31 2013 by Kevin Bowersox

   When beginning to learn Java, early in your journey you will most likely encounter a discussion of the primitive data types.  An investigation of the primitive numeric data types will expose you to the primitives byte, short, int and long.  From their definitions, most individuals with no programming background can discern the following about these types:

These primitive data types only represent whole numbers (including negative numbers) aka Integers.

 

Each of these primitives can store a specific range of values with byte having the smallest range and long having the largest range.


   With these simple concepts established most beginners can declare and use these primitives without any further knowledge.  This article explores the fundamentals of how these primitives are stored in memory.  At this point, you may be asking yourself if it is really necessary to understand how Java stores a primitive data type.

   Grasping this low-level concept will provide a better understanding of concepts such as type overflow, memory management and reading/writing files.  This knowledge serves as the foundation for understanding bitwise operators and brings the programmer one step closer to the hardware.  A true appreciation and understanding of the language is gained from understanding how the language works at its lowest level.

 

   From a personal perspective, the more I work with third party frameworks such as Spring and Hibernate the more I yearn to master the 1s and 0s.


   The first clue about storing primitives can be found in the Java Language Specification, Section 2.3. Primitive Types and Values.  Within this section we find the definition of the Integral primitive types.  Let's specifically focus on the definition of byte, which is as follows:

 

byte, whose values are 8-bit signed two's-complement integers, and whose default value is zero

 

   For the seasoned programmer this definition is clear and concise, however a beginner may be intimidated.  What exactly is an 8-bit signed two's complement integer?  Let's discover its meaning by exploring the language and a bit of binary.


   First, let's evaluate the MIN, MAX and RANGE of a byte:

public class ByteExamples {

	public static void main(String[] args) {
		System.out.println("MAX:" + Byte.MAX_VALUE);
		System.out.println("MIN:" + Byte.MIN_VALUE);
		
		int byteCounter = 0;
		for(int x = Byte.MIN_VALUE; x <= Byte.MAX_VALUE; x++){
			byteCounter++;
		}
		
		System.out.println("RANGE:" + byteCounter);
		
		// Outputs:
		// MAX:127
		// MIN:-128
		// RANGE:256
	}
}

    Our simple example allows us to discern two facts about bytes.  First, there are 256 values assignable to a byte and second, those values range from -128 to 127.  Keep these two facts in the back of your mind as we move on to explore some related binary numbers.


   The definition of byte states that each byte is 8-bit.  This means that 8 binary digits, containing a value of either one or zero, make up a byte.  For example, 1111 1111 would represent the maximum value for an 8-bit binary number.  Let's explore this value through Java:

public class ByteExamples {

	public static void main(String[] args) {
		System.out.println("VALUE: " + Integer.parseInt("11111111", 2));
		//  Outputs;
		//  VALUE: 255
	}

}


    In our first example, we discovered the maximum value for a primitive byte is 127, however when calculating the maximum value for an 8-bit number, 1111 1111, we discover that 8-bits can represent a maximum value of 255.  This may cause you to question why the primitive byte's maximum value is lower than the maximum value that can be represented by an 8-bit number.  To answer this question lets explore the binary representation of the minimum byte value -128:

public class ByteExamples {

	public static void main(String[] args) {
		String b = Integer.toBinaryString(-128);
		System.out.println(b.substring(b.length() -8));
	}
	// Outputs:
	// 10000000
}

   Interestingly, we see that the binary equivalent of byte's (decimal) minimum value -128 is equivalent to the binary representation 1000 0000.  At this point, we must ask why a seemingly positive binary number 1000 0000, is equivalent to a negative decimal number.  The answer lies in how Java stores negative numbers in binary, the two's compliment.

    To create a negative number Java uses the two's compliment, which is created by first representing the absolute value of a number in binary.  For example, the absolute value of -128 (128) represented in 8-bit binary would be:  1000 0000.  Next, we perform a bitwise not (~) by flipping every bit: 0111 1111.  Then we add 1, leaving us with the two's complement: 1000 0000.

    At the lowest level, Java stores negative decimal numbers in binary using the two's compliment, while positive numbers are stored in their regular binary representation.  When storing binary numbers the most significant bit of the binary number is considered the sign bit.  In a binary number, the most significant bit will always be the left most bit since it contains the bit with the largest possible value.  In a two's complement number system the sign bit will be 0 for positive numbers and 1 for negative numbers, hence the minimum byte value's (-128) most significant bit in binary is a 1, 1000 0000.

    To convert a two's compliment number to decimal, we must reverse the operations.  Using 1000 0000 (-128 decimal) as our example, we first perform a bitwise not (~) by flipping every bit: 0111 1111.  Next, we add one to the inverted number:  1000 0000 or 128 decimal.  Finally, we negate the number since its most significant bit was a 1, leaving us with -128.

    We can now answer our lingering question, "Why is the primitive byte's maximum value (127) lower than the maximum value that can be represented by an 8-bit number (255)?"  The answer is because the sign bit occupies the most significant digit so that negative numbers can be represented in binary.  When storing values a byte only uses the first seven bits since the most significant bit stores the sign bit.  We can illustrate this via a Java example:

public class ByteExamples {

	public static void main(String[] args) {
		String b = String.format("%32s",Integer.toBinaryString(Byte.MAX_VALUE)).replace(' ', '0');
		System.out.println(b.substring(b.length() - 8));
	}
	// Outputs:
	// 01111111
}

    In this example we see that the sign bit is 0 and precedes seven bits 1111111 equivalent to 127 decimal.  It also should be noted that the sign bit sits in the most significant digit of the 8-bit binary number, which represents the 128 position or the absolute value of byte's minimum value.
   

   This article demonstrated how Java stores the primitive integral type byte.  While our exploration only covered the byte primitive the same concepts are applicable to larger primitive integral types such as short, int and long, however these types require more bits to represent larger numbers.  A main takeaway from this article is how primitive integrals use a two's complement to store negative numbers in binary, which includes the important sign bit.

    Admittedly, the primitive byte or binary numbers are most likely not a fixture in your daily work.  However, the concepts illustrated in this article are applicable to all integral types, which may appear more often during your daily endeavors.  There is also merit in understanding the basic concepts upon which the language is built, since they often arise in other languages and technologies we encounter.  The Java environment is lush with third party libraries and we often develop using libraries dependent upon other libraries.  While these libraries are extremely useful, they often leave us out of touch with some of the most basic concepts of the language.  To maximize our potential as developers, we must master the basics, which will serve as the building blocks for leveraging more complicated technology stacks.

Post a Comment
*Name
Email
Site
*Comment