the good ,the bad and the ugly: C language : Bits SIGNED , UNSIGNED and thier storage.

Now we have covered most of the things related to C, but still some interviewer can surprise us by asking some question , which sometimes can confused us.

For ex : One of my friend was asked.

char C = 1;   // whether 1 will be stored or 0x31 ( ascii value of 1)

One more question was

char c=30;

for ( ; c<300;c++)
{
printf("%d",c);
}

// What will be the output.

Now in both of these question, we have to understand how number is stored and difference between SIGNED and UNSIGNED number storage.

First we will take only CHAR and INT, since they are easy to understand than FLOAT.

Now , when we do

char c=1;
char c='1';
int c=1;
int c='1';

We have to understand that compiler takes the hexa decimal representation of that number on RHS and then stores it. So irrespective of int or char , 1 will be 0x01 and '1' will be 0x31.

And again while printing the values , convert the binary number representation to hexa-decimal.
And depending on %d or %c, check the int or character representation of the stored hexa number.

So for

char a=1;
char b='1';
int c=1;
int d='1';

printf ("%d %d %d %d",a,b,c,d); // will give 1 49 1 49

Coming to second part. SIGNED vs UNSIGNED.

SIGNED means + and - . By DEFAULT every char and int which we mentioned is SIGNED.

Now taking a machine which stores CHAR in 8 bits and INT in 32 bits.

For UNSIGNED we will have 32 bits for INT...so max value can be from 0 to 2^32 -1 , for CHAR 0 to 2^8 -1.

So what if we store some number in unsigned CHAR which is greater than 255.

Then in that case, the last 8 bits of RHS representation of number will be taking into consideration,

So if we have   unsigned char c=256; On printing its value we will see 0 is stored in it.

Since the number 256 binary will be

0001 0000 0000 , and CHAR will consider only last 8 bits we have 0 in there.

You will get WARNING at compile time though, and you can use -Werror option on compilation time, so that you don't skip the warning and get unwanted behavior in your program.

So

unsigned char c=30;

for(;c<300;c+=30)
{
printf("%d",c);
}

// This will give infinite loop, since c value at max can be 255, and once it reaches 270, the actual number stored in C will be

270 = 1 0000 1110    // taking last 8 bits   = 14 , so again the loop will run.

Now what if we assign some negative number to unsigned type.

Ex :

char c = -1;

Compiler will first convert -1 to binary, using 2's complement .
After that it will take the last 8 bits of RHS and convert back to char/int representation.

Note while printing the value the 8th bit will not be taking as SIGNED bit.
It will be treated as UNSIGNED only.

SO printf("%d",c)    will give us 255.

------------------------------------------------------------------------------------------------------------------

2's complement   :

-1

1) remove the negative sign
2) convert the number to binary representation
   So   we have 0000 0001
3) Apply NOT operation.
   1111 1110
4) Add 1 now
   1111 1111
------------------------------------------------------------------------------------------------------------------
Now this number will be stored , 0xFF   will be stored in UNSIGNED CHAR, since we have defined c as UNSIGNED.

So we are done with the UNSIGNED INT and CHAR.

Now coming to SIGNED ones, In case of SIGNED types the first LHS bit is treated as signed flag,

So

1)signed char c=0x80;
2) unsigned char c=0x80;

%d of both will give as

1) -128
2) 128     // this is quite evident

So how 1) answer came to -128.

Now 0x80 was stored as    1000 0000
When we are doing printf, compiler sees that this is signed bit, so it prints its two complement

number              1000 0000     // first bit is 1 , append - flag when printing
Not operation :0111 1111
add                                   1
                           1000 0000   = 128

Run this program and see if you can understand the output which came.

#include<stdio.h>
int main()
{
signed char c;
c= 0x80;
printf("%d\n",c);
c= 8;
printf("%d\n",c);

c= -8;
printf("%d\n",c);

c= 511;
printf("%d\n",c);
c= 255;
printf("%d\n",c);
c= -255;
printf("%d\n",c);
}

Note when dealing with SIGNED number , remember

1) Irrespective of signed or unsigned, every negative number will be stored as 2's complement.
2) While printing, if
      the number MSB is 1 and the type is SIGNED. 2's complement will be printed with negative sign

So with SIGNED types, storage and printing values are different, which can be highlighted with below program.

One more interesting question:

#include<stdio.h>
int main()
{
unsigned char c= -8;
signed char d = 0x80;

printf("%d \n",c);
printf("%d \n",d);

signed char e = (char) c&d;
unsigned char f = c&d;

printf("%d\n",c&d);
printf("%d \n ",e);
printf("%d \n ",f);
}

Now the output will be

248
-128
128
128
128

Exp :

in C , we will have 8 complement since it is negative ,
8 = 0000 1000
NOT operation : 1111 0111

Now add 1
                         1111 1000 = 248 , since it is unsigned, 248 will be printed, which is also which is in STORAGE at memory for this

In d we have 1000 0000 ( In STORAGE)
when it is printed, since MSB is 1 , its 2 complement is taken as it is SINGED.

hence,    0111 1111
add 1     1000 0000    128 and add negative sign    -128.

Now when we perform   & operation, we are doing operation on

1111 1000   and 1000 0000 , since these values are in storage, so we get

1000 0000 which is 128.

Now when we perform any operation between SIGNED and UNSINGED , the result is UNSIGNED.

hence we get 128 , but when we cast it to signed variable , again 2's complement of 0x80 will be printed , hence -128.

What if we have used + operation instead of & operation. Try to work out.

And also what about the - operation. HINT : - operation is nothing , but telling compiler that the number is negative, so a-b becomes more like   a + (b 2 complement).
----------------------------------------------------------------------------------------------------------

While making - program, I got one thing, which was bit puzzling.

Ex :

unsigned char c = -8;
signed char  d = 0x80;

unsigned char e= c-d;
printf("%d",e);
printf("%d,c-d);

The output were   120 and 376.

120 is quite eveident.

C will be in memory store as 0XF8   .
and D will be 0x80.

Now when we do c-d ...its like c+ (d 2 complement) which is again 0x80.

So e will be   1 0111 1000    , but since e is just 8 bits so we have 0111 1000 = 120.

But when we do printf("%d",c-d), printf sees %d and treat c-d as SIGNED INT , hence we get 1 0111 1000 value which is 376.

-------------------------------------------------------------------

Similarly ,

signed char d = 0x80;
signed char e = -0x80;

printf("%d", -d );
printf("%d", e );

The value will be different, try to figure it out.

So to summarize:

Every negative number gets saved as 2 complement.
While printing , it is checked is number is signed or unsigned.
While operations , the value at the storage is taken into consideration.

the good ,the bad and the ugly

Monday, October 5, 2015

C language : Bits SIGNED , UNSIGNED and thier storage.

No comments:

Post a Comment