Advertisment

Understanding CTS in .Net

author-image
PCQ Bureau
New Update

Programming in any language requires a clear understanding of the fundamental data types made available to a programmer. Data types define the kind of data with which a programmer can work–that is, the nature of data like integers, floating point values, or characters; and its range. It’s the range that, in turn, defines the minimum and maximum values of the data type.

Advertisment

Here we’ll take a look at the .Net type system, called CTS (Common Type System), which forms the basis of the data types in all .Net languages, including C#. We’ll then see data types, their divisions and, finally, how to convert one kind to another.

Hello CTS

In the .Net article last month (See Understanding .Net, page 162, PCQuest July 2001), we saw how CLS (Common Language Specification) provides a foundation, which when adhered to by compiler vendors, allows them to write compilers compliant with the CLR (Common Language Runtime), and hence, be interoperable with other compliant compilers. This interoperability stems from the fact that CLS defines a set of data types, which all compliant compilers must provide. This set of data types forms the .Net CTS (Common Type System), which is a part of the

CLS.

Advertisment

The CTS makes available a common set of data types so that compiled code of one language could easily interoperate with compiled code of another language by understanding each others’ data types. But why do we need a common type system in the first place?

In traditional, object-oriented languages, developers had two kinds of data types at their disposal: primitive types, defined by the language and built into it, like int, float, double, or char, and user defined types, which included classes. The problem was that these two types were, and still are, incompatible with each other. For example, say you have a float variable, and an object of some class, say CFoo. While CFoo may have some methods associated with it to work on it, the same isn’t true for the built-in float type. Thus, if you were asked to write a piece of code that could work, without any change, identically on the two data types, you couldn’t. Hence, you were stuck until you wrote some wrapper classes for the primitive types, and then made it your programming habit to declare the variable that required the use of a primitive type, to use the corresponding wrapper class.

This problem has been taken care of by the .Net CTS. All data types are objects in nature under CTS, and more importantly, they all derive from a common class, System.Object. Hence, all CTS data types derive from a common, most generic data type, called object. And since they all derive from one class, they share some common functionality, and one type can easily be converted to another.

Advertisment

However, creating everything as an object has a disadvantage in the form-performance degradation. Suppose, we were to add two integers. However, from our knowledge of C#, we know that even an integer is an object. Thus, the simple operation of adding two integers would require allocation of memory on heap (area of free memory available to all processes) for the integer object. To tackle this, and make things more efficient, the CTS divides the available data types into two categories.

Value and reference types

The CTS data types are categorized as either value types or reference types, depending on how they are created in the memory.

Advertisment

The value types are constituted of the following:

  • Simple type, like integers, floats, doubles, char, byte, short, long, bool
  • Structures
  • Enums

Likewise, the reference types include the following:

Advertisment
  • Classes
  • Interfaces (new to C#)
  • Delegates (new to C#)
  • Arrays
  • Objects
  • Strings

To understand how the two categories differ in their creation, suppose, we perform the following assignment:



int a = 2;



Since a is a variable of the type int, which happens to be a value, we end up allocating space on the stack for the value type variable, and the assigned value, that is, 2 is stored there. Likewise, if we perform the following assignment,



int b = a;



we allocate another space on the stack for the variable b, storing the value 2 there. Thus, both memory locations, corresponding to the value type variables a and b contain the value 2. This boils down to the fact that all value types contain some data.

Advertisment
Kinds

of data types

Type Name 



System.Object

Known in C# 



as object

Kind Mother of 



all data types

System.String string String
System.Sbyte sbyte Signed 8-bit byte
System.Byte byte Unsigned 8-bit byte
System.Int16 short Signed 16-bit integer
System.UInt16 ushort Unsigned 16-bit integer
System.Int32 int Signed 32-bit integer
System.UInt32 uint Unsigned 32-bit integer
System.Int64 long Signed 64-bit integer
System.UInt64 ulong Unsigned 64-bit integer
System.Char char 16-bit unicode characters
System.Boolean bool Boolean value (true/false)
System.Single single 32-bit float number
System.Double double 64-bit float number
System.Decimal decimal 128-bit number for financial

applications

Reference type allocations work differently. For instance, in the following assignment,

Advertisment

string s = “Nannu misses me….. but I don’t!”;

instead of the stack, memory is allocated from the heap. The assigned string is stored there and the memory address is stored in the reference type variable s. Thus, s doesn’t contain the string, but it points, or refers, to the memory location which contains the assigned

string.

Thus, even though all data types in the CTS are objects in nature and derive from the common System.Object class, the way they are worked upon by the CLR is different. And it’s these different ways of creation that make the CTS efficient, even though everything’s an object. The table below gives you a brief introduction to the various data types kinds.

We now introduce data types greater than 32-bit in range. The best part is that these data type ranges remain fixed, irrespective of the system on which the application using them is run. Unlike traditional development environments where the data type range was dependent on the underlying microprocessor, in CTS, the data type ranges are dependent on the CLR. So, it is the job of the CLR to make sure that the data type ranges remain fixed in the ranges shown above, irrespective of the underlying microprocessor.

So, when we write a statement as below in C#,

int a = 2;

what we are actually doing is telling the compiler that a is an object of the System.Int32 class. int is just an alias for the

System.Int32 class.

Now that we’ve gone through the basics of data types, let’s move onto boxing (of data types).

CTS conversions

In .Net, and consequently in C#, the process of converting a value type to a reference type is termed as boxing. And

vice-versa, the process of converting a reference type to value type is termed as unboxing. Let’s take an example.

int a = 2;

object oa = a;

Here, the first line creates a value type variable a on the stack and assigns the value 2 to the memory location. The second line performs a boxing operation automatically, by first creating an object oa on the heap, and then allocating to it the value of the value type variable, a. The important point here is that the two values are independent of each other. The following lines of code illustrate the concept.

int joke = 2;



object ojoke = joke;


ojoke = 3;


Console.WriteLine(“Joke={0}, oJoke={1}”,joke,ojoke);

When the value of ojoke is changed, the change isn’t reflected in joke. This is the consequence of the way value and reference types are created by the

CLR.

Moving onto unboxing, one notable difference from boxing is that during unboxing, we have to specify the type being unboxed to, and hence, it’s an explicit operation. Consequently, C# first verifies that the type being requested is actually stored in the reference type, as in the following example.

int joke1 = 2;



object ojoke = joke1;


int joke2 = (int)ojoke;

In this case, after the boxing operation in line 2 is performed, when unboxing is attempted in line 3, CLR first ensures that the requested type (in this case int) is actually present in the reference type. Since it is, because an integer was boxed into ojoke, the unboxing operation succeeds. Had some other type been requested, like decimal, the unboxing operation would have failed, and an exception would have been raised.

Finally

By now, you should have a fair idea about the data types in use by C#, and in fact, by all .Net languages because they are part of the implementation of the CTS. CTS ensures easy interoperability between all .Net languages. So the learning curve is minimized, and behavior arising out of the operations on the data types is common across all environments. A single root-class hierarchy helps in data-type interoperability, and CTS makes sure that every reference to an object is typed, that is, its data type is known, and that the type referenced is valid in the context of the operation, as shown during the explanation of

unboxing.

Kumar Gaurav Khanna runs www.wintools.f2s.com

Advertisment