Developers

Understanding .Net

PCQ Bureau

14 Jul 2001 20:38 IST

New Update

Okay, that shows my affinity for COM (Component Object Model), but more than that, it also tells you the focus of this article. Here, we plan to QueryInterface (go into) the philosophy behind the .Net platform and give everyone a break from the hype surrounding it by touching its mettle. We’ll start off by tracing its lineage, followed by understanding what it does. Finally, we’ll see how .Net works internally and then draw two very important implications this framework holds for developers.

Advertisment

.Net’s lineage

To start with, we had the traditional language compilers by vendors like Borland and Microsoft. Let’s take the example of a C/C++ compiler. If I wrote a C++ module (say Module A), I would compile it to an executable. But what if I later found out that portions of Module A could be used in another C++ project of mine? I’d reuse the required code in the second project and compile it into another executable. If there were many such projects which required the code in Module A and I continued to use this approach, I’d soon have many executables in my hard disk, all containing the same piece of code. As you can see, this approach of source code reusability eats hard disk space. A solution to this problem was the introduction of dynamic link libraries (DLLs).

Using this approach, the code to be shared amongst various applications was coded as a DLL, and exported using some functions. Client applications requiring the

shared code would simply load the DLL, locating the entry point to the exported function. That solves a big problem, right?

Advertisment

CLR vs JVM

You may wonder about how the CLR and JVM (Java Virtual Machine) compare with each other. Here’s the answer to that.

Let’s start with what CLR and JVM have in common. Firstly, both are responsible for executing the code produced by either one’s compilers–JVM executes

bytecode, while CLR does the same with IL code.

Both provide facilities for memory management for their respective program executions, and both look after the code verification before letting it execute.

Coming to the differences, the first is the concept of platform independence. Theoretically, languages running on top of CLR and JVM are platform independent, because the code the respective compilers produce target the underlying runtime, and not the microprocessor. It’s the underlying runtime which is different for different platforms, and thus, code written on one platform can run on another one.

Practically, this concept is available for JVM since we have JVMs for various platforms, but as of now, the CLR is available only for Windows. Hence, though the CLR is fundamentally designed for platform independence, as of date, due to lack of CLRs for various other platforms, you’re still restricted to Windows.

The second difference is in the binary reusability across languages under a runtime. As of date, there’s only one language running over the JVM–Java. So, developers wanting to reap the benefits of platform independence have a learning curve before them, depending on their previous programming knowledge. For instance, a VB programmer’s learning curve is going to far more than that of a C++ programmer.

On the other hand, many languages run on the CLR, prominent ones being C#,

VB.NET, and VC.NET. So, a VB programmer will have almost no learning since they have VB.NET running over the

CLR. It holds true for the VC.NET developers as well. Switching to C# will have its learning curve but that will be highly reduced, if you are proficient in like environments like C++, Java, etc.

The most important implication of having many programming languages over a runtime like

CLR, is that Microsoft has gone to the lengths of making them interoperable. You can write one module in C#, another which handles the exceptions in

VB.NET, and the user interface in VC.NET, and the three will execute as a single entity. This is something similar to what COM proposed to do, but it’s done in a better manner in .Net.

So, while JVM currently scores in being practically platform independent, CLR is limited in the same for the reasons discussed above. But as far as language interoperability is concerned, CLR has a definite edge over

JVM.

Wrong. Though this approach of sharing the code using DLL saved disk space and prevented executables from being bloated by the same code, it introduced another, very invisible problem. If a DLL exporting the C++ class containing the exported functions was written using a particular compiler, the same compiler had to be used to load that DLL. That is, if you wrote a DLL exporting a C++ class containing the code to be reused using the Borland C++ compiler, then only a client written using the Borland C++ compiler can load and utilize the DLL’s exported functions. A Microsoft VC++ client won’t be able to invoke the exported class’s functions. The reverse also holds true. Why is this so?

This happens because each compiler has an independent scheme of exporting C++ classes, leading to incompatibility between compilers. However, if a non-class function is exported using the C calling convention, the problem of interoperability between a client written using compiler A and a DLL written using compiler B

won’t arise.

Advertisment

Thus, a solution to the problem was that instead of exporting the classes directly from the DLL, a non-class function be exported, which would internally instantiate classes and return pointers to clients so that the class member functions could be invoked. A standard was devised which dictated how the classes would be implemented within the DLL, so that a DLL designed using language A could be used by clients written using language B.

This was the philosophy behind Microsoft’s highly successful binary object model, popularly known as COM. This was the standard that defined internal class implementation within DLLs.

This is a very brief explanation of the problems faced with code reusability and how they were tackled. For more insights into the philosophy of COM, read Don Box’s classic text on COM, Essential COM.

Advertisment

.Net is related to COM in that it takes the concept of binary reusability a step further. Before .Net, support for binary reusability within Microsoft products was natively available in the form of COM. So, COM is largely platform limited to Windows. Of course, there are projects which are in the go (and have succeeded to some extent) to port COM to other platforms like Unix and Macintosh, but to reap the full benefits of COM, one had to code for Windows.

With .Net, Microsoft proposes to change the way reusable code is written. When an application is coded for the .Net environment using any of its available languages like C#, VB.NET or VC++.NET, the code is not only binary reusable (as in the case of COM), but can also execute on any OS for which the .Net environment, called the Common Language Runtime (CLR), is available. As an integral part of the .Net strategy, Microsoft has published what is called the Common Language Specification (CLS). The CLS defines the way vendors should write their compilers so that they become compliant with the CLR and can create programs for .Net. Also, to ease versioning and registration problems associated with component development, .Net stores all such information within the component itself, as opposed to storing it in the registry as in done under COM/COM+. Moreover, .Net provides a rich set of built-in classes (analogous in functionality to the Win32 API) to help you work on any imaginable task.

Since these classes are made available in .Net via the CLR, and developers write code using any CLS-compliant compiler, all developers get to use the same set of classes, decreasing the learning curve when migrating from one .Net compiler to another. All compilers produce a common binary code that is executed by the CLR, and thus developers of one .Net language can share binary code produced by developers using another .Net language.

Advertisment

Now that we’ve traced the lineage of .Net and seen what it does, let’s see how it does what it does.

Inside .Net

As we’ve said above, .Net tries to bring platform independence to binary reusability with the introduction of the CLR. A piece of code, usually a DLL, traditionally referred to as runtime was required by programs for their execution. The best example of this is Microsoft Visual Basic which bundles the VB runtime (MSVBVMxx.DLL, where xx are the version numbers) since it is required by the programs produced using VB, for their execution. However, the runtime is traditionally limited to one particular language, that is, VB. A VB program won’t execute on a system where the VB runtime is missing.

Advertisment

.Net solves the second limitation by proposing that the CLR be present on all systems (possibly by integrating it with the OS) so that the issue of a missing runtime never arises. And the first limitation is taken care of since the CLR, in its present shape, handles the runtime requirements of all the languages bundled with Visual

Studio.Net.

Now, let’s see how the different pieces fit together in the .NET picture

The developer first writes code in a CLS-compliant language like C#, which is then compiled using the language’s CLS compliant compiler. The compiler, alongwith the metadata engine, compiles the code into the Intermediate Language (IL) form, which is a form that is neither in the executable format that could be executed under the .Net runtime, nor is it in the pure language source code format–as the name suggests, it’s in between the two.

Advertisment

The metadata engine runs on the source code alongwith the compiler and produces metadata information. This information is what tells the .Net runtime about your code, like the data types used, signature of each type’s members, etc. To give an analogy, metadata provides all the details in .Net like the ones contained in type libraries, registry entries, etc, for COM components. The only difference is that in COM, these details were spread across various locations, while under .Net, they’re stored right within the file, alongwith the IL code. This done, the IL code, alongwith metadata, is processed by the Linker, which then produces an executable or a dynamic link library (as required) containing the IL. This completes phase one.

Phase two begins when the executable is invoked for execution. The class loader comes into play and loads the .Net base classes, which provide for most of the functionality utilized by applications produced using CLS-compliant compilers. The code, in totality, is then subjected to type safety checks via the Type Verifier, after which the Just-in-Time (JIT) compiler comes into play. Its job is to process the IL code and produce the managed native code, which is then executed by the .Net runtime. In .Net terminology, a piece of code is said to be managed if it runs under the context of the .Net runtime, and unmanaged if it doesn’t.

The implications

The important point here is how this results in language interoperability, and how the CLR can run programs compiled in various languages. Since any language requiring to produce code will have to comply with the CLS, their compilers will always end up producing the IL code. Thus, you may use C# to write your DLL, and VB.NET to code its client. Since both will create IL code upon compilation, the code is interoperable. If you recall, this is what COM proposed to do.

This interoperability is taken a step further by the CLR by making it platform independent. Conventionally, the IL code produced by CLS-compliant compilers is processed by the CLR. The CLR is the .Net component, which is dependent on the operating system. As of date, the CLR is only available for Windows. In future, the CLR may be written for other OSs as well. This will bring about platform independence.

Kumar Gaurav Khanna runs www.wintools.2fs.com

Advertisment