by December 6, 2001 0 comments

Let’s go back a bit in history. Back in 1986-87, we did not have a research division. Bill G and Nathan Myhrvold (who later became the CTO of Microsoft) used to talk about it. Nathan was running a random group called advanced projects. Nathan was of the view that we needed to get talent from outside Microsoft. It so happened at that time that Dave Cutler of Digital wanted to move out. Somehow Nathan, Bill, and Dave connected, and Dave with a group of nine people joined Microsoft in 1988, and for the next two years all they did was sit in a conference room and write what the next-generation OS should look like, and came out with a fantastic specification for it. They spoke only about the kernel. If you look at the goals for NT in 1988, they were very simple. We wanted an OS that was portable, reliable, scalable and would be an enterprise OS. Around these goals Dave and his people created a kernel. In parallel, Rick Rashid, who was then a professor at Carnegie Melon University, had developed the micro kernel concept–the product he worked on was called the Mach micro kernel. They wanted to build a micro kernel, and on top of this they could build other building blocks to create an enterprise ready OS.

In 1989-90 we started working on productizing this kernel and building an OS and the first 32 bit OS from Microsoft–NT 3.1–was what we ended up shipping in 1993. At that time Windows 3.0 had come out and by 1992 Windows 3.1 came out and was a huge success. By then we started realizing that we had two different OSs. If you want application and device compatibility between the two systems you need to have APIs and DDIs (the interfaces with which people write applications or drivers ) to be consistent across both. But with time we found it increasingly difficult to meet this requirement.

We had two engineering efforts on, one on the Windows NT code base and other on DOS and Windows code base. So we had this dream that one day we need to take this NT code base to be the single code base for a ‘one OS story’. Meanwhile we were continuing to develop NT.

With NT 4 we decided that we needed to take the user interface of Windows 95 to NT so that at least from a look and feel perspective you don’t have to go through the learning process twice. And then Windows 95 came out. We tried to combine the best of Windows 95 and Windows NT in Windows 2000. A whole bunch of new things came into Windows 2000 and the project took much longer than what we had thought it would take. Half way through we realized that wewould have to back away from the goal of having Windows 2000 as the single product that will satisfy both consumer and business users.

With Windows 2000 out, we said that the number one priority was to get to a single code

We started work on XP just before we finished work on Windows 2000, right about the end of 1999.

The way it works inside Microsoft is like this: We have at the higher level the Windows division and inside it we have different groups. At the bottom, for each functional area you have a Product Unit Manager who owns a feature end to end. He has development, test and program management responsibilities. His job is to understand what customers want from the product, what will the next-generation product look like, what features should be built in, etc. They interact a lot with customers. Once they have an idea of what they want to build, product managers and developers together come up with the functional specifications, which is more like a high-level design for a particular feature. The day the functional specs get written we give it to the developers and ask them to implement it in code. We give it to the testers at the same time and say this is the code/functionality we are going to implement, you go write a test so that when the development is done, you can test it. So there is a program management team, a development team and a test team and they are together for a particular functionality.

Thus, you have a team that is responsible for it end to end. And there are many teams like that.

Program managers also do customer studies to find out what the next product should look like. Meanwhile from the top of the company we talk about what are the goals that we have for the next generation product. And we take both of these and create a high-level product vision. Out of this vision come the priorities for the product. There should be four or five priorities, there cannot be 25, there cannot be one. If you wake up someone from the team in the middle of the night and ask, they should be able to tell you the vision and the priorities. It also means that a particular engineer is able to identify how what they are doing fits into the overall priorities for the product. Once we have the top-level priority and vision we tell each feature team to think about their feature area. You already have a certain set of functionalities in this area. Now, what can you do for the next version of the product? Meanwhile we have a general framework that says that we want to ship XP maybe a year and a half after Windows 2000. So the teams know how much time they have to innovate, and they come back with an overall project schedule.

Can the teams innovate what they want? Yes, absolutely! The bottom-level teams obviously have a lot of say, but they also have to conform to the high-level priorities and the vision that has been laid down.A team can come up and say that these are the things that we want to do. And then there will be people who’ll look at it and will give them feedback. Bill is very involved, Brian Arbogast is involved, Steve Ballmer is involved in Windows, Jim Allchin is involved. All of us will look at the plans and give feedback to the team. The team looks at this feedback and also the customer feedback before deciding.

That brings us to cross integration. Let me pick a feature in XP–Fast User Switching. Basically it lets you switch between users without logging on and off. Someone in the user group thought about that. He wrote down what fast user switching means in his mind and whatever dependency he thought it had from other components in the product. He sends it out to his peers across the division for review and feedback. That is how interdependencies start getting ironed out.

We follow the concept of Dog-fooding. We have to eat our own dog food, meaning if you’re doing something, if you don’t use it, how can you expect others to use it? Microsoft is a very strong believer of the dog-fooding concept. And this starts very early. We shipped Windows 2000 on December 15, 1999, the rest of the year we took off and didn’t do much. By January 4 we put out the first build of Windows XP. Whatever little functionality was developed till then was checked in and we put a build out. Early on it would be pretty rough. It would not be good quality; a lot of things would break. But at least we would see where the integration points are.

Do we pull out old code and plug new stuff in? No. We add more stuff and in the process we may clean up lots more. So there are two kinds of efforts that are involved. one was new functionality coming into the product, the other was cleaning up the existing code. We run source analysis tools to capture buffer overruns and fix them for example.There may be portions of code we needed to re-architect–security, for example. We spent about six months doing that. All these happen in parallel.

Sometimes we decide that the feature is more important, so we implement it and then fix the bugs. If the bugs are too many, we break to first fix the bugs and then put in more features.

We put out a build everyday. But every build is not of acceptable quality. We define milestones called IDW (Internal Development WorkStation). Basically, we want everybody, all the 5000-7000 people to be running XP for whatever they are doing. So by March of 2000, most of the Windows division was already running XP. Nobody was running Windows 2000 anymore; we were done with that one.

As part of using it they find out usability and performance concerns to be rectified . That’s an important way in which we make a product better. I don’t write code any longer, but I have two machines and both have been running XP for about two years now. The first IDW comes out and I pick it up. So right from day one, all of us get self hosted. For the next version of
Windows which we are calling longhorn internally, the first build will come out soon, and we will have an IDW sometime before the end of the year. When that happens I am going to run that on both my machines.

We believe in crawl, walk, and then run, one step at a time. But it is not as rosy as I am saying here. As you are writing code , it takes at least two months to get an IDW. But if two months go by and we don’t get an acceptable quality build that we cannot self-host on, we stop developing new code and spend the next two weeks fixing bugs, getting in to the right quality so that everyone can self host before we continue to add
more stuff.

For the IDW, we define certain goals; say that we need to be able to run a compiler, or I need to be able to use Excel and so on. There are automated test kits to test a build to tell you whether this minimal criteria has been met or not. So everyday that the build comes out from the build team, there is a BVTT (Build Verification Test Team) that runs through these tests. We don’t want to go more than two months for an IDW. A series of IDWs later, when we are 70- 80 percent code complete, it’s time to do a beta.

Between beta 1 and beta 2 is where we usually become 100 percent code complete. Then we improve the quality some more and put out beta 2.

There are different kinds of testing. One is the build verification testing. There is the functionality testing by the teams. When multiple components come together how well do they play together? There are system-level tests to check this. What we do is, we define some scenarios: Take a networking scenario–when a customer wants to move from one network to another, things should work. So we develop a set of tests to check that. The fourth type of testing is stress testing. The 5000-7000 people in the division all have a couple of machines that are idle at night. We use this idle time for stress testing.

Our goal is to get at least 1000-2500 machines running stress. We have made it as simple as an employee pressing a button when leaving. We automatically pick the build of the day and download the stress test and kick it off. If you say you are going to come in at 8:00 am, by 7:30 am we are going to stop running the test and will clean up your machine and have it ready for you. You can decide that you are going to put your test machine only today. With an IDW you can decide that you want to mark both your developer machine and your test machine. And we have a program that picks up a random collection of tests so that we don’t run the same program all 2000 machines.That’s on the desktop side.

On the server side, we have labs where we test all kinds of server services–file, print, Web, database servers. And we do long haul on them. The server should be up and running for 30 days continuously before we ship a product. Running a stress test for 30 days on a server really simulates what a customer would be doing in their environment for a year on an average loaded machine.

Apart from these formal tests we dog-food everything with the Microsoft IT department before we ship the product. Before we launched Windows 2000, for example, 80 percent of Microsoft desktops were running Windows 2000. And all of the IT infrastructure servers are running Windows 2000 before shipping.

The other thing we do is extensive beta programs. For Windows XP we had half a million beta customers. We sometimes also give them IDW builds. The tech enthusiasts at the beta sites are excited about playing with new technologies and they feel they have an involvement in the development of the product. And they also feel good about participating.

Another program that we have is the Joint Development Partner program. We identify about 30-40 customers, usually large enterprise customers. and treat them as if they are an extension to our development team. They are excited about working with us because they know that when we are releasing the product it works very well in their environment and they are ready to deploy it. We work with them hand in hand from the early stages, we invite them to come to our labs and bring their equipment and environment. They invite our engineers for their architectural reviews

We work with third parties right from early on in a product cycle. So every IDW build that we have, is made available to all our OEM partners, and to ISV partners. We say here is some code, here is an SDK if you are a corporate development team, here is a DDK if you are a driver writer. At our software design reviews, we invite all hardware and software partners and have them interact with the engineering team so that by the time we are ready to ship the product we have a whole slew of third party support—both hardware and software.

There are different versions of XP, professional, home, advanced server etc. Are these all developed by the same team or separate teams? There is one team, one division–the Windows division that is responsible for all these flavors. Inside the
Windows division there are groups of people whose primary focus would be different flavors of Windows, but they also develop some technologies that everybody else uses.

Let’s look at the kernel. The code base is pretty much common for all flavors. But elements like clustering, which are a part of the kernel, are not needed by all flavors. So they are not used where not required.

What is the cost of developing XP? The Windows division budget is about $1 billion a year, but there are people outside the
Windows division who contribute to Windows. So we are talking about a lot more money actually.

S Somasegar has worked on Windows since inception, and is currently Vice President Windows Engineering Services Group at Microsoft

No Comments so far

Jump into a conversation

No Comments Yet!

You can be the one to start a conversation.