Archive for the ‘Engineering’ Category

Computer Clubs

I’m old. Well I’m not really that old in the grand scheme of things, I just feel that way when I hang around game developers.

I got my first real computer time in the fall of 1982 by hanging around after school and hacking some stuff in BASIC on the Vic-20 in the library.  I was in 5th grade at the time, and was by far the most computer-obsessed person I knew. That christmas my parents bought me a TI-99/4A and a little black and white TV to hook it up to. Technically the computer was a present for “the family”, but in practice it didn’t really work out that way. I was obsessed with the TI, and wrote all sorts of little games and other programs on it.

A few years later I spent all my accumulated allowance and paper route money on a Commodore 64. The C64 was a big upgrade, and included such advanced features as a floppy drive and a 300 baud modem. It also had the advantage of having a manufacturer that was still in the PC business. (TI abandoned its home computer line shortly after we got ours.)  I spent quite a bit of time on the local BBSes, much to the delight of the other 4 people I shared a phone line with. Once I had a car began participating in one of the staples of the personal computer revolution: the computer club.

The local commodore user’s group met once a month in one of the classrooms at the University of Northern Colorado in Greeley. It was a group of 20-30 people, many of which came from the university or worked at the local Hewlett-Packard site. Computer enthusiasts were pretty few and far between in those days, and this was one place where we all fit in. Just about everybody in that room was a geeky, sci-fi reading, D&D playing male. Everybody could program to one degree or another, and more than a few knew their way around a soldering iron. Despite all the other things they had in common it was those last two that brought this group together: everyone wanted to do cool stuff with computers.

I don’t know if that kind of community disappeared or if I just fell out of touch with it. There are millions of programmers these days, and they are usually specialized enough that they barely speak the same language let alone program in it. Being a “hardware guy” now means that you are comfortable plugging together prebuilt components and hunting down device drivers online. The inexorable march of progress has pretty much made the computer itself disappear as something people get excited about. Nobody cares enough about specific platforms these days to even have the sort of trash-talking arguments Commodore and Apple fans used to have with each other.

Does this sort of passionate niche club still exist? The Seattle Robotics Society might fall into that category. They spend their meetings talking about various components to build robots from and what sort of code to put on microcontrollers to make their robots do interesting things. The meetings feature lots of teenagers learning things about robots that they would never have any exposure to at school. There seems to be the same mix of Boeing engineers and college students that the computer clubs had.

What about others? Are there clubs for wearable computer enthusiasts? People who design programming languages? Quantum computing fans? Or are we nearing the end of the innovative period for computing and somewhere there are developing pockets of interest around nanotech or some other technology that doesn’t really exist yet?

It’s funny that I’m so nostalgic for something that was already going extinct by the time I got involved. My experience with the computer clubs was 10-15 years after the Homebrew Computer Club spawned Apple Computer and others. The people I met in the clubs were not entrepreneurs to be, they were more like fans and maybe the occasional shareware developer. It’s been twenty years, and I’ve never seen any of those names show up as leaders of industry.

What about you? Are any of you old enough to have belonged to a computer club?  :)

StackOverflow is amazing

A couple of weeks ago, Jeff Atwood and crew launched the public beta of Stack Overflow. Stack Overflow lets programmers ask questions and other programmers answer them. That’s it.  They just did it with a lot less suck than all the other programming community help sites: The ads are unobtrusive, there is no login requirement just to see an answer, answers are listed from best to worst instead of first to last, and anyone can edit a question or answer to make it better.

For instance, look at this question I asked about boost shared pointers. I have work-arounds for the problem in my code, but figured that there had to be a better way. Turns out that the boost experts on Stack Overflow knew exactly what I needed, and answered within a few hours.  Then some other people read the question, picked the best answer, and by voting it up made that answer appear prominently.  By the time I got back to check to see if my question had been answered, there was a clear winner. To make it even more prominent, I marked that answer as “accepted” and now it’s highlighted.

If you’re a programmer, I suggest you check it out. Next time you’re looking for the answer to a programming question, see if it’s been asked on Stack Overflow. If not, ask your question. I think you’ll be pleased with the results.

(Back in July I joined a company called Divide by Zero.  Now I’m singing the praises of a site called Stack Overflow.  Next thing you know I’ll be renaming my blog “Access violation”. :) )

ServerDir 2.0

As I am putting together the architecture for the new game we’re building at Divide by Zero, I am spending a fairly significant amount of time thinking about where the weak spots in the Pirates architecture were. The servers in Pirates worked out pretty well, but I think I can do better the second time around.  This is the first of N posts describing how I intend to evolve Server Architecture v1 into Server Architecture v2.

By far the biggest scaling problem Pirates ran into right at the start of open beta was the Server Directory (ServerDir) database. This was the direct result of incredible naiveté on my part about how much load a single database could handle. The original design of ServerDir called for every process in every cluster to connect to one shared database and to update its own status in that database every five seconds. When you multiply that update by all the instanced zones in the game (plus other miscellaneous servers) you find that the database needs to handle thousands of updates per second from tens of thousands of connections. It turns out that Microsoft SQL Server is not up to the task. (There’s also the little problem that the single shared ServerDir database was a single point of failure for the entire service.)

Pirates ServerDir on a single DB

 

Original ServerDir design

When a single ServerDir was obviously not going to work, we expanded the system slightly to split that single database into up to one database per cluster. This still put quite a bit of load onto the ServerDir DB, but there were now enough of them to allow SQL Server to keep up.  This is the setup that Pirates was using when I left Flying Lab in July of 2008.

Pirates ServerDir with one DB per cluster

Final ServerDir design

Within a cluster the ServerDir database was used by a process called Big Brother to monitor the health of the cluster. Each physical server machine in the cluster has an instance of Big Brother running on it, and they automatically pick one of their number to be the primary Big Brother for the cluster. This process is responsible for deciding which other processes need to be launched, as well as clearing out the ServerDir entries for processes that have crashed. If you want to read more about the specifics of the ServerDir system, you can read all about it in Massively Multiplayer Game Development 2. I wrote an article on the Pirates architecture years before the game launched, and it really didn’t change too much.

Pirates ServerDir inside a cluster

ServerDir Inside a Cluster

ServerDir 2.0

There are several fundamental problems with the original ServerDir that I intend to fix with version 2.0. First is the reliance on a database as the point of synchronization. Databases are not built for this kind of transient data, so they handle it poorly.  The second problem is the way the Big Brothers communicate with each other via UDP (the dashed lines above indicate non-persistent or UDP connections.) This pointlessly complicated the protocol between Big Brothers by requiring them to compensate for dropped network packets. Another goal for the new ServerDir is actually driven by broader architectural changes I want to make, specifically that I want to promote “shard” from being an operations-level concept to one that is entirely in game design and UI.  That will require far more machines with far more processes per cluster, and ServerDir will need to cope. The fourth and final fix in the new ServerDir is that the old version of Big Brother actually does a pretty poor job of dealing with hung processes. We had some periods during Beta where we were getting some of those, and the operations staff had to deal with them by restarting clusters regularly and running scripts to kill all the zombies.  What follows is a sketch of my initial design for how to accomplish all this.

ServerDir v2.0

ServerDir v2.0

The biggest change here is that individual cluster processes no longer connect to ServerDir directly. Instead they open a persistent connection to their local Big Brother, and Big Brother updates ServerDir on their behalf. Part of this change is that the “every five seconds” updates never go into ServerDir at all.  ServerDir is notified of two events for processes: process started and process stopped. All of the “is this process hung” detection is now the job of each individual Big Brother. While a cluster process is up, it will send period updates to Big Brother, and if none arrive for too long a period of time, Big Brother will kill the process and clean up ServerDir.

Another significant change is that instead of the point of synchronization being a database, the point of synchronization is a web service. Whether there is a database (or multiple databases) backing up that web service is entirely invisible to the tools and to the cluster processes. Using a stateless API with no persistent connections also makes the task of scaling the ServerDir resource much easier. With load balancers and some reasonable architecture on the back end, single points of failure and scaling problems with ServerDir itself can be all but eliminated.

My next post will go into much greater detail on the new web service and how BigBrothers and operations tools interact with it. Once I’ve covered the new ServerDir plan I can get into my whacky new ideas for the game servers themselves.

What do you think? See any red flags in my high level sketch?

This is why I’m a programmer

Gustavo Duarte sums it up.

Five Kinds of Programmers

I recently had a conversation with one of the long-time programmers on Pirates that got me thinking about how I think about programmers. Over the course of my career I’ve run into several archetypes of professional programmers. I thought it might be interesting to formalize my thinking on the subject, and this is the result.

The Researcher

These programmers are more scientist than engineer. If your organization has a research lab, it is probably stocked with Researchers. Since academia is just one giant lab, it is almost entire filled with Researchers.

The Researcher loves to find solutions to problems that are poorly understood. They are on the bleeding edge of their technological specialty. If there are no papers out there that explain how to do something they will write one.

One downside of the Researcher is that there are so many interesting problems out there that need solving that they have trouble actually finishing any solution before they move on to the next thing. When you can get these guys to check in some code it’s usually great, but it takes them far longer than it would take other kinds of programmers to actually implement anything. They are also the most likely archetype to suffer from Not Invented Here Syndrome.

The Explorer

Like the Researcher, the Explorer is unafraid of the poorly defined dark corners of technology. The key difference is that when the Explorer delves those depths it is to get things done, not for the joy of the exploration itself.

When you have a really thorny problem that you don’t know how to solve, this is the programmer you give it to. Explorers will dig into unfamiliar code-bases and problem domains with a shocking level of energy. These programmers are by far the quickest learners, and are a great resource for other programmers who are trying come behind them into new territory.

The downside of Explorers is that their single-minded practicality can make their code a little sloppy. These programmers dedicate a lot more time to putting their current task behind them than they do to writing code they would want to maintain years down the road. This doesn’t mean that the code won’t work, but that if an extra #include or circular dependency will save an hour the Explorer is always tempted to cut that corner.

The Craftsman

The highest quality code in your code-base was probably written by a Craftsman. Your QA department loves Craftsmen. They value the quality of their work above all else.

When a new system just has to work, you give it to a Craftsman. They will do a great job coding it, and then test it until it is perfect. Craftsmen are absolutely the best programmers when it comes to handling exceptional conditions and corner cases. In my experience Craftsmen also excel at writing maintainable code because they know that they’re going to have to come back to it someday.

Unfortunately all that quality comes at a price. The Craftsmen on your team are the slowest programmers you have. When they estimate tasks they generate the most accurate estimates, but also the biggest. (Partly because they always include the debugging time that everyone else hopes won’t be necessary.) Their emphasis on quality and reliability also means that Craftsmen are terrified of unfamiliar parts of the code-base or poorly defined problems.

The Activist

You know that guy on your team who is pushing Test-Driven Development, is constantly refactoring code, and actually uses the names of design patterns? That guy is your Activist. They are the driving force for architectural and process improvements on your team.

Activists want the code quality in your project to be as high as it can be. They give tough code reviews, and even tougher design reviews, but that’s a good thing. Every time someone on the team listens to the Activist, they are improving as a programmer.

On the other hand, their ceaseless pursuit of perfect code hurts the productivity of the Activist. Quick hacks are physically painful to them, even when that is exactly what the situation calls for. Paradoxically, they also often introduce bugs with their refactoring that never would have come up otherwise. (On the plus side, the refactoring makes fixing that bug far easier.)

The Workhorse

In their various ways, all of the programmers above are sacrificing some of their capacity to their particular quirks. Workhorse programmers don’t do that. They are in a single-minded pursuit of adding as much to the system as possible, and as a result end up owning vast chunks of the code-base.

If you were count lines of code per programmer, the Workhorses would come out ahead. (That’s assuming you don’t count generated code from the Activists.) Sheer output is the domain of these kind of programmers. If you have a few great Workhorses on your team you will be able to do things that other teams only dream of.

The dark flip side of what a great workhorse can accomplish is that a bad one will do absurd amounts of damage to your code-base. Workhorses don’t have any significant dedication to quality that allows them to avoid doing bad things. Sometimes make up for this by having enough time to build the system two or three times in the time that a Craftsman would build it once, but that’s always painful. A single bad Workhorse can do enough damage to negate the positive effect of one or two other programmers.

What kind of programmer are you?

You will notice that none of these archetypes are particularly bad or particularly good. There can be good or bad programmers of any archetype. All the teams I’ve ever been on have had a mix of archetypes. For that matter, very few programmers could be assigned to one archetype.

Personally, I think I’m mostly a Workhorse with a little bit of Activist and Explorer mixed in. I am put to shame by the ability of the some of the programmers around me to suss out how to do some radical new thing. I’m not hard-core enough about process or code quality to keep up with the Activists on the team. The one way I compete is on quantity, and most of that code is fortunately good enough to not doom any projects I’ve been on up to this point.

What about you? Where would you fit in this taxonomy? Do you recognize any programmers you know?