<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: The hard part of continuous deployment</title>
	<atom:link href="http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/feed/" rel="self" type="application/rss+xml" />
	<link>http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/</link>
	<description>Joe Ludwig's blog</description>
	<lastBuildDate>Fri, 30 Mar 2012 18:45:28 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
	<item>
		<title>By: Bryant</title>
		<link>http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/comment-page-1/#comment-330784</link>
		<dc:creator>Bryant</dc:creator>
		<pubDate>Tue, 31 Mar 2009 11:47:04 +0000</pubDate>
		<guid isPermaLink="false">http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/#comment-330784</guid>
		<description>And now that I&#039;ve read part two: yes, that, exactly. Particularly enabling restarts without dumping players.

A fair amount of Turbine&#039;s non-game traffic is HTTP. It turned out to be a big win for both the obvious reasons and some unexpected ones. As Kevin notes, getting the benefit of load balancing technology is great, not just for balancing load. F5&#039;s BigIP iRule technology turned out to be very handy for quickly developing all sorts of useful tools for handling traffic loads that would have been harder to build into the servers.</description>
		<content:encoded><![CDATA[<p>And now that I&#8217;ve read part two: yes, that, exactly. Particularly enabling restarts without dumping players.</p>
<p>A fair amount of Turbine&#8217;s non-game traffic is HTTP. It turned out to be a big win for both the obvious reasons and some unexpected ones. As Kevin notes, getting the benefit of load balancing technology is great, not just for balancing load. F5&#8242;s BigIP iRule technology turned out to be very handy for quickly developing all sorts of useful tools for handling traffic loads that would have been harder to build into the servers.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kevin Gadd</title>
		<link>http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/comment-page-1/#comment-318681</link>
		<dc:creator>Kevin Gadd</dc:creator>
		<pubDate>Tue, 24 Feb 2009 15:08:00 +0000</pubDate>
		<guid isPermaLink="false">http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/#comment-318681</guid>
		<description>Re: HTTP, one of the benefits I haven&#039;t seen mentioned is that there&#039;s a lot of extremely solid software out there for proxying/load-balancing HTTP requests. We&#039;re using a bunch of off-the-shelf software with minor modifications where I work currently (I think most of it was developed by Danga), and for the most part it&#039;s scaled tremendously well. Developing that sort of technology from scratch for custom protocols would be a fairly large task, I suspect.</description>
		<content:encoded><![CDATA[<p>Re: HTTP, one of the benefits I haven&#8217;t seen mentioned is that there&#8217;s a lot of extremely solid software out there for proxying/load-balancing HTTP requests. We&#8217;re using a bunch of off-the-shelf software with minor modifications where I work currently (I think most of it was developed by Danga), and for the most part it&#8217;s scaled tremendously well. Developing that sort of technology from scratch for custom protocols would be a fairly large task, I suspect.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rob Hale</title>
		<link>http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/comment-page-1/#comment-318453</link>
		<dc:creator>Rob Hale</dc:creator>
		<pubDate>Tue, 24 Feb 2009 01:48:58 +0000</pubDate>
		<guid isPermaLink="false">http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/#comment-318453</guid>
		<description>I&#039;ve often thought about the possibilities of moving all non-time critical game elements out of the normal client/server communication and into secure http connections. This is largely because of the inherent benefit that your economy can function without your game servers being live and that players in the game can communicate seamlessly with those not in the game.

From a community point of view it makes alot of sense and I imagine would take alot of strain off of the game servers as the game servers dealing with combat calculations, AI and all the &quot;Hard&quot; stuff won&#039;t have to be concerned with a guild leader promoting members. It would even allow you to have multiple game servers that all feed into the same market servers or read out of the same Character Database.</description>
		<content:encoded><![CDATA[<p>I&#8217;ve often thought about the possibilities of moving all non-time critical game elements out of the normal client/server communication and into secure http connections. This is largely because of the inherent benefit that your economy can function without your game servers being live and that players in the game can communicate seamlessly with those not in the game.</p>
<p>From a community point of view it makes alot of sense and I imagine would take alot of strain off of the game servers as the game servers dealing with combat calculations, AI and all the &#8220;Hard&#8221; stuff won&#8217;t have to be concerned with a guild leader promoting members. It would even allow you to have multiple game servers that all feed into the same market servers or read out of the same Character Database.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim Burris</title>
		<link>http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/comment-page-1/#comment-318303</link>
		<dc:creator>Tim Burris</dc:creator>
		<pubDate>Mon, 23 Feb 2009 21:50:32 +0000</pubDate>
		<guid isPermaLink="false">http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/#comment-318303</guid>
		<description>Actually at this point Pirates is no longer building pack files from scratch.  We switched to a file format that supports incremental packing a milestone or two after Joe left.

We could shave off about 5 minutes with better data organization: the first step of building the pack files is to copy all the unpacked client files to a temporary location to ensure that no server-only files get to clients.  We sacrifice another 5 minutes by copying the current live pack files down, in order to start the incremental packing from a known point and give a minimal patch size.  Analysis showed that a purely incremental approach bloated the patch size in certain circumstances, such as when multiple modifications to the same asset altered its size significantly.  The actual packing operation takes 10 or 20 minutes depending on the extent of changes.  Another 5 minutes is spent copying the full distribution to the central share.

During the remaining hour, the (SOE-supplied) delta program generates patches, reports and sums their sizes, then throws them away.  This is one of the organizational handoff inefficiencies Joe spoke of in the previous post.  Obviously we could eliminate this hour or one of the patch deployment hours if we had control of the entire deployment pipeline.

We keep the hour because a) by and large it happens during the nightly build and nobody cares whether the build finishes at 11pm or midnight, and b) it is extremely useful to know when the patch size jumps up significantly.  We&#039;ve caught quite a few art bugs this way.</description>
		<content:encoded><![CDATA[<p>Actually at this point Pirates is no longer building pack files from scratch.  We switched to a file format that supports incremental packing a milestone or two after Joe left.</p>
<p>We could shave off about 5 minutes with better data organization: the first step of building the pack files is to copy all the unpacked client files to a temporary location to ensure that no server-only files get to clients.  We sacrifice another 5 minutes by copying the current live pack files down, in order to start the incremental packing from a known point and give a minimal patch size.  Analysis showed that a purely incremental approach bloated the patch size in certain circumstances, such as when multiple modifications to the same asset altered its size significantly.  The actual packing operation takes 10 or 20 minutes depending on the extent of changes.  Another 5 minutes is spent copying the full distribution to the central share.</p>
<p>During the remaining hour, the (SOE-supplied) delta program generates patches, reports and sums their sizes, then throws them away.  This is one of the organizational handoff inefficiencies Joe spoke of in the previous post.  Obviously we could eliminate this hour or one of the patch deployment hours if we had control of the entire deployment pipeline.</p>
<p>We keep the hour because a) by and large it happens during the nightly build and nobody cares whether the build finishes at 11pm or midnight, and b) it is extremely useful to know when the patch size jumps up significantly.  We&#8217;ve caught quite a few art bugs this way.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joshua M. Kriegshauser</title>
		<link>http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/comment-page-1/#comment-317458</link>
		<dc:creator>Joshua M. Kriegshauser</dc:creator>
		<pubDate>Sat, 21 Feb 2009 22:42:03 +0000</pubDate>
		<guid isPermaLink="false">http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/#comment-317458</guid>
		<description>Very interesting and informative post.  We have some similarities and some differences with Ultima Online and EverQuest II.

With UO, there was no database backend; everything persistent was stored in flat files.  We also had a hodgepodge of text and binary server-side data files.  When I first started working on UO we&#039;d actually sync CVS to the servers and &lt;i&gt;build it on each and every server machine&lt;/i&gt;.  Eventually the deployment process became more sane and we would push pre-built binaries.  Since the flat-files were generally opaque to everything but the game processes, the update would never touch them.  The slowest part about a publish was the servers coming up and reading all of the persistent world data (since everything &lt;i&gt;in the world&lt;/i&gt; was persistent, not just characters).

Interestingly, with UO we would push a new client live a week or two prior to a new server, so the client had to support two network protocol versions.  Unfortunately the &#039;old version&#039; support was almost never removed.

When the client version changed, clients would of course have to patch, but client patches on UO were often fairly tiny by today&#039;s standards.

----~~~~----

EverQuest II is quite different.  We use an Oracle DB to store our persistent data (which centralizes around characters) and we keep our server-side data in packed binary files.  Our patcher system allows new client and server data to be sync&#039;d to their respective servers while the live game stays up and running.  We almost never do DB updates that require large amounts of downtime.  While our character data is a mixture of columns and blobs, the game is authoritative on versioning.  We could potentially have very old character versions stored in the DB as they&#039;re not upgraded until loaded.  In practice this has rarely affected us negatively but has allowed us to perform very fast updates.

When syncing is done, servers are taken down and a switch is flipped which really causes a directory rename on the servers, followed by a startup.  The longest downtime portion of our patches is QA checking out the servers before they&#039;re unlocked.

In the case of most hotfixes, a client publish is more or less &quot;optional&quot;.  By this I mean that the game will connect and run if you don&#039;t patch the client because the network protocol version hasn&#039;t changed.  However, you&#039;ll still get booted off when the servers go down.

While the best user experience might be trying to maintain old-version uptime even after a patch has synced, it sounds like a logistical nightmare considering that players are going to have to disconnect and patch at some point anyways.

I&#039;m excited to see how the client streaming system for Free Realms will play out.  Unfortunately, EQII&#039;s client-side data was not designed for streaming and there are many, many interdependencies that any sort of streaming system would have to be aware of and anticipate.

Always fun to chat about current and upcoming MMO tech.  Looking forward to LOGIN 2009!</description>
		<content:encoded><![CDATA[<p>Very interesting and informative post.  We have some similarities and some differences with Ultima Online and EverQuest II.</p>
<p>With UO, there was no database backend; everything persistent was stored in flat files.  We also had a hodgepodge of text and binary server-side data files.  When I first started working on UO we&#8217;d actually sync CVS to the servers and <i>build it on each and every server machine</i>.  Eventually the deployment process became more sane and we would push pre-built binaries.  Since the flat-files were generally opaque to everything but the game processes, the update would never touch them.  The slowest part about a publish was the servers coming up and reading all of the persistent world data (since everything <i>in the world</i> was persistent, not just characters).</p>
<p>Interestingly, with UO we would push a new client live a week or two prior to a new server, so the client had to support two network protocol versions.  Unfortunately the &#8216;old version&#8217; support was almost never removed.</p>
<p>When the client version changed, clients would of course have to patch, but client patches on UO were often fairly tiny by today&#8217;s standards.</p>
<p>&#8212;-~~~~&#8212;-</p>
<p>EverQuest II is quite different.  We use an Oracle DB to store our persistent data (which centralizes around characters) and we keep our server-side data in packed binary files.  Our patcher system allows new client and server data to be sync&#8217;d to their respective servers while the live game stays up and running.  We almost never do DB updates that require large amounts of downtime.  While our character data is a mixture of columns and blobs, the game is authoritative on versioning.  We could potentially have very old character versions stored in the DB as they&#8217;re not upgraded until loaded.  In practice this has rarely affected us negatively but has allowed us to perform very fast updates.</p>
<p>When syncing is done, servers are taken down and a switch is flipped which really causes a directory rename on the servers, followed by a startup.  The longest downtime portion of our patches is QA checking out the servers before they&#8217;re unlocked.</p>
<p>In the case of most hotfixes, a client publish is more or less &#8220;optional&#8221;.  By this I mean that the game will connect and run if you don&#8217;t patch the client because the network protocol version hasn&#8217;t changed.  However, you&#8217;ll still get booted off when the servers go down.</p>
<p>While the best user experience might be trying to maintain old-version uptime even after a patch has synced, it sounds like a logistical nightmare considering that players are going to have to disconnect and patch at some point anyways.</p>
<p>I&#8217;m excited to see how the client streaming system for Free Realms will play out.  Unfortunately, EQII&#8217;s client-side data was not designed for streaming and there are many, many interdependencies that any sort of streaming system would have to be aware of and anticipate.</p>
<p>Always fun to chat about current and upcoming MMO tech.  Looking forward to LOGIN 2009!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matthew Weigel</title>
		<link>http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/comment-page-1/#comment-316851</link>
		<dc:creator>Matthew Weigel</dc:creator>
		<pubDate>Fri, 20 Feb 2009 07:41:13 +0000</pubDate>
		<guid isPermaLink="false">http://programmerjoe.com/2009/02/19/the-hard-part-of-continuous-deployment/#comment-316851</guid>
		<description>&quot;A better way to go would be to build your persistence layer so that it could handle any number of historical versions for player data.&quot;

Yep, Dungeon Runners did this, including the incremental crawler.  The crawler process was implemented so early in the project, I actually had to revisit it around the beginning of last year: it was still trying to load EVERY CHARACTER into memory before doing any migration (which worked early on, and continued to work in dev, but started choking in production).  I implemented a moving window of loaded characters, so a few thousand (or whatever) characters were in memory at any time, with new ones being loaded as completed migrations were saved to the DB.

That kind of system is a LOT harder when you go for a regular relational representation in the database, something I&#039;m still kind of contemplating for my current project.  New (nullable or default value) columns aren&#039;t too bad, but you better be sure to change the meaning of an existing column only under extreme duress.

For server binary upgrade, NCsoft Operations had a classic (but workable) system: the root of the server install had &#039;old,&#039; &#039;live,&#039; and &#039;new&#039; subdirectories.  Install everything in &#039;new&#039; while the server is up, take down the server and do database upgrades while deleting &#039;old,&#039; renaming &#039;live&#039; to &#039;old&#039; and &#039;new&#039; to &#039;live&#039;.  It wasn&#039;t as snazzy as versioned DLLs, but it avoided some of the craziness of DLLs too.  This essentially removes server software upgrades from the downtime equation.

Something like PotBS&#039; ServerDirectory is probably better, and also more webbish: connect to a server that&#039;s still at the version your client is at, next time the client changes instance/zone.  You could probably go a step further, and mark servers as needing to be upgraded one at a time, so that the Server Directory won&#039;t send new clients to that server and when everyone is gone from it, you can upgrade it and have it reconnect with its new version.

Dungeon Runners also had package files with incremental updates: that was actually implemented after the game was live, when we decided the slow start time of the client was unacceptable.  I think the PlayNC Launcher also had some concept of patching existing files, so what we uploaded to the patch servers (and what clients downloaded) was essentially a binary diff too.

I think for Dungeon Runners the real causes for downtime were bugs, database upgrades (in some cases, particularly because of the character data and other webbish initiatives), and Windows/hardware updates.  Aside from that, of course, there&#039;s the span of time between &quot;build finished,&quot; &quot;build signed off,&quot; and &quot;build published.&quot;  We didn&#039;t use the incremental patches for server packages, sign off generally took days (but there was no automated testing)... we solved a lot of the technical problems without actually making progress anywhere but server downtime.

Also, somehow I missed your previous blog post on the subject... pretty cool that someone formerly at ArenaNet commented on it, their system engendered a combination of &quot;WTF?&quot; and &quot;cool!&quot; around NC Austin. :-)</description>
		<content:encoded><![CDATA[<p>&#8220;A better way to go would be to build your persistence layer so that it could handle any number of historical versions for player data.&#8221;</p>
<p>Yep, Dungeon Runners did this, including the incremental crawler.  The crawler process was implemented so early in the project, I actually had to revisit it around the beginning of last year: it was still trying to load EVERY CHARACTER into memory before doing any migration (which worked early on, and continued to work in dev, but started choking in production).  I implemented a moving window of loaded characters, so a few thousand (or whatever) characters were in memory at any time, with new ones being loaded as completed migrations were saved to the DB.</p>
<p>That kind of system is a LOT harder when you go for a regular relational representation in the database, something I&#8217;m still kind of contemplating for my current project.  New (nullable or default value) columns aren&#8217;t too bad, but you better be sure to change the meaning of an existing column only under extreme duress.</p>
<p>For server binary upgrade, NCsoft Operations had a classic (but workable) system: the root of the server install had &#8216;old,&#8217; &#8216;live,&#8217; and &#8216;new&#8217; subdirectories.  Install everything in &#8216;new&#8217; while the server is up, take down the server and do database upgrades while deleting &#8216;old,&#8217; renaming &#8216;live&#8217; to &#8216;old&#8217; and &#8216;new&#8217; to &#8216;live&#8217;.  It wasn&#8217;t as snazzy as versioned DLLs, but it avoided some of the craziness of DLLs too.  This essentially removes server software upgrades from the downtime equation.</p>
<p>Something like PotBS&#8217; ServerDirectory is probably better, and also more webbish: connect to a server that&#8217;s still at the version your client is at, next time the client changes instance/zone.  You could probably go a step further, and mark servers as needing to be upgraded one at a time, so that the Server Directory won&#8217;t send new clients to that server and when everyone is gone from it, you can upgrade it and have it reconnect with its new version.</p>
<p>Dungeon Runners also had package files with incremental updates: that was actually implemented after the game was live, when we decided the slow start time of the client was unacceptable.  I think the PlayNC Launcher also had some concept of patching existing files, so what we uploaded to the patch servers (and what clients downloaded) was essentially a binary diff too.</p>
<p>I think for Dungeon Runners the real causes for downtime were bugs, database upgrades (in some cases, particularly because of the character data and other webbish initiatives), and Windows/hardware updates.  Aside from that, of course, there&#8217;s the span of time between &#8220;build finished,&#8221; &#8220;build signed off,&#8221; and &#8220;build published.&#8221;  We didn&#8217;t use the incremental patches for server packages, sign off generally took days (but there was no automated testing)&#8230; we solved a lot of the technical problems without actually making progress anywhere but server downtime.</p>
<p>Also, somehow I missed your previous blog post on the subject&#8230; pretty cool that someone formerly at ArenaNet commented on it, their system engendered a combination of &#8220;WTF?&#8221; and &#8220;cool!&#8221; around NC Austin. <img src='http://programmerjoe.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
</channel>
</rss>

