Archive for the ‘Augmented Reality’ Category


This rounds out my trilogy of year end posts. Here is what I think will happen in the coming year. I would love to hear your thoughts on these predictions:

  1. Star Trek Online will be the only significant MMO launch in 2010. It will do well enough to make Atari and Cryptic plenty of money, but will not do nearly as well as World of Warcraft, so many people will consider it a failure. (Those people are dumb.)
  2. At least three of the major unreleased MMO projects will be cancelled. I have a guess about which two are most likely, but I’ll keep that to myself.  Qualifying projects include:
    • Guild Wars 2
    • Whatever Carbine is working on
    • That console MMO Turbine hasn’t said much about
    • Whatever Trion is working on
    • The Sci-Fi channel tie-in MMO that Trion has said they’re working on
    • Whatever Zenimax is working on that may or may not be Fallout
    • Whatever 38 Studios is working on
    • The Agency
    • Whatever Gazillion’s Gargantuan studio is working on
    • Star Wars: The Old Republic
    • The second MMO that CCP is working on down in Atlanta where all those White Wolf people are.  Hmm. What could it be?
    • Whatever Red 5 is working on
    • DC Online
    • That other MMO I know NCSoft is working on that is completely under the radar
    • Whatever Slipgate Ironworks is working on
    • The second MMO Blizzard has in the works
    • APB
    • Jumpgate: Evolution
  3. Project Natal and the Playstation Motion Controller will both come out.  Natal will do fairly well. Both controllers will allow some new kinds of games, but we won’t see any compelling examples of those games until 2011.
  4. Unemployment will peak and then start to fall.
  5. The compass+GPS augmented reality products will begin to shift to general location-awareness and away from their Augmented Reality roots. They will de-emphasize magic lens and start to emphasize aggregation of nearby content.
  6. No consumer-level see-through displays will come out in 2010. Significant progress toward them will be made, but nothing will be released.
  7. Neither Google nor Apple will release any kind of AR-focused hardware
  8. The use of “wave it in front of your webcam” type AR in advertising will peak with an AR-enhanced ad in the Superbowl.  The backlash will begin. By the end of the year the advertising world will have moved on.
  9. Apple will release its tablet and a new iPhone (faster and more storage) but won’t release anything that is specifically an AR product.
  10. Apple will address the pain caused by its app-store approval process, at least in part. I have no idea what their specific solution will be, but they aren’t going to let their developer community grow to hate them.
  11. Android will continue to pick up steam. By the end of the year Android will boast 50,000 applications.
  12. People will spend the entire year trying to find something really useful to do with Google Wave. They won’t succeed in 2010.
  13. Google will make Wave interoperate with email. This will make it useful as an email client if nothing else.

Ok, that’s the last of this kind of post for at least a year.  If only I could get back to posting regular stuff again. :)

The twenty-teens

Around this time of year for the past few years I have written a blog post listing what I expected to occur during the coming year. Since this new year marks the start of a new decade, I thought I would start a new tradition and write a post on my expectations for the coming decade. 2020 is a long way away, so I’m sure most of this will miss the mark. Hopefully at least 48 year old me will be amused by what 38 year old me had to say.

Please note that just because something is on this list does not mean that it’s something I want to happen, only that it’s something I think will happen. Anything that’s missing from this list is probably just something I didn’t think of.

I would love to hear your thoughts on any or all of these.  Please comment below.

General Technology Trends:

  1. Moore’s Law will continue to operate for the entire decade. That means a given form-factor of computing device will be approximately 100x the power of the same form-factor today.
  2. Mobile computing will dominate. Everyone who owns a laptop or desktop today will have a mobile device that is about 10x the power of their current computer.  We may still call these “phones”, but placing voice calls will only be one tiny part of what they do. This device will replace most users’ desktop and laptop computers.
  3. Digital Distribution will be king. Only a tiny fraction of the media that’s currently consumed digitally (TV, movies, music, and software) will be purchased on a hunk of plastic. Both the subscription model (aka Rhapsody or cable television) or the purchase model (aka iTunes or DVDs) will have at least 20% market share, but one of those two models will be gradually taking over. Advertising supported media will be just as big of a deal as it now, but the user will have much more control over how they consume that media (think Hulu rather than broadcast television.) Books are on the same trajectory, but in 2020 the majority of books will still be sold on dead trees.
  4. Speach recognition will gain a lot of ground as the primary way we enter text into a computer. Offices are one place where this trend won’t have advanced very far mostly because of the noise involved.

Game Industry Trends:

  1. Total revenues from video games of all kinds (including mobile and social games) will exceed revenue from movies and television (independantly, not added together.) Games will finally learn to exploit merchandising and secondary markets as vigorously as movies do.
  2. In 2020 no one will be selling a dedicated gaming console. All computing devices in production in ten years will be about consuming other kinds of media just as much as they are about playing games.
  3. Desktop PC gaming will be all but dead, with the majority of triple-A games coming out for multi-media consoles or mobile devices.
  4. Gaming that involves exercise will be the primary way that the majority of people get their exercise.
  5. Location-aware games will be common.

Augmented reality:

  1. A growing minority of people in the developed world will wear heads up displays almost all the time. These displays will be capable of information overlays, but will mostly be about contextual information that is not overlaid on the world. These products will be on the verge of hitting the mainstream, but won’t quite be mainstream yet.
  2. Development of these displays will be by small companies (perhaps companies that are around now) but those companies will be acquired by massive consumer electronics multinationals before wearable displays hit the mainstream.
  3. Recognition of people and text in images (and video) will be nearly perfect, at least in reasonable lighting conditions.
  4. Gestural interfaces will be commonplace. Many hard-core computer users will be sad at how clumsy they are compared to keyboard and mouse.

The fate of specific companies:

  1. Google will be huge and influential. Their influence will likely peak in the 2010s, but it will difficult to see that from the ground. Google will have had some sort of anti-monopoly action taken against them.
  2. Microsoft will fail to transition to the new mobile-centric world and will be in decline. They will still be a very powerful multi-billion-dollar company, but will not own the end-user to nearly the extent they do now.
  3. A company that exists today will be the dominant social network.  that could be Facebook, Twitter, or YouTube, but it probably won’t be MySpace.
  4. Apple will be huge and influential. They won’t ever be as dominant as Microsoft was in the 90s, but they will be very successful. Steve Jobs will still be running the company.

US Politics:

  1. Gay marriage will be legal in most states.
  2. Marijuana use will be legal in California and a few other states.
  3. We won’t have elected a woman president. (My wife came up with this one, but I agree with her.)
  4. The problems of illegal immigration will not be solved.
  5. The problems of providing health-care to everyone that needs it will not be solved.
  6. Privacy in an age of always-on location-aware devices will be a huge topic of debate.
  7. Silicon Valley will remain the world’s premier startup region.
  8. The US will still have troops in both Iraq and Afganistan. These will be like the troops we still have in Germany and South Korea, and will not be in combat often, if ever.

International Politics:

  1. Carbon emissions will be at approximately their peak in 2020.
  2. Oil production will also be peaking around 2020.
  3. Most other countries will be ahead of the US in terms of switching to renewable energy.
  4. Most of the rest of the world will have consumer-friendly privacy regulations in place. Those countries will scratch their heads at the debate raging in the US.

Things that will not happen:

  1. We will not have flying cars, jet-packs, or most of the other things promised by Sci-Fi in the 50s.
  2. There will not be peace in the middle east.
  3. Africa will still be the poorest continent.
  4. Brain-computer interfaces will still not work very well. No one will be uploading themselves into a computer.
  5. We won’t have a human equivalent AI.
  6. We won’t know how to reliably unfreeze people.
  7. World War Three won’t have happened.

50 Things I Learned at ISMAR 2009

The good thing about going to your first conference on a new subject matter is that you’re not jaded and certainly not level capped. So without further ado, here are fifty things I learned at ISMAR:

  1. Metaio is pronounced mehtayo, not (as I’ve been saying) mehtah-ayo.
  2. The high-end HMDs that academics buy for tens of thousands of dollars are terrible.
  3. Nokia has a very cool see-through display with eye tracking up and running in their research lab. This display may never see the light of day.
  4. There are still tons of people doing research with markers.
  5. Robert Rice and I are both 38.
  6. When using a tag-based gesture to activate a menu, users are more accurate and able to select their option more quickly if the options are presented relative to the user’s view than if they are presented relative to the marker’s original location or an object in the world.
  7. Vuzix is working on cool stuff and Paul Travers is a good guy with a passion for AR.
  8. Telepresence is creepy when it is accomplished by projecting a remote video feed onto a static mannekin head. (This was the Anamatronics Shader Lamps Avatar paper and demo.)
  9. Robert Rice really got into AR in early 2008, just like me.
  10. The academic AR community is ready to welcome industry to their conference with open arms. Apparently there were many more companies present this year than last year.
  11. Metaio’s mobile platform (Junaio) is not a clone of Layar/Wikitude in any way. They are building a much more social system based on user-provided content.  Junaio is also going to work on phone with no compass (i.e. the iPhone 3G.)
  12. X from Y is a smart dude. (Sub in any X and Y you like among the many people I met this week. I met so many smart people.)
  13. There are some professors who love the sound of their own voices. OMG, (that one guy) from (that one university) can’t seem to ask a question in less than five minutes.
  14. I believe that augmented reality is the next big technology revolution and will have an impact at least as big as the web’s impact. This will provide opportunities for tons of companies and as a result there’s no reason to start competing bitterly at this early stage.  It turns out Robert Rice agrees with me.
  15. Tish Shute is obsessed with XMPP (and a smart non-dude.)
  16. There are more AR startups out there that are flying under the radar. For instance, there are these two guys from Rochester…
  17. Silicon Valley remains completely oblivious to AR. If Robert and I are right it will be interesting to see what this means for their dominance of the startup community.
  18. The vast majority of the AR research being done in adademia is being done outside the US. I knew this going in, but it was shocking to be confronted with it in person.
  19. Georg Klein (of PTAM fame) works at Microsoft now.  Hmm.
  20. The food in Orlando is terrible.  Maybe they could move this conference to Austin…
  21. Microvision’s display technology works really well.  At least on the monocular test unit that I got a chance to look through after their talk.
  22. There is (or was) at least one PC gamer out there that has never heard of Steam. I was shocked.
  23. Qualcomm is backing AR in a big way and intends to be the hardware provider of choice for mobile AR.
  24. Venture Capital isn’t flowing into augmented reality quite yet. Most AR startups are self-funded or funded by friends and family.
  25. I am much better at networking than I was when I first started going to game conferences.
  26. It is far too early for meaningful standards in AR. It would be awfully nice if the Wikitude content provider API used the same format that people are already providing to Layar, however.
  27. The projector part of Sixth Sense is still a non-starter. The UI parts are still very cool, however.
  28. Robert Rice and I have a creepy number of common traits.
  29. Disney Imagineering makes extensive use of AR.
  30. Peter from Metaio suggests that if you want to get anything done in the AR space you shouldn’t spend any time worrying about whether or not what you’re doing is AR or not. I agree with him. There’s not a clear line between AR and not AR and there probably never will be.
  31. See-through glasses at a reasonable price point (and field of view) are probably more than a year out. This is frustrating to a great many people, including me.
  32. Layar isn’t going to ruin AR. I went into the week with a fear that the GPS+compass category (which Layar is currently leading) would forever taint the term Augmented Reality by providing a fairly useless AR view (when compared to a map or list view.)  Instead I think that people will simply not use the AR view and that Layar pushes location based services forward in a huge way by providing access to multiple content providers from a single app. One day no one will remember that they started out as primarily an AR app.
  33. I prefer talks about what people did over talks about what people think will happen.
  34. For many researchers, augmented reality is a solution looking for a problem. There are a lot of gee-whiz demos and many people seem to accept cool factor as a compelling reason to use AR instead of more traditional solutions.
  35. I saw a presentation on an AR-based interface that included a user study that concluded the mouse-and-keyboard interface they devised for comparison was both more accurate and faster for users. Clearly we should not rush out and replace all UI in places where a mouse and keyboard are working now.
  36. Roundtable sessions with fifty or more people in the room don’t work.
  37. There was a company using optical flow to fake accelerometer-type UI elements back before phones had accelerometers. On a related note, promo videos from old dead-end technologies are funny.
  38. By and large academics feel that augmented reality is poised to take off in a big way.
  39. Academics don’t drink nearly as much as game developers.
  40. Nobody has solved the problem of optical tracking in arbitrary outdoor environments as a means of correcting GPS and magnetometer error. The sensor fusion presentation from Gratz was promising, however.
  41. ISMAR doesn’t treat their speakers very well. Apparently there was some question at to whether or not speakers would even get a free badge.  That’s just silly. Speakers also shouldn’t have to buy their tickets to the award banquet all attendees get for free.
  42. Some people think that “the Layar and Wikitude type apps” don’t count as real AR because they only use the camera for video pass through. Most people (including some of the people in the first group) agree that it doesn’t really matter whether these apps are AR or not.
  43. Video pass-through introduces massive latency, which can cause significant issues with perception of haptic feedback.
  44. Natasha Tsakos is happy to use the same shtick to open her talks at both TED and ISMAR.
  45. AR researchers are poor at name badge design. Badges should include company/university name. The name of the attendee is the most important thing on the badge and should be larger than everything else. The ISMAR badges had three lines of text, all the same size:
    • ISMAR 2009
    • Your Name
    • Science and Technology or Arts and Humanities.
  46. Nobody in the ISMAR community takes the various advertising uses of AR too seriously.
  47. You shouldn’t register for a conference on the day registration opens. Apparently the regonline account was still in test mode for the first day or so and all the people who registered that day didn’t really register (or have a charge appear on their credit cards.)
  48. There is a strong bias toward computer vision and away from other sensors among many researchers.
  49. Orlando was not made for walking.
  50. ISMAR 2009 was totally worth attending.

I am so happy I went.  ISMAR reinvigorated my interested in AR and allowed me to meet many great people. I wonder if I’ll be able to swing a trip to Seoul for ISMAR 2010.

My Layar development experience

When SPRX Mobile announced that they were opening up the Layar API back in July, I applied immediately. I wanted to learn more about publishing geo-coded data, keep abreast of what Layar was up to, and try to deliver some useful data all at the same time. Fortunately my application was accepted and I received one of the first batch of API keys to go out.

My specific project has been to take the real-time bus arrival information provided by One Bus Away and publish it on the Layar platform. I use the mobile-formatted One Bus Away website at least twice per workday as part of my commute. This data is currently only available in Seattle, but will soon be expanding to everywhere that offers a GTFS feed. My feelings about this experience have been almost entirely positive, but I still come away from it discouraged.

On one hand, the people building Layar (Dirk in particular) have been very helpful. The platform is easy to develop for and they provide good documentation and tools to make it even easier. All of the time spent on this project (which took less than 24 working hours, total) was spent figuring out Google AppEngine, the python web framework I used, the One Bus Away API, and how to filter nearby stops to a reasonable set to show to a user. With minor exceptions Layar performed very well. I have provided all 436 lines of code here so you can see for yourself how easy it was.

Marjolein and Claire from SPRX were helpful in less technical ways too.  All developers were invited to the launch event to show off their layers. They ran several conference calls for people all over the world to answer any questions on the API or about the launch. SPRX has done a great job with the launch of Layar 2.0, and I think all the positive press they have received is a direct result of that.

My discouragement has less to do with Layar specifically than it does with the entire category of tricorder augmented reality. The view through the mobile phone and its camera is less useful than a top-down map would be for every piece of data I have seen so far. For my layer in particular, the rider is very likely to know where the stop is. In situations like that where location is unimportant, both the Reality View and Map View actually get in the way.

This experience has led me to two conclusions.  First, augmented vision is pointless until head-mounted displays are available.  I already felt that way, so now I am just more firm in my belief.  Second, filtering data to a useful subset for display is actually the hard problem.  Job listing sites, travel sites, Ecommerce sites, and review sites already knew this, which is why they spend so much effort on search. Turns out the problem is the same for mobile location-aware services.

If you live in Seattle and would like to try out the One Bus Away layer for Layar, just search for One Bus Away inside of Layar.  I welcome your feedback on how I could make this layer more useful. And, of course, I would also love to hear your thoughts on the utility of augmented vision on a mobile phone.

Cameras vs. Sensors

If you search for “augmented reality” in Google, most of the hits will involve systems that analyze the output of a video stream in order to figure out what to draw in the overlay and where to draw it.  Sometimes the what and where are answered by the same marker (as in the endless YouTube AR clips.) In the more interesting examples the what comes from using the camera to figure out where the camera is pointed in more general terms and then to draw something positioned in some sort of known coordinate space (like PTAM or the recently announced MetaIO World.) This latter approach is broadly termed visual odometry. This seems to be what most people think of when they refer to AR, and that is no surprise given how much academic AR research focusses on computer vision.

As Wikitude (and more recently Layar, Nearest Tube, and Wimbledon Seer) has shown us, there is another way.  Making sense of a video stream is hard, particularly on a mobile device. Why not just use the non-camera sensors on that device (GPS receiver, tilt sensor, and compass) to provide the absolute position and orientation of the device and then look up nearby waypoints from some sort of database. This approach makes these applications more similar to map-based location aware apps (like Whrrl and Urban Spoon) than to those YouTube videos, but it’s not clear that users care.

Using sensors to determine position and orientation has key advantages.  The first is that it works in more environments.  While GPS often fails indoors, it works fine at night, at sea, and on most parts of the earth. Visual odometry has been shown to work relative to a start point — basically where you start up the tracking system — but not relative to an absolute coordinate system. GPS is also immune to nearby objects moving around. Real world scenes are very dynamic and moving cars, furniture, and people around can throw off vision-based systems. Tilt sensors comprised of accelerometers and gyros are quite good at returning stable, accurate pitch and roll values.  Compasses are somewhat less reliable due to their susceptibility to nearby magnetic fields and large chunks of metal, but they are still able to give you a reasonable approximation of heading. Tilt sensors and compasses also work fine indoors and out of doors.

On the other hand, vision-based tracking systems have advantages of their own.  The biggest is accuracy.  PTAM demonstration videos show an accuracy down to a centimeter or less. Marker-based approaches show even better accuracy. Compare that to the two meters that represents the best possible accuracy of a GPS receiver. Those two orders of magnitude mean that GPS based AR systems simply don’t work for objects that are less than ten meters away. The second advantage for vision based systems is that there are many cases where it is impractical to know about all the objects in the user’s field of view.  They aren’t there yet, but advanced computer vision techniques offer hope that one day a computer will be able to recognize any arbitrary object simply by looking at it.  And until that day arrives there are already readily indexed markers on most items in the form of UPC codes. GPS will never provide such a service, and even if every item in the world had an RFID tag there is no way that every person would have access to the database into which those tags are indices.

Despite its those shortcomings, my belief is that pragmatism is going to result in GPS-based systems winning this fight. The fact is that today’s GPS-based solutions actually work in the general case and vision based have only worked well in very controlled demonstrations. If pioneering companies like SPRX Mobile and Mobilizy start to make money then capital is going to start flowing into this industry. Most of those new companies are going to follow the lead of the existing players and prefer GPS to computer vision. Eventually that will drive sensor-based approaches to get better, faster than vision-based approaches, which will encourage more investment until eventually vision-based AR tracking systems are left in the dust.  One of these improvements could be Galileo, which is expected to offer GPS accuracy down to 2cm. When vision researchers eventually solve the object recognition problem those solutions will be integrated into already existing AR platforms with sensor-based trackers.

What do you think? Do you see vision-based systems coming out on top? Will non-camera sensors be king? Or is a hybrid system the only way to go long-term?