Augmented Vision and the Decade of Ubiquity

[Originally posted in March 2009]

There is one  thing stronger than all the armies in the world, and that is an idea whose  time has come. - Victor Hugo

“The best way  to predict the future is to invent it. Really smart people with  reasonable funding can do just about anything…” - Alan Kay

The Past

The concept of Augmented Reality has been around for a very  long time, and not just in fiction. I’m not going to spend much time talking about what augmented reality (“AR)” is or should be, you can do that on your own. There are plenty of resources like Ori Inbar’s Games Alfresco out there that will get you up to speed quickly. Start there if you want to know who is who, and who is doing what. I’m not aware of any other resource on the net that is as definitive as this site is.

The Present

Augmented Reality is quickly becoming one of the buzzwords of 2009 mostly due to social networking, blogs, twitter, and early exposure in mass media. Unless you have been living under a rock recently, you should have seen some marketing by GE, Toyota, Lego, and many others. While I think that it is too early for AR to have so much attention in the mass market, and it is already beginning to suffer overexposure in some circles, it is undeniably building momentum and the early experimenters/adopters are diving right in with accessible tools.

For now, AR is mostly about superimposing graphics on a video stream (from a webcam). This requires some type of marker such as a glyph or fidicial with a symbol, or some other type of image such as a picture (like the front of a baseball card). In either case, the software uses the marker for two things…first, to determine registration and tracking (where should the content and media be displayed) and second, what content to display. Some companies advertise the second method as markerless, but what they really mean is that they aren’t using the first method of a symbol or pattern. Let’s call all of this Level 1 AR.

In most of these cases, this type of AR is pretty novelty and fairly useless. Aside from some games like Sony’s Eye of Judgment, Int13’s Kweekies, and even Frank Lasorne’s AR Toys concept, which are all pretty damn cool, you won’t see very many applications worth more than a glance unless you break away from the desktop, take it mobile, and get rid of all types of printed markers. Now, we are talking about Level 2 AR.

Probably the most well known example of level 2 is Mobilizy’s Wikitude-AR for the Android platform. As we move away from the desktop AR toys and start paying attention to where you are and what is around you, things get much more interesting. The mobile device becomes a lens that gives us the sensation of looking through and seeing the world around us layered with information, data, and visualizations. As an industry, we are only beginning to explore the possibilities here. The transformation of mobile phones into mobile internet devices (MIDs) with powerful processors, 3D graphics, and GPS functionality has already changed the way we think, communicate, and interact with media. Some, like MIT’s improperly named “Sixth Sense” have this backwards by trying to project images on to objects instead of augmenting what you see. Others, like Tonchidot’s Sekai Camera has the right idea, but their approach feels incomplete. It is one thing to associate or link media to a general location, but it is much better to link to specific objects and things. SprxMobile’s ATM finder for ING is another example of how early location-based augmented reality can be very useful.

The Future

Level 3 becomes Augmented Vision. This is an important distinction. We must break away from the monitor and display to lightweight transparent wearable displays (in an eyeglasses formfactor). Once AR becomes AV, it is immersive. The whole experience immediately changes into something more relevant, contextual, and personal. This is radical and changes everything. As I have said before, this will be the next evolution in media. Print, Radio, Television, Internet, Augmented Reality (well, Vision).

L3 must also be mobile massively multi-user, persistent, shared, dynamic, and ubiquitous. This requires a full on convergence of a variety of technologies and disciplines, particularly powerful multi-core MIDs, pervasive wireless broadband, semantic search, intelligent pattern and image recognition, intelligent agents, hybrid service oriented and client-server architectures, gesture interfaces, standardized communications protocols and data formats, easy-to-use and intuitive tools for application development and content creation, and many others. Depending on a number of factors and variables, we are two to three years from this being realized commercially, and maybe five to seven from dominating the mass market. Maybe longer.

2010 to 2020 will become The Decade of Ubiquity. Not only will Level 3 become a reality, but the advent of this will spawn entirely new industries, professions, and hundreds of thousands of jobs. The impact of L3 will be equal to or greater than the effect of the Internet and the Web combined. Nearly every industry will change in some way, and L3 technologies will have a dramatic effect on our day to day lives, jobs, education, entertainment, culture, politics, society, and so on. Even newspapers will evolve and reinvent themselves. Today’s web designers and artists will become holoscape designers…developers will create intelligent agents and bots that are capable of seamlessly interacting with the real and digital worlds (think about Star Trek Voyager’s Holographic Doctor). Marketing and advertising will be completely reinvented and will be more interactive and dynamic than the targeted holographic advertising in The Minority Report. The world around you becomes your display and your interface. Any and everything will be tagged, labeled, interpreted, remembered, and filtered, in real-time. Cyberspace, combined with L3 devices, will become something like a hive-mind collective conscience and memory that we can all tap into at will. We don’t quite know how this is going to happen yet, but a lot of thought and effort is going on right now. Ideas are beginning to become reality.

Early on, entertainment, advertising, and social communication will feel the effects the strongest. Massive amounts of revenue will be generated and the technology will begin to explode, disrupting the way we do everything. Next, education will get a huge shock, as will training, medicine, and business. Industry domination will first be focused on the hardware and software that users need. Then it will be controlled by whoever masters what goes on behind the scenes in the cloud of cyberspace.


You only have to see the Yellowbook Ads, HP’s Roku’s Reward, Soryn’s The Future of Education, Bruce Branit’s World Builder, Nokia’s Morph concept phone, and Microsoft’s Future Vision Series to get a glimpse of what is COMING and in some cases is almost already here.

The best examples of L3 AR, at least where we are headed to and what everyone is talking about for the near-future, include Vernor Vinge’s Rainbows End and Mitsuo Iso’s Denno Coil. If you don’t bother with anything else, at least pay attention to those two.

The Decade of Ubiquity is defined as the next ten years where every aspect of our lives will be permeated by digital, mobile, media, data, information, augmented, virtual, and so forth. It will be everywhere and accessible almost instantly. Everything will be connected, labled, monitored, tracked, tagged, and interactive to some degree or another. We will break away from the desk, we will throw away our monitors, and our children will laugh at how large our IPhones are. They will struggle with how we ever managed to get work done with “windows” “webpages” and keyboards. They will be unable to fathom the concept of vinyl disks, typewriters, and landlines. But it all starts, and accelerates, during this next decade. Imagine everything that happened in the last decade, and multiply it. You haven’t seen anything yet. The next decade will make the last one pale in comparison.

The Distant Future

Level 4 is a long way off and is where we upgrade to contact lens displays and/or direct interfaces to the optic nerve and the brain. At this point, multiple realities collide, merge, and we end up with the Matrix. Without some amazing breakthroughs in a dozen fields, don’t expect this for another two or three decades. That is, assuming there is aggressive funding and R&D in the right areas. It won’t just happen on its own. There needs to be dedicated effort here. This is where Virtual Reality will finally come into its own and our dreams of pure and total immersion where we forget our bodies will finally be realized. Ok, maybe just Playstation 9.

Back to the Future

VAST Media is Virtual, Augmented and Simulations Technology Media. Virtual Worlds, Virtual Reality, Augmented Reality, MMORPGs, Simulations, and so on. In other words any media that is usually based on technology and is generally three dimensional. Print, Radio, and Television don’t count (this includes video). VAST Media today is still heavily segregated into individual industries with very little cross-pollination and sharing of theory, methodology, application, and leaders. This is slowly changing, but the fact remains that the technologies used for each are very similar. Until industry-wide convergence begins to occur, there will be little growth or advancement in any of the individual sectors. Virtual Reality went into a coma in the earl-mid 90s. Innovation in Virtual Worlds is barely measurable, with much of today’s state-of-the-art barely different from where it was a decade ago. MMORPGs have actually devolved in nearly every aspect. Some of the leading titles focus more on single-player gameplay, repetitive and static content, or aren’t even real 3D anymore. Augmented Reality, even as it is gaining momentum and excitement, is at risk of over-exposure and hype.

New leaders and thinkers are emerging and the hunger for creative innovation is beginning to gnaw at the bellies of Gen X’ers that miss the good old days of the Internet boom. Rapid advancements in mobile internet devices and tools for open development are fanning the fires. L2 will burst into the mainstream very soon, and the main thing holding L3 back are the wearable display companies that keep making promises but don’t seem to actively and aggressively be pushing the limits of technology. Too much emphasis is on miniature projectors or wearable displays so people can watch IPod/IPhone videos on the plane in privacy.

The world is nearing another dramatic paradigm shift and explosive growth in technology and economics, but we need to wake up. Demand more, better, stronger, faster, smaller. The future is ours to invent. Don’t be satisfied with mediocrity or lazy development.

We still have a long way to go, and there are plenty of obstacles and problems to be sorted out. Hardware has got to keep up this time (remember what happened to VR). This means that mobile devices have to crank it up real soon and compete with the desktop. Wearable display companies have got to quit screwing around, or they will single-handedly snuff out most efforts to push the envelop by years.

The architects of our augmented future need to think outside of the box as well. Forget everything you know about the internet, the web, web 2.0, virtual worlds, interface design, client/server, internet domains, etc. They MUST look at massively multiuser ubiquitious augmented reality with fresh eyes and vision. The paradigm is completly different. You can’t think about website design and development and ubiquitous AR at the same time. It isn’t about pages, servers, websites, or everything we have created over the last two decades. AR is about WHO you are, WHERE you are, WHAT is around you, WHAT you are doing, and WHO is nearby. Even things we take for granted like anonymity on the internet needs to be thrown out and rethought. The user’s identity is absolutely key to building the future. So are other things like privacy, interoperability, context, semantics, interface, and so on. We have to be thinking about these things NOW if we are going to build the future in the next decade.

Even the way we think about media and content is going to be important. Types of media can be categorized as Passive, Active, Interactive, Dynamic, and Meta. Passive media is text, an image, a 3D object, or something else that just is, and is static. Active media does something. It might be animated, it could turn on and off, and it can have multiple states. Interactive media requires input and interaction with a user. Games are a good example of interactive media. Dynamic media has the ability to change or evolve. It can be influenced. Meta media is beyond all other media types and is usually created and driven by other media or data sources. An example of this would be dynamic media, such as a constantly shifting and transforming 3D shape with attributes such as size, color, texture, volume, and morphability determined by live input from some other source such as the stock market or an orchestra.

Think about all of that, but with other attributes and influences that are based on the who, what, where, when, how, and why that become important with mobile multiuser ubiquitous augmented reality and vision. Now, make it intelligent. The rabbit hole is getting very deep, isn’t it? You absolutely cannot create, architect, and develop this stuff while in the mindset of 1.0 or 2.0. You have to think ahead to 9.0, or better yet, throw out the whole “point oh” system to start with. Never mind Shrödinger’s Cat, think about his Dog.

You must change your perspective, if you want to change how we see the world.

One good place to find out what some out of the box thinkers are thinking, is over at Tish Shute’s blog. Check it out, definitely worth your time. Her recent interviews with Mike Kuniavsky, Adam Greenfield, Usman Haque, and Andy Stanford-Clark are very interesting and in-depth.

What is your vision of the future?


Note to Venture Capitalists: Don’t even THINK about investing in anything remotely associated with Augmented Reality unless you are absolutely familiar (that means having seen or read) with these and others like Neal Stephenson’s Snow Crash and The Diamond Age, Roger Zelazny’s Donnerjack, Charles Stross’ Halting State, Larry Niven’s Dream Park, and just about anything by William Gibson and Bruce Sterling. My apologies to anyone I left out, this isn’t a definitive list by any means. Make sure you watch Masamune Shirow’s Ghost in the Shell as well. Beware of the slew of startups that will come out of nowhere in the next few years with no discernable business model or any real understanding of the tech. Everyone and their brother is going to try to jump on this bandwagon once the realization sets in that the next billion dollar world-changing corporations are going to have something to do with L3. Get rich quick is going to be redefined.