Hack the Future!: 3D

Showing posts with label 3D. Show all posts

Sunday, August 17, 2008

Using Photographs to Enhance Videos of a Static Scene

Using Photographs to Enhance Videos of a Static Scene from pro on Vimeo.
Scientists at the University of Washington are pushing the boundaries of what can be done with video by computers. They can use a series of still images taken of the same scene a video was taken, and extract some of the superior qualities of the still images to enhance the video. Lighting, resolution, texture and camera movement can all be enhanced.

Other things the software can be used for include the ability to edit a few frames of video to remove unpleasant objects, and have the computer propagate those edits through the whole video. Examples shown include removing a "no parking" sign from the foreground of a flower shop, and removing an unsightly scar from a tree.
These researchers are making progress toward what I'm looking for: the ability of a computer to watch my videos for me, and create an artificial reality populated with the people and places in those videos.

Watch the video to get an idea of all the capabilities that these researchers are working on.

Friday, May 02, 2008

More Than "Just" The World's Oldest Cyborg

I just noticed recently that Steve Mann, formerly of MIT and now a professor at the University of Toronto, has been actively involved in some of the things I am interested in, for quite a while. Not just wearable computers, which I am waiting impatiently for, but image processing and interpreting. He has developed a program - Video Orbits - for stitching together video into stills. One property of note: If the video zooms, then the image formed in that region has higher resolution. If the exposure changes, then the dynamic range of the image increases. It's like layering data upon data.
Professor Mann has been wearing a computer for almost 30 years, as shown in the above illustration. Of most interest to me is the idea of mediated reality, wherein the computer looks at and interprets what you are looking at, and modifies the scene before presenting it to you. These modifications could include directions to a destination ("follow the yellow line"), a name tag for someone you run into whose name you should remember but don't, or any other sort of context-relevant information. It could even present a wildly distorted picture of the world, if that's what you want. Or it could save you from being inundated by external media: it can replace billboards with countryside, or make crowds transparent, so that you don't feel crowded (say, at Disneyland). At the same time it could highlight obstacles you are in danger of colliding with, so you don't keep running into these invisible people!
The sensing/display device that is being studied and (hopefully) developed is called the EyeTap. Within its tiny-enough-to-wear eyeglass frame is both a camera and a display. A mirror sends incoming light to the camera, and also sends the image from a micro-display back into into your field of view. Between the camera and display, a wearable computer does all the image analysis, recognition and re synthesis, "mediating" your view of the world. One cool result of this is that head tracking is done directly from analysis of the image; the system doesn't need a gyro!
I'm not sure how far anyone has gotten with the tough problem of image understanding, but a quick Google search lists several links to universities involved with it. It involves face and object recognition, 3D perception and probably a lot more. This is all very encouraging!

Friday, November 16, 2007

2D Video to VR Environment and Characters

For the last few months, I've been thinking a lot about what I would ultimately do with the hours of video I have taken over the course of my children's lives so far. I can't imagine actually watching all of that video, or even trying to edit it down. I don't really want to experience it in a linear fashion. After all, I was there. And while I may have experienced it through the viewfinder of a camera, still - I was there.

I've concluded that what would be useful would be to have a computer watch the video for me!
The computer would interpret the video, identifying the time and place, capturing the environment, and digitizing the people, creating photo-realistic 3D avatars of them.
In the long run, these videos will eventually be incorporated into my own (augmented) memory. The people's behaviors would be catalogued and pattern-recognized to the point that realistic simulations of the people - at various ages - could be made. I could have conversations and interactions with those who were no longer with me.

I can see that, ultimately, the computer's AI will be sufficient to really interpret the videos as well as I could if I were watching them. It would generate new memories, very similarly to what would happen to me - my memory "refreshed" - if I were to watch them.

Of course the computers of today aren't quite there yet. They are only now able to recognize the environment well enough to drive at about 14 miles an hour.

I would really like to see this technology developed, and I would like to know as much about it as I can. I have started (in my mid-life now) reading up on projective geometry, C++ programming and so on so that I can do some hobbyist-level playing around with this technology. I look forward to the day when the tools start to exist that would enable me to begin tackling these piles of videos I have here at home.

Thanks to a fellow named Augusto Roman (thanks for the link!), I now have a tool that will get me started. It's called Voodoo. It's camera tracking software that will "watch" a video clip and create both a point cloud of what is in the video, and a camera path. Both of these things can be exported to Blender, a free (and quite powerful) 3D animation package. I am now planning to try this with a few video clips. Of course, the package assumes a static scene with just the camera moving (if I understand it correctly), so I know it (by itself) isn't going to get me to the goal of separating out the moving objects from the environment. But it is a start.

Voodoo Camera Tracker is at: http://www.digilab.uni-hannover.de/docs/manual.html

Tuesday, July 24, 2007

Photo Tourism demo

Here's a Java demo of Photo Tourism, the predecessor to Photosynth. The drag with Photosynth is that the demo only runs on Windows XP and Vista. Bleccch. Here, all you need is a Java-enabled browser (I'm using Firefox).

Photosynth

A great video showing a demonstration of Photosynth - a program or database that does some of what I have been talking about

Hack the Future!