RegisterGet new password
Augmentation Without Boundaries

The 14th IEEE International Symposium on Mixed and Augmented Reality

S&T Accepted Contributions

Full Papers: 12
Short Papers: 10
Extended Posters: 27
Posters: 23
Demos: 18

For Oral/Poster/Demo presenters, you need to check Guidelines.

Full Papers

Important Notice

Starting with ISMAR 2015, all full papers of ISMAR are published as a special issue of IEEE Transactions on Visualization and Computer Graphics. Please cite these papers as follows:
AUTHORS, 2015: TITLE. IEEE Transactions on Visualization and Computer Graphics (Proc. ISMAR 2015), 21(11): ppp-ppp

Live Texturing of Augmented Reality Characters from Colored Drawings
Stephane Magnenat, Dat Tien Ngo, Fabio Zund, Mattia Ryffel, Gioacchino Noris, Gerhard Rothlin, Alessia Marra, Maurizio Nitti, Pascal Fua, Markus Gross, Robert W. Sumner
Oral on 1 Oct Demo Teaser on 30 Sep

Coloring books capture the imagination of children and provide them with one of their earliest opportunities for creative expression. However, given the proliferation and popularity of digital devices, real-world activities like coloring can seem unexciting, and children become less engaged in them. Augmented reality holds unique potential to impact this situation by providing a bridge between real-world activities and digital enhancements. In this paper, we present an augmented reality coloring book App in which children color characters in a printed coloring book and inspect their work using a mobile device. The drawing is detected and tracked, and the video stream is augmented with an animated 3-D version of the character that is textured according to the child’s coloring. This is possible thanks to several novel technical contributions. We present a texturing process that applies the captured texture from a 2-D colored drawing to both the visible and occluded regions of a 3-D character in real time. We develop a deformable surface tracking method designed for colored drawings that uses a new outlier rejection algorithm for real-time tracking and surface deformation recovery. We present a content creation pipeline to efficiently create the 2-D and 3-D content. And, finally, we validate our work with two user studies that examine the quality of our texturing algorithm and the overall App experience.

On-site Semi-Automatic Calibration and Registration of a Projector-Camera System Using Arbitrary Objects With Known Geometry
Christoph Resch, Hemal Naik, Peter Keitler, Steven Benkhardt, Gudrun Klinker
Oral on 1 Oct

In the Shader Lamps concept, a projector-camera system augments physical objects with projected virtual textures, provided that a precise intrinsic and extrinsic calibration of the system is available. Calibrating such systems has been an elaborate and lengthy task in the past and required a special calibration apparatus. Self-calibration methods in turn are able to estimate calibration parameters automatically with no effort. However they inherently lack global scale and are fairly sensitive to input data.
We propose a new semi-automatic calibration approach for projector-camera systems that - unlike existing auto-calibration approaches - additionally recovers the necessary global scale by projecting on an arbitrary object of known geometry. To this end our method combines surface registration with bundle adjustment optimization on points reconstructed from structured light projections to refine a solution that is computed from the decomposition of the fundamental matrix. In simulations on virtual data and experiments with real data we demonstrate that our approach estimates the global scale robustly and is furthermore able to improve incorrectly guessed intrinsic and extrinsic calibration parameters thus outperforming comparable metric rectification algorithms.

Radiometric Compensation for Cooperative Distributed Multi-Projection System through 2-DOF Distributed Control
Jun Tsukamoto, Daisuke Iwai, Kenji Kashima
Oral on 1 Oct

This paper proposes a novel radiometric compensation technique for cooperative projection system based-on distributed optimization. To achieve high scalability and robustness, we assume cooperative projection environments such that 1. each projector does not have any information about other projectors as well as target images, 2. the camera does not have any information about the projectors either, while having the target images, and 3. only a broadcast communication from the camera to the projectors is allowed to suppress the data transfer bandwidth. To this end, we first investigate a distributed optimization based feedback mechanism that is suitable for the required decentralized information processing environment. Next, we show that this mechanism works well for still image projection, however not necessary for moving images due to the lack of dynamic responsiveness. To overcome this issue, we focus on a specific structure of the distributed projector-camera system in consideration, and propose to implement an additional feedforward mechanism. Such a 2 Degree Of Freedom (2-DOF) control structure is well-known in control engineering community as a typical method to enhance not only disturbance rejection but also reference tracking capability, simultaneously. Actually, we can theoretically guarantee that this 2-DOF structure yields the moving image projection accuracy that is overwhelming the best achievable performance only by the distributed optimization mechanisms. The effectiveness of the proposed method is demonstrated through physical projection experiments.

Structural Modeling from Depth Images
Thanh Nguyen, Gerhard Reitmayr, Dieter Schmalstieg
Oral on 30 Sep Demo

In this work, we present a new automatic system for scene reconstruction, which delivers high- level structural models. We start with identifying planar regions in depth images obtained with a SLAM system. Our main contribution is an approach which identifies constraints such as incidence and orthogonality of planar surfaces and uses them in an incremental optimization framework to extract high-level structural models. The result is a manifold mesh with a low number of polygons, immediately useful in many Augmented Reality applications.

Very High Frame Rate Volumetric Integration of Depth Images on Mobile Devices
Olaf Kahler, Victor Adrian Prisacariu, Carl Yuheng Ren, Xin Sun, Philip Torr, David Murray
Oral on 30 Sep Demo

Volumetric methods provide efficient, flexible and simple ways of integrating multiple depth images into a full 3D model. They provide dense and photorealistic 3D reconstructions, and parallelised implementations on GPUs achieve real-time performance on modern graphics hardware. To run such methods on mobile devices, providing users with freedom of movement and instantaneous reconstruction feedback, remains challenging however. In this paper we present a range of modifications to existing volumetric integration methods based on voxel block hashing, considerably improving their performance and making them applicable to tablet computer applications. We present (i) optimisations for the basic data structure, and its allocation and integration; (ii) a highly optimised raycasting pipeline; and (iii) extensions to the camera tracker to incorporate IMU data. In total, our system thus achieves frame rates up 43 Hz on a Nvidia Shield Tablet and 820 Hz on a Nvidia GTX Titan X GPU, or even beyond 1 kHz without visualisation.

MobileFusion: Real-time Volumetric Surface Reconstruction and Dense Tracking On Mobile Phones
Peter Ondruska, Shahram Izadi, Pushmeet Kohli
Oral on 30 Sep Demo

VWe present the first pipeline for real-time volumetric surface reconstruction and dense 6DoF camera tracking running purely on standard, off-the-shelf mobile phones. Using only the embedded RGB camera, our system allows users to scan objects of varying shape, size, and appearance in seconds, with real-time feedback during the capture process. Unlike existing state of the art methods, which produce only point-based 3D models on the phone, or require cloud-based processing, our hybrid GPU/CPU pipeline is unique in that it creates a connected 3D surface model directly on the device at 25Hz. In each frame, we perform dense 6DoF tracking, which continuously registers the RGB input to the incrementally built 3D model, minimizing a noise aware photoconsistency error metric. This is followed by efficient key-frame selection, and dense per-frame stereo matching. These depth maps are fused volumetrically using a method akin to KinectFusion, producing compelling surface models. For each frame, the implicit surface is extracted for live user feedback and pose estimation. We demonstrate scans of a variety of objects, and compare to a Kinect-based baseline, showing on average ~1.5cm error. We qualitatively compare to a state of the art point-based mobile phone method, demonstrating an order of magnitude faster scanning times, and fully connected surface models.

ModulAR: Eye-controlled Vision Augmentations for Head Mounted Displays
Jason Orlosky, Takumi Toyama, Kiyoshi Kiyokawa, Daniel Sonntag
Oral on 30 Sep Demo

In the last few years, the advancement of head mounted display technology and optics has opened up many new possibilities for the field of Augmented Reality. However, many commercial and prototype systems often have a single display modality, fixed field of view, or inflexible form factor. In this paper, we introduce Modular Augmented Reality (ModulAR), a hardware and software framework designed to improve flexibility and hands-free control of video see-through augmented reality displays and augmentative functionality. To accomplish this goal, we introduce the use of integrated eye tracking for on-demand control of vision augmentations such as optical zoom or field of view expansion. Physical modification of the device’s configuration can be accomplished on the fly using interchangeable camera-lens modules that provide different types of vision enhancements. We implement and test functionality for several primary configurations using telescopic and fisheye camera-lens systems, though many other customizations are possible. We also implement a number of eye-based interactions in order to engage and control the vision augmentations in real time, and explore different methods for merging streams of augmented vision into the user’s normal field of view. In a series of experiments, we conduct an in depth analysis of visual acuity and head and eye movement during search and recognition tasks. Results show that methods with larger field of view that utilize binary on/off and gradual zoom mechanisms outperform snapshot and sub-windowed methods and that type of eye engagement has little effect on performance.

Semi-Parametric Color Reproduction Method for Optical See-Through Head-Mounted Displays
Yuta Itoh, Maksym Dzitsiuk, Toshiyuki Amano, Gudrun Klinker
Oral on 30 Sep Poster on 30 Sep

The fundamental issues in Augmented Reality (AR) are on how to naturally mediate the reality with virtual content as seen by users. In AR applications with Optical See-Through Head- Mounted Displays (OST-HMD), the issues often raise the problem of rendering color on the OST-HMD consistently to input colors. However, due to various display constraints and eye properties, it is still a challenging task to indistinguishably reproduce the colors on OST-HMDs. An approach to solve this problem is to pre-process the input color so that a user perceives the output color on the display to be the same as the input.
We propose a color calibration method for OST-HMDs. We start from modeling the physical optics in the rendering and perception process between the HMD and the eye. We treat the color distortion as a semi-parametric model which separates the non-linear color distortion and the linear color shift. We demonstrate that calibrated images regain their original appearance on two OST-HMD setups with both synthetic and real datasets. Furthermore, we analyze the limitations of the proposed method and remaining problems of the color reproduction in OST- HMDs. We then discuss how to realize more practical color reproduction methods for future HMD-eye system.

SoftAR: Visually Manipulating Haptic Softness Perception in Spatial Augmented Reality
Parinya Punpongsanon, Daisuke Iwai, Kosuke Sato
Oral on 2 Oct Poster on 2 Oct

We present SoftAR, a novel spatial augmented reality (AR) technique based on a pseudo- haptics mechanism in the human brain that visually manipulates the sense of softness perceived by a user pushing a soft physical object. Considering the limitations of projection- based approaches that change only the surface appearance of a physical object, we propose two projection visual effects, i.e., surface deformation effect (SDE) and body appearance effect (BAE), on the basis of the observations of humans pushing physical objects. The SDE visualizes a two-dimensional deformation of the object surface with a controlled softness parameter, and BAE changes the color of the pushing hand. Through psychophysical experiments, we confirm that the SDE can manipulate softness perception such that the participant perceives significantly greater softness than the actual softness. Furthermore, fBAE, in which BAE is applied only for the finger area, significantly enhances manipulation of the perception of softness. On the basis of the experimental results, we create a computational model that estimates perceived softness when SDE+fBAE is applied. We construct a prototype SoftAR system in which two application frameworks are implemented, i.e., softness adjustment and softness transfer. The former framework allows a user to adjust the softness parameter of a physical object, and the latter allows the user to replace the softness with that of another object. Through a user study of the prototype, we confirm that perceived softness can be manipulated with accuracy that is less than the just noticeable difference of softness perception. SoftAR does not require user-worn/hand-held equipment and allows users to feel significantly different softness perception without changing materials; therefore, we believe that it will be useful for various applications, particularly the design process of soft products, such as furniture, plush toys, and imitation materials.

Matching and Reaching Depth Judgments with Real and Augmented Reality Targets
J. Edward Swan II, Gurjot Singh, Stephen Ellis
Oral on 2 Oct

Many compelling augmented reality (AR) applications, such as image-guided surgery, manufacturing, and maintenance, that involve the dexterous manipulation of real and virtual objects at reaching distances, require users to correctly perceive the location of virtual objects. Some of these applications require aligning real and virtual objects with accuracies as tight as 1 mm or less. However, measuring the perceived depth of AR objects at these accuracies has not yet been demonstrated. In this paper, we address this challenge by employing two different depth judgment methods from the literature, \emph{blind reaching} and \emph{perceptual matching}, in a series of three experiments, where observers judged the depth of real and AR target objects presented at reaching distances. Both depth judgment methods are promising solutions to the measurement challenge, but both also have limitations that warrant additional study. Our experiments found that observers can very accurately match the distance of a real target, but systematically overestimate the distance to an AR target viewed through collimating optics, resulting in 0.5 to 4.0 cm of error. However, a model in which the collimating optics cause the eyes' vergence angle to rotate outward by a constant angular amount explains these results, which were replicated three times. These findings give error bounds on using collimating AR displays at reaching distances, and suggest that AR displays need to provide variable focus for these reaching-distance applications. Our experiments further found that observers initially reach $\sim$4 cm too short, but reaching accuracy improves with both consistent proprioception and corrective visual feedback to become nearly as accurate as matching. An additional contribution is the design of an apparatus that affords measuring depth judgments to within a few millimeters of precision.

Local Geometric Consensus: a general purpose point pattern-based tracking algorithm
Liming YANG, Jean-marie Normand, Guillaume Moreau

Oral on 1 Oct Demo Teaser on 30 Sep

We present a method which can quickly and robustly match 2D and 3D point patterns based on their sole spatial distribution, but it can also handle other cues if available. This method can be easily adapted to many transformations such as similarity transformations in 2D/3D, and affine and perspective transformations in 2D. It is based on local geometric consensus among several local matchings and a refinement scheme. We provide two implementations of this general scheme, one for the 2D homography case (which can be used for marker or image tracking) and one for the 3D similarity case. We demonstrate the robustness and speed performance of our proposal on both synthetic and real images and show that our method can be used to augment any (textured/textureless) planar objects but also 3D objects.

Instant Outdoor Localization and SLAM Initialization from 2.5D Maps
Clemens Arth, Christian Pirchheim, Jonathan Ventura, Dieter Schmalstieg, Vincent Lepetit
Oral on 1 Oct

We present a method for large-scale geo-localization and global tracking of mobile devices in urban outdoor environments. In contrast to existing methods, we instantaneously initialize and globally register a SLAM map by localizing the first keyframe with respect to widely available untextured 2.5D maps. Given a single image frame and a coarse sensor pose prior, our localization method estimates the absolute camera orientation from straight line segments and the translation by aligning the city map model with a semantic segmentation of the image. We use the resulting 6DOF pose, together with information inferred from the city map model, to reliably initialize and extend a 3D SLAM map in a global coordinate system, applying a model- supported SLAM mapping approach. We show the robustness and accuracy of our localization approach on a challenging dataset, and demonstrate unconstrained global SLAM mapping and tracking of arbitrary camera motion on several sequences.

Short Papers

Augmented Reality Scout: Joint Unaided-eye and Telescopic-zoom System for Immersive Team Training
Taragay Oskiper, Mikhail Sizintsev, Vlad Branzoi, Supun Samarasekera, Rakesh Kumar
Oral on 1 Oct

In this paper we present a dual, wide area, collaborative augmented reality (AR) system that consists of standard live view augmentation, e.g., from helmet, and zoomed-in view augmentation, e.g., from binoculars. The proposed advanced scouting capability allows long range high precision augmentation of live unaided and zoomed-in imagery with aerial and terrain based synthetic objects, vehicles, people and effects. The inserted objects must appear stable in the display and not jitter or drift as the user moves around and examines the scene. The AR insertions for the binocs must work instantly when they are picked up anywhere as the user moves around. The design of both AR modules is based on using two different cameras with wide and narrow field of view (FoV) lenses. The wide FoV gives context and enables the recovery of location and orientation of the prop in 6 degrees of freedom (DoF) much more robustly, whereas the narrow FoV is used for the actual augmentation and increased precision in tracking. Furthermore, narrow camera in unaided eye and wide camera on the binoculars are jointly used for global yaw (heading) correction. We present our navigation algorithms using monocular cameras in combination with IMU and GPS in an Extended Kalman Filter (EKF) framework to obtain robust and real-time pose estimation for precise augmentation and cooperative tracking.

A Framework to Evaluate Omnidirectional Video Coding Schemes
Matt Yu, Haricharan Lakshman, Bernd Girod
Oral on 1 Oct

Omnidirectional videos of real world environments viewed on head-mounted displays with real- time head motion tracking can offer immersive visual experiences. For live streaming applications, compression is critical to reduce the bitrate. Omnidirectional videos, which are spherical in nature, are mapped onto one or more planes before encoding to interface with modern video coding standards. In this paper, we consider the problem of evaluating the coding efficiency in the context of viewing with a head-mounted display. We extract viewport based head motion trajectories, and compare the original and coded videos on the viewport. With this approach, we compare different sphere-to-plane mappings. We show that the average viewport quality can be approximated by a weighted spherical PSNR.

Tiled Frustum Culling for Differential Rendering on Mobile Devices
Kai Rohmer, Thorsten Grosch
Oral on 1 Oct

Mobile devices are part of our everyday life and allow augmented reality (AR) with their integrated camera image. Recent research has shown that even photorealistic augmentations with consistent illumination are possible. A method, achieving this first, distributed lighting computations and the extraction of the important light sources. To reach real-time frame rates on a mobile device, the number of these extracted light sources must be low, limiting the scope of possible illumination scenarios and the quality of shadows. In this paper, we show how to reduce the computational cost per light using a combination of tile-based rendering and frustum culling techniques tailored for AR applications. Our approach runs entirely on the GPU and does not require any precomputation. Without reducing the displayed image quality, we achieve up to 2.2x speedup for typical AR scenarios.

Simultaneous Direct and Augmented View Distortion Calibration of Optical See-Through Head-Mounted Displays
Yuta Itoh, Gudrun Klinker
Oral on 30 Sep Poster on 30 Sep

In Augmented Reality (AR) with an Optical See-Through Head-Mounted Display (OST-HMD), the spatial calibration between a user's eye and the display screen is a crucial issue in realizing seamless AR experiences. A successful calibration hinges upon proper modeling of the display system which is conceptually broken down into an eye part and an HMD part. This paper breaks the HMD part down even further to investigate optical aberration issues. The display optics causes two different optical aberrations that degrade the calibration quality: the distortion of incoming light from the physical world, and that of light from the image source of the HMD. While methods exist for correcting either of the two distortions independently, there is, to our knowledge, no method which corrects for both simultaneously.
This paper proposes a calibration method that corrects both of the two distortions simultaneously for an arbitrary eye position given an OST-HMD system. We expand a light-field (LF) correction approach [8] originally designed for the former distortion. Our method is camera- based and has an offline learning and an online correction step. We verify our method in exemplary calibrations of two different OST-HMDs: a professional and a consumer OST-HMD. The results show that our method significantly improves the calibration quality compared to a conventional method with the accuracy comparable to 20/50 visual acuity. The results also indicate that only by correcting both the distortions simultaneously can improve the quality.

Introducing Augmented Reality to Optical Coherence Tomography in Ophthalmic Microsurgery
Hessam Roodaki, Konstantinos Filippatos, Abouzar Eslami, Nassir Navab
Oral on 2 Oct Poster on 2 Oct

Augmented Reality (AR) in microscopic surgery has been subject of several studies in the past two decades. Nevertheless, AR has not found its way into everyday microsurgical workflows. The introduction of new surgical microscopes equipped with Optical Coherence Tomography (OCT) enables the surgeons to perform multimodal (optical and OCT) imaging in the operating room. Taking full advantage of such elaborate source of information requires sophisticated intraoperative image fusion, information extraction, guidance and visualization methods. Medical AR is a unique approach to facilitate utilization of multimodal medical imaging devices. Here we propose a novel medical AR solution to the long-known problem of determining the distance between the surgical instrument tip and the underlying tissue in ophthalmic surgery to further pave the way of AR into the surgical theater. Our method brings augmented reality to OCT for the first time by augmenting the surgeon's view of the OCT images with an estimated instrument cross-section shape and distance to the retinal surface using only information from the shadow of the instrument in intraoperative OCT images. We demonstrate the applicability of our method in retinal surgery using a phantom eye and evaluate the accuracy of the augmented information using a micromanipulator.

Auditory and Visio-Temporal Distance Coding for 3-Dimensional Perception in Medical Augmented Reality
Felix Bork, Bernhard Fuerst, Anja-Katharina Schneider, Nassir Navab
Oral on 2 Oct

Image-guided medical interventions more frequently rely on Augmented Reality (AR) visualization to enable surgical navigation. Current systems use 2-D monitors to present the view from external cameras, which does not provide an ideal perception of the 3-D position of the region of interest. Despite this problem, most research targets the direct overlay of diagnostic imaging data, and only few studies attempt to improve the perception of occluded structures in external camera views. The focus of this paper lies on improving the 3-D perception of an augmented external camera view by combining both auditory and visual stimuli in a dynamic multi-sensory AR environment for medical applications. Our approach is based on Temporal Distance Coding (TDC) and an active surgical tool to interact with occluded virtual objects of interest in the scene in order to gain an improved perception of their 3-D location. Users performed a simulated needle biopsy by targeting virtual lesions rendered inside a patient phantom. Experimental results demonstrate that our TDC-based visualization technique significantly improves the localization accuracy, while the addition of auditory feedback results in increased intuitiveness and faster completion of the task.

RGBDX: First Design and Experimental Validation of a Mirror-based RGBD Xray Imaging System
Severine Habert, Jose Gardiazabal, Pascal Fallavollita, Nassir Navab
Oral on 2 Oct Poster 2 Oct

This paper presents the first design of a mirror based RGBD X-ray imaging system and includes an evaluation study of the depth errors induced by the mirror when used in combination with an infrared pattern-emission RGBD camera. Our evaluation consisted of three experiments. The first demonstrated almost no difference in depth measurements of the camera with and without the use of the mirror. The final two experiments demonstrated that there were no relative and location-specific errors induced by the mirror showing the feasibility of the RGBDX-ray imaging system. Lastly, we showcase the potential of the RGBDX-ray system towards a visualization application in which an X-ray image is fused to the 3D reconstruction of the surgical scene via the RGBD camera, using automatic C-arm pose estimation.

The Ventriloquist Effect in Augmented Reality
Mikko Kytö, Kenta Kusumoto, Pirkko Oittinen
Oral on 2 Oct

An effective interaction in augmented reality (AR) requires utilization of different modalities. In this study, we investigated orienting the user in bimodal AR. Using auditory perception to support visual perception provides a useful approach for orienting the user to directions that are outside of the visual field-of-view (FOV). In particular, this is important in path-finding, where points-of-interest (POIs) can be all around the user. However, the ability to perceive the audio POIs is affected by the ventriloquism effect (VE), which means that audio POIs are captured by visual POIs. We measured the spatial limits for the VE in AR using a video see-through head- worn display. The results showed that the amount of the VE in AR was approx. 5 deg - 15 deg higher than in a real environment. In AR, spatial disparity between an audio and visual POI should be at least 30 deg of azimuth angle, in order to perceive the audio and visual POIs as separate. The limit was affected by azimuth angle of visual POI and magnitude of head rotations. These results provide guidelines for designing bimodal AR systems.

Augmented Reality during Cutting and Tearing of Deformable Objects
Christoph Paulus, Nazim HAOUCHINE, David Cazier, Stephane Cotin
Oral on 1 Oct Poster on 1 Oct

Current methods dealing with non-rigid augmented reality only provide an augmented view when the topology of the tracked object is not modified, which is an important limitation. In this paper we solve this shortcoming by introducing a method for physics-based non-rigid augmented reality. Singularities caused by topological changes are detected by analyzing the displacement field of the underlying deformable model. These topological changes are then applied to the physics-based model to approximate the real cut. All these steps, from deformation to cutting simulation, are performed in real-time. This significantly improves the coherence between the actual view and the model, and provides added value.

Efficient Computation of Absolute Pose for Gravity-Aware Augmented Reality
Chris Sweeney, John Flynn, Benjamin Nuernberger, Matthew Turk, Tobias Höllerer
Oral on 1 Oct Poster on 1 Oct

We propose a novel formulation for determining the absolute pose of a single or multi-camera system given a known vertical direction. The vertical direction may be easily obtained by detecting the vertical vanishing points with computer vision techniques, or with the aid of IMU sensor measurements from a smartphone. Our solver is general and able to compute absolute camera pose from two 2D-3D correspondences for single or multi-camera systems. We run several synthetic experiments that demonstrate our algorithm's improved robustness to image and IMU noise compared to the current state of the art. Additionally, we run an image localization experiment that demonstrates the accuracy of our algorithm in real-world scenarios. Finally, we show that our algorithm provides increased performance for real-time model-based tracking compared to solvers that do not utilize the vertical direction and show our algorithm in use with an augmented reality application running on a Google Tango tablet.

Extended Posters

Augmented Reality for Radiation Awareness
Nicola Leucht, Severine Habert, Patrick Wucherer, Simon Weidert, Nassir Navab, Pascal Fallavollita
Poster on 1 Oct Teaser on 1 Oct

C-arm fluoroscopes are frequently used during surgeries for intra-operative guidance. Unfortunately, due to X-ray emission and scattering, increased radiation exposure occurs in the operating theatre. The objective of this work is to sensitize the surgeon to their radiation exposure, enable them to check on their exposure over time, and to help them choose their best position related to the C-arm gantry during surgery. First, we aim at simulating the amount of radiation that reaches the surgeon using the Geant4 software, a toolkit developed by CERN. Using a flexible setup in which two RGB-D cameras are mounted to the mobile C-arm, the scene is captured and modeled respectively. After the simulation of particles with specific energies, the dose at the surgeon’s position, determined by the depth cameras, can be measured. The validation was performed by comparing the simulation results to both theoretical values from the C-arms user manual and real measurements made with a QUART didoSVM dosimeter. The average error was 16.46% and 16.39%, respectively. The proposed flexible setup and high simulation precision without a calibration with measured dosimeter values, has great potential to be directly used and integrated intraoperatively for dose measurement.

Remote Mixed Reality System Supporting Interactions with Virtualized Objects
Peng Yang, Itaru Kitahara, Yuichi Ohta
Poster on 1 Oct Teaser on 1 Oct

Mixed Reality (MR) can merge real and virtual worlds seamlessly. This paper proposes a method to realize smooth collaboration using a remote MR, which makes it possible for geographically distributed users to share the same objects and communicate in real time as if they are at the same place. In this paper, we consider a situation that the users at local and remote sites perform a collaborative work, and real objects to be operated exist only at the local site. It is necessary to share the real objects between the two sites. In prior studies sharing real objects by duplication is either too costly or unrealistic. Therefore, we propose a method to share the objects by virtualizing the real objects using Computer Vision (CV) and then rendering the virtualized objects using MR. We also realize the interaction with virtualized objects by the remote site user and construct a remote collaborative work system. Through experiments, we confirmed the effectiveness of our approach.

Fusion of Vision and Inertial Sensing for Accurate and Efficient Pose Tracking on Smartphones
Xin Yang, Tim Cheng
Poster on 1 Oct Teaser on 1 Oct

This paper aims at accurate and efficient pose tracking of planar targets on modern smartphones. Existing methods, relying on either visual features or motion sensing based on built-in inertial sensors, are either too computationally expensive to achieve real-time performance on a smartphone, or too noisy to achieve sufficient tracking accuracy. In this paper we present a hybrid tracking method which can achieve real-time performance with high accuracy. Based on the same framework of a state-of-the-art visual feature tracking algorithm [5] which ensures accurate and reliable pose tracking, the proposed hybrid method significantly reduces its computational cost with the assistance of a phone’s built-in inertial sensors. However, noises in inertial sensors and abrupt errors in feature tracking due to severe motion blurs could result in instability of the hybrid tracking system. To address this problem, we propose to employ an adaptive Kamlan filter with abrupt error detection to robustly fuse the inertial and feature tracking results. We evaluated the proposed method on a dataset consisting of 16 video clips with synchronized inertial sensing data. Experimental results demonstrated our method’s superior performance and accuracy on smartphones, compared to a state-of-the-art vision tracking method [5]. The dataset will be made publicly available with the publication of this paper.

Augmenting mobile C-arm fluoroscopes via Stereo-RGBD sensors for multimodal visualization
Severine Habert, Meng Ma, Wadim Kehl, Xiang Wang, Federico Tombari, Pascal Fallavollita, Nassir Navab
Poster on 1 Oct Teaser on 1 Oct

Fusing intraoperative X-ray data with real-time video in a common reference frame is not trivial since both modalities have to be acquired from the same viewpoint. The goal of this work is to design a flexible system comprising two RGBD sensors that can be attached to any mobile C-arm, with the objective of synthesizing projective images from the X-ray source viewpoint. To achieve this, we calibrate the RGBD sensors followed by the X-ray source with a 3D calibration object. Then, we synthesize the projective image from the X-ray viewpoint by applying a volumetric-based rendering method. Finally, the X-ray image is overlaid on the projective image without any further registration, offering a multimodal visualization of X-ray and color images. In this paper we present the different steps of development (i.e. hardware setup, calibration and rendering algorithm) and discuss clinical applications for the new video augmented C-arm. By placing X-ray markers on a hand patient and a spine model, we show that the overlay accuracy between the X-ray image and the synthetized image is in average 1.7 mm.

Natural user interface for ambient objects
Meng Ma, Kevin Merckx, Pascal Fallavollita, Nassir Navab
Demo Teaser on 30 Sep

We present a natural gesture interface for ambient-objects using a wearable RGB-D sensor. The aim of this work is to propose a methodology that determines accurately where a user is pointing at when gesturing with their finger. First, the wearable RGB-D sensor is affixed around the user forehead. A calibration between the user’s eyes and the RGB-D camera is performed by having the user move their fingers along their line of sight. We detect the fingertip in the depth camera and then find the direction of the line of sight. Finally we estimate where the user is pointing at in the RGB image in different scenarios with a depth map, a detected object and a controlled virtual element. To validate our methods, we perform a point-to-screen experiment. Results demonstrate that when a user is interacting with a display up to 1.5 meters away, our natural gesture interface has an average error of 2.1cm. In conclusion, the presented technique is a viable option for a reliable user interaction.

INCAST: Interactive Camera Streams for Surveillance Cams AR
István Szentandrási, Michal Zachariáš, Rudolf Kajan, Jan Tinka, Markéta Dubská, Jakub Sochor, Adam Herout
Poster on 1 Oct Teaser on 1 Oct

Augmented reality does not make any sense for fixed cameras. Or does it? In this work, we are dealing with static cameras and their usability for augmented reality applications. Knowing that the camera does not move makes camera pose estimation both less and more difficult - one does not have to deal with pose change in time, but on the other hand, obtaining some level of understanding of the scene from a single viewpoint is challenging. We propose several ways how to gain advantage from the camera being static and a pipeline of a system for broadcasting a video stream enriched by information needed for its visual augmenting - Interactive Camera Streams, INCAST. We present a proof-of-concept system showing the usability of INCAST on several use-cases - non-interactive demos and simple AR games.

Natural 3D Interaction using a See-through Mobile AR System
Yuko Unuma, Takashi Komuro
Poster on 1 Oct Teaser on 1 Oct

In this paper, we propose an interaction system in which the appearance of the image displayed on a mobile display is consistent with that of the real space and that enables a user to interact with virtual objects overlaid on the image using the user’s hand. The three-dimensional scene obtained by a depth camera is projected according to the user’s viewpoint position obtained by face tracking, and the see-through image whose appearance is consistent with that outside the mobile display is generated. Interaction with virtual objects is realized by using the depth information obtained by the depth camera. To move virtual objects as if they were in real space, virtual objects are rendered in the world coordinate system that is fixed to a real scene even if the mobile display moves, and the direction of gravitational force added to virtual objects is made consistent with that of the world coordinate system. The former is realized by using the ICP (Iterative Closest Point) algorithm and the latter is realized by using the information obtained by an accelerometer. Thus, natural interaction with virtual objects using the user’s hand is realized.

Augmented Wire Routing Navigation for Wire Assembly
Mark Rice, Hong Huei Tay, Jamie Ng, Calvin Lim, Senthil Selvaraj, Ellick Wu
Poster on 1 Oct Teaser on 1 Oct

Within manufacturing, high value digital solutions are needed to optimize and aid shop floor processes. This includes agile technologies that can be easily integrated into factory environments to facilitate manufacturing tasks. In this paper, we present a dynamic system to support the electrical wiring assembly of commercial aircraft. Specifically, we describe the system design, which aims to improve the productivity of factory operators through the integration of wearable and mobile solutions. An evaluation of the augmented component of our system using a pair of smart glasses is reported with 12 participants, as we describe important interaction issues in the ongoing development of this work.

Marker Identification Using IR LEDs and RGB Color Descriptors
Gou Koutaki, Shodai Hirata, Hiromu Sato, Keiichi Uchimura
Poster on 1 Oct Teaser on 1 Oct

In optical motion capture systems, it is difficult to correctly recognize markers based on their unique identifiers (IDs) in a single frame. In this paper, we propose two types of light-emitting diodes (LEDs) and cameras, infrared (IR) and RGB, in order to correctly detect and identify all markers tracking objects in a given system. To detect and estimate the three-dimensional (3D) position of the marker, we measure IR LEDs using IR stereo cameras. Furthermore, in order to identify each marker, we calculate and compare the RGB color descriptor in the vicinity of its center. Our system consists of general IR and RGB cameras, and is easy to extend by increasing the number of cameras. We implemented an IR/RGB LED marker circuit and constructed a simple motion capture system to test the effectiveness of our system. The results show that our system can detect the 3D positions and unique IDs of markers in one frame.

RGB-D/C-arm Calibration and Application in Medical Augmented Reality
Xiang Wang, Severine Habert, Meng Ma, Chun-Hao Huang, Pascal Fallavollita, Nassir Navab
Poster on 1 Oct Teaser on 1 Oct

Calibration and registration are the first steps for augmented reality and mixed reality applications. In the medical field, the calibration between an RGB-D camera and a mobile C-arm fluoroscope is a new topic which introduces challenges. In this paper, we propose a precise 3D/2D calibration method to achieve a video augmented fluoroscope. With the design of a suitable calibration phantom for RGB-D/C-arm calibration, we calculate the projection matrix from the depth camera coordinates to the X-ray image. Through a comparison experiment by combining different steps leading to the calibration, we evaluate the effect of every step of our calibration process. Results demonstrated that we obtain a calibration RMS error of 0.54±1.40 mm which is promising for surgical applications. We conclude this paper by showcasing two clinical applications. One is a markerless registration application, the other is an RGB-D camera augmented mobile C-arm visualization.

A Comprehensive Interaction Model for Augmented Reality Systems
Mikel Salazar, Carlos Laorden, Pablo Bringas
Demo Teaser on 30 Sep

In this extended abstract, we present a model that aims to provide developers with an extensive and extensible set of context-aware interaction techniques, greatly facilitating the creation of meaningful AR-based user experiences. To provide a complete view of the model, we detail the different aspects that form its theoretical foundations, while also discussing several considerations for its correct implementation.

Transforming your website to an augmented reality view
Dimitrios Ververidis, Spiros Nikolopoulos, Ioannis Kompatsiaris
Poster on 1 Oct Teaser on 1 Oct

In this paper we present FastAR, a software component capable of transforming Joomla based websites into AR-channels compatible with the most popular augmented reality browsers (i.e. Junaio, Layar, Wikitude). FastAR exploits the consistency of the data structure across multiple sites that have been developed using the same content management system, so as to automate the transformation process of an internet website to an augmented reality channel. The proposed component abstracts all related programming tasks and significantly reduces the time required to generate and publish AR-content, making the entire process manageable by non-experts. In verifying the usefulness and effectiveness of FastAR, we conducted a survey to solicit the opinion of users who carried out the installation and transformation process.

A Step Closer To Reality: Closed Loop Dynamic Registration Correction in SAR
Hemal Naik, Federico Tombari, Christoph Resch, Peter Keitler, Nassir Navab
Poster on 2 Oct Teaser on 2 Oct

In Spatial Augmented Reality (SAR) applications, real world objects are augmented with virtual content by means of a calibrated camera-projector system. The virtual content to be projected is prepared by means of a computer generated model (CAD) of the real object. It is often the case that the real object deviates from its CAD model, this resulting in misregistered augmentations. We propose a new method to dynamically correct the planned augmentation by accommodating for the unknown deviations in the object geometry. We use a closed loop approach where the projected features are detected in the camera image and deployed as feedback. As a result, the registration misalignment is identified and the augmentations are corrected in areas affected by the deviation. Our work is especially focused on SAR applications related to the industrial domain, where this problem is omnipresent. We show that our method is effective and beneficial for multiple industrial applications.

Realtime Shape-from-Template: System and Applications
Toby Collins, Adrien Bartoli
Demo Teaser on 30 Sep

An important yet unsolved problem in computer vision and Augmented Reality (AR) is to compute the 3D shape of nonrigid objects from live RGB videos. When the object's shape is provided in a rest pose, this is the Shape-from-Template (SfT) problem. We present a general framework for realtime SfT. This handles generic objects, complex deformations and most of the difficulties present in real imaging conditions. Achieving this has required new solutions to two core sub-problems in SfT: robust registration and fast 3D shape inference. For registration we propose Deformable Render-based Block Matching (DRBM), which is a tracking-based solution that combines the advantages of feature-based and direct approaches without their main disadvantages. Shape inference is achieved by solving a single sparse linear least squares system for each frame, which is done quickly with a Geometric Multi-Grid method. On a standard desktop PC we archive up to 21fps depending on the object. Code will be released to the community.

Design Guidelines for Generating Augmented Reality Instructions
Cledja Karina Rolim da Silva, Dieter Schmalstieg, Denis Kalkofen, Veronica Teichrieb
Poster on 2 Oct Teaser on 2 Oct

Most work about instructions in Augmented Reality (AR) does not follow established patterns or design rules ? each approach defines its own method on how to convey instructions. This work describes our initial results and experiences towards defining design guidelines for AR instructions. The guidelines were derived from a survey of the most common visualization techniques and instruction types applied in AR. We studied about how instructions 2D and 3D can be applied in the AR context.

Haptic Ring Interface Enabling Air-Writing in Virtual Reality Environment
Kiwon Yeom, Jounghuem Kwon, Sang-Hun Nam, Bum-Jae You
Poster on 2 Oct Teaser on 2 Oct

We introduce a novel finger worn ring interface that enables complex spatial interactions through 3D hand movement in virtual reality environment. Users receive physical feedback in the form of vibrations from the wearable ring interface as their finger reaches a certain 3D position. The positions of the fingertip are extracted, linked, and then reconstructed as a trajectory. This system allows the wearer to write characters in midair as if they were using an \ imaginary whiteboard. User can freely write in the air using Korean characters, English letters, both upper and lower case, and digits in real time with over 92% accuracy rate. Thus, it is now conceivable that anything people can do on contemporary touch based devices, they could do in midair with a pseudocontact interface.

Remote Welding Robot Manipulation using Multi-view Images
Yuichi Hiroi, Kei Obata, Katsuhiro Suzuki, Naoto Ienaga, Maki Sugimoto, Hideo Saito, Tadashi Takamaru
Poster on 2 Oct Teaser on 2 Oct

This paper proposes a remote welding robot manipulation system by using multi-view images. After an operator specifies two-dimensional path on images, the system transforms it into three-dimensional path and displays the movement of the robot by overlaying graphics with images. The accuracy of our system is sufficient to weld objects when combining with a sensor in the robot. The system allows the non-expert operator to weld objects remotely and intuitively, without the need to create a 3D model of a processed object beforehand.

A Particle Filter Approach to Outdoor Localization using Image-based Rendering
Christian Poglitsch, Clemens Arth, Dieter Schmalstieg, Jonathan Ventura
Poster on 2 Oct Teaser on 2 Oct

We propose an outdoor localization system using a particle filter. In our approach, a textured, geo-registered model of the outdoor environment is used as a reference to estimate the pose of a smartphone. The device position and the orientation obtained from a Global Positioning System (GPS) receiver and an inertial measurement unit (IMU) are used as a first estimation of the true pose. Then, multiple pose hypotheses are randomly distributed about the GPS/IMU measurement and use to produce renderings of the virtual model. With vision-based methods, the rendered images are compared with the image received from the smartphone, and the matching scores are used to update the particle filter. The outcome of our system improves the camera pose estimate in real time without user assistance.

Tracking and Mapping with a Swarm of Heterogeneous Clients
Philipp Fleck, Clemens Arth, Christian Pirchheim, Dieter Schmalstieg
Demo Teaser on 30 Sep

In this work, we propose a multi-user system for tracking and mapping, which accommodates mobile clients with different capabilities, mediated by a server capable of providing real-time structure from motion. Clients share their observations of the scene according to their individual capabilities. This can involve only keyframe tracking, but also mapping and map densification, if more computational resources are available. Our contribution is a system architecture that lets heterogeneous clients contribute to a collaborative mapping effort, without prescribing fixed capabilities for the client devices. We investigate the implications that the clients' capabilities have on the collaborative reconstruction effort and its use for AR applications.

AR4AR: Using Augmented Reality for guidance in Augmented Reality Systems setup
Frieder Pankratz, Gudrun Klinker
Poster on 2 Oct Teaser on 2 Oct

AR systems have been developed for many years now, ranging from systems consisting of a single sensor and output device to systems with a multitude of sensors and/or output devices. With the increasing complexity of the setup, the complexity of handling the different sensors as well as the necessary calibrations and registrations increases accordingly. A much needed (yet missing) area of augmented reality applications is to support AR system engineers when they set up and maintain an AR system by providing visual guides and giving immediate feedback on the current quality of their calibration measurements.

Exploiting Photogrammetric Targets for Industrial AR
Hemal Naik, Yuji Oyamada, Peter Keitler, Nassir Navab
Poster on 2 Oct Teaser on 2 Oct

In this work, we encourage the idea of using Photogrammetric targets for object tracking in Industrial Augmented Reality (IAR). Photogrammetric targets, especially uncoded circular targets, are widely used in the industry to perform 3D surface measurements. Therefore, an AR solution based on the uncoded circular targets can improve the work flow integration by reusing existing targets and saving time. These circular targets do not have coded patterns to establish unique 2D-3D correspondences between the targets on the model and their image projections. We solve this particular problem of 2D-3D correspondence of non-coplanar circular targets from a single image. We introduce a Conic Pair Descriptor, which computes the Eucledian invariants from circular targets both in the model space and in the image space. A three stage method is used to compare the descriptors and compute the correspondences with up to 100% precision and 89% recall rates. We are able to achieve tracking performance of 3 FPS (2560x1920 pix) to 8 FPS (640x480 pix) depending on the camera resolution.

Rubix: Dynamic Spatial Augmented Reality by Extraction of Plane Regions with a RGB-D Camera
Masayuki Sano, Kazuki Matsumoto, Bruce Thomas, Hideo Saito
Poster on 2 Oct Teaser on 2 Oct

Dynamic spatial augmented reality requires accurate real-time 3D pose information of the physical objects that are to be projected onto. Previous depth-based methods for tracking objects required strong features to enable recognition; making it difficult to estimate an accurate 6DOF pose for physical objects with a small set of recognizable features (such as a non-textured cube). We propose a more accurate method with fewer limitations for the pose estimation of a tangible object that has known planar faces and using depth data from an RGB-D camera only. In this paper, the physical object's shape is limited to cubes of different sizes. We apply this new tracking method to achieve dynamic projections onto these cubes. In our method, 3D points from an RGB-D camera are divided into a cluster of planar regions, and the point cloud inside each face of the object is fitted to an already-known geometric model of a cube. With the 6DOF pose of the physical object, SAR generated imagery is then projected correctly onto the physical object. The 6DOF tracking is designed to support tangible interactions with the physical object. We implemented example interactive applications with one or multiple cubes to show the capability of our method.

An Adaptive Augmented Reality Interface for Hand based on Probabilistic Approach
Jinki Jung, Hyeopwoo Lee, Hyun Seung Yang
Demo Teaser on 30 Sep

In this paper we propose an adaptive Augmented Reality interface for general hand gestures based on a probabilistic model. The proposed interface provides multiple interfaces and the corresponding gesture inputs by recognizing a context of the hand shape which requires the accurate recognition of static and dynamic hand states. For the accuracy, we present a hand representation that is robust to the hand shape variation, and the extraction of hand features based on the fingertip posteriors from a GMM model. Experimental results show that both context-sensitivity and accurate hand gesture recognition are achieved throughout the quantitative evaluation and its implementation as a three-in-one virtual interface.

Content Completion in Lower Dimensional Feature Space through Feature Reduction and Compensation
Mariko Isogawa, Dan Mikami, Kosuke Takahashi, Akira Kojima
Poster on 2 Oct Teaser on 2 Oct

A novel framework for image/video content completion comprising three stages is proposed. First, input images/videos are converted to a lower dimensional feature space, which is done to achieve effective restoration even in cases where a damaged region includes complex structures and changes in color. Second, a damaged region is restored in the converted feature space. Finally, an inverse conversion from the lower dimensional feature space to the original feature space is performed to generate the completed image in the original feature space. This three-step solution generates two advantages. First, it enhances the possibility of applying patches dissimilar to those in the original color space. Second, it enables the use of many existing restoration methods, each having various advantages, because the feature space for retrieving the similar patches is the only extension. Experiments verify the effectiveness of the proposed framework.

ARPML: The Augmented Reality Process Modeling Language
Tobias Müller, Tim Rieger
Poster on 2 Oct Teaser on 2 Oct

The successful application of augmented reality as a guidance tool for procedural tasks like maintenance or repair requires an easily usable way of modeling support processes. Even though some suggestions have already been made to address this problem, they still have shortcomings and don't provide all the needed features. Thus in a first step the requirements that a solution has to meet are collected and presented. Based on these, the augmented reality process modeling language (ARPML) is developed, which consists of the four building blocks (i) templates, (ii) sensors, (iii) work steps and (iv) tasks. Other than existing approaches it facilitates the creation of multiple views on one process. This allows to specifically select instructions and information needed in targeted work contexts. It also allows to combine multiple variants of one process into one model with only a minimum of redundancy. The application of ARPML is shown with a practical example.

Authoring Tools in Augmented Reality: An Analysis and Classification of Content Design Tools
Roberta Cabral Mota, Rafael Roberto, Veronica Teichrieb
Poster on 2 Oct Teaser on 2 Oct

Augmented Reality Authoring Tools are important instruments that can help a widespread use of AR. They can be classified as programming or content design tools in which the latter completely removes the necessity of programming skills to develop an AR solution. Several solutions have been developed in the past years, however there are few works aiming to identify patterns and general models for such tools. This work aims to perform a trend analysis on content design tools in order to identify their functionalities regarding AR, authoring paradigms, deployment strategies and general dataflow models. This work is aimed to assist developers willing to create authoring tools, therefore, it focus on the last three aspects. Thus, 19 tools were analyzed and through this evaluation it were identified two authoring paradigms and two deployment strategies. Moreover, from their combination it was possible to elaborate four generic dataflow models in which every tool could be fit into.

Affording Visual Feedback for Natural Hand Interaction in AR to Assess Upper Extremity Motor Dysfunction
Marina A. Cidota, Rory M.S. Clifford, Paul Dezentje, Stephan G. Lukosch, Paulina J.M. Bank
Poster on 2 Oct Teaser on 2 Oct

For the clinical community, there is great need for objective, quantitative and valid measures of the factors contributing to motor dysfunction. Currently, there are no standard protocols to assess motor dysfunction in various patient groups, where each medical discipline uses subjectively scored clinical tests, qualitative video analysis, or cumbersome marker-based motion capturing. We therefore investigate the potential of Augmented Reality (AR) combined with serious gaming and marker-less tracking of the hand to facilitate efficient, cost-effective and patient-friendly methods for evaluation of upper extremity motor dysfunction in various patient groups. First, the design process of the game and the system architecture of the AR framework are described. To provide unhindered assessment of motor dysfunction, patients should freely operate with the system in a natural way and be able to understand their actions in the Virtual AR world. To test this in our system, we conducted a pilot usability study with five healthy people (aged between 57-63) on three different modalities of visual feedback for natural hand interaction. These modalities are: no virtual hand, partial virtual hand (tip of index finger and tip of thumb) and full virtual hand models. The results of the study show that a virtual representation of the fingertips or hand improves the usability of natural hand interaction.


Overlaying Navigation Signs on a Road Surface using a Head-Up Display
Kaho Ueno, Takashi Komuro
Poster on 30 Sep Teaser on 30 Sep

In this paper, we propose a method for overlaying navigation signs on a road surface and displaying them on a head-up display (HUD). Accurate overlaying is realized by measuring 3D data of the surface in real time using a depth camera. In addition, the effect of head movement is reduced by performing face tracking with a camera that is placed in front of the HUD, and by performing distortion correction of projection images according to the driver’s viewpoint position. Using an experimental system, we conducted an experiment to display a navigation sign and confirmed that the sign is overlaid on a surface. We also confirmed that the sign looks to be fixed on the surface in real space.

Deformation Estimation of Elastic Bodies Using Multiple Silhouette Images for Endoscopic Image Augmentation
Akira Saito, Megumi Nakao, Yuki Uranishi, Tetsuya Matsuda
Poster on 30 Sep Teaser on 30 Sep

This study proposes a method to estimate elastic deformation using silhouettes obtained from multiple endoscopic images. Our method allows to estimate intraoperative deformation of organs using a volumetric mesh model reconstructed from preoperative CT data. We use silhouette information of elastic bodies not to model the shape but to estimate the local displacements. The model shape is updated to satisfy the constraint of silhouettes while preserving the shape as much as possible. The result of the experiments showed that the proposed methods could estimate the deformation with 5mm-1cm RMS errors.

Hands-free AR Work Support System Monitoring Work Progress with Point-cloud Data Processing
Hirohiko Sagawa, Hiroto Nagayoshi, Harumi Kiyomizu, Tsuneya Kurihara
Poster on 30 Sep Teaser on 30 Sep

We present a hands-free AR work support system that provides work instructions to workers without interrupting normal work procedures. This system estimates the work progress by monitoring the status of work objects only on the basis of 3D data captured from a depth sensor mounted on a helmet, and it selects appropriate information to be displayed on a head-mounted display (HMD) on the basis of the estimated work progress. We describe a prototype of the proposed system and the results of primary experiments carried out to evaluate the accuracy and performance of the system.

Endoscopic Image Augmentation Reflecting Shape Changes During Cutting Procedures
Megumi Nakao, Shota Endo, Keiho Imanishi, Tetsuya Matsuda
Poster on 30 Sep Teaser on 30 Sep

This paper introduces a concept of endoscopic image augmentation that overlays shape changes to support cutting procedures. This framework handles the history of measured drill tip’s location as a volume label, and visualizes the remains to be cut overlaid on the endoscopic image in real time. We performed a cutting experiment, and the efficacy of the cutting aid was evaluated among shape similarity, total moved distance of a cutting tool, and the required cutting time. The results of the experiments showed that cutting performance was significantly improved by the proposed framework.

Toward Enhancing Robustness of DR System: Ranking Model for Background Inpainting
Mariko Isogawa, Dan Mikami, Kosuke Takahashi, Akira Kojima
Poster on 30 Sep Teaser on 30 Sep

A method for blindly predicting inpainted image quality is proposed for enhancing the robustness of diminished reality (DR), which uses inpainting to remove unwanted objects by replacing them with background textures in real time. The method maps from inpainted image features to subjective image quality scores without the need for reference images. It enables more complex background textures to be applied to DR.

Interactive Visualizations for Monoscopic Eyewear to Assist in Manually Orienting Objects in 3D
Carmine Elvezio, Mengu Sukan, Steve Feiner, Barbara Tversky
Poster on 30 Sep Teaser on 30 Sep

Assembly or repair tasks often require objects to be held in specific orientations to view or fit together. Research has addressed the use of AR to assist in these tasks, delivered as registered overlaid graphics on stereoscopic head-worn displays. In contrast, we are interested in using monoscopic head-worn displays, such as Google Glass. To accommodate their small monoscopic field of view, off center from the user's line of sight, we are exploring alternatives to registered overlays. We describe four interactive rotation guidance visualizations for tracked objects intended for these displays.

Movable Spatial AR On-The-Go
Ahyun Lee, Joo-Haeng Lee, Jaehong Kim
Poster on 30 Sep Teaser on 30 Sep

We present a movable spatial augmented reality (SAR) system that can be easily installed in a user workspace. The proposed system aims to dynamically cover a wider projection area using a portable projector attached to a simple robotic device. It has a clear advantage than a conventional SAR scenario where, for example, a projector should be installed with a fixed projection area in the workspace. In the previous research [1], we proposed a data-driven kinematic control method for a movable SAR system. This method targets a SAR system integrated with a user-created robotic (UCR) device where an explicit kinematic configuration such as CAD model is unavailable. Our contribution in this paper is to show the feasibility of the data-driven control method by developing a practical application where dynamic change of projection area matters. We outline the control method and demonstrate an assembly guide example using a casually installed movable SAR system.

2D-3D Co-segmentation for AR-based Remote Collaboration
Kuo-Chin Lien, Benjamin Nuernberger, Matthew Turk, Tobias Höllerer
Poster on 30 Sep Teaser on 30 Sep

In Augmented Reality (AR) based remote collaboration, a remote user can draw a 2D annotation that emphasizes an object of interest to guide a local user accomplishing a task. This annotation is typically performed only once and then sticks to the selected object in the local user's view, independent of his or her camera movement. In this paper, we present an algorithm to segment the selected object, including its occluded surfaces, such that the 2D selection can be appropriately interpreted in 3D and rendered as a useful AR annotation even when the local user moves and significantly changes the viewpoint.

Maintaining appropriate interpersonal distance using virtual body size
Masaki Maeda, Nobuchika Sakata
Demo Teaser on 30 Sep

Securing one’s personal space is quite important in leading a comfortable social life. However, it is difficult to maintain an appropriate interpersonal distance all the time. Therefore, we propose an interpersonal distance control system with a video see-through system, consisting of a head-mounted display (HMD), depth sensor, and RGB camera. The proposed system controls the interpersonal distance by changing the size of the person in the HMD view. In this paper, we describe the proposed system and conduct an experiment to confirm the capability of the proposed system. Finally, we show and discuss the results of the experiment.

Vergence-based AR X-ray Vision
Yuki Kitajima, Sei Ikeda, Kosuke Sato
Demo Teaser on 30 Sep

The ideal AR x-ray vision should enable users to clearly observe and grasp not only occludees, but also occluders. We propose a novel selective visualization method of both occludee and occluder layers with dynamic opacity depending on the user's gaze depth. Using the gaze depth as a trigger to select the layers has a essential advantage over using other gestures or spoken commands in the sense of avoiding collision between user's intentional commands and unintentional actions. Our experiment by a visual paired-comparison task shows that our method has achieved a 20% higher success rate, and significantly reduced 30% of the average task completion time than a non-selective method using a constant and half transparency.

Manipulating Haptic Shape Perception by Visual Surface Deformation and Finger Displacement in Spatial Augmented Reality
Toshio Kanamori, Daisuke Iwai, Kosuke Sato
Poster on 30 Sep Teaser on 30 Sep

Many researchers are trying to realize a pseudo-haptic system which can visually manipulate a user's haptic shape perception when touching a physical object. In this paper, we focus on altering the perceived surface shape of a curved object when touching it with an index finger, by visually deforming it's surface shape and displacing the visual representation of the user's index finger as like s/he is touching the deformed surface, using spatial augmented reality. A experiment were conducted with a projection system to confirm the effect of the visual feedback for altering the perceived shape of curved shape. The result showed that the shape which the participants perceived was deformed from the actual shape they touched. The result prove the possibility for manipulating user's perceived shape of a curved surface by using pseudo-haptics in spatial augmented reality.

Mixed-Reality Store on the Other Side of a Tablet
Masaya Ohta, Shunsuke Nagano, Hotaka Niwa, Katsumi Yamashita
Poster on 30 Sep Teaser on 30 Sep

This paper proposes a mixed-reality shopping system for users who do not own a PC but do own a tablet. In this system, while viewing panoramic images photographed along the aisles of a real store, the user can move freely around the store. Products can be selected and freely viewed from any angle. Furthermore, by utilizing a Photo-based augmented reality (Photo AR) technology the product can be displayed as if it were in the hands of the user. The results of a user evaluation showed that even though the proposed system uses a tablet with a smaller screen it was preferred over a conventional e-commerce site using a larger monitor.

Avatar-Mediated Contact Interaction between Remote Users for Social Telepresence
JIHYE OH, Yeonjoon Kim, Taeil Jin, Sukwon Lee, Youjin Lee, Sung-Hee Lee
Poster on 30 Sep Teaser on 30 Sep

Social touch such as a handshake increases the sense of coexistence and closeness between remote users in a social telepresence environment, but creating such coordinated contact movements with a distant person is extremely difficult if given only visual feedback, without haptic feedback. This paper presents a method to enable hand-contact interaction between remote users in an avatar-mediated telepresence environment. The key approach is, while the avatar directly follows its owner’s motion in normal conditions, it adjusts the pose to maintain contact with the other user when the two users attempt to make contact interaction. To this end, we develop classifiers to recognize the users’ intention for the contact interaction. The contact classifier identifies whether the users try to initiate contact when they are not in contact, and the separation classifier identifies whether the two in contact attempt to break contact. The classifiers are trained based on a set of geometric distance features. During the contact phase, inverse kinematics is solved to determine the pose of the avatar’s arm so as to initiate and maintain natural contact with the other user’s hand. Our system is unique in that two remote users can perform real time hand contact interaction in a social telepresence environment.

Towards Estimating Usability Ratings of Handheld Augmented Reality Using Accelerometer Data
Marc Ericson Santos, Takafumi Taketomi, Goshiro Yamamoto, Gudrun Klinker, Christian Sandor, Hirokazu Kato
Poster on 30 Sep Teaser on 30 Sep

Usability evaluations are important to the development of augmented reality systems. However, conducting large-scale longitudinal studies remains challenging because of the lack of inexpensive but appropriate methods. In response, we propose a method for implicitly estimating usability ratings based on readily available sensor logs. To demonstrate our idea, we explored the use of features of accelerometer data in estimating usability ratings in an annotation task. Results show that our implicit method corresponds with explicit usability ratings at 79-84%. These results should be investigated further in other use cases, with other sensor logs.

Abecedary tracking and mapping: a toolkit for tracking competition
Hideaki Uchiyama, Takafumi Taketomi, Sei Ikeda, João Lima
Poster on 30 Sep Teaser on 30 Sep

This paper introduces a toolkit with camera calibration, monocular Simultaneous Localization and Mapping (SLAM) and registration with a calibration marker. With the toolkit, users can perform the whole procedure of the ISMAR on-site tracking competition in 2015. Since the source code is designed to be well-structured and highly-readable, users can easily install and modify the toolkit. By providing the toolkit, we encourage beginners to learn tracking techniques and participate in the competition.

Retrieving Lights Positions Using Plane Segmentation with Diffuse Illumination Reinforced with Specular Component
Paul-Émile Buteau, Hideo Saito
Poster on 30 Sep Teaser on 30 Sep

We present a novel method to retrieve multiple positions of point lights in real indoor scenes based on a 3D reconstruction. This method takes advantage of illumination over planes detected using a segmentation of the reconstructed mesh of the scene. We can also provide an estimation without suffering from the presence of specular highlights but rather use this component to refine the final estimation. This allows consistent relighting throughout the entire scene for aumented reality purposes.

Improved SPAAM Robustness Through Stereo Calibration
Kenneth Moser, J. Edward Swan II
Poster on 1 Oct Teaser on 1 Oct

We are investigating methods for improving the robustness and consistency of the Single Point Active Alignment Method (SPAAM) optical see-through (OST) head-mounted display (HMD) calibration procedure. Our investigation focuses on two variants of SPAAM. The first utilizes a standard monocular alignment strategy to calibrate the left and right eye separately, while the second leverages stereoscopic cues availble from binocular HMDs to calibrate both eyes simultaneously. We compare results from repeated calibrations between methods using eye location estimates and inter pupilary distance (IPD) measures. Our findings indicate that the stereo SPAAM method produces more accurate and consistent results during calibration compared to the monocular variant.

Road Maintenance MR System Using LRF and PDR
ChingTzun Chang, Ryosuke Ichikari, Takashi Okuma, Takeshi Kurata, Koji Makita
Poster on 1 Oct Teaser on 1 Oct

We have been developing a MR system for supporting road maintenance using overlaid visual aids. In this situation, we need a positioning method that can provide sub-meter accuracy and work even if the appearance of the road surface is completely different which is caused by many factors such as construction phase, time (i.e. day and night) and weather. Therefore, we are developing a real-time worker positioning method that can be applied to these situation by using data fusion of laser range finder (LRF) and pedestrian dead-reckoning (PDR). In the real field, plural workers are working and moving around the workspace, so we need to make correspondence between PDR-based trajectories and LRF-based trajectories by defining the similarity of trajectories.. Corresponded pair of trajectories will be fusion for acquiring the position and direction of the worker. In this paper, we proposed a method to calculate the similarity between trajectories and a procedure to fusion them.

Geometric Mapping for Color Compensation using Scene Adaptive Patches
Jong Hun Lee, Yong Hwi Kim, Yong Yi Lee, Kwan Heng Lee
Poster on 1 Oct Teaser on 1 Oct

The SAR technique using a projector-camera system allows us to make various effect on a real scene without physical reconstitution. In order to project contents on a textured scene without color imperfections, geometric and radiometric compensation of a projection image should be conducted as preprocessing. In this paper, we present a new geometric mapping method for color compensation in the projector-camera system. We capture the scene and segment it into adaptive patch according to the scene structure using the SLIC segmentation. The piece-wise polynomial function is evaluated for each patch to find pixel-to-pixel correspondences between the measured and projection images. Finally, color compensation is performed by using a color mixing matrix. Experimental results show that our geometric mapping method establishes accurate correspondences and color compensation alleviates the color imperfections which is caused by texture of a general scene.

Pseudo Printed Fabrics through Projection Mapping
Yuichiro Fujimoto, Goshiro Yamamoto, Takafumi Taketomi, Christian Sandor, Hirokazu Kato
Poster on 1 Oct Teaser on 1 Oct

Projection-based Augmented Reality commonly projects on rigid objects, while only few systems project on deformable objects. In this paper, we present Pseudo Printed Fabrics (PPF), which enables the projection on a deforming piece of cloth. This can be applied to previewing a cloth design while manipulating its shape. We support challenging manipulations, including heavy occlusions and stretching the cloth. In previous work, we developed a similar system, based on a novel marker pattern; PPF extends it in two important aspects. First, we improved performance by two orders of magnitudes to achieve interactive performance. Second, we developed a new interpolation algorithm to keep registration during challenging manipulations. We believe that PPF can be applied to domains including virtual-try on and fashion design.

On-site AR Interface with Web-based 3D Archiving System for Archaeological Project
Ryosuke Matsushita, Tokihisa Higo, Hiroshi Suita, Yoshihiro Yasumuro
Poster on 2 Oct Teaser on 2 Oct

This paper proposes an AR interface for on-site use in an archaeological project. We have already been developing a web-based 3D archiving system for supporting the diverse specialties and nations needed for driving the surveys and restoration work at the archaeological project. Our 3D archiving sistem is designed for the spontaneous updating, accumulating and sharing of information on findings to better enable frequent discussions, through a 3D virtual copy of the field site that a user can visit, explore, and embed information in the virtual site over the Internet. Here we present an AR-based human interface to enhance the access from mobile devices at the actual site to the archiving system. Using SFM (structure from motion) and solving PNP problem, a photo taken at the site can be stably matched to the pre-registered photo collection in the archive system and the access to archiving system start smoothly by associating the 3D coordinates between the system and the actual user viewpoint. Our implementation effectively works on an on-going project developed at Mastaba Idout in Saqqara, Egypt.

Photo Billboarding: A Simple Method to Provide Clues that Relate Camera Views and a 2D Map for Mobile Pedestrian Navigation
Junta Watanabe, Shingo Kagami, Koichi Hashimoto
Poster on 2 Oct Teaser on 2 Oct

This paper describes a mobile pedestrian navigation system that provides users with clues that help understanding spatial relationship between mobile camera views and a 2D map. The proposed method draws on the map upright billboards that correspond to the basal planes of viewing frustums of the camera. The user can take photographs of arbitrary landmarks on the way to build billboards with photographs corresponding to them on the map.

Automatic Visual Feedback from Multiple Views for Motor Learning
Dan Mikami, Mariko Isogawa, Kosuke Takahashi, Akira Kojima
Poster on 2 Oct Teaser on 2 Oct

Visual feedback system of a trainee's movements for effective motor learning is proposed. It provides a visual feedback of a trainee's movement in synchronization with the reference movement from multiple view angles automatically with delay of a few second. Because the automatic feedback, a trainee can obtain the feedback without operations during motion of memory is clear. By employing features with low computational cost, the proposed system achieve the synchronized video feedback with four cameras on a consumer tablet PC.


SlidAR: A 3D Positioning Technique for Handheld Augmented Reality
Jarkko Polvi, Takefumi Taketomi, Goshiro Yamamoto, Christian Sandor, Hirokazu Kato
Demo Teaser on 30 Sep

Tablet system for visual overlay of rectangular virtual object onto real environment
Hiroyuki Yoshida, Takuya Okamoto, Hideo Saito
Demo Teaser on 30 Sep

Accurate Passive Eye-Pose Estimation through Corneal Imaging
Alexander Plopski, Christian Nitschke, Kiyoshi Kiyokawa, Dieter Schmalstieg, Haruo Takemura
Demo Teaser on 30 Sep

EyeAR: Physically-Based Depth of Field through Eye Measurements
Damien Rompapas, Kohei Oshima, Sei Ikeda, Goshiro Yamamoto, Takefumi Taketomi, Christian Sandor, Hirozoku Kato
Demo Teaser on 30 Sep

R-V Dynamics Illusion Experience System in Mixed Reality Space
Yuta Kataoka, Satoshi Hashiguchi, Taiki Yamada, Fumihisa Shibata, Asako Kimura
Demo Teaser on 30 Sep

Diminished Reality for Hiding a Pedestrian using Hand-held Camera
Kunihiro Hasegawa, Hideo Saito
Demo Teaser on 30 Sep

SharpView: Improved Legibility of Defocussed Content on Optical See-Through Head-Mounted Displays
Kohei Oshima, Damien Rompapas, Kenneth Moser, Edward Swan, Sei Ikeda, Goshiro Yamamoto, Takafumi Taketomi, Christian Sandor, Hirokazu Kato
Demo Teaser on 30 Sep

DroneAR: Augmented Reality Supported Unmanned Aerial Vehicle (UAV) in Agriculture for Farmer Perspective
Yuan Wang, Henry Duh Been-Lim, Hirokazu Kato, Takafumi Taketomi
Demo Teaser on 30 Sep

DOMINO (Do Mixed-reality Non-stop) Toppling
Ryotaro Hirata, Tomoka Ishibashi, Jianing Qie, Shohei Mori, Fumihisa Shibata, Asako Kimura, Hideyuki Tamura
Demo Teaser on 30 Sep

Imperceptible On-Screen Markers for Arbitrary Background Images
Goshiro Yamamoto, Luiz Sampaio, Takafumi Taketomi, Christian Sandor, Hirokazu Kato
Demo Teaser on 30 Sep

Magical Mystery Room, 2nd Stage
Daiki Sakauchi, Yuichi Matsumi, Shohei Mori, Fumihisa Shibata, Asako Kimura, Hideyuki Tamura
Demo Teaser on 30 Sep

Mobile Binocular Augmented Reality System for Museum
Jae-In Hwang, Elisabeth Adelia Widjojo, Seungmin Roh, Youna Lee, Jinwoo Lee, Junho Kim
Demo Teaser on 30 Sep

Multiple Kinect for 3D Human Skeleton Posture Using Axis Replacement Method
Nuth Otanasap, Poonpong Boonbrahm
Demo Teaser on 30 Sep

InstantReach: Virtual Hand Interaction using Smartphone
Yuta Ueda, Daisuke Iwai, Kosuke Sato
Demo Teaser on 30 Sep

Improving Stability of Vision-based Camera Tracking by Smartphone Sensors
Jaejun Lee, Kei Obata, Maki Sugimoto, Hideo Saito
Demo Teaser on 30 Sep

Study of the AR marker available on foldable surfaces
Hajime Sasanuma, Yoshitsugu Manabe, Noriko Yata
Demo Teaser on 30 Sep

Immersive Virtual Tourism with Omnidirectional View Interpolation
Abdoulaye Maiga, Naoki Chiba, Tony Tung, Hideo Saito
Demo Teaser on 30 Sep

Sponsors (Become one)






in special cooperation with

in cooperation with

Partner Event