ABSTRACT
Title of dissertation: TOWARDS EFFECTIVE DISPLAYS FOR
VIRTUAL AND AUGMENTED REALITY
Xuetong Sun, Doctor of Philosophy, 2020
Dissertation directed by: Professor Amitabh Varshney
Department of Computer Science
Virtual and augmented reality (VR and AR) are becoming increasingly acces-
sible and useful nowadays. This dissertation focuses on several aspects of designing
effective displays for VR and AR.
Compared to conventional desktop displays, VR and AR displays can better
engage the human peripheral vision. This provides an opportunity for more infor-
mation to be perceived. To fully leverage the human visual system, we need to
take into account how the human visual system perceives things differently in the
periphery than in the fovea. By investigating the relationship of the perception time
and eccentricity, we deduce a scaling function which facilitates content in the far
periphery to be perceived as efficiently as in the central vision.
AR overlays additional information on the real environment. This is useful
in a number of fields, including surgery, where time-critical information is key. We
present our medical AR system that visualizes the occluded catheter in the exter-
nal ventricular drainage (EVD) procedure. We develop an accurate and efficient
catheter tracking method that requires minimal changes to the existing medical
equipment. The AR display projects a virtual image of the catheter overlaid on the
occluded real catheter to depict its real-time position. Our system can make the
risky EVD procedure much safer.
Existing VR and AR displays support a limited number of focal distances,
leading to vergence-accommodation conflict. Holographic displays can address this
issue. In this dissertation, we explore the design and development of nanophotonic
phased array (NPA) as a special class of holographic displays. NPAs have the
advantage of being compact and support very high refresh rates. However, the
use of the thermo-optic effect for phase modulation renders them susceptible to
thermal proximity effect. We study how the proximity effect impacts the images
formed on NPAs. We then propose several novel algorithms to compensate for the
thermal proximity effect on NPAs and compare their effectiveness and computational
efficiency.
Computer-generated holography (CGH) has traditionally focused on 2D im-
ages and 3D images in the form of meshes and point clouds. However, volumetric
data can also benefit from CGH. One of the challenges in the use of volumetric data
sources in CGH is the computational complexity needed to calculate the holograms
of volumetric data. We propose a new method that achieves a significant speedup
compared to existing holographic volume rendering methods.
TOWARDS EFFECTIVE DISPLAYS FOR VIRTUAL AND
AUGMENTED REALITY
by
Xuetong Sun
Dissertation submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
2020
Advisory Committee:
Professor Amitabh Varshney, Chair/Advisor
Professor Mario Dagenais
Professor David Mount
Professor Martin Peckerar
Professor Matthias Zwicker
?c Copyright by
Xuetong Sun
2020

Table of Contents
List of Tables v
List of Figures vi
1 Introduction 1
1.1 Perception in the Far Periphery . . . . . . . . . . . . . . . . . . . . . 2
1.2 AR Displays for Surgery . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Correcting Proximity Effect in Nanophotnoic Phased Arrays . . . . . 9
1.4 Fast Holographic Volume Rendering . . . . . . . . . . . . . . . . . . . 12
2 Perception in the Far Periphery 14
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 Peripheral Vision . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 Wide FOV Displays . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 User Study Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Task and Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.2 User Study Procedure . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.1 Perception Time and Scale . . . . . . . . . . . . . . . . . . . . 29
2.4.2 Perception Time and Eccentricity . . . . . . . . . . . . . . . . 30
2.4.3 Equal-Efficiency Scale at each Eccentricity . . . . . . . . . . . 31
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3 AR Displays for Surgery 36
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.1.1 Technical Details from Previous AR Ultrasound System . . . . 40
3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.1 Catheter Placement . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.2 Augmented Reality Displays for Neurosurgery . . . . . . . . . 44
3.3 System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.1 System Components . . . . . . . . . . . . . . . . . . . . . . . 47
ii
3.3.2 System Operation . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.3 Reference System mapping . . . . . . . . . . . . . . . . . . . . 49
3.4 Catheter Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4.1 Angle ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4.2 Position of Catheter . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.3 Color Band Detection . . . . . . . . . . . . . . . . . . . . . . 56
3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.5.1 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5.2 Accuracy over a 2D Grid . . . . . . . . . . . . . . . . . . . . . 58
3.5.3 Accuracy over a 3D Grid . . . . . . . . . . . . . . . . . . . . . 60
3.5.4 Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4 Correcting the Proximity Effect in Nanophotonic Phased Arrays 68
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.2.1 Nanophotonic Phased Arrays . . . . . . . . . . . . . . . . . . 75
4.2.2 Near-eye Displays . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.2.3 Computer-Generated Holography and Phase-Only Holograms . 78
4.2.4 Proximity Effect Correction . . . . . . . . . . . . . . . . . . . 81
4.3 Nanophotonic Phased Array . . . . . . . . . . . . . . . . . . . . . . . 82
4.3.1 Phase Modulation . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.3.2 Thermal Proximity Effect . . . . . . . . . . . . . . . . . . . . 84
4.4 Proximity Effect Correction for Fourier Holograms . . . . . . . . . . . 86
4.4.1 Iterative Proximity Effect Correction . . . . . . . . . . . . . . 87
4.4.1.1 Gerchberg-Saxton Iterative Phase Retrieval Algorithm 90
4.4.2 Proximal Proximity Effect Correction . . . . . . . . . . . . . . 92
4.4.2.1 Derivation of PPEC Gradient . . . . . . . . . . . . . 94
4.4.3 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.4.3.1 Implementation and Test . . . . . . . . . . . . . . . 98
4.4.3.2 Image Quality Metrics . . . . . . . . . . . . . . . . . 99
4.4.3.3 Correction Effectiveness . . . . . . . . . . . . . . . . 104
4.4.3.4 Processing Time . . . . . . . . . . . . . . . . . . . . 107
4.4.3.5 Spatial Frequency of Images . . . . . . . . . . . . . . 108
4.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.4.4.1 Improving IPEC . . . . . . . . . . . . . . . . . . . . 111
4.4.4.2 Pixel Pitch and Thermal Proximity Effect . . . . . . 114
4.4.4.3 Choice of Proximal Algorithm . . . . . . . . . . . . . 114
4.4.4.4 Iterations of PPEC . . . . . . . . . . . . . . . . . . . 116
4.4.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.4.5.1 Amplitude Modulation and Diminished Reality . . . 117
4.4.5.2 Choice of Initial Phase . . . . . . . . . . . . . . . . . 117
4.4.5.3 Larger Proximity Effect for PPEC . . . . . . . . . . 118
4.4.5.4 3D Holograms . . . . . . . . . . . . . . . . . . . . . . 118
iii
4.5 Proximity Effect Correction for Fresnel Holograms . . . . . . . . . . . 119
4.5.0.1 Derivation of FPEC Gradient . . . . . . . . . . . . . 121
4.5.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.5.1.1 2D Fresnel Holograms . . . . . . . . . . . . . . . . . 124
4.5.1.2 3D Holograms . . . . . . . . . . . . . . . . . . . . . . 127
4.5.2 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.5.2.1 Disagreement in Image Metrics . . . . . . . . . . . . 128
4.5.2.2 Larger Proximity Effect . . . . . . . . . . . . . . . . 131
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5 Fast Holographic Volume Rendering 133
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.2.1 Volume Visualization . . . . . . . . . . . . . . . . . . . . . . . 137
5.2.2 Computer-Generated Holography . . . . . . . . . . . . . . . . 139
5.3 Algorithm and Display . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.3.1 Direct Volume Rendering Sampling . . . . . . . . . . . . . . . 142
5.3.2 Acceleration using Convolution Theorem . . . . . . . . . . . . 144
5.4 Increase the FOV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.5.1 Digital Reconstruction . . . . . . . . . . . . . . . . . . . . . . 154
5.5.2 Realistic Simulation . . . . . . . . . . . . . . . . . . . . . . . 154
5.5.2.1 Configuration of the Display . . . . . . . . . . . . . . 156
5.5.2.2 Light Source . . . . . . . . . . . . . . . . . . . . . . 157
5.5.2.3 Simulations . . . . . . . . . . . . . . . . . . . . . . . 159
5.5.2.4 Focus at different depths . . . . . . . . . . . . . . . . 162
5.5.3 Speedup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
5.6.1 Factors of Hologram Computation Time . . . . . . . . . . . . 164
5.6.2 Scene FOV and Volume Placement . . . . . . . . . . . . . . . 167
5.6.3 Elemental Hologram . . . . . . . . . . . . . . . . . . . . . . . 170
5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6 Conclusions and Future Work 172
6.1 Perception in the Far Periphery . . . . . . . . . . . . . . . . . . . . . 172
6.2 AR Displays for Surgery . . . . . . . . . . . . . . . . . . . . . . . . . 173
6.3 Correcting Proximity Effect in Nanophotnoic Phased Arrays . . . . . 174
6.3.1 Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6.3.2 3D Fresnel Holograms . . . . . . . . . . . . . . . . . . . . . . 175
6.4 Fast Holographic Volume Rendering . . . . . . . . . . . . . . . . . . . 176
6.4.1 Phase Shift on the SLM . . . . . . . . . . . . . . . . . . . . . 176
6.4.2 Quantifying the Speckles . . . . . . . . . . . . . . . . . . . . . 177
6.4.3 Improving Volume Rendering Sampling . . . . . . . . . . . . . 178
Bibliography 180
iv
List of Tables
2.1 List of wide-FOV virtual and augmented reality display techniques
and devices. The spatial resolution of a holographic display is the
same as the SLM pixel pitch when inside the depth of field [1]. The
items with * are calculated from measurement and other reported
device specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Parameters of the equal-efficiency scales fitted to a linear model. The
valid range of eccentricity is [40, 70]?. Analysis of variance is con-
ducted with the all participant?s equal-efficiency scale at a certain
eccentricity as a group. We include the last 5 rows because the equal-
efficiency scales at 70? eccentricity are no more than 4? above the
tested range. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.1 Time breakdown of functions in our tracking algorithm. . . . . . . . . 63
5.1 Spectra of light sources used in the realistic simulation. . . . . . . . . 159
5.2 Hologram computation time and speedup for several volumetric datasets.
These holograms are calculated withHx = Hy = 2048 and Rx = Ry =
1024. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
v
List of Figures
1.1 An example of the equal-efficiency scales. At higher eccentricities (40?
and up, patterns can be scaled according to our finding to achieve the
same perception time as perceiving a 5? pattern at 10? eccentricity. . 5
1.2 A surgeon using our EVD assistance system and the surgeon?s see-
through view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 The first row of the middle two columns show the simulated obser-
vations of the Fourier hologram displayed on the NPA without any
proximity effect correction and with our proposed PEC method. The
second row show the simulated observations of the Fresnel hologram.
Our PEC method can effectively correct the proximity effect correc-
tion at the tested proximity effect level (? = 0.66 px). The simulated
observation of the 3D volume by applying our proposed PEC method
to a stack of Fresnel holograms can be seen on the right. . . . . . . . 11
1.4 Sample results of our holographic volume rendering algorithm. . . . . 13
2.1 Illustration of the human visual field. The central visual vision, near,
middle, and far peripheral vision are from 0? to 8?, 8? to 30?, 30? to
60? and beyond 60? respectively. This image is adapted from [2]. . . . 15
2.2 Top view and stimuli (left) of our user-study setup. In the experiment,
patterns appear one at a time. A photograph of our system used in
the study (right). Shown on screen is one of the possible use cases of
the system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Perception time of all eccentricity-scale combinations is presented as
a 3D colored mesh (top left). The three planes from top to bottom
are the 75% quartile, median (50% quartile), and 25% quartile of all
participants. The perception time at three scales (5?, 10? and 20?) is
also shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4 A color-coded map of the median perception time for all eccentricity-
scale combinations. Using our findings in Section 2.4.3, we plot the
scales at higher eccentricities to be perceived in the same time as a
5? pattern at 10? (red), 20? (orange) and 40? (yellow). . . . . . . . . . 28
vi
2.5 Figures (a), (b), (c) and (d) show the relation between perception
time and scale measured for a single user (participant 32) at 50?, 60?,
70? and 80? eccentricities, respectively. . . . . . . . . . . . . . . . . . 30
2.6 The perception time across eccentricities with an s-sized pattern is
the intersection of the plane scale = s and the perception time surface
in Figure 2.3. The intersection with 5?, 10? and 20? scale planes are
shown in (left). The intersection curves are shown in (right). . . . . . 31
2.7 Equal-efficiency scales that produce the same perception time as (a)
a 5? pattern at 10? eccentricity and (b) perceived in 450 ms. These
are the median over all participants. . . . . . . . . . . . . . . . . . . . 32
2.8 Visualization of equal-efficiency scales for a notional theme park nav-
igation. The Star Wars attraction logo is scaled at various eccentrici-
ties using equal-efficiency scales. The target perception times are the
perception time of a target pattern of size 5?, at eccentricities (top)
10?, (middle) 20?, and (bottom) 40? (these are shown with red dashed
boxes). Theme park image reproduced with the permission of Bernie
Kelm, https://www.rocket9.net. . . . . . . . . . . . . . . . . . . . . . 35
3.1 The view observed by the surgeon (captured with HoloLens mixed
reality capture function). This is a realistic surgical scenario where
EVD is performed on a cadaveric head. Additional information could
be shown such as the CT slice at the tip of the catheter if a CT
volume is registered to the patient. The CT image shown in this figure
serves as an illustration of this potential. Other medical imaging
visualization techniques such as in Bista et al. [3,4] may also be used. 37
3.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 The workflow of the system . . . . . . . . . . . . . . . . . . . . . . . 49
3.4 Reference system mapping . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 A drawing of the catheter and the color bands. The color bands need
to be distinct and continuous. The lengths of the three color bands,
as well as the uncolored forward portion (from the catheter tip to the
beginning of the first color band) of the catheter, are known. With
the lengths and the positions of the color band endpoints detected in
camera space, we can calculate the 5-DOF (no roll) information of
the catheter, and infer the position of the tip. . . . . . . . . . . . . . 51
3.6 We need to find the angle ? between the catheter and its image. Let
A, B, and C be three endpoints of two consecutive color bands. The
color bands are of equal length a. P is the center of projection in the
pinhole camera model. The distance between B and P is b . . . . . . 52
3.7 Finding the position of the catheter in camera space. P is the center
of projection. TA is the uncolored forward portion of the catheter.
AB is the first color band on the catheter. . . . . . . . . . . . . . . . 54
3.8 Detecting endpoints. (g), (h) and (i) show detecting endpoints in
an image where the catheter is blurry because of movement. Our
algorithm is able to handle this case. . . . . . . . . . . . . . . . . . . 57
vii
3.9 Setup for testing (a) the tracking accuracy over a 2D grid, and (b)
tracking accuracy over a 3D grid with the 3D CAD model in the lower
right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.10 The accuracy of our catheter tracking approach over the grid. The
red circles indicate the grid intersection. The blue crosses indicate
the computed catheter tip position. . . . . . . . . . . . . . . . . . . . 60
3.11 The latency of our system compared to an HTC Vive tracker (a) and
the various components of our system?s latency in image capture and
processing (b). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.12 Our system being used in a cadaver study . . . . . . . . . . . . . . . 66
4.1 Nanophotonic phased arrays suffer from thermal proximity effect where
one pixel being heated affects the temperature of nearby pixels. This
causes inaccurate phase modulation and noise in the formed image
of Fourier holograms as shown in (a). The current state-of-the-art
methods, as shown in (b), are not able to sufficiently correct the
proximity effect for our use case. We propose two proximity effect
correction (PEC) methods for Fourier holograms on the NPA which
are able to reduce the noise, as shown in (c) and (d). The proximal
PEC method has the best correction effectiveness. Images are chosen
from the Common Objects in Context dataset [5]. The proximity
effect correction for Fresnel holograms is discussed in Section 4.5. . . 71
4.2 The Rayleigh-Sommerfeld model provides a highly accurate, but also
computationally expensive, diffraction model. It is applicable one
wavelength away from the source. When the propagation distance
is farther, the Fresnel diffraction approximation, which is less com-
putationally expensive than Rayleigh-Sommerfeld diffraction may be
used. Holograms computed using the Fresnel diffraction are called
Fresnel holograms. They can be reconstructed at the user-specified
depths, such as z1 and z2. The Fresnel diffraction approximation is
2
considered acceptable when N wF = < 30 ( [6]), where w is the ra-?z
dius of the source field, ? is the wavelength and z is the propagation
distance. Farther away when NF  1, the Fraunhofer diffraction ap-
proximation can be used where the propagation can be approximated
with a Fourier transform. Holograms based on the Fraunhofer diffrac-
tion are called Fourier holograms and are the fastest to compute. In
our NPA design, w = 1 mm. Given a typical green light wavelength
? = 530 nm, the Fresnel diffraction is acceptable for z > 62.89 mm
and Fraunhofer diffraction is suitable for z  1886.79 mm. . . . . . . 74
4.3 The NPA display with electronic and optical layers is shown in (a).
The optical power is evenly distributed into each pixel. Therefore, the
near-field (b) has uniform amplitude. The antennas (c) propagate the
near-field wavefront to form the image to be observed at the desired
depth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
viii
4.4 Thermal crosstalk arising from power being supplied to the center
pixel on a 5? 5 array. . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.5 Iterative phase retrieval algorithm . . . . . . . . . . . . . . . . . . . . 91
4.6 How the three image quality metrics change with increasing levels of
proximity effect on a single image. The results on two example images
are shown (left and right column). These results are representative
of the entire image dataset. . . . . . . . . . . . . . . . . . . . . . . . 102
4.7 The image quality metrics over the entire dataset when the proximity
effect level is fixed. A small proximity effect (? = 0.1 px, left) and a
large one (? = 1.0 px, right) are shown. The standard deviations of
the metrics over the dataset are also shown under each image. . . . . 103
4.8 This figure compares the various Proximity Error Correction (PEC)
methods. The proximity effect level used in the first two rows is ? =
0.66 px. At this level, the IPEC results are noisy but still reasonable.
The proximity effect level for the bottom row is ? = 0.8 px. At this
level, the results from the IPEC methods are incomprehensible but
the Proximal PEC method provides readable results. The images in
the first two rows are from the LIVE dataset (Sheikh et al. [7, 8])
and the newspaper image in the bottom row is from the CHRONIC
dataset (Smits and Faber [9]). . . . . . . . . . . . . . . . . . . . . . 105
4.9 Image quality degradation for various methods. This figure is plotted
with interpolated data from simulations where ? ranges from 0.10 to
1.00 px with a 0.10 px interval. . . . . . . . . . . . . . . . . . . . . . 106
4.10 The processing time of the proposed proximity effect correction meth-
ods with increasing levels of proximity effect at respective resolution. 108
4.11 The processing time of the proposed proximity effect correction meth-
ods on images of increasing resolution. The image aspect ratio is kept
at 3 : 2. The resolution on the x-axis indicates the vertical resolution
of the image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.12 The results of the basic iterative PEC method (top row) and the
iterative PEC method (middle row) and the proximal PEC method
(bottom row) on low-pass filtered images. The images (left column)
are the results of the respective PEC methods applied to a low-pass
filtered image with frequency cutoff threshold fc = 0.2 at proximity
effect level ? = 0.7 px. The image qualities of each PEC method
at different proximity effect levels on images with different frequency
cutoff thresholds are shown in the right column. . . . . . . . . . . . 110
4.13 (a) shows the result of TDM application to the IPEC method with 10
different initial phase. (b) shows the of result of the padding technique
applied to IPEC method. The center signal region is shown. The
image is padded from 171? 256 to 512? 512. The initial and result
padded images are shown in (c) and (d), respectively. The thermal
proximity effect level is ? = 0.66 px for both techniques. . . . . . . . 112
4.14 IPEC with TDM and padding techniques compared to the original
IPEC method and the PPEC method. . . . . . . . . . . . . . . . . . 113
ix
4.15 Image quality improves over the iterations. The orange dashed line
indicates the change in the residual compare to the previous iteration.
? = 0.66 px in this case. . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.16 Qualitative comparison of the Fresnel PEC method with the case
without proximity effect correction and the iterative Fresnel PEC
method. The original image is presented in the left column for refer-
ence. The thermal proximity effect levels for the images from top to
bottom are ? = 0.70 px, 0.80 px, 0.90 px and 1.00 px, respectively.
At ? = 0.70 px, the proximity effect causes noise but the images
are still recognizable. However, for more severe proximity effect, the
noise overwhelms the images. Our Fresnel PEC method is able to
correct the proximity effect and restore the images close to the origi-
nal even at such high proximity effect levels. The images in the first
and second rows are from the Common Objects in Context dataset [5]
and the CHRONIC dataset (Smits and Faber [9]), respectively. The
remaining images are from the LIVE dataset (Sheikh et al. [7, 8]). . . 125
4.17 The SSIM and perceptual similarity of the Fresnel PEC results com-
pared to the case without proximity effect correction and the iter-
ative Fresnel PEC (IFPEC) method. When there is no PEC, the
image quality degrades quickly with the proximity effect level beyond
? = 0.30 px. The image quality of the IFPEC method starts to
degrade starting from ? = 0.40 px. The Fresnel PEC method can
effectively correct the proximity effect at almost all proximity effect
levels tested, only to see image quality degrade slightly at extreme
proximity effect level ? = 1.00 px. . . . . . . . . . . . . . . . . . . . . 126
4.18 The 3D volume is divided into a stack of images at different depths.
The Fresnel holograms of all the images are calculated with the Fres-
nel PEC method at their respective depths and displayed on the NPA
in consecutive frames so the 3D scene can be observed. We simulate
the observation by reconstructing the holograms at a certain distance
that corresponds to the user?s focal plane. By adjusting the focal
plane (for example, a closer focal plane in the left images and a far-
ther focal plane in the right images), different parts of the scene (red
or blue squares) are in focus. The top and bottom volumes are di-
vided into 232 and 253 equidistant slices, respectively. The proximity
effect level used is ? = 0.66 px. . . . . . . . . . . . . . . . . . . . . . 129
4.19 Two examples where perceptual similarity disagrees with SSIM. We
observe that the results of the Iterative Fresnel PEC method at prox-
imity effect ? = 0.9 px are of better quality than the results of the
same method at proximity effect ? = 1.0 px. The perceptual similar-
ity metric agrees with this observation while the SSIM disagrees. . . . 130
4.20 The FPEC method at higher proximity effect levels where much noise
is observed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
x
5.1 We propose an efficient way to compute the hologram of a 3D volume.
We use the raycasting direct volume rendering method to sample
a collection of point light sources and use the convolution theorem
to accelerate hologram computation. The digital reconstructions of
several such holograms are shown. Holographic volume rendering is
ideal for visualizing volumetric data such as 3D medical imaging data
or physics simulations on a holographic display. . . . . . . . . . . . . 134
5.2 The general placement of the volume, the hologram, the illuminating
light (planar wave), and the observer. The hologram is calculated
as the summation of the wavefronts from the collection of spherical
point light sources in the volume. . . . . . . . . . . . . . . . . . . . . 141
5.3 (a) illustrates the direct volume rendering method used for sampling
the point light sources. Front-to-back compositing is used in our pro-
posed method (b). The position and color at each compositing step
in the direct volume rendering are used for the hologram calculation. 143
5.4 The FOV under planar wave is 2asin( ? ). Spherical wave illumination
2?
can be used to increase the FOV by placing a lens closely behind the
hologram. While the FOV can be increased to 2atan( w ) where w
2fel
is the side length of the hologram and fel is the focal length of the
external lens. This configuration is not compatible with our proposed
method of hologram computation because the diffractions are oriented
in different directions. . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.5 We increase the FOV while also making sure the configuration is
compatible with our accelerated hologram computation method by
placing the external lens farther away from the hologram than its
focal length. The volume is placed within the focal length from the
external lens. The magnified virtual image of the volume (bound in
dashed lines) is observed. . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.6 Direct volume rendering results of two volumetric datasets and the
digital reconstruction of the holograms calculated with the convo-
lution theorem methods. We use orthogonal raycasting in the direct
volume rendering because it most resembles the digital reconstruction
by wavefront propagation. . . . . . . . . . . . . . . . . . . . . . . . . 155
5.7 The spectra of the laser and SLED light sources. . . . . . . . . . . . . 158
5.8 The realistic simulations of three holograms with different light sources.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.9 By controlling the focal length of the eye lens the viewer is able to
focus at different depths of the volume. . . . . . . . . . . . . . . . . . 162
xi
5.10 The left shows that the computation time increases linearly with
the number of compositing steps D with a fixed hologram dimen-
sions (Hx = Hy = 2048). The middle row shows the computation
time increases with increasing hologram dimensions with D = 232.
dV =30 mm for both tests. The right shows how the computation
time changes with the convolution size. The convolution size is in-
creased by placing the volume farther away from the hologram thus
increasing the sizes of the Kz and Cz. . . . . . . . . . . . . . . . . . . 164
5.11 The sizes of Cz and Kz of compositing planes z1 and z2 away from
the hologram are shown. Given the hologram size and the diffrac-
tion angle ?, the reconstruction of point light sources in region I has
the finest resolution (smallest spot size). The resolution of the re-
construction in region II drops off (the spot size increases) from the
middle towards top and bottom. The resolution of the reconstruction
in region III drops off horizontally as it gets farther away from the
hologram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
5.12 Given the volume size, the size of Kz does not need to increase end-
lessly with distance from the hologram. . . . . . . . . . . . . . . . . . 168
xii
Chapter 1: Introduction
Advances in virtual and augmented reality (VR and AR) allow us to explore
new ways to present information more efficiently and effectively. Our research in this
area has four foci. First, we explore ways to present information efficiently in wide
field-of-view (FOV) VR and AR devices that leverage the characteristics of the far
peripheral vision. Second, we explore how surgery can benefit from the use of AR.
Specifically, we develop an AR system that tracks a catheter in an external ventricu-
lar drain (EVD) procedure accurately in real-time and displays its position directly
in a surgeon?s field of view. Third, we explore the novel holographic technology
for the VR and AR displays. Conventional VR and AR near-eye displays based on
stereoscopic vision only provide one or a limited number of focal distances, leading
to vergence-accommodation conflict. Holographic displays allow a natural-to-senses
perception of objects and scenes by recreating the optical waves from the scene or
object. We have designed a nanophotonic phased array (NPA) which modulates
the phase using the thermo-optic effect. Our design has the advantage of a fast
refresh rate and being compact compared to other phase modulators such as liquid
crystal on silicon spatial light modulators (LCoS SLMs). However, the NPA suffers
from thermal proximity effect which diminishes the phase modulation accuracy. In
1
this dissertation, we investigate how proximity effect affects the images formed by
an NPA. We propose several algorithms to address this challenge. We have carried
out numerical simulations to compare and contrast various algorithms in terms of
image quality and computational efficiency. Last, we explore ways to generate holo-
grams from volumetric data. Such techniques will be critical for visualization in
holographic displays. Computer-generated holography (CGH) has largely focused
on 2D images and 3D images in the form of meshes and point clouds. Volumetric
data can also benefit from holographic visualization. In order to tackle the com-
putational complexity of calculating the hologram of a volume, we develop a new
calculation method that brings the computation time from hours previously to mere
seconds. We also validate our method on an actual holographic display.
We introduce the background and contributions of our work in the rest of
this chapter. Then in Chapter 2, we introduce our work on perception in the far
periphery. In Chapter 3, we present our AR system that depicts the occluded
catheter position for EVDs. In Chapter 4, we describe our nanophotonic phased
array, and propose several proximity effect correction methods. In Chapter 5, we
cover our accelerated holographic volume rendering algorithm. We summarize our
contributions and propose the directions for future work in Chapter 6.
1.1 Perception in the Far Periphery
The far peripheral vision (beyond 60? eccentricity) is beginning to be supported
in the latest VR and AR headsets. While lacking in visual acuity, the far peripheral
2
vision is capable of, and plays a crucial role for a variety of visual and cognitive
functions ( [10?12]). Making use of the far peripheral vision benefits the VR and AR
experiences by offering a larger display real estate. However, the visual properties
of the far periphery are different from those in the central vision because of the
physiological differences between the areas on the visual cortex responsible for the
respective types of vision.
Our goal is to investigate the visual properties in the far peripheral vision to
take advantage of the wider field of view in VR and AR headsets. We conduct a
study to explore the relationship between the perception time, which is the time it
takes to perceive a certain object, and the eccentricity. Specifically, we conduct a
user study to measure participants? perception time of a series of size-scaled simple
patterns over several eccentricities. In the study, two types of patterns of different
scales (sizes) show up at different locations on a wide FOV display. The task of the
participants is to perceive the pattern and respond based on what they perceive.
We use an eye-tracker to make sure that the participants? gaze is fixated at the
center of the display and record their response time as the perception time. In our
study, we had a total of 40 participants, all with normal or corrected-to-normal
vision. The collected data shows that it takes longer to perceive a given pattern
at a higher eccentricity than in the central vision; however, a larger pattern at a
higher eccentricity can be perceived in a similar amount of time as the original in
central vision. Our main contribution is deriving scaling functions with which the
objects displayed at higher eccentricities could be perceived as efficiently as in lower
eccentricities. Our scaling functions are valid from 40? to 70? eccentricity.
3
Our study is related to M-scaling in psychophysics. Most past work on M-
scaling focuses on low-level visual functions such as contrast sensitivity. Rovamo
and Virsu [13] show that the contrast sensitivity functions with retinally scaled
stimuli are similar at different eccentricities. Grogoric et al. [14] conduct a study to
find the proper scales for objects to be recognized up to 40? eccentricity.
Making use of the far peripheral vision has several advantages. First, the
peripheral vision provides an opportunity for additional information to be displayed.
Second, displaying non-critical information in the far periphery can reduce visual
clutter in the central vision. Last but not the least, displaying content in the far
periphery is less intrusive to the central vision, allowing techniques such as subtle
attention management.
We show an example of this scaling function in Figure 1.1. To achieve the
same perception time as a 5? pattern at 10? eccentricity, the patterns at higher
eccentricities can be scaled according to our equal efficiency scales. Details are
available in Chapter 2 and peer-reviewed publication Sun et al. [15].
4
(a)
(b)
Figure 1.1: An example of the equal-efficiency scales. At higher eccentricities (40?
and up, patterns can be scaled according to our finding to achieve the same percep-
tion time as perceiving a 5? pattern at 10? eccentricity.
5
1.2 AR Displays for Surgery
AR has transformative potential for a wide variety of applications, such as
entertainment, driving assistance, city planning, and manufacturing assembly guid-
ance. Our focus here is on leveraging AR capabilities for health care and medicine.
We have built several prototype medical AR systems in a collaborative effort with
the University of Maryland Shock Trauma Center.
The first prototype is a system that displays images from medical imaging ma-
chines directly in the healthcare provider?s field of view. In many surgical operation
scenarios, when the surgeon needs the information from an imaging device, he or
she needs to constantly move the sight away from the patient to look at a separate
screen. Our prototype can alleviate this psychomotor burden and help health-care
professionals to focus on giving the best possible care to the patient.
Our first prototype serves as a proof-of-concept that AR technology can ben-
efit medical procedures. It helped us resolve technical difficulties in developing AR
applications with multimedia. We have selected the Microsoft HoloLens AR head-
set as our platform for this project. It provides a good 6 degree-of-freedom (DOF)
inside-out spatial tracking and an optical see-through field of view of 30? ? 17?.
This FOV is small compared to most VR headsets. However, the Hololens offers
natural and intuitive user interfaces including hand gesture and voice recognition
and is suitable for our see-through AR prototypes. We use a video capture card
to convert the output signal from medical imaging machines to be ingested by our
system, which sends the images to the AR headset over WiFi. Based on this experi-
6
ence, we proceeded to build another prototype system to assist surgeons in external
ventricular drain (EVD) procedures. EVD is a high-risk medical procedure that
involves inserting a catheter inside a patient?s skull, through the brain, and into a
ventricle, to drain cerebrospinal fluid relieving elevated intracranial pressure. Once
the catheter has entered the skull its tip cannot be seen. The neurosurgeon has to
rely on experience to guess its location inside the cranium and direct it towards the
ventricle using only anatomic landmarks.
Our prototype system tracks the catheter using a novel approach that does not
change how it feels to operate and projects the catheter?s image into the surgeon?s
field of view, aligned with the real catheter. Our tracking technique requires mini-
mum modification to the catheter. We color portions of the catheter with different
colors. By tracking the color bands with known lengths, we can infer the position of
the catheter. A simple one-step calibration tells the relationship between the camera
space and the display space. The position of the catheter is sent to the display via
WiFi and a virtual catheter is rendered to overlay the real (and occluded) catheter
to depict its position. We have conducted the preliminary tests in a controlled en-
vironment on a cadaveric head. Our collaborating surgeons are pleased with the
high accuracy and the low latency that we have been able to achieve. Figure 1.2
shows how our system is used in the test by a neurosurgeon and a see-through view
from the surgeon?s perspective. Details are available in Chapter 3 and peer-reviewed
publication Sun et al. [16].
7
(a)
(b)
Figure 1.2: A surgeon using our EVD assistance system and the surgeon?s see-
through view.
8
1.3 Correcting Proximity Effect in Nanophotnoic Phased Arrays
Holographic displays recreate the scene by modulating, phase, and sometimes
also the amplitude, of the optical wave. A phase modulator that can accurately
and independently control the phase of the wavefront is crucial to the successful
display of a hologram. While LCoS SLMs have been demonstrated to be useful as
phase modulators, they suffer from a relatively low refresh rate (below 200 Hz) and
complicated light source configurations.
Nanophotonic phased arrays are phased arrays operating at optical wave-
lengths. An NPA is composed of an optical power distribution system, a phase
modulation mechanism that can control the phase of each pixel individually, and
antennas for propagating phase-modulated light into the free space (Raval [17]). Sun
et al. [18] showed that nanophotonic phased arrays can also be used for generating
an optical wavefront which through careful phase modulation in the near field can
display complicated images (or 3D scenes) in the far field. One major advantage
of the NPAs over SLMs is their fast refresh rate. The thermal simulation with our
NPA design shows that it can complete a heat-cool cycle within 10 ?s, suggesting a
refresh rate up to 100 kHz which is uniquely suitable for dynamic content as well as
time-division multiplexing operations such as color-switching or image smoothing.
In addition, the NPAs support integrated light sources. Lasers can be coupled into
the NPA via an optical fiber, distributed to each pixel and emitted by the on-board
antennas. This way, the NPA displays can be made more compact as optical opera-
tions such as beam expansion and polarization necessary for the typical LCOS SLM
9
display systems are not necessary.
The successful display of desired 2D or 3D imagery lies in the precise and
independent control of the phases of the pixels in the NPAs. Most NPA displays,
such as those in Doylend et al. [19], Sun et al. [18] and our own design, depend
on the thermo-optic effect to modulate the phase of a pixel. The phase of a pixel
can be changed from 0 to 2? by adjusting the temperature of the pixel. However,
the thermal proximity effect, the phenomenon where the temperature of one pixel
affects the temperature of neighboring pixels, impacts the accuracy of phase modu-
lation and degrades the observed image. Therefore, to achieve the desired phase on
the near-field wavefront, this thermal proximity effect must be taken into consider-
ation. This is further complicated by the fact that most NPAs only have heating
mechanisms with no cooling capability, making some phase patterns impossible to
realize.
In Chapter 4, we explore the impact of the thermal proximity effect on the
observed images of holograms displayed on the NPA. The proximity effect can be
modeled as a convolution and we explore correcting the proximity effect with de-
convolution methods. Most NPA display systems including our design only have
phase modulation capabilities. We develop a novel proximity effect correction (PEC)
method based on the proximal optimization method. This method is applicable to
both the Fourier holograms usually for displaying images in the far field and the
Fresnel holograms for displaying 2D and 3D images at user-specified depths. Our
new method is also more effective and efficient than previous proximity effect cor-
rection methods based on the iterative phase retrieval algorithm. The results of our
10
original no PEC proposed PEC proposed PEC
Figure 1.3: The first row of the middle two columns show the simulated observa-
tions of the Fourier hologram displayed on the NPA without any proximity effect
correction and with our proposed PEC method. The second row show the simulated
observations of the Fresnel hologram. Our PEC method can effectively correct the
proximity effect correction at the tested proximity effect level (? = 0.66 px). The
simulated observation of the 3D volume by applying our proposed PEC method to
a stack of Fresnel holograms can be seen on the right.
proximity effect correction method can be seen in Figure 1.3. As can be seen, the
near-field phase modulation is negatively impacted by the thermal proximity effect.
Without proximity effect correction, the image quality of the formed image degrades.
Our proposed proximity effect correction method can effectively correct the prox-
imity effect and restore the image at the tested proximity effect level (? = 0.66
px). We can even display a 3D volume by applying the PEC method to a stack of
Fresnel holograms at their respective depths. Details are available in Chapter 4 and
peer-reviewed publication Sun et al. [20].
11
1.4 Fast Holographic Volume Rendering
Light has a wave-particle duality, meaning it could be modeled as a particle
traveling from one location to another as well as a wave propagating from the source.
Much of the computer graphics rendering techniques have been developed using the
particle model, a popular example of which is raytracing [21, 22]. However, as is
shown in Dennis Gabor?s experiments in the 1940s [23], a scene can be reconstructed
by recording and illuminating the interference pattern of an object wave and a
reference wave. With the advancement in electro-optics, there are devices that can
programmatically control the phase and amplitude of the light wave, such as spatial
light modulators (SLMs).
Computer-generated holography (CGH) refers to digitally computing the in-
terference pattern, either from a digital model or from specially captured images. Al-
though in its early stages, this technology promises to solve several significant exist-
ing problems in conventional VR and AR displays, such as vergence-accommodation
conflict due to the limited number of focal distances. Varifocal and light-field dis-
plays also offer a wider range of focal distances. However, the former usually involves
moving components while the spatial and angular resolutions of the latter are lim-
ited by optical diffraction as the pixel density increases. On the contrary, the quality
of CGH improves as the pixel density increases.
CGH has largely focused on 2D images or 3D scenes in the form of meshes or
point clouds. The Fourier transform is a common and efficient method to calculate
the hologram of a 2D image. Calculating the hologram of a 3D scene is compu-
12
Figure 1.4: Sample results of our holographic volume rendering algorithm.
tationally intensive and usually involves converting the scene into a collection of
points directly visible from the hologram and calculating the hologram following
the Huygens-Fresnel principle. Modern computational hardware and acceleration
methods have enabled real-time calculation of holograms from 3D meshes.
Volumetric data from medical imaging can be visualized to reveal critical in-
formation about the organs and tissues. Volumes from physics simulations are also
used in the realistic rendering of fog, smoke and so on. Volume visualization can
benefit from the advantages of holographic displays. However, calculating the holo-
gram of a volume is an order-of-magnitude more complicated than calculating that
of a 3D mesh, as every voxel contributes to the perception of the volume instead of
only those closest to the hologram.
We develop a new method to calculate the hologram of volumetric data that
achieves significant speedup compared to existing holographic volume rendering
methods. We also show through simulations how to choose the light source for the
holographic display to strike a balance between the speckle noise and image sharp-
ness for the observation. Some simulated observations of our proposed method can
be seen in Figure 1.4. Details are available in Chapter 5.
13
Chapter 2: Perception in the Far Periphery
2.1 Overview
Computer graphics has thus far primarily focused on high-fidelity rendering
that is best suited for the central vision. Rendering in the far peripheral vision
has not received much attention because most of the displays till now have only
had a narrow field of view (FOV). However, rapid advances in display technologies
are enabling head-mounted displays (HMDs) with an ever-increasing FOV. This
has necessitated a better understanding of how peripheral vision could assist in the
next-generation of VR and AR applications.
In this study, we are interested in the far peripheral vision, which suffers from
significantly diminished visual acuity compared to the central vision. Definitions
of the peripheral vision differ throughout the literature. In this chapter, the foveal
vision (within 2? from the fixation point), together with the perifoveal, is called the
central vision [2, 24]. The central vision covers the region from 0? to 8? eccentricity
and corresponds to the eye?s macula which is responsible for the high-resolution
color vision. Beyond the central vision is the peripheral vision. Near, middle, and
far periphery span 8? to 30?, 30? to 60?, and above 60? eccentricity, respectively [25].
All eccentricities in this study are horizontal. A visual representation can be seen
14
Figure 2.1: Illustration of the human visual field. The central visual vision, near,
middle, and far peripheral vision are from 0? to 8?, 8? to 30?, 30? to 60? and beyond
60? respectively. This image is adapted from [2].
in Figure 2.1.
The visual property difference between the central and peripheral vision has
been used in foveated rendering such as Meng et al. [26?28] to save the computational
capabilities. In this chapter, we are more interested in the perception in the far
periphery and conduct a user study to investigate the perception time in the far
peripheral vision. We wish to find a quantifiable scaling function so the information
presented in the far periphery can be perceived as efficiently as in the central vision.
15
Our study is related to concepts in psychophysics called cortical magnification
and M-scaling. Cortical magnification says that M, the diameter in the primary
visual cortex onto which one degree of visual field projects [29], is inversely linear to
the eccentricity E. The relationship can be formulated as follows. a is a coefficient
[24]
M?1(E) = M?10 (1 + aE) (2.1)
Variations of the formula exist in [13, 30?38]. This leads to M-scaling, which
says performance variations with eccentricity can be minimized by using appropri-
ately scaled stimuli ( [24]). M-scaling can be represented in the following equation.
S(E) is the scale at eccentricity E.
S(E) = S0(1 + E/E2) (2.2)
Our work makes the following contributions. First, many studies on M-scaling
focus on low-level visual properties such as contrast sensitivity [13], In our study,
participants are asked to perform a higher-level visual task by responding to different
patterns. We record the time for participants to respond to patterns of different
scales at different locations. This facilitates the determination of an appropriate
magnification for patterns at higher eccentricities so that they can be perceived in
a similar time as those in the central vision. Second, to our knowledge, we are the
first to test and report a scaling function in the far peripheral vision, applicable to
a field of view of up to 140? horizontally.
16
2.2 Related Work
In this section, we introduce past studies related to our work on the peripheral
vision. These include the peripheral vision and wide FOV displays.
2.2.1 Peripheral Vision
Low-level visual properties like contrast sensitivity have been studied exten-
sively in the central vision (0? to 8? away from the fixation point, or eccentricity).
But there has been little work on the far peripheral vision (beyond 60? eccentric-
ity) or on high-level perceptual functions. Below we introduce related perceptual
psychology works on the peripheral vision.
Cortical magnification says that M , the diameter in the primary visual cortex
onto which one degree of visual field projects (Daniel and Witteridge [29]), is in-
versely linear to the eccentricity E. Various models have been proposed. Cowey and
Rolls [30] first fit the visual cortex electrical stimulation data provided by Brindly
and Lewin [31] to the inverse linear model (M?1(E) = M?10 (E/E2))with M0 = 8.55
mm/? and E2=1.746. Through medical imaging of the visual cortex, Horton and
Hoyt [32], Schira et al. [39] and Larsson and Heeger [33] derive different values for
M0 and E2. Rovamo and Virsu [13] propose adding an extra term for improved
accuracy M?1(E) = (1 + aE + bE3)M?10 and compile four equations for the four
half meridians separately. There is also the power function used by Tolhurst and
Ling [34], which is M?1(E) = M?1(1 + aE)1.10 with M0 = 24.88 mm/
?.
This leads to M-scaling, which says performance variations with eccentricity
17
can be minimized by using appropriately scaled stimuli ( Strasburger et al. [24]).
M-scaling can be represented as S(E) = S0(1 + E/E2) where S(E) is the scale at
eccentricity E. Rovamo and Virsu [13] adjust the stimuli presented in the periph-
ery using the function M?1(E), and the contrast sensitivity functions at different
eccentricities are made similar as a result. However, it is worth noting that the
cortical magnification and M-scaling concepts are not golden rules to ?make every-
thing equally visible independent of eccentricity?. There are cases in which scaling
does not compensate for the decreasing M factor and even cases where the visual
performance is diminished [37, 38]. For more details on cortical magnification and
M-scaling, we refer readers to Strasburger et al. [24].
High-level visual functions like letter recognition are also peripheral vision
study topics. Strasburger et al. [24] propose a scaling function S(E) = 0.022E +
0.25
? based on data from Strasburger [40] where C is the Michelson thresh-logC+2.07 0.0345E
old contrast. Grogoric et al. [14] find a linear scaling function from 0? to 40? eccen-
tricity to make objects presented in a virtual scene perceivable.
Studies have also shown that the peripheral vision is capable of, and some-
times plays an important part in, higher-level perceptual tasks like scene and object
recognition. Thorpe et al. [10] show that even at 70? eccentricity, an accuracy of
60.6% can be achieved in recognizing animal faces. Boucart et al. [11] compare
people?s ability to identify images of human faces, vehicles, and animals at different
eccentricities. The results show that human faces have an advantage over vehicles
and animals in terms of response time, and that advantage persists even at extreme
eccentricities up to 80?. In scene gist recognition, Loschky et al. [12, 41] compare
18
participants? abilities to categorize a scene from a ?window? (central portion of the
image) and a ?scotoma? (blocking central portion, only keeping the peripheral part
of the image). They show that the peripheral vision is more important than central
vision in scene gist recognition, as recognizing a 30? radius scotoma achieves the
same accuracy as recognizing the whole image, and a 5? radius window is signifi-
cantly harder to recognize. Other studies have shown that peripheral vision allows,
though at a reduced level, recognition of facial expressions and emotions ( [42,43]).
Ren et al. [44] investigate whether a larger field of view improves searching
task performance in augmented reality. Their results show that a larger field of view
(108? ? 82?) leads to a shorter task completion time, but lowers the correctness com-
pared to a limited field of view (45? ? 30?). Kline and Witmer [45] conduct a user
study to investigate the effect of field of view on distance judgment in the virtual
environment and find that with a smaller field of view (60? ? 38.5?), participants
tend to overestimate distance while with a larger field of view (140? ? 90?) partic-
ipants tend to underestimate. Joneset al. show in [2,46] that adding stimulation in
the far periphery in a VR headset causes more accurate distance and size judgment
in a virtual environment and changes people?s walking behavior. A good literary
review on the effects of the field of view in virtual environments can be found in
Jones et al. [46].
19
2.2.2 Wide FOV Displays
Researchers in the graphics community have long used the fact that the human
peripheral vision perceives less detail than the foveal vision. Oshima et al. [47] ana-
lyze user gaze direction and blur objects not in the user?s foveal vision to reduce ren-
dering complexity. In an augmented reality application, Ishiguro and Rekimoto [48]
display an icon above objects in the periphery of the user?s field of view and show
the detailed information when the user?s gaze is moved to that object. Jones et
al. [49] project images that are aesthetically matched with the content on the TV.
Positive user feedback shows that enriching the peripheral vision can improve the
interactive user experience.
The peripheral vision cannot be utilized in VR and AR without wide-FOV
displays. Various ways to build wide-FOV VR and AR displays have been proposed.
Most of the systems we review here support 100? horizontal FOV or greater. The
most straightforward system that extends the viewer?s FOV is the cave automatic
virtual environment (CAVE). Cruz-Neira et al. [50] create a 7?? 7?? 7? cube-shaped
CAVE where images were projected to three side walls and the floor. It covers
the entire field of view when the viewer is facing the middle of the three projection
walls). There have been many iterations and variations of the CAVE system [44,51].
In an effort to study the immersion on desktop VR, Robertson et al. [52] use
two additional monitors (referred by them as peripheral lenses) on either side of the
main monitor. The total horizontal FOV using two peripheral lenses is approxi-
mately 38?. Attaching additional monitors can possibly cover the entire FOV.
20
In traditional HMDs, the two most important components are the display
panel and optics. Modifying either or both is a common method of enlarging the
FOV of HMDs. Nagahara et al. [53] build a prototype video see-through HMD that
uses a complicated configuration of an ellipsoidal and a hyperboloidal curved mirror
to reflect light from the display into the viewer?s eyes, covering 180? horizontal
FOV. Kiyoshi [54] creates an optical see-through HMD using a hyperbolic mirror
instead of a flat one to increase the horizontal FOV to 190?. Dunn et al. [55] take
the optics design one step further to accommodate depth cues while achieving a
large field of view by using a flexible mirror membrane to reflect light from the
display. This allows a 90? by 45? binocular FOV. The membrane is also varifocal,
controlled by airtight cavities. The focal power can be adjusted with a gaze tracker.
Rakkolainen et al. [56, 57] use flexible OLED that curved around the viewer?s eye
and curved optics to reach super-wide FOV. They create proof-of-concept prototypes
using Fresnel lenses and achieved 232? ? 130? and SI318? ? 130? FOV.
Tiling display panels or optics can also increase the field of view. Massof et
al. [58] achieve 150? ? 100? FOV by using a specially-molded array of Fresnel lenses
and surrounding each lens with 16 OLED panels. Tiling optics of tiny pinholes
has been used by Aksit et al. [59]. Given the pupil and aperture size and the
aperture-pupil and aperture-display distance, the FOV of a single aperture can be
calculated. To widen the FOV of the HMD, an array of pinholes is used. Specially
synthesized images are displayed on a cell phone, from which light travels across
the apertures to reconstruct the original images. The FOV is easily expandable
with larger displays and more apertures. Xiao and Benko [60] fill the periphery of
21
the HMD with sparse arrays of high-contrast LED lights, dubbed sparse peripheral
display. Their prototypes based on existing commercial VR and AR HMDs can
achieve a FOV of 170? and 190? respectively.
Light-field and holographic near-eye HMDs, while in their early development
stages, have the potential to support a large FOV and address problems like the
vergence-accommodation conflict. A prototype by Huang et al. [61] uses a rank
1-factored light field, and has a FOV of 87? by 91?. Maimone et al. [62] and Shi
et al. [1] use spherical light waves to address the limitations of a small diffraction
angle. The former demonstrates a prototype with an 80? horizontal FOV.
We compile below a list to include the different approaches and examples of
wide FOV displays as well as their resolutions in Table 2.1.
2.3 User Study Design
Figure 2.2: Top view and stimuli (left) of our user-study setup. In the experiment,
patterns appear one at a time. A photograph of our system used in the study (right).
Shown on screen is one of the possible use cases of the system.
We build a user study system that uses a large curved screen and eye-tracking
to simulate a wide FOV HMD. Stimuli of different sizes are displayed at various
22
Table 2.1: List of wide-FOV virtual and augmented reality display techniques and
devices. The spatial resolution of a holographic display is the same as the SLM
pixel pitch when inside the depth of field [1]. The items with * are calculated from
measurement and other reported device specifications.
Display Max re- Color Angular
ported Resolution
FOV (?)
Spherical wave holographic 80(H) RGB 3.74 ?m
display(2017) [1, 62]
Stereoscopic factored light field 87 x 91 RGB 4.1 arcmin/pix*
synthesis(2015) [61]
Flexible mirror membrane(2017) [55] 90 x 45 RGB 3.4 arcmin/pix*
LCD and optics (Meta 2) 90 x 50 RGB 3 arcmin/pix
Detachable display and optics (Samsung 100(D) RGB 2.5 arcmin/pix*
Gear VR, Google Daydream)
Pinlight(2014) [63] 110(D) Mono 4.5 arcmin/pix*
Tiling display and Fresnel lenses(2003) [58] 150 x RGB 3 arcmin/pix
100
LCD and catadioptrical optics(2003) [53] 180(H) RGB 5.7-10.5
arcmin/pix*
Projector and hyperbolic mirror(2007) [54] 190(H) RGB 60 arcmin/pix*
sparse peripheral display(2016) [60] 190(H) RGB 5.5 arcmin/pix*
OLED and Fresnel optics (HTC Vive, 210 x RGB 5.5 arcmin/pix*
Oculus Rift, StarVR) 130
Curve display and optics(2016) [56,57] 318 x RGB 5.4-7.5
130 arcmin/pix*
CAVE [44,50,51,64] full RGB 1.5-3
arcmin/pix*
Tiled desktop displays [52] full RGB 3-4 arcmin/pix*
eccentricities. Participants respond to the stimuli by pressing buttons on a controller
device. The responses are recorded. Our setup is illustrated in Figure 2.2.
The display we use in this study is a custom-made curve screen using 5 ? 3
projector array, with a 6400 ? 2100 resolution. This display has a 180? horizontal
FOV when viewed 236 cm away from the center of the screen. The system setup
can be seen in Figure 2.2 right. The eye-tracking device we use is a Tobii EyeX
23
commercial eye tracker. The eye tracker is placed 163 cm away from the center of
the screen, slightly below eye level facing the user.
The system displays various patterns of different sizes at different eccentricities
according to a pre-determined randomized order. The response from the user?s game
controller is recorded at 60 Hz.
2.3.1 Task and Stimuli
We display two types of patterns, crosses and circles, as shown in Figure 2.2
left top. In the user study task, one type of pattern is specified as the target, and
the other is the distraction. The task is to perceive the pattern and respond to the
target pattern (by pressing a button on a controller) as soon as it appears and ignore
the distraction while keeping fixated at the center of the screen. For the purposes
of this study, we define the time from the appearance of the pattern to the button
being pressed is recorded as the perception time. We recognize that this includes the
time for the eye to perceive the pattern, the brain to understand the pattern, and
for the user to respond by pressing the button on the controller. If a participant
responds incorrectly, the stimulus will turn red. For example, when the cross is
specified as the target, and the circle is the distraction, the participant should press
a button on the controller as soon as he or she sees a cross on the screen, and do
nothing with a circle. Participants are further directed to keep looking at the center
of the screen and not move their heads or gaze directions. The cross-hair at the
center will turn green if the eye tracker confirms the participant is looking at the
24
center and turns black otherwise.
Since our display supports 180? FOV and considering the space to display the
pattern at the outermost eccentricity, the stimuli are tested at eccentricities from 0?
to 80?, with a 10? interval. Because we are more interested in higher eccentricities, we
test stimuli at higher eccentricities at a wider scale range. Specifically, eccentricities
0?, 10?, 20? and 30? are tested at scales 5?, 10? and 15?, while eccentricities 40?, 50?,
60?, 70? and 80? are tested at scales 5?, 8?, 11?, 14?, 17?, 20?, 23? and 26?. The scale
ranges are determined after our pilot study. Each eccentricity-scale combination is
repeated 5 times for each pattern as the target and 2 times for each pattern as the
distraction (14 times in total).
2.3.2 User Study Procedure
The study is conducted with one participant at a time. We first explain to the
participants the purpose of this study. We then ask them to log their basic personal
information, including name, age, and whether or not they have vision correction.
We then explain how the study is conducted, and perform practice rounds of the
study. The experimenter answers any question at this point.
To reduce participant fatigue, the entire experiment is divided into four shorter
rounds between which the participants can take a break. Two rounds have the cross
as the target pattern; two have the circle as the target. For example, the circle is
repeated 5 times for each eccentricity-scale combination and the cross is repeated
2 times for each eccentricity-scale combination. These patterns are shuffled and
25
divided into the 2 circle-target rounds. The opposite is done for the 2 cross-target
rounds. The 4 rounds are the same for each participant, but half go over them
in a cross1, circle1, cross2 circle2 order, and the other half in a circle1, cross1,
circle2, cross2 order. During each round, the experimenter stays out of the eye
tracker?s range. The experiment takes about sixty minutes for each participant.
After all the rounds are done, each participant was compensated with $10. During
the experiment, we recorded the participants? eye movements and game controller
responses. We cross-referenced this data with the stimuli to determine the precise
perception time.
For the pilot user study, we had six participants. For the formal user study, we
recruited 40 participants (24 male and 16 female). Every participant had a normal
(31 participants) or corrected-to-normal vision (6 had eyeglasses and 3 used contact
lenses).
2.4 Results
We first visualize the perception time over all eccentricity-scale combinations.
Then we characterize the relation among the perception time, the eccentricity and
the scale. Finally, we quantify the optimal scaling at high eccentricities to achieve
the same perception time as in the central vision.
We first show the perception time over all eccentricity-scale combinations in
the form of a 3D surface mesh. The perception time is indicated both as the height
and by color. The surface mesh of the median (50th percentile), as well as the 25%
26
Figure 2.3: Perception time of all eccentricity-scale combinations is presented as
a 3D colored mesh (top left). The three planes from top to bottom are the 75%
quartile, median (50% quartile), and 25% quartile of all participants. The perception
time at three scales (5?, 10? and 20?) is also shown.
and 75% quartiles of all participants, can be seen in Figure 2.3. In Figure 2.4, the
median perception time of participants over all eccentricity and scale combinations
is shown in greater detail, and we plot the linear relationship between scale and
eccentricity that will result in the same perception time (Section 2.4.3). Figures 2.3
and 2.4 show that generally, perception time increases as eccentricity gets higher,
and decreases as the scale becomes larger. One-way ANOVA test on the perception
time with each eccentricity-scale combination as a group has a p-value of 1.3?10?6,
giving us high confidence to reject the null hypothesis that the perception times
follow the same distribution.
27
Figure 2.4: A color-coded map of the median perception time for all eccentricity-
scale combinations. Using our findings in Section 2.4.3, we plot the scales at higher
eccentricities to be perceived in the same time as a 5? pattern at 10? (red), 20?
(orange) and 40? (yellow).
28
2.4.1 Perception Time and Scale
The aim of this study is to determine the optimal scaling of the content dis-
played at higher eccentricities so that it is perceived in a similar time as in the
central vision. Specifically, for a pre-specified perception time T0, we want to find
at each higher eccentricity the scale at which the pattern?s perception time is equal
to T0.
First, we derive the relationship between perception time T and scale S at
a given eccentricity. We plot the perception time at each tested scale for a given
eccentricity directly from the user study data. We wish to represent the perception
time as a function of the scale. As can be seen in Figure 2.5, for a given eccentricity,
perception time generally decreases as the scale gets larger.
Judging from Figure 2.5, we fit the relation of perception time and scale as a
function in the following form.
T (S) = alog(S) + b (2.3)
T (S) is the perception time in milliseconds, S is the scale in degrees in the visual
field, and a, b are parameters. The function T (S) is calculated for each eccentricity
for each participant. We only calculate the perception time vs. scale relation on
eccentricities above 40? because we are interested in calculating this relationship for
the far peripheral vision.
29
(a) (b)
(c) (d)
Figure 2.5: Figures (a), (b), (c) and (d) show the relation between perception time
and scale measured for a single user (participant 32) at 50?, 60?, 70? and 80? eccen-
tricities, respectively.
2.4.2 Perception Time and Eccentricity
We present the relation between perception time and eccentricity in Figure 2.6.
Specifically, we show the median perception time of all participants with 5?, 10? and
20? patterns at higher eccentricities. When the size of the pattern is kept constant,
it takes longer for the patterns at higher eccentricities to be perceived. Perception
time decreases as the pattern size increases at the same eccentricity (see Figure 2.5).
As is shown, there is a clear difference between the time in perceiving the same-sized
30
Figure 2.6: The perception time across eccentricities with an s-sized pattern is the
intersection of the plane scale = s and the perception time surface in Figure 2.3. The
intersection with 5?, 10? and 20? scale planes are shown in (left). The intersection
curves are shown in (right).
pattern at a lower eccentricity versus at a higher eccentricity. The 150 ms difference
between perceiving a 5? pattern at 40? and 80? eccentricity is noticeable and it is
meaningful to characterize the scaling at higher eccentricities to bridge this gap.
2.4.3 Equal-Efficiency Scale at each Eccentricity
With the function T (S), we can calculate for any given eccentricity the scale
at which the pattern is perceived within a given amount of time. Therefore, we can
plot the relations of eccentricity and scale for a desired perception time. We call
this the equal-efficiency scale.
Given a target perception time, we calculate the equal-efficiency scale for each
participant and present the median of all participants in Figure 2.7. We note that
we only test within the scale range of [5,26]?. The equal-efficiency scales at 80?
eccentricity for a sizable portion of the participants are outside our tested range
and therefore cannot be considered valid. The equal-efficiency scales within the
31
(a) (b)
Figure 2.7: Equal-efficiency scales that produce the same perception time as (a) a
5? pattern at 10? eccentricity and (b) perceived in 450 ms. These are the median
over all participants.
tested range exhibit a linear relationship between scale and eccentricity. S(E) =
?E + ?, 40 ? E ? 70.
We present the parameters for a few example target perception time choices
in Table 2.2. We can either use a fixed perception time, e.g. 450 ms, or we can
specify a target pattern, a pattern of a given size at a given eccentricity. The time it
takes to perceive this pattern is used as the target perception time. We also conduct
one-way ANOVA tests on the equal-efficiency scales of all participants for each high
eccentricity with the null hypothesis being that equal-efficiency scales over different
eccentricities follow the same distribution.
We illustrate the equal-efficiency scales in a theme park navigation example in
Figure 2.8. We choose the target perception time to be the perception time of a 5?
sized pattern at 10?, 20? and 40?, respectively. The patterns at higher eccentricities
are scaled so they can be perceived within the target perception time. The target
perception time increases as the eccentricity goes from 10? to 40?. The target equal-
32
Table 2.2: Parameters of the equal-efficiency scales fitted to a linear model. The
valid range of eccentricity is [40, 70]?. Analysis of variance is conducted with the all
participant?s equal-efficiency scale at a certain eccentricity as a group. We include
the last 5 rows because the equal-efficiency scales at 70? eccentricity are no more
than 4? above the tested range.
Target perception time ? ? dF F p-value
5? at ecc 10? 0.47 -11.04 3 7.02 0.0002
5? at ecc 20? 0.55 -16.3 3 11.96 4.46? 10?7
5? at ecc 30? 0.54 -15.08 3 7.96 5.75? 10?5
5? at ecc 40? 0.54 -17.64 3 12.48 2.45? 10?7
10? at ecc 10? 0.52 -13.11 3 4.78 0.0033
10? at ecc 20? 0.58 -14.8 3 9.3 1.13? 10?5
10? at ecc 30? 0.65 -16.52 3 7.04 0.0002
15? at ecc 10? 0.65 -18.49 3 4.22 0.0067
15? at ecc 20? 0.51 -6.67 3 5.32 0.0017
15?at ecc 30? 0.67 -18.74 3 5.31 0.0017
efficiency scales decrease as a result. The scales in Figure 2.8 are calculated using
the linear equations in Table 2.2.
2.5 Conclusion
Wide FOV VR and AR displays enable information perception in the far
periphery. In this work, we conduct a user study to investigate the perception
time in the far peripheral vision. We found that patterns can be scaled linearly
with the eccentricity it is shown at to reach a constant perception time. Based
33
on these equal-efficiency scales, we can optimally scale the content displayed at
higher eccentricities so that it can be perceived efficiently. Our finding is useful for
designing VR and AR experiences that are able to leverage the entire human visual
system?s field of view.
34
Figure 2.8: Visualization of equal-efficiency scales for a notional theme park navi-
gation. The Star Wars attraction logo is scaled at various eccentricities using equal-
efficiency scales. The target perception times are the perception time of a target
pattern of size 5?, at eccentricities (top) 10?, (middle) 20?, and (bottom) 40? (these
are shown with red dashed boxes). Theme park image reproduced with the permis-
sion of Bernie Kelm, https://www.rocket9.net.
35
Chapter 3: AR Displays for Surgery
3.1 Overview
External ventricular drain (EVD) placement involves inserting a very small
clear plastic tube into a lateral ventricle of the brain. Once the catheter is inside
the skull, its position can no longer be directly observed, and the neurosurgeon must
rely on his or her experience and knowledge of human anatomy to guide placement
of the catheter. Huyette et al. [65] and Kakarla et al. [66] retrospectively assess
the accuracy of ventriculostomy catheter placement by comparing postprocedure
CT scans with preoperative scans. The former study includes 98 ventriculostomy
catheter placements and finds that in the author?s institution, it requires on average
2 passes to successfully place the catheter, and 22.4% of those successful placements
end up in nonventricular spaces. The latter includes studies of 346 patients and
finds that 13% of the catheter placements are suboptimal, resulting sometimes in
unfunctional drainage. Furthermore, Woo et al. [67] finds that catheter misplace-
ment is a significant independent predictor of catheter-associated hemorrhage. In a
radiographic simulation, Robertson et al. [68] finds that the traditional method of
EVD trajectory selection and placement is predicted to have a 19% chance to impact
a cortical vein. These results show that there is still much room for improvement. A
36
method of visualizing the location of the catheter inside the skull would be of great
assistance to the surgeon.
A few EVD visualization techniques have been proposed. Cutler et al. [69]
have developed a system in which an optical marker is attached to the catheter and
tracked by the camera on a head-mounted display (HMD). Other medical instrument
tracking techniques by Low et al. [70], Najafi et al. [71, 72], Stolka et al. [73] and
Fan et al. [74] make use of markers of different kinds.
Figure 3.1: The view observed by the surgeon (captured with HoloLens mixed reality
capture function). This is a realistic surgical scenario where EVD is performed on
a cadaveric head. Additional information could be shown such as the CT slice at
the tip of the catheter if a CT volume is registered to the patient. The CT image
shown in this figure serves as an illustration of this potential. Other medical imaging
visualization techniques such as in Bista et al. [3, 4] may also be used.
Preliminary tests with Microsoft Kinect 2.0 showed that the thinness of the
37
catheter makes it hard to detect and track with commodity depth sensors. The
shortcomings of infrared depth sensing, in this case, are out of the scope of this
chapter. Instead, we devise a new optical marker and tracking technique, suitable
for augmented reality. We label a portion of the catheter with three distinct colors.
One design principle we have observed after consulting our medical collaborators
is making minimal changes to the catheter. Our modification does not change the
catheter in any way that would make it feel different to operate. We develop an
algorithm to detect the color bands and then use them to calculate the position
of the catheter. This way, even if the tip of the needle is occluded, we still know
where it is and are able to visualize it, as long as enough of the colored portion
remains visible. We display a virtual catheter overlaid on the real catheter directly
in the user?s field of view using a see-through Microsoft HoloLens, which allows the
surgeon to better focus on the patient while still having access to the relevant data
and provides better depth perception and understanding of the operating context.
We only need to do a one-time calibration to determine the transformation between
the HoloLens and the camera. Testing shows our system achieves high accuracy
and low latency. We also show the usefulness of our system in a realistic surgical
environment on a cadaveric head. An example of the view directly observed by the
surgeon is shown in Figure 3.1.
Our system takes a step beyond Cutler et al.?s pioneering research [69], and
improves on several aspects. First, our tracking technique does not require sig-
nificant changes to the shape and weight of the catheter. Second, since we use a
stationary camera for tracking, it does not require the user of the head-mounted
38
display to be looking at the catheter to track. This allows the surgeon to freely look
anywhere without losing the tracking of the catheter. Finally, the capability of our
system is not tied to the HMD. We empirically observed a lag of 1.43 s in Cutler
et al. [69]?s video demonstration1, which is likely caused by the HMD?s inability to
keep up with the processing. In our system, the images from the camera, as well
as the medical volume are processed separately on another machine, and results
are sent to the HMD wirelessly. This allows for faster and more accurate sensing
and higher fidelity medical images. Our system can be naturally integrated with
existing medical image registration techniques. A proposed example is illustrated
in 3.1 where the CT slice at the tip of the catheter is displayed if the CT volume is
registered with the patient. Due to the scope of this study, we do not perform the
medical image registration but refer interested readers to surveys by Kersten-Oertel
et al. [75] and Meola et al. [76].
In short, we built an augmented reality EVD tracking and visualization system
that
? overlays a virtual catheter onto a user?s field-of-view, mimicking the position
of the real catheter,
? leaves the catheter functionally unchanged, and
? is accurate and responsive.
1We observed this at 0:50 and 0:51 in video https://youtu.be/qqykWW9f41Q
39
3.1.1 Technical Details from Previous AR Ultrasound System
Before developing this AR assistant system for EVD, we developed an AR ul-
trasound system where we gained a lot of experience in developing AR applications.
The AR ultrasound system streams the video from a medical imaging machine live
to be viewed in an AR head-mounted display. With this, the surgeons can focus
on the patient during a procedure instead of having to look away to look at various
monitors. For the sake of completeness, we document these technical details.
The AR platform we use is the Microsoft HoloLens2. We develop and de-
ploy the Hololens application (client application) from a Windows machine. The
application is designed with Unity3 and the C-sharp scripting language.
The HoloLens is completely tetherless. The information, both the location of
the catheter and the medical images are sent to the Hololens from another program
(feeder program) running on a separate machine via WiFi. The feeder program for
the AR ultrasound system is developed with Unity and C-sharp. And the feeder
program in the AR assistant system for EVD is developed with C++ and OpenCV.
We use the User Datagram Protocol (UDP) in our system (DatagramSocket in the
.Net framework and Winsock4).
To get the live medical images from the imaging machines, we use a portable
video grabber (AV.io5) to send the output from the imaging machines to the feeder
2https://www.microsoft.com/en-us/hololens
3https://unity3d.com/
4https://docs.microsoft.com/en-us/windows/desktop/winsock/windows-sockets-start-page-2
5https://www.epiphan.com/products/avio-hd/
40
programs as a webcam signal.
With this setup, it is possible to send the same information to multiple HoloLenses
from one instance of the feeder program. We have found success with two HoloLenses.
3.2 Related Work
External ventricular drainage has undergone numerous iterations in materials
and techniques since it was first performed in the 1740s. The complete documen-
tation of the evolution of EVD is out of scope for this chapter; readers can refer to
Srinivasan et al. [77]. In this section, we will review recent developments in EVD
catheter placement techniques. In addition, we discuss augmented reality use cases
in neurosurgery.
3.2.1 Catheter Placement
Proper catheter placement is key to the success of an EVD procedure. A 2016
Neurocritical Care Society study (Fried et al. [78]) found that with a stabilization
tripod (?Ghajar guide?) or other technical adjuncts such as CT guidance, stereotac-
tic navigation could improve accuracy and/or lower the number of EVD procedures
performed on a patient.
The ?Ghajar guide? is a catheterization assistance device in current clinical
use, invented by Jamshid Ghajar in 1985. It is a tripod-shaped device that can
be securely placed on the patient?s skull; the catheter travels along the axis of the
41
device and can be stabilized as it goes into the skull. A study by O?Leary et al. [79]
found that successful catheterization can be achieved either using a Ghajar guide or
freehand, but that a smaller distance to target was achieved with the Ghajar guide.
For ventricle exploration and catheter placement training, Phillips and John [80]
developed a simple web application that lets trainee surgeons practice simulated
catheterization (catheter placement) via a personal computer. Cros et al. [81] in-
corporated haptic feedback into the ventriculostomy simulator; users feel a ?pop?
sensation when the catheter pierces the ventricular surface.
Krombach et al. [82] conducted a free-hand catheterization simulation before
performing the actual surgery. In simulations, MRI images were registered onto the
patient?s head with fiducials. A pointer instrument was tracked by infrared diodes
attached to it. The surgeons used the pointer to virtually move the catheter and
the position of the catheter was evaluated.
Luciano et al. [83] developed the ImmersiveTouch ventriculostomy simulation
system. Stereoscopic goggles are tracked electromagnetically and represent the head
position. A hand stylus is tracked mechanically and can offer different resistance
sensations to simulate catheter moving in different types of tissue. This way, the
displayed content can change according to the user?s point of view and the virtual
catheter is visualized based on the location of the hand stylus. Haptic feedback is
provided to perceive the interaction between the virtual catheter and the virtual
brain model. ImmersiveTouch is used in studies by Banerjee et al. [84], Lemole et
al. [85], Schirmer et al. [86] and Yudkowsky et al. [87].
Tai et al. [88] 3D-printed an actual physical phantom head (manikin head
42
for medical procedure simulations) from the reconstruction of CT images and used
the brain model as an EVD training platform. This platform provides a realistic
anatomical structure and visual/haptic feedback.
The most recent experimental EVD development is by Cutler et al. [69]. In
their system, a rigid catheter is attached to an optical marker, in this case, a piece of
paper. The marker?s, and therefore the catheter?s position is tracked by the camera
on an HMD, which displays the virtual catheter image as well as a 3D model of
the damaged part of the brain. This way, as long as the marker is in the camera?s
field of view, the catheter can be visually tracked, even when the tip is occluded.
The system compellingly demonstrates how the advances in sensing and displays
can create intuitive and reliable assistance to neurosurgery. Somewhat similar to
Cutler et al. [69] but used in biopsy and ablation are Lin et al.?s HoloNeedle [89] and
Kuzhagaliyev et al. [90], where the needle is tracked using the OptiTrack6 system
and a virtual needle is displayed in an AR headset to align with the real one. In Lin
et al.?s HoloNeedle, the biopsy needle?s shape is further obtained with embedded
Fiber Bragg Grating sensors to take the needle bending into consideration. All the
systems mentioned above make a significant change to the shapes of the instruments
to accommodate their tracking techniques which our method avoids (more detail in
Section 3.4). Najafi et al. [71,72] use a closed-form formula to calculate the trajectory
of a needle instrument based on the live footage from a single camera attached to
an ultrasound probe and guide the needle angle to reach the target shown in the
ultrasound image. Our system is different in that the exact location of the catheter
6https://optitrack.com/
43
tip is calculated visualized directly in the user?s field of view via an AR HMD. Stolka
et al. [73] use stereo cameras to detect the 5DOF information of a needle marked
with pseudo-random binary sequence(PRBS) patterns and display the needle in
the ultrasound image or project a line on the patient?s skin to mimic the needle?s
location. A generalized version incorporating multiple medical imaging and probing
modalities has been proposed in Basafa et al. [91]. Our system is different in that
it uses perspective foreshortening of color bands on the catheter to estimate the tip
location from a single camera with higher precision than the prior art.
3.2.2 Augmented Reality Displays for Neurosurgery
One type of augmented reality technique being used in neurosurgery is over-
laying detailed preoperative medical images on intraoperative images. Masutani et
al. [92] experimented with overlaying preoperative 3D vascular models onto intra-
operative X-ray fluoroscopy images via medical fiducial markers. Two clinical test
cases showed the registration reprojection displacement to be below 2.6 mm. Lovo
et al. [93] reconstructed volumes from cerebral MR scans and overlaid the volumes
onto intraoperative cerebral images.
One step further is to track the imaging probe (or microscope) during surgery
and overlay meaningful information such as anatomy models onto the intraoperative
images. Edwards et al. [94?96] developed and tested an AR microscope-assisted
guided intervention system that tracked the microscope and overlaid 3D medical
images on the two eyepieces. Shahidi et al. [97], Scholz et al. [98] experimented
44
with similar setups using surgical endoscopes. Kockro et al. [99] developed a hand-
held AR system, Dex-Ray, that overlays medical volumes or other images onto a
live stream video. Low et al. [70] used Dex-Ray integrated with the Dextroscope
system (an immersive medical image viewing system) and performed meningioma
(a type of usually noncancerous brain tumor) excision on five patients. Inoue et
al. [100] tracked off-the-shelf webcams to overlay anatomical medical structures on
the webcam images.
Sauer et al. [101] and Maurer et al. [102] developed an AR video see-through
surgery navigation HMD and demonstrated its use on a phantom head with encour-
aging results. Two color cameras record the real-world scene, and one additional
camera tracks the retroreflective optical markers attached to the phantom head.
A medical MR (magnetic resonance) volume aligned with the head phantom is dis-
played stereoscopically on the HMD. Abe et al. [103] develop an AR guidance system
for percutaneous vertebroplasty where the insertion location and intended insertion
orientation is displayed in an AR HMD. Trials on the phantom show improved ac-
curacy using this system, and successful clinical trials demonstrate the practical
usability. Abhari et al. [104] engineer a mixed reality system that facilitates the
training and planning for a brain tumor resection procedure and demonstrated the
greatly improved performance of junior residents and the reduction in time to per-
form clinically relevant tasks by clinicians. Li et al. [105] conduct clinical EVD trial
where manually aligned CT holograms are shown in an AR headset to display the
catheter placement trajectory and target location. The catheter is not tracked or vi-
sualized, but it took the AR-assisted group fewer passes and less time on average to
45
complete the procedure. Azimi et al. [106] develop a mixed reality surgical training
system that integrates functionalities such as preoperative procedure visualization
and real-time performance assessment and demonstrate a catheter placement train-
ing session where the system informs the trainee of the orientation error, though
the system?s tracking accuracy is not reported. Very recently, Meola et al. [76] and
Guha et al. [107] present two thorough surveys on the past work in which AR is
used for neurosurgery.
3.3 System
Figure 3.2: System Overview
46
Our system consists of three components: a sensing unit, a processing unit
and a display unit in addition to the modified catheter itself. The functions of each
component are described in Section 3.3.1, as well as the actual hardware used for
each component. The way the system is operated in our experiment is described in
Section 3.3.2. Finally, the calibration between the sensing unit and the display unit
is described in detail in Section 3.3.3.
3.3.1 System Components
The sensing unit captures the live footage of the operation area that contains
the modified catheter. The processing unit analyzes the footage and calculates the
catheter?s position in the camera?s reference system (camera space), which is then
sent via a wireless network to the display unit. The display unit uses a transfor-
mation obtained at the calibration step (more detail in Section 3.3.3) to transform
the received catheter?s location into the HMD?s reference system (HMD space) and
displays the virtual catheter at that transformed location to overlay the physical
catheter. A general illustration of the system, as well as the intended use scenario,
can be seen in Figure 3.2. The workflow of our system is depicted in Figure 3.3. An
actual view captured using HoloLens?s mixed reality capture function is shown in
Figure 3.1.
The sensing unit is an off-the-shelf Logitech c920 web camera fixated on a
tripod. The reason we use a separate camera, rather than the one integrated into
the display, is to avoid being limited by the quality of the integrated camera and
47
the processing capabilities of the display, as well as reduce network latency. The
processing unit is an Alienware laptop computer with an Intel i7 7820HK processor
running at 2.9GHz. The display unit is a Microsoft Hololens tetherless AR head-
mounted display.
The catheter used in our experiment is a Codman?R EDS 3TM clear ventricular
CSF catheter. This catheter measures 35 cm in length which is typical of catheters
used in EVD. Fried et al. [78] suggest that, in surgical scenarios, the catheter should
not penetrate the skull by more than 6.5 cm, therefore the place the color bands
at 7 cm away from the tip. We use three color bands each 3.8 cm in length (see
Figure 3.5 for an illustration).
3.3.2 System Operation
When using this system, the user wears the HMD which displays a virtual
catheter. The virtual catheter can be dragged around in the HMD space with hand
gestures. After the user manually aligns the virtual catheter with the physical one
observed through the HMD, the user issues a calibration command which triggers
the system to calculate the mapping from the camera space to the HMD space. The
system then uses this mapping to transform the catheter?s location in the camera
space into HMD space as mentioned above for the subsequent frames, until a new
calibration command is issued when the users see fit.
In the experiment, the surgeons held the rear portion of the catheter as illus-
trated in Figure 3.2 to avoid occluding the color bands.
48
Figure 3.3: The workflow of the system
3.3.3 Reference System mapping
Our algorithm calculates the position of the catheter in the camera?s reference
system (camera space). To be able to visualize the catheter in the HMD, a way to
transform from camera space to the HMD?s reference system (HMD space) is re-
quired. We achieve this by doing a one-time calibration with the calculated catheter
position in camera space and the virtual catheter position in HMD space. Suppose
the coordinate of the catheter in its own reference system (model space) is AM . At
a certain frame, Tcam and Thmd transform A
M into camera space and HMD space
respectively. We want to find a transformation Tcam?hmd that will transform from
camera space to HMD space.
T M McamTcam?hmdA = ThmdA (3.1)
49
Figure 3.4: Reference system mapping
In order for this to work with every AM , we simply take
T ?1cam?hmd = TcamThmd (3.2)
Then with every frame where a new Tcam is calculated from the position of the
catheter in the image, we can calculate a corresponding Thmd and display the virtual
catheter. Reference system mapping is illustrated in Figure 3.4
3.4 Catheter Tracking
We develop a way of tracking the 3D position of the catheter in real-time,
enabled by three color bands painted on the catheter (see Figure 3.5). The three
color bands are adjacent, and we know each of their lengths and locations relative
to the catheter; the colors only need to be sufficiently distinct to be identified in the
50
image.
Figure 3.5: A drawing of the catheter and the color bands. The color bands need
to be distinct and continuous. The lengths of the three color bands, as well as the
uncolored forward portion (from the catheter tip to the beginning of the first color
band) of the catheter, are known. With the lengths and the positions of the color
band endpoints detected in camera space, we can calculate the 5-DOF (no roll)
information of the catheter, and infer the position of the tip.
We use a pinhole model to represent the camera. As seen in Figure 3.6, P
is the center of projection of the pinhole model and is the origin point in camera
space. We observe in the image the 2D coordinates of the catheter A?, B?, and C ?
(Figure 3.6 shows three endpoints of two color bands as an example). Given that we
know the lengths of the color bands |AB| and |BC|, this becomes a simplified version
of the perspective three-point problem (P3P) thoroughly studied and documented
in [108?111]. In a general P3P problem, there could be as many as four solutions.
In our case, where the three points are collinear, there could be two solutions, of
which only one is desired. Determining the correct solution requires finding the
angle between the catheter and its image in the image plane, shown in Figure 3.6.
51
Figure 3.6: We need to find the angle ? between the catheter and its image. Let A,
B, and C be three endpoints of two consecutive color bands. The color bands are
of equal length a. P is the center of projection in the pinhole camera model. The
distance between B and P is b
We describe our approach to the P3P problem below. First, we use a geometric
method to find ? in Section 3.4.1. Then we calculate the position of the catheter in
Section 3.4.2. Finally, we extract the segments? endpoints A?, B? and C ? from the
image in Section 3.4.3.
3.4.1 Angle ?
We denote the three endpoints of two consecutive color bands as A, B, and
C in Fig 3.6, and the center of projection as P . The images of A, B and C are A?,
B? and C ? on the image plane. Given the camera intrinsics which can be obtained
by a one-time calibration, we can explicitly represent P , A?, B? and C ? with exact
52
coordinates in pixels. We can calculate the angles
PA? ?PB?
?APB = ?1 = acos ? ? (3.3)|PA ||PB |
PB? ?PC?
?BPC = ?2 = acos |PB?||PC?
(3.4)
|
Now assume the two color segments are of equal length a (Assume |AB| =
|BC| = a) and |PB| = b. Trigonometric equations give the following relations. ? is
the angle ?ACP .
a b
= (3.5)
sin?1 sin(? ? ?1 ? ?2 ? ?)
a b
= (3.6)
sin?2 sin?
? can be solved with the following equation.
sin?2sin(?1 + ?2)
? = arctan( ) (3.7)
(sin?1 ? sin?2cos(?1 + ?2))
Then we can calculate ? = ? ? ? ? ?. ? could be either positive or negative
depending on whether A is farther away from the image plane than C or closer.
When A is closer than C (Figure 3.6), ? > 0. When A is farther, ? < 0.
3.4.2 Position of Catheter
With the 3D orientation, we can calculate the position of the catheter in
camera space. In Figure 3.7, AB represents a color band. AT represents the portion
of the catheter from the tip to the color band. AT may not be entirely visible from
the camera. A?, B?, and T ? are the images of A, B and T on the image plane,
53
Figure 3.7: Finding the position of the catheter in camera space. P is the center
of projection. TA is the uncolored forward portion of the catheter. AB is the first
color band on the catheter.
respectively. We know the lengths of A?B? because we have detected them in the
image. We also know the lengths of PA? and PB? as well.
The actual catheter is in the plane formed by P and T ?B?, and we know its
orientation. But it could be on any line that is parallel to TB, for example, T ??B??.
We can set PA?? = mPA?. The actual ratio m does not really matter because a
different ratio only leads to a parallel line to BC. We can assume, for example,
m = 10, and let PB?? = kPB?. Then A??B?? = PB?? ? PA?? = kPB? ? PA??.
We know the angle formed by the catheter and its image ?, which is the angle
formed by A??B?? and A?B?.
A??B?? ?A?B? = |A??B??||A?B?|cos? (3.8)
54
Equation 3.8 can be further transformed into
?
kPB? ?A?B? ? PA?? ?A?B? = k2PB?2 ? 2kPB?PA?? + PA??2|A?B?|cos?
(3.9)
All the values in Equation 3.9 are known except for k. We can solve k and thus
find the position of B??. It is worth noting that Equation 3.9 is a quadratic equation,
and there could be two solutions for k. Judging from Figure 3.7, if ? > 0, B should
be further away from the image plane than A. Therefore k should be greater than
the assumed ratio of |PA??| over |PA?| and vice versa.
Now that we know the length of the catheter color band AB in the real world,
we can calculate the position of the color band AB in the camera space.
|AB|
PA = PA???? ?? (3.10)|A B |
|PA|
PB = ?? PB
?? (3.11)
|PA |
Since we know where AB is on the catheter, we can calculate the position of
the catheter in camera space. For example, we can calculate the position of the tip
in camera space since we know the lengths of the uncolored forward portion of the
catheter and the color bands.
|AB|
PT = PA? AB (3.12)
|AT |
It is worth noting that we can calculate the position of the catheter with two
adjacent color bands. We use three color bands to make our system robust against
occlusion and improve accuracy when all three are visible.
55
3.4.3 Color Band Detection
In the color band detection step, the input is an image I of the catheter
(Pcatheter being all the pixels that belong to the catheter), and the output is the
locations of the endpoints of the color bands. The accuracy of the color band detec-
tion is the most important factor in finding the catheter tip. To reliably detect the
endpoints of the color bands, we develop a new algorithm (see Figure 3.3). Starting
from the original image (Figure 3.8 (a)) and its undistorted result (Figure 3.8 (b)),
we first use simple thresholds to get the most of the pixels P ?catheter ? Pcatheter of
the catheter in the image. We then fit these pixels to a line, which we call the axis
Laxis of the catheter (Figure 3.8 (c)). The gradient GL along the direction ofaxis
Laxis is calculated using a variation of Sobel filter KL that is weighed anisotrop-axis
ically according to Laxis (Figure 3.8 (d)). GL has high values near the borderaxis
of two different colors. After smoothing using a median filter, the gradient map is
thresholded into binary images G?L with positive values where the borders areaxis
(Figure 3.8 (e)). Finally, we take the weighted centers of the connected components
of G?L as the endpoints of color bands (Figure 3.8 (f)). Our processing techniqueaxis
is robust against blurry images caused by motion, as seen in Figure 3.8 (g, h, i).
3.5 Results
In this section, we analyze the performance of our proposed algorithm. First,
we measure the stability of our tracking algorithm. Then we test tracking accuracy
on a 2D planar grid as well as a specially designed and printed 3D grid. Finally, we
56
(a) original (b) undistorted (c) axis
(d) gradient (e) binarized (f) endpoints
(g) blurry image (h) gradient (i) endpoints
Figure 3.8: Detecting endpoints. (g), (h) and (i) show detecting endpoints in an
image where the catheter is blurry because of movement. Our algorithm is able to
handle this case.
57
test the latency compared with a third-party tracker.
3.5.1 Stability
Given the color segment endpoints detected in the image, our tracking algo-
rithm will output the computed catheter position. The instability in the tracking
algorithm comes from random noise in each frame. The noise could cause the same
endpoint in two frames to be detected a few pixels apart, even when the catheter
stays still.
We measure the stability as the root mean square of the change of calculated
tip in camera space in two consecutive frames as in Niehorster et al. [112] while
keeping the catheter still. Given n frames, and the tip of the catheter in frame i as
xi, the stability is measured as
???? ?n?11SRMS = ||xi ? xi+1||2 (3.13)
n? 1
i=1
Several factors could influence the stability of the tracking algorithm, including
lighting conditions, thresholds for color segmentation, and distance from the camera
to the catheter. In a typical lab setting, our algorithm achieves a stability of 0.33 mm
as measured over 870 frames.
3.5.2 Accuracy over a 2D Grid
To test the accuracy of the tracking algorithm, we move and point the tip
of the catheter at the intersections of a grid. The grid, 8.16 cm ? 8.16 cm in size,
is printed on a white sheet. This provides enough space to test the accuracy of
58
(a) (b)
Figure 3.9: Setup for testing (a) the tracking accuracy over a 2D grid, and (b)
tracking accuracy over a 3D grid with the 3D CAD model in the lower right
catheter movement, more than the 6.5 cm suggested by Fried et al. [78]. The setup
for this test can be seen in Figure 3.9 (a).
We assume the center intersection of the grid to be a reference point. Before
the test, a 2D marker is placed on the workbench, and its up and right directions
(parallel to the edges of the paper sheet the marker is printed on) in the camera space
are detected. The sheet with the grid is then placed on top of the marker with the
edges aligned. Therefore, the grid has the same up and right direction as the marker,
and with its size known, we know the position of each grid intersection relative to
the reference point. When moving along the grid, the catheter?s orientation is kept
relatively fixed and its tip position in the camera space is recorded. Then the tip
positions relative to that when the tip is pointed at the reference point is compared
to positions of the grid intersections relative to the reference point.
Figure 3.10 shows the catheter tip locations on the grid with the reference
point as the origin. Blue circles indicate the grid intersections, and the orange
59
Figure 3.10: The accuracy of our catheter tracking approach over the grid. The
red circles indicate the grid intersection. The blue crosses indicate the computed
catheter tip position.
crosses are the detected catheter tips. The average distance from the catheter tip
to the corresponding grid intersections is 0.58 mm.
3.5.3 Accuracy over a 3D Grid
We also demonstrate the accuracy of our algorithm when the catheter is mov-
ing in vertical space. To construct a reliable ground truth, we use a 3D-printed
structure where square grids of blind holes are on its three faces (see Figure 3.9).
60
The centers of the blind holes are d =10 mm apart in both directions which we man-
ually verified with a caliper. Each blind hole is designed to be a cone with 1 mm
radius and 3 mm height. On each face, there are 8? 8 blind holes.
When testing on each face, one of the center blind holes (for example (m,n)
on the 8? 8 grid) is chosen as the reference point. The location of the tip when the
needle tip is inserted into each blind hole (i, j) is recorded as T (i, j). We construct
the ground truth?G(i, j) as the designed distance of each blind hole to the reference
point G(i, j) = (m? i)2 + (n? j)2d. The distance between the calculated tip
and the reference at each blind hole is D(?i, j) = ||T (i, j)? T (m,n)||2. The average
accuracy on the face f is then Accf =
1
i,j abs(G(i, j)?D(i, j)).64
In this test, AccA = 0.49 mm, AccB = 0.93 mm, AccC = 1.12 mm. The overall
average accuracy on all three faces is 0.85 mm.
3.5.4 Latency
We measure our system?s latency compared to an HTC Vive tracker. This
latency is measured as the time elapsed between the tracker update and our system
catheter?s location update. In our test, we rapidly move the catheter, which is
attached to the tracker, and record for each frame the distance from the catheter tip
(tracked with our system and by the HTC Vive tracker) to their original location
recorded at the beginning of the test. The distances are then plotted, and we
manually measure the average time by which our system lags the tracker. Figure 3.11
(a) shows the plotted distances that we recorded. Manual measurements show an
61
(a) (b)
Figure 3.11: The latency of our system compared to an HTC Vive tracker (a) and
the various components of our system?s latency in image capture and processing (b).
average latency of 95.01 ms. A break down of the latency can be seen in Figure 3.11
(b).
There are two factors in the latency with respect to the HTC Vive tracker. It
takes about 72 ms for our camera (together with the driver, and OpenCV functions)
to capture and store the image. The processing takes an additional 22.60 ms on our
machine, which uses an Intel i7 7820HK processor running at 2.9GHz. As we show
in Table 3.1, undistorting takes the longest time, 10.05 ms per frame (44.42% of the
total processing time). Speed up may be achieved by performing color segmentation
first and undistorting only the pixels of the color bands. The next most time-
consuming task is calculating the gradient along the catheter axis, (6.81 ms per
frame, 30.12% of total processing time). Our current implementation is on the
CPU. A GPU is likely to reduce the processing time through parallelization. A
professional-grade camera with a low image capture time can greatly reduce the
latency.
62
ID Function Avg time per frame Percentage in
(ms) processing (%)
1 Undistort 10.04 44.42
2 Color segmentation 1.05 4.64
3 Erosion 0.23 1.02
4 Get axis 0.93 4.13
5 Gradient 6.81 30.12
6 Connected components 1.15 5.08
7 Weighted centroids 0.92 4.09
8 Calculate catheter position 0.21 0.92
Other 1.26 5.58
Total 22.60 100.00
Table 3.1: Time breakdown of functions in our tracking algorithm.
3.6 Discussion
We have demonstrated a novel method of tracking a catheter tip and projecting
it onto an augmented reality (AR) display. In combination with CT scan projection
into the skull, this could allow for a safer, better, and more accurate way to place
an external ventricular drainage into the brain of a sick patient. The first challenge
we address is significantly enhancing the accuracy and latency of our system over
the state-of-the-art augmented reality systems for this procedure such as Cutler et
al. [69].
63
External ventricular drainage is a technically difficult procedure. It involves
placing a catheter several centimeters into the brain through tissues of varying den-
sity including bone, dura, brain, and cerebral spinal fluid. In addition, it is generally
performed in bright light with blue sterile drapes. These realities can potentially
affect the accuracy of the AR tip projection in difficult-to-predict ways and needed
to be validated in our study. Furthermore, the AR system should not interfere with
procedure performance by not hindering the surgeon placing the drain. A purely
computational evaluation of our approach would not have been sufficient to assess
the efficacy of our augmented-reality approach in such challenging environments.
In order to validate the true feasibility of AR-assisted EVD placement, we have
carried out our evaluation in a realistic experimental environment, which included
a cadaveric head, using sterile drapes and the existing EVD placement kits, closely
mimicking actual EVD placement in a patient. After inserting the catheter tip at
the insertion point, the surgeons advanced the catheter in a straight line. During
the process, the displayed virtual catheter projection aligned well with the visible
portion of the real catheter and the pace at which the real and the virtual catheter
moved matched. The surgeons have reported that the handling of our slightly mod-
ified catheter feels identical to a traditional catheter. The surgeons, while aware of
the HMD, did not find it cumbersome.
Tests show that our system is accurate with sub-millimeter accuracy and low
latency. If further improvements in accuracy and latency were needed, we could
do so with the following enhancements. First, a professional-grade camera with a
higher resolution, higher dynamic range, and lower latency could improve tracking
64
by enhancing our ability to discern the segment endpoints and reduce the overall la-
tency. With a higher resolution, the camera could be farther away from the catheter
without compromising tracking quality, thus allowing a larger tracking area. All this
comes with a higher computational burden. At the moment, our algorithm is imple-
mented on a CPU but is highly parallelizable. Implementing a parallel GPU version
of the algorithm is a high priority in our future work. At present, we observe around
167ms end-to-end latency between moving the real catheter and its updating the
virtual catheter in the HMD. In our current system, we use an independent camera
and a laptop for processing, but it would be desirable to explore an all-in-one setup.
This is highly likely with further advances in commodity HMDs. Current-generation
AR HMDs often suffer from drift in spatial tracking. In our studies, this has ranged
from 2.15mm to 6.35mm, depending on the amount of user head movement. Com-
plementing the 3D spatial tracking with 2D optical tracking like in Frantz et al. [113]
should be able to improve spatial tracking accuracy.
In our system, the calibration between the camera space and the HMD space
is performed manually to make sure the observed virtual catheter aligns with the
observed physical virtual catheter. Therefore very little optical misalignment has
been observed in our experiment as long as the user does not remove or adjust
how the HMD is worn. The integration of eye-tracking technology in most of the
next-generation HMDs should further improve optical alignment.
Most catheters used in EVD (such as Integra?R , Medtronic?R , and Codman?R )
measure longer than 30cm and there is sufficient space to include color bands, as
outlined in our approach, on clinical catheters. To further mitigate occlusion-related
65
Figure 3.12: Our system being used in a cadaver study
concerns around color bands, one could explore adding additional redundant color
bands. Last but not least, attention needs to be paid to the clinical safety of catheter
modification. Technologies such as VisiMark?R 7 could be used to create medical-
grade markings to serve as the color bands. Alternatively, the catheter could be
constructed with medical-grade silicone of various colors and go through standard
sterilization. Slight changes would need to be made to the color band detection
process.
7http://www.surfacesolutionsgroup.com/coatings/visi-mark/
66
3.7 Conclusion
We present an augmented reality surgical visualization and tracking system
that can assist surgeons in EVD catheter placement. Compared to existing similarly-
purposed systems, our system does not require changing the shape or weight of the
medical instruments. Our main technical contributions include 1) a low-latency,
high-performance way to track catheters and other 5DOF thin cylindrical objects,
and 2) an image processing algorithm to extract tracking color segment endpoints
in an image. Tests show that our system achieves a 0.58 mm accuracy on a 2D
plane and an accuracy of 0.85 mm in 3D space. Processing for each frame takes
22.60 ms on a moderately powerful computer. Both the accuracy and speed can be
further improved with higher-quality cameras and parallel implementation. We have
tested our system in a realistic surgical environment with a cadaveric head. Our
color marker and tracking technique can be applied to other medical procedures
where the precise location of a very thin medical instrument is needed, such as
needle-guided biopsies and minimally invasive surgeries.
67
Chapter 4: Correcting the Proximity Effect in Nanophotonic Phased
Arrays
4.1 Overview
Nanophotonic phased arrays (NPA) are phased arrays operating at optical
wavelengths. An NPA is composed of an optical power distribution system, a phase
modulation mechanism that can control the phase of each pixel individually, and
antennas for propagating phase-modulated light into free space. NPAs can efficiently
steer optical waves in the desired direction and do so efficiently without any moving
parts. Thus, they have found widespread applications in optical communications
(such as Khalighi and Uysal [114]), range-finding (such as Poulton et al. [115]), and
other fields. Recently, Sun et al. [18] and Notaros et al. [116] demonstrated the use
of the NPA as an augmented reality holographic display. In this application, the
NPA generates an optical wavefront. Through careful phase modulation, text and
images (or 3D scenes) can be displayed for the observer to see. This use of NPAs is
of particular interest for graphics research into near-eye displays (NEDs).
Near-eye displays have become more popular with virtual and augmented real-
ity (VR and AR) going mainstream. In the current VR and AR headsets, users? eyes
68
can only focus at one or a small number of depths. This vergence-accommodation
conflict (VAC) can lead to visual discomfort, nausea, and cybersickness. Focus-
tunable displays have been shown to provide a preferable viewing experience [117].
Varifocal displays and light field displays have been proposed to address this issue
but they are either limited in angular resolution or involve multiple moving com-
ponents. Holographic displays recreate the optical wave field of 2D images or 3D
scenes. Therefore, the light observed by the user looks as if it is coming from the real
image or object, allowing viewing from different perspectives. Holographic displays
also form images and 3D scenes at the desired depths for eyes to naturally focus on,
thereby addressing VAC.
A holographic display modulates the phase and/or amplitude of light at each
pixel location to form the desired wavefront to be observed. Previously, Liquid
Crystal on Silicon Spatial Light Modulators (LCoS SLMs) are the most common
phase modulators for holographic displays. However, NPAs have gained popularity
recently. One major advantage of the NPAs over SLMs is their fast refresh rate.
One refresh cycle on the NPA includes heating each pixel to a certain temperature
corresponding to the desired phase and cooling down to ambient temperature. The
simulation with our NPA design shows that it can complete a refresh cycle within
10 ?s, suggesting a 100kHz refresh rate which is uniquely suitable for dynamic con-
tent as well as time-division multiplexing operations such as color-switching and
higher dynamic range. The only other microdisplay capable of kHz-refresh rates
is the Digital Micromirror Device (DMD). It has been used for creating near-eye
displays in Rathinavel et al. [118]. However, the DMD can only modulate binary
69
amplitude patterns which limits the fidelity of its image in a single frame. The
NPA is capable of continuous phase modulation. In addition, the NPAs support
integrated light sources where a laser is coupled to the NPA via an optical fiber,
distributed to each pixel and emitted by the on-board antennas. In this way, the
NPA displays can be made more compact as optical operations such as beam expan-
sion and polarization necessary for typical LCoS SLM or DMD display systems are
not required. LCoS SLMs also suffer from pixel voltage crosstalk and non-linearity
mapping between phase and voltage which hinder accurate phase modulation.
Thermally-modulated NPAs, such as those in Doylend et al. [19], Sun et al. [18]
and our design, depend on the thermo-optical effect of the materials to modulate the
phase of a pixel. The phase of a pixel can be changed from 0 to 2? by changing the
temperature of the pixel. The precise and independent control of the phase is needed
to ensure the successful display of desired 2D or 3D imagery. However, the thermal
proximity effect (PE), the phenomenon where the temperature of one pixel affects
the temperature of neighboring pixels, impacts the accuracy of phase modulation
and degrades the observed image. A simulated example of the proximity effect
negatively impacting the formed image of some Fourier holograms can be seen in
Figure 4.1. Therefore, in order to form the desirable images, this thermal proximity
effect must be taken into consideration. This problem is further complicated by
the fact that most NPAs only have heat generation mechanisms, with no cooling
capability, making some phase patterns impossible to realize.
There has been limited study on the problem of thermal proximity effect on
NPA holographic displays. But the proximity effect is present in many processes
70
(a) Without PEC (b) Basic Iterative PEC (c) Iterative PEC (d) Proximal PEC
Gemayel et al. [119]
Figure 4.1: Nanophotonic phased arrays suffer from thermal proximity effect where
one pixel being heated affects the temperature of nearby pixels. This causes in-
accurate phase modulation and noise in the formed image of Fourier holograms as
shown in (a). The current state-of-the-art methods, as shown in (b), are not able to
sufficiently correct the proximity effect for our use case. We propose two proximity
effect correction (PEC) methods for Fourier holograms on the NPA which are able
to reduce the noise, as shown in (c) and (d). The proximal PEC method has the best
correction effectiveness. Images are chosen from the Common Objects in Context
dataset [5]. The proximity effect correction for Fresnel holograms is discussed in
Section 4.5.
71
and devices. In beam electron lithography and optical communication using SLMs,
the proximity effect is modeled as a convolution, such as in Peckerar et al. [120]
and Persson et al. [121]. In the former case, the correction is conducted as a de-
convolution with non-negative constraints on the solution. In the latter case, the
Gerchberg-Saxton phase retrieval algorithm is slightly modified to produce a result
that is somewhat resistant to the proximity effect. Sun et al. [122] also consider the
thermal proximity effect on NPAs as a convolution and perform deconvolution on
the phase under the assumption of perfect amplitude modulation.
In this chapter, we consider the practical but more challenging case of phase-
only holograms on NPAs. Based on the thermo-electric simulations, we also model
the proximity effect as a convolution. We first propose two new proximity effect
correction methods for the Fourier holograms. The first method works by integrat-
ing deconvolution into the phase retrieval algorithm. The second method uses the
proximal algorithm to directly optimize the displayed image on the NPA considering
the proximity effect. In addition, we propose a proximity effect correction method
for Fresnel holograms also based on the proximal algorithm. Fourier holograms are
based on the Fraunhofer diffraction approximation where propagating a wavefront
to a plane very far away (far field) can be approximated with a Fourier transform.
Fourier holograms are fast to compute and are good for reconstructing 2D images
in the far field. However, since Fourier holograms do not specify the depths of the
formed images, they are not suitable for 3D scenes. Fresnel holograms, on the other
hand, are based on the more accurate Fresnel diffraction approximation which is
applicable closer to the source than the far field and can specify the depths of the
72
formed imagery. An illustration of the diffraction approximations and hologram
types can be seen in Figure 4.2. For more detail, please refer to Goodman [123] and
Voelz [6]. The closer range is more suitable for near-eye displays and the ability to
specify the depth allows the users to focus on the correct focal distance. This is
where holographic displays and computer-generated holography truly shine. Vari-
ous methods have been proposed to calculate the Fresnel holograms of 3D objects
and display them in SLM-based holographic displays, including by Shi et al. [1] and
Maimoneet al. [62]. No proximity effect correction method has been proposed for
Fresnel holograms on the NPA.
We start by investigating how the proximity effect negatively affects the ob-
served image quality on a phase-only NPA holographic display. We implement the
new proximity effect correction methods mentioned above and compare their correc-
tion effectiveness with existing solutions. Computational simulations are run with
proximity effect of varying severity and both qualitative and quantitative results are
presented. The factors affecting the correction effectiveness and processing time of
the methods are also investigated.
In summary, the goal of this chapter is to explore the proximity effect on NPA
holographic displays and propose new methods of correction. To the best of our
knowledge, this is the first detailed analysis of phase-only NPA-based holographic
displays with the thermal proximity effect and the first to come up with effective
solutions.
The rest of the chapter is organized as follows. In Section 4.2 we briefly sum-
marize the existing literature on several related topics including near-eye displays,
73
Figure 4.2: The Rayleigh-Sommerfeld model provides a highly accurate, but also
computationally expensive, diffraction model. It is applicable one wavelength away
from the source. When the propagation distance is farther, the Fresnel diffraction
approximation, which is less computationally expensive than Rayleigh-Sommerfeld
diffraction may be used. Holograms computed using the Fresnel diffraction are called
Fresnel holograms. They can be reconstructed at the user-specified depths, such as z1
and z2. The Fresnel diffraction approximation is considered acceptable when NF =
w2 < 30 ( [6]), where w is the radius of the source field, ? is the wavelength and z
?z
is the propagation distance. Farther away when NF  1, the Fraunhofer diffraction
approximation can be used where the propagation can be approximated with a
Fourier transform. Holograms based on the Fraunhofer diffraction are called Fourier
holograms and are the fastest to compute. In our NPA design, w = 1 mm. Given a
typical green light wavelength ? = 530 nm, the Fresnel diffraction is acceptable for
z > 62.89 mm and Fraunhofer diffraction is suitable for z  1886.79 mm.
74
NPAs, computer-generated holography, and proximity correction. In Section 4.3,
we describe how we model the proximity effect on an NPA display. Then in Section
4.4 and 4.5, we describe our proposed proximity effect correction methods for the
Fourier holograms and Fresnel holograms, respectively.
4.2 Related Work
We introduce some of the existing work on the various topics related to this
chapter. This includes work on NPAs, NEDs, computer-generated holography and
the proximity effect correction.
4.2.1 Nanophotonic Phased Arrays
The phased array has been an active area of research since its invention. The
microwave version of the phased array was first realized in the mid 1940s [124]
and since then it has seen widespread use in applications such as radio detection
and ranging (RADAR), broadcasting, and astronomy. The invention of the optical
version of the phased arrays, the nanophotonic phased arrays, was made possible
due to several advances starting from the invention of the laser in the early 1960s
and high-precision silicon photonics fabrication.
Optical beam steering using one-dimensional NPAs allows stable, rapid, and
precise beam steering without mechanical motion while providing large scanning an-
gles. Two-dimensional NPAs have several applications, including LiDAR scanning
and free-space optical communications. What is of particular interest to us is its
75
use as a holographic display. Sun et al. [18] demonstrate a novel NPA architecture
capable of integrating an optical directional coupler, a 2? tunable phase modulator
and a nanoantenna within a compact 9 ?m? 9 ?m unit and an infrared image con-
taining text information which can be observed in the far field. Building an NPA on
a silicon nitride platform can break its operating wavelength limitation and expand
its applications into the visible spectrum. Raval et al. [125] show a static image
with the horizontal parallax and a viewing angle of 5? generated on an NPA oper-
ating at 635 nm wavelength. Notaros et al. [116] demonstrate for the first time an
NPA-based visible-light near-eye holographic projector where a holographic image
encoding methodology has been used and a 3D cube can be observed by the eye.
However, in most of the works mentioned above, the phase shift for each pixel has
been hard-coded during manufacturing and cannot be dynamically changed. As a
result, only static images were shown. A notable exception is the work by Sun et
al. [18] where a small-scale (8? 8) array with active phase tuning is used to demon-
strate infrared beam steering. Our NPA design supports active phase tuning for
visible light and addresses the challenge of correcting thermal proximity effect that
comes with thermally modulated NPAs. A phased array with acousto-optic phase
modulation has also been shown by Grinenko et al. [126].
4.2.2 Near-eye Displays
Traditionally, near-eye displays use binocular vision to provide the parallax
depth cue [127, 128]. While these displays and the modern commercial headsets
76
can provide a very good field-of-view, they can only afford a single focal distance
which leads to the vergence-accommodation mismatch. A comprehensive study by
Rolland et al. [129] provides an engineering specification on the number of focal
distances that is required to suppress the vergence-accommodation mismatch. To
provide multiple focal distances, Akeley et al. [130] use several beamsplitters placed
at different distances from the viewer to reflect the content from a single LCD
monitor. Dunn et al. [55] use a deformable membrane controlled by airtight cavities
to modulate the focal distance. A gaze tracker is used to sense the desired focal
distance. Commercialized focus-tunable lenses have also been used address VAC
and vision correction in near-eye displays in Chakravarthula et al. [131] and Xia et
al. [132]. Aks?it et al. [133] and Kim et al. [134] use a motorized linear stage to move
the optical combiner and the MOLED display on the optical axis to adjust the focal
distance.
Light field displays encode the radiance as a function of position and angle [135]
and have been shown to approximate the accommodation to the extent limited by
the angular resolution [136]. Light field displays have been implemented based on a
variety of underlying hardware platforms such as microlens arrays [137], attenuating
LCDs [63, 138?140], and holographic optical elements combiner layer with projec-
tors [141, 142]. Pamplona et al. [143] and Huang et al. [144] show that light field
displays can correct impaired vision such as myopia and presbyopia and higher-order
aberrations. A method to compute the optimal imagery on such displays with mul-
tiple focal planes has been described [145]. However, the angular resolution of light
field displays is limited by the number of different views which together with the
77
spatial resolution is bound by the device?s form factor and pixel density. Attempt-
ing to further increase pixel density can lead to undesirable diffraction artifacts.
Padmanaban et al. [146] solve this problem to some degree with Overlap-Add Stere-
ogram (OLAS) at the cost of more intensive computation. Mi et al. [147, 148] have
shown that the accommodation problem can be circumvented with a type of retinal
scanning display that directly projects the image onto the user?s retina.
Holographic displays have been built with pre-recorded holographic optical
elements [133,134] or with dynamic phase spatial light modulators [1,62] or a com-
bination of both [149]. This technology has the potential to fully address some hard
problems faced by the current generation of VR and AR near-eye displays such as
vergence-accommodation conflict. However, the refresh rate and form factor of the
holographic displays falls short of what the NPA technology can provide.
For more details on near-eye displays we refer the interested reader to the
comprehensive survey by Koulieris et al. [150].
4.2.3 Computer-Generated Holography and Phase-Only Holograms
Various methods have been proposed for computer-generated holography (CGH).
For Fourier holograms, the holographic pattern of an image in the far field can be
calculated as its Fourier transform. Lohmann et al. [151] show that a binary repre-
sentation of the Fourier hologram can achieve good reconstruction quality for letters
in a 2D image.
Fresnel holograms are calculated by considering the various scene objects, such
78
as points and polygons, as a collection of light sources. Waters [152] synthesized a
Gabor-style hologram by calculating the interference pattern from the light emitted
by each point on an object and the illumination beam. Directly computing the
interference pattern from every point is computationally intensive. Lucente [153]
uses a look-up table corresponding to the interference contribution from points at
each location in the image space and achieves a 43? speedup. Yoshikawa et al. [154]
use approximation methods to calculate the distance from a point source to a holo-
graphic pixel. Stereograms are small regions of a hologram. They each record the
diffraction pattern of a portion of the scene instead of the entire scene and are used
in Lucente and Galyean [155], Xu et al. [156], and Shi et al. [1] to reduce the amount
of computation. More recently, Eybposh et al. [157] train an unsupervised convolu-
tional neural network that outputs the near-field phase for a given desired image in
real-time. Peng et al. [158] employ a camera in the hologram calculation process to
iteratively optimize the hologram based on captured observation from the camera.
These holograms are used to train a neural network that can output high-quality
holograms in real-time.
Ziegler et al. [159] develop a conversion technique between the light field repre-
sentation and the holographic representation of a scene. Padmanaban et al. [146] use
this conversion to develop the OLAS algorithm to escape the trade-off between spa-
tial and angular resolution in traditional light-field displays. Ways to add occlusion
or change lighting in a hologram have also been explored in Ziegler et al. [160].
While an optical wavefront contains both phase and amplitude, many currently
available holographic displays are only capable of modulating either only the ampli-
79
tude or only the phase. Aks?it [161] use a learning-based optimization approach to
calculate the amplitude hologram that enhances the spatial resolution by up to four
times when illuminated with a fast-refreshing locally-addressable backlight. The
NPAs we are investigating here is only capable of phase modulation. Therefore,
a method to convert fully complex holograms to phase-only holograms is required.
Gerchberg and Saxton [162] proposed the first of a family of iterative phase re-
trieval algorithms that generate a near-field phase pattern which coupled with a
predefined near-field amplitude pattern can propagate to form the desired far-field
amplitude. We discuss this algorithm and our modifications in Section 4.4.1. The
original Gerchberg-Saxton (GS) algorithm uses Fourier transform to approximate
the Fraunhofer diffraction over a long distance. Fresnel diffraction is also frequently
used in a similar way when the propagation distances are shorter.
Several researchers have made modifications to the iterative phase-retrieval
algorithm. Masuda et al. [163] added a binarization step after the propagation
from the object plane to the hologram plane to extract a binary phase. Similarly,
Persson et al. [121] and Gemayel et al. [119] add a convolution step to counter the
pixel voltage crosstalk on SLMs. However, these methods require heavy padding
of the image to be effective and in our opinion do not make the full of the limited
resolution on holographic displays. Nevertheless, we include these methods as the
most recent existing proximity effect correction method even though the cause of
proximity effect in our case is different.
A recent work on phase retrieval is Chakravarthula et al. [164] using non-
convex optimization based on Wirtinger derivative. Our proposed proximal prox-
80
imity effect correction method also uses non-convex optimization and shows that it
can effectively and efficiently correct thermal proximity effect in NPAs.
4.2.4 Proximity Effect Correction
Parikh [165] introduces a self-consistent technique that provides a mathemat-
ically unique solution to the PEC problem given the proximity function by solving
a set of linear equations. This technique is accompanied by a second technique
that compensates for the unaddressed regions by solving an overdetermined system
of equations. Carroll [166] expresses the PEC problem as a linear programming
problem with constraints derived from the addressed and unaddressed pixels.
Chow et al. [167] use an image deconvolution approach based on the convo-
lution theorem to efficiently solve the PEC problem. Similarly, a matrix inversion
approach to deconvolution is tested by Peckerar et al. [168]. These deconvolution-
based methods are likely to produce negative entries in the result, which is not
physically acceptable. Simple techniques have been proposed to address this prob-
lem such as setting them to zero (chopping) and adding a background value equal
to the absolute of the minimum entry (shifting) in Marrian et al. [169]. A more ad-
vanced method to avoid negative entries is to add a regularizer to the cost function
in the gradient-descent method. One such regularizer used in [168, 169] is propor-
tional to the Shannon entropy of the pattern. Alternatively, Peckerar et al. [170]
express the non-negative requirement as well as the incident dose as constraints
in a linear programming problem that minimizes the total dose. Sun et al. [122]
81
try a number of PEC methods including quadratic programming on an NPA model
that suffers from proximity effect in phase modulation but assumes independent and
ideal amplitude modulation. Our work addresses the more practical, but also more
challenging case of thermal proximity effect correction on a phase-only NPA.
There are many more PEC methods in E-beam lithography that we are not
able to cover in the entirety here. We refer interested readers to Li [171] for a
thorough review.
4.3 Nanophotonic Phased Array
This section describes the basic structure and operation of an NPA display.
The description is based on an NPA of our design currently under development.
The differences between this NPA design and previous NPAs such as those in Sun et
al. [18] and Raval et al. [125] are out of the scope of this chapter. The focus of this
chapter is the thermal proximity effect which is common in all thermal-modulated
NPAs and its correction methods. Therefore, we first explain the phase modulation
in Section 4.3.1, followed by how we model and quantify the thermal proximity effect
in Section 4.3.2.
4.3.1 Phase Modulation
Figure 4.3(a) shows the schematics of the NPA we have developed. Each array
unit (a pixel) consists of a tunable thermo-optical phase shifter that is coupled to an
optical antenna. The refractive index of the phase shifter changes with temperature.
82
Figure 4.3: The NPA display with electronic and optical layers is shown in (a). The
optical power is evenly distributed into each pixel. Therefore, the near-field (b) has
uniform amplitude. The antennas (c) propagate the near-field wavefront to form
the image to be observed at the desired depth.
Therefore, phase modulation can be achieved by heating each pixel. Special efforts
have been put into reducing the physical size of the phase shifter while maintaining
its sensitivity to temperature change. This way, the array can be small and a
complete 2? phase shift can be achieved with a relatively small temperature range,
thus achieving high energy efficiency. The optical power is evenly distributed into
each pixel by a precise design of the directional couplers. Therefore, a uniform
amplitude is achieved in the near field, as shown in Figure 4.3(b). The phase of the
NPA holographic display pixel is modulated with the temperature. An electronic
83
chip (green layer in Figure 4.3(a)) is used to control the heating of each pixel.
There is a linear relationship between the phase ?j of a pixel j and its tem-
perature Tj (above room temperature), expressed in Equation 4.1.
?j = ?Tj (4.1)
Note that Tj must be non-negative for physical feasibility. From the simu-
lations run on our design, ? is inferred to be ? rad ?C?1 for 630 nm red laser.
175
With precise phase shifter design, ? is proportional to the wavelength, i.e. ? = ?
147
rad ?C?1 for green (530 nm) and ? = ? rad ?C?1 for blue (445 nm) lasers. Given
124
the desired 2D phase pattern of a hologram, we can find the desired 2D temperature
profile T? with Equation 4.1.
4.3.2 Thermal Proximity Effect
The thermal proximity effect in the NPAs is the phenomenon where one pixel
affects the neighboring pixel?s temperature when it is being heated. Similar prox-
imity effects are present in beam electron lithography and LCoS SLMs. In Peckerar
et al. [120] and Persson et al. [121], the relation between the energy deposited and
energy absorbed and the pixel voltage crosstalk are both modeled as convolutions.
We also model the thermal proximity effect in the NPAs as a convolution. To quan-
tify the proximity effect, we simulated a 5 ? 5 array where we supplied the center
pixel with power and measured the temperature on all pixels. An illustration of
this thermal proximity effect can be seen in Figure 4.4. The data for this figure is
obtained from fitting a Gaussian to the 25 values from the simulation on a 5 ? 5
84
array which is described below.
Figure 4.4: Thermal crosstalk arising from power being supplied to the center pixel
on a 5? 5 array.
Using the measurement, we derive the following relationship:
2
T = T e?
d(j,k)
k?j k ?2 (4.2)
Tk?j refers to the temperature rise in pixel j caused by pixel k, Tj and Tk are
the temperatures of pixels j and k before considering the proximity effect and d(j, k)
is the distance between pixels j and k measured in pixels (px). The temperature
T?j of pixel j considering the proximity effect is the sum of Tj and the contributions
from all surrounding pixels, expressed in Equation 4.3. The temperature of the
entire NPA can be expressed as the convolution in Equation 4.4 where T and T? are
the 2D temperature profiles before and after considering the proximity effect and ?
is the convolution operation.
85
?
T?j = Tj + Tk?j (4.3)
k 6=j
T? = T ?K? (4.4)
We derive that ? = 0.66 px based on the data obtained from thermal-electrical
simulations. The convolution kernel K? in Equation 4.4 can be constructed using
?. We have used kernels of size 7? 7 in our simulations.
The severity of the thermal proximity effect is quantified by the value of ?.
Even though we derived that in our current design ? = 0.66 px, factors such as the
thermal conductivity of the materials of the NPA and the pixel pitch could affect
this value. For example, if the pixel pitch is reduced, heating one pixel would affect
the temperature of its neighboring pixels more as they are closer to the pixel being
heated. The impacts of the thermal proximity effect and the effectiveness of the
correction methods are presented with varying levels of the thermal proximity effect
in Section 4.4.3.
4.4 Proximity Effect Correction for Fourier Holograms
In this section, we describe our novel proximity effect correction methods for
Fourier holograms in phase-only NPA holographic displays. The first method is
based on the Gerchberg-Saxton iterative phase retrieval algorithm. In this method, a
deconvolution step is introduced to the source amplitude constraint in each iteration.
We refer to this method as the Iterative Proximity Effect Correction (IPEC) method.
The second method which we call the Proximal Proximity Effect Correction (PPEC)
86
method, is based on the proximal algorithm. We use the proximal algorithm to
directly find a near-field phase pattern that minimizes the difference between the
far-field intensity it forms under the proximity effect and the desired image.
4.4.1 Iterative Proximity Effect Correction
The Gerchberg-Saxton iterative phase retrieval algorithm is a classic and well-
known algorithm for calculating phase-only holograms. Here we focus on how we
integrate proximity effect correction into it. For completeness, we present a detailed
description of this algorithm in Section 4.4.1.1.
The pseudocode for the IPEC method is presented in Algorithm 1. The inputs
to this algorithm are S, T ? RM?N , the source and target amplitude patterns each
of size M ? N . The source amplitude S is the near-field amplitude on the NPA.
According to Section 4.3, S is uniform in our setting. The target amplitude T is the
amplitude we wish to form in the far field. The phase retrieval algorithm finds the
near-field phase that coupled with S produces the far-field amplitude T . A random
initial phase ? ? RM?N0 may be used to initialize the algorithm. The function ?phs?
returns the phase of a complex wavefront. We use the Fourier hologram model where
the propagation of the near-field wavefront to the observation plane is approximated
with a Fourier transform. F represents the discrete Fourier transform. Let P (?)
be the function that applies the proximity effect to the phase pattern ?. P?1 is the
reverse process.
We integrate the proximity effect correction into the phase retrieval algorithm
87
Algorithm 1: Iterative Proximity Effect Correction
input : S, T , ?0
output: ?r
A = F?1(T ? ei?0);
while Within maximum iteration do
?r = P
?1(phs(A));
B = S ? eiP (?r) (source amplitude constraint);
C = F(B);
if Termination condition met then
break;
end
D = T ? eiphs(C) (target amplitude constraint);
A = F?1(D);
end
by modifying the step of the source amplitude constraint. Originally, the source
amplitude constraint is applied by having B take the amplitude S and the phase of
A. In our iterative proximity effect correction method, B takes the same amplitude
S but the phase after taking the proximity effect and its correction into account.
Specifically, the phase is P (P?1(phs(A))). P?1 takes into consideration the non-
negative constraint. Therefore, in most cases, P (P?1(?)) 6= ?.
A good termination condition can be the difference, ?T ? |C|?F, between the
amplitude of C and the target T becoming small enough. While this works well
88
with the original phase retrieval algorithm, we find it hard to define a threshold
when P?1 is introduced. Instead, we terminate when the minimum and maximum
values of |T ? |C|| are not updated for several consecutive iterations.
In our setting, P is a convolution, i.e. P (?) = K ? ? where ? is a 2D phase
pattern. P?1 is therefore a deconvolution with non-negative constraints. We can
express P?1 in the following form. MKx = ? is the matrix form of the convolution.
M MN?MNK ? R . ?,x ? RMN?1.
P?1(?) = argmin ?MKx? ??22 (4.5)
x
s.t. x ? 0
A basic deconvolution method is matrix inversion. If MK is non-singular,
which it is in our case, we can simply take x = M?1K ?. In practice, it is faster and
more robust to solve the linear system MKx = ?. However, this method is likely
to produce a solution that contains negative entries corresponding to a temperature
lower than the room temperature. This is physically impossible to achieve on the
current design of the NPA.
A more robust deconvolution method is quadratic programming, which is for-
mulated in Equation 4.6. Traditional quadratic programming algorithms such as
the interior-point algorithm can be used. Alternatively, we can use gradient descent
to minimize the cost function in Equation 4.6 and use a regularizer to ensure the
solution is non-negative.
89
1
min xT(MTKMK)x+ (??TMK)x (4.6)
x 2
s.t. x ? 0
In the simulations, we use the quadratic programming deconvolution method
described above.
4.4.1.1 Gerchberg-Saxton Iterative Phase Retrieval Algorithm
The iterative proximity effect correction method integrates proximity effect
correction into the classic Gerchberg-Saxton iterative phase retrieval algorithm. For
completeness, we describe the original GS algorithm here.
We have a target amplitude pattern T that we want to form in the far field and
a fixed source amplitude pattern S in the near field. We wish to find the near-field
phase that when coupled with S can be propagated to the far field to form T . In the
traditional iterative phase-retrieval algorithm, there are four complex wavefronts, A,
B, C, and D (shown in Figure 4.5). A and B are located in the near field while
C and D are located in the far field. In the initialization step, a common way to
initialize A is to set it as the inverse Fourier transform of the target amplitude with
the random initial phase ?0.
Then in each iteration, B is formed by applying source amplitude constraint
on the hologram plane, expressed as B = S ?ei?A . B is then propagated to the object
plane to form C. If the termination condition is met here, A?s phase (same as B?s
phase) is the retrieved phase of the algorithm and C is the expected wavefront in
90
Figure 4.5: Iterative phase retrieval algorithm
the far field. Otherwise, the iteration continues as the target amplitude constraint
is applied on the object plane, expressed as D = T ? ei?C . A is then formed by
propagating D to the hologram plane.
The phase retrieval algorithm may also specify a signal region. In this case, we
only want to ensure that the farfield amplitude matches T only in the signal region.
We specify the signal region as I, and the rest of the region as background II. We
denote the signal and background region on a 2D field A as AI and AII respectively.
AI ? AII = A. The target amplitude constraint will be applied separately for the
signal and background regions.
91
D = T ? eiphs(CI)I I (4.7)
DII = CII (4.8)
4.4.2 Proximal Proximity Effect Correction
In the proximal proximity effect correction method, we set out to find the near-
field phase pattern that minimizes the difference between the far-field intensity it
forms under the proximity effect and the desired image I. Namely, given the image
I ? RM?N M?N, we want to find the non-negative near-field phase pattern X ? R+
that minimizes g(X) in Equation 4.9. U(X) is the intensity of the wavefront the NPA
with input phase X forms in the far field, which can be calculated in Equation 4.10.
Again, P (X) is the result of the proximity effect on the input phase X. It should
be noted that the proximity effect should be applied to the temperature instead of
the phase. However in our case, since there is a relationship between the phase and
temperature on the NPA, we directly use the phase X for simplicity in the problem
formulation.
g(X) = ?U(X)? I?2F (4.9)
U(X) = |F(eiP (X))|2 (4.10)
Proximal algorithms are generally used for solving convex optimization prob-
lems [172]. However, as is shown in Heide et al. [173] and will be shown in Sec-
tion 4.4.3, the proximal algorithms can also work well on our non-convex problems.
92
The proximal algorithms work by repeatedly evaluating the proximal operator.
The proximal operator of a function f is defined as follows ( [172]). ? is the step
size parameter.
1
prox?f (v) = argmin(f(x) + ?x? v?22) (4.11)
x 2?
To enforce the non-negative constraint, we add an indicator function to the
cost function f(X) in Equation 4.12.
f(X) = g(X) + c(X) (4.12)
c(X) = I[0,+ inf)(X) (4.13)
Various proximal algorithms have been proposed. In this chapter, we use the
accelerated proximal gradient algorithm. This algorithm splits the cost function into
two parts, one of which is differentiable. In our case, f(X) is the differentiable term.
Let Xk and ?k be the solution and the step size in iteration k. Y k is the extrapolated
solution in iteration k whose purpose is to move the solution a step further towards
convergence. g?? (Z, Y ) = g(Y ) + tr(?g(Y )(Z ? Y ))T + 1 ?Z ? Y ?2F and ? ? (0, 1)2?
are used in the line search method (Beck and Teboulle [174]) to determine the step
size ? . The algorithm is carried out iteratively according to Equation 4.14.
Y k+1 = Xk + ?k(Xk ?Xk?1)
Xk+1 = prox k+1 k k+1?kc(Y ? ? ?g(Y )) (4.14)
93
A simple choice for ?k is k [172]. The proximal operator of the indicator
k+3
function prox?c(v) projects the input v into the non-negative space. The gradient
of the function g(X), ?g(X), has been derived in Section 4.4.2.1. The function
?Re? takes the real components of a complex value. X is the complex conjugate
of X. K is the convolution kernel. The pseudocode for this algorithm is shown in
Algorithm 2.
?g(X) = 4K ? Re{[ieiP (X) ? F((U(X)? I) ? eiP (X))]} (4.15)
4.4.2.1 Derivation of PPEC Gradient
Here we document the derivation of?g(X). To reiterate, g(X) is the difference
between the far-field image U(X) formed from near-field phase X under proximity
effect and the desired image I expressed in Equations 4.9 and 4.10.
We use the notation Mij to indicate the element at index [i, j] in a matrix M .
For simplicity, we may also use U or P to represent U(X) or P (X).
?g(X) ?g ?U
= Tr[( )T ] (4.16)
?Xij ?U ?Xij
?U
= Tr[2(U ? I)T ] (4.17)
?Xij
Let Q = U ? I, and ?g(X) becomse the following.
?Xij
M??1N?1?g(X) ? ?Umn
= 2 Qmn[ ] (4.18)
?Xij ?Xm=0 n=0 ij
94
Algorithm 2: Proximal Proximity Effect Correction
input : X0, I, ? , ?
output: X
X = X0;
Xprev = X;
while Within maximum iteration do
Y = X + ?(X ?Xprev);
while true do
Z = prox?c(Y ? ??g(Y ));
if g(Z) < g?? (Z, Y ) then
break;
else
? = ?? ;
end
end
Xprev = X;
X = Z;
if Termination condition met then
break;
end
end
95
Now we take a look at Umn.
U = |F(eiP (X))|2mn mn
= [F(eiP (X))]mn[F(eiP (X))]mn (4.19)
M??1N??1
iP ?i2?m k ?i2?n l= ( e kle M e N )?
k?=0 l=0M?1N??1
( e?iPklei2?m
k
ei2?n
l
M N ) (4.20)
k=0 l=0
M?1N?1
?U ? ?mn ?Umn ?Pkl
= (4.21)
?Xij ?Pkl ?Xij
k=0 l=0
?Umn iP ?i2?m k ?i2?n l= ie kle M e N [F(eiP )]mn +
?Pkl
(?ie?iPkl)ei2?m
k
ei2?n
l iP
M N )[F(e )]mn (4.22)
k l
Now let Hmnkl = ie
iPkle?i2?m ?i2?nM e N [F(eiP )]mn, Equation 4.22 becomes the
following.
?Umn
= Hmn +Hmn
?P kl kl
(4.23)
kl
Take Equation 4.23 and 4.21 into Equation 4.18, we get the following.
M??1N??1 M??1N??1?g(X) ?Pkl
= 2 Qmn (H
mn mn
?X kl
+Hkl ) (4.24)
ij ?Xm=0 n=0 ijk=0 l=0
Since Q and ?Pkl are real, we know that ?g(X)mn is real, which is what we?Xij ?Xij
expect.
96
?g(X)
= 4Re(Sij) (4.25)
?Xij
M??1N??1 M??1N??1
mn ?PklSij = Qmn Hkl (4.26)
? ? ? ?Xm=0 n=0 ijk=0 ?l=0M?1N?1 M?1N?1
k l ?Pkl
= Qmn ie
iPkle?i2?m ?i2?nM e N [F(eiP )]mn (4.27)
m? ?X=0M?1N?n=0 ijk=0 l=0?1M??1N??1
= {Q [F(eiP )] e?i2?k
m ?i2?l n iP ?Pkl kl
mn mn M e N }ie (4.28)
? ? ?Xk=0 l=0 m=0 n=0 ijM?1N?1
= { F ?Pkl[ (Q ? F(eiP ))]kl ieiPkl} (4.29)
?Xij
k=0 l=0
It is easy to follow that as long as ?Pkl is constant (P being a convolution is
?Xij
one such case), S and therefore ?g can be calculated as follows.
S = P{F [Q ? F(eiP )] ? ieiP} (4.30)
?g = 4Re(S) (4.31)
4.4.3 Simulations
We have carried out our computational simulations on a dataset containing 29
real-life images (Sheikh et al. [7, 8]) and an additional newspaper image containing
text from the CHRONIC dataset (Smits and Faber [9]). Our software implementa-
tion uses Matlab running on an Intel 9900K CPU. We present the simulation results
in color as our proximity effect correction methods are applicable to all wavelengths.
97
Multi-color displays can be easily implemented by coupling three lasers into the
NPA.
4.4.3.1 Implementation and Test
As mentioned before in Section 4.4.1, we perform the deconvolution by min-
imizing the residual ?MKx ? ??22 while maintaining x ? 0. Specifically, the cost
function f(x) to minimize in the proximal algorithm is the following.
f(x) = g(x) + c(x) (4.32)
1
g(x) = xT(MTKMK)x+ (??TMK)x (4.33)2
c(x) = I[0,inf](x) (4.34)
We use the accelerated proximal gradient algorithm. The gradient of g(x) is
evaluated as the following.
?g(x) = MTKMK ? ?TMK (4.35)
For the PPEC method, we use the accelerated proximal algorithm exactly
like described in Section 4.4.2. We use the residual change to determine when to
terminate the algorithm. Let Xk be the solution in iteration k. The residual is
g(Xk) from Equation 4.9. Let rk be the change in iteration k compared to the
previous iteration. We terminate the algorithm when rk drops below 0.001 for 10
consecutive frames.
98
abs(g(X)k ? g(X)k?1)
rk =
g(X)k?
(4.36)
1
.
In addition to showing the simulated reconstructions of the far-field images,
we also quantify the quality of these images with structural similarity index (SSIM)
from Wang et al. [175]. The power of the luminance, contrast, and structure terms
we use in the SSIM are all 1. SSIM(I, R) measures the similarity between the input
image I and the reference image R. When I and R are identical, SSIM(I, R) = 1.
As the similarity decreases, so does the SSIM. We have also considered two other
image quality metrics, the classic Peak Signal-to-Noise Ratio (PSNR) and a neural
network-based perceptual similarity by Zhang et al. [176]. We did not find any
contradictions between these image metrics and direct observation in our tests. The
larger the proximity effect level, the more noise is present and the lower the quality
metric score gets. More detail on these image quality metrics in our test can be seen
in Section 4.4.3.2.
4.4.3.2 Image Quality Metrics
In order to quantitatively assess the negative impacts of proximity effect and
the effectiveness of the correction, a metric for image quality is needed. We look at
three image quality metrics and pick one to use in our simulations. The three are
Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM) from Wang
et al. [175] and a neural network-based perceptual image quality metric from Zhang
99
et al. [176].
The first two are well-known image quality metrics. Given the reference image
R and the image in question I whose quality we want to quantify, the PSNR is
defined as follows. The images R and I have the same size M ?N .
max(R)2
PSNRR(I) = 10log10( ) (4.37)
?M?SE(R, I)M?1N?11
MSE(R, I) = (R ? I )2ij ij (4.38)
MN
i=0 j=0
The PSNR treats each pixel separately. Even though it is a good metric for
reporting the `2 difference, it is not suitable for structured signals such as images.
The SSIM is a perceptually inspired image metric and is defined as follows.
SSIMR(I) = L(I, R)
? + C(I, R)? + S(I, R)? (4.39)
2?I?R + C1
L(I, R) = (4.40)
?2I + ?
2
R + C1
2?I?R + C2
C(I, R) = (4.41)
?2 + ?2I R + C2
?I,R + C3
S(I, R) = (4.42)
?I?R + C3
L(I, R), C(I, R) and S(I, R) are the luminance, contrast and structural terms,
with ?, ? being the mean and standard deviation of the respective images. ?I,R is
the cross-covariance for I and R. ?, ?, ? and C1, C2 and C3 are parameters. We
use the default parameters, i.e. ? = ? = ? = 1, C1 = (0.01D)
2, C2 = (0.03D)
2,
C C23 = . D is the dynamic range of the images, which is 255 for the uint8 images2
we use.
100
Zhang et al. [176] find that some of the loss functions based on the deep features
of image synthesis applications are uniquely suitable to express the quality of the
image. In their study, there are cases where the perceptual similarity?s choices align
more with humans while the images PSNR and SSIM think are better are actually
considered worse by humans. It has also been shown that the perceptual similarity
also performs well on images outside the domain of training image datasets. The
metric value d produced by their network1 represents the ?perceptual difference?.
To keep up with the PSNR and SSIM where larger metric value means better image
quality, we use s = e?d to express the ?perceptual similarity?.
We want the image quality metric to reflect the severity of the proximity effect
and treat every image equally. Specifically, we want to see the increasing severity
of proximity effect reflected in the metric for every image. Additionally, when the
severity of the proximity effect is fixed, we want the variance of how the metric scores
every image to be as small as possible. Therefore, we assess the three image quality
metrics in two ways. First, we apply increasing levels of proximity effect to one
image and observe how the metrics change. The results can be seen in Figure 4.6.
Next, we fix the proximity effect and plot the metrics on the entire image
dataset of 29 images. We want the variance over the images to be as small as
possible. The results can be seen in Figure 4.7.
It can be seen that all three metrics satisfy our requirements for the image
quality metric. We choose to use SSIM as the image quality metric in our simula-
tions because there is no significant disagreement among between the SSIM and the
1Their code is publicly available at https://github.com/richzhang/PerceptualSimilarity
101
Figure 4.6: How the three image quality metrics change with increasing levels of
proximity effect on a single image. The results on two example images are shown
(left and right column). These results are representative of the entire image dataset.
102
std: 0.063 std: 0.028
std: 2.163 std: 1.182
std: 0.050 std: 0.084
Figure 4.7: The image quality metrics over the entire dataset when the proximity
effect level is fixed. A small proximity effect (? = 0.1 px, left) and a large one
(? = 1.0 px, right) are shown. The standard deviations of the metrics over the
dataset are also shown under each image.
103
perceptual similarity in this simulation and the SSIM value drops to around zero
when the proximity effect level is high and the result image is completely noise.
4.4.3.3 Correction Effectiveness
We first present the qualitative visual results in Figure 4.8. The original images
are shown on the left as references. We also include the method in Persson et
al. [121] and Gemayel et al. [119] in the second column. We refer to this method as
the Basic Iterative Proximity Effect Correction. The basic IPEC method skips the
deconvolution step when applying the source amplitude constraint, namely using
B = S ? eiP (phs(A)) in Algorithm 1. The results of the IPEC method and the PPEC
method are presented in the remaining two columns, respectively. The proximity
effect level in the first two rows is ? = 0.66 (px) corresponding to our sample device.
To show the impact of a greater proximity effect level especially on text-rich images,
we use ? = 0.8 (px) in the third row with the image of a newspaper.
It can be seen from Figure 4.8, the IPEC method performs much better than
the basic IPEC method and is able to produce reasonable results. However, it is still
not able to effectively correct high levels of proximity effect. The PPEC method has
the best correction effectiveness of all methods compared and is the only method
that can recover text-rich images when the proximity effect level is high. At high
proximity effect levels such as ? = 0.80 px, a vignette effect (where the center of
the image is brighter than the border region) can be observed in the results of the
basic IPEC as well as IPEC. This is because the proximity effect is modeled as a
104
Original Basic IPEC Iterative PEC Proximal PEC
Gemayel et
al. [119]
Figure 4.8: This figure compares the various Proximity Error Correction (PEC)
methods. The proximity effect level used in the first two rows is ? = 0.66 px. At
this level, the IPEC results are noisy but still reasonable. The proximity effect level
for the bottom row is ? = 0.8 px. At this level, the results from the IPEC methods
are incomprehensible but the Proximal PEC method provides readable results. The
images in the first two rows are from the LIVE dataset (Sheikh et al. [7, 8]) and
the newspaper image in the bottom row is from the CHRONIC dataset (Smits and
Faber [9]).
105
Figure 4.9: Image quality degradation for various methods. This figure is plotted
with interpolated data from simulations where ? ranges from 0.10 to 1.00 px with
a 0.10 px interval.
Gaussian convolution. When the proximity effect is high, the Gaussian kernel gets
flatter and the near-field phase becomes more uniform. Since the Fourier transform
is used to approximate the propagation, more energy gets focused around the center
in the far field.
Figure 4.9 shows how the average image quality degrades with increasing
amounts of proximity effect for the methods. We observe that the image qual-
ity degrades with increasing proximity effect ? for the IPEC method. The PPEC
106
method retains the best image quality at all proximity effect levels. At ? = 0.66 px,
the IPEC and PPEC methods improve the SSIM image quality metric by 121.26%
and 249.98% compared to the basic IPEC method, respectively.
4.4.3.4 Processing Time
We also look at the processing time of the proposed proximity effect correction
methods. Two factors may impact the processing time, the severity of the proximity
effect and the resolution of the image.
The processing time of the proposed methods with increasing thermal prox-
imity effect levels is presented in Fig 4.10. It is clear at first glance that the IPEC
method is very time-consuming. It is to be expected since IPEC is an iterative
method and at the same time, the deconvolution in each iteration is also imple-
mented in an iterative manner. The PPEC method is able to maintain a much
faster processing speed than the IPEC method. The processing time for the PPEC
method also increases with the proximity effect level. The basic iterative PEC
method is the fastest of the three methods tested. However, as shown in Figure 4.8
and 4.9, this method does not provide sufficient proximity effect correction.
The processing time of the methods with increasing image resolution is shown
in Figure 4.11. The processing time increases with image resolution for both the
IPEC and PPEC methods.
107
384 ? 256 768 ? 512
Figure 4.10: The processing time of the proposed proximity effect correction meth-
ods with increasing levels of proximity effect at respective resolution.
4.4.3.5 Spatial Frequency of Images
Since the Gaussian filter used in the convolution modeling of the thermal
proximity effect is a low-pass filter, we want to investigate whether the spatial
frequency of the image affects the image quality of the proximity effect correction
methods.
To test this, we prepare images filtered by low-pass filters with several fre-
quency cutoff thresholds. We use the Butterworth low-pass filter to avoid the
ringing artifacts in the filtered images. The low-pass filter we use is expressed
in Equation 4.43. f(u?, v) is the normalized distance from (u, v) to the center of the
spectrum. f(u, v) = (2u ? 0.5)2 + (2v ? 0.5)2. fc is the frequency cutoff thresh-M N
old. We use frequency cutoff thresholds fc = 0.1, 0.2, 0.3, 0.4, 0.5. The results of the
PEC methods on the low-pass-filtered images are shown in Figure 4.12.
108
? = 0.5 px ? = 1.0 px
Figure 4.11: The processing time of the proposed proximity effect correction meth-
ods on images of increasing resolution. The image aspect ratio is kept at 3 : 2. The
resolution on the x-axis indicates the vertical resolution of the image.
BLPF(I) = F?1[F(I) ?H] (4.43)
1
H(u, v) = (4.44)
1 + [f(u, v)/fc]4
As can be seen in Figure 4.12, there is no significant change of image quality
with different image spatial frequencies. This may be because the thermal proximity
effect mainly causes noise artifacts in the reconstructed images rather than blurring.
4.4.4 Discussion
Here we propose the iterative proximity effect correction which integrates de-
convolution into the iterative phase retrieval algorithm and the proximal proximity
effect correction which directly finds the phase that optimizes the formed image un-
der thermal proximity effect. Here we discuss some details of the proposed methods.
109
Basic Iterative PEC Gemayel et al. [119]
Iterative PEC
Proximal PEC
Figure 4.12: The results of the basic iterative PEC method (top row) and the
iterative PEC method (middle row) and the proximal PEC method (bottom row)
on low-pass filtered images. The images (left column) are the results of the respective
PEC methods applied to a low-pass filtered image with frequency cutoff threshold
fc = 0.2 at proximity effect level ? = 0.7 px. The image qualities of each PEC
method at different proximity effect levels on images with different frequency cutoff
thresholds are shown in the right column.
110
4.4.4.1 Improving IPEC
As can be seen in Figure 4.9, the PPEC method achieves better image quality
at all the thermal proximity effect levels tested. In fact, the PPEC method is able to
maintain the image quality with increasing thermal proximity effect levels, albeit at
the cost of slightly longer processing time (see Figure 4.10). IPEC methods perform
reasonably well at lower proximity effect levels. But at high proximity effect levels,
the IPEC method inevitably breaks down. An example with proximity effect level
? = 0.8 px can be seen in Figure 4.8 bottom row. However, some techniques can be
applied to the IPEC method to further improve image quality.
The first technique is time-division multiplexing (TDM). We observe that the
artifacts in the IPEC results are noise artifacts. We can run the IPEC method
multiple times with different initial phase ?0 and display the different resulting phase
patterns in consecutive frames. This would increase the processing time of the IPEC
method. The TDM technique uniquely takes advantage of the fast refreshing nature
of the NPA. We can simulate the TDM by averaging the resulting far-field images,
which are shown in Figure 4.13(a). Obviously, if IPEC is run for n times, the TDM
technique would take n times the processing time of the IPEC method.
The other technique is to pad the desired image with a background region. In
the iterative phase retrieval algorithm, a signal region and a background region on
the wavefront can be specified. The two regions have no intersection. We want the
signal region to display the desired amplitude pattern therefore the target amplitude
constraint is only applied in the signal region. The background region offers more
111
(a) Time-Division Multiplexing (b) Padding to 512? 512
(c) Initial padded image (d) Result padded image
Figure 4.13: (a) shows the result of TDM application to the IPEC method with
10 different initial phase. (b) shows the of result of the padding technique applied
to IPEC method. The center signal region is shown. The image is padded from
171 ? 256 to 512 ? 512. The initial and result padded images are shown in (c)
and (d), respectively. The thermal proximity effect level is ? = 0.66 px for both
techniques.
112
Figure 4.14: IPEC with TDM and padding techniques compared to the original
IPEC method and the PPEC method.
freedom to phase manipulation. We have included more detail on the iterative phase
retrieval algorithm in the supplemental material. An example of this technique
is shown in Figure 4.13(b). The padding technique uses more pixels for a given
image than PPEC and wastes a portion of the display. Padding also increases the
processing time.
A comparison of the two techniques with the IPEC and PPEC methods is
shown in Figure 4.14. As can be seen, both techniques perform much better than
the original IPEC method without the techniques. The two techniques achieve
comparable image qualities to the PPEC method at lower proximity effect (? < 0.7
113
px). This indicates that the IPEC method has the potential to achieve comparable
correction effectiveness as the PPEC method. However, it is necessary to find fast
deconvolution methods for the IPEC method to be efficient.
4.4.4.2 Pixel Pitch and Thermal Proximity Effect
Reducing the pixel pitch is one of the most effective ways of improving the
quality of a holographic display. Given the form factor of the display, reducing the
pixel pitch not only increases the number of pixels but also improves the field-of-
view of the display. The importance of pixel pitch has been discussed in a number
of studies on LCoS based holographic displays such as Shi et al. [1] and Jang et
al. [149]. The same principle also applies to NPA holographic displays. It is easy
to see that when the pixel pitch decreases, the pixels are more densely packed and
each pixel is affected more by the heat from neighboring pixels. This is highly likely
to lead to a more severe proximity effect. The PPEC method is the only effective
correction method at the moment to address this problem.
4.4.4.3 Choice of Proximal Algorithm
There are multiple variations of the proximal algorithm. Two examples are
the proximal gradient algorithm and the ADMM (alternating direction method of
multipliers). Both algorithms can be used to minimize the function f(X) = g(X) +
c(X) in Equation 4.12. In the proximal gradient algorithm, let Xk be the solution
in iteration k. The update in iteration k + 1 is the following.
114
Xk+1 = prox?c(Xk) (4.45)
In the ADMM algorithm, the updates in each iteration are the following. Xk
and Zk converge to each other and are in the domains of f and g respectively.
Xk+1 = prox (Zk ? Uk?f ) (4.46)
Zk+1 = prox (Xk+1?g + U
k) (4.47)
Uk+1 = Uk +Xk+1 ? Zk+1 (4.48)
The main difference between the two algorithms is the evaluation of the prox-
imal operator of the function g(X). g(X) is a non-linear function, and minimizing
it is the core of the proximal proximity effect correction problem. Since it is diffi-
cult to evaluate prox?g, we use the proximal gradient algorithm which only involves
calculating the gradient of g(X). In practice, we use the accelerated version of
the proximal gradient algorithm. Its update steps are described in Section 4.4.2.
The accelerated proximal gradient algorithm has an extrapolation step that can be
interpreted as moving the solution a step faster towards the optimal.
In the implementation of the quadratic programming deconvolution in the
IPEC method, we also use the proximal algorithm. Even though there exists closed-
form evaluation of the function g(x) in Equation 4.33 (see Parikh and Boyd [172]),
it still involves solving a linear system with tens of thousands of variables. We
have found that the accelerated proximal gradient implementation is faster than the
ADMM implementation.
115
4.4.4.4 Iterations of PPEC
Here we show how the image quality improves over the iterations in the PPEC
method. An example with proximity effect level ? = 0.66 px is shown in Figure 4.15.
Figure 4.15: Image quality improves over the iterations. The orange dashed line
indicates the change in the residual compare to the previous iteration. ? = 0.66 px
in this case.
The image quality improves over the iterations. We use the residual change
rk in Equation 4.36 to determine when to terminate the algorithm. We terminate
the algorithm when rk drops below 0.001 for 10 consecutive frames at which point
the image quality changes very little.
116
4.4.5 Limitations
Our simulations have shown that the proposed PPEC method achieves very
good correction effectiveness across different proximity effect levels. However, there
are still limitations to this approach. The NPA?s properties can help addresses some
of the limitations. Some of the properties may pose problems as well. Here we list
some of the limitations and propose potential solutions.
4.4.5.1 Amplitude Modulation and Diminished Reality
NPAs have the advantage of fast refresh rate and integrated light source. How-
ever, NPAs also have limitations. The integrated light source makes it difficult for
NPAs to modify the light from the real scene in augmented reality. Reflective or
transmissive SLMs can modify the wavefront from the real scene that reaches them,
achieving diminished reality or occlusion of real objects in AR, such as in Rathinavel
et al. [177]. Combining LCoS SLMs can expand the capabilities of NPA holographic
displays.
4.4.5.2 Choice of Initial Phase
The initial phase value of X in the PPEC method has an effect on the result.
We find that a constant value often leads to poor image quality or an excessively
long time to convergence. We use random values in our implementation and about
1% of the PPEC results suffer from slight speckle noise. We currently deal with
this by detecting the noise and rerun PPEC with a different initial value. Heide et
117
al. [173] propose to use the result from the iterative phase retrieval algorithm as the
initial value for the phase retrieval using the proximal algorithm. We do not think
it is a viable choice for us because the iterative phase retrieval would add to the
processing time of the method. Still, the intermediate result of the iterative method
after only a few iterations may offer better robustness for our method.
4.4.5.3 Larger Proximity Effect for PPEC
We have shown that the PPEC method effectively corrects the proximity effect
within the tested range (from ? = 0.10 px to 1.00 px). While this is a realistic range
in which our NPA design lies, the PPEC method may not suffice at higher proximity
effect levels. For example, at ? = 1.50 px, the average SSIM of the PPEC method
is 0.48, and at ? = 2.00 px, the average SSIM drops to 0.30.
4.4.5.4 3D Holograms
In this section, we have explored proximity effect correction on NPAs to form
2D images. Holographic displays truly shine at displaying realistic 3D scenes at
the correct depth. The most naive approach would be to generate the phase-only
hologram of a 3D scene either with double-phase encoding (such as in Maimone et
al. [62]) or multi-plane phase retrieval (such as Makowski et al. [178]) and apply
deconvolution to the phase-only hologram. Or we can integrate deconvolution into
the multi-plane phase retrieval just like the proposed IPEC method. Noise and
discontinuity in the depth can be expected from these approaches. Another approach
118
that takes advantage of the fast refresh rate of the NPA is time-division multiplexing.
Similar to what is suggested in Chakravarthula et al. [164], we can divide the 3D
scene into a stack of images at different depth and apply PPEC to each image. The
results are displayed in consecutive frames. The PEC methods need to adapt to
a different wave propagation model that specifies the propagation distance such as
the Fresnel propagation or Fraunhofer propagation. We also plan to look at direct
optimization methods that optimize the 3D scene as a whole. I try to address 3D
holograms in Section 4.5.
4.5 Proximity Effect Correction for Fresnel Holograms
Here, we present a novel proximity effect correction method for Fresnel holo-
grams on the NPAs. We refer to this method as the Fresnel Proximity Effect Cor-
rection (FPEC). This method uses the proximal algorithm to directly find the phase
of a phase-only hologram that minimizes the difference between the desired image
and the image formed at a specific distance under the proximity effect.
Since our method focuses on Fresnel holograms and reconstructing 2D images
or 3D scenes, we consider the case of Fresnel diffraction approximation. Assuming
a wavefront U1 formed on the NPA, the wavefront U2 formed by propagating U1 by
a distance z can be calculated as in Equation 4.49. The two wavefronts U1 and U2
are parallel to each other. F and F?1 are the Fourier transform and its inverse.
2 2
Hz?(fX , f ) = e
ikz
Y ? e?i??z(fX+fY ) is the transfer function in the frequency domain
where k = 2? is the wave number for wavelength ?. We refer interested readers to
?
119
Goodman [123] for the detailed derivation of the Fresnel diffraction.
U2(x, y) = F?1{F [U1(x, y)]Hz?(fX , fY )} (4.49)
The rest of the FPEC method is very similar to the PPEC method. Given the
desired image I ? R+M?N M?N, we wish to find the non-negative phase X ? R+ that
minimizes the difference between the desired image I and the formed image under
the proximity effect, expressed as g(X) in Equation 4.9. Let U(X) be the intensity
of the wavefront which the NPA with input phase X forms at a specific distance z
from the NPA. Let P (X) be the function that applies the proximity effect to the
phase pattern, i.e. P (X) = X?K where K is the convolution kernel in Equation 4.4.
g(X) = ?U(X)? I?2F (4.50)
U(X) = |F?1[F(eiP (X)Hz?]|2 (4.51)
To enforce the non-negative constraint, we add an indicator function c(X).
The FPEC method minimizes the function f(X) in Equation 4.12 using the proximal
algorithm.
f(X) = g(X) + c(X) (4.52)
c(X) = I[0,+ inf)(X) (4.53)
The same pseudocode in Algorithm 2 can be used for the FPEC method. We
document the derivation of the gradient for the FPEC method in Section 4.5.0.1.
120
4.5.0.1 Derivation of FPEC Gradient
Here we present the derivation of the gradient ?g of the function g(X). Like
before, we use the notation Mij to indicate the element at index [i, j] in a matrix
M . For simplicity, we use U or P to represent U(X) or P (X), respectively.
?g(X) ?g ?U
= Tr[( )T ] (4.54)
?Xij ?U ?Xij
?U
= Tr[2(U ? I)T ] (4.55)
?Xij
Let Q = U ? I, and we can represent ?g(X) as follows:
?Xij
M??1N?g(X) ??1 ?Umn
= 2 Qmn[ ] (4.56)
?Xij ?Xm=0 n=0 ij
Next, we consider the derivation of Umn. Let F be the complex wavefront in the
observation plane. For simplicity, we omit mentioning the subscripts z and ? with
the transfer function Hz? in Equation 4.49. To closely relate to the implementation,
F and F?1 are written as the discrete Fourier transform and its inverse. X is the
complex conjugate of X.
F (X) = F?1[F(eiP (X)H]
121
M??1N??1 M??1N??1
k l
F = H ( eiPkle?i2?p e?i2?q )ei2?m
p
ei2?n
q
mn pq M N M N
M?p=0 ?q=0 ?k=0 ?l=0?1N?1 M?1N?1
F = H ?iP
k l p
kl i2?p i2?q ?i2?m ?i2?n q
mn pq( e e M e N )e M e N
p=0 q=0 k=0 l=0
Umn = Fmn ? Fmn
M??1N?1?U ?mn ?Umn ?Pkl
= (4.57)
?Xij ?Pkl ?Xij
k=0 l=0
M??1N??1?Umn
= H ieiPkle?i2?p
k ?i2?q l i2?m p i2?n q
pq M e N e M e N ? Fmn
?Pkl p=0 q=0
M??1N??1
k l p q
+ Hpq(?ie?iPkl)ei2?p ei2?q e?i2?mM N M e?i2?nN ? Fmn (4.58)
p=0 q=0
Since the two summation terms in Equation 4.58 are complex conjugates, it
easy to see that ?Umn is real and can be simplified as shown next. The function
?Pkl
?Re? takes the real components of a complex value.
M??1N??1?Umn
= 2Re{ H ieiPkle?i2?p
k
e?i2?q
l
ei2?m
p
ei2?n
q
pq M N M N ? Fmn} (4.59)
?Pkl p=0 q=0
Using Equation 4.59 and 4.57 with Equation 4.56, we get the following:
?g(X)
= 4Re(Sij)
?Xij
122
M??1N??1 M??1N??1M??1N??1
Sij = Qmn Hpq ? ieiPkl
m=0 n=0 k=0 l=0 p=0 q=0
? e?i2?p
k
e?i2?q
l i2?m p i2?n q ? ? ?PklM N e M e N Fmn
? ?XijM?1N??1M??1N??1 ?Pkl
= H ieiPkle?i2?p
k
e?i2?q
l
pq M N
?X
k?=0 l=M?1N?0 p=0 q=0 ij?1
? Q F ei2?p
m
ei2?q
n
mn mn M N
?m=0M?1N?n=0?1M??1N??1 ?Pkl
= ieiP
k
klH e?i2?p e?i2?q
l
[F?1pq M N (Q ? F )]pq
?X
? ijk=0 ?l=0 p=0 q=0M?1N?1 ?Pkl
= ? ieiPkl ? F{H ? [F?1(Q ? F )]}kl
?Xij
k=0 l=0
Finally, we can calculate the gradient of g(X) as follows:
?g = 4Re(S)
S = P (ieiP ? F{H ? [F?1(Q ? F )]})
4.5.1 Simulations
We present computational simulations of our novel Fresnel PEC method and
compare it with the case without proximity effect correction where the proximity
effect is applied to the phase of the phase-only hologram. We conduct the simulations
on Matlab on a PC with an Intel 9900K CPU. We present the simulation results in
color as our FPEC method is applicable to all wavelengths. Multi-color displays can
be easily implemented by coupling three lasers with the NPA. In our simulations,
we use wavelengths 630 nm, 530 nm, and 445 nm for the red, green, and the blue
123
channels. The NPA is of size 2 mm? 2 mm and the images are placed 100 mm away
from NPA.
We use the same termination condition as in the PPEC method: when the
residual change r in Equation 4.36 remains below 0.001 for 10 iterations.
We compare the proposed Fresnel PEC method with two baselines. The first
is the case without proximity effect correction where the proximity effect is ap-
plied to the phase of the phase-only hologram. The second is the IPEC method
in Chapter 4.4.1 adapted to the Fresnel holograms. Here we replace the wavefront
propagation with the Fresnel propagation approximation. We refer to this method
as the Iterative Fresnel Proximity Effect Correction (IFPEC) method.
4.5.1.1 2D Fresnel Holograms
The qualitative results of the Fresnel PEC on 2D images are shown in Fig. 4.16.
As can be seen, the proximity effect results in appreciable noise in the simulated
images. The proximity effect levels in Fig. 4.16 are ? = 0.70 px, 0.80 px, 0.90
px, and 1.00 px in the four rows from top to bottom, respectively. For the two
baseline methods, the higher the proximity effect, the noisier the images become.
Our proposed Fresnel PEC method is able to effectively reduce the noise caused by
the proximity effect even at very high proximity effect levels. Compared with the
iterative Fresnel PEC method, our method is able to better correct the proximity
effect and restore the images.
In addition to qualitative reconstructions of the images, we also quantify our
124
Original No PEC Iterative FPEC Fresnel PEC
Figure 4.16: Qualitative comparison of the Fresnel PEC method with the case with-
out proximity effect correction and the iterative Fresnel PEC method. The original
image is presented in the left column for reference. The thermal proximity effect
levels for the images from top to bottom are ? = 0.70 px, 0.80 px, 0.90 px and
1.00 px, respectively. At ? = 0.70 px, the proximity effect causes noise but the
images are still recognizable. However, for more severe proximity effect, the noise
overwhelms the images. Our Fresnel PEC method is able to correct the proximity
effect and restore the images close to the original even at such high proximity effect
levels. The images in the first and second rows are from the Common Objects in
Context dataset [5] and the CHRONIC dataset (Smits and Faber [9]), respectively.
The remaining images are from the LIVE dataset (Sheikh et al. [7, 8]).
125
SSIM Perceptual Similarity
Figure 4.17: The SSIM and perceptual similarity of the Fresnel PEC results com-
pared to the case without proximity effect correction and the iterative Fresnel PEC
(IFPEC) method. When there is no PEC, the image quality degrades quickly with
the proximity effect level beyond ? = 0.30 px. The image quality of the IFPEC
method starts to degrade starting from ? = 0.40 px. The Fresnel PEC method can
effectively correct the proximity effect at almost all proximity effect levels tested,
only to see image quality degrade slightly at extreme proximity effect level ? = 1.00
px.
results with image quality metrics. We present both the structural similarity index
(SSIM) from Wang et al. [175] and the perceptual similarity from Zhang et al. [176].
More details on these metric can be seen in Section 4.4.3.2.
We present the average image metrics of the images from the LIVE dataset
from Sheikh et al. [7, 8] at each proximity effect level from ? = 0.10 px to ? = 1.00
px. As can be seen, the proposed Fresnel PEC method can effectively correct the
proximity effect at proximity effect levels from ? = 0.10 px to 0.90 px, with only a
126
slight dip in the image quality at ? = 1.00 px. As a comparison, the case without
PEC starts to see rapid image quality degradation starting from ? = 0.30 px, and
the iterative Fresnel PEC method suffers from noise starting from ? = 0.40 px. The
two image metrics are largely consistent with each other with the exception of the
iterative Fresnel PEC method at proximity effect levels at ? = 0.90 px and 1.00 px.
In this case, we think that the perceptual similarity represents the image quality
better. Some examples of such mismatch between the two image metrics can be
seen in the discussion in Section 4.5.2.1.
4.5.1.2 3D Holograms
The Fresnel PEC method can effectively correct 2D Fresnel holograms formed
at desired depths. This can be used in combination with the fast refresh rate of the
NPA to display 3D holograms. A 3D scene can be divided into a stack of 2D images
at different depths. The 2D Fresnel hologram of each image can be calculated and
the proximity effect corrected with the Fresnel PEC method at the desired depth.
The stack of holograms can be displayed on the NPA in consecutive frames so that
the 3D scene can be observed.
We present the simulations of such 3D holograms2 in Fig. 4.18. A 3D volume
is divided into a stack of images whose 2D Fresnel holograms are calculated at their
respective depths with the Fresnel PEC method. Those holograms are reconstructed
by propagating the wavefronts formed at the holograms to a certain distance from
2Volumes are from https://klacansky.com/open-scivis-datasets and http://www.sci.
utah.edu/download/IV3DData.html
127
the NPA that corresponds to the user?s focal plane. The average of those recon-
structions is taken as the observation. Holograms of 3D meshes and point clouds
can be computed and displayed in a similar way. The user can focus on different
parts of the 3D scene. We simulate this by adjusting the focal plane accordingly.
4.5.2 Discussions
Here we discuss some limitations of our proposed FPEC method and propose
several directions for future research.
4.5.2.1 Disagreement in Image Metrics
We observe in Figure 4.17 that there is some disagreement between the SSIM
and the perceptual similarity (PS) at ? = 0.9 px and 1.0 px. Here we show in
Figure 4.19 some examples where we think that the perceptual similarity is a more
accurate representation of image quality than SSIM. These are the results of the
Iterative Fresnel PEC method. We believe the images formed with the proximity
effect ? = 0.9 px are better than those formed with more severe proximity effect ? =
1.0 px. Even though a substantial amount of noise is present with both proximity
effect levels, the noise in images with ? = 0.9 px is more granular, making those
images relatively more desirable in our opinion. The PS metrics are in agreement
with the our observation while the SSIM metrics are not.
128
Figure 4.18: The 3D volume is divided into a stack of images at different depths.
The Fresnel holograms of all the images are calculated with the Fresnel PEC method
at their respective depths and displayed on the NPA in consecutive frames so the 3D
scene can be observed. We simulate the observation by reconstructing the holograms
at a certain distance that corresponds to the user?s focal plane. By adjusting the
focal plane (for example, a closer focal plane in the left images and a farther focal
plane in the right images), different parts of the scene (red or blue squares) are in
focus. The top and bottom volumes are divided into 232 and 253 equidistant slices,
respectively. The proximity effect level used is ? = 0.66 px.
129
? = 0.9 px ? = 1.0 px
SSIM: 0.64 SSIM: 0.69
PS: 0.79 PS: 0.76
? = 0.9 px ? = 1.0 px
SSIM: 0.60 SSIM: 0.62
PS: 0.52 PS: 0.47
Figure 4.19: Two examples where perceptual similarity disagrees with SSIM. We
observe that the results of the Iterative Fresnel PEC method at proximity effect
? = 0.9 px are of better quality than the results of the same method at proximity
effect ? = 1.0 px. The perceptual similarity metric agrees with this observation
while the SSIM disagrees.
130
4.5.2.2 Larger Proximity Effect
We have shown in our simulations that the proposed FPEC method is effective
within the tested proximity effect level range (? = 0.10 px to 1.00 px). While this
is a large range within which our NPA design lies, it is not difficult to imagine
cases where a more severe proximity effect may arise. For example, as the pixels
on the NPA are more densely packed, it can be expected that each pixel is affected
more by nearby pixels. In Figure 4.17, we see that the image quality of the FPEC
results start to decrease at ? = 1.00 px. At higher proximity effect levels, the image
quality would drop more. At proximity effect levels ? = 1.50 px and 2.00 px, the
average SSIMs of the results of FPEC are 0.57 and 0.34 while the average perceptual
similarities are 0.69 and 0.48. Examples of the FPEC method at higher proximity
effect levels can be seen in Figure 4.20.
? = 1.50 px ? = 2.00 px
Figure 4.20: The FPEC method at higher proximity effect levels where much noise
is observed.
131
4.6 Conclusion
Nanophotonic phased arrays have several advantages in its use as a holographic
display. The integrated light source enables small-factor integration and the fast re-
fresh rate is ideal for dynamic content. The use of NPA in a near-eye display has
been demonstrated in Notaros et al. [116] and other works. However, the use of
a large scale dynamically modulated NPA as a holographic display has not been
achieved. We propose two proximity effect correction methods to address the ther-
mal proximity effect issue of Fourier holograms in NPAs. The PPEC method is
the obvious choice as it is faster and more effective in correcting proximity effect
at all proximity effect levels. The iterative proximity effect correction is effective
at low proximity effect levels and the correction effectiveness can be improved with
TDM or padding techniques. The IPEC method is also readily applicable to multi-
plane scenes. However, more efficient deconvolution methods with non-negative
constraints are required for the IPEC method to be practical. We also propose the
Fresnel proximity effect correction method that can effectively correct the thermal
proximity effect for 2D Fresnel holograms on the NPAs when realistic levels of prox-
imity effect are present. We show that we can use the FPEC method to display 3D
holograms so that different parts of the 3D scene can be focused on. We believe
that our approaches make a meaningful contribution to the emerging field of NPAs.
We hope that it brings this highly interesting avenue for holographic displays closer
to practical use.
132
Chapter 5: Fast Holographic Volume Rendering
5.1 Overview
Holographic displays and computer-generated holography (CGH) have been an
active area of research in recent years. Holographic displays recreate the optical wave
of the light coming from a scene, naturally addressing the vergence-accommodation
conflict faced by the conventional stereoscopic displays with a single or a limited
number of focal planes. Varifocal and light-field displays also offer a wider range
of focal distances. However, the former usually involves moving components while
the spatial and angular resolutions of the latter are limited by optical diffraction
as the pixel density increases. On the contrary, the quality of CGH improves as
the pixel density increases. Recent advancement has also proposed solutions to
problems faced by holographic displays such as limited field of view (FOV) and
miniaturization.
CGH has largely focused on 2D images or 3D scenes in the form of meshes or
point clouds. The Fourier transform is a common and efficient method to calculate
the hologram of a 2D image. Calculating the hologram of a 3D scene is computation-
ally intensive and usually involves converting the scene into a collection of point light
sources directly visible from the hologram and calculating the hologram following
133
Figure 5.1: We propose an efficient way to compute the hologram of a 3D volume.
We use the raycasting direct volume rendering method to sample a collection of point
light sources and use the convolution theorem to accelerate hologram computation.
The digital reconstructions of several such holograms are shown. Holographic vol-
ume rendering is ideal for visualizing volumetric data such as 3D medical imaging
data or physics simulations on a holographic display.
the Huygens-Fresnel principle. Modern computational hardware and acceleration
methods have enabled real-time calculation of holograms from 3D meshes.
Volumetric data is another type of data suitable for visualization. Volumetric
data from medical imaging such as computed tomography (CT) and magnetic reso-
nance imaging (MRI) can be rendered to reveal critical information about the organs
and tissues. Volumes from physics simulations are also used in the realistic render-
ing of fog, smoke and, so on. Volume visualization can benefit from the advantages
of holographic displays. For simplicity, we shall refer to the holograms of volumetric
data as volumetric holograms and the practice of generating volumetric holograms
and displaying them on holographic displays as holographic volume rendering. A few
challenges have prevented CGH from being applied to volumetric data. First, cal-
culating the volumetric hologram is an order-of-magnitude more complicated than
134
calculating that of a 3D mesh, as every voxel contributes to the perception of the
volume instead of only those closest to the hologram. In the latest work on this
topic, Lu and Sakamoto [179] find the color of each voxel and calculate the holo-
gram using the positions and colors of those voxels. Every voxel contributes to each
hologram pixel and the computation takes more than three hours for a 1280? 768
hologram. Second, the small FOV is a problem suffered by all digital holographic
displays. This is caused by the limited diffraction angle of the currently existing
light modulators. Various methods to increase the FOV have been proposed for
different holographic displays such as in Maimone et al. [62], Shi et al. [1] and Jang
et al. [149]. But the methods to increase the FOV need to be considered in accor-
dance with the hologram computation, as is with the case of volumetric holograms.
Finally, the choice of the illuminating light source is also critical to the observed
image of a holographic display. A light source with a narrow spectrum such as a
laser can produce images of high resolution but also leads to heavy speckles while a
light source with a wider spectrum such as an LED reduces speckles at the cost of
lower resolution.
In this chapter, we address each of the three challenges faced by holographic
volume rendering mentioned above. To address the first challenge, we propose an
accelerated method to compute volumetric holograms. This is primarily achieved
by treating the summation in the Huygens-Fresnel principle as a convolution and
accelerate it with the convolution theorem. Maimone et al. [62] recently noted that
convolution can be used to calculate the hologram of a 2D image. We take this idea
one step further and combine it with an elegant way to sample the volume based
135
on raycasting direct volume rendering (DVR) to achieve significant acceleration
compared to the brute-force method. To address the second challenge, we propose a
holographic display configuration that increases the FOV. Special attention is paid
to making sure the display is compatible with the accelerated volumetric hologram
computation method. And finally, we investigate what kind of light source strikes
the balance between speckles and image resolution for holographic volume rendering.
To validate the holograms computed using our method, we present the digi-
tal reconstruction of the volumetric holograms. We also realistically simulate the
observed image on our display configuration with different light sources to identify
the light source that best balances the speckles and resolution. The light source
customization is implemented on the optics bench and validated experimentally.
In summary, we improve the holographic volume rendering in following ways:
? We develop an accelerated hologram computation method for volumetric data.
Our method samples the volume based on raycasting direct volume rendering
and uses the convolution theorem to significantly improve the computation
speed.
? We propose a configuration that increases the FOV of the holographic display
while maintaining the compatibility with the accelerated hologram computa-
tion method.
? We identify the appropriate light source for holographic volume rendering with
realistic simulation.
136
5.2 Related Work
In this section, we give an overview of the existing literature on several topics
related to this work. These include volume visualization and computer-generated
holography based on 2D images, 3D meshes, and volumetric data.
5.2.1 Volume Visualization
Traditional volume rendering methods can be classified into two groups: sur-
face fitting and direct volume rendering. We briefly cover some of the basics and
the advantages and disadvantages of both groups. For a more thorough review, we
refer readers to a survey by Elvins [180].
Surface fitting methods find a surface or some other geometric element that
is representative of a meaningful object in the volume and visualize it. One such
method is contour-connecting [181,182] where a contour matching a specified thresh-
old value is selected on each slice of the volume and the contours from neighboring
slices are connected to form a mesh. This mesh is then rendered. The marching
cubes algorithm [183] is another famous method in the surface fitting group. In this
method, the voxels whose values straddle a specified threshold are converted into
a mesh to be rendered. Improvements have been made to reduce the ambiguity in
how the corners of the voxels are connected to form triangles [184,185].
While the surface fitting methods are fast and able to provide good visualiza-
tion for medical data, one major disadvantage is that they depend on a pre-specified
threshold value and only the voxels matching that value are visualized. Direct vol-
137
ume rendering, on the other hand, can make use of all the information in the volume.
Raycasting [186?188] is the most widely-used direct volume rendering technique. In
raycasting, rays are shot from each image pixel into the volume. The color and
opacity associated with the ray are composited at each sampling point along the
ray. The point light source sampling process in our proposed hologram calculation
methods is also based on raycasting. A detailed description and our modification
are presented in Section 5.3.1. Another direct volume rendering method is splatting,
where all the voxels in the volume are traversed with the voxels closer to the image
plane being processed before those farther away. When processing a voxel, a region
around the projection of the voxel onto the image plane is determined where the
contribution of this voxel to the image is defined. Different ways for voxels to con-
tribute to the image (aka footprints) have been proposed [189,190]. The colors and
opacities of the voxels are blended into the image in their respective contribution
regions.
Recently, a generative model for volume rendering has been proposed by Berger
et al. [191] where a generative adversarial network (GAN) can interactively help
the user craft the transfer function. With the advancement of graphics processors
and raytracing techniques, photorealism can be achieved with physics-based volume
rendering [192].
138
5.2.2 Computer-Generated Holography
Various methods have been proposed to calculate the hologram in CGH. For
2D images, Lohmann et al. [151] show that a binary representation of the Fourier
hologram can achieve good reconstruction quality. Iterative multi-plane methods
(Makowski et al. [193]) have been proposed to allow images to be reconstructed at
different depth locations.
To generate holograms based on 3D objects, the object is often converted to
a collection of point light sources and the complex wavefronts from all the light
sources are summed up. Directly computing the summation from every point is
computationally intensive. Lucente [153] uses a look-up table corresponding to
the interference contribution from points in each location in the image space and
achieves a 43? speedup. Masuda et al. [194] build graphics hardware specifically for
calculating holograms. Matsushima et al. [195] calculate the approximated hologram
by reusing the distance between a light source and a hologram pixel for nearby
hologram pixels. The 3D scene can be seen as a collection of planar light sources
as well. Matsushima et al. [196] create high-definition holograms of complicated 3D
objects using a polygon-based CGH method to simplify the calculation.
Stereograms are small regions of a hologram. They each record the diffrac-
tion pattern of a portion of the scene instead of the entire scene and are used in
Lucente and Galyean [155], Xu et al. [156], and Shi et al. [1] to reduce the amount
of computation.
In addition, Ziegler et al. [159] develop a conversion technique between the
139
light-field representation and the holographic representation of a scene. Padmana-
ban et al. [146] use this conversion to develop the overlap-add stereogram (OLAS)
algorithm to overcome the trade-off between spatial and angular resolution in tra-
ditional light-field displays. Ways to add occlusion or change lighting in a hologram
have also been explored in Ziegler et al. [160].
CGH based on volumetric data is a less-studied area. Sakamoto et al. [197]
calculate the hologram of a volume by propagating the slices one by one to the
hologram plane using Fresnel propagation. Lu and Sakamoto [198] use two methods
to calculate holograms from a volume. In the first, maximum intensity projection
is used to find a surface of point light sources from which a Fresnel hologram is
calculated. In the second, the volume is converted into a polygonal model using the
marching cubes algorithm and the hologram is calculated based on the polygonal
model. Both of these methods have the same disadvantage as the surface-fitting
method, that they only visualize a surface in the volume instead of its entirety. In
their later work, Lu and Sakamoto [179] find the color of each voxel considering the
intensity decay as light travels through other voxels. While this method does not
require the hologram to be parallel to one of the volume?s faces like in [197], it is
based on brute-force summation and may take hours to compute. Our proposed
method significantly accelerates the calculation, offering over 2000? speedup.
140
Figure 5.2: The general placement of the volume, the hologram, the illuminating
light (planar wave), and the observer. The hologram is calculated as the summation
of the wavefronts from the collection of spherical point light sources in the volume.
5.3 Algorithm and Display
We base our holographic volume rendering on the heuristics of the Huygens-
Fresnel principle, which treats the object (the volume in our case) as a collection of
spherical point light sources. Let x be a pixel on the hologram, v be a point light
source and r(x,v) be the distance between them. A(v) and ?(v) are the amplitude
and initial phase of the light emitted from v, ? is the wavelength, and k = 2? is the
?
wave number.
With the placement of the volume, the hologram, the illuminating light, and
141
the observer arranged in Figure 5.2, the electric field1 E(x) at pixel x can be ex-
pressed in Equation 5.1.
? A(v)
E(x) = ei(kr(x,v)+?(v)) (5.1)
r(x,v)
v
The sign of the phase term kr(x,v) in Equation 5.1 is irrelevant as long as a
consistent coordinate system is used throughout the calculation.
Given the Huygen-Fresnel principle-based hologram computation, steps must
be taken to generate the point light sources and find out their amplitude and initial
phase values before the hologram computation can proceed. We cover below in
Section 5.3.1 how we use the direct volume rendering to sample those point light
sources. In Section 5.3.2, we describe how to convert the summation in Equation 5.1
to a convolution to significantly accelerate the hologram calculation.
5.3.1 Direct Volume Rendering Sampling
We use the raycasting direct volume rendering and front-to-back compositing
to generate the position as well as the color of the point light sources. An illustration
can be seen in Figure 5.3(a).
Rays are cast from the observer?s eye position into the volume. Compositing is
performed inside the volume along the rays? directions (front to back). At composit-
ing step i (illustrated as p0 to p3 in Figure 5.3(b)), a pre-specified transfer function
1The magnetic part of the electromagnetic wave is usually ignored in computer-generated holog-
raphy. Detail can be seen in [123].
142
(a) (b)
Figure 5.3: (a) illustrates the direct volume rendering method used for sampling the
point light sources. Front-to-back compositing is used in our proposed method (b).
The position and color at each compositing step in the direct volume rendering are
used for the hologram calculation.
is used to derive a color ci and opacity value ?i associated with the volume scalar
value at location pi. Assuming the composited color and opacity before compositing
step i are cini and ?
in
i , the composited color and opacity after the compositing step
i are couti and ?
out
i . Their relations are as follows.
couti = c
in
i + (1? ?ini )?ici (5.2)
?outi = ?
in
i + (1? ?ini )?i (5.3)
cini+1 = c
out
i (5.4)
?in = ?outi+1 i (5.5)
The compositing usually stops as the ray exits the volume, or when the termi-
nation condition is satisfied (such as the composited opacity is larger than a certain
value). In traditional volume rendering, we are interested in the final composited
color coutn?1, assuming n compositing steps. As shown in Equation 5.6, the final
143
composited color is a linear combination of the color at each compositing step.
?n?1
coutn?1 = Ti?ici (5.6)
i=0 ?i
Ti = Ti?1(1? ?i?1) = (1? ?j?1) (5.7)
j=1
The pseudocode for the raycasting direct volume rendering is presented in
Algorithm 3. The pseudocode shows the compositing steps for the ray r with its
origin at or, direction dr and step size sr. tr0 indicates the distance from the ray?s
origin at which the compositing starts. D is the maximum number of compositing
steps. C and A are the color and opacity textures derived from the volume and its
transfer function. We use ?Sampler3D? to sample C and A and find the color and
opacity at each compositing step. The outputs are the color contribution Crj and
the position prj at each compositing step.
For our holographic volume rendering, the position pi and the composited color
Ti?ici of each composition step of each ray are used to calculate the hologram in
Equation 5.1. Each ray is associated with a random initial phase and it is used for
all the point light sources generated on this ray. We refer to the brute-force way of
calculating the hologram according to Equation 5.1 as the direct method.
5.3.2 Acceleration using Convolution Theorem
Assuming a total of Rx ?Ry rays are cast and each ray has a maximum of D
compositing steps, direct volume rendering generates O(DRxRy) point light sources.
Given the hologram dimension of Hx ?Hy, the complexity of the direct calculation
method is O(DRxRyHxHy). The pseudocode of the direct calculation method is
144
Algorithm 3: Raycasting Direct Volume Rendering Sampling
input : or, dr, sr, tr0, D, C, A
output: Crj , p
r
j , j = 0, 1, ...D ? 1
p = or + tr0 ? dr;
?r = 0;
for j ? 0 to D ? 1 do
prj = p;
cp = Sampler3D(C, p);
?p = Sampler3D(A, p);
Crj = (1? ?r)?pcp;
?r = max(?r + (1? ?r)?p, 1);
p=p + sr ? dr;
end
145
shown in Algorithm 4. ?r0 is the random initial phase assigned to each ray r.
The three color channels of the hologram are calculated separately with differ-
ent wavelengths ?. The function HgPos(x, y) finds the 3D position of the hologram
pixel (x, y) and the function dist(p, q) finds the distance between the two points p
and q. The output is the complex hologram E(x, y). As we will show in Section 5.5,
this complexity leads to hologram calculation times that span hours with holograms
that match the resolution of state-of-the-art SLMs. We next present a method to
significantly accelerate this calculation.
Algorithm 4: Direct Holographic Volume Rendering
input : Cr rj , pj , ?
r
0, Rx, Ry, D, ?, Hx, Hy
output: E(x, y)
for x? 0 to Hx ? 1 do
for y ? 0 to Hy ? 1 do
E(x, y) = 0;
px,y = HgPos(x, y);
for r ? 0 to Rx ?Ry ? 1 do
for j ? 0 to D do ? 2? r r
E(x, y) = E(x, y) + Cr 1 i( dist(p? x,y ,pj )+?0)j edist(p rx,y ,pj )
end
end
end
end
146
Equation 5.1 can be written as the following.
? eikr(x,v)
E(x) = A(v)ei?(v) (5.8)
r(x,v)
v
If the point light sources are sampled on a plane parallel to the hologram,
E(x) can be seen as the summation of a series of integrals across the collection of
sampling distances z.
?
E(x, y) = Ez(x, y) (5.9)?z?
eikrz(x??,y??)
E (x, y) = A i?z(?,?)z z(?, ?)e d?d? (5.10)
?,? rz(x? ?, y ? ?)
x = (x, y) refers to the hologram pixel position. Az(?, ?) and ?z(?, ?) are the
amplitude and initial phase of the sampled point light source at position (?, ?) at
?
sampling distance z. rz(m,n) = z2 +m2 + n2. The integral in Equation 5.10 can
be seen as a convolution shown below and this convolution can be calculated using
the convolution theorem for a significant speedup. ? is the convolution operator.
Ez = Cz ?Kz (5.11)
Cz(?, ?) = Az(?, ?)e
i?z(?,?) (5.12)
eikrz(m,n)
Kz(m,n) = (5.13)
rz(m,n)
To take advantage of this speedup, two changes need to be made to the direct
volume rendering sampling process. First, the compositing step sizes along each
ray must be adjusted so that the jth compositing steps for all the rays are on
147
the same plane (referred to as compositing plane) parallel to the hologram and of
distance z = dj away from the hologram. As a result, the compositing step sizes are
non-uniform for the rays, and opacity correction is needed. Second, a resampling
step is needed to make sure that the amplitude Az(?, ?) and phase ?z(?, ?) at each
compositing plane have the same sampling pitch as the hologram.
We can use the convolution theorem and fast Fourier transform to calculate the
convolution in Equation 5.11 efficiently. F in Equation 5.14 is the Fourier transform.
We refer to this way of calculation the convolution theorem (CT) method.
F(Ez) = F(Cz) ? F(Kz) (5.14)
The pseudocode for calculating the hologram using the CT method is presented
in Algorithm 5. The function Resample(Crj , ?
r
0, ?,Hx, Hy) resamples the C
r
j and ?
r
0 of
all rays at compositing step j into complex Cd that has the same spacing (?) as thej
hologram. The function ConvolutionKernel calculates the convolution kernel Kdj
based on the distance dj between the jth compositing plane, the wavelength and the
hologram. Both these functions have complexity O(HxHy). The total complexity
is O(DHxHylog(HxHy)) assuming that the fast Fourier transform is used and that
both Cd and Kd have dimensions in the order of Hx ?Hj j y.
5.4 Increase the FOV
If we are to illuminate the SLM with a planar wave as illustrated in Figure 5.2,
the FOV the observer can expect is dependent on the maximum diffraction angle
148
Algorithm 5: Convolution Theorem Holographic Volume Rendering
input : Cr, ?rj 0, D, ?, ?, Hx, Hy
output: E
E = 0;
for j ? 0 to D ? 1 do
Cd = Resample(C
r
j , ?
r
0, ?,Hx, Hy);j
Kd = ConvolutionKernel(dj, ?, ?,Hx, Hy);j
E = E + F?1(F(Cd ) ? F(Kd );j j
end
? assuming that the illumination?s incident angle is 0?. In this case, ? = asin( ? ),
2?
where ? is the SLM?s pixel pitch and ? is the wavelength. The FOV is therefore 2?,
as illustrated in Figure 5.4. Assuming one of the state-of-the-art SLMs? pixel pitch
of 3.74 ?m and a red laser wavelength of 636 nm, the FOV is 9.76?.
A common method to increase the FOV is to use the spherical wave illumina-
tion. Sakamoto et al. [198, 199] and Shi et al. [1] place a lens behind the hologram
to form the spherical wave illumination. The hologram is within the focal length
fel of the external lens, as illustrated in Figure 5.4 bottom. When the planar wave
illuminates the hologram and converges around the focal point of the external lens,
it can be seen as a spherical wave centered at the focal point of the external lens il-
luminating a magnified virtual hologram. In this configuration, the FOV is enlarged
to 2atan( w ) with the viewer?s eye placed at the focal point of the external lens. We
2f
refer to this configuration as spherical wave illumination.
149
planar wave illumination
spherical wave illumination
Figure 5.4: The FOV under planar wave is 2asin( ? ). Spherical wave illumination
2?
can be used to increase the FOV by placing a lens closely behind the hologram.
While the FOV can be increased to 2atan( w ) where w is the side length of the
2fel
hologram and fel is the focal length of the external lens. This configuration is
not compatible with our proposed method of hologram computation because the
diffractions are oriented in different directions.
150
However, the spherical wave illumination is not compatible with our proposed
convolution theorem method of calculating the hologram. This is because when
calculating the hologram using the Huygens-Fresnel principle, the point light sources
v that contribute to the hologram position x must lie within the diffraction cone of
the hologram at x under the specific illumination, or it would lead to aliasing noise.
This is true for both the real image light sources to the right of the hologram and
the virtual image light sources to the left of the hologram in Figure 5.4. The aliasing
artifacts of sampling outside the diffraction cones have been discussed in [1]. Under
planar illumination, the diffraction cones are the same everywhere on the hologram,
therefore allowing the integral in Equation 5.10 to be treated as the convolution in
Equation 5.11. However, under spherical wave illumination, the diffraction cones are
different at different positions on the hologram. An example can be seen in Figure 5.4
bottom where the diffraction cones of the top and bottom pixels on the virtual SLM
are oriented in different directions. Diverging spherical wave illumination is also used
in Maimone et al. [62] to increase the FOV and is incompatible with our proposed
method for similar reasons.
Alternatively, we use a different display configuration to improve the FOV. As
illustrated in Figure 5.5, the hologram is farther away from the external lens than
its focal length fel. The viewer?s eye is placed at or to the right of the focal point
on the other side of the external lens. Similar to the spherical wave illumination,
the FOV is 2atan( w ). The diffraction cones are the same across the hologram,
2fel
allowing us to use the convolution theorem method. The volume is placed within
the focal length from the external lens on the same side as the hologram. Therefore,
151
Figure 5.5: We increase the FOV while also making sure the configuration is com-
patible with our accelerated hologram computation method by placing the external
lens farther away from the hologram than its focal length. The volume is placed
within the focal length from the external lens. The magnified virtual image of the
volume (bound in dashed lines) is observed.
a magnified virtual image of the volume is observed by the viewer.
In addition to the compatibility with our proposed calculation method, another
reason that makes the volume magnification method work better in practice is that
there is a limit on how close the external lens can be to the hologram. Due to the
reflective nature of most of the commercially available high-resolution SLMs in use,
a beam splitter (often cube-shaped) is usually inserted between the SLM and the
external lens. The external lens itself also has a thickness that cannot be ignored.
152
As a result, the focal length of the external lens in the spherical wave illumination
method is larger than that can be used in our configuration.
We have used the spherical lens to represent the external lens in our illus-
trations. It is usually difficult to manufacture spherical lenses with a focal length
smaller than 10 mm and the smaller the focal length, the thicker the lens will be.
A growing trend in optical displays is to use holographic optical elements (HOEs).
A holographic plate can be created which mimics the function of a regular lens for
the wavelengths of light used in the display. This way, smaller focal lengths can be
achieved.
Finally, it is important to note that the FOV discussed in this section refers
to the theoretical maximum FOV in which light diffracted from the hologram enters
the eye. The FOV spanned by the scene also depends on the size of the scene and
the distance between the viewer and the image of the scene. We discuss this in
detail in Section 5.6.2.
5.5 Results
In this section, we present both the simulated results of our proposed holo-
graphic volume rendering and the realistic simulation of the observed image with
different types of light sources. First, the digital reconstruction of the holograms is
carried out in Section 5.5.1 to validate our proposed convolution theorem method.
Realistic simulations based on our proposed display configuration with different light
sources are shown in Section 5.5.2. Finally, we report and analyze the speedup of
153
our proposed hologram calculation method in Section 5.5.3.
5.5.1 Digital Reconstruction
First, we show some of the volumes2 used in our tests and the digital recon-
structions from their holograms in Figure 5.6. These datasets are visualized with
raycasting direct volume rendering and front-to-back compositing. 1024 ? 1024
rays are cast (Rx = Ry = 1024). The holograms? dimensions are 2048 ? 2048
(Hx = Hy = 2048). The pixel pitch used is ? = 3.74 ?m. The digital reconstruction
is done by propagating the complex wavefront from the position of the hologram
back to the position of the volume. The propagation distance determines what depth
in the volume is in focus. The propagation of the hologram is performed with the
wavelengths the hologram is computed with. The speckles are reduced by taking the
average of the digital reconstructions of holograms calculated with different initial
phase. Realistic simulations of the observed images are presented in Section 5.5.2.
It can be seen that the holograms calculated using the convolution theorem
method can successfully be digitally reconstructed to visualize the volume, indicating
that the convolution theorem method correctly encodes the volume into a hologram.
5.5.2 Realistic Simulation
While the digital reconstruction is useful in showing that the generated holo-
gram correctly encodes the spatial and color information of the volume, it does not
2We tested on six volumetric datasets selected from https://klacansky.com/open-scivis-datasets
and http://www.sci.utah.edu/download/IV3DData.html
154
Direct volume rendering
Convolution theorem hologram reconstruction
Figure 5.6: Direct volume rendering results of two volumetric datasets and the
digital reconstruction of the holograms calculated with the convolution theorem
methods. We use orthogonal raycasting in the direct volume rendering because it
most resembles the digital reconstruction by wavefront propagation.
155
represent what the viewer actually sees. The configuration of the display and the
light source used to illuminate the hologram both affect the viewer?s observation.
We take these factors into consideration in a realistic simulation of the viewer?s ob-
servation. We also show simulations of what the viewer would see when he or she
focuses at different depths of the volume or changes the perspective.
5.5.2.1 Configuration of the Display
The configuration of the display is shown in Figure 5.5. The volume is placed
dV to the right of the SLM. Without specification, we refer to the position of the
volume as the position of the center of the volume. The external lens is placed del
to the right of the SLM, which makes it s0 = del ? dV to the right of the volume.
As mentioned before, s0 < fel so a magnified virtual image of the volume can be
observed. The observer?s eye is placed to the right of the external lens. The lens of
the eye which we will refer to as the eye lens is deye away from the external lens. The
retina is dr away from the eye lens. Depending on the depth in the virtual volume
the observer is focusing at, the focal length feye of the lens in the eye is chosen so
that 1 = 1 + 1 where seye = s1 + deye is the focus distance of the eye lens.feye s1+deye dr
We can adjust deye so seye is a comfortable distance for the user to focus at.
The realistic simulation is performed following the display configuration in
Figure 5.5. The wavefront E at the hologram is propagated by a distance of del to
the external lens. The wavefront then goes through the external lens and propagates
by a distance of deye to the eye lens. Finally, the wavefront goes through the eye
156
lens and propagates to the retina. The intensity of the wavefront at the retina
is taken as the result. The pseudocode is shown in Algorithm 6. Prop(E, ?, ?, d)
propagates the wavefront E with wavelength ? and sampling pitch ? by a distance
of d. Lens(E, ?, ?, f, r) applies the transfer function of a lens with focal length f and
radius r to the wavefront E with wavelength ? and sampling pitch ?. In practice, ? is
the wavelength of the illuminating light source. ? is the pixel pitch of the hologram.
rel and reye are the radius of the external lens and the iris of the eye, respectively.
The implementations of Prop and Lens follow the Rayleigh-Sommerfeld diffraction
solution and the lens function described in Voelz [6].
Algorithm 6: Realistic Simulation
input : E, ?, ?, del, fel, rel deye, feye, reye, dr
output: I
E1 = Prop(E, ?, ?, del);
E2 = Lens(E1, ?, ?, fel, rel);
E3 = Prop(E2, ?, ?, deye);
E4 = Lens(E3, ?, ?, feye, reye);
E5 = Prop(E4, ?, ?, dr);
I = abs(E5)
2;
5.5.2.2 Light Source
The spectrum of the light source is commonly expressed as a Cauchy distribu-
tion in Equation 5.15. I(?;?0, ?, I0) is the intensity of the light source at wavelength
157
laser SLED
Figure 5.7: The spectra of the laser and SLED light sources.
? given the parameters of the distribution. ?0 is the center wavelength of the spec-
trum at which the intensity is at the maximum I0. The intensity is half the maximum
intensity at a wavelength of ? away from the center wavelength. ? is the full-width
half-max (FWHM) which represents the spectral width of the light source.
I0
I(?;?0, ?, I0) =
[1 + (??
(5.15)
?0 )2]
?
Different light sources have different spectra. A diode laser typically has an
FWHM near 1.4 nm when it is functioning as the light source for a holographic
display. Commercially available SLEDs have FWHM ranging from 3 nm to 15 nm.
LEDs have even larger FWHM. An example of the RGB laser and SLED light
sources can be seen in Figure 5.7.
To find out what kind of light source works well for holographic volume ren-
dering, we perform realistic simulation using 4 different kinds of light sources that
correspond to a laser, a narrow-spectrum SLED, a SLED, and a LED. The FWHM
158
of these light sources can be seen in Table 5.1.
Light Source FWHM nm Light Source FWHM nm
Laser 1.4 Narrow SLED 5
SLED 10 LED 20
Table 5.1: Spectra of light sources used in the realistic simulation.
We take the range of the wavelengths with intensity greater than 10% of the
peak intensity as the wavelengths used in the realistic simulation for each color. The
ranges are marked in respective colors in the examples in Figure 5.7. We sample
the spectrum at a regular interval ?? = 1.0 nm within each range.
In the realistic simulation, the wave propagation from the hologram to the
retina described in Algorithm 6 is performed with each sampled wavelength. The
weighted average of the intensity is taken as the result of the realistic simulation
with the weights being the intensities for each wavelength.
5.5.2.3 Simulations
The realistic simulations of three holograms are shown in Figure 5.8. The
holograms are calculated with Rx = Ry = 1024, Hx = Hy = 2048 and the pixel pitch
? = 3.74 ?m. In the realistic simulation, del = 45.56 mm, fel = 20 mm, deye = 50 mm.
This way, the focus distance seye = 120 mm. The iris radius reye = 1.5 mm is typical
of an eye in an area with normal illumination. The retina distance dr = 24 mm is
typical of an adult.
We can see that with increasing spectral width, the speckle noise becomes less
159
Laser Narrow SLED SLED LED
Figure 5.8: The realistic simulations of three holograms with different light sources.
160
severe. However, the images also become less sharp. Judging from direct observation
of Figure 5.8, we conclude that the spectral width of a narrow SLED (FWHM of
5 nm) offers the best balance between speckle noise and image sharpness.
The weighted intensity from the realistic simulation is a 2D real matrix with
arbitrarily large maximum values. To show the realistic simulations as images in
Figure 5.8, we normalize the weighted intensity to [0, 1] and then perform gamma
correction. Let the weighted intensity be Iw, the image S is effectively the following.
Iw
S = ( )0.5 (5.16)
max(Iw)
Due to speckle noise, there are a small number of bright spots in the results
of realistic simulations. As the speckle noise becomes less severe with increasing
spectral width, the max value in Iw decreases. As a result, the images obtained
from Equation 5.16 become increasingly brighter with increasing spectral width.
In reality, the power of the light source determines the brightness of the observed
image, and the power of the light source is usually limited by the integrated intensity
which can safely enter the observer?s eyes. Therefore, assuming the same light source
power, the brightness of the observation should be similar to each other. We scale
the intensities of the images in Figure 5.8 so they have the same total intensity as
the narrow SLED for each channel. As a result, the images for SLED and LED are
undersaturated while the images for laser are oversaturated by no more than 25%.
161
Figure 5.9: By controlling the focal length of the eye lens the viewer is able to focus
at different depths of the volume.
5.5.2.4 Focus at different depths
Next, we show the realistic simulation of focusing at different depths in the
volume in Figure 5.9. As can be seen, by adjusting the focal length of the eye lens,
parts of the volume at different depths appear to be in focus.
5.5.3 Speedup
As mentioned in Section 5.3.2, our proposed hologram calculation method
using the convolution theorem reduces the complexity from O(RxRyDHxHY ) of the
direct method to a more manageable O(DHxHylog(HxHy)). All calculations are
performed on a PC with an Intel 9900K CPU and an Nvidia RTX 2080Ti GPU.
The implementation is in C++ and CUDA. Here we show the speedup of using
the convolution theorem to accelerate the hologram of several volumes in Table 5.2.
162
The hologram computation time mentioned in this work includes both the point
light source sampling using raycasting direct volume rendering and the calculation
of Equation 5.1 using the sampled light sources either with the direct method or the
convolution theorem method.
volume dV D direct CT speedup
Tooth 20 mm 90 2.89 h 5.09 s 2049?
C60 30 mm 58 1.80 h 11.56 s 561?
Teapot 40 mm 253 7.87 h 64.20 s 441?
Engine 50 mm 116 3.64 h 27.73 s 472?
Hand 60 mm 290 9.30 h 82.95 s 403?
Bonsai 70 mm 232 7.14 h 129.93 s 198?
Table 5.2: Hologram computation time and speedup for several volumetric datasets.
These holograms are calculated with Hx = Hy = 2048 and Rx = Ry = 1024.
As can be seen, our proposed method achieves from hundreds to thousands
times speedup compared to the brute-force direct calculation method when calcu-
lated on the same hardware. In a simulated comparison with the previous work,
it takes our proposed method 5.04 s, compared to 3.6 h in [179], to calculate the
hologram of the same dimension with the same number of point light sources. This
is roughly a 2500? speedup including the hardware difference.
It can also be seen that the speedup is not uniform in all six test cases. Gen-
erally speaking, the closer the volume is to the hologram, the larger the speedup.
We will explain the factors of the processing time in Section 5.6.1.
163
Figure 5.10: The left shows that the computation time increases linearly with the
number of compositing stepsD with a fixed hologram dimensions (Hx = Hy = 2048).
The middle row shows the computation time increases with increasing hologram di-
mensions with D = 232. dV =30 mm for both tests. The right shows how the
computation time changes with the convolution size. The convolution size is in-
creased by placing the volume farther away from the hologram thus increasing the
sizes of the Kz and Cz.
5.6 Discussion
Here we discuss several details about our holographic volume rendering method
including the hologram computation time, observed scene FOV, the use of elemental
holograms, and the wavelength-specific interference artifacts.
5.6.1 Factors of Hologram Computation Time
Here, we closely examine the different factors that affect the computation
time of our proposed method. The complexity to calculate the hologram using the
CT method is O(DHxHylog(HxHy)) assuming the fast Fourier transform (FFT)
implementation and that the arrays in the convolution are both of dimensions Hx?
164
Hy. We use the cuFFT library
3 in CUDA, which is based on the Cooley-Tukey
and Bluestein algorithms. We show how the computation time changes with the
increasing number of compositing steps D and hologram dimensions (Hx and Hy)
in Figure 5.10 top and middle row. As can be seen, when all the other parameters
are kept the same, the computation time increases linearly with the number of
compositing steps. Even though the computation time does not scale exactly as
the complexity suggests with the increasing hologram dimensions, the computation
time does increase with the hologram dimensions.
Another factor that affects the computation time of our method, as can be
seen in Table 5.2, is the distance between the hologram and the volume dV. In our
implementation, the 2D signal undergoing the FFT has the combined dimensions of
Kz and Cz for the compositing plane z away from the hologram. In Section 5.3.2,
we assume that both Cz and Kz have dimensions of the same order of magnitude
as the hologram. This is generally true for the area where we place the volume.
In practice, the sizes of Cz and Kz both increase with z illustrated in Figure 5.11.
We sample the point light sources in the space the wavefront from the SLM can
diffract to. This space is the union of all the diffraction cones of the SLM pixels.
The computation time with increasing convolution size (the number of elements
undergoing FFT) can be seen in Figure 5.10 bottom row. A large convolution size
would also lead to heavy usage of the VRAM in this implementation. There are
several outliers where the computation time does not scale with the convolution size.
This may be because we did not pad the arrays undergoing FFT to the size of a
3https://developer.nvidia.com/cufft
165
Figure 5.11: The sizes of Cz and Kz of compositing planes z1 and z2 away from
the hologram are shown. Given the hologram size and the diffraction angle ?, the
reconstruction of point light sources in region I has the finest resolution (smallest
spot size). The resolution of the reconstruction in region II drops off (the spot
size increases) from the middle towards top and bottom. The resolution of the
reconstruction in region III drops off horizontally as it gets farther away from the
hologram.
power of two. The different convolution sizes and underlying VRAM organization
may have caused different variations of the FFT implementation to be used by the
cuFFT library, leading to the outliers.
However, we can find an upper bound for the hologram computation time.
If the size of the volume is known, the dimension of Kz does not need to increase
endlessly with z. Given that the size of the volume is fixed and the center of the
volume is placed on the optical axis of the hologram, Kz does not need to be larger
166
than what covers the entirety of the hologram, as illustrated in Figure 5.12 with
red dashed lines. In this case, the upper bound of the size of Kz is the combined
size of the hologram and the size of the volume at distance z. Let the maximum
size of the volume at all compositing planes be Vmax and the distance between
the hologram and the farthest compositing plane be zmax. The dimension of Cz is
bound by Vmax ? Vmax . The dimension of K is 2ztan(?)z ? 2ztan(?) and is bound by? ? ? ?
(H + Vmaxx )?(H + Vmaxy ). Therefore, the signal undergoing the FFT has dimensions? ?
Vmax+2zmax?tan(?) ? Vmax+2zmax?tan(?) and is bound by (Hx + 2Vmax ) ? (H Vmax? ? ? y + 2 ).?
Another potential bound is that zmax cannot be larger than the form factor of the
display. This is usually within 200 mm for a head-mounted near-eye display.
5.6.2 Scene FOV and Volume Placement
The FOV described in Section 5.4 refers to the theoretical maximum FOV
where the light diffracted from the hologram enters the observer?s eye. The FOV
spanned by the scene, which we refer to as the scene FOV, is also dependent on the
size of the scene and the distance between the scene image and the eye.
Without loss of generality, we assume that the intersections of the volume and
the compositing planes are at their largest Vmax at the center of the volume. This
plane is at a distance of s0 away from the external lens in Fig 5.5. The magnified
virtual image of this plane is at a distance of s = fels01 ? away from the external lens.fel s0
The magnified size of the volume is L = V fel1 max ? . The scene FOV is thereforefel s0
2atan( Vmaxfel ).
2(fel?s0)?(s1+deye)
167
Figure 5.12: Given the volume size, the size of Kz does not need to increase endlessly
with distance from the hologram.
The focal distance s1 + deye needs to be large enough for the observer to
comfortably focus on the image. Therefore, to achieve a large scene FOV, a large
volume is required, leading to longer computation time according to the discussion
in Section 5.6.1. For example, if we want the observer to focus on a volume at a
distance of 120 mm that spans 40? through an external lens of focal length 10 mm,
the volume needs to be of size Vmax = 10.91 mm.
The placement of the volume relative to the hologram also affects the quality
of the hologram. According to the Rayleigh criterion [200], the size of the smallest
spot that the hologram can create as part of its image is proportional to the distance
168
to that spot divided by the size of the area of the hologram devoted to making that
spot. Therefore, the larger the section of the hologram devoted to any particular
spot in the image, the finer resolution that spot can have. On the other hand, only
those pixels on the hologram for which the diffraction cone contains the image spot
can be devoted to making any spot in the image without increasing noise in the
image [1]. The area on the hologram plane that provides the full bandwidth to
ensure the finest spot size for a given image point source v can be thought of as
the base of a cone whose vertex is at v, the base is on the hologram plane, and the
vertex angle is the same as the diffraction cone. An example is in Figure 5.11 where
the areas that provide the full bandwidth for point sources v1 and v2 are the areas
labeled as Kz1 and Kz2 . If the hologram cannot cover the area that provides the full
bandwidth for some point sources, the reconstructions of those point sources have
larger spot sizes and poorer resolution.
We can divide the space to the right of the hologram into three types of
regions to place the volume, as illustrated in Figure 5.11. The hologram is able to
provide all the bandwidth needed to resolve point light sources in region I. Therefore,
reconstruction of these point light sources is of the finest resolution. The bandwidths
required to resolve point light sources in regions II and III are both partially provided
by a portion of the hologram. Specifically, the resolution of the reconstruction in
region II drops off from its border with region I vertically towards the top and
bottom. The resolution of reconstruction in region III drops off horizontally as it
gets farther away from the hologram.
A small volume can be placed within region I. For larger volumes, it is preferred
169
to place them in region III for better quality. Using the example above, the volume
needs to be placed at a distance of 157.81 mm away from the hologram.
5.6.3 Elemental Hologram
Another popular method to accelerate holographic calculation is to use ele-
mental holograms such as in [1] and [179]. However, it does not help much with
our hologram computation method with the convolution theorem . If we are to
divide the hologram into M ?N elemental holograms each with Hx ? Hy pixels, the
M N
complexity to calculate elemental hologram is O(HxHy logHxHy ). The complexity
MN MN
to calculate the entire hologram is O(DMN HxHy logHxHy ) = O(DHxHylogHxHy).MN MN
It is not much different from the original complexity of our proposed method, but
will potentially add more CUDA kernel calls and increase the computation time.
Therefore, we do not use the approach of elemental holograms.
5.7 Conclusion
In this Chapter, we improve the holographic volume rendering by addressing
three challenges it faces. First, we develop a new hologram computation method
that uses direct volume rendering to generate the collection of point light sources
and uses the convolution theorem to significantly accelerate the computation. Com-
pared with the brute-force methods, this accelerated hologram computation method
achieves over 2000 speedup. Second, we explore several holographic display configu-
rations and find one that increases the FOV while also maintaining the compatibility
170
with the accelerated hologram computation method. Third, we realistically simulate
the observed images under different kinds of light sources and identify the appro-
priate light source that strikes the balance between speckles and image resolution
for holographic volume rendering. These improvements pave the way for bringing
volume visualization to holographic displays.
171
Chapter 6: Conclusions and Future Work
6.1 Perception in the Far Periphery
First, in Chapter 2, we investigate the perception in the far peripheral vision in
VR and AR experiences. We conduct a user study in which we record the perception
time of 40 participants reacting to patterns of varying sizes appearing at different
eccentricities in their peripheral vision. Based on the data collected, we find that
patterns can be scaled linearly with the eccentricity it is shown at to reach a constant
perception time. Based on these equal-efficiency scales, we can optimally scale the
content displayed at higher eccentricities so that it can be perceived efficiently. Our
finding is useful for designing VR and AR experiences that are able to leverage
the entire human visual system?s field of view for the emerging displays capable
of supporting a wide FOV. Future work on this topic could explore if the scaling
function applies to even higher eccentricities. With a larger field of view display (for
example StarVR 1), we can examine the visual properties up to 105? eccentricity,
based on the findings in Chapter 2. In addition to visual acuity, peripheral vision
also differs from central vision in several other attributes such as contrast sensitivity
in color, intensity, orientation, motion, and texture detail. It will be important
1https://www.starvr.com/
172
to investigate how these attributes affect the perception of information in the far
periphery. Finally, future work could use the findings of Chapter 2 to design an
actual experience and show how far peripheral vision can enhance the user experience
in an immersive VR or AR scenario. Some examples include the VR exploration
and surveillance applications by Duet al. [201?205] and the virtual memory palace
created by Krokos et al. [206].
6.2 AR Displays for Surgery
Second, in Chapter 3, we implement an AR application and show through a
study how AR could help improve the safety of a dangerous and invasive medical
procedure. We present an AR surgical visualization and tracking system that can
assist surgeons catheter placement in external ventricular drain procedure. Com-
pared to existing similarly-purposed systems, our system does not require changing
the shape or weight of the medical instruments. The catheter tracking in our system
has high accuracy and low latency. Tests show that our system achieves a 0.58 mm
accuracy on a 2D plane and an accuracy of 0.85 mm in 3D space. Processing for
each frame takes 22.60 ms on a moderately powerful computer. Both the accuracy
and speed can be further improved with higher-quality cameras and parallel imple-
mentation. In the study that took place in a realistic surgical environment with
a cadaveric head, the displayed virtual catheter projection aligned well with the
visible portion of the real catheter and the pace at which the real and the virtual
catheter moved matched. The surgeon participants also reported that the handling
173
of our slightly modified catheter feels identical to a traditional catheter. In the fu-
ture, we expect to adapt our system to comply with surgical safety standards and
improve the accuracy and speed of our system with better hardware and software
implementation. We could also make the system more robust to occlusion of the
color markers by using more redundant markers.
6.3 Correcting Proximity Effect in Nanophotnoic Phased Arrays
Third, in Chapter 4, we look at the infrastructure of VR and AR displays.
Holographic displays recreate the optical wave field of 2D images or 3D scenes,
addressing the vergence-accommodation conflicts present in the current VR and
AR headsets that rely on stereoscopic vision for depth perception. Nanophotonic
phased arrays (NPAs) is a new type of holographic display that has the advantage
of high refresh rate and small form-factor compared the LCoS SLM holographic
displays. However, the thermally-modulated NPAs suffer from thermal proximity
effect as they depend on the thermo-optic effects for phase modulation. We propose
novel proximity effect correction methods for both the Fourier holograms and the
Fresnel holograms for the NPAs. Computational simulations show that our proposed
methods are more effective and efficient in correcting the proximity effect than
existing methods. Future work on this topic includes acceleration of the methods
and correcting the proximity effect of 3D Fresnel holograms.
174
6.3.1 Acceleration
Both the Proximal PEC and the Fresnel PEC method are iterative algorithms.
For example, it currently takes our implementation of the Fresnel PEC method 4.28 s
to process a 256 ? 256 image at proximity effect level ? = 0.66 px. However, the
Fresnel PEC method is on average 36.18 times faster than the Iterative Fresnel PEC
method while also offering a superior correction effectiveness. Since the Fresnel PEC
method heavily uses the Fourier transform, we believe that hardware implementation
and massive parallelization can significantly accelerate it. Special hardware for the
Fourier transform such as the on-chip integrated all-optical FFT with a 26 Tbps
data throughput by Nejadriahi and Sorger [207] may also significantly increase the
speed of this method. Alternatively, it is desirable to investigate how to properly
adjust the convergence condition and the step size for maximum efficiency.
6.3.2 3D Fresnel Holograms
We have demonstrated displaying 3D holograms by using time-division multi-
plexing to display multiple 2D Fresnel holograms processed with the FPEC method
at different depths in consecutive frames. Proximity effect correction on 3D Fresnel
holograms where one hologram encodes the entire 3D scene will be an interesting
direction for future work. A naive solution would be to calculate the fully complex
or phase-only 3D Fresnel hologram similar to Shi et al. [1] and Maimoneet al. [62]
and perform deconvolution on the phase signal. The non-negative constraint needs
to be enforced in the deconvolution. An alternative is to optimize the observed 3D
175
scene similar to the proposed FPEC method. It remains to be seen how to define
the cost function similar to Equation 4.50 for a 3D scene.
6.4 Fast Holographic Volume Rendering
Last,in Chapter 5, we propose a way to significantly accelerate the calcula-
tion of the holograms based on volumetric data. Our proposed method achieves a
2500? speedup compared to the previous state-of-the-art method and reduces the
computation time from hours previously to mere seconds. We propose a display
configuration that improves the field-of-view of the display while maintaining the
compatibility to the speedup. We also show through simulations how to choose the
light source for the holographic display to strike a balance between the speckle noise
and image sharpness for the observation.
Going forward, there are several directions to be taken.
6.4.1 Phase Shift on the SLM
In our realistic simulation, we simulated the image as the weighted average
of the wavefront propagated from the hologram to the retina when the hologram
is illuminated with light of different wavelengths sampled from a spectrum. In this
simulation, the same electric field parameters (amplitude and phase) are assumed
to be present at each wavelength. In reality, when the SLM is illuminated by light
of a wavelength other than the one it is calibrated with, the dispersion of the liq-
uid crystal in the SLM alters the phase delay of the light as it is traveling through
176
the liquid crystal. For phase modulation, that means that the phase of the electric
field will be slightly different for wavelengths different than the calibration wave-
length. Similarly, one would expect slight differences in the amplitude for amplitude
modulation. It is worthwhile to investigate how to simulate this effect in order to
perform more realistic simulations. Manufacturers may not be forthcoming with
information about the dispersion of their liquid crystals; but this is also measurable
as a function of wavelength by doing interferometric measurements similar to those
commonly used to calibrate SLMs.
6.4.2 Quantifying the Speckles
We investigate the choice of light sources in the realistic simulations. With
direct observation, we have found that a light source with an FWHM of 5 nm strikes
the balance between the speckles and resolution among the four sample light sources
tested. Altering the initial phase of the point light sources in the volume should
allow one to generate statistically independent speckle patterns in the observation
simulations. By averaging over successive runs, one could, in principle, get an image
without speckles. This would allow one to characterize the speckle and find a metric
that could be used to optimize the trade-off between speckle and resolution.
To quantify the speckles in a simulated observation image, it is necessary to
first identify the speckles. This can be done by subtracting the image without any
speckles from the speckled image. The severity, granularity, and anisotropy of the
speckles can be quantified with methods such as the power spectral density in Ong
177
et al. [208]. Given the speckleless image, it can also be used as the reference image
in image quality metrics such as the structural similarity index in Wany et al. [175]
or the neural network-based perceptual metric in Zhang et al. [176]. The image
quality metric can be used in combination with the quantified speckle to find the
optimum observed image.
6.4.3 Improving Volume Rendering Sampling
In Chapter 5, we use raycasting direct volume rendering to generate the
point light sources for hologram computation and do not consider more complicated
lighting. The complexity of generating the point light sources using raycasting is
O(DRxRy). In practice, processing each compositing plane (with Rx = Ry = 1024)
takes on average 0.63 ms. The raycasting process takes no more than 1.12% of
the total computation time depending on the hologram dimension and the distance
between the hologram and the volume. Therefore, lighting models such as diffuse
lighting and specular lighting can be easily integrated into our method with very
little overhead to the computation time. Other volume visualization and exploration
techniques such as in Cheng et al. [209, 210]. Also, in the current implementation,
the rays are uniformly sampled in both dimensions. This can be improved by sam-
pling the rays more densely in areas where the gradient of the volume is large. In
order to cope with more complicated illuminations in the scenes, methods such as
the Monte Carlo technique in Veach and Guibas [211], Shirley et al. [212], and Meng
et al. [213] can be used. Instead of computing the light integrals on the image plane,
178
one can compute them at various sampling positions in the volume.
179
Bibliography
[1] Liang Shi, Fu-Chung Huang, Ward Lopes, Wojciech Matusik, and David Lue-
bke. Near-eye light field holographic rendering with spherical waves for wide
field of view interactive 3d computer graphics. ACM Transactions on Graphics
(TOG), 36(6):236, 2017.
[2] J Adam Jones, J Edward Swan, and Mark Bolas. Peripheral stimulation and
its effect on perceived spatial scale in virtual environments. IEEE transactions
on visualization and computer graphics, 19(4):701?710, 2013.
[3] Sujal Bista, Jiachen Zhuo, Rao P. Gullapalli, and Amitabh Varshney. Visu-
alization of Brain Microstructure through Spherical Harmonics Illumination
of High Fidelity Spatio-Angular Fields. IEEE Transactions on Visualization
and Computer Graphics, 20(12):2516?2525, Dec 2014.
[4] Sujal Bista, Jiachen Zhuo, Rao P. Gullapalli, and Amitabh Varshney. Visual
Knowledge Discovery for Diffusion Kurtosis Datasets of the Human Brain.
In Ingrid Hotz and Thoma Schultz, editors, Visualization and Processing of
Higher Order Descriptors for Multi-Valued Data, pages 213?234. Springer,
2015.
[5] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona,
Deva Ramanan, Piotr Dolla?r, and C. Lawrence Zitnick. Microsoft COCO:
Common objects in context. In David Fleet, Tomas Pajdla, Bernt Schiele,
and Tinne Tuytelaars, editors, Computer Vision ? ECCV 2014, pages 740?
755, Cham, 2014. Springer International Publishing.
[6] David George Voelz. Computational fourier optics: a MATLAB tutorial. SPIE
press Bellingham, WA, 2011.
[7] H. R. Sheikh, Z. Wang, L. Cormack, and A. C. Bovik. Live image qual-
ity assessment database release 2. http://live.ece.utexas.edu/research/
quality, 2004. Accessed: 2019-04-03.
180
[8] H. R. Sheikh, M. F. Sabir, and A. C. Bovik. A statistical evaluation of recent
full reference image quality assessment algorithms. IEEE Transactions on
Image Processing, 15(11):3440?3451, Nov 2006.
[9] T. Smits and W.J. Faber. Chronic (classified historical newspa-
per images). KB Lab: The Hague. http://lab.kb.nl/dataset/
chronic-classified-historical-newspaper-images, 2018.
[10] Simon J Thorpe, Karl R Gegenfurtner, Miche?le Fabre-Thorpe, and Heinrich H
BuE?lthoff. Detection of animals in natural images using far peripheral vision.
European Journal of Neuroscience, 14(5):869?876, 2001.
[11] Muriel Boucart, Quentin Lenoble, Justine Quettelart, Sebastien Szaffarczyk,
Pascal Despretz, and Simon J Thorpe. Finding faces, animals, and vehicles in
far peripheral vision. Journal of Vision, 16(2):10?10, 2016.
[12] Adam M Larson and Lester C Loschky. The contributions of central versus
peripheral vision to scene gist recognition. Journal of Vision, 9(10):6?6, 2009.
[13] J Rovamo and V Virsu. An estimation and application of the human cortical
magnification factor. Experimental brain research, 37(3):495?510, 1979.
[14] Steve Grogorick, Michael Stengel, Elmar Eisemann, and Marcus Magnor. Sub-
tle gaze guidance for immersive environments. In Proceedings of the ACM
Symposium on Applied Perception, page 4. ACM, 2017.
[15] Xuetong Sun and Amitabh Varshney. Investigating perception time in the far
peripheral vision for virtual and augmented reality. In Proceedings of the 15th
ACM Symposium on Applied Perception, page 13. ACM, 2018.
[16] Xuetong Sun, Sarah B. Murthi, Gary Schwartzbauer, and Amitabh Varshney.
High-precision 5dof tracking and visualization of catheter placement in evd of
the brain using ar. ACM Trans. Comput. Healthcare, 1(2), March 2020.
[17] Manan Raval. Nanophotonic visible light phased arrays. Master?s thesis,
Massachusetts Institute of Technology, 2016.
[18] Jie Sun, Erman Timurdogan, Ami Yaacobi, Ehsan Shah Hosseini, and
Michael R Watts. Large-scale nanophotonic phased array. Nature,
493(7431):195, 2013.
[19] Jonathan K Doylend, MJR Heck, Jock T Bovington, Jonathan D Peters,
LA Coldren, and JE Bowers. Two-dimensional free-space beam steering with
an optical phased array on silicon-on-insulator. Optics express, 19(22):21595?
21604, 2011.
[20] X. Sun, Y. Zhang, P. C. Huang, N. Acharjee, M. Dagenais, M. Peckerar, and
A. Varshney. Correcting the proximity effect in nanophotonic phased arrays.
IEEE Transactions on Visualization and Computer Graphics, pages 1?1, 2020.
181
[21] Andrew S Glassner. An introduction to ray tracing. Elsevier, 1989.
[22] Marc Levoy. Efficient ray tracing of volume data. ACM Transactions on
Graphics (TOG), 9(3):245?261, 1990.
[23] Dennis Gabor. A new microscopic principle, 1948.
[24] Hans Strasburger, Ingo Rentschler, and Martin Ju?ttner. Peripheral vision and
pattern recognition: A review. Journal of vision, 11(5):13?13, 2011.
[25] Michael J Simpson. mini-review: Far peripheral vision. Vision research,
140:96?105, 2017.
[26] Xiaoxu Meng, Ruofei Du, Matthias Zwicker, and Amitabh Varshney. Kernel
Foveated Rendering. Proceedings of the ACM on Computer Graphics and
Interactive Techniques (I3D), 1(5):1?20, May 2018.
[27] X. Meng, R. Du, and A. Varshney. Eye-dominance-guided foveated render-
ing. IEEE Transactions on Visualization and Computer Graphics, 26(5):1972?
1980, 2020.
[28] X. Meng, R. Du, J. F. JaJa, and A. Varshney. 3d-kernel foveated rendering
for light fields. IEEE Transactions on Visualization and Computer Graphics,
pages 1?1, 2020.
[29] PM Daniel and D Whitteridge. The representation of the visual field on the
cerebral cortex in monkeys. The Journal of physiology, 159(2):203?221, 1961.
[30] A Cowey and ET Rolls. Human cortical magnification factor and its relation
to visual acuity. Experimental Brain Research, 21(5):447?454, 1974.
[31] Giles S Brindley and WS Lewin. The sensations produced by electrical stim-
ulation of the visual cortex. The Journal of physiology, 196(2):479, 1968.
[32] Jonathan C Horton and William F Hoyt. The representation of the visual
field in human striate cortex: a revision of the classic holmes map. Archives
of ophthalmology, 109(6):816?824, 1991.
[33] Jonas Larsson and David J Heeger. Two retinotopic visual areas in human
lateral occipital cortex. The Journal of Neuroscience, 26(51):13128?13142,
2006.
[34] DJ Tolhurst and L Ling. Magnification factors and the organization of the
human striate cortex. Human neurobiology, 6(4):247?254, 1987.
[35] Eric L Schwartz. Computational anatomy and functional architecture of stri-
ate cortex: a spatial mapping approach to perceptual coding. Vision research,
20(8):645?669, 1980.
182
[36] Hans Strasburger and Maka Malania. Source confusion is a major cause of
crowding. Journal of Vision, 13(1):24?24, 2013.
[37] David J Yates and Tom Stafford. Insights into the function and mechanism
of saccadic decision making from targets scaled by an estimate of the cortical
magnification factor. Cognitive Computation, 3(1):89?93, 2011.
[38] Christopher W Tyler. Human symmetry detection exhibits reverse eccentricity
scaling. Visual Neuroscience, 16(05):919?922, 1999.
[39] Mark M Schira, Alex R Wade, and Christopher W Tyler. Two-dimensional
mapping of the central and parafoveal visual field to human visual cortex.
Journal of Neurophysiology, 97(6):4284?4295, 2007.
[40] Hans Strasburger. Invariance of the psychometric function for character
recognition across the visual field. Attention, Perception, & Psychophysics,
63(8):1356?1376, 2001.
[41] Lester Loschky, Muriel Boucart, Sebastien Szaffarczyk, Clement Beugnet, Ali-
cia Johnson, and Jia Li Tang. The contributions of central and peripheral
vision to scene gist recognition with a 180? visual field. Journal of vision,
15(12):570?570, 2015.
[42] Dimitri J Bayle, Benjamin Schoendorff, Marie-Anne He?naff, and Pierre
Krolak-Salmon. Emotional facial expression detection in the peripheral vi-
sual field. PloS one, 6(6):e21584, 2011.
[43] Rogier Landman, Jitendra Sharma, Mriganka Sur, and Robert Desimone. Ef-
fect of distracting faces on visual selective attention in the monkey. Proceedings
of the National Academy of Sciences, 111(50):18037?18042, 2014.
[44] Donghao Ren, Tibor Goldschwendt, YunSuk Chang, and Tobias Ho?llerer.
Evaluating wide-field-of-view augmented reality with mixed reality simulation.
In Virtual Reality (VR), 2016 IEEE, pages 93?102. IEEE, 2016.
[45] Paul B Kline and Bob G Witmer. Distance perception in virtual environments:
Effects of field of view and surface texture at near distances. In Proceedings
of the Human Factors and Ergonomics Society Annual Meeting, volume 40,
pages 1112?1116. SAGE Publications Sage CA: Los Angeles, CA, 1996.
[46] J Adam Jones, David M Krum, and Mark T Bolas. Vertical field-of-view ex-
tension and walking characteristics in head-worn virtual environments. ACM
Transactions on Applied Perception (TAP), 14(2):9, 2017.
[47] Toshikazu Ohshima, Hiroyuki Yamamoto, and Hideyuki Tamura. Gaze-
directed adaptive rendering for interacting with virtual space. In Virtual reality
annual international symposium, 1996., Proceedings of the IEEE 1996, pages
103?110. IEEE, 1996.
183
[48] Yoshio Ishiguro and Jun Rekimoto. Peripheral vision annotation: Noninter-
ference information presentation method for mobile augmented reality. In
Proceedings of the 2nd Augmented Human International Conference, page 8.
ACM, 2011.
[49] Brett R Jones, Hrvoje Benko, Eyal Ofek, and Andrew D Wilson. Illumiroom:
peripheral projected illusions for interactive experiences. In Proceedings of the
SIGCHI Conference on Human Factors in Computing Systems, pages 869?878.
ACM, 2013.
[50] Carolina Cruz-Neira, Daniel J Sandin, and Thomas A DeFanti. Surround-
screen projection-based virtual reality: the design and implementation of the
cave. In Proceedings of the 20th annual conference on Computer graphics and
interactive techniques, pages 135?142. ACM, 1993.
[51] Thomas A DeFanti, Gregory Dawe, Daniel J Sandin, Jurgen P Schulze, Peter
Otto, Javier Girado, Falko Kuester, Larry Smarr, and Ramesh Rao. The star-
cave, a third-generation cave and virtual reality optiportal. Future Generation
Computer Systems, 25(2):169?178, 2009.
[52] George Robertson, Mary Czerwinski, and Maarten Van Dantzich. Immersion
in desktop virtual reality. In Proceedings of the 10th annual ACM symposium
on User interface software and technology, pages 11?19. ACM, 1997.
[53] Hajime Nagahara, Yasushi Yagi, and Masahiko Yachida. Super wide viewer
using catadioptrical optics. In Proceedings of the ACM symposium on Virtual
reality software and technology, pages 169?175. ACM, 2003.
[54] Kiyoshi Kiyokawa. A wide field-of-view head mounted projective display using
hyperbolic half-silvered mirrors. In Mixed and Augmented Reality, 2007. IS-
MAR 2007. 6th IEEE and ACM International Symposium on, pages 207?210.
IEEE, 2007.
[55] David Dunn, Cary Tippets, Kent Torell, Petr Kellnhofer, Kaan Aks?it, Piotr
Didyk, Karol Myszkowski, David Luebke, and Henry Fuchs. Wide field of view
varifocal near-eye display using see-through deformable membrane mirrors.
IEEE Transactions on Visualization and Computer Graphics, 23(4):1322?
1331, 2017.
[56] Ismo Rakkolainen, Matthew Turk, and Tobias Ho?llerer. A compact, wide-fov
optical design for head-mounted displays. In Proceedings of the 22nd ACM
Conference on Virtual Reality Software and Technology, pages 293?294. ACM,
2016.
[57] I Rakkolainen, M Turk, and T Ho?llerer. A superwide-fov optical design for
head-mounted displays. 2016.
184
[58] Robert W Massof, Lawrence G Brown, Marc D Shapiro, G David Barnett,
Frank H Baker, and Fuminobu Kurosawa. 37.1: Invited paper: Full-field
high-resolution binocular hmd. In SID Symposium Digest of Technical Papers,
volume 34, pages 1145?1147. Wiley Online Library, 2003.
[59] Kaan Aks?it, Jan Kautz, and David Luebke. Slim near-eye display using pinhole
aperture arrays. Applied optics, 54(11):3422?3427, 2015.
[60] Robert Xiao and Hrvoje Benko. Augmenting the field-of-view of head-mounted
displays with sparse peripheral displays. In Proceedings of the 2016 CHI Con-
ference on Human Factors in Computing Systems, pages 1221?1232. ACM,
2016.
[61] Fu-Chung Huang, Kevin Chen, and Gordon Wetzstein. The light field stereo-
scope: immersive computer graphics via factored near-eye light field displays
with focus cues. ACM Transactions on Graphics (TOG), 34(4):60, 2015.
[62] Andrew Maimone, Andreas Georgiou, and Joel S Kollin. Holographic near-eye
displays for virtual and augmented reality. ACM Transactions on Graphics
(TOG), 36(4):85, 2017.
[63] Andrew Maimone, Douglas Lanman, Kishore Rathinavel, Kurtis Keller, David
Luebke, and Henry Fuchs. Pinlight displays: wide field of view augmented
reality eyeglasses using defocused point light sources. In ACM SIGGRAPH
2014 Emerging Technologies, page 20. ACM, 2014.
[64] Carolina Cruz-Neira, Daniel J Sandin, Thomas A DeFanti, Robert V Kenyon,
and John C Hart. The cave: audio visual experience automatic virtual envi-
ronment. Communications of the ACM, 35(6):64?73, 1992.
[65] David R Huyette, Benjamin J Turnbow, Christian Kaufman, Dale F Vaslow,
Benjamin B Whiting, and Michael Y Oh. Accuracy of the freehand pass
technique for ventriculostomy catheter placement: retrospective assessment
using computed tomography scans. Journal of Neurosurgery, 108(1):88?91,
2008.
[66] Udaya K Kakarla, Louis J Kim, Steven W Chang, Nicholas Theodore, and
Robert F Spetzler. Safety and accuracy of bedside external ventricular drain
placement. Operative Neurosurgery, 63(suppl 1):ONS162?ONS167, 2008.
[67] Peter YM Woo, Ben CF Ng, Jacob X Xiao, Daniel Wong, Andrew Seto, Sandy
Lam, Carmen Yim, Hong-Yip Lo, Yin-Chung Po, Larry YW Wong, et al.
The importance of aspirin, catheterization accuracy, and catheter design in
external ventricular drainage-related hemorrhage: a multicenter study of 1002
procedures. Acta Neurochirurgica, pages 1?10, 2019.
185
[68] Faith C Robertson, Muhammad M Abd-El-Barr, Srinivasan Mukundan, and
William B Gormley. Ventriculostomy-associated hemorrhage: a risk assess-
ment by radiographic simulation. Journal of neurosurgery, 127(3):532?536,
2017.
[69] Andrew Benjamin Cutler, Shervin Rahimpour, Yameng Liu, Nandan Lad,
Regis Kopper, and Patrick Codd. Development of augmented reality-based
neuro-navigation system for use in external ventricular drain placement. In
Journal of Neurosurgery, volume 126, pages A1438?A1438, 2017.
[70] David Low, Cheng Kiang Lee, Lee Lian Tay Dip, Wai Hoe Ng, Beng Ti Ang,
and Ivan Ng. Augmented reality neurosurgical planning and navigation for
surgical excision of parasagittal, falcine and convexity meningiomas. British
Journal of Neurosurgery, 24(1):69?74, 2010.
[71] Mohammad Najafi and Robert Rohling. Single camera closed-form real-time
needle trajectory tracking for ultrasound. In Medical Imaging 2011: Visual-
ization, Image-Guided Procedures, and Modeling, volume 7964, page 79641F.
International Society for Optics and Photonics, 2011.
[72] Mohammad Najafi, Purang Abolmaesumi, and Robert Rohling. Single-camera
closed-form real-time needle tracking for ultrasound-guided needle insertion.
Ultrasound in medicine & biology, 41(10):2663?2676, 2015.
[73] Philipp J. Stolka, Pezhman Foroughi, Matthew Rendina, Clifford R. Weiss,
Gregory D. Hager, and Emad M. Boctor. Needle guidance using handheld
stereo vision and projection for ultrasound-based interventions. In Polina
Golland, Nobuhiko Hata, Christian Barillot, Joachim Hornegger, and Robert
Howe, editors, Medical Image Computing and Computer-Assisted Intervention
? MICCAI 2014, pages 684?691, Cham, 2014. Springer International Publish-
ing.
[74] Zhencheng Fan, Guowen Chen, Junchen Wang, and Hongen Liao. Spatial
position measurement system for surgical navigation using 3-d image marker-
based tracking tools with compact volume. IEEE Transactions on Biomedical
Engineering, 65(2):378?389, 2017.
[75] Marta Kersten-Oertel, Pierre Jannin, and D Louis Collins. The state of the art
of visualization in mixed reality image guided surgery. Computerized Medical
Imaging and Graphics, 37(2):98?112, 2013.
[76] Antonio Meola, Fabrizio Cutolo, Marina Carbone, Federico Cagnazzo, Mauro
Ferrari, and Vincenzo Ferrari. Augmented reality in neurosurgery: a system-
atic review. Neurosurgical review, pages 1?12, 2016.
[77] Visish M Srinivasan, Brent R O?Neill, Diana Jho, Donald M Whiting, and
Michael Y Oh. The history of external ventricular drainage: Historical vi-
gnette. Journal of neurosurgery, 120(1):228?236, 2014.
186
[78] Herbert I Fried, Barnett R Nathan, A Shaun Rowe, Joseph M Zabramski,
Norberto Andaluz, Adarsh Bhimraj, Mary McKenna Guanci, David B Seder,
and Jeffrey M Singh. The insertion and management of external ventricular
drains: an evidence-based consensus statement. Neurocritical care, 24(1):61?
81, 2016.
[79] Shaun T O?Leary, Max K Kole, Devon A Hoover, Steven E Hysell, Ajith
Thomas, and Christopher I Shaffrey. Efficacy of the ghajar guide revisited: a
prospective study. Journal of neurosurgery, 92(5):801?803, 2000.
[80] Nicholas I Phillips and Nigel W John. Web-based surgical simulation for
ventricular catheterization. Neurosurgery, 46(4):933?937, 2000.
[81] Olivier Cros, Morten Volden, Jens J?rgen Flaaris, Lars Cheb?rl?v Brix, Chris-
tian Fischer Pedersen, Kim Vang Hansen, Ole Vilhelm Larsen, Lasse Riis
?stergaard, and Jens Haase. Simulating the puncture of the human ventricle.
In 10th Annual Medicine Meets Virtual Reality Conference, 2002.
[82] G Krombach, A Ganser, Ch Fricke, V Rohde, M Reinges, J Gilsbach, and
U Spetzger. Virtual placement of frontal ventricular catheters using frameless
neuronavigation: an ?unbloody training? for young neurosurgeons. Minimally
Invasive Neurosurgery, 43(04):171?175, 2000.
[83] Cristian Luciano, Pat Banerjee, G Michael Lemole, and Fady Char-
bel. Second generation haptic ventriculostomy simulator using the
immersivetouchTMsystem. Studies in health technology and informatics,
119:343, 2005.
[84] P Pat Banerjee, Cristian J Luciano, G Michael Lemole Jr, Fady T Charbel,
and Michael Y Oh. Accuracy of ventriculostomy catheter placement using a
head-and hand-tracked high-resolution virtual reality simulator with haptic
feedback. Journal of Neurosurgery, 107(3):515?521, 2007.
[85] G Michael Lemole Jr, P Pat Banerjee, Cristian Luciano, Sergey Neckrysh,
and Fady T Charbel. Virtual reality in neurosurgical education: part-task
ventriculostomy simulation with dynamic visual and haptic feedback. Neuro-
surgery, 61(1):142?149, 2007.
[86] Clemens M Schirmer, J Bradley Elder, Ben Roitberg, and Darlene A Lobel.
Virtual reality?based simulation training for ventriculostomy: an evidence-
based approach. Neurosurgery, 73(suppl 1):S66?S73, 2013.
[87] Rachel Yudkowsky, Cristian Luciano, Pat Banerjee, Alan Schwartz, Ali Alaraj,
G Michael Lemole Jr, Fady Charbel, Kelly Smith, Silvio Rizzi, Richard Byrne,
et al. Practice on an augmented reality/haptic simulator and library of virtual
brains improves residents? ability to perform a ventriculostomy. Simulation in
Healthcare, 8(1):25?31, 2013.
187
[88] Bruce L Tai, Deborah Rooney, Francesca Stephenson, Peng-Siang Liao, Oren
Sagher, Albert J Shih, and Luis E Savastano. Development of a 3d-printed
external ventricular drain placement simulator. Journal of neurosurgery,
123(4):1070?1076, 2015.
[89] M. A. Lin, A. F. Siu, J. H. Bae, M. R. Cutkosky, and B. L. Daniel. Holoneedle:
Augmented reality guidance system for needle placement investigating the
advantages of three-dimensional needle shape reconstruction. IEEE Robotics
and Automation Letters, 3(4):4156?4162, Oct 2018.
[90] Timur Kuzhagaliyev, Neil T Clancy, Mirek Janatka, Kevin Tchaka, Francisco
Vasconcelos, Matthew J Clarkson, Kurinchi Gurusamy, David J Hawkes, Brian
Davidson, and Danail Stoyanov. Augmented reality needle ablation guidance
tool for irreversible electroporation in the pancreas. In Medical Imaging 2018:
Image-Guided Procedures, Robotic Interventions, and Modeling, volume 10576,
page 1057613. International Society for Optics and Photonics, 2018.
[91] Ehsan Basafa, Pezhman Foroughi, Martin Hossbach, Jasmine Bhanushali, and
Philipp Stolka. Visual tracking for multi-modality computer-assisted image
guidance. In Medical Imaging 2017: Image-Guided Procedures, Robotic Inter-
ventions, and Modeling, volume 10135, page 101352S. International Society
for Optics and Photonics, 2017.
[92] Yoshitaka Masutani, Takeyoshi Dohi, Fumitaka Yamane, Hiroshi Iseki, and
Kintomo Takakura. Augmented reality visualization system for intravascular
neurosurgery. Computer aided surgery, 3(5):239?247, 1998.
[93] Eduardo E Lovo, Juan C Quintana, Manuel C Puebla, Gonzalo Torrealba,
Jose? L Santos, Isidro H Lira, and Patricio Tagle. A novel, inexpensive method
of image coregistration for applications in image-guided surgery using aug-
mented reality. Operative Neurosurgery, 60(suppl 4):ONS?366, 2007.
[94] Philip J Edwards, Derek LG Hill, David J Hawkes, R Spink, Alan CF Colch-
ester, A Strong, and M Gleeson. Neurosurgical guidance using the stereo
microscope. In Computer Vision, Virtual Reality and Robotics in Medicine,
pages 555?564. Springer, 1995.
[95] Philip J Edwards, Andrew P King, Calvin R Maurer, Darryl A De Cunha,
David J Hawkes, Derek LG Hill, Ronald P Gaston, Michael R Fenlon,
A Jusczyzck, Anthony J Strong, et al. Design and evaluation of a system
for microscope-assisted guided interventions (magi). IEEE Transactions on
Medical Imaging, 19(11):1082?1093, 2000.
[96] Philip J Edwards, Laura G Johnson, David J Hawkes, Michael R Fenlon,
Anthony J Strong, and Michael J Gleeson. Clinical experience and perception
in stereo augmented reality surgical navigation. In International Workshop on
Medical Imaging and Virtual Reality, pages 369?376. Springer, 2004.
188
[97] Ramin Shahidi, Bai Wang, Marc Epitaux, Robert Grzeszczuk, and John Adler.
Volumetric image guidance via a stereotactic endoscope. Medical Image Com-
puting and Computer-Assisted Intervention, pages 241?252, 1998.
[98] M Scholz, W Konen, S Tombrock, B Fricke, L Adams, M Von Duering,
A Hentsch, L Heuser, and AG Harders. Development of an endoscopic nav-
igation system based on digital image processing. Computer Aided Surgery,
3(3):134?143, 1998.
[99] Ralf A Kockro, Yeo Tseng Tsai, Ivan Ng, Peter Hwang, Chuangui Zhu,
Kusuma Agusanto, Liang Xiao Hong, and Luis Serra. Dex-ray: augmented
reality neurosurgical navigation with a handheld video probe. Neurosurgery,
65(4):795?808, 2009.
[100] D Inoue, B Cho, M Mori, Y Kikkawa, T Amano, A Nakamizo, K Yoshimoto,
M Mizoguchi, M Tomikawa, J Hong, et al. Preliminary study on the clinical
application of augmented reality neuronavigation. Journal of Neurological
Surgery Part A: Central European Neurosurgery, 74(02):071?076, 2013.
[101] Frank Sauer, Ali Khamene, Benedicte Bascle, and Gregory J Rubino. A head-
mounted display system for augmented reality image guidance: Towards clin-
ical evaluation for imri-guided neurosurgery. In Proceedings of the 4th In-
ternational Conference on Medical Image Computing and Computer-Assisted
Intervention, pages 707?716. Springer-Verlag, 2001.
[102] Calvin Maurer, Frank Sauer, Bo Hu, Benedicte Bascle, Bernhard Geiger,
Fabian Wenzel, Filippo Recchi, Torsten Rohlfing, Christopher M Brown,
Robert S Bakos, et al. Augmented reality visualization of brain structures
with stereo and kinetic depth cues: System description and initial evaluation
with head phantom. Medical Imaging, 2001:445?456, 2001.
[103] Yuichiro Abe, Shigenobu Sato, Koji Kato, Takahiko Hyakumachi, Yasushi
Yanagibashi, Manabu Ito, and Kuniyoshi Abumi. A novel 3d guidance system
using augmented reality for percutaneous vertebroplasty. Journal of Neuro-
surgery: Spine, 19(4):492?501, 2013.
[104] Kamyar Abhari, John SH Baxter, Elvis CS Chen, Ali R Khan, Terry M Pe-
ters, Sandrine de Ribaupierre, and Roy Eagleson. Training for planning tu-
mour resection: augmented reality and human factors. IEEE Transactions on
Biomedical Engineering, 62(6):1466?1477, 2014.
[105] Ye Li, Xiaolei Chen, Ning Wang, Wenyao Zhang, Dawei Li, Lei Zhang, Xin
Qu, Weitao Cheng, Yueqiao Xu, Wenjin Chen, et al. A wearable mixed-reality
holographic computer for guiding external ventricular drain insertion at the
bedside. Journal of Neurosurgery, 1(aop):1?8, 2018.
[106] Ehsan Azimi, Camilo Molina, Alexander Chang, Judy Huang, Chien-Ming
Huang, and Peter Kazanzides. Interactive training and operation ecosystem for
189
surgical tasks in mixed reality. In OR 2.0 Context-Aware Operating Theaters,
Computer Assisted Robotic Endoscopy, Clinical Image-Based Procedures, and
Skin Image Analysis, pages 20?29. Springer, 2018.
[107] Daipayan Guha, Naif M Alotaibi, Nhu Nguyen, Shaurya Gupta, Christopher
McFaul, and Victor XD Yang. Augmented reality in neurosurgery: a review of
current concepts and emerging applications. Canadian Journal of Neurological
Sciences, 44(3):235?245, 2017.
[108] Long Quan and Zhongdan Lan. Linear n-point camera pose determination.
IEEE Transactions on pattern analysis and machine intelligence, 21(8):774?
780, 1999.
[109] Marc-Andre? Ameller, Bill Triggs, and Long Quan. Camera pose revisited-new
linear algorithms. 2000.
[110] Xiao-Shan Gao, Xiao-Rong Hou, Jianliang Tang, and Hang-Fei Cheng. Com-
plete solution classification for the perspective-three-point problem. IEEE
transactions on pattern analysis and machine intelligence, 25(8):930?943,
2003.
[111] Eric Marchand, Hideaki Uchiyama, and Fabien Spindler. Pose estimation for
augmented reality: a hands-on survey. IEEE transactions on visualization and
computer graphics, 22(12):2633?2651, 2016.
[112] Diederick C Niehorster, Li Li, and Markus Lappe. The accuracy and precision
of position and orientation tracking in the HTC Vive virtual reality system
for scientific research. i-Perception, 8(3):177?205, 2017.
[113] Taylor Frantz, Bart Jansen, Johnny Duerinck, and Jef Vandemeulebroucke.
Augmenting microsoft?s hololens with vuforia tracking for neuronavigation.
Healthcare technology letters, 5(5):221?225, 2018.
[114] M. A. Khalighi and M. Uysal. Survey on free space optical communication: A
communication theory perspective. IEEE Communications Surveys Tutorials,
16(4):2231?2258, 2014.
[115] Christopher V. Poulton, Ami Yaacobi, David B. Cole, Matthew J. Byrd,
Manan Raval, Diedrik Vermeulen, and Michael R. Watts. Coherent solid-state
lidar with silicon photonic optical phased arrays. Opt. Lett., 42(20):4091?4094,
Oct 2017.
[116] J. Notaros, M. Raval, M. Notaros, and M. R. Watts. Integrated-phased-
array-based visible-light near-eye holographic projector. In 2019 Conference
on Lasers and Electro-Optics (CLEO), pages 1?2, 2019.
[117] Robert Konrad, Emily A. Cooper, and Gordon Wetzstein. Novel optical con-
figurations for virtual reality: Evaluating user preference and performance
190
with focus-tunable and monovision near-eye displays. In Proceedings of the
2016 CHI Conference on Human Factors in Computing Systems, CHI ?16,
pages 1211?1220, New York, NY, USA, 2016. ACM.
[118] K. Rathinavel, H. Wang, A. Blate, and H. Fuchs. An extended depth-at-
field volumetric near-eye augmented reality display. IEEE Transactions on
Visualization and Computer Graphics, 24(11):2857?2866, 2018.
[119] Pierre Gemayel, Bruno Colicchio, Alain Dieterlen, and Pierre Ambs. Cross-
talk compensation of a spatial light modulator for iterative phase retrieval
applications. Applied optics, 55(4):802?810, 2016.
[120] Martin Peckerar, Robert Bass, and Kee Woo Rhee. Sub-0.1 ? electron-beam
lithography for nanostructure development. Journal of Vacuum Science &
Technology B: Microelectronics and Nanometer Structures Processing, Mea-
surement, and Phenomena, 18(6):3143?3149, 2000.
[121] Martin Persson, David Engstro?m, and Mattias Gokso?r. Reducing the ef-
fect of pixel crosstalk in phase only spatial light modulators. Optics express,
20(20):22334?22343, 2012.
[122] X Sun, Y Zhang, P-C Huang, Mario Dagenais, M Peckerar, and A Varshney.
Proximity effect on nanophotonic phased arrays. In Frontiers in Optics, pages
JW3A?101. Optical Society of America, 2019. To appear.
[123] Joseph W Goodman. Introduction to Fourier optics. Roberts and Company
Publishers, 2005.
[124] Alan J Fenn, Donald H Temme, William P Delaney, and William E Court-
ney. The development of phased-array radar technology. Lincoln Laboratory
Journal, 12(2):321?340, 2000.
[125] Manan Raval, Ami Yaacobi, and Michael R. Watts. Integrated visible
light phased array system for autostereoscopic image projection. Opt. Lett.,
43(15):3678?3681, Aug 2018.
[126] A. Grinenko, M. P. MacDonald, C. R. P. Courtney, P. D. Wilcox, C. E. M.
Demore, S. Cochran, and B. W. Drinkwater. Tunable beam shaping with a
phased array acousto-optic modulator. Opt. Express, 23(1):26?32, Jan 2015.
[127] Hajime Nagahara, Yasushi Yagi, and Masahiko Yachida. Super wide viewer
using catadioptrical optics. In Proceedings of the ACM Symposium on Virtual
Reality Software and Technology, VRST ?03, pages 169?175, New York, NY,
USA, 2003. ACM.
[128] Kiyoshi Kiyokawa. A wide field-of-view head mounted projective display using
hyperbolic half-silvered mirrors. In Proceedings of the 2007 6th IEEE and ACM
International Symposium on Mixed and Augmented Reality, ISMAR ?07, pages
1?4, Washington, DC, USA, 2007. IEEE Computer Society.
191
[129] Jannick P. Rolland, Myron W. Krueger, and Alexei Goon. Multifocal planes
head-mounted displays. Appl. Opt., 39(19):3209?3215, Jul 2000.
[130] Kurt Akeley, Simon J. Watt, Ahna Reza Girshick, and Martin S. Banks. A
stereo display prototype with multiple focal distances. ACM Trans. Graph.,
23(3):804?813, August 2004.
[131] P. Chakravarthula, D. Dunn, K. Akit, and H. Fuchs. FocusAR: Auto-focus
augmented reality eyeglasses for both real world and virtual imagery. IEEE
Transactions on Visualization and Computer Graphics, 24(11):2906?2916,
2018.
[132] X. Xia, Y. Guan, A. State, P. Chakravarthula, K. Rathinavel, T. Cham, and
H. Fuchs. Towards a switchable ar/vr near-eye display with accommodation-
vergence and eyeglass prescription support. IEEE Transactions on Visualiza-
tion and Computer Graphics, 25(11):3114?3124, 2019.
[133] Kaan Aks?it, Ward Lopes, Jonghyun Kim, Peter Shirley, and David Luebke.
Near-eye varifocal augmented reality display using see-through screens. ACM
Transactions on Graphics (TOG), 36(6):189, 2017.
[134] Jonghyun Kim, Youngmo Jeong, Michael Stengel, Kaan Aks?it, Rachel Albert,
Ben Boudaoud, Trey Greer, Joohwan Kim, Ward Lopes, Zander Majercik,
Peter Shirley, Josef Spjut, Morgan McGuire, and David Luebke. Foveated
AR: Dynamically-foveated augmented reality display. ACM Trans. Graph.,
38(4):99:1?99:15, July 2019.
[135] Marc Levoy and Pat Hanrahan. Light field rendering. In Proceedings of
the 23rd annual conference on Computer graphics and interactive techniques,
pages 31?42. ACM, 1996.
[136] Andrew Maimone, Gordon Wetzstein, Matthew Hirsch, Douglas Lanman,
Ramesh Raskar, and Henry Fuchs. Focus 3D: Compressive accommodation
display. ACM Trans. Graph., 32(5):153?1, 2013.
[137] Douglas Lanman and David Luebke. Near-eye light field displays. ACM Trans-
actions on Graphics (TOG), 32(6):220, 2013.
[138] Andrew Maimone and Henry Fuchs. Computational augmented reality eye-
glasses. In 2013 IEEE International Symposium on Mixed and Augmented
Reality (ISMAR), pages 29?38. IEEE, 2013.
[139] Gordon Wetzstein, Douglas Lanman, Matthew Hirsch, and Ramesh Raskar.
Tensor displays: Compressive light field synthesis using multilayer displays
with directional backlighting. ACM Trans. Graph., 31(4), July 2012.
[140] Fu-Chung Huang, Kevin Chen, and Gordon Wetzstein. The light field stereo-
scope: Immersive computer graphics via factored near-eye light field displays
with focus cues. ACM Trans. Graph., 34(4):60:1?60:12, July 2015.
192
[141] Seungjae Lee, Changwon Jang, Seokil Moon, Jaebum Cho, and Byoungho Lee.
Additive light field displays: Realization of augmented reality with holographic
optical elements. ACM Trans. Graph., 35(4):60:1?60:13, July 2016.
[142] Seungjae Lee, Changwon Jang, Seokil Moon, Byounghyo Lee, Jaebum Cho,
and Byoungho Lee. See-through light field displays for augmented reality.
In SIGGRAPH ASIA 2016 Virtual Reality Meets Physical Reality: Modelling
and Simulating Virtual Humans and Environments, SA ?16, pages 3:1?3:2,
New York, NY, USA, 2016. ACM.
[143] Vitor F. Pamplona, Manuel M. Oliveira, Daniel G. Aliaga, and Ramesh
Raskar. Tailored displays to compensate for visual aberrations. ACM Trans.
Graph., 31(4):81:1?81:12, July 2012.
[144] Fu-Chung Huang, Gordon Wetzstein, Brian A. Barsky, and Ramesh Raskar.
Eyeglasses-free display: Towards correcting visual aberrations with computa-
tional light field displays. ACM Trans. Graph., 33(4):59:1?59:12, July 2014.
[145] Rahul Narain, Rachel A. Albert, Abdullah Bulbul, Gregory J. Ward, Martin S.
Banks, and James F. O?Brien. Optimal presentation of imagery with focus
cues on multi-plane displays. ACM Trans. Graph., 34(4):59:1?59:12, July 2015.
[146] Nitish Padmanaban, Yifan Peng, and Gordon Wetzstein. Holographic near-
eye displays based on overlap-add stereograms. ACM Trans. Graph., 38(6),
November 2019.
[147] Lantian Mi, Chao Ping Chen, Yifan Lu, Wenbo Zhang, Jie Chen, and Niza-
muddin Maitlo. Design of lensless retinal scanning display with diffractive
optical element. Opt. Express, 27(15):20493?20507, Jul 2019.
[148] Jie Chen, Lantian Mi, Chao Ping Chen, Haowen Liu, Jinghui Jiang, and
Wenbo Zhang. Design of foveated contact lens display for augmented reality.
Opt. Express, 27(26):38204?38219, Dec 2019.
[149] Changwon Jang, Kiseung Bang, Gang Li, and Byoungho Lee. Holographic
near-eye display with expanded eye-box. ACM Trans. Graph., 37(6), Decem-
ber 2018.
[150] G. A. Koulieris, K. Akit, M. Stengel, R. K. Mantiuk, K. Mania, and
C. Richardt. Near-eye display and tracking technologies for virtual and aug-
mented reality. Computer Graphics Forum, 38(2):493?519, 2019.
[151] A Lohmann and D Paris. Synthesis of binary holograms. IEEE Journal of
Quantum Electronics, 2(4):153?153, 1966.
[152] James P Waters. Holographic image synthesis utilizing theoretical methods.
Applied physics letters, 9(11):405?407, 1966.
193
[153] Mark E Lucente. Interactive computation of holograms using a look-up table.
Journal of Electronic Imaging, 2(1):28?35, 1993.
[154] Hiroshi Yoshikawa, Susumu Iwase, and Tadashi Oneda. Fast computation
of fresnel holograms employing difference. In Practical Holography XIV and
Holographic Materials VI, volume 3956, pages 48?56. International Society for
Optics and Photonics, 2000.
[155] Mark Lucente and Tinsley A Galyean. Rendering interactive holographic im-
ages. In Proceedings of the 22nd annual conference on Computer graphics and
interactive techniques, pages 387?394. ACM, 1995.
[156] Shuhong Xu, Farzam Farbiz, Sanjeev Solanki, Xinan Liang, and Xuewu Xu.
Adaptive computation of computer-generated holograms. In Proceedings of
The 7th ACM SIGGRAPH International Conference on Virtual-Reality Con-
tinuum and Its Applications in Industry, VRCAI ?08, pages 35:1?35:2, New
York, NY, USA, 2008. ACM.
[157] M. Hossein Eybposh, Nicholas W. Caira, Mathew Atisa, Praneeth
Chakravarthula, and Nicolas C. Pe?gard. DeepCGH: 3D computer-generated
holography using deep learning. Opt. Express, 28(18):26636?26650, Aug ts
, url = http://www.opticsexpress.org/abstract.cfm?URI=oe-28-18-26636, doi
= 10.1364/OE.399624,.
[158] Yifan Peng, Suyeon Choi, Nitish Padmanaban, Jonghyun Kim, and Gordon
Wetzstein. Neural holography with camera-in-the-loop training. ACM Trans-
actions on Graphics (TOG), 39(6), 2020.
[159] Remo Ziegler, Simon Bucheli, Lukas Ahrenberg, Marcus Magnor, and Markus
Gross. A bidirectional light field-hologram transform. In Computer Graphics
Forum, volume 26, pages 435?446. Wiley Online Library, 2007.
[160] Remo Ziegler, Simone Croci, and Markus Gross. Lighting and occlusion in
a wave-based framework. In Computer Graphics Forum, volume 27, pages
211?220. Wiley Online Library, 2008.
[161] Kaan Aks?it. Patch scanning displays: spatiotemporal enhancement for dis-
plays. Opt. Express, 28(2):2107?2121, Jan 2020.
[162] Ralph W Gerchberg and W. O. Saxton. A practical algorithm for the determi-
nation of phase from image and diffraction plane pictures. Optik, 35:237?246,
1972.
[163] Kazunobu Masuda, Yusuke Saita, Ryusuke Toritani, Peng Xia, Kouichi Nitta,
and Osamu Matoba. Improvement of image quality of 3d display by using
optimized binary phase modulation and intensity accumulation. Journal of
Display Technology, 12(5):472?477, 2015.
194
[164] Praneeth Chakravarthula, Yifan Peng, Joel Kollin, Henry Fuchs, and Felix
Heide. Wirtinger holography for near-eye displays. ACM Trans. Graph., 38(6),
November 2019.
[165] Mihir Parikh. Corrections to proximity effects in electron beam lithography.
i. theory. Journal of Applied Physics, 50(6):4371?4377, 1979.
[166] Allen M Carroll. Proximity-effect correction with linear programming. Journal
of Applied Physics, 52(1):434?437, 1981.
[167] DGL Chow, JF McDonald, DC King, W Smith, K Molnar, and AJ Steckl. An
image processing approach to fast, efficient proximity correction for electron
beam lithography. Journal of Vacuum Science & Technology B: Microelec-
tronics Processing and Phenomena, 1(4):1383?1390, 1983.
[168] MC Peckerar, S Chang, and CRK Marrian. Proximity correction algorithms
and a co-processor based on regularized optimization. i. description of the
algorithm. Journal of Vacuum Science & Technology B: Microelectronics and
Nanometer Structures Processing, Measurement, and Phenomena, 13(6):2518?
2525, 1995.
[169] Christie R Marrian, Steven Chang, and Martin C Peckerar. Proximity cor-
rection for electron beam lithography. Optical Engineering, 35(9):2685?2693,
1996.
[170] Martin Peckerar, David Sander, Ankur Srivastava, Adakou Foli, and Uzi
Vishkin. Electron beam and optical proximity effect reduction for nanolithog-
raphy: New results. Journal of Vacuum Science & Technology B: Microelec-
tronics and Nanometer Structures Processing, Measurement, and Phenomena,
25(6):2288?2294, 2007.
[171] Pengcheng Li. A review of proximity effect correction in electron-beam lithog-
raphy. arXiv preprint arXiv:1509.05169, 2015.
[172] Neal Parikh and Stephen Boyd. Proximal algorithms. Foundations and Trends
in Optimization, 1(3):127?239, 2014.
[173] Felix Heide, Steven Diamond, Matthias Nie?ner, Jonathan Ragan-Kelley,
Wolfgang Heidrich, and Gordon Wetzstein. Proximal: Efficient image op-
timization using proximal algorithms. ACM Trans. Graph., 35(4), July 2016.
[174] Amir Beck and Marc Teboulle. Gradient-based algorithms with applications to
signal recovery. Convex optimization in signal processing and communications,
pages 42?88, 2009.
[175] Zhou Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality
assessment: from error visibility to structural similarity. IEEE Transactions
on Image Processing, 13(4):600?612, April 2004.
195
[176] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang. The unreasonable
effectiveness of deep features as a perceptual metric. In 2018 IEEE/CVF Con-
ference on Computer Vision and Pattern Recognition, pages 586?595, 2018.
[177] K. Rathinavel, G. Wetzstein, and H. Fuchs. Varifocal occlusion-capable op-
tical see-through augmented reality display based on focus-tunable optics.
IEEE Transactions on Visualization and Computer Graphics, 25(11):3125?
3134, 2019.
[178] Michal Makowski, Maciej Sypek, Izabela Ducin, Agnieszka Fajst, Andrzej
Siemion, Jaroslaw Suszek, and Andrzej Kolodziejczyk. Experimental eval-
uation of a full-color compact lensless holographic display. Optics express,
17(23):20840?20846, 2009.
[179] Zixiang Lu and Yuji Sakamoto. Holographic display method for volume data
by volume rendering. Optics express, 27(2):543?556, 2019.
[180] T. Todd Elvins. A survey of algorithms for volume visualization. SIGGRAPH
Comput. Graph., 26(3):194?201, August 1992.
[181] E. Keppel. Approximating complex surfaces by triangulation of contour lines.
IBM Journal of Research and Development, 19(1):2?11, Jan 1975.
[182] H. Fuchs, Z. M. Kedem, and S. P. Uselton. Optimal surface reconstruction
from planar contours. Commun. ACM, 20(10):693?702, October 1977.
[183] William E. Lorensen and Harvey E. Cline. Marching cubes: A high resolution
3d surface construction algorithm. SIGGRAPH Comput. Graph., 21(4):163?
169, August 1987.
[184] Peter Shirley and Allan Tuchman. A polygonal approximation to direct scalar
volume rendering. In Proceedings of the 1990 Workshop on Volume Visual-
ization, VVS ?90, pages 63?70, New York, NY, USA, 1990. Association for
Computing Machinery.
[185] Gregory M. Nielson and Bernd Hamann. The asymptotic decider: Resolving
the ambiguity in marching cubes. In Proceedings of the 2nd Conference on
Visualization ?91, VIS ?91, pages 83?91, Washington, DC, USA, 1991. IEEE
Computer Society Press.
[186] H. K. Tuy and L. T. Tuy. Direct 2-d display of 3-d objects. IEEE Computer
Graphics and Applications, 4(10):29?34, Oct 1984.
[187] M. Levoy. Display of surfaces from volume data. IEEE Computer Graphics
and Applications, 8(3):29?37, May 1988.
[188] Marc Levoy. Efficient ray tracing of volume data. ACM Trans. Graph.,
9(3):245?261, July 1990.
196
[189] Lee Westover. Footprint evaluation for volume rendering. In Proceedings of the
17th Annual Conference on Computer Graphics and Interactive Techniques,
SIGGRAPH ?90, pages 367?376, New York, NY, USA, 1990. Association for
Computing Machinery.
[190] M. Zwicker, H. Pfister, J. van Baar, and M. Gross. Ewa volume splatting. In
Proceedings Visualization, 2001. VIS ?01., pages 29?538, Oct 2001.
[191] M. Berger, J. Li, and J. A. Levine. A generative model for volume render-
ing. IEEE Transactions on Visualization and Computer Graphics, 25(4):1636?
1650, April 2019.
[192] Jan Nova?k, Iliyan Georgiev, Johannes Hanika, Jaroslav Kr?iva?nek, and Wo-
jciech Jarosz. Monte carlo methods for physically based volume rendering.
In ACM SIGGRAPH 2018 Courses, SIGGRAPH ?18, New York, NY, USA,
2018. Association for Computing Machinery.
[193] Michal Makowski, Maciej Sypek, Andrzej Kolodziejczyk, and Grzegorz
Mikula. Three-plane phase-only computer hologram generated with iterative
fresnel algorithm. Optical Engineering, 44(12):125805, 2005.
[194] Nobuyuki Masuda, Tomoyoshi Ito, Takashi Tanaka, Atsushi Shiraki, and
Takashige Sugie. Computer generated holography using a graphics processing
unit. Optics Express, 14(2):603?608, 2006.
[195] Kyoji Matsushima and Masahiro Takai. Recurrence formulas for fast creation
of synthetic three-dimensional holograms. Applied Optics, 39(35):6587?6594,
2000.
[196] Kyoji Matsushima, Hirohito Nishi, and Sumio Nakahara. Simple wave-field
rendering for photorealistic reconstruction in polygon-based high-definition
computer holography. Journal of Electronic Imaging, 21(2):023002, 2012.
[197] Yuji Sakamoto and Yoshinao Aoki. Autostereoscopic visualization of volume
data using computer-generated holograms. Electronics and Communications
in Japan (Part III: Fundamental Electronic Science), 90(11):31?39, 2007.
[198] Zixiang Lu and Yuji Sakamoto. Holographic display methods for volume
data: polygon-based and mip-based methods. Applied optics, 57(1):A142?
A149, 2018.
[199] Takuo Yoneyama, Chanyoung Yang, Yuji Sakamoto, and Fumio Okuyama.
Eyepiece-type full-color electro-holographic binocular display with see-through
vision. In Digital Holography and Three-Dimensional Imaging, page DW2A.11.
Optical Society of America, 2013.
[200] Eugene Hecht and A. R. Ganesan. Optics. Pearson, Chennai, 4th edition,
2008.
197
[201] Ruofei Du and Amitabh Varshney. Social Street View: Blending Immersive
Street Views with Geo-Tagged Social Media. In Proceedings of the 21st Inter-
national Conference on Web3D Technology, Web3D, pages 77?85. ACM, July
2016.
[202] Ruofei Du, Sujal Bista, and Amitabh Varshney. Video Fields: Fusing Multiple
Surveillance Videos into a Dynamic Virtual Environment. In Proceedings of
the 21st International Conference on Web3D Technology, Web3D, pages 165?
172. ACM, July 2016.
[203] Ruofei Du, David Li, and Amitabh Varshney. Experiencing a Mirrored World
With Geotagged Social Media in Geollery. In Extended Abstracts of the 2019
CHI Conference on Human Factors in Computing Systems, CHI, pages 1?4.
ACM, May 2019.
[204] Ruofei Du, David Li, and Amitabh Varshney. Geollery: A Mixed Reality
Social Media Platform. In Proceedings of the 2019 CHI Conference on Human
Factors in Computing Systems (CHI), CHI, pages 1?13. ACM, May 2019.
[205] Ruofei Du, David Li, and Amitabh Varshney. Project Geollery.com: Recon-
structing a Live Mirrored World With Geotagged Social Media. In Proceedings
of the 24th International Conference on Web3D Technology, Web3D, pages 1?
9. ACM, July 2019.
[206] Eric Krokos, Catherine Plaisant, and Amitabh Varshney. Virtual Memory
Palaces: Immersion Aids Recall. Springer VR 2018, 1(15):1?20, May 2018.
[207] Hani Nejadriahi and Volker J. Sorger. On-chip integrated all-optical fast
fourier transform: Design and analysis. In Frontiers in Optics 2017, page
JW4A.46. Optical Society of America, 2017.
[208] Desmond C. Ong, Sanjeev Solanki, Xinan Liang, and Xuewu Xu. Analysis
of laser speckle severity, granularity, and anisotropy using the power spectral
density in polar-coordinate representation. Optical Engineering, 51(5):1 ? 8,
2012.
[209] Hsueh-Chien Cheng, Antonio Cardone, and Amitabh Varshney. Volume Seg-
mentation Using Convolutional Neural Networks With Limited Training Data.
In Proceedings of 2017 IEEE International Conference on Image Processing,
ICIP. IEEE, September 2017.
[210] Hsueh-Chien Cheng, Antonio Cardone, Somay Jain, Eric Krokos, Kedar
Narayan, Sriram Subramaniam, and Amitabh Varshney. Deep-learning-
assisted Volume Visualization. IEEE Transactions on Visualization and Com-
puter Graphics, PP(99):1?14, January 2018.
[211] Eric Veach and Leonidas J. Guibas. Optimally combining sampling techniques
for monte carlo rendering. In Proceedings of the 22nd Annual Conference on
198
Computer Graphics and Interactive Techniques, SIGGRAPH 95, page 419428,
New York, NY, USA, 1995. Association for Computing Machinery.
[212] Peter Shirley, Changyaw Wang, and Kurt Zimmerman. Monte carlo techniques
for direct lighting calculations. ACM Trans. Graph., 15(1):136, January 1996.
[213] Xiaoxu Meng, Quan Zheng, Amitabh Varshney, Gurprit Singh, and Matthias
Zwicker. Real-time Monte Carlo Denoising with the Neural Bilateral Grid. In
Carsten Dachsbacher and Matt Pharr, editors, Eurographics Symposium on
Rendering - DL-only Track. The Eurographics Association, 2020.
199