Title: Toward Urban Model Acquisition From GeoLocated Images
1Toward Urban Model Acquisition From Geo-Located
Images
- Seth Teller
- MIT Computer Graphics Group
- graphics.lcs.mit.edu
- ARPA Site Visit, April 1999
2Motivation Modeling process
- Representing object in form suitablefor
manipulation by a computer program - A fundamental bottleneck in graphics
- Our emphasis extended urban scenes
- Possible modeling methods
- Direct entry (AutoCAD, etc.)
- Procedural generation
- Semi-automatic photogrammetry
- Computer vision techniques
- City Scanning project a sensor, andassociated
algorithms, to automaticallyacquire textured
models of urban scenes
3Engineering Emphases
- Novel sensor imagery navigation data
- Gives approximate exterior orientation per image
- End-to-end architecture real data
- Simple modules first will elaborate later
- Data redundancy
- Acquire thousands of observations
- Can discard a constant fraction
- Structure has strong signature in remainder
- Handle dense occlusion automatically
- Use of ensemble features
- Avoid matching individual vertices, etc.
- Scalable spatial data structures
- Resources linear in input, output size
4From 2D Images to 3D Site Models
3) Who establish camera control
4) Indicate/verify scene structure
5) Control, structure then optimized generalized
triangulation ensues.
2) Processed by human image analysts
Note human(s) in the loop !
1) Images acquired
Implications for scaling, throughput
5Fundamental limitations of semi-automated
approaches
- Existing algorithms/systems assume
- Every image handled by human operator
- Human operator indicates structure c.
- All pairs of images are correlated O(n2)
- Implications
- Human in loop throughput limitation
- Quadratic processing scaling limitation
- Typically, small number (tens) of input images
- System does not improve w/ technology!
- Eventually, human operator is bottleneck
6Fully Automated Site Modeling
- How it can be done in the future
3) Geometry estimated comb-inatorially
reflection esti- mated with robust statistics
1) Many geo-located images acquired
2) Images are spatially indexed
7Implications of Novel Approach
- Spatial index acquisition time depends only on
image density, region size (but not on total
number of images) - System throughput increases withadvances in
underlying technology - Quantity/complexity of reconstructed scenes will
grow steadily with time - Fusion of spatially distributed models eased
- Quality will likely be less than that possible
with a human operator
8Novel problem domain
- Urban exteriors (built structure)
- Tens of thousands of digital still images
- Acquisition near-ground, inside scene
- Absolute, a priori camera pose estimates
- No human in the loop (break scale barrier)
9Research/Engineering Footprints
1 2
Ascender Façade MIT/City
Number of images Tens Tens Thousands 6-DOF
camera pose From human From human Instruments
optim. Structure extraction Roof-matching By
human Automatic detection optimization optimi
zation optimization Number of structures Tens On
e to Tens Arbitrary Output coord- Specified
by Specified by Geodetic (Earth) inate
system operator operator coordinates Texture
Procedural Manual Automatic w/
matching segmentation robust statistics Scaling
capability Unclear Unclear Spatial
index Parallel model acquis- None None Use of
geodetic ition and merging coordinates, index
1 Hanson et 2 Debevec et al.,
UMASS al., Berkeley
10Challenges (Research/Engineering/Systems)
- Instrumentation for absolute geolocation
- Sparse/dense correspondence algorithms
- Incremental/multiresolution reconstruction
- Scaleable in of images, output features
- Estimation of surface reflectance (texture)
- System assessment (speed, error, cost)
11Geo-located digital camera
Cheap digital cameras, GPS, MEMS inertial
chipsets soon available also MAVs
12Image acquisition First dataset
Early prototype of pose camera deployed in and
around Tech Square (4 structures) Collected 81
nodes 4,000 geo-located images
13Mosaic generation
Each node is 50-70 images tiling a
sphere about a mechanically fixed optical
center Each node correlated to form spherical
mosaic Camera internal parameters auto-calibrated
Computation is automated (no human in loop) Per
node, about 20 CPU-minutes _at_ 200 MHz
14Mosaics A Closer Look
Each is about 75 Mega-Pixels with
improved cameras, each will be about 300
Mega-pixels
15Imagery Control Exterior Registration
Each node is controlled, or co-situated, in a
common, global (Earth) coordinate system
Instrumentation requires human assistance (1 hr
total, or about 1 second per image) Mosaicing
significant engineering advantage Goal full
automation of geo-location instruments
16Building detection without correspondence
Histogramming algorithm identifies orientations
of significant vertical façades in spatial region
17Building detection II
Sweep-plane algorithm identifies locations and
extents of these vertical façades
18Texture estimation challenge
19Reflectance (Texture) estimation
Robust, weighted median - statistics algorithm
estimates texture/BRDF for each building façade
weighted xyY median
sharpening
Algorithm removes structural occlusion foliage
blur (obliquity) color and lighting variations!
20Texture estimation results
Input Raw imagery
Output Synthetic texture
- Made possible by many observations
- A sensor and system that effectively see
through complex foliage!
21Preliminary results (with overlay of aerial image)
Model represents about 1 CPU-Day at 200 MHz
Next acquire full MIT campus compare to
refer-ence model captured via traditional
surveying
22From the East
From the South
23System overview
24Contributions
Instrumentation, scaleable end-to-end system
design Address scaling with geo-location, spatial
data structures Novel mosaicing, reconstruction,
texturing algorithms Significant step toward
fully automated reconstruction
Next goal capture MIT Campus (200 structures)
from 1 Tb of ground, 1 Tb of aerial imagery
25Further information
- graphics.lcs.mit.edu
- graphics.lcs.mit.edu/seth
- graphics.lcs.mit.edu/city/city.html
- graphics.lcs.mit.edu/publications.html