ABSTRACT Title of thesis: ANALYSIS OF LIDAR DATA FOR FLUVIAL GEOMORPHIC CHANGE DETECTION AT A SMALL MARYLAND STREAM Vincent J. Gardina, Master of Science, 2008 Thesis directed by: Associate Professor Kaye L. Brubaker Department of Civil and Environmental Engineering Numerous detailed topographic measurements, which must be periodically repeated, are required to characterize stream bank and channel geometry. Light Detection and Ranging (LiDAR) is becoming more widely used, but its accuracy for change detection in and around small streams is not well quantified. Two LiDAR and one ground-surveyed elevation data sets are compared for a thickly vegetated riparian area in the Maryland Piedmont. Interpolated surfaces (prediction maps) and estimates of their uncertainty (standard error maps) are created from the point data using kriging. The LiDAR 2006 elevations are compared to ground-survey to evaluate accuracy. LiDAR 2002 and 2006 elevations are compared to evaluate the potential for change detection. When the estimated LiDAR system error is included in hypothesis testing, no statistically significant elevation differences are found between 2002 and 2006. Conclusions about geomorphic change based on LiDAR scenes should account for error and uncertainty in the data collection and processing. ANALYSIS OF LIDAR DATA FOR FLUVIAL GEOMORPHIC CHANGE DETECTION AT A SMALL MARYLAND STREAM by Vincent J. Gardina Thesis Submitted to the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Master of Science of Civil Engineering 2008 Advisory Committee: Associate Professor Kaye L. Brubaker, Chair Professor Richard H. McCuen Professor Gregory B. Baecher ii Copyright by Vincent J. Gardina 2008 Acknowledgements I would like to thank the numerous individuals who assisted both directly and indirectly in helping this research come to completion. I am most grateful to Dr. Kaye L. Brubaker who contributed her insight, knowledge and passion for engineering. Her guidance and assistance have been invaluable. I am also grateful to Dr. Taylor Jarnigan, lead research scientist with the United States Environmental Protection Agency (USEPA) and David B. Jennings, research scientist, USEPA, who provided valuable information that, enabled this research. I am grateful to Mr. Keith Van Ness, Department of Environmental Protection, Montgomery County, Maryland and his staff for allowing this research to become part of their ongoing efforts to ensure protection of streams and watersheds in Montgomery County. I wish to acknowledge the engineering firm of Johnson, Mirmiran and Thompson (JMT) for their the generous contribution of resources and professional surveying staff and especially, Mr. Jerry Jurick P.E., Vice President, Environmental and Facilities (JMT), and David Stickles, Vice President, Surveys (JMT), for their commitment to community and academic research helping to improve engineering technology. ii TABLE OF CONTENTS LIST OF TABLES............................................................................................................V LIST OF FIGURES........................................................................................................VI CHAPTER 1. INTRODUCTION.................................................................................... 1 CHAPTER 2. LITERATURE REVIEW........................................................................ 6 2.1 TRANSECT METHODS ........................................................................................................................... 6 FIGURE 2-2. LONGITUDINAL PROFILE AND PLAN VIEW OF RIFFLE AND POOL SEQUENCE (HARRELSON, 1994) ................................................................... 8 FIGURE 2-3. POSITIONING OF A TRANSECT USING A FIXED LOCATION AND SETTING OF A REBAR MONUMENT (HARRELSON, 1994)....................... 8 2.2 DIGITAL ELEVATION MODELS............................................................................................................ 10 2.3 LIGHT DETECTION AND RANGING (LIDAR)....................................................................................... 18 2.3.1 LiDAR Sensors and Positioning................................................................................................. 18 2.3.2 Determining the Position of LiDAR Points................................................................................ 21 2.3.3 Post-processing of LiDAR Data................................................................................................. 23 2.4 LIDAR ERROR ................................................................................................................................... 23 2.4.1 Determination of Horizontal and Vertical Error ....................................................................... 23 2.4.2 System Error .............................................................................................................................. 24 2.4.3 Labeling Error ........................................................................................................................... 25 2.4.4 Slope Error................................................................................................................................. 26 2.4.5 Point Density.............................................................................................................................. 27 2.4.6 Interpolation Error .................................................................................................................... 28 2.4.7 Error Budget for LiDAR Data.................................................................................................... 30 2.4.8 Statistical Significance of Differences in Mean Absolute Errors by Land Cover Types ............ 37 2.5 USING LIDAR IN FLUVIAL TOPOGRAPHY .......................................................................................... 39 2.6 USING LIDAR TO IDENTIFY FIRST ORDER STREAM CHANNELS......................................................... 43 2.7 USE OF LIDAR TO DETERMINE EROSIONAL CHANGES OF BEDROCK CHANNELS............................... 44 CHAPTER 3. METHODOLOGY................................................................................. 47 3.1 STUDY AREA ...................................................................................................................................... 47 3.2 LIDAR DATA COLLECTION................................................................................................................ 49 3.2.1 LiDAR 2002 General Specifications .......................................................................................... 51 3.2.2 LiDAR 2002 Calibration............................................................................................................ 51 3.2.3 LiDAR 2006 General Specifications .......................................................................................... 53 3.2.4 LiDAR 2006 Calibration............................................................................................................ 56 3.2.5 LiDAR bare earth processing .................................................................................................... 56 3.3. GROUND SURVEY DATA COLLECTION............................................................................................... 56 3.4 SELECTION OF POINTS FROM LIDAR DATA ....................................................................................... 57 3.5 GEOGRAPHICAL INFORMATION SYSTEM (GIS)................................................................................... 58 3.5.1 Data types in a GIS .................................................................................................................... 58 3.5.2 Importing LiDAR and Ground Surveyed Points into ArcMap 9.2 ............................................. 60 3.6 INTERPOLATION.................................................................................................................................. 60 3.6.1 Triangulated Irregular Network (TIN)....................................................................................... 60 3.6.2 Optimal Interpolation (Kriging) ................................................................................................ 63 iii 3.6.3 Creating Rasters from the Kriged Prediction and Standard error Maps................................... 65 3.7 COMPARING ELEVATIONS BASED ON LIDAR 2006 AND GROUND SURVEYED DATA......................... 65 3.7.1 Correcting for Systematic Error ................................................................................................ 66 3.7.2 Statistical Hypothesis Test for Elevation Differences ................................................................ 68 3.7.3 Inferring LiDAR system error by comparing LiDAR and Ground............................................. 71 3.7.4 Error Budgets............................................................................................................................. 75 3.8 TESTING FOR GEOMORPHIC CHANGE FROM 2002 TO 2006 USING LIDAR.......................................... 76 CHAPTER 4. RESULTS AND ANALYSIS................................................................. 78 4.1 COMPARISON OF SURFACES DERIVED FROM LIDAR 2006 AND GROUND-SURVEYED POINTS............ 78 4.1.1. Comparing LiDAR Bare-ground Points and Ground-Surveyed Points .................................... 78 4.1.2 Creation of TINs for LiDAR 2006 and Ground Survey.............................................................. 80 4.1.4 Quantitative Comparison of TIN-derived rasters ...................................................................... 84 4.1.5 Interpolating Point Elevation Measurements with Kriging ....................................................... 91 4.1.6 Creating Ordinary Kriged Standard error Maps....................................................................... 91 4.1.6 Testing for Elevation Differences............................................................................................. 103 4.1.6 Estimating LiDAR System Error by Reversing the Hypothesis Test ........................................ 105 4.1.7 Evaluating the Error Budget Components for LiDAR 2006 ? Ground .................................... 111 4.2 COMPARISON OF SURFACES DERIVED FROM LIDAR 2006 AND LIDAR 2002 POINTS ..................... 115 4.2 COMPARISON OF SURFACES DERIVED FROM LIDAR 2006 AND LIDAR 2002 POINTS ..................... 116 4.2.2 Evaluating the Error Budget Components for LiDAR 2006 ? LiDAR 2002............................. 133 CHAPTER 5. CONCLUSIONS AND DISCUSSION ............................................... 139 iv List of Tables Table 2-1. Observed LiDAR Elevation Error (cm) for leaf-off data (Hodgson, et.al. 2004) 34 Table 2-2. Observed LiDAR Elevation Error at Points and Interpolation Error at Points (Elevation in cm) (Hodgson, 2004) 35 Table 2-3. Density of LiDAR ground returns by cover type (Hodgson, 2004) 37 Table 2-4. Observed Absolute LiDAR Elevation Error by cover type (Hodgson, 2004) 37 Table 2-5. Significance Levels for Difference between Mean Absolute Error by Land Cover (Hodgson, 2004) 38 Table 2-6. Elevation Errors between LiDAR data and ground data 42 Table 2-7. Vertical and Horizontal Accuracy of Three Data Sources (Smith, 2006) 44 Table 3-1. Final calibration values after adjustments were made from the calibration flight (Airborne 1, 2002). 53 Table 4-1. Geostatistical Analyst (GA), Ordinary Kriging Prediction Map Setting and Calculated Values Using LiDAR 2006 Points 93 Table 4-2. Geostatistical Analyst (GA), Ordinary Kriging Prediction Map Setting and Calculated Values Using Ground Survey Points 92 Table 4-3. Statistics of Cell-by-Cell Ratios of Squared Standard Error to the Total Squared Error for Each Source of Error, LiDAR 2006-Ground 111 Table 4-4. Geostatistical Analyst (GA), Ordinary Kriging Prediction Map Setting and Calculated Values Using LiDAR 2002 Points 124 Table 4-5. Statistics of Cell-by-Cell Ratios of Squared Standard Error to the Total Squared Error for Each Source of Error, LiDAR 2006-LiDAR 2002 133 v List of Figures Figure 1-1. Satellite classification of urban land cover, 1970s to 2000 showing the CSPA outlined in yellow (Jarnigan and Jennings, 2004). 3 Figure 2-1. Diagram of Stream Cross-section Survey (Harrelson, 1994) 7 Figure 2-2. Longitudinal Profile and plan view of riffle and pool sequence (Harrelson, 1994) 8 Figure 2-3. Positioning of a transect using a fixed location and setting of a rebar monument (Harrelson, 1994) 8 Figure 2-4. Cross-sectional profile determined from a transect survey (Harrelson, 1994). 9 Figure 2-5. Measurement of height from a single vertical aerial photograph of a building (Jensen, 2000) 12 Figure 2-6. Computing height by measuring the stereoscopic x-parallax by superpositioning neighboring photographs in the line of flight (Jensen, 2000) 14 Figure 2-7. Example of Instantaneous Field of View at different scan angles (Burtch 2002) 20 Figure 2-8. Pitch, yaw and roll of the aircraft. 21 Figure 2-9. Example of roll on the LiDAR signal (Burtch, 2002) 22 Figure 2-10. Effects of Terrain Slope on observed error (Hodgson, 2004) 27 Figure 2-11. Evaluation of interpolation error by removing points and cross-validating (Hodgson 2004) 32 Figure 2-12. Photo of Sopers Branch just south of the bridge and parking lot 39 Figure 3-1. CSPA showing the Area of Interest (inside red rectangle) with the parking lot, stream and surrounding riparian zone. (Image source: Canaan Valley Institute, 2006) 50 vi Figure 3-2. Clarksburg Special Protection Area showing tiles where LiDAR 2002 data were collected. Tile 48 is the location of the Area of Interest for this study. 52 Figure 3-3. Clarksburg Special Protection Area showing tiles where LiDAR 2006 data were collected. Tile 236NW14 is the location of the Area of Interest for this study. 54 Figure 3-4. CSPA showing LiDAR 2006 point density (1.17 per square meter) within the AOI 55 Figure 3-5. Stream channel cross-section showing the differences in slope interpolation between a ground-survey using a visually-identified breakline point at the top of bank and a surveyed point at the bottom of bank and randomly-located LiDAR points 62 Figure 3-6. Parking Lot within the AOI at Soper?s Run with the March 22, 2006 Survey team. (Photo: K. Brubaker) 66 Figure 4-1. Ground point polygon (green outlined polygon) containing the ground-surveyed points. This polygon was used to clip an area used for comparison of LiDAR and ground survey-derived surface features and grids. 79 Figure 4-2. Ground point polygon showing the clipped LiDAR 2006 points. 81 Figure 4-3. Triangulated Irregular Network (TIN) derived from ground-surveyed data and delineated to the ground point polygon. 82 Figure 4-4. Triangulated Irregular Network (TIN) derived from LiDAR 2006 data and delineated to the ground point polygon. 83 Figure 4-5. ArcMap-generated TIN difference map (LiDAR 2006 minus Ground-Surveyed). Only qualitative information is provided. 85 Figure 4-6. Raster created from the LiDAR 2006 point TIN. Grid size is 0.5 ft. 86 Figure 4-7. Raster created from the Ground-Surveyed TIN. Grid size is 0.5 ft. 87 vii Figure 4-8. Parking lot difference raster was calculated by subtracting the ground survey-derived raster from the LiDAR 2006-derived raster and multiplying by the parking lot mask (mean difference = -0.16556 feet) 88 Figure 4.9. Histogram showing the distribution of elevation differences in the parking lot (LiDAR 2006 minus Ground Surveyed), minimum value -0.74 ft, maximum value 0.16 ft., mean value -0.16556 ft. 89 Figure 4-10. Difference raster showing the elevation differences between the rasters created from the LiDAR 2006 TIN and the ground-surveyed TIN. 90 Figure 4-11. Ordinary kriged prediction map created from LiDAR 2006 points. 93 Figure 4-12. Ordinary kriged prediction map created from ground-surveyed points. 94 Figure 4-13. Ordinary kriged standard error map created from LiDAR 2006 points. 95 Figure 4-14. Ordinary kriged standard error map created from ground-surveyed points. 96 Figure 4-15. Ordinary kriged raster derived from the ordinary kriged prediction map created from LiDAR 2006 points 98 Figure 4-16. Ordinary kriged raster derived from the ordinary kriged prediction map created from ground-surveyed points 99 Figure 4-17. Ordinary Kriged Prediction standard error raster created from the Kriged prediction standard error map using LiDAR 2006 points. 100 Figure 4-18. Ordinary Kriged Prediction standard error raster created from the Kriged prediction standard error map using ground-surveyed points. 101 Figure 4-19. Ordinary Kriged parking lot difference raster (Kriged LiDAR 2006 raster minus kriged Ground-Surveyed raster over the parking lot surface). Mean elevation difference for (Lidar-Ground) is -0.15892 ft. 102 Figure 4-20. Raster showing the difference between adjusted Kriged LiDAR 2006 raster and kriged Ground-Surveyed raster. 104 viii Figure 4-21. Map showing the locations where the difference between the interpolated LiDAR 2006 and ground-surveyed elevations were within the 95% Confidence Interval and where they were exceeded. Brown and blue areas were outside the 95% Confidence Interval. The SE of the difference of LiDAR 2006 and ground-surveyed was used. 106 Figure 4-22. Critical standard error raster for the Ordinary Kriged difference raster of the LiDAR 2006 and ground-surveyed points. 107 Figure 4-23. Additional squared error necessary to obtain Z critical for acceptance of the hypothesis of equality. Negative values occur in pixels where the Z statistic was already in the range of acceptance. The highest positive value is 1.61. The inferred SE for the LiDAR system is 1.26 (square root of 1.61). 109 Figure 4-24. Map showing the locations where the difference between the interpolated LiDAR 2006 and ground-surveyed elevations were within the 95% Confidence Interval, after including the assumed Ground Survey system error and the inferred LiDAR system error in calculating the standard error of the elevation difference. The inferred SE of the LiDAR system (1.26) was used. 110 Figure 4-25. Ratio of SE 2 of the LiDAR 2006 system to the total SE 2 . 112 Figure 4-26. Ratio of SE 2 of the LiDAR interpolation to the total SE 2 . 113 Figure 4-27. Ratio of SE 2 of the Ground-survey interpolation to the total SE 2 . 114 Figure 4-28. Map showing the sum of the ratio rasters, verifying that the sum of the squared error ratios is equal to 1 in all grid cells. 115 Figure 4-29. LiDAR 2002 points within the ground point polygon 117 Figure 4-30. Triangulated Irregular Network (TIN) derived from LiDAR 2002 data and delineated to the ground point polygon. 118 Figure 4-31. Qualitative difference of the LiDAR2006 and LiDAR 2002 TINs. 119 Figure 4-32 Raster derived from LiDAR 2002 TIN. 120 Figure 4-33. Difference raster showing the LiDAR 2006 minus LiDAR 2002 raster over the parking lot. The mean difference is -0.14988 feet. 121 ix x Figure 4-34. Raster of the difference between the TIN-based LiDAR 2006 adjusted raster and LiDAR 2002 raster. 122 Figure 4-35. Ordinary Kriged Prediction Map created from the LiDAR 2002 dataset. 125 Figure 4-36. Ordinary Kriged standard error Map created from the LiDAR 2002 dataset. 126 Figure 4-37. Ordinary Kriged Prediction Map raster created from the LiDAR 2002 dataset. 127 Figure 4-38. Ordinary Kriged standard error Raster created from the LiDAR 2002 dataset. 128 Figure 4-39. Parking Lot Difference of the Ordinary Kriged Prediction raster (LiDAR 2006 ? LiDAR 2002). Mean Difference is -0.15298 feet which was used to adjust the LiDAR 2006 raster for systematic error. 129 Figure 4-40. Map showing the difference of the adjusted LiDAR 2006 surface and the LiDAR 2002 surface. 131 Figure 4-41 . Map showing the locations where the difference between the interpolated LiDAR 2006 and LiDAR 2002 elevations were within the 95% Confidence Interval, after including the inferred LiDAR system error for both LIDAR 2006 and 2002 datasets in calculating the standard error of the elevation difference. The inferred SE of the LiDAR system (1.26 ft) was used. 132 Figure 4-42. Ratio of SE 2 of the LiDAR 2006 system to the total SE 2 . 135 Figure 4-44. Ratio of SE 2 of the LiDAR 2002 system to the total SE 2 . 136 Figure 4-44. Ratio of SE 2 of the LiDAR 2006 interpolation to the total SE 2 . 137 Figure 4-45. Ratio of SE 2 of the LiDAR 2002 interpolation to the total SE 2 . 138 Chapter 1. Introduction Geomorphic changes along a riparian zone or within a stream channel that results from either natural or anthropogenic factors are difficult to measure in situ. Property access, dense vegetation, topography, weather, and cost can restrict measurements needed to assess these changes. Topographic changes along stream banks, changes in channel geometry, and detection of degradation and aggradations within the channel require numerous detailed measurements along transects and longitudinally within the stream bed. These measurements must be repeated periodically to assess change. These traditional methods of assessing stream channel geometry are very spatially and temporally limited. It is difficult to assess changes over long time periods, along lengthy stream reaches, and throughout large stream networks. The actual process of surveying land to determine topography requires numerous measurements in the field. When evaluating changes to stream channels and adjoining riparian zones, transects across the stream channel must be measured to make inferences about the channel geometry between transects. Longitudinal surveys along the stream channel must be completed to identify thalwegs, pools, riffles, beds, bars, and other stream geomorphic features. These measurements and the use of other techniques such as pebble counts can help to develop conclusions about the change in channel geometry and the impact of flow. These fluvial geomorphic changes may be a natural progression caused by stream flow or due to anthropogenic effects such as urbanization within the watershed. Changes to stream geometry both from a cross-sectional and plan view require that regular measurements be completed. In an effort to improve the spatial and temporal resolution 1 of fluvial geomorphic studies and ensure accuracy that is acceptable for change detection, there has been a rapid increase in the use of Light Detection and Ranging, which is known as LiDAR. LiDAR is an active remote sensing technology that is emitted and collected on an aerial platform such as an airplane or helicopter. The high frequency lasers are emitted at a rate of up to 70,000 pulses per second covering the Earth?s surface with a dense posting (point placement) of LiDAR points. Each point has x, y, and z values that indicate its position and elevation. These points are used to create an interpolated surface such as a Digital Elevation Model (DEM) or a Triangulated Irregular Network (TIN) which depict elevations on the Earth?s surface. The use of a fine resolution, LiDAR-derived DEM or a TIN to detect change in stream channel geometry and within the riparian zone could provide the accuracy and spatial and temporal resolution needed for this purpose. If LiDAR is to be used to determine changes in fluvial geomorphology and channel geometry, then its accuracy must be ensured. Accuracy is a relative term that depends on the use of the LiDAR data. For example, less accuracy is needed when delineating a watershed or identifying a channel than for the determination of geomorphic changes to a channel. The use of a 30-meter DEM is acceptable, in most cases, for delineating a watershed. However, changes to a stream bank or channel will probably not be visible at that resolution. In addition, objects on the Earth?s surface such as vegetation, logs, and structures must be considered when trying to determine bare-ground for identifying topography. These objects return LiDAR pulses and must be removed when identifying the bare-ground. 2 The area in which this study occurred is known as the Clarksburg Special Project Area (CSPA) which is located outside an area that is transitioning from rural to high density suburban residential and mixed land use (Figure 1-1). The CSPA is intensely studied and monitored by the Montgomery County Department of Environmental Protection for the impacts of urbanization on stream biology and chemistry and by the United States Environmental Protection Agency (USEPA) to map development and placement of anthropogenic surface structures such as roads, buildings, and parking lots to determine changes in topography associated with urbanization, the changes in stream Figure 1-1. Satellite classification of urban land cover, 1970s to 2000 showing the CSPA outlined in yellow, T. Jarnigan (personal communication, March, 2006). 3 flow, surface and groundwater relationships, and the effects of these surface alterations on the biological and chemical properties of these waters. In an effort to find better ways of improving the analysis of topographic changes, especially how urbanization affects stream geomorphology and its impacts on hydrologic analysis and surface mapping, the USEPA obtained three overflights collecting LiDAR datasets over three years: 2002, 2004, and 2006. High-resolution aerial photography, satellite imagery and geographic information processing were used to develop mapping of surface features including topography (Jarnigan and Jennings, 2004). The steps that are required to collect and process LiDAR data can cause errors. The identified errors for most LiDAR data are caused by the LiDAR system, interpolation between LiDAR points, ground survey error used to verify LiDAR points, errors caused by sloping terrain, and errors due to land cover (Hodgson, 2004). The total accumulation of these errors can result in significant differences between actual surface elevations and those as determined by LiDAR. The objectives of this study were: 1. To evaluate the accuracy of LiDAR-derived elevations relative to in situ ground-based measurements along the banks of a small stream and in its vegetated floodplain in the CSPA, and 2. To investigate the utility of LiDAR-derived elevations for change detection in this location. Two comparisons of LiDAR-derived surfaces were completed. In the first comparison, LiDAR-derived surfaces created from LiDAR data collected in 2006 were compared to surfaces derived from ground-surveyed points. The identification of the 4 error should indicate the acceptability of LiDAR as a technology that can provide topographic measurements used to determine geomorphic changes within the channel and along the banks and bordering riparian zone. In the second comparison, LiDAR-derived surfaces from overflights in 2002 and 2006 were compared to determine if changes in the topography were statistically significant. The error that results from the collection and processing of LiDAR data has been examined quantitatively by Hodgson (2004). Hodgson identified interpolation between data points as a source of error. In this study, interpolation error can occur in both Triangulated Irregular Networks (TINs) and rasters created from LiDAR and ground- surveyed points. The quantitative evaluation of the interpolation error between data points is completed using Ordinary Kriging to create prediction and standard error rasters. Using the difference rasters, that is, LiDAR 2006 minus LiDAR 2002 and LiDAR 2006 minus ground-surveyed points, it is possible to determine where the differences are statistically significant. 5 Chapter 2. Literature Review Geomorphic changes in stream channels are assessed using in-situ and remote methods. This chapter reviews the various methods and summarizes efforts to date in the study area (Clarksburg Special Protection Area, specifically Soper?s Branch). 2.1 Transect Methods The use of transects to determine stream channel changes at cross sections and longitudinally along the stream bed in the direction of flow has been the traditional approach for determining the extent of bank and bed erosion and deposition. These transects are established at specific locations along a stream reach where erosion and deposition are more likely to occur. For example, erosion during high flow is likely to occur at the outside of a meander bend and deposition is likely to be on the inside of the bend (Leopold, 1964). These series of transects can then be used to determine change, evaluate locations for stream bank and bed reinforcement and determine stream hydraulics when engineering stream restoration projects. The lateral migration of the stream channel over time requires numerous in situ measurements of the physical conditions of the stream and a comparison to some initial benchmark. These measurements must be continually recorded so that biological, chemical and physical conditions within the stream can be monitored. The frequent monitoring of these transects provides valuable data that can be used for: ? Monitoring trends in fluvial and geomorphic condition over time ? Quantifying environmental impact 6 ? Assessing stream and watershed response to management ? Providing channel and flow facts for water allocation ? Supporting resource inventories (habitat, water quality, vegetation) ? Tracking cumulative effects for entire drainage areas ? Contributing to regional, national, and international databases (Harrelson, 1994). The vertical and horizontal changes to the bank and bed along the cross-sectional transect are usually determined by standard surveying methods which include measuring the change in elevation at points along the transect (Figure 2-1). A longitudinal profile requires many of the same types of measurements but extending from one cross-sectional transect to another downstream paralleling the stream bed (Figure 2-2). The longitudinal profile includes measurements of key features such as the location and depth of thalwegs, riffles, pools and point bars (Harrelson, 2004). However, these measurements are time- consuming, spatially limited and require inferences about the reach between the cross- sectional reference points. Figure 2-1. Diagram of Stream Cross-section Survey (Harrelson, 1994) Reference points such as established monuments with known latitude, longitude and elevation above some datum are used for comparison to previously recorded measurements. These reference points are either position-fixed naturally-occurring and/or 7 anthropogenic structures (boulders, bridge piers, buildings) that can be used to site a monument (permanent reference point with established coordinates and elevation) (Figure 2-3). The monuments are established at both sides of the channel at some determined distance from the top of bank and are used to establish the ends of each cross- sectional transect. Figure 2-2. Longitudinal Profile and plan view of riffle and pool sequence (Harrelson, 1994) Figure 2-3. Positioning of a transect using a fixed location and setting of a rebar monument (Harrelson, 1994) 8 These physical measurements can be used to classify a stream type according to various classification systems. The most widely used is the Rosgen Classification scheme which ranks streams by numerous criteria including landscape level (headwater, intermediate, meander, braided, etc.), pebble count or sediment size collected from the bed , slope of the bank and other factors that must be determined in the field (Harrelson, 1994; Rosgen, 1994). This classification system is then used to predict the behavior of the stream and the expected changes to the banks and bed over time. Channel geometry based upon measurements at the transects can also be used to determine the hydraulics of the stream at that transect and to assess geomorphic change over time (Figure 2-4). This has been the traditional approach used to measure fluvial geomorphic changes within a stream reach associated with factors such as urbanization, impervious surfaces, land use, land cover, best management practices and specific storm events. These in situ measurements are time-consuming and spatially restrictive. They are applicable to a limited reach near the transects. Extensive interpolation is required to extend these measurements to other locations in and around the stream. Figure 2-4. Cross-sectional profile determined from a transect survey (Harrelson, 1994). 9 In addition to the low spatial and temporal resolution of these in situ measurements, acceptable accuracy is needed for measurements in the x, y plane and in the z-direction. The evaluation of fluvial geomorphic changes requires accurate measurements in three dimensions. That is, x, y and z values must be within an acceptable accuracy to ensure that the measurements reflect current surface elevations and that future measurement will detect changes in surface topography. Measuring distances and elevations using either a transit and stadia rod (rod with marked elevations used to determine the elevation and distance at a point) or laser survey equipment is often very accurate. The acceptable error depends on the use of the survey but for river surveying an acceptable error is 0.02 feet when closing the survey. The closure of the survey is the completion of the survey of all points turning the angles from the established location of the total station or stadia. An example of an equation used to estimate allowable error is Equation 2-1 (Harrelson, 1994). Acceptable error = 0.007 total _ distance 100 (Equation 2-1) where, total distance = the total distance measured in the closure of the survey 2.2 Digital Elevation Models A Digital Elevation Model (DEM) is a grid of pixels that are used to represent the elevation of the Earth?s surface with each pixel having a value indicating the elevation within that pixel. The resolution is determined by the size of each pixel. Many DEM grids are available at 10 m and 30 m resolution; thus, the reported elevation is an average or representative value over 100 or 900 sq m, respectively. This resolution is acceptable 10 for most hydrologic analyses including watershed delineation, surface runoff and stream flow calculations. However, geomorphic changes of a stream channel may not be visible at these resolutions thus requiring finer resolution. Digital elevation models are created using ground survey data, cartographic digitization of contour data and/or photogrammetry. The most common technique is photogrammetry which uses aerial photographs taken in the line of flight to determine surface elevations. Photogrammetry includes several methods that are accurate for measuring surface elevations and horizontal distances. A single aerial photograph can be used to determine the elevation of features in the photograph or multiple photographs can be used to determine surface elevations over a broader spatial extent (Jensen, 2000). A single aerial photograph can be used to measure surface elevations including the height of buildings, hills, and mountains and the depth of ravines and depressions through a technique called relief displacement. [The following explanation is taken from Jensen (2000).] In a vertical aerial photograph, an object is displaced from the principal point (PP) of the photograph. The principal point of the photograph is the exact point on the Earth where the optical axis of the camera was pointing when the photo was taken. An object that lies above the local datum (the plane passing through the principal point) is displaced outward from the principal point if the object?s elevation is above the local datum and toward the principal point if the object?s elevation is below the local datum. The amount of relief displacement is related to the height (h) of the object or surface elevation is given by r Hd h ? = (Equation 2-2) 11 The amount of relief displacement is d, the radial distance from the displaced image and the principal point is r and the altitude of the camera above the datum is H. The relief displacement, d, is the horizontal measurement between the top of the object and its bottom as measured on the aerial photograph. The radial distance, r, is the measurement between the top of the displaced object and the principal point (Figure 2-5). Exposure station, L d r a b H B h B PP A Figure 2-5. Measurement of height from a single vertical aerial photograph of a building (Jensen, 2000) The use of a single aerial photograph does not provide the spatial resolution needed to develop a digital elevation model but it does demonstrate a technique that clearly shows the relationship between relief displacement and elevations above some datum. Another more widely used technique for determining surface elevations from aerial photographs is stereoscopic measurements using two adjacent photographs within the line of flight. Software algorithms have been developed to use this technique to convert digitized aerial photographs into digital elevation models (Jensen, 2000). Aerial photographs in a flight line usually overlap by 60% along the flight path. This overlap allows objects that are in both of the neighboring photographs to be seen from different views, which can then be used to determine the elevations of these objects. 12 Each photograph has a principal point and a conjugate principal point. A conjugate principal point (CPP) is the principal point on a photograph that will overlap to a point on its neighboring photograph within the line of flight. The alignment of the CPP for one photograph and the PP from the neighbor establishes the line of flight. The overlap of the two photographs can be viewed stereoscopically to determine elevations. Stereoscopic viewing gives the perception of depth and can be viewed using different methods but all with the same purpose, determination of surface elevation. The neighboring aerial photographs within the line of flight indicate a change in position of an object (hill, building, mountain, etc.) from one photograph to another as the forward progress of the plane causes a shift of the object across the camera?s focal plane. This change in location of the object from one photograph to the other is called x-parallax. The relationships of the x-parallaxes allow for the determination of elevations as shown in Equation 2-3. )( )( 0 dpP dp hH h + ??= (Equation 2-3) The height of the object is h 0 , the altitude of the aircraft above-ground-level is (H-h) where ground-level is the top of object whose elevation is being determined, the absolute stereoscopic parallax at the base of the object being measured is P and dp is the differential parallax. These values are determined by first superpositioning each of the neighboring photographs so that the principal points of each are superimposed over one another. This will allow the movement of the objects across the film plane to be measured. The parallax of a point is the difference of the distance of the point from the principal point in each photograph. The top and bottom of the objects each have a 13 parallax. The absolute stereoscopic parallax is determined by measuring the distance from the PP to the CPP of each photograph and then using the mean of these two measurements as P. The differential parallax (dp) is the difference of the x-parallax of the top of the object and the x-parallax at the bottom of the object (Figure 2-6). a?b b? Xb=-3.606? Pb=-3.339? Pa=-3.55? Xa=-3.820? Xa?=-0.267? Xb?=-0.267? o a Figure 2-6. Computing height by measuring the stereoscopic x-parallax by superpositioning neighboring photographs in the line of flight (Jensen, 2000) The superpositioning technique for determining surface elevations from aerial photography along a flight path demonstrates the complexity of the methodology. However, the collection of data in the x, y and z direction must be accurate if the determination of a DEM is to simulate the surface topography. The first step in the creation of an accurate DEM is to collect x, y and z ground data by using either ground surveying or Global Positioning Systems (GPS). These points can be used to rectify aerial images that may have terrain-induced errors. Once these displacement errors are 14 corrected, the orthoimage (correctly georectified image) can accurately reflect all surface locations. These orthoimages can be used as a backdrop for the development of layers within a Geographic Information System (GIS) map and for measuring distances needed to create DEMs. The accuracy of a digital elevation model is dependent upon the x, y and z points that are extracted from a photogrammetric image. The photogrammetric image that has been rectified will allow for the creation of an accurate three-dimensional model of the terrain. However, accuracy is affected by the presence of trees, buildings, bridges, overpasses and other objects that extend above the ground level. These objects are viewed by algorithms as surface features and compute the differential parallax and surface elevations using these objects as surface features. Therefore, the removal of these objects from the DEM must occur if a more accurate DEM is to be developed (Jensen 2000). The DEM is determined based upon calculations used to determine surface elevations such as the x-parallax relationship and the corrections made to the surface elevations where objects such as buildings or trees have distorted the elevation. Once the corrections are made, the spatial resolution of the DEM can be determined by the accuracy and number of surface points used for elevation calculations within a grid. Many DEMs used for hydrologic analysis are based on remotely sensed data with a low point density. These DEMs usually have a resolution of approximately 30 m which indicates that a 30 m x 30 m pixel will have the same elevation throughout the pixel. While this is suitable for many hydrologic determinations, it is not suitable for evaluating fluvial geomorphic change along a stream reach or within the channel since many of 15 these vertical and horizontal changes occurs within smaller distances (< 3 m). The potential inaccuracy of DEMs and the lower spatial resolution limits their use for topographic change detection especially in fluvial areas where changes can be difficult to detect. Digital Elevation Models are frequently used by Earth scientists and hydrologists to develop models to predict hydrology, erosion, and landscape evolution with strong dependence on the integrity of the DEMs. These scientific studies are based on the assumption that the accuracy of the DEMs is suitable for predictive studies in these fields. The most often used DEM is a two-dimensional array of numbers that represent elevations along a spatially distributed grid. However, Walker and Willgoose (1999) analyzed 10 m resolution DEMs for comparison to ground-surveyed data. They found that these DEMs are not accurate enough in most cases to predict drainage area and stream networks. They found that published DEMs indicated significantly different stream networks and drainage basins than those determined by ground-surveyed data. Width and cumulative area measurements made by the DEMs consistently fell outside the 90% confidence interval that was determined by ground surveyed data in 60% of the compared positions. This suggested that hydrologic properties including geomorphic measurements are poorly estimated by DEMs. However, the accuracy of slope-area relationships was more acceptable with only 40% of the compared positions falling outside of the 90% confidence interval. Walker and Willgoose (1999) indicated that further study would be needed before a general conclusion on DEM accuracy could be made. However, they did indicate that the accuracy was dependent upon post-processing of the DEMs, differences in accuracy based on terrain and grid sizes. The use of DEMs 16 for the determination of geomorphic change was not supported in their study and identifies questions about accuracy even at 10 m resolution which is finer than used in many hydrologic models. The most accurate digital elevation models are not suitable for the fine vertical and horizontal measurements needed in change detection studies of fluvial areas. The United States Geological Survey (USGS) produces DEMs that have a spatial resolution of 10 m with a vertical accuracy of 7 m and horizontal accuracy of 10 m (Hans, 2003). One example of the benefits of greater spatial resolution DEMs is to determine surface elevations in the design of highways. The design of highways could be significantly improved if the DEM spatial resolution was increased. LiDAR has been investigated as a technology that could produce finer spatial resolution DEMs for highway design and for the delineation of smaller watersheds that are now excluded from the drainage area being considered in highway design. These smaller drainage areas have an impact at a more local level on the hydraulic design of highway projects. For example, smaller culverts are now either not seen due to the lower resolution DEMs or are filled as sinks and improperly considered in the overall flow when determining the drainage in an area (Hans, 2003). Even though these finer resolution DEMs are used for hydrologic and hydraulic analyses during highway design, they still do not provide the accuracy needed to determine change for fluvial geomorphic studies since small horizontal and vertical changes may not be detected. 17 2.3 Light Detection and Ranging (LiDAR) Total Station surveying techniques used to determine surface elevations along a stream transect or within a riparian zone are accurate within acceptable engineering standards but are very spatially limited. Digital elevation models provide better spatial coverage but often lack the resolution needed to measure changes within stream channels and along areas bordering the riparian zone. In an effort to improve spatial and temporal resolution, i.e., more frequent evaluation of changes over larger areas, the use of a new technology is showing promise. Light Detection and Ranging known as LiDAR, is fast becoming the remote sensing technology of choice for providing point data with high spatial and temporal resolution at a reasonable cost. Topography can be determined from these points by developing a Triangulated Irregular Network (TIN) or a digital elevation model (DEM) with fine spatial resolution (?0.5 feet). LiDAR may provide accurate point measurements that can be used to determine elevation at specific points on the Earth?s surface with enough accuracy for detecting fluvial features. These LiDAR measurements are used to assess storm water control design, flood control, geomorphic changes, and generally any assessment that requires data with x, y and z coordinates (Hodgson, 2004). 2.3.1 LiDAR Sensors and Positioning Airborne LiDAR uses an active sensor to capture point data with x, y and z values and in many cases with an intensity measurement. The LiDAR system is composed of three primary parts that together provide the data to determine elevation at a point: 1. a laser scanning system 2. global positioning system (GPS) 18 3. inertial measuring unit (IMU) The laser scanning system consists of an emitting diode that produces the laser, which is a beam of light at a very specific high frequency. The receiver in the system receives the return pulse from the emitted laser after it contacts and is reflected from the surface feature. The distance from the scanner to the surface feature is determined by specific calculations that include the speed of light and the time that the emitted laser takes to reach the surface feature and return to the receiver. The use of a pulse laser system is best for determining topography of the surface since a pulse laser emits a small diameter laser in the near infrared region of the electromagnetic spectrum. These narrow pulses allow for greater penetration of obstacles such as forest canopies and other vegetation allowing the laser to reach bare-ground and obtain ground surface data. The near infrared portion of the electromagnetic spectrum will allow the laser to be more easily reflected over land surfaces even though it is partially absorbed by water resulting in an undetectable signal (Burtch, 2002). It is important to understand the LiDAR sensing technology, not only to comprehend the data collection process, but to understand the likely error inherent in the actual scanning and data collection. The LiDAR system, which includes error in both data collection and processing, is a key factor in determining overall LiDAR data error (Hodgson, 2004). The laser scanner can emit up to 70,000 pulses per second, collecting the scan data using a scanning mirror that rotates transverse to the line of flight. The scan angle for the mirror ranges from 20? to 30? from nadir (point on the Earth from the emitting diode that is normal to the Earth?s surface). The laser scan signal creates an area or footprint on the ground that can vary in diameter from 24 to 60 cm. The smaller 19 footprint increases the ability of the laser to penetrate through vegetation and to reach the ground. It is this ground measurement that is used to determine surface elevation. The actual footprint is referred to as the instantaneous field of view (IFOV); it varies in size and shape based upon the angle from nadir and the altitude of the scanner above ground level (AGL). As the angle from nadir increases, the footprint increases in size and becomes elongated and ellipsoidal (Figure 2-7). At nadir, the IFOV is smaller and circular. While each IFOV is an area and not a point, each is usually referred to as a point. Figure 2-7. Example of Instantaneous Field of View at different scan angles (Burtch 2002) The scan rate must be fast enough to prevent gaps and to allow for a uniform distribution of data over the targeted site. The swaths or width of laser points extends from approximately 2,500 to 3,000 feet perpendicular to the line of flight. The number of laser points emitted per second depends on the type of scanner and varies depending on the signal emission frequency (measured in kHz, where 1 kHz equals 1,000 pulses per 20 second). Most modern scanners emit pulses at 10 to 70 kHz. This pulsing rate will determine the point density over one pass of the target area (Burtch, 2002). 2.3.2 Determining the Position of LiDAR Points The determination of the x, y coordinates is necessary to locate the horizontal position of each laser measurement. The determination of these coordinates is made by the use of a Global Positioning System (GPS) that is located near the scanner. The location is determined as the return pulse is received by the sensor and is offset by the location of the GPS from the sensor. The GPS provides x, y information in a specific projection that is relevant to the target area and data use (Hodgson, 2005). The movement of the aircraft platform must be considered when determining the x, y and z coordinates for each laser point. The aircraft movement can cause variations in these points requiring that adjustments be made before recording the data. These small angular variations are due to the pitch, yaw and roll of the aircraft (Figure 2-8). These Figure 2-8. Pitch, yaw and roll of the aircraft. 21 position variations are recorded by the Inertial Measurement Unit (IMU); the coordinates and elevations can then be corrected by an analyst using these IMU measurements to determine the orientation of the scanner at the time the point data was received. An example is shown in Figure 2-9A. If the scan angle was 10? to the right of the flight direction when the laser signal was emitted and the height above the surface at nadir was 1000 m, then the return distance for the signal would be 1015.43 m from the location of the center of the IFOV as determined by S = D?sin10? = 176.33 m (Equation 2-4) where S is the ground distance from the vertical to the center of the IFOV (Figure 2-9B). If there was an additional 1? tilt of the aircraft, the scan angle would change to an 11? angle resulting in a return distance of 194.38 m (Figure 2-9B). This difference is 17 m, requiring that the IMU accurately measure the change in the scan angles in all three directions if the georeferencing is to be correct. In some instances the corrections are pitch yaw roll A B Figure 2-9. Example of roll on the LiDAR signal (Burtch, 2002) 22 made directly by the IMU and others are made as the flight?collected data is post- processed (Burtch, 2002). 2.3.3 Post-processing of LiDAR Data The LiDAR returns are collected by the in-flight receiver and are adjusted for the variations caused by the aircraft movement. However, the data received in-flight must be post-processed to ensure that the actual returns reflect the desired surface features. As the LiDAR pulse strikes an object on or above the ground, it can be totally absorbed, partially absorbed or totally reflected. The returns can come from tree canopies, mid-level branches and surface features such as logs or rocks that are not indicative of bare-ground. If the bare-ground topography is needed, then data points that resulted from other returns must be eliminated from that data set. The returns that represent the top of a forest canopy could be saved as a separate data set. Since each LiDAR pulse can emit up to 70,000 points that are returned up to five times per point depending on the surface features, extensive post-processing is required to ensure that the data collected in flight is separated to represent the appropriate surface feature. The analysis and processing of the in-flight data is performed with a combination of software algorithms and manual expertise (Burtch, 2002). 2.4 LiDAR Error 2.4.1 Determination of Horizontal and Vertical Error Location and elevation error can result from data collection and processing, causing the accuracy of LiDAR data to be insufficient for many of the uses mentioned 23 previously. The detection of topographic changes requires that horizontal location (x, y) and elevation (z) be accurate within some standard determined for the particular application. For example, the Federal Emergency Management Agency (FEMA) has several efforts underway to determine floodplain mapping to establish flood mitigation efforts. FEMA has established acceptable root mean-squared error (RMSE) for vertical accuracy of 20 cm for coastal areas and 25 cm for inland areas (Hodgson, 2004). The RMSE is always calculated by determining the difference between ground-measured point elevation (surveyed elevation at or near the specified LiDAR point coordinates) and the LiDAR-derived point elevation. Some studies actually use Total Station surveying of specified bare-ground LiDAR points and designate these as ground-truth data (actual ground elevations at the specified point), while other studies survey random locations in the target area and compare these elevations to the bare-ground LiDAR points. The random selection of survey locations introduces interpolation error that must be considered and is discussed later. The RMSEz is calculated as follows, where Z is either at a LiDAR point or within the targeted area near LiDAR points, n is the total number of points and Z LiDAR is the elevation at a LiDAR point: RMSE ObservedLidarPts = Z lidar ? Z survey () 2 ? n (Equation 2-5) 2.4.2 System Error As discussed earlier, the LiDAR system is the first potential source of error. The error is inherent in the collection process and is associated with the Global Positioning System (GPS) onboard the aircraft, the inertial measurement unit (IMU) used to monitor 24 the pointing direction of the laser and also within the inertial navigation unit (INU) used to estimate positions between the GPS fixes (Hodgson, 2004). The forward movement of an aircraft in the line of flight, the pitch, roll and yaw associated with the flight, the pulse of laser light from a height of 1200 to 3000 ft. AGL and the determination of the location of as many as 70,000 points per pulse makes data collection susceptible to error. This positional error can occur horizontally (x, y) and vertically (z). Determining if horizontal error has occurred can be difficult since ground measurements do not reflect the error but may actually augment the error by using the horizontal location determined by the airborne GPS as the location for verifying vertical error. If specific ground points can be identified then horizontal error can be assessed. Typical assessments of horizontal error would include the use of building corners, flag poles and other fixed locations on flat surfaces such as roads or roofs to determine the horizontal offsets (Burtsch, 2002). 2.4.3 Labeling Error The measurement of surface elevations requires that ground points be identified and other laser returns be removed from the data set. Since a typical LiDAR system can emit up to 70,000 pulses per second with the possibility of as many as five times that number of returns, it important to be able to identify the returns and their source (ground, canopy, mid-level tree structure). Identifying the source of these returns is called labeling. Each laser pulse can have multiple returns as the laser reaches different surface features. For example, four or five returns per pulse from a forested location are not unusual. The first return could be from the top of canopy, second return from branches within the tree, third return from the herbaceous understory and finally the fourth return 25 could be the bare ground. The identification of each of these returns is a potential source of additional error. The identification of bare-ground returns is done by software algorithms and/or manual interpretation of the LiDAR points. Human error and algorithmic interpretation of points can add to the identification error. This error is known as labeling error. In his study of LiDAR error, Hodgson (2004) refers to the combination of LiDAR system and labeling error as LiDAR system error. This is the elevation error at LiDAR points caused by mislabeling and/or laser pulse height measurement error. 2.4.4 Slope Error Slope error is an elevation error caused by the incorrect horizontal location of a LiDAR point on a slope resulting in a vertical error, most often due to the system errors previously mentioned (Peng and Shih 2006). Slope changes in riparian zones and along stream channels are important for assessing topographic and channel changes that can be used to determine the impacts of land use, storm events and other factors on the hydraulics and geomorphology of a stream and riparian zone. If horizontal error occurs on a slope, the elevation error can be significant and is maximized as a function of slope by a simple trigonometric relationship: Elevation error = tan ? ? Horizontal Displacement (Equation 2-6) For example, the elevation error caused by a 100-cm horizontal displacement on a 10? slope (?) can be up to ? 18 cm. The maximum error occurs when the displacement is perpendicular to the contour line (Figure 2-10). Horizontal displacement parallel to a contour line will not result in elevation error since the elevation is constant along the contour line (Hodgson, 2004). However, a point that moves either up or down the slope away from the location where it should have been positioned will have an elevation that 26 is either greater or less than it would have had at the correct position. This error is caused by the incorrect horizontal positioning (x, y location) due to the system error in the scanner. Incorrect horizontal positioning on a flat surface will not result in an elevation error but on a slope, the error can be significant for that point. This elevation error can further complicate interpolation of elevations between points. Vertical error Horizontal error Laser energy reflected from here ? Apparent x, y position of LiDAR point is here X-Dimension z- Dimension Figure 2-10. Effects of Terrain Slope on observed error (Hodgson, 2004) 2.4.5 Point Density Point density is also important for deriving accurate DEMs from the LiDAR data. As the point density increases, the accuracy of the resulting DEM increases. This improved accuracy is the result of reduced interpolation between points. (Obviously, point density has no effect on the accuracy of individual points, which are still susceptible to LiDAR system error.) The density of the observation points depends on the pulse rate (kHz), the nominal posting (spacing between points at nadir), land-cover type and accuracy of identifying bare-ground points through the labeling process. Adequate 27 point density is essential for ensuring that the target area is covered and that voids in data are minimized. The current LiDAR systems produce pulses at 70 kHz creating up to 70,000 data points per pulse, thus reducing voids and increasing point density with respect to previous systems (Hodgson 2004). 2.4.6 Interpolation Error The surface elevation between points is determined by interpolating between measured elevations, such as LiDAR points or ground points. Interpolation is a prediction of the elevation between points through the use of a specific algorithm or methodology. In ArcMap, there are several methods used to determine interpolated surfaces. These include the use of known points with x, y and z values to create a Triangulated Irregular Network (TIN), conversion of a TIN to a grid or raster, creation of a grid or raster directly from points and ordinary kriging or cokriging to predict surface elevations based on geostatistics. The interpolation of the surface elevation between measured points introduces additional error that must be considered. Digital elevation models do not generally have the spatial resolution needed to measure small changes in stream channel geometry. However, by using LiDAR points and creating a triangulated irregular network, it may be possible to detect these changes. A Triangulated Irregular Network (TIN) is a method to create the appearance of a 3-dimensional surface from irregularly spaced (x, y, z) measurements. The TIN is created by using a dataset that includes points with x, y and z values partitioning the geographic space into triangular planes formed by nodes (vertices) connected by edges using Delaunay triangulation. Delaunay triangulation creates a mesh of contiguous, non- overlapping triangles. Each triangle?s circumscribing circle contains no data points from 28 the dataset in its interior. Topographic lines, where the surface changes abruptly such as ridge lines, top of stream banks and roads should be incorporated as break lines when creating the TIN (ESRI, 2006). The TIN can be converted to a raster with a specified resolution so that it can be used in calculations. A raster is a grid of cells each having an integer or floating point value to represent discrete or continuous data, respectively. For elevation, the raster cell values are floating point since elevation is a continuous variable. Rasters can also be created from point data without the intermediate step of creating a TIN. Each method is acceptable but consistency should be used when creating the rasters. Ordinary Kriging can be used to create interpolated continuous surfaces determined by autocorrelation (the statistical relationship among measured points). The Kriged continuous surface predicts the elevation values where no points exist. The points closest to the interpolated surfaces have a greater weight in the prediction algorithm. However, the weights are based not only on the distance between the measured points but also on the overall spatial arrangement among the measured points (ESRI, 2007). The location of LiDAR points is random while ground-surveyed points can be positioned to address specific topographic measurements. For example, the top of bank, bottom of bank, break lines, surface anomalies, geomorphic features and other identifiable features can be identified and measured during a ground survey. The randomness of LiDAR introduces the possibility of additional interpolation error between points. 29 2.4.7 Error Budget for LiDAR Data In summary, the sources of error in the collection of LiDAR points for the production of DEMs or TINs include: elevation error caused by the sensor system measurement, horizontal error caused by the sensor system, labeling errors that can be caused by software or manual misidentification of bare-ground points and interpolation error caused by incorrect determination of elevations between points. The impact of these errors can be significant and limit some LiDAR data sets from consideration for use in fluvial geomorphic change detection. Research to identify all of the possible sources of error in LiDAR data sets is continuing. However, specific errors that result during each of the steps taken to develop a LiDAR data set are being more clearly identified. In their research to identify and quantify errors occurring during each step of LiDAR data gathering, Hodgson and Bresnahan (2004) separated what they identified as error sources. By separating each error source, they were able to develop an error budget that quantified the error observed in a LiDAR-derived TIN. Root Mean Squared Error (RMSE) analysis was used to evaluate the observed TIN error in four specific sub-error categories: 1. LiDAR system 2. Ground survey 3. Vertical effect due to horizontal displacement on a slope 4. Interpolation The RMSE for the observed TIN that resulted from the LiDAR dataset was calculated as: RMSE observedTIN = RMSE lidarsystem 2 + RMSE survey 2 + RMSE horiz,slope 2 + RMSE interp 2 (Equation 2-7) 30 Equation 2-7 assumes that the sources of error are statistically independent. Ground Survey error (RMSE survey ) was attributed to errors that occurred during Total Station surveying of LiDAR elevations at the exact coordinates of each LiDAR point. These errors were typical of errors that occur during a topographic Total Station survey and did not contribute significantly to the observed error. For example, a survey error of 5 cm RMSEz resulted in only a 0.8 cm observed error in the field (Hodgson, 2004). Elevation errors of 3 to 4 cm during surveying are common. An evaluation of the horizontal error (RMSE Horz,slope ) as a function of slope is shown in equation 2-8. The equation shows that steeper slopes result in greater horizontal error. Most LiDAR data companies advertise that horizontal error is a function of altitude estimated as 1/1000 th of the altitude above ground level (AGL). Since many LiDAR flights are between 1,200 m and 3,000 m, the horizontal error can be 120 to 300 cm. Therefore, using a mean slope that ranged from 1.67? to 4.15? and a mean horizontal error of 120 cm, the RMSE horiz, slope resulted in a horizontal error of 2.5 to 6.2 cm (Hodgson, 2004). (Equation 2-8) This horizontal error is attributed to the LiDAR system and results from the changes in angle off nadir caused by the scan angle and the pitch, yaw and roll of the aircraft, as well as errors in adjustments to compensate for the angle differences. Interpolation error (RMSE interp ) is inherent when using a DEM or TIN to determine elevations between measured points, since the elevations between points are interpolated manually or by software algorithms. This is another potential error identified 31 by Hodgson (2004) that contributes to the combined total error of the LiDAR-derived elevations between measured points in the developed TIN. Hodgson and Bresnahan?s (2004) process for evaluating interpolation error requires that specific LiDAR points be surveyed and then removed so that adjoining points could then be connected by an edge in the TIN. The elevation of the edge is then interpolated and compared to the surveyed elevation of the removed point which was at a location between the endpoints of the edge (Figure 2-11). Figure 2-11. Evaluation of interpolation error by removing points and cross-validating (Hodgson 2004) This technique used by Hodgson and Bresnahan (2004) is a cross validation approach that created TINs using both the surveyed and non-surveyed LiDAR points. Each of the surveyed points was then removed one at a time and a new TIN was created. The surveyed elevation of the removed point was then compared to the interpolated edge that intersected that point. This was repeated for every surveyed point in the dataset and the RMSE was calculated using these points and the interpolated edge in the TIN. This cross validation technique is commonly used for linear interpolation across triangular faces of a TIN (Hodgson 2004). 32 Hodgson (2004) showed an increase in total RMSE caused by interpolation during the creation of a TIN from the LiDAR points. All of the land cover categories with the exception of deciduous forests showed an increase in total RMSE with a range of 0.4 cm to 3.3 cm. The RMSE of the deciduous forests decreased from 25.9 cm to 23.5 cm. Significance of these errors was tested using a paired t-test which indicated a significant difference between the mean absolute error observed at the LiDAR reference points and the mean absolute interpolated error. The accuracy of LiDAR data is also affected by land cover. Specifically, the type of vegetative cover affects the density of LiDAR points since fewer points are actually determined to be bare-ground when vegetative cover is dense or highly varied. Certain types of vegetation such as a deciduous or coniferous forest would be expected to reduce bare-ground points since many pulses may be reflected from the canopy and under story. Other land covers such as low grass, high grass, herbaceous vegetation, low trees and pavement would be expected to permit more pulses to reach the ground, increasing point density and improving the accuracy of TINs and DEMs developed from these points. A study was conducted by Peng (2006) with LiDAR points being collected during leaf-on conditions. The post-processing of this data set removed the vegetation to obtain bare-ground data used to determine the surface elevation. It was shown through regression analysis that there was a strong relationship between vegetative cover and LiDAR point accuracy. However, the accuracy was due to the reduced point postings with decreased point postings producing less error. The reduced point spacing distances were a function of vegetation cover (Peng, 2006). 33 A previous study with leaf-off conditions was conducted by Hodgson, Jensen, et. al. (2005). In this study, the last returns were assumed to be the best candidates for bare- ground points. All other returns were eliminated from consideration. The ground return data set was then evaluated using a moving neighborhood window where ground points were selected based on their elevation with respect to their neighbors. During this step, previously labeled ground points could be removed and new ground points could be added to the bare-ground data set. This is a highly labor intensive process that requires manual analysis using orthoimagery and human three-dimensional visualization. This process was used in both leaf-off and leaf-on post-processing for determination of bare- ground points (Hodgson, 2005). The RMSE for elevation of LiDAR data points by cover type for two different leaf-off studies are shown in Tables 2-1 and 2-2. The overall elevation Root Mean Square Error (RMSE) for leaf-off conditions as determined in the study by Hodgson and Bresnahan (2004) was found to be 21.1 cm. The RMSE ranged from 17.2 to 25.9 cm depending upon the land-cover categories. It was found that deciduous trees exhibited the highest RMSE at 25.9 cm and high grass and pavement the lowest at 18.9 cm. The RMSE for the other land-covers are shown in Table 2-1. Table 2-1. Observed LiDAR Elevation Error (cm) for leaf-off data (Hodgson, et.al. 2004) Cover Type Pavement Low Grass High Grass Brush/Low Grass Evergreen Deciduous RMSE Observed LiDAR pts 18.9 22.5 18.9 23.3 17.2 25.9 34 In this study, the differences in RMSE due to land cover showed both expected and unexpected results (Table 2-1). The pavement error was small as expected since the LiDAR pulses are not attenuated or blocked by cover. However, the evergreen forest had a lower error than pavement. This was partially explained by the nature of the evergreen forest in this study. The authors explained that the forest -- composed of mostly Southern pines -- reduced ground cover due to the heavy layer of needles on the forest floor. The Southern pine canopy was very open to sunlight, and thus LiDAR pulses, since the lower branches continually die back as the tree grows. This open canopy and clear forest floor allow for two good returns that identify the forest canopy and the bare-ground. Hodgson (2004) used a t-test to determine the statistical significance of the differences in errors between the pavement and pine forest. He determined that the error difference between the pavement and the pine forest was not statistically significant. He did not provide an explanation for the observed lower error in the pine forest than the pavement. The low and high grasses showed moderate observed RMSE as well. This RMSE is thought to have been caused by the difficulty in labeling bare-ground points due to returns from stubble remaining during the winter months when the LiDAR points were collected. The multistory structure of brush and deciduous forest even during winter Table 2-2. Observed LiDAR Elevation Error at Points and Interpolation Error at Points (Elevation in cm) (Hodgson, 2004) Cover Type Pavement Low Grass High Grass Scrub/Shrub Pine Deciduous RMSE Observed LiDAR pts 22.6 14.5 16.3 36.1 27.6 27.3 RMSE Observed in TIN 22.1 25.8 22.2 26.6 17.6 23.5 35 months demonstrated the difficulty in removing multiple returns from the trees and understory which may have blocked pulses from reaching the bare-ground (Hodgson 2004). Using the observed LiDAR measurements summarized in Table 2-2, Hodgson (2004) found that pine and deciduous forest cover showed greater RMSE than both the low and high grass cover. Pavement actually showed a greater error than the grass cover- types. Hodgson (2004) stated that the error differences may have resulted from the significantly lower point density of scrub-shrub, pine and deciduous cover types. The lower density of points resulted from the elimination of returns from the complex vegetative structure in these cover-types. In addition to point density, horizontal error was identified as another source of error which has been shown to increase with slope. Hodgson indicated that these errors could possibly be overcome by using higher pulse rates to increase point density and reducing the AGL of the flight to reduce horizontal error. The error associated with denser vegetation is consistent with other studies including those conducted with leaf-on conditions (Hodgson, 2004). Both of these studies showed that land cover has an impact on the ability of LiDAR pulses to reach bare-ground and return this value to the sensor. Hodgson (2004) indicated that land cover affects point density which is directly related to the removal of points that are not identified as bare-ground points during the labeling process. When specifying the parameters (flying height, forward speed, pulse rate, etc.) for LiDAR data collection, a primary goal is to ensure that point density is high after the point removal process so that bare-ground interpolation is minimized. There should be a high point density for the various land covers after vegetation removal if accuracy is to be 36 maintained. As shown in Table 2-3, the post spacing did vary by land cover with the lowest density in the evergreen and deciduous forests. This lower density was directly related to the interception of pulses by the vegetative structure (canopy, branches, under story) (Hodgson 2004). Table 2-3. Density of LiDAR ground returns by cover type (Hodgson, 2004) Cover Type Pavement Low grass High grass Brush/Low Trees Evergreen Deciduous Area per Observation (m 2 ) 7.67 6.28 5.43 6.28 10.80 8.46 Regularized Post Spacing (m) 2.77 2.51 2.33 2.51 3.28 2.91 2.4.8 Statistical Significance of Differences in Mean Absolute Errors by Land Cover Types In his accuracy assessment based on the observed error at LiDAR points that were surveyed, Hodgson (2004) conducted an independent samples t-test using the mean absolute error to determine if the differences in errors between pairs of land-cover categories were statistically significant (Table 2-5). The significance level for the t-test was 0.05. The mean absolute error by land cover category is shown in Table 2-4. Table 2-4. Observed Absolute LiDAR Elevation Error by cover type (Hodgson, 2004) Cover Type Payment Low Grass High Grass Brush/Low Trees Evergreen Deciduous N 120 137 98 98 119 82 Mean Absolute Error (cm) 14.9 16.8 15.9 18.9 12.9 20.3 37 Table 2-5. Significance Levels for Difference between Mean Absolute Error by Land Cover (Hodgson, 2004) Pavement Low Grass High Grass Brush/Low Trees Evergreen Low Grass 0.250 High Grass 0.495 0.609 Brush/Low Trees 0.020 0.083 0.019 Evergreen 0.190 0.022 0.042 0.035 Deciduous 0.006 0.106 0.028 0.531 0.000 The significance levels of the differences between the errors for each cover type are shown in Table 2-5. The mean absolute errors that are significantly different at the 5% level between the cover types are as follows: pavement and brush/low trees, pavement and deciduous forest, low grass and evergreen, high grass and brush/low trees, high grass and evergreen, high grass and deciduous, brush/low tress and evergreen, and evergreen and deciduous. The brush/low trees are a multi-story cover that causes problems when trying to eliminate points that are not considered bare-ground. The deciduous category also poses problems when trying to identify the bare-ground since the vegetative structure, under story and possible presence of some leaves during the data collection period of this study obscured the bare-ground. The high grasses also caused problems in bare-ground identification most likely due to the thick stubble that remained during senescence making bare-ground identification difficult (Hodgson, 2004). Complex vegetative multi-story cover creates a point cloud (points that are returned from different elevations) that must be carefully evaluated to identify bare-ground points and remove those points that result from returns within the vegetative structure. 38 2.5 Using LiDAR in Fluvial Topography The complexity of a riparian zone (multistory vegetation, thick under story, logs and brush, etc.) and the frequent and severe slope changes within the channels of streams complicate the determination of LiDAR elevation error at specific points (Figure 2-12). Figure 2-12. Photo of Sopers Branch just south of the bridge and parking lot The presence of a very heterogeneous vegetative cover in the riparian zone would tend to make bare-ground points difficult to obtain and the steep banks often present in stream channels would make elevation error caused by horizontal offsets more severe based on 39 studies by Bowen (2002), Hodgson (2005) and Peng and Shih (2006). Bowen (2002) found that the RMSE z over all terrain types was 43 cm and was largely attributable to horizontal error that had an RMSE x, y of 1 to 2 m in areas with variable terrain and large topographic relief. Algorithms used to remove non-ground returns from the riparian vegetation proved less effective possibly due to the thin linear configuration of the riparian zone. If vegetation removal algorithms are less effective, then bare-ground points are less dense and less accurate for determining topography (Bowen, 2002). The assessment of river corridor topography is essential for predicting and evaluating the effects of discharge and stage on the habitat of aquatic and riparian organisms. This topographic assessment is also essential for determining parameters used in the hydraulic analysis of streams for predicting geomorphic changes, designing highways, and flood and erosion control and water resource management. The data that has been used for most studies has been in the 1:24,000 scale providing data at the watershed scale. However, for determination of these hydraulic parameters and more accurate measurement of topographic changes at the stream reach scale, data is required at the 1:5,000 scale. LiDAR is capable of providing topographic details at this scale, assuming that bare-ground points are correct (Bowen, 2002). In his study to evaluate the potential use of LiDAR for measuring river corridor topography, Bowen (2002) used LiDAR data collected in a study area in northeast Utah in October, 1999. This area was located along the Green River and was characterized by the presence of different soils, slopes and vegetation.. The primary vegetation was sagebrush (Artemisia tridentata), salt cedar (Tamarix ramosissima) and Russian Olive (Elaegnus angustifolia). It was not possible to measure the elevations at specific LiDAR 40 point locations. Adjustments made to the x, y locations during data collection and post- processing made it impossible to know the exact location of the LiDAR points. However, the ground survey locations within the study area were selected in an effort to represent each soil and terrain type (brush, sand, cobble, slope, etc.). More than one location for each terrain type was selected and in situ measurements were collected using a Trimble 4800 GPS to identify x, y and z coordinates. The error of this GPS system is approximated to be ?3 cm (x, y, z). The location for an in situ evaluation was both point comparisons (circular areas with the specified terrain type) and transects perpendicular to the main river channel. The number of ground points collected at each location varied due to topography, channel geometry, topographic complexity and the ability to collect the points. There were eleven point locations and seven transects. Bowen (2002) used four statistical error measurements to compare the LiDAR elevations with the ground GPS elevations: Root Mean Square Error (RMSE), Absolute Mean Error (ABSE), Mean Error (ME) and Maximum Error (MAXE). Each was calculated as follows: () n gpslidar ZZ RMSEz ? ? = 2 (Equation 2-9) n ZZ ABSE gpslidar z ? ? = (Equation 2-10) () n gpslidar ZZ MEz ? ? = (Equation 2-11) ZZMAXE gpslidarz MAX ?= (Equation 2-10) 41 The deviations of the LiDAR elevations from the GPS elevations taken in the field weremeasured by the root mean error (RMSE z ) and the absolute mean error (ABSE z ). Any overestimation or underestimation was counted as errors in these measures. The overall bias of the LiDAR data to overestimate or underestimate elevations from the ground GPS data is measured in the mean error. The largest deviation of the LiDAR elevations to the ground GPS elevations are measured in the maximum error. In this study, Bowen (2002) block corrected any systematic bias that may have been caused by setup, calibration, or measurement errors in either the LiDAR or ground GPS data by subtracting the mean error from the original LiDAR dataset. The mean error was -44 cm. The Komolgorov-Smirnov test was used to determine if the error distributions were normal by terrain type. A Kruskal-Wallis rank test was used to test the hypothesis that error magnitude was the same for different terrain types (Bowen, 2002). The results using the original data prior to block correcting showed that the elevation error (RMSEz) was larger than the advertised specifications of 15 to 20 cm. RMSEz was significantly above these values. The block corrected data was still much higher than the advertised specifications but the error was reduced (Table 2-6). Table 2-6. Elevation Errors between LiDAR data and ground data Error Statistic Original Data (cm) Block Corrected Data (cm) RMSE 62 43 ABSE 56 22 ME -44 0 MAXE 191 233 42 The large maximum error compared to the mean error indicated the presence of large outliers in the LiDAR data. Error values for the the cobble terrain type were normally distributed (p>0.05). Error values for brush, sand, and slope terrain were not normally distributed (p<0.05). Errors differed across terrain type with the largest RMSEz for the slope terrain type (RMSEz = 111 cm). This error was followed by sand ( RMSEz = 53 cm), cobble (RMSEz = 19 cm) and brush (RMSEz = 9 cm). The largest range of errors was in the slope terrain type and the largest number of outliers was found in the sand terrain. Transect measurements near the active channel indicated the greatest LiDAR overestimation where there were steep bank slopes and dense riparian vegetation. In more open terrain, cross sections showed lower LiDAR elevations than GPS measurements while some showed only slight variations in the elevation and profile shape (Bowen, 2002). Block correction was found to be an acceptable method of reducing the effects of systematic bias caused by setup and calibration errors in LiDAR and ground GPS systems, errors found in the ground GPS network, and errors introduced during data processing. The mean error (bias) was used to block correct the LiDAR data resulting in a RMSE z that was 30 percent less than if not corrected. Bowen (2002) indicated that this correction stressed the importance of collecting at least a minimal number of ground survey points to make the block corrections (Bowen 2002). 2.6 Using LiDAR to Identify First Order Stream Channels In other studies, LiDAR is being accepted as a more accurate alternative to high resolution DEMs and to National Elevation Dataset (NED) 7.5 minute quadrangle maps 43 developed by the United States Geological Survey (USGS). Comparisons of several elevation data sets and their respective errors were prepared by the Maryland Department of Natural Resources in part to find a better method for identifying first order streams in the Piedmont region. The elevation errors for each of the three methods are shown in Table 2-7 (Smith, 2006). Table 2-7. Vertical and Horizontal Accuracy of Three Data Sources (Smith, 2006) Source data Map Scale Vertical Accuracy (RMSE z ) Horizontal Accuracy (RMSE x,y ) DEM Size LiDAR 1:2,400 0.185 cm 2 m 2 m Photogrammetric 1.5 m contours 1:2,400 0.75 m 2 m 5 m NED 7.5 minute quads 6 m elevation contours 1:24,000 5 m 12 m 30 m 2.7 Use of LiDAR to Determine Erosional Changes of Bedrock Channels In their study to determine the erosion of a bedrock channel of the Holtwood Gorge along the Susquehanna River over time, Ruesser and Bierman (2007) used LiDAR data to create 1 m Digital Elevation Models (DEMs) over a specific geologic period. These fine resolution DEMs were used to develop a three-dimensional image of the channel geometry. The image could then be used to infer the volume of material that was removed from the bedrock during incision events. The determination of the volume was difficult without LiDAR because of the complexities of using transect data and interpolating along a 500 meter reach. The DEMs were used to make determinations of erosional changes during the late Pleistocene period. 44 The LiDAR data required significant calibration to adjust for differences between the actual GPS points and the underlying grid cells that resulted from LiDAR data. The differences between the GPS elevations and the underlying DEMs at the two control points were 33.39 m and 33.42 m. In order to ensure an accurate comparison of the GPS ground point elevations and the LiDAR-derived DEMs, the elevation of the DEMs was raised by an average of the two differences (33.40 m). In contrast to previous studies, the filtered data (data with only bare-ground points) showed greater error than unfiltered data. The RMSEz and the ?z (?z = LiDAR ? GPS) for the filtered data were 3.26 m and 2.20 m, respectively, while the unfiltered RMSEz and ?z were 1.4 m and 0.85 m. The greater error of the filtered data is thought to be the result of the algorithms used to filter the data and the very steep slopes along the terraces of the channels in this study area. The algorithms used to remove points not thought to be bare-ground are designed to look at vegetative cover as the source of the point cloud and not consider the elevation due to rock outcroppings. The ground conditions in the bedrock channels of the Holtwood Gorge are markedly different than the usual riparian zones reviewed by Hodgson (2005) and others. In their studies, Hodgson (2004, 2005), Peng (2006) and others collected LiDAR and ground points in vegetated riparian zones with dense brush and trees. Algorithms designed to remove the vegetation in order to reveal bare-ground do not distinguish between rock outcroppings, terrace remnants (boulders) and other major variations in surface elevation. The result is removal of surface features that should have been identified as bare-ground but were identified as vegetation. This is a very important point when evaluating LiDAR datasets. The bare-ground must be carefully compared to 45 orthoimages and include field data collection to verify the actual surface, especially on surfaces with low vegetation and rocky surface features (Reusser 2007). 46 Chapter 3. Methodology Two comparisons are performed in this study: 1. LiDAR data collected in 2006 to ground-surveyed data collected in 2006 (evaluation of accuracy). 2. LiDAR data collected in 2006 to LiDAR data collected in 2002 (change detection). This chapter describes the location and characteristics of the study site, the data used, and the Geographnical Information Systems (GIS) data analysis tools applied. Interpolation and statistical analysis methods are summarized. Because many of the analytical steps involved explanatory maps of intermediate results, details of these steps are interleaved with intermediate and final results in Chapter 4. 3.1 Study Area The study area is located in the Sopers Branch watershed north of Washington, D.C. , in the Clarksburg Special Protection Area (CSPA) in Montgomery County, Maryland. Special Protection Areas (SPAs) are locations within Montgomery County, Maryland, that have been identified as having existing high quality or unusually sensitive water resources or other environmental features and where proposed land uses would threaten these resources or features in the absence of special water quality protection measures that would require controls as the land use changes. For example, specific Best Management Practices (BMPs) would be required as land use changes from rural agricultural, forest or grasslands to suburban and urban land uses. In Montgomery 47 County, there are four SPAs: Upper Rock Creek, Upper Paint Branch, Piney Branch, and Clarksburg. SPAs are designated by the County Council, but can be proposed by concerned individuals from the public and government agencies (Special Protection Areas, 2008). The CSPA is the focus of a collaborative research effort by the United States Environmental Protection Agency (USEPA) Environmental Photographic Interpretation Center (EPIC) and the Montgomery County Department of Environmental Protection (DEP). The USEPA EPIC research has been led by Dr. Taylor Jarnagin. EPIC has studied the impacts of urbanization within the CSPA watershed to determine the correlation between the impacts of the development and the mitigation of local BMPs on hydrological, biological, and chemical parameters of water resources within the CSPA. EPIC has two primary research objectives: 1) Develop high resolution watershed mapping, aerial photography, satellite imagery, GIS mapping and processing, BMP placement and development mapping over time 2) Monitor biological and physical stream parameters such as chemical and biological stream monitoring, streamflow and precipitation gauging and weather parameters over time (Jarnagin, 2006) The Montgomery County Department of Environmental Protection (DEP) has been monitoring area streams since 1996 to determine changes in water quality and biology and to evaluate best management practices (BMPs) to determine if they have successfully 48 limited the impact of development on water resources. The lead researcher for Montgomery County DEP is Keith VanNess. Within the CSPA, the Area of Interest (AOI) for the present study is 0.93 acres, with its centroid at 1226352 Easting and 585888.4 Northing in the NAD 1983 State Plane Maryland FIPS 1900 Feet projection. Sopers Run is a small stream flowing through the AOI bordered by a riparian zone consisting of a deciduous forest with a dense brush and shrub understory. Two small drainage ditches enter the stream, one from the east bank and one from the west bank. A bridge crosses the stream at the north side of the AOI and a road runs along the west bank of the stream to an asphalt parking lot (Figure 3-1). 3.2 LiDAR Data Collection The LiDAR data sets used in this study were provided by the Canaan Valley Institute, which was commissioned by the USGS/USEPA/Montgomery County DEP collaborators. The two LiDAR datasets were collected during overflights of the CSPA in 2002 and 2006. 49 Figure 3-1. CSPA showing the Area of Interest (inside red rectangle) with the parking lot, stream and surrounding riparian zone. (Image source: Canaan Valley Institute, 2006) 50 3.2.1 LiDAR 2002 General Specifications The 2002 LiDAR data used in this study were collected in December, 2002 during leaf-off conditions with clear skies in the watershed that includes the Clarksburg Special Protection Area. The flight covered an area of 8,584 acres including the CSPA. The Clarksburg overflight area was divided into fifty tiles with various posting densities dependent upon flight parameters and conditions. The specified posting interval was 0.8 meters (2.62 feet). The tile used for this study has an area of 11,015,658 square feet and is identified as tile 48 (Figure 3-2). The LiDAR system was an Optech-ALTM-2025 that emitted pulses at a rate of 28 KHz. The flight altitude above ground (AGL) was 2500 feet. The laser wavelength was 1064 nm (near infrared). The survey utilized Global Positioning System (GPS), Laser Rangefinder (LiDAR) and Inertial Measurement Unit (IMU) technologies to develop a digital terrain model. The LiDAR data was post-processed to identify first returns and last returns. The last return was processed to identify bare-ground (Airborne 1, 2002). 3.2.2 LiDAR 2002 Calibration A pre-flight calibration was conducted on December 12, 2002 to ensure both horizontal and vertical accuracy. The calibration was conducted by flying sixteen lines with known positions and elevations to determine any bias in first and last returns. The first returns were found to have a -11 cm bias and the last returns a -7 cm bias against known ground controls. Pitch and roll were also adjusted and calibration values used in the flight are shown in Table 3-1. 51 Figure 3-2. Clarksburg Special Protection Area showing tiles where LiDAR 2002 data were collected. Tile 48 is the location of the Area of Interest for this study. 52 Table 3-1. Final calibration values after adjustments were made from the calibration flight (Airborne 1, 2002). Original Value from Optech New Value 12/11/02 Pitch -0.004 -0.004 Roll -0.001 -0.0124 Offset 0.00 0.00 Scale 0.9923 0.9916 TIM1 first pulse -0.334 -0.404 TIM2 last pulse -0.243 -0.353 3.2.3 LiDAR 2006 General Specifications The 2006 LiDAR data used in this study was collected on March 18, 2006 during leaf-off conditions with no snow cover and clear skies in the watershed that includes the Clarksburg Special Protection Area. The flight covered the entire area of the CSPA which was approximately 50 square kilometers. The CSPA was divided into thirty tiles with various posting densities dependent upon flight parameters and conditions. The LiDAR system was an Optech-ALTM-3100 that emitted pulses at a rate of 64.4 KHz. The flight altitude above ground (AGL) was 2500 feet. The laser wavelength was 1064 nm (near infrared). The survey utilized Global Positioning System (GPS), Laser Rangefinder (LiDAR) and Inertial Measurement Unit (IMU) technologies to develop a digital terrain model. The LiDAR data was post-processed to identify first returns and last returns. The last return was processed to identify bare-ground. The tile used for this study has an area of 339,351 square meters and is identified as tile 236NW14 (Figure 3-3). There was a total of 4,158,042 LiDAR points collected for 53 Figure 3-3. Clarksburg Special Protection Area showing tiles where LiDAR 2006 data were collected. Tile 236NW14 is the location of the Area of Interest for this study. 54 Figure 3-4. CSPA showing LiDAR 2006 point density (1.17 per square meter) within the AOI 55 this tile with a posting density of 12.25 points per square meter. The total number of points within this tile identified as bare-earth was 397,317 with a posting density of 1.17 points per square meter (Figure 3-4) (Canaan Valley Institute, 2006). 3.2.4 LiDAR 2006 Calibration A pre-flight calibration was conducted to ensure vertical and horizontal accuracy. The calibration was conducted by flying multiple strips along an airport runway and over a building with 5,000 pre-surveyed GPS ground points. This calibration was found to meet the vendor specifications for both horizontal and vertical accuracy as follows: 1. Horizontal accuracy: 1/2000 ? altitude with 1 ? 2. Vertical accuracy: a. <15cm at 1200 m with 1 ? b. <25cm at 2000 m with 1 ? c. <30 cm at 3000 m with 1 ? 3.2.5 LiDAR bare earth processing Each LiDAR pulse is returned to the aerial sensor with as many as five returns, dependent upon the specific surface features (trees, shrubs, grass, concrete, etc.) that the pulse contacts. The LiDAR last returns were identified as bare-ground during post- processing by the data provider, Canaan Valley Institute. These LiDAR points covered the entire CSPA in thirty tiles as specified by Montgomery County DEP and USEPA. 3.3. Ground Survey Data Collection The AOI was surveyed on March 22, 2006, using a total station creating a dataset of 604 points with x, y, and z coordinates. The field surveying and surveying data processing were provided by personnel from Johnson, Mirmiran and Thompson Engineers (JMT). These surveyed points were intended as the ?ground truth? to evaluate 56 the accuracy of surface features derived from the LiDAR points. These bare-ground points are important for identifying surface topography and fluvial geomorphic change, as well as, determining error within the datasets. Errors in the ground survey ? and its ultimate utility as ?ground truth? ? are discussed in Chapter 5. 3.4 Selection of Points from LiDAR Data The LiDAR datasets for 2002 and 2006 include hundreds of thousands of points for each tile within the CSPA. This data are in an (x,y,z) format that include the longitude, latitude and elevation at the point. Since the Area of Interest is much smaller than the tile, a subset of the LiDAR datasets was selected using Microsoft ACCESS database management software to reduce the points to those that are located within the AOI for analysis. The method used to select only those LiDAR points within the AOI is as follows: 1. Create a Microsoft ACCESS database. 2. Create a new table. 3. Import a table (LiDAR text file) with the delimited fields into the table. 4. Create a SQL query to select those records that are within the AOI. Use the North and South extents of the AOI for the Y limits and the East and West boundaries for the X limits in the ?where? clause of the SQL statement, as follows: To select LiDAR 2006 points within the AOI: SELECT * FROM Tile48 WHERE (Y > 585778.375989 and Y < 585978.375989 and X >1226239.703921 and X < 1226464.703921); To select LiDAR 2002 points within the AOI: SELECT * FROM Tile50 WHERE (Y > 585778.375989 and Y < 585978.375989 and X >1226239.703921 and X < 1226464.703921); 57 5. Run the query and export the results to a delimited text file with the X, Y, Z and I column headings. 6. The text file of selected points is now ready for import to the Geographical Information System (GIS) software (see below). The number of LiDAR points actually used to assess the bare-ground surface within the AOI for this study was 2,137 (for 2006) and 1,149 (for 2002). 3.5 Geographical Information System (GIS) The Geographical Information System (GIS) software environment used in this study is ESRI?s ArcGIS 9.2 (ESRI 2008). A GIS uses different data types to represent different kinds of information. The steps in processing and analyzing the LiDAR and Ground Survey data required several of these data types, as well as conversions between different data types. The basic types are described here, and further details are given together with specific steps and intermediate results in this Chapter and in Chapter 4. 3.5.1 Data types in a GIS The ArcGIS data types used in this study include the following: 1. Tabular data: Geographic objects that are used in a map to represent features on the surface are stored and managed in tables. These tables contain information that is the basis of the geographic features and allow the information to be visualized, queried, and analyzed using tools within the ArcGIS software. All tables contain rows with the same columns in a specific table. Each colun stores a specific data type (number, date or character) that is used to represent a characteristic of the feature. For example, LiDAR points contain four fields that are used to help identify the location, elevation and intensity of each point. The columns are given names such as X, Y, Z and I. 58 2. Vector/point data: Vector is a data structure, used to store spatial data. Vector data is comprised of lines or arcs, defined by beginning and end points, which meet at nodes. Vector features are frequently used geographic data types that are well-suited for such features as boundaries, road center lines, breaklines, stream channel locations and generally any features that can be represented by lines, polygons or points. A feature is simply an object that stores its geographic representation, which is typically a point, line, or polygon, as one of its properties (or fields) in the row (ESRI 2008). Examples in this study include the Area of Interest and boundary polygon used to specify targeted areas. 3. Shapefile data: Shapefiles are a simple format for storing the geometric location and attribute information of geographic features. Geographic features in a shapefile can be represented by points, lines, or polygons (areas). Examples in this study are the LiDAR points that are saved as shapefiles and the polygons that are used to mask areas for use in calculations. 4. Raster data: Raster data is represented as a series of pixels represented in a grid. The actual digital imagery can be represented as a series of pixels arranged in a grid. A combination of these pixels will create an image that can represent various measurements at a given location on the surface. For example, in this study, rasters are used to represent the elevation across the surface with each pixel representing the elevation within the area covered by that pixel. These grids can represent temperature, slope and other criteria. Raster data type consists of rows and columns of cells, with each cell storing a single value. Raster data can be images (raster images) with each pixel (or cell) 59 containing a color value. Additional values recorded for each cell may be a discrete value, such as land use, a continuous value, such as temperature, or a null value if no data is available. While a raster cell stores a single value, it can be extended by using raster. 3.5.2 Importing LiDAR and Ground Surveyed Points into ArcMap 9.2 The text files containing (x,y,z) data for LiDAR 2006, LiDAR 2002, and Ground Survey were imported into ArcMap as tabular data by using Tools>Add X, Y data. The data were then added as ?Event Layers? and exported as Point Shapefiles to be added as Shapefile Layers for further visualization and analysis. Within ArcMap 9.2, the LiDAR datasets were further subsetted by the polygon defined by the Ground Surveyed points. 3.6 Interpolation To obtain a map of elevation, it is necessary to interpolate between the irregularly spaced (x,y) points of the measured data, whether LiDAR or Ground Surveyed. . Interpolation is defined as the estimation of values (elevation in this study) at locations between actual points with known values. Extending the estimations some distance beyond existing points creates an extrapolated surface. Two interpolation methods are used in this study: Triangulated Irregular Network (TIN) and Optimal Interpolation (Kriging). Tools in ArcGIS 9.2 to implement these methods are introduced below. 3.6.1 Triangulated Irregular Network (TIN) It is fairly straightforward to produce a TIN (see Section 2.4.6) from point data in a GIS. The TIN creates edges and planes connecting adjacent points. The TIN surfaces 60 were created with the following ArcMap 9.2 toolset: ArcToolBox,>3D Analyst>TIN creation and TIN Edit using each point dataset (LiDAR 2006 or ground-surveyed points) to specify the nodes of each triangle created by Delaunay triangulation. The graphical representation of a TIN provides an interpolated view of the surface that is visually appealing, showing the elevation, slope and aspect of the triangular planes formed by the Delaunay Method. How accurately these planes describe the actual terrain depends on the location of the measured points. In a ground survey, the survey team can visually identify key landscape features that will improve accuracy and interpolation. For example, identifying breaklines along the top and bottom of a stream bank can improve the determination of the slope, improving accuracy, if ground-surveyed points are collected there. The use of breaklines and the identification of critical points such as those at the bottom of the bank will also improve the accuracy of the elevations within the plane formed by the connection of these points. The slope, elevation and aspect of the planes formed by the Delaunay triangulation algorithm will be more representative of the actual surface. While ground-surveyed points can be targeted, LiDAR points will fall at random locations that will not clearly identify important breaklines. The location of the LiDAR points occurs randomly since the points are not specifically positioned within the landscape but fall as the laser swath moves across the terrain. These LiDAR points could actually fall some distance from the top of bank. If these points are included with other points that fell within the stream bed away from the bottom of the bank then interpolation between these points will create an edge and plane with a slope that does not accurately reflect the actual slope. The error in the slope affects the elevation of points in the plane 61 which is determined by joining the edges that were formed by connecting these points (Fig. 3-5). A GIS comparison of two TINs gives a qualitative measure of elevation differences; it can show regions where one data set lies above or below the other. However, this information is not sufficient for a quantitative analysis of these differences. It was necessary to convert the TIN vector data, consisting of nodes and edges, to a data format that provides elevation values at every point on a regular grid. Differences in elevation were quantified by converting the LiDAR and ground survey TINs to rasters. The value associated with each pixel is the interpolated elevation of the surface at that pixel, as determined from the respective TIN. The TINs were converted to rasters using ArcMap 9.2 toolsets as follows: ArcToolBox >3D Analyst>Conversion>From TIN, and TIN to Raster. The input TINs were the LiDAR or Ground-Surveyed TIN, the output data type was ?float,? appropriate Ground-surveyed points and interpolated edge from top of bank to bottom of bank LiDAR points and interpolated edge from top of bank to bed of stream. Figure 3-5. Stream channel cross-section showing the differences in slope interpolation between a ground-survey using a visually-identified breakline point at the top of bank and a surveyed point at the bottom of bank and randomly-located LiDAR points 62 for continuously-valued data, the method was ?linear,? which calculates cell values by linear interpolation of the TIN nodes and edges, and the sampling distance was a cell size of 0.5 feet. The exact alignment of the cells in each grid is necessary for accurate processing. The lower left corner of each cell must align for all grids that need to be compared. This is accomplished by setting the Environment Settings as follows: Environment>General Settings>Extent>Same as any other grid used for comparison. In this analysis, the raster converted from the Ground-Surveyed TIN was assigned the same extent as the raster previously converted from the LiDAR TIN. 3.6.2 Optimal Interpolation (Kriging) Because interpolated data are used to make judgments about locations between surveyed points, the error due to interpolation must be evaluated. The difference between the rasterized TINS does not provide estimates of uncertainty. This is because the automated TIN interpolation technique in ArcMap 9.2 does not generate error bars. An alternative is to apply Optimal Interpolation, or Kriging. Kriging produces both a continuous interpolated surface (prediction map) and estimates of uncertainty in that surface (standard error map). With this information, the analyst can account for uncertainty in interpolated quantities and in calculations (such as differences) based on them. The use of Kriging allows for the creation of interpolated continuous surfaces together with estimated uncertainty in those surfaces, based on geospatial statistics. Kriging procedures are available in the Geostatistical Analyst tools in ArcGIS 9.2. Kriging uses autocorrelation (the statistical relationship among measured points) to create 63 a continuous surface that predicts the elevation values where no points exist. The points closest to the interpolation location have a greater weight in the prediction algorithm, as would be expected. However, the weights are based not only on the distance between the measured points and the prediction location but also on the overall spatial arrangement among the measured points. To use the spatial arrangement in the weights, the spatial autocorrelation function must be quantified (ESRI, 2007). Kriged prediction and standard error maps from the LiDAR and Ground-Surveyed points were generated using the ArcMap tool, Geostatistical Wizard, Ordinary Kriging, Prediction Map. (An alternative would have been to apply methods proposed by Hodgson [2004] and Peng [2006] to estimate uncertainty in interpolated elevations; however, the automated Kriging tool was judged to be more efficient for this study?s purposes.) The Kriged prediction maps (shapefile features) were created using the LiDAR and ground-surveyed points and Geostatistical Analyst in ArcMap 9.2. To use this toolset, the analyst must choose one of several possible variogram functions (exponential, spherical, circular ?). The tool also provides the option of fitting the surface with a polynomial of a specified order before analyzing the autocorrelation; this step helps to address the problem of non-stationarity in the spatial data. The analyst selects a specific semivariogram model (including a regression step, a particular semivariogram shape, and the choice of nugget or no nugget, whether or not to include anisotropy). The Geostatistical Analyst automatically fits the selected variogram to the data, and provides goodness of fit tools so the analyst can select the best of the candidate models. The Kriged prediction standard error maps (shapefile features) are created using the LiDAR and ground-surveyed points. The ArcMap tool used to produce Ordinary 64 Kriged Prediction Standard error Maps from the LiDAR points and ground-surveyed points is Geostatistical Wizard> Ordinary Kriging> Prediction Standard error Map. 3.6.3 Creating Rasters from the Kriged Prediction and Standard error Maps The ArcMap kriging routines produce results in a ?Geostatistical Layer? format. The quantitative comparison of the LiDAR-derived and ground survey-derived Kriged Prediction Maps required that each prediction map be converted to a raster so that analysis could be performed using ArcMap?s Raster math. To convert a Geostatistical Analyst Output Layer (Kriged prediction map or Kriged prediction standard error map) to a raster, the ArcMap 9.2 toolset used is ArcToolBox>Geostatistical Analyst>GA Layer to Grid. The parameters were set as follows: 1. Input Geostatistical Layer: The Kriged prediction map or Kriged prediction standard error map for the LiDAR and Ground points 2. Output Surface Grid: The output raster for each prediction map or prediction standard error map 3. Output Cell Size: Set to 0.5 feet or use an existing grid 4. # of points in the cell (horizontal): set to 1 for these grids 5. # of points in the cell (vertical): set to 1 for these grids 6. Environments>General Settings>Extent>Same as other grids that are to be compared. This aligns cells in the various grids so that raster math is accurate on a cell by cell basis. 3.7 Comparing Elevations Based on LiDAR 2006 and Ground Surveyed Data The 2006 LiDAR and Total Station Ground Survey data were collected within 4 days of each other. The Total Station data are used as ground truth to evaluate the accuracy of the LiDAR data in the vicinity of Sopers Branch. This section describes the analysis steps in this comparison. 65 3.7.1 Correcting for Systematic Error The LiDAR data points have a difference in elevation and position from the Ground-Surveyed data in most cases. The differences may include both a random and a systematic component. Before proceeding to analyze random errors, the presence of systematic error was detected and corrected by analyzing the elevation data for the asphalt parking lot, where both the ground-surveyed and LiDAR data would be expected to be most accurate. The parking lot is a relatively flat, asphalt-covered surface with no overlying vegetation or other objects to interfere with the pulses and returns (Fig. 3-6). Therefore, in the absence of any systematic error, the LiDAR and ground elevations for the parking lot would be expected to agree. A consistent difference in elevations over this surface Figure 3-6. Parking Lot within the AOI at Soper?s Run with the March 22, 2006 Survey team. (Photo: K. Brubaker) 66 would be interpreted as systematic error that affects the entire LiDAR scene. The difference between the LiDAR 2006 raster and the Ground-Surveyed raster over the area identified as the parking lot was determined using ArcMap 9.2 Raster Math with the following toolset: ArcToolbox>3D Analyst>Raster Math>Minus. The input rasters were the LiDAR 2006 raster and the ground point raster. All grid cells outside the parking lot were masked out using a binary raster mask. These steps resulted in a raster map of differences (LiDAR 2006 ? Ground Surveyed) over the parking lot, with all other grid cells set to the ?No Data? value. The mean of the parking lot difference raster was found by examining the ArcMap statistical properties of the parking lot difference raster. (In the ArcMap 9.2 Table of Contents, right click on the parking lot difference raster and the General tab to see the values of the raster properties.) The mean difference between the LiDAR and ground parking lot rasters was used to correct the elevations in the LiDAR raster covering the entire ground point polygon. This adjustment was made by using ArcMap 9.2 Raster Math to subtract the mean of the parking lot difference raster from the LiDAR-derived raster covering the ground point polygon. The adjusted LiDAR-derived raster was created by using ArcMap 9.2 with the following toolset: ArcToolbox,>3D Analyst>Raster Math>and Minus. The input values were the LiDAR 2006 raster that had been masked to fit the ground point polygon, and a constant value equal to the mean parking lot difference The difference between the adjusted LiDAR-derived raster and the ground- surveyed raster covering the ground point polygon is the first step in determining the accuracy of LiDAR in measuring surface topography. The difference raster indicates the 67 pixel-by-pixel difference between the adjusted interpolated LiDAR and the interpolated ground-surveyed elevation. The difference raster was created by using ArcMap 9.2 with the following ArcMap 9.2 toolset: ArcToolbox,>3D Analyst>Raster Math>and Minus. The first input raster is the adjusted LiDAR 2006 raster, the second input raster is the Ground Surveyed raster. The second input (Ground Surveyed) is subtracted from the first (LiDAR 2006). The systematic error correction using the mean parking lot difference was applied to both TIN-based and Kriging-based difference analysis. 3.7.2 Statistical Hypothesis Test for Elevation Differences The LiDAR 2006 and Ground Surveyed point data were collected nearly simultaneously (Mar. 18 and Mar. 22, 2006, respectively). The difference between the interpolated LiDAR surface elevations, adjusted for systematic error, and the interpolated ground-surveyed surface elevations should be zero. This zero difference would indicate that the LiDAR elevations, after being adjusted for systematic error, are the same as the elevations determined by the Total Station ground survey. However, if the difference between the LiDAR and ground-survey rasters is different from zero, then it must be determined if these differences are due to random error introduced by the measurement system or by the interpolation; that is, whether the differences are statistically significant. Using the Kriged prediction standard error rasters as determined from the LiDAR and Ground-Surveyed points, it is possible to perform a quantitative hypothesis test on the differences on a grid cell basis. 68 The null hypothesis can be stated as: the difference between LiDAR and ground- surveyed elevations at a given location (grid cell) is zero (H 0: d = 0). A Z-test can be used to test this null hypothesis. The Z-test is used to test the null hypothesis because ArcMap 9.2 Geostatistical Analyst is used to determine the standard error of the difference of the elevations at each cell within the interpolated surfaces, using the kriging equations and the specified semivariogram. The z-statistic is calculated cell by cell using this standard error map. The t-test was not used since the standard error is not calculated from a sample and the degrees of freedom are not known for each cell. The hypothesis can be stated as follows: H 0: d = 0 H A: d ? 0 where, d = the difference between the LiDAR and ground-surveyed elevations at a particular cell or pixel of the raster. The calculation of the z-test statistic is shown in Eq. 3-1. SE d z 0? = (Equation 3-1) where, 0 = the hypothesized difference (0 for all raster cells) SE = the standard error of the difference between the adjusted LiDAR and ground-surveyed surfaces for that cell or pixel (Equation 3-2, below) 69 The null hypothesis is rejected if Z lies in the region of rejection, defined for this two-tailed test by Z > Z critical where Z critical is defined by the selected level of significance, ?. This study applies ? equal to 0.05 and Z critical = 1.96, consistent with the common understanding of error bars as plus or minus approximately two standard errors. The Kriged standard error rasters created using the Geostatistical Analyst and conversion process described above can be used to calculate the standard error of the difference between the LiDAR and ground-surveyed surfaces: SE 2 diff,interp = SE 2 LiDAR,interp + SE 2 Ground,interp (Equation 3-2) Each standard error raster (LiDAR and ground-surveyed) is squared using ArcMap 9.2, ArcToolBox>Spatial Analyst Tools>Math>Square. The resulting rasters are added using ArcMap 9.2, ArcToolBox>Spatial Analyst Tools>Math>Plus. This raster math results in a raster of the standard error of the difference squared (SE 2 diff ). Taking the square root of this raster will create a raster of standard error values for the difference of the adjusted LiDAR and ground-surveyed rasters (SE diff ); this can be calculated with the following toolset: ArcMap 9.2, ArcToolBox>Spatial Analyst Tools>Math>Square Root. The standard error of the difference (SE diff ) raster is used to calculate a raster of Z-test statistics using equation 3-1. The z-statistic raster is calculated by using the following ArcMap toolset: ArcToolBox>Spatial Analyst Tools>Math>Divide. Input raster 1 is the raster of the difference between the adjusted Kriged Prediction LiDAR and the Kriged Prediction Ground point raster. The adjusted Kriged Prediction LiDAR raster is the Kriged Lidar 70 raster minus the adjustment from the mean of the parking lot difference of the Kriged Prediction LiDAR raster and ground-surveyed Kriged Prediction raster. Input raster 2 is the raster of the standard error values as determined by taking the square root of equation 3-2 but using the raster math procedures described above. The resulting raster has a z- statistic value for each cell. The cells can then be classified based on the upper and lower bounds of the region of acceptance. The Z-test statistic raster provides a geospatial analysis of the locations within the difference raster (LiDAR ? ground) where the differences are significantly different from zero at a level of significance of 0.05. Following equation 3-1, the adjusted difference raster is divided by the standard error raster to create a raster of values where each cell within the raster has a value for the Z-statistic at that cell. Since the level of significance is 0.05, the acceptance region must fall within the lower and upper bounds of -1.96 and 1.96, respectively. That is -1.96 < z < 1.96. Where values of the z-statistic fall within the region of acceptance, the differences are considered not to be significantly different from zero. Any cells where the z-statistic falls outside this range are considered to be significantly different from zero ? that is, we are reasonably certain that the difference is real, and not an artifact of errors introduced by interpolation. 3.7.3 Inferring LiDAR system error by comparing LiDAR and Ground It is important to note that the standard errors calculated for the LiDAR and Ground Survey cell elevations in the previous section represent only the uncertainty in measurement introduced by the interpolation procedure ? in other words, only the fourth term under the radical in Equation 2-6. Imposing the assumption that the differences 71 should be zero in all cells, we can then attribute the remaining differences to LiDAR system error, ground survey error, and slope error. LiDAR system error can be considered independently of interpolation and ground-survey error. LiDAR system error results from measurement error in the instrumentation, which can include error in the horizontal and vertical positioning of the LiDAR points. This error due to the data collection system is described in Section 2.4. The pitch, yaw and roll of the aircraft, slope of the surface and GPS and IMU error are included in this measurement error. The other source of LiDAR system error is incorrect labeling of points. Since LiDAR pulses often have four or five returns for each pulse, it is necessary to identify if the last return is bare ground. If these bare-ground points are mislabeled, then additional error is included in the LiDAR system error. If the LiDAR system error and the ground-surveyed error were accounted for in the difference between the Kriged prediction standard error rasters as determined from the LiDAR and ground-surveyed points, then the z-test statistic would indicate that the null hypothesis would be accepted for all cells within the ground point polygon. There would be no cells outside of the region of acceptance since, the difference in the elevations of the LiDAR and ground-surveyed cells would be zero. The critical standard error value for a given cell ? that is, the value of SE required to cause acceptance of the null hypothesis (the difference between the LiDAR 2006 and ground-surveyed surfaces are equal to zero for that cell) ? can be determined as follows: Z critical = |elevation LiDAR ? elevation ground | (Equation 3-3) SE critical SE critical = |elevation LiDAR ? elevation ground | (Equation 3-4) Z critical 72 The value of the Z-statistic which separates the region of acceptance from the region of rejection for a specified level of significance is known as the Z critical value. In this study, the level of significance is 0.05, which has a Z critical value of 1.96. Therefore, using a Z critical value of 1.96, it is possible to determine the critical standard error (SE critical ) from Equation 3-4 that would be necessary to ensure that the null hypothesis is accepted. ArcMap 9.2 Raster Math can be used to find the the critical standard error (SE critical ) at each cell in the map by dividing absolute value of the difference between the interpolated adjusted LiDAR surface elevations and the interpolated ground-surveyed surface elevations by the Z critical (1.96). The total error for LiDAR and ground-surveyed elevations is determined by equations 3-5 and 3-6, respectively. The total error for both datasets includes system and interpolation error as shown in the following equations: Total LiDAR error: SE 2 L,total = SE 2 L,system + SE 2 L,interp (Equation 3-5) Total Ground-survey error: SE 2 G,total = SE 2 G,system + SE 2 G,interp (Equation 3-6) In Eq. 3-5, ?L,system? error is used to refer to all sources of error discussed in Section 2.4 (sensor, position, slope, labeling, etc.) Since the standard error of their difference equals the sum of the squares of the standard errors for the LiDAR and ground systems, it is possible to back-calculate to obtain the square of the LiDAR system error that would be required to make the standard error of the difference equal the critical SE: SE diff 2 = SE L,total 2 + SE G,total 2 = SE L,system 2 + SE G,system 2 + SE L,interp 2 + SE G,interp 2 = SE L,system 2 + SE G,system 2 + SE diff,interp 2 (Equation 3-7) 73 SE L,required 2 = SE crit 2 ? SE diff,interp 2 ? SE G,system 2 (Equation 3-8) The standard error of the ground survey system used in this study is 5 cm which is larger than the 3.1 cm error cited for total station surveying by Hodgson, 2004. This converts to approximately 0.16 feet. This value squared is 0.026 ft 2 , which is small relative to the errors attributable to the critical standard error and the interpolated error. The required LiDAR system error, SE 2 L, required can be calculated by using equation 3-8 with Raster Math in ArcGIS 9.2. The squared standard error of the difference due to interpolation (SE 2 diff ) has already been calculated (Eq. 3-2) and the square of the standard error of the ground system (SE 2 G, system ) is assumed to be 0.026 ft 2 based on the assumed vertical total station survey error. This operation gives a raster of cell-by-cell values of what additional squared error (above and beyond interpolation and ground survey error) would be required to make the Z statistic in that pixel lie within the region of acceptance in the hypothesis test ? in other words, to conclude that the measured difference is not statistically significant (that it is due to random errors in measurement and interpolation rather than to a real difference). In order to characterize the random errors due to slope, labeling, etc., in this terrain, a single value for the required LiDAR system error is desirable. The maximum value of SE 2 L, required is chosen to account for all errors that may occur. This value, if used in calculating the SE 2 diff (Equation 3-7), will result in acceptance of the null hypothesis for every cell in the difference grid. The square root of the square of the standard error of the difference (SE 2 diff ) can then be used as an estimate of the LiDAR system accuracy in dense understory riparian areas such as Sopers Branch. 74 3.7.4 Error Budgets The total squared standard error of the difference between LiDAR and Ground- Surveyed elevation can be expressed as the sum of four parts: SE 2 for the LiDAR system, SE G, system2 for the Ground system and square of the standard error of each of the interpolated maps (SE 2 L, int and SE 2 L, int ). It is of interest to determine the fraction that each component contributes to the total squared standard error: SE L,system 2 SE Diff ,tot 2 + SE G,system 2 SE Diff ,tot 2 + SE L,interp 2 SE Diff ,tot 2 + SE G,interp 2 SE Diff ,tot 2 =1.0 Equation 3-9 where, SE 2 L,system = square of the standard error of the LiDAR system SE 2 G, system = square of the standard error of the ground-survey system SE 2 L, interp = square of the standard error of the LiDAR interpolation SE 2 G, interp = square of the standard error of the ground-survey interpolation SE 2 Diff, tot = square of the standard error of the total error of the difference Using Equation 3-9 and ArcMap 9.2 Raster Math, it is possible to calculate and map the above fractions. For each ratio, the square of the corresponding squared standard error raster is divided by the squared standard error of the total. The assumed value for the Ground system, and the inferred value for the LiDAR System (based on Section 3.8) are used in this analysis. The ratios are determined using ArcMap 9.2 Raster Math, ArcToolBox>Spatial Analyst Tools>Math>Divide. Input 1 is the raster corresponding to the appropriate numerator in Eq. 3-9, and Input 2 is the total standard error of the difference (LiDAR minus Ground) which is determined by Equation 3-10. The ratios will differ from cell to cell in the analysis grid, due to differences in the standard error of interpolation. 75 3.8 Testing for Geomorphic Change from 2002 to 2006 using LiDAR The comparison of LiDAR datasets collected in 2002 and 2006 allow for the evaluation of fluvial geomorphic change detection over this time period. As mentioned, the increased spatial and temporal resolution of LiDAR can greatly improve the ability to detect change over large geographic areas and with more frequent regularity. Comparing 2006 and 2002 LiDAR-derived features could indicate fluvial geomorphic changes at the study site. However, change that is detected must be tested to ensure that it is statistically significant, that is, observed differences are physical and not artifacts of random errors in measurement and interpolation. In this study, these two LiDAR datasets were used to determine what changes in elevation have occurred over this four year period. These changes in elevation can be used to assess geomorphic changes within and along the stream channel and within the riparian zone. The 2002 and 2006 LiDAR datasets were compared for the same area within the CSPA. These LiDAR datasets were obtained by overflights of the CSPA as described in Sections 3.2. The 2006 LiDAR dataset used in this comparison is the same as that compared to the ground-surveyed point elevations as described in Sections 3.6 and 3.7 The LiDAR 2002 dataset was reduced to the extent of the AOI using Microsoft ACCESS as described in Section 3.4. Since all comparisons of surface elevations must have the same surface coverage within the AOI, the 2002 LiDAR points were clipped to the ground point polygon as described in Section 3.5. The steps described in Sections 3.7.1 through 3.7.2 were repeated for the LiDAR 2006 and LiDAR 2002 data sets, to detect changes in elevation between 2006 and 2002. 76 Instead of comparing a kriged LiDAR map to a kriged ground point map, the procedure was now applied to compare the two LiDAR maps, as follows: 1. Create TINS, and compare the TINS qualitatively. 2. Convert the TINS to rasters, correct for systematic error, and calculate a raster difference. 3. Use Geostatistical Analyst to perform Ordinary Kriging, creating prediction maps and standard error maps. 4. Correct the kriged maps for systematic error, and create a difference map by subtracting LiDAR 2006 from LiDAR 2002 to detect change. 3. Create a map of pixel-by-pixel total standard error of the difference, using the kriging standard errors for LiDAR 2006 and LiDAR 2002, and the LiDAR system error, as inferred from the analysis in Section 3.7.3. 4. Determine which, if any, pixels exhibit statistically significant differences of either sign using the two-sided Z test. The standard error used to determine statistical significance of the difference of the LiDAR 2006 and 2002 rasters is found using Equation 3-10. The square root of this raster is used to determine the Z-statistic and includes system and interpolation error for both LiDAR datasets. (Negative differences indicate erosion and positive differences aggradation). Equation 3-10 5. Analyze the contribution of the various error sources in terms of a total error budget. SE L06?L02 2 = SE 06,interp 2 + SE 02,interp 2 + SE 06,system 2 + SE 02,system 2 77 Chapter 4. Results and Analysis This chapter presents the intermediate and final results of implementing the steps described in Chapter 3. Most steps are illustrated with GIS-produced maps. 4.1 Comparison of Surfaces Derived from LiDAR 2006 and Ground-surveyed Points The first comparison, LiDAR measurements to nearly simultaneous ground measurements in 2006, was intended to evaluate the accuracy of LiDAR topography of a small stream and its densely vegetated riparian area. Two point datasets were used to create the surface elevation features: LiDAR 2006 and Ground-Surveyed. 4.1.1. Comparing LiDAR Bare-ground Points and Ground-Surveyed Points Since the entire tile was not needed to assess the surface features within the AOI, the LiDAR points were clipped to the AOI boundaries using ArcMap 9.2. The toolset used to clip these points was ArcToolBox>Analysis Tools>Extract>Clip. Since the ground surveyed points covered only part of the total AOI, any surfaces created from these ground points would be much smaller and could not be compared to the larger surfaces created from the LiDAR points. To ensure that surface features created from both datasets (LiDAR and ground-surveyed) covered the same area, a polygon shapefile (ground point polygon) was created to outline the perimeter of the ground-surveyed points (Fig. 4-1). This polygon includes the stream channel, the road running parallel to the stream, the bridge, the parking lot and the surrounding riparian zone. The ground point polygon was then used as the clip feature to create clipped datasets, which included 78 Figure 4-1. Ground point polygon (green outlined polygon) containing the ground- surveyed points. This polygon was used to clip an area used for comparison of LiDAR and ground survey-derived surface features and grids. 79 points that were located only within the ground point polygon. The clipped subset of the LiDAR 2006 data is shown in Fig. 4-2. These clipped datasets could then be used to create features that represented the bare-ground surface within the ground point polygon. 4.1.2 Creation of TINs for LiDAR 2006 and Ground Survey The point datasets (shapefiles) were used to create Triangulated Irregular Networks (TINs, see Section 2.4.6) representing the 3-D ground surface within the ground point polygon for each dataset. These TINs were then delineated (trimmed) to remove long edges that lay outside the ground point polygon. These edges resulted from the concave shape of the bounding polygon, and do not represent actual terrain. The toolset to delineate these TINs was as follows: ArcToolBox>3D Analyst Tools>TIN Creation, and Delineate TIN Data Area. The edited TINs using the ground-surveyed and LiDAR points are shown in Figs. 4-3 and 4-4, respectively. The interpolated elevations of the stream bed, banks, riparian zone, parking lot and surrounding area are visible in these maps. The edges and faces of the triangular planes are assigned elevation values when the Delaunay triangulation algorithm is used to create the TINs. The ArcGIS Identify Tool gives an elevation at points within the TIN. Since the TIN surface is interpolated by the Delaunay Method, the density of points affects the detail of the TIN surface and the accuracy of the interpolated surface between points. The LiDAR 2006 dataset within the ground point polygon has a denser posting than the ground-surveyed dataset. This is important since greater distances between points increase the interpolation between those points. Denser point postings mean less interpolation. 80 Figure 4-2. Ground point polygon showing the clipped LiDAR 2006 points. 81 Figure 4-3. Triangulated Irregular Network (TIN) derived from ground-surveyed data and delineated to the ground point polygon. 82 Figure 4-4. Triangulated Irregular Network (TIN) derived from LiDAR 2006 data and delineated to the ground point polygon. 83 4.1.3 Elevation Differences Using TINs The GIS can show qualitatively the differences between two TINs (Fig. 4-5). This map was created using ArcMap 9.2 toolset: ArcToolBox>3D Analyst Tools>TIN Surface>TIN Difference. There appear to be large regions in the Area of Interest where the LiDAR and Ground Surveyed maps are not in agreement. However, these differences cannot be quantified using the TIN data structure. In addition, mathematical operations necessary to quantify systematic and random errors cannot be performed. To allow mathematical analysis of the data as interpolated by the TIN, the TIN maps were converted to rasters, as described in Section 3.6.1. A grid cell dimension of 0.5 ft was assigned (grid cell area 0.25 ft2). The raster versions of the TINs for the LiDAR 2006 and Ground Survey, respectively, appear as Figs. 4-6 and 4-7. 4.1.4 Quantitative Comparison of TIN-derived rasters Using the toolsets in ArcMap 9.2, it was possible to determine the elevation differences across the surfaces. However, before subtracting rasters, it was necessary to adjust the LiDAR 2006 raster for systematic error as described in Section 3.6.2. The mean difference of the LiDAR 2006 and Ground-Surveyed TIN-derived rasters over the parking lot surface was -0.16556 feet (Fig. 4-8). The elevation differences over the parking lot are approximately normally distributed, with a standard deviation of about 0.11 ft. (Fig. 4-9). The constant value -0.16556 ft was subtracted from each pixel of the LiDAR 2006 raster. The adjusted LiDAR 2006 raster was used in the remaining analysis. 84 Figure 4-5. ArcMap-generated TIN difference map (LiDAR 2006 minus Ground- Surveyed). Only qualitative information is provided. 85 Figure 4-6. Raster created from the LiDAR 2006 point TIN. Grid size is 0.5 ft. 86 Figure 4-7. Raster created from the Ground-Surveyed TIN. Grid size is 0.5 ft. 87 Figure 4-8. Parking lot difference raster was calculated by subtracting the ground survey- derived raster from the LiDAR 2006-derived raster and multiplying by the parking lot mask (mean difference = -0.16556 feet) 88 Raster math was used to subtract the ground-surveyed raster from the adjusted LiDAR 2006 raster. The difference at each pixel is represented in a difference raster (Fig. 4-10). The smallest differences are on the parking lot surface as would be expected, with the greatest differences in the stream channel and at the banks. The elevation differences range from -2.60 to 2.62 feet with a mean difference across the ground point polygon of 0.10493 feet. The greater differences in and along the channel are most likely attributable to slope error. The bed along the stream bank was incised and the bank was vertical and undercut in many locations. This would lead to error in the interpolated LiDAR measurements since the LiDAR points are randomly positioned while the surveyed points are taken at specifically identified fluvial geomorphic features such as top and bottom of bank (Section 3.6.1, Fig. 3-7). Histogram of Parking Lot Difference Raster(L2006-ground) CO UN T 180 160 140 120 100 80 60 40 20 0 Figure 4-9. Histogram showing the distribution of elevation differences in the parking lot (LiDAR 2006 minus Ground Surveyed), minimum value -0.74 ft, maximum value 0.16 ft., mean value -0.16556 ft. 89 Figure 4-10. Difference raster showing the elevation differences between the rasters created from the LiDAR 2006 TIN and the ground-surveyed TIN. 90 4.1.5 Interpolating Point Elevation Measurements with Kriging As discussed in Section 3.4.8, subtracting the TIN-interpolated LiDAR 2006 and Ground- Surveyed maps gives an estimate of elevation differences over the study area, but does not provide uncertainty, or error bars, on those estimates. To address this lack, the interpolation procedure was repeated using Optimal Interpolation (Kriging). The specified and calculated model settings are shown in Table 4-1 for the LiDAR 2006 data, and in Table 4-2 for the Ground Survey data (The procedures used in ArcMap 9.2 are summarized in Section 3.4.8). The different autocorrelation functions reflect underlying differences in the two sets of point data. The kriged elevation (prediction) maps for the LiDAR 2006 and Ground Survey data appear, respectively, as Figures 4-11 and 4-12. 4.1.6 Creating Ordinary Kriged Standard error Maps A standard error map is produced for both the LiDAR and the ground survey points. In each case, care was taken to supply the same parameters for the autocorrelation model (semi-variogram) as used in the prediction map (Tables 4-1 and 4-2). The ArcMap Geostatistical Analyst makes this process fairly automatic by storing the model information with the shapefile data. (An alternative method for creating the Ordinary Kriged Standard error Maps is to right-click the name of the Kriged Prediction Map in the ArcMap Table of Contents and click on Create Prediction Standard error Map. This creates the standard error map with the same parameters and methods used to create the Ordinary Kriged Prediction Map.) The standard error maps produced by the Geostatistical Analyst appear as Figs. 4-13 and 4-14. 91 Table 4-1. Geostatistical Analyst (GA), Ordinary Kriging Prediction Map Setting and Calculated Values Using LiDAR 2006 Points Setting or Model Parameter Specified by Analyst Calculated by GA Transformation None Order of trend removal None Model Spherical Search Direction None Lag size 2 Number of lags 12 Partial sill 1.1508 Neighbors to include 50 Include at least Off Major range 23.7065 Sector type One sector Regression function Y = 0.948x + 6.33 Root Mean Squared Error 0.1898 Average standard error 0.3512 Table 4-2. Geostatistical Analyst (GA), Ordinary Kriging Prediction Map Setting and Calculated Values Using Ground Survey Points Setting or Model Parameter Specified by Analyst Calculated by GA Transformation None Order of trend removal None Model Spherical Search Direction None Lag size 2 Number of lags 12 Partial sill 1.8485 Neighbors to include 50 Include at least Off Major range 15.3646 Sector type One sector Regression function Y = 0.896x + 39.963 Root Mean Squared Error 0.6841 Average standard error 0.7833 92 Figure 4-11. Ordinary kriged prediction map created from LiDAR 2006 points. 93 Figure 4-12. Ordinary kriged prediction map created from ground-surveyed points. 94 Figure 4-13. Ordinary kriged standard error map created from LiDAR 2006 points. 95 Figure 4-14. Ordinary kriged standard error map created from ground-surveyed points. 96 The standard error maps clearly show that the uncertainty introduced by interpolation is smallest closest to the measurement points, as would be expected. These maps also show that the denser posting of the LiDAR points (Fig. 4-13) caused less standard error than the less dense ground-surveyed points (Fig. 4-14). The highest standard errors occur at locations on the surface where the point density is lowest or where the distance between the points is greatest. In Fig. 4-13, the standard error is greater within the stream channel since LiDAR pulses were not returned from the water due to absorption of the laser. This creates a gap in the points at those locations leading to increased interpolation and greater error in the determination of elevation differences between LiDAR and ground pixels at those locations. The features produced by the Geostatistical Analyst were converted to rasters using the techniques described in Section 3.6.1. The raster versions of the prediction maps are shown in Figs. 4-15 and 4-16, and the raster versions of the standard error maps in Figs. 4-17 and 4-18. Visually, the raster versions (Figs. 4-15 through 4-18) appear very similar, if not identical, to the corresponding Geostatistical Analyst features (Figs. 4-12 through 4-15). However, the underlying data structure is very different, and the raster format is necessary for any further mathematical operations on the interpolated maps. The raster converted from the kriged LiDAR 2006 prediction was adjusted for systematic error by subtracting the mean difference (-0.15892 feet) between the LiDAR 2006 kriged prediction raster and the ground point kriged prediction raster over the parking lot surface (Fig. 4-19). 97 Figure 4-15. Ordinary kriged raster derived from the ordinary kriged prediction map created from LiDAR 2006 points 98 Figure 4-16. Ordinary kriged raster derived from the ordinary kriged prediction map created from ground-surveyed points 99 Figure 4-17. Ordinary Kriged Prediction Standard error raster created from the Kriged prediction standard error map using LiDAR 2006 points. 100 Figure 4-18. Ordinary Kriged Prediction Standard error raster created from the Kriged prediction standard error map using ground-surveyed points. 101 Figure 4-19. Ordinary Kriged parking lot difference raster (Kriged LiDAR 2006 raster minus kriged Ground-Surveyed raster over the parking lot surface). Mean elevation difference for (Lidar-Ground) is -0.15892 ft. 102 The raster showing the difference between the adjusted LiDAR 2006 kriged prediction raster and the ground-surveyed prediction raster is shown in Fig. 4-20. The mean elevation difference is 0.23773 feet. The largest differences in elevation between the adjusted LiDAR 2006 Kriged Prediction raster and ground-surveyed Kriged Prediction raster are along and within the stream channel. The smallest differences between the two rasters are at the parking lot. The greater differences near the stream are most likely due to steep slopes along the bank that were not accounted for in the interpolation between the LiDAR points. The incision of the stream bed and the erosion along the bank created nearly vertical bank slopes at locations along the studied reach. These steep and often vertical slopes were accounted for in the ground-survey by creating a breakline. However, the LiDAR points fell at random locations often several feet away from the top and bottom of bank. Also, since the frequency of the LiDAR laser was in the near infrared spectrum, the water within the stream channel absorbed the laser and did not reflect points. The absorption of near infrared LiDAR pulses by water would likely lead to interpolation to points that were reflected from such stream channel locations as point bars, rocks or dry beds. The ground survey included points at the top of bank and at top of water within the channel, as well as, on point bars, rocks and dry bed locations. The location of the returned LiDAR points would create interpolated stream bank slopes that were not indicative of the actual bank slopes. 4.1.6 Testing for Elevation Differences The z-statistic was calculated on a pixel-by-pixel basis across the prediction surface using the methods described in Section 3.7.3. Using a level of significance of 103 Figure 4-20. Raster showing the difference between adjusted Kriged LiDAR 2006 raster and kriged Ground-Surveyed raster. 104 0.05, the surface locations were significantly different from zero along the stream bank and near the channel. The other locations within the ground point polygon did not have elevation differences between the Kriged LiDAR 2006 raster and the Kriged ground- surveyed raster that are significantly different from each other (Fig. 4-21). The areas where the z-statistic was within the region of acceptance (-1.96 < z <1.96) included most of the riparian area, the parking lot and most of the locations within and adjacent to the stream channel. These areas show no statistically significant differences in elevation between the Kriged raster created from the ground-surveyed points and the Kriged raster created from the LiDAR 2006 points. The areas where the z-statistic was outside the region of acceptance (z < -1.96 or z > 1.96) are either on or in close proximity to the bank or in the stream channel. This may have been due to the steep bank caused by erosion and/or the random placement of the LiDAR points and the interpolation error caused by these points. The gap of LiDAR points within the channel due to absorption by the water may have increased the standard error and increased the elevation difference. Since the z- statistic is the ratio of elevation difference to standard error, an increase in elevation difference could lead to greater Z-values. The elevation differences are greater in many locations within the stream channel (Fig. 4-20). 4.1.6 Estimating LiDAR System Error by Reversing the Hypothesis Test Following the procedures described in Section 3.7.3, the raster of critical standard error values was calculated (Fig. 4-22). The largest critical standard error is 1.38 ft as calculated using Equation 3-4 and shown in Figure 4-22. The largest interpolated error is 105 Figure 4-21. Map showing the locations where the difference between the interpolated LiDAR 2006 and ground-surveyed elevations were within the 95% Confidence Interval and where they were exceeded. Brown and blue areas were outside the 95% Confidence Interval. The SE of the difference of LiDAR 2006 and ground-surveyed was used. 106 Figure 4-22. Critical standard error raster for the Ordinary Kriged difference raster of the LiDAR 2006 and ground-surveyed points. 107 2.56 ft and is shown in Figure 4-20 showing a difference raster calculated by subtracting the ground-surveyed raster from the adjusted LiDAR 2006 raster. The required LiDAR system squared error (SE 2 required ) to achieve the critical standard error is determined from Equation 3-8 following the steps defined in Section 3.7.3. The resulting values are displayed in Fig. 4-23. Negative values occur where the calculated SE 2 already equaled or exceeded the critical value (that is, where the null hypothesis was already accepted based only on interpolation error). The maximum value obtained in this calculation is 1.61 ft 2 . Taking the square root, this implies that the standard error of the LiDAR measurements is 1.26 ft, giving a 2-SE error bar of plus or minus 2.52 ft. The z-statistic was re-calculated on a pixel-by-pixel basis across the prediction surface using Equation 3-8, including the assumed ground survey standard error, and the inferred Lidar System error. Using a level of significance of 0.05, there were no areas that showed statistically significant differences in elevation between the Kriged raster created from the ground-surveyed points and the Kriged raster created from the LiDAR 2006 points when incorporating these estimates in addition to the interpolation error calculated by kriging (Fig. 4-24). 108 Figure 4-23. Additional squared error necessary to obtain Z critical for acceptance of the hypothesis of equality. Negative values occur in pixels where the Z statistic was already in the range of acceptance. The highest positive value is 1.61. The inferred SE for the LiDAR system is 1.26 (square root of 1.61). 109 Figure 4-24. Map showing the locations where the difference between the interpolated LiDAR 2006 and ground-surveyed elevations were within the 95% Confidence Interval, after including the assumed Ground Survey system error and the inferred LiDAR system error in calculating the standard error of the elevation difference. The inferred SE of the LiDAR system (1.26) was used. 110 4.1.7 Evaluating the Error Budget Components for LiDAR 2006 ? Ground The ratio of the squared error for each source of error (LiDAR system, ground system, and interpolation) was calculated using ArcMap 9.2 Raster Math using the steps described in Section 3.7.4. Maps of these ratios are shown in Figures 4-25, 4-26 and 4-27, respectively. Because the LiDAR system and Ground system errors are assumed constant, the spatial variability is due to interpolation error. The sum of these rasters equals a raster of ones as is shown is Figure 4-28. Statistics of the squared error ratio maps are presented in Table 4-3. The mean values of the ratios indicate that the greatest contribution to the total squared standard error of the difference in elevation between the surfaces derived from LiDAR 2006 and ground-surveyed points is the LiDAR system. The mean of the ratio of the square of the standard error to the total error as shown in Table 4-3 indicates that the greatest fraction of the total error would be attributed to the LiDAR system, and the least to interpolation error Table 4-3. Statistics of Cell-by-Cell Ratios of Squared Standard Error to the Total Squared Error for Each Source of Error, LiDAR 2006-Ground Source Mean (SE 2 /SE 2 Diff,tot ) Maximum (SE 2 /SE 2 Diff, tot ) LiDAR Interpolation 0.05 0.26 Ground Interpolation 0.25 0.60 LiDAR System 0.70 0.96 111 Figure 4-25. Ratio of SE 2 of the LiDAR 2006 system to the total SE 2 . 112 Figure 4-26. Ratio of SE 2 of the LiDAR interpolation to the total SE 2 . 113 Figure 4-27. Ratio of SE 2 of the Ground-survey interpolation to the total SE 2 . 114 Figure 4-28. Map showing the sum of the ratio rasters, verifying that the sum of the squared error ratios is equal to 1 in all grid cells. 115 4.2 Comparison of Surfaces Derived from LiDAR 2006 and LiDAR 2002 Points The second comparison, LiDAR measurements collected in 2006 to LiDAR measurements collected in 2002, was intended to evaluate the potential of using LiDAR to detect fluvial geomorphic change of a small stream and its densely vegetated riparian area over time. The same methods that were used to compare the differences in elevation between the LiDAR 2006 and ground-surveyed datasets were used to compare the two LiDAR datasets. First, the 2002 LiDAR points were clipped to the ground points polygon (Fig. 4-29). Next, a LiDAR 2002 TIN was created (Fig. 4-30) to provide a qualitative comparison of the surfaces derived from the two LiDAR datasets, 2002 and 2006 (Fig. 4- 31). The LiDAR 2002 TIN was converted to a raster using the methods described in Section 3.6.1 so that elevation differences could be determined. The raster derived from the LiDAR TIN is shown in Fig. 4-32. The difference map is determined by using raster math as described in Section 3.7.1. The raster derived from the LiDAR 2006 TIN is adjusted for systematic error by first determining the mean difference of the LiDAR 2006 and LiDAR 2002 rasters over the parking lot area (Figure 4-33) and then subtracting this difference from the LiDAR 2006 raster. The mean difference over the parking lot was found to be -0.14988 feet. The difference raster was created by subtracting the LiDAR 2002 raster from the adjusted LiDAR 2006 raster (Figure 4-34). This raster shows the differences in elevation for each 0.25 ft 2 pixel across the surface. The stream channel shows a negative difference 116 Figure 4-29. LiDAR 2002 points within the ground point polygon 117 Figure 4-30. Triangulated Irregular Network (TIN) derived from LiDAR 2002 data and delineated to the ground point polygon. 118 Figure 4-31. Qualitative difference of the LiDAR2006 and LiDAR 2002 TINs. 119 Figure 4-32. Raster derived from LiDAR 2002 TIN. 120 Figure 4-33. Difference raster showing the LiDAR 2006 minus LiDAR 2002 raster over the parking lot. The mean difference is -0.14988 feet. 121 Figure 4-34. Raster of the difference between the TIN-based LiDAR 2006 adjusted raster and LiDAR 2002 raster. 122 of approximately 2.3 feet along the studied reach. Since the difference raster was determined by subtracting the 2002 elevations from the 2006 elevations, the negative elevation difference within the channel may indicate that erosion has occurred within the channel during this four year period. A field inspection of the stream channel in March 2006 indicated incision of the channel bed and erosion along the banks. Many locations along the stream bank were undercut or vertical. The erosion may be the result of increased discharge caused by the Clarksburg urbanization occurring upstream of the Area of Interest. There appears to be an increase in elevation along the top of bank and within the riparian zone closest to the stream channel. This could be due to sediment deposition during flooding. However, as with the LiDAR 2006 and the ground-surveyed datasets, the elevation differences assessed by using TIN-derived rasters do not provide confidence intervals on the differences. Therefore, new difference maps were created using ordinary kriging (optimal interpolation). Kriging provides interpolated continuous surfaces, and estimated uncertainty in those surfaces. The steps parallel those taken for the first comparison (LiDAR 2006 to Ground Surveyed), as described in Section 4.1. The semivariogram model parameters assigned and calculated for LiDAR 2002 are given in Table 4-4. The same semivariogram is used for LiDAR 2006 as in Table 4.1. The Ordinary kriged prediction map and standard error map for the LiDAR 2002 data are shown in Figures 4-35 and 4-36, respectively. These Geostatistical Analyst features were used to create prediction rasters and standard error raster based on the LiDAR 2002 data set, shown in Figures 4-37 and 4-38. The LiDAR 2002 raster was adjusted by the mean parking lot difference (-0.15298 feet, Fig. 4-39). The difference 123 Table 4-4. Geostatistical Analyst (GA), Ordinary Kriging Prediction Map Setting and Calculated Values Using LiDAR 2002 Points Setting or Model Parameter Specified by Analyst Calculated by GA Transformation None Order of trend removal None Model Spherical Search Direction None Lag size 2 Number of lags 12 Partial sill 0.73784 Neighbors to include 50 Include at least Off Major range 23.7065 Sector type One sector Regression function Y = 0.986x + 5.498 Root Mean Squared Error 0.1835 Average standard error 0.3401 124 Figure 4-35. Ordinary Kriged Prediction Map created from the LiDAR 2002 dataset. 125 Figure 4-36. Ordinary Kriged Standard error Map created from the LiDAR 2002 dataset. 126 Figure 4-37. Ordinary Kriged Prediction Map raster created from the LiDAR 2002 dataset. 127 Figure 4-38. Ordinary Kriged Standard error Raster created from the LiDAR 2002 dataset. 128 Figure 4-39. Parking Lot Difference of the Ordinary Kriged Prediction raster (LiDAR 2006 ? LiDAR 2002). Mean Difference is -0.15298 feet which was used to adjust the LiDAR 2006 raster for systematic error. 129 between the two Ordinary Kriged rasters (LiDAR 2006 adjusted minus LiDAR 2002) was determined and is shown in Figure 4-40. These rasters were then used to calculate the z-statistic raster as described in Section 3.5. The z-statistic was calculated on a pixel- by-pixel basis across the prediction surface using Equation 3-1 where the standard error (SE) includes the inferred LiDAR System error for both the LiDAR 2002 and 2006 data sets (1.26 ft, as inferred in Section 4.16) and the standard error due to interpolation of both LiDAR data sets (Equation 3-10). Using a level of significance of 0.05, there were no areas that showed statistically significant differences in elevation between the Kriged raster created from the LiDAR 2006 points and the Kriged raster created from the LiDAR 2002 points when incorporating the standard error estimates of the LiDAR 2002 and 2006 system in addition to the interpolation error calculated by kriging (Fig. 4-44). 130 Figure 4-40. Map showing the difference of the adjusted LiDAR 2006 surface and the LiDAR 2002 surface. 131 Figure 4-41 . Map showing the locations where the difference between the interpolated LiDAR 2006 and LiDAR 2002 elevations were within the 95% Confidence Interval, after including the inferred LiDAR system error for both LIDAR 2006 and 2002 datasets in calculating the standard error of the elevation difference. The inferred SE of the LiDAR system (1.26 ft) was used. 132 4.2.2 Evaluating the Error Budget Components for LiDAR 2006 ? LiDAR 2002 The ratio of the squared error for each source of error (LiDAR 2006 system, LiDAR 2002 system, LiDAR 2006 interpolation, and LiDAR 2002 interpolation) was calculated using ArcMap 9.2 Raster Math using the steps described in Section 3.7.4. Maps of these ratios are shown in Figures 4-42 through 4-45, respectively. Because the LiDAR system errors are assumed constant, the spatial variability is due to interpolation error. Statistics of the squared error ratio maps are presented in Table 4-5. The mean values of the ratios indicate that the greatest contribution to the total squared standard error of the difference in elevation between the surfaces derived from LiDAR 2006 and LiDAR 2002 points is the LiDAR system for each LiDAR dataset. The mean of the ratio of the square of the standard error to the total error as shown in Table 4-5 indicates that the greatest fraction of the total error would be attributed to the LiDAR 2006 and 2002 system, and the least to interpolation error. Table 4-5. Statistics of Cell-by-Cell Ratios of Squared Standard Error to the Total Squared Error for Each Source of Error, LiDAR 2006-LiDAR 2002 Source Mean (SE 2 /SE 2 Diff,tot ) Maximum (SE 2 /SE 2 Diff, tot ) LiDAR Interpolation 0.06 0.21 LiDAR 2002 System 0.47 0.50 LiDAR 2006 System 0.47 0.50 133 The large percentage of error that results from LiDAR system error requires that the proprietary software and the procedures used to label LiDAR points be improved. The bare-ground points are labeled by both software algorithms and human analysis but have led to such a high percentage of the standard error that it is impossible to determine differences in the surfaces derived from the LiDAR 2006 and LiDAR 2002 points. This makes change detection over time impossible. If change detection is to be determined by comparing the differences in LiDAR-derived surfaces then LiDAR system error must be reduced. 134 Figure 4-42. Ratio of SE 2 of the LiDAR 2006 system to the total SE 2 . 135 Figure 4-43. Ratio of SE 2 of the LiDAR 2002 system to the total SE 2 . 136 Figure 4-44. Ratio of SE 2 of the LiDAR 2006 interpolation to the total SE 2 . 137 Figure 4-45. Ratio of SE 2 of the LiDAR 2002 interpolation to the total SE 2 . 138 Chapter 5. Conclusions and Discussion The primary purpose of this study was to determine the acceptability of LiDAR to determine change in topography in and around a small stream. If LiDAR can be used to evaluate the differences in elevation over time then spatial and temporal resolution would be improved, enabling large geographic areas to be evaluated for change detection over time. The results of this study show that differences in interpolated topography within and around Sopers Branch are largely attributable to errors in the LiDAR data collection and processing, and should not be interpreted as real changes. Accounting for all sources of error in the difference map, differences were not statistically significant using a 5% significance level. In Tables 4-3 and 4-5, the greatest fraction of the total error is due to the LiDAR system. The LiDAR system error was found to be 70% of the total error when determining the difference between the LiDAR 2006 and ground-surveyed interpolated surfaces while interpolation error was only 30%. When comparing LiDAR datasets from 2002 and 2006 in the same area, the LiDAR system error was found to contribute 94% of the total error. Interpolation error was only 6%, due to the point density of the LiDAR data. The LiDAR system error may have included any or all of the following: instrumentation measurement error and labeling error caused by software algorithms and/or human error. The identification of LiDAR points as bare-ground is an essential step in identifying the surface; this is the most likely source of the system error identified 139 in this study. Because of the proprietary nature of the LiDAR data processing, it was not possible to examine the raw point data and evaluate the bare earth detection algorithm used by the data provider for this study. While LiDAR system error is the dominant error, other errors contribute to the total error and have an impact on determining the statistical significance of the elevation differences. For example, LiDAR is absorbed by water in the stream and does not reflect points from locations where surface water is present. This creates areas that are void of LiDAR points, causing increased interpolation error. It is possible that the standard error applied to the ground survey was underestimated. The positioning of the survey rod, the type of rod base (flat or pointed), and human error in the reading can affect the error. While this error is estimated in the literature to be about 3-5 cm, conditions specific to the site can increase this error. Assigning a larger value to the ground survey standard error would have resulted in a lower estimate of LiDAR system error. The purpose of this study was to determine if LiDAR could be used to determine fluvial geomorphic change. The benefits are obvious; increased spatial and temporal resolution allowing for measurement of surface elevations over large areas. However, it was found that LiDAR system error must be reduced so that accurate topographic measurements can be obtained. The diverse vegetation and topography within a riparian area seems to increase the LiDAR system error. Change detection cannot be determined over time using LiDAR if the LiDAR system error makes the elevation differences statistically insignificant. It is the large LiDAR system error that must be reduced if the use of LiDAR is to be acceptable for 140 determining surface elevations and change in those elevations over time. Improvements to the various aspects of the LiDAR system are important and include instrumentation measurement (location and elevation), labeling and algorithms and manual identification of ground points. Users of LiDAR elevation measurements need to account for uncertainty in these data before making conclusions about geomorphic change based on visual inspection of maps. 141 References Airborne 1, Clarksburg LiDAR Mapping Report, December, 2002. Bowen, Zachary and Waltermire, Robert G., 2002. Evaluation of Light Detection and Ranging(LiDAR) for Measuring River Topography, Journal of American Water Resources Association, 38(1):33-41. Burtch, Robert, LiDAR Principles and Applications, 2002 IMAGIN Conference, Traverse City, MI, pp. 1-13, 2002. Canaan Valley Institute, Mission Mapping Report and LiDAR Accuracy/ Postings Density Assessment Report for LiDAR Data and CIR 3-band Digital Orthophotography of the Clarksburg Special Protection Area, May,2006. ESRI, ArcGISArcMap 9, Using ArcMap Desktop, 2006. Redlands, CA, ESRI, Using Geostatistical Analyst, Redlands, 2007. CA, Hans, Z., Tenges, R., Hallmark, S., Souleyrette, R.,and Pattnaik, S., 2003. Use of LiDAR-based Elevation Data for Highway Drainage Analysis: A Qualitative Analysis, Iowa State University, Center for Transportation Research and Education, Mid-Continent Transportation Symposium, pp. 1-19. Harrelson, Cheryl C., Rawlins, C.L., Potyondy, John P., Stream Channel Reference Sites: An Illustrated Guide to Field Technique, USDA Forest Service, Rocky Mountain Research Station, General Technical Report RM-245, April, 1994. Hodgson, Michael E., and Bresnahan, Patrick, 2004. Accuracy of Airborne LiDAR- Derived Elevation: Empirical Assessment and Error Budget, Photogrammetric Engineering and Remote Sensing, 70(3):331-339. 142 Hodgson, Michael E., Raber, G., Jason, T., Davis, B., Thompson, G., and Schuckman, K, 2005. An Evaluation of LiDAR-derived Elevation and Terrain Slope in leaf-off Conditions, Photogrammetric Engineering and Remote Sensing, 71(7):817-823. Jensen, John R.,2000. Remote Sensing of the Environment, an Earth Resource Perspective, Upper Saddle River: Prentice-Hall, Inc. Leopold, Luna B., Wolman, M. Gordon, Miller, John P. Fluvial Processes in Geomorphology, 1964. New York: Dover Publications, Inc., 1995. Peng, Misao-Hsiang and Shih, Tian-Yuan, 2006. Error Assessment in Two LiDAR- derived TIN Datasets, Photogrammetric Engineering and Remote Sensing, 72(8):933-947. Reusser, Lucas and Bierman, Paul, 2007. Accuracy Assessment of LiDAR-derived DEMs of Bedrock River channels: Holtwood Gorge, Susquehanna River, Geophysical Research Letters, 34, L23S06, doi:10.1029/2007GL031329. Rosgen, David L., 1994. A Classification of Natural Rivers, Catena, 22:169 ? 199. Shank, Michael, 2001. An Assessment of LiDAR Data Quality, West Virginia Department of Environmental Protection, TAGIS Unit, pp. 1-6. Smith, Sean and Hermann, Michael, 2006. Stream Mapping in Maryland, Finding First Order Piedmont Channels, Maryland Department of Natural Resources, pp. 1- 43. ?Special Protection Areas?, Montgomery County, Maryland Department of Environmental Protection, 20 June 2008, . 143 144 Walker, Jeffrey P. and Willgoose, Garry R., July, 1999, On the Effect of Digital Elevation Model Accuracy on Hydrology and Geomorphology, Water Resources Research, 35(7):2259 -2268.