## Abstract

We report new methods for retrieving atmospheric constituents from symmetrically-measured lidar-sounding absorption spectra. The forward model accounts for laser line-center frequency noise and broadened line-shape, and is essentially linearized by linking estimated optical-depths to the mixing ratios. Errors from the spectral distortion and laser frequency drift are substantially reduced by averaging optical-depths at each pair of symmetric wavelength channels. Retrieval errors from measurement noise and model bias are analyzed parametrically and numerically for multiple atmospheric layers, to provide deeper insight. Errors from surface height and reflectance variations are reduced to tolerable levels by “averaging before log” with pulse-by-pulse ranging knowledge incorporated.

© 2014 Optical Society of America

## 1. Introduction

Future airborne and space missions call for unprecedented high precision for global measurement of atmospheric constituents and parameters [1, 2]. For example, the Active Sensing of CO_{2} Emissions over Nights, Days, and Seasons (ASCENDS) mission [1] has been planned by NASA to measure the global distribution of carbon dioxide (CO_{2}) mixing ratios (~400 ppm in average) to ~1 ppm precision. To meet such stringent requirements, nadir-viewing, direct-detection, and pulsed integrated-path differential-absorption (IPDA) lidar techniques are being developed to measure the two-way optical absorption of the target species from the spacecraft to the surface and back at multiple wavelength channels [3, 4]. From the absorption and altimetry measurements and other ancillary data of the atmosphere, the dry mixing ratios of the target species can be retrieved [5–7]. The errors of the retrieved mixing ratios essentially arise from the random measurement noise, the forward model bias, and errors in the forward model parameters.

Throughout this paper, a candidate ASCENDS lidar approach [4] being developed at NASA Goddard will be used as a concrete example. This IPDA lidar approach allows simultaneous measurement of CO_{2} and surface height in the same path [4, 6]. As shown in Fig. 1, a pulsed laser is wavelength-stepped across a single CO_{2} line at 1572.335 nm to measure the optical depths (ODs) at 4 pairs of symmetric laser frequency channels [8]. The laser frequency fluctuation causes a variation in measured CO_{2} transmittance, resulting in an uncertainty in the target mixing ratio retrieval. To reduce this uncertainty, laser pulses at each fixed wavelength are carved from a frequency stabilized continuous-wave (CW) laser [8, 9]. The ~1-μs pulses need to be at least ~100 μs apart to eliminate crosstalk from cloud scattering. This approach has also been adopted to scan an O_{2} absorption line doublet for an atmospheric pressure measurement [10], and to measure atmospheric methane concentrations [11].

Comprehensive analyses of IPDA measurement errors and errors of column-averaged retrieval from differential absorption optical depth (DAOD) measured at two (on/offline) wavelength channels have been published [3–7, 12–15]. Nevertheless, the unprecedented precision targets call for more vigorous modeling and error reduction methods, and closer scrutiny of previously neglected error sources. We have recently reported new methods to quantify and reduce errors for DAOD measurement arising from the laser frequency noise, broadened laser line-shape, statistical bias in “log after averaging”, surface height and reflectance variations [16]. The present paper generalizes the formulation from our previous publication [16] and uses established inversion methods [17] to address multiple-layer retrievals from the absorption spectra measured at multiple pairs of symmetric wavelength channels. Our forward model is essentially linearized by linking the estimated ODs rather than the transmittance values to the mixing ratios. This allows us to link both the relative random error (RRE) and the relative systematic error (RSE) of the retrieved mixing ratios, arising from the measurement noise and model bias, respectively, to characteristic parameters to provide deeper insight into system optimization and limitation. We show that errors from the laser frequency drift and spectral distortion (due to, e.g., etalon fringes and surface reflectance variations) can be substantially reduced by placing the laser frequency channels symmetrically about the center of the target absorption line profile and averaging the two ODs measured at each pair of symmetric channels to cancel out errors. Our model includes laser line-shape factor and thus remains accurate even when the laser line-shape is broadened for the suppression of the stimulated Brillouin scattering (SBS) in the laser amplifiers. Retrieval errors from imperfect forward model parameters (including spectroscopic parameters, surface pressure, atmospheric temperature profile, and the water vapor profile) have been thoroughly studied [5, 7, 12, 13, 18–21] and is outside the scope of this paper. Owing to the continuing advances in the absorption line-shape modeling studies [20, 21], the retrieval errors from the inaccuracy of the absorption line-shape modeling are diminishing.

Our forward model is presented in section 2. The retrieval and error analysis methods are derived in section 3. A numerical example of the parametric error analysis is presented in section 4. More considerations for the observing systems are discussed in section 5. Some supporting details are provided in the Appendices. Throughout this paper, matrices are denoted by bold face upper case, e.g., **K**, column vectors by bold face lower case, e.g. **b**, the transpose by superscript *T*, e.g., **K*** ^{T}*. $K\text{'}$ represents an estimate of $K,$ and $\mathrm{det}(K)$ the determinant of $K$. $\mathrm{Cov}(x,y)$ represents the covariance of $x$ and $y$. ${\sigma}^{2}(x)$ (or $\mathrm{Var}(x)$) represents the variance, $\sigma (x)$ the standard deviation, $\overline{x}$ the ensemble average of $x$.

## 2. Forward model

#### 2.1 Dependence of optical depth on mixing ratios

Using ancillary data of atmospheric temperature profile, water vapor mixing ratio ${q}_{{H}_{2}O}$ and surface pressure, the atmospheric pressure $p(r)$ at range *r* from the spacecraft can be determined for each vertical column of the atmosphere. *p* decreases monotonically with height and is conveniently used as the vertical coordinate in this paper. The laser frequency noise contributes to measurement errors through two complimentary factors, the energy spectral density (ESD) $L({\nu}_{F})$ of each single laser pulse (as a function of the Fourier frequency ${\nu}_{F}$), and the fluctuation of the line-center frequency ${\nu}_{c}$of $L({\nu}_{F}).$ The present model uses the effective two-way OD $\tau ({\nu}_{c},2r)$ of the target species [16] to include the line-shape factor, and thus remains accurate even when the laser line-shape is broadened. $\tau ({\nu}_{c},2r)$ can be related to the dry mixing ratio ${q}_{gas}(p)$ and the two-way effective weighting function $w({\nu}_{c},p)$ of the target species as $\tau ({\nu}_{c},2r)\text{\hspace{0.17em}}={\displaystyle {\int}_{0}^{p(r)}{q}_{gas}(p)w({\nu}_{c},p)dp}$ (see Appendix A for details). This relationship becomes linear and much simpler when $L({\nu}_{F})$ is much narrower than the spectral width of the target absorption line. In such narrow line-width case, $\tau ({\nu}_{c},2r)$ becomes twice the one-way monochromatic OD ${\tau}_{0}({\nu}_{F},p(r))$and $w({\nu}_{c},p)$ twice the one-way monochromatic weighting function ${w}_{0}({\nu}_{F},p)$. ${\tau}_{0}({\nu}_{F},p)$ and ${w}_{0}({\nu}_{F},p)$ are given by

_{2}O and one dry air molecule,

*g*is the gravitational acceleration of the Earth, ${N}_{gas}(p)$ the number density of the target molecules. Such narrow line-width case is assumed for numerical examples throughout this paper.

We now divide the atmospheric column into *n _{q}* layers and approximate ${q}_{gas}(p)$ by a state vector$q={[{q}_{1},{q}_{2},\mathrm{...},{q}_{{n}_{q}}]}^{T}$. ${q}_{j}$ is a channel independent mixing ratio used to approximate the layer-averaged mixing ratio ${q}_{j}({\nu}_{c})\equiv {\displaystyle {\int}_{{p}_{j}}^{{p}_{j-1}}q(p)w({\nu}_{c},p)dp}/{\displaystyle {\int}_{{p}_{j}}^{{p}_{j-1}}w({\nu}_{c},p)dp}\text{\hspace{0.17em}}$ for layer

*j*(

*j*= 1, 2, …,

*n*). Here layer 1 is at the bottom of atmosphere,

_{q}*p*is the pressure at the top boundary of layer

_{j}*j*, and ${p}_{0}=p(r)$. This approximation is valid when ${q}_{gas}(p)$ or $w({\nu}_{c},p)$(for all channels) are sufficiently uniform within layer

*j*. $\tau ({\nu}_{c},2r)$ can then be simplified to

#### 2.2 Forward model for uncombined wavelength channels

Throughout this paper, we assume *m* symmetric pairs of frequency channels and use superscript (or subscript) *i* to index such channels at mean laser line-center frequencies ${\nu}_{i}\equiv \overline{{\nu}_{c}^{i}}$ (*i* = 1, 2, …, 2*m*). $\Delta {\nu}_{i}\equiv {\nu}_{i}-{\nu}_{0}$ is the frequency offset about the absorption line center ${\nu}_{0}$. As illustrated in Fig. 1, each pair $(i,\text{\hspace{0.17em}}2m+1-i)$ is placed symmetrically about ${\nu}_{0}$ (i.e., $\Delta {\nu}_{i}=-\Delta {\nu}_{2m+1-i}$). We will establish the forward model for these uncombined channels in this subsection, and then combine each pair of channels $(i,\text{\hspace{0.17em}}2m+1-i)$ into a symmetrically-combined channel *i* in the next subsection. The forward model for symmetrically-combined channels can be easily derived from the fundamental forward model for uncombined channels. The forward model for uncombined channels is adapted from our previous DAOD modeling work [16], and thus will be discussed only briefly herein. As to be seen in the next section, the retrieval requires knowledge of both the variance and cross-channel covariance of the OD measurements. The former has been quantified in [16] but the latter has not been well addressed previously. In this subsection, we further quantify the cross-channel covariance of the OD measurements.

Including other attenuation factors [22], the detected laser pulse energy ${W}_{s}^{i}$ backscattered from the surface is given by

*ρ*the surface reflectance (in sr

^{−1}), and

*T*the one-way transmittance of the atmosphere excluding the target species. To cover both linear and photon counting receivers, the analog detector signal and noise are gain normalized and expressed in detected photon count

_{atm}*K*. For any type of radiation (including thermal light), the ensemble average and the variance of

*K*and the optical energy

*W*entering the detector area are related by [23, 24] Here $\alpha \equiv \eta /(h\overline{{\nu}_{c}})$ is proportional to the detector quantum efficiency

*η*, and

*h*is the Planck constant. Note that only the shot noise contribution is multiplied by the detector’s excess noise factor ${F}_{e}.$

Let ${r}_{G}^{i}$ denote a certain mean of ${r}_{s}^{i}(k)$ averaged across multiple laser pulses (*k* = 1, 2, …, *n _{p}*) in channel

*i*. $\tau \left({\nu}_{i},2{r}_{G}^{i}\right)$ can be estimated from the following sum of normalized photon counts (SNK) ${S}_{NK}^{i}$ accumulated across the

*n*pulses

_{p}*k*, $E{\text{'}}_{s}^{i}(k)$ the transmitted laser pulse energy, and ${A}_{z}^{i}(k)$ accounts for the OD difference $\tau ({\nu}_{i},2{r}_{s}^{i}(k))-\tau ({\nu}_{i},2{r}_{G}^{i})$. Errors due to surface height variation can be reduced to negligible levels by incorporating pulse-by-pulse ranging knowledge ${r}_{s}^{i}(k)$through${A}_{z}^{i}(k)$. The exact relationship between ${S}_{NK}^{i}$ and $\tau \left({\nu}_{i},2{r}_{G}^{i}\right)$ is found to be

Our forward model is derived from Eq. (7) as follows. $-\mathrm{ln}\left(\overline{{S}_{NK}^{i}}\right)$ is estimated by $-\mathrm{ln}\left({S}_{NK}^{i}\right)+{C}_{i}^{OD}$ where a bias correction factor ${C}_{i}^{OD}$ (given by Eq. (30) in Appendix B) is added to improve the estimation accuracy. ${C}_{i}^{OD}$ becomes negligible when the integrated photon count is sufficiently high. $\tau \left({\nu}_{i},2{r}_{G}^{i}\right)$ is related to the mixing ratios through Eq. (2). The offset $-\mathrm{ln}({S}_{A}^{i})$ can be modeled by a polynomial ${c}_{0}+{c}_{1}\Delta {\nu}_{i}+{c}_{2}{\left(\Delta {\nu}_{i}\right)}^{2}+\mathrm{...}$, where ${c}_{0},\text{\hspace{0.17em}}{c}_{1},\text{\hspace{0.17em}}{c}_{2},$ and so on, are unknown constants. This offset has little dependence on the frequency channels and thus can be nearly approximated by ${c}_{0}.$ The odd terms (${c}_{1}\Delta {\nu}_{i},\text{\hspace{0.17em}}{c}_{3}{\left(\Delta {\nu}_{i}\right)}^{3}$, and so on) are canceled out by averaging each pair of symmetric channels, and thus excluded from our model. For ease of analysis, we only keep the quadratic term ${c}_{2}{\left(\Delta {\nu}_{i}\right)}^{2}$ to model any uncorrected spectral response of the lidar (due to, e.g., etalon fringes). We then arrive at the following forward model for channel *i*

We now turn to derive the covariance and variance of ${y}_{i}^{OD}$ as needed for the retrieval. The covariance $\mathrm{Cov}({y}_{i}^{OD},{y}_{j}^{OD})$$(i\ne j)$ is nonzero due to a cross-channel correlation of the effective laser line-center frequency noise defined by

*t*. Considering a master-slave laser frequency locking scheme [8, 9], ${\delta}_{\nu 1}^{i}$ of the slave laser can be approximately divided into two components ${\delta}_{\nu 1}^{i}(t)={\delta}_{\nu slow}(t)+{\delta}_{\nu fast}^{i}(t)$ [16]. The fast frequency noise component ${\delta}_{\nu fast}^{i}(t)$ can be treated as uncorrelated among different pulses and uncorrelated to the slow frequency noise component ${\delta}_{\nu slow}(t).$ ${\delta}_{\nu slow}(t)$ arises solely from the master laser frequency drift, and thus is the same for all channels. It has been demonstrated that the slow drift of the frequency difference between the slave and master lasers can be eliminated [8, 9] and thus is neglected here. The correlation of ${\delta}_{\nu n}^{i}$ arises solely from ${\delta}_{\nu slow}(t)$ that essentially remains unchanged within each wavelength sweep, but varies slowly over longer time scales. From Eq. (9), ${\delta}_{\nu n}^{i}$ can be split into ${\delta}_{\nu n}^{i}={\delta}_{\nu nslow}^{i}+{\delta}_{\nu nfast}^{i}$, where ${\delta}_{\nu nslow}^{i}$ and ${\delta}_{\nu nfast}^{i}$ arise from ${\delta}_{\nu slow}$ and ${\delta}_{\nu fast}^{i}$, respectively. Using $\overline{K{\text{'}}_{s}^{i}(k)K{\text{'}}_{s}^{j}(k\text{'})}={\alpha}^{2}\overline{W{\text{'}}_{s}^{i}(k)W{\text{'}}_{s}^{j}(k\text{'})}\text{\hspace{0.17em}}(i\ne j)$ [23] and $\mathrm{ln}(x)-\overline{\mathrm{ln}(x)}\cong (x-\overline{x})/\overline{x}$, $\mathrm{Cov}({y}_{i}^{OD},{y}_{j}^{OD})$ is found to be

_{p}Assuming that the transmitted laser pulses in channel *i* have nearly the same pulse energy, the variance ${\sigma}^{2}({y}_{i}^{OD})$ of the measurement ${y}_{i}^{OD}$ is found to be

*n*pulses. The effective absorption cross-section ${\sigma}_{eff}({\nu}_{i},{r}_{G}^{i})$ of the target species, defined by Eq. (26) in Appendix A, assumes its monochromatic value ${\sigma}_{0}({\nu}_{i},p({r}_{G}^{i}))$ for the narrow line-width case. $<{W}_{ii}^{cov}(f)>$ is the averaged height of a window function ${W}_{ii}^{cov}(f)$ defined by Eq. (31) in Appendix C. $<{W}_{ii}^{cov}(f)>$ is only slightly higher than 1/

_{p}*n*and is taken to be 1/

_{p}*n*for our numerical evaluations. Error contributions from ${\delta}_{\nu slow}$, ${\delta}_{\nu nfast}^{i}$ and these other noise sources will be further discussed in the next subsection.

_{p}$\overline{{\epsilon}_{i}^{OD}}$ (evaluated for deterministic${S}_{A}^{i}$) can be regarded as the model bias, and is found to be

*m*channels, the second term ${b}_{\tau \nu}^{i}$ from the laser line-center frequency noise, the third term ${b}_{\tau C}^{i}$ is the residual bias for estimation of $-\mathrm{ln}\left(\overline{{S}_{NK}^{i}}\right)$ with $-\mathrm{ln}\left({S}_{NK}^{i}\right)+{C}_{i}^{OD}$, and the last term ${b}_{\tau r}^{i}$ from the altimetry bias and variance. All these bias terms can be reduced to negligible levels in practice.

#### 2.3 Forward model for symmetrically combined channels

Since the pressure-shift of the line-center of ${\sigma}_{0}({\nu}_{F},p)$ varies with altitude, the target atmospheric absorption line profile becomes slightly asymmetric. In our numerical examples, ${\nu}_{0}$ is simply placed at the maximum absorption point. At each pair of symmetric channels about ${\nu}_{0}$, the ODs are roughly the same, but the OD slopes are nearly opposite. This allows us to drastically reduce errors due to the laser frequency drift. Figure 2 illustrates this feature (*left*) and the weighting functions at the 4 pairs of channels (*right*) for the atmospheric CO_{2} line at 1572.335 nm. The CO_{2} absorption spectrum and weighting functions are computed for US standard atmospheric conditions with a constant dry CO_{2} mixing ratio of 400 ppm. The averaged OD ${\tau}_{av}({\nu}_{i})\equiv [\tau ({\nu}_{i},2{r}_{G}^{i})+\tau ({\nu}_{2m+1-i},2{r}_{G}^{2m+1-i})]/2$ is treated as the OD at the symmetrically-combined channel *i*. For the narrow line-width case, ${\tau}_{av}({\nu}_{i})$ becomes ${\tau}_{0av}({\nu}_{i})\equiv [{\tau}_{0}({\nu}_{i},p({r}_{G}^{i}))+{\tau}_{0}({\nu}_{2m+1-i},p({r}_{G}^{2m+1-i}))]$. Due to the cancellation of the two nearly opposite OD slopes for each symmetric channel pair, the slope of the averaged OD ${\tau}_{0av}({\nu}_{i})$ shown in Fig. 2 is reduced from that of the original OD by a factor of several tens for any practical online channels. The four channels are placed at $\Delta {\nu}_{1}=-15.6\text{\hspace{0.17em}}\text{GHz,}$ $\Delta {\nu}_{2}=-1.7\text{\hspace{0.17em}}\text{GHz,}$$\Delta {\nu}_{3}=-1.08\text{\hspace{0.17em}}\text{GHz,}$and $\Delta {\nu}_{4}=-0.5\text{\hspace{0.17em}}\text{GHz}\text{.}$ Since the two weighting functions for each pair of channels are roughly the same, such channel combining essentially does not reduce the vertical resolution of the retrieval.

The new measurement vector $y={[{y}_{1},{y}_{2},\mathrm{...},{y}_{m}]}^{T}$ is defined as ${y}_{i}\equiv ({y}_{i}^{OD}+{y}_{2m+1-i}^{OD})/2$, and the extended state vector is $x={[{q}_{1},{q}_{2},\mathrm{...},{q}_{{n}_{q}},{c}_{0},{c}_{2}]}^{T}$ that combines $q$ and $c={[{c}_{0},{c}_{2}]}^{T}.$ In practice, the photon counts for each of the 2*m* channels are measured separately before combining the pairs. From Eq. (8), we arrive at the following forward model

**b**is a vector of model parameters that are not to be retrieved (including temperature and water vapor profiles, surface pressure, and spectroscopic data), ${\epsilon}_{i}\equiv ({\epsilon}_{i}^{OD}+{\epsilon}_{2m+1-i}^{OD})/2$ is the measurement error and the model bias is $\overline{{\epsilon}_{i}}=\left(\overline{{\epsilon}_{i}^{OD}}+\overline{{\epsilon}_{2m+1-i}^{OD}}\right)/2$. The variations of $\overline{{\epsilon}_{i}^{OD}}$ across the 2

*m*channels constitute a spectral distortion that mimics the target absorption and often causes major retrieval errors. Such spectral distortion could arise from surface reflectance variations and etalon fringes in the lidar’s receiving path. The distortion can be decomposed into two types: anti-symmetric and symmetric (about ${\nu}_{0}$). By combining symmetric channels, the anti-symmetric spectral distortion is canceled out. It is beneficial to use more than two pairs of symmetric channels so that correction terms (such as ${c}_{2}{\left(\Delta {\nu}_{i}\right)}^{2}$) can be included to reduce symmetric spectral distortion. In contrast, using only two (on/offline) wavelength channels cannot reject either type of spectral distortion.

The covariance matrix ${S}_{y}$of $y$is found to be essentially diagonalized due to the cancellation of the two nearly opposite OD slopes for each symmetric channel pair

_{2}measurement illustrated in Fig. 2, the upper bound for $\sigma ({\delta}_{\nu slow})$ can be relaxed from 0.23 MHz [16] to 6 MHz, which becomes much easier to satisfy [8, 9].

The error reduction can be simply explained as follows. Within each wavelength sweep, ${\nu}_{c}^{i}$ of each of the 2*m* pulses is shifted from its mean ${\nu}_{i}$ by essentially the same amount ${\delta}_{\nu slow}$ (and further shifted by an uncorrelated amount ${\delta}_{\nu fast}^{i}$). The common shift ${\delta}_{\nu slow}$ causes nearly opposite OD changes ${\delta}_{\nu slow}{\left(d{\tau}_{i}/d{\nu}_{c}\right)}_{{\nu}_{i}}$ and ${\delta}_{\nu slow}{\left(d{\tau}_{2m+1-i}/d{\nu}_{c}\right)}_{{\nu}_{2m+1-i}}$ in the two symmetric channels. By the pair combining, the two opposite OD changes nearly cancel out, resulting in a much reduced OD error ${\delta}_{\nu slow}{\left(d{\tau}_{av}/d{\nu}_{c}\right)}_{{\nu}_{i}}$ from ${\delta}_{\nu slow}$ for the symmetrically combined channel. Consequently, it can be easily shown that ${\delta}_{\nu slow}$ contributes to the covariance and variance of ${y}_{i}$ as expressed by Eqs. (14)-(15). Similarly, a common laser frequency bias ${\delta}_{{\nu}_{c}}$ (including the Doppler shift arising from the high-speed cross-wind and the radial component of the spacecraft velocity [12]) will result in an OD bias ~${\delta}_{{\nu}_{c}}{\left(d{\tau}_{i}/d{\nu}_{c}\right)}_{{\nu}_{i}}$for each uncombined channel *i*. By the pair combining, the bias towards ${y}_{i}$ is reduced substantially to ${\delta}_{{\nu}_{c}}{\left(d{\tau}_{av}/d{\nu}_{c}\right)}_{{\nu}_{i}}$. Compared to the single-channel measurement of ${y}_{i}^{OD}$ with $2{n}_{p}$ pulses, the pair combining essentially retains the same noise contributions to ${\sigma}^{2}({y}_{i})$ from sources other than ${\delta}_{\nu slow}$ (such as the signal shot noise and ${\delta}_{\nu nfast}^{i}$) when each of the two symmetric channels is measured with ${n}_{p}$pulses. These other noise contributions to ${\sigma}^{2}({y}_{i})$ are essentially inversely proportional to $2{n}_{p}$ and thus can be reduced by pulse averaging.

## 3. Retrieval and error analysis methods

#### 3.1 Retrieval method

In general, the forward model $F(x,b)$ is nonlinear with respect to **x**. Although $y$ is evaluated from $K{\text{'}}_{s}^{i}(k)$ and $E{\text{'}}_{s}^{i}(k)$measured at **x**, an approximate value **x*** _{i}* of

**x**is needed to evaluate the factors${A}_{z}^{i}(k)$ in $y$. The evaluated $y$, denoted as $y({x}_{i})$, is a nonlinear function of

**x**

*. Nevertheless, the forward model $F(x)$ and $y$ can be linearized about*

_{i}*a priori*state ${x}_{a}$ within the natural variability of

**x**about ${x}_{a}$

**R**: $x\text{'}=R(y,b\text{'},{x}_{a})$, where $x\text{'}$is an estimate of $x$ and $b\text{'}$ an estimate of $b$. We consider the maximum

*a posteriori*(MAP) retrieval approach [17] where the measurement error $\epsilon $ and the error of ${x}_{a}$are assumed to be Gaussian with known error covariance matrixes

**S**

*and*

_{y}**S**

*, respectively. Using the Newton-Gauss method, the following iteration ${x}_{i+1}$is expected to converge to $x\text{'}$ [17]*

_{a}For the narrow line-width case, ${K}_{F}$ becomes independent of **x** and the remaining dependence of $y$ on *q*_{1} can be minimized by choosing ${r}_{G}^{i}$ so that ${(d{y}_{i}/d{q}_{1})}_{x={x}_{0}}\cong 0$. $K(x)$ is then simplified to

**x**

*and the forward model $y={K}_{F}x+\epsilon $ becomes linear, allowing the following linear retrieval $x{\text{'}}_{L}$ of $x$ without iterations [17]*

_{i}${r}_{s}^{i}(k)$ can be approximated by a constant ${r}_{G}^{i}$ if the surface is sufficiently flat within the averaging time ${n}_{p}{t}_{p}$, or if the laser beam and the receiver are pointed to a fixed surface spot during the ${n}_{p}$ pulses to be averaged. When there is no *a priori* information available, ${S}_{a}^{-1}=0$. By setting ${S}_{a}^{-1}=0$, the above MAP solutions in Eqs. (17), (18), and (20) become the same as the corresponding maximum likelihood (ML) solutions. The linear ML solution in Eq. (20) with ${S}_{a}^{-1}=0$ is equivalent to the weighted linear least-square solution with weighting covariance matrix ${S}_{y}^{-1}$. This linear ML solution can serve as the start point ${x}_{0}$ for the iterations of Eq. (17).

#### 3.2 retrieval error analysis

Discretizing ${q}_{gas}(p)$into the state vector $q$ is based on the approximation ${q}_{j}({\nu}_{i})\cong {q}_{j}$. When ${q}_{gas}(p)$ and $w({\nu}_{c}^{i},p)$ are not uniform within layer *j,* the bias ${q}_{j}({\nu}_{i})-{q}_{j}$ for estimating ${q}_{j}({\nu}_{i})$ with ${q}_{j}$ may become significant. Neglecting ${q}_{j}({\nu}_{i})-{q}_{j}$, the retrieval error in the extended state vector $x\text{'}$ can be expressed by [17]

It is adequate to represent ${q}_{gas}(p)$ with a column-averaged retrieval $q{\text{'}}_{1}$ (*n _{q}* = 1) when the mixing ratio ${q}_{gas}(p)$ of the target species is sufficiently uniform in the column. When this is not the case, it is desirable to retrieve ${q}_{gas}(p)$ in two or even more layers. For example, when ${q}_{gas}(p)$ in the planetary boundary layer (PBL) is significantly different from that in the column above the PBL (see, for example [25],), double-layer retrievals are desirable.

We now turn to link the retrieval RRE and RSE, arising from the measurement noise and model bias, respectively, to characteristic parameters to gain deeper insight into system optimization and limitation. For ease of analysis, the linear ML retrieval solution is used without including the quadratic correction term ${c}_{2}{\left(\Delta {\nu}_{i}\right)}^{2}$ and the small covariance elements ${[{S}_{y}]}_{ij}\text{\hspace{0.17em}}(i\ne j)$ are neglected. In the following derivations, $<{x}_{i}>\text{\hspace{0.17em}}\equiv {\displaystyle {\sum}_{i=1}^{m}[{x}_{i}/{\sigma}^{2}({y}_{i})]}/{\displaystyle {\sum}_{i=1}^{m}[1/{\sigma}^{2}({y}_{i})}]$ represents an average of elements ${x}_{i}$ (*i* = 1, 2, …, *m*) across *m* channels, weighted by $1/{\sigma}^{2}({y}_{i})$. ${\mathrm{Var}}_{i}({x}_{i})\equiv \text{\hspace{0.17em}}<{({x}_{i}-<{x}_{i}>)}^{2}>$ represents a variance of ${x}_{i}$, and ${\mathrm{Cov}}_{i}({x}_{i},{y}_{i})\equiv \text{\hspace{0.17em}}<({x}_{i}-<{x}_{i}>)({y}_{i}-<{y}_{i}>)>$ a covariance between elements ${x}_{i}$ and ${y}_{i}$ across *m* channels. $\tau {\text{'}}_{ij}\equiv {[K{\text{'}}_{q}]}_{i,j}q{\text{'}}_{j}$ is the estimated two-way OD of the target species within layer *j* at channel *i*.

From Eq. (18), the RRE of $q{\text{'}}_{j}$ arising from the experimental variance ${\sigma}^{2}({y}_{i})$ is found to be

*j*

^{th}row and

*j*

^{th}column). The element ${[R]}_{j,j\text{'}}$ of $R$ is the correlation coefficient between ${[K{\text{'}}_{q}]}_{i,j}$ and ${[K{\text{'}}_{q}]}_{i,j\text{'}}$, and varies between −1 and 1. Here we introduce an effective two-way DAOD within layer

*j*, $\Delta {\tau}_{j}\equiv 2\sqrt{{\mathrm{Var}}_{i}(\tau {\text{'}}_{ij})}$, to quantify the spread of the ODs $\tau {\text{'}}_{ij}$ across

*m*channels. ${\sigma}_{\Delta \tau}{F}_{j}$ can be regarded as the effective standard deviation of $\Delta {\tau}_{j}$, where ${\sigma}_{\Delta \tau}^{2}\equiv 4{\left[{\displaystyle {\sum}_{i=1}^{m}1/{\sigma}^{2}({y}_{i})}\right]}^{-1}$ and ${F}_{j}\equiv \sqrt{{M}_{jj}/\mathrm{det}(R)}$. Equation (22) indicates that the RRE of $q{\text{'}}_{j}$ is equal to the RRE of $\Delta {\tau}_{j}$. Since${F}_{1}=1$ for single layer retrieval, ${\sigma}_{\Delta \tau}^{2}$ can be regarded as an effective measurement variance of the effective two-way DAOD for the column. When retrieving from a combined online and a combined offline channel that have the same ${\sigma}^{2}({y}_{i})$, the single-layer $\Delta {\tau}_{1}$ becomes the conventional DAOD ${\tau}_{21}-{\tau}_{11}$, and ${\sigma}_{\Delta \tau}^{2}=2{\sigma}^{2}({y}_{i})$ becomes the variance of the DAOD ${\tau}_{21}-{\tau}_{11}$. Without changing ${\sigma}^{2}({y}_{i})$ of the existing channels, adding more channels in the retrieval (and the computation of ${\sigma}_{\Delta \tau}^{2}$) reduces ${\sigma}_{\Delta \tau}^{2}$. Compared with the single-layer retrieval, the RRE of each $q{\text{'}}_{j}$ retrieved for multiple layers is increased by two factors: a factor${F}_{j}>1$, and a $\Delta {\tau}_{j}$ smaller than the single layer $\Delta {\tau}_{1}$. For double-layer retrievals, for example, ${F}_{j}=1/\sqrt{1-{[R]}_{1,2}^{2}}$.

Following the argument leading to the Cramer's rule, the linear retrieval $q{\text{'}}_{j}$ can be found from Eq. (20) as

*j*

^{th}column of $R$ by a column vector ${r}_{y}$ defined as ${[{r}_{y}]}_{j}\equiv {\mathrm{Cov}}_{i}({[K{\text{'}}_{q}]}_{i,j},{y}_{i})/\sqrt{{\mathrm{Var}}_{i}({[K{\text{'}}_{q}]}_{i,j}){\mathrm{Var}}_{i}({y}_{i})}$. The retrieval error arising from forward model parameter errors $(b-b\text{'})$ or the model bias $\overline{\epsilon}$ can be obtained by replacing ${y}_{i}$ in Eq. (23) with ${[{K}_{b}(b-b\text{'})]}_{i}$ or $\overline{{\epsilon}_{i}}$, respectively. The RSE of $q{\text{'}}_{j}$ from $\overline{\epsilon}$ is found to be

*j*

^{th}column of $R$ by a column vector ${r}_{\epsilon}$ defined as ${[{r}_{\epsilon}]}_{j}\equiv {\mathrm{Cov}}_{i}({[K{\text{'}}_{q}]}_{i,j},\overline{{\epsilon}_{i}})/\sqrt{{\mathrm{Var}}_{i}({[K{\text{'}}_{q}]}_{i,j}){\mathrm{Var}}_{i}(\overline{{\epsilon}_{i}})}$. Both ${[{r}_{y}]}_{j}$and ${[{r}_{\epsilon}]}_{j}$ are correlation coefficients varying between −1 and 1. The retrieval RSE can also be expressed in terms of ${\delta}_{\Delta \tau j}\equiv 2{\mathrm{Cov}}_{i}(\tau {\text{'}}_{ij},\overline{{\epsilon}_{i}})/\sqrt{{\mathrm{Var}}_{i}(\tau {\text{'}}_{ij})}$. The single-layer retrieval RSE is ${\delta}_{q\text{'}1}/q{\text{'}}_{1}={\delta}_{\Delta \tau 1}/\Delta {\tau}_{1}$. The double-layer retrieval RSEs are ${\delta}_{q\text{'}1}/q{\text{'}}_{1}=({\delta}_{\Delta \tau 1}-{\delta}_{\Delta \tau 2}{[R]}_{1,2})/\Delta {\tau}_{1}(1-{[R]}_{1,2}^{2})$, and ${\delta}_{q\text{'}2}/q{\text{'}}_{2}=({\delta}_{\Delta \tau 2}-{\delta}_{\Delta \tau 1}{[R]}_{1,2})/\Delta {\tau}_{2}(1-{[R]}_{1,2}^{2}).$

To reduce both the RRE and RSE, it is desirable to have a large $\Delta {\tau}_{j}$. The online and offline channels are equivalent in their contributions to $\Delta {\tau}_{j}$ and ${\sigma}_{\Delta \tau}$. $\Delta {\tau}_{j}$can be increased by shifting online channels towards the absorption peak while keeping the offline channels in the adjacent low-absorption window regions. However, this also increases ${\sigma}_{\Delta \tau}$. For the simple case of column-averaged retrieval from two channels, the RRE is typically minimized when the online OD (and hence $\Delta {\tau}_{1}$) is not maximized. The online points often need to be placed on the sides of the absorption line to uniformly sense concentrations in the lower troposphere [15], resulting in an increased RRE. Further error reduction considerations will be discussed along with the numerical example in the next section.

When the laser signal shot noise term in Eq. (15) is well above other noise terms (as is the case for the numerical example in section 4), ${\sigma}^{2}({y}_{i})\cong {F}_{e}/(\overline{{S}_{K\text{'}}^{i}}+\overline{{S}_{K\text{'}}^{2m+1-i}})$so that ${\sigma}_{\Delta \tau}^{2}\cong 4{F}_{e}/{\displaystyle {\sum}_{i=1}^{m}\left(\overline{{S}_{K\text{'}}^{i}}+\overline{{S}_{K\text{'}}^{2m+1-i}}\right)}$. In other words, the shot-noise limited retrieval RRE is inversely proportional to the square root of the accumulated photon count of the $2m\times {n}_{p}$ pulses from all *m* combined channels. In general, the retrieval RRE and RSE are reduced by using information collectively from all measurement channels. Averaging multiple retrievals will further reduce the retrieval RRE arising from the measurement noise by the square root of the number of retrievals being averaged. However, this averaging may not effectively reduce the retrieval RSE arising from some persistent bias $\overline{\epsilon}$.

## 4. Numerical estimation of retrieval errors

In this section, we numerically estimate the errors of atmospheric CO_{2} retrievals from the lidar-sounding spectra measured across the 1572.335 nm CO_{2} absorption line. A fast pulse rate (8 kHz) is chosen to lower the required pulse energy and reduce the spectral distortion arising from the surface reflectance variations. The calculations are based on parametric formulas derived in section 3 and realistic parameters listed in Table 1. The pulses in all channels are assumed to have the same transmitted pulse energy${E}_{s}.$ The overall optical efficiency of the lidar’s receiving path (including filters) is assumed to be 51%. The additional OD bias ${b}_{\tau r}^{i}$ and variance ${({\sigma}_{\tau r}^{i})}^{2}$ due to imperfect ranging are negligible for a ranging bias ${\delta}_{r}^{i}$ ≤ 0.66 m and a ranging precision ${\sigma}_{r}^{i}$ ≤ 20 m [16], and are neglected hereafter. The speckle noise can also be neglected due to the large telescope diameter of 1.5 m and large laser beam spot size of 50 m on the surface [3, 6]. The measurement noise $\sigma ({y}_{i})$ is computed from Eq. (15) and plotted as a function of ${\tau}_{0av}({\nu}_{i})$ in Fig. 3 (*left*). Partial contributions to $\sigma ({y}_{i})$ from the signal shot noise $~{F}_{e}/(\overline{{S}_{K\text{'}}^{i}}+\overline{{S}_{K\text{'}}^{2m+1-i}})$, frequency noise, solar background, receiver circuitry noise, and detector dark count are also shown in Fig. 3 (*left*).

Referring to Eq. (22), the effective single-layer DAOD $\Delta {\tau}_{1}$ is found to be 1.17, and its effective measurement error ${\sigma}_{\Delta \tau}$ is found to be 0.00039 when there are 3200 detected photons (in average) for each offline pulse, resulting in a retrieval RRE of 0.034% from the measurement noise. This photon count can be achieved with ${E}_{s}=\text{4}\text{\hspace{0.17em}}\text{mJ}$ for an average surface reflectance *ρ* = 0.17. The signal shot noise contribution is well above the noise contributions from the receiver circuitry and the solar background count, as desired. The detector specifications listed in Table 1 can be met with a state-of-the-art HgCdTe avalanche photodiode (APD) detector [26]. The background solar photon count rate is estimated for a zenith angle of 75°. Its noise contribution is larger than that from the receiver transimpedance amplifier (TIA). The noise from the detector dark count is negligible. The frequency noise contribution arises essentially from the 3 MHz slow frequency drift $\sigma \left({\delta}_{\nu slow}\right)$, and the contribution from the 2 MHz fast frequency noise $\sigma \left({\delta}_{\nu fast}\right)$ is averaged down to a negligible level within the 10-s averaging time. For $\sigma \left({\delta}_{\nu slow}\right)\le 3\text{\hspace{0.17em}}\text{MHz,}$ the frequency noise contribution is less than that from the solar background radiation and it is safe to assume ${[{S}_{y}]}_{i,j}=0\text{\hspace{0.17em}}(i\ne j)$. The bias ${b}_{\tau \nu av}^{i}\equiv ({b}_{\tau \nu}^{i}+{b}_{\tau \nu}^{2m+1-i})/2$ towards ${\tau}_{0av}({\nu}_{i})$ due to the laser line-center frequency noise is shown in Fig. 3 (*right*). The partial retrieval RSE due this bias is found to be as small as 10^{−5} and thus negligible.

Next, we estimate the single-layer retrieval RSE arising from the model bias $\overline{{\epsilon}_{i}}$. Referring to Eq. (12), $\overline{{\epsilon}_{i}}$ is taken to be $-[\mathrm{ln}({S}_{A}^{i})+\mathrm{ln}({S}_{A}^{2m+1-i})]/2-{c}_{0}$ and other terms of $\overline{{\epsilon}_{i}}$ are neglected. Since the retrieval bias from $-[\mathrm{ln}({S}_{A}^{i})+\mathrm{ln}({S}_{A}^{2m+1-i})]/2$varies from one path to another, the corresponding RSE needs to be kept to a small fraction of the 0.1% RRE. For realistic error estimation, we use measured surface reflectance data in [27] to quantify the RSE.

As shown in Fig. 1, the laser pulses are assumed to repeatedly cycle through wavelength channels 1 to 8 and the laser beam spot at the surface is assumed to travel at the same speed as the spacecraft. The same set of surface reflectance data used in [16] for its strong variations is reused for this evaluation. The surface reflectance measurement was taken in southern Spain using ~10-m laser spot size and a step size of ~6 m [27]. To convert this data to the reflectance for our beam size, a 1-D running average is taken within our beam size (50 m) and the averaged reflectance is used as ${A}_{s}^{i}(k)$ in our calculation. The raw and averaged reflectance data are plotted in Fig. 4 (*left*). For an averaging time of 1 s, there are *n* = 1000 pulses for each wavelength and the path length is 7 km. In contrast, the separation between spots of adjacent channels is only 0.875 m. As a result, the spectral distortion due to variations of $\overline{{\epsilon}_{i}}$across the 8 channels is quite small. Referring to Eq. (24), ${\delta}_{\Delta \tau j}$ (and hence the RSE) is further reduced by the factor ${\mathrm{Cov}}_{i}(\tau {\text{'}}_{ij},\overline{{\epsilon}_{i}})$ that rejects variations of $\overline{{\epsilon}_{i}}$uncorrelated to $\tau {\text{'}}_{ij}$. This point is confirmed by our calculation results shown in Fig. 4 (*right*). The RSE calculated from the surface reflectance data is $\le 2\times {10}^{-5}$ for 1-s averaging starting at any position along the path. The RSE is increased when the averaging time (hence *n _{p}*) is reduced. Figure 4 (

*right*) also shows an average of 10 values of RSE, each computed over 0.1 s averaging time consecutively along the path. This averaged RSE is essentially reduced to the same level of the 1-sec RSE. This is also the case for the following double-layer retrievals. The retrieval from the multiple symmetric channels rejects anti-symmetric variations in $\mathrm{ln}({S}_{A}^{i})$ and appears to randomize the RSEs observed along consecutive sections of the path. This allows shorter time of averaging before log, without increasing the overall RSE averaged across the consecutive sections of the path. It is desirable to shorten the time of averaging before log, to minimize the impact of time-varying spectral distortion (due to, e.g., etalon fringes in the lidar’s spectral response).

Similarly, the RREs and RSEs of the double-layer retrievals are computed and shown in Fig. 5 (*left*) and (*right*), respectively. The RREs are plotted as functions the boundary pressure *p*_{1} between the two layers. Also plotted in Fig. 5 (*left*) are $\Delta {\tau}_{1}$, $\Delta {\tau}_{2}$, and the correlation coefficient ${[R]}_{1,2}$ as functions of *p*_{1}. As expected, the RREs and RSEs become larger than the single-layer values. When retrieving the PBL with a top boundary ~2 km above the surface (corresponding to *p*_{1} ~795 hPa), the RRE and RSE of $q{\text{'}}_{1}$ are found to be as large as 0.52% and 0.011%, respectively, due to a small $\Delta {\tau}_{1}$~0.210 and a large correlation coefficient ${[R]}_{1,2}$of 0.933. The RRE and RSE of either layer become smaller as the layer becomes thicker. To reduce ${[R]}_{1,2}$ (the degree of linear dependence between ${\tau}_{i1}$ and ${\tau}_{i2}$ of the two layers), it is desirable that each layer has significant overlap (i.e., ${[K{\text{'}}_{q}]}_{i,j}$) with at least one online weighting function (while the offline absorption is minimized) and each online weighting function has much more overlap within one layer than other layers.

The triple-layer retrieval RREs are computed and plotted in Fig. 6 (*left*), and the factor ${F}_{j}$and effective DAOD $\Delta {\tau}_{j}$are plotted in Fig. 6 (*right*), as functions of the boundary pressure *p*_{2} between layers 2 and 3 while *p*_{1} is fixed at 795 hPa. The RREs are significantly higher than the corresponding double-layer RREs due to much increased ${F}_{j}$.

## 5. Discussions

Without averaging each pair of symmetric channels, the same results can still be achieved if the 2*m* by 2*m* covariance matrix of ${y}_{i}^{OD}$ is used. However, this requires knowledge of the non-zero covariances among different channels. By averaging each pair of symmetric laser frequency channels, the covariance matrix of **y** is diagonalized and reduced to *m* elements. This further simplifies the parametric analysis and numerical computation. The formulation can be further extended to allow each pair of symmetric channels to transmit pulses simultaneously. This could double the SBS-limited laser peak power. There is no need to measure $K{\text{'}}_{s}^{i}(k)$ and ${K}_{s}^{2m+1-i}(k)$ separately, only the sum $K{\text{'}}_{s}^{i}(k)+{K}_{s}^{2m+1-i}(k)$ is needed. However, this requires accurate measurement of the pulse energy ratio ${E}_{s}^{i}(k)/{E}_{s}^{2m+1-i}(k)$ (in addition to the pulse energy sum ${E}_{s}^{i}(k)+{E}_{s}^{2m+1-i}(k)$), which is difficult to achieve when the simultaneous pulses come from the same laser.

A wavelength channel becomes redundant for the retrieval of $q$ if its weighting function can be approximated by a linear combination of weighting functions of other channels. Nevertheless, it can still provide an independent piece of information to allow for inclusion of a term in the model (such as ${c}_{2}{\left(\Delta {\nu}_{i}\right)}^{2}$) to further correct spectral distortion. To retrieve $q$, only the relative ratios of the transmitted pulse energies $E{\text{'}}_{s}^{i}(k)$ are needed. There is no need to measure the absolute values of $E{\text{'}}_{s}^{i}(k)$ because scaling $K{\text{'}}_{s}^{i}(k)/E{\text{'}}_{s}^{i}(k)$with a common factor only shifts ${c}_{0}$, not $q$. The spectral distortion from the lidar’s receiving path can be substantially removed from the offset $-\mathrm{ln}({S}_{A}^{i})$. One way to do this is to normalize ${S}_{NK}^{i}$ by the spectral response measured for the receiving path. Another way is to pass a small fraction of the transmitted laser through the receiving path and measure $E{\text{'}}_{s}^{i}(k)$ from it at the end of the path [12].

It should be noted that the weighting function can be also defined as $[d{\tau}_{i}/d\mathrm{ln}(p)]/q(p)$, or $(d{\tau}_{i}/dz)/q(z)$ (as a function of the altitude *z*). Although different definitions lead to different weighting function curves, they produce the same integrals ${[K{\text{'}}_{q}]}_{i,j}$ and thus are equivalent.

Despite the variations of ${A}_{s}^{i}(k)$ from one wavelength sweep cycle to the next, the retrievals remain accurate as long as $[\mathrm{ln}({S}_{A}^{i})+\mathrm{ln}({S}_{A}^{2m+1-i})]/2$ are the same for all *m* combined channels. Even when $\mathrm{ln}({S}_{A}^{i})$ varies across the 2*m* uncombined channels, the retrieval RSEs arising from this spectral distortion are substantially reduced by the cancelation of the anti-symmetric spectral distortion and rejection of the symmetric spectral distortion component that is uncorrelated to $\tau {\text{'}}_{ij}$. This applies to all contributing factors to ${A}_{s}^{i}(k)$, including the surface reflectance and the optical detector responsivity for the receiver and the transmitted laser pulse energy monitor. For example, the slow responsivity drift of either detector does not affect the retrievals as long as the detector responsivity remains constant during each wavelength sweep cycle (~1 ms).

The retrieval errors can be further reduced if *a priori* constraints for the gas mixing ratios are included. Fixed ${x}_{a}$and **S**_{a} have been used for CO_{2} retrieval [18]. When the measurements and retrievals are made at consecutive time steps, it would be more accurate if ${x}_{a}$and **S**_{a} could be estimated from neighboring measurements taken before (and after) the current time step. Similarly, the range ${r}_{s}^{i}(k)$ at a beam spot *k* can be estimated from multiple altimetry measurements at beam spot *k* and neighboring beam spots. Since there are 57 such neighboring beam spots within a beam spot size of 50 m for our CO_{2} sounder example, the estimation accuracy of ${r}_{s}^{i}(k)$ can be significantly improved.

To simplify retrievals, narrow-line width lasers have been used to scan the target absorption lines. The narrow laser linewidth leads to SBS in the laser amplifiers that limits the laser peak power and laser pulse energy. It would be highly desirable if the IPDA measurements can be made with spectrally-broadened laser pulses, in order to suppress the SBS. The laser line-shape needs to be broadened deterministically so that the resulting effective OD ${\tau}_{av}$ is deterministic and can be accurately calculated. The present model allows us to examine the feasibility and limitation of retrievals from such measurements. Our research on this topic will be reported in future publications.

## 6. Summary

New modeling and error reduction methods are presented for retrieving atmospheric constituents from symmetrically measured lidar-sounding absorption spectra. The forward model accounts for laser line-center frequency noise and broadened line-shape, and is essentially linearized by linking estimated ODs to the mixing ratios of the target species. Errors from the spectral distortion and laser frequency drift are substantially reduced by averaging ODs at each pair of symmetric wavelength channels. This allows the tolerance for the laser frequency drift to be relaxed from 0.23 MHz to 6 MHz for the ASCENDS’ CO_{2} transmitter. Retrieval errors from measurement noise and model bias are analyzed parametrically and numerically for multiple atmospheric layers, to provide deeper insight. For each atmospheric layer, an effective DAOD and its effective measurement variance are introduced. The RRE of the mixing ratio is equal to the RRE of the effective DAOD. In general, the retrieval RRE and RSE are reduced by using information collectively from all measurement channels. When the signal shot noise is predominant, the retrieval RRE decreases approximately by the square root of accumulated photon count of participating pulses from all measurement channels. Errors from surface height and reflectance variations are reduced to tolerable levels by “averaging before log” with pulse-by-pulse ranging knowledge incorporated. Error contributions from other sources are also taken into account.

## Appendices

#### A. Effective optical depth and weighting function

The effective two-way OD $\tau ({\nu}_{c},2r)$ [16] can be linked to the dry mixing ratio ${q}_{gas}(p)$ by

*p*,

*l*is the accumulated laser path length running from the spacecraft to the surface and back, ${N}_{gas}(l)$ the number density of the target molecules at path length

*l*, ${m}_{{H}_{2}O}$, ${m}_{dryair}$ and

*g*are defined for Eq. (1), ${\sigma}_{eff}({\nu}_{c},l)$, ${\sigma}_{eff}^{f}({\nu}_{c},p)$ and ${\sigma}_{eff}^{b}({\nu}_{c},p)$ are defined below. ${\sigma}_{eff}({\nu}_{c},l)$ is the effective absorption cross-section of the target molecules defined as

*l*. In general, a broad ${L}_{N}({\nu}_{F},l)$ is distorted progressively along the outgoing and return paths. This leads to two different values of ${\sigma}_{eff}({\nu}_{c},p)$ for each vertical position, ${\sigma}_{eff}^{f}({\nu}_{c},p)$ for the outgoing laser pulse and ${\sigma}_{eff}^{b}({\nu}_{c},p)$ for the return laser pulse as given by

*p*falls into a layer

*j*=

*j*, and ${p}_{{j}_{p}-1}$ is shifted to ${p}_{{j}_{p}-1}=p$.

_{p}#### B. Additional details for forward model

We now summarize noise contributions from the background solar radiation, detector dark count, and receiver circuitry noise for the lidar receiver. $K{\text{'}}_{s}^{i}(k)$ is estimated from the total count ${K}_{tot}^{i}(k)$ within a pulse duration $\Delta t$ minus a derived background count $K{\text{'}}_{bgd}^{i}$ arising from background solar radiation, detector dark count and receiver circuitry noise. To reduce the background variance, the background count is measured in a longer duration *β*Δ*t* between the pulse measurements, and scaled to $K{\text{'}}_{bgd}^{i}$within Δ*t*. Referencing Eq. (5), the variance of $K{\text{'}}_{s}^{i}(k)$is found to be [16]

*B*the optical filter bandwidth,

_{o}*λ*an equivalent dark count rate,

_{d}*F*an effective dark-count excess noise factor, ${S}_{\delta i}(f)$ the PSD of the equivalent input noise current of the circuit,

_{d}*M*the mean internal gain of the detector, and

_{e}*e*the electron charge.

The bias correction factor ${C}_{i}^{OD}$ introduced in Eq. (8) is given by [16]

#### C. Numerical verification of $Cov({\delta}_{\nu nslow}^{i},{\delta}_{\nu nslow}^{j})\cong {\sigma}^{2}({\delta}_{\nu slow})$

The covariance of ${\delta}_{\nu slow}(t)$ is the Fourier transform of its PSD ${S}_{\delta \nu slow}(f)$ that in turn can be derived from the measured master laser frequency noise PSD ${S}_{\delta \nu}(f)$ [9] shown in Fig. 7 (*left*). ${S}_{\delta \nu slow}(f)$ is taken to be the slow (“flicker” noise) portion of ${S}_{\delta \nu}(f)$, and it is truncated to zero for *f* > 20 Hz. $\mathrm{Cov}\left({\delta}_{\nu nslow}^{i},{\delta}_{\nu nslow}^{j}\right)$ is computed by integrating the product of ${S}_{\delta \nu slow}(f)$ and the following window function

*right*), ${W}_{ij}^{cov}(f)$ are computed for

*n*= 100 and 1000 using the relevant parameters listed in Table 1 and taking the measured surface reflectance as

_{p}*A*(

_{s}*i*) as described in section 4. ${W}_{1,1}^{cov}(f)$, ${W}_{8,8}^{cov}(f)$ and ${W}_{1,7}^{cov}(f)$ are plotted for the path segment giving rise to the most relative change in ${\sum}_{k=1}^{{n}_{p}}{A}_{s}^{i}(k)$ between channels 1 and 7. ${W}_{ij}^{cov}(f)$ (including ${W}_{ii}^{cov}(f)$) among the 8 channels are found to be essentially identical within

*f*≤ 160 Hz. This is due to the small$f{t}_{p}\ll 1$, and the fact that ${A}_{s}^{j}(k)$ closely tracks${A}_{s}^{i}(k)$ when the separation between the corresponding beam spots of channel

*i*and channel

*j*is much smaller than the beam spot diameter and the path length of the

*n*pulses. $\mathrm{Cov}\left({\delta}_{\nu nslow}^{i},{\delta}_{\nu nslow}^{j}\right)$ (including ${\sigma}^{2}\left({\delta}_{\nu nslow}^{i}\right)$) among the 8 channels are found to be the same within 0.001%, even when

_{p}*n*is as small as 100. They are only slightly smaller than ${\sigma}^{2}\left({\delta}_{\nu slow}\right)$(smaller by < 0.3% for

_{p}*n*= 100, and by < 2% for

_{p}*n*= 1000). This verifies that ${\sigma}^{2}({\delta}_{\nu nslow})$from the slow frequency drift essentially does not decrease from the pulse averaging, and $\mathrm{Cov}\left({\delta}_{\nu nslow}^{i},{\delta}_{\nu nslow}^{j}\right)\cong {\sigma}^{2}\left({\delta}_{\nu slow}\right)$.

_{p}## Acknowledgments

The authors gratefully acknowledge Dr. J. Mao and Dr. X. Sun of NASA Goddard for fruitful discussions. They are also indebted to Dr. A. Amediek of Deutsches Zentrum für Luft- und Raumfahrt (DLR) for sharing surface reflectance measurement data, Dr. J. Abshire and other members of the Goddard CO_{2} sounder team for their support. This work was supported by the NASA Goddard Internal Research and Development program and the NASA Earth Science Technology Office Instrument Incubator Program.

## References and links

**1. **Space Studies Board, National Research Council, *Earth Science and Applications from Space: National Imperatives for the Next Decade and Beyond* (National Academies, 2007).

**2. **A-SCOPE—advanced space carbon and climate observation of planet earth, report for assessment,” ESA-SP1313/1(European Space Agency, 2008), http://esamultimedia.esa.int/docs/SP1313-1_ASCOPE.pdf.

**3. **G. Ehret, C. Kiemle, M. Wirth, A. Amediek, A. Fix, and S. Houweling, “Space-borne remote sensing of CO_{2}, CH_{4}, and N_{2}O by integrated path differential absorption lidar: a sensitivity analysis,” Appl. Phys. B **90**(3-4), 593–608 (2008). [CrossRef]

**4. **J. B. Abshire, H. Riris, G. Allan, X. Sun, S. R. Kawa, J. Mao, M. Stephen, E. Wilson, and M. A. Krainak, “Laser sounder for global measurement of CO_{2} concentrations in the troposphere from space,” in Laser Applications to Chemical, Security and Environmental Analysis, OSA Technical Digest (CD) (Optical Society of America, 2008), paper LMA4.

**5. **R. T. Menzies and M. T. Chahine, “Remote atmospheric sensing with an airborne laser absorption spectrometer,” Appl. Opt. **13**(12), 2840–2849 (1974). [CrossRef] [PubMed]

**6. **J. B. Abshire, H. Riris, G. R. Allan, C. J. Weaver, J. Mao, X. Sun, W. E. Hasselbrack, S. R. Kawa, and S. Biraud, “Pulsed airborne lidar measurements of atmospheric CO_{2} column absorption,” Tellus Ser. B, Chem. Phys. Meteorol. **62**(5), 770–783 (2010). [CrossRef]

**7. **J. Caron and Y. Durand, “Operating wavelengths optimization for a spaceborne lidar measuring atmospheric CO_{2.},” Appl. Opt. **48**(28), 5413–5422 (2009). [CrossRef] [PubMed]

**8. **K. Numata, J. R. Chen, and S. T. Wu, “Precision and fast wavelength tuning of a dynamically phase-locked widely-tunable laser,” Opt. Express **20**(13), 14234–14243 (2012). [CrossRef] [PubMed]

**9. **K. Numata, J. R. Chen, S. T. Wu, J. B. Abshire, and M. A. Krainak, “Frequency stabilization of distributed-feedback laser diodes at 1572 nm for lidar measurements of atmospheric carbon dioxide,” Appl. Opt. **50**(7), 1047–1056 (2011). [CrossRef] [PubMed]

**10. **H. Riris, M. Rodriguez, G. R. Allan, W. Hasselbrack, J. Mao, M. Stephen, and J. Abshire, “Pulsed airborne lidar measurements of atmospheric optical depth using the Oxygen A-band at 765 nm,” Appl. Opt. **52**(25), 6369–6382 (2013). [CrossRef] [PubMed]

**11. **H. Riris, K. Numata, S. Li, S. Wu, A. Ramanathan, M. Dawsey, J. Mao, R. Kawa, and J. B. Abshire, “Airborne measurements of atmospheric methane column abundance using a pulsed integrated-path differential absorption lidar,” Appl. Opt. **51**(34), 8296–8305 (2012). [CrossRef] [PubMed]

**12. **J. Caron, Y. Durand, J. L. Bezy, and R. Meynart, “Performance modeling for A-SCOPE, a spaceborne lidar measuring atmospheric CO_{2},” Proc. SPIE **7479**, 74790E1 (2009). [CrossRef]

**13. **R. T. Menzies and D. M. Tratt, “Differential laser absorption spectrometry for global profiling of tropospheric carbon dioxide: selection of optimum sounding frequencies for high-precision measurements,” Appl. Opt. **42**(33), 6569–6577 (2003). [CrossRef] [PubMed]

**14. **J. B. Abshire, H. Riris, C. J. Weaver, J. Mao, G. R. Allan, W. E. Hasselbrack, and E. V. Browell, “Airborne measurements of CO_{2} column absorption and range using a pulsed direct-detection integrated path differential absorption lidar,” Appl. Opt. **52**(19), 4446–4461 (2013). [CrossRef] [PubMed]

**15. **S. R. Kawa, J. Mao, J. B. Abshire, G. J. Collatz, X. Sun, and C. J. Weaver, “Simulation studies for a space-based CO_{2} lidar mission,” Tellus Ser. B, Chem. Phys. Meteorol. **62**(5), 759–769 (2010). [CrossRef]

**16. **J. R. Chen, K. Numata, and S. T. Wu, “Error reduction methods for integrated-path differential-absorption lidar measurements,” Opt. Express **20**(14), 15589–15609 (2012). [CrossRef] [PubMed]

**17. **C. D. Rodgers, *Inverse Methods for Atmospheric Sounding: Theory and Practice* (World Scientific, 2000), Vol. 2.

**18. **E. Dufour and F. M. Bréon, “Spaceborne estimate of atmospheric CO_{2} column by use of the differential absorption method: error analysis,” Appl. Opt. **42**(18), 3595–3609 (2003). [CrossRef] [PubMed]

**19. **J. Mao and S. R. Kawa, “Sensitivity studies for space-based measurement of atmospheric total column carbon dioxide by reflected sunlight,” Appl. Opt. **43**(4), 914–927 (2004). [CrossRef] [PubMed]

**20. **D. A. Long, K. Bielska, D. Lisak, D. K. Havey, M. Okumura, C. E. Miller, and J. T. Hodges, “The air-broadened, near-infrared CO_{2} line shape in the spectrally isolated regime: evidence of simultaneous Dicke narrowing and speed dependence,” J. Chem. Phys. **135**(6), 064308 (2011). [CrossRef] [PubMed]

**21. **D. R. Thompson, D. C. Benner, L. R. Brown, D. Crisp, V. M. Devi, Y. Jiang, V. Natraj, F. Oyafuso, K. Sung, D. Wunch, R. Castaño, and C. E. Miller, “Atmospheric validation of high accuracy CO_{2} absorption coefficients for the OCO-2 mission,” J. Quant. Spectrosc. Radiat. Transf. **113**(17), 2265–2276 (2012). [CrossRef]

**22. **W. B. Grant, “Effect of differential spectral reflectance on DIAL measurements using topographic targets,” Appl. Opt. **21**(13), 2390–2394 (1982). [CrossRef] [PubMed]

**23. **J. W. Goodman, *Statistical Optics* (John Wiley & Sons, 1985).

**24. **N. Z. Hakim, M. C. Teich, and B. E. A. Saleh, “Generalized excess noise factor for avalanche photodiodes of arbitrary structure,” IEEE Trans. Electron. Dev. **37**(3), 599–610 (1990). [CrossRef]

**25. **G. D. Spiers, R. T. Menzies, J. Jacob, L. E. Christensen, M. W. Phillips, Y. Choi, and E. V. Browell, “Atmospheric CO_{2} measurements with a 2 μm airborne laser absorption spectrometer employing coherent detection,” Appl. Opt. **50**(14), 2098–2111 (2011). [CrossRef] [PubMed]

**26. **J. D. Beck, R. Scritchfield, P. Mitra, W. Sullivan III, A. D. Gleckler, R. Strittmatter, and R. J. Martin, “Linear-mode photon counting with the noiseless gain HgCdTe e-APD,” Proc. SPIE **8033**, 80330N (2011). [CrossRef]

**27. **A. Amediek, A. Fix, G. Ehret, J. Caron, and Y. Durand, “Airborne lidar reflectance measurements at 1.57 μm in support of the A-SCOPE mission for atmospheric CO_{2},” Atmos. Meas. Tech. **2**(2), 755–772 (2009). [CrossRef]