Lecture - Fundamental Concepts

Website:	WueCampus
Kurs:	vhb : Radio-Astronomical Interferometry (DEMO)
Buch:	Lecture - Fundamental Concepts

Gedruckt von:	Gast
Datum:	Mittwoch, 19. Februar 2025, 00:08

Inhaltsverzeichnis

1. Fourier optics
2. Interferometry
3. Aperture Synthesis by Radio Interferometric Arrays
4. Receiver Response
5. Image reconstruction
6. Digital Beamforming

1. Fourier optics

Learning Objectives:

Basic concepts applied to telescopes
Fourier and convolution mathematics
Application of telescopes as spatial filters

2. Interferometry

Learning Objectives:

Interferometry basics and concepts
Application in radio astronomy
Introduction to important values like visibility function and bandwidth
Influences of the instrument set up to the measured values
Orientation of pointing in the scope of coordinates

3. Aperture Synthesis by Radio Interferometric Arrays

Learning Objectives:

Establishing a familiarity with the $(u,v)$ space
Interferometer arrangements with their strengths and draw backs
Interferometer properties based on their field of application

4. Receiver Response

Learning Objectives:

Process and advantages of heterodyne frequency conversion
Sensitivity of a radio-interferometrical array
Effects of sampling, weighting and gridding
Effect of finite bandwidth, bandwidth smearing
Calibration of radio-interferometrical observations

4.1. Heterodyne frequency conversion

The detectors of radio telescopes are diodes with quadratic current-voltage characteristics that require an input power of

$\sim 10^{-5}\,\text{W}$ to work in the quadratic regime. The power

$P$ of the weak radio-astronomical signals entering the receiver is given by

$\displaystyle P = k \cdot T_\text{sys} \cdot \Delta \nu \text{,}$

in which

$k$ is the Boltzmann constant and

$T_\text{sys}$ is the system temperature containing the receiver noise temperature and the antenna temperature. The antenna temperature is composed of the desired astronomical signal, a contribution from the earth's atmosphere, and possibly radiation from the ground. Due to that always present background the best case for the systemtemperature is

$T_\text{sys} = 20\,\text{K}$ . Therefore, using a typical bandwidth of

$\Delta\nu = 50\,\text{MHz}$ , the input power

$P = 1.4 \cdot 10^{-14}\,\text{W}$ , meaning that the signal must be amplified by a factor of

$\sim 7\cdot 10^8$ . However, amplification with such high factors is problematic, since the amplifying system becomes unstable due to feedback. A small amount of the power passing through the individual electronic components leaks out and reaches previous receiver elements, where it is again amplified. To circumvent this problem, the frequency of the signal

$\nu_\text{S}$ is down-converted to the lower intermediate frequency (IF) by mixing the signal with that of a local oscillator (LO), which decouples the signal path after the first amplification. This process is called heterodyne amplification or heterodyne frequency conversion.

Frequency down-conversion


Fig. 2.29 Illustration of the down-conversion process. The measured radio signal is mixed with a local oscillator of frequency $\nu_\text{LO}$ leading to two frequency sidebands $\nu_\text{i}$ and $\nu_\text{S}$ . These two sidebands can than be down-converted to an intermediate frequency band at $\nu_\text{IF}$ , representing the difference between the two sideband frequencies and $\nu_\text{LO}$ , respectively.

Fig. 2.29 Illustration of the down-conversion process. The measured radio signal is mixed with a local oscillator of frequency

$\nu_\text{LO}$ leading to two frequency sidebands

$\nu_\text{i}$ and

$\nu_\text{S}$ . These two sidebands can than be down-converted to an intermediate frequency band at

$\nu_\text{IF}$ , representing the difference between the two sideband frequencies and

$\nu_\text{LO}$ , respectively.

The LO produces a signal at a well-defined and tunable frequency

$\nu_\text{LO}$ , close to the observing frequency

$\nu_\text{S}$ . To mix these two input signals, a semi-conductor diode with a non-linear current-voltage characteristic is used. The current-voltage relation of this diode can be written in terms of a Taylor expansion as

$\displaystyle I(U_0 + \delta U) = I(U_0) + \frac{\text{d}I}{\text{d}U} \cdot \delta U + \frac{1}{2} \cdot \frac{\text{d}^2 I}{\text{d} U^2} \cdot (\delta U)^2 + \text{...} = K_0 + K_1 \cdot \delta U + K_2 \cdot (\delta U)^2 + \text{...}$

for small variations

$\delta U$ around the constant input bias

$U_0$ of the mixer

$(\delta U \ll U_0)$ , where:

$\displaystyle \delta U = A \cdot \sin (\omega_\text{LO} \cdot t) + B \cdot \sin (\omega_\text{S} \cdot t) \text{,}$

with

$\displaystyle \begin{array}{rcl} \omega_\text{LO} = 2 \pi \cdot \nu_\text{LO} & \text{and} & \omega_\text{S} = 2 \pi \cdot \nu_\text{S} \text{.} \end{array}$

Inserting this expression for

$\delta U$ into the Taylor expansion yields

$\displaystyle \begin{array}{rl} I = & K_0 + K_1 \cdot \left[A \cdot \sin (\omega_\text{LO} \cdot t) + B \cdot \sin (\omega_\text{S} \cdot t)\right] \\ & + K_2 \cdot \left[\frac{A^2}{2} + \frac{B^2}{2}\right] \\ & - K_2 \cdot \left[\frac{A^2}{2} \cdot \cos (2 \cdot \omega_\text{LO} \cdot t) + \frac{B^2}{2} \cdot \cos (2 \cdot \omega_\text{S} \cdot t)\right] \\ & + K_2 \cdot A \cdot B \cdot \cos \left[(\omega_\text{LO} - \omega_\text{S}) \cdot t\right] \\ & - K_2 \cdot A \cdot B \cdot \cos \left[(\omega_\text{LO} + \omega_\text{S}) \cdot t\right] \text{,} \end{array}$

leading to a frequency spectrum containing frequencies of

$\displaystyle \begin{array}{lcr} \nu_\text{l,m} = |l \cdot \nu_\text{LO} \pm m \cdot \nu_\text{S}| \text{,} & \text{with} & l,m = 0, 1, 2, 3, ...\,\text{.} \end{array}$

Since the LO power

$P_\text{LO}$ of higher harmonics decreases as

$1/l^2$ and usually the signal power

$P_\text{S} \ll P_\text{LO}$ , only three terms of this frequency spectrum are important, namely:

the frequency sum

$\displaystyle \nu_\text{LO} + \nu_\text{S} \text{,}$

the intermediate frequency (IF)

$\displaystyle \nu_\text{IF} = |\nu_\text{LO} - \nu_\text{S}|$

and the image frequency

$\displaystyle \nu_\text{i} = 2 \cdot \nu_\text{LO} - \nu_\text{S} \text{,}$

which is the "image" of

$\nu_\text{S}$ . Combining these terms, one obtains two high frequency (HF) bands that have equidistant separations from

$\nu_\text{LO}$ which correspond to the IF. For

$\nu_\text{LO} < \nu_\text{S}$ , these two HF bands correspond to the observing frequency, or signal frequency,

$\nu_\text{S}$ , and the image frequency

$\nu_\text{i}$ , and are denoted by

$\displaystyle \begin{array}{lcr} \nu_\text{S} = \nu_\text{LO} + \nu_\text{IF} & \text{and} & \nu_\text{i} = \nu_\text{LO} - \nu_\text{IF} \text{.} \end{array}$

Figure 2.26 illustrates the down-conversion of the two HF bands to the single IF band. The two HF bands

$\nu_\text{i}$ and

$\nu_\text{S}$ are called lower sideband (LSB) and upper sideband (USB), respectively. If both sidebands are used, the receiver operates in the so-called double-sideband mode. Otherwise, the receiver is said to operate in the single-sideband mode. To make sure that only the two HF bands are produced by the mixing process, an IF filter is used after the mixer to suppress all unwanted products of the mixing process (e.g.

$\nu_\text{LO} + \nu_\text{S}$ ).

Advantages


Fig. 2.30 Illustration of the down-conversion process for a two-element interferometer. The measured radio signal at frequency $\nu_\text{RF}$ is mixed with a common local oscillator (LO) signal at frequency $\nu_\text{LO}$ . One of the two LO signals is phase-shifted by $\varphi_\text{LO}$ .

Once the HF bands are down-converted to the lower IF band, further amplification can be performed. Because of the lower frequency, all electronic components after the mixer become more stable and easier to handle. Furthermore, after the heterodyne frequency conversion it is also possible to vary the observing frequency

$\nu_\text{S}$ without changing any components following the mixer, which is essential for spectral-line measurements.

The heterodyne frequency conversion also results in some important advantages in radio interferometry. As already seen before, the correlated power of a two-element interferometer is given by

$\displaystyle P = A_0 \cdot \Delta \nu \cdot |V| \cdot \cos(2\pi \cdot \vec{D}_\lambda \cdot \vec{s}_0 - \varphi_\text{V}) \text{,}$

in which

$|V|$ and

$\varphi_\text{V}$ are the amplitude and phase of the complex visibility function,

$\vec{D}_\lambda$ is the baseline between the two antennas in units of the observing wavelength and

$\vec{s}_0$ denotes the position of the observed radio source.

Figure 2.27 illustrates such an interferometer with some modifications concerning the heterodyne frequency conversion. First, an artificial time lag

$\tau_\text{i}$ is introduced in one of the two branches of the receiver systems before the correlator. Second, a common LO signal is introduced into both mixers, with a phase shifter fed into one of the branches. Therefore, the resulting signal frequency

$\nu_\text{RF} = \nu_\text{S}$ is given by

$\displaystyle \nu_\text{RF} = \nu_\text{LO} \pm \nu_\text{IF} \text{.}$

Observing in the single-sideband mode, the USB

$(\nu_\text{LO} + \nu_\text{IF})$ and LSB

$(\nu_\text{LO} - \nu_\text{IF})$ phases

$\varphi_1$ and

$\varphi_2$ of the two signals are given by

$\displaystyle \varphi_1 = 2 \pi \cdot \nu_\text{RF} \cdot \tau_\text{g} = 2 \pi \cdot (\nu_\text{LO} \pm \nu_\text{IF}) \cdot \tau_\text{g} \text{,}$

$\displaystyle \varphi_2 = 2 \pi \cdot \nu_\text{IF} \cdot \tau_\text{i} + \varphi_\text{LO} \text{,}$

in which

$\tau_\text{g}$ is the geometric time delay and

$\varphi_\text{LO}$ is the phase difference between the LO signals at the input of the mixers. The correlated power of such an interferometer can then be determined by changing the phase

$2 \pi \cdot \vec{D}_\lambda \cdot \vec{s}_0$ by

$\varphi_1 - \varphi_2$ , leading to

$\displaystyle P = A_0 \cdot \Delta \nu \cdot |V| \cdot \cos [2 \pi (\nu_\text{LO} \cdot \tau_\text{g} \pm \nu_\text{IF} \cdot \Delta \tau) - \varphi_\text{V} - \varphi_\text{LO}] \text{,}$

in which

$\Delta \tau = \tau_\text{g} - \tau_\text{i}$ .

Therefore, using heterodyne frequency conversion in radio interferometry leads to the possibility that the geometric delay

$\tau_\text{g}$ can be compensated by introducing an intrinsic time lag

$\tau_\text{i}$ and that any phase variations due to the varying

$\tau_\text{g}$ can be compensated by controlling

$\varphi_\text{LO}$ .

Fringe rotation and complex correlators

By varying

$\varphi_\text{LO}$ , the fringe frequency at which the interference pattern changes for each telescope pair due to the varying hour angle can be reduced. This is called fringe rotation. Looking at the time derivative of

$\varphi_1 - \varphi_2$ , given by


FIg. 2.31 Sketch of a complex correlator. The correlation of the two original signals with a cosine correlator yields the real part of the visibility. The correlation of the two signals, with one of them shifted by $90^\circ$ , leads to the imaginary part of the visibility using a sine correlator.

$\displaystyle \frac{\text{d}}{\text{d}t}(\varphi_1 - \varphi_2) = 2\pi\cdot\nu_\text{LO}\cdot \frac{\text{d}\tau_\text{g}}{\text{d}t}- \frac{\text{d}\varphi_\text{LO}}{\text{d}t}\,\text{,}$

one can see that it is even possible to stop the fringes by varying

$\varphi_\text{LO}$ at a speed that is identical to the so-called natural fringe frequency, given by the term

$\displaystyle \nu_\text{LO}\cdot \frac{\text{d}\tau_\text{g}}{\text{d}t}\,\text{.}$

This is then called fringe stopping. After rotating or stopping the fringes, it is also much easier to measure the amplitude and phase of the complex visibility by using a so-called complex correlator (see Fig. 2.28). Here, two correlations are performed: one with the two original signals, leading to a co-sinusoidal output signal that represents the real part of the visibility

$\Re V$ , and one that has a

$90^\circ$ -phase shift in one of its branches prior to the correlation, leading to a sinusoidal output signal that corresponds to the imaginary part of the visibility

$\Im V$ . With these output signals, the amplitude

$|V|$ and phase

$\varphi_\text{V}$ of the visibility can be calculated by

$\displaystyle |V| = \sqrt{(\Re V)^2 + (\Im V)^2}$

and

$\displaystyle \varphi_\text{V} = \arctan\left(-\frac{\Im V}{\Re V}\right)\,\text{.}$

4.2. Interferometer sensitivity

In order to estimate whether a source can be detected by a radio telescope or array, the so-called signal-to-noise ratio (SNR) has to be calculated. For a single telescope the signal is given by the measured antenna temperature $T_\text{A}$ and the noise is given by the radiometer equation

$\displaystyle \Delta T = \frac{C \cdot T_\text{sys}}{\sqrt{\Delta \nu \cdot \tau}} \text{,}$

in which $C$ is a dimensionless constant depending on the receiver system used, $T_\text{sys}$ is the system temperature, $\Delta \nu$ is the bandwidth of the receiver equipment and $\tau$ is the integration time.
In radio interferometry, the signal is given by the brightness distribution $B(\xi, \eta)$ of a source that must be calculated by the Fourier integral of the measured visibility $V(u, v)$ :

$\displaystyle B(\xi, \eta) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} V(u,v) \cdot \text{e}^{\text{i}\cdot 2\pi \cdot (u \cdot \xi + v \cdot \eta)} \text{d}u ~\text{d}v \text{.}$

Therefore, to calculate the SNR in radio interferometry, one has to calculate the uncertainty of the brightness distribution $B$ from the uncertainty of the complex visibility $V$ . As in every measurement, the measured data need to be sampled, meaning that the visibility is only measured at discrete locations $(u_\text{l},v_\text{l})$ along the tracks in the $(u,v)$ -plane. Let: $\tau_\text{a}$ be the integration time of the individual samples in the $(u,v)$ -plane, $\tau_0$ be the total integration time and $n_\text{a}$ be the total number of antennas. Then the total number of measured visibilities $n_\text{d}$ is given by

$\displaystyle n_\text{d} = n_\text{p} \cdot \frac{\tau_0}{\tau_\text{a}} \text{,}$

in which

$\displaystyle n_\text{p} = \frac{n_\text{a}\cdot (n_\text{a} - 1)}{2}$

is the number of antenna pairs or rather the number of individual two-element interferometers. Since this brightness distribution suffers from incomplete $(u,v)$ -coverage, the resulting image is called a "dirty image", which is also indicated by the superscript "D" of the brightness distribution $B^\text{D}(\xi_\text{m}, \eta_\text{m})$ . The visibilities are measured at discrete locations $(u_\text{l},v_\text{l})$ , so the brightness distribution is given by the discrete Fourier transform

$\displaystyle B^\text{D}(\xi_\text{m}, \eta_\text{m}) = K \cdot \sum_{\text{l}=0}^{2\cdot n_\text{d}} V(u,v) \cdot S(u,v) \cdot \text{e}^{\text{i}\cdot 2\pi \cdot (u \cdot \xi_\text{m} + v \cdot \eta_\text{m})} \text{,}$

in which $K = \frac{1}{2\cdot n_\text{d} + 1}$ is a normalization constant and $S(u,v)$ is a sampling function which is given by

$\displaystyle S(u,v) = \sum_{\text{l}=0}^{2\cdot n_\text{d}} \delta^2(u - u_\text{l}, v - v_\text{l}) \text{.}$

This sampling function is only non-zero where the visibility is measured in the $(u,v)$ -plane.
To suppress sidelobes, the beam shape of the observing array can be controlled by applying appropriate weights to the measured visibilities. Furthermore, if the observing array consists of telescopes that have different collecting areas $A_\text{eff}$ , and different receivers with different system temperatures $T_\text{sys}$ , frequency bandwidths $\Delta \nu_\text{IF}$ and integration times $\tau_\text{a}$ , one can also apply weights to control these differences. The weights can be written as

$\displaystyle W(u,v) = \sum_{\text{l}=0}^{2\cdot n_\text{d}} R_\text{l} \cdot T_\text{l} \cdot D_\text{l} \cdot \delta^2(u - u_\text{l}, v - v_\text{l}) \text{,}$

in which $R_\text{l}$ accounts for different telescope properties, $T_\text{l}$ is a taper which controls the beam shape and $D_\text{l}$ weights the density of the measured visibilities (more detailed information on the weights, especially on the $D_\text{l}$ -factor, will be given in the next section 2.4.3). With these weights, the brightness distribution can then be written as

$\displaystyle B^\text{D}(\xi_\text{m}, \eta_\text{m}) = K \cdot \sum_{\text{l}=0}^{2\cdot n_\text{d}} V(u,v) \cdot S(u,v) \cdot W(u,v) \cdot \text{e}^{\text{i}\cdot 2\pi \cdot (u \cdot \xi_\text{m} + v \cdot \eta_\text{m})} \text{.}$

Because there is no measurement in the center of the $(u,v)$ -plane, the $l=0$ terms of the dirty image $B^\text{D}(\xi_\text{m}, \eta_\text{m})$ vanish, leading to

$\displaystyle \begin{array}{rl} B^\text{D}(\xi_\text{m}, \eta_\text{m}) & = K \cdot \sum_{\text{l}=0}^{2\cdot n_\text{d}} V_\text{l} \cdot W_\text{l} \cdot \text{e}^{\text{i}\cdot 2\pi \cdot (u_\text{l} \cdot \xi_\text{m} + v_\text{l} \cdot \eta_\text{m})} \\ & = 2 \cdot K \cdot \sum_{\text{l}=1}^{n_\text{d}} W_\text{l} \cdot [ \Re V_\text{l} \cdot \cos(2\pi \cdot (u_\text{l} \cdot \xi_\text{m} + v_\text{l} \cdot \eta_\text{m})) - \Im V_\text{l} \cdot \sin(2\pi \cdot (u_\text{l} \cdot \xi_\text{m} + v_\text{l} \cdot \eta_\text{m}))] \text{,} \end{array}$

in which

$\displaystyle V_\text{l} = V(u,v) \cdot S(u,v) = \Re V_\text{l} + \text{i} \cdot \Im V_\text{l}$

and

$\displaystyle W_\text{l} = W(u,v) \text{.}$

Note that the two additional terms $\Re V_\text{l} \cdot \sin[2\pi \cdot (u_\text{l} \cdot \xi_\text{m} + v_\text{l} \cdot \eta_\text{m})]$ and $\Im V_\text{l} \cdot \cos[2\pi \cdot (u_\text{l} \cdot \xi_\text{m} + v_\text{l} \cdot \eta_\text{m})]$ vanish under the integral of $B(\xi, \eta)$ and can therefore also be neglected for the sum of $B^\text{D}(\xi_\text{m}, \eta_\text{m})$ .

To calculate the sensitivity in the image plane, a point source at the phase and image center is considered. For such a point source, the visibility $V$ is real and constant across the entire $(u,v)$ -plane and only varies due to random noise. Furthermore, because $\xi_\text{m} = \eta_\text{m} = 0$ , the $\sin[2\pi \cdot (u_\text{l} \cdot \xi_\text{m} + v_\text{l} \cdot \eta_\text{m})]$ -term vanishes and $\cos[2\pi \cdot (u_\text{l} \cdot \xi_\text{m} + v_\text{l} \cdot \eta_\text{m})] = 1$ . Therefore, the dirty image of such a point source is given by

$\displaystyle \begin{array}{rl} B^\text{D}(0,0) & = 2 \cdot K \cdot \sum_{\text{l}=1}^{n_\text{d}} W_\text{l} \cdot \Re V_\text{l} \\& = 2 \cdot K \cdot \sum_{\text{l}=1}^{n_\text{d}} W_\text{l} \cdot S_\text{l} \\& = 2 \cdot K \cdot S \cdot \sum_{\text{l}=1}^{n_\text{d}} W_\text{l} \text{,} \end{array}$

in which $S = S_\text{l} = \Re V_\text{l} = \Re V$ is the total flux density of the source. Note that here

$\displaystyle K = \frac{1}{2 \cdot \sum_{\text{l}=1}^{n_\text{d}} W_\text{l}} \text{,}$

so that

$\displaystyle B^\text{D}(0,0) = S \text{.}$

Assuming that the uncertainty of the flux density $\Delta S_\text{l} = \Delta S = \text{const}$ , the noise in the dirty image is given by

$\displaystyle \Delta B^\text{D} = 2 \cdot K \cdot \Delta S \cdot \sqrt{\sum_{\text{l}=1}^{n_\text{d}} W_\text{l}^2} \text{.}$

Furthermore, assuming an observing array that consists of $n_\text{a}$ identical telescopes $(R_\text{l} = 1)$ and a naturally weighted $(D_\text{l} = 1)$ , untapered $(T_\text{l} = 1)$ image, $\Delta B^\text{D}$ simplifies to

$\displaystyle \Delta B^\text{D} = \frac{\Delta S}{\sqrt{n_\text{d}}} \text{.}$

Since $S = \Re V = \Re V_\text{l}$ , the uncertainty of the flux density $\Delta S$ is given by the rms noise $\sigma_\text{V}$ of the measured visibilities which is, based on the radiometer equation, given by

$\displaystyle \sigma_\text{V} = \frac{\sqrt{2} \cdot k \cdot T_\text{sys}}{A_\text{eff} \cdot \eta_\text{Q} \cdot \sqrt{\Delta \nu_\text{IF} \cdot \tau_\text{a}}} \text{,}$

in which $k$ is the Boltzmann constant and

$\displaystyle \eta_\text{Q} = \frac{\text{digital correlator sensitivity}}{\text{analogue correlator sensitivity}}$

is the quantization efficiency accounting for the quantization noise due to the conversion of analogue signals to digital signals.
Inserting $\Delta S = \sigma_\text{V}$ into $\Delta B^\text{D}$ , the sensitivity for the synthesized image is finally given by

$\displaystyle \Delta B^\text{D} = \frac{\sqrt{2} \cdot k \cdot T_\text{sys}}{A_\text{eff} \cdot \eta_\text{Q} \cdot \sqrt{\frac{n_\text{a} \cdot (n_\text{a} - 1)}{2} \cdot \Delta \nu_\text{IF} \cdot \tau_0}} \text{.}$

4.3. Sampling, weighting, gridding

Sampling

As previously mentioned, the visibilities are only measured at discrete locations $(u_\text{l}, v_\text{l})$ . Therefore, the measured visibilities are given by the multiplication of the true visibility function $V$ and a sampling function $S$ . Hence, the dirty image $B^\text{D}$ is then given by the convolution of their Fourier transforms:

$\displaystyle B^\text{D} = \mathcal{FT}[V^\text{S}] = \mathcal{FT}[V \cdot S] = \mathcal{FT}[V] \star \star \mathcal{FT}[S] \text{,}$

in which $\mathcal{FT}[x]$ denotes the Fourier transform of $x$ and the double-star $\star \star$ indicates a two-dimensional convolution.

Weighting

Furthermore, as also previously mentioned, the visibilities can be weighted by multiplying a weighting function $W$ to the visibilities $V^\text{S}$ :

$\displaystyle V^\text{W} = W \cdot V^\text{S} = W \cdot V \cdot S \text{.}$

This weighting function $W$ consists of three individual factors:

$R_\text{l}$ accounts for different telescope properties within an array (e.g. different $A_\text{eff}$ , $T_\text{sys}$ , $\Delta \nu$ and $\tau_\text{a}$ ),
$T_\text{l}$ is a taper that controls the beam shape,
$D_\text{l}$ weights the density of the measured visibilities.

The $D_\text{l}$ -factor affects the angular resolution and sensitivity of the array. On the one hand side, the highest angular resolution can be achieved by weighting the visibilities as if they had been measured uniformly over the entire $(u,v)$ -plane. Therefore, this weighting scheme is called uniform weighting. Since the density of the measured visibilities is higher at the center of the $(u,v)$ -plane, the visibilities in the outer part are over-weighted leading to the highest angular resolution. On the other hand, the highest sensitivity is achieved if all measured visibilities are weighted by identical weights. This weighting scheme is called natural weighting. The main properties of both schemes are summarized in the following:

natural weighting:
- $D_\text{l} = 1$
- broader synthesized beam
- highest sensitivity

uniform weighting:
- $D_\text{l} = \frac{1}{n(\text{l})}$ , in which $n(\text{l})$ is the number of visibilities occurring within an area of constant size around the weighted visibility
- highest resolution
- lowest sensitivity

With this weighting function $W$ , the dirty image $B^\text{D}$ is given by

$\displaystyle B^\text{D} = \mathcal{FT}[V^\text{W}] = \mathcal{FT}[W \cdot V^\text{S}] = \mathcal{FT}[W] \star \star \mathcal{FT}[V^\text{S}] = \mathcal{FT}[W] \star \star (\mathcal{FT}[V] \star \star \mathcal{FT}[S]) \text{.}$

Gridding

To use the time advantage of the fast Fourier transform (FFT) algorithm, the visibilities must be interpolated onto a regular grid of size $2^{\text{M}_\text{V}}\times 2^{\text{N}_\text{V}}\,(\text{e.g.}\,256\times 256,\,512\times 512,\, 1024 \times 1024,\,...)$ , resulting in an image of size $2^{\text{M}_\text{B}}\times 2^{\text{N}_\text{B}}$ . This interpolation, also called gridding, is done at first by convolving the weighted discrete visibility $V^\text{W}$ with an appropriate function $C$ to obtain a continuous visibility distribution. This continuous visibility distribution is then resampled at points of the regular grid with spacings $\Delta u$ and $\Delta v$ by multiplying a two-dimensional Shah-function $G$ , given by


Fig. 2.32 Illustration of the gridding process. The real $(u,v)$ -tracks measured by the IRAM interferometer at Plateau de Bure in France (black dots) have to be interpolated onto a regular grid of $2^N\times2^N$ pixels (red dots). Taken from: U. Klein

$\displaystyle G(u,v) = \sum_{\text{j} = -\infty}^{\infty} \sum_{\text{k} = -\infty}^{\infty} \delta^2\left(j-\frac{u}{\Delta u}, k-\frac{v}{\Delta v}\right) \text{.}$

After these modifications, the visibility $V^\text{G}$ is given by

$\displaystyle V^\text{G} = G \cdot (C \star \star V^\text{W}) = G \cdot [C \star \star (W \cdot V^\text{S})] = G \cdot [C \star \star (W \cdot V \cdot S)]$

and the dirty image $\tilde{B^\text{D}}$ reads

$\displaystyle \tilde{B^\text{D}} = \mathcal{FT}[G] \star \star (\mathcal{FT}[C] \cdot \mathcal{FT}[V^\text{W}]) = \mathcal{FT}[G] \star \star \{ \mathcal{FT}[C] \cdot [\mathcal{FT}[W] \star \star (\mathcal{FT}[V] \star \star \mathcal{FT}[S])]\} \text{.}$

This gridding process is illustrated in the Fig. 2.29, in which real $(u,v)$ -tracks measured by the IRAM interferometer at Plateau de Bure in France are shown as black dotted ellipses. Furthermore, a regular grid, onto which the visibilities have been interpolated, is shown as red dots.

Since the sampling intervals $\Delta u$ and $\Delta v$ in the $(u,v)$ -plane are inversely proportional to the sampling intervals $\Delta \xi$ and $\Delta \eta$ in the image plane ( $\Delta u^{-1} = M\Delta \xi$ and $\Delta v^{-1} = N\Delta \eta$ for a grid of size $M\times N$ ), the maximum map size in one domain is given by the minimum sampling interval in the other. Therefore, it is important to choose appropriate sampling intervals $\Delta u$ and $\Delta v$ . If $\Delta u$ and $\Delta v$ are chosen to be too large for example, this will result in artefacts in the image plane produced by reflections of structures from the map edges, which is called aliasing.

The most effective way to deal with this aliasing is to use a convolution function $C$ for which the Fourier transform in the image plane $\mathcal{FT}[C]$ decreases rapidly at the image edges and is nearly constant over the image. Therefore, the simplest choice for the convolution function $C$ is a rectangular function, but with this choice the aliasing would be strongest. A better choice would be a sinc function, however a Gaussian-sinc function (product of a Gaussian and a sinc function) leads to the best suppression of aliasing.

Finally, the gridding modifications must be corrected after the Fourier transform by dividing the dirty image $\tilde{B^\text{D}}(\xi_\text{m},\eta_\text{m})$ by the inverse Fourier transform of the gridding convolution function $C(u,v)$ . Therefore, the so-called grid-corrected image $\tilde{B^\text{D}_\text{C}}(\xi_\text{m},\eta_\text{m})$ is given by

$\displaystyle \tilde{B^\text{D}_\text{C}}(\xi_\text{m},\eta_\text{m}) = \frac{\tilde{B^\text{D}}(\xi_\text{m},\eta_\text{m})}{\mathcal{FT}[C](u,v)} \text{.}$

4.4. Bandwidth smearing

As already seen in Chapt. 2 Sect. 4.2, the sensitivity of radio-interferometrical measurements depends on the effective area of the telescopes $A_\text{eff}$ and the bandwidth $\Delta \nu_\text{IF}$ . The larger $\Delta \nu_\text{IF}$ , the higher the sensitivity. However, using a large bandwidth is problematic, since the Fourier relation between brightness distribution and visibility is only valid for monochromatic signals. Therefore, the effects of the finite bandwidth on the images obtained from radio-interferometrical observations, especially the inevitable effect called bandwidth smearing or chromatic aberration, will be investigated in this section, following the textbook of Taylor et al. (1999).

For a single observing frequency $\nu_0$ the brightness distribution is given by

$\displaystyle \tilde{B}(\xi,\eta) = \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\tilde{V}(u_0,v_0)\cdot \text{e}^{2\pi\text{i}\cdot(u_0\xi+v_0\eta)}\text{d}u_0\text{d}v_0\text{,}$

in which the "~" indicates the influence of a bandpass. The frequency-independent coordinates $u_0$ and $v_0$ are given by

$\displaystyle \begin{array}{rcl} u_0 = \frac{\nu_0}{\nu}u & \text{and} & v_0 = \frac{\nu_0}{\nu}v \text{,} \end{array}$

where $u$ and $v$ are the actual spatial coordinates of the visibility at the frequency $\nu$ .

Therefore, using the generalized similarity theorem for Fourier transformations in $n$ dimensions, which reads

$\displaystyle \mathcal{FT}^n[f(\alpha\vec{x})] = \frac{1}{|\alpha|^n}\cdot F\left(\frac{\vec{s}}{\alpha}\right)\text{,}$

the two-dimensional Fourier relation between visibility and brightness distribution is given by

$\displaystyle B(\xi,\eta)=B\left(\xi_0\frac{\nu_0}{\nu},\eta_0\frac{\nu_0}{\nu}\right)\Leftrightarrow\left(\frac{\nu}{\nu_0}\right)^2\cdot V\left(u_0\frac{\nu}{\nu_0},v_0\frac{\nu}{\nu_0}\right)=\left(\frac{\nu}{\nu_0}\right)^2\cdot V(u,v)\text{.}$

Furthermore, since the smeared visibility $\tilde{V}(u_0,v_0)$ is obtained by rescaling and weighting the true visibility $V(u,v)$ by a normalized bandpass function $G_\text{n}(\nu')$ , where $\nu'=\nu-\nu_0$ , and then integrating over the frequency band, another important effect must be taken into account. There will be a delay error of

$\displaystyle \Delta\tau=\frac{u_0\xi+v_0\eta}{\nu_0}$

for signals arriving from a direction $(\xi,\eta)$ at frequency $\nu$ . Therefore, the phase is shifted by

$\displaystyle 2\pi(\nu-\nu_0)\Delta\tau=2\pi\nu'\frac{u_0\xi+v_0\eta}{\nu_0}$

and the smeared visibility is given by

$\displaystyle \tilde{V}(u_0,v_0)=\int_{-\infty}^{\infty}V\left(u_0\frac{\nu}{\nu_0},v_0\frac{\nu}{\nu_0}\right)\left(\frac{\nu}{\nu_0}\right)^2G_\text{n}(\nu')\text{e}^{2\pi\text{i}\frac{\nu'}{\nu_0}(u_0\xi+v_0\eta)}\text{d}\nu'\text{.}$

For simplicity, a point source with unit amplitude located at $(\xi_0,0)$ is assumed without restricting generality. Then, the true visibility is given by

$\displaystyle V(u,v)=\text{e}^{-2\pi\text{i}u\xi_0}$

and the smeared visibility reads

$\displaystyle \tilde{V}(u_0,v_0)=\int_{-\infty}^{\infty}\text{e}^{-2\pi\text{i}u_0\frac{\nu}{\nu_0}\xi_0}\left(\frac{\nu}{\nu_0}\right)^2G_\text{n}(\nu')\text{e}^{2\pi\text{i}\frac{\nu'}{\nu_0}(u_0\xi+v_0\eta)}\text{d}\nu'\text{.}$

Furthermore, assuming that the bandwidth is sufficiently small, so that $(\frac{\nu}{\nu_0})^2\approx1$ (in practice, $0.95\lesssim(\frac{\nu}{\nu_0})^2\lesssim0.998$ ), the bandwidth-smeared brightness distribution is given by

$\displaystyle \begin{array}{rl} \tilde{B}(\xi,\eta) & =\int_{-\infty}^{\infty}\left[\int_{-\infty}^{\infty}{e}^{-2\pi\text{i}u_0\frac{\nu}{\nu_0}\xi_0}G_\text{n}(\nu')\text{e}^{2\pi\text{i}u_0\frac{\nu'}{\nu_0}\xi}\text{d}\nu'\right]\text{e}^{2\pi\text{i}u_0\xi}\text{d}u_0\delta(\eta) \\ & =\int_{-\infty}^{\infty}\left[\int_{-\infty}^{\infty}{e}^{-2\pi\text{i}u_0\left(1+\frac{\nu'}{\nu_0}\right)\xi_0}G_\text{n}(\nu')\text{e}^{2\pi\text{i}u_0\frac{\nu'}{\nu_0}\xi}\text{d}\nu'\right]\text{e}^{2\pi\text{i}u_0\xi}\text{d}u_0\delta(\eta) \\ & =\int_{-\infty}^{\infty}\text{e}^{2\pi\text{i}u_0(\xi-\xi_0)}\left[\int_{-\infty}^{\infty}G_\text{n}(\nu')\text{e}^{2\pi\text{i}u_0\frac{\nu'}{\nu_0}(\xi-\xi_0)}\text{d}\nu'\right]\text{d}u_0\delta(\eta)\text{,} \end{array}$

using $\frac{\nu}{\nu_0}=\frac{\nu_0+\nu-\nu_0}{\nu_0}=1+\frac{\nu'}{\nu_0}$ . Here, one can see that the term in squared brackets is the Fourier transform of the normalized bandpass function over $\nu'$ , to an argument $\tau=\frac{u_0}{v_0}(\xi-\xi_0)$ that represents a delay corresponding to the positional offset $\xi-\xi_0$ . Therefore, it is helpful to define a delay function $d(\tau)$ , given by

$\displaystyle d(\tau) = \int_{-\infty}^{\infty} G_\text{n}(\nu')\text{e}^{2\pi\text{i}\tau\nu'}\text{d}\nu'\text{.}$

Using this delay function, the bandwidth-smeared brightness distribution then reads

$\displaystyle \tilde{B}(\xi,\eta)=\int_{-\infty}^{\infty}\text{e}^{2\pi\text{i}u_0(\xi-\xi_0)}d(\tau)\text{d}u_0\delta(\eta)\text{.}$

Therefore, the bandwidth-smeared brightness distribution is the Fourier transform over $u_0$ of the product of the true visibility with the delay function. Using the convolution theorem, the brightness distribution is given by


Fig. 2.33 Illustration of the effect of bandwidth smearing. Two signals that arrive from slightly different directions, $\xi_0$ and $\xi_0-\Delta\xi$ , with slightly different frequencies can positively interfere, if their delay difference matches.

$\displaystyle \tilde{B}(\xi,\eta)=\int_{-\infty}^{\infty}\text{e}^{2\pi\text{i}u_0(\xi-\xi_0)}\text{d} u_0\delta(\eta) \star \int_{-\infty}^{\infty}d(\tau)\text{e}^{2\pi\text{i}u_0 \xi}\text{d} u_0\text{,}$

which is the convolution of the true image with a position-dependent bandwidth distortion function $D(\xi)$ , which is the Fourier transform of the delay function

$d(\tau)$ over $u_0$ . Since this distortion function varies with the radial distance $\xi_0$ from the phase center and is always oriented along the radius to the phase center, the final image of an extended source can be interpreted as a radially-dependent convolution.

The effect of finite bandwidth is illustrated in Fig. 2.30. Here it is assumed that the observed emission of a source at position $\xi_0$ positively interferes exactly at frequency $\nu_0$ . A signal from a neighboring position shifted by $\Delta\xi$ with respect to $\xi_0$ measured by one telescope at a frequency deviating from $\nu_0$ by an amount of $\Delta\nu/2$ , for example, can then also produce maximum interference with a signal from $\xi_0$ measured by the other telescope at frequency $\nu_0$ , if the delay difference matches. Therefore, a source observed with finite bandwidth can also produce structures away from its nominal position.

This finite bandwidth effect cannot be removed, because it is described by a mathematical functional. However, there are two methods to minimize the effects of bandwidth smearing. The first method is to limit the field size by dividing the observed area into subfields and produce separate images for all such subfields. To cover the complete area, the images of the subfields can then be fitted together. This technique is called mosaicing. The second method is to split the full frequency band into narrower sections, which will be summed up after imaging each individual data set. This method is called bandwidth synthesis.

4.5. Calibration

The true visibility of a target source $V_\text{ij}^\text{target}(u,v)$ observed by two telescopes $i$ and $j$ of an array is related to the observed visibility $V_\text{ij}^\text{obs}(u,v)$ by

$\displaystyle V_\text{ij}^\text{obs}(u,v)=G_\text{ij}(t)\cdot V_\text{ij}^\text{target}(u,v)\text{,}$

in which $G_\text{ij}(t)$ is the time-variable and complex gain factor of the correlation product of the two telescopes $i$ and $j$ . This gain factor includes variations caused by different effects, e.g., weather effects, effects of the atmospheric paths to the telescopes, and instrumental effects. Therefore, to determine the true visibility, the gain factor has to be calculated. This can be done via frequent observations of calibrator sources with known flux densities $S_\text{cal}$ and accurately known positions. Furthermore, these sources have to be point sources, meaning that their angular size $\Theta_\text{cal}$ is

$\displaystyle \Theta_\text{cal} \ll \frac{\lambda}{D_\text{max}}\text{.}$

Then, the complex gain $G_\text{ij}(t)$ of each interferometer can be determined by

$\displaystyle V_\text{ij}^\text{cal}(u,v)=G_\text{ij}(t)\cdot S_\text{cal}\text{,}$

in which $V_\text{ij}^\text{cal}(u,v)$ is the observed visibility of a calibrator source. This complex gain factor can then be used to calibrate the observed visibility of the target source. The true visibility of a target source $V_\text{ij}^\text{target}(u,v)$ is then given by

$\displaystyle V_\text{ij}^\text{target}(u,v)=\frac{V_\text{ij}^\text{obs}(u,v)}{G_\text{ij}(t)}=V_\text{ij}^\text{obs}(u,v)\cdot\frac{S_\text{cal}}{V_\text{ij}^\text{cal}(u,v)}\text{.}$

Note that the gains $G_\text{ij}$ can also be expressed by the voltage gain factors $g_\text{i}$ and $g_\text{j}$ of the individual i-th and j-th antennas by

$\displaystyle G_\text{ij} = g_\text{i}\cdot g_\text{j}^\star \,\text{.}$

This reduces the amount of calibration data, since there are many more correlated antenna pairs than antennas in large arrays. Furthermore, this makes the calibration procedure more flexible, since, for example, a source that is resolved at the longest baselines of an array can be used to compute the gain factors of the individual antennas from measurements made only at shorter baselines, at which the source appears unresolved.

Gibb's phenomenon


Fig. 2.34 Real and imaginary part of a frequency power spectrum for an idealized rectangular bandpass.

In continuum observations the correlation of the signals is usually measured for zero time lags, meaning that the correlation of the measured signal is averaged over the bandwidth, which is defined via a bandpass. In spectral line observations (see Chapt. 3 Sect. 4), measurements at different frequencies across the bandpass are required, which can be implemented by correlating the signals as a function of time lags. After Fourier transforming this correlation output, this yields the so-called cross power spectrum. This cross power spectrum is obtained for $K$ frequency channels (for more detailed information see Chapt. 3 Sect. 4).

To calibrate an interferometer over the bandpass, which represents the filter characteristics of the last IF stage, a continuum point source is used to determine the complex gains of the interferometer, which yields the true cross-correlation spectrum of the observed target source. For calibrator sources with a constant continuum spectrum over the bandpass, the behavior of the complex gains $g_\text{i}$ and $g_\text{j}$ across the bandpass $\Delta\nu$ are given by the correlated power as a function of frequency for each interferometer of the observing array. The real and imaginary parts of this correlated power, measured for a continuum source with a flat spectrum, are shown in Fig. 2.31 for an idealized rectangular bandpass.

The observed visibility of an unknown target source is given by

$\displaystyle V_\text{ij}^\text{obs}(\nu) = f(\nu)\star\left[g_\text{i}(\nu)\cdot g_\text{j}^\star(\nu)\cdot V(\nu)\right]\,\text{,}$

in which $f(\nu)$ is the Fourier transform of the truncating function in the frequency domain, which is a sinc function given by

$\displaystyle f(\nu) = \frac{\sin(\pi\cdot\nu\cdot K/\Delta\nu)}{\pi\cdot\nu}\,\text{.}$

Here, $V_\text{ij}^\text{obs}$ is given by a convolution with $f(\nu)$ , since the cross-correlation has only a finite maximum lag of $2K\cdot\Delta\nu$ . The observed visibility $V_\text{ij}^\text{cal}$ of a calibrator source with $|V| = 1$ and $\varphi_\text{V} = 0$ is then given by

$\displaystyle V_\text{ij}^\text{cal}(\nu)= f(\nu)\star\left[g_\text{i}(\nu)\cdot g_\text{j}^\star(\nu)\right]\,\text{.}$

Finally, the true visibility $V_\text{ij}^\text{true}$ of the observed target source is given by

$\displaystyle V_\text{ij}^\text{true}(\nu) = \frac{V_\text{ij}^\text{obs}(\nu)}{V_\text{ij}^\text{cal}(\nu)} = \frac{f(\nu)\star\left[g_\text{i}(\nu)\cdot g_\text{j}^\star(\nu)\cdot V(\nu)\right]}{f(\nu)\star\left[g_\text{i}(\nu)\cdot g_\text{j}^\star(\nu)\right]}\,\text{,}$

in which the gains do not cancel out, due to the convolution. Therefore, the original frequency power spectrum, shown in Fig. 2.31, is convolved with the sinc function $f(\nu)$ , due to the finite lag, which will lead to large errors, if the target source has a spectral line near $\nu=0$ . This is known as Gibbs' phenomenon.

Since the cross-correlation function has a complex power spectrum, Gibbs' phenomenon has strong influence on the imaginary part of the true visibility, because of the ratio in the equation for $V_\text{ij}^\text{true}$ , which leads to non-negligible errors, especially for strong line emission near the step at $\nu=0$ . Since the step in the imaginary part is large, these errors have a strong effect on the phases, producing strong ripples.

5. Image reconstruction

Learning Objectives:

Image reconstruction using the CLEAN and MEM algorithms
Image defects
Image reconstruction using self-calibration

6. Digital Beamforming

Learning Objectives:

Geometric understanding of an array performing beamforming
Introduction and mathematical concept of weighting elements
Mathematical description of one and two dimensional array factors
Visualization of array factors, beam forming and beam steering