mapspec

mapspec stands for Mcmc Algorithm for Parameters of SPECtra. It was originally designed to rescale astronomical spectra to a standard flux scale, using a method similar to that of van Groningen & Wanders 1992. Full details of the method are in Fausnaugh 2017, published in PASP (please cite this paper if you use the code). Along the way, several spectra analysis tools useful for dealing with bright/broad emission lines have been implemented.

Installation

mapspec is a fairly simple python package. Download it with git clone, cd to the mapspec directory, and run

python setup.py sdist
pip install -e ./

This installs to your full system---consider using a virtual environment (virtualenv or conda) or

python setup.py sdist
pip install ./ --user <MAPSPECDIR>

for more fine-grained control.

Alternatively, put the mapspec directory in your PYTHONPATH or your current working directory.

mapspec is compatible with python 2.7 and python 3.6 or later.

To test the installation, go to examples and try running the regression test script:

cd ../mapspec_test
bash run_test.sh

run_test.sh runs the regression test---it runs mapspec on some sample data and checks that the outputs match standardized files in mapspec_test/output_checks. The tests take about 10 minutes to run, and the results can be plotted with mapspec_test/plot_test.py. More details are in mapspec_test/README.

Quick Start

The workhorse script is script/do_map, which is available from the command line after installing mapspec and has a help function.

Here are the data files that you need:

A reference spectrum in 3 column ascii format (wavelength, flux, and flux error).
A file with wavelength regions for the emission line. This should have 3 rows: 1) the wavelength window of the emission line, 2) the wavelength window of the blueward continuum, and 3) the wavelength window of the redward continuum. Each line has 2 numbers: the blue edge of the wavelength region window and the red edge of the wavelength region window.
A list of files with spectra to rescale, 1 file per line. Like the reference, the spectra should be in 3 column ascii format.

Examples of all of these files are in examples/mapspec_test.

do_map runs a list of spectra through the rescaling model. This is the normal modus operandi for reverberation mapping studies, so do_map may have everything that most users need. The output (using the Gauss-Hermite smoothing kernel) will be saved in files called 'scale.h._<input_spectrum_name>'.

To see how do_map works, you can look at the source in mapspec/scripts/do_map.

do_map also does some simple model comparisons (Gaussian smoothing vs. Gauss-Hermite smoothing). The file mapspec.params compares the chi^2 from two different different rescaling models, and two output spectra are saved by default ('scale_<input_spectrum_name>' for Gaussian smoothing and 'scale.h._<input_spectrum_name>' for Gauss-Hermite smoothing).

Outline

mapspec is object-oriented and is designed to be modular and highly extensible. It makes heavy use of numpy and scipy libraries, and was designed to easily integrate with these packages.

There are two main parts of the package: First, general-usage spectrum utilities, which are bundled in mapspec/spectrum.py. Second, the rescaling procedure, which is kept in mapspec/mapspec.py.

Some helpful scripts for constructing a reference spectrum are also provided----these are make_ref and smooth_ref. Check the help menus and internal comments for details. By default, they are installed to your system and available on the command line.

We begin by describing (with illustrative examples) the spectrum utilities.

Spectrum

The primary object is the Spectrum class. Spectrum has 3 fundamental attributes: wavelength (Spectrum.wv), flux (Spectrum.f), and flux error (or measurement error, Spectrum.ef). Each attribute is a numpy array, so all the functionality of numpy/scipy is preserved. For example, to get the mean flux-level:

from mapspec.spectrum import Spectrum
my_spec = Spectrum()
print my_spec.f.mean()

or to get the number of pixels:

print my_spec.wv.size

(to actually put data in the Spectrum object, see "Reading Data" below). Spectrum also has many useful methods to interpolate, rebin, extinction-correct, and smooth (or log-smooth) the data. For example, this will rebin the spectrum to twice the original wavelength spacing:

import numpy as np
dx = my_spec.wv[1] - my_spec.wv[0] #assumes even wavelength spacing throughout
xnew = np.arange(my_spec.wv.min(), my_spec.wv.max() + 2.*dx, 2.*dx)
my_spec.rebin(xnew)

Emission Lines

The next object is the EmissionLine class. EmissionLine inherits from Spectrum, so it can be used in approximately the same way. To instantiate an EmissionLine, you need to specify the line window and surrounding continuum:

from mapspec.spectrum import EmissionLine
oiii = [ [5065, 5115],     #OIII 5007, at redshift 1.018
         [5059, 5065],     #blue continuum
         [5115, 5129]  ]   # red continuum
my_line = EmissionLine(my_spec, oiii[0], [oiii[1],oiii[2]] )

At instantization, EmissionLine fits a linear model to the local continuum (oiii[1] and oiii[2]) and subtracts the model underneath the emission line. That is, my_line.wv is the wavelengths of my_spec between oiii[0,0] and oiii[0,1] and my_line.f is the corresponding section of my_spec.f minus the continuum fit.

EmissionLine also has other useful methods, such as my_line.integrate_line(), which will use scipy.integrate.simps to return the total line flux, or my_line.fwhm(), which will return the full-width-at-half-maximum of the line.

Line Model

The last class is LineModel. At instantization, LineModel fits a function to an input EmissionLine:

from mapspec.spectrum import LineModel
my_lmodel = LineModel(my_line, func='gauss-hermite')

The raison d'être of LineModel is to generate flux predictions at arbitrary wavelengths. This is accomplished by overloading the class's call method, so you can think of LineModel as a kind of function:

xeval = np.linspace( my_line.wv.min(), my_line.wv.max(), 500 )
fpredict = my_lmodel(xeval)

You could use these predictions to, for example, subtract the line from the original spectrum:

mask = ( my_spec.wv > oiii[0,0])*(my_spec.wv < oiii[0,1] )
my_spec.f[mask] -= my_lmodel(my_spec.wv[mask])

If the parameters of the fit are of interest, you can access then with my_lmodel.p (and the covariance matrix, for errors, is stored in my_lmodel.covar), but you will have to check the individual functions/methods to see which parameter is which.

Reading Data

Input from a file is handled with inheritance. If you know what format your data are in, the following classes know how to read these formats and return fully functioning Spectrum objects:

from mapspec.spectrum import TextSpec, TextSpec_2c, FitsSpec

s1 = TextSpec('mockdata.txt')  #for 3 column format
s2 = TestSpec_2c('mockdata_2col.txt') #for 2 columns, assumes no errors
s3 = FitsSpec('mockdata.fits')

FitsSpec contains many keywords in order to deal with the vast diversity of .fits formats:

s3 = FitsSpec('mockdata.fits', extension = 1, data_axis = 1, error_axis = 2,
              x1key = 'CRVAL1',
              dxkey = 'CD1_1',
              p1key = 'CRPIX1'
              )

Two files with test data from NGC 5548 (3 column ascii) are in examples (test.LBT.dat and test.MDM.dat---test.dat is a copy of test.LBT.dat), and 6 spectra for rescaling are in examples/mapspec_test

Rescaling Spectra

The tools for rescaling spectra are in mapspec/mapspec.py. Most of the work is done by the RescaleModel class:

from mapspec.spectrum import *
from mapspec.mapspec import RescaleModel

sref = TextSpec('ref.txt')
lref = EmissionLine(sref, oiii[0], [ oiii[1], oiii[2] ] )

my_model = RescaleModel(lref, kernel = 'Gauss')

my_model will take an input emission line and try to align it with lref. The model consists of a wavelength shift, rescaling factor, and smoothing---the form of the smoothing kernal is specified by the kernel keyword (in the above example, a simple Gaussian).

The parameters of the model are stored in a dictionary, so they are easy to access. If you know what parameters you want, you can immediate apply the model to the data:

#shift by 0.25 x-units, multiply by 2, and smooth with kernel of width 1 x-units
my_model.p = {'shift': 0.25, 'scale': 2.0, 'width': 1.0} 
sout, lost_pix = my_model.output(my_spec)

lost_pix is a boolean array that tells you how much of the original spectrum is lost---you loose at least one pixel on either side for any non-zero shift. If you don't care about the original wavelength grid, it is suggested to change the wavelengths in place (i.e., my_spec.wv -= shift).

Usually, you will want to optimize these parameters so that they match the reference. To do this, a function metro_hast is included:

from mapspec.mapspec import metro_hast
chi2, p_best, frac_accept = metro_hast(5000, my_line, my_model)

This will run an MCMC for 5000 steps, and try to optimize my_model.p so that my_line matches lref (as a reminder, lref was specified when instantiating my_model---any RescaleModel object should probably have its own reference line). chi2 is the best (minimum) chi^2, p_best are the corresponding parameters, and frac_accept is the acceptance fraction of the chain.

You then apply the model as above:

my_model.p = p_best
sout,lost_pix =  my_model.output(my_spec)

metro_hast has some other useful options:

chi2, p_best, frac_accept, p_chain = metro_hast(5000, my_line, my_model, keep = True)

This will return the MCMC chain of parameters in p_chain. p_chain is actually its own object (mapspec.mapspec.Chain), but it is little more than a glorified NxM array that knows how to plot itself (it also stores the log-likelihood for each set of parameters in a separate attribute). Also

plt.ion()
chi2, p_best, frac_accept, p_chain = metro_hast(5000, my_line, my_model, plot = True)

will plot the MCMC chain as it runs.

Details

Warning on Interpolation

At the moment, only linear interpolation supports a rigorous error propagation. This includes calculating the covariance matrix due to both interpolation and smoothing when rescaling the data.

Other interpolation methods are supported---any 'kind' keyword for scipy.interpolate.interp1d can be used. There is also an implementation of bsplines using scipy.interpolate, and a custom built sinc interpolator (with a Lanczos window by default).

These alternative methods do not support full error propagation. For now, they will perform the same operation as called for on the flux spectrum, but on the square of the error spectrum (i.e., the variance). This usually results in errors similar to the input and roughly preserves the fractional uncertainty. However, we note that this is not strictly correct, and may be a feature that we improve in the future.

Calculation of the Likelihood and Fitting Procedure

mapspec aligns the data and the model by minimzing: (data - model)^2/error^2, where data is the input line, model is the reference value chosen at instantization, and error^2 = (reference error)^2 + (data error)^2. In other words, mapspec takes a maximum likelihood approach, assuming normally distributed residuals and independent uncertainties on the data and reference.

Because of wavelength shifts and edge-effects from convolution, there can be problems aligning the edges of the fitting region---in order to censor these points, it would be necessary to change the amount of data, and therefore the number of degrees-of-freedom, during the fit. To prevent this while mitigating the edge-effects, mapspec ignores 10% of the data (5% on each end) when calculating the likelihood. Although this is not an optimal solution, it does produces reasonable results and appears to be a good compromise between correctness, simplicity, and practicability.

It is therefore suggested to choose the fitting region (line wavelengths) slightly larger than would naively be expected for isolating the emission line flux. It is also suggested to remove large shifts (greater than 1 pixel) before performing the fit. A convenience function mapspec.mapspec.get_cc is provided to cross correlate two spectra---see do_map.py for an example.

In the Bayesian framework, priors are always added into the likelihood calculation. mapspec assumes uniform priors on all parameters, so the priors do not explicitly enter into the calculation of the log-likelihood. Note that a flat prior is informative for the case of the scaling parameter---however, because the rescaling is a factor of a few (not many orders of magnitude), this situation makes little difference.

When using a Gauss-Hermite Kernel, h3 and h4 are constrained to be > -0.3 and < 0.3; experimentation shows that this range is sufficient to produce a wide range of emission line shapes. The finite interval is enforced by setting the prior probability to 0 outside of this interval. Similarly, strange behavior will occur if the kernel width drops below the spectral resolution (a kind of undersampling/aliasing effect). A minimum kernel width is imposed at 1/2 of the wavelength spacing of the reference data.

RescaleModel objects have the option to add priors, either through a function provided by the user or from posterior distributions stored in Chain objects. See do_map.py for an example.

Methods

Spectrum

Object	Method	args	Description
`Spectrum`	`__init__`	`style = 'linear'`	Change the interpolation style at instantization (implicitly calls `set_interp`).
`Spectrum`	`restore`		Restore attributes (`Spectrum.wv`,`Spectrum.f`, and `Spectrum.ef`) to values when spectrum was instantiated.
`Spectrum`	`interp`	`xnew`	Returns interpolated flux and flux errors at location of array `xnew`.
`Spectrum`	`set_interp`	kwargs	Change interpolation style (see `style` below). Keywords for custom b-splines and sinc interpolation are also provided here.
`Spectrum`	`rebin`	`xnew`	Rebins the spectrum to wavelengths `xnew`. Modifies `Spectrum.wv`,`Spectrum.f`, and `Spectrum.ef`.
`Spectrum`	`extinction_correct`	`E_BV`, `RV`	De-extinguish the spectrum for a CCM89 extinction law, for a given E(B-V) (`E_BV`) and R_V (`RV`). Modifies `Spectrum.f`, and `Spectrum.ef`.
`Spectrum`	`smooth`	`width`,`name`	Smooths the spectrum with a kernel of width `width` specified with `name` (calculated with `scipy.signal.get_window`). Modifies `Spectrum.f`, and `Spectrum.ef`. Edge-effects are treated by replacing with the original spectrum (a number of pixels equal to `width` on both edges).
`Spectrum`	`velocity_smooth`	`v_width`	Smooths the spectrum with a velocity dispersion of width `v_wdith` (in km/s) by rebinning evenly in log wavelength. Modifies `Spectrum.f`, and `Spectrum.ef`. Converts back to original wavelengths, but cannot support sinc interpolation (unevenly spaced wavelengths).
`Spectrum`	`wv`	Attribute	numpy array of wavelengths.
`Spectrum`	`f`	Attribute	numpy array of fluxes.
`Spectrum`	`ef`	Attribute	numpy array of flux errors.
`Spectrum`	`style`	Attribute	String with instructions for `set_interp`. Options are anything supported by `scipy.interpolate.interp1d`. Also available is a custom sinc interpolator (`style = 'sinc'`), which uses convolution in real space and a Lanczos window to prevent ringing (very slow). Bsplines (`style= 'bspline'`) are also available (using a custom class to interface with `splrep` and `splev` from `scipy.interpolate` ).

EmissionLine

Object	Method	args	Description
`EmissionLine`	`__init__`	`Spec`, `window`, `cwindow`	`Spec` is a `Spectrum` object from which to extract the `EmissionLine`. `window` gives the wavelength region of the line (inclusive, blue edge first). `cwindow` is a 2x2 list/array that specifies the continuum. Each row is a continuum region (exclusive), which must have the blue edge first, but the blue and red regions may be in any order.
`EmissionLine`	`integrate_line`		Returns total line flux using `scipy.integrate.simps`.
`EmissionLine`	`equivalent_width`		Returns integrated flux divided by mean continuum.
`EmissionLine`	`wv_mean`		Returns first moment of the line.
`EmissionLine`	`wv_median`		Returns median wavelength (wavelength that splits the integrated flux in 2).
`EmissionLine`	`dispersion`		Returns second moment of the line.
`EmissionLine`	`mad`		Returns Mean Absolute Deviation.
`EmissionLine`	`fwhm`	`center`	Returns a full-width-at-half-maximum, following the procedure in Peterson et al. 2004; `center` is the central wavelength, to check if the line is double-peaked. Actually returns a tuple with fwhm, width on blue side, width on red-side, and a boolean for which `True` indicates double-peaked.
`EmissionLine`	`ip_wv`	`frac`	Returns inter-percentile region in wavelength units. The inter-percentile region contains `frac` of the total flux, where `frac` is between 0 and 1. Actually returns a tuple with the inter-percentile region, the corresponding blue wavelength, and the corresponding red wavelength.
`EmissionLine`	`ipv`	`frac`,`center`	Returns the inter-percentile region in velocity units. `frac` is defined as above, `center` is used to convert to velocities. Actually returns a tuple with the inter-pecentile region, blue width, and red width. Assumes km/s.
`EmissionLine`	`wv`	Attribute	numpy array of wavelengths (subset of `Spec.wv` within `window`).
`EmissionLine`	`f`	Attribute	numpy array of fluxes (matching subset of `Spec.f` minus continuum flux).
`EmissionLine`	`ef`	Attribute	numpy array of flux errors (as `EmissionLine.f`, but error propagation from the continuum fit is included).
`EmissionLine`	`cf`	Attribute	Continuum flux underneath the line.
`EmissionLine`	`ecf`	Attribute	Error on continuum flux. This is added in quadrature to the original measurement errors in the line wavelength region.

LineModel

Object	Method	args	Description
`LineModel`	`__init__`	`EmLine`,`func='gaussian'`,`window=None`,`nline = 1`,`floating=False`,`linedata=None`	`EmLine` is an `EmissionLine` to which the model is fit. `func` is a string that specifies the functional form to fit to the line; can be 'gaussian', 'gauss-hermite', 'lorentzian', 'approx-voig' (sum of lorentzian and gaussian), or 'data' (empirical template). `window` is a boolean array that masks the input `EmissionLine` (`True` indicates that the pixel is included in the fit). `nline` indicates the number of lines (components) used in the fit. `floating=True` adds a parameter to the model that allows a constant offset or 'pedestal' flux. `linedata` is an `EmissionLine` that serves as a template if `func = 'data'`. See examples in `examples/test_linefit.py`.
`LineModel`	`__call__`	`x`	Returns fitted function evaluated at input `x` wavelengths (i.e., flux-predictions, `x` is an array).
`LineModel`	`chi2`		Returns the value of chi^2 from the initial fit.
`LineModel`	`plot_components`	`x`,`ax`,`cuse`	Plots individual components (if `nline` is > 1) on axis `ax` in color `cuse` at abcissas `x`.
`LineModel`	`p`	Attribute	Parameters of the fitted model.
`LineModel`	`covar`	Attribute	Covariance matrix of fitted parameters.
`LineModel`	`wv`	Attribute	Saves `EmLine.wv`.
`LineModel`	`lf`	Attribute	Saves `EmLine.f`.
`LineModel`	`elf`	Attribute	Saves `EmLine.ef`.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
examples		examples
mapspec		mapspec
scripts		scripts
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

mapspec

mapspec

scripts

scripts

.gitignore

.gitignore

CHANGELOG.md

CHANGELOG.md

LICENSE

LICENSE

README.md

README.md

setup.py

setup.py

Repository files navigation

mapspec

Installation

Quick Start

Outline

Spectrum

Emission Lines

Line Model

Reading Data

Rescaling Spectra

Details

Warning on Interpolation

Calculation of the Likelihood and Fitting Procedure

Methods

Spectrum

EmissionLine

LineModel

About

Releases

Packages

Languages

License

mmfausnaugh/mapspec

Folders and files

Latest commit

History

Repository files navigation

mapspec

Installation

Quick Start

Outline

Spectrum

Emission Lines

Line Model

Reading Data

Rescaling Spectra

Details

Warning on Interpolation

Calculation of the Likelihood and Fitting Procedure

Methods

Spectrum

EmissionLine

LineModel

About

Resources

License

Stars

Watchers

Forks

Languages