Repo to compute CIV "distances" as introduced in Rivera et al. (2020)/Richards et al. (2021) and further discussed in Rivera+2021 (in prep).
Note that results presented here are based on CIV Equivalent Width (EW) and Blueshift definitions given in Rankine et al. 2020; measurements of EW and blueshift are derived from the noiseless ICA spectral reconstructions presented therein. Usage of noisier EW and blueshift measurements (e.g., directly from an SDSS catalog) will likely yield slightly different results.
Dependencies
pip install numpy
pip install joblib
pip install -U scikit-learn
The code you'll need is in CIVfunctions.py
. To compute CIV distances for your dataset:
from CIVfunctions import project,CIV_distance
- Load your CIV blueshift and log10(EW)1 data into an N-by-2 numpy array,
data
. - Load in the line of best fit in
data/bestfit.npy
asfit
. - Save them as
civ_distances = CIV_distance(data, fit)
See example.ipynb
for a walkthrough.
1If you wish to work with linearly scaled EWs instead, simply set the parameter logEW=False
. This will compute CIV distance in a scaled EW-blueshift (rather than logEW-blueshift) space. However, we advise against this as the CIV distance metric is much more meaningful at low EWs when using the default logarithmic scale.
After calling CIV_distance(data, fit)
the basic method for computing CIV distances for a given dataset is:
- Convert the data from "raw" to "scaled" space. Because the range of CIV blueshift (~ -1000-5000 km/s) is much greater than CIV EW (~ 5-110 Angstroms), scaling ensures that each parameter contributes equally to the CIV distance. Scaling models are saved in
scalers/
and ensure that CIV distance calculations will be the same no matter the input dataset. Note thatCIV_distance()
assumesscalers/
is in your current working directory. - Project the scaled Blueshift+EW data onto the scaled best-fit curve. (Note that points beyond either endpoint of the curve will get projected onto the respective endpoint.)
data = project(data, fit)
- To compute a distance for a given data point, start at the upper left the curve, and travel from point-to-point2 along the best-fit curve--summing your distance traveled as you go--until you pass the data point you're looking for. Since the curve is monotonically decreasing, once your y-location on the curve falls below the projection of data point's y-location, save the total distance traveled.
#Start at tip of line and sum distance traveled until passing data point
darr = [] #list to fill with distances along best-fit line for each point
for scat in data: #scat is [x,y] location of a given data point (projected onto the curve)
d = 0 #start at beginning of the line
for i in range(fit.shape[0]-1):
xp, x = fit[i,0], fit[i+1,0]
yp, y = fit[i,1], fit[i+1,1]
dp = d
d += np.sqrt((x-xp)**2 + (y-yp)**2)
if yp >= scat[1] >= y: #if we pass the projected y-coord, save the distance traveled
darr.append((d+dp)/2)
break
2You don't actually have to travel from point-to-point -- the step
parameter sets the increment between points and can save a lot of time; see example.ipynb
.
For another visualization of how CIV "distance" changes throughout the EW-blueshift plane, hover over the points in this plot: http://physics.drexel.edu/~tmccafferey/CIV_distance_example.html
This work supported in part by NASA through a grant from the Space Telescope Science Institute (HST-AR-15048.001-A), which is operated by the Association of Universities for Research in Astronomy, Incorporated, under NASA contract NAS5-26555.