Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use effective focal length to compute src-dep parameters #1131

Merged
merged 13 commits into from
Jul 23, 2023
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 119 additions & 54 deletions lstchain/reco/dl1_to_dl2.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
"""

import os
import logging

import astropy.units as u
import joblib
Expand All @@ -31,6 +32,9 @@

from ctapipe.image.hillas import camera_to_shower_coordinates
from ctapipe.instrument import SubarrayDescription
from ctapipe_io_lst import OPTICS

logger = logging.getLogger(__name__)

__all__ = [
'apply_models',
Expand All @@ -43,6 +47,7 @@
'train_energy',
'train_reco',
'train_sep',
'update_disp_with_effective_focal_length'
]


Expand All @@ -67,15 +72,15 @@
features = config['energy_regression_features']
model = RandomForestRegressor

print("Given features: ", features)
print("Number of events for training: ", train.shape[0])
print("Training Random Forest Regressor for Energy Reconstruction...")
logger.info("Given features: ", features)
logger.info("Number of events for training: ", train.shape[0])
logger.info("Training Random Forest Regressor for Energy Reconstruction...")

reg = model(**energy_regression_args)
reg.fit(train[features],
train['log_mc_energy'])

print("Model {} trained!".format(model))
logger.info("Model {} trained!".format(model))
return reg


Expand Down Expand Up @@ -105,16 +110,16 @@
features = config['disp_regression_features']
model = RandomForestRegressor

print("Given features: ", features)
print("Number of events for training: ", train.shape[0])
print("Training model {} for disp vector regression".format(model))
logger.info("Given features: ", features)
logger.info("Number of events for training: ", train.shape[0])
logger.info("Training model {} for disp vector regression".format(model))

Check warning on line 115 in lstchain/reco/dl1_to_dl2.py

View check run for this annotation

Codecov / codecov/patch

lstchain/reco/dl1_to_dl2.py#L113-L115

Added lines #L113 - L115 were not covered by tests

reg = model(**disp_regression_args)
x = train[features]
y = np.transpose([train[f] for f in predict_features])
reg.fit(x, y)

print("Model {} trained!".format(model))
logger.info("Model {} trained!".format(model))

Check warning on line 122 in lstchain/reco/dl1_to_dl2.py

View check run for this annotation

Codecov / codecov/patch

lstchain/reco/dl1_to_dl2.py#L122

Added line #L122 was not covered by tests

return reg

Expand All @@ -139,16 +144,16 @@
features = config['disp_regression_features']
model = RandomForestRegressor

print("Given features: ", features)
print("Number of events for training: ", train.shape[0])
print("Training model {} for disp norm regression".format(model))
logger.info("Given features: ", features)
logger.info("Number of events for training: ", train.shape[0])
logger.info("Training model {} for disp norm regression".format(model))

reg = model(**disp_regression_args)
x = train[features]
y = np.transpose(train[predict_feature])
reg.fit(x, y)

print("Model {} trained!".format(model))
logger.info("Model {} trained!".format(model))

return reg

Expand All @@ -173,16 +178,16 @@
features = config["disp_classification_features"]
model = RandomForestClassifier

print("Given features: ", features)
print("Number of events for training: ", train.shape[0])
print("Training model {} for disp sign classification".format(model))
logger.info("Given features: ", features)
logger.info("Number of events for training: ", train.shape[0])
logger.info("Training model {} for disp sign classification".format(model))

clf = model(**classification_args)
x = train[features]
y = np.transpose(train[predict_feature])
clf.fit(x, y)

print("Model {} trained!".format(model))
logger.info("Model {} trained!".format(model))

return clf

Expand Down Expand Up @@ -211,24 +216,24 @@
disp_features = config['disp_regression_features']
model = RandomForestRegressor

print("Given energy_features: ", energy_features)
print("Number of events for training: ", train.shape[0])
print("Training Random Forest Regressor for Energy Reconstruction...")
logger.info("Given energy_features: ", energy_features)
logger.info("Number of events for training: ", train.shape[0])
logger.info("Training Random Forest Regressor for Energy Reconstruction...")

Check warning on line 221 in lstchain/reco/dl1_to_dl2.py

View check run for this annotation

Codecov / codecov/patch

lstchain/reco/dl1_to_dl2.py#L219-L221

Added lines #L219 - L221 were not covered by tests

reg_energy = model(**energy_regression_args)
reg_energy.fit(train[energy_features],
train['log_mc_energy'])

print("Random Forest trained!")
print("Given disp_features: ", disp_features)
print("Training Random Forest Regressor for disp_norm Reconstruction...")
logger.info("Random Forest trained!")
logger.info("Given disp_features: ", disp_features)
logger.info("Training Random Forest Regressor for disp_norm Reconstruction...")

Check warning on line 229 in lstchain/reco/dl1_to_dl2.py

View check run for this annotation

Codecov / codecov/patch

lstchain/reco/dl1_to_dl2.py#L227-L229

Added lines #L227 - L229 were not covered by tests

reg_disp = RandomForestRegressor(**disp_regression_args)
reg_disp.fit(train[disp_features],
train['disp_norm'])

print("Random Forest trained!")
print("Done!")
logger.info("Random Forest trained!")
logger.info("Done!")

Check warning on line 236 in lstchain/reco/dl1_to_dl2.py

View check run for this annotation

Codecov / codecov/patch

lstchain/reco/dl1_to_dl2.py#L235-L236

Added lines #L235 - L236 were not covered by tests
return reg_energy, reg_disp


Expand All @@ -253,16 +258,16 @@
features = config["particle_classification_features"]
model = RandomForestClassifier

print("Given features: ", features)
print("Number of events for training: ", train.shape[0])
print("Training Random Forest Classifier for",
logger.info("Given features: ", features)
logger.info("Number of events for training: ", train.shape[0])
logger.info("Training Random Forest Classifier for",
"Gamma/Hadron separation...")

clf = model(**classification_args)

clf.fit(train[features],
train['mc_type'])
print("Random Forest trained!")
logger.info("Random Forest trained!")
return clf


Expand Down Expand Up @@ -342,17 +347,29 @@
df_gamma = pd.read_hdf(filegammas, key=dl1_params_lstcam_key)
df_proton = pd.read_hdf(fileprotons, key=dl1_params_lstcam_key)

# Update parameters related to target direction on camera frame for gamma MC
# taking into account of the abrration effect using effective focal length
try:
subarray_info = SubarrayDescription.from_hdf(filegammas)
tel_id = config["allowed_tels"][0] if "allowed_tels" in config else 1
effective_focal_length = subarray_info.tel[tel_id].optics.effective_focal_length
except OSError:
logger.warning("subarray table is not readable because of the version inompatibility.")
logger.warning("Use the effective focal lentgh for the standard LST optics")
effective_focal_length = OPTICS.effective_focal_length

Check warning on line 359 in lstchain/reco/dl1_to_dl2.py

View check run for this annotation

Codecov / codecov/patch

lstchain/reco/dl1_to_dl2.py#L356-L359

Added lines #L356 - L359 were not covered by tests

df_gamma = update_disp_with_effective_focal_length(df_gamma, effective_focal_length = effective_focal_length)

if config['source_dependent']:
# if source-dependent parameters are already in dl1 data, just read those data
# if not, source-dependent parameters are added here
if dl1_params_src_dep_lstcam_key in get_dataset_keys(filegammas):
src_dep_df_gamma = get_srcdep_params(filegammas)

else:
subarray_info = SubarrayDescription.from_hdf(filegammas)
tel_id = config["allowed_tels"][0] if "allowed_tels" in config else 1
focal_length = subarray_info.tel[tel_id].optics.equivalent_focal_length
src_dep_df_gamma = get_source_dependent_parameters(df_gamma, config, focal_length=focal_length)
src_dep_df_gamma = get_source_dependent_parameters(
df_gamma, config, effective_focal_length=effective_focal_length
)

df_gamma = pd.concat([df_gamma, src_dep_df_gamma['on']], axis=1)

Expand All @@ -362,10 +379,18 @@
src_dep_df_proton = get_srcdep_params(fileprotons)

else:
subarray_info = SubarrayDescription.from_hdf(fileprotons)
tel_id = config["allowed_tels"][0] if "allowed_tels" in config else 1
focal_length = subarray_info.tel[tel_id].optics.equivalent_focal_length
src_dep_df_proton = get_source_dependent_parameters(df_proton, config, focal_length=focal_length)
try:
subarray_info = SubarrayDescription.from_hdf(fileprotons)
tel_id = config["allowed_tels"][0] if "allowed_tels" in config else 1
effective_focal_length = subarray_info.tel[tel_id].optics.effective_focal_length
except OSError:
logger.warning("subarray table is not readable because of the version inompatibility.")
logger.warning("Use the effective focal lentgh for the standard LST optics")
effective_focal_length = OPTICS.effective_focal_length

Check warning on line 389 in lstchain/reco/dl1_to_dl2.py

View check run for this annotation

Codecov / codecov/patch

lstchain/reco/dl1_to_dl2.py#L386-L389

Added lines #L386 - L389 were not covered by tests

src_dep_df_proton = get_source_dependent_parameters(
df_proton, config, effective_focal_length=effective_focal_length
)

df_proton = pd.concat([df_proton, src_dep_df_proton['on']], axis=1)

Expand Down Expand Up @@ -546,7 +571,7 @@
reg_disp_vector=None,
reg_disp_norm=None,
cls_disp_sign=None,
focal_length=28 * u.m,
effective_focal_length=29.30565 * u.m,
custom_config=None,
):
"""
Expand All @@ -572,7 +597,7 @@
cls_disp_sign: string | Path | bytes | sklearn.ensemble.RandomForestClassifier
Path to the random forest filename or file or pre-loaded RandomForestClassifier object
for disp sign reconstruction
focal_length: `astropy.unit`
effective_focal_length: `astropy.unit`
custom_config: dictionary
Modified configuration to update the standard one

Expand All @@ -597,6 +622,13 @@
+ config['disp_classification_features'],
)

# Update parameters related to target direction on camera frame for MC data
# taking into account of the abrration effect using effective focal length
is_simu = 'disp_norm' in dl2.columns
if is_simu:
dl2 = update_disp_with_effective_focal_length(dl2, effective_focal_length = effective_focal_length)


# Reconstruction of Energy and disp_norm distance
if isinstance(reg_energy, (str, bytes, Path)):
reg_energy = joblib.load(reg_energy)
Expand Down Expand Up @@ -664,7 +696,7 @@
dl2.y.values * u.m,
dl2.reco_disp_dx.values * u.m,
dl2.reco_disp_dy.values * u.m,
focal_length,
effective_focal_length,
alt_tel * u.rad,
az_tel * u.rad)

Expand Down Expand Up @@ -698,7 +730,7 @@
return dl2


def get_source_dependent_parameters(data, config, focal_length=28 * u.m):
def get_source_dependent_parameters(data, config, effective_focal_length=29.30565 * u.m):
"""Get parameters dict for source-dependent analysis.

Parameters
Expand All @@ -714,8 +746,9 @@
else:
data_type = 'real_data'

expected_src_pos_x_m, expected_src_pos_y_m = get_expected_source_pos(data, data_type, config,
focal_length=focal_length)
expected_src_pos_x_m, expected_src_pos_y_m = get_expected_source_pos(
data, data_type, config, effective_focal_length=effective_focal_length
)

src_dep_params = calc_source_dependent_parameters(data, expected_src_pos_x_m, expected_src_pos_y_m)
src_dep_params_dict = {'on': src_dep_params}
Expand Down Expand Up @@ -765,7 +798,7 @@
return src_dep_params


def get_expected_source_pos(data, data_type, config, focal_length=28 * u.m):
def get_expected_source_pos(data, data_type, config, effective_focal_length=29.30565 * u.m):
"""Get expected source position for source-dependent analysis .

Parameters
Expand All @@ -782,16 +815,8 @@

# For proton MC, nominal source position is one written in config file
if data_type == 'mc_proton':
expected_src_pos = utils.sky_to_camera(
u.Quantity(data['mc_alt_tel'].values + config['mc_nominal_source_x_deg'], u.deg, copy=False),
u.Quantity(data['mc_az_tel'].values + config['mc_nominal_source_y_deg'], u.deg, copy=False),
focal_length,
u.Quantity(data['mc_alt_tel'].values, u.deg, copy=False),
u.Quantity(data['mc_az_tel'].values, u.deg, copy=False)
)

expected_src_pos_x_m = expected_src_pos.x.to_value(u.m)
expected_src_pos_y_m = expected_src_pos.y.to_value(u.m)
expected_src_pos_x_m = np.tan(np.deg2rad(config['mc_nominal_source_x_deg'])) * effective_focal_length
expected_src_pos_y_m = np.tan(np.deg2rad(config['mc_nominal_source_y_deg'])) * effective_focal_length

# For real data
if data_type == 'real_data':
Expand All @@ -816,7 +841,7 @@
obstime = Time(time, scale='utc', format='unix')
pointing_alt = u.Quantity(data['alt_tel'], u.rad, copy=False)
pointing_az = u.Quantity(data['az_tel'], u.rad, copy=False)
source_pos = utils.radec_to_camera(source_coord, obstime, pointing_alt, pointing_az, focal_length)
source_pos = utils.radec_to_camera(source_coord, obstime, pointing_alt, pointing_az, effective_focal_length)

expected_src_pos_x_m = source_pos.x.to_value(u.m)
expected_src_pos_y_m = source_pos.y.to_value(u.m)
Expand All @@ -827,3 +852,43 @@
)

return expected_src_pos_x_m, expected_src_pos_y_m


def update_disp_with_effective_focal_length(data, effective_focal_length=29.30565 * u.m):
"""Update disp parameters using effective focal length

Parameters
----------
data: Pandas DataFrame
config: dictionnary containing configuration
"""

source_pos_in_camera = utils.sky_to_camera(
u.Quantity(data['mc_alt'].values, u.rad, copy=False),
u.Quantity(data['mc_az'].values, u.rad, copy=False),
effective_focal_length,
u.Quantity(data['mc_alt_tel'].values, u.rad, copy=False),
u.Quantity(data['mc_az_tel'].values, u.rad, copy=False)
)

expected_src_pos_x_m = source_pos_in_camera.x.to_value(u.m)
expected_src_pos_y_m = source_pos_in_camera.y.to_value(u.m)

data['src_x'] = expected_src_pos_x_m
data['src_y'] = expected_src_pos_y_m

disp_dx, disp_dy, disp_norm, disp_angle, disp_sign = disp.disp(
data['x'].values,
data['y'].values,
expected_src_pos_x_m,
expected_src_pos_y_m,
data['psi'].values
)

data['disp_dx'] = disp_dx
data['disp_dy'] = disp_dy
data['disp_norm'] = disp_norm
data['disp_angle'] = disp_angle
data['disp_sign'] = disp_sign

return data
17 changes: 12 additions & 5 deletions lstchain/scripts/lstchain_add_source_dependent_parameters.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@

import pandas as pd
from ctapipe.instrument import SubarrayDescription
from ctapipe_io_lst import OPTICS

from lstchain.io import (
get_standard_config,
Expand Down Expand Up @@ -61,11 +62,17 @@
pass

dl1_params = pd.read_hdf(dl1_filename, key=dl1_params_lstcam_key)
subarray_info = SubarrayDescription.from_hdf(dl1_filename)
tel_id = config["allowed_tels"][0] if "allowed_tels" in config else 1
focal_length = subarray_info.tel[tel_id].optics.equivalent_focal_length

src_dep_df = pd.concat(get_source_dependent_parameters(dl1_params, config, focal_length=focal_length), axis=1)
try:
subarray_info = SubarrayDescription.from_hdf(dl1_filename)
tel_id = config["allowed_tels"][0] if "allowed_tels" in config else 1
effective_focal_length = subarray_info.tel[tel_id].optics.effective_focal_length
except OSError:
print("subarray table is not readable because of the version inompatibility.")
print("Use the effective focal lentgh for the standard LST optics")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typos: inompatibility, lentgh

Also, better "I will use the effective..." (otherwise it seems it is asking the user for some change )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The effective focal length for the standard LST optics will be used"

effective_focal_length = OPTICS.effective_focal_length

Check warning on line 72 in lstchain/scripts/lstchain_add_source_dependent_parameters.py

View check run for this annotation

Codecov / codecov/patch

lstchain/scripts/lstchain_add_source_dependent_parameters.py#L69-L72

Added lines #L69 - L72 were not covered by tests

src_dep_df = pd.concat(get_source_dependent_parameters(
dl1_params, config, effective_focal_length=effective_focal_length), axis=1)
Copy link
Collaborator

@moralejo moralejo Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I already commented about this. In

def get_expected_source_pos(data, data_type, config, effective_focal_length=29.30565 * u.m):

for gamma mc the returned nominal position on the camera is (I think) obtained with nominal focal length (pre-calculated):

if data_type == 'mc_gamma':
expected_src_pos_x_m = data['src_x'].values
expected_src_pos_y_m = data['src_y'].values

Wait, it seems the values src_x, src_y finally used in building the models and in applying them are not directly those from dl1 files, but are recalculated with effective_focal_length, so I think it is probably working.

On the other hand, lstchain_add_source_dependent_parameters.py will keep the src_x, src_y that are found in the dl1 file (calculated with "equivalent f" for existing v0.9 files).

I think that what is misleading is the behaviour of get_expected_source_pos:

get_expected_source_pos(data, data_type, config, effective_focal_length=29.30565 * u.m)

I would expect the line above to always recalculate the source position using the focal length that is passed. But for MC gammas it just returns whatever is already stored in "data"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src_x and src_y is updated using effective focal length here (I thought this was done before get_expected_source_pos function, but it is not the case... I will update the code)

dl2 = update_disp_with_effective_focal_length(dl2, effective_focal_length = effective_focal_length)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said, I think you have to update get_expected_source_pos so that it recalculates the position, not just propagates...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it is true. Then I will modify codes to always recalculate the source position inside this function.
This function is only used for source-dep analysis. But, it would be better to recalculate the source position somewhere else also for the source-indep analysis addressing this comment, correct?
#1131 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it is true. Then I will modify codes to always recalculate the source position inside this function. This function is only used for source-dep analysis. But, it would be better to recalculate the source position somewhere else also for the source-indep analysis addressing this comment, correct? #1131 (comment)

Yes, for source-indep we need that the target for direction RF (true source position) is also in "true camera coordinates", meaning that it contains the effect of aberration (just like the shower images).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see.

For lstchain_add_source_dependent_parameters.py, I'd even remove this script since source-dep parameters are also computed during dl1_to_dl2 step if src-dep parameters don't exist.
This script overwrites the existing DL1 data to add source-dependent data frame, so it doesn't make sense for the current data analysis scheme since we analyze the common DL1 data produced by OSA.


metadata = global_metadata()
write_dataframe(src_dep_df, dl1_filename, dl1_params_src_dep_lstcam_key, config=config, meta=metadata)
Expand Down
Loading
Loading