API Reference: Torch Choice
- - -
- data
- special
- choice_dataset
- The dataset object for management large scale consumer choice datasets.
-Please refer to the documentation and tutorials for more details on using ChoiceDataset
Author: Tianyu Du -Update: Apr. 27, 2022
- - - -
-ChoiceDataset (Dataset)
- Source code in torch_choice/data/choice_dataset.py
- class ChoiceDataset(torch.utils.data.Dataset):
- def __init__(self,
- item_index: torch.LongTensor,
- label: Optional[torch.LongTensor] = None,
- user_index: Optional[torch.LongTensor] = None,
- session_index: Optional[torch.LongTensor] = None,
- item_availability: Optional[torch.BoolTensor] = None,
- **kwargs) -> None:
- """
- Initialization methods for the dataset object, researchers should supply all information about the dataset
- using this initialization method.
- The number of choice instances are called `batch_size` in the documentation. The `batch_size` corresponds to the
- file length in wide-format dataset, and often denoted using `N`. We call it `batch_size` to follow the convention
- in machine learning literature.
- A `choice instance` is a row of the dataset, so there are `batch_size` choice instances in each `ChoiceDataset`.
- The dataset consists of:
- (1) a collection of `batch_size` tuples (item_id, user_id, session_id, label), where each tuple is a choice instance.
- (2) a collection of `observables` associated with item, user, session, etc.
- Args:
- item_index (torch.LongTensor): a tensor of shape (batch_size) indicating the relevant item in each row
- of the dataset, the relevant item can be:
- (1) the item bought in this choice instance,
- (2) or the item reviewed by the user. In the later case, we need the `label` tensor to specify the rating score.
- NOTE: The support for second case is under-development, currently, we are only supporting binary label.
- label (Optional[torch.LongTensor], optional): a tensor of shape (batch_size) indicating the label for prediction in
- each choice instance. While you want to predict the item bought, you can leave the `label` argument
- as `None` in the initialization method, and the model will use `item_index` as the object to be predicted.
- But if you are, for example, predicting the rating an user gave an item, label must be provided.
- Defaults to None.
- user_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
- the ID of the user who was involved in each choice instance. If `None` user index is provided, it's assumed
- that the choice instances are from the same user.
- `user_index` is required if and only if there are multiple users in the dataset, for example:
- (1) user-observables is involved in the utility form,
- (2) and/or the coefficient is user-specific.
- This tensor is used to select the corresponding user observables and coefficients assigned to the
- user (like theta_user) for making prediction for that purchase.
- Defaults to None.
- session_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
- the ID of the session when that choice instance occurred. This tensor is used to select the correct
- session observables or price observables for making prediction for that choice instance. Therefore, if
- there is no session/price observables, you can leave this argument as `None`. In this case, the `ChoiceDataset`
- object will assume each choice instance to be in its own session.
- Defaults to None.
- item_availability (Optional[torch.BoolTensor], optional): A boolean tensor of shape (num_sessions, num_items)
- indicating the availability of each item in each session. Utilities of unavailable items would be set to -infinite,
- and hence these unavailable items will be set to 0 while making prediction.
- We assume all items are available if set to None.
- Defaults to None.
- Other Kwargs (Observables):
- One can specify the following types of observables, where * in shape denotes any positive
- integer. Typically * represents the number of observables.
- Please refer to the documentation for a detailed guide to use observables.
- 1. user observables must start with 'user_' and have shape (num_users, *)
- 2. item observables must start with 'item_' and have shape (num_items, *)
- 3. session observables must start with 'session_' and have shape (num_sessions, *)
- 4. taste observables (those vary by user and item) must start with `taste_` and have shape
- (num_users, num_items, *).
- NOTE: we don't recommend using taste observables, because num_users * num_items is potentially large.
- 5. price observables (those vary by session and item) must start with `price_` and have
- shape (num_sessions, num_items, *)
- """
- # ENHANCEMENT(Tianyu): add item_names for summary.
- super(ChoiceDataset, self).__init__()
- self.label = label
- self.item_index = item_index
- self.user_index = user_index
- self.session_index = session_index
- if self.session_index is None:
- # if any([x.startswith('session_') or x.startswith('price_') for x in kwargs.keys()]):
- # if any session sensitive observable is provided, but session index is not,
- # infer each row in the dataset to be a session.
- # TODO: (design choice) should we assign unique session index to each choice instance or the same session index.
- print('No `session_index` is provided, assume each choice instance is in its own session.')
- self.session_index = torch.arange(len(self.item_index)).long()
- self.item_availability = item_availability
- for key, item in kwargs.items():
- setattr(self, key, item)
- # TODO: add a validation procedure to check the consistency of the dataset.
- def __getitem__(self, indices: Union[int, torch.LongTensor]) -> "ChoiceDataset":
- """Retrieves samples corresponding to the provided index or list of indices.
- Args:
- indices (Union[int, torch.LongTensor]): a single integer index or a tensor of indices.
- Returns:
- ChoiceDataset: a subset of the dataset.
- """
- if isinstance(indices, int):
- # convert single integer index to an array of indices.
- indices = torch.LongTensor([indices])
- new_dict = dict()
- new_dict['item_index'] = self.item_index[indices].clone()
- # copy optional attributes.
- new_dict['label'] = self.label[indices].clone() if self.label is not None else None
- new_dict['user_index'] = self.user_index[indices].clone() if self.user_index is not None else None
- new_dict['session_index'] = self.session_index[indices].clone() if self.session_index is not None else None
- # item_availability has shape (num_sessions, num_items), no need to re-index it.
- new_dict['item_availability'] = self.item_availability
- # copy other attributes.
- for key, val in self.__dict__.items():
- if key not in new_dict.keys():
- if torch.is_tensor(val):
- new_dict[key] = val.clone()
- else:
- new_dict[key] = copy.deepcopy(val)
- return self._from_dict(new_dict)
- def __len__(self) -> int:
- """Returns number of samples in this dataset.
- Returns:
- int: length of the dataset.
- """
- return len(self.item_index)
- def __contains__(self, key: str) -> bool:
- return key in self.keys
- def __eq__(self, other: "ChoiceDataset") -> bool:
- """Returns whether all tensor attributes of both ChoiceDatasets are equal."""
- if not isinstance(other, ChoiceDataset):
- raise TypeError('You can only compare with ChoiceDataset objects.')
- else:
- flag = True
- for key, val in self.__dict__.items():
- if torch.is_tensor(val):
- # ignore NaNs while comparing.
- if not torch.equal(torch.nan_to_num(val), torch.nan_to_num(other.__dict__[key])):
- print('Attribute {} is not equal.'.format(key))
- flag = False
- return flag
- @property
- def device(self) -> str:
- """Returns the device of the dataset.
- Returns:
- str: the device of the dataset.
- """
- for attr in self.__dict__.values():
- if torch.is_tensor(attr):
- return attr.device
- @property
- def num_users(self) -> int:
- """Returns number of users involved in this dataset, returns 1 if there is no user identity.
- Returns:
- int: the number of users involved in this dataset.
- """
- # query from user_index
- if self.user_index is not None:
- return len(torch.unique(self.user_index))
- else:
- return 1
- # for key, val in self.__dict__.items():
- # if torch.is_tensor(val):
- # if self._is_user_attribute(key) or self._is_taste_attribute(key):
- # return val.shape[0]
- # return 1
- @property
- def num_items(self) -> int:
- """Returns the number of items involved in this dataset.
- Returns:
- int: the number of items involved in this dataset.
- """
- return len(torch.unique(self.item_index))
- # for key, val in self.__dict__.items():
- # if torch.is_tensor(val):
- # if self._is_item_attribute(key):
- # return val.shape[0]
- # elif self._is_taste_attribute(key) or self._is_price_attribute(key):
- # return val.shape[1]
- # return 1
- @property
- def num_sessions(self) -> int:
- """Returns the number of sessions involved in this dataset.
- Returns:
- int: the number of sessions involved in this dataset.
- """
- return len(torch.unique(self.session_index))
- # if self.session_index is None:
- # return 1
- # for key, val in self.__dict__.items():
- # if torch.is_tensor(val):
- # if self._is_session_attribute(key) or self._is_price_attribute(key):
- # return val.shape[0]
- # return 1
- @property
- def x_dict(self) -> Dict[object, torch.Tensor]:
- """Formats attributes of in this dataset into shape (num_sessions, num_items, num_params) and returns in a dictionary format.
- Models in this package are expecting this dictionary based data format.
- Returns:
- Dict[object, torch.Tensor]: a dictionary with attribute names in the dataset as keys, and reshaped attribute
- tensors as values.
- """
- out = dict()
- for key, val in self.__dict__.items():
- if self._is_attribute(key): # only include attributes.
- out[key] = self._expand_tensor(key, val) # reshape to (num_sessions, num_items, num_params).
- return out
- @classmethod
- def _from_dict(cls, dictionary: Dict[str, torch.tensor]) -> "ChoiceDataset":
- """Creates an instance of ChoiceDataset from a dictionary of arguments.
- Args:
- dictionary (Dict[str, torch.tensor]): a dictionary with keys as argument names and values as arguments.
- Returns:
- ChoiceDataset: the created copy of dataset.
- """
- dataset = cls(**dictionary)
- for key, item in dictionary.items():
- setattr(dataset, key, item)
- return dataset
- def apply_tensor(self, func: callable) -> "ChoiceDataset":
- """This s a helper method to apply the provided function to all tensors and tensor values of all dictionaries.
- Args:
- func (callable): a callable function to be applied on tensors and tensor-values of dictionaries.
- Returns:
- ChoiceDataset: the modified dataset.
- """
- for key, item in self.__dict__.items():
- if torch.is_tensor(item):
- setattr(self, key, func(item))
- # boardcast func to dictionary of tensors as well.
- elif isinstance(getattr(self, key), dict):
- for obj_key, obj_item in getattr(self, key).items():
- if torch.is_tensor(obj_item):
- setattr(getattr(self, key), obj_key, func(obj_item))
- return self
- def to(self, device: Union[str, torch.device]) -> "ChoiceDataset":
- """Moves all tensors in this dataset to the specified PyTorch device.
- Args:
- device (Union[str, torch.device]): the destination device.
- Returns:
- ChoiceDataset: the modified dataset on the new device.
- """
- return self.apply_tensor(lambda x: x.to(device))
- def clone(self) -> "ChoiceDataset":
- """Creates a copy of self.
- Returns:
- ChoiceDataset: a copy of self.
- """
- dictionary = {}
- for k, v in self.__dict__.items():
- if torch.is_tensor(v):
- dictionary[k] = v.clone()
- else:
- dictionary[k] = copy.deepcopy(v)
- return self.__class__._from_dict(dictionary)
- def _check_device_consistency(self) -> None:
- """Checks if all tensors in this dataset are on the same device.
- Raises:
- Exception: an exception is raised if not all tensors are on the same device.
- """
- # assert all tensors are on the same device.
- devices = list()
- for val in self.__dict__.values():
- if torch.is_tensor(val):
- devices.append(val.device)
- if len(set(devices)) > 1:
- raise Exception(f'Found tensors on different devices: {set(devices)}.',
- 'Use dataset.to() method to align devices.')
- def _size_repr(self, value: object) -> List[int]:
- """A helper method to get the string-representation of object sizes, this is helpful while constructing the
- string representation of the dataset.
- Args:
- value (object): an object to examine its size.
- Returns:
- List[int]: list of integers representing the size of the object, length of the list is equal to dimension of `value`.
- """
- if torch.is_tensor(value):
- return list(value.size())
- elif isinstance(value, int) or isinstance(value, float):
- return [1]
- elif isinstance(value, list) or isinstance(value, tuple):
- return [len(value)]
- else:
- return []
- def __repr__(self) -> str:
- """A method to get a string representation of the dataset.
- Returns:
- str: the string representation of the dataset.
- """
- info = [
- f'{key}={self._size_repr(item)}' for key, item in self.__dict__.items()]
- return f"{self.__class__.__name__}({', '.join(info)}, device={self.device})"
- # ==================================================================================================================
- # methods for checking attribute categories.
- # ==================================================================================================================
- @staticmethod
- def _is_item_attribute(key: str) -> bool:
- return key.startswith('item_') and (key != 'item_availability') and (key != 'item_index')
- @staticmethod
- def _is_user_attribute(key: str) -> bool:
- return key.startswith('user_') and (key != 'user_index')
- @staticmethod
- def _is_session_attribute(key: str) -> bool:
- return key.startswith('session_') and (key != 'session_index')
- @staticmethod
- def _is_taste_attribute(key: str) -> bool:
- return key.startswith('taste_')
- @staticmethod
- def _is_price_attribute(key: str) -> bool:
- return key.startswith('price_')
- def _is_attribute(self, key: str) -> bool:
- return self._is_item_attribute(key) \
- or self._is_user_attribute(key) \
- or self._is_session_attribute(key) \
- or self._is_taste_attribute(key) \
- or self._is_price_attribute(key)
- def _expand_tensor(self, key: str, val: torch.Tensor) -> torch.Tensor:
- """Expands attribute tensor to (num_sessions, num_items, num_params) shape for prediction tasks, this method
- won't reshape the tensor at all if the `key` (i.e., name of the tensor) suggests its not an attribute of any kind.
- Args:
- key (str): name of the attribute used to determine the raw shape of the tensor. For example, 'item_obs' means
- the raw tensor is in shape (num_items, num_params).
- val (torch.Tensor): the attribute tensor to be reshaped.
- Returns:
- torch.Tensor: the reshaped tensor with shape (num_sessions, num_items, num_params).
- """
- if not self._is_attribute(key):
- print(f'Warning: the input key {key} is not an attribute of the dataset, will NOT modify the provided tensor.')
- # don't expand non-attribute tensors, if any.
- return val
- num_params = val.shape[-1]
- if self._is_user_attribute(key):
- # user_attribute (num_users, *)
- out = val[self.user_index, :].view(
- len(self), 1, num_params).expand(-1, self.num_items, -1)
- elif self._is_item_attribute(key):
- # item_attribute (num_items, *)
- out = val.view(1, self.num_items, num_params).expand(
- len(self), -1, -1)
- elif self._is_session_attribute(key):
- # session_attribute (num_sessions, *)
- out = val[self.session_index, :].view(
- len(self), 1, num_params).expand(-1, self.num_items, -1)
- elif self._is_taste_attribute(key):
- # taste_attribute (num_users, num_items, *)
- out = val[self.user_index, :, :]
- elif self._is_price_attribute(key):
- # price_attribute (num_sessions, num_items, *)
- out = val[self.session_index, :, :]
- assert out.shape == (len(self), self.num_items, num_params)
- return out
-device: str
- property
- readonly
- Returns the device of the dataset.
- -Returns:
-Type | -Description | -
str |
- the device of the dataset. |
-num_items: int
- property
- readonly
- Returns the number of items involved in this dataset.
- -Returns:
-Type | -Description | -
int |
- the number of items involved in this dataset. |
-num_sessions: int
- property
- readonly
- Returns the number of sessions involved in this dataset.
- -Returns:
-Type | -Description | -
int |
- the number of sessions involved in this dataset. |
-num_users: int
- property
- readonly
- Returns number of users involved in this dataset, returns 1 if there is no user identity.
- -Returns:
-Type | -Description | -
int |
- the number of users involved in this dataset. |
-x_dict: Dict[object, torch.Tensor]
- property
- readonly
- Formats attributes of in this dataset into shape (num_sessions, num_items, num_params) and returns in a dictionary format. -Models in this package are expecting this dictionary based data format.
- -Returns:
-Type | -Description | -
Dict[object, torch.Tensor] |
- a dictionary with attribute names in the dataset as keys, and reshaped attribute - tensors as values. |
-__eq__(self, other)
- special
- Returns whether all tensor attributes of both ChoiceDatasets are equal.
- -Source code in torch_choice/data/choice_dataset.py
- def __eq__(self, other: "ChoiceDataset") -> bool:
- """Returns whether all tensor attributes of both ChoiceDatasets are equal."""
- if not isinstance(other, ChoiceDataset):
- raise TypeError('You can only compare with ChoiceDataset objects.')
- else:
- flag = True
- for key, val in self.__dict__.items():
- if torch.is_tensor(val):
- # ignore NaNs while comparing.
- if not torch.equal(torch.nan_to_num(val), torch.nan_to_num(other.__dict__[key])):
- print('Attribute {} is not equal.'.format(key))
- flag = False
- return flag
-__getitem__(self, indices)
- special
- Retrieves samples corresponding to the provided index or list of indices.
- -Parameters:
-Name | -Type | -Description | -Default | -
indices |
- Union[int, torch.LongTensor] |
- a single integer index or a tensor of indices. |
- required | -
-Type | -Description | -
ChoiceDataset |
- a subset of the dataset. |
Source code in torch_choice/data/choice_dataset.py
- def __getitem__(self, indices: Union[int, torch.LongTensor]) -> "ChoiceDataset":
- """Retrieves samples corresponding to the provided index or list of indices.
- Args:
- indices (Union[int, torch.LongTensor]): a single integer index or a tensor of indices.
- Returns:
- ChoiceDataset: a subset of the dataset.
- """
- if isinstance(indices, int):
- # convert single integer index to an array of indices.
- indices = torch.LongTensor([indices])
- new_dict = dict()
- new_dict['item_index'] = self.item_index[indices].clone()
- # copy optional attributes.
- new_dict['label'] = self.label[indices].clone() if self.label is not None else None
- new_dict['user_index'] = self.user_index[indices].clone() if self.user_index is not None else None
- new_dict['session_index'] = self.session_index[indices].clone() if self.session_index is not None else None
- # item_availability has shape (num_sessions, num_items), no need to re-index it.
- new_dict['item_availability'] = self.item_availability
- # copy other attributes.
- for key, val in self.__dict__.items():
- if key not in new_dict.keys():
- if torch.is_tensor(val):
- new_dict[key] = val.clone()
- else:
- new_dict[key] = copy.deepcopy(val)
- return self._from_dict(new_dict)
-__init__(self, item_index, label=None, user_index=None, session_index=None, item_availability=None, **kwargs)
- special
- Initialization methods for the dataset object, researchers should supply all information about the dataset -using this initialization method.
-The number of choice instances are called batch_size
in the documentation. The batch_size
corresponds to the
-file length in wide-format dataset, and often denoted using N
. We call it batch_size
to follow the convention
-in machine learning literature.
-A choice instance
is a row of the dataset, so there are batch_size
choice instances in each ChoiceDataset
The dataset consists of:
-(1) a collection of batch_size
tuples (item_id, user_id, session_id, label), where each tuple is a choice instance.
-(2) a collection of observables
associated with item, user, session, etc.
-Name | -Type | -Description | -Default | -
item_index |
- torch.LongTensor |
- a tensor of shape (batch_size) indicating the relevant item in each row
-of the dataset, the relevant item can be:
-(1) the item bought in this choice instance,
-(2) or the item reviewed by the user. In the later case, we need the |
- required | -
label |
- Optional[torch.LongTensor] |
- a tensor of shape (batch_size) indicating the label for prediction in
-each choice instance. While you want to predict the item bought, you can leave the |
- None |
user_index |
- Optional[torch.LongTensor] |
- a tensor of shape num_purchases (batch_size) indicating
-the ID of the user who was involved in each choice instance. If |
- None |
session_index |
- Optional[torch.LongTensor] |
- a tensor of shape num_purchases (batch_size) indicating
-the ID of the session when that choice instance occurred. This tensor is used to select the correct
-session observables or price observables for making prediction for that choice instance. Therefore, if
-there is no session/price observables, you can leave this argument as |
- None |
item_availability |
- Optional[torch.BoolTensor] |
- A boolean tensor of shape (num_sessions, num_items) -indicating the availability of each item in each session. Utilities of unavailable items would be set to -infinite, -and hence these unavailable items will be set to 0 while making prediction. -We assume all items are available if set to None. -Defaults to None. |
- None |
Other Kwargs (Observables):
- One can specify the following types of observables, where * in shape denotes any positive
- integer. Typically * represents the number of observables.
- Please refer to the documentation for a detailed guide to use observables.
- 1. user observables must start with 'user_' and have shape (num_users, )
- 2. item observables must start with 'item_' and have shape (num_items, )
- 3. session observables must start with 'session_' and have shape (num_sessions, )
- 4. taste observables (those vary by user and item) must start with taste_
and have shape
- (num_users, num_items, ).
- NOTE: we don't recommend using taste observables, because num_users * num_items is potentially large.
- 5. price observables (those vary by session and item) must start with price_
and have
- shape (num_sessions, num_items, *)
Source code in torch_choice/data/choice_dataset.py
- def __init__(self,
- item_index: torch.LongTensor,
- label: Optional[torch.LongTensor] = None,
- user_index: Optional[torch.LongTensor] = None,
- session_index: Optional[torch.LongTensor] = None,
- item_availability: Optional[torch.BoolTensor] = None,
- **kwargs) -> None:
- """
- Initialization methods for the dataset object, researchers should supply all information about the dataset
- using this initialization method.
- The number of choice instances are called `batch_size` in the documentation. The `batch_size` corresponds to the
- file length in wide-format dataset, and often denoted using `N`. We call it `batch_size` to follow the convention
- in machine learning literature.
- A `choice instance` is a row of the dataset, so there are `batch_size` choice instances in each `ChoiceDataset`.
- The dataset consists of:
- (1) a collection of `batch_size` tuples (item_id, user_id, session_id, label), where each tuple is a choice instance.
- (2) a collection of `observables` associated with item, user, session, etc.
- Args:
- item_index (torch.LongTensor): a tensor of shape (batch_size) indicating the relevant item in each row
- of the dataset, the relevant item can be:
- (1) the item bought in this choice instance,
- (2) or the item reviewed by the user. In the later case, we need the `label` tensor to specify the rating score.
- NOTE: The support for second case is under-development, currently, we are only supporting binary label.
- label (Optional[torch.LongTensor], optional): a tensor of shape (batch_size) indicating the label for prediction in
- each choice instance. While you want to predict the item bought, you can leave the `label` argument
- as `None` in the initialization method, and the model will use `item_index` as the object to be predicted.
- But if you are, for example, predicting the rating an user gave an item, label must be provided.
- Defaults to None.
- user_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
- the ID of the user who was involved in each choice instance. If `None` user index is provided, it's assumed
- that the choice instances are from the same user.
- `user_index` is required if and only if there are multiple users in the dataset, for example:
- (1) user-observables is involved in the utility form,
- (2) and/or the coefficient is user-specific.
- This tensor is used to select the corresponding user observables and coefficients assigned to the
- user (like theta_user) for making prediction for that purchase.
- Defaults to None.
- session_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
- the ID of the session when that choice instance occurred. This tensor is used to select the correct
- session observables or price observables for making prediction for that choice instance. Therefore, if
- there is no session/price observables, you can leave this argument as `None`. In this case, the `ChoiceDataset`
- object will assume each choice instance to be in its own session.
- Defaults to None.
- item_availability (Optional[torch.BoolTensor], optional): A boolean tensor of shape (num_sessions, num_items)
- indicating the availability of each item in each session. Utilities of unavailable items would be set to -infinite,
- and hence these unavailable items will be set to 0 while making prediction.
- We assume all items are available if set to None.
- Defaults to None.
- Other Kwargs (Observables):
- One can specify the following types of observables, where * in shape denotes any positive
- integer. Typically * represents the number of observables.
- Please refer to the documentation for a detailed guide to use observables.
- 1. user observables must start with 'user_' and have shape (num_users, *)
- 2. item observables must start with 'item_' and have shape (num_items, *)
- 3. session observables must start with 'session_' and have shape (num_sessions, *)
- 4. taste observables (those vary by user and item) must start with `taste_` and have shape
- (num_users, num_items, *).
- NOTE: we don't recommend using taste observables, because num_users * num_items is potentially large.
- 5. price observables (those vary by session and item) must start with `price_` and have
- shape (num_sessions, num_items, *)
- """
- # ENHANCEMENT(Tianyu): add item_names for summary.
- super(ChoiceDataset, self).__init__()
- self.label = label
- self.item_index = item_index
- self.user_index = user_index
- self.session_index = session_index
- if self.session_index is None:
- # if any([x.startswith('session_') or x.startswith('price_') for x in kwargs.keys()]):
- # if any session sensitive observable is provided, but session index is not,
- # infer each row in the dataset to be a session.
- # TODO: (design choice) should we assign unique session index to each choice instance or the same session index.
- print('No `session_index` is provided, assume each choice instance is in its own session.')
- self.session_index = torch.arange(len(self.item_index)).long()
- self.item_availability = item_availability
- for key, item in kwargs.items():
- setattr(self, key, item)
- # TODO: add a validation procedure to check the consistency of the dataset.
- special
- Returns number of samples in this dataset.
- -Returns:
-Type | -Description | -
int |
- length of the dataset. |
- special
- A method to get a string representation of the dataset.
- -Returns:
-Type | -Description | -
str |
- the string representation of the dataset. |
Source code in torch_choice/data/choice_dataset.py
- def __repr__(self) -> str:
- """A method to get a string representation of the dataset.
- Returns:
- str: the string representation of the dataset.
- """
- info = [
- f'{key}={self._size_repr(item)}' for key, item in self.__dict__.items()]
- return f"{self.__class__.__name__}({', '.join(info)}, device={self.device})"
-apply_tensor(self, func)
- This s a helper method to apply the provided function to all tensors and tensor values of all dictionaries.
- -Parameters:
-Name | -Type | -Description | -Default | -
func |
- callable |
- a callable function to be applied on tensors and tensor-values of dictionaries. |
- required | -
-Type | -Description | -
ChoiceDataset |
- the modified dataset. |
Source code in torch_choice/data/choice_dataset.py
- def apply_tensor(self, func: callable) -> "ChoiceDataset":
- """This s a helper method to apply the provided function to all tensors and tensor values of all dictionaries.
- Args:
- func (callable): a callable function to be applied on tensors and tensor-values of dictionaries.
- Returns:
- ChoiceDataset: the modified dataset.
- """
- for key, item in self.__dict__.items():
- if torch.is_tensor(item):
- setattr(self, key, func(item))
- # boardcast func to dictionary of tensors as well.
- elif isinstance(getattr(self, key), dict):
- for obj_key, obj_item in getattr(self, key).items():
- if torch.is_tensor(obj_item):
- setattr(getattr(self, key), obj_key, func(obj_item))
- return self
- Creates a copy of self.
- -Returns:
-Type | -Description | -
ChoiceDataset |
- a copy of self. |
Source code in torch_choice/data/choice_dataset.py
- def clone(self) -> "ChoiceDataset":
- """Creates a copy of self.
- Returns:
- ChoiceDataset: a copy of self.
- """
- dictionary = {}
- for k, v in self.__dict__.items():
- if torch.is_tensor(v):
- dictionary[k] = v.clone()
- else:
- dictionary[k] = copy.deepcopy(v)
- return self.__class__._from_dict(dictionary)
-to(self, device)
- Moves all tensors in this dataset to the specified PyTorch device.
- -Parameters:
-Name | -Type | -Description | -Default | -
device |
- Union[str, torch.device] |
- the destination device. |
- required | -
-Type | -Description | -
ChoiceDataset |
- the modified dataset on the new device. |
Source code in torch_choice/data/choice_dataset.py
- def to(self, device: Union[str, torch.device]) -> "ChoiceDataset":
- """Moves all tensors in this dataset to the specified PyTorch device.
- Args:
- device (Union[str, torch.device]): the destination device.
- Returns:
- ChoiceDataset: the modified dataset on the new device.
- """
- return self.apply_tensor(lambda x: x.to(device))
- joint_dataset
- The JointDataset class is a wrapper for the torch.utils.data.ChoiceDataset class, it is particularly useful when we -need to make prediction from multiple datasets. For example, you have data on consumer purchase records in a fast food -store, and suppose every customer will purchase exactly a single main food and a single drink. In this case, you have -two separate datasets: FoodDataset and DrinkDataset. You may want to use PyTorch sampler to sample them in a dependent -manner: you want to take the i-th sample from both datasets, so that you know what (food, drink) combo the i-th customer -purchased. You can do this by using the JointDataset class.
-Author: Tianyu Du -Update: Apr. 28, 2022
- - - -
-JointDataset (Dataset)
- A helper class for joining several pytorch datasets, using JointDataset -and pytorch data loader allows for sampling the same batch index from several -datasets.
-The JointDataset class is a wrapper for the torch.utils.data.ChoiceDataset class, it is particularly useful when we -need to make prediction from multiple datasets. For example, you have data on consumer purchase records in a fast food -store, and suppose every customer will purchase exactly a single main food and a single drink. In this case, you have -two separate datasets: FoodDataset and DrinkDataset. You may want to use PyTorch sampler to sample them in a dependent -manner: you want to take the i-th sample from both datasets, so that you know what (food, drink) combo the i-th customer -purchased. You can do this by using the JointDataset class.
- -Source code in torch_choice/data/joint_dataset.py
- class JointDataset(torch.utils.data.Dataset):
- """A helper class for joining several pytorch datasets, using JointDataset
- and pytorch data loader allows for sampling the same batch index from several
- datasets.
- The JointDataset class is a wrapper for the torch.utils.data.ChoiceDataset class, it is particularly useful when we
- need to make prediction from multiple datasets. For example, you have data on consumer purchase records in a fast food
- store, and suppose every customer will purchase exactly a single main food and a single drink. In this case, you have
- two separate datasets: FoodDataset and DrinkDataset. You may want to use PyTorch sampler to sample them in a dependent
- manner: you want to take the i-th sample from both datasets, so that you know what (food, drink) combo the i-th customer
- purchased. You can do this by using the JointDataset class.
- """
- def __init__(self, **datasets) -> None:
- """The initialize methods.
- Args:
- Arbitrarily many datasets with arbitrary names as keys. In the example above, you can construct
- ```
- dataset = JointDataset(food=FoodDataset, drink=DrinkDataset)
- ```
- All datasets should have the same length.
- """
- super(JointDataset, self).__init__()
- self.datasets = datasets
- # check the length of sub-datasets are the same.
- assert len(set([len(d) for d in self.datasets.values()])) == 1
- def __len__(self) -> int:
- """Get the number of samples in the joint dataset.
- Returns:
- int: the number of samples in the joint dataset, which is the same as the number of samples in each dataset contained.
- """
- for d in self.datasets.values():
- return len(d)
- def __getitem__(self, indices: Union[int, torch.LongTensor]) -> Dict[str, ChoiceDataset]:
- """Queries samples from the dataset by index.
- Args:
- indices (Union[int, torch.LongTensor]): an integer or a 1D tensor of multiple indices.
- Returns:
- Dict[str, ChoiceDataset]: the subset of the dataset. Keys of the dictionary will be names of each dataset
- contained (the same as the keys of the ``datasets`` argument in the constructor). Values will be subsets
- of contained datasets, sliced using the provided indices.
- """
- return dict((name, d[indices]) for (name, d) in self.datasets.items())
- def __repr__(self) -> str:
- """A method to get a string representation of the dataset.
- Returns:
- str: the string representation of the dataset.
- """
- out = [f'JointDataset with {len(self.datasets)} sub-datasets: (']
- for name, dataset in self.datasets.items():
- out.append(f'\t{name}: {str(dataset)}')
- out.append(')')
- return '\n'.join(out)
- @property
- def device(self) -> str:
- """Returns the device of datasets contained in the joint dataset.
- Returns:
- str: the device of the dataset.
- """
- for d in self.datasets.values():
- return d.device
- def to(self, device: Union[str, torch.device]) -> "JointDataset":
- """Moves all datasets in this dataset to the specified PyTorch device.
- Args:
- device (Union[str, torch.device]): the destination device.
- Returns:
- ChoiceDataset: the modified dataset on the new device.
- """
- for d in self.datasets.values():
- d = d.to(device)
- return self
-device: str
- property
- readonly
- Returns the device of datasets contained in the joint dataset.
- -Returns:
-Type | -Description | -
str |
- the device of the dataset. |
-__getitem__(self, indices)
- special
- Queries samples from the dataset by index.
- -Parameters:
-Name | -Type | -Description | -Default | -
indices |
- Union[int, torch.LongTensor] |
- an integer or a 1D tensor of multiple indices. |
- required | -
-Type | -Description | -
Dict[str, ChoiceDataset] |
- the subset of the dataset. Keys of the dictionary will be names of each dataset
- contained (the same as the keys of the |
Source code in torch_choice/data/joint_dataset.py
- def __getitem__(self, indices: Union[int, torch.LongTensor]) -> Dict[str, ChoiceDataset]:
- """Queries samples from the dataset by index.
- Args:
- indices (Union[int, torch.LongTensor]): an integer or a 1D tensor of multiple indices.
- Returns:
- Dict[str, ChoiceDataset]: the subset of the dataset. Keys of the dictionary will be names of each dataset
- contained (the same as the keys of the ``datasets`` argument in the constructor). Values will be subsets
- of contained datasets, sliced using the provided indices.
- """
- return dict((name, d[indices]) for (name, d) in self.datasets.items())
-__init__(self, **datasets)
- special
- The initialize methods.
- -Source code in torch_choice/data/joint_dataset.py
- def __init__(self, **datasets) -> None:
- """The initialize methods.
- Args:
- Arbitrarily many datasets with arbitrary names as keys. In the example above, you can construct
- ```
- dataset = JointDataset(food=FoodDataset, drink=DrinkDataset)
- ```
- All datasets should have the same length.
- """
- super(JointDataset, self).__init__()
- self.datasets = datasets
- # check the length of sub-datasets are the same.
- assert len(set([len(d) for d in self.datasets.values()])) == 1
- special
- Get the number of samples in the joint dataset.
- -Returns:
-Type | -Description | -
int |
- the number of samples in the joint dataset, which is the same as the number of samples in each dataset contained. |
Source code in torch_choice/data/joint_dataset.py
- special
- A method to get a string representation of the dataset.
- -Returns:
-Type | -Description | -
str |
- the string representation of the dataset. |
Source code in torch_choice/data/joint_dataset.py
- def __repr__(self) -> str:
- """A method to get a string representation of the dataset.
- Returns:
- str: the string representation of the dataset.
- """
- out = [f'JointDataset with {len(self.datasets)} sub-datasets: (']
- for name, dataset in self.datasets.items():
- out.append(f'\t{name}: {str(dataset)}')
- out.append(')')
- return '\n'.join(out)
-to(self, device)
- Moves all datasets in this dataset to the specified PyTorch device.
- -Parameters:
-Name | -Type | -Description | -Default | -
device |
- Union[str, torch.device] |
- the destination device. |
- required | -
-Type | -Description | -
ChoiceDataset |
- the modified dataset on the new device. |
Source code in torch_choice/data/joint_dataset.py
- def to(self, device: Union[str, torch.device]) -> "JointDataset":
- """Moves all datasets in this dataset to the specified PyTorch device.
- Args:
- device (Union[str, torch.device]): the destination device.
- Returns:
- ChoiceDataset: the modified dataset on the new device.
- """
- for d in self.datasets.values():
- d = d.to(device)
- return self
- utils
-pivot3d(df, dim0, dim1, values)
- Creates a tensor of shape (df[dim0].nunique(), df[dim1].nunique(), len(values)) from the -provided data frame.
-Example, if dim0 is the column of session ID, dim1 is the column of alternative names, then - out[t, i, k] is the feature values[k] of item i in session t. The returned tensor - has shape (num_sessions, num_items, num_params), which fits the purpose of conditioanl - logit models.
- -Source code in torch_choice/data/utils.py
- def pivot3d(df: pd.DataFrame, dim0: str, dim1: str, values: Union[str, List[str]]) -> torch.Tensor:
- """
- Creates a tensor of shape (df[dim0].nunique(), df[dim1].nunique(), len(values)) from the
- provided data frame.
- Example, if dim0 is the column of session ID, dim1 is the column of alternative names, then
- out[t, i, k] is the feature values[k] of item i in session t. The returned tensor
- has shape (num_sessions, num_items, num_params), which fits the purpose of conditioanl
- logit models.
- """
- if not isinstance(values, list):
- values = [values]
- dim1_list = sorted(df[dim1].unique())
- tensor_slice = list()
- for value in values:
- layer = df.pivot(index=dim0, columns=dim1, values=value)
- tensor_slice.append(torch.Tensor(layer[dim1_list].values))
- tensor = torch.stack(tensor_slice, dim=-1)
- assert tensor.shape == (df[dim0].nunique(), df[dim1].nunique(), len(values))
- return tensor
- model
- special
- coefficient
- The general class of learnable coefficients in various models, this class serves as the building blocks for models in this package. -The weights (i.e., learnable parameters) in the Coefficient class are implemented using PyTorch and can be trained -directly using optimizers from PyTorch.
-NOTE: torch-choice package users don't interact with classes in this file directly, please use conditional_logit_model.py -and nested_logit_model.py instead.
-Author: Tianyu Du -Update: Apr. 28, 2022
- - - -
-Coefficient (Module)
- Source code in torch_choice/model/coefficient.py
- class Coefficient(nn.Module):
- def __init__(self,
- variation: str,
- num_params: int,
- num_items: Optional[int]=None,
- num_users: Optional[int]=None
- ) -> None:
- """A generic coefficient object storing trainable parameters. This class corresponds to those variables typically
- in Greek letters in the model's utility representation.
- Args:
- variation (str): the degree of variation of this coefficient. For example, the coefficient can vary by users or items.
- Currently, we support variations 'constant', 'item', 'item-full', 'user', 'user-item', 'user-item-full'.
- For detailed explanation of these variations, please refer to the documentation of ConditionalLogitModel.
- num_params (int): number of parameters in this coefficient. Note that this number is the number of parameters
- per class, not the total number of parameters. For example, suppose we have U users and you want to initiate
- an user-specific coefficient called `theta_user`. The coefficient enters the utility form while being multiplied
- with some K-dimension observables. Then, for each user, there are K parameters to be multiplied with the K-dimensional
- observable. However, the total number of parameters is K * U (K for each of U users). In this case, `num_params` should
- be set to `K`, NOT `K*U`.
- num_items (int): the number of items in the prediction problem, this is required to reshape the parameter correctly.
- num_users (Optional[int], optional): number of users, this is only necessary if the coefficient varies by users.
- Defaults to None.
- """
- super(Coefficient, self).__init__()
- self.variation = variation
- self.num_items = num_items
- self.num_users = num_users
- self.num_params = num_params
- # construct the trainable.
- if self.variation == 'constant':
- # constant for all users and items.
- self.coef = nn.Parameter(torch.randn(num_params), requires_grad=True)
- elif self.variation == 'item':
- # coef depends on item j but not on user i.
- # force coefficients for the first item class to be zero.
- self.coef = nn.Parameter(torch.zeros(num_items - 1, num_params), requires_grad=True)
- elif self.variation == 'item-full':
- # coef depends on item j but not on user i.
- # model coefficient for every item.
- self.coef = nn.Parameter(torch.zeros(num_items, num_params), requires_grad=True)
- elif self.variation == 'user':
- # coef depends on the user.
- # we always model coefficient for all users.
- self.coef = nn.Parameter(torch.zeros(num_users, num_params), requires_grad=True)
- elif self.variation == 'user-item':
- # coefficients of the first item is forced to be zero, model coefficients for N - 1 items only.
- self.coef = nn.Parameter(torch.zeros(num_users, num_items - 1, num_params), requires_grad=True)
- elif self.variation == 'user-item-full':
- # construct coefficients for every items.
- self.coef = nn.Parameter(torch.zeros(num_users, num_items, num_params), requires_grad=True)
- else:
- raise ValueError(f'Unsupported type of variation: {self.variation}.')
- def __repr__(self) -> str:
- """Returns a string representation of the coefficient.
- Returns:
- str: the string representation of the coefficient.
- """
- return f'Coefficient(variation={self.variation}, num_items={self.num_items},' \
- + f' num_users={self.num_users}, num_params={self.num_params},' \
- + f' {self.coef.numel()} trainable parameters in total).'
- def forward(self,
- x: torch.Tensor,
- user_index: Optional[torch.Tensor]=None,
- manual_coef_value: Optional[torch.Tensor]=None
- ) -> torch.Tensor:
- """
- The forward function of the coefficient, which computes the utility from purchasing each item in each session.
- The output shape will be (num_sessions, num_items).
- Args:
- x (torch.Tensor): a tensor of shape (num_sessions, num_items, num_params). Please note that the Coefficient
- class will NOT reshape input tensors itself, this reshaping needs to be done in the model class.
- user_index (Optional[torch.Tensor], optional): a tensor of shape (num_sessions,)
- contain IDs of the user involved in that session. If set to None, assume the same
- user is making all decisions.
- Defaults to None.
- manual_coef_value (Optional[torch.Tensor], optional): a tensor with the same number of
- entries as self.coef. If provided, the forward function uses provided values
- as coefficient and return the predicted utility, this feature is useful when
- the researcher wishes to manually specify values for coefficients and examine prediction
- with specified coefficient values. If not provided, forward function is executed
- using values from self.coef.
- Defaults to None.
- Returns:
- torch.Tensor: a tensor of shape (num_sessions, num_items) whose (t, i) entry represents
- the utility of purchasing item i in session t.
- """
- if manual_coef_value is not None:
- assert manual_coef_value.numel() == self.coef.numel()
- # plugin the provided coefficient values, coef is a tensor.
- coef = manual_coef_value.reshape(*self.coef.shape)
- else:
- # use the learned coefficient values, coef is a nn.Parameter.
- coef = self.coef
- num_trips, num_items, num_feats = x.shape
- assert self.num_params == num_feats
- # cast coefficient tensor to (num_trips, num_items, self.num_params).
- if self.variation == 'constant':
- coef = coef.view(1, 1, self.num_params).expand(num_trips, num_items, -1)
- elif self.variation == 'item':
- # coef has shape (num_items-1, num_params)
- # force coefficient for the first item to be zero.
- zeros = torch.zeros(1, self.num_params).to(coef.device)
- coef = torch.cat((zeros, coef), dim=0) # (num_items, num_params)
- coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)
- elif self.variation == 'item-full':
- # coef has shape (num_items, num_params)
- coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)
- elif self.variation == 'user':
- # coef has shape (num_users, num_params)
- coef = coef[user_index, :] # (num_trips, num_params) user-specific coefficients.
- coef = coef.view(num_trips, 1, self.num_params).expand(-1, num_items, -1)
- elif self.variation == 'user-item':
- # (num_trips,) long tensor of user ID.
- # originally, coef has shape (num_users, num_items-1, num_params)
- # transform to (num_trips, num_items - 1, num_params), user-specific.
- coef = coef[user_index, :, :]
- # coefs for the first item for all users are enforced to 0.
- zeros = torch.zeros(num_trips, 1, self.num_params).to(coef.device)
- coef = torch.cat((zeros, coef), dim=1) # (num_trips, num_items, num_params)
- elif self.variation == 'user-item-full':
- # originally, coef has shape (num_users, num_items, num_params)
- coef = coef[user_index, :, :] # (num_trips, num_items, num_params)
- else:
- raise ValueError(f'Unsupported type of variation: {self.variation}.')
- assert coef.shape == (num_trips, num_items, num_feats) == x.shape
- # compute the utility of each item in each trip, take summation along the feature dimension, the same as taking
- # the inner product.
- return (x * coef).sum(dim=-1)
-__init__(self, variation, num_params, num_items=None, num_users=None)
- special
- A generic coefficient object storing trainable parameters. This class corresponds to those variables typically -in Greek letters in the model's utility representation.
- -Parameters:
-Name | -Type | -Description | -Default | -
variation |
- str |
- the degree of variation of this coefficient. For example, the coefficient can vary by users or items. -Currently, we support variations 'constant', 'item', 'item-full', 'user', 'user-item', 'user-item-full'. -For detailed explanation of these variations, please refer to the documentation of ConditionalLogitModel. |
- required | -
num_params |
- int |
- number of parameters in this coefficient. Note that this number is the number of parameters
-per class, not the total number of parameters. For example, suppose we have U users and you want to initiate
-an user-specific coefficient called |
- required | -
num_items |
- int |
- the number of items in the prediction problem, this is required to reshape the parameter correctly. |
- None |
num_users |
- Optional[int] |
- number of users, this is only necessary if the coefficient varies by users. -Defaults to None. |
- None |
Source code in torch_choice/model/coefficient.py
- def __init__(self,
- variation: str,
- num_params: int,
- num_items: Optional[int]=None,
- num_users: Optional[int]=None
- ) -> None:
- """A generic coefficient object storing trainable parameters. This class corresponds to those variables typically
- in Greek letters in the model's utility representation.
- Args:
- variation (str): the degree of variation of this coefficient. For example, the coefficient can vary by users or items.
- Currently, we support variations 'constant', 'item', 'item-full', 'user', 'user-item', 'user-item-full'.
- For detailed explanation of these variations, please refer to the documentation of ConditionalLogitModel.
- num_params (int): number of parameters in this coefficient. Note that this number is the number of parameters
- per class, not the total number of parameters. For example, suppose we have U users and you want to initiate
- an user-specific coefficient called `theta_user`. The coefficient enters the utility form while being multiplied
- with some K-dimension observables. Then, for each user, there are K parameters to be multiplied with the K-dimensional
- observable. However, the total number of parameters is K * U (K for each of U users). In this case, `num_params` should
- be set to `K`, NOT `K*U`.
- num_items (int): the number of items in the prediction problem, this is required to reshape the parameter correctly.
- num_users (Optional[int], optional): number of users, this is only necessary if the coefficient varies by users.
- Defaults to None.
- """
- super(Coefficient, self).__init__()
- self.variation = variation
- self.num_items = num_items
- self.num_users = num_users
- self.num_params = num_params
- # construct the trainable.
- if self.variation == 'constant':
- # constant for all users and items.
- self.coef = nn.Parameter(torch.randn(num_params), requires_grad=True)
- elif self.variation == 'item':
- # coef depends on item j but not on user i.
- # force coefficients for the first item class to be zero.
- self.coef = nn.Parameter(torch.zeros(num_items - 1, num_params), requires_grad=True)
- elif self.variation == 'item-full':
- # coef depends on item j but not on user i.
- # model coefficient for every item.
- self.coef = nn.Parameter(torch.zeros(num_items, num_params), requires_grad=True)
- elif self.variation == 'user':
- # coef depends on the user.
- # we always model coefficient for all users.
- self.coef = nn.Parameter(torch.zeros(num_users, num_params), requires_grad=True)
- elif self.variation == 'user-item':
- # coefficients of the first item is forced to be zero, model coefficients for N - 1 items only.
- self.coef = nn.Parameter(torch.zeros(num_users, num_items - 1, num_params), requires_grad=True)
- elif self.variation == 'user-item-full':
- # construct coefficients for every items.
- self.coef = nn.Parameter(torch.zeros(num_users, num_items, num_params), requires_grad=True)
- else:
- raise ValueError(f'Unsupported type of variation: {self.variation}.')
- special
- Returns a string representation of the coefficient.
- -Returns:
-Type | -Description | -
str |
- the string representation of the coefficient. |
Source code in torch_choice/model/coefficient.py
- def __repr__(self) -> str:
- """Returns a string representation of the coefficient.
- Returns:
- str: the string representation of the coefficient.
- """
- return f'Coefficient(variation={self.variation}, num_items={self.num_items},' \
- + f' num_users={self.num_users}, num_params={self.num_params},' \
- + f' {self.coef.numel()} trainable parameters in total).'
-forward(self, x, user_index=None, manual_coef_value=None)
- The forward function of the coefficient, which computes the utility from purchasing each item in each session. -The output shape will be (num_sessions, num_items).
- -Parameters:
-Name | -Type | -Description | -Default | -
x |
- torch.Tensor |
- a tensor of shape (num_sessions, num_items, num_params). Please note that the Coefficient -class will NOT reshape input tensors itself, this reshaping needs to be done in the model class. |
- required | -
user_index |
- Optional[torch.Tensor] |
- a tensor of shape (num_sessions,) -contain IDs of the user involved in that session. If set to None, assume the same -user is making all decisions. -Defaults to None. |
- None |
manual_coef_value |
- Optional[torch.Tensor] |
- a tensor with the same number of -entries as self.coef. If provided, the forward function uses provided values -as coefficient and return the predicted utility, this feature is useful when -the researcher wishes to manually specify values for coefficients and examine prediction -with specified coefficient values. If not provided, forward function is executed -using values from self.coef. -Defaults to None. |
- None |
-Type | -Description | -
torch.Tensor |
- a tensor of shape (num_sessions, num_items) whose (t, i) entry represents - the utility of purchasing item i in session t. |
Source code in torch_choice/model/coefficient.py
- def forward(self,
- x: torch.Tensor,
- user_index: Optional[torch.Tensor]=None,
- manual_coef_value: Optional[torch.Tensor]=None
- ) -> torch.Tensor:
- """
- The forward function of the coefficient, which computes the utility from purchasing each item in each session.
- The output shape will be (num_sessions, num_items).
- Args:
- x (torch.Tensor): a tensor of shape (num_sessions, num_items, num_params). Please note that the Coefficient
- class will NOT reshape input tensors itself, this reshaping needs to be done in the model class.
- user_index (Optional[torch.Tensor], optional): a tensor of shape (num_sessions,)
- contain IDs of the user involved in that session. If set to None, assume the same
- user is making all decisions.
- Defaults to None.
- manual_coef_value (Optional[torch.Tensor], optional): a tensor with the same number of
- entries as self.coef. If provided, the forward function uses provided values
- as coefficient and return the predicted utility, this feature is useful when
- the researcher wishes to manually specify values for coefficients and examine prediction
- with specified coefficient values. If not provided, forward function is executed
- using values from self.coef.
- Defaults to None.
- Returns:
- torch.Tensor: a tensor of shape (num_sessions, num_items) whose (t, i) entry represents
- the utility of purchasing item i in session t.
- """
- if manual_coef_value is not None:
- assert manual_coef_value.numel() == self.coef.numel()
- # plugin the provided coefficient values, coef is a tensor.
- coef = manual_coef_value.reshape(*self.coef.shape)
- else:
- # use the learned coefficient values, coef is a nn.Parameter.
- coef = self.coef
- num_trips, num_items, num_feats = x.shape
- assert self.num_params == num_feats
- # cast coefficient tensor to (num_trips, num_items, self.num_params).
- if self.variation == 'constant':
- coef = coef.view(1, 1, self.num_params).expand(num_trips, num_items, -1)
- elif self.variation == 'item':
- # coef has shape (num_items-1, num_params)
- # force coefficient for the first item to be zero.
- zeros = torch.zeros(1, self.num_params).to(coef.device)
- coef = torch.cat((zeros, coef), dim=0) # (num_items, num_params)
- coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)
- elif self.variation == 'item-full':
- # coef has shape (num_items, num_params)
- coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)
- elif self.variation == 'user':
- # coef has shape (num_users, num_params)
- coef = coef[user_index, :] # (num_trips, num_params) user-specific coefficients.
- coef = coef.view(num_trips, 1, self.num_params).expand(-1, num_items, -1)
- elif self.variation == 'user-item':
- # (num_trips,) long tensor of user ID.
- # originally, coef has shape (num_users, num_items-1, num_params)
- # transform to (num_trips, num_items - 1, num_params), user-specific.
- coef = coef[user_index, :, :]
- # coefs for the first item for all users are enforced to 0.
- zeros = torch.zeros(num_trips, 1, self.num_params).to(coef.device)
- coef = torch.cat((zeros, coef), dim=1) # (num_trips, num_items, num_params)
- elif self.variation == 'user-item-full':
- # originally, coef has shape (num_users, num_items, num_params)
- coef = coef[user_index, :, :] # (num_trips, num_items, num_params)
- else:
- raise ValueError(f'Unsupported type of variation: {self.variation}.')
- assert coef.shape == (num_trips, num_items, num_feats) == x.shape
- # compute the utility of each item in each trip, take summation along the feature dimension, the same as taking
- # the inner product.
- return (x * coef).sum(dim=-1)
- conditional_logit_model
- Conditional Logit Model.
-Author: Tianyu Du -Date: Aug. 8, 2021 -Update: Apr. 28, 2022
- - - -
-ConditionalLogitModel (Module)
- The more generalized version of conditional logit model, the model allows for research specific -variable types(groups) and different levels of variations for coefficient.
-The model allows for the following levels for variable variations:
-!!! note "unless the -full
flag is specified (which means we want to explicitly model coefficients"
- for all items), for all variation levels related to item (item specific and user-item specific),
- the model force coefficients for the first item to be zero. This design follows standard
- econometric practice.
constant: constant over all users and items,
- -
user: user-specific parameters but constant across all items,
- -
item: item-specific parameters but constant across all users, parameters for the first item are - forced to be zero.
- -
item-full: item-specific parameters but constant across all users, explicitly model for all items.
- -
user-item: parameters that are specific to both user and item, parameter for the first item - for all users are forced to be zero.
- - user-item-full: parameters that are specific to both user and item, explicitly model for all items. -
Source code in torch_choice/model/conditional_logit_model.py
- class ConditionalLogitModel(nn.Module):
- """The more generalized version of conditional logit model, the model allows for research specific
- variable types(groups) and different levels of variations for coefficient.
- The model allows for the following levels for variable variations:
- NOTE: unless the `-full` flag is specified (which means we want to explicitly model coefficients
- for all items), for all variation levels related to item (item specific and user-item specific),
- the model force coefficients for the first item to be zero. This design follows standard
- econometric practice.
- - constant: constant over all users and items,
- - user: user-specific parameters but constant across all items,
- - item: item-specific parameters but constant across all users, parameters for the first item are
- forced to be zero.
- - item-full: item-specific parameters but constant across all users, explicitly model for all items.
- - user-item: parameters that are specific to both user and item, parameter for the first item
- for all users are forced to be zero.
- - user-item-full: parameters that are specific to both user and item, explicitly model for all items.
- """
- def __init__(self,
- coef_variation_dict: Dict[str, str],
- num_param_dict: Optional[Dict[str, int]]=None,
- num_items: Optional[int]=None,
- num_users: Optional[int]=None
- ) -> None:
- """
- Args:
- num_items (int): number of items in the dataset.
- num_users (int): number of users in the dataset.
- coef_variation_dict (Dict[str, str]): variable type to variation level dictionary. Keys of this dictionary
- should be variable names in the dataset (i.e., these starting with `price_`, `user_`, etc), or `intercept`
- if the researcher requires an intercept term.
- For each variable name X_var (e.g., `user_income`) or `intercept`, the corresponding dictionary key should
- be one of the following values, this value specifies the "level of variation" of the coefficient.
- - `constant`: the coefficient constant over all users and items: $X \beta$.
- - `user`: user-specific parameters but constant across all items: $X \beta_{u}$.
- - `item`: item-specific parameters but constant across all users, $X \beta_{i}$.
- Note that the coefficients for the first item are forced to be zero following the standard practice
- in econometrics.
- - `item-full`: the same configuration as `item`, but does not force the coefficients of the first item to
- be zeros.
- The following configurations are supported by the package, but we don't recommend using them due to the
- large number of parameters.
- - `user-item`: parameters that are specific to both user and item, parameter for the first item
- for all users are forced to be zero.
- - `user-item-full`: parameters that are specific to both user and item, explicitly model for all items.
- num_param_dict (Optional[Dict[str, int]]): variable type to number of parameters dictionary with keys exactly the same
- as the `coef_variation_dict`. Values of `num_param_dict` records numbers of features in each kind of variable.
- If None is supplied, num_param_dict will be a dictionary with the same keys as the `coef_variation_dict` dictionary
- and values of all ones. Default to be None.
- """
- super(ConditionalLogitModel, self).__init__()
- if num_param_dict is None:
- num_param_dict = {key:1 for key in coef_variation_dict.keys()}
- assert coef_variation_dict.keys() == num_param_dict.keys()
- self.variable_types = list(deepcopy(num_param_dict).keys())
- self.coef_variation_dict = deepcopy(coef_variation_dict)
- self.num_param_dict = deepcopy(num_param_dict)
- self.num_items = num_items
- self.num_users = num_users
- # check number of parameters specified are all positive.
- for var_type, num_params in self.num_param_dict.items():
- assert num_params > 0, f'num_params needs to be positive, got: {num_params}.'
- # infer the number of parameters for intercept if the researcher forgets.
- if 'intercept' in self.coef_variation_dict.keys() and 'intercept' not in self.num_param_dict.keys():
- warnings.warn("'intercept' key found in coef_variation_dict but not in num_param_dict, num_param_dict['intercept'] has been set to 1.")
- self.num_param_dict['intercept'] = 1
- # construct trainable parameters.
- coef_dict = dict()
- for var_type, variation in self.coef_variation_dict.items():
- coef_dict[var_type] = Coefficient(variation=variation,
- num_items=self.num_items,
- num_users=self.num_users,
- num_params=self.num_param_dict[var_type])
- # A ModuleDict is required to properly register all trainable parameters.
- # self.parameter() will fail if a python dictionary is used instead.
- self.coef_dict = nn.ModuleDict(coef_dict)
- def __repr__(self) -> str:
- """Return a string representation of the model.
- Returns:
- str: the string representation of the model.
- """
- out_str_lst = ['Conditional logistic discrete choice model, expects input features:\n']
- for var_type, num_params in self.num_param_dict.items():
- out_str_lst.append(f'X[{var_type}] with {num_params} parameters, with {self.coef_variation_dict[var_type]} level variation.')
- return super().__repr__() + '\n' + '\n'.join(out_str_lst)
- @property
- def num_params(self) -> int:
- """Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied
- with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no
- intercept is involved.
- Returns:
- int: the total number of learnable parameters.
- """
- return sum(w.numel() for w in self.parameters())
- def summary(self):
- """Print out the current model parameter."""
- for var_type, coefficient in self.coef_dict.items():
- if coefficient is not None:
- print('Variable Type: ', var_type)
- print(coefficient.coef)
- def forward(self,
- batch: ChoiceDataset,
- manual_coef_value_dict: Optional[Dict[str, torch.Tensor]] = None
- ) -> torch.Tensor:
- """
- Forward pass of the model.
- Args:
- batch: a `ChoiceDataset` object.
- manual_coef_value_dict (Optional[Dict[str, torch.Tensor]], optional): a dictionary with
- keys in {'u', 'i'} etc and tensors as values. If provided, the model will force
- coefficient to be the provided values and compute utility conditioned on the provided
- coefficient values. This feature is useful when the research wishes to plug in particular
- values of coefficients and examine the utility values. If not provided, the model will
- use the learned coefficient values in self.coef_dict.
- Defaults to None.
- Returns:
- torch.Tensor: a tensor of shape (num_trips, num_items) whose (t, i) entry represents
- the utility from item i in trip t for the user involved in that trip.
- """
- x_dict = batch.x_dict
- if 'intercept' in self.coef_variation_dict.keys():
- # intercept term has no input tensor, which has only 1 feature.
- x_dict['intercept'] = torch.ones((len(batch), self.num_items, 1), device=batch.device)
- # compute the utility from each item in each choice session.
- total_utility = torch.zeros((len(batch), self.num_items), device=batch.device)
- # for each type of variables, apply the corresponding coefficient to input x.
- for var_type, coef in self.coef_dict.items():
- total_utility += coef(
- x_dict[var_type], batch.user_index,
- manual_coef_value=None if manual_coef_value_dict is None else manual_coef_value_dict[var_type])
- assert total_utility.shape == (len(batch), self.num_items)
- if batch.item_availability is not None:
- # mask out unavilable items.
- total_utility[~batch.item_availability[batch.session_index, :]] = torch.finfo(total_utility.dtype).min / 2
- return total_utility
- def negative_log_likelihood(self, batch: ChoiceDataset, y: torch.Tensor, is_train: bool=True) -> torch.Tensor:
- """Computes the log-likelihood for the batch and label.
- TODO: consider remove y, change to label.
- TODO: consider move this method outside the model, the role of the model is to compute the utility.
- Args:
- batch (ChoiceDataset): a ChoiceDataset object containing the data.
- y (torch.Tensor): the label.
- is_train (bool, optional): whether to trace the gradient. Defaults to True.
- Returns:
- torch.Tensor: the negative log-likelihood.
- """
- if is_train:
- self.train()
- else:
- self.eval()
- # (num_trips, num_items)
- total_utility = self.forward(batch)
- logP = torch.log_softmax(total_utility, dim=1)
- nll = - logP[torch.arange(len(y)), y].sum()
- return nll
- # NOTE: the method for computing Hessian and standard deviation has been moved to std.py.
- # @staticmethod
- # def flatten_coef_dict(coef_dict: Dict[str, Union[torch.Tensor, torch.nn.Parameter]]) -> Tuple[torch.Tensor, dict]:
- # """Flattens the coef_dict into a 1-dimension tensor, used for hessian computation.
- # Args:
- # coef_dict (Dict[str, Union[torch.Tensor, torch.nn.Parameter]]): a dictionary holding learnable parameters.
- # Returns:
- # Tuple[torch.Tensor, dict]: 1. the flattened tensors with shape (num_params,), 2. an indexing dictionary
- # used for reconstructing the original coef_dict from the flatten tensor.
- # """
- # type2idx = dict()
- # param_list = list()
- # start = 0
- # for var_type in coef_dict.keys():
- # num_params = coef_dict[var_type].coef.numel()
- # # track which portion of all_param tensor belongs to this variable type.
- # type2idx[var_type] = (start, start + num_params)
- # start += num_params
- # # use reshape instead of view to make a copy.
- # param_list.append(coef_dict[var_type].coef.clone().reshape(-1,))
- # all_param = torch.cat(param_list) # (self.num_params(), )
- # return all_param, type2idx
- # @staticmethod
- # def unwrap_coef_dict(param: torch.Tensor, type2idx: Dict[str, Tuple[int, int]]) -> Dict[str, torch.Tensor]:
- # """Rebuilds coef_dict from output of self.flatten_coef_dict method.
- # Args:
- # param (torch.Tensor): the flattened coef_dict from self.flatten_coef_dict.
- # type2idx (Dict[str, Tuple[int, int]]): the indexing dictionary from self.flatten_coef_dict.
- # Returns:
- # Dict[str, torch.Tensor]: the re-constructed coefficient dictionary.
- # """
- # coef_dict = dict()
- # for var_type in type2idx.keys():
- # start, end = type2idx[var_type]
- # # no need to reshape here, Coefficient handles it.
- # coef_dict[var_type] = param[start:end]
- # return coef_dict
- # def compute_hessian(self, x_dict, availability, user_index, y) -> torch.Tensor:
- # """Computes the Hessian of negative log-likelihood (total cross-entropy loss) with respect
- # to all parameters in this model. The Hessian can be later used for constructing the standard deviation of
- # parameters.
- # Args:
- # x_dict ,availability, user_index: see definitions in self.forward method.
- # y (torch.LongTensor): a tensor with shape (num_trips,) of IDs of items actually purchased.
- # Returns:
- # torch.Tensor: a (self.num_params, self.num_params) tensor of the Hessian matrix.
- # """
- # all_coefs, type2idx = self.flatten_coef_dict(self.coef_dict)
- # def compute_nll(P: torch.Tensor) -> float:
- # coef_dict = self.unwrap_coef_dict(P, type2idx)
- # y_pred = self._forward(x_dict=x_dict,
- # availability=availability,
- # user_index=user_index,
- # manual_coef_value_dict=coef_dict)
- # # the reduction needs to be 'sum' to obtain NLL.
- # loss = F.cross_entropy(y_pred, y, reduction='sum')
- # return loss
- # H = torch.autograd.functional.hessian(compute_nll, all_coefs)
- # assert H.shape == (self.num_params, self.num_params)
- # return H
- # def compute_std(self, x_dict, availability, user_index, y) -> Dict[str, torch.Tensor]:
- # """Computes
- # Args:f
- # See definitions in self.compute_hessian.
- # Returns:
- # Dict[str, torch.Tensor]: a dictionary whose keys are the same as self.coef_dict.keys()
- # the values are standard errors of coefficients in each coefficient group.
- # """
- # _, type2idx = self.flatten_coef_dict(self.coef_dict)
- # H = self.compute_hessian(x_dict, availability, user_index, y)
- # std_all = torch.sqrt(torch.diag(torch.inverse(H)))
- # std_dict = dict()
- # for var_type in type2idx.keys():
- # # get std of variables belonging to each type.
- # start, end = type2idx[var_type]
- # std_dict[var_type] = std_all[start:end]
- # return std_dict
-num_params: int
- property
- readonly
- Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied -with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no -intercept is involved.
- -Returns:
-Type | -Description | -
int |
- the total number of learnable parameters. |
-__init__(self, coef_variation_dict, num_param_dict=None, num_items=None, num_users=None)
- special
- Parameters:
-Name | -Type | -Description | -Default | -
num_items |
- int |
- number of items in the dataset. |
- None |
num_users |
- int |
- number of users in the dataset. |
- None |
coef_variation_dict |
- Dict[str, str] |
- variable type to variation level dictionary. Keys of this dictionary
-should be variable names in the dataset (i.e., these starting with
The following configurations are supported by the package, but we don't recommend using them due to the
- large number of parameters.
- required | -
num_param_dict |
- Optional[Dict[str, int]] |
- variable type to number of parameters dictionary with keys exactly the same
-as the |
- None |
Source code in torch_choice/model/conditional_logit_model.py
- def __init__(self,
- coef_variation_dict: Dict[str, str],
- num_param_dict: Optional[Dict[str, int]]=None,
- num_items: Optional[int]=None,
- num_users: Optional[int]=None
- ) -> None:
- """
- Args:
- num_items (int): number of items in the dataset.
- num_users (int): number of users in the dataset.
- coef_variation_dict (Dict[str, str]): variable type to variation level dictionary. Keys of this dictionary
- should be variable names in the dataset (i.e., these starting with `price_`, `user_`, etc), or `intercept`
- if the researcher requires an intercept term.
- For each variable name X_var (e.g., `user_income`) or `intercept`, the corresponding dictionary key should
- be one of the following values, this value specifies the "level of variation" of the coefficient.
- - `constant`: the coefficient constant over all users and items: $X \beta$.
- - `user`: user-specific parameters but constant across all items: $X \beta_{u}$.
- - `item`: item-specific parameters but constant across all users, $X \beta_{i}$.
- Note that the coefficients for the first item are forced to be zero following the standard practice
- in econometrics.
- - `item-full`: the same configuration as `item`, but does not force the coefficients of the first item to
- be zeros.
- The following configurations are supported by the package, but we don't recommend using them due to the
- large number of parameters.
- - `user-item`: parameters that are specific to both user and item, parameter for the first item
- for all users are forced to be zero.
- - `user-item-full`: parameters that are specific to both user and item, explicitly model for all items.
- num_param_dict (Optional[Dict[str, int]]): variable type to number of parameters dictionary with keys exactly the same
- as the `coef_variation_dict`. Values of `num_param_dict` records numbers of features in each kind of variable.
- If None is supplied, num_param_dict will be a dictionary with the same keys as the `coef_variation_dict` dictionary
- and values of all ones. Default to be None.
- """
- super(ConditionalLogitModel, self).__init__()
- if num_param_dict is None:
- num_param_dict = {key:1 for key in coef_variation_dict.keys()}
- assert coef_variation_dict.keys() == num_param_dict.keys()
- self.variable_types = list(deepcopy(num_param_dict).keys())
- self.coef_variation_dict = deepcopy(coef_variation_dict)
- self.num_param_dict = deepcopy(num_param_dict)
- self.num_items = num_items
- self.num_users = num_users
- # check number of parameters specified are all positive.
- for var_type, num_params in self.num_param_dict.items():
- assert num_params > 0, f'num_params needs to be positive, got: {num_params}.'
- # infer the number of parameters for intercept if the researcher forgets.
- if 'intercept' in self.coef_variation_dict.keys() and 'intercept' not in self.num_param_dict.keys():
- warnings.warn("'intercept' key found in coef_variation_dict but not in num_param_dict, num_param_dict['intercept'] has been set to 1.")
- self.num_param_dict['intercept'] = 1
- # construct trainable parameters.
- coef_dict = dict()
- for var_type, variation in self.coef_variation_dict.items():
- coef_dict[var_type] = Coefficient(variation=variation,
- num_items=self.num_items,
- num_users=self.num_users,
- num_params=self.num_param_dict[var_type])
- # A ModuleDict is required to properly register all trainable parameters.
- # self.parameter() will fail if a python dictionary is used instead.
- self.coef_dict = nn.ModuleDict(coef_dict)
- special
- Return a string representation of the model.
- -Returns:
-Type | -Description | -
str |
- the string representation of the model. |
Source code in torch_choice/model/conditional_logit_model.py
- def __repr__(self) -> str:
- """Return a string representation of the model.
- Returns:
- str: the string representation of the model.
- """
- out_str_lst = ['Conditional logistic discrete choice model, expects input features:\n']
- for var_type, num_params in self.num_param_dict.items():
- out_str_lst.append(f'X[{var_type}] with {num_params} parameters, with {self.coef_variation_dict[var_type]} level variation.')
- return super().__repr__() + '\n' + '\n'.join(out_str_lst)
-forward(self, batch, manual_coef_value_dict=None)
- Forward pass of the model.
- -Parameters:
-Name | -Type | -Description | -Default | -
batch |
- ChoiceDataset |
- a |
- required | -
manual_coef_value_dict |
- Optional[Dict[str, torch.Tensor]] |
- a dictionary with -keys in {'u', 'i'} etc and tensors as values. If provided, the model will force -coefficient to be the provided values and compute utility conditioned on the provided -coefficient values. This feature is useful when the research wishes to plug in particular -values of coefficients and examine the utility values. If not provided, the model will -use the learned coefficient values in self.coef_dict. -Defaults to None. |
- None |
-Type | -Description | -
torch.Tensor |
- a tensor of shape (num_trips, num_items) whose (t, i) entry represents - the utility from item i in trip t for the user involved in that trip. |
Source code in torch_choice/model/conditional_logit_model.py
- def forward(self,
- batch: ChoiceDataset,
- manual_coef_value_dict: Optional[Dict[str, torch.Tensor]] = None
- ) -> torch.Tensor:
- """
- Forward pass of the model.
- Args:
- batch: a `ChoiceDataset` object.
- manual_coef_value_dict (Optional[Dict[str, torch.Tensor]], optional): a dictionary with
- keys in {'u', 'i'} etc and tensors as values. If provided, the model will force
- coefficient to be the provided values and compute utility conditioned on the provided
- coefficient values. This feature is useful when the research wishes to plug in particular
- values of coefficients and examine the utility values. If not provided, the model will
- use the learned coefficient values in self.coef_dict.
- Defaults to None.
- Returns:
- torch.Tensor: a tensor of shape (num_trips, num_items) whose (t, i) entry represents
- the utility from item i in trip t for the user involved in that trip.
- """
- x_dict = batch.x_dict
- if 'intercept' in self.coef_variation_dict.keys():
- # intercept term has no input tensor, which has only 1 feature.
- x_dict['intercept'] = torch.ones((len(batch), self.num_items, 1), device=batch.device)
- # compute the utility from each item in each choice session.
- total_utility = torch.zeros((len(batch), self.num_items), device=batch.device)
- # for each type of variables, apply the corresponding coefficient to input x.
- for var_type, coef in self.coef_dict.items():
- total_utility += coef(
- x_dict[var_type], batch.user_index,
- manual_coef_value=None if manual_coef_value_dict is None else manual_coef_value_dict[var_type])
- assert total_utility.shape == (len(batch), self.num_items)
- if batch.item_availability is not None:
- # mask out unavilable items.
- total_utility[~batch.item_availability[batch.session_index, :]] = torch.finfo(total_utility.dtype).min / 2
- return total_utility
-negative_log_likelihood(self, batch, y, is_train=True)
- Computes the log-likelihood for the batch and label. -TODO: consider remove y, change to label. -TODO: consider move this method outside the model, the role of the model is to compute the utility.
- -Parameters:
-Name | -Type | -Description | -Default | -
batch |
- ChoiceDataset |
- a ChoiceDataset object containing the data. |
- required | -
y |
- torch.Tensor |
- the label. |
- required | -
is_train |
- bool |
- whether to trace the gradient. Defaults to True. |
- True |
-Type | -Description | -
torch.Tensor |
- the negative log-likelihood. |
Source code in torch_choice/model/conditional_logit_model.py
- def negative_log_likelihood(self, batch: ChoiceDataset, y: torch.Tensor, is_train: bool=True) -> torch.Tensor:
- """Computes the log-likelihood for the batch and label.
- TODO: consider remove y, change to label.
- TODO: consider move this method outside the model, the role of the model is to compute the utility.
- Args:
- batch (ChoiceDataset): a ChoiceDataset object containing the data.
- y (torch.Tensor): the label.
- is_train (bool, optional): whether to trace the gradient. Defaults to True.
- Returns:
- torch.Tensor: the negative log-likelihood.
- """
- if is_train:
- self.train()
- else:
- self.eval()
- # (num_trips, num_items)
- total_utility = self.forward(batch)
- logP = torch.log_softmax(total_utility, dim=1)
- nll = - logP[torch.arange(len(y)), y].sum()
- return nll
- Print out the current model parameter.
- - -
- nested_logit_model
- Implementation of the nested logit model, see page 86 of the book -"discrete choice methods with simulation" by Train. for more details.
-Author: Tianyu Du -Update; Apr. 28, 2022
- - - -
-NestedLogitModel (Module)
- Source code in torch_choice/model/nested_logit_model.py
- class NestedLogitModel(nn.Module):
- def __init__(self,
- category_to_item: Dict[object, List[int]],
- category_coef_variation_dict: Dict[str, str],
- category_num_param_dict: Dict[str, int],
- item_coef_variation_dict: Dict[str, str],
- item_num_param_dict: Dict[str, int],
- num_users: Optional[int]=None,
- shared_lambda: bool=False
- ) -> None:
- """Initialization method of the nested logit model.
- Args:
- category_to_item (Dict[object, List[int]]): a dictionary maps a category ID to a list
- of items IDs of the queried category.
- category_coef_variation_dict (Dict[str, str]): a dictionary maps a variable type
- (i.e., variable group) to the level of variation for the coefficient of this type
- of variables.
- category_num_param_dict (Dict[str, int]): a dictionary maps a variable type name to
- the number of parameters in this variable group.
- item_coef_variation_dict (Dict[str, str]): the same as category_coef_variation_dict but
- for item features.
- item_num_param_dict (Dict[str, int]): the same as category_num_param_dict but for item
- features.
- num_users (Optional[int], optional): number of users to be modelled, this is only
- required if any of variable type requires user-specific variations.
- Defaults to None.
- shared_lambda (bool): a boolean indicating whether to enforce the elasticity lambda, which
- is the coefficient for inclusive values, to be constant for all categories.
- The lambda enters the category-level selection as the following
- Utility of choosing category k = lambda * inclusive value of category k
- + linear combination of some other category level features
- If set to True, a single lambda will be learned for all categories, otherwise, the
- model learns an individual lambda for each category.
- Defaults to False.
- """
- super(NestedLogitModel, self).__init__()
- self.category_to_item = category_to_item
- self.category_coef_variation_dict = category_coef_variation_dict
- self.category_num_param_dict = category_num_param_dict
- self.item_coef_variation_dict = item_coef_variation_dict
- self.item_num_param_dict = item_num_param_dict
- self.num_users = num_users
- self.categories = list(category_to_item.keys())
- self.num_categories = len(self.categories)
- self.num_items = sum(len(items) for items in category_to_item.values())
- # category coefficients.
- self.category_coef_dict = self._build_coef_dict(self.category_coef_variation_dict,
- self.category_num_param_dict,
- self.num_categories)
- # item coefficients.
- self.item_coef_dict = self._build_coef_dict(self.item_coef_variation_dict,
- self.item_num_param_dict,
- self.num_items)
- self.shared_lambda = shared_lambda
- if self.shared_lambda:
- self.lambda_weight = nn.Parameter(torch.ones(1), requires_grad=True)
- else:
- self.lambda_weight = nn.Parameter(torch.ones(self.num_categories) / 2, requires_grad=True)
- # breakpoint()
- # self.iv_weights = nn.Parameter(torch.ones(1), requires_grad=True)
- # used to warn users if forgot to call clamp.
- self._clamp_called_flag = True
- @property
- def num_params(self) -> int:
- """Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied
- with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no
- intercept is involved.
- Returns:
- int: the total number of learnable parameters.
- """
- return sum(w.numel() for w in self.parameters())
- def _build_coef_dict(self,
- coef_variation_dict: Dict[str, str],
- num_param_dict: Dict[str, int],
- num_items: int) -> nn.ModuleDict:
- """Builds a coefficient dictionary containing all trainable components of the model, mapping coefficient names
- to the corresponding Coefficient Module.
- num_items could be the actual number of items or the number of categories depends on the use case.
- NOTE: torch-choice users don't directly interact with this method.
- Args:
- coef_variation_dict (Dict[str, str]): a dictionary mapping coefficient names (e.g., theta_user) to the level
- of variation (e.g., 'user').
- num_param_dict (Dict[str, int]): a dictionary mapping coefficient names to the number of parameters in this
- coefficient. Be aware that, for example, if there is one K-dimensional coefficient for every user, then
- the `num_param` should be K instead of K x number of users.
- num_items (int): the total number of items in the prediction problem. `num_items` should be the number of
- categories if _build_coef_dict() is used for category-level prediction.
- Returns:
- nn.ModuleDict: a PyTorch ModuleDict object mapping from coefficient names to training Coefficient.
- """
- coef_dict = dict()
- for var_type, variation in coef_variation_dict.items():
- num_params = num_param_dict[var_type]
- coef_dict[var_type] = Coefficient(variation=variation,
- num_items=num_items,
- num_users=self.num_users,
- num_params=num_params)
- return nn.ModuleDict(coef_dict)
- # def _check_input_shapes(self, category_x_dict, item_x_dict, user_index, item_availability) -> None:
- # T = list(category_x_dict.values())[0].shape[0] # batch size.
- # for var_type, x_category in category_x_dict.items():
- # x_item = item_x_dict[var_type]
- # assert len(x_item.shape) == len(x_item.shape) == 3
- # assert x_category.shape[0] == x_item.shape[0]
- # assert x_category.shape == (T, self.num_categories, self.category_num_param_dict[var_type])
- # assert x_item.shape == (T, self.num_items, self.item_num_param_dict[var_type])
- # if (user_index is not None) and (self.num_users is not None):
- # assert user_index.shape == (T,)
- # if item_availability is not None:
- # assert item_availability.shape == (T, self.num_items)
- def forward(self, batch: ChoiceDataset) -> torch.Tensor:
- """An standard forward method for the model, the user feeds a ChoiceDataset batch and the model returns the
- predicted log-likelihood tensor. The main forward passing happens in the _forward() method, but we provide
- this wrapper forward() method for a cleaner API, as forward() only requires a single batch argument.
- For more details about the forward passing, please refer to the _forward() method.
- # TODO: the ConditionaLogitModel returns predicted utility, the NestedLogitModel behaves the same?
- Args:
- batch (ChoiceDataset): a ChoiceDataset object containing the data batch.
- Returns:
- torch.Tensor: a tensor of shape (num_trips, num_items) including the log probability
- of choosing item i in trip t.
- """
- return self._forward(batch['category'].x_dict,
- batch['item'].x_dict,
- batch['item'].user_index,
- batch['item'].item_availability)
- def _forward(self,
- category_x_dict: Dict[str, torch.Tensor],
- item_x_dict: Dict[str, torch.Tensor],
- user_index: Optional[torch.LongTensor] = None,
- item_availability: Optional[torch.BoolTensor] = None
- ) -> torch.Tensor:
- """"Computes log P[t, i] = the log probability for the user involved in trip t to choose item i.
- Let n denote the ID of the user involved in trip t, then P[t, i] = P_{ni} on page 86 of the
- book "discrete choice methods with simulation" by Train.
- Args:
- x_category (torch.Tensor): a tensor with shape (num_trips, num_categories, *) including
- features of all categories in each trip.
- x_item (torch.Tensor): a tensor with shape (num_trips, num_items, *) including features
- of all items in each trip.
- user_index (torch.LongTensor): a tensor of shape (num_trips,) indicating which user is
- making decision in each trip. Setting user_index = None assumes the same user is
- making decisions in all trips.
- item_availability (torch.BoolTensor): a boolean tensor with shape (num_trips, num_items)
- indicating the aviliability of items in each trip. If item_availability[t, i] = False,
- the utility of choosing item i in trip t, V[t, i], will be set to -inf.
- Given the decomposition V[t, i] = W[t, k(i)] + Y[t, i] + eps, V[t, i] is set to -inf
- by setting Y[t, i] = -inf for unavilable items.
- Returns:
- torch.Tensor: a tensor of shape (num_trips, num_items) including the log probability
- of choosing item i in trip t.
- """
- if self.shared_lambda:
- self.lambdas = self.lambda_weight.expand(self.num_categories)
- else:
- self.lambdas = self.lambda_weight
- # if not self._clamp_called_flag:
- # warnings.warn('Did you forget to call clamp_lambdas() after optimizer.step()?')
- # The overall utility of item can be decomposed into V[item] = W[category] + Y[item] + eps.
- T = list(item_x_dict.values())[0].shape[0]
- device = list(item_x_dict.values())[0].device
- # compute category-specific utility with shape (T, num_categories).
- W = torch.zeros(T, self.num_categories).to(device)
- if 'intercept' in self.category_coef_variation_dict.keys():
- category_x_dict['intercept'] = torch.ones((T, self.num_categories, 1)).to(device)
- for var_type, coef in self.category_coef_dict.items():
- W += coef(category_x_dict[var_type], user_index)
- # compute item-specific utility (T, num_items).
- Y = torch.zeros(T, self.num_items).to(device)
- for var_type, coef in self.item_coef_dict.items():
- Y += coef(item_x_dict[var_type], user_index)
- if item_availability is not None:
- Y[~item_availability] =torch.finfo(Y.dtype).min / 2
- # =============================================================================
- # compute the inclusive value of each category.
- inclusive_value = dict()
- for k, Bk in self.category_to_item.items():
- # for nest k, divide the Y of all items in Bk by lambda_k.
- Y[:, Bk] /= self.lambdas[k]
- # compute inclusive value for category k.
- # mask out unavilable items.
- inclusive_value[k] = torch.logsumexp(Y[:, Bk], dim=1, keepdim=False) # (T,)
- # boardcast inclusive value from (T, num_categories) to (T, num_items).
- # for trip t, I[t, i] is the inclusive value of the category item i belongs to.
- I = torch.zeros(T, self.num_items).to(device)
- for k, Bk in self.category_to_item.items():
- I[:, Bk] = inclusive_value[k].view(-1, 1) # (T, |Bk|)
- # logP_item[t, i] = log P(ni|Bk), where Bk is the category item i is in, n is the user in trip t.
- logP_item = Y - I # (T, num_items)
- # =============================================================================
- # logP_category[t, i] = log P(Bk), for item i in trip t, the probability of choosing the nest/bucket
- # item i belongs to. logP_category has shape (T, num_items)
- # logit[t, i] = W[n, k] + lambda[k] I[n, k], where n is the user involved in trip t, k is
- # the category item i belongs to.
- logit = torch.zeros(T, self.num_items).to(device)
- for k, Bk in self.category_to_item.items():
- logit[:, Bk] = (W[:, k] + self.lambdas[k] * inclusive_value[k]).view(-1, 1) # (T, |Bk|)
- # only count each category once in the logsumexp within the category level model.
- cols = [x[0] for x in self.category_to_item.values()]
- logP_category = logit - torch.logsumexp(logit[:, cols], dim=1, keepdim=True)
- # =============================================================================
- # compute the joint log P_{ni} as in the textbook.
- logP = logP_item + logP_category
- self._clamp_called_flag = False
- return logP
- def log_likelihood(self, *args):
- """Computes the log likelihood of the model, please refer to the negative_log_likelihood() method.
- Returns:
- _type_: the log likelihood of the model.
- """
- return - self.negative_log_likelihood(*args)
- def negative_log_likelihood(self,
- batch: ChoiceDataset,
- y: torch.LongTensor,
- is_train: bool=True) -> torch.scalar_tensor:
- """Computes the negative log likelihood of the model. Please note the log-likelihood is summed over all samples
- in batch instead of the average.
- Args:
- batch (ChoiceDataset): the ChoiceDataset object containing the data.
- y (torch.LongTensor): the label.
- is_train (bool, optional): which mode of the model to be used for the forward passing, if we need Hessian
- of the NLL through auto-grad, `is_train` should be set to True. If we merely need a performance metric,
- then `is_train` can be set to False for better performance.
- Defaults to True.
- Returns:
- torch.scalar_tensor: the negative log likelihood of the model.
- """
- # compute the negative log-likelihood loss directly.
- if is_train:
- self.train()
- else:
- self.eval()
- # (num_trips, num_items)
- logP = self.forward(batch)
- nll = - logP[torch.arange(len(y)), y].sum()
- return nll
- # def clamp_lambdas(self):
- # """
- # Restrict values of lambdas to 0 < lambda <= 1 to guarantee the utility maximization property
- # of the model.
- # This method should be called everytime after optimizer.step().
- # We add a self_clamp_called_flag to remind researchers if this method is not called.
- # """
- # for k in range(len(self.lambdas)):
- # self.lambdas[k] = torch.clamp(self.lambdas[k], 1e-5, 1)
- # self._clam_called_flag = True
- # @staticmethod
- # def add_constant(x: torch.Tensor, where: str='prepend') -> torch.Tensor:
- # """A helper function used to add constant to feature tensor,
- # x has shape (batch_size, num_classes, num_parameters),
- # returns a tensor of shape (*, num_parameters+1).
- # """
- # batch_size, num_classes, num_parameters = x.shape
- # ones = torch.ones((batch_size, num_classes, 1))
- # if where == 'prepend':
- # new = torch.cat((ones, x), dim=-1)
- # elif where == 'append':
- # new = torch.cat((x, ones), dim=-1)
- # else:
- # raise Exception
- # return new
-num_params: int
- property
- readonly
- Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied -with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no -intercept is involved.
- -Returns:
-Type | -Description | -
int |
- the total number of learnable parameters. |
-__init__(self, category_to_item, category_coef_variation_dict, category_num_param_dict, item_coef_variation_dict, item_num_param_dict, num_users=None, shared_lambda=False)
- special
- Initialization method of the nested logit model.
- -Parameters:
-Name | -Type | -Description | -Default | -
category_to_item |
- Dict[object, List[int]] |
- a dictionary maps a category ID to a list -of items IDs of the queried category. |
- required | -
category_coef_variation_dict |
- Dict[str, str] |
- a dictionary maps a variable type -(i.e., variable group) to the level of variation for the coefficient of this type -of variables. |
- required | -
category_num_param_dict |
- Dict[str, int] |
- a dictionary maps a variable type name to -the number of parameters in this variable group. |
- required | -
item_coef_variation_dict |
- Dict[str, str] |
- the same as category_coef_variation_dict but -for item features. |
- required | -
item_num_param_dict |
- Dict[str, int] |
- the same as category_num_param_dict but for item -features. |
- required | -
num_users |
- Optional[int] |
- number of users to be modelled, this is only -required if any of variable type requires user-specific variations. -Defaults to None. |
- None |
shared_lambda |
- bool |
- a boolean indicating whether to enforce the elasticity lambda, which -is the coefficient for inclusive values, to be constant for all categories. -The lambda enters the category-level selection as the following -Utility of choosing category k = lambda * inclusive value of category k - + linear combination of some other category level features -If set to True, a single lambda will be learned for all categories, otherwise, the -model learns an individual lambda for each category. -Defaults to False. |
- False |
Source code in torch_choice/model/nested_logit_model.py
- def __init__(self,
- category_to_item: Dict[object, List[int]],
- category_coef_variation_dict: Dict[str, str],
- category_num_param_dict: Dict[str, int],
- item_coef_variation_dict: Dict[str, str],
- item_num_param_dict: Dict[str, int],
- num_users: Optional[int]=None,
- shared_lambda: bool=False
- ) -> None:
- """Initialization method of the nested logit model.
- Args:
- category_to_item (Dict[object, List[int]]): a dictionary maps a category ID to a list
- of items IDs of the queried category.
- category_coef_variation_dict (Dict[str, str]): a dictionary maps a variable type
- (i.e., variable group) to the level of variation for the coefficient of this type
- of variables.
- category_num_param_dict (Dict[str, int]): a dictionary maps a variable type name to
- the number of parameters in this variable group.
- item_coef_variation_dict (Dict[str, str]): the same as category_coef_variation_dict but
- for item features.
- item_num_param_dict (Dict[str, int]): the same as category_num_param_dict but for item
- features.
- num_users (Optional[int], optional): number of users to be modelled, this is only
- required if any of variable type requires user-specific variations.
- Defaults to None.
- shared_lambda (bool): a boolean indicating whether to enforce the elasticity lambda, which
- is the coefficient for inclusive values, to be constant for all categories.
- The lambda enters the category-level selection as the following
- Utility of choosing category k = lambda * inclusive value of category k
- + linear combination of some other category level features
- If set to True, a single lambda will be learned for all categories, otherwise, the
- model learns an individual lambda for each category.
- Defaults to False.
- """
- super(NestedLogitModel, self).__init__()
- self.category_to_item = category_to_item
- self.category_coef_variation_dict = category_coef_variation_dict
- self.category_num_param_dict = category_num_param_dict
- self.item_coef_variation_dict = item_coef_variation_dict
- self.item_num_param_dict = item_num_param_dict
- self.num_users = num_users
- self.categories = list(category_to_item.keys())
- self.num_categories = len(self.categories)
- self.num_items = sum(len(items) for items in category_to_item.values())
- # category coefficients.
- self.category_coef_dict = self._build_coef_dict(self.category_coef_variation_dict,
- self.category_num_param_dict,
- self.num_categories)
- # item coefficients.
- self.item_coef_dict = self._build_coef_dict(self.item_coef_variation_dict,
- self.item_num_param_dict,
- self.num_items)
- self.shared_lambda = shared_lambda
- if self.shared_lambda:
- self.lambda_weight = nn.Parameter(torch.ones(1), requires_grad=True)
- else:
- self.lambda_weight = nn.Parameter(torch.ones(self.num_categories) / 2, requires_grad=True)
- # breakpoint()
- # self.iv_weights = nn.Parameter(torch.ones(1), requires_grad=True)
- # used to warn users if forgot to call clamp.
- self._clamp_called_flag = True
-forward(self, batch)
- An standard forward method for the model, the user feeds a ChoiceDataset batch and the model returns the - predicted log-likelihood tensor. The main forward passing happens in the _forward() method, but we provide - this wrapper forward() method for a cleaner API, as forward() only requires a single batch argument. - For more details about the forward passing, please refer to the _forward() method.
-TODO: the ConditionaLogitModel returns predicted utility, the NestedLogitModel behaves the same?
- -Parameters:
-Name | -Type | -Description | -Default | -
batch |
- ChoiceDataset |
- a ChoiceDataset object containing the data batch. |
- required | -
-Type | -Description | -
torch.Tensor |
- a tensor of shape (num_trips, num_items) including the log probability -of choosing item i in trip t. |
Source code in torch_choice/model/nested_logit_model.py
- def forward(self, batch: ChoiceDataset) -> torch.Tensor:
- """An standard forward method for the model, the user feeds a ChoiceDataset batch and the model returns the
- predicted log-likelihood tensor. The main forward passing happens in the _forward() method, but we provide
- this wrapper forward() method for a cleaner API, as forward() only requires a single batch argument.
- For more details about the forward passing, please refer to the _forward() method.
- # TODO: the ConditionaLogitModel returns predicted utility, the NestedLogitModel behaves the same?
- Args:
- batch (ChoiceDataset): a ChoiceDataset object containing the data batch.
- Returns:
- torch.Tensor: a tensor of shape (num_trips, num_items) including the log probability
- of choosing item i in trip t.
- """
- return self._forward(batch['category'].x_dict,
- batch['item'].x_dict,
- batch['item'].user_index,
- batch['item'].item_availability)
-log_likelihood(self, *args)
- Computes the log likelihood of the model, please refer to the negative_log_likelihood() method.
- -Returns:
-Type | -Description | -
_type_ |
- the log likelihood of the model. |
Source code in torch_choice/model/nested_logit_model.py
-negative_log_likelihood(self, batch, y, is_train=True)
- Computes the negative log likelihood of the model. Please note the log-likelihood is summed over all samples - in batch instead of the average.
- -Parameters:
-Name | -Type | -Description | -Default | -
batch |
- ChoiceDataset |
- the ChoiceDataset object containing the data. |
- required | -
y |
- torch.LongTensor |
- the label. |
- required | -
is_train |
- bool |
- which mode of the model to be used for the forward passing, if we need Hessian
-of the NLL through auto-grad, |
- True |
-Type | -Description | -
torch.scalar_tensor |
- the negative log likelihood of the model. |
Source code in torch_choice/model/nested_logit_model.py
- def negative_log_likelihood(self,
- batch: ChoiceDataset,
- y: torch.LongTensor,
- is_train: bool=True) -> torch.scalar_tensor:
- """Computes the negative log likelihood of the model. Please note the log-likelihood is summed over all samples
- in batch instead of the average.
- Args:
- batch (ChoiceDataset): the ChoiceDataset object containing the data.
- y (torch.LongTensor): the label.
- is_train (bool, optional): which mode of the model to be used for the forward passing, if we need Hessian
- of the NLL through auto-grad, `is_train` should be set to True. If we merely need a performance metric,
- then `is_train` can be set to False for better performance.
- Defaults to True.
- Returns:
- torch.scalar_tensor: the negative log likelihood of the model.
- """
- # compute the negative log-likelihood loss directly.
- if is_train:
- self.train()
- else:
- self.eval()
- # (num_trips, num_items)
- logP = self.forward(batch)
- nll = - logP[torch.arange(len(y)), y].sum()
- return nll