Skip to content

Commit

Permalink
Updated the docs
Browse files Browse the repository at this point in the history
  • Loading branch information
KevinMusgrave committed Mar 30, 2022
1 parent d54e782 commit 3c429eb
Show file tree
Hide file tree
Showing 4 changed files with 36 additions and 8 deletions.
2 changes: 1 addition & 1 deletion docs/accuracy_calculation.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ AccuracyCalculator(include=(),
* ```None```. This means k will be set to the total number of reference embeddings.
* An integer greater than 0. This means k will be set to the input integer.
* ```"max_bin_count"```. This means k will be set to ```max(bincount(reference_labels)) - self_count``` where ```self_count == 1``` if the query and reference embeddings come from the same source.
* **label_comparison_fn**: A function that compares two torch arrays of labels and returns a boolean array. The default is ```torch.eq```. If a custom function is used, then you must exclude clustering based metrics ("NMI" and "AMI"). The following is an example of a custom function for two-dimensional labels. It returns ```True``` if the 0th column matches, and the 1st column does **not** match:
* **label_comparison_fn**: A function that compares two torch arrays of labels and returns a boolean array. The default is ```torch.eq```. If a custom function is used, then you must exclude clustering based metrics ("NMI" and "AMI"). The example below shows a custom function for two-dimensional labels. It returns ```True``` if the 0th column matches, and the 1st column does **not** match.
* **device**: The device to move input tensors to. If ```None```, will default to GPUs if available.
* **knn_func**: A callable that takes in 4 arguments (```query, k, reference, embeddings_come_from_same_source```) and returns ```distances, indices```. Default is ```pytorch_metric_learning.utils.inference.FaissKNN```.
* **kmeans_func**: A callable that takes in 2 arguments (```x, nmb_clusters```) and returns a 1-d tensor of cluster assignments. Default is ```pytorch_metric_learning.utils.inference.FaissKMeans```.
Expand Down
31 changes: 31 additions & 0 deletions docs/distances.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,37 @@ def pairwise_distance(self, query_emb, ref_emb):
```


## BatchedDistance

Computes distance matrices iteratively, passing each matrix into ```iter_fn```.

```python
distances.BatchedDistance(distance, iter_fn=None, batch_size=32)
```

**Parameters**:

* **distance**: The wrapped distance function.
* **iter_fn**: This function will be called at every iteration. It receives ```(mat, s, e)``` as input, where ```mat``` is the current distance matrix, and ```s, e``` is the range of query embeddings used to construct ```mat```.
* **batch_size**: Each distance matrix will be size ```(batch_size, len(ref_emb))```.

**Example usage**:
```python
from pytorch_metric_learning.distances import BatchedDistance, CosineSimilarity

def fn(mat, s, e):
print(f"At query indices {s}:{e}")

distance = BatchedDistance(CosineSimilarity(), fn)

# Works like a regular distance function, except nothing is returned.
# So any persistent changes need to be done in the supplied iter_fn.
# query vs query
distance(embeddings)
# query vs ref
distance(embeddings, ref_emb)
```

## CosineSimilarity
```python
distances.CosineSimilarity(**kwargs)
Expand Down
3 changes: 2 additions & 1 deletion docs/inference_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,12 +119,13 @@ Uses a [distance function](distances.md) to determine similarity between datapoi

```python
from pytorch_metric_learning.utils.inference import CustomKNN
CustomKNN(distance)
CustomKNN(distance, batch_size=None)
```

**Parameters**:

* **distance**: A [distance function](distances.md)
* **batch_size**: If specified, k-nn will be computed incrementally. For example, if there are 50000 reference embeddings and the batch size is 32, then CustomKNN will iterate through all embeddings, using distance matrices of size (32, 50000). The final result is equal to the ```batch_size=None``` setting, but saves memory because the full (50000, 50000) matrix does not need to be computed all at once.

Example:
```python
Expand Down
8 changes: 2 additions & 6 deletions docs/losses.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,15 +169,11 @@ Unlike many other losses, the instance of this class can only be called as the f

```python
from pytorch_metric_learning import losses
loss_func = losses.SomeLoss()

embeddings = torch.randn(8, 32)
labels = torch.tensor([0, 0, 0, 0, 0, 0, 1, 1])
loss_func = losses.CentroidTripletLoss()
loss = loss_func(embeddings, labels)
```

and does not allow for use of `ref_embs`, `ref_labels`. Furthermore, the labels can't imply classes with just one
embedding in it (e.g. if there was only one label with value `1` in the above example). Refer to a [previous issue](https://github.com/KevinMusgrave/pytorch-metric-learning/issues/451) about this topic.
and does not allow for use of `ref_embs`, `ref_labels`. Furthermore, there must be at least 2 embeddings associated with each label. Refer to [this issue](https://github.com/KevinMusgrave/pytorch-metric-learning/issues/451) for details.

**Parameters**:

Expand Down

0 comments on commit 3c429eb

Please sign in to comment.