Updated the docs

KevinMusgrave · Mar 30, 2022 · 3c429eb · 3c429eb
1 parent d54e782
commit 3c429eb
Show file tree

Hide file tree

Showing 4 changed files with 36 additions and 8 deletions.
diff --git a/docs/accuracy_calculation.md b/docs/accuracy_calculation.md
@@ -24,7 +24,7 @@ AccuracyCalculator(include=(),
     * ```None```. This means k will be set to the total number of reference embeddings.
     * An integer greater than 0. This means k will be set to the input integer.
     * ```"max_bin_count"```. This means k will be set to ```max(bincount(reference_labels)) - self_count``` where ```self_count == 1``` if the query and reference embeddings come from the same source.
-* **label_comparison_fn**: A function that compares two torch arrays of labels and returns a boolean array. The default is ```torch.eq```. If a custom function is used, then you must exclude clustering based metrics ("NMI" and "AMI"). The following is an example of a custom function for two-dimensional labels. It returns ```True``` if the 0th column matches, and the 1st column does **not** match:
+* **label_comparison_fn**: A function that compares two torch arrays of labels and returns a boolean array. The default is ```torch.eq```. If a custom function is used, then you must exclude clustering based metrics ("NMI" and "AMI"). The example below shows a custom function for two-dimensional labels. It returns ```True``` if the 0th column matches, and the 1st column does **not** match.
 * **device**: The device to move input tensors to. If ```None```, will default to GPUs if available.
 * **knn_func**: A callable that takes in 4 arguments (```query, k, reference, embeddings_come_from_same_source```) and returns ```distances, indices```. Default is ```pytorch_metric_learning.utils.inference.FaissKNN```.
 * **kmeans_func**: A callable that takes in 2 arguments (```x, nmb_clusters```) and returns a 1-d tensor of cluster assignments. Default is ```pytorch_metric_learning.utils.inference.FaissKMeans```.

diff --git a/docs/distances.md b/docs/distances.md
@@ -67,6 +67,37 @@ def pairwise_distance(self, query_emb, ref_emb):
 ```
 
 
+## BatchedDistance
+
+Computes distance matrices iteratively, passing each matrix into ```iter_fn```.
+
+```python
+distances.BatchedDistance(distance, iter_fn=None, batch_size=32)
+```
+
+**Parameters**:
+
+* **distance**: The wrapped distance function.
+* **iter_fn**: This function will be called at every iteration. It receives ```(mat, s, e)``` as input, where ```mat``` is the current distance matrix, and ```s, e``` is the range of query embeddings used to construct ```mat```.
+* **batch_size**: Each distance matrix will be size ```(batch_size, len(ref_emb))```.
+
+**Example usage**:
+```python
+from pytorch_metric_learning.distances import BatchedDistance, CosineSimilarity
+
+def fn(mat, s, e):
+	print(f"At query indices {s}:{e}")
+
+distance = BatchedDistance(CosineSimilarity(), fn)
+
+# Works like a regular distance function, except nothing is returned.
+# So any persistent changes need to be done in the supplied iter_fn.
+# query vs query
+distance(embeddings)
+# query vs ref
+distance(embeddings, ref_emb)
+```
+
 ## CosineSimilarity
 ```python
 distances.CosineSimilarity(**kwargs)

diff --git a/docs/inference_models.md b/docs/inference_models.md
@@ -119,12 +119,13 @@ Uses a [distance function](distances.md) to determine similarity between datapoi
 
 ```python
 from pytorch_metric_learning.utils.inference import CustomKNN
-CustomKNN(distance)
+CustomKNN(distance, batch_size=None)
 ```
 
 **Parameters**:
 
 * **distance**: A [distance function](distances.md)
+* **batch_size**: If specified, k-nn will be computed incrementally. For example, if there are 50000 reference embeddings and the batch size is 32, then CustomKNN will iterate through all embeddings, using distance matrices of size (32, 50000). The final result is equal to the  ```batch_size=None``` setting, but saves memory because the full (50000, 50000) matrix does not need to be computed all at once.
 
 Example:
 ```python

diff --git a/docs/losses.md b/docs/losses.md
@@ -169,15 +169,11 @@ Unlike many other losses, the instance of this class can only be called as the f
 
 ```python
 from pytorch_metric_learning import losses
-loss_func = losses.SomeLoss()
-
-embeddings = torch.randn(8, 32)
-labels = torch.tensor([0, 0, 0, 0, 0, 0, 1, 1])
+loss_func = losses.CentroidTripletLoss()
 loss = loss_func(embeddings, labels) 
 ```
 
-and does not allow for use of `ref_embs`, `ref_labels`. Furthermore, the labels can't imply classes with just one 
-embedding in it (e.g. if there was only one label with value `1` in the above example). Refer to a [previous issue](https://github.com/KevinMusgrave/pytorch-metric-learning/issues/451) about this topic.
+and does not allow for use of `ref_embs`, `ref_labels`. Furthermore, there must be at least 2 embeddings associated with each label. Refer to [this issue](https://github.com/KevinMusgrave/pytorch-metric-learning/issues/451) for details.
 
 **Parameters**: