Skip to content

Commit

Permalink
Documentation and small fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
bpiwowar committed Dec 7, 2023
1 parent f5f4cb1 commit 3322f5f
Show file tree
Hide file tree
Showing 14 changed files with 5 additions and 78 deletions.
2 changes: 0 additions & 2 deletions docs/source/data/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,5 @@ as well as its own datasets (e.g. hard negatives datasets).
.. toctree::
:maxdepth: 2

types
irds
xpmir
adapters
11 changes: 0 additions & 11 deletions docs/source/data/irds.rst

This file was deleted.

8 changes: 0 additions & 8 deletions docs/source/data/types.rst

This file was deleted.

1 change: 0 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@ Table of Contents
text/index
papers/index
pretrained
misc


Indices and tables
Expand Down
2 changes: 1 addition & 1 deletion docs/source/learning/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Trainers
Trainers are responsible for defining the the way to train
a learnable scorer.

.. autoxpmconfig:: xpmir.letor.trainers.Trainer
.. autoxpmconfig:: xpmir.learning.trainers.Trainer
.. autoxpmconfig:: xpmir.learning.trainers.multiple.MultipleTrainer

.. autoxpmconfig:: xpmir.letor.trainers.LossTrainer
Expand Down
3 changes: 2 additions & 1 deletion docs/source/letor/batchwise.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,5 @@ Samplers


.. autoxpmconfig:: BatchwiseSampler
.. autoxpmconfig:: xpmir.documents.samplers.BatchwiseRandomSpanSampler

.. autoxpmconfig:: xpmir.documents.samplers.RandomSpanSampler
1 change: 0 additions & 1 deletion docs/source/letor/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ Samplers

Samplers provide samples in the form of *records*. They all inherit from:

.. autoxpmconfig:: Sampler
.. autoclass:: SerializableIterator


Expand Down
2 changes: 0 additions & 2 deletions docs/source/letor/pairwise.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ Pairwise
Trainers are responsible for defining the the way to train
a learnable scorer.

.. autoxpmconfig:: xpmir.letor.trainers.Trainer
.. autoxpmconfig:: xpmir.learning.trainers.multiple.MultipleTrainer

.. autoxpmconfig:: xpmir.letor.trainers.LossTrainer
Expand Down Expand Up @@ -45,6 +44,5 @@ Pairwise
.. autoxpmconfig:: PairwiseSampleDatasetFromTSV
.. autoxpmconfig:: PairwiseSamplerFromTSV
.. autoxpmconfig:: ModelBasedHardNegativeSampler
.. autoxpmconfig:: TripletBasedInBatchNegativeSampler

.. autoxpmconfig:: xpmir.letor.samplers.hydrators.PairwiseTransformAdapter
1 change: 0 additions & 1 deletion docs/source/letor/pointwise.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ Pointwise
Trainers are responsible for defining the the way to train
a learnable scorer.

.. autoxpmconfig:: xpmir.letor.trainers.Trainer
.. autoxpmconfig:: xpmir.learning.trainers.multiple.MultipleTrainer

.. autoxpmconfig:: xpmir.letor.trainers.LossTrainer
Expand Down
4 changes: 0 additions & 4 deletions docs/source/misc.rst

This file was deleted.

1 change: 0 additions & 1 deletion docs/source/retrieval.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,6 @@ Anserini
--------

.. autoxpmconfig:: xpmir.index.anserini.Index
.. autoxpmconfig:: xpmir.interfaces.anserini.Index
.. autoxpmconfig:: xpmir.interfaces.anserini.AnseriniRetriever
.. autoxpmconfig:: xpmir.interfaces.anserini.IndexCollection
.. autoxpmconfig:: xpmir.interfaces.anserini.SearchCollection
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Experimaestro

experimaestro>=1.2.1
experimaestro>=1.3.4
datamaestro>=0.8.13
datamaestro_text>=2023.11.22
ir_datasets
Expand Down
43 changes: 0 additions & 43 deletions src/xpmir/letor/samplers/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import random
from pathlib import Path
from typing import Iterator, List, Tuple, Dict, Any
import numpy as np
Expand Down Expand Up @@ -344,48 +343,6 @@ def iter(random):
return RandomSerializableIterator(self.random, iter)


class TripletBasedInBatchNegativeSampler(PairwiseSampler):
"""An in-batch negative sampler which generate the triplets,
which use the postives of the other in batch as the negatives"""

sampler: Param[PairwiseSampler]
"""The base pairwise sampler"""

batch_size: Param[int]
"""How many triplets to be used for building the ibn"""

def initialize(self, random):
super().initialize(random)
self.sampler.initialize(random)

def pairwise_iter(self) -> SerializableIterator[PairwiseRecord, Any]:
def iter(pair_iter):
while True:
topics = []
positives = []
for _, record in zip(range(self.batch_size), pair_iter):
topics.append(record.query)
positives.append(record.positive)
all_qry = [
topic for topic in topics for _ in range(self.batch_size - 1)
]
all_pos = [pos for pos in positives for _ in range(self.batch_size - 1)]
pos_as_neg = positives * self.batch_size
pos_index = [(self.batch_size + 1) * i for i in range(self.batch_size)]
all_neg = [
doc for i, doc in enumerate(pos_as_neg) if i not in pos_index
]

# randomize, to make the same document not gather too close
mapping = list(zip(all_qry, all_pos, all_neg))
for _ in range(30000):
random.shuffle(mapping)
for (topic, positive, negative) in mapping:
yield PairwiseRecord(topic, positive, negative)

return SerializableIteratorAdapter(self.sampler.pairwise_iter(), iter)


class PairwiseInBatchNegativesSampler(BatchwiseSampler):
"""An in-batch negative sampler constructured from a pairwise one"""

Expand Down
2 changes: 1 addition & 1 deletion src/xpmir/test/test_documented.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ def test_documented():

analyzer.analyze()
analyzer.report()
analyzer.assert_no_undocumented()
analyzer.assert_valid_documentation()

0 comments on commit 3322f5f

Please sign in to comment.