Pinned Loading
-
Universal_Decoder
Universal_Decoder PublicThis is the Transformers Decoder version, based on the "Universal Transformers (Encoder-Decoder)" paper, and trained on natural data rather than algorithmic tasks.
Python 2
-
FP_vs_1Bit_InductionHeadCircuit
FP_vs_1Bit_InductionHeadCircuit PublicComparing the full-precision Query, Keys, and Values matrices with their 1-bit counterparts in a two-layer, attention-only transformer trained on a synthetic copying task.
Python 1
-
baby_mamba
baby_mamba PublicAn implementation of Mamba to develop an understanding of its functioning.
Python 2
-
BERT_with_Residual_vs_Highway
BERT_with_Residual_vs_Highway PublicComparing between residual stream and highway stream in transformers(BERT) .
Python 3
-
test_liger_kernels
test_liger_kernels PublicBenchmarking the performance of Liger Kernels Library Using instruction following and reasoning tasks
Python
If the problem persists, check the GitHub status page or contact support.