Skip to content

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

License

Notifications You must be signed in to change notification settings

VowpalWabbit/vowpal_wabbit

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Folders and files

NameName
Last commit message
Last commit date
May 17, 2021
Jun 6, 2023
Sep 26, 2022
Apr 13, 2023
Oct 20, 2022
Oct 11, 2017
Mar 17, 2022
Jun 22, 2021
Dec 28, 2022
Jul 17, 2022
Jun 1, 2023
Sep 8, 2023
Oct 5, 2023
Jan 13, 2023
Jun 6, 2023
Mar 20, 2023
Mar 20, 2023
Dec 7, 2016
Jan 13, 2023
Oct 5, 2023
May 1, 2019
Oct 9, 2023
May 30, 2023
Oct 11, 2023
Jul 20, 2023
Jun 30, 2022
Nov 29, 2022
Jul 19, 2017
May 31, 2019
May 19, 2023
Apr 13, 2023
Aug 18, 2016
Apr 25, 2023
Apr 25, 2023
Nov 3, 2020
May 7, 2014
Apr 13, 2022
Dec 28, 2022
Apr 3, 2023
Apr 13, 2023
Apr 15, 2022
Nov 29, 2022
Jan 4, 2023
Apr 12, 2022
May 27, 2020
Apr 19, 2023
Jul 19, 2023
Jul 19, 2023

Repository files navigation

Vowpal Wabbit

Linux build status Windows build status

codecov Total Alerts

This is the Vowpal Wabbit fast online learning code.

Why Vowpal Wabbit?

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning. There is a specific focus on reinforcement learning with several contextual bandit algorithms implemented and the online nature lending to the problem well. Vowpal Wabbit is a destination for implementing and maturing state of the art algorithms with performance in mind.

  • Input Format. The input format for the learning algorithm is substantially more flexible than might be expected. Examples can have features consisting of free form text, which is interpreted in a bag-of-words way. There can even be multiple sets of free form text in different namespaces.
  • Speed. The learning algorithm is fast -- similar to the few other online algorithm implementations out there. There are several optimization algorithms available with the baseline being sparse gradient descent (GD) on a loss function.
  • Scalability. This is not the same as fast. Instead, the important characteristic here is that the memory footprint of the program is bounded independent of data. This means the training set is not loaded into main memory before learning starts. In addition, the size of the set of features is bounded independent of the amount of training data using the hashing trick.
  • Feature Interaction. Subsets of features can be internally paired so that the algorithm is linear in the cross-product of the subsets. This is useful for ranking problems. The alternative of explicitly expanding the features before feeding them into the learning algorithm can be both computation and space intensive, depending on how it's handled.

Visit the wiki to learn more.

Getting Started

For the most up to date instructions for getting started on Windows, MacOS or Linux please see the wiki. This includes: