Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/marsupialtail/quokka
Browse files Browse the repository at this point in the history
  • Loading branch information
marsupialtail committed Jun 29, 2023
2 parents 766c700 + 5b9cc64 commit a8165fd
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 4 deletions.
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,19 +22,20 @@ If you would like to support Quokka, please give us a star! 🙏

## Showcases

* **[Tick-level backtesting](https://github.com/marsupialtail/quokka/blob/master/blog/backtest.md):** backtest a mid-high trading strategy against SIP trade stream for the last five years in 10 minutes.
* **[Tick-level backtesting](https://github.com/marsupialtail/quokka/blob/master/blog/backtest.md):** backtest a mid-high frequency trading strategy against SIP trade stream for the last four years in 10 minutes.

* **[Vector embedding analytics](https://blog.lancedb.com/why-dataframe-libraries-need-to-understand-vector-embeddings-291343efd5c8):** easily add new input readers in Python, like for the [Lance](https://github.com/lancedb/lance) format.

* **Approximate quantiles for 10000 columns:** easily integrate with Arrow-compatible C++ Plugins.
* **[Extreme feature engineering on 10k columns](https://github.com/marsupialtail/quokka/blob/master/blog/approxquant.md):** easily integrate with Arrow-compatible C++ Plugins.

* **TPC-H:** Several times faster than SparkSQL in many TPC-H queries. (EMR, not DBR!)
* **[TPC-H](https://github.com/marsupialtail/quokka/blob/master/blog/release.md):** Several times faster than SparkSQL in many TPC-H queries. (EMR, not DBR!)

<p align="center">
<img src="https://github.com/marsupialtail/quokka/blob/master/docs/docs/tpch-parquet.svg?raw=true" alt="Title"/>
</p>

* **Detect iceberg orders in quote stream (upcoming):** use complex event processing to easily detect iceberg order in the MBO stream.
* **Backtest an online learning algorithm on clickstream data (upcoming):** test incremental training algorithms on historical data.

## What is Quokka?

Expand Down
2 changes: 1 addition & 1 deletion blog/backtest.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ This blog post showcases a prime use case of Quokka's time series functionality

There are obvious downsides to the bar-level backtesting strategy, the most important of which is the bucketization of time. For example if you use 5-minute bars, you cannot enter the market until the end of the interval. This means if you happened to want to trade at 9:33, you have to enter the market at 9:35 if you are using 5-min bars. Your alpha might have decayed by then.

- **Trade backtesting**: Another strategy used by more professional investors involve looking at the list of trades that happened on the exchange, and figure out what the entry price would have been for their proposed historical trades. The simplest way to do this is to find the next trade that happened right after your proposed trade, and use that trade's price as the simulated execution price. You can also use more complicated techniques like VWAP.
- **Trade backtesting**: Another strategy used by more professional investors involves looking at the list of trades that happened on the exchange to figure out what the entry price would have been for their proposed historical trades. The simplest way to do this is to find the next trade that happened right after your proposed trade, and use that trade's price as the simulated execution price. You can also use more complicated techniques like VWAP.

- **Book backtesting**: Even trade-level simulation can be improved upon by using order book information. What if you wanted to trade one million shares, and the flanking trades of your proposed trade at that time only had size 100? Using the execution prices of those trades will not accurately reflect the market impact of your trade, and you will most likely get a much worse execution price. The most accurate way to backtest would be to reconstruct the order book at the time of your proposed trade and compute exactly your execution across multiple levels. However, the book information is very expensive and can easily amount to hundreds of GB per day. You will also have to make assumptions like the position of your trade on the exchange queue, which might be suitable only for HFT players.

Expand Down

0 comments on commit a8165fd

Please sign in to comment.