Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calibration/structural estimation of retail order participation #199

Closed
sbenthall opened this issue Mar 3, 2023 · 16 comments
Closed

Calibration/structural estimation of retail order participation #199

sbenthall opened this issue Mar 3, 2023 · 16 comments
Assignees

Comments

@sbenthall
Copy link
Owner

What are the stylized facts of retail order volume (30% ? ) that we are trying to match.

We can then do a calibration exercise to figure out what we can tweak to get that as an output.

Calibrate to stylized facts.

We can do the calibration with the MockMarket.

Get in touch with @wjt5121 about this ...

@sbenthall sbenthall added this to the v0.4.1 G.G. SHARK milestone Mar 3, 2023
@sbenthall sbenthall self-assigned this Mar 3, 2023
@mesalas
Copy link
Collaborator

mesalas commented Mar 4, 2023

Ill try to find some reference to back up the 30% value.

@mesalas
Copy link
Collaborator

mesalas commented Mar 7, 2023

Have done some research and my conclusion is that retail volume seems to be around or slightly above 20%.
I think we should try and target that number.

Here's a slide from virtu citing 605 reports that wholesalers have to produce:
2AF9A22E
https://www.sec.gov/comments/265-28/26528-8901054-242178.pdf

Reuters has an article from 2021 citing 25+% in July and aug 2021

The head of research at NASDAQ estimates 20% in 2020
9866BECD
https://www.nasdaq.com/articles/who-counts-as-a-retail-investor-2020-12-17

Forbes is citing JP Morgan that retail volume hit 23% in the beginning of 2023
https://www.forbes.com/sites/dereksaul/2023/02/03/retail-trading-just-hit-an-all-time-high-heres-what-stocks-are-the-most-popular/?sh=714a70b46664

CEBO has an interesting report that finds that retail volume initially was targeting "meme" stocks but now targets the broader market:
The graph shows the actual volume and not percentage of total, but they note that the volume has been relative consistent since 2Q 2021 while the fraction of meme stocks has dropped.
BB77633D

https://www.cboe.com/insights/posts/the-evolution-of-retail-investment-activity/

This might be consistent with this report of lower retail volume if a large fraction of the volume is traded in meme stocks outside of the index?
https://www.reuters.com/business/retail-traders-account-10-us-stock-trading-volume-morgan-stanley-2021-06-30/

@sbenthall
Copy link
Owner Author

Awesome.

Our current levers for setting the retail investor involvement are the attention parameter and dollars_per_hark_money_unit.

I expect we might also get an interaction with broker fees once we include them: #188

I suppose we can run some calibration tests to figure out the right ballpark for parameters to get this 'stylized fact' and then sweep within that ballpark to get a sense of the effects of variations?

@mesalas
Copy link
Collaborator

mesalas commented Mar 13, 2023

@sbenthall
Copy link
Owner Author

We can do the calibration step with the MockMarket.

@sbenthall
Copy link
Owner Author

@mesalas So, what are our targets exactly?

I believe you would like:

  • a mean daily trade volume of $\mu_v$
  • with some target standard deviation $\sigma_v$

These should be chosen relative to the trading volume of other agents on the market.

Essentially I need to write a program to optimize the parameter choices with respect to a loss function reflecting these two moments. That would be ideal.

But ... what are the ballpark numbers I should be aiming at? In case I don't have time to solve this in a general way.

@sbenthall
Copy link
Owner Author

Hitting this target will depend on #195 , since having more people in the population will increase the retail trade volume and make it easier to spread that activity evenly.

@mesalas
Copy link
Collaborator

mesalas commented Mar 20, 2023

But ... what are the ballpark numbers I should be aiming at? In case I don't have time to solve this in a general way.

I would go for a total volume of 40000 (on average 20000 buy and 20000 sell) and a std of about 5%

@sbenthall
Copy link
Owner Author

Hmmm. Ok.

Doing some testing with the original WHITESHARK population, getting up to this volume means ramping up the DPHM parameter to something like 500,000.

Recall that in our Whiteshark simulation, raising this value to 15,000 (much lower) got us consistent market crashes as the retail investors overpowered the institutional investors. But that was with memory-based expectations.

I assume that for G.G. Shark we will stick with UsualExpectations. That way, trade volume will be less correlated, as it will be responding to idiosyncratic labor shocks rather than market prices.

@sbenthall
Copy link
Owner Author

sbenthall commented Mar 21, 2023

@mesalas I've been working on this, and it looks like there are several reasons why calibrating the macro agents to get this realistic distribution of retail investor activity is not going to work.

I can get the volume up. Thanks to @alanlujan91 's work, I can now easily make a population of 1000 macro-agent, with a low attention rate (0.05), getting on average 50 agents to pay attention per day. With DPHM at 830,000, that gets a mean buy volume per day of 18,600 over one quarter. All these parameters can be scaled up and down.

The problem, simply, is that with USUAL expectations (where the agents compute their expected return from the true dividend process statistics), the agents never want to sell. They just hold onto the asset, collect the dividend, and reinvest. So the mean sell value is 0, and the standard deviation on the buy side is 0.48.

Here's what the simulation of one quarter looks like:

image

I think that leaves the following options for G.G. Shark:

  • Switch to the memory-based FinanceModel for agent expectations. Which we know will crash the market with so much retail investor clout! So, not really an option.

...

  • Just run noise through the broker. I.e. create a Simulation type where the broker activity is drawn from a random distribution like what you want. We've ran simulations like this with Calibration before.

Shall we do that? If so, do you want the buy/sell side to be normally distributed?

We'll need to discuss the implications of the interaction between expectations and stylized facts of retail investor activity later.

@sbenthall
Copy link
Owner Author

@mesalas and I have decided that for G.G. SHARK, he can just sample the broker activity on the AMMPS side, simplifying the whole thing a lot.

In the future, we should figure out what it takes to get the macro agent's trading activity to match these stylized facts.

@sbenthall
Copy link
Owner Author

General question for @llorracc : Which 'stylized facts' (about price process, retail investor activity, etc.) and calibration information (population model, etc.) if any, are targets for FIRE SHARK?

We should enumerate these empirics and/or assumptions in a separate file, then refer to it when doing SE and/or configuration for the FIRE SHARK experiments.

(These could be different for Good? Shark and so doing this rigorously will keep the framework supple.)

@llorracc
Copy link

llorracc commented Mar 21, 2023 via email

@sbenthall
Copy link
Owner Author

Sorry, the context shift may be confusing. The issue about mNrmStE is #205.

This is #199, which I expect is a totally separate topic.

SHARKFin already does a lot of what you describe here. I'm not asking you about how to get data out of the solution or simulation.

I'm asking what (if any) empirical facts (such as: no autocorrelation of stock returns; what percent of order volume is filled by retail orders; what ex ante heterogeneity is there in the population of the U.S.A.) are important for FIRE SHARK (specifically: the Economics publication we have been building towards).

My understanding is that we would use these facts in one of two ways:

  • Entering them as calibration to the model (as we do with ex ante population parameters), or
  • Estimate (through search, etc.) the parameters needed for the simulation to match the moments of these facts

But both of these approaches, important for validating the model, require a stock of facts that we care about matching.
I imagine which facts matter depends on disciplinary taste.

So, I'm asking: which facts should we be using to calibrate and validate FIRE SHARK?
It's not an urgent question. But answering it systematically would help us avoid what just happened in G.G. SHARK, which is a scramble to achieve a calibration which turned out not to be possible under our model assumptions.

@sbenthall
Copy link
Owner Author

CDC says:

For the first step, calibrate:

  • Variance and mean return of dividends match the variance and mean return of prices, from the S&P500
  • Lognormal dividend process.
  • To start with, just pure Lucas agents -- homogeneous, where there's no labor income.

What we are not matching in the first round:

  • prices are more volatile than fundamentals (dividends) (Campbell Schiller literature)
  • heteroskedasticity

@sbenthall
Copy link
Owner Author

@alanlujan91 makes the point that we should be attentive to the starting wealth levels. wealthier agents might sell more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants