Calibration/structural estimation of retail order participation #199

sbenthall · 2023-03-03T15:19:48Z

What are the stylized facts of retail order volume (30% ? ) that we are trying to match.

Match a mean -- DPHM, attention; Mean: Get from @mesalas
Match the variance? - attention? number of agents? SD: 5% -- see Increasing the agent population size : agent_data and AgentCount #195

We can then do a calibration exercise to figure out what we can tweak to get that as an output.

Calibrate to stylized facts.

We can do the calibration with the MockMarket.

Get in touch with @wjt5121 about this ...

mesalas · 2023-03-04T07:27:34Z

Ill try to find some reference to back up the 30% value.

mesalas · 2023-03-07T10:35:54Z

Have done some research and my conclusion is that retail volume seems to be around or slightly above 20%.
I think we should try and target that number.

Here's a slide from virtu citing 605 reports that wholesalers have to produce:

https://www.sec.gov/comments/265-28/26528-8901054-242178.pdf

Reuters has an article from 2021 citing 25+% in July and aug 2021

The head of research at NASDAQ estimates 20% in 2020

https://www.nasdaq.com/articles/who-counts-as-a-retail-investor-2020-12-17

Forbes is citing JP Morgan that retail volume hit 23% in the beginning of 2023
https://www.forbes.com/sites/dereksaul/2023/02/03/retail-trading-just-hit-an-all-time-high-heres-what-stocks-are-the-most-popular/?sh=714a70b46664

CEBO has an interesting report that finds that retail volume initially was targeting "meme" stocks but now targets the broader market:
The graph shows the actual volume and not percentage of total, but they note that the volume has been relative consistent since 2Q 2021 while the fraction of meme stocks has dropped.

https://www.cboe.com/insights/posts/the-evolution-of-retail-investment-activity/

This might be consistent with this report of lower retail volume if a large fraction of the volume is traded in meme stocks outside of the index?
https://www.reuters.com/business/retail-traders-account-10-us-stock-trading-volume-morgan-stanley-2021-06-30/

sbenthall · 2023-03-13T18:24:37Z

Awesome.

Our current levers for setting the retail investor involvement are the attention parameter and dollars_per_hark_money_unit.

I expect we might also get an interaction with broker fees once we include them: #188

I suppose we can run some calibration tests to figure out the right ballpark for parameters to get this 'stylized fact' and then sweep within that ballpark to get a sense of the effects of variations?

mesalas · 2023-03-13T20:22:35Z

More stats https://www.nasdaq.com/articles/who-counts-as-a-retail-investor-2020-12-17

sbenthall · 2023-03-13T20:41:55Z

We can do the calibration step with the MockMarket.

sbenthall · 2023-03-16T19:36:10Z

@mesalas So, what are our targets exactly?

I believe you would like:

a mean daily trade volume of $\mu_v$
with some target standard deviation $\sigma_v$

These should be chosen relative to the trading volume of other agents on the market.

Essentially I need to write a program to optimize the parameter choices with respect to a loss function reflecting these two moments. That would be ideal.

But ... what are the ballpark numbers I should be aiming at? In case I don't have time to solve this in a general way.

sbenthall · 2023-03-16T21:07:06Z

Hitting this target will depend on #195 , since having more people in the population will increase the retail trade volume and make it easier to spread that activity evenly.

mesalas · 2023-03-20T10:33:27Z

But ... what are the ballpark numbers I should be aiming at? In case I don't have time to solve this in a general way.

I would go for a total volume of 40000 (on average 20000 buy and 20000 sell) and a std of about 5%

sbenthall · 2023-03-20T15:54:18Z

Hmmm. Ok.

Doing some testing with the original WHITESHARK population, getting up to this volume means ramping up the DPHM parameter to something like 500,000.

Recall that in our Whiteshark simulation, raising this value to 15,000 (much lower) got us consistent market crashes as the retail investors overpowered the institutional investors. But that was with memory-based expectations.

I assume that for G.G. Shark we will stick with UsualExpectations. That way, trade volume will be less correlated, as it will be responding to idiosyncratic labor shocks rather than market prices.

sbenthall · 2023-03-21T19:48:06Z

@mesalas I've been working on this, and it looks like there are several reasons why calibrating the macro agents to get this realistic distribution of retail investor activity is not going to work.

I can get the volume up. Thanks to @alanlujan91 's work, I can now easily make a population of 1000 macro-agent, with a low attention rate (0.05), getting on average 50 agents to pay attention per day. With DPHM at 830,000, that gets a mean buy volume per day of 18,600 over one quarter. All these parameters can be scaled up and down.

The problem, simply, is that with USUAL expectations (where the agents compute their expected return from the true dividend process statistics), the agents never want to sell. They just hold onto the asset, collect the dividend, and reinvest. So the mean sell value is 0, and the standard deviation on the buy side is 0.48.

Here's what the simulation of one quarter looks like:

I think that leaves the following options for G.G. Shark:

Switch to the memory-based FinanceModel for agent expectations. Which we know will crash the market with so much retail investor clout! So, not really an option.

...

Just run noise through the broker. I.e. create a Simulation type where the broker activity is drawn from a random distribution like what you want. We've ran simulations like this with Calibration before.

Shall we do that? If so, do you want the buy/sell side to be normally distributed?

We'll need to discuss the implications of the interaction between expectations and stylized facts of retail investor activity later.

sbenthall · 2023-03-21T20:07:33Z

@mesalas and I have decided that for G.G. SHARK, he can just sample the broker activity on the AMMPS side, simplifying the whole thing a lot.

In the future, we should figure out what it takes to get the macro agent's trading activity to match these stylized facts.

sbenthall · 2023-03-21T20:23:28Z

General question for @llorracc : Which 'stylized facts' (about price process, retail investor activity, etc.) and calibration information (population model, etc.) if any, are targets for FIRE SHARK?

We should enumerate these empirics and/or assumptions in a separate file, then refer to it when doing SE and/or configuration for the FIRE SHARK experiments.

(These could be different for Good? Shark and so doing this rigorously will keep the framework supple.)

llorracc · 2023-03-21T21:09:28Z

I'm a bit confused what you are asking. `mNrmStE` is from the solution stage. Some of the other things you mention are from the simulation stage. For the results from the converged `.solution[0]` I can't see any reason to pare them down. The storage space occupied by these parameters will be trivial, and it would add extra work and complication if I removed one that later I realized we might need. For the simulation stage, we will want to start by keeping the entire history of dates (I guess counting up from the first period after the `burn-in` period; we may as well call that period [0]. So, I guess the thing to do would be to store a snapshot of all the agent's state variables at every date at which the agent observes prices. If we need to pare that down, then I'd suggest a subtractive procedure: Make an automated list of the agent's state and parameter values, and then allow us to configure a method like `excise([item],[simulation])`.

…

On Tue, Mar 21, 2023 at 4:23 PM Sebastian Benthall ***@***.***> wrote: General question for @llorracc <https://github.com/llorracc> : Which 'stylized facts' (about price process, retail investor activity, etc.) and calibration information (population model, etc.) if any, are targets for FIRE SHARK? We should enumerate these empirics and/or assumptions in a separate file, then refer to it when doing SE and/or configuration for the FIRE SHARK experiments. (These could be different for Good? Shark and so doing this rigorously will keep the framework supple.) — Reply to this email directly, view it on GitHub <#199 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKCK73ZKMMWVWULBVIXQDTW5IE4XANCNFSM6AAAAAAVOXR2TM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- - Chris Carroll

sbenthall · 2023-03-21T21:25:18Z

Sorry, the context shift may be confusing. The issue about mNrmStE is #205.

This is #199, which I expect is a totally separate topic.

SHARKFin already does a lot of what you describe here. I'm not asking you about how to get data out of the solution or simulation.

I'm asking what (if any) empirical facts (such as: no autocorrelation of stock returns; what percent of order volume is filled by retail orders; what ex ante heterogeneity is there in the population of the U.S.A.) are important for FIRE SHARK (specifically: the Economics publication we have been building towards).

My understanding is that we would use these facts in one of two ways:

Entering them as calibration to the model (as we do with ex ante population parameters), or
Estimate (through search, etc.) the parameters needed for the simulation to match the moments of these facts

But both of these approaches, important for validating the model, require a stock of facts that we care about matching.
I imagine which facts matter depends on disciplinary taste.

So, I'm asking: which facts should we be using to calibrate and validate FIRE SHARK?
It's not an urgent question. But answering it systematically would help us avoid what just happened in G.G. SHARK, which is a scramble to achieve a calibration which turned out not to be possible under our model assumptions.

sbenthall · 2023-04-03T20:15:25Z

CDC says:

For the first step, calibrate:

Variance and mean return of dividends match the variance and mean return of prices, from the S&P500
Lognormal dividend process.
To start with, just pure Lucas agents -- homogeneous, where there's no labor income.

What we are not matching in the first round:

prices are more volatile than fundamentals (dividends) (Campbell Schiller literature)
heteroskedasticity

sbenthall · 2023-04-03T20:24:24Z

@alanlujan91 makes the point that we should be attentive to the starting wealth levels. wealthier agents might sell more.

sbenthall added this to the v0.4.1 G.G. SHARK milestone Mar 3, 2023

sbenthall self-assigned this Mar 3, 2023

sbenthall added the Population label Mar 20, 2023

sbenthall mentioned this issue Mar 20, 2023

Experiment: Plot bid-ask spread against internalization #203

Closed

sbenthall modified the milestones: v0.4.1 G.G. SHARK, v0.4.2 FIRE SHARK Mar 21, 2023

This was referenced Apr 4, 2023

Set starting aNrm for Lucas population near equilibrium #205

Open

Lucas0 writeup #207

Closed

sbenthall closed this as completed Apr 20, 2023

sbenthall mentioned this issue May 20, 2023

Calibrating for target consumer behavior? #233

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calibration/structural estimation of retail order participation #199

Calibration/structural estimation of retail order participation #199

sbenthall commented Mar 3, 2023

mesalas commented Mar 4, 2023

mesalas commented Mar 7, 2023

sbenthall commented Mar 13, 2023

mesalas commented Mar 13, 2023

sbenthall commented Mar 13, 2023

sbenthall commented Mar 16, 2023

sbenthall commented Mar 16, 2023

mesalas commented Mar 20, 2023

sbenthall commented Mar 20, 2023

sbenthall commented Mar 21, 2023 •

edited

Loading

sbenthall commented Mar 21, 2023

sbenthall commented Mar 21, 2023

llorracc commented Mar 21, 2023 via email

sbenthall commented Mar 21, 2023

sbenthall commented Apr 3, 2023

sbenthall commented Apr 3, 2023

Calibration/structural estimation of retail order participation #199

Calibration/structural estimation of retail order participation #199

Comments

sbenthall commented Mar 3, 2023

mesalas commented Mar 4, 2023

mesalas commented Mar 7, 2023

sbenthall commented Mar 13, 2023

mesalas commented Mar 13, 2023

sbenthall commented Mar 13, 2023

sbenthall commented Mar 16, 2023

sbenthall commented Mar 16, 2023

mesalas commented Mar 20, 2023

sbenthall commented Mar 20, 2023

sbenthall commented Mar 21, 2023 • edited Loading

sbenthall commented Mar 21, 2023

sbenthall commented Mar 21, 2023

llorracc commented Mar 21, 2023 via email

sbenthall commented Mar 21, 2023

sbenthall commented Apr 3, 2023

sbenthall commented Apr 3, 2023

sbenthall commented Mar 21, 2023 •

edited

Loading