Dynamical Factor Models (DFM) Implementation (GSOC 2025) #446

andreacate · 2025-03-31T18:47:53Z

Dynamical Factor Models (DFM) Implementation

This PR provides a first draft implementation of Dynamical Factor Models as part of my application proposal for the PyMC GSoC 2025 project. A draft of my application report can be found at this link.

Overview

Added DFM.py with initial functionality

Current Status

This implementation is a work in progress and I welcome any feedback

Next Steps

Vectorize the construction of the transition and selection matrices (possibly by reordering state variables).
Add support for measurement error.

zaxtax · 2025-04-01T23:05:56Z

Looks interesting! Just say when you think it's ready for review

fonnesbeck · 2025-04-05T15:37:57Z

cc @jessegrabowski

review-notebook-app · 2025-04-07T15:01:00Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

andreacate · 2025-04-07T15:06:44Z

Thanks for the feedback!

I'm still exploring the best approach for implementing Dynamic Factor Models.
I've added a simple custom DFM model in a Jupyter notebook, which I plan to use as a prototype and testing tool while developing the main BayesianDynamicFactor class.

pymc_extras/statespace/models/DFM.py

jessegrabowski · 2025-07-13T05:44:45Z

pymc_extras/statespace/models/DFM.py

+        # Factor states
+        for i in range(self.k_factors):
+            for lag in range(self.factor_order):
+                names.append(f"factor_{i+1}_lag{lag}")


nit: I've been using stata notation for lagged states, e.g. L{lag}.factor_{i+1}

Not married to it, but consider it for consistency's sake.

jessegrabowski · 2025-07-13T05:45:21Z

pymc_extras/statespace/models/DFM.py

+        if self.error_order > 0:
+            for i in range(self.k_endog):
+                for lag in range(self.error_order):
+                    names.append(f"error_{i+1}_lag{lag}")


jessegrabowski · 2025-07-13T05:46:01Z

pymc_extras/statespace/models/DFM.py

+
+        # If error_order > 0
+        if self.error_order > 0:
+            coords["error_ar_param"] = list(range(1, self.error_order + 1))


Suggested change

coords["error_ar_param"] = list(range(1, self.error_order + 1))

coords[ERROR_AR_PARAM_DIM] = list(range(1, self.error_order + 1))

It's weird to have a global everywhere except here

pymc_extras/statespace/models/DFM.py

jessegrabowski · 2025-07-13T06:35:14Z

pymc_extras/statespace/models/DFM.py

+
+        self.ssm["initial_state_cov", :, :] = P0
+
+        # TODO vectorize the design matrix


You're going to have to double-check all of these matrix constructions if you re-ordered the states.

pymc_extras/statespace/models/DFM.py

jessegrabowski · 2025-07-25T11:37:07Z

Some tests are failing due to missing constants. You might have lost some changes in the reset/rebasing process

jessegrabowski

Left some comments. I didn't look over the tests because they still seem like WIP, but seem to be on the right track!

pymc_extras/statespace/models/DFM.py

jessegrabowski · 2025-07-17T12:09:37Z

pymc_extras/statespace/models/DFM.py

+    Internally, this model is represented in state-space form by stacking all current and lagged latent factors and,
+    if present, autoregressive observation errors into a single state vector. The full state vector has dimension
+    :math:`k_factors \cdot factor_order + k_endog \cdot error_order`, where :math:`k_endog` is the number of observed time series.


Show the actual transition equation that is used in block form, using the vectors/matrices that you defined above.

pymc_extras/statespace/models/DFM.py

In the notebook a comparison between the custom DFM and the implemented DFM (which has an hardcoded version of make_symbolic_graph, that work just in this case)

…M.py

…iables in DFM.py

…pymc_extras/statespace/models/structural/components/regression.py

jessegrabowski

I did a deeper pass on everything except the build_symbolic_graph method. I need to spend more time on that because it's gotten quite complex.

I'll finish ASAP.

jessegrabowski · 2025-08-16T06:27:56Z

tests/statespace/models/test_DFM.py

+# TODO: check test for error_var=True, since there are problems with statsmodels, the matrices looks the same by some experiments done in notebooks
+# (FAILED tests/statespace/models/test_DFM.py::test_DFM_update_matches_statsmodels[True-2-2-2] - numpy.linalg.LinAlgError: 1-th leading minor of the array is not positive definite)


Could you replace this TODO with a test that fails, and mark it as xfail with this comment about statsmodels maybe doing something wrong?

jessegrabowski · 2025-08-16T06:31:57Z

pymc_extras/statespace/models/DFM.py

+
+    factor_order : int
+        Order of the VAR process for the latent factors. If 0, the factors are treated as static (no dynamics).
+        Therefore, the state vector will include one state per factor and "factor_ar" will not exist.


When you say "no dynamics" do you mean the estimated factors will literally be static, or just that they won't be autoregressive?

I guess I'm asking if they still get stochastic innovations when factor_order = 0

Yes sorry, maybe that was a bit misleading. The factor won't be autoregressive, but will still have stochastic innovation

jessegrabowski · 2025-08-16T06:33:09Z

pymc_extras/statespace/models/DFM.py

+        Names of the exogenous variables. If not provided, but `k_exog` is specified, default names will be generated as `exog_1`, `exog_2`, ..., `exog_k`.
+
+    shared_exog_states: bool, optional
+        Whether exogenous latent states are shared across the observed states. If True, there will be only one set of exogenous latent


What do you mean by "exogenous latent state"? The learned regression coefficient states?

Yes, they are the betas coefficient

pymc_extras/statespace/models/DFM.py

jessegrabowski · 2025-08-16T06:35:43Z

pymc_extras/statespace/models/DFM.py

+
+    Notes
+    -----
+    TODO: adding to notes, how exog variables are handled and add them in the example?


I think you already have them as $x_t$ in the equations?

Yes, but I was also thinking about adding the explanation of how we handle the exogenous variables in the section about the state-space matrices. For example, explaining that we extend the state… Do you think that’s unnecessary

Yes that could be nice, but consider it a stretch goal, not a must-have

pymc_extras/statespace/models/DFM.py

jessegrabowski requested changes Jul 13, 2025

View reviewed changes

jessegrabowski reviewed Jul 17, 2025

View reviewed changes

pymc_extras/statespace/models/DFM.py Outdated Show resolved Hide resolved

jessegrabowski reviewed Jul 17, 2025

View reviewed changes

pymc_extras/statespace/models/DFM.py Outdated Show resolved Hide resolved

andreacate force-pushed the DFM_draft_implementation branch 2 times, most recently from 21560db to a459a1a Compare July 25, 2025 10:44

andreacate force-pushed the DFM_draft_implementation branch from 1c04f65 to bc3fcf2 Compare July 25, 2025 13:51

jessegrabowski requested changes Jul 27, 2025

View reviewed changes

andreacate force-pushed the DFM_draft_implementation branch 3 times, most recently from 7846f15 to e15cdd3 Compare July 29, 2025 07:59

andreacate force-pushed the DFM_draft_implementation branch from e15cdd3 to 3b8bfe4 Compare August 8, 2025 12:36

andreacate and others added 11 commits August 15, 2025 23:06

Added new file DFM.py for GSOC 2025 Dynamical Factor Models

4ffbee7

Add initial notebook on custom DFM implementation

9a4ef64

Update of DFM draft implementation

bda9e7f

In the notebook a comparison between the custom DFM and the implemented DFM (which has an hardcoded version of make_symbolic_graph, that work just in this case)

Aligning the order of vector state with statsmodel and updating the test

2bed71b

Added test_DFM_update_matches_statsmodels and small corrections to DF…

b19fd28

…M.py

Updating test following test_ETS.py and small adjustment for exog var…

123c893

…iables in DFM.py

Added support for joint VAR modelling (error_var=True)

05c11d4

Adding a first implemntation of exogeneous variable support based on …

fbd5b1f

…pymc_extras/statespace/models/structural/components/regression.py

Completing the implementation of exogeneous varibales support

dcc1293

Small adjustments and improvements in DFM.py

bff3e80

Small adjustments and improvements in DFM.py

615960b

andreacate force-pushed the DFM_draft_implementation branch from 6496f38 to 615960b Compare August 15, 2025 21:20

jessegrabowski requested changes Aug 16, 2025

View reviewed changes

Adjustments after Jesse review

f4bb74c

Adjustments following Jesse suggestions and added tests for exog support

2288606

	coords["error_ar_param"] = list(range(1, self.error_order + 1))
	coords[ERROR_AR_PARAM_DIM] = list(range(1, self.error_order + 1))


		self.ssm["initial_state_cov", :, :] = P0

		# TODO vectorize the design matrix

		# TODO: check test for error_var=True, since there are problems with statsmodels, the matrices looks the same by some experiments done in notebooks
		# (FAILED tests/statespace/models/test_DFM.py::test_DFM_update_matches_statsmodels[True-2-2-2] - numpy.linalg.LinAlgError: 1-th leading minor of the array is not positive definite)

Dynamical Factor Models (DFM) Implementation (GSOC 2025) #446

Are you sure you want to change the base?

Dynamical Factor Models (DFM) Implementation (GSOC 2025) #446

Conversation

andreacate commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dynamical Factor Models (DFM) Implementation

Overview

Current Status

Next Steps

Uh oh!

zaxtax commented Apr 1, 2025

Uh oh!

fonnesbeck commented Apr 5, 2025

Uh oh!

review-notebook-app bot commented Apr 7, 2025

Uh oh!

andreacate commented Apr 7, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jessegrabowski commented Jul 25, 2025

Uh oh!

jessegrabowski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jessegrabowski left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andreacate commented Mar 31, 2025 •

edited

Loading