Skip to content

Commit

Permalink
avoid multiple data types conversion when extracting fixed effects
Browse files Browse the repository at this point in the history
  • Loading branch information
pachadotdev committed Apr 9, 2024
1 parent e5a5b18 commit 74555fd
Show file tree
Hide file tree
Showing 33 changed files with 1,136 additions and 44 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Package: capybara
Type: Package
Title: Fast and Memory Efficient Fitting of Linear Models With High-Dimensional
Fixed Effects
Version: 0.4
Version: 0.4.5
Authors@R: c(
person(
given = "Mauricio",
Expand Down
12 changes: 10 additions & 2 deletions R/fixed_effects.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,14 @@
#' @references Gaure, S. (n. d.). "Multicollinearity, identification, and
#' estimable functions". Unpublished.
#' @seealso \code{\link{felm}}, \code{\link{feglm}}
#' @examples
#' # same as the example in feglm but extracting the fixed effects
#' mod <- fepoisson(
#' trade ~ log_dist + lang + cntg + clny | exp_year + imp_year,
#' trade_panel
#' )
#'
#' fixed_effects(mod)
#' @export
fixed_effects <- function(object = NULL, alpha.tol = 1.0e-08) {
# Check validity of 'object'
Expand Down Expand Up @@ -51,8 +59,8 @@ fixed_effects <- function(object = NULL, alpha.tol = 1.0e-08) {

# Assign names to the different fixed effects categories
for (i in seq.int(k)) {
fe.list[[i]] <- as.vector(fe.list[[i]])
names(fe.list[[i]]) <- nms.fe[[i]]
colnames(fe.list[[i]]) <- k.vars[i]
rownames(fe.list[[i]]) <- nms.fe[[i]]
}
names(fe.list) <- k.vars

Expand Down
58 changes: 58 additions & 0 deletions dev/cass_bug.r
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
library(tidyverse)
# library(alpaca)
# library(capybara)
library(devtools)

load_all()

# trade <- capybara::trade_panel %>%
trade <- trade_panel %>%
mutate(
exporter = str_sub(exp_year, 1, 3),
importer = str_sub(imp_year, 1, 3),
pair_id_2 = ifelse(exporter == importer, "0-intra", pair),

# Set reference country
exporter = ifelse(exporter == "DEU", "0-DEU", exporter),
importer = ifelse(importer == "DEU", "0-DEU", importer)
) %>%
# Sort by importer
arrange(importer) %>%
# Compute sum of trade by pair
group_by(pair) %>%
mutate(sum_trade = sum(trade)) %>%
ungroup()

# Poisson regression with Capybara works fine
# fit_capybara <- capybara::fepoisson(
object <- fepoisson(
trade ~ rta | exp_year + imp_year + pair_id_2,
data = trade %>% filter(sum_trade > 0)
)

foo <- fixed_effects(object)

class(foo)
class(foo$exp_year)

head(foo$exp_year)

summary(object)

# Error when using fixed_effects()
options(error = function() traceback(3))
foo <- fixed_effects(object)
bar <- alpaca::getFEs(object)

names(foo)
head(foo$exp_year)
head(bar$exp_year)

all.equal(foo$exp_year, bar$exp_year)
all.equal(foo$imp_year, bar$imp_year)
all.equal(foo$pair_id_2, bar$pair_id_2)

saveRDS(
list(model = object, fes = foo),
"dev/cass_bug.rds"
)
173 changes: 173 additions & 0 deletions dev/cass_bug.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
==37800== Memcheck, a memory error detector
==37800== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==37800== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==37800== Command: /usr/lib/R/bin/exec/R --vanilla -f dev/cass_bug.r
==37800==

R version 4.3.3 (2024-02-29) -- "Angel Food Cake"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
> # library(alpaca)
> library(capybara)
> library(devtools)
Loading required package: usethis
>
> load_all()
ℹ Loading capybara
>
> # trade <- capybara::trade_panel %>%
> trade <- trade_panel %>%
+ mutate(
+ exporter = str_sub(exp_year, 1, 3),
+ importer = str_sub(imp_year, 1, 3),
+ pair_id_2 = ifelse(exporter == importer, "0-intra", pair),
+
+ # Set reference country
+ exporter = ifelse(exporter == "DEU", "0-DEU", exporter),
+ importer = ifelse(importer == "DEU", "0-DEU", importer)
+ ) %>%
+ # Sort by importer
+ arrange(importer) %>%
+ # Compute sum of trade by pair
+ group_by(pair) %>%
+ mutate(sum_trade = sum(trade)) %>%
+ ungroup()
>
> # Poisson regression with Capybara works fine
> # fit_capybara <- capybara::fepoisson(
> object <- fepoisson(
+ trade ~ rta | exp_year + imp_year + pair_id_2,
+ data = trade %>% filter(sum_trade > 0)
+ )
>
> summary(object)
Formula: trade ~ rta | exp_year + imp_year + pair_id_2

Family: Poisson

Estimates:

| | Estimate | Std. Error | z value | Pr(>|z|) |
|-----|----------|------------|----------|------------|
| rta | -0.0480 | 0.0020 | -24.0238 | 0.0000 *** |

Significance codes: *** 99.9%; ** 99%; * 95%; . 90%

Pseudo R-squared: 0.7455

Number of observations: Full 27822; Missing 0; Perfect classification 0

Number of Fisher Scoring iterations: 19
>
> # Error when using fixed_effects()
> options(error = function() traceback(3))
> fixed_effects(object)
==37800== Invalid read of size 8
==37800== at 0x49E8986: VECTOR_ELT (in /usr/lib/R/lib/libR.so)
==37800== by 0x16BC5713: cpp11::r_vector<SEXPREC*>::operator[](long) const (list.hpp:30)
==37800== by 0x16BC72D3: cpp11::r_vector<SEXPREC*>::operator[](int) const (r_vector.hpp:588)
==37800== by 0x16BCDF33: get_alpha_(cpp11::matrix<cpp11::r_vector<double>, double, cpp11::by_column> const&, cpp11::r_vector<SEXPREC*> const&, double) (02_get_alpha.cpp:37)
==37800== by 0x16BFBED6: _capybara_get_alpha_ (cpp11.cpp:19)
==37800== by 0x49562AD: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x495685C: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49AE367: Rf_eval (in /usr/lib/R/lib/libR.so)
==37800== by 0x49B1377: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49AE0FF: Rf_eval (in /usr/lib/R/lib/libR.so)
==37800== by 0x49AFD85: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49B0BB4: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==37800== Address 0x98e96c0 is 0 bytes after a block of size 3,360 alloc'd
==37800== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==37800== by 0x49F0584: Rf_allocVector3 (in /usr/lib/R/lib/libR.so)
==37800== by 0x4A612D8: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x4993264: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49ADCFF: Rf_eval (in /usr/lib/R/lib/libR.so)
==37800== by 0x49AFD85: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49B0BB4: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==37800== by 0x49F62C0: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49F66B6: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49F6AA1: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x4993067: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49ADCFF: Rf_eval (in /usr/lib/R/lib/libR.so)
==37800==
Error: Invalid input type, expected 'integer' actual 'NULL'
4: .Call(`_capybara_get_alpha_`, p_r, klist, tol) at cpp11.R#8
3: get_alpha_(pie, k.list, alpha.tol)
2: as.list(get_alpha_(pie, k.list, alpha.tol)) at fixed_effects.R#50
1: fixed_effects(object)
>
> # [1,] numeric,414
> # [2,] numeric,414
> # [3,] numeric,4637
>
==37800==
==37800== HEAP SUMMARY:
==37800== in use at exit: 202,220,590 bytes in 31,310 blocks
==37800== total heap usage: 214,439 allocs, 183,129 frees, 699,680,359 bytes allocated
==37800==
==37800== LEAK SUMMARY:
==37800== definitely lost: 0 bytes in 0 blocks
==37800== indirectly lost: 0 bytes in 0 blocks
==37800== possibly lost: 2,688 bytes in 8 blocks
==37800== still reachable: 202,217,902 bytes in 31,302 blocks
==37800== of which reachable via heuristic:
==37800== newarray : 4,264 bytes in 1 blocks
==37800== suppressed: 0 bytes in 0 blocks
==37800== Rerun with --leak-check=full to see details of leaked memory
==37800==
==37800== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==37800==
==37800== 1 errors in context 1 of 1:
==37800== Invalid read of size 8
==37800== at 0x49E8986: VECTOR_ELT (in /usr/lib/R/lib/libR.so)
==37800== by 0x16BC5713: cpp11::r_vector<SEXPREC*>::operator[](long) const (list.hpp:30)
==37800== by 0x16BC72D3: cpp11::r_vector<SEXPREC*>::operator[](int) const (r_vector.hpp:588)
==37800== by 0x16BCDF33: get_alpha_(cpp11::matrix<cpp11::r_vector<double>, double, cpp11::by_column> const&, cpp11::r_vector<SEXPREC*> const&, double) (02_get_alpha.cpp:37)
==37800== by 0x16BFBED6: _capybara_get_alpha_ (cpp11.cpp:19)
==37800== by 0x49562AD: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x495685C: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49AE367: Rf_eval (in /usr/lib/R/lib/libR.so)
==37800== by 0x49B1377: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49AE0FF: Rf_eval (in /usr/lib/R/lib/libR.so)
==37800== by 0x49AFD85: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49B0BB4: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==37800== Address 0x98e96c0 is 0 bytes after a block of size 3,360 alloc'd
==37800== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==37800== by 0x49F0584: Rf_allocVector3 (in /usr/lib/R/lib/libR.so)
==37800== by 0x4A612D8: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x4993264: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49ADCFF: Rf_eval (in /usr/lib/R/lib/libR.so)
==37800== by 0x49AFD85: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49B0BB4: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==37800== by 0x49F62C0: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49F66B6: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49F6AA1: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x4993067: ??? (in /usr/lib/R/lib/libR.so)
==37800== by 0x49ADCFF: Rf_eval (in /usr/lib/R/lib/libR.so)
==37800==
==37800== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
2 changes: 1 addition & 1 deletion docs/404.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/LICENSE.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/articles/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/articles/intro.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions docs/authors.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/news/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@ pkgdown: 2.0.7
pkgdown_sha: ~
articles:
intro: intro.html
last_built: 2024-03-17T19:47Z
last_built: 2024-04-09T14:22Z

4 changes: 2 additions & 2 deletions docs/reference/apes.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions docs/reference/bias_corr.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/reference/capybara-package.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 74555fd

Please sign in to comment.