Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change fo3 cleaning #21

Open
korenmiklos opened this issue Oct 3, 2024 · 4 comments
Open

Change fo3 cleaning #21

korenmiklos opened this issue Oct 3, 2024 · 4 comments
Assignees

Comments

@korenmiklos
Copy link
Member

Take 5 consecutive years of data. The following changes should be made. Other patterns remain the same.

Data Corrected
11011 11111
01011 01111
11010 11110
00100 00000
10100 10000
00101 00001

Basically, change the middle to the 2 neighbors, if in the 4 neighbors they are a majority. Pseudocode, check before running.

forvalues i = 0/1 {
    bysort frame_id_numeric (year): generate neighbor`i' = `X'[_n-1] == `i' & `X'[_n+1] == `i'
    bysort frame_id_numeric (year): generate sum`i' = (`X'[_n-1] == `i')  + (`X'[_n+1] == `i') + (`X'[_n-2] == `i')  + (`X'[_n+2] == `i')
    replace `X' = `i' if `X' == 1-`i' & neighbor`i' & sum`i' >= 3
} 

The current code seems to only do this for sumi' == 4`, which is too strict.

@korenmiklos
Copy link
Member Author

The code seems to work as intended

@andrasvereckei
Copy link
Collaborator

andrasvereckei commented Oct 7, 2024

This is the code for the panel:


use "balance_sheet_80_22.dta" ,clear

**
foreach X in so3_with_mo3 fo3 do3 {
clonevar `X'_old = `X'
forvalues i = 0/1 {
	bysort frame_id2 (year): generate `X'_neighbor`i' = `X'[_n-1] == `i' & `X'[_n+1] == `i'
	bysort frame_id2 (year): generate `X'_sum`i' = (`X'[_n-1] == `i')  + (`X'[_n+1] == `i') + (`X'[_n-2] == `i')  + (`X'[_n+2] == `i')
	replace `X' = `i' if `X' == 1-`i' & `X'_neighbor`i' & `X'_sum`i' >= 3
			} 
}

@andrasvereckei
Copy link
Collaborator

andrasvereckei commented Oct 7, 2024

1 issue:
We have less fo3 with the cleaning:

Dfference ----------
Count Minimum Average Maximum

fo3<fo3_old 5734 -1 -1 -1
fo3=fo3_old 10537060
fo3>fo3_old 3049 1 1 1

2. issue:
Total of so3_with_mo3 fo3 do3 will be 0 at 3523 cases.

rowtotal | Freq. Percent Cum.
0 3,523 0.03 0.03
1 10,542,320 99.97 100.00

Total 10,545,843 100.00

I will have to check test firms (the biggest ones) that we really want these corrections.
The cleaner have to run in accordance with all owner variables.

@korenmiklos
Copy link
Member Author

Let's do this first for fo3, then for so3. Clearly the order matters here, as the patterns change when the holes are filled in. Something like

do "fillin.do" fo3
replace so3 = 0 if fo3 == 1
replace do3 = 0 if fo3 == 1
do "fillin.do" so3
replace do3 = 0 if so3 == 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants