The IATTC's regression tree R package for length frequency data

This is the GitHub repository for IATTC's regression tree algorithm on length frequency data.
The R codes for regression tree analysis were originally developed by Cleridy Lennart-Cody ( https://doi.org/10.1016/j.fishres.2009.11.014) and then modified by Haikun Xu to make it automatic as a R package.
Please contact Haikun ([email protected]) for any questions related to the package
How to install the package: devtools::install_github('HaikunXu/RegressionTree',ref='main')
User manual of the package: https://github.com/HaikunXu/FishFreqTree/blob/main/manual/Manual.pdf

Note:

Data Format

The input data frame should include at least four columns named exactly as "lat", "lon", "year", and "quarter". The columns "lat" and "lon" represent the latitudinal and longitudinal positions of grid centers, respectively. The input data frame should also include various columns that record length frequency information with column names = length bin. This regression tree package works with length frequency data so please make sure the input values sum to 1 across length bins. An example of the input data can be found here.

Model description

This package finds the best multi-cell combination for a length frequency data based on the proportion of variance explained. The variables that are current considered in the code include latitude, longitude, quarter/cyclic quarter, and year (can be turned on by using year=TRUE). For those who don't consider quarter as a splitting dimension (e.g., your model has a time step of one year), please still add a column named "quarter" to the input data with values = 1. In the main functions this package provides (run_regression_tree and loop_regression_tree), you can manually turn off the quarter dimension by adding "quarter = FALSE" as a function argument.

Main functions

run_regression_tree (type ?run_regression_tree on the console for more info): run the regression tree
loop_regression_tree (type ?loop_regression_tree on the console for more info): loop the regression tree
evaluate_regression_tree (type ?evaluate_regression_tree on the console for more info): evaluate a pre-specified regression tree

Code description

For the nth best split, the code first loops over all existing n cells that are defined by the previous n-1 splits, to find the best split (the one that leads to the maximum variance explained) for every cell. Then those best cell-specific splits are compared to find the split that results in the maximum variance explained. This split is the nth best split. This process is iterated until reaching the maximum number of splits specified by the user.

Users should combine the output figures with output tables to understand the best splits in order. Also, the advanced feature (see the example code for more details) in the package allows users to manually specify some or all splits.

Name		Name	Last commit message	Last commit date
Latest commit History 201 Commits
R		R
data		data
man		man
manual		manual
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
FishFreqTree.Rproj		FishFreqTree.Rproj
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The IATTC's regression tree R package for length frequency data

Note:

Data Format

Model description

Main functions

Code description

About

Releases 1

Packages

Languages

License

HaikunXu/FishFreqTree

Folders and files

Latest commit

History

Repository files navigation

The IATTC's regression tree R package for length frequency data

Note:

Data Format

Model description

Main functions

Code description

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages