-
Notifications
You must be signed in to change notification settings - Fork 13
Home
Caffery Yang edited this page Feb 18, 2023
·
1 revision
Welcome to the ggpicrust2 wiki!
The ggpicrust2
function integrates pathway name/description annotations, ten of the most advanced differential abundance (DA) methods, and visualization of DA results.
-
file
: A character, the file path to store picrust2 export files -
metadata
: A tibble, consisting of sample information -
group
: A character, name of the group -
pathway
: A character, consisting of "EC", "KO", "MetaCyc" -
daa_method
: A character, the chosen differential abundance analysis (DA) method -
ko_to_kegg
: A character to control the conversion of KO abundance to KEGG abundance -
p.adjust
: A character, the method to adjust p-values -
order
: A character to control the order of the main plot rows -
p_values_bar
: A character to control if the main plot has the p_values bar -
x_lab
: A character to control the x-axis label name -
select
: A vector consisting of pathway names to be selected -
reference
: A character, a reference group level for several DA methods -
colors
: A vector consisting of colors number
-
daa.results.df
: A dataframe of DA results
Here is an example of using the ggpicrust2
function:
ggpicrust2(file = "picrust2_export.txt",
metadata = sample_metadata,
group = "treatment",
pathway = "KO",
daa_method = "ALDEx2",
ko_to_kegg = FALSE,
p.adjust = "BH",
order = "group",
p_values_bar = TRUE,
x_lab = "pathway_name",
select = c("pathway1", "pathway2"),
reference = "control",
colors = c("red", "blue"))
This function annotates pathway information for three types of pathways: EC, KO, and MetaCyc.
-
file
: The address to store Picrust2 export files. -
pathway
: A character that specifies the pathway. -
daa_results_df
: A data frame that contains the results of the pathway_daa function. -
ko_to_kegg
: A character that determines whether to convert KO abundance to KEGG pathway abundance.
- The function checks if the
file
anddaa_results_df
arguments are not null. If both are null, it throws an error. - It checks the file format and reads the file.
- It adds a column named 'description' and annotates the pathways:
- For 'KO' pathway, it loads the KO_reference.RData file, checks for matching KO_reference values, and fills in the 'description' column.
- For 'EC' and 'MetaCyc' pathways, it loads the respective reference files and proceeds in a similar way.
- If the
daa_results_df
argument is not null, it checks whether to convert the KO abundance to KEGG pathway abundance:- If not, it proceeds in the same way as described above.
- If yes, it filters the
daa_results_df
based on a p-value threshold of 0.05, adds four new columns, and connects to the KEGG database to get the latest results. - It then checks if the number of pathways with statistical significance is too many and prints a message accordingly. If not, it uses the
keggGet
function to retrieve the pathway information and fills in the new columns.
- Finally, it returns the annotated abundance or annotated
daa_results_df
, depending on the input.
The annotated abundance or annotated daa_results_df
.
This function converts the KO abundance in picrust2 export files to KEGG pathway abundance.
-
file
: A character string, the address of the file containing the KO abundance data obtained from picrust2 export files.
- A data frame,
kegg_abundance
- The function first determines the file format based on the file extension.
- It then reads the file using the appropriate delimiter based on the file format. If the file format is not one of the accepted formats (
.tsv
,.txt
, or.csv
), the function stops with an error message. - The KEGG reference data is loaded from the package
ggpicrust2
. - Sample names and KEGG pathway names are extracted from the abundance data and the reference data.
- A matrix is initialized to store the KEGG pathway abundance.
- The function then loops through each KEGG pathway and sample to calculate the abundance.
- The kegg pathways with zero abundance in all the samples are removed.
- The kegg abundance matrix is converted to a data frame.
- The updated kegg abundance data frame is returned as the final output.
Note: The calculation may take a long time, so the function provides a message to inform the user to be patient.