-
Notifications
You must be signed in to change notification settings - Fork 11
Higashi Usage
Run the following commands to process the input data.
cd higashi
python Process.py [-c CONFIG]
'
required arguments:
-c CONFIG The path to the configuration JSON file that you created in the step
'
This script will finish the following tasks:
- generate a dictionary that'll map genomic bin loci to the node id.
- extract data from the data.txt and turn that into the format of hyperedges (triplets)
- create contact maps based on sparse scHi-C for visualization, baseline model, and generate node attributes
- run linear convolution + random-walk-with-restart (scHiCluster) to impute the contact maps as baseline and visualization
- generate node attributes
- (Optional) process co-assayed signals
Before each step is executed, a message would be printed indicating the progress, which helps the debugging process.
python main_cell.py [-c CONFIG] [-s START]
'
optional arguments:
-s {1,2,3} The start step of Higashi program. Can be used to continue Higashi
training if interrupted before. 1,2,3 stands for the following steps:
1. Train Higashi without cell-dependent GNN to force self-attention layers
to capture the heterogeneity of chromatin structures
2. Train Higashi with cell-dependent GNN, but with k=0
3. Train Higashi with cell-dependent GNN, but with k=`neighbor_num` in the
config JSON. When set as 1, the program would execute step 1,2,3 sequentially.
When set as 2, the program would execute step 2,3 sequentially. (default: 1)
required arguments:
-c CONFIG The path to the configuration JSON file that you created in the step 2
'
**Extra Notes: **
Higashi saves parameters of the model and embeddings every 5 epochs, the user can check if the embeddings look good in the process. For instance, the user is not sure how many epochs would Higashi converges on their new dataset and set the embedding_epoch
as 120 just to be on the safe side. During the training process, the user find that the embeddings converge at around epoch 58. Instead of waiting for 120 epochs to finish, one can just wait till the model finished the 60 epoch (as the model saves parameter every 5 epochs), and interrupt the Higashi program. Then the user can restart Higashi with the option -s 2
to load pre-trained model and skip the first embedding generation training stage.
Higashi ~ ~ Wiki
- Input files
- Usage (API)
- [Fast-Higashi initialized Higashi (Under construction)]
- Runtime of Fast-Higashi