# Introduction to NovAST

NovAST is a deep-learning framework for automated label transfer and novel cell type discovery in spatial transcriptomics.

## Installation
Here is the sample command to install packages into a python environment:
```
# clone the code from GitHub
git clone https://github.com/YMa-lab/NovAST.git
# create a new environemnbt with python==3.11
python -m venv NovAST_env
# activate the python environemtn 
source NovAST_env/bin/activate
# install the package
pip install .
```

## Quickstart

NovAST requires both **reference** and **target** datasets in **AnnData (.h5ad)** format. 
We need to first set up some basic path :

```python
savedir: "./"
train_path="path_to_train_dataset.h5ad"
test_path="path_to_train_dataset.h5ad"
training_mode = "exploration"

# Column name in the reference AnnData that stores cell-type annotations
celltype_name_train = "cell_type"
name = "demo_exploration"
dataset = "dataset_name"
```

All remaining hyperparameters are defined in the file **`default_config.yaml`**.  
Users may override any of them directly when calling `run_NovAST()` if customization is needed.

You can now run NovAST using the specified settings as follows:

```python
args = run_NovAST(
    training_mode=training_mode,
    train_path=train_path,
    test_path=test_path,
    celltype_name_train=celltype_name_train,
    name=name,
    dataset=dataset,
    rounds=10
)
```
For each training round, the pipeline saves all outputs to the specified directory, with each random seed assigned its own subfolder. This includes the trained model, the Stage-1 loss values, and a final result file named **`adata_unlabeled_final.h5ad`**, which stores the latent embeddings in **`.obsm['X_latent']`** and the final predicted labels in **`.obs['voted_final_prediction']`**.

Running the following line of code will generate UMAP visualizations as well as spatial plots of the predicted cell types and their associated confidence scores, and save them to each individual seed’s output directory.

```python
NovAST_plot(args)
```

## Tutorials
We further provides the tutorials on the datasets presented in the main figures in the manuscript.