NS-Forest v1.3 is a stable release on GitHub here. More background information in this post.

Cells are fundamental functional units of multicellular organisms, with different cell types playing distinct physiological roles in the body. The recent advent of single cell transcriptional profiling using RNA sequencing is producing “big data,” enabling the identification of novel human cell types at an unprecedented rate.

NS-Forest is a method based on random forest machine learning for identifying sets of necessary and sufficient marker genes, which can be used for quantitative PCR and multiplex FISH, and to assemble consistent and reproducible cell type definitions for incorporation into the Cell Ontology (CL). The representation of defined cell type classes and their relationships in the CL using this strategy will make the cell type classes findable, accessible, interoperable, and reusable (FAIR), allowing the CL to serve as a reference knowledgebase of information about the role that distinct cellular phenotypes play in human health and disease.

The early version of NS-Forest (v1.3) is conceptulized in the following publications: