Welcome to shannonca’s documentation!

shannonca (Shannon Component analysis, or SCA) is a linear dimensionality reduction technique which captures and preserves informative features of the input data. Given input data with many features and observations, it models the enrichment or depletion of each feature in a neighborhood of each observation to produce an information score matrix, which is then distilled into a smaller set of features for downstream analysis. For a detailed description, please see our pre-print. SCA integrates with scanpy for use in single-cell pipelines.

In this implementation, we offer two ways to interact with SCA. The top-level API provides quick, easy access to SCA’s core functionality via functions that accept high-dimensional data and output lower-dimensional embeddings. Even this API is highly customizable, but we have made a few basic decisions, like the use of SVD to generate a base embedding, that simplify the user experience. See Quick start for a simple tutorial.

For users looking to customize and extend SCA beyond standard use cases, we offer an object-oriented API which allows more nuanced construction of dimensionality reduction tools using SCA’s information-theoretic foundation. For example, the object-oriented API may be used to tweak the multiple testing correction method, base embedding strategy, or generation of k-nearest neighborhood graphs from embeddings.


shannonca is uploaded to PyPI, and can be downloaded using pip:

pip3 install shannonca

Alternatively, you can clone our github repository, which will contain the very latest changes, but may also be less stable than the version uploaded to PyPI.