This is a more in-depth interface to SCA, allowing the user to define and construct arbitrary embedding strategies. There are three components to SCA’s workflow, each represented by a Python class:
Connectors accept data in some representation, and calculate the nearest neighbors of each observation. For example, the
metric=euclideancomputes neighbors using Euclidean distance.
Scorers accept a representation of the data and a list of neighbors for each observation, and compute scores indicating, for each observation and each feature in the input representation, the degree of over- or under-expression locally. For example, the
WilcoxonScorerperforms a Wilcoxon rank-sum test for each feature’s over- or under-expression in the neighborhood of each observation, and takes the negative log p-value as the score.
Correctors transform probabilities to account for multiple testing. For example, the
FWERCorrectorperforms family-wise error rate correction.
Embedders compute low-dimensional embeddings of the input data, possibly using the above tools. For example, the
SVDEmbedderprojects the data onto the top right singular vectors. The
SCAEmbedderfirst uses an
SVDEmbedderto make the initial embedding, computes neighborhoods using a
Connectoron this representation, computes scores using a
Scoreron the neighborhoods, and performs SVD on the resulting score matrix to obtain axes for projection.
This object-oriented framework allows the user to easily swap out different strategies for base embedding generation, neighborhood construction, and score computation. The top-level API uses this framework under the hood.
Embedder produces low-dimensional representations of data via the method
embed. They include
Base class for Embedders.
Embedder that performs SCA on the input.
Embedder that projects data to its top right singular vectors.
Scorer takes a representation of data and a list of neighborhoods for each observation, and computes significance scores for each feature in the neighborhood of each observation. These can be used to compute embeddings, as the
SCAEmbedder does, or simply as an alternate featurization.
Connector takes a representation of data and computes neighborhoods of each observation.
Base class for Connectors.
Make neighborhoods of observations using a metric on the input data.
Correctors accept matrices or arrays of probabilities, and correct them according to some multiple testing correction scheme.
Performs family-wise error rate correction on input p-values