Frequently Asked Questions

Q: What Enrichr database are supported?

A: Support modEnrich (https://amp.pharm.mssm.edu/modEnrichr/) . Now, Human, Mouse, Fly, Yeast, Worm, Fish are all supported.

Q: Use custom defined GMT file input in Jupyter ?

A: argument gene_sets accept dict input. This is useful when define your own gene_sets. An example dict looks like this:

gene_sets = {
          "term_1": ["gene_A", "gene_B", ...],
          "term_2": ["gene_B", "gene_C", ...],
           ...
          "term_100": ["gene_A", "gene_T", ...]
         }

APIs support dict object input: gsea, prerank, ssgsea, enrichr

Q: How to use Yeast database in gseapy.enrichr()?

Because some library names are the same in different Enrichr database, you have to set an additional augment organism when no use Human

gss = gseapy.get_library_name(organism='Yeast')
enr = gseapy.enrichr(gene_list=...,
                    gene_sets=gss,
                    organism='Yeast', # don't forget to set organism="Yeast"
                    )

Q: How to use Yeast database in gseapy.prerank()?

There is no augment organism in prerank, gsea, ssgea, but you could input these Enrichr libraries as follow:

# get libraries you'd like to use
gss = gseapy.get_library_name(organism='Yeast')
# get a custom gmt_dict
gmt_dict = gseapy.parser.gsea_gmt_parser('GO_Biological_Process_2018', organism='Yeast')
# run
prn_res = gseapy.prerank( ..., gene_sets=gmt_dict, ...)

Q: How to save plots using gseaplot, barplot, dotplot,``heatmap`` in Jupyter?

A: e.g. gseaplot(…, ofname=’your.plot.pdf’). That’s it

Q: What cutoff mean in functions, like enrichr(), dotplot, barplot ?

A: This argument control the terms (e.g FDR < 0.05) that will be shown on figures, not the result table output.

Q1: ssGSEA missing p value and FDR?

A: The original ssGSEA alogrithm will not give you pval or FDR, so, please ignore the gseaplot generated by ssgsea. It’s useless and misleading, therefore, fdr, and pval are not shown on the plot. If you’er seeking for ssGSEA with p-value output, please see here: https://github.com/broadinstitute/ssGSEA2.0 Actually, ssGSEA2.0 use the same method with GSEApy to calculate P-value, but FDR is not.

Q: What the difference between ssGSEA and Prerank

A: In short, - prerank is used for comparing two group of samples (e.g. control and treatment), where the gene ranking are defined by your custom rank method (like t-statistic, signal-to-noise, et.al). - ssGSEA is used for comparing individual samples to the rest of all, trying to find the gene signatures which samples shared the same (use ssGSEA when you have a lot of samples).

The statistic between prerank (GSEA) and ssGSEA are different. Assume that we have calculated each running enrichment score of your ranked input genes, then

  • es for GSEA: max(running enrichment scores) or min(running enrichment scores)
  • es for ssGSEA: sum(running enrichment scores)