How to Format Input Data:

  • All data should be saved as a tab seperated value (.tsv) file.
  • The top line of the file should be a header line.
  • There should be three columns of data on each line with the following headers:
    • Name - The name of the gene set.
    • Molecules - Molecules in the set that were enriched, each seperated by a space.
    • Value - The p-value, q-value, or whatever metric the sets are evaluated by.
  • The order of these columns does not matter, but each line of data must match the order of the header.
  • Click here to download an example data file.
  • This is what the first three lines of a data file might look like:
Name	Molecules	Value
GO_PODOSOME_ASSEMBLY	DBNL BIN2 SRC MSN ASAP1 ARHGEF2	0.074073679
GO_PROFILIN_BINDING	VASP CTTN WIPF1 HTT DBN1	0.2582822646196803

Screenshot

  • On Selection: A sreenshot of the current graph will be taken and saved as a png.

Zoom

  • Click and Drag on Graph: Zoom in on a selected area on the graph.
  • Click and Drag on Axis: Zoom in on selected range on UMAP-1 or UMAP-2.
  • Double Click on Graph: Apply default zoom which fits all points, even hidden ones.

Pan

  • Click and Drag on Graph: Pan accross graph.
  • Click and Drag on Axis: Pan accross selected axis.
  • Double Click on Graph: Apply default zoom which fits all points, even hidden ones.

Box Select

  • Click and Drag on Graph: Select points in a box to compare below graph.
  • Double Click on Graph: Exit selection mode.

Lasso Select

  • Click and Drag on Graph: Select points in a custom area to compare below graph.
  • Double Click on Graph: Exit selection mode.

Zoom In

  • On Selection: Zoom into center of graph.

Zoom Out

  • On Selection: Zoom out of center of graph.

Autoscale

  • On Selection: Scale graph to fit all points and labels on screen
  • Double Click on Graph: Apply default zoom which fits all points, even hidden ones.

Reset Scale

  • On Selection: Apply default zoom which fits all points, even hidden ones.

Graph and Visual Settings

  • The color of all points is a gradient between two colors.
  • A gene set's position on this gradient is determined based on its p-value.
  • Most Significant Color (RGB): Determines color for most significant gene set.
  • Least Significant Color (RGB): Determines color for least significant gene set.
  • Selected Color (RGB): Controls the color of the outline placed around selected points on graph.
  • This determines what points are displayed on the graph from the data set based on p-value.
  • Maximum (Float): The largest p-value a displayed point can have
  • Minimum (Float): the smallest p-value a displayed point can have.
  • This also sets the range for the color gradient.
    • Maximum sets the p-value for the least significant color.
    • Minimum sets the p-value for most significant color.
  • Fixed (Float): The size of every point is the same, represents the length of its diamater in pixels.
  • Dynamic (Float): The size of each point is determined by it's number of enriched molecules.
    • The number of molecules is treated as the area, and can be scaled by any positive float.

UMAP Settings

  • Input tye: Integer
  • Sets the number of neighbors paramater used in UMAP calculation.
  • Smaller values result in smaller clusters on embedding, while higher values produce larger clusters.
  • Click here for more information.
  • Input tye: Float
  • Controls the minimum distance between points in the final embedding.
  • Lower values means points and clusters will be closer together, while larger values push them apart.
  • Click here for more information.
  • Input tye: Integer
  • Sets the random state of UMAP to a set number for reproducibility.
  • Click here for more information.