Module contents

Top-level package for Cell Maps Generate Hierarchy.

cellmaps_generate_hierarchy.cellmaps_generate_hierarchycmd.main(args)[source]

Main entry point for program

Parameters:

args (list) – arguments passed to command line usually sys.argv[1:]()

Returns:

return value of cellmaps_generate_hierarchy.runner.CellmapsGenerateHierarchy.run() or 2 if an exception is raised

Return type:

int

cellmaps_generate_hierarchy.cellmaps_generate_hierarchycmd.validate_percentage(value)[source]

Runner module

class cellmaps_generate_hierarchy.runner.CellmapsGenerateHierarchy(outdir=None, inputdirs=[], ppigen=None, algorithm='leiden', maxres=80, k=10, gene_node_attributes=None, hiergen=None, name=None, organization_name=None, project_name=None, layoutalgo=None, skip_logging=True, provenance_utils=<cellmaps_utils.provenance.ProvenanceUtil object>, input_data_dict=None, ndexserver=None, ndexuser=None, ndexpassword=None, visibility=None, keep_intermediate_files=False, provenance=None)[source]

Runs steps necessary to create PPI from embedding and to generate a hierarchy

Constructor

Parameters:
  • outdir (str) – Directory to create and put results in

  • ppigen (PPINetworkGenerator) – PPI Network Generator object, should be a subclass

  • hiergen (HierarchyGenerator) – Hierarchy Generator object, should be a subclass

  • name

  • organization_name

  • project_name

  • skip_logging (bool) – If True skip logging, if None or False do NOT skip logging

  • provenance_utils

  • ndexserver (str)

  • ndexuser (str)

  • ndexpassword (str)

  • visibility (str or bool) – If set to public, PUBLIC or True sets hierarchy and interactome to publicly visibility on NDEx, otherwise they are left as private

get_hierarchy_dest_file()[source]

Creates file path prefix for hierarchy

Example path: /tmp/foo/hierarchy

Returns:

Prefix path on filesystem to write Hierarchy Network

Return type:

str

get_hierarchy_parent_network_dest_file()[source]

Creates file path prefix for hierarchy parent network

Example path: /tmp/foo/hierarchy_parent :return:

get_ppi_network_dest_file(ppi_network)[source]

Gets the path where the PPI network should be written to

Parameters:

ppi_network (ndex2.nice_cx_network.NiceCXNetwork) – PPI Network

Returns:

Path on filesystem to write the PPI network

Return type:

str

run()[source]

Runs CM4AI Generate Hierarchy

Returns:

PPI module

class cellmaps_generate_hierarchy.ppi.CosineSimilarityPPIGenerator(embeddingdirs=[], cutoffs=[0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008, 0.009, 0.01, 0.02, 0.03, 0.04, 0.05, 0.1])[source]

Bases: PPINetworkGenerator

Takes Embedding file of format:

ID # # # #

Where ID is gene and #’s is embedding vector

Constructor

PPI_CUTOFFS = [0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008, 0.009, 0.01, 0.02, 0.03, 0.04, 0.05, 0.1]
get_next_network()[source]

Gets all the edges

Parameters:

cutoff (float) – Fraction of top edges to keep 0.01 means 1% 0.5 means 50%

Returns:

Network

Return type:

ndex2.nice_cx_network.NiceCXNetwork

class cellmaps_generate_hierarchy.ppi.PPINetworkGenerator[source]

Bases: object

Base class for objects that generate Protein to Protein interaction networks

Constructor

get_next_network()[source]

Gets next protein to protein interaction network

Returns:

Network

Return type:

ndex2.nice_cx_network.NiceCXNetwork

Hierarchy module

class cellmaps_generate_hierarchy.hierarchy.CDAPSHiDeFHierarchyGenerator(hidef_cmd='hidef_finder.py', provenance_utils=<cellmaps_utils.provenance.ProvenanceUtil object>, refiner=None, hcxconverter=None, hierarchy_parent_cutoff=0.1, author='cellmaps_generate_hierarchy', version='0.2.2', bootstrap_edges=0)[source]

Bases: HierarchyGenerator

Generates hierarchy using HiDeF

Parameters:
  • hidef_cmd (str) – HiDeF command line binary

  • provenance_utils

  • author (str)

  • version

ATTR_DEC_NAME = 'attributeDeclarations'
BOOTSTRAP_EDGES = 0
CDAPS_JSON_FILE = 'cdaps.json'
CDRES_KEY_NAME = 'communityDetectionResult'
EDGELIST_TSV = '.id.edgelist.tsv'
HIDEF_OUT_PREFIX = 'hidef_output'
HIERARCHY_PARENT_CUTOFF = 0.1
NODE_CX_KEY_NAME = 'nodeAttributesAsCX2'
PERSISTENCE_COL_NAME = 'HiDeF_persistence'
TRANSLATED_HIDEF_OUT_PREFIX = 'hidefnames_output'
convert_hidef_output_to_cdaps(out_stream, outdir)[source]

Looks for x.nodes and x.edges in outdir directory to generate output in COMMUNITYDETECTRESULT format: https://github.com/idekerlab/communitydetection-rest-server/wiki/COMMUNITYDETECTRESULT-format

This method leverages

#write_members_for_row()

and

#write_communities()

to write output

Parameters:
  • out_stream (file like object) – output stream to write results

  • outdir (str)

Returns:

None

get_hierarchy(networks, algorithm='leiden', maxres=80, k=10)[source]

Runs HiDeF to generate hierarchy and registers resulting output files with FAIRSCAPE. To do this the method generates edgelist files from the CX files corresponding to the networks using the internal node ids for edge source and target names. These files are written to the same directory as the networks with HiDeF is then given all these networks via --g flag.

Warning

Due to FAIRSCAPE registration this method is NOT threadsafe and cannot be called in parallel or with any other call that is updating FAIRSCAPE registration on the current RO-CRATE

Parameters:
  • networks (list) – Paths (without suffix ie .cx) to PPI networks to be used as input to HiDeF

  • algorithm (str) – The algorithm to use for community detection (default is ‘leiden’).

  • maxres (int) – The maximum resolution parameter for HiDeF (default is 80).

  • k (int) – The k parameter for HiDeF (default is 10).

Raises:

CellmapsGenerateHierarchyError – If there was an error

Returns:

Resulting hierarchy or None if no hierarchy from HiDeF

Returns:

(hierarchy as list, parent ppi as list, hierarchyurl, parenturl) or None, None if not created

Return type:

tuple

get_hierarchy_from_edgelists(outdir, edgelist_files, parent_net, algorithm='leiden', maxres=80, k=10)[source]

Generates a hierarchy from edgelist files using HiDeF.

This method runs the HiDeF algorithm on the provided edgelist files to generate a hierarchical community structure. It optionally refines the hierarchy, converts the HiDeF output to CDAPS format, and then uses cdapsutil to run community detection on the parent network.

Parameters:
  • outdir (str) – The output directory where HiDeF results and intermediate files will be stored.

  • edgelist_files (list) – A list of paths to edgelist files to be used as input to HiDeF.

  • parent_net (NiceCXNetwork or CX2Network) – The parent network on which community detection is performed.

  • algorithm (str) – The algorithm to use for community detection (default is ‘leiden’).

  • maxres (int) – The maximum resolution parameter for HiDeF (default is 80).

  • k (int) – The k parameter for HiDeF (default is 10).

Returns:

A tuple containing the resulting hierarchy and the path to the CDAPS output JSON file, or (None, None) if an error occurs.

Return type:

tuple (hierarchy, str) or (None, None)

Raises:

FileNotFoundError – If no output is generated from HiDeF.

update_cluster_node_map(cluster_node_map, cluster, max_node_id)[source]

Updates ‘cluster_node_map’ which is in format of

<cluster name> => <node id>

by adding ‘cluster’ to ‘cluster_node_map’ if it does not exist

Parameters:
  • cluster_node_map (dict) – map of cluster names to node ids

  • cluster (str) – name of cluster

  • max_node_id (int) – current max node id

Returns:

(new ‘max_node_id’ if ‘cluster’ was added otherwise ‘max_node_id’, id corresponding to ‘cluster’ found in ‘cluster_node_map’)

Return type:

tuple

update_persistence_map(persistence_node_map, node_id, persistence_val)[source]
Parameters:
  • persistence_node_map

  • node_id

  • persistence_val

Returns:

write_communities(out_stream, edge_file, cluster_node_map)[source]

Writes out links between clusters in COMMUNITYDETECTRESULT format as noted in #convert_hidef_output_to_cdaps()

using hidef edge file set in ‘edge_file’ that is expected to be in this tab delimited format:

<SOURCE CLUSTER> <TARGET CLUSTER> <default>

This function converts the <SOURCE CLUSTER> <TARGET CLUSTER> to new node ids (leveraging ‘cluster_node_map’)

and writes the following output:

<SOURCE CLUSTER NODE ID>,<TARGET CLUSTER NODE ID>,c-c;

to the ‘out_stream’

Parameters:
  • out_stream (file like object) – output stream

  • edge_file (str) – path to hidef edges file

Returns:

None

write_members_for_row(out_stream, row, cur_node_id)[source]

Given a row from nodes file from hidef output the members of the clusters by parsing the <SPACE DELIMITED NODE IDS> as mentioned in #get_max_node_id() description.

The output is written to out_stream for each node id in format:

<cur_node_id>,<node id>,c-m;

Parameters:
  • out_stream (file like object)

  • row (iterator) – Should be a line from hidef nodes file parsed by csv.reader()

  • cur_node_id (int) – id of cluster that contains the nodes

Returns:

None

write_persistence_node_attribute(out_stream, persistence_map)[source]
Parameters:
  • out_stream

  • persistence_map

Returns:

class cellmaps_generate_hierarchy.hierarchy.HierarchyGenerator(provenance_utils=<cellmaps_utils.provenance.ProvenanceUtil object>, author='cellmaps_generate_hierarchy', version='0.2.2')[source]

Bases: object

Base class for generating hierarchy that is output in CX format following CDAPS style

Constructor

get_generated_dataset_ids()[source]

Gets IDs of datasets created by this object that have been registered with FAIRSCAPE :return:

get_hierarchy(networks, algorithm='leiden', maxres=80, k=10)[source]

Gets hierarchy

Returns:

(hierarchy as list, parent ppi as list)

Return type:

tuple

Mature hierarchy module

class cellmaps_generate_hierarchy.maturehierarchy.HiDeFHierarchyRefiner(ci_thre=0.75, ji_thre=0.9, min_term_size=4, min_diff=1, provenance_utils=<cellmaps_utils.provenance.ProvenanceUtil object>, author='cellmaps_generate_hierarchy', version='0.2.2')[source]

Bases: object

Refines HiDeF hierarchy output by removing highly similar terms. This code derived from maturehierarchy.py developed by (## todo Leah who created this?)

Constructor

Parameters:
  • ci_thre – Containment index threshold

  • ji_thre – Jaccard index threshold for merging similar clusters

  • min_system_size – Minimum number of proteins requiring each system to have

  • min_diff – Minimum difference in number of proteins for every parent-child pair

CHILD_COL = 'child'
CONTAINMENT_THRESHOLD = 0.75
DEFAULT_TYPE = 'default'
EDGES_SUFFIX = '.edges'
GENES_COL = 'genes'
GENE_TYPE = 'gene'
JACCARD_THRESHOLD = 0.9
MIN_DIFF = 1
MIN_SYSTEM_SIZE = 4
NODES_SUFFIX = '.nodes'
PARENT_COL = 'parent'
STABILITY_COL = 'stability'
TERMS_COL = 'terms'
TSIZE_COL = 'tsize'
TYPE_COL = 'type'
refine_hierarchy(outprefix=None)[source]

Removes highly similar systems and dumps out a new HiDeF formatted .nodes and .edges file with .pruned.nodes and .pruned.edges suffixes

Parameters:

outprefix (str) – output_dir/file_prefix for the output file

Returns:

dataset ids of .pruned.nodes and .pruned.edges file generated

Return type:

list

HCX (Hierarchy in CX2) module

class cellmaps_generate_hierarchy.hcx.HCXFromCDAPSCXHierarchy[source]

Bases: object

Converts CDAPS Hierarchy (and parent network/interactome) into HCX hierarchy and CX2 respectively.

Constructor

VISUAL_EDITOR_PROPERTIES_ASPECT = 'visualEditorProperties'
static apply_style_to_network(network, style_filename)[source]

Applies the style to CX2Network from another network from file specified by the path.

Parameters:
  • network (CX2Network) – The network to be converted and styled.

  • style_filename (str) – The filename of the style to be applied.

Returns:

The styled network.

Return type:

ndex2.cx2.CX2Network

get_converted_hierarchy(hierarchy=None, parent_network=None)[source]

Converts hierarchy in CX CDAPS format into HCX format and parent network from CX format into CX2 format

For the parent network aka interactome, it translates it from cx to cx2 using ~ndex2.cx2.NoStyleCXToCX2NetworkFactory class.

This transformation is done by first annotating the hierarchy network with needed HCX annotations, namely going with filesystem based HCX format where the network attribute: HCX::interactionNetworkName is set to filename of parent ppi.

For necessary annotations see: https://cytoscape.org/cx/cx2/hcx-specification/ and for code implementing these annotations see: https://github.com/idekerlab/hiviewutils/blob/main/hiviewutils/hackedhcx.py

Once the hierarchy is annotated, it translates it from cx to cx2 using ~ndex2.cx2.NoStyleCXToCX2NetworkFactory class.

Parameters:
  • hierarchy (NiceCXNetwork) – Hierarchy network

  • parent_network (NiceCXNetwork) – Parent network

Returns:

(hierarchy as CX2Network, parent ppi as CX2Network)

Return type:

tuple

Layout module

class cellmaps_generate_hierarchy.layout.CytoscapeJSBreadthFirstLayout(layout_algorithm='breadthfirst', rest_endpoint='http://cytolayouts.ucsd.edu/cd/communitydetection/v1', retry_sleep_time=1, request_timeout=120)[source]

Bases: HierarchyLayout

Runs breadthfirst layout from http://cytolayouts.ucsd.edu/cd to get a layout

Constructor

Parameters:
  • layout_algorithm (str) – can be one of the following: circle|cose|grid|concentric|breadthfirst|dagre

  • rest_endpoint – URL for rest service

  • retry_sleep_time (int or float) – time in seconds to wait before checking status with REST service on status of task

  • request_timeout (int or float) – timeout in seconds to pass to requests library for web requests

HEADERS = {'Accept': 'application/json', 'Content-Type': 'application/json'}
add_layout(network=None, timeout=1800)[source]

Runs algorithm specified in constructor on network in place

Parameters:
  • network (NiceCXNetwork) – Hierarchy network

  • timeout (int or float) – time in seconds to wait for task to finish before failing

class cellmaps_generate_hierarchy.layout.HierarchyLayout[source]

Bases: object

Base class for layout algorithms

Constructor

add_layout(network=None)[source]

Adds layout to network passed in. Subclasses should implement

Parameters:

network (NiceCXNetwork)

Raises:

NotImplementedError – Always raised

NDEx Upload module

class cellmaps_generate_hierarchy.ndexupload.NDExHierarchyUploader(ndexserver, ndexuser, ndexpassword, visibility=None)[source]

Bases: object

Base class for uploading hierarchical networks and their parent networks to NDEx.

Note:

This class is deprecated and will be removed in a future release. Please use NDExHierarchyUploader from cellmaps_utils.ndexupload instead.

Constructor

Parameters:
  • ndexserver (str)

  • ndexuser (str)

  • ndexpassword (str)

  • visibility (str or bool) – If set to public, PUBLIC or True sets hierarchy and interactome to publicly visibility on NDEx, otherwise they are left as private

get_cytoscape_url(ndexurl)[source]

Generates a Cytoscape URL for a given NDEx network URL.

Parameters:

ndexurl (str) – The URL of the NDEx network.

Returns:

The URL pointing to the network’s view on the Cytoscape platform.

Return type:

str

save_hierarchy_and_parent_network(hierarchy, parent_ppi)[source]

Saves both the hierarchy and its parent network to the NDEx server. This method first saves the parent network, then updates the hierarchy with HCX annotations based on the parent network’s UUID, and finally saves the updated hierarchy. It returns the UUIDs and URLs for both the hierarchy and the parent network.

Parameters:
  • hierarchy (CX2Network) – The hierarchy network to be saved.

  • parent_ppi (CX2Network) – The parent protein-protein interaction network associated with the hierarchy.

Returns:

UUIDs and URLs for both the parent network and the hierarchy.

Return type:

tuple

upload_hierary_and_parent_network_from_files(outdir)[source]

Uploads hierarchy and parent network to NDEx from CX2 files located in a specified directory. It first checks the existence of the hierarchy and parent network files, then loads them into network objects, and finally saves them to NDEx using save_hierarchy_and_parent_network method.

Parameters:

outdir (str) – The directory where the hierarchy and parent network files are located.

Returns:

UUIDs and URLs for both the hierarchy and parent network.

Return type:

tuple

Raises:

CellmapsGenerateHierarchyError – If the required hierarchy or parent network files do not exist in the directory.

Exceptions

exception cellmaps_generate_hierarchy.exceptions.CellmapsGenerateHierarchyError[source]

Bases: Exception

Base exception for cellmaps_generate_hierarchy