Module contents
Top-level package for Cell Maps Generate Hierarchy.
Runner module
- class cellmaps_generate_hierarchy.runner.CellmapsGenerateHierarchy(outdir=None, inputdirs=[], ppigen=None, algorithm='leiden', maxres=80, k=10, gene_node_attributes=None, hiergen=None, name=None, organization_name=None, project_name=None, layoutalgo=None, skip_logging=True, provenance_utils=<cellmaps_utils.provenance.ProvenanceUtil object>, input_data_dict=None, ndexserver=None, ndexuser=None, ndexpassword=None, visibility=None, keep_intermediate_files=False, provenance=None)[source]
Runs steps necessary to create PPI from embedding and to generate a hierarchy
Constructor
- Parameters:
outdir (str) – Directory to create and put results in
ppigen (
PPINetworkGenerator
) – PPI Network Generator object, should be a subclasshiergen (
HierarchyGenerator
) – Hierarchy Generator object, should be a subclassname
organization_name
project_name
skip_logging (bool) – If
True
skip logging, ifNone
orFalse
do NOT skip loggingprovenance_utils
ndexserver (str)
ndexuser (str)
ndexpassword (str)
visibility (str or bool) – If set to
public
,PUBLIC
orTrue
sets hierarchy and interactome to publicly visibility on NDEx, otherwise they are left as private
- get_hierarchy_dest_file()[source]
Creates file path prefix for hierarchy
Example path:
/tmp/foo/hierarchy
- Returns:
Prefix path on filesystem to write Hierarchy Network
- Return type:
- get_hierarchy_parent_network_dest_file()[source]
Creates file path prefix for hierarchy parent network
Example path:
/tmp/foo/hierarchy_parent
:return:
- get_ppi_network_dest_file(ppi_network)[source]
Gets the path where the PPI network should be written to
- Parameters:
ppi_network (
ndex2.nice_cx_network.NiceCXNetwork
) – PPI Network- Returns:
Path on filesystem to write the PPI network
- Return type:
- run()[source]
Runs CM4AI Generate Hierarchy
- Returns:
PPI module
- class cellmaps_generate_hierarchy.ppi.CosineSimilarityPPIGenerator(embeddingdirs=[], cutoffs=[0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008, 0.009, 0.01, 0.02, 0.03, 0.04, 0.05, 0.1])[source]
Bases:
PPINetworkGenerator
Takes Embedding file of format:
ID # # # #
Where ID is gene and #’s is embedding vector
Constructor
- PPI_CUTOFFS = [0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008, 0.009, 0.01, 0.02, 0.03, 0.04, 0.05, 0.1]
Hierarchy module
- class cellmaps_generate_hierarchy.hierarchy.CDAPSHiDeFHierarchyGenerator(hidef_cmd='hidef_finder.py', provenance_utils=<cellmaps_utils.provenance.ProvenanceUtil object>, refiner=None, hcxconverter=None, hierarchy_parent_cutoff=0.1, author='cellmaps_generate_hierarchy', version='0.2.2', bootstrap_edges=0)[source]
Bases:
HierarchyGenerator
Generates hierarchy using HiDeF
- ATTR_DEC_NAME = 'attributeDeclarations'
- BOOTSTRAP_EDGES = 0
- CDAPS_JSON_FILE = 'cdaps.json'
- CDRES_KEY_NAME = 'communityDetectionResult'
- EDGELIST_TSV = '.id.edgelist.tsv'
- HIDEF_OUT_PREFIX = 'hidef_output'
- HIERARCHY_PARENT_CUTOFF = 0.1
- NODE_CX_KEY_NAME = 'nodeAttributesAsCX2'
- PERSISTENCE_COL_NAME = 'HiDeF_persistence'
- TRANSLATED_HIDEF_OUT_PREFIX = 'hidefnames_output'
- convert_hidef_output_to_cdaps(out_stream, outdir)[source]
Looks for x.nodes and x.edges in outdir directory to generate output in COMMUNITYDETECTRESULT format: https://github.com/idekerlab/communitydetection-rest-server/wiki/COMMUNITYDETECTRESULT-format
This method leverages
#write_members_for_row()
and
#write_communities()
to write output
- Parameters:
out_stream (file like object) – output stream to write results
outdir (str)
- Returns:
None
- get_hierarchy(networks, algorithm='leiden', maxres=80, k=10)[source]
Runs HiDeF to generate hierarchy and registers resulting output files with FAIRSCAPE. To do this the method generates edgelist files from the CX files corresponding to the networks using the internal node ids for edge source and target names. These files are written to the same directory as the networks with HiDeF is then given all these networks via
--g
flag.Warning
Due to FAIRSCAPE registration this method is NOT threadsafe and cannot be called in parallel or with any other call that is updating FAIRSCAPE registration on the current RO-CRATE
- Parameters:
networks (list) – Paths (without suffix ie .cx) to PPI networks to be used as input to HiDeF
algorithm (str) – The algorithm to use for community detection (default is ‘leiden’).
maxres (int) – The maximum resolution parameter for HiDeF (default is 80).
k (int) – The k parameter for HiDeF (default is 10).
- Raises:
CellmapsGenerateHierarchyError – If there was an error
- Returns:
Resulting hierarchy or
None
if no hierarchy from HiDeF- Returns:
(hierarchy as list, parent ppi as list, hierarchyurl, parenturl) or None, None if not created
- Return type:
- get_hierarchy_from_edgelists(outdir, edgelist_files, parent_net, algorithm='leiden', maxres=80, k=10)[source]
Generates a hierarchy from edgelist files using HiDeF.
This method runs the HiDeF algorithm on the provided edgelist files to generate a hierarchical community structure. It optionally refines the hierarchy, converts the HiDeF output to CDAPS format, and then uses cdapsutil to run community detection on the parent network.
- Parameters:
outdir (str) – The output directory where HiDeF results and intermediate files will be stored.
edgelist_files (list) – A list of paths to edgelist files to be used as input to HiDeF.
parent_net (
NiceCXNetwork
orCX2Network
) – The parent network on which community detection is performed.algorithm (str) – The algorithm to use for community detection (default is ‘leiden’).
maxres (int) – The maximum resolution parameter for HiDeF (default is 80).
k (int) – The k parameter for HiDeF (default is 10).
- Returns:
A tuple containing the resulting hierarchy and the path to the CDAPS output JSON file, or (None, None) if an error occurs.
- Return type:
- Raises:
FileNotFoundError – If no output is generated from HiDeF.
- update_cluster_node_map(cluster_node_map, cluster, max_node_id)[source]
Updates ‘cluster_node_map’ which is in format of
<cluster name> => <node id>
by adding ‘cluster’ to ‘cluster_node_map’ if it does not exist
- update_persistence_map(persistence_node_map, node_id, persistence_val)[source]
- Parameters:
persistence_node_map
node_id
persistence_val
- Returns:
- write_communities(out_stream, edge_file, cluster_node_map)[source]
Writes out links between clusters in COMMUNITYDETECTRESULT format as noted in
#convert_hidef_output_to_cdaps()
using hidef edge file set in ‘edge_file’ that is expected to be in this tab delimited format:
<SOURCE CLUSTER> <TARGET CLUSTER> <default>
This function converts the <SOURCE CLUSTER> <TARGET CLUSTER> to new node ids (leveraging ‘cluster_node_map’)
and writes the following output:
<SOURCE CLUSTER NODE ID>,<TARGET CLUSTER NODE ID>,c-c;
to the ‘out_stream’
- Parameters:
out_stream (file like object) – output stream
edge_file (str) – path to hidef edges file
- Returns:
None
- write_members_for_row(out_stream, row, cur_node_id)[source]
Given a row from nodes file from hidef output the members of the clusters by parsing the <SPACE DELIMITED NODE IDS> as mentioned in
#get_max_node_id()
description.The output is written to out_stream for each node id in format:
<cur_node_id>,<node id>,c-m;
- Parameters:
out_stream (file like object)
row (iterator) – Should be a line from hidef nodes file parsed by
csv.reader()
cur_node_id (int) – id of cluster that contains the nodes
- Returns:
None
- class cellmaps_generate_hierarchy.hierarchy.HierarchyGenerator(provenance_utils=<cellmaps_utils.provenance.ProvenanceUtil object>, author='cellmaps_generate_hierarchy', version='0.2.2')[source]
Bases:
object
Base class for generating hierarchy that is output in CX format following CDAPS style
Constructor
Mature hierarchy module
- class cellmaps_generate_hierarchy.maturehierarchy.HiDeFHierarchyRefiner(ci_thre=0.75, ji_thre=0.9, min_term_size=4, min_diff=1, provenance_utils=<cellmaps_utils.provenance.ProvenanceUtil object>, author='cellmaps_generate_hierarchy', version='0.2.2')[source]
Bases:
object
Refines HiDeF hierarchy output by removing highly similar terms. This code derived from maturehierarchy.py developed by (## todo Leah who created this?)
Constructor
- Parameters:
ci_thre – Containment index threshold
ji_thre – Jaccard index threshold for merging similar clusters
min_system_size – Minimum number of proteins requiring each system to have
min_diff – Minimum difference in number of proteins for every parent-child pair
- CHILD_COL = 'child'
- CONTAINMENT_THRESHOLD = 0.75
- DEFAULT_TYPE = 'default'
- EDGES_SUFFIX = '.edges'
- GENES_COL = 'genes'
- GENE_TYPE = 'gene'
- JACCARD_THRESHOLD = 0.9
- MIN_DIFF = 1
- MIN_SYSTEM_SIZE = 4
- NODES_SUFFIX = '.nodes'
- PARENT_COL = 'parent'
- STABILITY_COL = 'stability'
- TERMS_COL = 'terms'
- TSIZE_COL = 'tsize'
- TYPE_COL = 'type'
HCX (Hierarchy in CX2) module
- class cellmaps_generate_hierarchy.hcx.HCXFromCDAPSCXHierarchy[source]
Bases:
object
Converts CDAPS Hierarchy (and parent network/interactome) into HCX hierarchy and CX2 respectively.
Constructor
- VISUAL_EDITOR_PROPERTIES_ASPECT = 'visualEditorProperties'
- static apply_style_to_network(network, style_filename)[source]
Applies the style to CX2Network from another network from file specified by the path.
- Parameters:
network (
CX2Network
) – The network to be converted and styled.style_filename (str) – The filename of the style to be applied.
- Returns:
The styled network.
- Return type:
ndex2.cx2.CX2Network
- get_converted_hierarchy(hierarchy=None, parent_network=None)[source]
Converts hierarchy in CX CDAPS format into HCX format and parent network from CX format into CX2 format
For the parent network aka interactome, it translates it from cx to cx2 using ~ndex2.cx2.NoStyleCXToCX2NetworkFactory class.
This transformation is done by first annotating the hierarchy network with needed HCX annotations, namely going with filesystem based HCX format where the network attribute:
HCX::interactionNetworkName
is set to filename of parent ppi.For necessary annotations see: https://cytoscape.org/cx/cx2/hcx-specification/ and for code implementing these annotations see: https://github.com/idekerlab/hiviewutils/blob/main/hiviewutils/hackedhcx.py
Once the hierarchy is annotated, it translates it from cx to cx2 using ~ndex2.cx2.NoStyleCXToCX2NetworkFactory class.
- Parameters:
hierarchy (
NiceCXNetwork
) – Hierarchy networkparent_network (
NiceCXNetwork
) – Parent network
- Returns:
(hierarchy as
CX2Network
, parent ppi asCX2Network
)- Return type:
Layout module
- class cellmaps_generate_hierarchy.layout.CytoscapeJSBreadthFirstLayout(layout_algorithm='breadthfirst', rest_endpoint='http://cytolayouts.ucsd.edu/cd/communitydetection/v1', retry_sleep_time=1, request_timeout=120)[source]
Bases:
HierarchyLayout
Runs breadthfirst layout from http://cytolayouts.ucsd.edu/cd to get a layout
Constructor
- Parameters:
layout_algorithm (str) – can be one of the following: circle|cose|grid|concentric|breadthfirst|dagre
rest_endpoint – URL for rest service
retry_sleep_time (int or float) – time in seconds to wait before checking status with REST service on status of task
request_timeout (int or float) – timeout in seconds to pass to
requests
library for web requests
- HEADERS = {'Accept': 'application/json', 'Content-Type': 'application/json'}
- class cellmaps_generate_hierarchy.layout.HierarchyLayout[source]
Bases:
object
Base class for layout algorithms
Constructor
- add_layout(network=None)[source]
Adds layout to network passed in. Subclasses should implement
- Parameters:
network (
NiceCXNetwork
)- Raises:
NotImplementedError – Always raised
NDEx Upload module
- class cellmaps_generate_hierarchy.ndexupload.NDExHierarchyUploader(ndexserver, ndexuser, ndexpassword, visibility=None)[source]
Bases:
object
Base class for uploading hierarchical networks and their parent networks to NDEx.
- Note:
This class is deprecated and will be removed in a future release. Please use NDExHierarchyUploader from cellmaps_utils.ndexupload instead.
Constructor
- Parameters:
- save_hierarchy_and_parent_network(hierarchy, parent_ppi)[source]
Saves both the hierarchy and its parent network to the NDEx server. This method first saves the parent network, then updates the hierarchy with HCX annotations based on the parent network’s UUID, and finally saves the updated hierarchy. It returns the UUIDs and URLs for both the hierarchy and the parent network.
- Parameters:
hierarchy (
CX2Network
) – The hierarchy network to be saved.parent_ppi (
CX2Network
) – The parent protein-protein interaction network associated with the hierarchy.
- Returns:
UUIDs and URLs for both the parent network and the hierarchy.
- Return type:
- upload_hierary_and_parent_network_from_files(outdir)[source]
Uploads hierarchy and parent network to NDEx from CX2 files located in a specified directory. It first checks the existence of the hierarchy and parent network files, then loads them into network objects, and finally saves them to NDEx using save_hierarchy_and_parent_network method.
- Parameters:
outdir (str) – The directory where the hierarchy and parent network files are located.
- Returns:
UUIDs and URLs for both the hierarchy and parent network.
- Return type:
- Raises:
CellmapsGenerateHierarchyError – If the required hierarchy or parent network files do not exist in the directory.