Features

EvalNE has been designed as a pipeline of interconnected and interchangeable building blocks. This structure provides the flexibility to create different evaluation pipelines and, thus, to evaluate methods from node embeddings, node-pair embeddings or similarity scores. The main building blocks that constitute EvalNE as well as the types of tasks and methods it can evaluate are presented in the following diagram. Blocks represented with solid lines correspond to modules provided by the library and those with dashed lines are the user-specified methods to be evaluated.

EvalNE diagram

Note

For node classification (NC) tasks currently only nede embedding methods are supported.

Note

The hyper-parameter tuning and evaluation setup functionalities are omitted in this diagram.

A more detailed description of the library features for the practitioner and for the methodologist are presented below. Further information can be found in our paper.

For Methodologists

A command line interface in combination with a configuration file (describing datasets, methods and evaluation setup) allows the user to evaluate any embedding method and compare it to the state of the art or replicate the experimental setup of existing papers without the need to write additional code. EvalNE does not provide implementations of any NE methods but offers the necessary environment to evaluate any off-the-shelf algorithm. Implementations of NE methods can be obtained from libraries such as OpenNE or GEM as well as directly from the web pages of the authors e.g. Deepwalk, Node2vec, LINE, PRUNE, Metapath2vec, CNE.

EvalNE also includes the following LP heuristics for both directed and undirected networks (in and out node neighbourhoods), which can be used as baselines:

  • Random Prediction
  • Common Neighbours
  • Jaccard Coefficient
  • Adamic Adar Index
  • Preferential Attachment
  • Resource Allocation Index
  • Cosine Similarity
  • Leicht-Holme-Newman index
  • Topological Overlap
  • Katz similarity
  • All baselines (a combination of the first 5 heuristics in a 5-dim embedding)

For Practitioners

When used as an API, EvalNE provides functions to:

  • Load and preprocess graphs

  • Obtain general graph statistics

  • Conveniently read node/edge embeddings from files

  • Sample nodes/edges to form train/test/validation sets

  • Different approaches for edge sampling:

    • Timestamp based sampling: latest nodes are used for testing
    • Random sampling: random split of edges in train and test sets
    • Spanning tree sampling: train set will contain a spanning tree of the graph
    • Fast depth first search sampling: similar to spanning tree but based of DFS
  • Negative sampling or generation of non-edge pairs using:

    • Open world assumption: train non-edges do not overlap with train edges
    • Closed world assumption: train non-edges do not overlap with either train nor test edges
  • Evaluate LP, SP and NR for methods that output:

    • Node Embeddings
    • Node-pair Embeddings
    • Similarity scores (e.g. the ones given by LP heuristics)
  • Implements simple visualization routines for embeddings and graphs

  • Includes NC evaluation for node embedding methods

  • Provides binary operators to compute edge embeddings from node feature vectors:

    • Average
    • Hadamard
    • Weighted L1
    • Weighted L2
  • Can use any scikit-learn classifier for LP/SP/NR/NC tasks

  • Provides routines to run command line commands or functions with a given timeout

  • Includes hyperparameter tuning based on grid search

  • Implements over 10 different evaluation metrics such as AUC, F-score, etc.

  • AUC and PR curves can be provided as output

  • Includes routines to generate tabular outputs and directly parse them to Latex tables