{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Deduplicating PPIs" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "iDist provides a fast way to compare large sets of protein-protein interactions (PPIs) pairwise. Therefore, the method may by used to deduplicate PPI datasets. This may be crucial to remove redundancy in the data and to avoid bias in downstream analyses or machine learning." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from ppiref.comparison import IDist\n", "from ppiref.definitions import PPIREF_TEST_DATA_DIR\n", "\n", "# Suppress Graphein log\n", "from loguru import logger\n", "logger.disable('graphein')" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "In this example, we will reuse the near-duplicate PPIs from the previous tutorial \"Comparing PPIs\" (taken from Figure 1 in the [\"Learning to design protein-protein interactions with enhanced generalization\"](https://arxiv.org/pdf/2310.18515.pdf) paper).\n", "\n", "
\n",
"
\n",
"