{
    "info": {
        "author": "Peter Vegh",
        "author_email": "peter.vegh@newcastle.ac.uk",
        "bugtrack_url": null,
        "classifiers": [
            "License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)",
            "Operating System :: OS Independent",
            "Programming Language :: Python :: 3"
        ],
        "description": "# pyVDJ\n\nV(D)J sequencing data analysis\n\nThis package adds 10x Genomics V(D)J sequencing data to an [AnnData](https://anndata.readthedocs.io) object's `.uns` part, and also makes annotation columns in `.obs`.\nThis enables plotting various V(D)J properties and handling mRNA (GEX) and V(D)J sequencing data together.\n\n\n## Install\n\n`pip3 install pyvdj`\n\n\n## Usage\n\n    import pyvdj\n    adata = pyvdj.load_vdj(paths, samples, adata)\n    adata = pyvdj.add_obs(adata, obs=['is_clone'])\n\nFor a detailed description, see the [tutorial](tutorials/pyVDJ_tutorial.html).\n\n\n## Details\n\nThe package has functions that\n* read `metrics_summary.csv` files into a pandas dataframe.\n* load `filtered_contig_annotations.csv` files into an AnnData object.\n* create various statistics and annotations in the AnnData object.\n\n\n### Read metrics\n\nThe `read10xsummary` function requires a list of paths to `metrics_summary.csv` files, and optionally a dictionary of path:samplename. It returns a dataframe of the metrics.\n\n\n### Load V(D)J data\n\nThe `load_vdj` function loads 10x V(D)J sequencing data (`filtered_contig_annotations.csv` files) into an AnnData object's `.uns['pyvdj']` slot, and returns the object. The `adata.uns['pyvdj']` slot is a dictionary which has the following elements:\n* `'df'`: a dataframe containing V(D)J data\n* `'obs_col'`: the `anndata.obs` columname of matching cellnames.\n* `'samples'`: a dictionary of filename:samplename\n\nIf an anndata object is not supplied, the function returns the dictionary.\n\nArguments:\n* `paths`: list of paths to filtered_contig_annotations.csv files.\n* `samples`: a dictionary of path:samplename.\n* `adata`: the AnnData object.\n* `add_obs`: whether to add some default .obs metadata columns.\n\n\n### Add annotations\n\nThe `adata.uns['pyvdj']['df']` is a pandas dataframe of the V(D)J data, with two additional columns that contain unique cell barcode and clonotype labels. These are generated using the user-supplied sample names: `cellbarcode + '_' + samplename` and `clonotype + '_' + samplename`.\n\nThese unique cell names are used to match the V(D)J cells to the AnnData `.X` cells, using `adata.obs['vdj_obs']`. The user has to prepare this column using the cell barcodes and the sample names.\n\nThe `add_obs` function can add the following annotations:\n* `'has_vdjdata'`: does the cell have V(D)J sequencing data?\n* `'clonotype'`: add clonotype name\n* `'is_clone'`: does it have a clone?\n* `'is_productive'`: are all chains productive?\n* `'chains'`: adds annotation (True, False, No_data) for each chain\n* `'genes'`: adds annotation (True, False, No_data) for each constant gene\n* `'v_genes'`: adds annotation (True, False, No_data) for each variable gene\n* `'j_genes'`: adds annotation (True, False, No_data) for each joining gene\n* `'clone_count'`: adds clone count annotation\n\n\n### Definitions\n\n* Clone: a cell whose TCR is identical to another cell, within the same individual (donor, organism).*\n* Clonotype: a set of all cells with the same TCR in the same individual (donor). A clonotype can have 1 or more cells.**\n* Clone count (of a clonotype): number of clones in the clonotype.\n* Public TCR (or CDR3) sequence: these are common and occur in multiple (or all) donors.\n* Private TCR (or CDR3) sequence: these are unique to one donor.\n* Condition-specific TCR (or CDR3) sequence: these occur in donors with a condition (disease, treated etc). These are private (unique) to the condition.\n\n_The above definitions are understood in the context of the sequenced cells._\n\n*As determined by Cell Ranger.\n**Note that Cell Ranger v2 does not assign a clonotype id to clonotypes with only 1 clone, but uses \u2018None\u2019. Cell Ranger v3 does assign a clonotype id to all cells.\n\n\n### CDR3 specificity\n\nWe can retrieve CDR3 amino acid sequences for given clonotypes using\n\n    pyvdj.get_spec(adata, clonotypes = [clonotype1_sampleA', 'clonotype3_sampleB'])\n\nwhich returns a dictionary. This can be used to find specificity in CDR3 databases, such as [VDJdb](http://vdjdb.cdr3.net) or [McPAS-TCR](http://friedmanlab.weizmann.ac.il/McPAS-TCR/).\n\n\n### Clonotype statistics\n\nWe can generate and [plot various statistics](tutorials/pyVDJ_tutorial.html) on clonotypes and diversity.\n\n    adata = pyvdj.stats(adata, meta)\n\nThis function adds a dictionary of statistics on the VDJ data (`adata.uns['pyvdj']['stats'][meta]`),\ngrouped by categories in the `adata.obs[meta]` column. Keys:\n\n* `'meta'` stores the adata.obs columname\n* `'cells'` count of cells, and cells with VDJ data per category\n* `'clonotype_counts'` number of different clonotypes per category\n* `'clonotype_dist'` clone count distribution\n* `'shared_cdr3'` dictionary of cdr3 - cell\n\n\n### Public and private CDR3 sequences\n\nWe can [find TCR-specificity shared between samples](tutorials/pyVDJ_tutorial.html), donors or any other annotation category.\n\n    adata = pyvdj.find_clones(adata, sample_dict)\n\nThis function returns AnnData with clonotype annotation, where clonotypes shared between 10x samples within donor (organism, individual) are combined to have the same clonotype ID.\n`'sample_dict'` is a dictionary of sample:donor, matching 10x samples (channels, as specified when the 10x VDJ data was loaded) to donors.\n\n\n### CDR3-similarity graph\n\nA set of prototype functions build CDR3-similarity graphs using [Levenshtein distances](https://en.wikipedia.org/wiki/Levenshtein_distance). The nodes are the CDR3 sequences, and edges connect nodes with Levenshtein distance of 1.\n\n    cdr3_dict = pyvdj.get_cdr3(adata)  # get CDR3s for each sample\n    dist = pyvdj.get_dist(cdr3_dict, sample)  # calculate distances (adjacency matrix)\n    g = pyvdj.graph_cdr3(dist)  # returns an igraph graph object.\n\nThis requires the python-Levenshtein and the igraph-python packages.\n\n\n### Dependencies\n\nThe package was originally developed for data made with Cell Ranger v2.1.1 (Chemistry: Single Cell V(D)J; V(D)J reference: GRCh38-alts-ensembl) and has been tested to work with Cell Ranger v3.1.0 data, with the following Python (v3.6.9) package versions:\n\n    pandas 0.25.1\n    anndata 0.6.21\n    scanpy 1.4.3\n\n\n\n",
        "description_content_type": "text/markdown",
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/veghp/pyVDJ",
        "keywords": "",
        "license": "",
        "maintainer": "",
        "maintainer_email": "",
        "name": "pyvdj",
        "package_url": "https://pypi.org/project/pyvdj/",
        "platform": "",
        "project_url": "https://pypi.org/project/pyvdj/",
        "project_urls": {
            "Homepage": "https://github.com/veghp/pyVDJ"
        },
        "release_url": "https://pypi.org/project/pyvdj/0.1.1/",
        "requires_dist": [
            "pandas",
            "anndata"
        ],
        "requires_python": "",
        "summary": "V(D)J sequencing data analysis",
        "version": "0.1.1"
    },
    "last_serial": 6004656,
    "releases": {
        "0.1.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "48ca025b163d63b213dc8a423b5577ce",
                    "sha256": "3c09214d862e25f8adc647d10e4a8f7a94864dd054ef760f0b7cc9bd52333d13"
                },
                "downloads": -1,
                "filename": "pyvdj-0.1.0-py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "48ca025b163d63b213dc8a423b5577ce",
                "packagetype": "bdist_wheel",
                "python_version": "py3",
                "requires_python": null,
                "size": 20129,
                "upload_time": "2019-09-01T18:00:04",
                "url": "https://files.pythonhosted.org/packages/7d/48/d3c352aff023528278452bd130adc3e82b3201ac8bf2081a06e81560c97d/pyvdj-0.1.0-py3-none-any.whl"
            }
        ],
        "0.1.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "60683598ccac8a9e5ece052bfb4ab798",
                    "sha256": "080e5998f8ccd61d3f952bcb0ad151fbdf912becdec6f6f94d9a8eb667976900"
                },
                "downloads": -1,
                "filename": "pyvdj-0.1.1-py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "60683598ccac8a9e5ece052bfb4ab798",
                "packagetype": "bdist_wheel",
                "python_version": "py3",
                "requires_python": null,
                "size": 27848,
                "upload_time": "2019-10-20T21:02:14",
                "url": "https://files.pythonhosted.org/packages/fd/87/5df94cefb66dc7fb27c1c53598f4c1bc87dee4e1afc16ce12c354b98c5ca/pyvdj-0.1.1-py3-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "776185270990ae1ff012313db128e300",
                    "sha256": "a6f6eb53a603dbf56edf18ef3b4cbcd671282ffbb3946609549fc3fdba35f8d9"
                },
                "downloads": -1,
                "filename": "pyvdj-0.1.1.tar.gz",
                "has_sig": false,
                "md5_digest": "776185270990ae1ff012313db128e300",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 10897,
                "upload_time": "2019-10-20T21:02:16",
                "url": "https://files.pythonhosted.org/packages/62/18/94657e57491935dd945b13814a07e9957958534c5007626c5146d937474b/pyvdj-0.1.1.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "60683598ccac8a9e5ece052bfb4ab798",
                "sha256": "080e5998f8ccd61d3f952bcb0ad151fbdf912becdec6f6f94d9a8eb667976900"
            },
            "downloads": -1,
            "filename": "pyvdj-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "60683598ccac8a9e5ece052bfb4ab798",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 27848,
            "upload_time": "2019-10-20T21:02:14",
            "url": "https://files.pythonhosted.org/packages/fd/87/5df94cefb66dc7fb27c1c53598f4c1bc87dee4e1afc16ce12c354b98c5ca/pyvdj-0.1.1-py3-none-any.whl"
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "776185270990ae1ff012313db128e300",
                "sha256": "a6f6eb53a603dbf56edf18ef3b4cbcd671282ffbb3946609549fc3fdba35f8d9"
            },
            "downloads": -1,
            "filename": "pyvdj-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "776185270990ae1ff012313db128e300",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 10897,
            "upload_time": "2019-10-20T21:02:16",
            "url": "https://files.pythonhosted.org/packages/62/18/94657e57491935dd945b13814a07e9957958534c5007626c5146d937474b/pyvdj-0.1.1.tar.gz"
        }
    ]
}