{ "info": { "author": "Tariq A. Hassan", "author_email": "laterallattice@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Natural Language :: English", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6" ], "description": "BioVida is a library designed to make it easy to gain access to existing\ndata sets of biomedical images as well as build brand new, custom-made\nones.\n\nIt is hoped that by automating the tedious data munging that is\ntypically involved in this process, more people will become interested\nin applying machine learning to biomedical images and, in turn,\nadvancing insights into human disease.\n\nIn a nod to recursion, BioVida tries to accomplish some of this\nautomation with machine learning itself, using tools like convolutional\nneural networks.\n\nInstallation\n------------\n\nPython Package Index:\n\n.. code:: bash\n\n $ pip install biovida\n\nLatest Build:\n\n.. code:: bash\n\n $ pip install git+git://github.com/TariqAHassan/BioVida@master\n\nRequires Python 3.4+\n\nImages: Stable\n--------------\n\nIn just a few lines of code, you can gain access to biomedical databases\nwhich store tens of millions of images.\n\n*Please note that you are bound to adhere to the copyright and other\nusage restrictions under which this data is provided to you by its\ncreators.*\n\nOpen-i BioMedical Image Search Engine\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. code:: python\n\n\n # 1. Import the Interface for the NIH's Open-i API.\n from biovida.images import OpeniInterface\n\n # 2. Create an Instance of the Tool\n opi = OpeniInterface()\n\n # 3. Perform a search for x-rays and cts of lung cancer\n opi.search(query='lung cancer', image_type=['x_ray', 'ct']) # Results Found: 9,220.\n\n # 4. Pull the data\n search_df = opi.pull()\n\nCancer Imaging Archive\n^^^^^^^^^^^^^^^^^^^^^^\n\n.. code:: python\n\n # 1. Import the interface for the Cancer Imaging Archive\n from biovida.images import CancerImageInterface\n\n # 2. Create an Instance of the Tool\n cii = CancerImageInterface(YOUR_API_KEY_HERE)\n\n # 3. Perform a search\n cii.search(cancer_type='esophageal')\n\n # 4. Pull the data\n cdf = cii.pull()\n\nBoth ``CancerImageInterface`` and ``OpeniInterface`` cache images for\nlater use. When data is 'pulled', a ``records_db`` is generated, which\nis a dataframe of all text data associated with the images. They are\nprovided as class attributes, e.g., ``cii.records_db``. While\n``records_db`` only stores data from the most recent data pull,\n``cache_records_db`` dataframes provides an account of all image data\ncurrently cached.\n\nSplitting Images\n^^^^^^^^^^^^^^^^\n\nBioVida can divide cached images into train/validation/test.\n\n.. code:: python\n\n from biovida.images import image_divvy\n\n # 1. Define a rule to 'divvy' up images in the cache.\n def my_divvy_rule(row):\n if row['image_modality_major'] == 'x_ray':\n return 'x_ray'\n elif row['image_modality_major'] == 'ct':\n return 'ct'\n\n # 2. Define Proportions and Divide Data\n tt = image_divvy(opi, my_divvy_rule, action='ndarray', train_val_test_dict={'train': 0.8, 'test': 0.2})\n\n # 3. The resultant ndarrays can be unpacked as follows:\n train_ct, train_xray = tt['train']['ct'], tt['train']['x_ray']\n test_ct, test_xray = tt['test']['ct'], tt['test']['x_ray']\n\nImages: Experimental\n--------------------\n\nAutomated Image Data Cleaning\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nUnfortunately, the data pulled from Open-i above is likely to contain a\nlarge number of images unrelated to the search query and/or are\nunsuitable for machine learning.\n\nThe *experimental* ``OpeniImageProcessing`` class can be used to\ncompletely automate this data cleaning process, which is partly powered\nby a Convolutional Neural Network.\n\n.. code:: python\n\n # 1. Import Image Processing Tools\n from biovida.images import OpeniImageProcessing\n\n # 2. Instantiate the Tool using the OpeniInterface Instance\n ip = OpeniImageProcessing(opi)\n \n # 3. Analyze the Images\n idf = ip.auto()\n\n # 4. Use the Analysis to Clean Images\n ip.clean_image_dataframe()\n\nIt is easy to split these images into training and test sets.\n\n.. code:: python\n\n from biovida.images import image_divvy\n\n def my_divvy_rule(row):\n if row['image_modality_major'] == 'x_ray':\n return 'x_ray'\n elif row['image_modality_major'] == 'ct':\n return 'ct'\n\n tt = image_divvy(ip, my_divvy_rule, action='ndarray', train_val_test_dict={'train': 0.8, 'test': 0.2})\n # These ndarrays can be unpack as shown above.\n\nGenomic Data\n------------\n\nWhile primarily focused on images, BioVida also provides a simple\ninterface for obtaining related information, such genomic data.\n\n.. code:: python\n\n # 1. Import the Interface for DisGeNET.org\n from biovida.genomics import DisgenetInterface\n\n # 2. Create an Instance of the Tool\n dna = DisgenetInterface()\n\n # 3. Pull a Database\n gdf = dna.pull('curated')\n\nDiagnostic Data\n---------------\n\nBioVida also makes it easy to obtain diagnostics data.\n\nInformation on disease definitions, families and synonyms:\n\n.. code:: python\n\n # 1. Import the Interface for DiseaseOntology.org\n from biovida.diagnostics import DiseaseOntInterface\n\n # 2. Create an Instance of the Tool\n doi = DiseaseOntInterface()\n\n # 3. Pull the Database\n ddf = doi.pull()\n\nInformation on symptoms associated with diseases:\n\n.. code:: python\n\n # 1. Import the Interface for Disease-Symptoms Information\n from biovida.diagnostics import DiseaseSymptomsInterface\n\n # 2. Create an Instance of the Tool\n dsi = DiseaseSymptomsInterface()\n\n # 3. Pull the Database\n dsdf = dsi.pull()\n\nUnifying Information\n--------------------\n\nThe ``unify_against_images`` function integrates image data information\nagainst ``DisgenetInterface``, ``DiseaseOntInterface`` and\n``DiseaseSymptomsInterface``.\n\n.. code:: python\n\n from biovida.unification import unify_against_images\n\n unify_against_images(interfaces=[cii, opi], db_to_extract='cache_records_db')\n\nLeft side of DataFrame: Image Data Alone\n\n+----+----------+--------+-----------+------------------------+-----+-------+------------+-----+\n| | article\\ | image\\ | image\\_ca | modality\\_best\\_guess | age | sex | disease | ... |\n| | _type | _id | ption | | | | | |\n+====+==========+========+===========+========================+=====+=======+============+=====+\n| 0 | case\\_re | 1 | ... | Magnetic Resonance | 73 | male | fibroma | ... |\n| | port | | | Imaging (MRI) | | | | |\n+----+----------+--------+-----------+------------------------+-----+-------+------------+-----+\n| 1 | case\\_re | 2 | ... | Magnetic Resonance | 73 | male | fibroma | ... |\n| | port | | | Imaging (MRI) | | | | |\n+----+----------+--------+-----------+------------------------+-----+-------+------------+-----+\n| 2 | case\\_re | 1 | ... | Computed Tomography | 45 | femal | bile duct | ... |\n| | port | | | (CT): angiography | | e | cancer | |\n+----+----------+--------+-----------+------------------------+-----+-------+------------+-----+\n\nRight side of DataFrame: Added Information\n\n+----------------+-------------+------------+---------------+------------+--------------+\n| disease\\_famil | disease\\_sy | disease\\_d | known\\_associ | mentioned\\ | known\\_assoc |\n| y | nonym | efinition | ated\\_symptom | _symptoms | iated\\_genes |\n| | | | s | | |\n+================+=============+============+===============+============+==============+\n| (cell type | nan | nan | (abdominal | (pain,) | ((ANTXR2, |\n| benign | | | pain,...) | | 0.12), ...) |\n| neoplasm,) | | | | | |\n+----------------+-------------+------------+---------------+------------+--------------+\n| (cell type | nan | nan | (abdominal | (pain,) | ((ANTXR2, |\n| benign | | | pain,...) | | 0.12), ...) |\n| neoplasm,) | | | | | |\n+----------------+-------------+------------+---------------+------------+--------------+\n| (biliary tract | (bile duct | A biliary | (abdominal | (colic,) | nan |\n| cancer,) | tumor,...) | tract... | obesity,..) | | |\n+----------------+-------------+------------+---------------+------------+--------------+\n\n--------------\n\nDocumentation\n-------------\n\n- `Getting\n Started `__\n- `Tutorials `__\n- `API\n Documentation `__\n\nContributing\n------------\n\nFor more information on how to contribute, see the\n`contributing `__\ndocument.\n\nBug reports and feature requests are always welcome and can be provided\nthrough the `Issues `__\npage.\n\nResources\n---------\n\nThe\n`resources `__\ndocument provides an account of all data sources and scholarly work used\nby BioVida.", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/TariqAHassan/BioVida/archive/v0.1.1.tar.gz", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/TariqAHassan/BioVida.git", "keywords": "machine-learning,biomedical-informatics,data-science,bioinformatics,imaging-informatics", "license": "BSD", "maintainer": null, "maintainer_email": null, "name": "biovida", "package_url": "https://pypi.org/project/biovida/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/biovida/", "project_urls": { "Download": "https://github.com/TariqAHassan/BioVida/archive/v0.1.1.tar.gz", "Homepage": "https://github.com/TariqAHassan/BioVida.git" }, "release_url": "https://pypi.org/project/biovida/0.1.1/", "requires_dist": null, "requires_python": null, "summary": "Automated BioMedical Information Curation for Machine Learning Applications.", "version": "0.1.1" }, "last_serial": 2770858, "releases": { "0.1.1": [ { "comment_text": "", "digests": { "md5": "77e6a655aa5b5811d2d55fba41b71113", "sha256": "9e09fbda396953eb31b316223545d37d38b7a3eb0a128d17658b19c8ca7aa13d" }, "downloads": -1, "filename": "biovida-0.1.1.tar.gz", "has_sig": false, "md5_digest": "77e6a655aa5b5811d2d55fba41b71113", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 125851, "upload_time": "2017-04-10T21:00:57", "url": "https://files.pythonhosted.org/packages/12/26/f772446462de5b631e0890a99614824cb4d0f30a9d7910a8ab9b769715fb/biovida-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "77e6a655aa5b5811d2d55fba41b71113", "sha256": "9e09fbda396953eb31b316223545d37d38b7a3eb0a128d17658b19c8ca7aa13d" }, "downloads": -1, "filename": "biovida-0.1.1.tar.gz", "has_sig": false, "md5_digest": "77e6a655aa5b5811d2d55fba41b71113", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 125851, "upload_time": "2017-04-10T21:00:57", "url": "https://files.pythonhosted.org/packages/12/26/f772446462de5b631e0890a99614824cb4d0f30a9d7910a8ab9b769715fb/biovida-0.1.1.tar.gz" } ] }