{ "info": { "author": "Kale Kundert", "author_email": "kale.kundert@ucsf.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Topic :: Scientific/Engineering :: Bio-Informatics", "Topic :: Software Development :: Libraries" ], "description": "******\nFCMcmp\n******\n\nThe goal of FCMcmp is to make it easy to analyze flow cytometry data from \npython. The first challenge in analyzing flow cytometry data is working out \nwhich wells should be compared with each other. For example, which wells are \ncontrols, which wells are replicates of each other, which wells contain the \nconditions you're interested in, etc. This isn't usually too complicated for \nindividual experiments, but if you want to write analysis scripts that you can \nuse on all of your data, managing this metadata becomes a significant problem.\n\nFCMcmp addresses this problem by defining a simple YAML file format that can \nassociate wells and plates in pretty much any way you want. When you ask \nFCMcmp to parse these files, it returns a list- and dictionary-based data \nstructure that contains these associations, plus it automatically parses the \nraw FCS data into pandas data frames.\n\n.. image:: https://img.shields.io/pypi/v/fcmcmp.svg\n :target: https://pypi.python.org/pypi/fcmcmp\n\n.. image:: https://img.shields.io/pypi/pyversions/fcmcmp.svg\n :target: https://pypi.python.org/pypi/fcmcmp\n\n.. image:: https://img.shields.io/travis/kalekundert/fcmcmp.svg\n :target: https://travis-ci.org/kalekundert/fcmcmp\n\n.. image:: https://img.shields.io/coveralls/kalekundert/fcmcmp.svg\n :target: https://coveralls.io/github/kalekundert/fcmcmp?branch=master\n\nInstallation\n============\n``fcmcmp`` is available on PyPI::\n\n pip3 install fcmcmp\n\nOnly python>=3.4 is supported.\n\nQuick Start\n===========\nI'll demonstrate using data you might export from running a 96-well plate on a \nBD LSRII, but the library should be pretty capable of handling any directory \nhierarchy::\n\n my_plate/\n 96 Well - U bottom/\n Specimen_001_A1_A01_001.fcs\n Specimen_002_A2_A02_002.fcs\n Specimen_003_A3_A03_003.fcs\n ...\n\nLoading the data\n~~~~~~~~~~~~~~~~\nFirst, we need to make a YAML metadata file describing the relationships \nbetween the wells on this plate::\n\n # my_plate.yml\n label: vaxadrin\n wells:\n without: [A1,A2,A3]\n with: [B1,B2,B3]\n ---\n label: vaxamaxx\n wells:\n without: [A1,A2,A3]\n with: [C1,C2,C3]\n\nIn this example, the name of the plate directory is inferred from the name of \nthe YAML file. You can also explicitly specify the path to the plate directory \nby adding the following header before the ``label``/``wells`` sections::\n\n plate: path/to/my_plate\n ---\n\nYou can even reference wells from multiple plates in one file::\n\n plates:\n foo: path/to/foo_plate\n bar: path/to/bar_plate\n ---\n label: vaxascab\n wells:\n without: [foo/A1, foo/A2, foo/A3]\n with: [bar/A1, bar/A2, bar/A3]\n\nNote that the ``label`` and ``wells`` fields are required, but you can add, \nremove, or rename any other field::\n\n label: vaxa-smacks\n channel: FITC-A\n gating: 60%\n wells:\n 0mM: [A1,A2,A3]\n 1mM: [B1,B2,B3]\n 5mM: [C1,C2,C3]\n \nOnce you have a YAML metadata file, you can use ``fcmcmp`` to read it::\n\n >>> import fcmcmp, pprint\n >>> experiments = fcmcmp.load_experiments('my_plate.yml')\n >>> pprint.pprint(experiments)\n [{'label': 'vaxadrin',\n 'wells': {'with': [Well(B1), Well(B2), Well(B3)],\n 'without': [Well(A1), Well(A2), Well(A3)]}},\n {'label': 'vaxamaxx',\n 'wells': {'with': [Well(C1), Well(C2), Well(C3)],\n 'without': [Well(A1), Well(A2), Well(A3)]}}]\n\nThe data structure returned is little more than a list of dictionaries, which \nshould be easy to work with in pretty much any context. The wells are \nrepresented by ``Well`` objects, which have only three attributes:\n\n- ``Well.label``: The name used to reference the well in the YAML file. \n- ``Well.data``: A ``pandas.DataFrame`` containing all the data associated \n with the well, parsed using the excellent ``fcsparse`` library.\n- ``Well.meta``: A dictionary containing any metadata associated with the \n well, also parsed using ``fcsparse``.\n\nNote that if you reference the same well more than once (e.g. for controls that \napply to all of your experiments), each reference is parsed separately and gets \nits own copy of all the data.\n\nWorking with the data\n~~~~~~~~~~~~~~~~~~~~~\nOnce the experiments are loaded into python as described above, ``fcmcmp`` \nprovides a couple ways to interact with them. The first is to apply one or \nmore of a handful of pre-defined \"processing steps\"::\n\n >>> ch = 'FITC-A', 'PE-Texas Red-A'\n >>> p1 = fcmcmp.GateEarlyEvents(throwaways_secs=2)\n >>> p1(experiments)\n >>> p2 = fcmcmp.GateSmallCells(threshold=40, save_size_col=True)\n >>> p2(experiments)\n >>> p3 = fcmcmp.GateNonPositiveEvents(ch)\n >>> p3(experiments)\n >>> p4 = fcmcmp.LogTransformation(ch)\n >>> p4(experiments)\n >>> p5 = fcmcmp.KeepRelevantChannels(ch)\n >>> p5(experiments)\n\nIn this example:\n\n- ``GateEarlyEvents`` discards the first few seconds of data, which is useful \n when you're using a high-throughput sampler and you suspect that cells from \n the previous well are being recorded at the beginning of each well.\n- ``GateSmallCells`` combines the ``FSC-A`` and ``SSC-A`` channels to estimate \n how the size of each event, then discards any events below the given \n percentile (40% in this example).\n- ``GateNonPositiveEvents`` discards negative data on the specified channels. \n I have to admit that I don't understand how \"fluorescence peak area\" data can \n be negative, but in any case this can be important if you want to work with \n the logarithm of your data, because of course you can't take the logarithm of \n negative data.\n- ``LogTransform`` takes the logarithm of the data in the specified channels. \n This is a very standard processing step for fluorescent channels.\n- ``KeepRelevantChannels`` discards all the data for any channels that aren't \n explicitly listed. This is mostly useful for when you're printing out data \n to the terminal and don't want to be distracted by channels you collected but \n aren't interested in at the moment.\n\nInstead of calling each processing step individually, you can also use the \n``run_all_processing_steps()`` function to call them all at once. If you do \nthis, you don't even need to make a variable for each step::\n\n >>> fcmcmp.GateEarlyEvents(throwaways_secs=2)\n >>> fcmcmp.GateSmallCells(threshold=40, save_size_col=True)\n >>> fcmcmp.GateNonPositiveEvents(ch)\n >>> fcmcmp.LogTransformation(ch)\n >>> fcmcmp.KeepRelevantChannels(ch)\n >>> fcmcmp.run_all_processing_steps()\n\nYou can also write your own processing steps by inheriting from either \n``ProcessingStep`` or ``GatingStep`` and reimplementing the proper methods. \n``ProcessingStep`` is for general transformations and has two virtual methods: \n``process_experiment()`` and ``process_well()``. The former is called once for \neach experiment and should transform that experiment in place. The latter is \ncalled once for each well and can either modify the well in place (and return \nNone) or return the processed data, which will overwrite the original data.\n\n``GatingStep`` is specifically for transformations regarding which data points \nto keep and which to throw out. It is itself a ``ProcessingStep``, but it has \na different virtual method(): ``gate()``. This method is called on each well \nand should return a boolean numpy array. Those indices that are ``False`` will \nbe thrown out, those that are ``True`` will be kept.\n\nThe second way to interact with the experiments is to use the ``yield_wells()`` \nand ``yield_unique_wells()`` functions. These are both `generators`__ which \niterate through all of your experiments and yield each well one at a time. The \npurpose of these functions is to make the nested ``experiments`` data structure \nseem more like a flat list::\n\n >>> for experiment, condition, well in fcmcmp.yield_wells(experiments):\n >>> print(experiment, condition, well)\n\nBoth functions take an optional keyword argument. If given, only wells with a \nmatching experiment label, condition, or well label will be returned. The only \ndifference between ``yield_wells()`` and ``yield_unique_wells()`` is that the \nformer won't yield the same well twice. This is important because the same \nwell can certainly be included in many different experiments.\n\n__ https://jeffknupp.com/blog/2013/04/07/improve-your-python-yield-and-generators-explained/\n\nBugs and new features\n=====================\nUse the GitHub issue tracker if you find any bugs or would like to see any new \nfeatures. I'm also very open to pull requests.", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/kalekundert/fcmcmp", "keywords": "fcmcmp", "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "fcmcmp", "package_url": "https://pypi.org/project/fcmcmp/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/fcmcmp/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/kalekundert/fcmcmp" }, "release_url": "https://pypi.org/project/fcmcmp/0.1.0/", "requires_dist": null, "requires_python": null, "summary": "A lightweight, flexible, and modern framework for annotating flow cytometry data.", "version": "0.1.0" }, "last_serial": 2231498, "releases": { "0.0.0": [], "0.1.0": [ { "comment_text": "", "digests": { "md5": "c09cc072a87f203f484ae6f34c871d5d", "sha256": "2566f4b23aa8b911d46ba445a901190450144c36f0066157a72bb373b760ea8b" }, "downloads": -1, "filename": "fcmcmp-0.1.0.tar.gz", "has_sig": false, "md5_digest": "c09cc072a87f203f484ae6f34c871d5d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 112702, "upload_time": "2016-07-19T23:08:15", "url": "https://files.pythonhosted.org/packages/a2/cf/5c41932440e833c305719909466e0d3ebb1d208b9edba539b735060ac312/fcmcmp-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "c09cc072a87f203f484ae6f34c871d5d", "sha256": "2566f4b23aa8b911d46ba445a901190450144c36f0066157a72bb373b760ea8b" }, "downloads": -1, "filename": "fcmcmp-0.1.0.tar.gz", "has_sig": false, "md5_digest": "c09cc072a87f203f484ae6f34c871d5d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 112702, "upload_time": "2016-07-19T23:08:15", "url": "https://files.pythonhosted.org/packages/a2/cf/5c41932440e833c305719909466e0d3ebb1d208b9edba539b735060ac312/fcmcmp-0.1.0.tar.gz" } ] }