{ "info": { "author": "Sofya Lipnitskaya", "author_email": "lipnitskaya.sofya@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "Intended Audience :: Education", "Intended Audience :: Other Audience", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Education :: Testing", "Topic :: Office/Business", "Topic :: Other/Nonlisted Topic", "Topic :: Scientific/Engineering", "Topic :: Software Development :: Libraries", "Typing :: Typed" ], "description": "# Experiment Runner (exp-runner)\n\n**exp-runner** is a simple and extensible framework for data analysis and machine learning experiments in Python.\n\n#### Structure\nThe framework includes following step:\n1. _Data loading_\n2. _Data transformation_\n3. _Model training and testing_\n4. _Performance evaluation_\n5. _Results saving_\n\n#### Main features\n* _Generability_: Variaty of models and methods are supported and it can be used in a number of tasks \n(such as preprocessing, dimensionality reduction, classification, \nregression, clustering, statistical tests, etc.)\n* _Flexability_: Steps can be easily skipped and/or included\n* _Dynamic loading_: Automatically imports modules during runtime - no additional lines are needed\n\n### Installation\n```bash\npip install exp-runner\n```\n\n### Usage \n\nLet's say, your project has the following structure:\n\n MyAwesomeProject/\n main.py\n my_custom_module.py\n\n data/\n data_00.npy\n data_01.npy\n ...\n data_NN.npy\n\n protocols/\n experiment_config.json\n\n results/\n\n\n#### Just give me a code!\nYou just need to describe your **framework** in the [JSON](https://json.org) configuration file:\n\n##### experiment_config.json\n```JSON\n{\n \"Setup\": {\n \"description\": \"You can add detailed description of the experiment\",\n \"random_seed\": 42\n },\n \"Dataset\": {\n \"class\": \"my_custom_module.MyAwesomeDataLoader\",\n \"args\": {\"path_to_data\": \"data/*.npy\"}\n },\n \"Transforms\": [\n {\n \"class\": \"sklearn.decomposition.PCA\",\n \"args\": {\"n_components\": 3, \"whiten\": true}\n }\n ],\n \"Model\": {\n \"class\": \"sklearn.cluster.KMeans\",\n \"args\": {\"n_clusters\": 3, \"n_jobs\": -1, \"verbose\": 0}\n },\n \"Metric\": {\n \"class\": \"my_custom_module.SklearnMetricWrapper\",\n \"args\": {\"metric\": \"normalized_mutual_info_score\"}\n },\n \"Saver\": {\n \"class\": \"my_custom_module.CSVReport\",\n \"args\": {\"path_to_output\": \"results/evaluation_results.csv\", \"sep\": \";\"}\n }\n}\n```\n
Here are aforementioned classes (click):\n

\n\n##### my_custom_module.py\n\n```python\nimport os\nimport glob\nimport numpy as np\nimport sklearn.metrics\n\nfrom exp_runner import Dataset, Metric, Saver\n\nfrom collections import defaultdict\nfrom typing import Any, Dict, List, Union, NoReturn, Iterable, Callable\n\nfrom sklearn.model_selection import StratifiedShuffleSplit\n\n\nclass MyAwesomeDataLoader(Dataset):\n\n def __init__(self, path_to_data: str, test_size: float = 0.1, training: bool = True):\n\n super(MyAwesomeDataLoader, self).__init__()\n\n self._samples = dict()\n self._labels = dict()\n self._splits = defaultdict(dict)\n\n paths_to_data = glob.glob(path_to_data)\n\n for path in paths_to_data:\n fname = os.path.basename(path)\n\n data = np.load(path)\n X = data[:, :-1] \n y = data[:, -1]\n\n indices_train, indices_test = next(StratifiedShuffleSplit(\n test_size=test_size\n ).split(X, y))\n\n self._samples[fname] = X\n self._labels[fname] = y\n self._splits[fname]['train'] = indices_train\n self._splits[fname]['test'] = indices_test\n\n self._indices = list(self._samples.keys())\n\n self._training = training\n\n def __getitem__(self, index: int) -> Dict[str, Dict[str, Union[str, np.ndarray]]]:\n if not (0 <= index < len(self._indices)):\n raise IndexError\n\n fname = self._indices[index]\n\n item = {\n 'X': self._samples[fname][self._splits[fname]['train'] if self.training else self._splits[fname]['test']],\n 'y': self._labels[fname][self._splits[fname]['train'] if self.training else self._splits[fname]['test']]\n }\n\n item['desc'] = 'it is possible to add description for each data sample'\n\n return {'filename': fname, 'item': item}\n\n def __len__(self) -> int:\n return len(self._indices)\n\n @property\n def training(self):\n return self._training\n\n\nclass SklearnMetricWrapper(Metric):\n\n def __init__(self, metric: str):\n super(SklearnMetricWrapper, self).__init__()\n\n metric = getattr(sklearn.metrics, metric)\n self._metric: Callable[[Iterable[Union[float, int]], Iterable[Union[float, int]]], float] = metric\n\n def __call__(self, y_true: Iterable[Union[float, int]], y_pred: Iterable[Union[float, int]]) -> float:\n return self._metric(y_true, y_pred)\n\n\nclass CSVReport(Saver):\n\n def __init__(self, path_to_output: str, sep: str = ';', append: bool = True):\n super(CSVReport, self).__init__()\n\n self.path_to_output = path_to_output\n self.sep = sep\n self.mode = 'a+' if append else 'w+'\n\n def save(self, report: List[Dict[str, Any]]) -> NoReturn:\n with open(self.path_to_output, self.mode) as csv:\n for entry in report:\n line = self.sep.join([\n entry['filename'],\n entry['desc'],\n entry['perf']\n ]) + '\\n'\n csv.write(line)\n```\n\n

\n
\n\nFinally, to run your experiment type in your terminal:\n```bash\ncd /path/to/MyAwesomeProject\npython main.py --config protocols/experiment_config.json\n```\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/slipnitskaya/exp-runner.git", "keywords": "pipeline framework modelling model training testing classification regression clustering,pipeline framework data analysis", "license": "", "maintainer": "", "maintainer_email": "", "name": "exp-runner", "package_url": "https://pypi.org/project/exp-runner/", "platform": "", "project_url": "https://pypi.org/project/exp-runner/", "project_urls": { "Homepage": "https://github.com/slipnitskaya/exp-runner.git" }, "release_url": "https://pypi.org/project/exp-runner/0.1.0b2/", "requires_dist": [ "tqdm (>=4.28.1)", "numpy (>=1.15.4)", "scikit-learn (>=0.20.1)" ], "requires_python": ">=3.6", "summary": "Framework for data analysis and machine learning experiments", "version": "0.1.0b2" }, "last_serial": 5229623, "releases": { "0.1.0b1": [ { "comment_text": "", "digests": { "md5": "2ca3883f6edcb11650a9f850c0815bb6", "sha256": "1f5ed8db3b1f07d341ab0c776634473b4fd66755c868b2bda101f1e5d8025c52" }, "downloads": -1, "filename": "exp_runner-0.1.0b1-py3-none-any.whl", "has_sig": false, "md5_digest": "2ca3883f6edcb11650a9f850c0815bb6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 6603, "upload_time": "2019-03-25T19:51:35", "url": "https://files.pythonhosted.org/packages/1a/bf/f14372c4d8c2bd97c74d1ee9f6d0cb7134218e017a1130bb78bc69e1ddf0/exp_runner-0.1.0b1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6b4a80ee0d161cc5b1be778e22ed5cc5", "sha256": "6ab14d9a791879b9ab200aee5fbe1c850a8d10850a79dd5f6a0240255a74d269" }, "downloads": -1, "filename": "exp-runner-0.1.0b1.tar.gz", "has_sig": false, "md5_digest": "6b4a80ee0d161cc5b1be778e22ed5cc5", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 5498, "upload_time": "2019-03-25T19:51:42", "url": "https://files.pythonhosted.org/packages/1f/25/72585f764dc7da4451903160fd4dcf78158fda90492d023c2e784b485cf8/exp-runner-0.1.0b1.tar.gz" } ], "0.1.0b2": [ { "comment_text": "", "digests": { "md5": "5cbca2ebb141bb49036031448c596be6", "sha256": "ed7118cada6a5f2dcc74ac84485bdddc0fc988514a18f7fac905da42e822b202" }, "downloads": -1, "filename": "exp_runner-0.1.0b2-py3-none-any.whl", "has_sig": false, "md5_digest": "5cbca2ebb141bb49036031448c596be6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 6637, "upload_time": "2019-05-05T20:49:45", "url": "https://files.pythonhosted.org/packages/3a/5c/a9af5166e6c2dbf84e945b248409494a0f33e59bb4ac0d500203ccb38b1d/exp_runner-0.1.0b2-py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5cbca2ebb141bb49036031448c596be6", "sha256": "ed7118cada6a5f2dcc74ac84485bdddc0fc988514a18f7fac905da42e822b202" }, "downloads": -1, "filename": "exp_runner-0.1.0b2-py3-none-any.whl", "has_sig": false, "md5_digest": "5cbca2ebb141bb49036031448c596be6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 6637, "upload_time": "2019-05-05T20:49:45", "url": "https://files.pythonhosted.org/packages/3a/5c/a9af5166e6c2dbf84e945b248409494a0f33e59bb4ac0d500203ccb38b1d/exp_runner-0.1.0b2-py3-none-any.whl" } ] }