{ "info": { "author": "Sofian Medbouhi", "author_email": "sof.m.sk@free.fr", "bugtrack_url": null, "classifiers": [], "description": "Data vizualization\n==================\n\nThis is a collection of tools to represent and navigate through the high-dimensional data. The algorithm t-SNE has been used to construct the 2D space so some choices and feature of the visualization may reflect that. The module should be agnostic of the data provided.\n\nUsage\n-----\n\n### How to run?\n\nSimply run\n```sh\nvizuka\n\n# Similar to :\npython3 data_viz/launch_viz.py\n```\n\nIt assumes you already have your 2D data, if not you can ask for tSNE+PCA reduction :\n```sh\nvizuka --reduce\n```\n\nIt will search in its \\_\\_package\\_\\_/data/ the datas but you can force your own with __--path__ argument\n\n### How to use ?\nNavigate inside the 2D space and look at the data, selecting it in the main window (the big one). Only this one is interactive. Data is grouped by cluster, you can select cluster individually (left click).\n\nMain window represents all the data in 2D space. Blue are good-predicted transactions, Red are the bad ones, Green are the special class (by default the label 0).\n\nBelow are three subplots :\n* a summary of the data inside the selected buckets (see navigation)\n* a heatmap of the red/blue/green representation\n* a heatmap of the cross-entropy of each bucket empirical distribution with empirical global empirical distribution.\n\nData viz navigation :\n* left click selects a bucket of data\n* right click reset all in-memory buckets\n\nOther options:\n* filter by predictions or by real class.\n* detect mouse event : if unchecked, cluster will not be selected on click (useful for zooming)\n* clusterize with an algo, Dummy is a simple grid, KMeans should be used, DBSCAN is experimental.\n* export x : export the raw inputs you selected in an output.csv \n* cluster borders : draw borders between clusters based on bhattacharyya similarity measure, or just all\n* force number of clusters (for kmeans essentially)\n* choose a different set of predictions to display\n\nWhat does it needs to be executed ?\n-----------------------------------\n\ndata_viz needs the following files:\n* pre-processed transactions\n* predictions:\n * predictor (currently only the keras NN are supported), the algo which will eat the pre-processed transactions\n it should have been trained on data ordered the same way as the raw transactions array.**or** \n * its predictions\n* 2D-projections: (optional)\n * a t-SNE (or another dimension-reduction nD-to-2D algorithm) output representing pre-processed data in a 2D-space **or**\n * parameters for t-SNE (optional, default ones are provided)\n* raw transactions (optional) which will be used to display additional human-understandable info.\n\n\nFile structures\n---------------\n\nYour files should be structured this way :\n(samples are provided in the data folder, you still need to find the raw/processed datas as it too heavy to be put here.\n\n* pre-processed transactions:\n * type: npz\n * internal structure:\n * entry x: pre-processed inputs\n * entry y_$(OUTPUT_NAME): pre-processed label to be predicted\n * (optional) entry $(OUTPUT_NAME)_encoder: humanToMachine labels labelling\n * name: $(INPUT_FILE_BASE_NAME)_x_y$(VERSION).npz\n * path: $(DATA_PATH)\n\n* predictions:\n * type: npz\n * internal structure:\n * entry pred: the labels predicted\n * name: predictions$(VERSION).npz\n * path: $(MODEL_PATH)\n\n* 2D-projections:\n * type: npz\n * internal structure:\n * entry x_2D: array of (float, float) datas\n * name: embedded_x_1-$(REDUCTION_SIZE_FACTOR)_$(PARAMS[0])_$(PARAMS[1]).$(PARAMS[N]).npz\n * path: $(TSNE_DATA_PATH)\n\n* raw transactions:\n * type: npz\n * internal_structure:\n * entry originals: raw transactions\n * name: originals$(VERSION).npz\n * path: $(DATA_PATH)\n \n* predictor:\n * type: npz\n * internal_structure:\n * pred: predictions\n * nam entry:e $(PREDICTOR)$(VERSION)\n * path: $(MODEL_PATH)\n\nDefault parameters\n------------------\n\n* BASE_PATH: absolute path of where to find things\n * e.g: /home/user/data\n\nAll the following are relative path inside $(BASE_PATH)\n* DATA_PATH: where are the originals data (pre-processed and raw data)\n * e.g: data/\n* TSNE_DATA_PATH: where are the 2D-projected datas and models\n * e.g: tsne/\n* MODEL_PATH: where are the NN keras models and/or its predictions\n * e.g: models/\n* GRAPH_PATH: where to store fancy graph and pickle visualizations\n * e.g graph/\n\n* INPUT_FILE_BASE_NAME: base name of pre-processed datas\n * e.g: processed_prediction, processed_for_meta, one_hot\n* DEFAULT_RN: name of keras NN model for predictions\n * e.g: one_hot_1000-600-300-200-100_RN\n* VERSION: suffix to identify a unique version of all files\n * e.g: _20170614\n\n* e.g:\n * processed_prediction_x_y_20170614.npz\n * one_hot_1000-600-300-200-100_RN_20170614\n\n* PARAMS: parameters of the t-SNE projections, all combinations will be loaded (or learnt)\n * e.g: {\n 'perplexities' : [30,40,50],\n 'learning_rates': [500, 800, 1000],\n 'inits' : ['random', 'pca'],\n 'n_iters' : [5000, 15000]\n }\n* REDUCTION_SIZE_FACTOR: which size of subset of data you want to load\n * e.g: 50 for local, 30 for ovh, 15 for epinal\n \n* OUTPUT_NAME: type of label you want to load, predict, and visualize\n * e.g: account", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "GPL V3", "maintainer": "", "maintainer_email": "", "name": "Data-viz", "package_url": "https://pypi.org/project/Data-viz/", "platform": "", "project_url": "https://pypi.org/project/Data-viz/", "project_urls": null, "release_url": "https://pypi.org/project/Data-viz/0.13/", "requires_dist": null, "requires_python": "", "summary": "Represents your high-dimensional datas in a 2D space and play wih it", "version": "0.13" }, "last_serial": 3123196, "releases": { "0.10": [ { "comment_text": "", "digests": { "md5": "600aa6b0f38e7d67f602629008002c0b", "sha256": "e899e0f1ff532555d5b259c9e59c2650fdf839c199936d2d99196625c8a34bf3" }, "downloads": -1, "filename": "Data-viz-0.10.tar.gz", "has_sig": false, "md5_digest": "600aa6b0f38e7d67f602629008002c0b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23670, "upload_time": "2017-08-21T13:42:21", "url": "https://files.pythonhosted.org/packages/d2/bd/45921b92792c339f5bd9fcaefb5a7c1e4668b15703ce6259d5477f185e76/Data-viz-0.10.tar.gz" } ], "0.12": [ { "comment_text": "", "digests": { "md5": "d766585f0339c5a69376363dabd570f9", "sha256": "bf5434e1019559f326c40b6867048ce3d2a407df519163d9996bb1b79544825f" }, "downloads": -1, "filename": "Data-viz-0.12.tar.gz", "has_sig": false, "md5_digest": "d766585f0339c5a69376363dabd570f9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26172, "upload_time": "2017-08-24T10:50:29", "url": "https://files.pythonhosted.org/packages/2d/7d/6eda37098c75185f2c6b0b6e22ba54d75c266a15676ef5414ec11ecb3a82/Data-viz-0.12.tar.gz" } ], "0.13": [ { "comment_text": "", "digests": { "md5": "d8cbff0137f75ab805d8172ba4d7597c", "sha256": "8fed064308862bf15304fe225bf782a6262c96c50bb9c2b721f5ff756e2a93dd" }, "downloads": -1, "filename": "Data-viz-0.13.tar.gz", "has_sig": false, "md5_digest": "d8cbff0137f75ab805d8172ba4d7597c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 28830, "upload_time": "2017-08-25T14:34:37", "url": "https://files.pythonhosted.org/packages/73/e5/fb90ba738107e1959ddb4e22f1d3b9fea2df6cbc53982de034dadadd313b/Data-viz-0.13.tar.gz" } ], "0.8": [ { "comment_text": "", "digests": { "md5": "8d8f3dec50715f1d757e6174ed0965cf", "sha256": "dd4c4848d32cafd7c41ff541451f2a9026462b48e2f07a011f9c92ea1c688053" }, "downloads": -1, "filename": "Data-viz-0.8.tar.gz", "has_sig": false, "md5_digest": "8d8f3dec50715f1d757e6174ed0965cf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20916, "upload_time": "2017-08-21T10:17:18", "url": "https://files.pythonhosted.org/packages/8e/65/54f5a0585da00337eeffdcc06e2679b2005ed7eda1dceba98a78b5cecf1a/Data-viz-0.8.tar.gz" } ], "0.8.1.dev0": [ { "comment_text": "", "digests": { "md5": "aa0c01797aa3058d593ab625bbd55064", "sha256": "e0088ea7c81bff682acb8c6b3cfc352bd4d9f0e0503792885c6adac3af36d8bf" }, "downloads": -1, "filename": "Data-viz-0.8.1.dev0.tar.gz", "has_sig": false, "md5_digest": "aa0c01797aa3058d593ab625bbd55064", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20921, "upload_time": "2017-08-21T10:24:34", "url": "https://files.pythonhosted.org/packages/a9/3e/95fced4e8f91d3e94b14ac1784393026ae8385ca3cf990c977f08b37c79e/Data-viz-0.8.1.dev0.tar.gz" } ], "0.8.2.dev0": [ { "comment_text": "", "digests": { "md5": "40dbed4c7621b360a514cd034745483e", "sha256": "aad46b7860a1379834901bd7eeaf422a0b7b0889ec04e440699dbef4a541ceb9" }, "downloads": -1, "filename": "Data-viz-0.8.2.dev0.tar.gz", "has_sig": false, "md5_digest": "40dbed4c7621b360a514cd034745483e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21036, "upload_time": "2017-08-21T10:31:43", "url": "https://files.pythonhosted.org/packages/6d/4d/4c59d6e9fef505f29105b165d88878befb1fd028ed790e5000ddfac8c883/Data-viz-0.8.2.dev0.tar.gz" } ], "0.8.3.dev0": [ { "comment_text": "", "digests": { "md5": "6530f775b9954701b8e0fd51848f9c9c", "sha256": "6e8d4b1173f818d800931dd2dff7f951f8682d3d5147b5b84bc8e5c6350a4c6f" }, "downloads": -1, "filename": "Data-viz-0.8.3.dev0.tar.gz", "has_sig": false, "md5_digest": "6530f775b9954701b8e0fd51848f9c9c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 19114, "upload_time": "2017-08-21T11:42:01", "url": "https://files.pythonhosted.org/packages/cc/e9/ae6b9ba75a0a8feb9cb190c484ecac3206b1c81a68ec77cd074b0d2b8e17/Data-viz-0.8.3.dev0.tar.gz" } ], "0.8.4.dev0": [ { "comment_text": "", "digests": { "md5": "6a07b14338c7fd77d11bff324bcd1baf", "sha256": "370f924ed27858bfeebbf6712d05f5e4407d2d9b5971e8801b40911fa85a4701" }, "downloads": -1, "filename": "Data-viz-0.8.4.dev0.tar.gz", "has_sig": false, "md5_digest": "6a07b14338c7fd77d11bff324bcd1baf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20941, "upload_time": "2017-08-21T11:44:43", "url": "https://files.pythonhosted.org/packages/da/7b/0cf73b939baecccc4de93dae46cf119e2c298c943d080c1ff260d4d1cfca/Data-viz-0.8.4.dev0.tar.gz" } ], "0.8.5.dev0": [ { "comment_text": "", "digests": { "md5": "583db1f546a99415882ed2b022170950", "sha256": "496136fcbf55a2388a04c4a4735bb7ecd3ae331181e819873b282c626cf83513" }, "downloads": -1, "filename": "Data-viz-0.8.5.dev0.tar.gz", "has_sig": false, "md5_digest": "583db1f546a99415882ed2b022170950", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21033, "upload_time": "2017-08-21T11:49:52", "url": "https://files.pythonhosted.org/packages/8f/7b/deaa817c38ad8be79d4e55070877c81d7fa470f57d9421b36ab0f5b32d98/Data-viz-0.8.5.dev0.tar.gz" } ], "0.8.6.dev0": [ { "comment_text": "", "digests": { "md5": "9097a64b2522a1dbc6c74c9b56a2817d", "sha256": "4082cc94be61b67e4038e7fbc23ed2527b59e8218a24f4013f7175814e9d98e4" }, "downloads": -1, "filename": "Data-viz-0.8.6.dev0.tar.gz", "has_sig": false, "md5_digest": "9097a64b2522a1dbc6c74c9b56a2817d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23037, "upload_time": "2017-08-21T12:10:14", "url": "https://files.pythonhosted.org/packages/66/c7/48336d822c56d17732eeb5e7657f46b6103aed450fb7295eeeb865807905/Data-viz-0.8.6.dev0.tar.gz" } ], "0.8.7.dev0": [ { "comment_text": "", "digests": { "md5": "138627beece79b88b35f8748387a995e", "sha256": "3759e53cc74f8819989401cc63ebd16f273ec1234dffd399bd360acaf476ac6d" }, "downloads": -1, "filename": "Data-viz-0.8.7.dev0.tar.gz", "has_sig": false, "md5_digest": "138627beece79b88b35f8748387a995e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23009, "upload_time": "2017-08-21T12:15:42", "url": "https://files.pythonhosted.org/packages/63/21/11695eb1b4113254145d79c780ae50c6dc88213733830a955b9bffcae945/Data-viz-0.8.7.dev0.tar.gz" } ], "0.8.8.dev0": [ { "comment_text": "", "digests": { "md5": "cd7241bbf6c07a75f0b195f454e0bb9e", "sha256": "a80c5b272c4d21dfaa283fc855ae84830c8b6a39289bed96f955a3adc8ed5233" }, "downloads": -1, "filename": "Data-viz-0.8.8.dev0.tar.gz", "has_sig": false, "md5_digest": "cd7241bbf6c07a75f0b195f454e0bb9e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23563, "upload_time": "2017-08-21T12:17:42", "url": "https://files.pythonhosted.org/packages/45/73/dc1f66747b834ed6056b71fd5eff38a562d1d8ba4f94fe8f3e69f15e778b/Data-viz-0.8.8.dev0.tar.gz" } ], "0.8.9.dev0": [ { "comment_text": "", "digests": { "md5": "8e3a1ca50d9a5a230c946ee39e3ad09d", "sha256": "38a41bfd3e3fd18bbe55ea37969cca56d6eab48b9c7d14c1f745e1f008b983e2" }, "downloads": -1, "filename": "Data-viz-0.8.9.dev0.tar.gz", "has_sig": false, "md5_digest": "8e3a1ca50d9a5a230c946ee39e3ad09d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23561, "upload_time": "2017-08-21T12:19:19", "url": "https://files.pythonhosted.org/packages/16/01/1b53a0dfc1e8d91f149130cbf3a1c63e8f58b396f1d97025fb30e4c47666/Data-viz-0.8.9.dev0.tar.gz" } ], "0.9.1": [ { "comment_text": "", "digests": { "md5": "ca894397af2365e1d17426a5f603e478", "sha256": "ca3d37eea95cde9875f27e42e591e652a8586be9842b9992d0f6f09b5a8205bd" }, "downloads": -1, "filename": "Data-viz-0.9.1.linux-x86_64.tar.gz", "has_sig": false, "md5_digest": "ca894397af2365e1d17426a5f603e478", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 34350, "upload_time": "2017-08-21T13:29:55", "url": "https://files.pythonhosted.org/packages/47/98/3f2098f18a89402809c75ea2ca0b1c5df8e06faaf811adb9c5099ccc8a76/Data-viz-0.9.1.linux-x86_64.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d8cbff0137f75ab805d8172ba4d7597c", "sha256": "8fed064308862bf15304fe225bf782a6262c96c50bb9c2b721f5ff756e2a93dd" }, "downloads": -1, "filename": "Data-viz-0.13.tar.gz", "has_sig": false, "md5_digest": "d8cbff0137f75ab805d8172ba4d7597c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 28830, "upload_time": "2017-08-25T14:34:37", "url": "https://files.pythonhosted.org/packages/73/e5/fb90ba738107e1959ddb4e22f1d3b9fea2df6cbc53982de034dadadd313b/Data-viz-0.13.tar.gz" } ] }