{ "info": { "author": "abhinav", "author_email": "abhinav@comp.nus.edu.sg", "bugtrack_url": null, "classifiers": [], "description": "# ![sciwing logo]( https://sciwing.s3.amazonaws.com/sciwing.png)\nA Modern Toolkit for Scientific Document Processing from [WING-NUS](https://wing.comp.nus.edu.sg/)\n\n[![Build Status](https://travis-ci.com/abhinavkashyap/sciwing.svg?token=AShdNBksk5K9Pxg45w3H&branch=master)](https://travis-ci.com/abhinavkashyap/sciwing) ![Open Issues](https://img.shields.io/github/issues/abhinavkashyap/sciwing) ![Last Commit](https://img.shields.io/github/last-commit/abhinavkashyap/sciwing) [![Updates](https://pyup.io/repos/github/abhinavkashyap/sciwing/shield.svg)](https://pyup.io/repos/github/abhinavkashyap/sciwing/) ![](https://img.shields.io/badge/contributions-welcome-success)\n\n\n\nSciWING is a modern framework from WING-NUS to facilitate Scientific Document Processing. It is built on PyTorch and believes in modularity from ground up and easy to use interface. SciWING includes many pre-trained models for fundamental tasks in Scientific Document Processing for practitioners. It has the following advantages\n\n- **Modularity** - The framework embraces modularity from ground-up. **SciWING** helps in creating new models by combining multiple re-usable modules. You can combine different modules and experiment with new approaches in an easy manner \n\n- ***Pre-trained Models*** - SciWING has many pre-trained models for fundamental tasks like Logical Section Classifier for scientific documents, Citation string Parsing (Take a look at some of the other project related to station parsing [Parscit](https://github.com/WING-NUS/ParsCit), [Neural Parscit](https://github.com/WING-NUS/Neural-ParsCit). Easy access to pre-trained models are made available through web APIs.\n\n- ***Run from Config File***- SciWING enables you to declare datasets, models and experiment hyper-params in a [TOML](https://github.com/toml-lang/toml) file. The models declared in a TOML file have a one-one correspondence with their respective class declaration in a python file. SciWING parses the model to a Directed Acyclic Graph and instantiates the model using the DAG's topological ordering.\n\n- **Extensible** - SciWING enables easy addition of new datasets and provides command line tools for it. It enables addition of custom modules which are PyTorch modules.\n\n\n\n\n\n## Installation \n\nYou can install SciWING from pip. We recommend using a virtual environment to install the package. \n\n```zsh\npip install sciwing\n```\n\n\n\n## Simple Example \n\nExample of a model that concatenates a vanilla word embedding and Elmo embedding and then encodes it using a `LSTM2Vec` encoder before finally passing it through a linear layer for classification.\n\n\n\n```python\nfrom sciwing.modules.embedders import BowElmoEmbedder\nfrom sciwing.modules.embedders import VanillaEmbedder \nfrom sciwing.modules.embedders import ConcatEmbedders\n\nfrom sciwing.modules.lstm2vecencoder import LSTM2VecEncoder \n\n# initialize a elmo_embedder\nelmo_embedder = BowElmoEmbedder()\nELMO_EMBEDDING_DIMENSION = 1024\n\n# Get word embeddings as PyTorch tensors for all the words in the vocab\nembedding = dataset.word_vocab.load_embedding()\n# initialize a normal embedder with the word embedding \n# EMBEDDING_DIM is the embedding dimension for the word vectors\nvanilla_embedder = VanillaEmbedder(embedding=embedding, embedding_dim=EMBEDDING_DIM)\n\n# concatenate the vanilla embedding and the elmo embedding to get a new embedding\nfinal_embedder = ConcatEmbedders([vanilla_embedder, elmo_embedder])\nFINAL_EMBEDDING_DIM = EMBEDDING_DIM + ELMO_EMBEDDING_DIMENSION\n\n# instantiate a LSTM2VecEncoder that encodes a sentence to a single vector\nencoder = LSTM2VecEncoder(\n emb_dim= FINAL_EMBEDDING_DIM,\n embedder=final_embedder, \n hidden_dimension=HIDDEN_DIM \n)\n\n# Instantiate a linear classification layer that takes in an encoder and the dimension of the encoding and the number of classes\nmodel = SimpleClassifier(\n encoder=encoder,\n encoding_dim=HIDDEN_DIM,\n num_classes=NUM_CLASSES\n)\n\n```\n\n\n\n## Contributing ![](http://img.shields.io/badge/contributions-welcome-success)\n\nThank you for your interest in contributing. You can directly email the author at (email omitted for submission purposes). We will be happy to help.\n\n\n\nIf you want to get involved in the development we recommend that you install SciWING on a local machine using the instructions below. All our classes and methods are documented and hope you can find your way around it.\n\n\n\n## Instructions to install SciWING locally\n\nSciWING requires Python 3.7, We recommend that you install `pyenv`. \n\nInstructions to install pyenv are available [here](https://github.com/pyenv/pyenv). If you have problems installing python 3.7 on your machine, make sure to check out their common build problems site [here](https://github.com/pyenv/pyenv/wiki/common-build-problems) and install all dependencies.\n\n1. **Clone from git** \n\n https://github.com/abhinavkashyap/sciwing.git\n\n2. `cd sciwing`\n\n3. **Install all the requirements** \n\n `pip install -r requirements.txt`\n\n4. **Download spacy models** \n\n `python -m spacy download en`\n\n5. **Install the package locally**\n\n `pip install -e .`\n\n6. **Create directories where sciwing stores embeddings and experiment results**\n\n `sciwing develop makedirs`\n\n `sciwing develop download`\n\n This will take some time to download all the data and embeddings required for development \n\n Sip some :coffee:. Come back later \n\n7. **Run Tests**\n\n SciWING uses `pytest` for testing. You can use the following command to run tests \n\n `pytest tests -n auto --dist=loadfile`\n\n The test suite is huge and again, it will take some time to run. We will put efforts to reduce the test time in the next iterations.\n\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/abhinavkashyap/sciwing", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "sciwing", "package_url": "https://pypi.org/project/sciwing/", "platform": "", "project_url": "https://pypi.org/project/sciwing/", "project_urls": { "Homepage": "https://github.com/abhinavkashyap/sciwing" }, "release_url": "https://pypi.org/project/sciwing/0.1.0.dev0/", "requires_dist": [ "networkx", "wandb", "logzero", "falcon-multipart", "typing", "torch", "wasabi", "boto3", "tqdm", "wrapt", "stopwords", "allennlp", "botocore", "gensim", "pytorch-pretrained-bert", "spacy", "questionary", "pandas", "pytorch-crf", "colorful", "falcon", "numpy", "click", "toml", "requests", "scikit-learn", "tensorboardX", "Deprecated" ], "requires_python": "", "summary": "Modern Scientific Document Processing Framework", "version": "0.1.0.dev0" }, "last_serial": 5821595, "releases": { "0.1.0.dev0": [ { "comment_text": "", "digests": { "md5": "b402ebe50ddd377483a18648fd4837c7", "sha256": "2e71d25f6ab9c709a699cca63c6badc9ef65d6193d405072d9f687629e547a66" }, "downloads": -1, "filename": "sciwing-0.1.0.dev0-py3-none-any.whl", "has_sig": false, "md5_digest": "b402ebe50ddd377483a18648fd4837c7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 135120, "upload_time": "2019-09-12T17:08:25", "url": "https://files.pythonhosted.org/packages/4e/29/9e2b859d88a4d0dac9fa8b68939181502269a15c88a7e745e5ca108826ae/sciwing-0.1.0.dev0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "443eabc449198e944ac0cc05fdbad33a", "sha256": "6c864dc19747705b8d8767479f15d1c8d65163cacbb4556ac1e1c5e1a85613e0" }, "downloads": -1, "filename": "sciwing-0.1.0.dev0.tar.gz", "has_sig": false, "md5_digest": "443eabc449198e944ac0cc05fdbad33a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 82941, "upload_time": "2019-09-12T17:08:28", "url": "https://files.pythonhosted.org/packages/67/f3/5dcab83fea1a67f89f9312c5a4b55233b17ea0b6601874a795a67ad61758/sciwing-0.1.0.dev0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "b402ebe50ddd377483a18648fd4837c7", "sha256": "2e71d25f6ab9c709a699cca63c6badc9ef65d6193d405072d9f687629e547a66" }, "downloads": -1, "filename": "sciwing-0.1.0.dev0-py3-none-any.whl", "has_sig": false, "md5_digest": "b402ebe50ddd377483a18648fd4837c7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 135120, "upload_time": "2019-09-12T17:08:25", "url": "https://files.pythonhosted.org/packages/4e/29/9e2b859d88a4d0dac9fa8b68939181502269a15c88a7e745e5ca108826ae/sciwing-0.1.0.dev0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "443eabc449198e944ac0cc05fdbad33a", "sha256": "6c864dc19747705b8d8767479f15d1c8d65163cacbb4556ac1e1c5e1a85613e0" }, "downloads": -1, "filename": "sciwing-0.1.0.dev0.tar.gz", "has_sig": false, "md5_digest": "443eabc449198e944ac0cc05fdbad33a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 82941, "upload_time": "2019-09-12T17:08:28", "url": "https://files.pythonhosted.org/packages/67/f3/5dcab83fea1a67f89f9312c5a4b55233b17ea0b6601874a795a67ad61758/sciwing-0.1.0.dev0.tar.gz" } ] }