{ "info": { "author": "J\u00e9r\u00e9my Ferrero,Shashank Khare", "author_email": "shashank88@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# ln2sql\n\nln2sql is a NLP tool to query a database in natural language. The tool takes in input a database model and a sentence and translate the latter in a valid SQL statement able to query the input data model.\n\n##### Scientific paper\n\nThe initial tool is described in the following French paper (which can be found in the `docs/` directory):\n\nBeno\u00eet Couderc and J\u00e9r\u00e9my Ferrero. fr2sql : Database Query in French. (fr2sql : Interrogation de bases de donn\u00e9es en fran\u00e7ais [in French]). In Proceedings of the 17th RECITAL (affiliated with the 22th TALN Conference). June 2015. Caen, France. ATALA. pp.1-12 \n\nPlease cite the paper if you use ln2sql.\n\n#### Differences between the version of the paper (fr2sql) and this version (ln2sql)\n\nln2sql is not the state-of-the-art tool for copyright reasons. It's just a quick & dirty Python wrapper but it has some speed optimizations.\n\n* [A data model is only learned from a parsing of a SQL dump file. Thus, no database connection is needed.](https://github.com/FerreroJeremy/ln2sql#database-input)\n\n* In the paper, TreeTagger is used to filter the words of the input sentence according to its POS tagging.\nIn this way, a mapping between the keywords of the input sentence and the keywords of the data model can be performed.\nIn ln2sql, Treegagger is left in favour of an import of personal configuration files (for [languages](https://github.com/FerreroJeremy/ln2sql#supported-languages), [stop words and synonyms](https://github.com/FerreroJeremy/ln2sql#thesaurus-and-stop-word-list)) to be more generic.\n\n> Beware that ln2sql cannot therefore automatically solve the gender and number problem. So if the word \"students\" is in the input sentence, it does not match with the table \"student\" in the model of data.\nTo do that, the equivalence \"students \u2192 student\" must be appear in the [used thesaurus](https://github.com/FerreroJeremy/ln2sql#thesaurus-and-stop-word-list).\nIf you want a version using TreeTagger, a Python wrapper exists and a documentation can be found here.\n\n* [In theory, all languages can be supported.](https://github.com/FerreroJeremy/ln2sql#supported-languages)\n\n* A grammar still parse the input sentence to generate the corresponding query structure, but now this structure is stocked in a Python class able to print [a query structure JSON file](https://github.com/FerreroJeremy/ln2sql/blob/master/README.md#json-output). Thus, the hash map for the query generation was abandoned. In addition, a multi-threading implementation was adopted.\n\n## SQL statements supported\n\n- [X] SELECT\n\t- [X] one column\n\t- [X] multiple columns\n\t- [X] all columns\n\t- [X] distinct select\n\t- [X] aggregate functions\n\t\t- [X] count-select\n\t\t- [X] sum-select\n\t\t- [X] avg-select\n\t\t- [X] min-select\n\t\t- [X] max-select\n- [X] JOIN\n\t- [X] inner join\n\t- [X] natural join\n- [X] WHERE\n\t- [X] one condition\n\t- [X] multiple conditions\n\t- [X] junction\n\t- [X] disjunction\n\t- [X] cross-condition\n\t- [X] operators\n\t\t- [X] equal operator\n\t\t- [X] not equal operator\n\t\t- [X] greater-than operator\n\t\t- [X] less-than operator\n\t\t- [X] like operator\n\t\t- [ ] between operator (not 100% efficient)\n\t- [X] aggregate functions\n\t\t- [X] sum in condition\n\t\t- [X] avg in condition\n\t\t- [X] min in condition\n\t\t- [X] max in condition\n- [X] ORDER BY\n\t- [X] ASC\n\t- [X] DESC\n- [X] GROUP BY\n- [X] multiple queries\n- [X] exception and error handling\n- [ ] detection of values (not 100% efficient)\n\n## How to use it?\n\n#### Supported languages\n\nThe tool can deal with any language, so long as it has its configuration file (i.e. a file with the keywords of the language).\n\nLanguage configuration files can be found in `lang/` directory. The files are CSV files. Each line represent a type of keywords. Anything before the colon is ignored. Keywords must be separated by a comma.\n\nYou can build your own language configuration file following the English and French templates.\n\n#### Database input\n\nTo be effective ln2sql need to learn the data model of the database that the user want to query. It need to load the corresponding SQL dump file to do that.
\nA database dump is a file containing a record of the table structure and/or the data of a database.\n\n##### Usage of the Database class\n\n```python\ndatabase = Database()\ndatabase.load(\"database/tal.sql\")\ndatabase.print_me()\n```\nFor the following SQL statements loaded, the output in the terminal looks like:\n

\n\n#### Thesaurus and stop word list\n\nYou can improve the keyword filtering using a thesaurus. Thesaurus can be found in `thesaurus/` directory. You can build your own thesaurus following the OpenOffice template.\n\nYou can improve the stop word filtering using a stop word list. You can build your own stop word list following the template of the lists in `stopwords/` directory.\n\n#### Usage\n\nYou can directly use the python wrapper by the following way:\n```\nUsage:\n\tpython ln2sql.py -d -l -i [-j ] [-t ] [-s ]\nParameters:\n\t-h\t\t\t\t\tprint this help message\n\t-d \t\t\t\tpath to sql dump file\n\t-l \t\t\t\tpath to language configuration file\n\t-i \t\t\tinput sentence to parse\n\t-j \t\t\t\tpath to JSON output file\n\t-t \t\t\t\tpath to thesaurus file\n\t-s \t\t\t\tpath to stopwords file\n```\nexample of usage:\n```\npython ln2sql.py -d database/city.sql -l lang/english.csv -j output.json -i \"Count how many city there are with the name blob?\"\n```\n\nor by graphical interface by typing the following command:\n```\npython ln2sql_gui.py\n```\na window like the one below will appear:\n

\n\n## JSON output\n\nWith the following input:\n```\nWhat is the average age of students whose name is Doe or age over 25?\n```\nthe output is:\n```JSON\n{\n\t\"select\": {\n\t\t\"column\": \"age\",\n\t\t\"type\": \"AVG\"\n\t},\n\t\"from\": {\n\t\t\"table\": \"student\"\n\t},\n\t\"join\": {\n\n\t},\n\t\"where\": {\n\t\t\"conditions\": [\n\t\t\t{ \"column\": \"name\",\n\t\t\t \"operator\": \"=\",\n\t\t\t \"value\": \"Doe\"\n\t\t\t},\n\t\t\t{\n\t\t\t \"operator\": \"OR\"\n\t\t\t},\n\t\t\t{ \"column\": \"age\",\n\t\t\t \"operator\": \">\",\n\t\t\t \"value\": \"25\"\n\t\t\t}\n\t\t]\n\t},\n\t\"group_by\": {\n\n\t},\n\t\"order_by\": {\n\n\t}\n}\n```\n\n## Conception\nThe tool is implemented under the Model-View-Controller pattern. The classes imported from the Python Standard Library do not appear in the diagram except those required for inheritance (e.g. Thread or Exception).\n

\nThe above diagram was modeled with StarUML.\n
\n
\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/FerreroJeremy/ln2sql", "keywords": "", "license": "GNU", "maintainer": "", "maintainer_email": "", "name": "ln2sql", "package_url": "https://pypi.org/project/ln2sql/", "platform": "", "project_url": "https://pypi.org/project/ln2sql/", "project_urls": { "Homepage": "https://github.com/FerreroJeremy/ln2sql" }, "release_url": "https://pypi.org/project/ln2sql/0.1/", "requires_dist": null, "requires_python": "", "summary": "Convert Natural Language to SQL queries", "version": "0.1" }, "last_serial": 3363583, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "e288bf099efc4f775e5e863a0cdfd571", "sha256": "52d9f2ae1f5c3135cd4a88d457e564cdcb69d6ce254e58ae9a900cec216f4f58" }, "downloads": -1, "filename": "ln2sql-0.1.tar.gz", "has_sig": false, "md5_digest": "e288bf099efc4f775e5e863a0cdfd571", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4634, "upload_time": "2017-11-25T21:44:17", "url": "https://files.pythonhosted.org/packages/27/12/4360db8a6432caf2f29dfea43fb8591d5ea6540e3d5d67ea8a619ee9aef9/ln2sql-0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "e288bf099efc4f775e5e863a0cdfd571", "sha256": "52d9f2ae1f5c3135cd4a88d457e564cdcb69d6ce254e58ae9a900cec216f4f58" }, "downloads": -1, "filename": "ln2sql-0.1.tar.gz", "has_sig": false, "md5_digest": "e288bf099efc4f775e5e863a0cdfd571", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4634, "upload_time": "2017-11-25T21:44:17", "url": "https://files.pythonhosted.org/packages/27/12/4360db8a6432caf2f29dfea43fb8591d5ea6540e3d5d67ea8a619ee9aef9/ln2sql-0.1.tar.gz" } ] }