{ "info": { "author": "fyrestone", "author_email": "fyrestone@outlook.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 2", "Programming Language :: Python :: 3", "Topic :: Software Development :: Libraries", "Topic :: Utilities" ], "description": "pycode_similar\r\n==============\r\n\r\nThis is a simple plagiarism detection tool for python code, the basic idea is to normalize python AST representation and use difflib to get the modification from referenced code to candidate code. The plagiarism defined in pycode_similar is how many referenced code is plagiarized by candidate code, which means swap referenced code and candidate code will get different result.\r\n\r\nIt only cost me a couple of hours to implement this tool, so there is still a long way to improve the speed and accuracy, but it already performs great in detecting the plagiarism of new recruits' homeworks in our company.\r\n\r\nCompare to Moss\r\n---------------\r\n\r\n- pure python implementation\r\n- only contains one source file\r\n- no third-party dependency (except `zss `_ when use TreeDiff)\r\n- no need to register account for Moss\r\n- no need of network to access Moss\r\n\r\nThis tool was born before I know there is a `Moss (for a Measure Of Software Similarity) `_ to determine the similarity of programs. And I have tried many ways to register account for Stanford Moss, but still can't get a valid account. So, I have no accurate comparison between pycode_similar and Moss.\r\n\r\nInstallation\r\n--------------\r\n\r\nIf you don't have much time, just perform\r\n\r\n `$ pip install pycode_similar`\r\n\r\nwhich will install the module(without tests) on your system.\r\n\r\nAlso, you can just copy & paste the pycode_similar.py which require no third-party dependency.\r\n\r\n\r\nUsage\r\n--------------\r\n\r\nJust use it as a standard command line tool if pip install properly.\r\n\r\n.. code-block:: text\r\n\r\n\t$ pycode_similar\r\n\tusage: pycode_similar [-h] [-l L] [-p P] files files\r\n\r\n\tA simple plagiarism detection tool for python code\r\n\r\n\tpositional arguments:\r\n\t files the input files\r\n\r\n\toptional arguments:\r\n\t -h, --help show this help message and exit\r\n\t -l L if AST line of the function >= value then output detail\r\n\t (default: 4)\r\n\t -p P if plagiarism percentage of the function >= value then output\r\n\t detail (default: 0.5)\r\n\r\n\tpycode_similar: error: too few arguments\r\n\r\nOf course, you can use it as a python library, too.\r\n\r\n.. code-block:: python\r\n\r\n\timport pycode_similar\r\n\tpycode_similar.detect([referenced_code_str, candidate_code_str1, candidate_code_str2, ...], diff_method=UnifiedDiff)\r\n\r\n\r\nImplementation\r\n--------------\r\nThis tool has implemented two diff methods: line based diff(UnifiedDiff) and tree edit distance based diff(TreeDiff), both of them are run in function AST level.\r\n\r\n- UnifiedDiff, diff normalized function AST string lines, naive but efficiency.\r\n- TreeDiff, diff function AST, very slow and the result is not good for small functions. (depends on `zss `_)\r\n\r\nSo, when run this tool in cmd, the default diff method is UnifiedDiff. And you can switch to TreeDiff when use it as a library.\r\n\r\n\r\nTesting\r\n--------------\r\nIf you have the source code you can run the tests with\r\n\r\n `$ python pycode_similar/tests/test_cases.py`\r\n\r\nOr perform\r\n\r\n.. code-block:: text\r\n\r\n\t$ python pycode_similar.py pycode_similar/tests/original_version.py pycode_similar.py\r\n\r\n\tref: tests/original_version.py\r\n\tcandidate: pycode_similar.py\r\n\t80.14 % (803/1002) of ref code structure is plagiarized by candidate.\r\n\tcandidate function plagiarism details (AST lines >= 4 and plagiarism percentage >= 0.5):\r\n\t1.0 : ref FuncNodeCollector._mark_docstring_sub_nodes<24:4>, candidate FuncNodeCollector._mark_docstring_sub_nodes<27:4>\r\n\t1.0 : ref FuncNodeCollector._mark_docstring_nodes<54:8>, candidate FuncNodeCollector._mark_docstring_nodes<57:8>\r\n\t1.0 : ref FuncNodeCollector.generic_visit<69:4>, candidate FuncNodeCollector.generic_visit<72:4>\r\n\t1.0 : ref FuncNodeCollector.visit_Str<74:4>, candidate FuncNodeCollector.visit_Str<78:4>\r\n\t1.0 : ref FuncNodeCollector.visit_Name<83:4>, candidate FuncNodeCollector.visit_Name<88:4>\r\n\t1.0 : ref FuncNodeCollector.visit_Attribute<89:4>, candidate FuncNodeCollector.visit_Name<88:4>\r\n\t1.0 : ref FuncNodeCollector.visit_ClassDef<95:4>, candidate FuncNodeCollector.visit_ClassDef<100:4>\r\n\t1.0 : ref FuncNodeCollector.visit_FunctionDef<101:4>, candidate FuncNodeCollector.visit_FunctionDef<106:4>\r\n\t1.0 : ref FuncInfo.__init__<141:4>, candidate FuncInfo.__init__<161:4>\r\n\t1.0 : ref FuncInfo.__str__<151:4>, candidate FuncInfo.__str__<171:4>\r\n\t1.0 : ref FuncInfo.func_code<162:4>, candidate FuncInfo.func_code<182:4>\r\n\t1.0 : ref FuncInfo.func_code_lines<168:4>, candidate FuncInfo.func_code_lines<188:4>\r\n\t1.0 : ref FuncInfo.func_ast<174:4>, candidate FuncInfo.func_ast<194:4>\r\n\t1.0 : ref FuncInfo.func_ast_lines<180:4>, candidate FuncInfo.func_ast_lines<200:4>\r\n\t1.0 : ref FuncInfo._retrieve_func_code_lines<186:4>, candidate FuncInfo._retrieve_func_code_lines<206:4>\r\n\t1.0 : ref FuncInfo._iter_node<208:4>, candidate FuncInfo._iter_node<228:4>\r\n\t1.0 : ref FuncInfo._dump<232:4>, candidate FuncInfo._dump<252:4>\r\n\t1.0 : ref FuncInfo._inner_dump<242:8>, candidate FuncInfo._inner_dump<262:8>\r\n\t1.0 : ref ArgParser.error<267:4>, candidate ArgParser.error<291:4>\r\n\t0.95: ref unified_diff<281:0>, candidate UnifiedDiff._gen<339:8>\r\n\t0.92: ref FuncNodeCollector.__init__<18:4>, candidate FuncNodeCollector.__init__<20:4>\r\n\t0.92: ref FuncNodeCollector.visit_Compare<108:4>, candidate FuncNodeCollector._simple_nomalize<117:8>\r\n\t0.89: ref FuncNodeCollector.visit_Expr<79:4>, candidate FuncNodeCollector.visit_Expr<83:4>\r\n\r\nClick `here `_\r\nto view this diff -> `0.92: ref FuncNodeCollector.visit_Compare<108:4>, candidate FuncNodeCollector._simple_nomalize<117:8>`\r\n\r\nRepository\r\n--------------\r\n\r\nThe project is hosted on GitHub. You can look at the source here:\r\n\r\n https://github.com/fyrestone/pycode_similar\r\n\r\n\r\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/fyrestone/pycode_similar", "keywords": "code similarity plagiarism moss generic utility", "license": "MIT License", "maintainer": "", "maintainer_email": "", "name": "pycode-similar", "package_url": "https://pypi.org/project/pycode-similar/", "platform": "All", "project_url": "https://pypi.org/project/pycode-similar/", "project_urls": { "Homepage": "https://github.com/fyrestone/pycode_similar" }, "release_url": "https://pypi.org/project/pycode-similar/1.2/", "requires_dist": null, "requires_python": "", "summary": "A simple plagiarism detection tool for python code", "version": "1.2" }, "last_serial": 3501308, "releases": { "1.0": [ { "comment_text": "", "digests": { "md5": "99c7c8cb3a2e9e46766e1fe7a6a8e860", "sha256": "9066f53d62bb3d29d943a6340af19d926502914f0454efcd9715d70afe190748" }, "downloads": -1, "filename": "pycode_similar-1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "99c7c8cb3a2e9e46766e1fe7a6a8e860", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 9980, "upload_time": "2017-10-01T15:43:12", "url": "https://files.pythonhosted.org/packages/a2/a5/80bf3052c54266874d7f11b5c2b0d1129bd49ad208e6729ddbc4926225a2/pycode_similar-1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "efa6c43a375915656f2748f1fe4ac02a", "sha256": "0e0d7f501dfcac2c6b48f3d5123fa4b1272f633f81f552c69be5125891e1bd06" }, "downloads": -1, "filename": "pycode_similar-1.0.tar.gz", "has_sig": false, "md5_digest": "efa6c43a375915656f2748f1fe4ac02a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7160, "upload_time": "2017-10-01T15:43:16", "url": "https://files.pythonhosted.org/packages/0b/6e/a16cc90507f0188e506e4c3ca92afa3ae3fc08e6774f2d36f069b080228d/pycode_similar-1.0.tar.gz" } ], "1.1": [ { "comment_text": "", "digests": { "md5": "cf3767e8104c7090bd6f0084fbb66f33", "sha256": "9df9b61b2c97a36b9d9740e1192a24b33adc52ca551b049cdca1645e4bbfb1ac" }, "downloads": -1, "filename": "pycode_similar-1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "cf3767e8104c7090bd6f0084fbb66f33", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 11380, "upload_time": "2017-10-04T05:27:10", "url": "https://files.pythonhosted.org/packages/36/15/051d2d651e79c26f4a7729512029f827c64394bc3f2db5eb7f28ceefa373/pycode_similar-1.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2f2b46b597d399d1b53249c23b0c3ef2", "sha256": "53b139cd517eb2b16b1bf0da353dcf9e3f4bc8535c33eb1852f6e494c32290c1" }, "downloads": -1, "filename": "pycode_similar-1.1.tar.gz", "has_sig": false, "md5_digest": "2f2b46b597d399d1b53249c23b0c3ef2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8456, "upload_time": "2017-10-04T05:27:12", "url": "https://files.pythonhosted.org/packages/74/82/0ef6033c564268b3413f484c091fa5194a54a8ac9cf5cf8606c8bea92634/pycode_similar-1.1.tar.gz" } ], "1.2": [ { "comment_text": "", "digests": { "md5": "bf58f5f2aa37d8730c512a1bc2d91caf", "sha256": "5ccc08faaefe26c1ea7d6034e52f2143b3bca848b406d0eed104919363b94755" }, "downloads": -1, "filename": "pycode_similar-1.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "bf58f5f2aa37d8730c512a1bc2d91caf", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 11530, "upload_time": "2018-01-18T15:56:40", "url": "https://files.pythonhosted.org/packages/10/45/e14b3fddb7926bb0b5c13393ae72fe36ab26c85c5008f75513b59ec14ef6/pycode_similar-1.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3d9380fe8d5778e5d32c2ffb18459d83", "sha256": "6a83582856d7769001e5c36cdbc59c232f1d8a992634944a6b6f044b004f3883" }, "downloads": -1, "filename": "pycode_similar-1.2.tar.gz", "has_sig": false, "md5_digest": "3d9380fe8d5778e5d32c2ffb18459d83", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8677, "upload_time": "2018-01-18T15:56:42", "url": "https://files.pythonhosted.org/packages/f3/42/785136e4a501541dc04bec61424bf88c83a38e93f7e5802e5f32fa55764d/pycode_similar-1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "bf58f5f2aa37d8730c512a1bc2d91caf", "sha256": "5ccc08faaefe26c1ea7d6034e52f2143b3bca848b406d0eed104919363b94755" }, "downloads": -1, "filename": "pycode_similar-1.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "bf58f5f2aa37d8730c512a1bc2d91caf", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 11530, "upload_time": "2018-01-18T15:56:40", "url": "https://files.pythonhosted.org/packages/10/45/e14b3fddb7926bb0b5c13393ae72fe36ab26c85c5008f75513b59ec14ef6/pycode_similar-1.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3d9380fe8d5778e5d32c2ffb18459d83", "sha256": "6a83582856d7769001e5c36cdbc59c232f1d8a992634944a6b6f044b004f3883" }, "downloads": -1, "filename": "pycode_similar-1.2.tar.gz", "has_sig": false, "md5_digest": "3d9380fe8d5778e5d32c2ffb18459d83", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8677, "upload_time": "2018-01-18T15:56:42", "url": "https://files.pythonhosted.org/packages/f3/42/785136e4a501541dc04bec61424bf88c83a38e93f7e5802e5f32fa55764d/pycode_similar-1.2.tar.gz" } ] }