API Knowledge Guided Test Generation for Machine Learning Libraries

dc.contributor.advisorWang, Song
dc.contributor.authorNarayanan, Arunkaleeshwaran
dc.date.accessioned2022-12-14T16:29:09Z
dc.date.available2022-12-14T16:29:09Z
dc.date.copyright2022-07-18
dc.date.issued2022-12-14
dc.date.updated2022-12-14T16:29:09Z
dc.degree.disciplineElectrical and Computer Engineering
dc.degree.levelMaster's
dc.degree.nameMASc - Master of Applied Science
dc.description.abstractThis thesis proposes MUTester to generate test cases for APIs of machine learning libraries by leveraging the API constraints mined from the corresponding API documentation and the API usage patterns mined from code fragments in Stack Overflow (SO). First, we propose a set of 18 linguistic rules for mining API constraints from the API documents. Then, we use the frequent itemset mining technique to mine the API usage patterns from a large corpus of machine learning API related code fragments collected from SO. Finally, we use the above two types of API knowledge to guide the test generation of existing test generators, for machine learning libraries. To evaluate the performance of MUTester, we first collected 2,889 APIs from five widely used machine learning libraries (i.e., Scikit-learn, Pandas, Numpy, Scipy, and PyTorch),then for each API, we further extract their API knowledge, i.e., API constraints and API usage patterns. Given an API, MUTester combines its API knowledge with existing test generators (e.g., search-based test generator PyEvosuite and random test generator PyRandoop) to generate test cases to test the API. Results of our experiment show that MUTester can significantly improve the corresponding test generation methods. And the improvement in code coverage ranges from 18.0% to 41.9% on average.In addition, it also reduced 21% of invalid tests generated by the existing test generators.
dc.identifier.urihttp://hdl.handle.net/10315/40679
dc.languageen
dc.rightsAuthor owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subjectComputer engineering
dc.subjectComputer science
dc.subject.keywordsSoftware engineering
dc.subject.keywordsSoftware testing
dc.subject.keywordsAutomated testing
dc.subject.keywordsNatural language processing
dc.subject.keywordsFrequent item-set mining
dc.titleAPI Knowledge Guided Test Generation for Machine Learning Libraries
dc.typeElectronic Thesis or Dissertation

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Narayanan_Arunkaleeshwaran_2022_Masters.pdf
Size:
6.77 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
license.txt
Size:
1.87 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
YorkU_ETDlicense.txt
Size:
3.39 KB
Format:
Plain Text
Description: