Subject Area

Computer Science


As the complexity of software systems is growing tremendously, it came with increasingly sophisticated data provided during development. The systematic and large-scale accumulation of software engineering data opened up new opportunities that infer information appropriately can be helpful to software development in a given context. This type of intelligent software development tools came to be known as recommendation systems.

Recommendation Systems in Software Change (RSSCs) share commonalities with conventional recommendation systems: mainly in their usage model, the usual reliance on data mining, and in the predictive nature of their functionality. So a major challenge for designing RSSCs is to automatically and accurately interpret the highly technical data stored in software repositories. Realizing such challenge, many RSSCs are proposed. However, existing works either rely on large amount of historical data or suffers from low accuracy, which makes the existing systems hardly to be applied practically.

In this dissertation we develop techniques that aid developers in overcoming one major challenge in software development and evolution: automated software change. We introduce various of recommendation systems: We start with proposing a new approach and tool (namely, CHIP) to predict software actual change impact set by leveraging two kinds of code dependency: call and data sharing dependency. For this purpose, CHIP employs novel extensions (dependency frequency filtering and shared data type idf filtering) to reduce false positives. Then in order to help developers to understand software change intents, we propose AutoCILink to automatically identify code to untangled change intent links with a pattern-based link identification system (AutoCILink-P) and a supervised learning-based link classification system (AutoCILink-ML). In addition, to help developers in finding appropriate API to adopt, we propose an approach called RecRank, which applies a novel ranking- based discriminative approach leveraging execution path features to improve Top-1 API recommendation. Furthermore, to resolve license restriction conflicts introduced in software change, we illustrate Automatic License Predictor (ALP), a novel learning- based method and tool for predicting licenses as software changes. We illustrate that the techniques presented in this dissertation represent significant advancements in software development and evolution through a series of empirical evaluation that demonstrate the effectiveness of these approaches and the benefits they provide developers.

Degree Date

Fall 12-21-2019

Document Type


Degree Name



Computer Science and Engineering


LiGuo Huang

Number of Pages




Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License