AI-Guided Exploration of Large-Scale Codebases

A project exploring how large language models can support software comprehension through multimodal, interactive visualization and guided navigation.

Project Description

Modern software systems are large, complex, and continuously evolving. Developers spend a significant portion of their time trying to understand unfamiliar codebases—whether for onboarding, debugging, refactoring, or feature development. Traditional tools such as static diagrams and IDE plugins often fail to scale with complexity or support interactive exploration.

This project investigates how large language models (LLMs) can be integrated with deterministic reverse engineering techniques to create adaptive, multimodal, and intent-aware tools for navigating large codebases.

Our system bridges static structure and intelligent interaction through four main components:

  • Code-to-UML Reverse Engineering: Converts source code into UML class diagrams using abstract syntax trees, enabling multi-level structural abstraction.
  • Interactive Visualization: A dynamic front-end with zooming, filtering, overlays (e.g., change heatmaps), and drill-down exploration.
  • LLM-Guided Interface Planner: Interprets user queries and navigation patterns to guide exploration, summarize modules, and recommend views via structured GUI updates.
  • Context and Collaboration Layer: Augments visualizations with Git history, documentation, and annotations for real-time team collaboration and shared understanding.

Together, these components enable a closed-loop exploration experience where the user’s intent drives visualization updates and contextual reasoning. Our prototype supports Java and Python codebases and demonstrates promising results in adaptive, multi-level comprehension workflows.


Technologies

  • Programming Language: Python
  • Visualization: JavaScript / Web-based UI
  • LLMs: Claude, GPT-4, DeepSeek, LLaMA
  • Frameworks: LangChain, AutoGen

Our long-term goal is to develop next-generation developer tools that treat software understanding as a collaborative, visual, and AI-assisted process.


Cover Image Credit: ChatGPT