Introduction: Why Code Readability Still Matters in the Age of LLM Agents

In 2025, the buzz around Large Language Models (LLMs) and Agentic AI for Software Engineering is everywhere. From SWE chatbots (e.g., RovoDev CLI) to autonomous workflow agents (e.g., RovoDev Coding Agent), these technologies are transforming how we build and maintain software. But as LLM-powered agents like Atlassian’s RovoDev start writing more of our code, a timely question emerges: Does code readability still matter when RovoDev Agents is doing the heavy lifting?

Atlassian’s latest research, “Code Readability in the Age of Large Language Models: An Industrial Case Study from Atlassian,” accepted at the 41th International Conference on Software Maintenance and Evolution (ICSME’25, Auckland, New Zealand, a prestigious conference in software engineering) dives deep into this question. This research is in collaboration with Associate Professor Kla Tantithamthavorn from Monash University. The findings are both surprising and reassuring for anyone worried about a future where humans and AI agents collaborate on code.


Meet RovoDev Coding Agent

RovoDev Coding Agent is Atlassian’s answer to collaborative, AI-assisted software development. Instead of aiming for full automation, RovoDev is designed to work alongside engineers—embedded in Jira, guiding tasks from planning to coding, always keeping a human in the loop for critical decisions.

How RovoDev fits into your workflow:

  1. Set Context: Select a Jira issue and code repo.
  2. Planning: The agent drafts a coding plan, which you can refine.
  3. Coding: RovoDev generates code, validated by tools and human review.
  4. Pull Request: Code is submitted for standard team review.

Why Investigate Code Readability in the Age of LLMs?

Readable code is the backbone of maintainable software. It’s easier to review, debug, and extend—especially in collaborative environments. As Guido van Rossum, creator of Python, famously said: “Code is read more often than it is written.”

But with LLMs generating more code, we wanted to know:

  • RQ1: How and why is code readability important in the age of LLMs?
  • RQ2: How readable is LLM-generated code compared to human-written code in practice?

What Developers Think: Survey Insights

We surveyed 118 practitioners across Atlassian and the industry. Here’s what we learned:

  • 81% say code readability is still crucial, even with LLMs in the mix.
  • The top motivation? Reducing long-term maintenance costs.
  • Time constraints are the biggest barrier to improving readability.
  • 72% are open to using LLMs to help improve code readability.
  • 39% believe LLM-generated code is more readable than human code; 34% say it’s about the same.
Figure 1: The survey results about the importance, benefits, challenges of the code readability in the age of large language models (LLMs).

Human vs. LLM-Generated Code: What the Data Shows

To move beyond opinions, Atlassian ran an empirical study to investigate the readability of human-written code and RovoDev-generated code (powered by GPT-4) across 144 real Jira issues and 250 files in six programming languages.

How was readability measured?

  • Lines of Code
  • Comment Ratio
  • Cyclomatic Complexity
  • Maintainability Index
  • Halstead Metrics (Difficulty, Vocabulary, Volume, Time Required)

Key findings: LLM Agents Can Write Readable Code

  • RovoDev-generated code is as readable as human-written code across most metrics.
  • In Java, Kotlin, Go, and Scala, the differences are negligible.
  • In TypeScript and Python, RovoDev code is slightly longer and a bit less maintainable, but the effect is small.
  • No significant difference in code complexity or comment quality.

Bottom line: LLM Agents like RovoDev can produce code that teams can trust and maintain—key for scaling SWE Agents in the enterprise.

Figure 2: A comparison of various code readability measures between
human-written code and RovoDev-generated code.

Key Takeaways for Developers

  • Readability Remains Critical: The majority of developers believe code readability is still essential, even as LLMs and AI agents generate more code. Readable code is seen as key for maintainability, collaboration, and long-term project health.
  • Trust and Maintainability: Developers emphasize that readable code builds trust—both in the codebase and in LLM-generated code. It makes it easier to review, debug, and extend code, regardless of whether it was written by a human or an LLM.
  • Barriers and Opportunities: Time constraints are the biggest barrier to improving readability. However, many developers are optimistic about using LLMs to help refactor or improve code clarity.
  • Developers’ Views on LLM-Generated Code: Developer opinions vary:
    • Some believe LLM-generated code is just as readable—or even more so—than code written by humans, particularly for well-defined tasks.
    • Others remain cautious, pointing out that LLMs can occasionally produce verbose or inconsistent code, highlighting the importance of human oversight and review.
  • The Readability of RovoDev-Generated Code vs Human-Written Code: Across most metrics, RovoDev-generated code demonstrates readability on par with human-written code. These findings help build the appropriate trust needed to encourage widespread adoption of our RovoDev Coding Agent.

Final Thoughts

Atlassian’s RovoDev is a glimpse into the future of software development—one where LLM Agents work hand-in-hand with people. The research is clear: readable code is here to stay, and with the right approach, LLM can help us write it.

Curious to learn more? Check out the original RovoDev blog and Atlassian’s RovoDev CLI product.

Atlassian Research: What Do Developers Think About Code Readability in the Age of LLMs?