FSE 2025 Preview 2: Workshops

Foundations of Software Engineering was a great opportunity to connect with and learn from software engineering researchers. In addition to the education and tool demo tracks of the conference, we have several contributions to co-located workshops at FSE 2025 that will be presented later this week!

Towards Evidence-Based Tech Hiring Pipelines [SE 2030]

Chris Brown, Swanand Vaishampayan

  • ๐Ÿ” Problem: Software engineers are primarily hiring through a tech hiring pipeline, consisting of processes such as resume matching and technical interviews. However, we lack insights on how to effectively evaluate the capabilities of programmers.

  • ๐Ÿ“Š Findings: We offer a roadmap to promote evidence-based hiring in the tech industry grounded in insights from evidence-based decision making---discuss challenging and future research directions to incorporate contextual, experiential, and research-based evidence into tech hiring pipelines.

  • ๐Ÿ’ก Implication: We provide implications for improving tech hiring processes, enhancing SE research on hiring, preventing toxicity and promoting diversity and inclusion in hiring processes, and promoting the adoption of evidence-based practices on software engineering in practice.

The First International Workshop on Large Language Model-Oriented Empirical Research (LLanMER) [FSE WS]

Na Meng, Xiaoyin Wang, Chris Brown (Workshop Organizers)

  • ๐Ÿ” Problem: There is fast growing interest in the application of large language models (LLMs) in software practice and research. However, lots of questions remain regarding responsible, reliable, and reproducible empirical research with LLMs.

  • ๐Ÿ’ก Workshop: Our workshop focuses on methodologies for conducting empirical research with LLMs. We aim to provide a venue to share ideas, discuss obstacles and challenges, brainstorm solutions, and establish collaborations in LLM-oriented empirical research.

Note

If you were attending FSE in Trondheim ๐Ÿ‡ณ๐Ÿ‡ด, please join us for our workshop on Friday from 2:00-5:30pm in the Sirius room!

From Prompts to Properties: Rethinking LLM Code Generation with Property-Based Testing [LLanMER]

Dibyendu Brinto Bose

  • ๐Ÿ” Problem: Large language models are increasingly used for code generation, however evaluating generated code is challenging and unit testing-based techniques are insufficient.

  • ๐Ÿงช Study: We propose Property-Based Testing (PBT) as an alternative strategy, and conduct a preliminary evaluation using two code generation models (StarCoder and CodeLlama) on two datasets (MBPP and HumanEval).

  • ๐Ÿ“Š Findings: Our results show unit test-based evaluations provide insights on correctness, while PBT is capable of evaluating against additional logical and structural constraints to assess in generated code.

  • ๐Ÿ’ก Implication: We suggest that hybrid approaches combining unit and property-based testing can provide a more reliable evaluation for LLM-generated code.

The Evolution of Information Seeking in Software Development: Understanding the Role and Impact of AI Assistants [HumanAISE]

Ebtesam Al Haque, Chris Brown, Thomas D. LaToza, Brittany Johnson

  • ๐Ÿ” Problem: Information seeking constitutes a considerable part of software development---however, we lack insights on the impact of AI-assisted tools on software engineers' information needs and information-seeking behaviors.

  • ๐Ÿงช Study: We conducted a mixed methods study, surveying and interviewing software practitioners to understand how developers use AI tools for information seeking and the impact of AI tools on productivity and skill development.

  • ๐Ÿ“Š Findings: Most participants reported using AI tools to support information seeking. We also observed challenges with AI-assisted information seeking, mixed perceptions on perceived productivity, and positive impacts expanding developer knowledge and skills.

  • ๐Ÿ’ก Implication: We discuss the evolution of information seeking behaviors in the age of AI, outlining productivity and learning trade-offs in addition to motivating future directions for AI-assisted developer tools.

Benchmark Infrastructure for LLMs for Code (BI4LLMC)

I was also invited to participate in the International Workshop on Benchmark Infrastructure for LLMs for Code, aiming to devise solutions to address the growing need for effective and rigorous evaluations of LLMs for coding-related tasks.

We hope the discussions at these workshops will spark new ideas, build community, and create new questions for us to explore within the evolving landscape of software engineering and SE research.

Comments