← Back to Grants

Understanding, Identifying, and Mitigating Vulnerabilities in LLMs. Large language models (LLMs) are widely used but still suffer from jailbreaking attacks that can elicit harmful responses, raising b

The University of Sydney — Discovery Projects
Amount
Up to $725,800
Closes
Saturday 24 March 2029
Status
unknown
Type
open opportunity
Apply Now →

Description

Understanding, Identifying, and Mitigating Vulnerabilities in LLMs. Large language models (LLMs) are widely used but still suffer from jailbreaking attacks that can elicit harmful responses, raising broad society’s concerns about LLMs’ risks. This project aims to enhance the security of LLMs by understanding, identifying, and addressing the fundamental weaknesses that make them susceptible to such attacks. Expected outcomes include theoretical analyses of LLMs’ weaknesses, developing a universal jailbreaking attack to detect diverse vulnerabilities, and facilitating a reliable defence to mitigate them. This will benefit society by ensuring AI technologies align with human values and uphold positive impacts, enabling a safe deployment of LLM systems with public trust in critical sectors.. Scheme: Discovery Projects. Field: 4605 - Data Management and Data Science. Lead: A/Prof Tongliang Liu

Categories
artsenterprisetechnology

Foundations Supporting This Area

Discovery method: arc-grants
Last verified: Monday 2 March 2026
Added: Saturday 28 February 2026