LLM Prompt Security Primer Workshop
Often, a hacker knows more about a system than its creator. That’s no different with large-language models. They have impressive capabilities, yet even their creators have a limited understanding of their insides. This is a hacker’s paradise. We’ll explore this new field from the ground up. First, we’ll check how an LLM-integrated application works, from models and prompts to completions and tokens. Next, we’ll attack this app, extract, inject, and jailbreak it. Finally, we’ll build some defenses. And as a bonus, we’ll break those as well.