Red Teaming is a strategy used in cybersecurity & military training.

A red team simulates adversaries , actions and tactics.

They test and improve the effectiveness of an organization’s defences.

Red teams employed to test the robustness , fairness and ethical boundaries of LLM systems.

MAIN TASK : Try to bypass safeguards of a given application.

OBJECTIVE : Find ways to make the LLM bot misbehave like returning an inappropriate or incorrect answers to the end users.

DEMO LLM APPLICATION:-

TESTING LINK : https://s172-29-71-14p8888.lab-aws-production.deeplearning.ai/notebooks/L2/L2_Red_teaming_LLMs.ipynb

image.png

BYPASSING SAFEGUARDS - TECHNIQUES:-

1. Exploiting Text Completion

image.png

2. Using Biased Prompts

image.png

3. Direct Prompt Injection

image.png

4. Gray Box Prompt Attacks

image.png

5. Prompt Probing (Advanced Technique)

image.png