Overview

Background

Text-to-SQL is a common use-case for LLMs, especially useful for chatbots that work with structured data, such as CSV files or databases like Postgres, Snowflake, and Redshift. This method works by having the LLM convert a user’s question into an SQL query. For example:

A user queries: “How many customers are there in each US state?”
The LLM generates an SQL statement: SELECT state, COUNT(*) FROM customers GROUP BY state
The SQL command is executed on the database.
Results from the database are then displayed to the user.

An additional step is possible where the LLM can interpret the SQL results and provide a summary in plain English.

Text-to-SQL Risk

While Text-to-SQL is highly useful, its biggest risk is that attackers can misuse it to modify SQL queries, potentially leading to unauthorized access or data manipulation. The potential threats in Text-to-SQL systems include:

Database Manipulation: Attackers can craft prompts leading to SQL commands like INSERT, UPDATE, DELETE, DROP, or other forms of db manipulation. This might result in data corruption or loss.
Data Leakage: Attackers can form prompts that result in unauthorized access to sensitive, restricted data.
Sandbox Escaping: By crafting specific prompts, attackers might be able to run code on the host machine, sidestepping security protocols.
Denial of Service (DoS): Through specially designed prompts, attackers can generate SQL queries that overburden the system, causing severe slowdowns or crashes.

It’s important to note that long-running queries could also occur accidentally by legitimate users, which can significantly impact the user experience.

Mitigation

The policies in this category are designed to automatically inspect and review SQL code generated by LLMs, ensuring security and preventing risks. This includes:

Database Manipulation Prevention: Block any SQL command that could result in unauthorized data modification, including INSERT, UPDATE, DELETE, CREATE, DROP, and others.
Restrict Data Access: Access is limited to certain tables and columns using an allowlist or blocklist. This secures sensitive data within the database.
Prevent Script Execution: Block the execution of any non-SQL code, for example, scripts executed via the PL/Python extension. This step is crucial in preventing the running of harmful scripts.
DoS Prevention: Block SQL elements that could lead to long-running or resource-intensive queries, including excessive joins, recursive CTEs, making sure there’s a LIMIT clause, and so on.

Policies

Allowed Tables

Detects SQL operations on tables that are not within the limits set in the policy.

Restrcted Tables

Detects the generation of SQL statements with access to specific tables that are considered sensitive.

Load Limit

Detects SQL statements that are likely to cause significant system load and affect performance.

Read-Only Access

Detects any attempt to use SQL operations that require more than read-only access.

Get Started

Fundamentals

Policies

Observability

REST API

Enterprise

Release Notes

Background

Text-to-SQL Risk

Mitigation

Policies

Allowed Tables

Restrcted Tables

Load Limit

Read-Only Access

Get Started

Fundamentals

Policies

Observability

REST API

Enterprise

Release Notes

​Background

​Text-to-SQL Risk

​Mitigation

​Policies

Allowed Tables

Restrcted Tables

Load Limit

Read-Only Access

Background

Text-to-SQL Risk

Mitigation

Policies