Overview

The SafeSQLExecutor class provides secure, read-only database access from HyperFlow Python code using SQLAlchemy as the underlying database connection framework. It validates SQL queries for safety, prevents injection attacks, enforces table access restrictions, and returns JSON-serializable results optimized for data processing workflows.

Built on SQLAlchemy, SafeSQLExecutor accepts standard SQLAlchemy connection strings and supports any database that SQLAlchemy supports (PostgreSQL, MySQL, SQLite, Oracle, SQL Server, etc.). All queries are validated to ensure they are read-only SELECT or WITH statements, and table access is restricted to explicitly allowed tables for security.

SQLAlchemy Foundation

SafeSQLExecutor uses SQLAlchemy's create_engine() and connection management, which means:

Constructor Parameters

SafeSQLExecutor(
    connection_string: str,    # SQLAlchemy connection URL
    allowed_tables: List[str], # Tables that can be queried (required)
    read_only: bool = True,    # Enforce read-only queries (recommended)
    timeout: int = 30,         # Query timeout in seconds
    max_rows: int = 10000     # Maximum rows returned (auto-LIMIT added)
)

Connection String Examples

# PostgreSQL with psycopg2
conn = "postgresql+psycopg2://user:password@localhost:5432/database"

# MySQL with PyMySQL  
conn = "mysql+pymysql://user:password@localhost:3306/database"

# SQLite (local file)
conn = "sqlite:///path/to/database.db"

# SQL Server with pyodbc
conn = "mssql+pyodbc://user:password@server/database?driver=ODBC+Driver+17+for+SQL+Server"

Query Execution and Results

The execute_query() method returns a comprehensive JSON response structure that's fully JSON-serializable for use throughout HyperFlow:

Response Format

# Successful query response
{
    "status": "success",
    "summary": "Query executed successfully, returning 25 rows in 0.15 seconds.",
    "query_executed": "SELECT * FROM users WHERE active = :active LIMIT 10000",
    "results": {
        "columns": ["id", "name", "email", "created_at"],
        "rows": [
            {"id": 1, "name": "John Doe", "email": "[email protected]", "created_at": "2024-01-15"},
            {"id": 2, "name": "Jane Smith", "email": "[email protected]", "created_at": "2024-01-16"}
        ]
    },
    "metrics": {
        "row_count": 25,
        "execution_time_seconds": 0.15
    }
}

# Error response
{
    "status": "error", 
    "summary": "The query failed due to a validation error.",
    "original_query": "SELECT * FROM forbidden_table",
    "error_details": "Table 'forbidden_table' not in allowed tables"
}

Simplified Access Pattern

For convenient access to just the row data: