Blog

How Do I Generate Redshift DDL for Existing Tables? A Quick Guide for Data Engineers

By
Snowplow Team
&
October 1, 2024
Share this post

When managing event data at scale using Snowplow pipelines in Amazon Redshift, it’s not uncommon to migrate or replicate data between clusters — perhaps to isolate production workloads, test new models, or establish dev/stage environments.

But when the target cluster doesn’t yet have a matching table, you’ll need to recreate it — which means regenerating the original CREATE TABLE DDL.

This post explains how to generate that DDL directly from Redshift using AWS-provided utilities — a valuable technique for Snowplow users and any Redshift practitioner working with complex schemas.

Q: Can Redshift generate the DDL for an existing table?

Yes — while Redshift doesn’t have built-in SHOW CREATE TABLE support like MySQL, AWS provides an officially supported workaround via their v_generate_tbl_ddl view in the amazon-redshift-utils repository.

Q: How do I set it up?

To use this utility, follow these steps:

1. Clone the AWS Redshift Utils repo

Visit: https://github.com/awslabs/amazon-redshift-utils

2. Create the admin.v_generate_tbl_ddl view

Navigate to:
src/AdminViews/v_generate_tbl_ddl.sql

Ensure you have a schema named admin in your Redshift cluster:

CREATE SCHEMA admin;

Then, execute the SQL script in that file to create the view. You’ll need sufficient privileges (typically a superuser or an account with pg_catalog access).

Q: How do I generate the DDL for a specific table?

Once the view is created, simply run:

SELECT ddl
FROM admin.v_generate_tbl_ddl
WHERE schemaname = 'your_schema'
  AND tablename = 'your_table';

This will return the full CREATE TABLE statement, including:

  • Columns and data types

  • Sort and distribution keys

  • Encoding specifications

  • Compression encodings

Q: Can I use this with Snowplow-generated tables?

Absolutely. This is particularly useful for Snowplow users who:

  • Deploy to multiple Redshift clusters

  • Need to replicate Snowplow atomic and derived schemas

  • Are migrating between environments (e.g. staging to prod)

  • Want to audit or version control DDLs for compliance or reproducibility

Snowplow’s dbt models, for example, generate tables dynamically during model runs — so being able to extract their final structure can be helpful for documentation or backup purposes.

Q: Is this better than using pg_table_def?

Yes — pg_table_def only gives you basic metadata (e.g. column names and types), whereas v_generate_tbl_ddl provides the full DDL, including constraints, encodings, and sort/dist keys — making it suitable for complete migrations.

Q: Any limitations or caveats?

  • You must have sufficient privileges to access system tables (pg_class, pg_namespace, etc.)

  • Some features (e.g. defaults, constraints) may be partially implemented, depending on your Redshift version

  • The utility doesn’t output GRANT statements or COMMENT ON clauses — you’ll need to handle permissions separately

Final Thoughts

For Snowplow users managing structured event data in Redshift, the ability to programmatically extract CREATE TABLE DDLs simplifies migration, recovery, and multi-environment consistency. AWS’s v_generate_tbl_ddl view makes this straightforward, and is a must-have in any Redshift-based analytics toolkit.

Snowplow pipelines generate powerful, schema-rich data — this method helps you keep that structure portable and auditable across Redshift environments.

Subscribe to our newsletter

Get the latest content to your inbox monthly.

Get Started

Whether you’re modernizing your customer data infrastructure or building AI-powered applications, Snowplow helps eliminate engineering complexity so you can focus on delivering smarter customer experiences.