Redshift projects can use direct connect to reverse engineer entire schemas in seconds. Please refer to the instructions below to configure and perform a direct connection to your Redshift account.

Direct Connect

To establish a direct connection to Redshift, you will require the following details

Server: e.g., https://accout.region.redshift.amazonaws.com
Port: 5439 (default)
Database
Username
Password

Using these details, follow the highlighted steps to establish the connection (or get more details in the related articles at the end of this post):

Navigate to the Reverse Engineering screen
Click the "Connect to DW" button
Enter your connection details from the section above
Connect and select the schema to import

Obtaining Redshift parameters

To obtain the connection parameters for your Redshift instance, head to the "Provisioned clusters dashboard" page in the AWS console

Click on the cluster you wish to connect to in order to see its related properties.

In the "Endpoint," you will find the connection string that contains the server, port, and database in the following format:

server:port/database

Required database permissions

To obtain DDL, SqlDBM direct connect runs several queries that reference metadata in the information_schema and pg_catalog. Access to the database schemas/catalogs and their contents is not required.

The connecting user will require read access to:

pg_catalog
information_schema
SELECT privilege on the table being imported

To reverse engineer from SqlDbm, the connecting user requires SELECT on the desired tables. This is due to the row-level security of the information_schema.tables as well as the requirements to execute the generated SQL to create a table, which is granted via SELECT privileges: show table {tablename}. You can apply these privileges via the following commands:

GRANT SELECT ON TABLE {schema.table} TO {user}
or
GRANT SELECT ON ALL TABLES IN SCHEMA {schemaName} TO {user}

Without SELECT privileges on the desired table, the row will not appear in information_schema.tables and you cannot run show table {tablename}

For reference, the following SQL is run to obtain the DDL:

--Get schema list

SELECT      table_schema as schema_name
FROM        information_schema.tables
WHERE       table_schema not in ('pg_internal', 'pg_catalog', 'information_schema')
AND         table_catalog = @database
GROUP BY    1
ORDER BY    1
;

--------------------

--Create schema ddl

SELECT
	'CREATE SCHEMA ' + QUOTE_IDENT(nspname) +
		CASE
		WHEN nspowner > 100
		THEN ' AUTHORIZATION ' + QUOTE_IDENT(pg_user.usename)
		ELSE ''
		END
		+ ';' AS ddl
FROM pg_catalog.pg_namespace as pg_namespace
LEFT OUTER JOIN pg_catalog.pg_user pg_user
ON pg_namespace.nspowner=pg_user.usesysid
WHERE nspName = any(@schemas)
ORDER BY nspname
;


--------------------


--Create table ddl

SELECT  'SHOW TABLE ' + quote_ident(t.table_schema) + '.' + quote_ident(t.table_name) + ';' as sqlText
FROM    information_schema.tables t
WHERE   t.table_schema = any(@schemas)
  AND   t.table_type = 'BASE TABLE'
;

IP allowlisting and private networking

Corporate networking and security policies may interfere with connectivity between SqlDBM and Databricks. If you are unable to connect despite following the guidelines in this article, please make sure that IP blocking or private networking is not interfering.

An easy way to diagnose this type of issue is by attempting to connect to your Databricks account from a home computer, without using a VNP or any corporate software.

For instructions on allowing SqlDBM through the firewall or private network, please see the related article at the bottom of this page.

Amazon Redshift - Direct Connect and Reverse Engineer

Direct Connect

Obtaining Redshift parameters

Required database permissions

IP allowlisting and private networking

See also: