Replicate data with Airbyte
Learn how to replicate data from Neon with Airbyte
Neon's logical replication feature allows you to replicate data from your Neon Postgres database to external destinations.
Airbyte is an open-source data integration platform that moves data from a source to a destination system. Airbyte offers a large library of connectors for various data sources and destinations.
In this guide, you will learn how to define your Neon Postgres database as a data source in Airbyte so that you can stream data to one or more of Airbyte's supported destinations.
Prerequisites
Enable logical replication in Neon
important
Enabling logical replication modifies the Postgres wal_level
configuration parameter, changing it from replica
to logical
for all databases in your Neon project. Once the wal_level
setting is changed to logical
, it cannot be reverted. Enabling logical replication also restarts all computes in your Neon project, meaning active connections will be dropped and have to reconnect.
To enable logical replication in Neon:
- Select your project in the Neon Console.
- On the Neon Dashboard, select Project settings.
- Select Beta.
- Click Enable to enable logical replication.
You can verify that logical replication is enabled by running the following query from the Neon SQL Editor:
Create a Postgres role for replication
It is recommended that you create a dedicated Postgres role for replicating data. The role must have the REPLICATION
privilege. The default Postgres role created with your Neon project and roles created using the Neon Console, CLI, or API are granted membership in the neon_superuser role, which has the required REPLICATION
privilege.
Grant schema access to your Postgres role
If your replication role does not own the schemas and tables you are replicating from, make sure to grant access. Run these commands for each schema:
Granting SELECT ON ALL TABLES IN SCHEMA
instead of naming the specific tables avoids having to add privileges later if you add tables to your publication.
Create a replication slot
Airbyte requires a dedicated replication slot. Only one source should be configured to use this replication slot.
Airbyte uses the pgoutput
plugin in Postgres for decoding WAL changes into a logical replication stream. To create a replication slot called airbyte_slot
that uses the pgoutput
plugin, run the following command on your database using your replication role:
airbyte_slot
is the name assigned to the replication slot. You will need to provide this name when you set up your Airbyte source.
important
To prevent storage bloat, Neon automatically removes inactive replication slots after a period of time if there are other active replication slots. If you have or intend on having more than one replication slot, please see Unused replication slots to learn more.
Create a publication
Perform the following steps for each table you want to replicate data from:
-
Add the replication identity (the method of distinguishing between rows) for each table you want to replicate:
In rare cases, if your tables use data types that support TOAST or have very large field values, consider using
REPLICA IDENTITY FULL
instead: -
Create the Postgres publication. Include all tables you want to replicate as part of the publication:
The publication name is customizable. Refer to the Postgres docs if you need to add or remove tables from your publication.
note
The Airbyte UI currently allows selecting any tables for Change Data Capture (CDC). If a table is selected that is not part of the publication, it will not be replicated even though it is selected. If a table is part of the publication but does not have a replication identity, the replication identity will be created automatically on the first run if the Postgres role you use with Airbyte has the necessary permissions.
Create a Postgres source in Airbyte
-
From your Airbyte Cloud account, select Sources from the left navigation bar, search for Postgres, and then create a new Postgres source.
-
Enter the connection details for your Neon database. You can get these details from your Neon connection string, which you'll find in the Connection Details widget on the Dashboard of your Neon project. For example, given a connection string like this:
Enter the details in the Airbyte Create a source dialog as shown below. Your values will differ.
- Host: ep-cool-darkness-123456.us-east-2.aws.neon.tech
- Port: 5432
- Database Name: dbname
- Username: alex
- Password: AbC123dEf
-
Under Optional fields, list the schemas you want to sync. Schema names are case-sensitive, and multiple schemas may be specified. By default,
public
is the only selected schema. -
Select an SSL mode. You will most frequently choose
require
orverify-ca
. Both of these options always require encryption. Theverify-ca
mode requires a certificate. Refer to Connect securely for information about the location of certificate files you can use with Neon. -
Under Advanced:
- Select Logical Replication (CDC) from available replication methods.
- In the Replication Slot field, enter the name of the replication slot you created previously:
airbyte_slot
. - In the Publication field, enter the name of the publication you created previously:
airbyte_publication
.
Allow inbound traffic
If you are on Airbyte Cloud, and you are using Neon's IP Allow feature to limit IP address that can connect to Neon, you will need to allow inbound traffic from Airbyte's IP addresses. You can find a list of IPs that need to be allowlisted in the Airbyte Security docs. For information about configuring allowed IPs in Neon, see Configure IP Allow.
Complete the source setup
To complete your source setup, click Set up source in the Airbyte UI. Airbyte will test the connection to your database. Once this succeeds, you've successfully configured an Airbyte Postgres source for your Neon database.
Configure a destination
To complete your data integration setup, you can now add one of Airbyte's many supported destinations, such as Snowflake, BigQuery, or Kafka, to name a few. After configuring a destination, you'll need to set up a connection between your Neon source database and your chosen destination. Refer to the Airbyte documentation for instructions:
References
- What is an ELT data pipeline?
- Logical replication - PostgreSQL documentation
- Publications - PostgreSQL documentation
Need help?
Join our Discord Server to ask questions or see what others are doing with Neon. Users on paid plans can open a support ticket from the console. For more detail, see Getting Support.
Last updated on