Importing data with S3
Last updated Aug 28th, 2024
Overview
You can import data at the contact, organization and/or the custom product entity level from your data warehouse directly into Common Room via our S3 integration. Once you've set up the S3 bucket using the instructions here, our team will work with you to finalize the data contract and set up a regular import.
Details
When importing data to Common Room, there are a few important things to keep in mind.
- User data is keyed by a customer email with one record per customer
- Company data is keyed by a unique identifier (e.g. SFDC account id) with one record per company
- Each company record has non-nullable fields for: the primary domain, name
- The primary Domain is what we use to match up with the Organizations in the community. It could be the actual domain of the customer’s company, email domain of the billing admin, account owner, etc.
- Name is used to differentiate multiple records for the same primary domain (e.g. google.com could have Android and Google Maps as two different client records
- Different datasets are written into different top-level locations
- E.g. s3://data/customers/…, s3://data/companies/...
- Data snapshots are written into date-based partitions
- E.g. data/customers/date=20211025 will contain the snapshot generated on 2021-10-25
- Common Room will detect new partitions and always use the data only from the latest partition
- Each partition contains the entire snapshot of the dataset
- Once written the partition should not change
- Files are one of:
- CSV/TSV (optionally gzipped)
- JSONL (optionally gzipped)