Exporting Data with S3
Last updated Dec 2nd, 2024
Overview
You can export contact, organization and activity data from Common Room into your data warehouse via our S3 integration. Once you've set up the S3 bucket using the instructions here, our team will work with you to finalize the data contract and set up a regular export.
Sample data contract
Activity Data
Folder name format: data-export/Activity/YYYY/MM/DD/
Default frequency: all new activities (from the past day) will be exported on a daily basis (upon request, we can also trigger a one-time export of all activities)
Field name | Data type | Description |
---|---|---|
activity_timestamp | datetime | Timestamp of the activity |
service_name | string | Signal associated with the activity |
activity_type | string | Type of activity (e.g. slack_post_added) |
emails | string | null | Comma separated list of emails for activity contact |
primary_email | string | null | Primary email for activity contact |
profiles | string | null | Comma separated list of social profiles for activity contact |
full_name | string | null | Full name for activity contact |
first_activity_date | datetime | null | First activity date for activity contact |
contact_token | string | Contact identifier associated with activity |
Contact Data
Folder name format: data-export/CommunityMember/YYYY/MM/DD/
Default frequency: all contact data will be exported on a weekly basis (upon request, we can also export new contacts from the past day on a daily basis)
Field name | Data type | Description |
---|---|---|
full_name | string | null | Full name of contact |
primary_email | string | null | Primary email of contact |
profiles | string | null | Comma separated list of social profiles for contact |
emails | string | null | Comma separated list of emails for contact |
first_activity_date | datetime | null | Date of contact's first activity in Common Room |
first_activity_source | string | null | Signal where contact was first active |
last_activity_date | datetime | null | Date of contact's last activity in Common Room |
last_activity_source | string | null | Signal where contact was last active |
location | string | null | Location of contact (City, State, Country) |
organization_name | string | null | Name of contact's organization |
organization_domain | string | null | Domain of contact's organization |
job_title | string | null | Job title of contact |
tags | string | null | Comma separated list of tags for contact |
segment_names | string | null | Comma separated list of segments contact is a part of |
contact_tokens | string | Comma separated list of tokens representing the Contact |
<custom_fields> | Dependent on custom field type | An entry for each of the specified custom fields on the Contact |
Organization Data
Folder name format: data-export/Group/YYYY/MM/DD/
Default frequency: all organization data will be exported on a weekly basis (upon request, we can also export new organizations from the past day on a daily basis)
Field name | Data type | Description |
---|---|---|
organization_name | string | Name of organization |
organization_domain | string | Domain of organization |
location | string | null | Location of organization (City, State, Country) |
contact_count | int | null | Total number of Common Room contacts associated with organization |
employee_count | int | null | Estimated employee count of organization |
approx_capital_raised | float | null | Estimated capital raised by organization |
approx_revenue_min | float | null | Estimated minimum revenue for organization |
approx_revenue_max | float | null | Estimated maximum revenue for organization |
first_activity_date | datetime | null | First date the organization was active in Common Room |
last_activity_date | datetime | null | Last date the organization was active in Common Room |
first_activity_source | string | null | First signal the organization was active in Common Room |
last_activity_source | string | null | Last signal the organization was active in Common Room |
tags | string | null | Comma separated list of tags for an organization |
segment_names | string | null | Comma separated list of segments that organization is a part of |
<custom fields> | Dependent on custom field type | An entry for each of the specified custom fields on the Organization |
Details
- File formats will be in JSONL by default, unless you request CSV.
- If you choose to receive data as a CSV instead of a JSONL format, null values will be transmitted as empty strings.
- When an export finishes, we will write a marker to the same directory with file name _completion_marker_ so that you know the data is ready to consume
- Common Room custom fields map to the following data types, each of which can be nullable.
- Yes/No - boolean
- Date - datetime
- Number - int
- Single-select - string
- Multi-select - string
- Text - string
- URL - string
- For customer-provided data (e.g. data that is coming in via your CRM, Census, etc.), we will preserve the formatting of the original data signal so please ensure that you have the proper handling on your end when mapping the data.
FAQ
Why does the data model not expose a unique ID for contacts or organizations?
- Our contacts and organizations data are dynamic in nature across our various data sources and enrichment providers due to our algorithm constantly looking for ways to improve the data quality, such as merging and unmerging contacts. The ability to merge and un-merge contacts requires us to preserve their historical state.
- In order to help avoid data consistency issues, we only expose the latest snapshot of data and do not expose unique identifiers to external systems.
What are some options for joining across the data sets?
- Recommended: Use the contact_token field on activities to match to a token in the contact_tokens field of a contact.
- Generate a primary key on your end that combines a few of our export fields (e.g. full name, email, profiles) to create an id with the understanding that conflicts would need to be handled
- Export out a subset of data that has a stable set of identifiers (e.g. using our hasEmails filter) and use that as your primary key
- Work with the Common Room team to pass in a known identifier from your system into Common Room that can then be included in exports to help with matching