The Taxonomy (SCHEMA) feature provides instrumentation managers complete control over their instrumentation and end-users more confidence and clarity in their data. You can find Taxonomy features in the project configuration page (Manage Data > choose a project). Each project will have its own SCHEMA configuration, and you can switch between projects using the "<All Projects" breadcrumb in the upper left-hand corner of the SCHEMA section.
Important Note: With the release of Govern, Taxonomy and Schema features are now in new locations. Please click here to familiarize yourself with Govern
What you will learn in this article
- How to set up the schema
- Details of specific functionality of this feature
- How to work with Transformations
- User permission levels
Prerequisites
- Some Schema features are available for all customers on Scholarship, Growth, and Enterprise plans, while others like edits to the schema, are only available to customers who have purchased the Taxonomy add-on.
- The Schema features available for Starter plans include adding event descriptions, editing display names, and categorizing events. Starter plans can also download their event schema.
- Scholarship, Growth, and Enterprise plans are also able to upload images for events as well as export AND import schema to CSV.
- Growth and Enterprise customers who have purchased the Taxonomy add-on, have access to Schema Initialization, Instrumentation planning, data validation and Transformations.
- Taxonomy API, which can help you programmatically maintain your event schema, is also available to Scholarship, Growth, and Enterprise plans. See article here for more details.
Table of Contents
- The Schema
- Initializing the Schema
- Editing the Schema
- Instrumentation Planning
- Schema Settings
- Import/Export Schema
- Transformations
- Permissions
The Schema
The Schema contains the definition of the events, event properties, and event properties’ values collected by Amplitude. This feature gives you the ability to categorize and describe your event data, as well as automate the QA process for newly ingested data by proactively surfacing or blocking unintended or malformed data. You can even use Schema as a whitelist of the event data you want Amplitude to collect.
Event data that conforms to the Schema you have created is collected as normal, while violations to the Schema trigger notifications and will appear in the Schema violations view. This helps automate much of the data validation process when implementing new events or properties and can proactively surface corrupted or malformed event data.
Initializing the Schema
Customers with Taxonomy add-on, can initialize the Schema for the project. The button to do so can be found at the top right-hand side of Schema.
Initializing the Schema is your way of telling Amplitude to take your current set of ingested data and begin looking out for changes or discrepancies from that point forward. Each time you make an edit to the Schema and publish changes, Amplitude will look out for changes and discrepancies from the newest version.
Editing the Schema
Whenever a set of instrumentation changes are being made to Amplitude, we encourage you to also update your Schema. This ensures that Amplitude can correctly validate the new data and you only receive email alerts for true violations.
As an Admin or Manager, click the "Edit Schema" button in the top right to make updates to your Schema.
In “Edit Mode”, there are several actions you can take:
- Categorize and describe live Events and Properties
- Add/plan Event Types
- Add an Event Property to an Event
- Change the Type of an Event Property
- Add data validations to Event Property
Create a category using the category selector drop-down. Note: Categories are case-insensitive, meaning you can not have categories with the same name.
To add a description within the SCHEMA section under Manage Data, click the input field below the event name from the Events view. To add descriptions for event properties in the Events subtab, please click on "Show Event Properties" to see a dropdown of event properties. Event property descriptions created from this view are not global and will only apply to the event on that particular event. To create a global event property description, use the Event Properties subtab.
To upload an image within this view, just click "Add Image" on the far right.
Clicking on each event will bring you to an Event Segmentation chart looking at event totals for the selected event over the last 30 days.
When in a chart, the dropdown is ordered by the categories you have created for each event. Your descriptions will appear inline with each event or event property that you hover over which will enable you to read more information about the event or event property within the chart itself.
You can also add descriptions to events or event properties from directly within a chart. This will allow you to do so without having to navigate away from your chart into the Taxonomy tab. Images can also be uploaded to events from this view.
Events
The Events sub-tab will show each visible event that has been seen for a project. This tab allows you to edit descriptions, categorize, and configure various settings for each event type. These settings include modifying the visibility and activity and deleting/blocking event types.
- Note: We highly recommend reading our documentation here so that you fully understand each type of configuration and the consequences of making these modifications to your data before making any changes.
Asides from Name, Display Name, Activity, and Visibility, this detailed view also provides the following information regarding each event. Please note that you can also sort events according to these metrics:
- 30 Day Queries: The number of times this event has been used on charts in the last 30 days. This will not take into account custom events.
- 30 Day Volume: The number of events Amplitude has received in the last 30 days for a particular event type.
- First Seen: The date this event type was first seen by Amplitude.
- Last Seen: The date this event type was last seen by Amplitude.
- Platform: All platforms this event has been seen on.
You can also toggle event properties in the view or show events that have not had a description added.
You can also block event types from your project under this view, which you can also do from the Project Settings page (Manage Data > Select the Project > Advanced).
- For more information about these features, see the documentation here. Since deleting data is permanent and irreversible, it is highly recommended that you fully understand this feature before using it.
Event Properties
The Event Properties are available under each Event in the Schema. There is an option to hide/show the event properties.
Event Property Types
The Schema introduces type checking for event property values. This means Amplitude can detect when event data comes in that does not match the specified type.
- Any: Any possible value
- Boolean: Values representing boolean states ("true", "false", "yes", "no", "0" and "1")
- Number: Numerical values (e.g 12345)
- String: A string value
- One of..: One of a set of possible values. Also known as an enumerable. (e.g. property fruit is one of [apple, banana, strawberry]
- Regular Expression: Custom regex that can be used for pattern matching or more complex values. (e.g. property zip code must have pattern [0-9]{5} )
User Properties
The User Properties subtab displays all user properties Amplitude has seen for your project. You can edit the visibility of user properties here. Marking a user property as hidden will remove it from all dropdown menus. The table will also show you the amount of queries on a particular user property in the last 30 days, the first seen date, and the last seen date. You can also block and delete event types from your project under this view, which you can also do from the Project Settings page (Manage Data > Select the Project > Advanced).
- For more information about these features, see the documentation here. Since deleting data is permanent and irreversible, it is highly recommended that you fully understand this feature before using it.
Instrumentation Planning
Amplitude built the instrumentation planning feature in order to help you safeguard the quality of your production data. If event data not matching your validation rules is sent to Amplitude, we can block those events from entering as a preventive measure against poor data quality.
The Settings page for Schema allows you to configure what happens if unexpected event data is sent to Amplitude. After changing your Settings click “Save” to publish them immediately.
Schema Settings
At the bottom of the Settings page is a text box where you can choose which members of your team will receive a daily email notification when there are violations of your Schema. Separate multiple email addresses with commas.
Currently, you can configure your settings for three different types of Schema violations:
An unplanned Event Type appears: In this scenario, Amplitude sees an event that is not part of your Schema, or that you did not previously plan. From the Settings page, whenever there is an unexpected Event you can:
- Accept the event and its properties and automatically add it to your Schema. No warnings or violations will be triggered. Note: All events, properties and values will be collected with this setting.
- Trigger a warning and hide the event. Amplitude will mark the event as Unexpected and mark it as hidden and inactive. You can then choose to approve the event and add it to Schema, or reject it and block its future collection.
- Reject the event and trigger a warning. Amplitude will not ingest the event at all. Note that any event data that is not ingested is not available in the future.
An unplanned Event Property appears: In this scenario, Amplitude sees an event property that is not part of your Schema, or that you did not previously plan. From the Settings page, whenever there is an unexpected event property you can:
- Accept the property and automatically add it to your Schema. No warnings or violations will be triggered.
- Note: All event property values will be collected and accepted with this setting.
- Trigger a warning. Amplitude will mark the property as Unexpected and flag it in the Schema. You can then choose to approve it and add it to Schema, or reject it and block its future collection.
- Reject the property and trigger a warning. Amplitude will ingest the event, but not the event property. Note that any event data that is not ingested is not available in the future.
An unplanned User Property value appears: In this scenario, Amplitude sees an user property value that is not part of your Schema, or that you did not previously plan. For example, an user property value is sent as a string but you expected a number. From the Settings page, whenever there is an unexpected user property value you can:
- Accept the property. No warnings or violations will be triggered.
- Trigger a warning. Amplitude will surface a warning that an unexpected user property value was collected.
- Reject the property and trigger a warning. Amplitude will ingest the event, but not the user property. Note that any event data that is not ingested is not available in the future.
If event data you instrument is considered unexpected, simply add it to your Schema via the Unexpected tab to stop triggering warnings in the future. If your event data is getting rejected and you want to begin collecting it, add the events or properties to your Schema by planning a new event or planning a new property.
At the end, you will also have the possibility to chose who will receive these notifications.
The validation errors can also be seen in the left menu side in the Validation errors Tab. For example, there is an error message "Event property value does not match type", if the expected value is Number and the value sent is a String.
Import/Export Schema
You can import and export your Schema by clicking on the icons at the top right of the page. This will allow you to change descriptions and edit categories.
You can also plan new event data via a CSV upload. To do so, download the schema, add the new events, and upload the schema again. Read more about this under Instrumentation Planing section.
Bulk Approving or Blocking Events and Properties
To bulk approve or block Events or Properties you can download the taxonomy CSV file, modify the Event and Property Schema Status columns and upload the modified CSV file.
The steps to complete a bulk approval or block are:
1. Download the taxonomy CSV.
2. Locate the Event Schema Status column. The available statuses are unexpected and added and blocked.
- To approve the event or property, change this column to added.
- To block an event or property, change this column to blocked.
- To approve an unexpected event but only approve some of the event properties, you would change Event Schema Status to added, and then set each Property Schema Status to either added or blocked.
3. Upon uploading the new schema, you will see a preview of the changes, and clicking “Publish Changes” will commit these changes.
Transformations
As an enterprise customer, you have the option to purchase our transformations package that allows you to transform event data in order to correct common implementation mistakes.
These transformations are applied at query-time when we generate the results to a query.
- Note: This does not affect the raw data. Raw data on Snowflake or Redshift will not be impacted by Transformations.
Transformations are retroactive, meaning they can be applied to all historical data and can be enabled or disabled at any point in time. This means you can make changes to your event data without having to touch your underlying code base. No matter when you recognize a mistake or want to make a change, you can use a transformation to correct all affected data, both historically and going forward.
Available Transformations
Currently, events, event properties, user properties, and event property values are supported in the following types of transformations:
- Merge Events: This transformation allows you to merge events together.
- Example: Transform the event 'comment_reply_like' and the event 'comment_share' into the event 'comment'.
- Merge Events into Event Property Values: This transformation allows you to merge events into event property values. You can choose to create a new event and add the old events as event property values or add the old events as event property values to an already existing event.
- Example: Transform the event 'comment_reply_like' and the event 'comment_share' into the event 'commentable_type' with the event properties 'reply like' and 'share'.
- Merge Event Properties: This transformation allows you to merge event properties for all events.
- Example: In some cases, an event property is called 'title' and in other cases, the event property is called 'TITLE' but they represent the same thing on all events. Transform 'title' and 'TITLE' into 'Title', combining the data.
- Merge User Properties: This transformation allows you to merge user properties for all events.
- Example: In some cases, a user property is called 'name' and in other cases, the user property is called 'NAME' but they represent the same thing on all users. Transform 'name' and 'NAME' into 'Title', combining the data.
- Rename Value: This transformation allows you to re-assign event property values on all events. This transformation is useful if an event property has a number of misspellings or nonsensical values in drop-downs and will allow you to hide them from the UI or turn them into another value.
- Example: You can reassign the values of 'true' and 'TRUE' to 'True'.
Creating a Transformation
To create a transformation, select the events or properties you want to act on from the list view. If you are only interested in transforming event properties, then you can utilize the Event Properties subtab. Then, click "Transform" and choose the transformation you want to create.
Clicking the "Transform" button will launch "Draft Mode", which allows you to preview the transformation before publishing it to the rest of your organization. You are able to create new charts and look at existing ones to ensure the change will have the desired effect.
- Note: While the transformation is not published to the rest of your organization, changes you make to existing charts, behavioral cohorts, settings, visibility of events, etc. are not staged and take effect immediately.
Once you have verified the effect of the transformation, you can click "Publish" to push the change live for everyone in your organization.
Manage Transformations
The Manage Transformations subtab allows you to view all the current transformations in the current project. It will show you the user that created the transformation, when it was created, and what the transformation entails. You can also choose to remove any existing transformations from this subtab.
Transformations in Queries
If you are running a query that uses multiple transformations, then they will be executed in the following order:
- Merge Events
- Merge Events into Event Property Values
- Merge Event Properties
- Rename Value
In general, transformations of transformations are not currently supported. For example, if you merge an event 'App Open' to 'Open App', then you cannot use 'Open App' as the source for another transformation.
Current Limitations
Transformations are not currently supported in the following chart types or features:
Permissions
Amplitude has 4 different types of permission levels: Admin (A), Manager (M), Member (M), and Viewer (V). For more information on permission levels, see our Permissions documentation.
[AM--] Admins and Managers are able to:
- Describe and categorize events.
- Plan, approve, and reject events.
- Create and undo transformations.
[AMMV] All users can:
- See categories and descriptions.
- See planned and unexpected events.