Amplitude manages the entire Extract, Transform, and Load (ETL) process for data loaded to your Redshift, saving your analysts from having to deal with recurring headaches and letting them focus on writing queries and exploring your data. Some of the optimizations Amplitude performs before loading data are:
- Data Vacuuming: Cleaning up deleted data and reorganizing it into the right order so that queries are much faster.
- Compressing Data: Amplitude uses a compression algorithm optimized for each column that also makes queries much faster.
- Handling non-UTF-8: Special formatting to prevent uploads from failing.
- Field Truncation: If the property value is too long, Amplitude truncates it and makes sure it stays in JSON format so that uploads do not fail.
NOTE: The optimizations are performed on Amplitude data only. Amplitude will not perform maintenance on your Redshift cluster.