The Impala Cookbook from Cloudera, Inc.
- useful tips about performance and optimisation
Cloudera Impala technical deep dive from huguk
- partitioning: few thousands of partitions are OK, tens of thousands are too much
- partition file sizes should be hundreds of MB and more
- Parquet: snappy vs. gzip compression (faster vs. more efficient)
- before Impala 1.2.1 was necessary to place the biggest table in the query first, but since that version Impala does automatic rearrangement
Impala Architecture presentation from hadooparchbook
Impala 2.0 - The Best Analytic Database for Hadoop from Cloudera, Inc.
An introduction to Cloudera Impala from Semtech Solutions Ltd