โš™๏ธ Warehouses: Sizing and Threads

Choosing the right warehouse size is critical for performance and cost efficiency.

Snowflake offers warehouses in T-shirt sizes (X-Small to 6X-Large), but exact hardware specifications are not public.

Best practices for warehouse sizing:

  • Start with X-Small
  • Increase size until query durations stop halving. At that point, you've likely found the cost-effective size
  • There can be hidden costs and uptime, so avoid relying solely on runtime seconds and theory
    • Be cautious when increasing to XL sizes, for instance
  • Doubling warehouse size does not always halve query time

Made-up example for illustrating warehouse cost:

Warehouse SizeRuntimeCredits per minuteTotal Credits Consumed
Small (S)10 min110
Medium (M)5 min210

In this example, running a query on a Small warehouse takes 10 minutes and consumes 1 credit per minute, resulting in 10 total credits.
Running the same query on a Medium warehouse takes only 5 minutes but consumes 2 credits per minute, leading to the same total credit usage, with less time.
In this case, using the Medium warehouse offers the same cost and better runtime!

Key takeaway: Larger warehouses can reduce runtime without increasing total cost, as long as the performance improvement is proportional to the size increase.

โš  Applying this method to complex DAGs and pipelines can be more challenging than for single queries

Threads:
| Warehouse Size | Recommended threads | | --- | --- | | X-Small | 4โ€“8 | | Small | 8โ€“16 | | Medium or larger | 16+ (monitor for queuing) |

Increasing threads beyond a warehouse's concurrency limit causes query queuing, increasing total runtime and delaying DAG execution.

โš  Over-threading (too many dbt threads) can cause query queuing and may slow down execution rather than speeding it up

๐Ÿ“Š Use Snowflake's Query History view to check if your warehouse is fully utilized or if queries are queuing

Auto-suspend should be configured to 60 seconds or less to avoid paying for idle compute when the warehouse is not actively processing queries.