Last week we at Aampe faced a simple yet interesting scale problem when using BigQuery.
The culprit: A simple inner join. We wrote a simple query that joins the users who are eligible to be messaged with their corresponding messages CMS table. The output needed: for each user, all messages that they are eligible to receive. Sounds simple? Yes indeed.
When running this query we broke BigQuery with a relatively small-sized dataset. Our users table was 30 MB in size and the messages table to be joined to was 16 MB in size. But it turned out that the inner join formed an exploding join where each row in users table got matched to each row in messages because every contact was eligible to receive every message from a large CMS.
This should work in any database scenario, however BigQuery has guards against such behavior. With the on-demand pricing model BigQuery only charges for the data size scanned per query and doesn't account for anything else when billing. So it only allocates the resources according to the potential data size scan in the query. However BigQuery enforces a maximum threshold on the ratio of CPU seconds elapsed for the query / data size scanned for the query.
For our use case the data scanned was just 46MB and so the slots assigned for the query were very few and hence the query was taking > 3 hours to complete. We also hit the permitted ratio and the query was cut off which is when we realized the error. This is also a feature by BigQuery to protect against the use of resources in the on-demand model as well as for the customers to be aware of such scenarios.
A simple solution would have been to move to the flat-rate pricing model and let BigQuery charge us more by adding more slots. But we went on a journey to identify how the query could be optimized so that such behavior is mitigated now and in the future. The biggest help in this is the BigQuery execution plan which is available for each query in the bottom toolbar.
It was clear from the graph that the join was exploding in that it needed several repartitions where the number of rows was growing at each step up to the order of 30B which was clearly indicative that the join needed refactoring.
We handled this by optimizing the query in such a way that it:
- Pre-aggregated several computations before the join.
- Limited the number of unique keys going into the join from the messages table and so it got turned into broadcast join because one table was significantly smaller than the other.
And voila! The same query now took just 3 minutes to run end-to-end. A 99 percent reduction! We since pay close attention to account for such possible exploding joins while writing queries and optimizing joins.