Some background When we started using Google BigQuery – almost five years ago now – it didn't have any partitioning functionality built into it. Heck, queries cost $20 p/TB back then too for goodness' sake! To compensate for this lack of functionality and to save costs, we had to manually shard our tables using the well known _YYYYMMDD suffix pattern just like everyone else. This works fine, but it's quite cumbersome, has some hard limits, and your SQL can quickly becomes unruly. Then about a year ago, the BigQuery team released ingestion time partitioning. This allowed users to partition tables based on the load/arrival time of the data, or by explicitly stating the partition to load the data into (using the $ syntax). By using the _PARTITIONTIME pseudo-column, users were more easily able to craft their SQL, and save costs by only addressing the necessary partition(s). It was a major milestone for the BigQuery engineering team, and we were quick to adopt it into our data pipelines. We rejoiced and gave each other a lot of high-fives.


I guess you came to this post by searching similar kind of issues in any of the search engine and hope that this resolved your problem. If you find this tips useful, just drop a line below and share the link to others and who knows they might find it useful too.

Stay tuned to my blogtwitter or facebook to read more articles, tutorials, news, tips & tricks on various technology fields. Also Subscribe to our Newsletter with your Email ID to keep you updated on latest posts. We will send newsletter to your registered email address. We will not share your email address to anybody as we respect privacy.


This article is related to

Big Data,Cloud,databases,Javascript,Node.js,BigQuery,Google,Google BigQuery,Google Cloud,google cloud dataflow,Google Cloud Platform,Software Engineering,SQL