Some Things to Know About BigQuery Editions

Brian Suk
Google Cloud - Community
6 min readApr 3, 2023

--

Image Source

Allright, so this has been a week. If you haven’t heard, Google Cloud announced BigQuery Editions at their Data Cloud and AI Summit. There has also been a Google Cloud Blog post about this as well announcing the release. I would highly recommend watching the announcement if you can, as well as reading through the blog post. There is a lot of information around this.

But that’s also part of the issue I’m seeing. There’s a lot of change happening all at once, and the discussions I am having indicate there’s some confusion out there. I wanted to take a few minutes and write down some common things I’m seeing and some human readable explanation around them. This is me, on my personal equipment and time, not a representative of any company, just a BigQuery user and enthusiast and hopefully this may help another BQ user out there navigate this. I’m going to assume that you’ve at least read the launch blog post and don’t want to re-publish that content, so please do so if you haven’t already.

On-Demand

I wanted to specifically start with this because there is a lot of discussion around Editions, but I understand a lot of people still use this model.

Price: To start with, the price is going up by 25% across all compute regions. For example, US multi-region is going up from $5/TB scanned to $6.25. Different regions may have a different starting point, but be sure to check where you’re using it to know what your specific impact will be.

Features: On-demand will maintain feature parity with Enterprise Plus Edition (EPE).

Capacity: This will work as it always has. See the documentation on slots for more detail there, but the mechanism itself isn’t changing, just the dollar cost.

Migration: I’ve heard some people mention they’ve heard that there will be a forced migration to Editions if you are On-Demand today. This is false. Customers will have the option to spin up a Editions reservation if you choose, but this is a choice you have.

Physical Storage

Physical storage billing (also sometimes referred to as “compressed storage billing”) will be generally available in the near future as a key piece of the TCO equation going forward. There is documentation on this, and there’s information on how this works out there so I won’t rehash that here, but it’s important to understand that this will be available.

Eligibility: There is a misconception that On-Demand customers won’t be able to take advantage of this. That is not true. From a technical perspective, this storage is not tied or dependent upon any compute mechanism. Anyone using any Edition or On-Demand will be able to leverage this.

That being said, it’s important to understand that this feature won’t be available until a customer has no Flex or Flat-Rate slots reserved. This is not an engineering dependency, but it is a commercial constraint.

The net here is that it’s not dependent on what type of compute you use in the future, but it’s contingent upon you not having any “legacy” reservations anymore.

Forecasting: There aren’t many good ways to give general predictions on how much this will cost, because compression rates on your data are highly dependent on the shape and cardinality of the data you’re storing. This isn’t specific to BigQuery, but is generally true for any columnar based data store. If you currently have data in a dataset in BQ today, though, you can actually see what it’s size is. Google documentation has example queries you can run on your INFORMATION_SCHEMA that will give you the numbers. Just remember that Physical Storage will have a two day time travel storage minimum (this is a technical dependency).

Some Things to Look For

Mixing and Matching Editions: When thinking about slot reservation hierarchies in the Flex and Flat-Rate world, you would have to figure out what use cases and service levels you have to support in order to properly size a reservation. Reservations that perform overnight ETL workloads may look different than one that powers a C level operational dashboard. I still think the questions you have to ask are similar, except now you have an added dimension of different feature bundles to think about. Maybe your ETL reservation can do with the features offered in Standard Edition. The data science team may need the features in EE. A forensic analytics team may need the security features offered with EPE. It’s something you will have to detail and make sure that the right Edition is assigned to the right workload.

On-Demand Users: Your compute costs, for the same data volumes scanned, are going to increase. One of the levers is to optimize your queries. Really do what you can to make sure you are scanning as little data as you can. It’s the more direct thing that is within your control. You may also want to look at Physical Storage. If you have high volumes of data stored and it’s incurring a high cost, utilizing this could lower your overall BigQuery costs despite query costs going up. The other angle is to start using Editions. If you have queries that are expensive because a lot of data is being scanned, you will want to look at the slot time being consumed and do the math because leveraging slots in a particular Edition may actually yield better financial results for you.

Security and Compliance: Note that Standard Edition will have Foundational Compliance, which includes Google Cloud Platform wide certifications. VPC Service Controls, and data security (data masking, column level security, and row level security) will be offered at EE and above. CMEK and Assured Workloads will only be enabled at EPE. This is according to the launch blog. All this is what’s happening with new reservations in Editions (for customers with existing Flex/Flat reservations, please speak with your Google or partner account team for more details), and is worth noting.

Timing: At the end of the BigQuery Editions launch blog, there’s a bit that speaks to the timing of this rolling out. Starting on July 5, 2023, BigQuery customers will no longer be able to purchase flat-rate annual, flat-rate monthly, and flex slot commitments. The implication here is that you can continue to purchase an annual commitment until then. If the current functionality and price point is advantageous for you, it may very well be worth creating additional reservations. Just note that it does not mention whether or not additional features within the year following July 5, 2023 to any Editions will be enabled for any legacy reservations you have active at that time, and there are no guarantees I can find that are public one way or the other. So, for example, if you buy an annual Flat-Rate reservation on July 4, 2023, and Google adds a new feature to EPE on August 1, 2023, it’s not immediately clear if you will be able to use it or not. You should definitely speak with your Google or partner account team if this is a concern, but just know this is something to consider and make clear for your situation.

Plan For This Change

At the end of the day, this is a fairly big change to the way Google is selling and pricing BigQuery. The mechanism in which it’s being sold is changing, so this can have varied implications depending on your current situation. I’ll try to answer any questions that come up in the comments to the best of my ability with public facing information, and I’ve also seen a Reddit thread pop up on this with people discussing this too, which may be helpful for you if you want to ask questions. I hope this information was helpful and that we can all learn from each other as we all navigate these changes together!

--

--

Brian Suk
Google Cloud - Community

Avid 2020 bed-to-couch traveler, cloud tech, big data, random trivia, Xoogler. My employer isn’t responsible for what’s here. NYC. linkedin.com/in/briansuk