Getting Data In
Getting Data Out
BigQuery is Google’s managed data warehouse in the cloud. BigQuery is incredibly fast. It can scan billions of rows in seconds. It’s also surprisingly inexpensive and easy to use. Querying terabytes of data costs only pennies and you only pay for what you use since there are no up-front costs.
This is a hands-on course where you can follow along with the demos using your own Google Cloud account or a trial account. You do not need any prior knowledge of Google Cloud Platform and the only prerequisite is having some experience with databases.
- Load data into BigQuery using files or by streaming one record at a time
- Run a query using standard SQL and save your results to a table
- Export data from BigQuery using Google Cloud Storage
- Anyone who is interested in analyzing data on Google Cloud Platform
- Experience with databases
- Familiarity with writing queries using SQL is recommended
- A Google Cloud Platform account is recommended (sign up for a free trial at https://cloud.google.com/free/ if you don’t have an account)
The GitHub repository for this course is at https://github.com/cloudacademy/bigquery-intro.
There are two components to BigQuery pricing: storage and queries.
BigQuery’s storage charges are incredibly cheap. It costs two cents per gigabyte per month, which is the same price as Cloud Storage Standard. What’s even better is that if you don’t edit a table for 90 days, then the price for that table drops to one cent per gig per month until you modify the data in the table again. That’s as cheap as Nearline Storage! In fact, it’s even cheaper because when you read data from Nearline Storage, there is a one-cent per gig charge. With BigQuery storage, you aren’t charged for reading data at all.
Since we’ve only been using public datasets so far, there won’t be any storage charges.
The only other charge is for queries. (There’s also a charge for streaming data to BigQuery in real-time, but that doesn’t apply to these examples and I’ll cover it in another lesson.) For queries, the first terabyte per month is free. After that, it costs $5 per terabyte, which is half a cent per gigabyte. Wait a minute, didn’t I just say that you aren’t charged for reading data from BigQuery storage? Yes, that’s true because BigQuery charges query fees regardless of where you read the data from. For instance, if you query a dataset that’s in Cloud Storage, then you get charged at the same rate that you would from querying a dataset in BigQuery storage, so the charge isn’t for reading -- it’s for processing.
For high-volume customers, there’s also flat-rate pricing, but it’s only worthwhile if you spend at least $2,000 per month. It only applies to query costs and not storage, which is still separate.
To see how much data is processed by a query, look in the Validator message area. Since there isn’t an error in the syntax, now it’s showing how much data would be processed by the query above. In this case, it’s 163 MB. Considering that the first terabyte of processing in a month is free, this won’t cost us anything, but even if we were already over the 1 terabyte mark this month, it wouldn’t cost much. How much? Less than a tenth of a cent. I’d say that’s pretty reasonable.
And that’s it for this lesson.
Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).