Snowflake and BigQuery are similar in many ways, and one could think it would be an easy and balanced fight… but is it though?

Ever since I wrote my Snowflake vs Redshift RA3 article, I have been getting requests from readers to make a comparison between BigQuery and Snowflake. As was pointed out by some, they both share a lot of the goodies that come with the separation of storage and compute, and are therefore worthy competitors.

As always, I would like to point out that choosing the “right” database for your data platform is not a clear-cut decision… there are many variables at play: data size, user requirements, performance requirements, price sensitivity, budgeting, maintenance costs, etc… only you will know what the best…

Technical overview of what makes Snowflake the best data platform out there

Executive Summary

If I had to summarize this article in just a few words…

Snowflake is just a great productivity enabler. It increases our productive work, reduces the time waiting around, and makes performance equally available to everyone instead of limited shared asset. At the end of the day, Data Engineers and Analysts’ time are much more expensive than licensing. If you add faster time-to-market and ability to “fail fast”, it’s quite a proposition.

Snowflake was built with scalability in mind. I have worked on tables with hundreds of billions of rows, and you kind of just forget how big they are……

On a deeper look, the differences will emerge

Welcome to Part 2 (and final) of this comparison between Snowflake (SF) and Google BigQuery (BQ). In Part 1 (which can be found here) we looked at the easy “out-of-the-box” version of BigQuery and kept it as simple as possible for the cost and performance comparison. But now, we need to have a deeper look:

it’s time to properly dive into the details and, as we all know, the devil is in the details.

Looking Beyond “On-Demand” Pricing

To compare the two technologies, we will have to find a way to match their compute costs, which is not at all trivial. The two technologies…

Optimise your workflow with Snowflake’s time travel feature

Photo by Delorean Rental on Unsplash

One of the great features you have available in Snowflake is called “time travel” (documentation here), which allows you to set up your table(s) so that you can go back in time for up to 90 days. This may seem like just another “cool” feature, but it’s way more than that... This is a great way to improve your workflow saving you a lot of time and hassle. Let’s jump right into it.

Debug Your ETL Failures (with data “as it looked then”)

It’s Friday night, you’re at home watching Love Island (you know you do…), and at that same exact time the finance month-end reconciliation script is failing in…

And you really should…

When moving to a new technology, there is a learning curve you have to go through until you start digging into more “intricate” details. It’s not uncommon for users to find themselves months down the line and find something they wished they’d known since the beginning. These configurations are one of those “easy to overlook” things, so I think it makes sense to point some of them out.

Some configurations are trivial and expected, we all assume there is a date/time format configuration in Snowflake, there must be a locale setting, all fairly trivial for anyone who has been around…

You’ve tried it out, and decided to make the move to Snowflake. But now what? Where should you start?

When in 2020 Snowflake burst into the stock market with a boom, I started noticing a growing interest on the job market for professionals with Snowflake experience to help with migrations. Snowflake was no longer the “dark horse”, and companies started to embrace it as a natural alternative to other long well established technologies, like Azure DW or Amazon Redshift.

So, maybe you’re one of those companies who finally decided to take the leap and move to Snowflake, and now you’re looking at this “almost the same, but not really the same” technology and you’re not really figuring out where…

Because your first step towards Analytics is an important one

Dipping your toes into data analytics can be quite overwhelming at first: it is probably harder, more expensive and more complex than you anticipated. But it doesn’t have to be!

Photo courtesy [as seen here]

Having had the pleasure to work with multiple startups at different growth and financing stages, I can appreciate how hard it is to balance investment with ambition, and how tricky it is to get data analytics investment right.

Data analytics is an expensive capability for any Company to acquire, and even more expensive to maintain.

Hiring is extremely difficult, the resources are scarce, validating candidates’ skills far from trivial and…

A really short article on why you should be using single sign on (SSO) with Snowflake

Database authentication have always been a pain for data warehouses… you are holding extremely sensitive data, and for that reason you should enforce high security standards. But high security standards usually come with a deal of great hassle!

Let’s use a real life scenario: your data warehouse has both personal data (aka PII), and non personal data. As a bare minimum, your personal data should be highly secured, and that means enforcing MFA, as well as enforcing password rotations every 3 months. But you can’t rely on users to do that, so you have to manage it yourself… but now…

Can’t even count how many times I had to do helper functions to encapsulate boto3… but the pain is finally over!

If you use Python and AWS a lot, I bet you’re just like me, creating dozens of libraries to help manage boto3. Here is a good example:

And all because I wanted the secrets as a dictionary… shouldn’t be that hard!

Meet awswrangler! The Library boto3 Should Have Been

As many others, I have blindly used boto3 for such a long time… but having been recently introduced to AWS’s new awswrangler library (an AWS Professional Service open source initiative), now I just can’t believe that we were forced to put up with boto3 for so long…

Let’s look at some of the things that have been finally…

The paradigm of pay-per-use is amazing, but it should come with a massive red sticker: “Use with Care!”

Since the beginning of (database) times, databases have always been there for us, up 24/7, eagerly waiting to serve the next query. I am pretty sure that if we were to add all the database idle time through the last decades, we would have centuries worth of availability where the database was just doing…nothing! What a waste… right?

Not anymore! Welcome to the Era of Pay-Per-Use!

What Do You Prefer: A Fiat Uno Always On, or a Ferrari You Just Turn On When Needed?

In the pay-per-use model, you do not need to pay for availability: having a server running idle is not something anyone should like to pay for anyway… Like the old performance tuning saying: whenever…

Joao Marques @ Data Beyond Ltd

Just a technical Data Architect having too much fun building new data platforms

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store