Mike Jelen
Mike Jelen

Terraforming Snowflake

Twitter
LinkedIn

Terraform is an open-source Infrastructure as Code (IaC) tool created by HashiCorp and is a declarative Infrastructure as Code tool, wherein you can declare what you want using a YAML like syntax. Terraform is also stateful, meaning it keeps track of your current state, and compares it with your desired state. Using Terraform is a great way to manage account level Snowflake resources like Warehouses, Databases, Schemas, Tables, and Roles/Grants, among many other use cases.

Example Terraform use-cases:
– Set up storage in your cloud provider and add it to Snowflake as an external stage
– Add storage and connect it to Snowpipe
– Create users and assign roles to users

Many Snowflake customers use Terraform to comply with security controls, maintain consistency, and support similar engineering workflows for infrastructure at scale.

Watch the video here as we demonstrate how you can use Terraform to manage your Snowflake configurations and show you how to install and use Terraform to create and manage your Snowflake environment, including how to create a database, schema, warehouse, multiple roles, and a service user.

Transcript from the Meetup event:

Hello, everyone. Welcome to another awesome Carolina Snowflake meetup group events. Tonight we will be going over Terraforming Snowflake. Very exciting. I’ll just go over our meet up rules real quick. 

We just ask that everyone is respectful of one another and tries to stay on you and use the chat to ask questions and try to keep those questions on the topic. Video on is optional, and we always hope that you will recommend us to someone and invite a friend and ping us on Twitter at DataLakeHouse.io. 

Our previous meet ups are added on to our YouTube channel, and you can find that YouTube channel in the discussion board on our Meetup group. Today, we’re going to start with some introductions, a quick Snowflake refresher and then an introduction to Terraform. 

We’ll go over a couple of use cases and then we will jump into Terraforming Snowflake and that will go over the Terraform plan and the demo itself. And then we’re going to talk about our Snowflake health checks and what we’ve got coming up next. 

So just a refresher for those that don’t know what Snowflake is, or maybe you’re new to Snowflake. So Snowflake kindly referred to as a Snowflake data cloud. 

It’s more than just a cloud data warehouse where you ingest your data, govern your data, access your data, but you also share your data. There’s a lot of means to take your data that you have and potentially anonymized that and make that an asset that you share and charge to perhaps your customers as additional service. 

Or maybe it’s part of the overall relationship that you have with your customers to make it easy to. Transfer and share data. They don’t have to be a Snowflake customer, but they get to leverage the power of the Snowflake data cloud. 

Of course, if they are so lots of different types of data sources on-prem, cloud, third party like a flat files device data, whether it’s IoT or perhaps network devices and all that data can go up into the Snowflake data cloud, where you transform that data into how you need to consume that data, whether that’s through your data scientists or executive users, your end users, through your organization or perhaps other applications within your organization, you can aggregate that data, slice and dice that data, and carve out data sets to be exposed to the appropriate folks within the organization. 

So when you talk about today’s conversation with all things Terraform, just really set the get everyone on the same page. We think of Terraform as a whole, think of it as infrastructure, infrastructure as a service. 

We’re using code in order to really create your infrastructure. The old way of having to go by and procure physical servers, or perhaps you even had VMware or virtual servers that you were provisioning had to determine what’s the Ram and the CPU and the shared disk and the type of nitcard or connection. 

Gone are those days where things from a software perspective, Terraform is the answer to that. And so there’s two different flavors of Terraform. So there’s Terraform, which is the traditional command line tool with a we call it a quote unquote local install, whether that local installs on your laptop or a VM. 

In a cloud provider, it still needs to have an operating system and location to run. Now with Terraform cloud, this is when you think of, all right, we’re going to have a team of people that are going to be doing infrastructure as a code. 

You want the ability to have that version control, whether you roll, pull back and say, oops, I made a mistake, that check in, check out that CI CD, Terraform cloud is going to be your answer at that point. 

So we talk about some of the use cases writing this code, especially repeatable code where it’s not code to the Nth degree, where I’m going to send you a degree in reading and writing code. This is very straightforward. 

I think it was more human readable. And we’ll dive into what that actually looks like in a little bit here. We talked about that, the committing of that version control, another means to a more modern way to handle infrastructure. 

All right, so Anaphy will turn over to you here. Okay, thank you, Mike and Heather, for the amazing detailed introduction. So taking forward from here, we’ll have a quick demo on how do we set up Terraform and how do we kind of set up therapy from a Snowflake standpoint. 

So we’ll first quickly talk about what are the various ways of importing infrastructure. So you could have, like when you’re completely setting up new infrastructure and you’re bringing that under Terraform, the second scenario and the more tricky scenarios, when you already have an existing infrastructure, I say infrastructure, you’d only be keeping it to Snowflake. 

And when you have already have an existing Snowflake, infrastructure, and then you kind of want to bring that into data farm. And we’ll walk through each step, around each of these steps where it takes to kind of get your, get a Snowflake objects into your Terraform infrastructure. 

So I touch upon this and if you go back, that there are two ways of doing it that you could completely terrify new infrastructure, or you could import listing one. And as Mike already pointed out, that there are two ways of Terraforming. 

You could use something called a Terraform cloud, which could give you a cloud interface UI to kind of set up a form. Or the other option is to kind of use the command line option and as part of the scope for today’s session, will be more around the command line piece. 

And what we’ve done is you have these set of registries or Snowflake provider for Terraform, which kind of gives you certain scripts and certain commands to kind of import your existing architecture, Snowflake architecture into Terraform. 

Okay, so this is one of the Snowflake providers that we have a Terraform. And as you can see, what they’ve done is they’ve given you like a Terraform script that you can kind of import as part of your code to import each of these objects. 

And as you see on the list, they give you resources for databases, functions, tables, external tables, file formats, masking policies, and most of the Snowflake infrastructure, each of them has their own transcripts. 

So if you click, the function tells you what the format is. So if you have to import this particular object into your Terraform, we’ll talk more about this as part of the demo. But just to get you a feel of what objects are available as part of this Terraform registry, you can have a look at this link. 

This is given as part of the presentation. You can have a look here and you can see that there are various objects that can be imported. So, yeah, you need to kind of ensure that your code generates this format and each of these parameters have to be kind of filled in for terrifying to know that you want to import this particular object with these properties. 

And it tells you the format and it tells you what are the required parameters, what are the optional parameters, and so on. So, moving on to the importing part, what essentially happens is that let’s assume that you have a scenario where you want to kind of import your architecture and it’s a completely new implementation. 

You haven’t implemented anything in Snowflake yet. So as you implemented Snowflake, at the same time you generate a dot TF file, which is nothing but your Terraform file, which has those resource scripts that I just showed right now. 

And then after that you kind of do a Terraform initialization where you initialize the Terraform setup. And then you do have a Terraform plan to kind of create a dot TF state file which keeps a track of all the existing or all the objects of Snowflake that have been imported into Terraform. 

And then you kind of do a Terraform apply to create, edit or delete any of the infrastructure from your Terraform and also your Snowflake set up. Let me do a quick demo here before I go ahead. Now, let’s take an example. 

For example, I have to import, let’s say, databases. So for databases as a setup, as I said, I’ve used this particular provider, which is bias, no fake lapse. It’s specified the version that we’ve used. 

And then this basically gives you a setup to kind of connect. With Snowflakes. These are just the connection details with Snowflake. In this particular scenario we’ve used a private key to connect with our Snowflake instance and you need to kind of set up users. 

So we set up a Terraflow Snow meetups. If I go to My, this is a Snowflake instance where we kind of set up the infrastructure using Terraform. So you see that we have a user Terraform Snow meet up and do that. 

You assign a public key that used to connect with a Terraform and then you can assign the sales admin, security admin for it to kind of go and create or remove a delete and so on. And then you would need your current accounts and your region information that you need to feed in here as part of the connection properties to connect Terraform with the Snowflake instance. 

So you can either use a password if you have OAuth set up, you can use the OAuth token or in this scenario if you use private key, public key, you can even use a public key and private key setup, you can specify the role that you want to use it to use for logging to Snowflakes. 

In this case it will be using in Sysadmin when it logs in at TS no Meetup. Now, let’s assume that if you look at this Snowflake instance here in the databases right now, you just have the standard Snowflake database.

There is nothing that is currently available. So let’s assume that I want to create a database called let’s say you take a name on Kit Sandbox and I specify that you need to use this particular role, this user, when you’re creating this database and that’s the name of the database. 

If you have any comments, your attention period, and if you understand what info needs to be passed on here, you can always look at the website, it tells you what parameters have to be passed down. Once you have a dot TF file, a Terraform file map here, you actually go to command line and then at first you go and initialize your Terraform. 

So I do a Terraform in it as you see it initializes my Snowflake Labs library and the registry. And then after that I do a Terraform plan, which tells me that based on the TF file I’ve created, so it sees that what values are there in TF file, it compares to the state file that we have right now. 

So right now my TF state file has nothing. So it’s telling me that I will go ahead and create this particular resource called as, this database with the name on Kit Sandbox. It tells me that it plans to add one value, change nothing, destroy nothing. 

But it’s important that we plan. It just tells you a plan and does not go and change or create or delete anything. It just tells you what changes are planned. So just look at this. What happens if you look at the dot tier state file? 

This is a state file which keeps a track of all the objects that have been imported under Terraform. So right now it’s a blank file. It just had these initialized headers. It has no real resources under it. 

But when I go and I do a Terraform apply, it will tell me again that we plan to go and create this database. And you have to confirm with the value yes for it to go and actually go create. So I say yes. 

And now you see it has gone and created this particular database in my Snowflake Sandbox. Now, if you look at the TSD file, it has added a resource called Snowflake database, the name unkind Sandbox with these attributes that we’ve provided. 

Now let’s go have a look at the Snowflake instance. Now, if I do a refresh, you see this particular Sandbox database has been created. So on get underscore Sandbox database has been created. And if you look at the history. 

You actually have this user Terraform snow meet up, which does do all these commands for you in the background. It does a check for all databases and then goes on to create a database for you. So it’s done that creation of database for me in the back end, while I only had to go and add that particular month in my data from file. 

And that’s the concept of something called infrastructure code, where you only add certain configuration files and then it takes care of creating, deleting, adding. So for example, if I want to go and delete this particular database, I can just say destroy. 

And whatever is created as part of Terraform destroy, it destroys everything like I asked me to confirm. So it’ll say that this particular database will be destroyed. So I say yes, and they’ll go and destroy this database and it has dropped that particular database again. 

So if you look here, it should not be here, it’s gone. So this is a scenario where you want to completely you have something which is not there as part of your infrastructure and then you want to add it. 

And that is something which might not be possible every time. There might be some scenarios that you already have Snowflake objects in your existing architecture and then you want to go ahead and then kind of in between you decide to kind of import all those Snowflake objects under your architecture. 

So that means you would have to, for example, if you have thousand databases, if you have 10,000 schemas, if you have let’s say, 100 users, you would have to go manually create those files for each one of those Terraform objects and then you will have to import each of these objects and then do a Terraform. 

Plan and Terraform apply and then kind of implement that. And that is something which is not desirable. So there are two ways to go about it. Either you could write a script where you could automatically generate each of these files and there is no existing solution per se as us. 

There are some add up solutions that companies have kind of divided on their own to do it. But what we’ve done is as part of Data Lake House, we’re right now working with a feature. It’s under Private preview and it should be out very soon. 

On the lake house, where you only have to specify the name of the account, you specify the user ID, you specify your password of your Snowflake account, and you just mentioned that for which particular object do you want the dot TF file and the import commands are to be created. 

And then you generate the Terraform scripts. So Data Lake House would go ahead based on your existing architecture, generate those top Terraform files and the import command files, and then you can just go and then import those commands into your Terraform. 

So how do you do that? So let’s assume I’ll take the same scenario here, that if I go and say, get a form plan and let’s assume that we have this particular database which is already created. So and then I create this. 

Now this is a scenario where we have so I will have to go and remove this value. So what it does is I have tried to simulate a scenario where I already have an object in my Snowflake system, but it’s not there in my TSD file. 

So what I can do is I can just go and so. So you have something called Import, where you can import existing architecture into your Terraform file, and then you go and import the architecture, your importance happened. 

And then after that, if I do a Terraform plan, it will be no changes because by importing that object through the import commands, it kind of matches your data from state file with the existing database or the existing architecture that you have in Snowflake. 

So if there are any questions around that, I have to take it now and I think Heather back to you. All right, so we’ve had a number of questions come in through various means and some of them I see a couple of questions just around where can we find the recordings? 

So within the chat with Zoom now, there’s a link there. There’s also within the Q amp, a area within the webinar, there’s a response in there. And generally, if you haven’t seen either of those places, our YouTube channel, which you can get to via our website on DataLakeHouse.io, is an alternative way to get to these recordings. 

Some conversations just around deeper dive into Snowflake. What is it? Happy to have some follow up conversations with those. Feel free to either reference the email address within Q and A or there’s a contact us on our website. 

Happy to dive further into Snowflake, do a demo, talk about your specific use case and go from there. Of course, all three of those conversations. Okay, great. So we’ve got these next steps here. First, would you start terraforming your Snowflake? 

If you have Snowflake and you need to use Terraform, of course we also encourage you to explore Snowflake objects, and Annika can probably comment more on that and then create manage.org accounts in Snowflake and let us know if you you need any assistance on a pub. 

Do you want to touch on the Snowflake objects? Yeah, I think to kind of fully unleash the power of Terraform, it’s also a useful initiative to kind of understand the various Snowflake objects and how they kind of interrelate with each other. 

While you’re trying to import the Snowflake architecture into Terraform, there are certain sequence that needs to be followed so that ensures that you have smooth transition into Terraform. So understanding Snowflake objects and the kind of connection between each of them is definitely a useful initiative and useful piece of effort. 

Great. We’ve got another question here. Does Terraform support adding multiple Snowflake accounts. When they say multiple Snowflake accounts? I think we’re trying to look at if you have multiple Snowflake accounts and you’re trying to import them under the same terrifying architecture to answer that, yes, it is possible. 

You’ll have to create different users for each of the Snowflake accounts that you have and then accordingly manage it. Awesome. Great question. So just to tell you guys a little bit about us, we are AICG, developers of the DataLakeHouse.io platform. 

Datalakehouse. IO is our site where you can find out more information about us. And we are essentially an iPad end to end analytics SAS platform with ELT. Machine learning. We offer analytics, dashboards and a data catalog and all of it with no code. 

Check us out on our site if you have any interest in learning more about Data Lake House. Right now we are booking health checks for Snowflake, where we will check through everything and let you know best practices and that kind of thing. 

So if you are interested in that, I can put the link for that out in the discussion board and I will update that. And we have another question here. You support IBM Cloud. Yeah. So Snowflake doesn’t run in IBM cloud. 

Snowflake runs in the three most popular clouds AWS, GCP, and Azure. However, data from data that sits on a share, regardless of where that data sits, can be ingested into Snowflake. It could be flat file, it could be database. 

So from that perspective, DataLakeHouse doesn’t care where that data is. As long as you can get to it in some of the format that can be adjusted, you’re good to go. All right, another question here in the Q and A, when we create Snowflake users through Terraform, is it best practice to keep the password off of the source code? 

What do you guys say to that? So there are two ways to go about it, definitely. If possible, it’s a good approach to have the one that we use the public and private key instead of connecting with the password. 

That’s one thing. Or the other way, it’s essentially, if you’re going to use it as part of some. CI/CD pipeline, you would have your password stored as Secrets. So, for example, if you have your Terraform setup as part of GitHub, you would essentially store your password as part of Secrets. 

You would never essentially hardcore your password. You would use secrets. It’s easy to maintain and also kind of keep with some form of confidentiality around the password being used. No other questions in the Q and A right now, but you can feel free to reach out through that account we have on here if you’re interested in anything else. 

And thank you guys so much for joining us. We’re appreciate everyone coming!

More to explorer

Snowflake Health Check

We enjoy sharing best practices/framework to validate the health of Snowflake implementation and discuss best practices to help you ensure you’re on

dbt Coalesce 2022 conference

DBT Coalesce 2022 – Day 4

This day was bittersweet, as it has been so nice to finally be around people in person (yes, I realize that sounds

dbt's Coalesce 2022 conference

DBT Coalesce 2022 – Day 3

As we were headed towards day 3- We were halfway through, but still, a lot to cover, and the schedule had so

Scroll to Top