Recently I was running yet another deployment of SAS Viya to the AWS cloud. In particular, I was stepping through my established process to ensure it still worked with the stable-2021.1.5 release. But early on, I hit a snag. The challenge occurred before I even got to download my SAS Viya order. It happened when I tried to provision my cloud hardware as the place to run SAS Viya. Of course, I was using the Infrastructure as Code project, viya4-iac-aws, to do that. And I thought it would be helpful to share it with you - but not to solve just this one problem. This is just an example of what might happen. Now that we're optimizing for continuous improvement and continuous delivery (CI/CD) of software, this kind of problem can happen at any time. We all need to practice improving our skills at troubleshooting the kind of problems which will occur as changes are continuously introduced to the software, the deployment toolchain, the infrastructure, and so on.
The viya4-iac-aws project (and variations for -azure and -gcp) is very useful and helps to make quick work of standing up cloud infrastructure that's needed for SAS Viya. As a guiding concept, remember that the sample files are provided to quickly provision a pre-determined set of machines, of course, but also act as a kind of guide to follow when crafting infrastructure specifically suited to your customer's needs. It's expected you'll need to make changes - that the IaC sample provisioning files are likely not what your customer will actually use for a real working environment.
I already have an established process consisting of files and scripts that I've successfully used to deploy SAS Viya many times over. The last time that I ran it was a couple of weeks ago. I was ready to give it another go when the stable-2021.1.5 release came out.
So keep in mind from this point that the challenge is not with SAS Viya - because I haven't even gotten to download the order yet.
When I got to the step where I run the terraform plan command - which takes as input a .tfvars file that I built based on the IaC samples to describe the Kubernetes cluster I want - it failed saying:
Basically, Terraform is complaining it cannot build the plan file because one variable in particular - var.create_nfs_public_ip - isn't set properly. The whole point of the .tfvars file is to define variables - and sure enough, when I look in mine, there's no reference to that var.create_nfs_public_ip variable.
I download and build my local copy of the viya4-iac-aws project each time I stand up a new environment. And my procedure relies on using the latest version of viya4-iac-aws. As discussed in my previous post about Contemplating Version Pinning for CI/CD, this could lead to unexpected errors over time as the viya4-iac-aws project is updated with new features and improvements.
When I see an error message like the one above - where Terraform is stumbling on an undefined variable - it triggers me to question what new functionality has been added to the viya4-iac-aws project. For this particular variable, I go and look at the samples to see when they were last updated and what changes were made.
That's when I saw that viya4-iac-aws/examples/sample-input-minimal.tfvars was changed recently with some new lines added at the end:
Whereas my current project's .tfvars file simply ends with:
In short, the IaC team has modified their process slightly to make the creation of an NFS server parameter driven using these new variables (as well as a change to the Jump server's creation process, too).
The fix here is pretty easy - I added the new lines to my local .tfvars file and re-ran the terraform plan step. After that completed successfully, I continued on with the rest of my deployment process.
Once I was happy with the outcome, then I updated my saved .tfvars file so future iterations of the process will get those new variables as well.
This is the right approach for me because I want to keep up with the latest changes in the IaC and I'm comfortable troubleshooting challenges on short notice. But this might not be the right approach for you or your customer depending on your objectives.
The alternative would be to use an older version of the viya4-iac-aws project prior to the introduction of these new variables. Looking at the Tags page for viya4-iac-aws in Github, release versions are shown as:
Therefore, another approach I could've used to fix this problem would be to select a slightly older release of the viya4-iac-aws project and use that instead. That way, the IaC would use its older method for provisioning the NFS server (and Jump server) that doesn't need the new variables.
Switching to a different tag is pretty easy. From the host machine where you're running the viya4-iac-aws project, get a listing of available tags:
Then use the checkout command to switch over to a different tag (choosing 2.1.0 here):
Since we're just wanting to use the files in here - not make changes to push back to the remote git repo - we can ignore the suggestion about branching provided.
Pinning to a particular version in this way is helpful to help maintain a repeatable process that's likely to break less frequently. The tradeoff is that you shouldn't allow yourself to get too complacent and ignore the march of time. Eventually, even pinning to a known-good version is likely to fail after enough time has passed because of changes to the cloud provider, security enhancements, etc. And if you've successfully ignored a large number of changes over a long period, then trying to bring this step of the process forward to the current day might entail a lot of work on your part.
The interesting thing here is how much control we now have over what we're working with - including the ability to rollback to earlier releases as needed.
Even so, this is a pretty simple example of the kinds of challenges you might face working with SAS and other vendors' tools that rely on CI/CD concepts for delivery and operation. Troubleshooting these kinds of challenges with automated scripting tools like the viya4-iac-aws project (and viya4-deployment project) is becoming a normal part of the job for SAS personnel working with CI/CD pipelines that range from informal to highly structured. We're expected to identify the primary problem, sleuth around to find the answer, and then craft the appropriate solution with an understanding in-line with the long-term objectives. These are baseline skills we all need to practice and master.
Going further, if the problem turns out to be a bug in the code, then it's good practice to report the issue to the project at a minimum. Even better is if you have the ability to fix the problem, then you might consider branching the project, coding in the fix, and then sending a pull request so that other users of the project can benefit from your solution.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.