New Approach — CI/CD for Synapse Analytics Pipelines with Azure DevOps yaml pipelines (Part 1.1)

Stefan Graf
4 min readFeb 3, 2023

--

Photo by Danil Shostak on Unsplash

In this short story I’ll showcase you the new and improved way, how to implement a Synapse Analytics CI/CD Pipeline using Azure DevOps. This is similar to the new approach in ADF. It no longer requires the publish branch. This means, that you can directly deploy from your main/collaboration/whatever branch. This enables you to integrate Synapse pipeline deployment fully into your prefered branching strategy.

For reference — here is the old way described: CI/CD for Azure Synapse Analytics Pipelines with Azure DevOps yaml pipelines (Part 1) | by Stefan Graf | CodeX | Medium

Background

Before we jump in, we need to clarify some basic concepts on how this CI/CD pipeline will work. The fundament of this will be the repo integration provided Azure by Synapse Analytics. This gives us the capability to connect our Synapse Workspace to either Azure DevOps or GitHub (Enterprise) repos.

All you need to do is to create the git connection by providing basic information about your repo and you’re ready to start. This Git integration will then replace your standard “Synapse live” code management. Additionally, you need to define 2 different branches, the collaboration branch (most likely your main branch, where your most current stable version of your code lives) and an publish branch (by default called workspace_publish, an automatically by the system created branch, where your code lives as an ARM template styled manner).

But be aware, that according to Microsoft Documentation, Synapse is no longer for every Artifact a pure ARM template like ADF was prior. That means we have to use another way of deploying it, because simply deploying ARM templates won’t work anymore.

The new CICD approach

As we can now just use every branch we want, the new approach enables you to use your preferred branching strategy. So for example you can develop in a trunk based branching policy and are able to deploy your Synapse artifacts directly from your main and release branches.

You don’t even need to click on the publish button anymore, but be aware, that on your Dev Synapse resource the main/collaboration branch is not the state which is deployed to the workspace, it is always what is installed in live mode. Bring new code to the live mode can either be done by using the publish button again, or you can also include your Dev environment into your Deployment pipelines to have this step also automated.

Just FYI: A similar new way of deploying your solution is also available for ADF.

The old approach for reference

The old approchh on how to handle CI/CD with Azure Synapse, differs quite a lot from the new one. The only branch you can use to deploy your code with this set up is the publish branch (workspace_publish). This branch will be created/updated when you press publish in your Synapse UI, after you done any changes.

The actual working branch, where all the Pull Requests are integrated to implement new features, is the collaboration branch (main branch). This is also the base for your publish branch.

CI/CD Concept Synapse

Azure DevOps yaml Pipeline

CI

Building is pretty much already done, because everything is already prepared as a deployment ready resource, which doesn’t need any build process. It is still recommended to run this CI step to package your code for traceability and reusability purposes.

CD

This task is also quite easy, because you can use a predefined task in Azure Dev Ops called “Synapse workspace deployment@2”. Here you only need insert your target Synapse Workspace, authenticate via Service Connection (Subscription).

Additionally, you need to turn your triggers of for a clean deployment, without any unwanted behaviours. There is also a reusable task to use in Azure DevOps called “toggle-triggers-dev@2”.

CI/CD put together

And now both put together in a fully working yaml pipeline. Keep in mind that a windows vm is needed, because these predefined tasks we are using, are based on Powershell scripts, which didn’t work on an Ubuntu machine for me. The trigger fires always when code gets pushed to the main branch in this case.

Conclusion

This story should enable you to use Synapse in a productive set up, while still using your preferred branching strategy.

--

--

Stefan Graf
Stefan Graf

Written by Stefan Graf

Data Engineer Consultant @Microsoft — Data and Cloud Enthusiast

No responses yet