Modern Data Architecture in Azure: Where Should I Start?
It can be difficult to know where to begin with modern data architecture. Sometimes it feels like we get too lost in the cloud. In this post, we’ll let you know just where to start with Azure.
Many organizations would like to move into a cloud data solution, but they suffer from a degree of “analysis paralysis”. So what should come first?
Getting raw data up into the cloud is a key step, so you could do worse than to start with the data ingestion process. Ingestion is indeed the first step in creating a data lake or other persistent data store in the cloud, because the data has to get there somehow! Ingestion will always be a need, and will likely evolve over time. So no better place to begin the journey than at the beginning.
If your chosen platform is Microsoft Azure, you can get into the act of ingestion by using Azure Data Factory (ADF). While many ETL/ELT tools exist and can be used for ingestion, ADF provides a low barrier to entry in the Azure ecosystem in that it is easily available, does not require additional licensing, and is fairly simple to pick up. You can generate a simple pipeline in minutes using well-known ADF patterns and examples that exist in Microsoft’s documentation.
Ready to accelerate your journey to the cloud?
Aside from easy tooling, another reason to start with ingestion is that it is an excellent place to begin identifying challenges and easy wins. Organizations often have surprisingly little institutional understanding of the breadth and depth of their data. Undertaking the effort to list and ingest data sources into Azure is a good first move. Just cataloging all of your data sources can help reveal the level of data maturity in your organization. That can in turn assist you in understanding what you need out of the larger cloud data solution.
Keep in mind that iterative development is simply how things are done in the cloud era. You can start with a simple pipeline that moves the data from source to sink as a Minimum Viable Product. Then, you can add more capability and maturation as you gain understanding of your data, your ingestion needs, and the ADF tool.
Your initial pipelines need not be sophisticated. They can simply serve the purpose of getting data into the lake. Once you know all the kinds of data — and how much of it — you need to ingest, you can improve on and enhance your ingestion solution. Examples of such enhancement might include refactoring pipelines to leverage ADFv2 features like parameters and variables to make your pipelines more flexible and to decouple them from specific data sources.
So where should you start on your Azure data solution? Start at the beginning!
There’s more to explore at Smartbridge.com!
Sign up to be notified when we publish articles, news, videos and more!