Azure Deployment POCs for the Cloud Ready Content Analysis System
Solution Sneak Peek
Metalogix is focused on building box products and SaaS helping to reduce the time customers spend managing their IT administration and more on their core business functions. In order to better serve their customers, they needed to make their product the Sensitive Content Manager cloud ready and scalable so that it could handle workloads between 1-2 TB an hour.
Metalogix came to Softwarium for assistance in creating proofs of concept (POC) for the product. In order to do so, we must understand the advantages and disadvantages of both VMs and serverless computing. One of the biggest advantages of VMs is have complete control to add any component you wish. However, there are scalability issues that must be overcome when using virtual machines. Since each VM has its own operating system, it adds substantial overhead as far as the RAM and storage footprint are concerned.
Metalogix needed to test two proofs of concept for a product they were running in the cloud: First, was to structure the product in such a way to take advantage of the computing power of Azure Virtual machines. The only problem with the VMs is that they might become idle and consume resources without actually working.
The second possibility was to use Azure Serverless Computing, which take seconds to start working. This would allow Metalogix to outsource their physical on-premise servers and take advantage of external cloud-based servers that are run and maintained by Microsoft Azure and many other platforms. In addition to the speed, serverless leads to significant cost reductions by working as a FaaS (Function as a Service) meaning that it is there for you whenever you need it and the work is scaled by the infrastructure.
After studying all of the advantages and disadvantages of each hosting option as well as the technical requirements, Softwarium built two POCs and reduced wasteful operational costs. This was especially challenging since Metalogix had a dataflow of about 2TB an hour.
Creating the POCs
We started by testing the serverless concept first and immediately encountered a constrained execution environment which resulted in the system’s inability to read certain content such as PDFs. The second issue was that it was not infinitely scalable. Also, the cost of the workloads was very high given the amount of executions thus making it unsustainable from a budget point of view.
Based on these three issues we decided to use VMs with containers. This allowed us to overcome the scaling issues since the operating system is virtualized which allows multiple workloads to function on a single operating system instance.
Making the Product Cloud Ready
Now that we knew which POC works, we needed to make the product cloud ready which presented certain difficulties. The workloads were CPU intensive which led to slower service and the constant need to keep adding virtual machines. There was an additional problem with the database since the secure SQL server was not horizontally scalable.
Let’s take a look at how we used Kubernetes to overcome these issues and created a solution that could handle the necessary traffic and could be scaled to accommodate additional users at a moment’s notice.
Dismantling the Monolithic Architecture
We broke down the monolithic architecture into three pieces: file uploading, working with business information and results publishing. Each of these pieces could be deployed as virtual machines, containers or serverless.
The containers leverage a single OS thus increasing the deployment speed and portability while lowering the total cost of ownership and provide an environment where micro services can be deployed, managed and scaled independently.
The old database was replaced with a secure Azure CosmosDB one which is much more scalable depending on the workload. It also allowed Metalogix to save money since they only had to pay for what they used as opposed to on-premise which was on all the time.
The New Content Analysis System
Metalogix received the content analysis system that they wanted. The system adhered to all of the criteria that were set out in the very beginning. We reworked the app from on-premise to cloud ready, obtained the needed configuration and did 100 GB an hour with 40 virtual machines that could be scaled. All of this allowed Metalogix to save money on computing power and storage while providing their customers with outstanding service. However, it is important to note that even though FaaS was not the best approach for this specific solution with these specific workloads, every solution and workload is unique and all “cloud ready” options should be considered.