Azure Cost Optimization: Ways to Reduce Your Azure Cloud Costs

We started working on this legacy project last year, with some time pressure as well, since the client wanted to go production as soon as possible.

First of all, our main goal was to start testing immediately. We knew there were a few bugs in the system, so we fixed them and tried to release the changes. The previously implemented CI/CD was responsible for the deployment. The CI job was running for more than half an hour. After a successful run, the changes were testable. As the testing kept going, the number of bugs was increasing, so we had to do a lot of fixes and release them into the testing environment. Every release took at least 30 minutes, no matter how big the change was. We realized with this speed of development we were never going to reach the target deadline.

We looked into the CI/CD pipeline and we were shocked at how complex the process was. We had to spend a couple of hours figuring out how the pipeline works. When we understood all the custom scripts and what part of the pipeline does what, we decided we have to refactor the whole thing. The build of the project ran multiple times during the CI job. There was a custom solution to detect which part of the multirepository was modified and what part of the code needs to rebuild. This solution probably had a few issues because the whole project was rebuilt every single time. We decided to remove every custom script and solution using only tools accepted by the industry and understood by a wide range of developers. This way, we could minimize the risk that our tooling has any errors and we did not have to spend any time developing the tools. Another benefit of using only existing tools is easier and quicker onboarding of new developers into the project.

We introduced NX to handle the monorepo. This way only the change related parts of the projects were rebuilt. For the deployment we introduced Argo CD which made the deployment more understandable and less error-prone. We also eliminated the unnecessary steps from the CI jobs.

With these changes we managed to decrease the length of the code release process from more than 30 minutes to around 5-6 minutes, depending on the changes. With these results we had a more reliable and much quicker way to release the code into any environment. We had a quicker feedback loop and the developers had to spend less time focusing on releasing and spend more time fixing bugs and developing change requests.

It is hard to put numbers behind these changes, but the development was more effective. However, it is easier to examine the numbers related to the infrastructure, since we can see them in the cost management screen of the Azure portal.When we took over the project, they had 2 environments: 1 development and 1 staging. The production environment was not present at that time. By the nature of the project we needed more environments since the application had to run with different configurations in different countries. At that time, we already started talking about the next country, so we created the same development and staging environment for it. The infrastructure costs were skyrocketing. We immediately told the client this was not going to work in the long run. These costs were not even including the production costs. We agreed with them: we have to change the infrastructure in order to decrease the maintenance costs after we fix the major issues in the application.

The legacy approach was that every environment has a different database, a different kubernetes cluster. Everything was separated, which could be an advantage if we want to make sure that one system can't bring down another. But in our case, we had 6 kubernetes clusters for only development and testing purposes. Our monthly costs were around 2500 pounds per month and we didn't even think about production at that point.

We built the whole infrastructure up from the ground. We decided we'd have only 2 environments: 1 for development (and testing) and 1 for production. Each environment contains a managed database and a managed kubernetes cluster and we will separate the applications with different configurations virtually only into different namespaces inside the kubernetes cluster. In this way, we don't have 6 different clusters with 4-5% usage. We used terraform to create the resources on Azure, in this way, once we were ready for production, we just had to run the terraform project with different parameters. After we switched at the end of last year to the new infrastructure, our monthly costs were decreased to around 3-400 pounds per month while keeping the same (virtual) environments. After we added the production, our monthly costs were around 7-800 pounds per month, which is still only a third of the initial costs. We saved thousands of pounds with these changes for our client.
azure

Got a project?

We are excited to hear your ideas.

Drop a message

Join the team

We are always looking for new tech wizards.

See open positions

Contact

hello@square21.tech

+36 30 3393 162

Follow us on Linkedin