Cloud infrastructure is becoming increasingly popular among companies that want to improve the efficiency of their IT processes. At the same time, creating and managing the infrastructure itself is a challenging task. This is especially true when it comes to the various layers, templates and configurations needed to ensure optimal system performance and security. In this article, we’ll look at how layers and templates can simplify cloud infrastructure management and what benefits they can bring to your company.
Why Do We Need Infrastructure Templates?
Infrastructure templates are important in simplifying cloud development and maintenance. They form functional boundaries, streamlining infrastructure development. They also define component compatibility and deliver infrastructure as part of the product. Templates can “flow” from one team to another and enable the division of responsibilities between the platform team and the SRE (Site Reliability Engineer) team. Templates are also used to apply the GitOps approach for each infrastructure component.
For a broader understanding of the benefits of infrastructure templates, let’s take a look at the main ways they can be used.
Infrastructure Growth and Promotion
To simplify the development and promotion of some application features, a conveyor is used. The application deployment and testing process works on the same principle. You can use common infrastructure templates and versions from Git to do this. This enables you to integrate with your existing CI/CD solution and optimize feature development and promotion across different environments.
Responsibility Delegation Within the Team
Infrastructure templates are important to use in developing companies where several teams are emerging and working together. For example, the platform team develops templates in the form of a complete Infra Template and passes them to the SRE team. They are already being used to deploy and operate the production infrastructure.
The same Infra Templates can also be used by developers and quality assurance engineers to create temporary development and testing infrastructures. By using a standards-based template across teams, they can then optimize and simplify collaboration.
When working with large business applications, complex infrastructure is required prior to deployment. To simplify the process of deploying an enterprise solution, vendors can include an infrastructure template with their product. In this way, they can allow businesses to quickly set up the necessary infrastructure needed to run their applications without the need for time-consuming manual configuration. This streamlines the deployment process and makes it easier for enterprises to launch the vendor’s product quickly.
Another helpful use case for templates is to quickly deploy new infrastructure for testing or research with minimal effort. To do this, you can select a template from a Github project or marketplace, set up the necessary values, and launch the infrastructure in minutes.
Templates can greatly simplify the workload for DevOps and SRE teams, especially when deploying and testing in complex environments.
Next, we’ll cover the key components (layers) needed to create infrastructure templates.
Today’s cloud infrastructures have many layers to consider. Let’s list the main ones:
- Network layer
- Permissions layer
- Infrastructure and OS layer
- Data management layer
- Application layer
- Observability layer
- Configuration Layer
Various technologies and tools can be used to effectively implement these layers. Let’s look at how you can work with each of them.
In today’s cloud environments, software-defined networks and cloud APIs have simplified the presentation of networks as code. This has largely eliminated the need for manual network configuration. We have two basic approaches to defining cloud networks in code:
- Using Terraform modules designed to define a cloud network. For example, Terraform modules can create VPC resources on AWS, deploy a GCP virtual private cloud (VPC), or use the Terraform Azure RM module for networking.
- Implementing networking at the container orchestration level, such as Kubernetes. Numerous networking plugin (CNI) options are available, including such tools as https://www.projectcalico.org/ or https://cilium.io/. These tools usually come as Helm charts along with Kubernetes manifests, making them easy to install and use.
Detailed permissions are declared at different layers of the infrastructure. Here are a few common examples:
To install infrastructure permissions, you can use Terraform modules for IAM and roles. For example, Terraform modules are available for creating IAM resources in AWS and for managing multiple IAM roles in Google Cloud.
When setting up permissions in Kubernetes, a popular choice is to use your own RBAC. In some cases, external authentication with OpenID Connect tokens or other tools may be required.
To link Kubernetes permissions with cloud roles, you may need to use technologies such as AWS IRSA, Azure Managed Identities, or GKE Workload Identity. This requires combining Terraform modules with Kubernetes manifests, using the output of one technology as input for the other.
You can use Open Policy Agent (OPA) to create permissions and apply policies. This is a mechanism for declaratively writing policies as open source code. OPA can be used to authorize REST API endpoints. This also applies to defining infrastructure policies by permitting or disallowing Terraform changes.
Infrastructure and OS Layer
Using Terraform modules, you can also determine the infrastructure level. The following are suitable for this purpose:
- Terraform module for setting up GKE clusters
- Terraform module for Azure computing
- Terraform module for automatic resource scaling on AWS
In some cases, you may need to further configure images with cloud-init and scripts, which can be done with Hashicorp Packer or Ansible.
Data Management Layer
Most cloud service providers offer managed storage options. For example, AWS S3/EBS, GCP cloud storage, and Azure file and BLOB-objects storage. Managed databases such as AWS RDS, Azure Cosmos DB and Google cloud databases are also available. Using Terraform modules is the most common approach to describing such resources.
Applications hosted in the cloud are usually divided into two groups: infrastructure and business applications.
Applications in the first group are deployed together with the infrastructure and include tools such as Kafka, Service Mesh, Ingress Controller and GitOps Controller. Helm charts or bash scripts are typically used to deploy them.
To deploy business applications, CI/CD tools such as Jenkins and GitLab are typically used. We should note that the lifecycle of business applications is different from that of application infrastructure.
Cloud service providers and external vendors offer a variety of monitoring and logging solutions, such as AWS CloudWatch, Google Stackdriver, and Datadog. These can be set up with Terraform.
For cluster monitoring and logging tools, Terraform or Helm charts are commonly used. This is typical for EFK or Prometheus/Grafana.
The Git repository is the most popular place to keep infrastructure code and configuration files. This method provides accountability and enables you to return to specific configurations from a specific time point.
It is important to save the resulting infrastructure state after applying a configuration that contains confidential data. To prevent potential security issues, it is recommended to store this data in a separate location. For example, in an object repository or database.
AWS SSM or Azure Vault can be used to store secrets and certificates. Hashicorp Vault or SOPS may be alternative repositories for such data.
Let’s take a look at a diagram to see how we can depict a modern infrastructure using a specific set of tools for each layer:
Infrastructure templates greatly simplify the work of DevOps and SRE teams. They optimize deployment and testing in complex environments and make it more consistent. The SHALB team helps to implement such tools in large projects and startups. With years of experience, the company has integrated over 100 cloud infrastructures of varying complexity.