Hemang Garapati
Sr. Cloud Engineer | DevSecOps | Cloud Infra & SRE (Site Reliability Engineer)
About
Highly accomplished Senior DevOps/DevSecOps engineer with overall 10+ years of IT experience, of which 6+ years in DevOps and Cloud Engineering, and about 4 years as Build & Release Engineer dedicated to automation and optimization. Understands, and manages the space between operations and development to quickly deliver features to customers, has experience with the cloud, as well as DevOps automation development for Linux systems and Windows. Brings maturity, enthusiasm, and a drive to learn new technologies along with real-world experience.
Skills & Expertise (22)
Work Experience
Sr Azure DevOps Engineer
ADP, Atlanta, GA
3-2021 - Present
• Orchestrated and integrated various Azure resources, such as virtual machines (VMs), storage accounts, virtual hard disks (VHDs), and storage pools, as part of Azure Infrastructure as a Service (IaaS) initiatives. • Successfully migrated on-premises servers to Azure and established availability sets for enhanced reliability and fault tolerance. • Implemented robust security measures, including VM hardening and disk encryption using the Key Encryption Key (KEK) in MS Azure. • Managing repositories and branches on GitHub and Collaborating with developers to resolve merge conflicts. • Leveraged Terraform and Docker to create highly customized machine images and automate software installations through Ansible. • Worked with Terraform Templates to automate the Azure virtual machines using terraform modules and deployed virtual machine scale sets in production environment. • Designed and implemented Ansible roles in YAML format, defining tasks, variables, files, handlers, and templates for streamlined deployment and management. • Established a Kubernetes pipeline for efficient deployment and management of applications written in Java and Python. • Managed containerized microservices, maintaining private container registries on Microsoft Azure with enhanced security using Active Directory. • Integrated Azure Data Factory (ADF) with AKS for seamless deployment within the CI/CD pipeline, optimizing workflows. • Managed servers on the Microsoft Azure platform using Ansible Configuration Management, automating system operations with Ansible Playbooks, tasks, and roles. • Designed and Implemented Azure Kubernetes Cluster using Terraform from Scratch to implement a fully working AKS Multi Node Cluster. • Developed and maintained Continuous Integration (CI) using tools in Azure DevOps (VSTS) spanning multiple environments, enabling teams to safely deploy code in Azure Kubernetes Services (AKS) using YAML scripts and HELM charts. • Monitored production servers using Grafana and Prometheus, integrated with Kubernetes for proactive issue identification and reporting. • Installed Prometheus and Grafana using Helm, leveraging their monitoring capabilities within the Kubernetes cluster. • Developed CI/CD pipelines within Azure Data Factory (ADF) and Data bricks, tailored for machine learning (ML) flows. • Closely worked with Kafka admin team to set up Kafka cluster setup on the QA and production environments. • Implemented Kafka producer and consumer application on Kafka cluster setup with help of Zookeeper. • Used crontab and shell scripts to use SQL Loader to automate loading data into database tables. • Install, configure, upgrade and maintain OpenShift clusters across public cloud, private cloud, on-premises and edge infrastructure. Collaborate with various teams like developers, SREs and cloud architects for OpenShift operations and cluster management. • Performed OpenShift deployment types - RPM, CLI, Ansible-based, and cluster managers like OKD and Red Hat Advanced Cluster Management • Integrated OpenShift with infrastructure components like networking, storage, security, logging, monitoring, container registry, CI/CD pipelines etc. • Configure Pod scheduling, service accounts, secrets management, resource quotas and limits on OpenShift • Provisioned Azure SQL DB instances through Terraform scripts and maintained server configuration setting and resource allocation. • Implemented disaster recovery strategies for Azure SQL DB by configuring automated backups and by creating servers in multiple regions to main high availability of DB servers. • Ensured collection of high-quality telemetry data. including determining data type, intervals, and efficiency such that it does not negatively impact system performance. • Involved in Linux System administration, OS upgrades, security patching, troubleshooting, and ensuring maximum performance and availability. • Responsible for setting up and configuring the instrumentation in applications and infrastructure to collect telemetry data. Used various monitoring tools and services like Azure Application Insights, Prometheus- Grafana, Azure Monitor, and Log Analytics. • Used telemetry data to monitor the performance of applications and infrastructure by key performance indicators (KPIs), and analyse performance trends to identify areas for optimization.