Profileimage by SambitKumar Nayak Senior Site Reliability Engineer from Bangalore

Sambit Kumar Nayak

available

Last update: 06.09.2022

Senior Site Reliability Engineer

Graduation: Bachelor of technology In Software Engineering
Hourly-/Daily rates: show
Languages: English (Full Professional) | Hindi (Native or Bilingual)

Attachments

Sambit Kumar Nayak Resume.pdf

Skills

Senior Site Reliability Engineer with 4  Years of experience in Designing Cloud Infrastructure, On Premises to Cloud Migration, Automated Configuration Management Tools, Devops Automation Management, Google Cloud platform products, Linux server administration, Networking. Currently I am Seeking new position involved in Cloud Platform designing, planning and deployment.

Project history

09/2019 - 03/2020
Senior Site Reliability Engineer
Charmboard

Managing Multi Cloud Resources, Services, Multi Cloud Resources Migration,
Troubleshooting Production issues, Planning & Designing the infrastructure

09/2019 - 03/2020
Senior Site Reliability Engineer
Charmboard

Roles & Responsibility: -

* Migrating monolithic apps from Virtual Machine Environment to Microservice Architecture using Google kubernetes engine with
horizontal autoscaling of pods.
* Written Templates for Azure Infrastructure as code using Terraform to build staging and production environments. Integrated
Azure Log Analytics with Azure VMs for monitoring the log files, store them and track metrics and used Terraform as a tool
* Worked on Azure Site Recovery and Azure Backup- Deployed Instances on Azure environments and in Data centers and migrating
to Azure using Azure Site Recovery and collecting data from all Azure Resources using Log Analytics and analyzed the data to
resolve issues.
* Configured Elasticsearch with fluent-bit for Central logging of Container logs & production api servers.
* Optimized Charmboard Search engine using Elasticsearch Indexing & indices.
* Configured High availability Elasticsearch cluster using 4 node data cluster & 2 master nodes for fault tolerance.
* Configured Azure application gateway & load balancers for traffic load balancing between 2 or more nodes.
* Monitoring and diagnosis of servers, Api, applications for optimal performance using Prometheus, Node exporter & Grafana.
* Configuring Prometheus servers for MySQL & mongo db Node exporters & configuring Grafana to visualize the live traffic & major
issues with servers.
* Implemented a CI/CD pipeline using Azure DevOps (VSTS, TFS) in Azure cloud with GIT, MS Build, Docker, Maven along with
Jenkins plugins.
* Configured Percona-monitoring-and-management 2 for MySQL & mongo dB query time optimization & alerts to developers for
fixing issues with mongo db.
* Designed Secure Azure Architecture for Charmboard for migration & deployment & creation of resources using Azure Devops with
Terraform (Infrastructure as code). Experience in writing Infrastructure as a code (IaC) in Terraform, Azure resource management
in azure cloud environments
* Designed autoscaling infrastructure for api & webapps to handle high volume live traffic on peak time using stackdriver, gcp load
balancer & health check services.
* Creation of Queuing and data-pipeline solutions such as (Elk, pub/sub, dataflow)
* Identifying, gathering, analyzing and automating responses to key performance metrics, logs, and alerts.
* Designed & implemented the architecture for centralized logging.
* Provide architectural and practical guidance to software development to improve resiliency, efficiency, performance, and costs
* Developed and integrated new monitoring and testing solutions, ensuring that all company policies and
procedures were followed.
* Designed & automated the process of Releasing Android app for Charmboard to Google play store.
* Currently Migrating resources from aws to Azure as a part of migration all infrastructure resources to azure.
* Writing Terraform codes for managing & designing the Azure Cloud infrastructure.
* Managing the requests from developers' team & applying the changes as per the requests.
* Managing Linux server creation, troubleshooting, upgradation tasks.
* Maintaing statuscake for sre dashboards of all websites hosted on the production servers.
* Managing userify licenses & configuration for ssh centralized authentication.
* Designing Dev & Production CI Pipeline for Production Team using Jenkins & ansible.
* Containerizing existing apps to docker images & storing in azure container registry.
* Dynamic Versioning for Charmboard Node.js Web applications.
* Troubleshooting Nginx web server related issues & implementing ssl certificate for websites.
* Implementing New DevOps tools to production environment to shorten application release time from months to hours.
* Experienced in designing and deploying AWS Solutions using EC2, S3, EBS, Elastic Load balancer (ELB), auto scaling groups.
* Responsible for managing infrastructure provisioning (S3, ELB, EC2, RDS, Route 53, IAM, security groups - CIDR's, VCP, NAT)
and deployment.
* Experience working with IAM in order to create new accounts, roles and groups.
* Experience in creating alarms and notifications for EC2 instances using Cloud Watch.
* Implemented AWS solutions using EC2, S3, RDS, Elastic load balancer, Auto scaling groups.
* Involved in maintaining the user accounts (IAM), RDS, Route 53, VPC, RDS, Dynamo DB and SNS services in AWS cloud.
* Experience involving configuring S3 versioning and lifecycle policies to and backup files and archive files in glacier.
* Created Ansible playbooks to do an overall process improvement to any manual processes.
* Worked on Installation and Implementation of Ansible configuration management system and used to manage Web
applications, Environments configuration Files, Users, Mount points and Packages.




* Created inventory in Ansible for automating the continuous deployment and wrote playbooks using YAML scripting.
* Used Ansible playbooks to setup continuous delivery pipeline. Deployed micro services including provisioning GCP
environments using Ansible playbooks.
* Installed Docker registry for local upload and download of Docker images and even from Docker hub.
* Created Docker files to automate the process of capturing and using images.
* Responsible for installing Jenkins master and slave nodes. Configured Git with Jenkins and schedule jobs using POLL SCM
option.
* Experienced in branching, tagging and maintaining the version across the environments using SCM tools like GIT, Subversion
(SVN).
* Experience with CI (Continuous integration) and CD (Continuous deployment) methodologies with Jenkins.
* Analyze and resolve conflicts related to merging of source code for GIT.
* Used ansible playbooks as a monitoring script to identify and resolve infrastructure problems before they affect critical
processes and worked on automatic restart of failed applications and services using supervisor.

02/2019 - 09/2019
DevOps L2 Engineer
Sigmoid Analytics

Deployment, CICD pipeline designing, Google Cloud Platform services orchestration

02/2019 - 09/2019
Sigmoid Analytics DevOps L2 Engineer

Roles & Responsibility: -
* Managed Snowflake & sigview software design and development project across 16 -member development team while remaining
focused on meeting client needs for functionality, timeline and performance.

* Interfaced with cross-functional team of business analysts, developers and technical support professionals to determine
comprehensive list of requirement specifications for new applications

* Cloud infra resources such as GCP Dataproc Cluster, GKE Cluster, GCP VM, GCS Buckets, Service Account Creation by using
Terraform

* Monitored automated build and continuous software integration process to drive build/release failure resolution using ansible

* Worked closely with software development and testing team members to design and develop robust data warehouse solutions to
meet client requirements for functionality, scalability and performance

* Drove project lifespan from concept to final rollout in software development, system deployment, testing and monitoring for cloud
resources

* Designed and built gocd automation script tools and applications to deploy next generation platform

* Versed in complete software life cycle from preliminary needs analysis to enterprise-wide deployment and support

* Collaborated with cross-functional development team members to analyze potential system solutions based on evolving client
requirements

* Collaborated closely with product development teams and other stakeholders, using effective communication and active listening
skills

* Prepared detailed reports on updates to project specifications, progress, identified conflicts and team activities

* Modified existing snowflake software source code to correct coding errors, upgrade interfaces and improve overall performance

* Wrote shell scripts for daily maintenance activities, including snowsql indexes and tables data duplication or missing data analyses

* Designing CICD Pipeline for ETL Jobs, Designing Cloud Infrastructure

* Migration of Running existing on-premise Hadoop jobs, storage & clusters to gcp dataproc clusters

* Using Salt stack & Ansible for Creation of Resources & Managing resources for clients

* Maintaining Vertica DB instances, databases, indexers, working on Jira trouble tickets, providing resolutions, RCA to the customers
within SLA

* Successfully Completed Migration of Hortonworks cloudera DW clusters to snowflake Cloud DW clusters

* Working & Fixing Data Deduplication issues in cloud data warehouses such as BigQuery, snowflake, Vertica DW

* Managing Source Code in GitHub for all branches, branching strategy creation & codebase maintaining

* Monitoring the running dataproc, google kubernetes engine clusters, gcp vm with stackdriver & profiler services.

07/2018 - 02/2019
Build & Release Engineer
LogiwareInc

Designing CICD Pipeline, Software Building & Release management, and
troubleshooting the GCP Cloud Architecture for LogiwareInc.

07/2018 - 02/2019
LogiwareInc Build & Release Engineer

Roles & Responsibility: -
* Designing, managing, and troubleshooting the Cloud Architecture Design Strategy for LogiwareInc.
* Managing Software Release Process, Creating Process for patch release and version release.
* Managing a Development Team of 10 Members for the release cycles & Interacting with them regarding new requirements
designing from customers.
* Migration of On-Premise Data Center Hosted Linux Servers, Database Migration from Traditional Oracle Database or MySQL
Db to Cloud SQL(GCP), RDS MySQL instance or Apsara DB for RDS instances.
* Managing Google Cloud infrastructure resources, Resource creation such as provisioning and deployment of vm on
compute engine, creating DR sites for infrastructure.
* Creation of Service account, users for gcp console & giving appropriate IAM Permission & roles.
* Creating Cloud SQL Database instance & configuring it for production.
* Backups planning of vm, cloud sql instances as per our company requirements & DR Site creation in separate regions for
high availability.
* Implementation of SSL Certificate & https web server on production environment.
* Configuration of Cloud sql instances, database creation, Creation of users, access list permission defining, network policy
governance migration of database from on premise to cloud, and cloud to cloud live migration of databases &
Configuring slow-query on the databases instances.
* GitHub Branch creation, Branch management, GitHub authentication setup on builds, sub branch creation, git cherrypicking
of committed codes, managing git hub users, access, permission setup on GitHub, branch checking out codes on
build servers & build .war files for deployments.
* Giving Demo to customers regarding our product & the benefits for the customer container shipping business.
* Tagging each build after successful test & make the production code server ready for the next building.
* Creating new branches on GitHub & bit bucket if the client's needs some new changes or some new tools to be added to the
software.
* Creation of resources, users, groups, defining authentication for compute resources, databases on Alibaba.
* Fixing product deployments related issues & creation of new deployments every day to fix master branch issues.
* Executing MySQL Queries on production or stage or QA databases & defining proper database for each customer,
configuring slow query on production database, configuring high availability failover in regions.
* End to end development & deployment support to customers for all incidents.
* Monitoring of running applications by using stack driver (GCP) creation of dashboard, Alert Action Policy to see all services
& resources availability & Actions during incidents.
* Live Migration of production Linux servers, Data and between multi cloud such as AWS to GCP, GCP to Alibaba cloud.
* Live Migration of servers from AWS to GCP, migration of files, databases, tomcat 7, and apache web server setup.
* Providing Training on Linux to Team about the product, fixing production issues, and debugging application related issues.

01/2018 - 06/2018
System Engineer
Ilabs Enterprise

Responsible for design, deploy, implement and support of cloud-based infrastructure
applications and its solutions.

01/2018 - 06/2018
Ilabs Enterprise System Engineer
Ilabs; Tesco India

Roles & Responsibility: -
* Responsible for design, deploy, implement and support of cloud-based infrastructure applications and its solutions.
* Complete End of the Day Transactions & write & execute EOD SQL queries & update the Databases to update the total number
of transactions, sales & inventory records (EJ to DB POS Sales).
* Monitor & Manage the Servers, Applications running on the Production Environment.
* Checking live sales transactions, transaction failures, escalations at an interval of every 45 minutes in the shifts.
* Manage Windows Virtual Machines via vSphere Console like power off the vm, reboot.
* Involved in Creating and Monitoring different process flows using AUTOSYS scheduling tool and giving solutions for failure of
jobs.




* Participating bridge calls at time of change managements, incident managements.
* Configuring Citrix Desktop Virtualization & Administrating XenServers Hosted at Tesco DC across Different Time Zones.
* Running & Monitoring Autosys Batch Jobs, Monitoring CE Nightly Activities, Making Report as per the shifts.
* Keeping Track of Safe lock across Thailand & Malaysia Hyper, Thalad & Express Tesco Stores & Prepare Safe lock Reports on
Daily Basis & send the report to Managers for auditing Purpose.
* Fixing up RTS (Real Time Sales) related issues, which happens at Tesco Stores across Europe Region.
* Monitoring Spunk & AppDynamics Journeys (Orders), Investigate Failed Journeys & reach out to the user via email or call if
any issues reporting.

11/2016 - 11/2017
Cloud Infrastructure Operation Engineer
QuintilesIMS

My Primary Job Responsibility was to provisioning resources & Troubleshoot OS related
issues on AWS Cloud Infrastructure (Windows, Linux). Provisioning & Deploying New
EC2 servers in Production Environment, Maintenance of AWS Resources

11/2016 - 11/2017
Cloud Infrastructure Operation Engineer
QuintilesIMS

Roles & Responsibility: -
* My Primary Job Responsibility was to Monitor & Troubleshoot OS related issues on AWS Cloud Infrastructure (Windows,
Linux).
* Provisioning & Deploying New EC2 servers in Production Environment, Maintenance of AWS Resources.
* Handling responsibilities of providing 24x7-infrastructure support in production environment Monitoring of Hosts/Nodes,
Network Devices for downtime, utilization, throughput, and response times using SolarWinds.
* I had worked on 24 x7x 365 shift-based operational environments providing level 2 response, solution, escalation to alarms
from network, vm servers, and cloud hosted application errors.
* Enabling CloudWatch for monitoring EC2 instances and Elastic Load Balancers.
* Experienced in maintaining VPC and private subnets and distributed them as groups into various availability zones.
* Experience in maintenance and configuration of user accounts for DEV/QA/PROD servers and Managing IAM roles, Setting
S3 Bucket Policy, lifecycle.
* Involved in the migration and implementation of multiple virtual machines & hosted applications from On-premise to cloud
using AWS vm Migration Tools.
* Managing & Troubleshooting Basic issues for windows & Red hat Linux Servers Reported By users.
* Manage, maintain, upgrade and monitor EC2 Instances hosted at AWS environments.
* Deploying EC2 instance, creating Managing IAM Policy, managing VPC, NACL and Security groups, Monitoring Servers in
CloudWatch dashboards.

05/2016 - 11/2016
Noc Engineer
StridesIT Services LLP

Client Company: - Harman Connected Services
May 2016-November 2016
Roles & Responsibility: -
* ROLES AND RESPONSIBILITIES.
* NOC Monitoring using Nagios Software, SolarWinds tool for network problem identification & Coordinating with Network
Team & ISP.
* Monitoring Windows Servers, Linux Servers, ESXI Hosts, Network and Backup based on priority.
* Using SCOM/Nagios/SolarWinds Tool for monitoring and Commvault for Backup.
* Configuring Host and Host Group, services, Nagios client agents in Nagios admin Panel for monitoring.
* Installation, Configuration, and basic Troubleshooting of Windows Servers.
* In the event of a network outage, need to follow a set of procedures to facilitate quick resolution.
* They are incident alert, tracking, identification, Isolation, notification and Escalation for internal and external (ISP's or
customers).
* Configuring Nagios XI for monitoring hosts, services, setting threshold for Critical, Warning level alert & Support level.
* Coordinate with concerned GOC/Core IT team if the Servers CPU/Memory/Hard disk Utilization crosses threshold.
* Maintain LTO Storage Tape Archival Backup on weekly and monthly basis.

Local Availability

Only available in these countries: India
Remotely Available to work
Profileimage by SambitKumar Nayak Senior Site Reliability Engineer from Bangalore Senior Site Reliability Engineer
Register