DevOps Talks week of Feb 17-21

There were lots of webinars this week in the DevOps space. I am still pushing VnV DevOps for DevOps and IT Service Management consulting. This week, while still pushing forward, I paused to attend several webinars from TechStrong and Cloud Canaries. Part 3 in the “Friends” series from OpenText talked about “the one where automation gets ship done” and another webinar about an Edge AI success story around Sports Analytics. I love the Friends series because I really like what OpenText and Don Jackson (Field CTO, OpenText) is doing in his space around really linking the DevOps Infinity Loop together and minimizing the touchpoints or approvals required in the pipelines. For Edge AI, Latent AI has joined forces with FSP to unveil a unique use case for moving AI processing from the cloud to the edge.

Cloud Canaries (Mark Callahan and Helen Beals) spoke about observability and how Open Telemetry and Agentic AI can help. Their product, Cloud Canaries, allows you to deploy canaries to strengthen observability and monitoring. Two focal points from Mark were bringing a low-cost solution to the table and providing an agentic AI solution around monitoring and observability. Do I really have to learn something new, and can I remove some the old stuff? Agents allow you to “take over” other solutions. So, eventually, a roll-up of solutions could be on tap.

(Courtesy: Cloud Canaries)

Cloud Canaries works with DataBricks and Snowflake to connect to ML AI tools. They also have their own neural network to use for a price. For forecasting, we generally look for events, not deploys or releases. A ticketing system like ServiceNow can be notified. The workflow for that canary can be displayed in the dashboard in order to determine whether the “learning” is correct or if that “learning” needs to be modified.

In 2025, autonomous remediation with low-cost tooling will be the calling for cloud service users and application developers. Increasing observability and monitoring will be part of the equation.

Please subscribe to my blog by entering your email address in the bottom right corner.

State of Software Delivery and AI

Listening to the folks at Harness talk about how AI will impact Software Delivery in 2025 and beyond. 37 years ago, I had to take a course on AI as part of my Computer Engineering degree, but no one valued that course. Fast forward to today, AI is another “label” that everyone wants in on. Executives in various companies are willing to spend money for 1-2 quarters, to be able to tell their peers “We do AI.” Same thing happened with “DevOps” and “Testing”. No, you don’t do any of these “things” well.

Really good conversations with Dean Clark (GFT) about Shadow AI, Eric Baran (AWS) about CodeGen bottlenecks and what’s next in AI and SDLC with the great Gene Kim. Has AI helped reduce the time that developers use to actually write code? Yes and no. Now, it takes us less time to write mundane tasks, so that we can spend more time writing more meaningful things.

Loved the session and can’t wait to see more changes this year in 2025.

VnV DevOps in 2025

I am reading “Confident DevOps” by Mark Peters and “Embracing DevOps Release Management” by Joel Kruger. With 30 years of technology experience, I am building a boutique consulting business focusing on DevOps and release management. I have met Mark at various conferences on DevOps and love his take on the SDLC and DevOps processes. I have seen what DevOps means to companies like GE Healthcare, JPMorgan Chase, Fidelity Investments, The Home Depot, JCPenney, and AIG. Most people think DevOps is CI/CD and can be implemented in several quarters. DevOps is not “fire and forget.” DevOps has to be built from within – that’s the only way it can be sustained.

VnV stands for a phrase in Marathi that is near and dear to my heart. “Vichitra navhe, Vegale” means “not strange, different.” If you view things as strange or weird, your outlook becomes closed and limited. But the minute you define something as “different,” your outlook on life changes, and you start adding more patterns (and people) to your repertoire.

DevOps Certifications

I keep stressing that DevOps is a mindset that needs to be embodied within the entire organization. Not just IT folks, but C-level and every other traditional division within the organization. Unfortunately, in order to determine the level of competency that Humans in DevOps have, certification programs have popped up to give prospective employers the ability to assess DevOps skills in individuals. As John Willis coined, the key ideas in DevOps are Culture, Automation, Measurement, and Sharing. You will find certification courses in all of these areas.

Some key vendors include:

  • DevOps Institute (DOI): The DevOps Institute is dedicated to advancing the human elements of DevOps success. They use a role-based approach to certification that focuses on the most modern competencies and hireable skills required by today’s organizations adopting DevOps. They have an open testing program that removes the requirement of formal training allowing those who already possess the skills, knowledge, and experience in the domain to gain direct access to DevOps Institute’s extensive portfolio of certifications.The types of certifications include:
    • DevOps Foundation – The DevOps Foundation certification validates a baseline understanding of key DevOps terminology, concepts, and practices to ensure everyone is talking the same language and highlights the benefits of DevOps to support organizational success.
    • Site Reliability Engineering (SRE) Foundation – The SRE Foundation certification validates knowledge of SRE basic vocabulary, principles, and practices.
    • DevOps Leader (DOL) – The DevOps Leader (DOL) certification is intended for anyone who wants to take a transformational leadership approach and make an impact within their organization by implementing DevOps including IT team leaders, managers, directors, business stakeholders, practitioners, and consultants.
    • DevSecOps Engineering (DSOE) – The DevSecOps Engineering (DSOE) certification is intended for IT Security professionals who are skilled at security as code with the intent of making security and compliance consumable as a service. A DevSecOps Engineer uses data and security science as their primary means of protecting the organization and customer.
    • Continuous Delivery Architecture (CDA) – The Continuous Delivery Architect (CDA) certification is designed for candidates who are engaged in the design, implementation, and management of DevOps deployment pipelines and toolchains that support Continuous Integration, Continuous Delivery, Continuous Testing, and potentially Continuous Deployment.
    • DevOps Test Engineering (DTE) – The DevOps Test Engineering (DTE) certification addresses testing in a DevOps environment and covers concepts such as the active use of test automation, testing earlier in the development cycle, and instilling testing skills in developers, quality assurance, security, and operational teams.
    • Certified Agile Service Manager (CASM) – The Certified Agile Service Manager (CASM) certification is designed to validate knowledge of Agile Service Management and Scrum basic vocabulary, principles, and practices. A Certified Agile Service Manager (CASM) is the operational equivalent of a ScrumMaster.
    • Certified Agile Process Owner (CAPO) – The Certified Agile Process Owner (CAPO) certification validates knowledge of process owner responsibilities and the practices and tools needed to oversee the design, reengineering, and improvement of IT Service Management (ITSM) processes; particularly in the context of Agile Service Management.
  • DevOps Agile Skills Association (DASA): The DASA DevOps Certification Program covers specific topics from the DASA DevOps Competence Model, helping DevOps and Agile teams to build the right mix of skills and capabilities. DASA identifies three broad levels of expertise and has developed a certification program designed for each profile.
    • Foundational Level (Know) – Builds an understanding of DevOps: scope, key concepts, terminology, and principles.
    • Professional Level (Apply) – Builds the capabilities relevant for professionals working together in a DevOps team. There are three Professional certifications: Enable and Scale, Specify and Verify, Create, and Deliver.
    • Leadership Level (Lead and Enable) – The Leadership Level focuses on the abilities to lead and enable. This Program is for Leaders, Coaches, and Product Owners.
  • DevOps Research & Assessment (DORA): DORA does not offer certification programs; rather DORA brings years of Research and Assessment knowledge to teams and organizations looking for ideas on how to bring the DevOps mindset to fruition. The Accelerate State of DevOps Report from DORA is the longest-running, academically rigorous research investigation into the capabilities and practices that make DevOps and transformation effective. Teams can learn how to achieve elite performance in software development and delivery with the 2019 report. The DevOps capabilities researched in the report are explained in the book, “Accelerate” by Dr. Nicole Forsgren, Jez Humble, and Gene Kim. Follow the book to implement steps to improve stability and throughput in your team or organization.

Technical certifications also exist that allow DevOps Engineers to showcase their skills and abilities in DevOps Operations. Some of the key certifications are:

  • Docker Certified Associate – This Docker certification program is for the Docker practitioners with some relevant experience of working with Docker, the DevOps tool. Like other DevOps certifications, the aim of this exam is to provide a valid credential to the Docker practitioners.
  • Kubernetes Certification – The Cloud Native Computing Foundation (CNCF) and the Linux foundation collaborate to organize the Kubernetes certification program to validate professionals working on this software. Kubernetes is one of the top DevOps tools and thus Kubernetes certifications are among the most demanded DevOps certifications. There are two certification options; The Certified Kubernetes Administrator (CKA) and Certified Kubernetes Application Developer (CKAD) programs.
  • AWS Certified DevOps Engineer Professional Exam – There are a number of AWS Certifications for the candidates performing different roles and responsibilities in the AWS cloud. AWS DevOps Engineer Professional certification exam is a professional level exam that recognizes the technical skills and expertise of the candidates to provision, operate, and manage distributed applications and systems on the AWS platform.
  • AZ-400: Microsoft Azure DevOps Solutions Certification Exam – Among the role-based Azure certifications, AZ-400: Microsoft Azure DevOps Solutions certification exam validates the skills and expertise of Azure DevOps professionals. The Azure professionals, working as DevOps engineers are mainly aspired to get this certification and so this certification lies in the list of best DevOps certifications.
  • Puppet Professional Certification – One of the top DevOps certifications is Puppet 206 – System Administration Using Puppet Exam. This certificate will let you be recognized as a Puppet Certified Professional (PCP).

There are other sites like DevOps University that provide certifications as well. The sites above are the most popular to develop skills and abilities for DevOps. Nothing will eventually beat regular practicing but these guides and certifications will help you familiarize with common DevOps terminology and operational guidelines.

Email: sagar@vnvdevops.com

DevOps in a Remote world

With the prominence of Agile, DevOps, and SRE, how do things change when we move to working remotely?

Agile and Lean is about how teams iterate

Can you chunk features into sizeable, bitsize stories that can be designed, developed and tested in 2-3 days or less? With the business and development teams co-located, the task of “pairing” stories down to size becomes easier because you can ask questions to get to a “minimally viable product” much faster. When people are remote, teams now have to turn to tools to help get answers to clarifying questions.

There are so many advanced tools and methods of engagement out there that will assist in helping clients adopt Agile techniques.

    • Video Conferencing provides coaches the ability to deliver content in an interactive manner
    • Utilizing surveys allow coaches to get responses anonymously and yet understand where the audience is coming from in their own development
    • The coaches work with the tools the client has:
      • Lucid Chart
      • Storm Board
      • Trello
    • Coaches follow-up with exercise facilitation and coaching, to allow Product Owners and Scrum Masters through the process allowing them to run it themselves.

DevOps is about how teams collaborate

Managing tools and processes needed to collaborate becomes the role of “DevOps”.

How to ensure teams thrive in a remote DevOps Transition covers the notion of collaboration perfectly, especially in a remote world.

SRE (Site Reliability Engineering) is about how teams automate.

SRE is “DevOps reversed”. From an Operations’ perspective, how should changes be automated faster so that users can leverage functionality without experiencing any downtime. With teams co-located, questions can be passed back and forth in person. But, how do you do that when you can’t see the person? The following articles summarize the make of distributed SRE teams.

The Makeup of Successful Geographically-Distributed SRE teams: Part 1

The Makeup of Successful Geographically-Distributed SRE teams: Part 2

My toolbox for DevOps sites:

XebiaLabs Periodic Table – https://xebialabs.com/periodic-table-of-devops-tools/

DevOps Terminology – https://www.plutora.com/devops-at-scale/terminology-glossary

DataOps / DevSecOps – https://dzone.com/articles/dataops-leveraging-devsecops-principles-for-secure

LinkedIn – linkedin.com/in/sagarkarma

Twitter – @sagarvnvdevops

DevOps titles popping up

In 2009, the term DevOps was coined by Patrick Dubois at a conference in Belgium. The idea was simple – improve collaboration between development and IT operations in order to provide better quality and consistency of software releases. Now, prior to that period, the title that fit me the best was Build / Release Engineer. In 2003, while consulting for General Electric, I was offered a full-time position at GE as a Lead Software Integrator. GE created this role to be someone who took responsibility for the code once it landed in version control, until the destination – production. I was responsible for build, test, deploy, and release. Planning and Coding were handled by the development teams and Monitoring was managed by IT operations. Plan, Code, Build, Test, Deploy, Release, and Monitor – that’s what we call DevOps today and it’s all of these steps that are part of the infinity loop.

In 2011, after working as a Release Manager with JPMorgan Chase, it was time for me to move more towards my passion – Release Engineering. Release Engineering is the software engineering discipline around managing releases into production. With engineering, comes tools and processes. All of this ultimately led to a change in my title, from Release Engineering to DevOps. Now this debate has been raging for the better part of the last decade – is DevOps a title or is it more to do with the culture of an engineering organization?

DevOps is a mindset that should be “woven” into the Engineering Culture of an organization. The business, development teams, and IT Operations should communicate and collaborate with each other in unison. Now, if your organization doesn’t have an engineering culture, well then you need to build it. It would make sense, in those circumstances, to have a “team” that builds culture, automation, measurement, and collaboration into the organization’s soul. What I have seen is that these teams stick around for a long time. Why? Because tools change and when tools change, the processes around them change. People always change – and that trifecta of change will continue to keep DevOps teams in place for a long time.

If you would like to talk to me about DevOps practices, please send me a note at sagar@vnvdevops.com.

XebiaLabs Periodic Table – https://xebialabs.com/periodic-table-of-devops-tools/

DevOps Terminology – https://www.plutora.com/devops-at-scale/terminology-glossary

DataOps / DevSecOps – https://dzone.com/articles/dataops-leveraging-devsecops-principles-for-secure

LinkedIn – linkedin.com/in/sagarkarma

Twitter – @sagarvnvdevops

Release Management and COVID-19

I stumbled upon a Dave Farley video, as part of his Continuous Delivery series, where he talks about Release Management. While talking about how science impacts the decisions that one makes to release software, Farley quotes Richard Feynman:

“It doesn’t matter how intelligent you are, if you guess and that guess cannot be backed up by experimental evidence – then it is still a guess!”

Now, let’s think about where we are today. COVID-19 is spreading fast throughout the world, with several million confirmed cases and several thousand fatalities. If you “guess” and release a vaccine that doesn’t work, it’s still a guess. What do you need to prove that the vaccine will work? Experimental evidence. For vaccines, that comes in the form of clinical trials. But, this effort will require “release management” on a global scale, and the cost of failure will be lives. Back when I worked at General Electric, in their Medical Systems division in Waukesha, WI – the motto of the Medical Systems division was “we are in business to save lives”. The flip side of that – “we don’t do anything to risks lives” – can be aligned to release management. “We don’t do anything to <fill in the blank>” can be your motto while releasing software to hundreds of thousands of customers.

So, let’s apply Farley’s process to release management of the COVID-19 vaccine –

  • Hypothesis – People will be able to live without fear of contracting the coronavirus. (I chose that over “our software will continue to work” because it’s congruent conceptually)
  • Measurements – We need to consider the following for the clinical trials:
    • Automated Testing – how are we going to test this vaccine?
    • Automated Deployment – how are we going to deploy the vaccine? is it a shot or medicine taken orally?
    • Configuration Management – what control groups (environments) are we releasing this vaccine into?
    • Monitoring & Health-Checks – how are we monitoring the results of the vaccine during clinical trials? how are we collecting data for future analysis? any “business” impacts of the vaccine?
  • Feedback – once this vaccine is released for clinical trials:
    • Continuous Integration & Continuous Delivery – how do we evaluate the release of the vaccine and improve delivery?
    • Dashboards – how do we get clear indications of the state of the vaccine?
    • Canary releases – can we release to a smaller, safer subset (control group) to determine the impact, if any, on people? (users)
  • Controlling Variables – some of the key variables we can control:
    • Make Changes in small steps – unfortunately, if your vaccine doesn’t work, you will have to go back to the drawing board, but you can restart the cycle if you make small changes
    • Good Configuration Management
    • Reliable Delivery (CD)
    • Quality
    • Monitoring

Netflix has a novel way (pun not intended) of canary releases – they release software in a geographical area when it’s 2 AM in that time zone. This allows Netflix to determine a “canary index” of releases as the release stays in production during daytime in that timezone. This allows the company to determine whether to pull that release out or distribute it to the other timezones at the same time, 2 AM. This is Netflix’s version of testing in production.

Unfortunately, with the number of lives lost during this pandemic, public sentiment around the globe will push for “testing the vaccine in production” more than ever. If you follow the steps above, you should be able to control the variables enough in your environment to allow the vaccine to be distributed widely, safely, and effectively in a relatively short amount of time.

LinkedIn – linkedin.com/in/sagarkarma

Twitter – @sagarvnvdevops

Managing releases remotely

It was the end of August 2017. The remnants of Hurricane Harvey were still hovering over the city of Houston. The water from the nearby bayous rushed into our buildings. The lobby had four feet of water and no one could get in or out of the building. My team was responsible for the Digital Transformation releases and back then, all releases were done on a Saturday. We’d all come into work at 8 AM. Breakfast and Lunch were brought in for the team. We’d all be lucky to go home on these Saturdays by 8 PM. The production deployment took about 3 hours, the rest was spent on production smoke tests and verification tests. This Saturday was different. The business decided that we would postpone the release by two weeks – this would allow people to get back to their homes and perhaps allow us to get back to the office to perform the release.

Unfortunately, it would be months before people could return to the office. The business made the decision to execute the software release mid-September, so we knew this release would have to be done remotely.

Sound familiar? With the COVID-19 pandemic impacting the world, all of your IT employees are staying safe at home. In times like these, software releases can become a challenge, unless you take steps to manage those releases remotely. Some of these steps take some planning and coordination but will get your teams releasing code into production much more efficiently and effectively.

  • Recruit a Release Manager – Notice I said “recruit” and not “hire”. Experienced Project Managers in the IT Operations space typically know how to manage large releases into production. The key is to have an individual who can coordinate and communicate effectively during the course of the release cycle. Release management is a relatively new but rapidly growing discipline within software engineering. As software systems, software development processes, and resources become more distributed, they invariably become more specialized and complex.

Organizations that have adopted agile software development are seeing higher quantities of releases. With the increasing popularity of agile development, a new approach to software releases knows as continuous delivery is starting to influence how software transitions from development to a release. One goal of Continuous Delivery and DevOps is to release more reliable applications fast and more frequently. Release Managers are beginning to utilize tools such as application release automation and continuous integration tools to help advance the process of Continuous Delivery and incorporate a culture of DevOps by automating a task so that it can be done more quickly, reliably, and is repeatable.

In organizations that manage IT Operations using the IT Service Management paradigm, specifically the ITIL framework, release management will be guided by ITIL concepts and principles. There are several formal ITIL processes that are related to release management, specifically the Release and Deployment Management process, which aims to plan, schedule, and control the movement of releases to test and production environments.

Having a Release Manager in place will help teams coordinate releases better when they are not face-to-face.

  • Create a plan and release template for current and all future releases – Having a release plan in place and allowing it to be visible will provide transparency to stakeholders and everyone in the organization. This can be as simple as an Excel spreadsheet with each row describing the task, planned start and end time of the task, the person responsible, and the actual start and end time of the task. During the release, the individual assigned to the task can call out on a conference bridge or via video to visually show the completion of the task. If there are any issues during the completion of the task, the Release Manager takes notes and ensures that an appropriate incident is filed and tracked to completion.

The Release Manager understands business needs and their priorities and under what circumstances those priorities can change. The Release Manager has a clear picture of development dependencies and how changes to one part of a product can affect the stability of the whole. These dependencies need to be clearly outlined in the release plan.

Large releases for programs or enterprise-wide initiatives typically will have breakout bridges or rooms for sidebar conversations during the course of the release. These should be planned for by the Release Manager in advance.

  • Utilize collaboration tools like Microsoft Teams – Audio conference bridges like GoToMeeting or WebEx are great for release day activities. However, in order to coordinate meetings heading up to the release and being able to save documents or create wiki pages for the release, there is no better tool than Microsoft Teams. Teams gives your organization the ability to collaborate and communicate effectively for releases.

 

  • Know existing issues in production – This sounds simple, but in my experience, has been the single reason that releases become non-trivial exercises involving tens and perhaps hundreds of people. On the day of the release, a verification / validation test script needs to be in place in order to verify and validate only those functions that were modified for the release. Over-zealous business analysts or business users stumble upon a problem in production and fail to realize the problem has existed in production before the current release was planned. If that’s not being fixed in this release, it should not be the focus of validation tests.

 

  • Use Release Management software – As a Release Manager, I have used several tools to manage deployments, releases and communications. Two that stand out are XebiaLabs’ XL Release and IBM’s Urbancode Release. These products allow for swimlanes for various teams and the ability to combine swimlanes into a cohesive release plan.

Remember the release that was supposed to happen at the end of August 2017, but was later pushed to mid-September? Well, we had a release plan in a spreadsheet and were able to complete the release in a little over 3 hours. Back then, we all used a conference bridge and my team helped with each step of the release. We knew the existing issues in production and therefore only validated the changes that were being made to the system during the release cycle.

Software releases can be managed remotely with the right coordination and collaboration between all stakeholders. If you would like to talk to me about managing releases remotely, please send me a note at sagar@vnvdevops.com.

What is release management?

LinkedIn – linkedin.com/in/sagarkarma

Twitter – @sagarvnvdevops