The overwhelming majority of a software system's lifespan is spent in use, not in design or implementation. So, why does conventional wisdom insist that software engineers focus primarily on the design and development of large-scale computing systems? In this collection of essays and articles, key members of Google's Site Reliability Team explain how and why their commitment to the entire lifecycle has enabled the company to successfully build, deploy, monitor, and maintain some of the largest software systems in the world. You'll learn the principles and practices that enable Google engineers to make systems more scalable, reliable, and efficient--lessons directly applicable to your organization. This book is divided into four sections: Introduction--Learn what site reliability engineering is and why it differs from conventional IT industry practices Principles--Examine the patterns, behaviors, and areas of concern that influence the work of a site reliability engineer (SRE) Practices--Understand the theory and practice of an SRE's day-to-day work: building and operating large distributed computing systems Management--Explore Google's best practices for training, communication, and meetings that your organization can use
In 2016, Google's Site Reliability Engineering book ignited an industry discussion on what it means to run production services today--and why reliability considerations are fundamental to service design. Now, Google engineers who worked on that bestseller introduce The Site Reliability Workbook, a hands-on companion that uses concrete examples to show you how to put SRE principles and practices to work in your environment. This new workbook not only combines practical examples from Google's experiences, but also provides case studies from Google's Cloud Platform customers who underwent this journey. Evernote, The Home Depot, The New York Times, and other companies outline hard-won experiences of what worked for them and what didn't. Dive into this workbook and learn how to flesh out your own SRE practice, no matter what size your company is. You'll learn: How to run reliable services in environments you don't completely control--like cloud Practical applications of how to create, monitor, and run your services via Service Level Objectives How to convert existing ops teams to SRE--including how to dig out of operational overload Methods for starting SRE from either greenfield or brownfield
Efficiently deploy and manage Kubernetes clusters on a cloud Key Features Deploy highly scalable applications with Kubernetes on Azure Leverage AKS to deploy, manage, and operations of Kubernetes Gain best practices from this guide to increase efficiency of container orchestration service on Cloud Book Description Microsoft is now one of the most significant contributors to Kubernetes open source projects. Kubernetes helps to create, configure, and manage a cluster of virtual machines that are preconfigured to run containerized applications. This book will be your resource for achieving successful container orchestration and deployment of Kubernetes clusters on Azure. You will learn how to deploy and manage highly scalable applications, along with how to set up a production-ready Kubernetes cluster on Azure. With this book, you will be able to reduce the complexity and operational overheads of managing a Kubernetes cluster on Azure. By the end of this book, you will not only be capable of deploying and managing Kubernetes clusters on Azure with ease, but also have the knowledge of industry best practices to work with advanced Azure Kubernetes Services (AKS) concepts for complex systems. What you will learn Get to grips with Microsoft AKS deployment, management, and operations Learn about the benefits of using Microsoft AKS, as well as the limitations, and avoid potential problems Integrate Microsoft toolchains such as Visual Studio Code, and Git Implement simple and advanced AKS solutions Implement the automated scalability and high reliability of secure deployments with Microsoft AKS Use kubectl commands to monitor applications Who this book is for If you’re a cloud engineer, cloud solution provider, sysadmin, site reliability engineer, or a developer interested in DevOps and are looking for an extensive guide to running Kubernetes in the Azure environment then, this book is for you. Though any previous knowledge of Kubernetes is not expected, some experience with Linux and Docker containers would be beneficial.
Instrument Engineers' Handbook, Third Edition: Volume Three: Process Software and Digital Networks provides an in-depth, state-of-the-art review of existing and evolving digital communications and control systems. While the book highlights the transportation of digital information by buses and networks, the total coverage doesn't stop there. It describes a variety of process-control software packages suited for plant optimization, maintenance, and safety related applications. In addition, topics include plant design and modernization, safety and operations related logic systems, and the design of integrated workstations and control centers. The book concludes with an appendix providing practical information such as bidders lists and addresses, steam tables, materials selection for corrosive services, and much more. If you buy the three-volume set of the Instrument Engineers Handbook, you will have everything a process control engineer or instrumentation technician needs. If you buy this volume, you will have at your fingertips all the software and digital network related information that is needed by I&C engineers. It will be the resource you reach for over and over again.
An innovative book that centers on developing and measuring true Overall Equipment Effectiveness (OEE), which as the author demonstrates, correlates with factory output and has a strong link to profitability.
This book presents original studies describing the latest research and developments in the area of reliability and systems engineering. It helps the reader identifying gaps in the current knowledge and presents fruitful areas for further research in the field. Among others, this book covers reliability measures, reliability assessment of multi-state systems, optimization of multi-state systems, continuous multi-state systems, new computational techniques applied to multi-state systems and probabilistic and non-probabilistic safety assessment.
This book starts with the basic premise that a service is comprised of the 3Ps-products, processes, and people. Moreover, these entities and their sub-entities interlink to support the services that end users require to run and support a business. This widens the scope of any availability design far beyond hardware and software.

