More actions
({CREATE}) |
(Repair batch-0003 pages from live compare) |
||
| (One intermediate revision by one other user not shown) | |||
| Line 1: | Line 1: | ||
https://sre.google/ | |||
__TOC__ | |||
= 참여자 = | |||
* [[김경민]] | |||
* [[김동우]] | |||
* [[김제신]] | |||
* 박민서 | |||
* [[음호준]] | |||
* [[전영은]] | |||
* [[조영호]] | |||
= 진행 방식 = | |||
사전에 주어지는 주제에 대한 분량을 책에서 읽고, 매주 금요일에 인상 깊게 보았던 부분을 서로 공유한다. | |||
= 주제 내용 = | |||
* Chapter 1 - Introduction | |||
* Chapter 2 - The Production Environment at Google, from the Viewpoint of an SRE | |||
* Chapter 3 - Embracing Risk | |||
* Chapter 4 - Service Level Objectives | |||
* Chapter 5 - Eliminating Toil | |||
* Chapter 6 - Monitoring Distributed Systems | |||
* Chapter 7 - The Evolution of Automation at Google | |||
* Chapter 8 - Release Engineering | |||
* Chapter 9 - Simplicity | |||
* Chapter 10 - Practical Alerting | |||
* Chapter 11 - Being On-Call | |||
* Chapter 12 - Effective Troubleshooting | |||
* Chapter 13 - Emergency Response | |||
* Chapter 14 - Managing Incidents | |||
* Chapter 15 - Postmortem Culture: Learning from Failure | |||
* Chapter 16 - Tracking Outages | |||
* Chapter 17 - Testing for Reliability | |||
* Chapter 18 - Software Engineering in SRE | |||
* Chapter 19 - Load Balancing at the Frontend | |||
* Chapter 20 - Load Balancing in the Datacenter | |||
* Chapter 21 - Handling Overload | |||
* Chapter 22 - Addressing Cascading Failures | |||
* Chapter 23 - Managing Critical State: Distributed Consensus for Reliability | |||
* Chapter 24 - Distributed Periodic Scheduling with Cron | |||
* Chapter 25 - Data Processing Pipelines | |||
* Chapter 26 - Data Integrity: What You Read Is What You Wrote | |||
* Chapter 27 - Reliable Product Launches at Scale | |||
* Chapter 28 - Accelerating SREs to On-Call and Beyond | |||
* Chapter 29 - Dealing with Interrupts | |||
* Chapter 30 - Embedding an SRE to Recover from Operational Overload | |||
* Chapter 31 - Communication and Collaboration in SRE | |||
* Chapter 32 - The Evolving SRE Engagement Model | |||
* Chapter 33 - Lessons Learned from Other Industries | |||
* Chapter 34 - Conclusion | |||
= 참고 자료 = | |||
* [https://sre.google/sre-book/table-of-contents/|SRE Book] | |||
Latest revision as of 00:29, 27 March 2026
참여자
진행 방식
사전에 주어지는 주제에 대한 분량을 책에서 읽고, 매주 금요일에 인상 깊게 보았던 부분을 서로 공유한다.
주제 내용
- Chapter 1 - Introduction
- Chapter 2 - The Production Environment at Google, from the Viewpoint of an SRE
- Chapter 3 - Embracing Risk
- Chapter 4 - Service Level Objectives
- Chapter 5 - Eliminating Toil
- Chapter 6 - Monitoring Distributed Systems
- Chapter 7 - The Evolution of Automation at Google
- Chapter 8 - Release Engineering
- Chapter 9 - Simplicity
- Chapter 10 - Practical Alerting
- Chapter 11 - Being On-Call
- Chapter 12 - Effective Troubleshooting
- Chapter 13 - Emergency Response
- Chapter 14 - Managing Incidents
- Chapter 15 - Postmortem Culture: Learning from Failure
- Chapter 16 - Tracking Outages
- Chapter 17 - Testing for Reliability
- Chapter 18 - Software Engineering in SRE
- Chapter 19 - Load Balancing at the Frontend
- Chapter 20 - Load Balancing in the Datacenter
- Chapter 21 - Handling Overload
- Chapter 22 - Addressing Cascading Failures
- Chapter 23 - Managing Critical State: Distributed Consensus for Reliability
- Chapter 24 - Distributed Periodic Scheduling with Cron
- Chapter 25 - Data Processing Pipelines
- Chapter 26 - Data Integrity: What You Read Is What You Wrote
- Chapter 27 - Reliable Product Launches at Scale
- Chapter 28 - Accelerating SREs to On-Call and Beyond
- Chapter 29 - Dealing with Interrupts
- Chapter 30 - Embedding an SRE to Recover from Operational Overload
- Chapter 31 - Communication and Collaboration in SRE
- Chapter 32 - The Evolving SRE Engagement Model
- Chapter 33 - Lessons Learned from Other Industries
- Chapter 34 - Conclusion