Lorin Hochstein
San Jose, California, United States
3K followers
500+ connections
About
Fascinated by complex systems, how they work, succeed, change, and…
Activity
-
tl;dr - if you want more people to speak up when they have issues focus on the conditions that inhibit or support speaking up, not on trying to…
tl;dr - if you want more people to speak up when they have issues focus on the conditions that inhibit or support speaking up, not on trying to…
Liked by Lorin Hochstein
-
Here's a comment you're not supposed to write: i++; //increment i by 1 Silly, right? But it's only "silly" because we've all memorized what `i++`…
Here's a comment you're not supposed to write: i++; //increment i by 1 Silly, right? But it's only "silly" because we've all memorized what `i++`…
Liked by Lorin Hochstein
-
Have you ever wondered why most modern imperative languages use `=` for assignment and `==` for equality? Well, there's some HISTORY behind it! In…
Have you ever wondered why most modern imperative languages use `=` for assignment and `==` for equality? Well, there's some HISTORY behind it! In…
Liked by Lorin Hochstein
Experience
Education
Publications
-
A Platform for Automating Chaos Experiments
27th IEEE International Symposium on Software Reliability Engineering (ISSRE '16)
The Netflix video streaming system is composed of many interacting services. in such a large system, failures in individual services are not uncommon. This paper describes the Chaos Automation Platform, a system for running failure injection experiments on the production system to verify that failures in non-critical services do not result in system outages.
Other authorsSee publication -
Chaos Engineering
IEEE Software
Modern software-based services are implemented as distributed systems with complex behavior and failure modes. Many large tech organizations are using experimentation to verify such systems' reliability. Netflix engineers call this approach chaos engineering. They've determined several principles underlying it and have used it to run experiments.
Other authorsSee publication -
Ansible: Up and Running
O'Reilly Media
Manually configuring servers and using multi-step checklists to deploy your applications is no fun at all. Ansible is a great tool for automating your infrastructure tasks, with a gentle learning curve.
This book covers how to use Ansible to automate your configuration management, deployment, and orchestration tasks. -
Peer impressions in open source organizations: A survey
Journal of Systems and Software
In virtual organizations, such as Open Source Software (OSS) communities, we expect that the impressions members have about each other play an important role in fostering effective collaboration. However, there is little empirical evidence about how peer impressions form and change in virtual organizations. This paper reports the results from a survey designed to understand the peer impression formation process among OSS participants in terms of perceived expertise, trustworthiness…
In virtual organizations, such as Open Source Software (OSS) communities, we expect that the impressions members have about each other play an important role in fostering effective collaboration. However, there is little empirical evidence about how peer impressions form and change in virtual organizations. This paper reports the results from a survey designed to understand the peer impression formation process among OSS participants in terms of perceived expertise, trustworthiness, productivity, experiences collaborating, and other factors that make collaboration easy or difficult. While the majority of survey respondents reported positive experiences, a non-trivial fraction had negative experiences. In particular, volunteer participants were more likely to report negative experiences than participants who were paid.
The results showed that factors related to a person's project contribution (e.g., quality and understandability of committed codes, important design related decisions, and critical fixes made) were more important than factors related to work style or personal traits. Although OSS participants are very task focused, the respondents believed that meeting their peers in person is beneficial for forming peer impressions. Having an appropriate impression of one's OSS peers is crucial, but the impression formation process is complicated and different from the process in traditional organizations.Other authors -
OpenStack Operations Guide
OpenStack Foundation
This book offers hard-earned experience from OpenStack operators who have run OpenStack in production for six months or longer. They've gathered their notes, shared their stories, and learned from each other in the room. We invite you to join in the quest for best practices in OpenStack cloud operations.
Other authorsSee publication -
The cost of the build tax in scientific software
All compiled software systems require a build system: a set of scripts to invoke compilers and linkers to generate the final executable binaries. For scientific software, these build scripts can become extremely complex. Anecdotes suggest that scientific programmers have long been dissatisfied with the current software build tool chains. In this paper, we describe preliminary results from a case study of two projects to estimate the fraction of effort devoted to maintaining these scripts, which…
All compiled software systems require a build system: a set of scripts to invoke compilers and linkers to generate the final executable binaries. For scientific software, these build scripts can become extremely complex. Anecdotes suggest that scientific programmers have long been dissatisfied with the current software build tool chains. In this paper, we describe preliminary results from a case study of two projects to estimate the fraction of effort devoted to maintaining these scripts, which we refer to as the `build tax'. While estimates based on line counts are on the order of only 5%, estimates based on activity-related metrics suggest much higher values.
Other authorsSee publication -
Heterogeneous Cloud Computing
Workshop on Parallel Programming on Accelerator Clusters
Current cloud computing infrastructure typically assumes a homogeneous collection of commodity hardware, with details about hardware variation intentionally hidden from users. In this paper, we present our approach for extending the traditional notions of cloud computing to provide a cloud-based access model to clusters that contain a heterogeneous architectures and accelerators.
Other authorsSee publication -
Automating Failure Testing Research at Internet Scale
ACM Symposium on Cloud Computing
Large-scale distributed systems must be built to anticipate and mitigate a variety of hardware and software failures. In order to build confidence that fault-tolerant systems are correctly implemented, Netflix (and similar enterprises) regularly run “failure drills” in which faults are deliberately injected in their production system. Existing failure testing approaches either explore the space of potential failures randomly or exploit the “hunches” of domain experts to guide the search. In…
Large-scale distributed systems must be built to anticipate and mitigate a variety of hardware and software failures. In order to build confidence that fault-tolerant systems are correctly implemented, Netflix (and similar enterprises) regularly run “failure drills” in which faults are deliberately injected in their production system. Existing failure testing approaches either explore the space of potential failures randomly or exploit the “hunches” of domain experts to guide the search. In this paper, we describe how we adapted and implemented a research prototype called lineage-driven fault injection (LDFI) to automate failure testing at Netflix.
Patents
Projects
-
OpenStack documentation
- Present
Collaboratively edited technical documentation site for the OpenStack open source cloud projects.
Other creatorsSee project
Organizations
-
Association of Computing Machinery (ACM)
-
More activity by Lorin
-
Excited that the article about the techniques we use at Netflix to keep our stateful services highly reliable has been published on InfoQ! A huge…
Excited that the article about the techniques we use at Netflix to keep our stateful services highly reliable has been published on InfoQ! A huge…
Liked by Lorin Hochstein
-
Are you interested in effectively transferring technology from academia to industry with impact? Then please check out this theme issue of IEEE…
Are you interested in effectively transferring technology from academia to industry with impact? Then please check out this theme issue of IEEE…
Liked by Lorin Hochstein
-
Incident management starts in the design doc
Incident management starts in the design doc
Liked by Lorin Hochstein
-
Our paper is out! We recommend interruptive broadcasts for high urgency communications, but non-interruptive targeted mechanisms like secure chat for…
Our paper is out! We recommend interruptive broadcasts for high urgency communications, but non-interruptive targeted mechanisms like secure chat for…
Liked by Lorin Hochstein
-
AI means you can’t “exit” the cloud now even if you want to. https://lnkd.in/gmfZSeGb #ai #cloud #cloudcomputing #softwareengineering…
AI means you can’t “exit” the cloud now even if you want to. https://lnkd.in/gmfZSeGb #ai #cloud #cloudcomputing #softwareengineering…
Liked by Lorin Hochstein
-
It's been great to be back in Banff, Canada this week at the Energy Safety Canada conference. In my opinion, this industry conference sets the global…
It's been great to be back in Banff, Canada this week at the Energy Safety Canada conference. In my opinion, this industry conference sets the global…
Liked by Lorin Hochstein
People also viewed
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More