New Directions in Cloud Programming | Joseph Hellerstein

BG

CS 历史上不同计算平台催生出不同的编程环境

Minicomputers PDP-11 1970 -- C 语言

Supercomputers Cray-1 1976 -- Chapel MPI

Personal Computers Macintosh 1984 -- Hypercard and Labview(Graphical Programming/ low code no code programming)

Smart Phones -- Swift Android 移动化开发

Public Cloud AWS 2006 -- ?

缺少一种 programming model 来完全激发 public cloud 的潜力

The Big Query

如何利用公有云环境进行编程，使得第三方可以充分利用如此广阔的公有云资源构建软件被 Joe 认为是未来十年中最值得关注的问题与发展点。

How will people program the cloud?
Its hard! 主要难点在以下
- Parallelism & partial failure
- Heterogeneous hard ware & networks
- Dynamic autoscaling
- Data: persistence, lantency and scale

A grand challenge across core computer science

Serverless Computing

Joe 实验室所作的一系列工作

CIDR 19: Systems critical
- 关注到 Serverless 作为突破口的文章 Serverless Computing: One Step Forward, Two Steps Back
2019-present: Systems work
- 主要关注一些系统级别的工作，构建了如下
- Cloudburst: Stateful FaaS

Yet severless is just sequential UDFs in the cloud.

仅仅是开始

A New PACT for Cloud Programming

A declaration of four independent facets to evolutionize cloud programming in the 20's

new pact for cloud programming 云环境编程演进的四个独立声明

Program Semantics 程序语义: The functionality of the program，表达了程序的功能
Avallability 可靠性: tolerate / independent failures，程序可以容忍多少错误（故障）
Consistency 一致性: the consistency models that you want on your APIs，程序提供的 API 所选择的一致性模型，应是 application-specific guarantees per API
Targets for Optimization: what kinds of cost and performance you're willing to deal with 面向/针对的是哪一些代价和性能

Facted Programming: Declarations, independent

Programmers specify each facet in isolation，具备一定的隔离性，比如程序员对程序语义设计的时候可以不考虑选择的一致性模型，另外调整一致性模型的时候可以不改变原有的程序语义代码。
Clarity of intent, separation of concerns 兴趣点清晰明了，关注点相互分离
- Compare to complexity of current distributed programs!（混合着语义和一致性模型的选择）
Each facet has its own syntax and code generator 每个 facet 独立的语义和 code generator，这一点对于底层 infrastructure 也是好的
- Program Synthesis: Codegen via search, rather than rule-based

拆开来看各个环节

P: Program Semantics

"Just told me what you want your program to do."

就目前而言，与社区中所讲的 IR 提供的 declarative specification 较为相似（高一级别的抽象）

但是写 IR 应是编译器以及熟悉编译器的程序员可以写出的，对更多现代程序员中陌生，所以 Joe 所提倡的技术，被称为“Declarative IR + Verified Lifiting”，即通过程序员更熟悉的框架，输入魏 Sequential Program，ORM(Object Relational Modeling) 等，通过 Verified Lifiting 自动化的生成 IR 表达。

使用 Verified Lifting 可增量，且自动地生成 Declrative IR

A: Avallability

"Tolerate f independent failures."

需要声明其容错能力，提供多少冗余。应该算是比较简单的一个刻画面，都需要指定\(f\)

注意这里指定容错能力是基于每个 API 的，换言之针对不同的 API 我们可以提供不同的可用性（容错能力）取决于我们的资源配置/API 的价值。

这里我们还需要声明独立性（相关性），比如是 VMs，同机架，不同 DCs 或者 AZs。

C: Consistency

"Trickiest parts of distributed programming."

在骑过去十年分布式系统发展的过程中，社区学习到的组主要的经验就是，Consistency Level 应该是在应用层指定的，应是 per API/op，而不是在数据存储层（per 记录？），能带来的好处，可以做很多有意义的事（Joe 举例，比如 column analysis，remove coordination from monotonic）

Types of guarantees 两种类型的 consistency guarantee

History-based
- ACID isolation, replica consistency
Property-based
- E.g. determinism or invariants

见得比较多的是 History-based guarantee，但是 Property-based guarantee 也会受到关注，比如我希望分布式的程序的 outcome 是确定的，又或者是我们也不 care 最终的一致性，而是希望有不变式能在过程中始终被保证（比如外键啊，余额非负等常见情况）。