Rust学习——TRPL-Part5

Understanding Ownership

Rust中最独特的性质,保证内存使用的安全性(不需要GC)。

Ownership is Rust’s most unique feature, and it enables Rust to make memory safety guarantees without needing a garbage collector.

What Is Ownership

The way to memory management:

  1. Use Garbage Collection(GC) that constantly looks for no longer used memory as the program runs.
  2. Programmer must explecitly allocate and free the memory.
  3. (Rust) Memroy is managed through a system of ownership with a set fo rules that the compiler checks at compile time. None of the ownership features slow down your program while it's running.

Ownership Rules

Keep these rules in mind!

  • Each value in Rust has a variable that's called its owner.
  • There can only be one owner at a time.
  • When the owner goes out of scope, the value will be dropped.

感觉很像Functional编程,总是在一个let 的 scope里。

Memroy and Allocation

从 string literal 到 String 类型,硬编码静态确定不可变的串,到一个可接受runtime值,可以改变的类型。

In the case of a string literal, we know the contents at compile time, so the text is hardcoded directly into the final executable. This is why string literals are fast and efficient. But these properties only come from the string literal’s immutability. Unfortunately, we can’t put a blob of memory into the binary for each piece of text whose size is unknown at compile time and whose size might change while running the program.

With the String type, in order to support a mutable, growable piece of text, we need to allocate an amount of memory on the heap, unknown at compile time, to hold the contents. This means:

  • The memory must be requested from the memory allocator at runtime.(Construct function)
  • We need a way of returning this memory to the allocator when we’re done with our String.(Out of scope, special function drop)

Ways Variables and Data Interact: Move

shallow copy? deep copy? Rust prefers move.

理解Move 叫法,ownership从先前的变量转移到了另一变量。先前的变量s1不再可用,因此Rust也无需在s1 out of scope的时候free(从而避免了double free的问题)。

shallow copy + invalidate

Ways Variables and Data Interact: Clone

If we do want to deeply copy the heap data of the String, not just the stack data, we can use a common method called clone.

1
2
3
4
5
6
fn main() {
let s1 = String::from("hello");
let s2 = s1.clone();

println!("s1 = {}, s2 = {}", s1, s2);
}

Stack-Only Data: Copy

1
2
3
4
5
6
fn main() {
let x = 5;
let y = x;

println!("x = {}, y = {}", x, y);
}

But this code seems to contradict what we just learned: we don’t have a call to clone, but x is still valid and wasn’t moved into y.

Types that have known size at compile time are stored entirely on the stack.

Rust 有一个叫做 Copy trait 的特殊注解,可以用在类似整型这样的存储在栈上的类型上(第十章详细讲解 trait)。如果一个类型实现了 Copy trait,那么一个旧的变量在将其赋值给其他变量后仍然可用。Rust 不允许自身或其任何部分实现了 Drop trait 的类型使用 Copy trait。如果我们对其值离开作用域时需要特殊处理的类型使用 Copy 注解,将会出现一个编译时错误。要学习如何为你的类型添加 Copy 注解以实现该 trait,请阅读附录 C 中的 “可派生的 trait”。

那么哪些类型实现了 Copy trait 呢?你可以查看给定类型的文档来确认,不过作为一个通用的规则,任何一组简单标量值的组合都可以实现 Copy,任何不需要分配内存或某种形式资源的类型都可以实现 Copy 。如下是一些 Copy 的类型:

  • 所有整数类型,比如 u32
  • 布尔类型,bool,它的值是 truefalse。
  • 所有浮点数类型,比如 f64
  • 字符类型,char
  • 元组,当且仅当其包含的类型也都实现 Copy 的时候。比如,(i32, i32) 实现了 Copy,但 (i32, String) 就没有。

Ownership and Functions

Taking ownership and then returning ownership with every function is a bit tedious. What if we want to let a function use a value but not take ownership?

可以选择元组(带有额外信息),但仍涉及频繁的ownership转换。但对于一个应该是通用的概念来说,这意味着太多的仪式和大量的工作,好在Rust 还提供了references

References and Borrowing

一个使用reference 避免移交ownership的简单例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
fn main() {
let s1 = String::from("hello");

let len = calculate_length(&s1);

// let len2 = calculate_length2(s1); // then s1 would be invalid

println!("The length of '{}' is {}.", s1, len);
}

fn calculate_length(s: &String) -> usize {
s.len()
}

fn calculate_length2(s: String) -> usize {
s.len()
}

Refer but not own

The &s1 syntax lets us create a reference that refers to the value of s1 but does not own it. Because it does not own it, the value it points to will not be dropped when the reference goes out of scope.

1
2
3
4
5
6
7
8
9
10
11
12
fn main() {
let s1 = String::from("hello");

let len = calculate_length(&s1);

println!("The length of '{}' is {}.", s1, len);
}

fn calculate_length(s: &String) -> usize { // s is a reference to a String
s.len()
} // Here, s goes out of scope. But because it does not have ownership of what
// it refers to, nothing happens.

When functions have references as parameters instead of the actual values, we won’t need to return the values in order to give back ownership, because we never had ownership. 这样就避免了每次都需要通过元组来取回 ownership。称为Borrowing

references 同variables,默认也是immutable的。

Mutable References

1
2
3
4
5
6
7
8
9
fn main() {
let mut s = String::from("hello");

change(&mut s);
}

fn change(some_string: &mut String) {
some_string.push_str(", world");
}

First, we had to change s to be mut(否则编译报错 E0596":cannot borrow s as mutable). Then we had to create a mutable reference with &mut s(reference 默认也是immutable) and accept a mutable reference with some_string: &mut String.

One big restriction: you can have only one mutable reference to a particular piece of data in a particular scope.

编译层面解决了race问题,不允许两个引用变量。回忆Go 中的机制race 检测工具go run -race,检测不同 routines 先读后写又无锁的情况,而go的检查并非静态检查而是动态测试。

类似的情况,不允许同时使用mutable和immutable的引用。

Dangling References

C/C++ 悬挂指针问题

In Rust, by contrast, the compiler guarantees that references will never be dangling references: if you have a reference to some data, the compiler will ensure that the data will not go out of scope before the reference to the data does.

传递回一个 Dangling Reference 会指向一个invalid String(因为已经 out of scope),Rust不会允许这样操作。解决方法,ownership move!

Danger case:

1
2
3
4
5
6
7
8
9
10
11
fn main() {
let reference_to_nothing = dangle();
}

fn dangle() -> &String { // dangle returns a reference to a String

let s = String::from("hello"); // s is a new String

&s // we return a reference to the String, s
} // Here, s goes out of scope, and is dropped. Its memory goes away.
// Danger!

Ownership move:

1
2
3
4
5
6
7
8
9
fn main() {
let string = no_dangle();
}

fn no_dangle() -> String {
let s = String::from("hello");

s
}

The Rules of References

Let’s recap what we’ve discussed about references:

  • At any given time, you can have either one mutable reference or any number of immutable references.
  • References must always be valid.

Next, we’ll look at a different kind of reference: slices.

The Slice Type

Getting starting/ending index from String (not tied to String state)

String slices

常见用法

1
2
3
4
5
6
7
8
9
10
11
12
13
fn main() {
let origin_s = String::from("Hello, world!");
let origin_len = origin_s.len();
let slice_array = [
&origin_s[..],
&origin_s[0..origin_len],
&origin_s[1..5],
&origin_s[2..8],
];
for (i, &item) in slice_array.iter().enumerate() {
println!("Slice {}'s length {} -- {}", i,item.len(), item);
}
}

With all this information in mind, let’s rewrite first_word to return a slice. The type that signifies “string slice” is written as &str:

越界的问题,如果我们试图修改原始String,编译Error。slice作为特殊引用也是immutable,ownership的,修改原始String意味着要创建一个mutable引用,联系上文,我们不能同时对同一变量创建mutable和immutable的引用。

String Literals Are Slices

Recall that we talked about string literals being stored inside the binary. Now that we know about slices, we can properly understand string literals:

1
let s = "Hello, world!";

s的类型是&str,这也可以解释为什么string literals都是immutable的,&str是immutable引用。

String Slices as parameters

More experienced Rustaceans prefer:

1
fn first_word(s: &str) -> &str { // instead of s: &String

这样我们可以同时支持&String类型的值和&str的值。

Deref coericion强制类型转换。