Rust Programming Language
Tools
Language support in Vim…
- Official Vim Plug-in for Rust
- COC (Conquer of Completion)
- …Install the COC extensions for rust-analyzer
:CocInstall coc-rust-analyzer
- …Install the COC extensions for rust-analyzer
- Syntastic Syntax Check Plug-in for Vim
- Rust Analyzer User Manual
rustup
rustup
manages multiple Rust installations in ~/.rustup
- References…
- The rustup Book
- Rust Language Server (RLS)
- toolchain, single installation of the Rust compiler
- Official release channels: stable, beta and nightly
- Stable channel by default
- Stable releases are made every 6 weeks (beta is next stable)
- components are used to install additional tools for a given toolchain
- targets are used to install compilers for other platforms
- By default the host-platform (architecture and operating system) is used
- Cross-compilation requires installation of additional targets
Installs rustc
, cargo
, rustup
and other standard tools:
# installs to ~/.cargo
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# source shell environment
source $HOME/.cargo/env
# clean up
rustup self uninstall
Basic usage of a toolchain:
rustup show # show active toolchain
rustup man $command # show man-page for a given command
rustup update # update to latest version
rustup toolchain help # toolchain help text
rustup toolchain list # list installed toolchains
rustup toolchain install $channel # install from another channel
rustup default $channel # switch default toolchain
rustup target list # list available targets
rustup target add $target # install an additional target
rustup target remove $target # install an additional target
rustup component list # list available components
Install additional components used during development:
rustup component add rls rust-analysis rust-src
cargo
cargo
is a package manager and build tool for the Rust language:
References…
- The Cargo Book
- https://crates.io/ …Rust community’s central package registry
# create a new project
>>> cargo new hello ; cd hello
Created binary (application) `hello` package
# basic skeleton
>>> tree
.
├── Cargo.toml
└── src
└── main.rs
# modify the source code
>>> cat > src/main.rs <<EOF
fn main() {
println!("Hello World!");
}
EOF
# build an executable
>>> cargo build
Compiling hello v0.1.0 (/tmp/hello)
Finished dev [unoptimized + debuginfo] target(s) in 0.77s
# run the program
>>> cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/hello`
Hello World!
- Automatically fetches and builds your package’s dependencies
Cargo.toml
manifest, metadata & various bits of package information- Specifying Dependencies
Cargo.lock
exact information on revision of all dependencies
- Package Layout
$HOME/.cargo/config.toml # user configuration file [02]
cargo new $name # new package for a binary program
cargo build # compile to target/debug/ directory
cargo run # compile and execute
cargo build --release # compile with optimizations to target/release/
cargo update # update dependencies
cargo test # run all tests
rustc
rustc
is the compiler for the Rust programming language
cat > hello_world.rs <<EOF
fn main() {
println!("Hello World!");
}
EOF
# compile an executable
rustc hello_world.rs
./hello_world
Reference The rustc book
Literals
1 // integer
100_000_000 // _ visual separator, equals 1000000
1.2 // floating point
3.141_592 // equals 3.145592
0xff // hex
0o77 // octal
0b1111_0000 // binary
b'A' // byte (u8 only)
'a' // character
Number literals except the byte literal allow a data type suffix:
123i32 // type i32
57u8 // type u8
123.0f64 // type f64
12E+99_f64 // scientific notation for type f64
0xff_u8 // hex type u8
0o70_i16 // octal type i16
Variables
Rust is a statically typed language.
Every value in Rust is of a certain data type.
- Use of
snake_case
for variable names by convention - Compiler must know the types of all variables at compile time
- The compiler can usually infer the type on assignment based on the value and how the are used (cf. Hindley–Milner type system)
- The
let
statement declares a variables in the current scope
Declare and initialize a variable with type inference:
fn main() {
// declare, initialize variable
let x = 1; // data type determined by the compiler
println!("{}",x);
}
1
Rust uses the stack by default for static values.
Static data (size known at compile time) includes:
- Function frames
- Scalar (integer, float, etc) & compound types (tuples, arrays)
- Structs and pointers to dynamic data in the heap
Collections cannot be stack based since the are dynamic in size by nature, and are therefore stored in the heap.
Scalar Types
Primitive data types that represents a single value
- Integer number without a fractional component
- Signed integer types start with
i
, and unsigned withu
- 8,16,32,64,128 bit variants, i.e.
i32
(signed 32 bit integer)
- Signed integer types start with
isize
andusize
types depend on architecture- Pointer sized signed and unsigned integer types
- 32 bits on 32-bit platforms and 64 bits on 64-bit platforms
- Floating-point types are
f32
andf64
, which are 32 bits and 64 bits in size - One byte Boolean type
bool
with two values:true
andfalse
char
character type, specified with single quotes'ℤ'
let x = 10; // default integer type is i32
let y: i8 = -128;
let a = 5i8; // Equals to `let a: i8 = 5;`
let x = 1.5; // default float type is f64
let y: f32 = 2.0;
Most primitives implement the Copy
trait
- Can be moved without owning the value in question
- Copied byte-for-byte in memory to produce a new, identical value
Compound Types
Group multiple values into one type
Tuple groups a number of values with a variety of types
// comma-separated list of values inside parentheses
let t = (500, 6.4, 1);
// access elements directly using a period
let x = t.0 // index starts at zero
// with type annotation
let t: (i32, f64, u8) = (500, 6.4, 1);
// deconstruct tuple into three separate variables
let (x, y, z) = t;
Array in Rust have a fixed length (like tuples):
- Arrays are allocated on the stack.
- Even with
mut
, its element count cannot be changed. - The
Vec<T>
standard library type provides a heap-allocated resizable array type.
// comma-separated list inside square brackets
let a = [1, 2, 3, 4, 5];
// define a type, and size
let a: [i32; 5] = [1, 2, 3, 4, 5];
// initial value, size
let a = [3; 5]; // expands to [3, 3, 3, 3, 3]
// access elements of an array using indexing
let mut c: [i32; 3] = [1, 2, 3];
0] = 2;
c[1] = 4;
c[2] = 6; c[
Declaration
The let
statement declares a variables in the current scope.
Local variables may not be initialized when allocated.
fn main() {
// declare variable, missing data type
let x;
println!("{}",x);
}
error[E0282]: type annotations needed
|
2 | let x; // declare a local variable
| ^ consider giving `x` a type
If data type inference is not possible, then it is required to define the data type with the declaration.
fn main() {
// declare variable with data type
let x: i32;
println!("{}",x);
}
error[E0381]: borrow of possibly-uninitialized variable: `x`
|
3 | println!("{}",x);
| ^ use of possibly-uninitialized `x`
Variables can only be accessed after a value has been assigned.
Assignment after declaration by a subsequent statement initialize the variable.
fn main() {
let x: i32; // declare variable
= 1; // initialize variable
x println!("{}",x);
}
1
Immutability
Variables in Rust are immutable by default:
We get the primary benefit we want from immutable-by-default: mutable code is explicitly called out, so you know where to look for bugs.
Example of an assignment to an immutable variable:
fn main() {
let x = 1; // declare variable
= 2; // assign to immutable variable
x println!("{}",x);
}
Compile-time error by an attempt to change a value of an immutable variable:
error[E0384]: cannot assign twice to immutable variable `x`
|
2 | let x = 1; // a single variable
| -
| |
| first assignment to `x`
| help: make this binding mutable: `mut x`
3 | x = 2;
| ^^^^^ cannot assign twice to immutable variable
Reducing the number of mutable variables in code makes its understanding infinitely easier because you know that once a variable has been given a value, it remains that way. You don’t need to carefully look for places where the value might be mutated…in practice you end up passing lots of values by reference to avoid copy costs. In those cases, it’s very useful to know that calling a specific function won’t mutate its arguments
The compile warns about mutable variables which never get a reassignment:
fn main() {
let mut a = 1;
println!("{}", a);
}
warning: variable does not need to be mutable
--> vars.rs:2:9
|
2 | let mut a = 1;
| ----^
| |
| help: remove this `mut`
|
Constants
Constants aren’t just immutable by default - they’re always immutable.
Use the const
keyword to declare compile-time constants
SCREAMING_SNAKE_CASE
names by convention- Declared in any scope, including the global scope
- Constants must be explicitly typed
fn main() {
const X: u8 = 1; // declare, initialize a constant with type
println!("{}",X);
}
Compiler complains about the mutable keyword with const
:
error: const globals cannot be mutable
|
2 | const mut X: u8 = 1;
| ----- ^^^ cannot be mutable
| |
| help: you might want to declare a static instead: `static`
Constants are essentially inlined wherever they are used, meaning that they are copied directly into the relevant context when used.
Shadowing
Multiple variables can be defined with the same name, which masks access to a previosly declared varriables beyond the point of shadowing
Shadowing is different from marking a variable as
mut
, because we’ll get a compile-time error if we accidentally try to reassign to this variable without using thelet
keyword. By usinglet
, we can perform a few transformations on a value but have the variable be immutable after those transformations have been completed.
fn main() {
let x = 1; // declare variable
let x = 2; // mask x
println!("{}",x);
}
2
Compiles with warning:
warning: unused variable: `x`
|
2 | let x = 1; // declare variable
| ^ help: consider prefixing with an underscore: `_x`
|
= note: `#[warn(unused_variables)]` on by default
No effect on original variable x
becomes more evident with scopes:
fn main() {
let x = 1; // declare variable
{ // start new scope
let x = 2; // masks out-scope x
println!("{}",x);
}
println!("{}",x);
}
2
1
Rust cares about protecting against unwanted mutation effects as observed through references. This doesn’t conflict with allowing shadowing, because you’re not changing values when you shadow, you’re just changing what a particular name means in a way that cannot be observed anywhere else. Shadowing is a strictly local change.
Rebind
If a variable has been declared and used, it is possible to recycle the variable name by a new variable declaration statement:
fn main() {
let x = 1; // declare
println!("{} {:p}",x,&x);
let x = 2; // rebind
println!("{} {:p}",x,&x);
}
1 0x7ffd246f56ac
2 0x7ffd246f571c
Note that the variable uses a different memory address when recycled.
Mutability
Re-assignable variables are declared with let mut
(mutable)
Mutability is a necessary component of software development. At the lowest level of software, machine code is inherently mutable (mutating memory and register values). We layer abstractions of immutability on top of that…
fn main() {
// declare, initialize a mutable variable
let mut x = 1;
println!("{} {:p}", x, &x);
// assign new value
= 2;
x println!("{} {:p}", x, &x);
}
Mutable variables are just that – mutable. The value changes but the underlying address in memory is the same:
1 0x7ffd1cd3dce4
2 0x7ffd1cd3dce4
There are multiple trade-offs to consider in addition to the prevention of bugs. For example, in cases where you’re using large data structures, mutating an instance in place may be faster than copying and returning newly allocated instances. With smaller data structures, creating new instances and writing in a more functional programming style may be easier to think through, so lower performance might be a worthwhile penalty for gaining that clarity.
Static
Global static
variable can be mutable, and represent a memory address:
…primary use cases are global locks, global atomic counters, interfacing with legacy C libraries, and embedded programming.
- Requires
unsafe
block for use - Stored in a dedicated section (BSS) in binary
Ownership
Tight coupling between assignment and ownership.
- Code analyses to check a standard set of safe code conventions
- Rust enforces Resource acquisition is initialization (RAII)
- Automatic and predictable release of resources (drop)
- Prevents bugs associated with resource leak
- No need to manually free memory
- No garbage collection
- Variable lifetime spans from declaration until compiler infers it can be dropped
- Drop of a variable includes all resources which it has ownership of
- Notion of a destructor in Rust is provided through the
Drop
trait
Rust uses lexical scopes - name resolution depends on the location in the source code and the lexical context.
fn main() {
{ // anonymous scope created
let x = 1;
} // drop value of x
println!("{}", x);
}
error[E0425]: cannot find value `x` in this scope
|
5 | println!("{}", x);
|
Rust enforces three simple rules of ownership:
- Each value has a variable which is the owner.
- Each value has exactly one owner at a time.
- When the owner goes out of scope the value is dropped
fn main() {
let x = "a".to_string();
let y = x; // move ownership
let z = x; // previous owner can no longer be used
println!("{}", z);
}
The compiler complains about the ownership
error[E0382]: use of moved value: `x`
|
2 | let x = "a".to_string();
| - move occurs because `x` has type `std::string::String`, which does not implement the `Copy` trait
3 | let y = x;
| - value moved here
4 | let z = x;
| ^ value used here after move
Borrowing
Given that there are rules about only having one mutable pointer to a variable binding at a time, rust employs a concept of borrowing.
One piece of data can be borrowed either as a shared borrow or as a mutable borrow at a given time (not both at the same time).
Mutable Borrowing
Data can be borrowed and altered by a single user.
fn main() {
let mut a = [1,2,3,4,5];
let b = &mut a; // mutable borrowing of `a`
0] = 6; // ⁝
b[// ...ends here
println!("{:?}", a);
}
[6, 2, 3, 4, 5]
Data not accessible for any other users at that time.
fn main() {
let mut a = [1,2,3,4,5];
let b = &mut a; // mutable borrowing of `a`
1] = 7; // ⁝
a[println!("{:?}", b); // ...ends here
}
error[E0503]: cannot use `a` because it was mutably borrowed
|
3 | let b = &mut a;
| ------ borrow of `a` occurs here
4 | a[1] = 7;
| ^^^^ use of borrowed `a`
5 | println!("{:?}", b);
| - borrow later used here
Control Flow
Conditions
Evaluate a block if
a condition holds.
fn main() {
let number = 5;
if number < 10 {
println!("first condition true");
} else if number < 22 {
println!("second confition ture");
} else {
println!("condition was false");
}
}
if
blocks can also act as expressions if all branches return the same type
let answer = if 1 == 2 {
"whoops, mathematics broke"
} else {
"everything's fine!"
}
Loops
The loop
keyword runs endless…
- ..until a
break
keyword stops the loop - a value after
break
will be returned
fn main() {
let mut counter = 0;
let result = loop {
+= 1;
counter if counter == 10 {
break counter;
}
};
println!("Looped {} times",result);
}
while
loop begins by evaluating the boolean loop conditional expression…
- …returns if the condition becomes
true
:
fn main() {
let mut i = 0;
while i < 10 {
print!("{} ", i);
= i + 1;
i }
}
0 1 2 3 4 5 6 7 8 9
for in
over a range:
fn main() {
for i in 0..5 {
print!("{} ", i);
}
}
0 1 2 3 4
Collections
Rust’s standard collection library:
- Efficient implementations of common data structures.
- Cover most use cases for generic data storage and processing
Vector
Vectors are re-sizable arrays.
Vec
is a (pointer, capacity, length) triplet- Pointer will never be null, so this type is null-pointer-optimized
- Memory it points to is on the heap in continuous order of its length
- As low-overhead as possible in the general case
- Automatic capacity increased on demand
- Elements will be reallocated (can be slow)
- Will never automatically shrink itself
Vec<T>
in Rust are generic, they have no default type.
fn main() {
// declare an empty vector with explicit data type
let mut empty_vector: Vec<i32> = Vec::new();
println!("{:?}", empty_vector);
}
Initialize a vector using the Vec::new()
method or the vec!
macro:
fn main() {
// initialize a mutable empty vector
let mut mutable_vector = Vec::new();
// push an element to the vector
.push(1);
mutable_vectorprintln!("{:?}", mutable_vector);
// use `vec!` macro to initialize a immutbale vector with three elements
let immutable_vector = vec![2,3,4];
println!("{:?}", immutable_vector);
}
Capacity
Recommended to specify capacity at declaration if possible
capacity()
number of allocated elements (without reallocating)len()
number of used elementspush()
andinsert()
never (re)allocate if capacity is sufficientshrink_to_fit()
drops memory if able:
fn main() {
let mut vec = Vec::new();
.push(1);
vec.push(2);
vec.pop();
vecprintln!("{} {}",vec.capacity(),vec.len());
.shrink_to_fit();
vecprintln!("{} {}",vec.capacity(),vec.len());
}
Iterators
iter
provides an iterator of immutable references:
fn main() {
let vector = vec![1,2,3];
for element in vector.iter() {
print!("{} ",element);
}
}
iter_mut
provides an iterator of mutable references:
fn main() {
let mut vector = vec![1,2,3];
for element in vector.iter_mut() {
*element += 1;
print!("{} ",element);
}
}
Strings
All strings in Rust are UTF-8 encoded.
fn main() {
// string literal initalizing a `&str` slice
let ss = "a"; // equivalent to `let ss: &str = "a"`
// create a `String` from a string literal
let st = "b".to_string();
// equivalent to
let sf = String::from("c");
println!("{} {} {}", ss, st, sf);
}
a b c
The &str
slice is provided by the Rust Core:
- Created using string literals (stored in the program’s binary)
- May reference a range of UTF-8 text “owned” by someone else
- Preallocated text that is stored in read-only memory as part of the executable
The String
type is provided by the Rust’s standard library:
String
is a wrapper over aVec<u8>
- Stores a pointer on the stack, and data in a heap-allocated buffer
- Automatically resizes its buffer on demand
- Interpretation as bytes, as slice, scalar values, and grapheme clusters
- No indexing (bytes do not correlate to a Unicode scalar value)
It’s safe to say that, if the API we’re building doesn’t need to own or mutate the text it’s working with, it should take a
&str
instead of aString
Rust has this super powerful feature called Deref
coercing which allows it to turn any passed String reference using the borrow operator:
fn puts(s: &str) {
print!("{}",s)
}
fn main() {
let s = "ab"; // `&str` slice
let t = "cd".to_string(); // `String` type
;
puts(s)&t); // pass by reference
puts(}
Append
Append strings by using push-methods of String
:
fn main() {
let mut s = "a".to_string();
// append a single character
.push('b');
s// append an `str` slice
.push_str("cde");
s// ownership of slice `t` not moved to `s`
let t = "f";
.push_str(t);
sprintln!("{} {}", s, t);
}
abcdef f
Split
split()
returns an iterator over substrings of a string slice:
fn main() {
let s = "a,b,c";
for t in s.split(",") {
print!("{} ",t);
}
}
split_whitespace()
splits the input string into different strings
fn main() {
let s = "a b c";
for t in s.split_whitespace() {
print!("{},",t);
}
}
Slice
Data type that does not have ownership
- Reference a contiguous sequence of elements in a collection
let a: [i32; 4] = [1, 2, 3, 4]; // Parent Array
let b: &[i32] = &a; // Slicing whole array
let c = &a[0..4]; // From 0th position to 4th(excluding)
let d = &a[..]; // Slicing whole array
let e = &a[1..3]; // [2, 3]
let f = &a[1..]; // [2, 3, 4]
let g = &a[..3]; // [1, 2, 3]
Functions
The main
function is the program entry point
// program entry point
fn main() {
// call the `add` function
print!("{}",add(1,2));
}
// function with parameters, and a return value
fn add(x: i32, y: i32) -> i32 {
// function body
+ y // last expression used for return value
x }
Declare new functions with fn
keyword followed by:
snake_case
function name- Input parameter list in
()
passed by the caller - Function body in
{}
containing statements and expressions- Statements do not return values
- Expressions evaluate to something and return a value (no ending semicolon)
Parameters are variables that are part of a function’s signature:
- Must declare the type of each parameter
- Multiple parameters separated by
,
Functions can return values to the code that calls them
- Declare type of a return value with
->
- Return value synonymous with the value of the final expression (implicit return value)
// a function with mutliple return values
fn add_sub(x: i32, y: i32) -> (i32, i32) {
+ y, x -y)
(x }
fn main() {
print!("{:?}",add_sub(1,2));
}
Statements return no value, while expressions return a value.
Return expressions are denoted with the keyword return
:
fn max(a: i32, b: i32) -> i32 {
if a > b {
return a;
}
return b;
}
fn main() {
println!("{}", max(27,64));
}
Return as the last line of a function works, but is considered poor style.
Match
Rust provides pattern matching via the
match
keyword, which can be used like a Cswitch
.
A
match
expression takes an input value, classifies it, and then jumps to code written to handle the identified class of data. [mmmmr]
for i in -2..5 {
match i { //scrutinee is `i`
-1 => println!("It's minus one"),
1 => println!("It's a one"),
2|4 => println!("It's either a two or a four"),
=> println!("Matched none of the arms"),
_ }
}
The value compared to the patterns is called the scrutinee.
Scrutinee expression and patterns must have the same type.
let text = "foo bar nom".to_string();
for word in text.split(' ') {
match word.as_ref() {
"foo" => {
println!("bar");
}
"bar" => {
println!("foo");
}
=> println!("no match")
_ }
}
match
can return a value:
let v = match 5 {
1 => 2,
2 => 3,
=> 0,
_ };
assert_eq!(v,0);
Match against an enum
type:
enum E {
,
A,
B
C}
let e = match E::C {
E::A => 'a',
E::B => 'b',
E::C => 'c'
};
assert_eq!(e,'c')
[mmmmr] Mixing matching, mutation, and moves in Rust
https://blog.rust-lang.org/2015/04/17/Enums-match-mutation-and-moves.html
Macros
fn add(x: i32, y: i32) -> i32 {
+ y
x }
macro_rules! add {
$x:expr,$y:expr) => {
($x,$y)
add(};
$x:expr) => {
($x,2)
add(};
=> {
() 1,2)
add(};
}
fn main() {
assert_eq!(add!(),3);
assert_eq!(add!(1),3);
assert_eq!(add!(1,2),3);
}
References
- The Rust Programming Language
- Rust Language Cheat Sheet
- The Embedded Rust Book
- Writing an OS in Rust
- Operating System development tutorials in Rust on the Raspberry Pi
- Memory Safety in Rust: A Case Study with C
- A closer look at Ownership in Rust
- The Rust Programming Book - What is Ownership?
- CS140e - An Experimental Course on Operating Systems
- Three Kinds of Polymorphism in Rust, 2022/01