Rust Programming Language

Code
Rust
Published

May 17, 2022

Modified

May 17, 2022

Tools

Language support in Vim…

rustup

rustup manages multiple Rust installations in ~/.rustup

  • References…
  • toolchain, single installation of the Rust compiler
  • Official release channels: stable, beta and nightly
    • Stable channel by default
    • Stable releases are made every 6 weeks (beta is next stable)
  • components are used to install additional tools for a given toolchain
  • targets are used to install compilers for other platforms
    • By default the host-platform (architecture and operating system) is used
    • Cross-compilation requires installation of additional targets

Installs rustc, cargo, rustup and other standard tools:

# installs to ~/.cargo
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# source shell environment
source $HOME/.cargo/env
# clean up 
rustup self uninstall

Basic usage of a toolchain:

rustup show                         # show active toolchain
rustup man $command                 # show man-page for a given command
rustup update                       # update to latest version
rustup toolchain help               # toolchain help text
rustup toolchain list               # list installed toolchains
rustup toolchain install $channel   # install from another channel
rustup default $channel             # switch default toolchain
rustup target list                  # list available targets
rustup target add $target           # install an additional target
rustup target remove $target        # install an additional target
rustup component list               # list available components

Install additional components used during development:

rustup component add rls rust-analysis rust-src

cargo

cargo is a package manager and build tool for the Rust language:

References…

# create a new project
>>> cargo new hello ; cd hello
     Created binary (application) `hello` package
# basic skeleton
>>> tree
.
├── Cargo.toml
└── src
    └── main.rs
# modify the source code
>>> cat > src/main.rs <<EOF
fn main() {
    println!("Hello World!");
}
EOF
# build an executable
>>> cargo build            
   Compiling hello v0.1.0 (/tmp/hello)
    Finished dev [unoptimized + debuginfo] target(s) in 0.77s
# run the program
>>> cargo run  
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
     Running `target/debug/hello`
Hello World!
  • Automatically fetches and builds your package’s dependencies
  • Cargo.toml manifest, metadata & various bits of package information
  • Package Layout
$HOME/.cargo/config.toml   # user configuration file [02]
cargo new $name            # new package for a binary program
cargo build                # compile to target/debug/ directory
cargo run                  # compile and execute 
cargo build --release      # compile with optimizations to target/release/
cargo update               # update dependencies
cargo test                 # run all tests

rustc

rustc is the compiler for the Rust programming language

cat > hello_world.rs <<EOF
fn main() {
    println!("Hello World!");
}
EOF

# compile an executable
rustc hello_world.rs
./hello_world

Reference The rustc book

Literals

1                // integer
100_000_000      // _ visual separator, equals 1000000
1.2              // floating point
3.141_592        // equals 3.145592
0xff             // hex
0o77             // octal
0b1111_0000      // binary
b'A'             // byte (u8 only)
'a'              // character

Number literals except the byte literal allow a data type suffix:

123i32       // type i32
57u8         // type u8
123.0f64     // type f64
12E+99_f64   // scientific notation for type f64
0xff_u8      // hex type u8
0o70_i16     // octal type i16

Variables

Rust is a statically typed language.

Every value in Rust is of a certain data type.

  • Use of snake_case for variable names by convention
  • Compiler must know the types of all variables at compile time
  • The compiler can usually infer the type on assignment based on the value and how the are used (cf. Hindley–Milner type system)
  • The let statement declares a variables in the current scope

Declare and initialize a variable with type inference:

fn main() {
    // declare, initialize variable
    let x = 1; // data type determined by the compiler
    println!("{}",x);
}
1

Rust uses the stack by default for static values.

Static data (size known at compile time) includes:

  • Function frames
  • Scalar (integer, float, etc) & compound types (tuples, arrays)
  • Structs and pointers to dynamic data in the heap

Collections cannot be stack based since the are dynamic in size by nature, and are therefore stored in the heap.

Scalar Types

Primitive data types that represents a single value

  • Integer number without a fractional component
    • Signed integer types start with i, and unsigned with u
    • 8,16,32,64,128 bit variants, i.e. i32 (signed 32 bit integer)
  • isize and usize types depend on architecture
    • Pointer sized signed and unsigned integer types
    • 32 bits on 32-bit platforms and 64 bits on 64-bit platforms
  • Floating-point types are f32 and f64, which are 32 bits and 64 bits in size
  • One byte Boolean type bool with two values: true and false
  • char character type, specified with single quotes 'ℤ'
let x = 10;         // default integer type is i32
let y: i8 = -128;
let a = 5i8;        // Equals to `let a: i8 = 5;`
let x = 1.5;        // default float type is f64
let y: f32 = 2.0;

Most primitives implement the Copy trait

  • Can be moved without owning the value in question
  • Copied byte-for-byte in memory to produce a new, identical value

Compound Types

Group multiple values into one type

Tuple groups a number of values with a variety of types

// comma-separated list of values inside parentheses
let t = (500, 6.4, 1);
// access elements directly using a period
let x = t.0                                // index starts at zero
// with type annotation
let t: (i32, f64, u8) = (500, 6.4, 1);     
// deconstruct tuple into three separate variables
let (x, y, z) = t;

Array in Rust have a fixed length (like tuples):

  • Arrays are allocated on the stack.
  • Even with mut, its element count cannot be changed.
  • The Vec<T> standard library type provides a heap-allocated resizable array type.
// comma-separated list inside square brackets
let a = [1, 2, 3, 4, 5];
// define a type, and size
let a: [i32; 5] = [1, 2, 3, 4, 5];
// initial value, size
let a = [3; 5]; // expands to [3, 3, 3, 3, 3]
// access elements of an array using indexing
let mut c: [i32; 3] = [1, 2, 3];
c[0] = 2;
c[1] = 4;
c[2] = 6;

Declaration

The let statement declares a variables in the current scope.

Local variables may not be initialized when allocated.

fn main() {
    // declare variable, missing data type
    let x;
    println!("{}",x);
}
error[E0282]: type annotations needed
  |
2 |     let x;  // declare a local variable
  |         ^ consider giving `x` a type

If data type inference is not possible, then it is required to define the data type with the declaration.

fn main() {
    // declare variable with data type
    let x: i32;
    println!("{}",x);
}
error[E0381]: borrow of possibly-uninitialized variable: `x`
  |
3 |     println!("{}",x);
  |                   ^ use of possibly-uninitialized `x`

Variables can only be accessed after a value has been assigned.

Assignment after declaration by a subsequent statement initialize the variable.

fn main() {
    let x: i32;  // declare variable
    x = 1;       // initialize variable
    println!("{}",x);
}
1

Immutability

Variables in Rust are immutable by default:

We get the primary benefit we want from immutable-by-default: mutable code is explicitly called out, so you know where to look for bugs.

Example of an assignment to an immutable variable:

fn main() {
    let x = 1;  // declare variable
    x = 2;      // assign to immutable variable
    println!("{}",x);
}

Compile-time error by an attempt to change a value of an immutable variable:

error[E0384]: cannot assign twice to immutable variable `x`
  |
2 |     let x = 1;  // a single variable
  |         -
  |         |
  |         first assignment to `x`
  |         help: make this binding mutable: `mut x`
3 |     x = 2;
  |     ^^^^^ cannot assign twice to immutable variable

Reducing the number of mutable variables in code makes its understanding infinitely easier because you know that once a variable has been given a value, it remains that way. You don’t need to carefully look for places where the value might be mutated…in practice you end up passing lots of values by reference to avoid copy costs. In those cases, it’s very useful to know that calling a specific function won’t mutate its arguments

The compile warns about mutable variables which never get a reassignment:

fn main() {
    let mut a = 1;
    println!("{}", a);
}
warning: variable does not need to be mutable
 --> vars.rs:2:9
  |
2 |     let mut a = 1;
  |         ----^
  |         |
  |         help: remove this `mut`
  |

Constants

Constants aren’t just immutable by default - they’re always immutable.

Use the const keyword to declare compile-time constants

  • SCREAMING_SNAKE_CASE names by convention
  • Declared in any scope, including the global scope
  • Constants must be explicitly typed
fn main() {
    const X: u8 = 1; // declare, initialize a constant with type
    println!("{}",X);
}

Compiler complains about the mutable keyword with const:

error: const globals cannot be mutable
  |
2 |     const mut X: u8 = 1;
  |     ----- ^^^ cannot be mutable
  |     |
  |     help: you might want to declare a static instead: `static`

Constants are essentially inlined wherever they are used, meaning that they are copied directly into the relevant context when used.

Shadowing

Multiple variables can be defined with the same name, which masks access to a previosly declared varriables beyond the point of shadowing

Shadowing is different from marking a variable as mut, because we’ll get a compile-time error if we accidentally try to reassign to this variable without using the let keyword. By using let, we can perform a few transformations on a value but have the variable be immutable after those transformations have been completed.

fn main() {
    let x = 1;  // declare variable
    let x = 2;  // mask x
    println!("{}",x);
}
2

Compiles with warning:

warning: unused variable: `x`
  |
2 |     let x = 1;  // declare variable
  |         ^ help: consider prefixing with an underscore: `_x`
  |
  = note: `#[warn(unused_variables)]` on by default

No effect on original variable x becomes more evident with scopes:

fn main() {
    let x = 1;  // declare variable
    { // start new scope
        let x = 2; // masks out-scope x
        println!("{}",x);
    }
    println!("{}",x);
}
2
1

Rust cares about protecting against unwanted mutation effects as observed through references. This doesn’t conflict with allowing shadowing, because you’re not changing values when you shadow, you’re just changing what a particular name means in a way that cannot be observed anywhere else. Shadowing is a strictly local change.

Rebind

If a variable has been declared and used, it is possible to recycle the variable name by a new variable declaration statement:

fn main() {
    let x = 1;  // declare
    println!("{} {:p}",x,&x);
    let x = 2;  // rebind
    println!("{} {:p}",x,&x);
}
1 0x7ffd246f56ac
2 0x7ffd246f571c

Note that the variable uses a different memory address when recycled.

Mutability

Re-assignable variables are declared with let mut (mutable)

Mutability is a necessary component of software development. At the lowest level of software, machine code is inherently mutable (mutating memory and register values). We layer abstractions of immutability on top of that…

fn main() {
    // declare, initialize a mutable variable
    let mut x = 1;
    println!("{} {:p}", x, &x);
    // assign new value
    x = 2;
    println!("{} {:p}", x, &x);
}

Mutable variables are just that – mutable. The value changes but the underlying address in memory is the same:

1 0x7ffd1cd3dce4
2 0x7ffd1cd3dce4

There are multiple trade-offs to consider in addition to the prevention of bugs. For example, in cases where you’re using large data structures, mutating an instance in place may be faster than copying and returning newly allocated instances. With smaller data structures, creating new instances and writing in a more functional programming style may be easier to think through, so lower performance might be a worthwhile penalty for gaining that clarity.

Static

Global static variable can be mutable, and represent a memory address:

…primary use cases are global locks, global atomic counters, interfacing with legacy C libraries, and embedded programming.

  • Requires unsafe block for use
  • Stored in a dedicated section (BSS) in binary

Ownership

Tight coupling between assignment and ownership.

  • Code analyses to check a standard set of safe code conventions
  • Rust enforces Resource acquisition is initialization (RAII)
    • Automatic and predictable release of resources (drop)
    • Prevents bugs associated with resource leak
    • No need to manually free memory
    • No garbage collection
  • Variable lifetime spans from declaration until compiler infers it can be dropped
  • Drop of a variable includes all resources which it has ownership of
  • Notion of a destructor in Rust is provided through the Drop trait

Rust uses lexical scopes - name resolution depends on the location in the source code and the lexical context.

fn main() {
    { // anonymous scope created
        let x = 1;
    } // drop value of x
    println!("{}", x);
}
error[E0425]: cannot find value `x` in this scope
  |
5 |     println!("{}", x);
  |

Rust enforces three simple rules of ownership:

  1. Each value has a variable which is the owner.
  2. Each value has exactly one owner at a time.
  3. When the owner goes out of scope the value is dropped
fn main() {
    let x = "a".to_string();
    let y = x;    // move ownership
    let z = x;    // previous owner can no longer be used
    println!("{}", z);
}

The compiler complains about the ownership

error[E0382]: use of moved value: `x`
  |
2 |     let x = "a".to_string();
  |         - move occurs because `x` has type `std::string::String`, which does not implement the `Copy` trait
3 |     let y = x;
  |             - value moved here
4 |     let z = x;
  |             ^ value used here after move

Borrowing

Given that there are rules about only having one mutable pointer to a variable binding at a time, rust employs a concept of borrowing.

One piece of data can be borrowed either as a shared borrow or as a mutable borrow at a given time (not both at the same time).

Shared Borrowing

Data is borrowed by a single or multiple users.

Data can not be altered, but is readable by all users.

fn main() {
    let a = [1,2,3,4,5];
    let b = &a;                   // shared borrow of `a`
    println!("{:?} {}", a, b[1]);
}
[1, 2, 3, 4, 5] 2

Mutable Borrowing

Data can be borrowed and altered by a single user.

fn main() {
    let mut a = [1,2,3,4,5];
    let b = &mut a;         // mutable borrowing of `a`
    b[0] = 6;               // ⁝
                            // ...ends here
    println!("{:?}", a);
}
[6, 2, 3, 4, 5]

Data not accessible for any other users at that time.

fn main() {
    let mut a = [1,2,3,4,5];
    let b = &mut a;          // mutable borrowing of `a`
    a[1] = 7;                // ⁝
    println!("{:?}", b);     // ...ends here
}
error[E0503]: cannot use `a` because it was mutably borrowed
  |
3 |     let b = &mut a;
  |             ------ borrow of `a` occurs here
4 |     a[1] = 7;
  |     ^^^^ use of borrowed `a`
5 |     println!("{:?}", b);
  |                      - borrow later used here

Control Flow

Conditions

Evaluate a block if a condition holds.

fn main() {

    let number = 5;

    if number < 10 {
        println!("first condition true");
    } else if number < 22 {
        println!("second confition ture");
    } else {
        println!("condition was false");
    }
}

if blocks can also act as expressions if all branches return the same type

let answer = if 1 == 2 {
    "whoops, mathematics broke"
} else {
    "everything's fine!"
}

Loops

The loop keyword runs endless…

  • ..until a break keyword stops the loop
  • a value after break will be returned
fn main() {
    let mut counter = 0;
    let result = loop {
       counter += 1;
       if counter == 10 {
           break counter;
       }
    };
    println!("Looped {} times",result);
}

while loop begins by evaluating the boolean loop conditional expression…

  • …returns if the condition becomes true:
fn main() {
    let mut i = 0;
    while i < 10 {
        print!("{} ", i);
        i = i + 1;
    }
}
0 1 2 3 4 5 6 7 8 9

for in over a range:

fn main() {
    for i in 0..5 {
        print!("{} ", i);
    }
}
0 1 2 3 4

Collections

Rust’s standard collection library:

  • Efficient implementations of common data structures.
  • Cover most use cases for generic data storage and processing

Vector

Vectors are re-sizable arrays.

  • Vec is a (pointer, capacity, length) triplet
    • Pointer will never be null, so this type is null-pointer-optimized
    • Memory it points to is on the heap in continuous order of its length
  • As low-overhead as possible in the general case
  • Automatic capacity increased on demand
    • Elements will be reallocated (can be slow)
    • Will never automatically shrink itself

Vec<T> in Rust are generic, they have no default type.

fn main() {
    // declare an empty vector with explicit data type
    let mut empty_vector: Vec<i32> = Vec::new();
    println!("{:?}", empty_vector);
}

Initialize a vector using the Vec::new() method or the vec! macro:

fn main() {
    // initialize a mutable empty vector
    let mut mutable_vector = Vec::new();
    // push an element to the vector
    mutable_vector.push(1);
    println!("{:?}", mutable_vector);
    // use `vec!` macro to initialize a immutbale vector with three elements
    let immutable_vector = vec![2,3,4];
    println!("{:?}", immutable_vector);
}

Capacity

Recommended to specify capacity at declaration if possible

  • capacity() number of allocated elements (without reallocating)
  • len() number of used elements
  • push() and insert() never (re)allocate if capacity is sufficient
  • shrink_to_fit() drops memory if able:
fn main() {
    let mut vec = Vec::new();
    vec.push(1);
    vec.push(2);
    vec.pop();
    println!("{} {}",vec.capacity(),vec.len());
    vec.shrink_to_fit();
    println!("{} {}",vec.capacity(),vec.len());
}

Iterators

iter provides an iterator of immutable references:

fn main() {
    let vector = vec![1,2,3];
    for element in vector.iter() {
        print!("{} ",element);
    }
}

iter_mut provides an iterator of mutable references:

fn main() {
    let mut vector = vec![1,2,3];
    for element in vector.iter_mut() {
        *element += 1;
        print!("{} ",element);
    }
}

Strings

All strings in Rust are UTF-8 encoded.

fn main() {
    // string literal initalizing a `&str` slice
    let ss = "a";    // equivalent to `let ss: &str = "a"`
    // create a `String` from a string literal
    let st = "b".to_string();
    // equivalent to
    let sf = String::from("c");
    println!("{} {} {}", ss, st, sf);
}
a b c

The &str slice is provided by the Rust Core:

  • Created using string literals (stored in the program’s binary)
  • May reference a range of UTF-8 text “owned” by someone else
  • Preallocated text that is stored in read-only memory as part of the executable

The String type is provided by the Rust’s standard library:

  • String is a wrapper over a Vec<u8>
  • Stores a pointer on the stack, and data in a heap-allocated buffer
  • Automatically resizes its buffer on demand
  • Interpretation as bytes, as slice, scalar values, and grapheme clusters
  • No indexing (bytes do not correlate to a Unicode scalar value)

It’s safe to say that, if the API we’re building doesn’t need to own or mutate the text it’s working with, it should take a &str instead of a String

Rust has this super powerful feature called Deref coercing which allows it to turn any passed String reference using the borrow operator:

fn puts(s: &str) {
    print!("{}",s)
}
fn main() {
    let s = "ab";             // `&str` slice
    let t = "cd".to_string(); // `String` type
    puts(s);
    puts(&t);                 // pass by reference
}

Append

Append strings by using push-methods of String:

fn main() {
    let mut s = "a".to_string();
    // append a single character
    s.push('b');
    // append an `str` slice
    s.push_str("cde");
    // ownership of slice `t` not moved to `s`
    let t = "f";
    s.push_str(t);
    println!("{} {}", s, t);
}
abcdef f

Split

split() returns an iterator over substrings of a string slice:

fn main() {
    let s = "a,b,c";
    for t in s.split(",") {
        print!("{} ",t);
    }
}

split_whitespace() splits the input string into different strings

fn main() {
    let s = "a b   c";
    for t in s.split_whitespace() {
        print!("{},",t);
    }
}

Slice

Data type that does not have ownership

  • Reference a contiguous sequence of elements in a collection
let a: [i32; 4] = [1, 2, 3, 4]; // Parent Array
let b: &[i32] = &a;             // Slicing whole array
let c = &a[0..4];               // From 0th position to 4th(excluding)
let d = &a[..];                 // Slicing whole array
let e = &a[1..3];               // [2, 3]
let f = &a[1..];                // [2, 3, 4]
let g = &a[..3];                // [1, 2, 3]

Functions

The main function is the program entry point

// program entry point
fn main() {
   // call the `add` function
   print!("{}",add(1,2));
}
// function with parameters, and a return value
fn add(x: i32, y: i32) -> i32 {
   // function body
   x + y // last expression used for return value
}

Declare new functions with fn keyword followed by:

  • snake_case function name
  • Input parameter list in () passed by the caller
  • Function body in {} containing statements and expressions
    • Statements do not return values
    • Expressions evaluate to something and return a value (no ending semicolon)

Parameters are variables that are part of a function’s signature:

  • Must declare the type of each parameter
  • Multiple parameters separated by ,

Functions can return values to the code that calls them

  • Declare type of a return value with ->
  • Return value synonymous with the value of the final expression (implicit return value)
// a function with mutliple return values
fn add_sub(x: i32, y: i32) -> (i32, i32) {
    (x + y, x -y)
}
fn main() {
    print!("{:?}",add_sub(1,2));
}

Statements return no value, while expressions return a value.

Return expressions are denoted with the keyword return:

fn max(a: i32, b: i32) -> i32 {
    if a > b {
        return a;
    }
    return b;
}

fn main() {
    println!("{}", max(27,64));
}

Return as the last line of a function works, but is considered poor style.

Match

Rust provides pattern matching via the match keyword, which can be used like a C switch.

A match expression takes an input value, classifies it, and then jumps to code written to handle the identified class of data. [mmmmr]

for i in -2..5 {
    match i { //scrutinee is `i`
        -1 => println!("It's minus one"),
        1 => println!("It's a one"),
        2|4 => println!("It's either a two or a four"),
        _ => println!("Matched none of the arms"),
    }
}

The value compared to the patterns is called the scrutinee.

Scrutinee expression and patterns must have the same type.

let text = "foo bar nom".to_string();
for word in text.split(' ') {
match word.as_ref() {
    "foo" => {
    println!("bar");
    }
    "bar" => {
    println!("foo");
    }
    _ => println!("no match")
}
}

match can return a value:

let v = match 5 {
    1 => 2,
    2 => 3,
    _ => 0,
};
assert_eq!(v,0);

Match against an enum type:

enum E {
    A,
    B,
    C
}
let e = match E::C {
    E::A => 'a',
    E::B => 'b',
    E::C => 'c'
};
assert_eq!(e,'c')

[mmmmr] Mixing matching, mutation, and moves in Rust
https://blog.rust-lang.org/2015/04/17/Enums-match-mutation-and-moves.html

Macros

fn add(x: i32, y: i32) -> i32 {
    x + y
}

macro_rules! add {

    ($x:expr,$y:expr) => {
        add($x,$y)
    };

    ($x:expr) => {
        add($x,2)
    };

    () => {
        add(1,2)
    };
}

fn main() {
    assert_eq!(add!(),3);
    assert_eq!(add!(1),3);
    assert_eq!(add!(1,2),3);
}

References