Chapter 1-Week 1 (Getting Started With Rust for Cloud, Data and MLOps)

(btw, yes, I made this image using Rust) sd_final (1)

Demo Video of building a Command-Line Tool in Rust

Graduate Cloud Computing for Data w/ Rust first approach

Key Goals in Semester

  • ~1, 500 Rust projects = 100 Students * 15 Weeks
  • Build Resume worthy projects
  • Projects should be runnable with minimal instructions as command-line tools or microservices deployed to the cloud

How to Accomplish Goals

Two different demo channels

  • Weekly Learning Demo: Projects can take 10-60 minutes on average to complete (Text only explanation, screencast optional). You must show the code via the link and explain it via README.md.
  • Weekly Project Progress Demo: Demo via screencast, required. The demo should be 3-7 minutes.

Two Different Portfolio Styles

Weekly Learning Repo Spec
Big Projects Repo Spec

Each "big" project should have a dedicated repo for it; a good example is the following repo: https://github.com/noahgift/rdedupe. Please also follow these additional guidelines:

  • Each repo needs a well-written README.md with an architectural diagram
  • Each repo needs a GitHub release (see example here: https://github.com/rust-lang/mdBook/releases) where a person can run your binary.
  • Each repo needs a containerized version of your project where they can build the project and do a docker pull to a public container registry like Docker Hub: Docker Hub
  • I would encourage advanced students to build a library for one of your projects and submit it to crates.io: https://crates.io if it benefits the Rust community (Don't publish junk)
  • Each repo needs to publish a benchmark showing performance. Advanced students may want to consider benchmarking your Rust project against a Python project
  • You should default toward building command-line tools with Clap: https://crates.io/crates/clap and web applications with Actix: https://crates.io/crates/actix, unless you have a compelling reason to switch to a new framework.
  • Your repo should include continuous integration steps: test, format, lint, publish (deploy as a binary or deploy as a Microservice).
  • Microservices should include logging; see rust-mlops-template for example.
  • A good starting point is this Rust new project template: https://github.com/noahgift/rust-new-project-template
  • Each project should include a reproducible GitHub .devcontainer workflow; see rust-mops-template for example.

Structure Each Week

  • 3:30-4:45 - Teach
  • 4:45-5:00 - Break
  • 5:00-6:00 - Teach

Projects

Team Final Project (Team Size: 3-4): Rust MLOps Microservice
  • Build an end-to-end MLOps solution that invokes a model in a cloud platform using only Rust technology (i.e., Pure Rust Code). Examples could include the PyTorch model, or Hugging Face model, or any model packaged with a Microservice. (see the guide above about specs)
Individual Project #1: Rust CLI
  • Build a useful command-line tool in data engineering or machine learning engineering. (see the guide above about specs)
Individual Project #2: Kubernetes (or similar) Microservice in Rust
  • Build a functional web microservice in data engineering or machine learning engineering. (see the guide above about specs)
Individual Project #3: Interact with Big Data in Rust
  • Build a functional web microservice or CLI in data engineering or machine learning engineering that uses a large data platform. (see the guide above about specs)
Individual Project #4: Serverless Data Engineering Pipeline with Rust
Optional Advanced Individual Projects

For advanced students, feel free to substitute one of the projects for these domains:

  • Web Assembly Rust: Follow the above guidelines, but port your deploy target to Rust Web Assembly. For example, you were Hugging Face in the browser.

  • Build an MLOps platform in Rust that could be a commercial solution (just a prototype)

  • Build a Rust Game that uses MLOps and runs in the cloud

  • (Or something else that challenges you)

Onboarding Day 1

  • GitHub Codespaces with Copilot
  • AWS Learner Labs
  • Azure Free Credits
  • More TBD (AWS Credits, etc.)

Getting Started with GitHub Codespaces for Rust

rust-new-project-template

All Rust projects can follow this pattern:

  1. Create a new repo using Rust New Project Template: https://github.com/noahgift/rust-new-project-template
  2. Create a new Codespace and use it
  3. Use main.rs to call the handle CLI and lib.rs to handle logic and import clap in Cargo.toml as shown in this project.
  4. Use `cargo init --name 'hello' or whatever you want to call your project.
  5. Put your "ideas" in as comments in Rust to seed GitHub Copilot, i.e //build anadd function
  6. Run make format i.e. cargo format
  7. Run make lint i.e. cargo clippy --quiet
  8. Run project: cargo run -- --help
  9. Push your changes to allow GitHub Actions to: format check, lint check, and other actions like binary deploy.

This pattern is a new emerging pattern and is ideal for systems programming in Rust.

1 1-prompt-engineering

Repo example here: https://github.com/nogibjj/hello-rust

Reproduce

A good starting point for a new Rust project

To run: cargo run -- marco --name "Marco" Be careful to use the NAME of the project in the Cargo.toml to call lib.rs as in:

[package]
name = "hello"

For example, see the name hello invoked alongside marco_polo, which is in lib.rs.

lib.rs code:

#![allow(unused)]
fn main() {
/* A Marco Polo game. */

/* Accepts a string with a name.
If the name is "Marco", returns "Polo".
If the name is "any other value", it returns "Marco".
*/
pub fn marco_polo(name: &str) -> String {
    if name == "Marco" {
        "Polo".to_string()
    } else {
        "Marco".to_string()
    }
}
}

main.rs code:

fn main() {
    let args = Cli::parse();
    match args.command {
        Some(Commands::Marco { name }) => {
            println!("{}", hello::marco_polo(&name));
        }
        None => println!("No command was used"),
    }
}
References