Sorting 400+ tabs in 60 seconds with JS, Rust & GPT3: Part 2 - Macros & Recursion

Sorting 400+ tabs in 60 seconds with JS, Rust & GPT3: Part 2 - Macros & Recursion
Programmers panicking with their head on fire, picasso, cubism - The artist is a machine, 2023

So, considering the response on the last part, I have a feeling I should add a preface to explain a few things about this post/series:

- It's not about "sorting" as in the algorithmic function of sorting - fun idea tho.
- It's not about GPT writing/optimising sorting functions - also a fun idea.
- It's not really in 60 seconds, the title is a jab at the common "in 60 seconds" trope
- It's mostly just a tagalong journal of an adventure where I try to solve my problem via over-engineering, with all the fun quirks, complexities, deep dives and scope creeps that come with building software.

So now that we got that out of the way, let's focus on what really matters:
Abusing GPT3 for fun and no profit.

In the last post, we built the user-facing part of our chrome extension that will help us fix our tab hoarding habit. We wrote some HTML, scribbled some JS and researched mysteries of the Chrome API. This time, we'll dive into the hottest language on the block right now - Rust.

I don't think I need to explain what Rust is. Even if you've lived under a rock you've probably heard of Rust - the development community is praising it into high heavens - it has the speed of C, the safety of Java and the borrowing system with the helicopter parenting skills of an AH-64 Apache attack helicopter.

But - the syntax is neat, the performance is awesome, macros are cool and even tho it's mostly strict about memory, it still gives you access to raw pointers and let's you go !unsafe .

So, to get a feel of the language, let's try and have some fun with it.

We'll build a simple service that will take our tab collection, simplify it a bit, talk to the OpenAI's API and - hopefully without hallucinations - parse the response into something our extension can use. On the way we'll encounter some obstacles, from having too many tabs and wasting money to Silicon Valley waking up and hammering the OpenAI API into oblivion.

Our service will be quite simple - we'll expose one method /sort to which we will POST our tabs and existing categories. To build it, we'll be resorting to Axum framework, allowing us to easily start up a server with a /sort endpoint. And to deploy it we'll use shuttle so we can easily spin up a Rust server without swimming through the sea of AWS configs, writing Procfiles or building docker images.

We'll even use it to scaffold our project, so let's start by installing it.
First, we'll need cargo, the rust package manager - if you don't have it installed, follow the steps here. Second, we'll need a shuttle account - don't worry, you can just 1-click signup with Github there - no need to fill out forms.

Now, open up the ol' terminal and hit cargo install cargo-shuttle && cargo shuttle login followed up with cargo shuttle init after you've authenticated.

Follow the instructions to set the project name and location, and in the menu choose axum as your framework. This will scaffold a new axum project as a library, with shuttle as a dependency.

Our folder should now look like this.

├── Cargo.lock
├── Cargo.toml
└── src
    └── lib.rs

It's quite a simple structure - We have cargo.toml, which is the rust version of manifest.json or package.json. It contains metadata about your package, it's dependencies, compilation features and more. The cargo.lock is just a hardcoded list of dependencies specified, ensuring consistent builds across environments.

Our main server code will reside inside src/lib.rs. Let's look at it while it's still fresh and beautiful:

use axum::{routing::get, Router};
use sync_wrapper::SyncWrapper;

async fn hello_world() -> &'static str {
    "Hello, world!"
}

#[shuttle_service::main]
async fn axum() -> shuttle_service::ShuttleAxum {
    let router = Router::new().route("/hello", get(hello_world));
    let sync_wrapper = SyncWrapper::new(router);

    Ok(sync_wrapper)
}

A few things of note here:

  • No main method - since this projects is marked as a [lib]rary, there is no predefined entry point needed.

  • The router - the "entry point" for your Axum service. Requests are routed through here and the code is pretty self-explanatory - you pair up the route to the function handling it, i.e, our supercoolservice.com/hello would return a simple "Hello, world!" text.

  • The SyncWrapper - wraps our router object, ensuring it's safe to access across different threads.

  • #[shuttle_service::main] - this is a rust macro - think of it as a more powerful version of annotations if you know what those are. It let's you write code that writes code - but that's the lazy explanation. Uuh.. I think we need a quick diversion here.

A quick diversion into the magical realm of macros

Magical realm of macros, hieronymus bosch - 2023, the artist is a machine. First one is a conference talk, the second is pair programming, the third is the rabbit hole you find yourself in after getting into macros.

Now, before we get into the macros, I gotta preface this with a warning: this is not a 100% explanation of macros and how they work in [insert your favorite language here]. For that, there exist hundreds of books, guides and articles.

But for those random readers who stumbled here and don't want to read "monad is a monoid in the category of endofunctors" style articles explaining macros, we'll have a quick dip into the beautiful rabbit-hole of macros.  

So let's imagine we're working in a imaginary language called Bust.

Bust is this new cool language the whole twitter is buzzing about and they say it's going to be the language of the metaverse AI web4 apps. But, since it's a new language, it's still early and there isn't many libraries - for example, there is no JSON serialisation libraries yet, so you gotta write all of the code for that manually. So every time you create a struct you have to also write a bunch of serialisation code for it too. Like:

struct ReallyBigModel {
   id: String,
   name: String,
   isReal: Bool,
   ...
   stuff: AnotherBigModel
   }


impl ToJson for ReallyBigModel {
    fn toJson() -> String {
          return mapOf { "id" to id, 
                "name" to name,
                "isReal" to isReal,
                ..., 
                "stuff" to stuff.toJson())
              }.toJson() 
        }
}

Annoying, isn't it?
Nobody wants to write this much boilerplate every day.

But one day, you read in the latest changelog it now supports this new thing called macros. There are many types of macros, but in Bust macros are these special methods you can define that consist of two things:

  • The macro attribute
  • The macro function

The attribute is like a mark you can put on other code.
Imagine it being a big red X over classes or methods. So when your compiler is doing it's compiling, if it stumbles upon a function with a big red X over it's head, it knows it should call your macro function.

The macro function receives the code that is marked with the attribute, decides what to do with it and then returns new code to the compiler which it then integrates back where the marked function was.

So if in our example, we made a toJson macro, we could add toJson attribute above any struct and it would write that code for us, so the above code would turn into:

#[toJson]
struct ReallyBigModel {
   id: String,
   name: String,
   isReal: Bool,
   ...
   stuff: AnotherBigModel
}

And what would our macro look like?

It would be a function that takes in the code marked with it (represented as tokens) and returns a new code that will replace it.

#[toJson]
fn addToJsonTrait(input: TokenStream) -> TokenStream {

    let tree = parseIntoAST(input)
    let nodes = ast.data.asStruct();
    let name = tree.identity
    
    // Get all the children that are properties
    // Map them into format: $name to name
    let properties = nodes
    			.filter((child)=>child.isProperty)
    			.map((property) =>
     			"\"${property.name}\" to ${property.name}")
        		.joinToString(",\n")
                       

    // Write the toJson trait body
    let body  =  quote! { //this is also a kind of macro!
                    impl ToJson for #name {
                        fn toJson() -> String {
                                return mapOf {
                                    #properties
                                     }.toJson();
                                   }
                               }
                 	}
					
    return body.intoTree().intoStream()

}
Note: This is Bust, an imaginary language. Every language has it's own macro implementation and this is just a simplified representation of one so the article doesn't get excessively long.

So now, when our compiler arrives at a class marked with #[toJson], it will call the addToJsonTrait method, pass it the code for the class and wait until it returns the new code before it continues compiling.

And just like that, we saved a ton of time by using a macro function and can now be the productive Bust developer we always wanted to be!

Now, don't get too excited - this is just an imaginary implementation.
There is a lot to know about macros and I would suggest you to get deep into the rabbit hole - rust itself has a few different types of macros, it's one of the reasons people love Lisp so much, there are hygenic and non-hygenic macros, different types of expansions, and a lot more magic hiding away in the deep.

So now that we got that out of the way, let's get back into building our API.

The POST office

We'll hide the simple magic of our service behind the /sort POST method, so delete that hello world and replace the router with one handling the /sort request - Router::new().route("/sort", post(sort_items)) and a sort_items method that will handle the request:

async fn sort_items(Json(payload): Json<SortRequestPayload>)
                                       -> impl IntoResponse {
 (StatusCode::OK, Json("ok")).into_response()

}`

The method will receive a Json wrapper of our request structure and will return a implementation of IntoResponse trait which our server knows how to handle. Specifically, we'll be returning it in the tuple format of StatusCode,T which the server knows how to transform into an appropriate response.

One more thing we need to implement is our request data structure. So instead of having them lay around in the same file, let's pop open a new file called models.rs in src folder and create some basic definitions.

We'll need the SortRequestPayload which is the wrapper we will receive. It should contain a list of categories and items, so we'll need some structures for them too - Category and Item, so let's add those too. And we'll need a list of categories with items, so we can have categories with belonging items to return and a wrapper for them. Also, we'll add an ErrorResponse so we know where the problem is.

//in models.rs

pub(crate) struct SortRequestPayload {
    pub(crate) categories: Vec<Category>,
    pub(crate) items: Vec<Item>,
}

pub(crate) struct Category {
    pub(crate) id: usize,
    pub(crate) title: String,
}

pub(crate) struct Item {
    pub(crate) id: usize,
    pub(crate) title: String,
}

pub(crate) struct CategoryWithItems {
    pub category_id: usize,
    pub category_name: String,
    pub items: Vec<usize>
}

pub(crate) struct Categories {
    pub categories: Vec<CategoryWithItems>
}

pub(crate) struct ErrorResponse {
    pub message: String,
}

But, we got one problem - we need our structures to be easily (de)serialisable from/into json - for that, we will use a library called Serde and use it's macros ( similar to the macro we constructed before) so open up your cargo.toml file and add serde and serde_json as a dependency:

serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"

Now, we can mark our structs with serde's #[derive(Deserialize)] macros so the framework knows how to deserialize the received JSON into our structs.

//in models.rs

#[derive(Deserialize)]
pub(crate) struct SortRequestPayload {
    pub(crate) categories: Vec<Category>,
    pub(crate) items: Vec<Item>,
}

#[derive(Deserialize)]
pub(crate) struct Category {
    pub(crate) id: usize,
    pub(crate) title: String,
}

#[derive(Deserialize)]
pub(crate) struct Item {
    pub(crate) id: usize,
    pub(crate) title: String,
}

#[derive(Deserialize)]
#[derive(Serialize)]
pub(crate) struct CategoryWithItems {
    pub category_id: usize,
    pub category_name: String,
    pub items: Vec<usize>
}

#[derive(Deserialize)]
#[derive(Serialize)]
pub(crate) struct Categories {
    pub categories: Vec<CategoryWithItems>
}

#[derive(Serialize)]
pub(crate) struct ErrorResponse {
    pub message: String,
}

With this done, we can dive back into our code.
Let's examine our plan:

1. Get the items
2. Assign items to categories
3. Slice the prompt into chunks
4. A recursive sort:
    4.1. Take existing categories and a chunk, turn them into a prompt
    4.2. Ask OpenAI to sort it
    4.3. Deserialize the response
    4.4. Add to existing categories
    4.5. While chunks remain, back to 4.1
5. Return the result

And structure it into methods:

//in lib.rs

...

fn create_chunks_for_prompting(items: Vec<Item>) -> Vec<Vec<Item>

fn sort_recursively(
	sorted_categories: Vec<CategoryWithItems>,
        remaining: Vec<Vec<Item>>) -> Result<Categories, Error>

fn build_prompt(items: Vec<Item>,
		categories: Vec<CategoryWithItems>) -> String

fn prompt_open_ai(prompt: String) -> Result<String, String>

Also, we'll need our prompt, so let's try out something like this - we tell GPT3 it will receive a list of items and give it the format and the embed the list. Then, we describe it the valid JSON format to return and pass in the existing categories. In the end, we tell it to return them to us in the valid JSON format. Hopefully, it will adhere to the JSON format and not hallucinate it, but we'll fine-tune that in later posts. For now, it seems like specifying the valid JSON format close to the end of the prompt and mentioning a "valid JSON format" in the end keeps it grounded quite nicely.

You will receive list of items with titles and id's in form of [title,id].
Based on titles and urls, classify them into categories, by using existing categories or making new ones.

Tabs are:
[$tabName, $tabId].

Valid JSON format to return is:
{ "categories": [ { 
    "category_id":"id here",
    "category_name": "name here", 
    "items":[tab_id here] } 
]}.

Existing categories are: 
$categories

A new more detailed list of categories (existing and new) with items, in valid JSON format is:

Sounds good to me!
Let's divide it up into constants we can use inside our code.


const PROMPT_TEXT_START: &str = "You will receive list of items with titles and id's in form of [title,id].
Based on titles and urls, classify them into categories, by using existing categories or making new ones.";

const PROMPT_TEXT_MIDDLE: &str = "\nValid JSON format to return is:
{ \"categories\": [ { \"category_id\":\"id here\", \"category_name\": \"name here\", \"items\":[tab_id here] } ]}.
Existing categories are:";

const PROMPT_TEXT_ENDING: &str = "A new more detailed list of categories (existing and new) with tabs, in valid JSON format is:";

Finally we can get into our sort_items method and start filling it all out. First, we'll take ownership of our data and split it into chunks:

let items = payload.items;
let categories = payload.categories.iter().map(|it| {
    CategoryWithItems {
        category_id: it.id,
        category_name: it.title.to_owned(),
        items: Vec::new(),
    }
}).collect();

let prompt_slices = create_chunks_for_prompting(items_with_indexes);

Why chunks? Because if we just add all of the items to the prompt, our prompt size could be well over 4096 tokens - which is what the model we'll be using supports as the maximum length for prompt and completion. So we need to find a way to split it into a suitable size and have some buffer for the completion too - we'll leave a 50% buffer for it, leaving our prompt size at 2048.

To achieve that, our create_chunks_for_prompting function will need to do two things:

  • Count the number of tokens in our base prompt
  • Count the number of tokens in the data we send to the API
  • Calculate the amount of chunks we need by splitting the size of the total tokens with 2048 minus our hardcoded prompt size.

According to the OpenAI documentation, we can guesstimate that a token is the size of about 4 characters. Now, there are a lot of different ways to count tokens, and to do it properly, we would have to do a bit more than just split the length by 4 - the best way would be to use the Rust tokenizers crate and their GPT2 tokenizer. But, since that leads us down another rabbit hole, we're gonna skip it for now and just gonna do a simple trick - the split_whitespace method which will give us an approximation of token length.

fn create_chunks_for_prompting(items: Vec<Item>) -> Vec<Vec<Item>> {
   
    //tokens in our data
    let json_size = serde_json::to_string(&items).unwrap()
        .split_whitespace()
        .collect_vec()
        .len();
    
    // get the size of our hardcoded prompt

    let hardcoded_prompt = format!("{a}{b}{c}",
                                   a =String::from(PROMPT_TEXT),
                                   b = String::from(PROMPT_TEXT_APPEND),
                                   c= String::from(PROMPT_TEXT_ENDING));
    
    let hardcoded_prompt_size = hardcoded_prompt
        .split_whitespace()
        .len();

    //find the number of chunks we should split the items into
    let chunks_to_make = json_size / (2048 - hardcoded_prompt_size);
    
    //split the vector up into N vectors
    let chunk_size = items.chunks(items.len() /
                                      (if chunks_to_make > 0 {
                                      chunks_to_make
                                      } else { 1 }));
                                      
    //return the list of chunks
    return chunk_size.map(|s| s.into()).collect();
}

Now, let's get down to our build_prompt function.
To build our prompt, we need the list of items to be sorted and the existing categories. We'll take the list of items and format! it to a string in the form of [title,id]. Then, we'll turn the categories into JSON and use the format! macro to combine it all into a single prompt.

fn build_prompt(items: Vec<Item>,
                categories: Vec<CategoryWithItems>) -> String {

    //map items into [title,id] then join them all into a string
    let items_joined = items.iter().map(|item| format!(
                                        "[{title},{id}]",
                                        title = item.title,
                                        id = item.id))
                                .collect()
                                .join(",");

    let categories_json = serde_json::to_string(&categories).unwrap();
    
    format!("{prompt}\n{tabs}{middle}{categories}\n{ending}",
            prompt = String::from(PROMPT_TEXT_START),
            tabs = items_joined,
            middle = String::from(PROMPT_TEXT_MIDDLE),
            categories = categories_json,
            ending = String::from(PROMPT_TEXT_ENDING))
}

Now, to actually send that prompt to OpenAI, we'll need a HTTP client.
For that, we'll be using the reqwest crate - it provides us with a high level HTTP client with simple async functions we can use to talk to OpenAI API and has a JSON feature which allows us easy serialization/deserialization. So let's add it to our Cargo.toml file:

[dependencies]
...
reqwest = { version = "0.11", features = ["json"] }

Using this, we can build our HTTP client via the good ol' builder pattern.

let client = Client::builder()
    .http2_keep_alive_timeout(Duration::from_secs(120))
    .timeout(Duration::from_secs(120))
    .build()
    .unwrap();

But, if we built the client inside our prompt_open_ai function, we will be creating a Client instance for each request we make, so let's instead create a dependency and add the client code into our sort_items function, then pass it down as na argument into the sort_recursively function and prompt_open_ai functions. This way, we'll only use one instance of the HTTP client per one /sort call and our prompt_open_ai function can focus only on actually calling the API and giving us the result back.

So let's build a simple POST call and see how we can receive it's Result.
To keep things clean, we'll create a separate module inside our structure - modules are containers for your code (akin to packages), enabling you to create some separation between different areas of your code. Create a new folder called openai and two new files in it:

  • a mod.rs for our code
  • a models.rs for our models

Open up the models.rs and add the structs we need to communicate with our OpenAI Completion API:


use serde::{Deserialize, Serialize};

#[derive(Serialize)]
pub(crate) struct AskGPT {
    pub prompt: String,
    pub model: String,
    pub max_tokens: usize,
    pub stream: bool,
    pub temperature: usize,
    pub top_p: usize,
    pub n: usize,
}

#[derive(Deserialize)]
pub(crate) struct Completion {
    pub model: String,
    pub choices: Vec<Choices>,
}

#[derive(Deserialize)]
pub(crate) struct Choices {
    pub text: String,
    pub index: usize,
}

And in the mod.rs we can build our prompt_open_ai method, with the POST request which will send our newly created AskGPT model to their /completions endpoint.

Now, there's a few important fields here - the self-explanatory prompt field, the model which lets us choose which model will do the completion (at the time of writing, text-davinci-003 is the best performing one for this task), the max_tokens which we'll set to 4096 (the max, d'oh), the n which controls the number of responses and temperature which is a way to tell it which probabilities to consider - the higher it is, the more random the completion might seem - we'll use 0, so our output is less random.

Note: For this part, you'll need your OpenAI API key, which you can find here.

async fn prompt_open_ai(prompt_txt: String,
                        client: &Client) -> Result<String, String> {
    let token =  String::from("YOUR_API_KEY_HERE")
    let auth_header = format!("Bearer {}",token);


    let req = client.post("https://api.openai.com/v1/completions")
        .header("Authorization", auth_header)
        .json(&AskGPT {
            prompt: prompt_txt,
            model: String::from("text-davinci-003"),
            max_tokens: 4096,
            n: 1,
            stream: false,
            temperature: 0,
        }).send().await;

}

Finally, a Result!
But what do we do with it?

Well, we can just add ? to the end of the await, which would immediately give us the Response, but that's no fun, so we'll use probably one of my favorite rust features - the famous match.
match statements are at the core of the rust developer experience, providing you with powerful pattern matching abilities that ensure all the paths your code takes are covered.

But Ian, what is so special about it?
Isn't it just if/else on steroids?
Oh no, it's way more than that. Unlike a set of if/else or switch statements, match forces you to check all of the possibilities, ensuring you cover both the happy and the sad paths your code can take. Why is this so superpowered? Because it reduces the possibility of bugs due to unhandled cases and forces you to cover all possible cases, improving your code immediately. It's one of those rare tools that can both improve readability of your code, solve bugs and increase maintainability in one single swoop.

So let's try and use it - the syntax is simple, on the left hand side is the pattern you are matching against and on the right hand side is the codeblock to execute. First we'll check if the request actually happened by checking the Result we got.

    match req {
        Ok(response) => {
          //request actually happened, we can access response safely
        }
        Err(error) => {
            //TODO handle error
        }
    }

Now in our Ok branch, we can access our response object safely, knowing we got the error case covered too and it isn't gonna cause a runtime crash.
We can move on to check if the request has actually been a sucessful one by simply checking if the status code is 200 OK.

match response.status() {
    StatusCode::OK => {
      // smashing success 
    }
    other => {
      // TODO handle error
    }
}

And finally, for the main step - if the request was a success, we should try and deserialize the body into our Completion struct. But since that can fail too, we should do a quick match here too and extract the response from our completion object:

match response.json::<Completion>().await {
    Ok(parsed) => {
        //We know there is always at least 1 item in choices 
        //due to our request param n==1 so we'll just live wild and unwrap
        let choices = parsed.choices.first().unwrap();
        let json: &str = choices.text.borrow();
        Ok(String::from(json))
    }
    Err(err) => {
            return Err(Parsing);
        }
}

Now, to handle the errors - let's add an enum that will denote the different types of errors we can have (yes, I'll condense all possible errors to these three types. What could go wrong..) - the connection error, the server response error and the parsing error. Hop up to the models.rs and add it:

#[derive(Debug)]
pub(crate) enum OpenAiError {
    Connection,
    Parsing,
    Server,
}
match req {
    Ok(response) => {
        match response.status() {
            StatusCode::OK => {
                match response.json::<Completion>().await {
                    Ok(parsed) => {
                        //there is always at least 1 due to our request
                        let choices = parsed.choices.first().unwrap();
                        let json: &str = choices.text.borrow();
                        Ok(String::from(json))
                    }
                    Err(err) => Err(Parsing);
                }
            }
            other => Err(Server)          
        }
    }
    Err(err) => Err(Connection)

Congratulations! We've successfully made our request in a safe manner and covered all the sad and happy paths on the way.

So with our requests poppin', we can finally start working on our sort_recursively function. Why recursion here? Because we're basically reducing a list onto itself with GPT3 acting as our reducer function. While we could do a loop here and call this method n times, it would mean we would have to also mutate a variable outside of the loop (containing our categories). As that feels dirty, we'll do it the clean, functional way by using our good ol' friend, the recursion.

So let's open up our main.rsand start get into the sort_recursively function.

First, we'll build our prompt, then send it to prompt_open_ai and try to deserialize the response. If it succeeds, we join it with the existing categories and pass it again into sort_recursively with the remaining chunks, until we're out of chunks.

async fn sort_recursively(
                        sorted_categories: Vec<CategoryWithItems>,
                        remaining: Vec<Vec<Item>>,
                        client: Client) -> Result<Categories, Error> {

    let prompt = build_prompt(remaining.first().unwrap().to_vec(),
                             sorted_categories);
    let ai_response = prompt_open_ai(prompt, &client).await.unwrap();
    let json = ai_response.as_str();

    //try to deserialize it
    let generated = serde_json::from_str::<Categories>(json);
    
    match ai_response_result {
        Ok(response) => {
           let parsed = serde_json::
                           from_str::<Categories>(ai_response.as_str());
           match parsed {
               Ok(res) => match res {
                   Ok(wrapper) => {
                       let mut new_categories = wrapper
                                   .categories.to_owned();
                       //remove the processed chunk
                       let mut next_slice = remaining.to_owned();
                       next_slice.remove(0);
                       //join the categories
                       next_categories.append(&mut new_categories);
                       //if we're not done yet recurse
                       if next_slice.len() != 0 {
                        let next = sort_recursively(next_categories,
                                                    next_slice, 
                                                    client).await;
                        match next {
                            Ok(cats) => Ok(cats),
                            Err(e) => Err(String::from("Sort failed"))
                        }
                       } else {
                           Ok(Categories { categories: next_categories })
                       }
                   }
                   Err(msg) => Err(msg)
               }
               Err(parsing) => Err("Parsing response error".to_string())
           }
        }
        Err(err) => Err(err)
    }}

With all these matches, our code is starting to look pretty ugly.One way to avoid nested match hell is to use map,map_err and and_then extensions - they operate on either the left (map) or the right (map_err) side of the Result, enabling us to avoid nesting hell by simply chaining them into a more readable, concise version of it. The data will pass only through the corresponding operands so we can safely map our data and errors to the proper format.

We'll use it to reduce the first set of nested matches and we'll leave the last one as a match. Why? Because async closures still aren't stable in Rust it seems. We'll map all the errors into a Err(String) format so we can return it properly:

async fn sort_recursively(sorted_categories: Vec<CategoryWithItems>, remaining: Vec<Vec<Item>>, client: Client) -> Result<Categories, String> {

    let mut next_categories = Vec::from(sorted_categories.deref());
    let prompt = build_prompt(remaining.first().unwrap().to_vec(),
                              sorted_categories);

    let ai_response_result = prompt_open_ai(prompt, &client).await;

    let res = ai_response_result
        .map_err(|e|
                format!("Error communicating with OpenAI - {:?}", e))
        .and_then(|ai_response|
            serde_json::from_str::<Categories>(ai_response.as_str())
                .map_err(|_| "Parsing response error".to_string()));

    match res {
        Ok(wrapper) => {
            let mut new_categories = wrapper.categories.to_owned();
            //remove the processed chunk
            let mut next_slice = remaining.to_owned();
            next_slice.remove(0);
            //join the categories
            next_categories.append(&mut new_categories);
            //if we're not done yet recurse
            if next_slice.len() != 0 {
                sort_recursively(next_categories, 
                                next_slice,
                                client).await
                    .map_err(|e| 
                        format!("Sorting failed, reason: {}", e))
            } else {
                Ok(Categories { categories: next_categories })
            }
        }
        Err(msg) => Err(msg)
    }
}

There it is - we called the API in a safe, error free-oh-wait.... it's not compiling.

Well, one thing we didn't think about is async recursion.
Why is this such a problem?

Well, due to how async/await is implemented in Rust (and a lot of other languages), under the hood it generates a state machine type with all the futures in the method. But now that we are adding recursion here, the generated type starts referencing itself - so under the hood, it blows up into a potentially infinitely recursive type and compiler cannot determine the size of the type. To stop it from blowing up, we'll need to fix the recursion to return a Box'd Future, which will then just give us the pointer to the heap instead of the whole object, preventing infinite self-referencing under the hood.

I'd recommend reading more about this problem here and following this rabbit hole deeper and deeper - it covers a lot of language design questions and concepts which appear through many languages. But, for now, all we are going to do is use the async_recursion crate, so head on to your Cargo.toml and add it there:

[dependencies]
..
async-recursion = "1.0.2"

And mark your function with #[async_recursion] macro so it can Box it for you.

With that out of the way, we can come back to our original sort_items method and finally respond to that API request. Last time we left there, we added the Client instance, so just head down below it and call the sort_recursively method, use map_err to map the error into our ErrorResponse structure, wrap it in the JSON and return as a response and use map to turn our Ok result into a proper response:

    sort_recursively(categories, prompt_slices, client).await
        .map_err(|e| 
            (StatusCode::INTERNAL_SERVER_ERROR, 
            Json(ErrorResponse { message: e })).into_response())
        .map(|wrapper| {
            let new_categories = wrapper.categories.iter().map(|item| {
                CategoryWithItems {
                    category_id: item.category_id.to_owned(),
                    category_name: item.category_name.to_owned(),
                    items: item.items.to_owned(),
                }
            }).collect::<Vec<CategoryWithItems>>();
            (StatusCode::OK, Json(Categories {
                categories: new_categories
            })).into_response()
        })

And with this done, our service is now finished!

We take the response, format it, prompt it and give it back to user. Our plan is safe and sound. All that's left to do is deploy it - but we don't have to think about provisioning instances, setting up security groups or writing dockerfiles. Since we scaffolded our service via shuttle, we can easily deploy it with a simple touch of the terminal. Open up your projects folder in your shell of choice and type:

cargo shuttle deploy

Now stand up, take a few breaths, grab a sip of the coffee and before you even know it, your server is up and running at: https://projectname.shuttleapp.rs/

Now, uh... why were we even doing this?

Oh yeah, we were writing a JS extension. With our server up, it's nearly finished - just pop over to the extension and replace the localhost endpoint with the real endpoint you just got from shuttle.

Now, load the extension into a small window just to test it. Hit the sort button, wait for a bit and - BAM! Your tabs should be magically sorted into proper groups! Finally!

Let's try it in our real window - the one with ..uhh its nearing 600 tabs now. So we'll just hit the sort button and - wait...

...wait..

.....wait a bit more....

...... waaaaait it's coming...

.... this is taking way longer than 60 seconds...

... oh wait...

.. error?

Ooops - we hit the token limit!
Why? How? Didn't we do the whole chunking thing just so it fits?

Weeeeell, seems like we'll need to do a better calculation on prompt sizes.

Also, our recursion is causing problems - adding all previous categories to each prompt is causing it to blow up in size and it takes a really long time to actually finish the whole chain - way longer than 60 seconds.

And finally, the categories are quite... meh.

Which is great, since it gives us more stuff to do for the next iteration - we'll see how to eliminate this recursion, how to use GPT tokenizer and embed dictionary files into the binary and use shuttle's static folder service for it instead of blowing up our build times. We'll also take a stab at finetuning the model, giving us better results for less tokens - and since we're lazy, we'll just be generating the training data using GPT itself.

If you've come this far, thanks for reading and don't worry, we have many more feature creeps and potential problems to uncover on our path, so see you in the next episode of "Human vs Machines".

Machines might still have some problem understanding the concept of cuteness. "Cute rusty crab illustration" - 2023, the artist is a machine.