<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Ian's blog | entropy.observer]]></title><description><![CDATA[brainmush - Ian Rumac]]></description><link>https://blog.entropy.observer/</link><image><url>https://blog.entropy.observer/favicon.png</url><title>Ian&apos;s blog | entropy.observer</title><link>https://blog.entropy.observer/</link></image><generator>Ghost 5.71</generator><lastBuildDate>Thu, 09 Apr 2026 22:14:56 GMT</lastBuildDate><atom:link href="https://blog.entropy.observer/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[AI SuperAgents: (ab)using WASM for LLM function calls - Part 1]]></title><description><![CDATA[<blockquote>&quot;why are you even doing this, it doesnt make sense, half of this stuff isn&apos;t even stable, you&apos;re just torturing yourself&quot; - Every sane person</blockquote><p>Bleeding edge technology is fun.<br>You know what&apos;s also fun? Smacking two bleeding edge technologies together.<br><br>One</p>]]></description><link>https://blog.entropy.observer/get-witty-wit-it-using-wasm-as-platform-for-llm-function-calls/</link><guid isPermaLink="false">65fec6f6da89d344959db561</guid><dc:creator><![CDATA[Ian Rumac]]></dc:creator><pubDate>Sat, 07 Sep 2024 10:23:40 GMT</pubDate><media:content url="https://blog.entropy.observer/content/images/2024/04/685387_surrealist-painting-of-tiny-machines-building-a-gi_xl-1024-v1-0.png" medium="image"/><content:encoded><![CDATA[<blockquote>&quot;why are you even doing this, it doesnt make sense, half of this stuff isn&apos;t even stable, you&apos;re just torturing yourself&quot; - Every sane person</blockquote><img src="https://blog.entropy.observer/content/images/2024/04/685387_surrealist-painting-of-tiny-machines-building-a-gi_xl-1024-v1-0.png" alt="AI SuperAgents: (ab)using WASM for LLM function calls - Part 1"><p>Bleeding edge technology is fun.<br>You know what&apos;s also fun? Smacking two bleeding edge technologies together.<br><br>One can never know what&apos;s gonna happen - in your imagination, they will work magically in unison, bringing your vision to life with sounds of an angelic choir as it finishes compiling - in reality, you&apos;ll probably spend your nights lying in bed, thinking about the 136 tabs of github issues, half-finished documentation and unexplained source code you have open in your browser.<br><br>But if you&apos;re a novelty chasing ADHD squirrel like me and now you&apos;re remembering all those fun nights spent in these rabbit holes - oooh, do I have a treat for you. So grab your coffee, follow me into this rabbit hole and let&apos;s smash some shiny rocks together.</p><h3 id="so-how-bout-those-llms">So, how bout those LLM&apos;s?<br></h3><p>In the last few years, LLM&apos;s have penetrated every aspect of our world. From your Google recommendations and contents of your spam folder, to the eternal AI hypestorms on your Twitter feed, everything seems to be succumbing to the machines. <br><br>But to give machines the agency they need to interact with our world, we need to provide them with a way to do it. And how we&apos;re doing it has mostly based on the concept of LLM <em>function calling</em> - providing LLM&apos;s a set of possible functions to execute, together with argument descriptions. And it works <em>just fine.</em> We write some code, give the function description to an LLM, it returns us arguments, we call the function, say <em>abracadabra</em> and poof we got agency.</p><p>But what if we could make it <em>even better</em>?<br><br>What if we could let LLM&apos;s write their own functions?  Let alone, what if those functions could be reused across conversations, constantly expanding the capabilities of our machines, providing us with a giant library of possible interactions they can have with the real world?<br><br>Oh man, wouldn&apos;t that just be great (and completely, utterly, chaotic)?<br><br>Well, what if I told you there is a perfect way to do that by using another <em>bleeding-edge edgy</em> technology, one that will surely raise some eyebrows at your next local meetup?</p><h3 id="enter-wasm-the-asm-of-the-future">Enter WASM: the ASM of the future</h3><p><br>If you&apos;ve been paying attention to the web world the last few years, you&apos;ve probably heard of WebAssembly. If not, I&apos;ll keep it quick - it&apos;s a new binary format &amp; compile target, made so you can use your good ol&apos; compiled languages such as C, Rust or Java in the browser instead of being constrained to just Javascript.<br>Basically, like your CPU runs assembly instructions, the browser has a virtual machine that runs WebAssembly instructions. Mind blowing, I know.<br><br>But unlike other tries at this (I&apos;m looking at you, Java applets), this one actually seems properly orchestrated and executed, supported by browsers and developed in the open, standardised under W3C&apos;s supervision.<br><br>And while it was announced nearly 9 years ago, it recently started picking up quite a lot more steam, with the development of <strong>WASI</strong> and the <strong>Component model.</strong><br>These allow developers to write code that communicates with the outside world (in an easier way), providing common interfaces (WASI) and allowing developers to define their own (component model). As the preview of this has been maturing lately and is quite usable now, it might just be the perfect time to start exploring all the potential the technology offers.</p><h2 id="wasma-language-for-the-machines">WASM - a language for the machines.</h2><p><br>With the rise of the Agent and function calling paradigms, a lot of useless frameworks have popped up to provide agents with different capabilities - enabling one to chain multiple agents, provide them with function calls and feed them knowledge. Be it searching the web, writing to files or looking up wikipedia, they expand the capabilities of the machines, enabling complex, emergent behaviour to evolve.<br><br>Still, they are quite limited: they are bound to a specific language or framework, they have to be written by the developers and they are rarely shared between projects - meaning we all have to repeat the same boring API calls thousands of others developers already wrote.<br><br>But what if it didn&apos;t have to be so?<br><br>What if instead of having to write these functions ourselves, we could have the machines write them for us, on a per-need basis? What if we enabled complete automation of agents and their behaviour, providing them with <em>whatever</em> they need, <em>whenever</em> they need it, by letting them write their own code?<br><br>What if we could build a <em>self-building</em> machine?<br><br>With WASM, we can build just that. Let&apos;s take a look how:</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.entropy.observer/content/images/2024/03/excalidraw-2020323144042--13-.png" class="kg-image" alt="AI SuperAgents: (ab)using WASM for LLM function calls - Part 1" loading="lazy" width="1538" height="583" srcset="https://blog.entropy.observer/content/images/size/w600/2024/03/excalidraw-2020323144042--13-.png 600w, https://blog.entropy.observer/content/images/size/w1000/2024/03/excalidraw-2020323144042--13-.png 1000w, https://blog.entropy.observer/content/images/2024/03/excalidraw-2020323144042--13-.png 1538w" sizes="(min-width: 1200px) 1200px"></figure><p><br>The basic idea is just like your average LLM function call - take the message, have LLM decide if it requires an external function, call the function with the provided arguments your LLM extracted from the message. But, it comes with a twist (or two): </p><ul><li>We can write functions in any language we want, as long as it compiles to WASM and obeys our function contract</li></ul><p>Which enables another, way more entertaining twist:</p><ul><li>If the function doesn&apos;t exist, we can have another LLM agent write it, compile it, execute it and return the calls to the original agent.</li></ul><p>Why is this so entertaining?<br><br>Because in theory, it provides us with the ultimate &quot;self-building&quot; machine, allowing our LLM to expand it&apos;s own capabilities as it goes - want it to scrape a website? Sure, it can write a function for that. Order an Uber? Give it an API key and watch it go. Create another LLM for you? Why not.<br><br><em>Note: &quot;Technically&quot;, I should add. With the current state of LLMs, it will probably order an Uber to the middle of the Amazon and spend the budget of an average Balkan country while trying to scrape the same website 26 billion times. Still, quite entertaining.</em> </p><p>But the best part - we get to play with <em>shiny, new technology. </em>And how can one say no to that?<br><br>So take out your editors, sharpen your Rust skills and let&apos;s build the self-building machine!<br></p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://blog.entropy.observer/content/images/2024/04/72141_recursion--4k--gallery--masterpiece--dali--de-chir_xl-1024-v1-0.png" class="kg-image" alt="AI SuperAgents: (ab)using WASM for LLM function calls - Part 1" loading="lazy" width="1536" height="640" srcset="https://blog.entropy.observer/content/images/size/w600/2024/04/72141_recursion--4k--gallery--masterpiece--dali--de-chir_xl-1024-v1-0.png 600w, https://blog.entropy.observer/content/images/size/w1000/2024/04/72141_recursion--4k--gallery--masterpiece--dali--de-chir_xl-1024-v1-0.png 1000w, https://blog.entropy.observer/content/images/2024/04/72141_recursion--4k--gallery--masterpiece--dali--de-chir_xl-1024-v1-0.png 1536w" sizes="(min-width: 1200px) 1200px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">&quot;Recursion, 4k, gallery, masterpiece, dali, de chirico, surrealism, detailed, light sky&quot; - 2024, the artist is a machine</em></i></figcaption></figure><h2 id="building-the-building-blocks-wasm-components">Building the building blocks: WASM Components</h2><p><br>Let&apos;s start with the most important thing - building out an interface for our WASM component. We&apos;ll keep it quite simple, providing us with a few fields we need to discover and use the function. And we&apos;ll be writing it in <strong>WIT.</strong></p><p><strong>WIT</strong> is assembly&apos;s interface format, used to define interface for <strong>Components</strong> and <strong>Worlds</strong>. A <strong>component</strong> is just a modular piece of code obeying a contract, while a <strong>world</strong> is a contract for the &quot;world&quot; the component executes in, defining the interfaces the component <em>exports (exposes to the outside world)</em>, and the interfaces that the component <em>imports (uses from the outside world). </em> </p><p>As a format itself, <strong>WIT</strong> is quite simple and understandable - if you&apos;ve used any higher level language the syntax should be intuitive immediately. We got your god ol&apos;<code>bools</code>, <code>floats</code>, <code>ints</code>, <code>chars</code> and the fan favorite <code>strings</code>. For more advanced needs, it supports not only <code>list</code>,  <code>tuple</code>, <code>option</code> and <code>result</code> , but also <code>record</code> , <code>variant</code>, <code>resource</code> and anything else you might need. For now, we&apos;ll be doing it simply and just push JSON around, since it&apos;s the end-all-be-all internet format that LLMs can easily recognise<em> (if you want to leave a comment like &quot;oh no JSON, muh performance&quot;, here is your chance).</em><br><br>So let&apos;s define our interfaces - spin up a project folder and create a file at <code>wit/function.wit</code> in which we will create the basic <code>interface</code> .</p><pre><code class="language-WIT">package superagent:functions;

interface function {
  
}</code></pre><p>First, we&apos;ll need to know which function to invoke and what it&apos;s arguments are, so we&apos;ll create a <strong>record</strong> called <code>metadata</code> that exposes us some function information. Besides name, it will also include a <code>description</code> so that we can search among the functions for it, and we&apos;ll list the <code>arguments</code>  too so we know how to invoke it.<br>To be able to retrieve that data from a component, we&apos;ll also create a function called <code>meta</code> which will return our <strong>record</strong>.</p><pre><code class="language-WIT">package superagent:functions;

interface function {

  record metadata {
    name: string,
    description: string,
    arguments: string
  }

  // Retrieves metadata of the function
  meta: func() -&gt; metadata;
  
}</code></pre><p>Then, to call the function itself, we&apos;ll need an <code>invoke</code> method.<br>This method will take in the arguments formatted as a JSON string and return a result which can either be a JSON string containing results or an <code>execution-error</code> which will contain an error message.</p><pre><code class="language-WIT">package superagent:functions;

interface function {

  record metadata {
    name: string,
    description: string,
    arguments: string
  }

  record execution-error {
      reason: string
  }

  // Used to invoke our function
  invoke: func(input: string) -&gt; result&lt;string, execution-error&gt;;

  meta: func() -&gt; metadata;

}</code></pre><p>And to wrap it all up together, let&apos;s define a <strong>world</strong> called <code>host</code> that exports our function interface:</p><pre><code class="language-WIT">package superagent:functions;

interface function {
   // .. all the code  
}

world host {
  export function;
}</code></pre><p>And voila, just like that our WebAssembly component is defined and ready to be implemented. To implement it, we&apos;ll be using the most <s>annoying</s> beloved language in the world, Rust.<br><br>First, we&apos;ll be creating a sample component that will implement the interface and that we will use to test our code. While we can do that with <code>bindgen</code> and <code>wasm-tools</code>, we&apos;ll be doing it &quot;the right way&quot; - by using <code>cargo component</code> to take out the busywork, so go ahead and do a: </p><pre><code class="language-bash">cargo install cargo-component
cargo component new --lib sample</code></pre><p>This will create a new component project in the <code>sample</code> directory.<br>Now go ahead and create a <code>wit</code> folder, then <a href="https://man7.org/linux/man-pages/man1/ln.1.html?ref=blog.entropy.observer"><em>symlink</em></a> the wit file inside (so we don&apos;t need two of them):</p><pre><code class="language-bash">cd sample &amp;&amp; mkdir wit &amp;&amp; cd wit
ln -s ../../wit/function.wit function.wit</code></pre><p>Now, a small diversion:</p><p>Since component model is written for <em>WASI preview 2</em>, and the Rust compiler only supports <em>WASI preview 1</em>, we need to adapt the compiled code by using an adapter - we can do this automatically with <code>cargo component</code>, but first we need to download <code>wasi_snapshot_preview1.reactor.wasm</code> adapter <a href="https://github.com/bytecodealliance/wasmtime/releases?ref=blog.entropy.observer">from the release page</a>. <br>Then, we can open up <code>Cargo.toml</code> and add this:</p><pre><code class="language-toml">[package.metadata.component]
adapter = &quot;wasi_snapshot_preview1.reactor.wasm&quot;</code></pre><p>This way, code can be written to target WASI preview 2 but it&apos;s adapted so it can be used in places that support WASI preview 1 only, giving us backwards compatibility with a lot of real-world WASI implementations. </p><p>Now that we got the basics set-up, we can run <code>cargo component build</code> inside our sample folder and we&apos;ll see it generate a <code>bindings.rs</code> file with all the WIT records and interfaces code created for us. Only thing left to do is implement it, and for this sample, we&apos;ll implement a simple one called <code>web_pinger</code> which will do a mock check if an endpoint is running:</p><pre><code class="language-rust">#[allow(warnings)]
mod bindings;

// Generated contracts
use bindings::{
    exports::superagent::functions::function,
    exports::superagent::functions::function::Guest,
};

//Generated records
use crate::bindings::exports::superagent::functions::function::{ExecutionError, Metadata};

//Create the component which will implement our WIT interfaces
struct Component;

//Implement the &quot;guest&quot; code, aka your webassembly component

impl Guest for Component {
    //Mock function invocation
    fn invoke(input: String) -&gt; Result&lt;String, ExecutionError&gt; {
        return Result::Ok(&quot;{ up: true}&quot;.to_string())
    }

    //Get function description
    fn description() -&gt; Metadata {
        Metadata {
            name: &quot;web_pinger&quot;.to_string(),
            description: &quot;Checks if a website is up by pinging it&quot;
                          .to_string(),
            arguments: &quot;{ \&quot;endpoint\&quot;: String }&quot;.to_string(),
        }
    }
}
</code></pre><p>Now, if we run <code>cargo component build</code>, our component should be built successfuly, and we&apos;ll end up with a WASM module file at <code>./target/wasm32-wasi/debug/sample.wasm</code>.  </p><p>Congratulations, your first WASM component has been built &#x1F389;</p><h2 id="assembling-the-rube-goldberg-von-neumann-machine">Assembling the Rube Goldberg-von Neumann machine</h2><p><br>Now that we have set up the <em>guest </em>code, it&apos;s time for the <em>host </em>code. Go through your favorite ritual of setting up a Cargo project in your project folder and let&apos;s add some basics - first off, you&apos;ll need your WASM runtime of choice.</p><p>Now choosing one is a daunting task in itself - I&apos;d recommend <a href="https://github.com/bytecodealliance/wasmtime?ref=blog.entropy.observer">wasmtime</a> by Bytecode Alliance and it&apos;s what I&apos;ll be using here, so let&apos;s add that to our cargo file, together with WASI support.</p><pre><code class="language-toml">[dependencies]
wasmtime = { version = &quot;18.0.1&quot;, features = [&quot;component-model&quot;] }
wasmtime-wasi = { version = &quot;18.0.1&quot;, default-features = true }</code></pre><p>Next, we&apos;ll also add two more libraries - the <code>wit-component</code> library for handling Components and a <code>wit-bindgen-rust</code> that will take our WIT interface and generate the glue code in the background.</p><pre><code class="language-toml">wit-component = &quot;0.201.0&quot;
wit-bindgen-rust = &quot;0.20.0&quot;</code></pre><p>For now, we&apos;ll leave the LLM part first and focus on WASM, since the LLM part is quite simple - create an <code>Agent</code> contract, implement it for your model/provider of choice, prompt tune a bit. But don&apos;t worry - we&apos;ll get to that later.<br><br>Let&apos;s figure out the bare minimum to run a WASM component:</p><ul><li>Take a compiled <code>wasm</code> binary</li><li>Instantiate a <code>Component</code> out of it</li><li>Start a WASM engine</li><li>Run the component <em>inside the engine</em> with provided arguments</li><li>Return the result</li></ul><p>So open up your <code>main.rs</code> and start writing. First, we&apos;ll be lazy and just create a lazy static instance of a WASM engine, then define our necessary functions:</p><pre><code class="language-`rust">lazy_static::lazy_static! {
    static ref ENGINE: Engine = {
        let mut config = Config::new();
        
        // For easier debugging
        config.wasm_backtrace_details(WasmBacktraceDetails::Enable);
        
        // Enables component model
        config.wasm_component_model(true);

        let engine = Engine::new(&amp;config).unwrap();
        engine
    };
}

fn run_function(arguments: String, 
                component_binary: Vec[u8]) -&gt; String {
    let component = build_wasm_component(component_binary);
    let mut instance = create_instance(component);
    let res = execute_function(&quot;invoke&quot;, &quot;arguments&quot;, instance)
}</code></pre><p>To build a WASM component from binary, we&apos;ll load the <code>.wasm</code> file into a byte array, <em>adapt it to preview2,</em> then use <code>Component::from_binary</code> to create it. Don&apos;t worry, it&apos;s quite simple and the <code>wit-component</code> crate is here to support it - you just need to download the adapter files and load them in, so let&apos;s do that:</p><pre><code class="language-rust">//Include the WASI Preview 1 Adapter
const ADAPTER: &amp;[u8] = include_bytes!(concat!(
    env!(&quot;CARGO_MANIFEST_DIR&quot;),
    &quot;/wasm-sample/wasi_snapshot_preview1.reactor.wasm&quot;
));


fn adapt_wasm_output(wasm_bytes: &amp;[u8],
                    adapter_bytes: &amp;[u8]) -&gt; Result&lt;Vec&lt;u8&gt;, Error&gt; {
    let component = ComponentEncoder::default()
        .module(&amp;wasm_bytes)
        .expect(&quot;Cannot encode module&quot;)
        .validate(true)
        .adapter(&quot;wasi_snapshot_preview1&quot;, &amp;adapter_bytes)
        .expect(&quot;Cannot encode adapter&quot;)
        .encode()
        .expect(&quot;Cannot encode components&quot;);

    Ok(component.to_vec())
}
</code></pre><p>To actually be able to use the classes defined in your <em>WIT </em>file from the <em>host</em> code,<br>we need to use a bindgen macro to generate them from WIT file. This will help us generate glue code that binds our WASM functions with our Rust ones using the defined contract. So add this to the beginning of your file and point it at the wit:</p><pre><code class="language-rust">bindgen!({
    path : &quot;wit/function.wit&quot;,
    world: &quot;host&quot;,
});</code></pre><p>Now, we can create our component using <code>wasmtime</code> , so let&apos;s implement that <code>build_wasm_component</code> method:</p><pre><code class="language-rust">fn build_wasm_component(bytes: &amp;[u8]) -&gt; Component {
 let component = adapt_wasm_output(bytes, ADAPTER).unwrap();
 
 Component::from_binary(&amp;ENGINE, &amp;component)
     .expect(&quot;Cannot create component&quot;)
}
</code></pre><p>After creating a component, we&apos;re ready to move on to the next step - creating an actual instance of it. To create an instance, we need a place for it to actually live, something where it can store it&apos;s variables, functions, memory et al. That <em>something </em>is called a <code>Store</code>. So let&apos;s create a <code>Store</code> and a <code>WasmState</code> which we will store inside. To create a <code>WasmState</code> , let&apos;s open <code>wasm_state.rs</code> .</p><p>Inside, we&apos;ll create a basic struct containing two things:</p><ul><li><code>WasiCtx</code> to provide a basic WASI implementation</li><li>a <code>ResourceTable</code> to access resources by reference</li></ul><pre><code class="language-rust">extern crate wasmtime;

use wasmtime::component::{ResourceTable};
use wasmtime_wasi::preview2::{WasiCtx, WasiCtxBuilder, WasiView};

pub(crate) struct WasmState {
    ctx: WasiCtx,
    table: ResourceTable,
}

impl WasmState {
    pub(crate) fn new() -&gt; Self {
        let ctx = WasiCtxBuilder::new().build();
        let table = ResourceTable::new();
        Self { ctx, table }
    }
}
</code></pre><p>To access the table and the context, we&apos;ll also need to implement the <code>WasiView</code> trait:</p><pre><code class="language-rust">impl WasiView for WasmState {
    fn table(&amp;mut self) -&gt; &amp;mut ResourceTable {
        &amp;mut self.table
    }

    fn ctx(&amp;mut self) -&gt; &amp;mut WasiCtx {
        &amp;mut self.ctx
    }
}
</code></pre><p>Phew, that was a lot of stuff - but finally we&apos;re ready to create the component, so let&apos;s go into the <code>create_instance</code> function. We&apos;ll first create a <code>Linker</code>, which links together <em>host</em> functions and instances. Then we&apos;ll link our <code>WasmState</code> into it and create the instance by passing in the <code>Store</code> it will be using and the <code>Component</code> we are instantiating to the linker, which should bind it all together and give us the living instance of our<code>Component</code>.</p><pre><code class="language-rust">fn create_instance(store: &amp;mut Store&lt;WasmState&gt;, 
                  component: Component) -&gt; Instance {
    let mut linker = Linker::new(&amp;ENGINE);
    preview2::command::sync::add_to_linker::&lt;WasmState&gt;(&amp;mut linker)
        .expect(&quot;Cannot add to linker&quot;);
    linker.instantiate(store, &amp;component)
        .expect(&quot;Cannot instantiate component&quot;)
}
</code></pre><p>Having the instance of our WASM program, only thing left to do is run the function itself, so let&apos;s create that <code>execute_function</code> method. To do that, we need to get the exported interface from our component instance and find the function we need. Then, we can invoke it with the provided arguments and receive a classic Rust <code>Result</code>.</p><pre><code class="language-rust">fn execute_function(mut store: Store&lt;WasmState&gt;, instance: &amp;mut Instance,
                    name: &amp;str, args: &amp;str) -&gt; Result&lt;String, ExecutionError&gt; {
                    
    let mut exports = instance.exports(&amp;mut store);
    let mut interface = exports
        .instance(&quot;superagent:functions/function&quot;)
        .expect(&quot;Cannot find interface&quot;);
        
    //Get the function by name
    let func = interface
        .typed_func::&lt;(String,),(Result&lt;String, ExecutionError&gt;,)&gt;(name)
        .expect(&quot;Cannot find action&quot;);
    drop(exports);
    
    //Call the function
    let res = func.call(&amp;mut store, (args.to_string(), ))
        .expect(&quot;Function execution failed&quot;).0;
        
    //Remove the return from WASM memory
    func.post_return(&amp;mut store)
        .expect(&quot;Cannot post return to store&quot;);
    res
}
</code></pre><p>And that&apos;s it - our WASM runner is ready. To test it, we can use the module we&apos;ve build before - add it/symlink it to your project root and load it in:</p><pre><code class="language-rust">fn main() -&gt; Result&lt;(), Error&gt; {
    let component = build_wasm_component(GUEST_RS_WASM_MODULE);
    let mut store = Store::new(
        &amp;ENGINE,
        WasmState::new(),
    );
    let mut instance = create_instance(&amp;mut store, component);
    let res = execute_function(store, &amp;mut instance, &quot;invoke&quot;, 
    &quot;{\&quot;endpoint\&quot;:\&quot;google.com\&quot;}&quot;);
    match res {
        Ok(result) =&gt; {
            println!(&quot;Result: {}&quot;, result);
            Ok(())
        }
        Err(e) =&gt; {
            panic!(&quot;{}&quot;, e.reason)
        }
    }
}</code></pre><p>Now, if you hit Cargo run, you should see the result being outputed:<br><code>Result: { up: true}</code></p><p>Congratulations - you have created your first WASM function and runner!<br><br>Now that the baseline is done, we can continue with the juicy bit - making the AI build it&apos;s own functions - but let&apos;s leave that for the next part of this blog post, it&apos;s getting too long and my coffee is getting too cold.<br></p><p><em>(Note: This post has been sitting in my shelf for the last few months, so some stuff might be out of date - don&apos;t worry, the declared versions here still work and the standards are still the same)<br><br></em></p>]]></content:encoded></item><item><title><![CDATA[Sorting 400+ tabs in 60 seconds with JS, Rust & GPT3: Part 2 - Macros & Recursion]]></title><description><![CDATA[<p></p><p>So, considering the response on the last part, I have a feeling I should add a preface to explain a few things about this post/series:<br><br>- It&apos;s not about &quot;sorting&quot; as in the algorithmic function of sorting - fun idea tho.<br>- It&apos;s</p>]]></description><link>https://blog.entropy.observer/sorting-400-tabs-in-under-60-seconds-with-js-rust-gpt3-part-2/</link><guid isPermaLink="false">65422724da89d344959db4f7</guid><dc:creator><![CDATA[Ian Rumac]]></dc:creator><pubDate>Tue, 07 Mar 2023 16:59:15 GMT</pubDate><media:content url="https://blog.entropy.observer/content/images/2023/03/000068.9a4f1d3a.2222094085--1-.png" medium="image"/><content:encoded><![CDATA[<img src="https://blog.entropy.observer/content/images/2023/03/000068.9a4f1d3a.2222094085--1-.png" alt="Sorting 400+ tabs in 60 seconds with JS, Rust &amp; GPT3: Part 2 - Macros &amp; Recursion"><p></p><p>So, considering the response on the last part, I have a feeling I should add a preface to explain a few things about this post/series:<br><br>- It&apos;s not about &quot;sorting&quot; as in the algorithmic function of sorting - fun idea tho.<br>- It&apos;s not about GPT writing/optimising sorting functions - also a fun idea.<br>- It&apos;s not really in 60 seconds, the title is a jab at the common <a href="https://highlowblog.com/was-gone-in-60-seconds-the-start-of-the-2000s/?ref=blog.entropy.observer">&quot;in 60 seconds&quot;</a> trope<br>- It&apos;s mostly just a tagalong journal of an adventure where I try to solve my problem via over-engineering, with all the fun quirks, complexities, deep dives and scope creeps that come with building software.<br><br>So now that we got that out of the way, let&apos;s focus on what really matters:<br>Abusing GPT3 for fun and no profit.<br><br>In the last post, we built the user-facing part of our chrome extension that will help us fix our tab hoarding habit. We wrote some HTML, scribbled some JS and researched mysteries of the Chrome API. This time, we&apos;ll dive into the hottest language on the block right now - Rust. <br><br>I don&apos;t think I need to explain what Rust is. Even if you&apos;ve lived under a rock you&apos;ve probably heard of Rust - the development community is praising it into high heavens - it has the speed of C, the safety of Java and the borrowing system with the helicopter parenting skills of an AH-64 Apache attack helicopter.<br><br>But - the syntax is neat, the performance is awesome, macros are cool and even tho it&apos;s mostly strict about memory, it still gives you access to raw pointers and let&apos;s you go !unsafe .<br><br>So, to get a feel of the language, let&apos;s try and have some fun with it.<br><br>We&apos;ll build a simple service that will take our tab collection, simplify it a bit, talk to the OpenAI&apos;s API and - hopefully without hallucinations - parse the response into something our extension can use. On the way we&apos;ll encounter some obstacles, from having too many tabs and wasting money to Silicon Valley waking up and hammering the OpenAI API into oblivion.</p><!--kg-card-begin: markdown--><p>Our service will be quite simple -  we&apos;ll expose one method <code>/sort</code> to which we will <code>POST</code> our tabs and existing categories. To build it, we&apos;ll be resorting to <a href="https://github.com/tokio-rs/axum?ref=blog.entropy.observer">Axum framework</a>, allowing us to easily start up a server with a <code>/sort</code> endpoint. And to deploy it we&apos;ll use <a href="https://shuttle.rs/?ref=blog.entropy.observer">shuttle</a> so we can easily spin up a Rust server without swimming through the sea of AWS configs, writing Procfiles or building docker images.</p>
<p>We&apos;ll even use it to scaffold our project, so let&apos;s start by installing it.<br>
First, we&apos;ll need cargo, the rust package manager - if you don&apos;t have it installed, follow the steps <a href="https://doc.rust-lang.org/cargo/getting-started/installation.html?ref=blog.entropy.observer">here</a>. Second, we&apos;ll need a <a href="https://shuttle.rs/?ref=blog.entropy.observer">shuttle</a> account - don&apos;t worry, you can just 1-click signup with Github there - no need to fill out forms.</p>
<p>Now, open up the ol&apos; terminal and hit <code>cargo install cargo-shuttle &amp;&amp; cargo shuttle login</code> followed up with <code>cargo shuttle init</code> after you&apos;ve authenticated.</p>
<p>Follow the instructions to set the project name and location, and in the menu choose <code>axum</code> as your framework. This will scaffold a new axum project as a library, with shuttle as a dependency.</p>
<p>Our folder should now look like this.</p>
<!--kg-card-end: markdown--><pre><code>&#x251C;&#x2500;&#x2500; Cargo.lock
&#x251C;&#x2500;&#x2500; Cargo.toml
&#x2514;&#x2500;&#x2500; src
    &#x2514;&#x2500;&#x2500; lib.rs</code></pre><!--kg-card-begin: markdown--><p>It&apos;s quite a simple structure - We have <code>cargo.toml</code>, which is the rust version of <code>manifest.json</code> or <code>package.json</code>. It contains metadata about your package, it&apos;s dependencies, compilation features and more. The <code>cargo.lock</code> is just a hardcoded list of dependencies specified, ensuring consistent builds across environments.</p>
<p>Our main server code will reside inside <code>src/lib.rs</code>. Let&apos;s look at it while it&apos;s still fresh and beautiful:</p>
<!--kg-card-end: markdown--><pre><code class="language-rust">use axum::{routing::get, Router};
use sync_wrapper::SyncWrapper;

async fn hello_world() -&gt; &amp;&apos;static str {
    &quot;Hello, world!&quot;
}

#[shuttle_service::main]
async fn axum() -&gt; shuttle_service::ShuttleAxum {
    let router = Router::new().route(&quot;/hello&quot;, get(hello_world));
    let sync_wrapper = SyncWrapper::new(router);

    Ok(sync_wrapper)
}</code></pre><!--kg-card-begin: markdown--><p>A few things of note here:</p>
<ul>
<li>
<p>No <code>main</code> method - since this projects is marked as a [lib]rary, there is no predefined entry point needed.</p>
</li>
<li>
<p>The <code>router</code> - the &quot;entry point&quot; for your Axum service. Requests are routed through here and the code is pretty self-explanatory - you pair up the route to the function handling it, i.e, our <code>supercoolservice.com/hello</code> would return a simple &quot;Hello, world!&quot; text.</p>
</li>
<li>
<p>The <code>SyncWrapper</code> - wraps our router object, ensuring it&apos;s safe to access across different threads.</p>
</li>
<li>
<p><code>#[shuttle_service::main]</code> - this is a rust macro - think of it as a more powerful version of annotations if you know what those are. It let&apos;s you write code that writes code - but that&apos;s the lazy explanation. Uuh.. I think we need a quick diversion here.</p>
</li>
</ul>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><h3 id="a-quick-diversion-into-the-magical-realm-of-macros">A quick diversion into the magical realm of macros</h3>
<!--kg-card-end: markdown--><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.entropy.observer/content/images/2023/03/000081.d7634f13.4272243829.png" class="kg-image" alt="Sorting 400+ tabs in 60 seconds with JS, Rust &amp; GPT3: Part 2 - Macros &amp; Recursion" loading="lazy" width="960" height="256" srcset="https://blog.entropy.observer/content/images/size/w600/2023/03/000081.d7634f13.4272243829.png 600w, https://blog.entropy.observer/content/images/2023/03/000081.d7634f13.4272243829.png 960w" sizes="(min-width: 720px) 720px"><figcaption>Magical realm of macros, hieronymus bosch - 2023, the artist is a machine. First one is a conference talk, the second is pair programming, the third is the rabbit hole you find yourself in after getting into macros.</figcaption></figure><p>Now, before we get into the macros, I gotta preface this with a warning: this is not a 100% explanation of macros and how they work in [insert your favorite language here]. For that, there exist hundreds of books, guides and articles.<br><br>But for those random readers who stumbled here and don&apos;t want to read <em>&quot;monad is a monoid in the category of endofunctors&quot;</em> style articles explaining macros, we&apos;ll have a quick dip into the beautiful rabbit-hole of macros. &#xA0;</p><!--kg-card-begin: markdown--><p>So let&apos;s imagine we&apos;re working in a imaginary language called Bust.</p>
<p>Bust is this new cool language the whole twitter is buzzing about and they say it&apos;s going to be the language of the metaverse AI web4 apps. But, since it&apos;s a new language, it&apos;s still early and there isn&apos;t many libraries - for example, there is no JSON serialisation libraries yet, so you gotta write all of the code for that manually. So every time you create a struct you have to also write a bunch of serialisation code for it too. Like:</p>
<!--kg-card-end: markdown--><pre><code>struct ReallyBigModel {
   id: String,
   name: String,
   isReal: Bool,
   ...
   stuff: AnotherBigModel
   }


impl ToJson for ReallyBigModel {
    fn toJson() -&gt; String {
          return mapOf { &quot;id&quot; to id, 
                &quot;name&quot; to name,
                &quot;isReal&quot; to isReal,
                ..., 
                &quot;stuff&quot; to stuff.toJson())
              }.toJson() 
        }
}</code></pre><!--kg-card-begin: markdown--><p>Annoying, isn&apos;t it?<br>
Nobody wants to write this much boilerplate every day.</p>
<p>But one day, you read in the latest changelog it now supports this new thing called macros. There are many types of macros, but in Bust macros are these special methods you can define that consist of two things:</p>
<ul>
<li>The macro attribute</li>
<li>The macro function</li>
</ul>
<p>The <code>attribute</code> is like a mark you can put on other code.<br>
Imagine it being a big red X over classes or methods. So when your compiler is doing it&apos;s compiling, if it stumbles upon a function with a big red X over it&apos;s head, it knows it should call your macro function.</p>
<p>The <code>macro function</code> receives the code that is marked with the <code>attribute</code>, decides what to do with it and then returns new code to the compiler which it then integrates back where the marked function was.</p>
<p>So if in our example, we made a <code>toJson</code> macro, we could add <code>toJson</code> attribute above any struct and it would write that code for us, so the above code would turn into:</p>
<!--kg-card-end: markdown--><pre><code>#[toJson]
struct ReallyBigModel {
   id: String,
   name: String,
   isReal: Bool,
   ...
   stuff: AnotherBigModel
}</code></pre><p>And what would our macro look like?<br><br>It would be a function that takes in the code marked with it (represented as tokens) and returns a new code that will replace it.</p><figure class="kg-card kg-code-card"><pre><code class="language-Bust">#[toJson]
fn addToJsonTrait(input: TokenStream) -&gt; TokenStream {

    let tree = parseIntoAST(input)
    let nodes = ast.data.asStruct();
    let name = tree.identity
    
    // Get all the children that are properties
    // Map them into format: $name to name
    let properties = nodes
    			.filter((child)=&gt;child.isProperty)
    			.map((property) =&gt;
     			&quot;\&quot;${property.name}\&quot; to ${property.name}&quot;)
        		.joinToString(&quot;,\n&quot;)
                       

    // Write the toJson trait body
    let body  =  quote! { //this is also a kind of macro!
                    impl ToJson for #name {
                        fn toJson() -&gt; String {
                                return mapOf {
                                    #properties
                                     }.toJson();
                                   }
                               }
                 	}
					
    return body.intoTree().intoStream()

}</code></pre><figcaption>Note: This is Bust, an imaginary language. Every language has it&apos;s own macro implementation and this is just a simplified representation of one so the article doesn&apos;t get excessively long.</figcaption></figure><!--kg-card-begin: markdown--><p>So now, when our compiler arrives at a class marked with <code>#[toJson]</code>, it will call the <code>addToJsonTrait</code> method, pass it the code for the class and wait until it returns the new code before it continues compiling.</p>
<p>And just like that, we saved a ton of time by using a macro function and can now be the productive Bust developer we always wanted to be!</p>
<p>Now, don&apos;t get too excited - this is just an imaginary implementation.<br>
There is a lot to know about macros and I would suggest you to get deep into the rabbit hole - <a href="https://doc.rust-lang.org/book/ch19-06-macros.html?ref=blog.entropy.observer">rust itself</a> has a few different types of <a href="https://veykril.github.io/tlborm/?ref=blog.entropy.observer">macros</a>, it&apos;s one of the <a href="http://www.paulgraham.com/avg.html?ref=blog.entropy.observer">reasons</a> people love <a href="https://lispcookbook.github.io/cl-cookbook/macros.html?ref=blog.entropy.observer">Lisp</a> so much, there are <a href="http://www.phyast.pitt.edu/~micheles/scheme/scheme29.html?ref=blog.entropy.observer">hygenic and non-hygenic macros</a>, different types of expansions, and a lot more magic hiding away in the deep.</p>
<p>So now that we got that out of the way, let&apos;s get back into building our API.</p>
<!--kg-card-end: markdown--><h2 id="the-post-office">The POST office</h2><!--kg-card-begin: markdown--><p>We&apos;ll hide the simple magic of our service behind the <code>/sort</code> POST method, so delete that hello world and replace the router with one handling the <code>/sort</code> request - <code>Router::new().route(&quot;/sort&quot;, post(sort_items))</code> and a <code>sort_items</code> method that will handle the request:</p>
<pre><code class="language-rust">async fn sort_items(Json(payload): Json&lt;SortRequestPayload&gt;)
                                       -&gt; impl IntoResponse {
 (StatusCode::OK, Json(&quot;ok&quot;)).into_response()

}`</code></pre>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><p>The method will receive a Json wrapper of our request structure and will return a implementation of IntoResponse trait which our server knows how to handle. Specifically, we&apos;ll be returning it in the tuple format of <code>StatusCode,T</code> which the server knows how to transform into an appropriate response.</p>
<p>One more thing we need to implement is our request data structure. So instead of having them lay around in the same file, let&apos;s pop open a new file called <code>models.rs</code> in <code>src</code> folder and create some basic definitions.</p>
<p>We&apos;ll need the <code>SortRequestPayload</code> which is the wrapper we will receive. It should contain a list of categories and items, so we&apos;ll need some structures for them too - <code>Category</code> and <code>Item</code>, so let&apos;s add those too. And we&apos;ll need a list of categories <em>with</em> items, so we can have categories with belonging items to return and a wrapper for them. Also, we&apos;ll add an <code>ErrorResponse</code> so we know where the problem is.</p>
<!--kg-card-end: markdown--><pre><code class="language-rust">//in models.rs

pub(crate) struct SortRequestPayload {
    pub(crate) categories: Vec&lt;Category&gt;,
    pub(crate) items: Vec&lt;Item&gt;,
}

pub(crate) struct Category {
    pub(crate) id: usize,
    pub(crate) title: String,
}

pub(crate) struct Item {
    pub(crate) id: usize,
    pub(crate) title: String,
}

pub(crate) struct CategoryWithItems {
    pub category_id: usize,
    pub category_name: String,
    pub items: Vec&lt;usize&gt;
}

pub(crate) struct Categories {
    pub categories: Vec&lt;CategoryWithItems&gt;
}

pub(crate) struct ErrorResponse {
    pub message: String,
}

</code></pre><!--kg-card-begin: markdown--><p>But, we got one problem - we need our structures to be easily (de)serialisable from/into json - for that, we will use a library called <a href="https://serde.rs/?ref=blog.entropy.observer">Serde</a> and use it&apos;s macros ( similar to the macro we constructed before) so open up your <code>cargo.toml</code> file and add <code>serde</code> and <code>serde_json</code> as a dependency:</p>
<pre><code class="language-toml">serde = { version = &quot;1.0&quot;, features = [&quot;derive&quot;] }
serde_json = &quot;1.0&quot;
</code></pre>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><p>Now, we can mark our structs with serde&apos;s <code>#[derive(Deserialize)]</code> macros so the framework knows how to deserialize the received JSON into our structs.</p>
<!--kg-card-end: markdown--><pre><code class="language-rust">//in models.rs

#[derive(Deserialize)]
pub(crate) struct SortRequestPayload {
    pub(crate) categories: Vec&lt;Category&gt;,
    pub(crate) items: Vec&lt;Item&gt;,
}

#[derive(Deserialize)]
pub(crate) struct Category {
    pub(crate) id: usize,
    pub(crate) title: String,
}

#[derive(Deserialize)]
pub(crate) struct Item {
    pub(crate) id: usize,
    pub(crate) title: String,
}

#[derive(Deserialize)]
#[derive(Serialize)]
pub(crate) struct CategoryWithItems {
    pub category_id: usize,
    pub category_name: String,
    pub items: Vec&lt;usize&gt;
}

#[derive(Deserialize)]
#[derive(Serialize)]
pub(crate) struct Categories {
    pub categories: Vec&lt;CategoryWithItems&gt;
}

#[derive(Serialize)]
pub(crate) struct ErrorResponse {
    pub message: String,
}</code></pre><!--kg-card-begin: markdown--><p>With this done, we can dive back into our code.<br>
Let&apos;s examine our plan:</p>
<pre><code>1. Get the items
2. Assign items to categories
3. Slice the prompt into chunks
4. A recursive sort:
    4.1. Take existing categories and a chunk, turn them into a prompt
    4.2. Ask OpenAI to sort it
    4.3. Deserialize the response
    4.4. Add to existing categories
    4.5. While chunks remain, back to 4.1
5. Return the result
</code></pre>
<p>And structure it into methods:</p>
<!--kg-card-end: markdown--><pre><code class="language-rust">//in lib.rs

...

fn create_chunks_for_prompting(items: Vec&lt;Item&gt;) -&gt; Vec&lt;Vec&lt;Item&gt;

fn sort_recursively(
	sorted_categories: Vec&lt;CategoryWithItems&gt;,
        remaining: Vec&lt;Vec&lt;Item&gt;&gt;) -&gt; Result&lt;Categories, Error&gt;

fn build_prompt(items: Vec&lt;Item&gt;,
		categories: Vec&lt;CategoryWithItems&gt;) -&gt; String

fn prompt_open_ai(prompt: String) -&gt; Result&lt;String, String&gt;</code></pre><!--kg-card-begin: markdown--><p>Also, we&apos;ll need our prompt, so let&apos;s try out something like this - we tell GPT3 it will receive a list of items and give it the format and the embed the list. Then, we describe it the valid JSON format to return and pass in the existing categories. In the end, we tell it to return them to us in the valid JSON format. Hopefully, it will adhere to the JSON format and not hallucinate it, but we&apos;ll fine-tune that in later posts. For now, it seems like specifying the valid JSON format close to the end of the prompt and mentioning a &quot;valid JSON format&quot; in the end keeps it grounded quite nicely.</p>
<pre><code>You will receive list of items with titles and id&apos;s in form of [title,id].
Based on titles and urls, classify them into categories, by using existing categories or making new ones.

Tabs are:
[$tabName, $tabId].

Valid JSON format to return is:
{ &quot;categories&quot;: [ { 
    &quot;category_id&quot;:&quot;id here&quot;,
    &quot;category_name&quot;: &quot;name here&quot;, 
    &quot;items&quot;:[tab_id here] } 
]}.

Existing categories are: 
$categories

A new more detailed list of categories (existing and new) with items, in valid JSON format is:
</code></pre>
<p>Sounds good to me!<br>
Let&apos;s divide it up into constants we can use inside our code.</p>
<pre><code class="language-rust">
const PROMPT_TEXT_START: &amp;str = &quot;You will receive list of items with titles and id&apos;s in form of [title,id].
Based on titles and urls, classify them into categories, by using existing categories or making new ones.&quot;;

const PROMPT_TEXT_MIDDLE: &amp;str = &quot;\nValid JSON format to return is:
{ \&quot;categories\&quot;: [ { \&quot;category_id\&quot;:\&quot;id here\&quot;, \&quot;category_name\&quot;: \&quot;name here\&quot;, \&quot;items\&quot;:[tab_id here] } ]}.
Existing categories are:&quot;;

const PROMPT_TEXT_ENDING: &amp;str = &quot;A new more detailed list of categories (existing and new) with tabs, in valid JSON format is:&quot;;

</code></pre>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><p>Finally we can get into our <code>sort_items</code> method and start filling it all out. First, we&apos;ll take ownership of our data and split it into chunks:</p>
<pre><code class="language-rust">let items = payload.items;
let categories = payload.categories.iter().map(|it| {
    CategoryWithItems {
        category_id: it.id,
        category_name: it.title.to_owned(),
        items: Vec::new(),
    }
}).collect();

let prompt_slices = create_chunks_for_prompting(items_with_indexes);

</code></pre>
<p>Why chunks? Because if we just add all of the items to the prompt, our prompt size could be well over 4096 tokens - which is what the model we&apos;ll be using supports as the maximum length for prompt and completion. So we need to find a way to split it into a suitable size and have some buffer for the completion too - we&apos;ll leave a 50% buffer for it, leaving our prompt size at 2048.</p>
<p>To achieve that, our <code>create_chunks_for_prompting</code> function will need to do two things:</p>
<ul>
<li>Count the number of tokens in our base prompt</li>
<li>Count the number of tokens in the data we send to the API</li>
<li>Calculate the amount of chunks we need by splitting the size of the total tokens with 2048 minus our hardcoded prompt size.</li>
</ul>
<p>According to the OpenAI documentation, we can guesstimate that a token is the size of about 4 characters. Now, there are a lot of different ways to count tokens, and to do it properly, we would have to do a bit more than just split the length by 4 - the best way would be to use the <a href="https://docs.rs/rust_tokenizers/latest/rust_tokenizers/?ref=blog.entropy.observer">Rust tokenizers</a> crate and their GPT2 tokenizer. But, since that leads us down another rabbit hole, we&apos;re gonna skip it <em>for now</em> and just gonna do a simple trick - the <code>split_whitespace</code> method which will give us an approximation of token length.</p>
<pre><code class="language-rust">fn create_chunks_for_prompting(items: Vec&lt;Item&gt;) -&gt; Vec&lt;Vec&lt;Item&gt;&gt; {
   
    //tokens in our data
    let json_size = serde_json::to_string(&amp;items).unwrap()
        .split_whitespace()
        .collect_vec()
        .len();
    
    // get the size of our hardcoded prompt

    let hardcoded_prompt = format!(&quot;{a}{b}{c}&quot;,
                                   a =String::from(PROMPT_TEXT),
                                   b = String::from(PROMPT_TEXT_APPEND),
                                   c= String::from(PROMPT_TEXT_ENDING));
    
    let hardcoded_prompt_size = hardcoded_prompt
        .split_whitespace()
        .len();

    //find the number of chunks we should split the items into
    let chunks_to_make = json_size / (2048 - hardcoded_prompt_size);
    
    //split the vector up into N vectors
    let chunk_size = items.chunks(items.len() /
                                      (if chunks_to_make &gt; 0 {
                                      chunks_to_make
                                      } else { 1 }));
                                      
    //return the list of chunks
    return chunk_size.map(|s| s.into()).collect();
}
</code></pre>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><p>Now, let&apos;s get down to our <code>build_prompt</code> function.<br>
To build our prompt, we need the list of items to be sorted and the existing categories. We&apos;ll take the list of items and <code>format!</code> it to a string in the form of <code>[title,id]</code>. Then, we&apos;ll turn the categories into JSON and use the <code>format!</code> macro to combine it all into a single prompt.</p>
<pre><code class="language-rust">fn build_prompt(items: Vec&lt;Item&gt;,
                categories: Vec&lt;CategoryWithItems&gt;) -&gt; String {

    //map items into [title,id] then join them all into a string
    let items_joined = items.iter().map(|item| format!(
                                        &quot;[{title},{id}]&quot;,
                                        title = item.title,
                                        id = item.id))
                                .collect()
                                .join(&quot;,&quot;);

    let categories_json = serde_json::to_string(&amp;categories).unwrap();
    
    format!(&quot;{prompt}\n{tabs}{middle}{categories}\n{ending}&quot;,
            prompt = String::from(PROMPT_TEXT_START),
            tabs = items_joined,
            middle = String::from(PROMPT_TEXT_MIDDLE),
            categories = categories_json,
            ending = String::from(PROMPT_TEXT_ENDING))
}
</code></pre>
<p>Now, to actually send that prompt to OpenAI, we&apos;ll need a HTTP client.<br>
For that, we&apos;ll be using the <a href="https://docs.rs/reqwest/?ref=blog.entropy.observer">reqwest</a> crate - it provides us with a high level HTTP client with simple async functions we can use to talk to OpenAI API and has a JSON feature which allows us easy serialization/deserialization. So let&apos;s add it to our <code>Cargo.toml</code> file:</p>
<pre><code class="language-toml">[dependencies]
...
reqwest = { version = &quot;0.11&quot;, features = [&quot;json&quot;] }
</code></pre>
<p>Using this, we can build our HTTP client via the good ol&apos; builder pattern.</p>
<pre><code class="language-rust">let client = Client::builder()
    .http2_keep_alive_timeout(Duration::from_secs(120))
    .timeout(Duration::from_secs(120))
    .build()
    .unwrap();
</code></pre>
<p>But, if we built the client inside our <code>prompt_open_ai</code> function, we will be creating a Client instance for each request we make, so let&apos;s instead create a dependency and add the client code into our <code>sort_items</code> function, then pass it down as na argument into the <code>sort_recursively</code> function and <code>prompt_open_ai</code> functions. This way, we&apos;ll only use one instance of the HTTP client per one <code>/sort</code> call and our <code>prompt_open_ai</code> function can focus only on actually calling the API and giving us the result back.</p>
<p>So let&apos;s build a simple POST call and see how we can receive it&apos;s <code>Result</code>.<br>
To keep things clean, we&apos;ll create a separate module inside our structure - modules are containers for your code (akin to packages), enabling you to create some separation between different areas of your code. Create a new folder called <code>openai</code> and two new files in it:</p>
<ul>
<li>a <code>mod.rs</code> for our code</li>
<li>a <code>models.rs</code> for our models</li>
</ul>
<p>Open up the <code>models.rs</code> and add the structs we need to communicate with our OpenAI Completion API:</p>
<pre><code class="language-rust">
use serde::{Deserialize, Serialize};

#[derive(Serialize)]
pub(crate) struct AskGPT {
    pub prompt: String,
    pub model: String,
    pub max_tokens: usize,
    pub stream: bool,
    pub temperature: usize,
    pub top_p: usize,
    pub n: usize,
}

#[derive(Deserialize)]
pub(crate) struct Completion {
    pub model: String,
    pub choices: Vec&lt;Choices&gt;,
}

#[derive(Deserialize)]
pub(crate) struct Choices {
    pub text: String,
    pub index: usize,
}
</code></pre>
<p>And in the <code>mod.rs</code> we can build our <code>prompt_open_ai</code> method, with the POST request which will send our newly created <code>AskGPT</code> model to their <code>/completions</code> endpoint.</p>
<p>Now, there&apos;s a few important fields here - the self-explanatory <code>prompt</code> field, the <code>model</code> which lets us choose which model will do the completion (at the time of writing, <code>text-davinci-003</code> is the best performing one for this task), the <code>max_tokens</code> which we&apos;ll set to 4096 (the max, d&apos;oh), the <code>n</code> which controls the number of responses and <code>temperature</code> which is a way to tell it which probabilities to consider - the higher it is, the more random the completion might seem - we&apos;ll use 0, so our output is less random.</p>
<p>Note: For this part, you&apos;ll need your <a href="https://platform.openai.com/account/api-keys?ref=blog.entropy.observer">OpenAI API key, which you can find here</a>.</p>
<pre><code>async fn prompt_open_ai(prompt_txt: String,
                        client: &amp;Client) -&gt; Result&lt;String, String&gt; {
    let token =  String::from(&quot;YOUR_API_KEY_HERE&quot;)
    let auth_header = format!(&quot;Bearer {}&quot;,token);


    let req = client.post(&quot;https://api.openai.com/v1/completions&quot;)
        .header(&quot;Authorization&quot;, auth_header)
        .json(&amp;AskGPT {
            prompt: prompt_txt,
            model: String::from(&quot;text-davinci-003&quot;),
            max_tokens: 4096,
            n: 1,
            stream: false,
            temperature: 0,
        }).send().await;

}
</code></pre>
<p>Finally, a <code>Result</code>!<br>
But what do we do with it?</p>
<p>Well, we can just add <code>?</code> to the end of the await, which would immediately give us the <code>Response</code>, but that&apos;s no fun, so we&apos;ll use probably one of my favorite rust features - the famous <code>match</code>.<br>
<code>match</code> statements are at the core of the rust developer experience, providing you with powerful pattern matching abilities that ensure all the paths your code takes are covered.</p>
<p>But Ian, what is so special about it?<br>
Isn&apos;t it just if/else on steroids?<br>
Oh no, it&apos;s way more than that. Unlike a set of <code>if/else</code> or <code>switch</code> statements, <code>match</code> forces you to check all of the possibilities, ensuring you cover both the happy and the sad paths your code can take. Why is this so superpowered? Because it reduces the possibility of bugs due to unhandled cases and forces you to cover all possible cases, improving your code immediately. It&apos;s one of those rare tools that can both improve readability of your code, solve bugs and increase maintainability in one single swoop.</p>
<p>So let&apos;s try and use it - the syntax is simple, on the left hand side is the pattern you are matching against and on the right hand side is the codeblock to execute. First we&apos;ll check if the request actually happened by checking the <code>Result</code> we got.</p>
<pre><code class="language-rust">    match req {
        Ok(response) =&gt; {
          //request actually happened, we can access response safely
        }
        Err(error) =&gt; {
            //TODO handle error
        }
    }

</code></pre>
<p>Now in our Ok branch, we can access our response object safely, knowing we got the error case covered too and it isn&apos;t gonna cause a runtime crash.<br>
We can move on to check if the request has actually been a sucessful one by simply checking if the status code is 200 OK.</p>
<pre><code class="language-rust">match response.status() {
    StatusCode::OK =&gt; {
      // smashing success 
    }
    other =&gt; {
      // TODO handle error
    }
}
</code></pre>
<p>And finally, for the main step - if the request was a success, we should try and deserialize the body into our <code>Completion</code> struct. But since that can fail too, we should do a quick <code>match</code> here too and extract the response from our completion object:</p>
<pre><code class="language-rust">match response.json::&lt;Completion&gt;().await {
    Ok(parsed) =&gt; {
        //We know there is always at least 1 item in choices 
        //due to our request param n==1 so we&apos;ll just live wild and unwrap
        let choices = parsed.choices.first().unwrap();
        let json: &amp;str = choices.text.borrow();
        Ok(String::from(json))
    }
    Err(err) =&gt; {
            return Err(Parsing);
        }
}

</code></pre>
<p>Now, to handle the errors - let&apos;s add an enum that will denote the different types of errors we can have (yes, I&apos;ll condense all possible errors to these three types. What could go wrong..) - the connection error, the server response error and the parsing error. Hop up to the <code>models.rs</code> and add it:</p>
<pre><code class="language-rust">#[derive(Debug)]
pub(crate) enum OpenAiError {
    Connection,
    Parsing,
    Server,
}
</code></pre>
<pre><code class="language-rust">match req {
    Ok(response) =&gt; {
        match response.status() {
            StatusCode::OK =&gt; {
                match response.json::&lt;Completion&gt;().await {
                    Ok(parsed) =&gt; {
                        //there is always at least 1 due to our request
                        let choices = parsed.choices.first().unwrap();
                        let json: &amp;str = choices.text.borrow();
                        Ok(String::from(json))
                    }
                    Err(err) =&gt; Err(Parsing);
                }
            }
            other =&gt; Err(Server)          
        }
    }
    Err(err) =&gt; Err(Connection)
</code></pre>
<p>Congratulations! We&apos;ve successfully made our request in a safe manner and covered all the sad and happy paths on the way.</p>
<p>So with our requests poppin&apos;, we can <em>finally</em> start working on our <code>sort_recursively</code> function. Why recursion here? Because we&apos;re basically reducing a list onto itself with GPT3 acting as our reducer function. While we could do a loop here and call this method n times, it would mean we would have to also mutate a variable outside of the loop (containing our categories). As that feels dirty, we&apos;ll do it the clean, functional way by using our good ol&apos; friend, the recursion.</p>
<p>So let&apos;s open up our <code>main.rs</code>and start get into the <code>sort_recursively</code> function.</p>
<p>First, we&apos;ll build our prompt, then send it to <code>prompt_open_ai</code> and try to deserialize the response. If it succeeds, we join it with the existing categories and pass it again into <code>sort_recursively</code> with the remaining chunks, until we&apos;re out of chunks.</p>
<pre><code class="language-rust">async fn sort_recursively(
                        sorted_categories: Vec&lt;CategoryWithItems&gt;,
                        remaining: Vec&lt;Vec&lt;Item&gt;&gt;,
                        client: Client) -&gt; Result&lt;Categories, Error&gt; {

    let prompt = build_prompt(remaining.first().unwrap().to_vec(),
                             sorted_categories);
    let ai_response = prompt_open_ai(prompt, &amp;client).await.unwrap();
    let json = ai_response.as_str();

    //try to deserialize it
    let generated = serde_json::from_str::&lt;Categories&gt;(json);
    
    match ai_response_result {
        Ok(response) =&gt; {
           let parsed = serde_json::
                           from_str::&lt;Categories&gt;(ai_response.as_str());
           match parsed {
               Ok(res) =&gt; match res {
                   Ok(wrapper) =&gt; {
                       let mut new_categories = wrapper
                                   .categories.to_owned();
                       //remove the processed chunk
                       let mut next_slice = remaining.to_owned();
                       next_slice.remove(0);
                       //join the categories
                       next_categories.append(&amp;mut new_categories);
                       //if we&apos;re not done yet recurse
                       if next_slice.len() != 0 {
                        let next = sort_recursively(next_categories,
                                                    next_slice, 
                                                    client).await;
                        match next {
                            Ok(cats) =&gt; Ok(cats),
                            Err(e) =&gt; Err(String::from(&quot;Sort failed&quot;))
                        }
                       } else {
                           Ok(Categories { categories: next_categories })
                       }
                   }
                   Err(msg) =&gt; Err(msg)
               }
               Err(parsing) =&gt; Err(&quot;Parsing response error&quot;.to_string())
           }
        }
        Err(err) =&gt; Err(err)
    }}

</code></pre>
<p>With all these matches, our code is starting to look pretty ugly.One way to avoid nested match hell is to use <code>map</code>,<code>map_err</code> and <code>and_then</code> extensions - they operate on either the left (<code>map</code>) or the right (<code>map_err</code>) side of the <code>Result</code>, enabling us to avoid nesting hell by simply chaining them into a more readable, concise version of it. The data will pass only through the corresponding operands so we can safely map our data and errors to the proper format.</p>
<p>We&apos;ll use it to reduce the first set of nested matches and we&apos;ll leave the last one as a match. Why? Because <a href="https://github.com/rust-lang/rust/issues/62290?ref=blog.entropy.observer">async closures still aren&apos;t stable</a> in Rust it seems. We&apos;ll map all the errors into a <code>Err(String)</code> format so we can return it properly:</p>
<pre><code>async fn sort_recursively(sorted_categories: Vec&lt;CategoryWithItems&gt;, remaining: Vec&lt;Vec&lt;Item&gt;&gt;, client: Client) -&gt; Result&lt;Categories, String&gt; {

    let mut next_categories = Vec::from(sorted_categories.deref());
    let prompt = build_prompt(remaining.first().unwrap().to_vec(),
                              sorted_categories);

    let ai_response_result = prompt_open_ai(prompt, &amp;client).await;

    let res = ai_response_result
        .map_err(|e|
                format!(&quot;Error communicating with OpenAI - {:?}&quot;, e))
        .and_then(|ai_response|
            serde_json::from_str::&lt;Categories&gt;(ai_response.as_str())
                .map_err(|_| &quot;Parsing response error&quot;.to_string()));

    match res {
        Ok(wrapper) =&gt; {
            let mut new_categories = wrapper.categories.to_owned();
            //remove the processed chunk
            let mut next_slice = remaining.to_owned();
            next_slice.remove(0);
            //join the categories
            next_categories.append(&amp;mut new_categories);
            //if we&apos;re not done yet recurse
            if next_slice.len() != 0 {
                sort_recursively(next_categories, 
                                next_slice,
                                client).await
                    .map_err(|e| 
                        format!(&quot;Sorting failed, reason: {}&quot;, e))
            } else {
                Ok(Categories { categories: next_categories })
            }
        }
        Err(msg) =&gt; Err(msg)
    }
}

</code></pre>
<p>There it is - we called the API in a safe, error free-oh-wait.... it&apos;s not compiling.</p>
<p>Well, one thing we didn&apos;t think about is async recursion.<br>
Why is this such a problem?</p>
<p>Well, due to how async/await is implemented in Rust (and a lot of other languages), under the hood it generates a state machine type with all the futures in the method. But now that we are adding recursion here, the generated type starts referencing itself - so under the hood, it blows up into a potentially infinitely recursive type and compiler cannot determine the size of the type. To stop it from blowing up, we&apos;ll need to fix the recursion to return a Box&apos;d Future, which will then just give us the pointer to the heap instead of the whole object, preventing infinite self-referencing under the hood.</p>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><p>I&apos;d recommend reading more about this problem <a href="https://rust-lang.github.io/async-book/07_workarounds/04_recursion.html?ref=blog.entropy.observer">here</a> and following this rabbit hole deeper and deeper - it covers a lot of language design questions and concepts which appear through many languages. But, for now, all we are going to do is use the <code>async_recursion</code> crate, so head on to your <code>Cargo.toml</code> and add it there:</p>
<pre><code class="language-rust">[dependencies]
..
async-recursion = &quot;1.0.2&quot;
</code></pre>
<p>And mark your function with <code>#[async_recursion]</code> macro so it can Box it for you.</p>
<p>With that out of the way, we can come back to our original <code>sort_items</code> method and finally respond to that API request. Last time we left there, we added the <code>Client</code> instance, so just head down below it and call the <code>sort_recursively</code> method, use <code>map_err</code> to map the error into our <code>ErrorResponse</code> structure, wrap it in the JSON and return as a response and use <code>map</code> to turn our <code>Ok</code> result into a proper response:</p>
<pre><code>    sort_recursively(categories, prompt_slices, client).await
        .map_err(|e| 
            (StatusCode::INTERNAL_SERVER_ERROR, 
            Json(ErrorResponse { message: e })).into_response())
        .map(|wrapper| {
            let new_categories = wrapper.categories.iter().map(|item| {
                CategoryWithItems {
                    category_id: item.category_id.to_owned(),
                    category_name: item.category_name.to_owned(),
                    items: item.items.to_owned(),
                }
            }).collect::&lt;Vec&lt;CategoryWithItems&gt;&gt;();
            (StatusCode::OK, Json(Categories {
                categories: new_categories
            })).into_response()
        })

</code></pre>
<!--kg-card-end: markdown--><!--kg-card-begin: markdown--><p>And with this done, our service is now finished!</p>
<p>We take the response, format it, prompt it and give it back to user. Our plan is safe and sound. All that&apos;s left to do is deploy it - but we don&apos;t have to think about provisioning instances, setting up security groups or writing dockerfiles. Since we scaffolded our service via shuttle, we can easily deploy it with a simple touch of the terminal. Open up your projects folder in your shell of choice and type:</p>
<p><code>cargo shuttle deploy</code></p>
<p>Now stand up, take a few breaths, grab a sip of the coffee and before you even know it, your server is up and running at: <code>https://projectname.shuttleapp.rs/</code></p>
<p>Now, uh... why were we even doing this?</p>
<p>Oh yeah, we were writing a JS extension. With our server up, it&apos;s nearly finished - just pop over to the extension and replace the localhost endpoint with the real endpoint you just got from shuttle.</p>
<p>Now, load the extension into a small window just to test it. Hit the sort button, wait for a bit and - BAM! Your tabs should be magically sorted into proper groups! Finally!</p>
<p>Let&apos;s try it in our real window - the one with ..uhh its nearing 600 tabs now. So we&apos;ll just hit the sort button and - wait...</p>
<p>...wait..</p>
<p>.....wait a bit more....</p>
<p>...... waaaaait it&apos;s coming...</p>
<p>.... this is taking way longer than 60 seconds...</p>
<p>... oh wait...</p>
<p>.. error?</p>
<p>Ooops - we hit the token limit!<br>
Why? How? Didn&apos;t we do the whole chunking thing just so it fits?</p>
<p>Weeeeell, seems like we&apos;ll need to do a better calculation on prompt sizes.</p>
<p>Also, our recursion is causing problems - adding all previous categories to each prompt is causing it to blow up in size and it takes a really long time to actually finish the whole chain - way longer than 60 seconds.</p>
<p>And finally, the categories are quite... meh.</p>
<p>Which is great, since it gives us more stuff to do for the next iteration - we&apos;ll see how to eliminate this recursion, how to use <a href="https://docs.rs/rust_tokenizers/latest/rust_tokenizers/?ref=blog.entropy.observer">GPT tokenizer</a> and embed dictionary files into the binary and use <a href="https://docs.shuttle.rs/resources/shuttle-static-folder?ref=blog.entropy.observer">shuttle&apos;s static folder</a> service for it instead of blowing up our build times. We&apos;ll also take a stab at <a href="https://platform.openai.com/docs/guides/fine-tuning?ref=blog.entropy.observer">finetuning</a> the model, giving us better results for less tokens - and since we&apos;re lazy, we&apos;ll just be generating the training data using GPT itself.</p>
<p>If you&apos;ve come this far, thanks for reading and don&apos;t worry, we have many more feature creeps and potential problems to uncover on our path, so see you in the next episode of &quot;Human vs Machines&quot;.</p>
<!--kg-card-end: markdown--><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.entropy.observer/content/images/2023/03/000077.c81d608e.1263736738.png" class="kg-image" alt="Sorting 400+ tabs in 60 seconds with JS, Rust &amp; GPT3: Part 2 - Macros &amp; Recursion" loading="lazy" width="960" height="512" srcset="https://blog.entropy.observer/content/images/size/w600/2023/03/000077.c81d608e.1263736738.png 600w, https://blog.entropy.observer/content/images/2023/03/000077.c81d608e.1263736738.png 960w" sizes="(min-width: 720px) 720px"><figcaption>Machines might still have some problem understanding the concept of cuteness. &quot;Cute rusty crab illustration&quot; - 2023, the artist is a machine.</figcaption></figure><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Sorting 400+ tabs in 60 seconds with JS, Rust & GPT-3: Part 1]]></title><description><![CDATA[<p></p><p>I&apos;m a serial tabbist. I admit it.<br><br>Currently, I have about 460 tabs open across 5 brave windows. Let&apos;s not even get started on the bookmarks.</p><blockquote><em>&quot;B-b-but, they&apos;re all necessary! So much knowledge! So many good links!&quot; </em><br>- My inner hoarder</blockquote><p>Yeah,</p>]]></description><link>https://blog.entropy.observer/sorting-400-tabs-in-60-seconds/</link><guid isPermaLink="false">65422724da89d344959db4f6</guid><category><![CDATA[rust]]></category><category><![CDATA[gpt3]]></category><category><![CDATA[ai]]></category><category><![CDATA[API]]></category><dc:creator><![CDATA[Ian Rumac]]></dc:creator><pubDate>Thu, 23 Feb 2023 10:43:00 GMT</pubDate><media:content url="https://blog.entropy.observer/content/images/2023/02/000021.27965652.1921725045-1.png" medium="image"/><content:encoded><![CDATA[<img src="https://blog.entropy.observer/content/images/2023/02/000021.27965652.1921725045-1.png" alt="Sorting 400+ tabs in 60 seconds with JS, Rust &amp; GPT-3: Part 1"><p></p><p>I&apos;m a serial tabbist. I admit it.<br><br>Currently, I have about 460 tabs open across 5 brave windows. Let&apos;s not even get started on the bookmarks.</p><blockquote><em>&quot;B-b-but, they&apos;re all necessary! So much knowledge! So many good links!&quot; </em><br>- My inner hoarder</blockquote><p>Yeah, I&apos;m like an information hamster. I just keep hoarding all the tabs until I can find enough time to read <em>everything - </em>and open even more of them on the way. And as one can assume, having so many tabs can be quite overwhelming, either when I need to find something and it&apos;s lost beyond the borders of the tab bar or when I&apos;m just looking at the screen and getting the anxious feeling of &quot;having so much to do&quot; - even when there is nothing to be done.<br><br>So, being the lazy hacker I am, instead of actually sorting them, cleaning them up<br>or *<em><em>gulp</em>*</em> simply closing them all, I wondered - why not just let the machine do the job? Can I have a 1-click solution to all my woes? Can I Marie-Kondo my inner hoarder into submission by using code?<br><br>Luckily for us, there is a giant language model worth billions of dollars just waiting to eagerly do the job. The idea is simple: Give GPT3 a list of items and ask it to return a list of categories those items belong to. Wrap all that up into a chrome extension and let the magic happen.<br><br>So, let&apos;s crack our fingers and get coding.. or.. oh... wait..</p><h2 id="the-sweet-taste-of-complexity">The sweet taste of complexity<br></h2><p>Let&apos;s backpedal a bit. So, our plan sounds simple enough. But as it usually goes in software, we missed out on some key details that are going to blow up our scope and budget if we don&apos;t think about them properly.<br><br>Some of the key issues to think about before we dive into code head first and find ourselves in a world of regret are:</p><ul><li><strong>Prompt token limits</strong><br>OpenAI&apos;s language models have token limits - 2048 or 4096 tokens.<br>Since each token is about 4 characters, that limits our prompt and response size to 8192/16384 characters respectively.<br><br> There are a few ways we can get around this problem (we&apos;ll cover all of them):<br>- Cutting our prompt into consumable chunks<br>- Optimising the data sent to reduce token count <br>- Fine-tuning a model for our task</li><li><strong>API Key security</strong><br>Since OpenAI API charges API calls by tokens used, our API key needs to be hidden somewhere safe. Hardcoding it in our extension is a no-no - unless we really want to pay OpenAI millions of dollars in bills because some bored script kiddy decided to scrape our key.</li><li><strong>User privacy</strong><br>Tab titles and URL&apos;s can reveal sensitive things - private documents,<br>links, session ID&apos;s and a lot of data about a person. We want users to be able to trust the extension, so we want to open-source it, have it build and deploy from that source and make it easy to deploy for others. </li><li><strong>Ease of update</strong><br>Since LLM&apos;s can be fickle with their responses and OpenAI API could incur us insane usage costs due to simple mistakes, we want to have control over updates instead of letting the users do it at their whim. That means our most important code cannot reside in the extension.</li></ul><p>How do we solve those issues?<br><br>We&apos;ll take a simple route - instead of writing all of the logic in the extension itself, we&apos;ll hide it behind an API - we&apos;ll build a simple backend service that will receive the tab data from the extension, chunk our prompts, communicate with OpenAI&apos;s API and reduce the data back into a single response. This enables us to both secure our keys, control our updates and open-source the extension without giving our secret token away.<br><br>To do this, we&apos;ll be using Rust - with <a href="https://github.com/tokio-rs/axum?ref=blog.entropy.observer">Axum</a> as our backend framework, <a href="https://shuttle.rs/?ref=blog.entropy.observer">Shuttle</a> as our deployment platform and <a href="https://github.com/features/actions?ref=blog.entropy.observer">Github Actions</a> as our CI.<br><br>So, before we get into code, let&apos;s do some napkin sketches to get an overview of what we&apos;re building:</p><figure class="kg-card kg-image-card kg-width-full kg-card-hascaption"><img src="https://blog.entropy.observer/content/images/2023/02/sketch.png" class="kg-image" alt="Sorting 400+ tabs in 60 seconds with JS, Rust &amp; GPT-3: Part 1" loading="lazy" width="2000" height="770" srcset="https://blog.entropy.observer/content/images/size/w600/2023/02/sketch.png 600w, https://blog.entropy.observer/content/images/size/w1000/2023/02/sketch.png 1000w, https://blog.entropy.observer/content/images/size/w1600/2023/02/sketch.png 1600w, https://blog.entropy.observer/content/images/size/w2400/2023/02/sketch.png 2400w"><figcaption>(Not a real napkin - made with <a href="https://okso.app/?ref=blog.entropy.observer">okso.app</a>, an amazing whiteboarding app made by <a href="https://github.com/sponsors/trekhleb?ref=blog.entropy.observer">Oleksii Trekhleb</a>)</figcaption></figure><p><br></p><h2 id="step-1-building-the-extension">Step 1: Building the Extension<br></h2><p>Chromium extension are quite simple to build - they&apos;re basically just tiny webpages that live inside your browser and (with proper permissions) are given access to your browser by using your browser&apos;s API. We&apos;ll be relying on the <a href="https://developer.chrome.com/docs/extensions/reference/?ref=blog.entropy.observer">Chrome API</a> - it&apos;s the API Google Chrome uses - and which many <a href="https://www.chromium.org/chromium-projects/?ref=blog.entropy.observer">Chromium</a> project based browsers expose (such as <a href="https://brave.com/?ref=blog.entropy.observer">Brave</a>, which I&apos;m using, and even Edge, tho with a different namespace). Other browsers, like Firefox or Safari aren&apos;t built off of the Chromium project, but provide a quite similar extension API. If you want to know more about the differences between them, I&apos;d suggest this <a href="https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Differences_between_API_implementations?ref=blog.entropy.observer">MDN</a> article.<br><br>Specifically we&apos;ll be focusing on these two API&apos;s:</p><!--kg-card-begin: markdown--><ul>
<li><code>chrome.tabs</code> - enables us to query tabs our user currently has opened</li>
<li><code>chrome.tabGroups</code> -  enables us to query existing groups, create new ones and move tabs inside them</li>
</ul>
<!--kg-card-end: markdown--><p>So let&apos;s get to building. To bootstrap our extension, we&apos;ll be using <a href="https://github.com/dutiyesh/chrome-extension-cli?ref=blog.entropy.observer">Chrome extension CLI</a> - it will generate the initial project structure we need.<br>So, hit the terminal with:</p><pre><code class="language-bash">npm install -g chrome-extension-cli
chrome-extension-cli bookie-js
cd bookie-js</code></pre><p>Follow the instructions at the end and load the build folder as an extension - it will allow you to load and test your extension via hot reload, so every change will be immediately visible.<br><br>Now, take a peek inside the structure it generated - most of it is self-explanatory, </p><pre><code class="language-HTML">&#x251C;&#x2500;&#x2500; README.md
&#x251C;&#x2500;&#x2500; config
&#x2502;&#xA0;&#xA0; &#x251C;&#x2500;&#x2500; paths.js
&#x2502;&#xA0;&#xA0; &#x251C;&#x2500;&#x2500; webpack.common.js
&#x2502;&#xA0;&#xA0; &#x2514;&#x2500;&#x2500; webpack.config.js
&#x251C;&#x2500;&#x2500; node_modules
&#x251C;&#x2500;&#x2500; package-lock.json
&#x251C;&#x2500;&#x2500; package.json
&#x251C;&#x2500;&#x2500; pbcopy
&#x251C;&#x2500;&#x2500; public
&#x2502;&#xA0;&#xA0; &#x251C;&#x2500;&#x2500; icons
&#x2502;&#xA0;&#xA0; &#x251C;&#x2500;&#x2500; manifest.json
&#x2502;&#xA0;&#xA0; &#x2514;&#x2500;&#x2500; popup.html
&#x2514;&#x2500;&#x2500; src
    &#x251C;&#x2500;&#x2500; background.js
    &#x251C;&#x2500;&#x2500; contentScript.js
    &#x251C;&#x2500;&#x2500; popup.css
    &#x2514;&#x2500;&#x2500; popup.js</code></pre><p>We&apos;re mostly interested in only three files for now:<br><br><em><strong>public/manifest.json</strong></em><br><br>The manifest is a JSON file which provides the browser with information about your extension, such as name, it&apos;s capabilities, how it&apos;s started, which file to display, scripts to run on pages and <a href="https://developer.chrome.com/docs/extensions/mv3/manifest/?ref=blog.entropy.observer">many more</a>. A few fields to note there for us:</p><!--kg-card-begin: markdown--><ul>
<li><code>default_popup</code> - the HTML file to show when the extension icon is clicked</li>
<li><code>permissions</code> -  we need them to access certain parts of Chrome API</li>
<li><code>host_permissions</code> -  a set of URL patterns your extension can access</li>
</ul>
<!--kg-card-end: markdown--><p>For now, we&apos;ll leave it all as it is and come back to it later.</p><p><em><strong>src/popup.html</strong></em><br><br>The starting point of our UI. This HTML pops up when we click the extension button in the browser, so we&apos;ll use it to build a simple interface here.<br>We&apos;ll have a &apos;Sort&apos; button that calls our API&apos;s /sort endpoint and returns the result, a loading bar and a simple error box in case anything goes wrong.<br>For debugging, we can also have a &quot;Show tabs&quot; button that will show as a list of all of our tabs. So let&apos;s write some simple HTML for it:</p><pre><code class="language-HTML">&lt;!DOCTYPE html&gt;
&lt;html lang=&quot;en&quot;&gt;
  &lt;head&gt;
    &lt;meta charset=&quot;UTF-8&quot; /&gt;
    &lt;title&gt;Bookie JS&lt;/title&gt;
    &lt;link rel=&quot;stylesheet&quot; href=&quot;popup.css&quot; /&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;div class=&quot;app&quot;&gt;
      &lt;div class=&quot;button-container&quot;&gt;
        &lt;!-- This will call our API --&gt;
      &lt;button id=&quot;sortBtn&quot; class=&quot;button&quot;&gt;Sort my mess&lt;/button&gt;
      &lt;div id=&quot;loading&quot; class=&quot;loading&quot;&gt;&lt;/div&gt;
      &lt;div id=&quot;error&quot; class=&quot;error&quot;&gt;&lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
  &lt;script src=&quot;popup.js&quot;&gt;&lt;/script&gt;
&lt;/body&gt;
&lt;/html&gt;
</code></pre><p><br><em><strong>src/popup.js</strong></em><br></p><!--kg-card-begin: markdown--><p>This is where our JS will reside. We ain&apos;t gonna use no fancy <em>bulletproof cybernetically CRISPR&apos;d SSSR JavaScript framework</em>, it&apos;s going to be our plain ol&apos; <a href="http://vanilla-js.com/?ref=blog.entropy.observer">vanilla JS</a>. To update the UI, we will rely on a simple <code>render(state)</code> function that manipulates DOM elements using some simple <code>show</code> and <code>hide</code> functions (by changing <code>element.style.display</code> to <code>block</code>/<code>none</code>).</p>
<!--kg-card-end: markdown--><p>Now, let&apos;s write our thought process down by writing it into functions:</p><pre><code class="language-javascript">&apos;use strict&apos;;

import &apos;./popup.css&apos;;

(function () {

const SORT_BTN = &apos;sortBtn&apos;;
const LOADING = &apos;loading&apos;;
const ERROR = &apos;error&apos;;
    
// get tabs &amp; groups from the API
async function getTabsAndGroups(){};

// call backend with the data
async function callBackendToSort(tabsAndGroups){};

// apply result to browser
async function applySort(sortedCategories){};   

//runs our app   
async function run(){

 //get tabs
 let tabsAndGroups = await getTabsAndGroups();
 render({loading: false, error: null}

 let btn = document.getElementById(&apos;sortBtn&apos;)

 //on click, call the API, show loading and apply the results when done 
 btn.addEventListener(&apos;click&apos;,async ()=&gt; {
     render({loading: true, error: null}
      try {
        let result = await callBackendToSort(tabsAndGroups)
        await applySort(result)
        render({loading: false, error: undefined})
      }catch (e){
        render({loading: false, error: e})
      }
 })
}

//load our run function when the content loads
document.addEventListener(&apos;DOMContentLoaded&apos;, run);
    
})();
</code></pre><!--kg-card-begin: markdown--><p>Our first step will be querying the Chrome API for tabs and groups. As we can see in the docs, we can use <code>chrome.tabs.query</code> to achieve this.</p>
<!--kg-card-end: markdown--><figure class="kg-card kg-image-card"><img src="https://blog.entropy.observer/content/images/2023/02/Screenshot-2023-02-13-at-15.28.40.png" class="kg-image" alt="Sorting 400+ tabs in 60 seconds with JS, Rust &amp; GPT-3: Part 1" loading="lazy" width="1462" height="560" srcset="https://blog.entropy.observer/content/images/size/w600/2023/02/Screenshot-2023-02-13-at-15.28.40.png 600w, https://blog.entropy.observer/content/images/size/w1000/2023/02/Screenshot-2023-02-13-at-15.28.40.png 1000w, https://blog.entropy.observer/content/images/2023/02/Screenshot-2023-02-13-at-15.28.40.png 1462w" sizes="(min-width: 720px) 720px"></figure><p>So, let&apos;s try it:</p><pre><code class="language-javascript">async function getTabsAndGroups() {
    let chromeTabs = await chrome.tabs.query({})
    console.log(chromeTabs)
  }
</code></pre><!--kg-card-begin: markdown--><p>Not working? Now, remember that <code>public/manifest.json</code> file? And the <code>permissions</code> object?</p>
<p>Well, to access tabs, their titles and groups, we&apos;ll need to add matching permissions to it. So open up the <code>manifest.json</code> and under <code>permissions</code> add <code>&quot;tabs&quot;, &quot;tabGroups&quot;</code>. Now when installing, chrome can check your extensions permissions and let the user know what you&apos;re accessing.<br>
But, to be able to access the tabs API, we&apos;ll need one other special permission called <code>host-permissions</code>. It tells the user which websites the extension is enabled to run on, so if we want to be able to use it on all tabs we&apos;ll need to add the proper URL pattern. So add a new property to the <code>manifest.json</code> called <code>host-permissions</code> with a pattern allowing it to match all URL&apos;s such as <code>&quot;host_permissions&quot;: [&quot;*://*/*&quot;]</code>. Finally, now we are able to access all of the user&apos;s tabs and groups.</p>
<p>Now that it&apos;s working, the data the <code>chrome.tabs.query</code> method returns will contain a few things we&apos;ll need: <code>id</code>, <code>title</code> and <code>groupId</code>. We&apos;ll be using <code>id</code> and <code>title</code> for sorting, and <code>groupId</code> to query existing groups, so first, we&apos;ll map the returned object to a simplified version of it, using only the properties we need.</p>
<p>To get more data about groups, we&apos;ll create <code>tabsForGroups</code> function which will find all the unique groups and query Chrome API by using <code>chrome.tabGroups.get(id)</code> to get the title of each group.</p>
<!--kg-card-end: markdown--><pre><code class="language-javascript">async function tabsToGroups(tabs){
  //get all existing groupIds from tabs
  let groupIds = tabs
      .map( (it)=&gt;it.groupId)
      .filter((it)=&gt;it!==null &amp;&amp; it!==undefined &amp;&amp; it!==-1);
  
  //push them into a set to get unique ones
  let groups = new Set(groupIds)

  //query chrome API for data about each tab group
  return await Promise.all([...groups]
      .map(async (it) =&gt; {
      let item = await chrome.tabGroups.get(it)
        return {
          id: item.id,
          title: item.title
        }
    }));
  }

// now our function can return us all of our tabs and groups
async function getTabsAndGroups() {
    let chromeTabs = await chrome.tabs.query({})
    let tabs = await mapTabs(chromeTabs)
    let tabsWithGroups = await tabsToGroups(tabs)
    let groups =  tabsWithGroups.filter((it)=&gt;it.title.length !== 0);
    return {
      items: tabs,
      categories: groups
    }
  }
</code></pre><p>Boom, in a few simple steps we have the list of our existing groups and tabs. <br>The API calling function is also quite simple. Since our API doesn&apos;t exist yet,<br>we&apos;ll just write a generic POST request to localhost:</p><pre><code class="language-javascript">async function callBackendToSort(data){    
 return await fetch(&apos;http://127.0.0.1:8000/sort&apos;,{
      method: &apos;POST&apos;,
      headers: {&apos;Content-Type&apos;: &apos;application/json&apos;},
      body: JSON.stringify({
        items: data.items,
        categories: data.categories
      })
    })
}
</code></pre><p></p><p></p><p>Our render function is quite simple too - we just check the state and change our UI accordingly.</p><pre><code class="language-javascript">function render(state){
    if(state.loading){
      show(LOADING)
      hide(SORT_BTN)
      hide(ERROR)
    }else{
      hide(LOADING)
      show(SORT_BTN,true)
    }
    if(state.loading!==true &amp;&amp;
      (state.error!==undefined &amp;&amp; state.error!=null)){
      show(ERROR)
      showError(state.error)
    }else
      hide(ERROR)
}
</code></pre><!--kg-card-begin: markdown--><p>All that&apos;s now left to do is implement the <code>applySort</code> function which will apply our new categories to the browser itself.</p>
<p>The idea is:</p>
<ul>
<li>Check if the group exists</li>
<li>If it doesnt, create it</li>
<li>Update it&apos;s tabs list and title</li>
</ul>
<p>For this, we have a bit of API research to do - the documentation covering this part is a bit confusing. You&apos;d expect to be able to have something like<br>
<code>chrome.tabGroups.create</code> or <code>chrome.tabGroups.update</code> which would change tabs in the group, but... that&apos;s naive thinking.</p>
<p>To create a group we use the API call <code>chrome.tabs.group</code> by <em>NOT</em> passing the <code>chrome.tabs.group</code> a <code>groupId</code>. Then, the group will be created and the new <code>groupId</code> returned to you. This is kind of a weird call by the chrome team - if groups are just containers of tabs, why would tabs have knowledge and control over them?</p>
<p>Shouldn&apos;t the groups be created and managed via groups API?</p>
<p>Oh also, if you want to add tabs to the group, you use the same call and pass it the array of tabs via <code>tabIds</code>. &quot;Hey can I pass in the title too since we&apos;re already creating and updating the object via this API call?&quot; No, for that you&apos;ll use <code>chrome.tabGroups.update</code> API call.</p>
<p>I assumed this weird syntax is because groups were a later addon in chrome so support was retrofitted into the tabs API itself. So let&apos;s test that assumption. Looking at the <a href="https://chromium-review.googlesource.com/c/chromium/src/+/2414921?tab=comments&amp;ref=blog.entropy.observer">commit</a> that added groups to the Tabs API, we can find the same discussion in the comments, leading us to the <a href="https://docs.google.com/document/d/1WgNtyBSuSmmHIuENU8IKLZmSK3tAVPpnjwjfD3clxqI/edit?disco=AAAAGpyGs6I&amp;ref=blog.entropy.observer">Tab Group API proposal</a>. It seems the team decided to split the responsibilities between <em>tab management</em> and <em>group management</em>. Since moving a tab is <em>tab management</em>, it&apos;s responsibility belongs in the Tabs API.</p>
<p>The alternative proposal was also discussed (putting that responsibility in the TabGroups API), along with it&apos;s pros and cons:<br>
<img src="https://blog.entropy.observer/content/images/2023/02/Screenshot-2023-02-16-at-21.29.44.png" alt="Sorting 400+ tabs in 60 seconds with JS, Rust &amp; GPT-3: Part 1" loading="lazy"></p>
<p>From my perspective (as the user of the API), the cons list doesn&apos;t seem that bad. Tabs wouldn&apos;t need to know about groups, user security would be increased (extensions would only need <code>tabGroups</code> permission, reducing the potential area for malicious abuse by extensions) and it would <em>hide the implementation details, replacing them with an intuitive API, which is what abstractions are all about</em>. Weird decision none the less.</p>
<p>But enough talking about the spaghetti, let&apos;s write some down.</p>
<!--kg-card-end: markdown--><pre><code class="language-javascript">function applySort(sortedCategories){

/* The response object we want looks like: 
{ categories: [
	{ category_id: int, category_title: string, items: [int] }
    ] }
*/


  for (i = 0; i &lt; sortedCategories.categories.length; i++) {
     let category = sortedCategories.categories[i]
     let categoryId = category.category_id
     //check if the group with ID exists
     let groupExists = await chrome.tabGroups.get(categoryId)
     					.catch((e)=&gt;undefined);
      let groupId;
      if(groupExists === undefined)
         //if it doesnt, the chrome.tabs.group returns us an ID
         groupId = await chrome.tabs.group({ tabIds: category.items });
      else {
          
        //if it does, we use the existing one
       	groupId = groupExists.id
        await chrome.tabs.group({groupId: groupId,
                                tabIds: category.items});
      }

      // Set the title of all groups and collapse them
      await chrome.tabGroups.update(groupId, {
        collapsed: true,
        title: category.title
      });

     
  })
}
</code></pre><p>With this, our JS extension MVP is done.<br> - We collect the tabs and groups<br> - We send them to the API<br> - We apply the returned sort.</p><!--kg-card-begin: markdown--><p>Now, we don&apos;t have an API yet, so how do we test it?<br>
We should write down some unit tests, but let&apos;s leave that for another day (no really - a few posts down we&apos;ll look into testing a chrome extension with Jest). For now, we can fake the return of <code>callBackendToSort</code> function to include a few categories and a few tab id&apos;s - something like this (but with your tab id&apos;s):</p>
<!--kg-card-end: markdown--><pre><code class="language-json">{
	&quot;categories&quot;: [{
		&quot;category_id&quot;: 837293848,
		&quot;category_name&quot;: &quot;Hacker News&quot;,
		&quot;items&quot;: [1322973609, 1322973620]
	}, {
		&quot;category_id&quot;: 837293850,
		&quot;category_name&quot;: &quot;Science&quot;,
		&quot;items&quot;: [1322973618, 1322973617, 1322973608]
	}, {
		&quot;category_id&quot;: 837293851,
		&quot;category_name&quot;: &quot;GitHub&quot;,
		&quot;items&quot;: [1322973619]
	}, {
		&quot;category_id&quot;: 837293852,
		&quot;category_name&quot;: &quot;Web Development&quot;,
		&quot;items&quot;: [1322973612, 1322973613, 1322973615, 1322973616]
	}, {
		&quot;category_id&quot;: 837293853,
		&quot;category_name&quot;: &quot;Web APIs&quot;,
		&quot;items&quot;: [1322973646]
	}]
}</code></pre><p>Now we can move on to the fun parts - building that API, prompt optimisations, GPT timeouts and fixing mistakes we&apos;ll make in the days of the future past.<br>Oh and we&apos;ll also be adding some more complexity and feature creep, but more on that later.<br><br>Stay tuned for Part 2 where we&apos;ll continue our adventure with everyone&apos;s favourite crab - Rust. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.entropy.observer/content/images/2023/02/000039.c7240e14.1720012991-2.png" class="kg-image" alt="Sorting 400+ tabs in 60 seconds with JS, Rust &amp; GPT-3: Part 1" loading="lazy" width="960" height="512" srcset="https://blog.entropy.observer/content/images/size/w600/2023/02/000039.c7240e14.1720012991-2.png 600w, https://blog.entropy.observer/content/images/2023/02/000039.c7240e14.1720012991-2.png 960w" sizes="(min-width: 720px) 720px"><figcaption>Rusty the crab cute illustration, simple, clean, 2022 (The Artist Is A Machine)</figcaption></figure><p><br><br> <br><br><br><br><br><br></p>]]></content:encoded></item><item><title><![CDATA[The actual value behind GPT isn’t in writing SEO spam - it’s the transformers.]]></title><description><![CDATA[<blockquote>(Note: I talk about GPT here mostly but just because it&apos;s easier to write than &quot;transformer language models&quot; and most people are familiar with them in the form of GPT, but the text is about them in general)</blockquote><p>GPT3, often confused with ChatGPT in the latest</p>]]></description><link>https://blog.entropy.observer/the-actual-value-behind-gpt-isnt-in-writing-mediocre-articles-its-the-transformers/</link><guid isPermaLink="false">65422724da89d344959db4f5</guid><dc:creator><![CDATA[Ian Rumac]]></dc:creator><pubDate>Mon, 06 Feb 2023 22:46:51 GMT</pubDate><media:content url="https://blog.entropy.observer/content/images/2023/02/8603987448ba445793e6d9fe1e018fe1.png" medium="image"/><content:encoded><![CDATA[<blockquote>(Note: I talk about GPT here mostly but just because it&apos;s easier to write than &quot;transformer language models&quot; and most people are familiar with them in the form of GPT, but the text is about them in general)</blockquote><img src="https://blog.entropy.observer/content/images/2023/02/8603987448ba445793e6d9fe1e018fe1.png" alt="The actual value behind GPT isn&#x2019;t in writing SEO spam - it&#x2019;s the transformers."><p>GPT3, often confused with ChatGPT in the latest swarm of internet articles,<br>has been all the rage in the tech buzzword world these days. It&apos;s treatment in the media for the last year or so has been off the charts, with some treating it as the miracle AI we have been waiting for. Everybody and their mom has been jumping on the bandwagon, creating the next copywriting tool, making it pass the bar or just using it to write their math homework. </p><p>Unfortunately, the quality of the content generated is usually mediocre - even with better prompting, the text generated cannot be novel - the technology itself is based on &quot;common denominators&quot; in a way, parroting and remixing from the trained texts, so you can forget about becoming the next James Joyce in a few clicks; your writing will most likely end up looking like an average philosophy student&apos;s grandiose manifesto, with a bunch of words thrown in to impress the average reader, yet meaning nothing and bearing no satisfaction to the reader&apos;s gaze.<br><br>But, far off on the other side, there are some way more fun applications people are finding uses for - <a href="https://spindas.dreamwidth.org/4207.html?ref=blog.entropy.observer">GPT3 as a reducer</a>, as a backend, <a href="https://www.width.ai/post/gpt-3-language-translation-software?ref=blog.entropy.observer">as a translator</a> or <a href="https://medium.com/tenable-techblog/g-3po-a-protocol-droid-for-ghidra-4b46fa72f1ff?ref=blog.entropy.observer">decompiler/deobfuscator</a>- and these applications have a much bigger practical value.<br><br>And for the last year or so, this has been tickling my mind - what are some actual usecases behind the technology - yes, generating articles or parroting back documentation is an obvious one. Fine-tuned models answering support questions is also a nice one, tho it comes with it&apos;s own 13 reasons why not.<br><br>But the transformations themselves - taking data in 1 form and returning it in the other, processing it along the way or just translating it - unlock a large pool of uncaptured value.</p><p>Imagine being able to process a bunch of scraped or human data into a predefined format that aligns with your API&apos;s data format - &#xA0;or to put it more vividly, imagine your grandma sending a text &quot;can you bring me 2 bottles of milk and a pack of eggs?&quot;, getting an answer &quot;that will be 3.97, is that ok?&quot; and someone showing up with 2 milk and eggs 15 minutes later (or sometimes 12 milks and 2 eggs because the model screwed up).<br><br>Behind the scenes, the text is actually fed into the model that transforms it into a json in the format of:<br></p><pre><code class="language-json">{
  &quot;action&quot;: &quot;purchase&quot;,
  &quot;items&quot;: {
    {
      &quot;name&quot;: &quot;Milk&quot;,
      &quot;quantity&quot;: 2
    },
    { 
    &quot;name&quot;: &quot;Egg pack&quot;,
    &quot;quantity&quot;: 1
    }
  }
}
</code></pre><p>Which the latest 15-minute grocery delivery app can then consume and bring your grandma her milk (and rip her off for a 4$ service fee, 8$ delivery fee, 3$ VC fee on the way).<br><br>Even better things are possible with <a href="https://github.com/hwchase17/langchain?ref=blog.entropy.observer">chaining</a> different models: <br><br>Scrape a website, feed it into a model to remove unnecessary HTML, and feed the results into another model that transforms contents into a format your API&apos;s consume. Hell, why even bother with an API, just insert the results into a model that is fine-tuned in translating to SQL queries and pump that sweet data oil in directly. <br><br>Want to check how much open bugs during full moons influence your user churn?<br>Well what if your favorite analytics tool had a question box connecting to a chain -<br>first giving your question to a model that suggests data to find, passing into another model returning a query on your data lake which is then evaluated for safety, executed and passed together with the original prompt into a code-generating model that will return the necessary HTML to display that data.<br><br>Instead of having to torture your developers and designers with supporting infinite possible permutations of filters, chart designs and customisations, you can just leave it up to the model to generate them on the fly. <br><br>With enough fine-tuning (and a lot of human work to provide good data),<br>transformer LLM&apos;s can help us achieve a lot of stuff that we thought &quot;unscalable&quot; as of now - stuff that wasn&apos;t cost efficient, needed a mechanical turk or a large swath of harcoded assumptions to iron out the edge cases - can be achieved by using an oversized text mumbler-jumber. <br><br>And yes, there are a lot of hallucinations, quite a few mistakes, and a lot of accuracy issues in the way - one wrong word and the model could go wind up in the crazy lane - but I&apos;m not saying it&apos;s a perfect &quot;do-all-be-all&quot; technology, far from it - &#xA0;I&apos;m saying it&apos;s a great &quot;glue&quot; layer we were missing in our toolbelt, a &quot;generic glue&quot; layer which could help us unlock more economic and data value than ever. With good training, error checking and proper chaining, we could conquer some problems that were unsurmountable until now.<br><br>Even though the current generation of models are like giant mainframes upon which we can only gaze with wonder, there are newer and smaller models coming out at a rapid pace. And while we are still quite far away from having a small, easily tuneable model that will be good enough to cover a large swath of tasks with only a small amount of additional training, the next generation of programmers might grow up complaining that &apos;gpt install is-integer&apos; ruined programming. </p><p></p><p><br><br><br><br><br><br> </p>]]></content:encoded></item></channel></rss>