We leveraged OpenAI’s June 13 update to make our web app navigational assistant (“RiskPilot”) 5x faster and 20x cheaper.
While OpenAI stepped in the right direction with the Function Calling feature, we proved it has room to improve still.
Intro
At RiskThinking.AI, we initiated a project back in April to leverage OpenAI’s GPT model to build a navigational assistant (dubbed “RiskPilot”) for our web app. At the time, gpt-4-0314 was the only viable model since gpt-3.5-turbo-0301 doesn’t have “system steerability” (the ability to guide the conversation towards a specific subject or goal.)
The workflow is fairly simple:
Our “RiskPilot” takes a user query such as “Toronto”, and a web app view states scope:
1 2 3 4 5 6 7 8 9 10 11
{ viewState: { type: "object", description: "parsed from user query as a deck.gl viewState with optimal zoom level", default: { "latitude": 49.254, "longitude": -123.13, "zoom": 2 } } }
And performs prompt engineering to form something like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
[ { "role": "system", "content": "You are an app navigation copilot which converts user asks based on possible parameters to a JSON form. For errors, respond with {}" }, { "role": "user", "content": """ Toronto Output format: {"viewState": {"latitude": 49.254, "longitude": -123.13, "zoom": 2}} Possible values for each parameter: - viewState: {type: "object", description: "parsed from user query as a deck.gl viewState with optimal zoom level", default: {"latitude": 49.254, "longitude": -123.13, "zoom": 2}} """ } ]
And this is how it works in the web app component:
See below for a more complex example from our internal sandbox tool (credit: jegrieve):
But it can be excruciatingly slow until now…
Since its June 13 update, OpenAI enabled system steerability for the gpt-3.5-turbo model, and along with that, came the Function Calling feature in recognition of the overwhelming use cases built on top of the same simple principle, Natural Language to JSON (our web app navigational assistant would be no different).
Under the hood, functions are injected into the system message in a syntax the model has been trained on. This means functions count against the model’s context limit and are billed as input tokens. If running into context limits, we suggest limiting the number of functions or the length of documentation you provide for function parameters.
We benchmarked the models before and after the June 13 update and concluded a new navigational assistant that’s 5 times faster and 20 times cheaper. On top of that, we exposed that OpenAI’s Function Calling, while promising, can still do better!
Benchmark Base (gpt-4-0314)
This is the base implementation of our original “RiskPilot”, where we leverage both <system> and <user> prompt engineering to achieve the steerability we needed for stable and high precision assistant output of JSON object that can be used for application usage.
The arguments field is what we need to use. It’s dysfunctional (to say the least) for our use case, even though they are both really fast and cheap. From the tokens count, it looks like its proprietary Function Callingsystem message may have been compressed too much which resulted in its less desirable outcome for our use case.
0613 Custom prompt-engineering
We rebase the same prompt engineering we used for the pre-0613 gpt-4 model and apply it to both gpt-4-0613 and gpt-3.5-turbo-0613 (which has system steerability over the previous version).
The result is excellent, where even though the speed is not as fast, and cost not as cheap, as OpenAI’s in-house Function Calling implementation, the same staple system and user prompt-engineering implementation is 5 times faster and 20 times cheaper over gpt-4-0314, so all we had to do was a line of code change:
Take a look at the difference before (gpt-4-0314) and after (gpt-3.5-turbo-0613) (credit: @shanwhiz):
Recap
At RiskThinking.AI, we embarked on a project to develop a navigational assistant using OpenAI’s GPT model. Initially, we had to use gpt-4-0314 as the model of choice due to system steerability not being enabled in gpt-3.5-turbo-0301. However, since OpenAI’s June 13 update, system steerability became available for gpt-3.5-turbo, significantly improving our navigational assistant’s speed and cost-effectiveness, now five times faster and twenty times cheaper.
While the June 13 update’s Function Calling feature shows promise and provides the fastest response time, we proved there’s still room for further improvement.