Paradigm Junction

In the last article in this series we discussed why some companies are seeking to build their own GenerativeAI chatbots. This allows them to keep their proprietary data safe, whilst customising the responses to their own business needs and saving money. A recently leaked memo from a Google researcher highlights the problem precisely - open source models are now good enough that paying for a private model (from Google, or OpenAI) isn’t worth the expense. Whilst the focus has been on computation costs, the impact in terms of data retention and customisation is just as large.

‍

So, as a company interested in putting a Large Language Model like this to work, what options do you have?

‍

Use an offline, on-prem, open source model

A range of options now exist to use miniaturised models which are small enough to be run on most modern laptops with a similar degree of quality. GPT4All has become one of the most attention grabbing coding projects in history and allows you to run a chatbot interface fully offline. Data is only sent back to the developers if you opt in, meaning you can potentially include sensitive commercial information in the prompts. There are even claims that these models can be run on a phone at a reasonable speed!

‍

Pros: These models are free to download and because of their relatively small size, cheap to run. Local versions shouldn’t send data back to the owner, meaning data you enter as prompts stays private.

Cons: Smaller models have slightly lower performance, although in tests of some miniaturised models humans choose the small model almost 50% of the time. That said, the models are still general and know nothing about your business or context. This reduces their utility and ability to give a competitive advantage.

‍

Fine tune an open source model

Fine tuning is the process of taking an existing model and layering on a small amount of additional training to make it respond with answers that are more aligned with what you want. Because the model has already “learned” the structure of language and how to follow instructions, much less training is needed. Whilst still requiring some Machine Learning expertise, the time and cost for fine tuning a model has decreased dramatically thanks to a technique called Low-Rank Adaptation (LoRA) and projects like this one which claim to fine tune a model in just a couple of days. If you have a reasonable volume of business specific text documents from which a language model could learn (think manuals for customer service agents, or several years’ history of strategy decisions) it is now possible to produce a model which has learnt how you work and to update it regularly and relatively cheaply. Best of all, the underlying foundation models are open sourced and getting better all the time.

‍

Pros: In addition to the security benefits of running a proprietary model you will have a tool that knows your business individually. For specific use cases these models can often outperform large, general models like GPT4 and at a fraction of the running costs.

Cons: Fine tuning a model will require developer time and a concerted effort to get the data you want to use in one place. It’s achievable in the order of a few weeks, but not yet an off the shelf solution.

‍

Train a model from scratch

This approach is not for the faint of heart. However, if you work with large amounts of data and need a highly bespoke tool (such as for interacting with large volumes of historical transaction data) then this approach could provide better results in time. Thanks to the rapid development of tools like Mosaic a lot of the hardware challenges - which used to prevent anyone except well funded researchers from training a model - have been solved. Once your data is well organised and in one place their preconfigured platform will handle the provision of computing power, batching up of training runs, error processing and progress analytics. Nonetheless, the cost of building something new is still high enough that this will only be undertaken where the benefits are large and other options have been explored. Additionally, training a new model means desired characteristics like instruction following and fluency with multiple languages can’t be guaranteed.

‍

Pros: Potentially the highest levels of performance for tasks which are unlikely to be shared by many users. Model size (and therefore cost) can be optimised to match the task it needs to fulfil.

Cons: Expensive and with uncertain results. The trend of the market is towards fine tuning existing foundation models.

‍

—

‍

If you would like to talk through how you might use GenerativeAI tools, like Large Language Models, to speed up your business processes or to help provide information to customers & internal decision makers then get in touch with james@paradigmjunction.com

‍

Building a ChatGPT replacement without OpenAI

Related posts

Privacy Policy

AI Agents and navigating the web

AI Investment and Sovereignty: Rethinking What Counts

How an AI workshop can unlock adoption across your office