#12: Vector databases

Welcome to The Prompt Engineer, your go-to source for making you a superhuman with LLMs.

Today, we're diving deep into the world of vector databases - the backbone technology that makes it possible for Large Language Models to comprehend text like never before.

Interested?

Great, let’s get started.

Vector databases

Vector databases are the unsung heroes behind the scenes of NLP (Natural Language Processing).

And no, they're not some kind of fancy math equation that only a handful of rocket scientists can understand.

In fact, they're pretty simple.

Think of vector databases as a sort of library that stores information about words and phrases in a way that's easy for machines to understand.

It's like having a cheat sheet for language, and we love a good cheat sheet.

Simply put, they are a way of representing words and other types of data as mathematical vectors.

And before you’d stop reading, look at the picture below.

Doing math on text

Here we transformed the word “king“ to a set of numbers.

Why is this useful?

Well, now we can do mathematic operations on it, for example

“king“ - "man“ + “woman“ = “queen“

And we can also discover hidden relationships in our dataset.

This makes vector databases perfect for

  • Long-term memory for Large Language Models,

  • Recommendation systems,

  • Image recognition,

  • Anomaly detection,

  • Or even drug discovery.

So next time you're chatting with your favorite AI, just remember: behind every great language model is a trusty vector database.

And behind every great AI output, is a someone who reads The Prompt Engineer!

Best,
Gabor Soter, PhD

A little about me:

  • did my PhD in Europe’s largest AI and robotics research lab

  • worked as software engineer and CTO at Y-combinator-backed and AI startups

  • in my previous startup my team worked with OpenAI