Embedding Vector Search w/ Django & PostgreSQL, Part 1

django, docker, postgresql, programming, Python

Sigh. 🙄 Yes–yet another “AI” article in a blog.

The pgvector extension allows PostgreSQL to support vector searches. This ability is useful for implementing RAG solutions. The link has all the background information on the technology. This post will not duplicate the info.

This post is a runbook on setting up:

  • a Docker environment with PostgreSQL and the pgvector extension.
  • a Django application with the client libraries ready to use the vector extension of the PostgreSQL DB.

Setup (Docker)

The first step is to create a “service” (docker-compose concept) for the DB:

services:
  ...
  postgres:
    # Check for latest version as necessary/desired
    image: pgvector/pgvector:pg17
    environment:
      # remove before pushing up to source control
      - POSTGRES_PASSWORD=mysecretpassword
    ports:
      - "5432:5432"
    expose:
      - "5432"
    volumes:
      # Optional; use a host volume
      - ./data/psql:/var/lib/postgresql/data

Then bring up the service once to initialize the DB:

> docker-compose up postgres
...
postgres-1  | 2025-02-13 03:05:22.288 UTC [1] LOG:  database system is ready to accept connections

The image pgvector/pgvector:pg17 is a Docker image for PostgreSQL 17 with the pgvector extension bundled.

Create a DB (if necessary)

Create a DB to be used if necessary.

> docker-compose run --rm postgres psql -h postgres -U postgres 
Password for user postgres: .....
...
> create database mydb;
> \q

The POSTGRES_PASSWORD environment entry can now be removed from docker-compose.yml.

Django Connection to PostgreSQL

Add the psycopg dependency so we can use PostgreSQL.

Shell into the Django app container and run:

> pip install psycopg
...
> pip freeze -r requirements.txt > requirements.txt

Tell Django to use the PostgreSQL DB instead of the default SQLite:

DATABASES = {
    # 'default': {
    #     'ENGINE': 'django.db.backends.sqlite3',
    #     'NAME': BASE_DIR / 'db.sqlite3',
    # }
    "default": {
        "ENGINE": "django.db.backends.postgresql",
        "HOST": "postgres",  # Matches the docker "service" name
        "NAME": "mydb",      # the DB name
        "USER": "postgres",  # the default user
        "PASSWORD": "mysecretpassword",  # the password from POSTGRES_PASSWORD above
    }
}

Verify by running the prerequisite migrations for a Django application:

> python manage.py migrate
...
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, sessions
Running migrations:
  Applying contenttypes.0001_initial... OK
  ...
  Applying sessions.0001_initial... OK

Install pgvector-python

Follow the directions here: https://github.com/pgvector/pgvector-python. Shell into the Django app container and run:

> pip install pgvector
...
> pip freeze -r requirements.txt > requirements.txt

There are also examples there for Django.