chromem

package module

v0.10.0 Latest Latest Go to latest Published: Feb 28, 2026 License: MPL-2.0 Imports: 30 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/TIANLI0/chromem-go

Links

Open Source Insights

README ¶

chromem-go

Embeddable vector database for Go with Chroma-like interface and zero third-party dependencies. In-memory with optional persistence.

This repository is a performance-focused fork of the original project at https://github.com/philippgille/chromem-go.

Because chromem-go is embeddable it enables you to add retrieval augmented generation (RAG) and similar embeddings-based features into your Go app without having to run a separate database. Like when using SQLite instead of PostgreSQL/MySQL/etc.

It's not a library to connect to Chroma and also not a reimplementation of it in Go. It's a database on its own.

The focus is not scale (millions of documents) or number of features, but simplicity and performance for the most common use cases. This fork adds post-f63964a64bf64b261f665dd45f92cafadcb0b972 query-path optimizations, SIMD controls, and multiple ANN / lexical retrieval modes; see Differences vs upstream (after f63964a) and Benchmarks.

⚠️ The project is in beta, under heavy construction, and may introduce breaking changes in releases before v1.0.0. All changes are documented in the CHANGELOG.

Use cases

With a vector database you can do various things:

Retrieval augmented generation (RAG), question answering (Q&A)
Text and code search
Recommendation systems
Classification
Clustering

Let's look at the RAG use case in more detail:

RAG

The knowledge of large language models (LLMs) - even the ones with 30 billion, 70 billion parameters and more - is limited. They don't know anything about what happened after their training ended, they don't know anything about data they were not trained with (like your company's intranet, Jira / bug tracker, wiki or other kinds of knowledge bases), and even the data they do know they often can't reproduce it exactly, but start to hallucinate instead.

Fine-tuning an LLM can help a bit, but it's more meant to improve the LLMs reasoning about specific topics, or reproduce the style of written text or code. Fine-tuning does not add knowledge 1:1 into the model. Details are lost or mixed up. And knowledge cutoff (about anything that happened after the fine-tuning) isn't solved either.

=> A vector database can act as the up-to-date, precise knowledge for LLMs:

You store relevant documents that you want the LLM to know in the database.
The database stores the embeddings alongside the documents, which you can either provide or can be created by specific "embedding models" like OpenAI's text-embedding-3-small.
- chromem-go can do this for you and supports multiple embedding providers and models out-of-the-box.
Later, when you want to talk to the LLM, you first send the question to the vector DB to find similar/related content. This is called "nearest neighbor search".
In the question to the LLM, you provide this content alongside your question.
The LLM can take this up-to-date precise content into account when answering.

Check out the example code to see it in action!

Interface

Our original inspiration was the Chroma interface, whose core API is the following (taken from their README):

Chroma core interface

import chromadb
# setup Chroma in-memory, for easy prototyping. Can add persistence easily!
client = chromadb.Client()

# Create collection. get_collection, get_or_create_collection, delete_collection also available!
collection = client.create_collection("all-my-documents")

# Add docs to the collection. Can also update and delete. Row-based API coming soon!
collection.add(
    documents=["This is document1", "This is document2"], # we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well
    metadatas=[{"source": "notion"}, {"source": "google-docs"}], # filter on these!
    ids=["doc1", "doc2"], # unique for each doc
)

# Query/search 2 most similar results. You can also .get by id
results = collection.query(
    query_texts=["This is a query document"],
    n_results=2,
    # where={"metadata_field": "is_equal_to_this"}, # optional filter
    # where_document={"$contains":"search_string"}  # optional filter
)

Our Go library exposes the same interface:

chromem-go equivalent

package main

import "github.com/TIANLI0/chromem-go"

func main() {
    // Set up chromem-go in-memory, for easy prototyping. Can add persistence easily!
    // We call it DB instead of client because there's no client-server separation. The DB is embedded.
    db := chromem.NewDB()

    // Create collection. GetCollection, GetOrCreateCollection, DeleteCollection also available!
    collection, _ := db.CreateCollection("all-my-documents", nil, nil)

    // Add docs to the collection. Update and delete will be added in the future.
    // Can be multi-threaded with AddConcurrently()!
    // We're showing the Chroma-like method here, but more Go-idiomatic methods are also available!
    _ = collection.Add(ctx,
        []string{"doc1", "doc2"}, // unique ID for each doc
        nil, // We handle embedding automatically. You can skip that and add your own embeddings as well.
        []map[string]string{{"source": "notion"}, {"source": "google-docs"}}, // Filter on these!
        []string{"This is document1", "This is document2"},
    )

    // Query/search 2 most similar results. You can also get by ID.
    results, _ := collection.Query(ctx,
        "This is a query document",
        2,
        map[string]string{"metadata_field": "is_equal_to_this"}, // optional filter
        map[string]string{"$contains": "search_string"},         // optional filter
    )
}

Initially chromem-go started with just the four core methods, but we added more over time. We intentionally don't want to cover 100% of Chroma's API surface though.
We're providing some alternative methods that are more Go-idiomatic instead.

For the full interface see the Godoc: https://pkg.go.dev/github.com/TIANLI0/chromem-go

Differences vs upstream (after `f63964a`)

Compared to the upstream baseline at commit f63964a64bf64b261f665dd45f92cafadcb0b972, this fork currently includes:

Query execution internals reworked for lower latency:
- chunk-based worker scheduling
- per-worker top-k aggregation before final merge (less contention)
- tuned concurrency heuristics based on document count and vector dimension
- cached document snapshots to reduce lock contention under high query concurrency
- pooled filtered-document slices to reduce query-time allocations
Runtime tuning knobs for query behavior:
- CHROMEM_QUERY_SMALL_DOCS_THRESHOLD
- CHROMEM_QUERY_SEQUENTIAL_DOCS_THRESHOLD
- CHROMEM_QUERY_HIGH_DIM_THRESHOLD
- CHROMEM_QUERY_HIGH_DIM_CONCURRENCY_DIVISOR
- CHROMEM_QUERY_MAX_CONCURRENCY (hard cap; 0 disables cap)
- plus matching setter APIs (SetQuery...)
Optional SIMD path for dot product (amd64 + GOEXPERIMENT=simd) with runtime threshold control:
- env var: CHROMEM_SIMD_MIN_LENGTH
- API: SetSIMDMinLength()
Additional retrieval/index modes configurable at runtime:
- hnsw (default ANN)
- ivf
- pq
- ivfpq
- bm25 (lexical)
- hybrid (vector + BM25 rerank)
HNSW internals reworked for better hot-path behavior:
- SIMD-aware distance kernel integration
- allocation reduction with visited/heap pools
- graph build/search logic split by responsibility for maintainability
Collection-level memory observability API:
- Collection.MemoryStats()
Reproducible benchmark workflow and matrix script:
- benchmark_matrix.ps1

Performance snapshot vs `f63964a`

Measured on this machine (windows/amd64, Intel i7-14700F), with:

go test -run=^$ -bench "BenchmarkCollection_Query_NoContent_(1000|5000|25000|100000)$|BenchmarkDotProduct" -benchmem -benchtime=200ms -count=4 ./...

Average query latency (ns/op) improved versus f63964a:

1000 docs: -56.33%
5000 docs: -68.95%
25000 docs: -42.94%
100000 docs: -38.61%

At the same time, memory overhead increased for these scenarios (B/op, allocs/op), which is a deliberate speed/overhead trade-off in the current implementation.

Additional SIMD-only dot-product check on HEAD (-cpu=1, CHROMEM_SIMD_MIN_LENGTH=0) shows optimized path gains over no-SIMD build:

size=1024: -35.78%
size=1536: -38.79%
size=3072: -54.95%

You can reproduce the same comparison by running the benchmark commands in Development, then comparing outputs with benchstat or equivalent summary tooling.

High-concurrency snapshot (1GiB @ 1536 dims)

Measured on this machine (windows/amd64, Intel i7-14700F) with SIMD enabled and ~1GiB embeddings-only corpus:

go test -run ^$ -bench "^BenchmarkCollection_Query_NoContent_1536_Approx1GiB_ParallelLatencyMatrix$" -benchmem -benchtime=1x -count=1

Observed range:

workers=1: ~21.9 QPS, p50 ~45.5ms, p95 ~49.5ms
workers=4: ~26.5 QPS, p50 ~145.9ms, p95 ~205.0ms
workers=8: ~27.5 QPS, p50 ~254.8ms, p95 ~362.0ms
workers=16: ~25.9 QPS, p95 ~1010ms
workers=32: ~28.2 QPS, p95 ~1304ms

Interpretation: throughput plateaus around 4-8 workers while tail latency rises rapidly beyond that.

Persistent mode comparison (1GiB @ 1536 dims)

Measured on this machine (windows/amd64, Intel i7-14700F):

go test -run '^$' -bench '^BenchmarkCollection_Query_NoContent_1536_Approx1GiB_PersistentModes$' -benchmem -benchtime 1x -count 1 .

Observed query-time results (nResults=10):

default_preload: 47.38ms/op, 1.49MB/op, 3,976 allocs/op
lazy_payload: 46.80ms/op, 1.71MB/op, 5,933 allocs/op
stream_embeddings: 1050.32ms/op, 4.87GB/op, 32,339,674 allocs/op

Interpretation:

default_preload and lazy_payload are close for embedding-only workloads (payload is tiny/empty here).
stream_embeddings massively reduces resident embedding memory but trades off a lot of query throughput and increases allocations due to per-query disk reads and decode overhead.
Recommended production default for balanced performance is still preload (or lazy_payload when content/metadata memory dominates).

Quick mode selection:

Goal	Mode	Recommended options	Trade-off
Lowest query latency / highest throughput	`default_preload`	`PersistentDBOptions{Compress:false}`	Highest resident memory usage
Lower startup memory, similar query speed	`lazy_payload`	`PersistentDBOptions{Compress:false, LazyLoadPayload:true}`	First access to content/metadata may read from disk
Minimum resident memory	`stream_embeddings`	`PersistentDBOptions{Compress:false, LazyLoadPayload:true, StreamEmbeddingsOnQuery:true}`	Large query slowdown and much higher per-query allocations

For most production workloads: start with lazy_payload, benchmark with your real data, and only use stream_embeddings when memory pressure is the top priority.

3-step decision flow:

If your dataset comfortably fits RAM, start with default_preload.
If startup memory is high but query latency still matters, switch to lazy_payload.
If RAM is still not enough, move to stream_embeddings and accept lower query throughput.

After choosing a mode, re-run the benchmark with your real filters/content mix and tune query concurrency (CHROMEM_QUERY_MAX_CONCURRENCY) for your latency target.

Features

Zero dependencies on third party libraries
Embeddable (like SQLite, i.e. no client-server model, no separate DB to maintain)
Multithreaded processing (when adding and querying documents), making use of Go's native concurrency features
Experimental WebAssembly binding
Embedding creators:
- Hosted:
  - OpenAI (default)
  - Azure OpenAI
  - GCP Vertex AI
  - Cohere
  - Mistral
  - Jina
  - mixedbread.ai
- Local:
  - Ollama
  - LocalAI
- Bring your own (implement chromem.EmbeddingFunc)
- You can also pass existing embeddings when adding documents to a collection, instead of letting chromem-go create them
Similarity search:
- Exhaustive nearest neighbor search using cosine similarity (sometimes also called exact search or brute-force search or FLAT index)
- Approximate nearest neighbor search (ANN)
  - Hierarchical Navigable Small World (HNSW)
  - Inverted File (IVF)
  - Product Quantization (PQ)
  - Inverted File + Product Quantization (IVFPQ)
- Lexical search with BM25 (CHROMEM_INDEX_TYPE=bm25)
- Hybrid rerank (vector ANN + BM25, CHROMEM_INDEX_TYPE=hybrid)
Filters:
- Document filters: $contains, $not_contains
- Metadata filters: Exact matches
Storage:
- In-memory
- Optional immediate persistence (writes one file for each added collection and document, encoded as gob, optionally gzip-compressed)
- Backups: Export and import of the entire DB to/from a single file (encoded as gob, optionally gzip-compressed and AES-GCM encrypted)
  - Includes methods for generic io.Writer/io.Reader so you can plug S3 buckets and other blob storage, see examples/s3-export-import for example code
Observability:
- Collection memory stats (Collection.MemoryStats())
Data types:
- Documents (text)

Roadmap

Performance:
- Further tune SIMD thresholds and defaults across CPU architectures
- Add roaring bitmaps to speed up full text filtering
Embedding creators:
- Add an EmbeddingFunc that downloads and shells out to llamafile
Similarity search:
- Continue improving ANN recall/latency trade-offs and defaults per workload
- Add more benchmark presets for hybrid (vector + lexical) workloads
Filters:
- Operators ($and, $or etc.)
Storage:
- JSON as second encoding format
- Write-ahead log (WAL) as second file format
- Optional remote storage (S3, PostgreSQL, ...)
Data types:
- Images
- Videos

Installation

go get github.com/TIANLI0/chromem-go@latest

If you want the original upstream instead of this fork, use:

go get github.com/philippgille/chromem-go@latest

Usage

See the Godoc for a reference: https://pkg.go.dev/github.com/TIANLI0/chromem-go

For full, working examples, using the vector database for retrieval augmented generation (RAG) and semantic search and using either OpenAI or locally running the embeddings model and LLM (in Ollama), see the example code.

Quickstart

This is taken from the "minimal" example:

package main

import (
 "context"
 "fmt"
 "runtime"

 "github.com/TIANLI0/chromem-go"
)

func main() {
  ctx := context.Background()

  db := chromem.NewDB()

  // Passing nil as embedding function leads to OpenAI being used and requires
  // "OPENAI_API_KEY" env var to be set. Other providers are supported as well.
  // For example pass `chromem.NewEmbeddingFuncOllama(...)` to use Ollama.
  c, err := db.CreateCollection("knowledge-base", nil, nil)
  if err != nil {
    panic(err)
  }

  err = c.AddDocuments(ctx, []chromem.Document{
    {
      ID:      "1",
      Content: "The sky is blue because of Rayleigh scattering.",
    },
    {
      ID:      "2",
      Content: "Leaves are green because chlorophyll absorbs red and blue light.",
    },
  }, runtime.NumCPU())
  if err != nil {
    panic(err)
  }

  res, err := c.Query(ctx, "Why is the sky blue?", 1, nil, nil)
  if err != nil {
    panic(err)
  }

  fmt.Printf("ID: %v\nSimilarity: %v\nContent: %v\n", res[0].ID, res[0].Similarity, res[0].Content)
}

Output:

ID: 1
Similarity: 0.6833369
Content: The sky is blue because of Rayleigh scattering.

Benchmarks

Benchmarked on 2024-03-17 with:

Computer: Framework Laptop 13 (first generation, 2021)
CPU: 11th Gen Intel Core i5-1135G7 (2020)
Memory: 32 GB
OS: Fedora Linux 39
- Kernel: 6.7

$ go test -benchmem -run=^$ -bench .
goos: linux
goarch: amd64
pkg: github.com/philippgille/chromem-go
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkCollection_Query_NoContent_100-8          13164      90276 ns/op     5176 B/op       95 allocs/op
BenchmarkCollection_Query_NoContent_1000-8          2142     520261 ns/op    13558 B/op      141 allocs/op
BenchmarkCollection_Query_NoContent_5000-8           561    2150354 ns/op    47096 B/op      173 allocs/op
BenchmarkCollection_Query_NoContent_25000-8          120    9890177 ns/op   211783 B/op      208 allocs/op
BenchmarkCollection_Query_NoContent_100000-8          30   39574238 ns/op   810370 B/op      232 allocs/op
BenchmarkCollection_Query_100-8                    13225      91058 ns/op     5177 B/op       95 allocs/op
BenchmarkCollection_Query_1000-8                    2226     519693 ns/op    13552 B/op      140 allocs/op
BenchmarkCollection_Query_5000-8                     550    2128121 ns/op    47108 B/op      173 allocs/op
BenchmarkCollection_Query_25000-8                    100   10063260 ns/op   211705 B/op      205 allocs/op
BenchmarkCollection_Query_100000-8                    30   39404005 ns/op   810295 B/op      229 allocs/op
PASS
ok   github.com/philippgille/chromem-go 28.402s

Development

Build: go build ./...
Test: go test -v -race -count 1 ./...
Benchmark:
- go test -benchmem -run=^$ -bench . (add > bench.out or similar to write to a file)
- With profiling: go test -benchmem -run ^$ -cpuprofile cpu.out -bench .
  - (profiles: -cpuprofile, -memprofile, -blockprofile, -mutexprofile)
Compare benchmarks:
1. Install benchstat: go install golang.org/x/perf/cmd/benchstat@latest
2. Compare two benchmark results: benchstat before.out after.out

Performance tuning (SIMD + concurrency)

The query path supports a SIMD-optimized dot product (Go GOEXPERIMENT=simd, AMD64) and adaptive multi-threaded scheduling.

SIMD can be enabled for benchmarking via GOEXPERIMENT=simd.
The runtime threshold for switching from scalar to SIMD dot product can be configured with env var CHROMEM_SIMD_MIN_LENGTH or programmatically with chromem.SetSIMDMinLength().
The default threshold is 1536.
Query concurrency can be tuned with:
- CHROMEM_QUERY_SMALL_DOCS_THRESHOLD
- CHROMEM_QUERY_SEQUENTIAL_DOCS_THRESHOLD
- CHROMEM_QUERY_HIGH_DIM_THRESHOLD
- CHROMEM_QUERY_HIGH_DIM_CONCURRENCY_DIVISOR
- CHROMEM_QUERY_MAX_CONCURRENCY (0 means no hard cap)
- equivalent APIs: SetQuerySmallDocsThreshold, SetQuerySequentialDocsThreshold, SetQueryHighDimThreshold, SetQueryHighDimConcurrencyDivisor, SetQueryMaxConcurrency
HNSW index behavior can be tuned with:
- CHROMEM_HNSW_ENABLED (true/false, default true)
- CHROMEM_HNSW_M (default 16)
- CHROMEM_HNSW_EF_CONSTRUCTION (default 200)
- CHROMEM_HNSW_EF_SEARCH (default 200)
- CHROMEM_HNSW_EXACT_RERANK_TOPN (default 0, disabled)
- CHROMEM_HNSW_TOMBSTONE_REBUILD_RATIO (default 0.2, 0 disables auto-compaction)
- CHROMEM_HNSW_TOMBSTONE_REBUILD_MIN_DELETED (default 2048)
- equivalent APIs: SetHNSWEnabled, SetHNSWM, SetHNSWEFConstruction, SetHNSWEFSearch, SetHNSWExactRerankTopN, SetHNSWTombstoneRebuildRatio, SetHNSWTombstoneRebuildMinDeleted
ANN / lexical mode selection can be tuned with env vars:
- CHROMEM_INDEX_TYPE (hnsw | ivf | pq | ivfpq | bm25 | hybrid)
- IVF: CHROMEM_IVF_NLIST, CHROMEM_IVF_NPROBE
- PQ: CHROMEM_PQ_M, CHROMEM_PQ_NBITS
- IVFPQ: CHROMEM_IVFPQ_NLIST, CHROMEM_IVFPQ_NPROBE, CHROMEM_IVFPQ_M, CHROMEM_IVFPQ_NBITS
- bm25 / hybrid require query text input (e.g. via QueryWithOptions{QueryText: ...})
Persistent DB startup memory can be reduced with NewPersistentDBWithOptions(..., PersistentDBOptions{LazyLoadPayload: true}), which keeps embeddings in memory and loads content/metadata on demand.
For an ultra-low-memory mode, set StreamEmbeddingsOnQuery: true to stream embeddings from disk during query instead of keeping them resident.

Based on benchmark runs on Intel i7-14700F:

Dot product (optimized) is significantly faster for vectors >= 1536 dimensions with the current defaults.
End-to-end query performance also improves, but gains are smaller than raw dot-product gains because filtering, heap maintenance, scheduling, and memory bandwidth become dominant.
A practical default is CHROMEM_SIMD_MIN_LENGTH=1536 for balanced single-core and multi-core performance on this hardware.

For 1GiB / 1536-dim workloads, prefer running query concurrency around 4-8 workers for best throughput/latency tradeoff.

Recommended presets (copy/paste)

The following presets are good starting points for library users. Keep CHROMEM_SIMD_MIN_LENGTH=1536 unless your benchmarks show a better value.

Large persistent datasets with limited RAM:
- use NewPersistentDBWithOptions(path, chromem.PersistentDBOptions{Compress: false, LazyLoadPayload: true})
- pair with CHROMEM_QUERY_MAX_CONCURRENCY=4..8 based on your latency budget
Ultra-low-memory mode (accept lower query throughput):
- use NewPersistentDBWithOptions(path, chromem.PersistentDBOptions{Compress: false, LazyLoadPayload: true, StreamEmbeddingsOnQuery: true})
- useful when dataset size exceeds available RAM
Low-latency API (stable p95/p99):
- CHROMEM_QUERY_MAX_CONCURRENCY=4
- CHROMEM_QUERY_HIGH_DIM_CONCURRENCY_DIVISOR=2
Throughput-oriented batch/service:
- CHROMEM_QUERY_MAX_CONCURRENCY=8
- CHROMEM_QUERY_HIGH_DIM_CONCURRENCY_DIVISOR=2
Conservative / unknown hardware:
- CHROMEM_QUERY_MAX_CONCURRENCY=0 (no hard cap)
- CHROMEM_QUERY_HIGH_DIM_CONCURRENCY_DIVISOR=2

Programmatic equivalent:

// Example: throughput-oriented profile.
chromem.SetQueryMaxConcurrency(8)
chromem.SetQueryHighDimConcurrencyDivisor(2)

// Example: HNSW tuning profile.
chromem.SetHNSWEnabled(true)
chromem.SetHNSWM(24)
chromem.SetHNSWEFConstruction(256)
chromem.SetHNSWEFSearch(96)
chromem.SetHNSWExactRerankTopN(128)
chromem.SetHNSWTombstoneRebuildRatio(0.2)
chromem.SetHNSWTombstoneRebuildMinDeleted(2048)

Environment variables (no code changes in consuming app):

CHROMEM_SIMD_MIN_LENGTH=1536
CHROMEM_QUERY_MAX_CONCURRENCY=8
CHROMEM_QUERY_HIGH_DIM_CONCURRENCY_DIVISOR=2
CHROMEM_INDEX_TYPE=hnsw
CHROMEM_IVF_NLIST=64
CHROMEM_IVF_NPROBE=8
CHROMEM_PQ_M=8
CHROMEM_PQ_NBITS=8
CHROMEM_IVFPQ_NLIST=64
CHROMEM_IVFPQ_NPROBE=8
CHROMEM_IVFPQ_M=8
CHROMEM_IVFPQ_NBITS=8
CHROMEM_HNSW_ENABLED=true
CHROMEM_HNSW_M=16
CHROMEM_HNSW_EF_CONSTRUCTION=200
CHROMEM_HNSW_EF_SEARCH=200
CHROMEM_HNSW_EXACT_RERANK_TOPN=0
CHROMEM_HNSW_TOMBSTONE_REBUILD_RATIO=0.2
CHROMEM_HNSW_TOMBSTONE_REBUILD_MIN_DELETED=2048

HNSW mutation strategy notes

AddDocument append path uses incremental HNSW insert (copy-on-write index replacement).
AddDocument overwrite path uses incremental upsert (old node tombstoned, new node inserted).
Delete path uses incremental tombstone marking (no immediate global rebuild).
Query path lazily compacts (rebuilds) the graph only when tombstone thresholds are exceeded.
This keeps write-path latency low while recovering long-term graph quality under heavy mutations.

HNSW enabled vs disabled benchmark comparison:

go test -run '^$' -bench '^BenchmarkCollection_Query_NoContent_1536_100k_HNSWToggle$' -benchmem -benchtime=1x -count=5 > hnsw_toggle.out

Then compare with other runs using benchstat.

HNSW Recall@K (vs brute-force ground truth) + performance comparison:

go test -run '^$' -bench '^BenchmarkCollection_Query_NoContent_1536_100k_HNSWRecallAt10$' -benchmem -benchtime=1x -count=3 > hnsw_recall.out

The benchmark reports recall_at_10 for both sub-benchmarks (hnsw_enabled, bruteforce) so you can evaluate accuracy and speed together.

Reproducible matrix benchmark

Use the included PowerShell script to benchmark baseline vs SIMD across CPU sets and thresholds:

powershell -ExecutionPolicy Bypass -File .\benchmark_matrix.ps1

This runs:

baseline (no SIMD)
SIMD with thresholds 0, 1024, 1536
CPU sets 1 and 8
benchstat comparisons for each pair

Results are written to bench-results/run-<timestamp>/compare-*.txt.

If your benchmark outputs were created with an incompatible encoding, regenerate compare files with:

powershell -ExecutionPolicy Bypass -File .\rebuild_compare.ps1 -RunDir .\bench-results\run-<timestamp>

Motivation

In December 2023, when I wanted to play around with retrieval augmented generation (RAG) in a Go program, I looked for a vector database that could be embedded in the Go program, just like you would embed SQLite in order to not require any separate DB setup and maintenance. I was surprised when I didn't find any, given the abundance of embedded key-value stores in the Go ecosystem.

At the time most of the popular vector databases like Pinecone, Qdrant, Milvus, Chroma, Weaviate and others were not embeddable at all or only in Python or JavaScript/TypeScript.

Then I found @eliben's blog post and example code which showed that with very little Go code you could create a very basic PoC of a vector database.

That's when I decided to build my own vector database, embeddable in Go, inspired by the ChromaDB interface. ChromaDB stood out for being embeddable (in Python), and by showing its core API in 4 commands on their README and on the landing page of their website.

Shoutout to @eliben whose blog post and example code inspired me to start this project!
Chroma: Looking at Pinecone, Qdrant, Milvus, Weaviate and others, Chroma stood out by showing its core API in 4 commands on their README and on the landing page of their website. It was also putting the most emphasis on its embeddability (in Python).
The big, full-fledged client-server-based vector databases for maximum scale and performance:
- Pinecone: Closed source
- Qdrant: Written in Rust, not embeddable in Go
- Milvus: Written in Go and C++, but not embeddable as of December 2023
- Weaviate: Written in Go, but not embeddable in Go as of March 2024 (only in Python and JavaScript/TypeScript and that's experimental)
Some non-specialized SQL, NoSQL and Key-Value databases added support for storing vectors and (some of them) querying based on similarity:
- pgvector extension for PostgreSQL: Client-server model
- Redis (1, 2): Client-server model
- sqlite-vss extension for SQLite: Embedded, but the Go bindings require CGO. There's a CGO-free Go library for SQLite, but then it's without the vector search extension.
- DuckDB has a function to calculate cosine similarity (1): Embedded, but the Go bindings use CGO
- MongoDB's cloud platform offers a vector search product (1): Client-server model
Some libraries for vector similarity search:
- Faiss: Written in C++; 3rd party Go bindings use CGO
- Annoy: Written in C++; Go bindings use CGO (1)
- USearch: Written in C++; Go bindings use CGO
Some orchestration libraries, inspired by the Python library LangChain, but with no or only rudimentary embedded vector DB:

Documentation ¶

Index ¶

Constants
func SetHNSWEFConstruction(ef int)
func SetHNSWEFSearch(ef int)
func SetHNSWEnabled(enabled bool)
func SetHNSWExactRerankTopN(topN int)
func SetHNSWM(m int)
func SetHNSWTombstoneRebuildMinDeleted(minDeleted int)
func SetHNSWTombstoneRebuildRatio(ratio float64)
func SetQueryHighDimConcurrencyDivisor(divisor int)
func SetQueryHighDimThreshold(threshold int)
func SetQueryMaxConcurrency(maxConcurrency int)
func SetQuerySequentialDocsThreshold(threshold int)
func SetQuerySmallDocsThreshold(threshold int)
func SetSIMDMinLength(minLen int)
type Collection
- func (c *Collection) Add(ctx context.Context, ids []string, embeddings [][]float32, ...) error
- func (c *Collection) AddConcurrently(ctx context.Context, ids []string, embeddings [][]float32, ...) error
- func (c *Collection) AddDocument(ctx context.Context, doc Document) error
- func (c *Collection) AddDocuments(ctx context.Context, documents []Document, concurrency int) error
- func (c *Collection) Count() int
- func (c *Collection) Delete(_ context.Context, where, whereDocument map[string]string, ids ...string) error
- func (c *Collection) GetByID(_ context.Context, id string) (Document, error)
- func (c *Collection) GetByMetadata(_ context.Context, where map[string]string) ([]*Document, error)
- func (c *Collection) ListDocuments(_ context.Context) ([]*Document, error)
- func (c *Collection) ListDocumentsPartial(_ context.Context) ([]*Document, error)
- func (c *Collection) ListDocumentsShallow(_ context.Context) ([]*Document, error)
- func (c *Collection) ListIDs(_ context.Context) []string
- func (c *Collection) MemoryStats() CollectionMemoryStats
- func (c *Collection) Query(ctx context.Context, queryText string, nResults int, ...) ([]Result, error)
- func (c *Collection) QueryEmbedding(ctx context.Context, queryEmbedding []float32, nResults int, ...) ([]Result, error)
- func (c *Collection) QueryWithOptions(ctx context.Context, options QueryOptions) ([]Result, error)
type CollectionMemoryStats
type DB
- func NewDB() *DB
- func NewPersistentDB(path string, compress bool) (*DB, error)
- func NewPersistentDBWithOptions(path string, options PersistentDBOptions) (*DB, error)
- func (db *DB) CreateCollection(name string, metadata map[string]string, embeddingFunc EmbeddingFunc) (*Collection, error)
- func (db *DB) DeleteCollection(name string) error
- func (db *DB) Export(filePath string, compress bool, encryptionKey string) errordeprecated
- func (db *DB) ExportToFile(filePath string, compress bool, encryptionKey string, collections ...string) error
- func (db *DB) ExportToWriter(writer io.Writer, compress bool, encryptionKey string, collections ...string) error
- func (db *DB) GetCollection(name string, embeddingFunc EmbeddingFunc) *Collection
- func (db *DB) GetOrCreateCollection(name string, metadata map[string]string, embeddingFunc EmbeddingFunc) (*Collection, error)
- func (db *DB) Import(filePath string, encryptionKey string) errordeprecated
- func (db *DB) ImportFromFile(filePath string, encryptionKey string, collections ...string) error
- func (db *DB) ImportFromReader(reader io.ReadSeeker, encryptionKey string, collections ...string) error
- func (db *DB) ListCollections() map[string]*Collection
- func (db *DB) Reset() error
type Document
- func NewDocument(ctx context.Context, id string, metadata map[string]string, ...) (Document, error)
type EmbeddingFunc
- func NewEmbeddingFuncAzureOpenAI(apiKey string, deploymentURL string, apiVersion string, model string) EmbeddingFunc
- func NewEmbeddingFuncCohere(apiKey string, model EmbeddingModelCohere) EmbeddingFunc
- func NewEmbeddingFuncDefault() EmbeddingFunc
- func NewEmbeddingFuncJina(apiKey string, model EmbeddingModelJina) EmbeddingFunc
- func NewEmbeddingFuncLocalAI(model string) EmbeddingFunc
- func NewEmbeddingFuncMistral(apiKey string) EmbeddingFunc
- func NewEmbeddingFuncMixedbread(apiKey string, model EmbeddingModelMixedbread) EmbeddingFunc
- func NewEmbeddingFuncOllama(model string, baseURLOllama string) EmbeddingFunc
- func NewEmbeddingFuncOpenAI(apiKey string, model EmbeddingModelOpenAI) EmbeddingFunc
- func NewEmbeddingFuncOpenAICompat(baseURL, apiKey, model string, normalized *bool) EmbeddingFunc
- func NewEmbeddingFuncVertex(apiKey, project string, model EmbeddingModelVertex, opts ...VertexOption) EmbeddingFunc
type EmbeddingModelCohere
type EmbeddingModelJina
type EmbeddingModelMixedbread
type EmbeddingModelOpenAI
type EmbeddingModelVertex
type NegativeMode
type NegativeQueryOptions
type PersistentDBOptions
type QueryOptions
type Result
type VertexOption
- func WithVertexAPIEndpoint(apiEndpoint string) VertexOption
- func WithVertexAutoTruncate(autoTruncate bool) VertexOption

Constants ¶

View Source

const (
	InputTypeCohereSearchDocumentPrefix string = "search_document: "
	InputTypeCohereSearchQueryPrefix    string = "search_query: "
	InputTypeCohereClassificationPrefix string = "classification: "
	InputTypeCohereClusteringPrefix     string = "clustering: "
)

Prefixes for external use.

View Source

const BaseURLOpenAI = "https://api.openai.com/v1"

Variables ¶

This section is empty.

Functions ¶

func SetHNSWEFConstruction ¶ added in v0.10.0

func SetHNSWEFConstruction(ef int)

func SetHNSWEFSearch ¶ added in v0.10.0

func SetHNSWEFSearch(ef int)

func SetHNSWEnabled ¶ added in v0.10.0

func SetHNSWEnabled(enabled bool)

func SetHNSWExactRerankTopN ¶ added in v0.10.0

func SetHNSWExactRerankTopN(topN int)

SetHNSWExactRerankTopN sets two-stage query reranking candidate count.

0  disables exact reranking.
>0 enables exact reranking on up to top-N ANN candidates.

func SetHNSWM ¶ added in v0.10.0

func SetHNSWM(m int)

func SetHNSWTombstoneRebuildMinDeleted ¶ added in v0.10.0

func SetHNSWTombstoneRebuildMinDeleted(minDeleted int)

SetHNSWTombstoneRebuildMinDeleted sets the minimum number of deleted (tombstoned) nodes required before ratio-based compaction can trigger. Values < 0 are ignored.

func SetHNSWTombstoneRebuildRatio ¶ added in v0.10.0

func SetHNSWTombstoneRebuildRatio(ratio float64)

SetHNSWTombstoneRebuildRatio sets the deleted-node ratio threshold that triggers a graph compaction rebuild. Values outside [0,1] are ignored.

0   disables ratio-based compaction trigger.
>0  enables trigger when deleted/total >= ratio and min deleted threshold is met.

func SetQueryHighDimConcurrencyDivisor ¶

func SetQueryHighDimConcurrencyDivisor(divisor int)

SetQueryHighDimConcurrencyDivisor sets the divisor that reduces query concurrency for high-dimensional embeddings. Values < 1 are ignored.

func SetQueryHighDimThreshold ¶

func SetQueryHighDimThreshold(threshold int)

SetQueryHighDimThreshold sets the embedding dimension threshold above which query concurrency is reduced. Values < 0 are ignored.

func SetQueryMaxConcurrency ¶ added in v0.10.0

func SetQueryMaxConcurrency(maxConcurrency int)

SetQueryMaxConcurrency sets a hard upper bound for query workers. Set to 0 to disable the cap. Values < 0 are ignored.

func SetQuerySequentialDocsThreshold ¶

func SetQuerySequentialDocsThreshold(threshold int)

SetQuerySequentialDocsThreshold sets the docs threshold below which query and filter paths run sequentially. Values < 0 are ignored.

func SetQuerySmallDocsThreshold ¶

func SetQuerySmallDocsThreshold(threshold int)

SetQuerySmallDocsThreshold sets the docs threshold below which query workers scale to runtime.NumCPU(). Values < 0 are ignored.

func SetSIMDMinLength ¶

func SetSIMDMinLength(minLen int)

SetSIMDMinLength sets the minimum vector length at which SIMD is used. Values < 0 are ignored.

Types ¶

type Collection ¶

type Collection struct {
	Name string
	// contains filtered or unexported fields
}

Collection represents a collection of documents. It also has a configured embedding function, which is used when adding documents that don't have embeddings yet.

func (*Collection) Add ¶

func (c *Collection) Add(ctx context.Context, ids []string, embeddings [][]float32, metadatas []map[string]string, contents []string) error

Add embeddings to the datastore.

ids: The ids of the embeddings you wish to add
embeddings: The embeddings to add. If nil, embeddings will be computed based on the contents using the embeddingFunc set for the Collection. Optional.
metadatas: The metadata to associate with the embeddings. When querying, you can filter on this metadata. Optional.
contents: The contents to associate with the embeddings.

This is a Chroma-like method. For a more Go-idiomatic one, see Collection.AddDocuments.

func (*Collection) AddConcurrently ¶

func (c *Collection) AddConcurrently(ctx context.Context, ids []string, embeddings [][]float32, metadatas []map[string]string, contents []string, concurrency int) error

AddConcurrently is like Add, but adds embeddings concurrently. This is mostly useful when you don't pass any embeddings, so they have to be created. Upon error, concurrently running operations are canceled and the error is returned.

This is a Chroma-like method. For a more Go-idiomatic one, see Collection.AddDocuments.

func (*Collection) AddDocument ¶

func (c *Collection) AddDocument(ctx context.Context, doc Document) error

AddDocument adds a document to the collection. If the document doesn't have an embedding, it will be created using the collection's embedding function.

func (*Collection) AddDocuments ¶

func (c *Collection) AddDocuments(ctx context.Context, documents []Document, concurrency int) error

AddDocuments adds documents to the collection with the specified concurrency. If the documents don't have embeddings, they will be created using the collection's embedding function. Upon error, concurrently running operations are canceled and the error is returned.

func (*Collection) Count ¶

func (c *Collection) Count() int

Count returns the number of documents in the collection.

func (*Collection) Delete ¶

func (c *Collection) Delete(_ context.Context, where, whereDocument map[string]string, ids ...string) error

Delete removes document(s) from the collection.

where: Conditional filtering on metadata. Optional.
whereDocument: Conditional filtering on documents. Optional.
ids: The ids of the documents to delete. If empty, all documents are deleted.

func (*Collection) GetByID ¶

func (c *Collection) GetByID(_ context.Context, id string) (Document, error)

GetByID returns a document by its ID. The returned document is a copy of the original document, so it can be safely modified without affecting the collection.

func (*Collection) GetByMetadata ¶

func (c *Collection) GetByMetadata(_ context.Context, where map[string]string) ([]*Document, error)

GetByMetadata returns a set of documents, filtered by their metadata. The metadata tags must match the params specified in the where argument in both key and value. The returned documents are a deep copy of the original document, so they can be safely modified without affecting the collection.

func (*Collection) ListDocuments ¶

func (c *Collection) ListDocuments(_ context.Context) ([]*Document, error)

ListDocuments returns all documents in the collection. The returned documents are a deep copy of the original ones, so you can modify them without affecting the collection.

func (*Collection) ListDocumentsPartial ¶

func (c *Collection) ListDocumentsPartial(_ context.Context) ([]*Document, error)

ListDocumentsPartial returns a partial version of all documents in the collection, containing only the ID and content, but not the embedding or metadata values.

func (*Collection) ListDocumentsShallow ¶

func (c *Collection) ListDocumentsShallow(_ context.Context) ([]*Document, error)

ListDocumentsShallow returns all documents in the collection. The returned documents' metadata and embeddings point to the original data, so modifying them will be reflected in the collection.

func (*Collection) ListIDs ¶

func (c *Collection) ListIDs(_ context.Context) []string

ListIDs returns the IDs of all documents in the collection.

func (*Collection) MemoryStats ¶ added in v0.9.0

func (c *Collection) MemoryStats() CollectionMemoryStats

MemoryStats returns an approximate memory footprint for collection data and selected runtime memory counters.

func (*Collection) Query ¶

func (c *Collection) Query(ctx context.Context, queryText string, nResults int, where, whereDocument map[string]string) ([]Result, error)

Query performs an exhaustive nearest neighbor search on the collection.

queryText: The text to search for. Its embedding will be created using the collection's embedding function.
nResults: The maximum number of results to return. Must be > 0. There can be fewer results if a filter is applied.
where: Conditional filtering on metadata. Optional.
whereDocument: Conditional filtering on documents. Optional.

func (*Collection) QueryEmbedding ¶

func (c *Collection) QueryEmbedding(ctx context.Context, queryEmbedding []float32, nResults int, where, whereDocument map[string]string) ([]Result, error)

QueryEmbedding performs an exhaustive nearest neighbor search on the collection.

queryEmbedding: The embedding of the query to search for. It must be created with the same embedding model as the document embeddings in the collection. The embedding will be normalized if it's not the case yet.
nResults: The maximum number of results to return. Must be > 0. There can be fewer results if a filter is applied.
where: Conditional filtering on metadata. Optional.
whereDocument: Conditional filtering on documents. Optional.

func (*Collection) QueryWithOptions ¶

func (c *Collection) QueryWithOptions(ctx context.Context, options QueryOptions) ([]Result, error)

QueryWithOptions performs an exhaustive nearest neighbor search on the collection.

options: The options for the query. See QueryOptions for more information.

type CollectionMemoryStats ¶ added in v0.9.0

type CollectionMemoryStats struct {
	DocumentCount int

	ApproxEmbeddingBytes uint64
	ApproxMetadataBytes  uint64
	ApproxContentBytes   uint64
	EstimatedTotalBytes  uint64

	RuntimeAllocBytes  uint64
	RuntimeHeapInuse   uint64
	RuntimeHeapObjects uint64
	RuntimeNumGC       uint32
}

CollectionMemoryStats provides memory observations for a collection and the current process.

type DB ¶

type DB struct {
	// contains filtered or unexported fields
}

DB is the chromem-go database. It holds collections, which hold documents.

+----+    1-n    +------------+    n-n    +----------+
| DB |-----------| Collection |-----------| Document |
+----+           +------------+           +----------+

func NewDB ¶

func NewDB() *DB

NewDB creates a new in-memory chromem-go DB. While it doesn't write files when you add collections and documents, you can still use DB.Export and DB.Import to export and import the entire DB from a file.

func NewPersistentDB ¶

func NewPersistentDB(path string, compress bool) (*DB, error)

NewPersistentDB creates a new persistent chromem-go DB. If the path is empty, it defaults to "./chromem-go". If compress is true, the files are compressed with gzip.

The persistence covers the collections (including their documents) and the metadata. However, it doesn't cover the EmbeddingFunc, as functions can't be serialized. When some data is persisted, and you create a new persistent DB with the same path, you'll have to provide the same EmbeddingFunc as before when getting an existing collection and adding more documents to it.

Currently, the persistence is done synchronously on each write operation, and each document addition leads to a new file, encoded as gob. In the future we will make this configurable (encoding, async writes, WAL-based writes, etc.).

In addition to persistence for each added collection and document you can use DB.ExportToFile / DB.ExportToWriter and DB.ImportFromFile / DB.ImportFromReader to export and import the entire DB to/from a file or writer/reader, which also works for the pure in-memory DB.

func NewPersistentDBWithOptions ¶ added in v0.10.0

func NewPersistentDBWithOptions(path string, options PersistentDBOptions) (*DB, error)

NewPersistentDBWithOptions creates a new persistent chromem-go DB with optional memory-saving behavior.

If LazyLoadPayload is true, only document embeddings are loaded at startup; document metadata/content are loaded on demand when needed.

func (*DB) CreateCollection ¶

func (db *DB) CreateCollection(name string, metadata map[string]string, embeddingFunc EmbeddingFunc) (*Collection, error)

CreateCollection creates a new collection with the given name and metadata.

name: The name of the collection to create.
metadata: Optional metadata to associate with the collection.
embeddingFunc: Optional function to use to embed documents. Uses the default embedding function if not provided.

func (*DB) DeleteCollection ¶

func (db *DB) DeleteCollection(name string) error

DeleteCollection deletes the collection with the given name. If the collection doesn't exist, this is a no-op. If the DB is persistent, it also removes the collection's directory. You shouldn't hold any references to the collection after calling this method.

func (*DB) Export deprecated

func (db *DB) Export(filePath string, compress bool, encryptionKey string) error

Export exports the DB to a file at the given path. The file is encoded as gob, optionally compressed with flate (as gzip) and optionally encrypted with AES-GCM. This works for both the in-memory and persistent DBs. If the file exists, it's overwritten, otherwise created.

filePath: If empty, it defaults to "./chromem-go.gob" (+ ".gz" + ".enc")
compress: Optional. Compresses as gzip if true.
encryptionKey: Optional. Encrypts with AES-GCM if provided. Must be 32 bytes long if provided.

Deprecated: Use DB.ExportToFile instead.

func (*DB) ExportToFile ¶

func (db *DB) ExportToFile(filePath string, compress bool, encryptionKey string, collections ...string) error

ExportToFile exports the DB to a file at the given path. The file is encoded as gob, optionally compressed with flate (as gzip) and optionally encrypted with AES-GCM. This works for both the in-memory and persistent DBs. If the file exists, it's overwritten, otherwise created.

filePath: If empty, it defaults to "./chromem-go.gob" (+ ".gz" + ".enc")
compress: Optional. Compresses as gzip if true.
encryptionKey: Optional. Encrypts with AES-GCM if provided. Must be 32 bytes long if provided.
collections: Optional. If provided, only the collections with the given names are exported. Non-existing collections are ignored. If not provided, all collections are exported.

func (*DB) ExportToWriter ¶

func (db *DB) ExportToWriter(writer io.Writer, compress bool, encryptionKey string, collections ...string) error

ExportToWriter exports the DB to a writer. The stream is encoded as gob, optionally compressed with flate (as gzip) and optionally encrypted with AES-GCM. This works for both the in-memory and persistent DBs. If the writer has to be closed, it's the caller's responsibility. This can be used to export DBs to object storage like S3. See https://github.com/TIANLI0/chromem-go/tree/main/examples/s3-export-import for an example.

writer: An implementation of io.Writer
compress: Optional. Compresses as gzip if true.
encryptionKey: Optional. Encrypts with AES-GCM if provided. Must be 32 bytes long if provided.
collections: Optional. If provided, only the collections with the given names are exported. Non-existing collections are ignored. If not provided, all collections are exported.

func (*DB) GetCollection ¶

func (db *DB) GetCollection(name string, embeddingFunc EmbeddingFunc) *Collection

GetCollection returns the collection with the given name. The embeddingFunc param is only used if the DB is persistent and was just loaded from storage, in which case no embedding func is set yet (funcs are not (de-)serializable). It can be nil, in which case the default one will be used. The returned collection is a reference to the original collection, so any methods on the collection like Add() will be reflected on the DB's collection. Those operations are concurrency-safe. If the collection doesn't exist, this returns nil.

func (*DB) GetOrCreateCollection ¶

func (db *DB) GetOrCreateCollection(name string, metadata map[string]string, embeddingFunc EmbeddingFunc) (*Collection, error)

GetOrCreateCollection returns the collection with the given name if it exists in the DB, or otherwise creates it. When creating:

name: The name of the collection to create.
metadata: Optional metadata to associate with the collection.
embeddingFunc: Optional function to use to embed documents. Uses the default embedding function if not provided.

func (*DB) Import deprecated

func (db *DB) Import(filePath string, encryptionKey string) error

Import imports the DB from a file at the given path. The file must be encoded as gob and can optionally be compressed with flate (as gzip) and encrypted with AES-GCM. This works for both the in-memory and persistent DBs. Existing collections are overwritten.

- filePath: Mandatory, must not be empty - encryptionKey: Optional, must be 32 bytes long if provided

Deprecated: Use DB.ImportFromFile instead.

func (*DB) ImportFromFile ¶

func (db *DB) ImportFromFile(filePath string, encryptionKey string, collections ...string) error

ImportFromFile imports the DB from a file at the given path. The file must be encoded as gob and can optionally be compressed with flate (as gzip) and encrypted with AES-GCM. This works for both the in-memory and persistent DBs. Existing collections are overwritten.

filePath: Mandatory, must not be empty
encryptionKey: Optional, must be 32 bytes long if provided
collections: Optional. If provided, only the collections with the given names are imported. Non-existing collections are ignored. If not provided, all collections are imported.

func (*DB) ImportFromReader ¶

func (db *DB) ImportFromReader(reader io.ReadSeeker, encryptionKey string, collections ...string) error

ImportFromReader imports the DB from a reader. The stream must be encoded as gob and can optionally be compressed with flate (as gzip) and encrypted with AES-GCM. This works for both the in-memory and persistent DBs. Existing collections are overwritten. If the writer has to be closed, it's the caller's responsibility. This can be used to import DBs from object storage like S3. See https://github.com/TIANLI0/chromem-go/tree/main/examples/s3-export-import for an example.

reader: An implementation of io.ReadSeeker
encryptionKey: Optional, must be 32 bytes long if provided
collections: Optional. If provided, only the collections with the given names are imported. Non-existing collections are ignored. If not provided, all collections are imported.

func (*DB) ListCollections ¶

func (db *DB) ListCollections() map[string]*Collection

ListCollections returns all collections in the DB, mapping name->Collection. The returned map is a copy of the internal map, so it's safe to directly modify the map itself. Direct modifications of the map won't reflect on the DB's map. To do that use the DB's methods like DB.CreateCollection and DB.DeleteCollection. The map is not an entirely deep clone, so the collections themselves are still the original ones. Any methods on the collections like Add() for adding documents will be reflected on the DB's collections and are concurrency-safe.

func (*DB) Reset ¶

func (db *DB) Reset() error

Reset removes all collections from the DB. If the DB is persistent, it also removes all contents of the DB directory. You shouldn't hold any references to old collections after calling this method.

type Document ¶

type Document struct {
	ID        string
	Metadata  map[string]string
	Embedding []float32
	Content   string
	// contains filtered or unexported fields
}

Document represents a single document.

func NewDocument ¶

func NewDocument(ctx context.Context, id string, metadata map[string]string, embedding []float32, content string, embeddingFunc EmbeddingFunc) (Document, error)

NewDocument creates a new document, including its embeddings. Metadata is optional. If the embeddings are not provided, they are created using the embedding function. You can leave the content empty if you only want to store embeddings. If embeddingFunc is nil, the default embedding function is used.

If you want to create a document without embeddings, for example to let Collection.AddDocuments create them concurrently, you can create a document with `chromem.Document{...}` instead of using this constructor.

type EmbeddingFunc ¶

type EmbeddingFunc func(ctx context.Context, text string) ([]float32, error)

EmbeddingFunc is a function that creates embeddings for a given text. chromem-go will use OpenAI`s "text-embedding-3-small" model by default, but you can provide your own function, using any model you like. The function must return a *normalized* vector, i.e. the length of the vector must be 1. OpenAI's and Mistral's embedding models do this by default. Some others like Nomic's "nomic-embed-text-v1.5" don't.

func NewEmbeddingFuncAzureOpenAI ¶

func NewEmbeddingFuncAzureOpenAI(apiKey string, deploymentURL string, apiVersion string, model string) EmbeddingFunc

NewEmbeddingFuncAzureOpenAI returns a function that creates embeddings for a text using the Azure OpenAI API. The `deploymentURL` is the URL of the deployed model, e.g. "https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME" See https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/embeddings?tabs=console#how-to-get-embeddings

func NewEmbeddingFuncCohere ¶

func NewEmbeddingFuncCohere(apiKey string, model EmbeddingModelCohere) EmbeddingFunc

NewEmbeddingFuncCohere returns a function that creates embeddings for a text using Cohere's API. One important difference to OpenAI's and other's APIs is that Cohere differentiates between document embeddings and search/query embeddings. In order for this embedding func to do the differentiation, you have to prepend the text with either "search_document" or "search_query". We'll cut off that prefix before sending the document/query body to the API, we'll just use the prefix to choose the right "input type" as they call it.

When you set up a chromem-go collection with this embedding function, you might want to create the document separately with NewDocument and then cut off the prefix before adding the document to the collection. Otherwise, when you query the collection, the returned documents will still have the prefix in their content.

cohereFunc := chromem.NewEmbeddingFuncCohere(cohereApiKey, chromem.EmbeddingModelCohereEnglishV3)
content := "The sky is blue because of Rayleigh scattering."
// Create the document with the prefix.
contentWithPrefix := chromem.InputTypeCohereSearchDocumentPrefix + content
doc, _ := NewDocument(ctx, id, metadata, nil, contentWithPrefix, cohereFunc)
// Remove the prefix so that later query results don't have it.
doc.Content = content
_ = collection.AddDocument(ctx, doc)

This is not necessary if you don't keep the content in the documents, as chromem-go also works when documents only have embeddings. You can also keep the prefix in the document, and only remove it after querying.

We plan to improve this in the future.

func NewEmbeddingFuncDefault ¶

func NewEmbeddingFuncDefault() EmbeddingFunc

NewEmbeddingFuncDefault returns a function that creates embeddings for a text using OpenAI`s "text-embedding-3-small" model via their API. The model supports a maximum text length of 8191 tokens. The API key is read from the environment variable "OPENAI_API_KEY".

func NewEmbeddingFuncJina ¶

func NewEmbeddingFuncJina(apiKey string, model EmbeddingModelJina) EmbeddingFunc

NewEmbeddingFuncJina returns a function that creates embeddings for a text using the Jina API.

func NewEmbeddingFuncLocalAI ¶

func NewEmbeddingFuncLocalAI(model string) EmbeddingFunc

NewEmbeddingFuncLocalAI returns a function that creates embeddings for a text using the LocalAI API. You can start a LocalAI instance like this:

docker run -it -p 127.0.0.1:8080:8080 localai/localai:v2.7.0-ffmpeg-core bert-cpp

And then call this constructor with model "bert-cpp-minilm-v6". But other embedding models are supported as well. See the LocalAI documentation for details.

func NewEmbeddingFuncMistral ¶

func NewEmbeddingFuncMistral(apiKey string) EmbeddingFunc

NewEmbeddingFuncMistral returns a function that creates embeddings for a text using the Mistral API.

func NewEmbeddingFuncMixedbread ¶

func NewEmbeddingFuncMixedbread(apiKey string, model EmbeddingModelMixedbread) EmbeddingFunc

NewEmbeddingFuncMixedbread returns a function that creates embeddings for a text using the mixedbread.ai API.

func NewEmbeddingFuncOllama ¶

func NewEmbeddingFuncOllama(model string, baseURLOllama string) EmbeddingFunc

NewEmbeddingFuncOllama returns a function that creates embeddings for a text using Ollama's embedding API. You can pass any model that Ollama supports and that supports embeddings. A good one as of 2024-03-02 is "nomic-embed-text". See https://ollama.com/library/nomic-embed-text baseURLOllama is the base URL of the Ollama API. If it's empty, "http://localhost:11434/api" is used.

func NewEmbeddingFuncOpenAI ¶

func NewEmbeddingFuncOpenAI(apiKey string, model EmbeddingModelOpenAI) EmbeddingFunc

NewEmbeddingFuncOpenAI returns a function that creates embeddings for a text using the OpenAI API.

func NewEmbeddingFuncOpenAICompat ¶

func NewEmbeddingFuncOpenAICompat(baseURL, apiKey, model string, normalized *bool) EmbeddingFunc

NewEmbeddingFuncOpenAICompat returns a function that creates embeddings for a text using an OpenAI compatible API. For example:

Azure OpenAI: https://azure.microsoft.com/en-us/products/ai-services/openai-service
LitLLM: https://github.com/BerriAI/litellm
Ollama: https://github.com/ollama/ollama/blob/main/docs/openai.md
etc.

The `normalized` parameter indicates whether the vectors returned by the embedding model are already normalized, as is the case for OpenAI's and Mistral's models. The flag is optional. If it's nil, it will be autodetected on the first request (which bears a small risk that the vector just happens to have a length of 1).

func NewEmbeddingFuncVertex ¶

func NewEmbeddingFuncVertex(apiKey, project string, model EmbeddingModelVertex, opts ...VertexOption) EmbeddingFunc

type EmbeddingModelCohere ¶

type EmbeddingModelCohere string

const (
	EmbeddingModelCohereMultilingualV2 EmbeddingModelCohere = "embed-multilingual-v2.0"
	EmbeddingModelCohereEnglishLightV2 EmbeddingModelCohere = "embed-english-light-v2.0"
	EmbeddingModelCohereEnglishV2      EmbeddingModelCohere = "embed-english-v2.0"

	EmbeddingModelCohereMultilingualLightV3 EmbeddingModelCohere = "embed-multilingual-light-v3.0"
	EmbeddingModelCohereEnglishLightV3      EmbeddingModelCohere = "embed-english-light-v3.0"
	EmbeddingModelCohereMultilingualV3      EmbeddingModelCohere = "embed-multilingual-v3.0"
	EmbeddingModelCohereEnglishV3           EmbeddingModelCohere = "embed-english-v3.0"
)

type EmbeddingModelJina ¶

type EmbeddingModelJina string

const (
	EmbeddingModelJina2BaseEN EmbeddingModelJina = "jina-embeddings-v2-base-en"
	EmbeddingModelJina2BaseES EmbeddingModelJina = "jina-embeddings-v2-base-es"
	EmbeddingModelJina2BaseDE EmbeddingModelJina = "jina-embeddings-v2-base-de"
	EmbeddingModelJina2BaseZH EmbeddingModelJina = "jina-embeddings-v2-base-zh"

	EmbeddingModelJina2BaseCode EmbeddingModelJina = "jina-embeddings-v2-base-code"

	EmbeddingModelJinaClipV1 EmbeddingModelJina = "jina-clip-v1"
)

type EmbeddingModelMixedbread ¶

type EmbeddingModelMixedbread string

const (
	// Possibly outdated / not available anymore
	EmbeddingModelMixedbreadUAELargeV1 EmbeddingModelMixedbread = "UAE-Large-V1"
	// Possibly outdated / not available anymore
	EmbeddingModelMixedbreadBGELargeENV15 EmbeddingModelMixedbread = "bge-large-en-v1.5"
	// Possibly outdated / not available anymore
	EmbeddingModelMixedbreadGTELarge EmbeddingModelMixedbread = "gte-large"
	// Possibly outdated / not available anymore
	EmbeddingModelMixedbreadE5LargeV2 EmbeddingModelMixedbread = "e5-large-v2"
	// Possibly outdated / not available anymore
	EmbeddingModelMixedbreadMultilingualE5Large EmbeddingModelMixedbread = "multilingual-e5-large"
	// Possibly outdated / not available anymore
	EmbeddingModelMixedbreadMultilingualE5Base EmbeddingModelMixedbread = "multilingual-e5-base"
	// Possibly outdated / not available anymore
	EmbeddingModelMixedbreadAllMiniLML6V2 EmbeddingModelMixedbread = "all-MiniLM-L6-v2"
	// Possibly outdated / not available anymore
	EmbeddingModelMixedbreadGTELargeZh EmbeddingModelMixedbread = "gte-large-zh"

	EmbeddingModelMixedbreadLargeV1          EmbeddingModelMixedbread = "mxbai-embed-large-v1"
	EmbeddingModelMixedbreadDeepsetDELargeV1 EmbeddingModelMixedbread = "deepset-mxbai-embed-de-large-v1"
	EmbeddingModelMixedbread2DLargeV1        EmbeddingModelMixedbread = "mxbai-embed-2d-large-v1"
)

type EmbeddingModelOpenAI ¶

type EmbeddingModelOpenAI string

const (
	EmbeddingModelOpenAI2Ada EmbeddingModelOpenAI = "text-embedding-ada-002"

	EmbeddingModelOpenAI3Small EmbeddingModelOpenAI = "text-embedding-3-small"
	EmbeddingModelOpenAI3Large EmbeddingModelOpenAI = "text-embedding-3-large"
)

type EmbeddingModelVertex ¶

type EmbeddingModelVertex string

const (
	EmbeddingModelVertexEnglishV1 EmbeddingModelVertex = "textembedding-gecko@001"
	EmbeddingModelVertexEnglishV2 EmbeddingModelVertex = "textembedding-gecko@002"
	EmbeddingModelVertexEnglishV3 EmbeddingModelVertex = "textembedding-gecko@003"
	EmbeddingModelVertexEnglishV4 EmbeddingModelVertex = "text-embedding-004"

	EmbeddingModelVertexMultilingualV1 EmbeddingModelVertex = "textembedding-gecko-multilingual@001"
	EmbeddingModelVertexMultilingualV2 EmbeddingModelVertex = "text-multilingual-embedding-002"

	EmbeddingModelVertexGeminiV1 EmbeddingModelVertex = "gemini-embedding-001"
)

type NegativeMode ¶

type NegativeMode string

NegativeMode represents the mode to use for the negative text. See QueryOptions for more information.

const (
	// NEGATIVE_MODE_FILTER filters out results based on the similarity between the
	// negative embedding and the document embeddings.
	// NegativeFilterThreshold controls the threshold for filtering. Documents with
	// similarity above the threshold will be removed from the results.
	NEGATIVE_MODE_FILTER NegativeMode = "filter"

	// NEGATIVE_MODE_SUBTRACT subtracts the negative embedding from the query embedding.
	// This is the default behavior.
	NEGATIVE_MODE_SUBTRACT NegativeMode = "subtract"

	// The default threshold for the negative filter.
	DEFAULT_NEGATIVE_FILTER_THRESHOLD = 0.5
)

type NegativeQueryOptions ¶

type NegativeQueryOptions struct {
	// Mode is the mode to use for the negative text.
	Mode NegativeMode

	// Text is the text to exclude from the results.
	Text string

	// Embedding is the embedding of the negative text. It must be created
	// with the same embedding model as the document embeddings in the collection.
	// The embedding will be normalized if it's not the case yet.
	// If both Text and Embedding are set, Embedding will be used.
	Embedding []float32

	// FilterThreshold is the threshold for the negative filter. Used when Mode is NEGATIVE_MODE_FILTER.
	FilterThreshold float32
}

type PersistentDBOptions ¶ added in v0.10.0

type PersistentDBOptions struct {
	Compress                bool
	LazyLoadPayload         bool
	StreamEmbeddingsOnQuery bool
}

PersistentDBOptions configures persistent DB behavior.

type QueryOptions ¶

type QueryOptions struct {
	// The text to search for.
	QueryText string

	// The embedding of the query to search for. It must be created
	// with the same embedding model as the document embeddings in the collection.
	// The embedding will be normalized if it's not the case yet.
	// If both QueryText and QueryEmbedding are set, QueryEmbedding will be used.
	QueryEmbedding []float32

	// The number of results to return.
	NResults int

	// Conditional filtering on metadata.
	Where map[string]string

	// Conditional filtering on documents.
	WhereDocument map[string]string

	// Negative is the negative query options.
	// They can be used to exclude certain results from the query.
	Negative NegativeQueryOptions
}

QueryOptions represents the options for a query.

type Result ¶

type Result struct {
	ID        string
	Metadata  map[string]string
	Embedding []float32
	Content   string

	// The cosine similarity between the query and the document.
	// The higher the value, the more similar the document is to the query.
	// The value is in the range [-1, 1].
	Similarity float32
}

Result represents a single result from a query.

type VertexOption ¶

type VertexOption func(*vertexOptions)

func WithVertexAPIEndpoint ¶

func WithVertexAPIEndpoint(apiEndpoint string) VertexOption

func WithVertexAutoTruncate ¶

func WithVertexAutoTruncate(autoTruncate bool) VertexOption

Directories ¶

Path	Synopsis
wasm

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL