Embeddings

How ragistry converts your content into searchable vectors

What are Embeddings?

Embeddings are numerical representations of text that capture semantic meaning. Instead of comparing text word-by-word, embeddings convert content into high-dimensional vectors that represent the underlying meaning.

This allows ragistry to find semantically similar content even when different words are used. For example, 'buy a car' and 'purchase a vehicle' will be recognized as similar because their embeddings are close together in vector space.

Embedding Models

ragistry uses state-of-the-art embedding models:

OpenAI text-embedding-3-large

High-precision embeddings with 3072 dimensions for maximum semantic accuracy

Custom Models

Support for specialized embedding models for specific industries or languages

Embedding Process

  1. Text is split into chunks (typically 500-1000 tokens)
  2. Each chunk is processed through the embedding model
  3. The resulting vector is stored in the vector database
  4. When querying, the search query is also embedded and compared