Embeddings

How ragistry converts your content into searchable vectors

What are Embeddings?

Embeddings are numerical representations of text that capture semantic meaning. Instead of comparing text word-by-word, embeddings convert content into high-dimensional vectors that represent the underlying meaning.

This allows ragistry to find semantically similar content even when different words are used. For example, 'buy a car' and 'purchase a vehicle' will be recognized as similar because their embeddings are close together in vector space.

Embedding Models

ragistry uses state-of-the-art embedding models:

OpenAI text-embedding-3-large

High-precision embeddings with 3072 dimensions for maximum semantic accuracy

Custom Models

Support for specialized embedding models for specific industries or languages

Embedding Process

Text is split into chunks (typically 500-1000 tokens)
Each chunk is processed through the embedding model
The resulting vector is stored in the vector database
When querying, the search query is also embedded and compared