Embeddings
How ragistry converts your content into searchable vectors
What are Embeddings?
Embeddings are numerical representations of text that capture semantic meaning. Instead of comparing text word-by-word, embeddings convert content into high-dimensional vectors that represent the underlying meaning.
This allows ragistry to find semantically similar content even when different words are used. For example, 'buy a car' and 'purchase a vehicle' will be recognized as similar because their embeddings are close together in vector space.
Embedding Models
ragistry uses state-of-the-art embedding models:
OpenAI text-embedding-3-large
High-precision embeddings with 3072 dimensions for maximum semantic accuracy
Custom Models
Support for specialized embedding models for specific industries or languages
Embedding Process
- Text is split into chunks (typically 500-1000 tokens)
- Each chunk is processed through the embedding model
- The resulting vector is stored in the vector database
- When querying, the search query is also embedded and compared