BAAI/bge-large-en-v1.5
Default embedding model in PipesHub for high-performance text embeddings
BAAI/bge-large-en-v1.5
The BAAI/bge-large-en-v1.5 is a default embedding model option in PipesHub
Model Overview
BAAI/bge-large-en-v1.5 is a state-of-the-art English embedding model developed by the Beijing Academy of Artificial Intelligence (BAAI). It’s one of the highest-performing models on the Massive Text Embedding Benchmark (MTEB) and is optimized for semantic search and document retrieval.
Key Features
- High Performance: Ranks among the top models on the MTEB benchmark
- Optimized Similarity Distribution: Enhanced retrieval capabilities even without instruction prefixes
- Versatile Applications: Excellent for semantic search, document retrieval, and clustering
- Large Context Understanding: Effectively captures semantic relationships in text
Technical Specifications
Feature | Specification |
---|---|
Dimensions | 1024 |
Model Size | ~1.34 GB |
Max Tokens | 512 |
Base Architecture | BERT (large) |
Training | Contrastive learning on 1B+ sentence pairs |
Languages | English |
Usage in PipesHub
Simply select “Sentence Transformer” as your Provider and enter “BAAI/bge-large-en-v1.5” in the Embedding Model field. This model is available as a default option for immediate use.
When to Choose This Model
- When you need high-quality embeddings for English text
- For applications requiring state-of-the-art semantic search capabilities
- When you have adequate computational resources to handle a large model
- For enterprise-grade applications where embedding quality is critical
Performance Considerations
While this model offers superior performance, it requires more computational resources than smaller alternatives like BAAI/bge-small-en-v1.5. For resource-constrained environments, consider using the smaller variant, which still provides excellent results with faster processing.
Related Models
- BAAI/bge-small-en-v1.5: Smaller, faster alternative (33% size)
- BAAI/bge-base-en-v1.5: Medium-sized alternative
For more details on general Sentence Transformer configuration, see the Sentence Transformer Embeddings documentation.