BAAI/bge-large-en-v1.5

The BAAI/bge-large-en-v1.5 is a default embedding model option in PipesHub

Model Overview

BAAI/bge-large-en-v1.5 is a state-of-the-art English embedding model developed by the Beijing Academy of Artificial Intelligence (BAAI). It’s one of the highest-performing models on the Massive Text Embedding Benchmark (MTEB) and is optimized for semantic search and document retrieval.

Key Features

  • High Performance: Ranks among the top models on the MTEB benchmark
  • Optimized Similarity Distribution: Enhanced retrieval capabilities even without instruction prefixes
  • Versatile Applications: Excellent for semantic search, document retrieval, and clustering
  • Large Context Understanding: Effectively captures semantic relationships in text

Technical Specifications

FeatureSpecification
Dimensions1024
Model Size~1.34 GB
Max Tokens512
Base ArchitectureBERT (large)
TrainingContrastive learning on 1B+ sentence pairs
LanguagesEnglish

Usage in PipesHub

Simply select “Sentence Transformer” as your Provider and enter “BAAI/bge-large-en-v1.5” in the Embedding Model field. This model is available as a default option for immediate use.

When to Choose This Model

  • When you need high-quality embeddings for English text
  • For applications requiring state-of-the-art semantic search capabilities
  • When you have adequate computational resources to handle a large model
  • For enterprise-grade applications where embedding quality is critical

Performance Considerations

While this model offers superior performance, it requires more computational resources than smaller alternatives like BAAI/bge-small-en-v1.5. For resource-constrained environments, consider using the smaller variant, which still provides excellent results with faster processing.

  • BAAI/bge-small-en-v1.5: Smaller, faster alternative (33% size)
  • BAAI/bge-base-en-v1.5: Medium-sized alternative

For more details on general Sentence Transformer configuration, see the Sentence Transformer Embeddings documentation.