langchain4j+milvus实战

本文主要研究一下如何使用langchain4j来对接向量数据库milvus

步骤

docker运行milvus

docker run -d \
        --name milvus-standalone \
        --security-opt seccomp:unconfined \
        -e ETCD_USE_EMBED=true \
        -e ETCD_DATA_DIR=/var/lib/milvus/etcd \
        -e ETCD_CONFIG_PATH=/milvus/configs/embedEtcd.yaml \
        -e COMMON_STORAGETYPE=local \
        -v $(pwd)/volumes/milvus:/var/lib/milvus \
        -v $(pwd)/embedEtcd.yaml:/milvus/configs/embedEtcd.yaml \
        -v $(pwd)/user.yaml:/milvus/configs/user.yaml \
        -p 19530:19530 \
        -p 9091:9091 \
        -p 2379:2379 \
        --health-cmd="curl -f http://localhost:9091/healthz" \
        --health-interval=30s \
        --health-start-period=90s \
        --health-timeout=20s \
        --health-retries=3 \
        docker.1ms.run/milvusdb/milvus:v2.5.5 \
        milvus run standalone  1> /dev/null

启动之后访问http://127.0.0.1:9091/webui

在这里插入图片描述

这里需要提前创建embedEtcd.yaml

listen-client-urls: http://0.0.0.0:2379
advertise-client-urls: http://0.0.0.0:2379
quota-backend-bytes: 4294967296
auto-compaction-mode: revision
auto-compaction-retention: '1000'

user.yaml内容为空即可

pom.xml

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-milvus</artifactId>
    <version>1.0.0-beta1</version>
</dependency>

example

public class JlamaMilvusExample {

    public static void main(String[] args) throws InterruptedException {
        EmbeddingModel embeddingModel = JlamaEmbeddingModel.builder()
                .modelName("intfloat/e5-small-v2")
                .build();

        MilvusServiceClient customMilvusClient = new MilvusServiceClient(
                ConnectParam.newBuilder()
                        .withHost("localhost")
                        .withPort(19530)
                        .build()
        );
        MilvusEmbeddingStore embeddingStore = MilvusEmbeddingStore.builder()
                .milvusClient(customMilvusClient)
                .collectionName("example_collection")      // Name of the collection
                .dimension(384)                            // Dimension of vectors
                .indexType(IndexType.FLAT)                 // Index type
                .metricType(MetricType.COSINE)             // Metric type
                .consistencyLevel(ConsistencyLevelEnum.EVENTUALLY)  // Consistency level
                .autoFlushOnInsert(true)                   // Auto flush after insert
                .idFieldName("id")                         // ID field name
                .textFieldName("text")                     // Text field name
                .metadataFieldName("metadata")             // Metadata field name
                .vectorFieldName("vector")                 // Vector field name
                .build();                                  // Build the MilvusEmbeddingStore instance

        TextSegment segment1 = TextSegment.from("I like football.");
        Embedding embedding1 = embeddingModel.embed(segment1).content();
        embeddingStore.add(embedding1, segment1);

        TimeUnit.SECONDS.sleep(60);

        TextSegment segment2 = TextSegment.from("The weather is good today.");
        Embedding embedding2 = embeddingModel.embed(segment2).content();
        embeddingStore.add(embedding2, segment2);

        TimeUnit.SECONDS.sleep(60);

        String userQuery = "What is your favourite sport?";
        Embedding queryEmbedding = embeddingModel.embed(userQuery).content();
        int maxResults = 1;
        List<EmbeddingMatch<TextSegment>> relevant = embeddingStore.findRelevant(queryEmbedding, maxResults);
        EmbeddingMatch<TextSegment> embeddingMatch = relevant.get(0);

        System.out.println("Question: " + userQuery); // What is your favourite sport?
        System.out.println("Response: " + embeddingMatch.embedded().text()); // I like football.
    }
}

最后输出

WARNING: Using incubator modules: jdk.incubator.vector
INFO  c.g.tjake.jlama.model.AbstractModel - Model type = F32, Working memory type = F32, Quantized memory type = F32
WARN  c.g.t.j.t.o.TensorOperationsProvider - Native operations not available. Consider adding 'com.github.tjake:jlama-native' to the classpath
INFO  c.g.t.j.t.o.TensorOperationsProvider - Using Panama Vector Operations (OffHeap)
Question: What is your favourite sport?
Response: I like football.

quotaAndLimits

quotaAndLimits:
  enabled: true # `true` to enable quota and limits, `false` to disable.
  # quotaCenterCollectInterval is the time interval that quotaCenter
  # collects metrics from Proxies, Query cluster and Data cluster.
  # seconds, (0 ~ 65536)
  quotaCenterCollectInterval: 3
  limits:
    allocRetryTimes: 15 # retry times when delete alloc forward data from rate limit failed
    allocWaitInterval: 1000 # retry wait duration when delete alloc forward data rate failed, in millisecond
    complexDeleteLimitEnable: false # whether complex delete check forward data by limiter
    maxCollectionNum: 65536
    maxCollectionNumPerDB: 65536 # Maximum number of collections per database.
    maxInsertSize: -1 # maximum size of a single insert request, in bytes, -1 means no limit
    maxResourceGroupNumOfQueryNode: 1024 # maximum number of resource groups of query nodes
    maxGroupSize: 10 # maximum size for one single group when doing search group by
  ddl:
    enabled: false # Whether DDL request throttling is enabled.
    # Maximum number of collection-related DDL requests per second.
    # Setting this item to 10 indicates that Milvus processes no more than 10 collection-related DDL requests per second, including collection creation requests, collection drop requests, collection load requests, and collection release requests.
    # To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.
    collectionRate: -1
    # Maximum number of partition-related DDL requests per second.
    # Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including partition creation requests, partition drop requests, partition load requests, and partition release requests.
    # To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.
    partitionRate: -1
    db:
      collectionRate: -1 # qps of db level , default no limit, rate for CreateCollection, DropCollection, LoadCollection, ReleaseCollection
      partitionRate: -1 # qps of db level, default no limit, rate for CreatePartition, DropPartition, LoadPartition, ReleasePartition
  indexRate:
    enabled: false # Whether index-related request throttling is enabled.
    # Maximum number of index-related requests per second.
    # Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including index creation requests and index drop requests.
    # To use this setting, set quotaAndLimits.indexRate.enabled to true at the same time.
    max: -1
    db:
      max: -1 # qps of db level, default no limit, rate for CreateIndex, DropIndex
  flushRate:
    enabled: true # Whether flush request throttling is enabled.
    # Maximum number of flush requests per second.
    # Setting this item to 10 indicates that Milvus processes no more than 10 flush requests per second.
    # To use this setting, set quotaAndLimits.flushRate.enabled to true at the same time.
    max: -1
    collection:
      max: 10 # qps, default no limit, rate for flush at collection level.
    db:
      max: -1 # qps of db level, default no limit, rate for flush
  compactionRate:
    enabled: false # Whether manual compaction request throttling is enabled.
    # Maximum number of manual-compaction requests per second.
    # Setting this item to 10 indicates that Milvus processes no more than 10 manual-compaction requests per second.
    # To use this setting, set quotaAndLimits.compaction.enabled to true at the same time.
    max: -1
    db:
      max: -1 # qps of db level, default no limit, rate for manualCompaction
  dml:
    enabled: false # Whether DML request throttling is enabled.
    insertRate:
      # Highest data insertion rate per second.
      # Setting this item to 5 indicates that Milvus only allows data insertion at the rate of 5 MB/s.
      # To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
      max: -1
      db:
        max: -1 # MB/s, default no limit
      collection:
        # Highest data insertion rate per collection per second.
        # Setting this item to 5 indicates that Milvus only allows data insertion to any collection at the rate of 5 MB/s.
        # To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
        max: -1
      partition:
        max: -1 # MB/s, default no limit
    upsertRate:
      max: -1 # MB/s, default no limit
      db:
        max: -1 # MB/s, default no limit
      collection:
        max: -1 # MB/s, default no limit
      partition:
        max: -1 # MB/s, default no limit
    deleteRate:
      # Highest data deletion rate per second.
      # Setting this item to 0.1 indicates that Milvus only allows data deletion at the rate of 0.1 MB/s.
      # To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
      max: -1
      db:
        max: -1 # MB/s, default no limit
      collection:
        # Highest data deletion rate per second.
        # Setting this item to 0.1 indicates that Milvus only allows data deletion from any collection at the rate of 0.1 MB/s.
        # To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
        max: -1
      partition:
        max: -1 # MB/s, default no limit
    bulkLoadRate:
      max: -1 # MB/s, default no limit, not support yet. TODO: limit bulkLoad rate
      db:
        max: -1 # MB/s, default no limit, not support yet. TODO: limit db bulkLoad rate
      collection:
        max: -1 # MB/s, default no limit, not support yet. TODO: limit collection bulkLoad rate
      partition:
        max: -1 # MB/s, default no limit, not support yet. TODO: limit partition bulkLoad rate
  dql:
    enabled: false # Whether DQL request throttling is enabled.
    searchRate:
      # Maximum number of vectors to search per second.
      # Setting this item to 100 indicates that Milvus only allows searching 100 vectors per second no matter whether these 100 vectors are all in one search or scattered across multiple searches.
      # To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
      max: -1
      db:
        max: -1 # vps (vectors per second), default no limit
      collection:
        # Maximum number of vectors to search per collection per second.
        # Setting this item to 100 indicates that Milvus only allows searching 100 vectors per second per collection no matter whether these 100 vectors are all in one search or scattered across multiple searches.
        # To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
        max: -1
      partition:
        max: -1 # vps (vectors per second), default no limit
    queryRate:
      # Maximum number of queries per second.
      # Setting this item to 100 indicates that Milvus only allows 100 queries per second.
      # To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
      max: -1
      db:
        max: -1 # qps, default no limit
      collection:
        # Maximum number of queries per collection per second.
        # Setting this item to 100 indicates that Milvus only allows 100 queries per collection per second.
        # To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
        max: -1
      partition:
        max: -1 # qps, default no limit
  limitWriting:
    # forceDeny false means dml requests are allowed (except for some
    # specific conditions, such as memory of nodes to water marker), true means always reject all dml requests.
    forceDeny: false
    ttProtection:
      enabled: false
      # maxTimeTickDelay indicates the backpressure for DML Operations.
      # DML rates would be reduced according to the ratio of time tick delay to maxTimeTickDelay,
      # if time tick delay is greater than maxTimeTickDelay, all DML requests would be rejected.
      # seconds
      maxTimeTickDelay: 300
    memProtection:
      # When memory usage > memoryHighWaterLevel, all dml requests would be rejected;
      # When memoryLowWaterLevel < memory usage < memoryHighWaterLevel, reduce the dml rate;
      # When memory usage < memoryLowWaterLevel, no action.
      enabled: true
      dataNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in DataNodes
      dataNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in DataNodes
      queryNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in QueryNodes
      queryNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in QueryNodes
    growingSegmentsSizeProtection:
      # No action will be taken if the growing segments size is less than the low watermark.
      # When the growing segments size exceeds the low watermark, the dml rate will be reduced,
      # but the rate will not be lower than minRateRatio * dmlRate.
      enabled: false
      minRateRatio: 0.5
      lowWaterLevel: 0.2
      highWaterLevel: 0.4
    diskProtection:
      enabled: true # When the total file size of object storage is greater than `diskQuota`, all dml requests would be rejected;
      diskQuota: -1 # MB, (0, +inf), default no limit
      diskQuotaPerDB: -1 # MB, (0, +inf), default no limit
      diskQuotaPerCollection: -1 # MB, (0, +inf), default no limit
      diskQuotaPerPartition: -1 # MB, (0, +inf), default no limit
    l0SegmentsRowCountProtection:
      enabled: false # switch to enable l0 segment row count quota
      lowWaterLevel: 30000000 # l0 segment row count quota, low water level
      highWaterLevel: 50000000 # l0 segment row count quota, high water level
    deleteBufferRowCountProtection:
      enabled: false # switch to enable delete buffer row count quota
      lowWaterLevel: 32768 # delete buffer row count quota, low water level
      highWaterLevel: 65536 # delete buffer row count quota, high water level
    deleteBufferSizeProtection:
      enabled: false # switch to enable delete buffer size quota
      lowWaterLevel: 134217728 # delete buffer size quota, low water level
      highWaterLevel: 268435456 # delete buffer size quota, high water level
  limitReading:
    # forceDeny false means dql requests are allowed (except for some
    # specific conditions, such as collection has been dropped), true means always reject all dql requests.
    forceDeny: false

注意milvus有频率控制,控制不好会报错

ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
WARN  i.m.client.AbstractMilvusGrpcClient - Retry(4) with interval 270ms. Reason: io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
WARN  i.m.client.AbstractMilvusGrpcClient - Retry(5) with interval 810ms. Reason: io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
WARN  i.m.client.AbstractMilvusGrpcClient - Retry(6) with interval 2430ms. Reason: io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
WARN  i.m.client.AbstractMilvusGrpcClient - Retry(7) with interval 3000ms. Reason: io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]

需要配置/milvus/configs/milvus.yaml,将quotaAndLimits.flushRate.collection.max调高一点,默认是0.1

小结

langchain4j提供了langchain4j-milvus用于集成对milvus的访问。

doc

### 如何在 Langchain-Chatchat 中配置 Milvus 为了在 Langchain-Chatchat 中集成并配置 Milvus 向量数据库,需遵循特定的设置流程。Milvus 可作为强大的向量相似度搜索引擎来增强系统的检索能力。 #### 配置 Milvus 连接参数 在 `config/kb_config.yaml` 文件中定义了与 Milvus 数据库交互所需的关键配置项[^3]: ```yaml # config/kb_config.yaml 示例片段 milvus: host: "localhost" port: "19530" collection_name: "langchain_chatchat_collection" index_params: metric_type: "L2" index_type: "IVF_FLAT" params: nlist: 16384 ``` 此部分设定包括连接地址 (`host`) 和端口 (`port`)、集合名称 (`collection_name`) 以及索引构建参数等细节。 #### 初始化 Milvus 并建立索引 完成上述配置后,在初始化知识库阶段会自动尝试连接至指定的 Milvus 实例,并依据给定参数创建相应的向量集合和索引结构。如果是在首次启动或重新初始化时,则可通过命令行工具执行如下操作以确保一切正常工作[^4]: ```bash cd path/to/langchain-chatchat python scripts/init_kb.py --with-milvus ``` 该脚本负责读取 `kb_config.yaml` 中有关 Milvus 的各项配置,并据此准备必要的基础设施。 #### 使用 Python API 手动验证连接状态 对于开发者而言,也可以通过编写简单的测试程序来手动检验同 Milvus 的连通性和基本功能是否正常运作: ```python from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection connections.connect( alias="default", host='localhost', port='19530' ) fields = [ FieldSchema(name="id", dtype=DataType.INT64, is_primary=True), FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=768) ] schema = CollectionSchema(fields, description="Test schema") test_collection = Collection("test_langchain_chatchat", schema=schema) print(f"Connected to Milvus server successfully.") ``` 这段代码展示了如何利用 PyMilvus 库实现对 Milvus 数据库的基础访问逻辑,包括但不限于建立连接、定义字段模式及创建新的集合实例。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值