Skip to content

MilvusUtils.readMilvusCollection - read data from specified partition #20

@meako689

Description

@meako689

I think it's related to #5

Hi, I'm trying to perform a data migration between two milvus collections using spark-milvus

The data is split into several partitions,
When I'm reading the data, I'm able to pass MILVUS_PARTITION_NAME param, (as it was recently added

    MilvusOptions.MILVUS_HOST -> host,
    MilvusOptions.MILVUS_PORT -> port.toString,
    MilvusOptions.MILVUS_COLLECTION_NAME -> collectionName,
    MilvusOptions.MILVUS_PARTITION_NAME -> partitionName,
    MilvusOptions.MILVUS_BUCKET -> bucketName,
    MilvusOptions.MILVUS_ROOTPATH -> rootPath,
    MilvusOptions.MILVUS_FS -> fs,
    MilvusOptions.MILVUS_STORAGE_ENDPOINT -> minioEndpoint,
    MilvusOptions.MILVUS_STORAGE_USER -> minioAK,
    MilvusOptions.MILVUS_STORAGE_PASSWORD -> minioSK
  )

  val milvusOptions = new MilvusOptions(
    new CaseInsensitiveStringMap(properties.asJava)
  )
  val collectionDF = MilvusUtils.readMilvusCollection(spark, milvusOptions)

however whole collection is being loaded into a dataframe.

Can you please provide guidance what has to be done to fix this behavior ? I don't have expertise on milvus-binlog format.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions