All Products
Search
Document Center

Intelligent Media Management:Query file metadata

Last Updated:Dec 09, 2024

After you create a metadata index, you can query the metadata of one or more files by calling API operations of Intelligent Media Management (IMM). You can also query metadata by using field searches, fuzzy keyword searches, or natural language keyword searches. This topic describes how to query file metadata.

Prerequisites

A metadata index is created for your files based on your use scenario. For more information, see Create a metadata index.

Query methods

The following table describes the methods that you can use to query file metadata.

Method

Description

Query the metadata of a single file

Call the GetFileMeta operation to query the metadata of a single file.

Query the metadata of multiple files

Call the BatchGetFileMeta operation to call the metadata of multiple files at a time.

Perform a simple query

Call the SimpleQuery operation to query files that meet the specified conditions and list metadata by field in a specific sorting order.

You can also use nesting to perform complex queries and perform aggregation operations to collect statistics on and analyze the values of different fields. For a list of the supported fields and operators, see Supported fields and operators.

Perform a fuzzy search

Call the FuzzyQuery operation to query files that match the specified string and list file metadata.

IMM searches for the specified string within extracted metadata fields, such as the file name, label, path, or custom label. If one metadata field value of a file matches the specified string, all metadata of the file is returned.

Perform a natural language keyword search

Call the SemanticQuery operation to query metadata in a dataset based on natural language keywords.

The operation supports semantic searches based on the Labels, ProduceTime, and AddressLine fields. For example, to query the metadata of files that semantically relate to 'sky over Hangzhou,' you can specify 'Query=sky over Hangzhou' as the query condition.

Query the metadata of a single file

The following example searches the test-dataset dataset of the test-project project for the metadata of the oss://test-bucket/test-object.jpg object.

  • Sample request

    {
        "ProjectName": "test-project",
        "URI": "oss://test-bucket/test-object.jpg",
        "DatasetName": "test-dataset"
    }
                            
  • Sample response

    {
        "RequestId": "645FB6D9-5EA0-02C9-B253-****",
        "Files": [
            {
                "ProduceTime": "2020-08-19T17:11:11+08:00",
                "ObjectACL": "default",
                "ContentType": "image/jpeg",
                "ProjectName": "test-project",
                "Size": 22868,
                "URI": "oss://test-bucket/test-object.jpg",
                "Addresses": [
                    {
                        "Language": "zh-Hans",
                        "Township": "Tanggou Town",
                        "AddressLine": "Chenlongzhuang, Tanggou Town, Shuyang County, Suqian City, Jiangsu Province",
                        "Country": "China",
                        "City": "Suqian",
                        "District": "Shuyang",
                        "Province": "Jiangsu"
                    }
                ],
                "ObjectType": "file",
                "CustomLabels": {
                    "category": "Persons"
                },
                "OwnerId": "****",
                "FileModifiedTime": "2021-05-13T10:22:44+08:00",
                "ImageWidth": 270,
                "OSSStorageClass": "Standard",
                "MediaType": "image",
                "ObjectId": "****",
                "CreateTime": "2022-07-06T07:10:18.497753661+08:00",
                "Filename": "1.jpg",
                "Labels": [
                    {
                        "CentricScore": 0.921999990940094,
                        "Language": "zh-Hans",
                        "LabelConfidence": 1,
                        "LabelName": "Hairstyle",
                        "LabelLevel": 2,
                        "ParentLabelName": "Daily behavior"
                    },
                    ...
                ],
                "Orientation": 1,
                "Figures": [
                    {
                        "Beard": "none",
                        "MaskConfidence": 0.6959999799728394,
                        "Gender": "female",
                        "Boundary": {
                            "Left": 70,
                            "Top": 75,
                            "Height": 134,
                            "Width": 101
                        },
                        ...
                    }
                ],
                "EXIF": "...",
                "ContentMd5": "HZwoCnxPZ/fvhz4oRJ****",
                "ImageHeight": 270,
                "ImageScore": {
                    "OverallQualityScore": 0.6140000224113464
                },
                "ETag": "\"1D9C280A7C4F67F7EF873E28449D****\"",
                "DatasetName": "test-dataset",
                "FileHash": "\"1D9C280A7C4F67F7EF873E2****\"",
                "UpdateTime": "2022-07-06T07:10:18.497753661+08:00",
                "OSSCRC64": "5634447745650079669",
                "OSSTaggingCount": 0,
                "LatLong": "34.000000,119.000000",
                "OSSObjectType": "Normal"
            }
        ]
    }
  • Complete sample code (IMM SDK for Python V1.27.3)

    # -*- coding: utf-8 -*-
    
    import os
    from alibabacloud_imm20200930.client import Client as imm20200930Client
    from alibabacloud_tea_openapi import models as open_api_models
    from alibabacloud_imm20200930 import models as imm_20200930_models
    from alibabacloud_tea_util import models as util_models
    from alibabacloud_tea_util.client import Client as UtilClient
    
    
    class Sample:
        def __init__(self):
            pass
    
        @staticmethod
        def create_client(
            access_key_id: str,
            access_key_secret: str,
        ) -> imm20200930Client:
            """
            Use your AccessKey ID and AccessKey secret to initialize the client. 
            @param access_key_id:
            @param access_key_id:
            @param access_key_secret:
            @return: Client
            @throws Exception
            """
            config = open_api_models.Config(
                access_key_id=access_key_id,
                access_key_secret=access_key_secret
            )
            # Specify the endpoint. 
            config.endpoint = f'imm.cn-beijing.aliyuncs.com'
            return imm20200930Client(config)
    
        @staticmethod
        def main() -> None:
            # The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. 
            # For security reasons, we recommend that you do not embed your AccessKey pair in your project code. 
            # In this example, the AccessKey pair is obtained from the environment variables to implement identity verification for API access. For information about how to configure environment variables, visit https://www.alibabacloud.com/help/en/imm/developer-reference/configure-environment-variables. 
            imm_access_key_id = os.getenv("AccessKeyId")
            imm_access_key_secret = os.getenv("AccessKeySecret")
            client = Sample.create_client(imm_access_key_id, imm_access_key_secret)
            get_file_meta_request = imm_20200930_models.GetFileMetaRequest(
                project_name='test-project',
                dataset_name='test-dataset',
                uri='oss://test-bucket/test-object.jpg'
            )
            runtime = util_models.RuntimeOptions()
            try:
                # Print the response of the API operation. 
                response = client.get_file_meta_with_options(get_file_meta_request, runtime)
                print(response.body.to_map())
            except Exception as error:
                # Print the error message if necessary. 
                UtilClient.assert_as_string(error.message)
                print(error)
    
    
    if __name__ == '__main__':
        Sample.main()

Query the metadata of multiple files

The following example searches the test-dataset dataset of the test-project project for the metadata of the oss://test-bucket/test-object1.jpg and oss://test-bucket/test-object2.jpg objects.

  • Sample request

    {
        "ProjectName": "test-project",
        "DatasetName": "test-dataset",
        "URIs": "[\"oss://test-bucket/test-object.jpg\", \"oss://test-bucket/test-object2.jpg\"]"
    }
  • Sample response (See the sample response in the Query the metadata of a single file section)

  • Complete sample code (IMM SDK for Python V1.27.3)

    # -*- coding: utf-8 -*-
    
    import os
    from alibabacloud_imm20200930.client import Client as imm20200930Client
    from alibabacloud_tea_openapi import models as open_api_models
    from alibabacloud_imm20200930 import models as imm_20200930_models
    from alibabacloud_tea_util import models as util_models
    from alibabacloud_tea_util.client import Client as UtilClient
    
    
    class Sample:
        def __init__(self):
            pass
    
        @staticmethod
        def create_client(
            access_key_id: str,
            access_key_secret: str,
        ) -> imm20200930Client:
            """
            Use your AccessKey ID and AccessKey secret to initialize the client. 
            @param access_key_id:
            @param access_key_secret:
            @return: Client
            @throws Exception
            """
            config = open_api_models.Config(
                access_key_id=access_key_id,
                access_key_secret=access_key_secret
            )
            # Specify the endpoint. 
            config.endpoint = f'imm.cn-beijing.aliyuncs.com'
            return imm20200930Client(config)
    
        @staticmethod
        def main() -> None:
            # The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. 
            # For security reasons, we recommend that you do not embed your AccessKey pair in your project code. 
            # In this example, the AccessKey pair is obtained from the environment variables to implement identity verification for API access. For information about how to configure environment variables, visit https://www.alibabacloud.com/help/en/imm/developer-reference/configure-environment-variables. 
            imm_access_key_id = os.getenv("AccessKeyId")
            imm_access_key_secret = os.getenv("AccessKeySecret")
            client = Sample.create_client(imm_access_key_id, imm_access_key_secret)
            batch_get_file_meta_request = imm_20200930_models.BatchGetFileMetaRequest(
                project_name='test-project',
                dataset_name='test-dataset',
                uris=[
                    'oss://test-bucket/test-object1.jpg',
                    'oss://test-bucket/test-object2.jpg'
                ]
            )
            runtime = util_models.RuntimeOptions()
            try:
                # Print the response of the API operation. 
                response = client.batch_get_file_meta_with_options(batch_get_file_meta_request, runtime)
                print(response.body.to_map())
            except Exception as error:
                # Print the error message if necessary. 
                UtilClient.assert_as_string(error.message)
                print(error)
    
    
    if __name__ == '__main__':
        Sample.main()

Perform a simple query

Example 1

The following content provides the conditions of the query:

  • Project name: test-project

  • Dataset name: test-dataset

  • File type: image

  • Query result sorting: in the ascending order of file size

  • Maximum number of returned query results: 100

The following content provides an example:

  • Sample request

    {
        "Query": "{\"Field\": \"ContentType\", \"Operation\": \"prefix\", \"Value\": \"image\"}",
        "ProjectName": "test-project",
        "DatasetName": "test-dataset"
        "Sort": "Size",
        "Order": "asc",
        "MaxResults": 100
    }
  • Sample response (See the sample response in the Query the metadata of a single file section)

  • Complete sample code (IMM SDK for Python V1.27.3)

    # -*- coding: utf-8 -*-
    
    import os
    from alibabacloud_imm20200930.client import Client as imm20200930Client
    from alibabacloud_tea_openapi import models as open_api_models
    from alibabacloud_imm20200930 import models as imm_20200930_models
    from alibabacloud_tea_util import models as util_models
    from alibabacloud_tea_util.client import Client as UtilClient
    
    
    class Sample:
        def __init__(self):
            pass
    
        @staticmethod
        def create_client(
            access_key_id: str,
            access_key_secret: str,
        ) -> imm20200930Client:
            """
            Use your AccessKey ID and AccessKey secret to initialize the client. 
            @param access_key_id:
            @param access_key_secret:
            @return: Client
            @throws Exception
            """
            config = open_api_models.Config(
                access_key_id=access_key_id,
                access_key_secret=access_key_secret
            )
            config.endpoint = f'imm.cn-beijing.aliyuncs.com'
            return imm20200930Client(config)
    
        @staticmethod
        def main() -> None:
            # The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. 
            # For security reasons, we recommend that you do not embed your AccessKey pair in your project code. 
            # In this example, the AccessKey pair is obtained from the environment variables to implement identity verification for API access. For information about how to configure environment variables, visit https://www.alibabacloud.com/help/en/imm/developer-reference/configure-environment-variables. 
            imm_access_key_id = os.getenv("AccessKeyId")
            imm_access_key_secret = os.getenv("AccessKeySecret")
            client = Sample.create_client(imm_access_key_id, imm_access_key_secret)
            request = imm_20200930_models.SimpleQueryRequest()
            params = {
                # Specify the query conditions 
                "Query": {"SubQueries": [{"Field": "ContentType", "Operation": "eq", "Value": "image/jpeg"}, {"Field": "Size", "Operation": "gt", "Value": "10485760"}, {"Field": "CustomLabels.category", "Operation": "eq", "Value": "Persons"}], "Operation": "and"},
                # Specify the name of the IMM project.
                "ProjectName": "test-project",
                # Specify the name of the dataset.
                "DatasetName": "test-dataset",
                # Specify the sorting field. 
                "Sort": "Size",
                # Specify the sorting order. 
                "Order": "asc",
                # Set the maximum number of query results to 100. 
                "MaxResults": 100
            }
            request.from_map(params)
            runtime = util_models.RuntimeOptions()
            try:
                # Print the response of the API operation. 
                response = client.simple_query_with_options(request, runtime)
                print(response.body.to_map())
            except Exception as error:
                # Print the error message if necessary. 
                UtilClient.assert_as_string(error.message)
                print(error)
    
    
    if __name__ == '__main__':
        Sample.main()

Example 2

The following content provides the conditions of the query:

  • Project name: test-project

  • Dataset name: test-dataset

  • File type: image

  • File size: greater than 10 MB

  • Custom labels (CustomLabels.category): Persons

  • Query result sorting: in the ascending order of file size

  • Maximum number of returned query results: 100

The following content provides an example:

  • Sample request

    {
        "Query": "{\"SubQueries\": [{\"Field\": \"ContentType\", \"Operation\": \"prefix\", \"Value\": \"image\"}, {\"Field\": \"Size\", \"Operation\": \"gt\", \"Value\": \"10485760\"}, {\"Field\": \"CustomLabels.category\", \"Operation\": \"eq\", \"Value\": \"Persons\"}], \"Operation\": \"and\"}",
        "ProjectName": "test-project",
        "DatasetName": "test-dataset",
        "Sort": "Size",
        "Order": "asc",
        "MaxResults": 100
                            
  • Sample response (See the sample response in the Query the metadata of a single file section)

Example 3

The following content provides the conditions of the query:

  • Project name: test-project

  • Dataset name: test-dataset

  • File path: oss://test-bucket/

  • File size: greater than 10 MB

  • Labels (Labels.LabelName): "TV" or "Stereo"

  • Query result sorting: in the ascending order of file size

  • Maximum number of returned query results: 100

The following content provides an example:

  • Sample request

    {
        "Query": "{\"SubQueries\":[{\"Field\":\"URI\",\"Value\":\"oss://test-bucket/\",\"Operation\":\"prefix\"},{\"Field\":\"Size\",\"Value\":\"10485760\",\"Operation\":\"gt\"},{\"SubQueries\":[{\"Field\":\"Labels.LabelName\",\"Value\":\"TV\",\"Operation\":\"eq\"},{\"Field\":\"Labels.LabelName\",\"Value\":\"Stereo\",\"Operation\":\"eq\"}],\"Operation\":\"or\"}],\"Operation\":\"and\"}",
        "ProjectName": "test-project",
        "DatasetName": "test-dataset",
        "Sort": "Size",
        "Order": "asc",
        "MaxResults": 100
    }
  • Sample response (See the sample response in the Query the metadata of a single file section)

Example 4

The following content provides the conditions of the query:

  • Project name: test-project

  • Dataset name: test-dataset

  • File type: image

  • File size: greater than 10 MB

  • Custom labels (CustomLabels.category): Persons

  • Returned result: the total size of matched files

The following content provides an example:

  • Sample request

    {
        "Query": "{\"SubQueries\": [{\"Field\": \"ContentType\", \"Operation\": \"eq\", \"Value\": \"image/jpeg\"}, {\"Field\": \"Size\", \"Operation\": \"gt\", \"Value\": \"10485760\"}, {\"Field\": \"CustomLabels.category\", \"Operation\": \"eq\", \"Value\": \"Persons\"}], \"Operation\": \"and\"}",
        "ProjectName": "test-project",
        "DatasetName": "test-dataset",
        "Aggregations": "[{\"Field\":\"Size\",\"Operation\":\"sum\"}]"
    }
  • Sample response

    {
        "RequestId": "0FB9BA35-E16B-0DFE-BD52-****",
        "Aggregations": [
            {
                "Field": "Size",
                "Value": 10485760,
                "Operation": "sum"
            }
        ]
    }

Example 5

The following content provides the conditions of the query:

  • Project name: test-project

  • Dataset name: test-dataset

  • File size: greater than 10 MB

  • Faces (Figures.Age and Figures.Gender): age and gender

  • Query result sorting: in the ascending order of file size

  • Maximum number of returned query results: 100

The following content provides an example:

  • Sample request

    {
        "Query": "{\"Operation\":\"not\",\"SubQueries\":[{\"Operation\":\"nested\",\"SubQueries\":[{\"Operation\":\"and\",\"SubQueries\":[{\"Field\":\"Figures.Age\",\"Operation\":\"gt\",\"Value\":\"36\"},{\"Field\":\"Figures.Gender\",\"Operation\":\"eq\",\"Value\":\"male\"},{\"Field\":\"Size\",\"Operation\":\"gt\",\"Value\":\"10485760\"}]}]}]}",
        "ProjectName": "test-project",
        "DatasetName": "test-dataset",
        "Sort": "Size",
        "Order": "asc",
        "MaxResults": 100
    }
  • Sample response (See the sample response in the Query the metadata of a single file section)

Example 6

The following content provides the conditions of the query:

  • Project name: test-project

  • Dataset name: test-dataset

  • File type: image

  • File size: greater than 10 MB

  • Custom labels (CustomLabels.category): existence

  • Returned result: the total size of matched files

The following content provides an example:

  • Sample request

    {
        "Query": "{\"SubQueries\": [{\"Field\": \"ContentType\", \"Operation\": \"eq\", \"Value\": \"image/jpeg\"}, {\"Field\": \"Size\", \"Operation\": \"gt\", \"Value\": \"10485760\"}, {\"Field\": \"CustomLabels.category\", \"Operation\": \"exist\"}], \"Operation\": \"and\"}",
        "ProjectName": "test-project",
        "DatasetName": "test-dataset",
        "Aggregations": "[{\"Field\":\"Size\",\"Operation\":\"sum\"}]"
    }
  • Sample response

    {
        "RequestId": "0FB9BA35-E16B-0DFE-BD52-****",
        "Aggregations": [
            {
                "Field": "Size",
                "Value": 10485760,
                "Operation": "sum"
            }
        ]
    }

Perform a fuzzy search

The following example searches the test-dataset dataset of the test-project project for the metadata of files that match the jpg string:

  • Sample request

    {
        "ProjectName": "test-project",
        "DatasetName": "test-dataset",
        "Query": "jpg"
    }
  • Sample response (See the sample response in the Query the metadata of a single file section)

  • Complete sample code (IMM SDK for Python V1.27.3)

    # -*- coding: utf-8 -*-
    
    import os
    from alibabacloud_imm20200930.client import Client as imm20200930Client
    from alibabacloud_tea_openapi import models as open_api_models
    from alibabacloud_imm20200930 import models as imm_20200930_models
    from alibabacloud_tea_util import models as util_models
    from alibabacloud_tea_util.client import Client as UtilClient
    
    
    class Sample:
        def __init__(self):
            pass
    
        @staticmethod
        def create_client(
            access_key_id: str,
            access_key_secret: str,
        ) -> imm20200930Client:
            """
            Use your AccessKey ID and AccessKey secret to initialize the client. 
            @param access_key_id:
            @param access_key_secret:
            @return: Client
            @throws Exception
            """
            config = open_api_models.Config(
                access_key_id=access_key_id,
                access_key_secret=access_key_secret
            )
            config.endpoint = f'imm.cn-beijing.aliyuncs.com'
            return imm20200930Client(config)
    
        @staticmethod
        def main() -> None:
            # The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. 
            # For security reasons, we recommend that you do not embed your AccessKey pair in your project code. 
            # In this example, the AccessKey pair is obtained from the environment variables to implement identity verification for API access. For information about how to configure environment variables, visit https://www.alibabacloud.com/help/en/imm/developer-reference/configure-environment-variables. 
            imm_access_key_id = os.getenv("AccessKeyId")
            imm_access_key_secret = os.getenv("AccessKeySecret")
            client = Sample.create_client(imm_access_key_id, imm_access_key_secret)
            fuzzy_query_request = imm_20200930_models.FuzzyQueryRequest(
                # Specify the name of the IMM project. 
                project_name='test-project',
                # Specify the name of the dataset. 
                dataset_name='test-dataset',
                # Specify the keyword. 
                query='jpg'
            )
            runtime = util_models.RuntimeOptions()
            try:
                # Print the response of the API operation. 
                response = client.fuzzy_query_with_options(fuzzy_query_request, runtime)
                print(response.body.to_map())
            except Exception as error:
                # Print the error message if necessary. 
                UtilClient.assert_as_string(error.message)
                print(error)
    
    
    if __name__ == '__main__':
        Sample.main()

Perform a natural language keyword search

The following example searches the test-dataset dataset of the test-project project for panda photos that were taken in Chengdu in July 2020:

  • Sample request

    {
        "ProjectName": "test-project",
        "DatasetName": "test-dataset",
        "Query": "Pandas in Chengdu in July 2020"
        "MaxResults": 100
    }
  • Sample response (See the sample response in the Query the metadata of a single file section)

  • Complete sample code (IMM SDK for Python V1.27.3)

    # -*- coding: utf-8 -*-
    
    import os
    from alibabacloud_imm20200930.client import Client as imm20200930Client
    from alibabacloud_tea_openapi import models as open_api_models
    from alibabacloud_imm20200930 import models as imm_20200930_models
    from alibabacloud_tea_util import models as util_models
    from alibabacloud_tea_util.client import Client as UtilClient
    
    
    class Sample:
        def __init__(self):
            pass
    
        @staticmethod
        def create_client(
            access_key_id: str,
            access_key_secret: str,
        ) -> imm20200930Client:
            """
            Use your AccessKey ID and AccessKey secret to initialize the client. 
            @param access_key_id:
            @param access_key_secret:
            @return: Client
            @throws Exception
            """
            config = open_api_models.Config(
                access_key_id=access_key_id,
                access_key_secret=access_key_secret
            )
            config.endpoint = f'imm.cn-beijing.aliyuncs.com'
            return imm20200930Client(config)
    
        @staticmethod
        def main() -> None:
            # The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. 
            # For security reasons, we recommend that you do not embed your AccessKey pair in your project code. 
            # In this example, the AccessKey pair is obtained from the environment variables to implement identity verification for API access. For information about how to configure environment variables, visit https://www.alibabacloud.com/help/en/imm/developer-reference/configure-environment-variables. 
            imm_access_key_id = os.getenv("AccessKeyId")
            imm_access_key_secret = os.getenv("AccessKeySecret")
            client = Sample.create_client(imm_access_key_id, imm_access_key_secret)
            semantic_query_request = imm_20200930_models.SemanticQueryRequest(
                query='Pandas in Chengdu in July 2020',
                project_name='test-project',
                dataset_name='test-dataset',
                max_results=100
            )
            runtime = util_models.RuntimeOptions()
            try:
                # Print the response of the API operation. 
                response = client.semantic_query_with_options(semantic_query_request, runtime)
                print(response.body.to_map())
            except Exception as error:
                # Print the error message if necessary. 
                UtilClient.assert_as_string(error.message)
                print(error)
    
    
    if __name__ == '__main__':
        Sample.main()