Skip to content

Assets Registry

Introduction

The Assets Registry is a backend service framework designed to manage the lifecycle of digital assets through a structured upload, registration, and retrieval process. It supports bulk ingestion of asset metadata and associated binary files using ZIP-based transport, persistent storage in object stores (such as Amazon S3 or Ceph), and metadata indexing via a MongoDB-backed registry.

This system is built for environments that require scalable, auditable, and modular asset handling, including AI models, datasets, configurations, and other domain-specific artifacts.

Key functionalities include:

  • Streaming ZIP Uploads: Asset bundles containing metadata and files are uploaded as ZIP archives. These archives are parsed and processed in a non-blocking manner.
  • Object Storage Integration: Binary files extracted from the ZIP are uploaded to S3-compatible storage, and URLs are embedded into the asset metadata.
  • Metadata Registration: Extracted metadata is submitted to the asset registry via structured API endpoints and persisted in MongoDB.
  • Dynamic ZIP Downloads: Assets can be retrieved by asset ID and streamed back as ZIP archives constructed on-demand using stored file references.

The system is designed with modularity in mind, enabling it to be adapted to various storage backends, validation workflows, and integration protocols. It exposes both REST and GraphQL APIs for asset registration, querying, and lifecycle management.


Architecture

The Assets Registry architecture is designed for modular, scalable ingestion and retrieval of asset metadata and associated files. It separates responsibilities across storage, processing, and indexing layers while maintaining stateless, asynchronous, and streaming-compatible interfaces.

The system is composed of several interconnected modules that coordinate to provide full asset lifecycle support through REST and GraphQL APIs, as well as ZIP-based upload and download pipelines.

assets-db

System Overview

The architecture consists of the following key layers:

  1. API Layer

  2. Exposes REST and GraphQL endpoints for asset registration, querying, upload, and download.

  3. Handles validation, background job management, and request routing.

  4. Processing Layer

  5. Manages ZIP stream parsing, file upload to S3-compatible storage, and metadata extraction.

  6. Supports asynchronous background threads for long-running uploads.

  7. Storage Layer

  8. Persists metadata in MongoDB collections.

  9. Stores large files in external object storage (Amazon S3 or Ceph).
  10. Upload status and tracking is handled through Redis.

  11. Streaming Layer

  12. Uses in-memory file-like objects to stream ZIP files in and out.

  13. Enables download of reconstructed ZIPs directly from stored metadata and file URLs.

Upload Architecture

Upload Flow (POST /zip/upload):

  1. Client uploads a ZIP archive containing asset.json and files.
  2. The API spawns a background thread and immediately returns an upload_id.
  3. The StreamingZipParser:

  4. Extracts and parses asset.json.

  5. Streams each file to S3UploaderPlugin, returning a file URL.
  6. The URLs are embedded into metadata.
  7. WriteAPIClient sends the full metadata to the Assets Create API (POST /assets).
  8. Upload status is tracked using Redis (UPLOAD_STATUS:<id>).

Download Architecture

Download Flow (GET /zip/download/<asset_id>):

  1. Client requests a ZIP archive for a given asset_id.
  2. The ReadAPIClient fetches metadata from the registry.
  3. The S3DownloaderPlugin streams each file from object storage.
  4. The StreamingZipArchiver dynamically constructs a ZIP stream including:

  5. All file entries

  6. A synthesized asset.json containing metadata
  7. The ZIP is streamed directly to the client using Flask’s Response.

Integration Points

Component Role
Flask REST + GraphQL web API layer
MongoDB Persistent metadata store
Redis Upload job tracking (non-blocking uploads)
Amazon S3 / Ceph Large binary file storage
zipfile, io.BytesIO ZIP stream parsing and generation

Schema

This section defines the core data model used in the Assets Registry. The registry represents an asset as a structured document that includes metadata, policies, API specifications, files, and indexing instructions. Each entity is modeled using Python @dataclass structures and stored as part of a unified asset document in MongoDB.


Asset

@dataclass
class Asset:
    asset_id: str
    asset_uri: str
    asset_version: str
    asset_profile_id: str
    asset_file_ids: List[str]
    asset_container_uri: Optional[str] = None
    asset_policy_ids: List[str] = field(default_factory=list)
    asset_container_registry_creds_config: Optional[Dict[str, Any]] = None
    asset_workflow_id: Optional[str] = None
    asset_api_ids: List[str] = field(default_factory=list)
    asset_brief_description: Optional[str] = None
    profiles: List[AssetProfile] = field(default_factory=list)
    policies: List[AssetPolicy] = field(default_factory=list)
    files: List[AssetFile] = field(default_factory=list)
    apis: List[AssetAPI] = field(default_factory=list)
    index_mappings: List[IndexMapping] = field(default_factory=list)
Field Type Description
asset_id str Unique identifier for the asset
asset_uri str URI pointing to the asset namespace or logical location
asset_version str Version string of the asset
asset_profile_id str Associated profile ID
asset_file_ids List[str] List of file IDs referenced within the asset
asset_container_uri Optional[str] URI for asset container or docker image
asset_policy_ids List[str] List of associated policy IDs
asset_container_registry_creds_config Optional[Dict] Credentials config for pulling container from a private registry
asset_workflow_id Optional[str] Workflow ID linked to this asset
asset_api_ids List[str] List of asset API definition IDs
asset_brief_description Optional[str] Human-readable description of the asset
profiles List[AssetProfile] Embedded profile descriptions
policies List[AssetPolicy] Embedded policy definitions
files List[AssetFile] Files stored in S3 and referenced in metadata
apis List[AssetAPI] API interface specifications for the asset
index_mappings List[IndexMapping] Page-level index mapping for document rendering

AssetProfile

@dataclass
class AssetProfile:
    asset_profile_id: str
    asset_type: str
    asset_sub_type: str
    asset_id: str
    asset_metadata: Optional[Dict[str, Any]] = None
    asset_creator_info: Optional[Dict[str, Any]] = None
    asset_tags: Optional[List[str]] = field(default_factory=list)
    asset_description: Optional[str] = None
    asset_complete_docs_url: Optional[str] = None
    asset_man_page_url: Optional[str] = None
    asset_sample_input_json: Optional[Dict[str, Any]] = None
    asset_sample_output_json: Optional[Dict[str, Any]] = None
    asset_sample_input_data_url: Optional[str] = None
    asset_sample_output_data_url: Optional[str] = None
    asset_author_metadata: Optional[Dict[str, Any]] = None
Field Type Description
asset_profile_id str Unique ID for the profile
asset_type str Asset's top-level category (e.g., model, data, tool)
asset_sub_type str Asset’s sub-type (e.g., transformer, image)
asset_id str Back-reference to the parent asset
asset_metadata Optional[Dict] Arbitrary metadata about the asset
asset_creator_info Optional[Dict] Metadata about the asset creator
asset_tags Optional[List[str]] Searchable tag list
asset_description Optional[str] Detailed textual description
asset_complete_docs_url Optional[str] Link to complete documentation
asset_man_page_url Optional[str] Link to man-page or quickstart docs
asset_sample_input_json Optional[Dict] Example input for the asset
asset_sample_output_json Optional[Dict] Example output
asset_sample_input_data_url Optional[str] Link to sample input data file
asset_sample_output_data_url Optional[str] Link to sample output data file
asset_author_metadata Optional[Dict] Information about the author(s)

AssetPolicy

@dataclass
class AssetPolicy:
    asset_policy_id: str
    asset_id: str
    asset_policy_type: str
    asset_policy_rule_uri: Optional[str] = None
    asset_policy_rule_config: Optional[Dict[str, Any]] = None
    asset_policy_rule_params: Optional[Dict[str, Any]] = None
Field Type Description
asset_policy_id str Unique policy identifier
asset_id str Parent asset ID
asset_policy_type str Type of policy (e.g., access, runtime, audit)
asset_policy_rule_uri Optional[str] External URI for policy code
asset_policy_rule_config Optional[Dict] Policy-specific configuration
asset_policy_rule_params Optional[Dict] Parameters to apply during policy evaluation

AssetFile

@dataclass
class AssetFile:
    asset_file_id: str
    asset_file_type: str
    asset_file_mime_type: str
    asset_file_url: str
    asset_file_metadata: Optional[Dict[str, Any]] = None
Field Type Description
asset_file_id str Unique identifier for the file
asset_file_type str Logical type (e.g., weights, config)
asset_file_mime_type str File’s MIME type (e.g., application/zip)
asset_file_url str Download URL (usually S3/HTTPS)
asset_file_metadata Optional[Dict] Optional file-specific metadata

AssetAPI

@dataclass
class AssetAPI:
    asset_api_id: str
    asset_id: str
    asset_api_metadata: Optional[Dict[str, Any]] = None
    asset_api_svc: Optional[str] = None
    asset_api_route: Optional[str] = None
    asset_api_protocol: Optional[str] = None
    asset_protocol_specific_config: Optional[Dict[str, Any]] = None
    asset_api_man_page: Optional[str] = None
    asset_api_swagger_doc: Optional[str] = None
    asset_api_usage_samples: Optional[List[Dict[str, Any]]] = field(default_factory=list)
Field Type Description
asset_api_id str Unique API identifier
asset_id str Parent asset reference
asset_api_metadata Optional[Dict] General API metadata
asset_api_svc Optional[str] Backend service name
asset_api_route Optional[str] API route or endpoint path
asset_api_protocol Optional[str] Protocol used (e.g., HTTP, gRPC)
asset_protocol_specific_config Optional[Dict] Configuration for protocol handling
asset_api_man_page Optional[str] Documentation link
asset_api_swagger_doc Optional[str] Swagger/OpenAPI specification link
asset_api_usage_samples Optional[List[Dict]] Example requests/responses

IndexMapping

@dataclass
class IndexMapping:
    json_doc_id: str
    mapping_field_index: int
    table_name: str
    field_name: str
    render_page_no: int
    render_order_no: int
Field Type Description
json_doc_id str ID of the document containing this field
mapping_field_index int Index of the field in the JSON document
table_name str Table or logical structure name for rendering
field_name str Name of the specific field
render_page_no int Page number to render the field on (in UI/PDF)
render_order_no int Order in which to render the field on the target page

Create, Delete, and Update APIs

The Assets Registry provides RESTful endpoints for managing the lifecycle of assets. This section documents the endpoints used to create, update, and delete assets.

Each endpoint accepts and returns data in JSON format. All requests should use the Content-Type: application/json header unless otherwise specified.


POST /assets

Creates a new asset by submitting a complete asset document. The asset must include the required top-level fields and embedded metadata components (e.g., files, profiles).

Request

  • Method: POST
  • Path: /assets
  • Body: Full Asset JSON object

Response

  • 201 Created on success
  • 400 Bad Request if validation fails or asset already exists

cURL Example

curl -X POST http://localhost:8080/assets \
  -H "Content-Type: application/json" \
  -d '{
    "asset_id": "asset-001",
    "asset_uri": "models://example.com/asset-001",
    "asset_version": "v1",
    "asset_profile_id": "profile-001",
    "asset_file_ids": [],
    "profiles": [{
      "asset_profile_id": "profile-001",
      "asset_type": "model",
      "asset_sub_type": "llm",
      "asset_id": "asset-001"
    }],
    "policies": [],
    "files": [],
    "apis": [],
    "index_mappings": []
  }'

PUT /assets/<asset_id>

Updates one or more fields of an existing asset. Only the fields provided in the request body will be updated.

Request

  • Method: PUT
  • Path: /assets/<asset_id>
  • Body: Partial or full Asset fields to update

Response

  • 200 OK if update was successful
  • 404 Not Found if asset ID does not exist

cURL Example

curl -X PUT http://localhost:8080/assets/asset-001 \
  -H "Content-Type: application/json" \
  -d '{
    "asset_brief_description": "Updated description"
  }'

DELETE /assets/<asset_id>

Deletes the specified asset and all associated embedded metadata from the database.

Request

  • Method: DELETE
  • Path: /assets/<asset_id>

Response

  • 200 OK if deletion was successful
  • 404 Not Found if the asset does not exist

cURL Example

curl -X DELETE http://localhost:8080/assets/asset-001

Query and GraphQL APIs

The Assets Registry supports both RESTful and GraphQL-based querying to retrieve and search asset metadata. These interfaces allow for flexible and fine-grained access to asset information by ID, type, tags, API route, or arbitrary filters.


REST Query APIs

GET /query/by-id/<asset_id>

Retrieves a single asset by its unique ID.

cURL Example:

curl http://localhost:8081/query/by-id/asset-001

GET /query/by-type/<asset_type>

Fetches all assets matching a given asset_type.

cURL Example:

curl http://localhost:8081/query/by-type/model

GET /query/by-sub-type/<sub_type>

Fetches assets with a specific asset_sub_type inside any embedded profile.

cURL Example:

curl http://localhost:8081/query/by-sub-type/llm

GET /query/by-tag/<tag>

Finds assets that are tagged with the given value.

cURL Example:

curl http://localhost:8081/query/by-tag/generative

GET /query/by-api-route/<route>

Finds assets that define an API using the given route.

cURL Example:

curl http://localhost:8081/query/by-api-route/api/v1/generate

POST /query/search

Allows custom MongoDB-style filter queries for advanced use cases.

cURL Example:

curl -X POST http://localhost:8081/query/search \
  -H "Content-Type: application/json" \
  -d '{"asset_version": "v1", "profiles.asset_type": "model"}'

GraphQL Query Endpoint

URL: /graphql

A flexible GraphQL interface for querying assets using defined filters and selectors. Only read operations are supported.


Example: Get asset by ID

query {
  getAssetById(assetId: "asset-001") {
    assetId
    assetUri
    assetVersion
    assetBriefDescription
  }
}

Example: Filter by type

query {
  getAssetsByType(assetType: "model") {
    assetId
    assetUri
    profiles
  }
}

Example: Custom filter

query {
  searchAssets(filters: { "profiles.asset_sub_type": "llm" }) {
    assetId
    assetUri
    assetVersion
  }
}

GraphQL Features

Query Field Parameters Returns
getAssetById assetId (String) A single asset
getAssetsByType assetType (String) List of assets
getAssetsByTag tag (String) List of assets
getAssetsBySubType subType (String) List of assets
getAssetsByApiRoute route (String) List of assets
searchAssets filters (GenericScalar) Custom filtered list of assets

You're welcome. Below is the next section of the Assets Registry documentation:


ZIP Upload and Download APIs

The Assets Registry supports bulk transport of asset metadata and files using ZIP archives. These interfaces enable clients to upload or download complete assets in a single compressed file, making it suitable for versioned model packaging, migration, or offline exchange.

The upload API is asynchronous and provides a polling mechanism to track the status of the operation. The download API reconstructs a ZIP on-the-fly from stored asset metadata and files in object storage.


POST /zip/upload

Accepts a ZIP file containing:

  • An asset.json metadata file
  • One or more binary files (e.g., models, configs)

The ZIP is parsed asynchronously. Files are uploaded to object storage, and the metadata is registered via the Assets Create API.

Request

  • Method: POST
  • Content-Type: multipart/form-data
  • Form Field: file (the ZIP archive)

Response

  • 202 Accepted with a unique upload_id to track progress

cURL Example

curl -X POST http://localhost:8081/zip/upload \
  -F "file=@asset_bundle.zip"

Response:

{
  "success": true,
  "upload_id": "0a5e3de1-1fcd-4a65-9c83-18c5c0b4d612"
}

GET /zip/status/<upload_id>

Returns the current status of a previously submitted ZIP upload job.

Status values

  • queued
  • processing
  • success
  • failed: <reason>

cURL Example

curl http://localhost:8081/zip/status/0a5e3de1-1fcd-4a65-9c83-18c5c0b4d612

Response:

{
  "upload_id": "0a5e3de1-1fcd-4a65-9c83-18c5c0b4d612",
  "status": "success"
}

GET /zip/download/<asset_id>

Dynamically generates and streams a ZIP archive for the specified asset. The archive includes:

  • Files listed in the asset metadata (files[])
  • The original asset.json metadata

Response

  • Content-Type: application/zip
  • Content-Disposition: attachment; filename=<asset_id>.zip

cURL Example

curl -OJ http://localhost:8081/zip/download/asset-001

Result: A file named asset-001.zip is downloaded.