Skip to main content

chapter 4: multipart uploads

WIP

Upload large files in chunks with S3/MinIO multipart uploads.

Goal

By the end of this chapter you will configure the multipart strategy for large file uploads, implement the multipart backend API methods (signPart, completeMultipart), track per-part progress, and understand how concurrent chunk uploads work.

Loading diagram...

Step by Step

Update your types to include the multipart strategy

Add MultipartIntent and MultipartCursor to your intent and cursor maps:

src/upload.ts
import {
  PostIntent, PostCursor,
  MultipartIntent, MultipartCursor,
} from '@gentleduck/upload'
import type { UploadApi, UploadResultBase } from '@gentleduck/upload'
 
// Now supports both POST (small files) and multipart (large files)
type PhotoIntentMap = {
  post: PostIntent
  multipart: MultipartIntent
}
 
type PhotoCursorMap = {
  post: PostCursor
  multipart: MultipartCursor
}
 
type PhotoPurpose = 'photo'
 
type PhotoResult = UploadResultBase & {
  url: string
}
src/upload.ts
import {
  PostIntent, PostCursor,
  MultipartIntent, MultipartCursor,
} from '@gentleduck/upload'
import type { UploadApi, UploadResultBase } from '@gentleduck/upload'
 
// Now supports both POST (small files) and multipart (large files)
type PhotoIntentMap = {
  post: PostIntent
  multipart: MultipartIntent
}
 
type PhotoCursorMap = {
  post: PostCursor
  multipart: MultipartCursor
}
 
type PhotoPurpose = 'photo'
 
type PhotoResult = UploadResultBase & {
  url: string
}

The MultipartIntent type defines what your backend returns for multipart uploads:

type MultipartIntent = {
  strategy: 'multipart'   // discriminant
  fileId: string          // backend file identifier
  uploadId: string        // S3 multipart upload ID
  partSize: number        // size of each part in bytes
  partCount: number       // total number of parts
}
type MultipartIntent = {
  strategy: 'multipart'   // discriminant
  fileId: string          // backend file identifier
  uploadId: string        // S3 multipart upload ID
  partSize: number        // size of each part in bytes
  partCount: number       // total number of parts
}

The MultipartCursor tracks which parts have been uploaded (for resume):

type MultipartCursor = {
  done: Array<{
    partNumber: number
    etag: string
    size: number
  }>
  completed?: true  // marks the multipart session as assembled
}
type MultipartCursor = {
  done: Array<{
    partNumber: number
    etag: string
    size: number
  }>
  completed?: true  // marks the multipart session as assembled
}

Register the multipart strategy

src/upload.ts
import {
  createUploadClient,
  createStrategyRegistry,
  PostStrategy,
  multipartStrategy,
  createXHRTransport,
} from '@gentleduck/upload'
 
const strategies = createStrategyRegistry<PhotoIntentMap, PhotoCursorMap, PhotoPurpose, PhotoResult>()
strategies.set(PostStrategy<PhotoIntentMap, PhotoCursorMap, PhotoPurpose, PhotoResult>())
strategies.set(multipartStrategy<PhotoIntentMap, PhotoCursorMap, PhotoPurpose, PhotoResult>({
  maxPartConcurrency: 4,
}))
src/upload.ts
import {
  createUploadClient,
  createStrategyRegistry,
  PostStrategy,
  multipartStrategy,
  createXHRTransport,
} from '@gentleduck/upload'
 
const strategies = createStrategyRegistry<PhotoIntentMap, PhotoCursorMap, PhotoPurpose, PhotoResult>()
strategies.set(PostStrategy<PhotoIntentMap, PhotoCursorMap, PhotoPurpose, PhotoResult>())
strategies.set(multipartStrategy<PhotoIntentMap, PhotoCursorMap, PhotoPurpose, PhotoResult>({
  maxPartConcurrency: 4,
}))

multipartStrategy() accepts an optional config:

OptionDefaultDescription
maxPartConcurrency4Maximum number of parts uploaded simultaneously

Higher concurrency uses more bandwidth and memory but finishes faster. For most connections, 3-6 is a good range. Each concurrent part holds a file slice (Blob) in memory.

Implement multipart UploadApi methods

The multipart strategy requires two additional methods on your UploadApi: signPart and completeMultipart. These live under the multipart namespace:

src/upload.ts
const api: UploadApi<PhotoIntentMap, PhotoPurpose, PhotoResult> = {
  async createIntent({ purpose, contentType, size, filename }) {
    const res = await fetch('/api/uploads/create-intent', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ purpose, contentType, size, filename }),
    })
 
    if (!res.ok) throw new Error(`Failed to create intent: ${res.status}`)
 
    // Backend decides strategy based on file size:
    // - Small files (under 100MB): returns PostIntent
    // - Large files (100MB and above): returns MultipartIntent
    return res.json()
  },
 
  async complete({ fileId }) {
    const res = await fetch('/api/uploads/complete', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ fileId }),
    })
 
    if (!res.ok) throw new Error(`Failed to complete upload: ${res.status}`)
    return res.json()
  },
 
  // Multipart-specific operations
  multipart: {
    async signPart({ fileId, uploadId, partNumber }) {
      const res = await fetch('/api/uploads/sign-part', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ fileId, uploadId, partNumber }),
      })
 
      if (!res.ok) throw new Error(`Failed to sign part ${partNumber}: ${res.status}`)
 
      // Returns: { url: 'https://...presigned-put-url...', headers?: { ... } }
      return res.json()
    },
 
    async completeMultipart({ fileId, uploadId, parts }) {
      const res = await fetch('/api/uploads/complete-multipart', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ fileId, uploadId, parts }),
      })
 
      if (!res.ok) throw new Error(`Failed to complete multipart: ${res.status}`)
      return res.json()
    },
  },
}
src/upload.ts
const api: UploadApi<PhotoIntentMap, PhotoPurpose, PhotoResult> = {
  async createIntent({ purpose, contentType, size, filename }) {
    const res = await fetch('/api/uploads/create-intent', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ purpose, contentType, size, filename }),
    })
 
    if (!res.ok) throw new Error(`Failed to create intent: ${res.status}`)
 
    // Backend decides strategy based on file size:
    // - Small files (under 100MB): returns PostIntent
    // - Large files (100MB and above): returns MultipartIntent
    return res.json()
  },
 
  async complete({ fileId }) {
    const res = await fetch('/api/uploads/complete', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ fileId }),
    })
 
    if (!res.ok) throw new Error(`Failed to complete upload: ${res.status}`)
    return res.json()
  },
 
  // Multipart-specific operations
  multipart: {
    async signPart({ fileId, uploadId, partNumber }) {
      const res = await fetch('/api/uploads/sign-part', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ fileId, uploadId, partNumber }),
      })
 
      if (!res.ok) throw new Error(`Failed to sign part ${partNumber}: ${res.status}`)
 
      // Returns: { url: 'https://...presigned-put-url...', headers?: { ... } }
      return res.json()
    },
 
    async completeMultipart({ fileId, uploadId, parts }) {
      const res = await fetch('/api/uploads/complete-multipart', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ fileId, uploadId, parts }),
      })
 
      if (!res.ok) throw new Error(`Failed to complete multipart: ${res.status}`)
      return res.json()
    },
  },
}

The flow per part is:

  1. Engine calls signPart({ fileId, uploadId, partNumber }) to get a presigned PUT URL
  2. Transport sends the part bytes via PUT to that URL
  3. S3 returns an ETag header for the part
  4. After all parts, engine calls completeMultipart with the list of { partNumber, etag }

Configure chunk size and concurrency

The chunk size is controlled by your backend. When createIntent returns a MultipartIntent, it includes partSize and partCount:

// Example backend response for a 200MB file with 10MB parts
{
  strategy: 'multipart',
  fileId: 'abc-123',
  uploadId: 's3-upload-id-xyz',
  partSize: 10 * 1024 * 1024,   // 10MB per part
  partCount: 20,                  // 200MB / 10MB = 20 parts
}
// Example backend response for a 200MB file with 10MB parts
{
  strategy: 'multipart',
  fileId: 'abc-123',
  uploadId: 's3-upload-id-xyz',
  partSize: 10 * 1024 * 1024,   // 10MB per part
  partCount: 20,                  // 200MB / 10MB = 20 parts
}

Common part size choices:

File sizePart sizePartsNotes
Under 100MBN/AN/AUse POST strategy instead
100MB - 1GB10MB10-100Good balance
1GB - 5GB50MB20-100Fewer requests
5GB+100MB50+S3 allows max 10,000 parts

S3 requires a minimum part size of 5MB (except the last part) and allows up to 10,000 parts per upload.

Upload a large file and track per-part progress

With the multipart strategy registered, large file uploads work the same way as POST uploads from the UI perspective. The engine handles everything internally:

src/main.ts
import { uploadClient } from './upload'
 
// Listen to progress -- same API as POST uploads
uploadClient.on('upload.progress', ({ localId, pct, uploadedBytes, totalBytes }) => {
  const mb = (bytes: number) => (bytes / 1024 / 1024).toFixed(1)
  console.log(`${localId}: ${pct.toFixed(1)}% (${mb(uploadedBytes)}MB / ${mb(totalBytes)}MB)`)
})
 
// Listen to cursor updates -- multipart-specific resume state
uploadClient.on('upload.cursor', ({ localId, cursor }) => {
  if (cursor.strategy === 'multipart' && cursor.value) {
    const mc = cursor.value as { done: Array<{ partNumber: number }> }
    console.log(`${localId}: ${mc.done.length} parts completed`)
  }
})
 
uploadClient.on('upload.completed', ({ localId, result }) => {
  console.log(`${localId}: upload complete!`, result)
})
 
// Add a large file
const input = document.querySelector<HTMLInputElement>('#file-input')!
input.addEventListener('change', () => {
  const files = Array.from(input.files ?? [])
  if (files.length > 0) {
    uploadClient.dispatch({ type: 'addFiles', files, purpose: 'photo' })
  }
})
src/main.ts
import { uploadClient } from './upload'
 
// Listen to progress -- same API as POST uploads
uploadClient.on('upload.progress', ({ localId, pct, uploadedBytes, totalBytes }) => {
  const mb = (bytes: number) => (bytes / 1024 / 1024).toFixed(1)
  console.log(`${localId}: ${pct.toFixed(1)}% (${mb(uploadedBytes)}MB / ${mb(totalBytes)}MB)`)
})
 
// Listen to cursor updates -- multipart-specific resume state
uploadClient.on('upload.cursor', ({ localId, cursor }) => {
  if (cursor.strategy === 'multipart' && cursor.value) {
    const mc = cursor.value as { done: Array<{ partNumber: number }> }
    console.log(`${localId}: ${mc.done.length} parts completed`)
  }
})
 
uploadClient.on('upload.completed', ({ localId, result }) => {
  console.log(`${localId}: upload complete!`, result)
})
 
// Add a large file
const input = document.querySelector<HTMLInputElement>('#file-input')!
input.addEventListener('change', () => {
  const files = Array.from(input.files ?? [])
  if (files.length > 0) {
    uploadClient.dispatch({ type: 'addFiles', files, purpose: 'photo' })
  }
})

The upload.progress event aggregates progress across all parts. The engine tracks bytes from finished parts plus bytes in-flight from currently uploading parts to give you a smooth total progress percentage.

How Concurrent Part Uploads Work

The multipart strategy manages its own concurrency at the part level (separate from the engine's maxConcurrentUploads which controls file-level concurrency).

Here is what happens step by step:

Loading diagram...

  1. Build the queue -- The strategy calculates which parts need uploading. It reads the cursor (ctx.readCursor()) to skip parts that were already uploaded in a previous session.

  2. Concurrent upload loop -- The strategy maintains a pool of up to maxPartConcurrency concurrent uploads. As each part finishes, the next one from the queue starts.

  3. Per-part signing -- For each part, the strategy calls api.multipart.signPart() to get a presigned PUT URL. This is a "sign on demand" pattern -- you do not need to pre-sign all parts upfront.

  4. ETag collection -- After each successful PUT, S3 returns an ETag header. The strategy collects these. If S3/MinIO is behind a proxy, make sure CORS exposes the ETag header: Access-Control-Expose-Headers: ETag.

  5. Cursor persistence -- After each part, the strategy calls ctx.persistCursor() with the updated list of completed parts. If the upload is paused or the browser crashes, the cursor is available on resume.

  6. Completion -- Once all parts are uploaded, the strategy calls api.multipart.completeMultipart() with the full list of { partNumber, etag }. S3 assembles the parts into the final object.

  7. Per-part retry -- If a part fails due to a network error, the strategy retries it up to 3 times with exponential backoff (500ms, 1s, 2s). Only network-ish errors are retried (network failures, timeouts, 5xx responses).

The Legacy Parts Array

The MultipartIntent has an optional parts field for backends that provide all presigned URLs upfront:

type MultipartIntent = {
  strategy: 'multipart'
  fileId: string
  uploadId: string
  partSize: number
  partCount: number
 
  // Optional: all part URLs provided upfront
  parts?: Array<{
    partNumber: number
    url: string
    headers?: Record<string, string>
  }>
}
type MultipartIntent = {
  strategy: 'multipart'
  fileId: string
  uploadId: string
  partSize: number
  partCount: number
 
  // Optional: all part URLs provided upfront
  parts?: Array<{
    partNumber: number
    url: string
    headers?: Record<string, string>
  }>
}

If parts is provided, the strategy uses those URLs directly instead of calling signPart. This is the "legacy" mode -- the on-demand signPart approach is preferred because:

  • URLs do not expire before they are needed
  • Fewer upfront API calls for large files
  • Better for resumable uploads (only sign parts you need)

Pausing and Resuming Multipart Uploads

The multipart strategy is resumable (resumable: true). When a user pauses:

  1. The engine sets the abort signal, which cancels in-flight PUT requests
  2. The strategy's cursor already has all completed parts persisted
  3. The item moves to the paused phase

When the user resumes:

  1. The item moves back to queued, then uploading
  2. The strategy calls ctx.readCursor() to get the list of already-completed parts
  3. It skips those parts and only uploads the remaining ones
  4. Progress resumes from where it left off

If the completed flag is set in the cursor, the strategy skips the completeMultipart call too -- this prevents duplicate assembly requests if the upload was interrupted after completion but before the engine finalized.

Checkpoint

Your project should look like this:

photoduck/
  src/
    upload.ts         -- types with multipart + api with signPart/completeMultipart
    App.tsx           -- UploadProvider wrapper
    PhotoUploader.tsx -- dropzone + progress bars + controls
  package.json
  tsconfig.json

Chapter 4 FAQ


Next: Chapter 5: Validation & Rejection