loom · nendlabs

The Boundary Is the Point

A thread pool is not automatically an improvement over a loop. Worker isolation has serialization cost, lifecycle cost, and a stricter programming model. Loom is useful only when the work is heavy enough, structured enough, or durable enough to make those costs worthwhile.

The benchmark harness is the most honest part of the project. On the local machine used for the run, pooled workers beat native serial execution for medium and heavy CPU fanout. Fresh workers per item were catastrophically slower. Cheap JSON transforms stayed much faster in native Bun. Keyed resume was materially useful for repeated work because completed items could be loaded from disk instead of recomputed.

That negative result is part of the system design. A runtime that cannot say when not to use it is just a slower abstraction.

text

Local benchmark summary

thread-cpu:         pooled workers 17.0ms, native serial 46.3ms, fresh workers 14982.7ms
thread-cpu-heavy:   pooled workers 21.6ms, native serial 125.6ms, fresh workers 542.6ms
json-transform:     pooled workers 11.0ms, native serial 0.7ms
resume-keyed-cpu:   resumed run 102.9ms, fresh durable run 1926.5ms

Run notes: darwin arm64, Bun 1.3.11, 14 CPUs, 3 repeats, 1 warmup.
RSS is main-process RSS only.

A Run Contract, Not a Queue

The public interface is one finite run specification. Items can be arrays, iterables, async iterables, or source factories. The run function is executed by the runtime. Without combine, outputs preserve input order. With combine, the same primitive becomes fanout plus aggregation.

The interesting part is that execution policy lives in the same object as the work: parallel, retry, timeoutMs, emit, key, group, groupLimit, and resume. This keeps a bounded job legible without turning it into a workflow DSL or public queue API.

The worker function must be self-contained. That constraint is not incidental. Loom serializes function source into isolated workers, so local closures and imported helpers are outside the model unless the function carries the needed code with it.

typescript

export interface RunSpec<T, O> {
  name?: string;
  items: Items<T>;
  run(item: T, ctx: RunContext<T>): O | Promise<O>;
  emit?: (entry: Emitted<T, O>) => void | Promise<void>;
  parallel?: number;
  retry?: number | RetryPolicy;
  timeoutMs?: number;
  key?: KeySelector<T>;
  group?: GroupSelector<T>;
  groupLimit?: GroupLimit;
  resume?: boolean;
}

The Scheduler Stays Small

Loom is not interesting because it can map items through workers. It is interesting because the scheduler stays small while still behaving operationally. Sources are consumed lazily. Pending plus active work is bounded. Active work is counted globally and by group. A saturated lane does not have to block the whole queue forever.

The scheduler fills pending work only until the source buffer limit is reached, then starts runnable items while respecting global and group caps. That is the difference between a batch helper and a runtime that can apply pressure to a finite source.

typescript

while (
  !sourceDone &&
  firstError === undefined &&
  pending.length + active.size < sourceBufferLimit
) {
  const didPull = await pullNext();
  if (!didPull) break;

  if (
    active.size < parallel &&
    findRunnableIndex(pending, nextCursor, activeByGroup, spec.groupLimit, parallel) !== -1
  ) {
    break;
  }
}

Worker Pooling Is the Performance Primitive

The benchmark numbers make the implementation choice clear. Worker reuse is the performance primitive. A stable pool avoids worker-per-item startup cost, bounds queued work, times out tasks, recycles failed workers, and gives the caller a real close path.

Inside each worker, evaluated function source is cached by string. That explains both the speedup and the caveat. The runtime can reuse worker state for the same function source, but it is executing trusted local code through worker-side evaluation. Loom is not a sandbox for untrusted functions.

typescript

let fn = functions.get(payload.source);
if (!fn) {
  fn = globalThis.eval(`(${payload.source})`) as (
    item: unknown,
    ctx: MessagePayload["ctx"],
  ) => unknown | Promise<unknown>;
  functions.set(payload.source, fn);
}

const output = await fn(payload.item, payload.ctx);

Durability Requires Identity

Resume is deliberately not ambient magic. The caller has to provide a store, a run name, stable item keys, and resume: true. That is the minimum identity required for the runtime to load completed records and skip work correctly.

The filesystem store writes one completed record per key and updates the manifest through a temporary file plus rename. This is local durability, not distributed transactionality. It is still enough for the intended gap: bounded local work that should survive interruption without becoming a queue service.

typescript

async markCompleted(record: StoredResultRecord<O>, totalCountHint: number): Promise<void> {
  await writeFile(
    this.pathForCompleted(record.key),
    JSON.stringify(record, null, 2) + "\n",
  );

  await this.withManifestLock(async () => {
    const manifest = await this.readManifest();
    await this.writeManifest({
      ...manifest,
      status: "running",
      completedCount: manifest.completedCount + 1,
      totalCount: Math.max(manifest.totalCount, totalCountHint),
    });
  });
}

The Local Boundary

The v0 shape is intentionally narrow: Bun-first, local worker threads, finite sources, filesystem-backed resume, lane-aware concurrency, retries, timeouts, emit hooks, and aggregation. There is no remote runtime, public queue, workflow language, CLI surface, or distributed coordinator in this version.

That boundary makes the project more useful as an experiment. It does not claim to replace a job system. It names the exact middle ground between Promise.all and infrastructure: a bounded run contract that can fan out, aggregate, and resume when the work justifies leaving the main thread.