fix(ruvector): bundle ONNX runtime into dist/ on build (#354)#434
Open
fix(ruvector): bundle ONNX runtime into dist/ on build (#354)#434
Conversation
The published tarball was missing every ONNX runtime file except a
1-line `package.json`, so `OptimizedOnnxEmbedder` (and any code path
that calls `initOnnxEmbedder()`) crashed on every clean install with:
Error: ONNX WASM files not bundled. The onnx/ directory is missing.
Root cause is the build script:
"build": "tsc && cp src/core/onnx/pkg/package.json dist/core/onnx/pkg/"
`tsc` only emits compiled `.ts` output (no `allowJs`). The wasm-bindgen
artifacts under `src/core/onnx/pkg/` (the .wasm payload, _bg.js, type
defs, LICENSE) and the sibling `src/core/onnx/loader.js` are runtime
JavaScript — `tsc` doesn't relay them — but the script only copied a
single `package.json`. Everything else stayed in `src/` and never made
it into the tarball.
Fix:
- Replace the single-file copy with `scripts/copy-onnx-assets.js`, a
Node-portable recursive copy (works on Windows; doesn't need cp).
- Skip dotfiles (e.g. transient `.claude-flow/` agent metadata) and
`node_modules/` so they don't leak into the published artifact.
- Sanity-check that the canonical runtime files (`*_bg.{js,wasm}`,
`*.js`, `loader.js`) landed where `onnx-embedder.js` looks for them;
fail the build loudly if not.
Verified end-to-end against ruvector@0.2.25 on Node 22.22.2:
$ rm -rf dist/core/onnx && npm run build
> tsc && node scripts/copy-onnx-assets.js
copy-onnx-assets: 10 ONNX runtime file(s) staged under dist/.
$ ls dist/core/onnx/pkg/
LICENSE ruvector_onnx_embeddings_wasm_bg.js
loader.js ruvector_onnx_embeddings_wasm_bg.wasm
package.json ruvector_onnx_embeddings_wasm_bg.wasm.d.ts
ruvector_onnx_embeddings_wasm.d.ts ruvector_onnx_embeddings_wasm.js
$ npm pack && tar -tzf ruvector-0.2.25.tgz | grep -c onnx/pkg
8
# Clean install into /tmp:
$ node -e "const {isOnnxAvailable} = require('ruvector/dist/core/onnx-embedder'); console.log(isOnnxAvailable())"
true
Closes #354
Co-Authored-By: claude-flow <ruv@ruv.net>
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
dist/core/onnx/pkg/package.jsonand none of the ONNX WASM runtime files. Every clean install failed atinitOnnxEmbedder()withError: ONNX WASM files not bundled. The onnx/ directory is missing.<database>path are broken on freshly-created DBs (andembed textfails on missing ONNX bundle) #417 (part 2). CLI commands that take a<database>path are broken on freshly-created DBs (andembed textfails on missing ONNX bundle) #417 still has CLI-handler bugs to fix on top of this; that work continues in a separate PR.Root cause
tsconly emits compiled.tsoutput (noallowJs), so the JS sources undersrc/core/onnx/(the wasm-bindgen.wasmpayload,_bg.js, type defs, LICENSE, siblingloader.js) never reachdist/. The script copied a singlepackage.jsonand called it a day.Fix
scripts/copy-onnx-assets.js, a Node-portable recursive copy (nocp -r, works on Windows)..claude-flow/agent metadata) andnode_modules/so they don't leak into the published artifact.*_bg.{js,wasm},*.js,loader.js) landed whereonnx-embedder.jslooks for them; fail the build loudly if not.Proof
On a fresh build of the current 0.2.25 working tree (Node 22.22.2):
Tarball grew from 825 KB → 3.1 MB; that's the 7.4 MB MiniLM WASM payload (uncompressed) compressing to ~2.3 MB on top of the 825 KB baseline — expected since this is the file the consumer was previously missing.
Test plan
npm run buildlands all 10 ONNX runtime files underdist/core/onnx/.npm packincludes them in the tarball (grep -c onnx/pkg = 8).isOnnxAvailable()returntrue(wasfalse/throw before)..claude-flow/dotfiles into the artifact.cp(portable to Windows).verify-dist.jsto assert the ONNX runtime files exist (best done after fix(ruvector): verify-dist guards package.json entrypoints (#376) #433 lands to avoid merge conflicts).Scope notes
<database>path are broken on freshly-created DBs (andembed textfails on missing ONNX bundle) #417 — those are a separate fix.🤖 Generated with claude-flow