Skip to content

up#995

Closed
bardiamaleki wants to merge 122 commits intotherealaleph:mainfrom
bardiamaleki:main
Closed

up#995
bardiamaleki wants to merge 122 commits intotherealaleph:mainfrom
bardiamaleki:main

Conversation

@bardiamaleki
Copy link
Copy Markdown

No description provided.

therealaleph and others added 30 commits April 22, 2026 11:04
Thanks @v4g4b0nd-0x76 — proper listener teardown on Stop is exactly what was needed. The 2-second grace window + force-abort fallback is a clean pattern.
Two user-reported issues.

=== GLIBC too new (reported via twitter) ===

Our linux-amd64 and linux-arm64 gnu builds were compiled on
ubuntu-latest (24.04, GLIBC 2.39), which means the resulting binaries
refuse to load on anything older:

  ./mhrv-rs: /lib/x86_64-linux-gnu/libc.so.6: version 'GLIBC_2.39'
    not found (required by ./mhrv-rs)

Users on Ubuntu 22.04 / Mint 21 (GLIBC 2.35) — the typical user in Iran
where this project's target audience lives, and where they can't
dist-upgrade because they're behind exactly the kind of network
restriction this tool exists to bypass — could not run the gnu builds
at all.

Fix: pin the linux-gnu matrix entries to ubuntu-22.04 runners. GLIBC
2.35 is now the minimum; binaries load on Ubuntu 22.04, Mint 21,
Debian 12, Fedora 36+, RHEL 9+ and everything newer.

Users on older distros (Ubuntu 20.04, CentOS 7) can still use the
static musl builds (mhrv-rs-linux-musl-amd64.tar.gz et al.) which
have no GLIBC dependency at all.

=== Short-screen laptops — main window content clipped (PR #6) ===

Co-authored fix from @v4g4b0nd-0x76 in PR #6 (manually applied to
avoid pulling in 400 lines of unrelated cargo-fmt churn):

- Wrap the CentralPanel body in ScrollArea::vertical()
  .auto_shrink([false; 2]) so everything stays reachable on short
  screens.
- Lower the min_inner_size from [420, 540] to [420, 400] so laptops
  with ~13" screens at default scaling can shrink the window without
  clipping UI elements.

Closes #6.

Co-authored-by: v4g4b0nd-0x76 <v4g4b0nd-0x76@users.noreply.github.com>
Verified: the linux-amd64 binary's highest GLIBC symbol is now 2.34
(was 2.39 in v0.7.0 and earlier), so it runs on Ubuntu 22.04 / Mint 21
/ Debian 12 and anything newer.
Two user complaints:
- English words mixed inline in the Persian section were breaking the
  RTL text flow, making paragraphs hard to read.
- Language was too technical for non-developer users.

Fixes:

1. Every English / technical term is now wrapped in backticks
   (`Apps Script`, `MITM`, `SOCKS5`, `Deployment ID`, …). GitHub
   renders these as monospace LTR islands, which the browser's
   bidirectional text algorithm treats as embedded strong-LTR runs
   and doesn't let them flip the surrounding RTL paragraph direction.
2. Rewrote most paragraphs as shorter, plainer Persian sentences.
   Replaced jargon (run-time, on-the-fly, rewrite, trust store…)
   with everyday wording.
3. Converted dense prose into tables where it helped (download
   table by OS, config fields table, per-OS CA install table).
4. Added a 5-step walkthrough (script deploy → download → first
   run → config in UI → browser setup) that a non-technical user
   can follow top-to-bottom.
5. New 'How do I know it's working?' quick verification section.
6. New big FAQ at the bottom — covers the questions that actually
   come up: certificate install safety, how to remove the cert,
   how many Deployment IDs to use, YouTube / ChatGPT caveats,
   the GLIBC 2.39 issue, and CLI usage for power users.
7. Telegram pairing section reworded — explains the WHY first
   (Apps Script can't speak MTProto), then the one-line fix.
8. SNI pool editor flow written as numbered steps mirroring the
   actual UI buttons the user clicks.

English section unchanged.
Thanks @v4g4b0nd-0x76 for the feature. Two small fixes folded in on
the merge so master still builds + doesn't hit sharp edges:

- src/scan_ips.rs: rand::thread_rng() held across an .await tripped
  the Send bound on the async fn (ThreadRng isn't Send). Scoped the
  rng in a block so it drops before subsequent awaits.
- src/scan_ips.rs: guard /0 and /32 CIDRs in cidr_to_ips and
  ip_in_cidr against the 1u32 << 32 shift panic (debug mode). goog.json
  is unlikely to contain either but defensive.

Behavior unchanged otherwise:
- fetch_ips_from_api=false (default): identical to previous static
  scan-ips behavior.
- fetch_ips_from_api=true: fetches goog.json from www.gstatic.com,
  resolves famous Google domain IPs, prioritises matching CIDRs,
  samples up to max_ips_to_scan candidates, validates with gws/
  x-google-/alt-svc headers. If the fetch fails, falls back to the
  static list cleanly — verified locally.

Closes #10.
…ws UI diagnostics (#7)

Three user-reported fixes / features in one release.

=== PR #9 — dynamic Google IP discovery (@v4g4b0nd-0x76) ===

Already merged in the previous commit. Opt-in via 'fetch_ips_from_api'
in config. Pulls goog.json from www.gstatic.com, maps it against
resolved IPs of well-known Google domains, samples from matching
CIDRs, and validates each candidate with gws / x-google / alt-svc
response-header checks. Graceful fallback to the static list if the
fetch fails or nothing passes validation. Default is off so existing
users are unaffected. Closes #10.

=== Issue #8 — OpenWRT: 'accept: No file descriptors available' ===

OpenWRT routers ship a very low RLIMIT_NOFILE (often 1024, sometimes
256 on constrained devices). A browser's burst of ~30 parallel sub-
resource requests can fill the limit within seconds, after which
accept(2) returns EMFILE and the proxy is effectively dead.

Two-fold fix:

1. New assets/openwrt/mhrv-rs.init now sets procd limits nofile=
   "16384 16384" on the service. procd raises the per-process fd
   limit before the binary even starts.
2. New src/rlimit.rs best-effort-raises RLIMIT_NOFILE in the binary
   itself (Unix only, no new runtime deps — libc is already
   transitively present via tokio). Targets 16384 soft, capped to
   whatever hard limit the kernel already allows the user (so it
   doesn't need root).

Both layers mean the fix applies whether the user runs via
  /etc/init.d/mhrv-rs start    (procd limits kick in)
or
  ./mhrv-rs --config ...       (in-binary bump kicks in)
or any other invocation path.

Closes #8.

=== Issue #7 — Windows UI crashes silently ===

User report: on Win 11, run.bat prints 'Starting mhrv-rs UI...' and
exits clean, but no UI window ever appears. Root cause: the old
run.bat used 'start "" "mhrv-rs-ui.exe"' which returns
immediately — if the UI binary dies at launch time (missing GPU
driver, RDP without GL accel, AV blocking, …), the crash is invisible
because start already disowned the child.

Fix: run the UI in-place (not via 'start'), so its stderr and exit
code land in the run.bat cmd window. On non-zero exit print a helpful
checklist of common Windows launch failures and pause so the user can
screenshot the output for an issue report.

This doesn't fix the underlying crash for affected users, but it
turns a ghost-crash bug into a self-diagnosing one so the next report
includes actionable info. Closes-via-diag #7.

=== Fixes folded into the PR #9 merge ===

- src/scan_ips.rs: rand::thread_rng() held across an .await tripped
  the Send bound on the async fn. Scoped the rng in a block so it
  drops before the subsequent awaits.
- src/scan_ips.rs: defend /0 and /32 CIDRs in cidr_to_ips and
  ip_in_cidr against 1u32 << 32 shift panic.

All 36 unit tests pass.
Placing a new table header before eframe silently scoped it into the
unix-only target table, so Windows builds lost the dependency entirely:

  error[E0432]: unresolved import `eframe`
  use of unresolved module or unlinked crate `eframe`

(Builds green on Mac/Linux because those hit cfg(unix) == true. Windows
was the only casualty.)

Moved the [target.'cfg(unix)'.dependencies] block to the end of
Cargo.toml, after the optional eframe line, so the main [dependencies]
table stays intact for all targets. Added a comment so this foot-gun
can't return.
…orks (closes #11)

Multiple users reported the same thing (issue #11): they trusted the
CA, then re-installed it, then deleted and re-generated it, and still
every HTTPS site through the proxy failed in the browser. The python
version of the same project doesn't have the issue.

Root cause: rcgen's CertificateParams::default() produces a
minimum-viable x509 cert that does NOT carry:

  - ExtendedKeyUsage extension with id-kp-serverAuth
  - KeyUsage extension with digitalSignature + keyEncipherment

Modern Chrome / Firefox / Edge / Safari all reject TLS server leaves
without those. The CA trust bit didn't matter — the browser's chain
validator rejected the leaf itself with NET::ERR_CERT_INVALID before
ever consulting the trust store. So 'reinstall the CA' was powerless
to help.

Fix in src/mitm.rs::issue_leaf:
  - Set params.extended_key_usages = [ServerAuth].
  - Set params.key_usages = [DigitalSignature, KeyEncipherment].
  - Backdate not_before by 5 min to absorb clock skew between the
    MITM process and a slightly-fast client clock. Same fix in the
    CA's own not_before.

Also added src/mitm.rs::tests::leaf_has_serverauth_eku_and_key_usage
as a permanent regression guard — it parses the DER with x509-parser
and asserts the three extensions are present. Added x509-parser to
dev-dependencies (already in the tree transitively via rcgen).

Upgrade path for users affected by #11: download v0.8.1, run it. No
CA reinstall required — the CA cert itself was fine, only the per-
site leaves were broken.

Verified end-to-end locally:
  curl --cacert <ca.crt> -x http://127.0.0.1:... https://httpbin.org/ip
  curl --cacert <ca.crt> -x socks5h://127.0.0.1:... https://httpbin.org/ip
Both return JSON without cert errors, through the Apps Script relay
path. 37 unit tests pass.
…llow-up to #11)

After v0.8.1 fixed the leaf cert extensions, users reported "still
broken" — specifically Firefox showing:
  "Software is Preventing Firefox From Safely Connecting to This Site.
   drive.google.com ... This issue is caused by MasterHttpRelayVPN"
for HSTS-preloaded sites. That error is Firefox's "MITM detected AND
issuing CA isn't in my trust store" path combined with HSTS blocking
the normal override button — so users were stuck with no workaround.

Real root cause of the "still broken" reports: the CA was making it
into the OS trust store (Windows cert store / update-ca-certificates
on Linux) but NOT into the browser-specific trust stores that
Firefox and Chrome use on every OS.

Three additions:

1. Firefox: .
   For every Firefox profile we find, we now write this pref to the
   profile's user.js. It tells Firefox to trust the OS CA store, so
   our already-successful system-level install automatically covers
   Firefox on next startup. Critical on Windows (NSS certutil isn't
   on PATH there, so the certutil-based Firefox install never
   worked). Idempotent — checks for existing pref before writing
   and leaves a non-matching user value alone.

2. Chrome/Chromium on Linux: install into ~/.pki/nssdb.
   Linux Chrome uses its own shared NSS DB, independent of both the
   OS store (populated by update-ca-certificates) AND Firefox's
   per-profile NSS. Without this, users installed the CA via
   run.sh, Chrome still refused every HTTPS site, and they spiraled
   trying to re-install the CA. We now also initialize that DB
   with  if it doesn't exist yet.

3. Refactored the NSS-install path so Firefox and Chrome share a
   single install_nss_in_dir() helper. Renamed the top-level entry
   from install_firefox_nss to install_nss_stores to match scope.

Locally verified the cert itself is fine — openssl x509 -text shows
Version 3, SAN, KeyUsage (critical), ExtendedKeyUsage, and
 passes. So the leaf is correct;
what was failing was the trust-chain validation inside the specific
browser because our CA wasn't in THAT browser's trust DB.

Upgrade path: download v0.8.2 and run the launcher or
`./mhrv-rs --install-cert`. Restart Firefox/Chrome after install —
Firefox needs the restart to re-read user.js.
…isibility (issue #12)

Two reported issues:

1. Log level in the form had no visible effect — trace produced the
   same panel output as warn.
2. upstream_socks5 was reported as never being attempted.

(1) was because the UI binary never installed a tracing subscriber.
Every tracing::info!/debug!/trace! from the proxy was discarded; only
the handful of manual push_log() calls for start/stop/test reached
the 'Recent log' panel. Swapping the log level in the combo-box just
rewrote the config field — nothing consumed it.

Fix: install_ui_tracing() at startup registers a tracing_subscriber
fmt layer with a custom MakeWriter that mirrors each formatted event
line into shared.state.log. Respects RUST_LOG, defaults to 'info'
with hyper pinned to warn so the panel isn't swamped by low-level
HTTP chatter. Now the log level switch actually filters panel
output, and routing decisions show up live.

(2) is a documentation / visibility issue more than a bug. Our
upstream_socks5 routing is intentionally scoped to raw-TCP traffic
(non-HTTP, non-TLS) — HTTPS goes through the Apps Script relay,
which is the whole reason mhrv-rs exists. But without working logs,
it looks like upstream_socks5 is dead code.

Fix: every branch of dispatch_tunnel now emits a tracing::info! that
says exactly which path the connection took and, where applicable,
whether upstream_socks5 was used:

    dispatch api.telegram.org:443 -> raw-tcp (127.0.0.1:50529)
    dispatch www.google.com:443   -> sni-rewrite tunnel (Google edge direct)
    dispatch httpbin.org:443      -> MITM + Apps Script relay (TLS detected)

Combined with (1), users can now see in real time whether their
traffic is hitting upstream_socks5. If it says 'raw-tcp (direct)'
after they set the field, that's evidence of a real bug; if it
never reaches the raw-tcp branch at all, that's the documented
design (HTTPS → Apps Script).

Also per user request, updated README:
- Shields.io badges up top: latest release, total downloads, CI
  status, license, stars.
- Short 'Heads up on authorship' note crediting Anthropic's Claude
  for the bulk of the Rust port (with the human-on-every-commit
  caveat). English and Persian mirrors both have it.

All 37 unit tests pass.
…t-on-reopen bug)

A user reported that after Save-config, closing the UI, and reopening,
every form field was blank — but the config.json on disk still had all
the right values.

The culprit in the UI was load_form()'s swallow-errors pattern:

  let existing = if path.exists() {
      Config::load(&path).ok()   // .ok() threw away the error
  } else { ... };
  if let Some(c) = existing { /* populate form */ } else { /* defaults */ }

When Config::load returned an Err, .ok() silently converted to None,
the form went back to defaults, and the user had no signal at all
that the load had failed or WHY. On every platform I could test
(macOS / Linux) the round-trip works fine with a real round-trip test
added in config.rs (config::rt_tests::round_trip_all_current_fields
and round_trip_minimal_fields_only — both green). So whatever's
failing for this specific reporter is environment-specific (weird
filesystem encoding, partial write, different field shape from an
older version, … TBD). Without visibility we can't diagnose it.

Changes:

1. load_form() now returns (FormState, Option<String>). The String
   is a user-facing error message (with the full path + the
   underlying parse/validate reason) when Config::load fails on an
   existing file.
2. main() plumbs that error into App's initial toast, which sticks
   for 30 seconds (vs the normal 5 for regular toasts) so users who
   only open the UI briefly still see it.
3. Added tracing::info! in load_form for the success path too —
   the Recent log panel now always shows either 'config: loaded OK
   from <path>' or 'Config at <path> failed to load: <reason>' on
   startup, regardless of toast timing.
4. Added two regression-guard tests in config.rs covering the
   full-fields and minimal-fields save shapes the UI emits.

Next time a user reports this: they'll have the exact error in the
toast + the Recent log panel, and we can fix the actual bug instead
of shooting blind.
User on issue #13 reported that even after installing the CA (and
seeing it in the Windows cert manager UI), our 'Check CA' button still
said 'NOT trusted'. Root cause: is_ca_trusted() on Windows was just
returning false unconditionally — Check-CA has never worked on Windows.

Fix: is_trusted_windows() now shells out to certutil:
  certutil -user -store Root 'MasterHttpRelayVPN'
  certutil -store Root 'MasterHttpRelayVPN'

Checks both the user store (where our install_windows puts it by
default) and the machine store (fallback path when user-store install
is blocked). Requires certutil to print the cert name in stdout AND
exit 0 — belt-and-suspenders against locales where certutil exits 0
even on an empty match.

Also made the Check-CA UI message point users at the CA file path
for cross-device install — the same user reported their Android
V2rayNG client getting cert errors on our MITM-signed TLS leaves,
which is the expected 'the phone doesn't trust our CA' scenario. The
message now calls out the ca.crt path explicitly, and notes the
Android 7+ user-CA restriction (Firefox Android works, Chrome and
most apps don't trust user-installed CAs regardless).

Not addressed (by design):
- Replacing our CA keypair with Python-generated PEM fails to parse
  via rcgen. User tried this as a workaround before reporting. rcgen
  expects PKCS#8 PEM; Python's cryptography commonly emits PKCS#1
  ('BEGIN RSA PRIVATE KEY'). Even if parsing worked, mixing an
  external CA with our leaf-issuing code would break the key match.
  Users should stick with our generated CA — that's the supported
  flow. The Python cross-contamination experiment is expected to
  fail; we don't document it as supported.
Two reasons to pin a copy in the repo:

1. Users on networks where raw.githubusercontent.com is intermittent
   can still get the deploy-ready file via a repo ZIP / clone.
2. The Apps Script relay protocol between mhrv-rs and Code.gs is
   informal — upstream changes can silently break us. Keeping a
   snapshot lets future-us diff against what we tested against
   when diagnosing protocol-drift bugs.

Fetched verbatim from:
  https://raw.githubusercontent.com/masterking32/MasterHttpRelayVPN/refs/heads/python_testing/apps_script/Code.gs

Credit stays with @masterking32. The assets/apps_script/README.md
next to it calls out that we don't modify this file — users deploy
it as-is into their own Google Apps Script project.

Updated the Setup Guide link in both the English and Persian
sections so offline / restricted-network users have a fallback path.
Thanks @hamed256 — armhf cross-compile verified locally, produces a valid ARM 32-bit ELF. Merging with a follow-up commit on main to pin the runner to ubuntu-22.04 (GLIBC 2.36 floor, same policy as our other linux-gnu targets) so it runs on Raspberry Pi users on Bookworm / Bullseye.
=== PR #14 follow-up: armhf build runs on Pi Bookworm/Bullseye ===

PR #14 (merged earlier) added arm-unknown-linux-gnueabihf to the
release matrix but pinned os=ubuntu-latest, which is 24.04 with GLIBC
2.39. Target armhf sysroot on 24.04 is Debian Trixie (GLIBC 2.39),
far too new for a Raspberry Pi 2/3 on Bookworm (2.36) or Bullseye
(2.31) — users would get 'GLIBC_2.39 not found' the same way the
Linux-amd64 issue #2 folks did before we pinned them to 22.04.

Fix: pin the armhf matrix entry to ubuntu-22.04, matching our other
linux-gnu targets. Binary will link against GLIBC 2.35 max, which
loads on Pi Bookworm and Bullseye. Also trimmed two trailing spaces.

Locally verified the cross-compile: rust:latest + gcc-arm-linux-
gnueabihf + proper CARGO_HOME config.toml produces a valid ARM 32-bit
ELF (2.9 MB, armhf EABI5).

=== Issue #15: 'Check for updates' button in the UI ===

New src/update_check.rs module. On the user's click (no polling):

  1. Tcp-probes github.com:443 with a 5s budget. If unreachable, we
     return Offline(reason) instead of a confusing 'update check
     failed' — distinguishes 'you're offline' from 'GitHub API
     misbehaved'.

  2. HTTPS GET api.github.com/repos/.../releases/latest via the
     tokio + rustls stack (same hand-rolled HTTP pattern as
     domain_fronter — no new crate deps). Parses tag_name, strips
     the v-prefix, loose-semver-compares to env!(CARGO_PKG_VERSION).

  3. Returns one of four UpdateCheck variants: Offline / Error /
     UpToDate / UpdateAvailable { release_url }.

New UI wiring (src/bin/ui.rs):
  - Cmd::CheckUpdate enqueue variant
  - UiState::last_update_check { InFlight, Done(result) }
  - 'Check for updates' button next to the CA buttons
  - Result displayed as a colored small-text line under the CA info:
    green 'up to date', amber 'update available v0.8.5 → v0.8.6' with
    a clickable release-page hyperlink, red for offline/error.

Verified end-to-end with a live github.com fetch (got a rate-limit
HTTP 403 from my IP because I've been hitting the API a lot, but
that's the expected Error() state — response classification was
correct). Three unit tests for is_newer() and a gated live test for
the full round-trip.

43 tests pass.
=== UI redesign (zero new deps, same binary size) ===

Entire App::update() rewritten around three ideas:

1. Section cards. Form rows are grouped inside rounded frames with
   faint fills and small-caps headings:
     - 'Apps Script relay'  — Deployment IDs (textarea) + Auth key
     - 'Network'            — Google IP (+inline scan button), Front
                              domain, Listen host, HTTP+SOCKS5 ports
                              on one row, SNI pool button
     - Collapsing 'Advanced' — upstream SOCKS5, parallel dispatch,
                              log level, verify SSL, show auth key.
                              Closed by default — most users never
                              touch these.

2. Clearer action hierarchy. Primary buttons are accent-filled and
   larger:
     - Start  (green filled,  ▶ glyph, 120x32)
     - Stop   (red filled,    ■ glyph, 120x32)
     - Save config (blue accent filled, path shown inline after →)
     - SNI pool (blue accent filled, inside Network section)
     - Test relay (neutral, tall)
   Secondary actions (Install CA / Check CA / Check for updates)
   moved to their own compact row below, no longer competing.

3. Status + log clarity.
   - Header version links to GitHub:  → repo,  →
     the release tag page.
   - Running/stopped status is now a pill-shaped colored chip at the
     right end of the header (green fill + green dot when running,
     red when stopped).
   - Traffic stats in a 2-column layout inside the Traffic card —
     7 metrics fit in 4 rows instead of a 7-row vertical strip.
   - One compact transient status line above the log that auto-hides
     after 10 seconds — replaces the previous stack of permanent
     ca_trusted / test_msg / update_check labels that were pushing
     the log panel off-screen.
   - Log panel now has its own bordered frame (darker fill), a
     '[x] show' checkbox that hides it entirely when off, a 'save…'
     button that writes the current log buffer to a timestamped
     log-YYYYMMDD-HHMMSS.txt in the user-data dir, and a 'clear'
     button. Empty state shows a muted placeholder instead of
     silent void.

All helper functions (section, primary_button, form_row) live at the
top of ui.rs as small local helpers — no new modules, no new
dependencies.

=== Stricter end-to-end test (test_cmd.rs) ===

Previous test passed on any HTTP 200 status regardless of body.
After a user pointed out that the test reported PASS even after
they deleted their Apps Script deployment, updated the pass criteria:

  1. Status must contain '200 OK'.
  2. Body must parse as JSON.
  3. JSON must have an 'ip' field with a valid IPv4 or IPv6.

Anything else → SUSPECT (returns false), with a specific log message
like 'HTML returned instead of JSON. The Apps Script deployment may
be deleted, not published to Anyone, or requires sign-in.'

Also now emits tracing::info!/warn!/error! alongside println!, so
the verdict + detail show up in the UI's Recent log panel instead
of disappearing to a stdout nobody sees.

One new unit test: looks_like_ip() accepts v4+v6, rejects empty,
rejects malformed, rejects overflowed octets. 44 tests total, all
green.

Verified locally end-to-end — UI launches clean, form loads config
cleanly, Start/Stop/Save all fire correctly, Test relay produces
the new PASS/SUSPECT verdict with the tracing detail visible in
the log panel, Check-for-updates hits GitHub and resolves with the
compact auto-hiding status line.
User @barzamini pointed out an optimization from the Python community
(originally from seramo_ir): X/Twitter GraphQL URLs look like

  https://x.com/i/api/graphql/{hash}/{op}?variables=...&features=...&fieldToggles=...

The features and fieldToggles params change across sessions and even
within a session, busting our 50 MB response cache on every request to
the same logical query. Stripping everything after 'variables=' lets
identical logical queries collapse into one cache entry, dramatically
reducing quota usage when browsing Twitter through the relay.

Implementation:
  - src/domain_fronter.rs: new normalize_x_graphql_url() helper. Matches
    exactly the Python patch's pattern (host == 'x.com', path starts
    with /i/api/graphql/, query starts with variables=). Truncates at
    the first '&' past the '?'. Applied at the top of relay() so the
    normalized URL feeds BOTH the cache key AND the request sent to
    Apps Script — so we save on Apps Script quota too, not just on
    return-trip bytes.
  - src/config.rs: new opt-in normalize_x_graphql bool (default false).
    Off by default because strict X endpoints may reject trimmed requests;
    user should flip it on and watch for regressions.
  - src/bin/ui.rs: checkbox in the Advanced section,
    'Normalize X/Twitter GraphQL URLs', with tooltip explaining the
    trade-off and crediting the source.
  - Four new unit tests in domain_fronter::tests covering: the happy
    path trim, non-x.com hosts pass through unchanged, non-graphql x.com
    paths pass through unchanged, and idempotency. 48 tests total, all
    green.

Credit: idea by seramo_ir, Python patch at
https://gist.github.com/seramo/0ae9e5d30ac23a73d5eb3bd2710fcd67,
implementation request by @barzamini in issue #16.
…#15 follow-up)

@zula-editor reported on issue #15 that the Check-for-updates button
was returning HTTP 403 on their ISP — classic GitHub
unauthenticated-API rate limit (60/hour per IP) on a shared NAT IP.
They also asked for the update to actually be downloadable from the
app, not just a page link.

Both addressed:

=== Route update check through our own proxy when running ===

New mhrv_rs::update_check::Route enum:
  - Direct: straight rustls to api.github.com (existing behavior)
  - Proxy { host, port }: HTTP CONNECT through our local HTTP proxy
    listener → MITM → Apps Script → api.github.com.

When the proxy is running, the UI automatically picks Proxy. From
GitHub's POV the request now comes from Apps Script's IP range (a
Google datacenter) — completely different rate-limit bucket from the
user's ISP IP, AND works even if GitHub is blocked on their network.

Routing over proxy means the MITM leaf for api.github.com has to be
trusted in the update_check's TLS config. build_root_store() now
conditionally adds our own CA cert from data_dir::ca_cert_path() to
the webpki roots when Route::Proxy is in use. Direct path is
unchanged.

=== Download button ===

The UpdateCheck::UpdateAvailable variant now carries an optional
ReleaseAsset { name, download_url, size_bytes } picked by
pick_asset_for_platform() from the GitHub API's assets[] array.
Preference list per (OS, arch):
  - macOS arm64 → mhrv-rs-macos-arm64-app.zip, else tar.gz
  - macOS amd64 → mhrv-rs-macos-amd64-app.zip, else tar.gz
  - Windows → mhrv-rs-windows-amd64.zip
  - Linux aarch64 → mhrv-rs-linux-arm64.tar.gz
  - Linux armv7 → mhrv-rs-raspbian-armhf.tar.gz
  - Linux x86_64 → mhrv-rs-linux-amd64.tar.gz

UI: when an update is available AND we have an asset, the transient
status line grows an accent-blue 'Download X.Y MB' button. Clicking
fires Cmd::DownloadUpdate, which pipes the asset through the same
Route (proxy if running, direct otherwise), writes it to
UserDirs::download_dir() (~/Downloads on most systems), and shows a
'show in folder' button that opens Finder / Explorer / xdg-open on
the containing directory.

Three new unit tests for asset-picking. The gated live test now
takes a Route argument (Direct) so it keeps working across the API
shape change. 49 tests pass.

Also refreshed in-repo releases/ archives to v0.9.1 alongside.
…sue #18)

@Behzad9 on #18: the OpenWRT 'No file descriptors available' errors
are back in v0.8.0+, this time logged as a wall of thousands of
identical ERRORs within seconds of activating the proxy. Two real
bugs, now fixed:

=== 1. accept() loop had no backoff ===

Previous code:
    loop {
        match listener.accept().await {
            Ok(x) => ...,
            Err(e) => { tracing::error!(...); continue; }  // tight loop
        }
    }

On EMFILE (RLIMIT_NOFILE exhausted), accept() returns synchronously,
the match re-runs instantly, accept() EMFILEs again, forever. The tight
loop ALSO starves the tokio runtime of CPU that existing connections
need to finish and close their fds — so the problem never clears on its
own. It's a self-sustaining meltdown.

New accept_backoff() helper (in proxy_server.rs) wraps both the HTTP
and SOCKS5 accept loops:
  - Detects EMFILE/ENFILE via raw_os_error (24 or 23).
  - Sleeps proportional to how long the pressure has lasted (50 ms
    first hit, ramping to a 2 s cap around hit #40). Gives existing
    connections a chance to finish and free fds.
  - Rate-limits the log line: one WARN on the first EMFILE with fix
    instructions, then one every 100 retries. No more walls of
    identical errors.
  - Resets the counter on the next successful accept.
  - Non-EMFILE errors (ECONNABORTED from clients that went away during
    handshake, etc.) get a plain single-line error + 5 ms sleep so we
    still don't tight-loop on any unexpected error.

End-to-end verified: ran mhrv-rs under , flooded the
SOCKS5 port with 247 concurrent connections to trip EMFILE. Before:
log would have been 1000s of identical lines. After: exactly 1 warning,
listener stayed quiet, fds drained, accept resumed.

=== 2. RLIMIT_NOFILE bump was too conservative + silent ===

Previous behavior: target 16384 soft, cap to existing hard limit,
no log. On constrained systems where hard is already tiny, we'd
stay at the tiny limit silently.

rlimit.rs now:
  - Targets 65536 soft.
  - ALSO tries to raise the hard limit up to /proc/sys/fs/nr_open
    on Linux (Linux allows a non-privileged process to bump its own
    hard limit up to the kernel ceiling, usually 1048576 on modern
    kernels). On macOS/BSD we skip this — only bump soft.
  - Logs WARN on startup if soft ends up <4096 with the exact fix
    ('ulimit -n 65536' or use the procd init). No more silent
    failure.
  - Logs INFO with the before/after limits otherwise, so field bug
    reports tell us immediately whether the kernel cap is the real
    bottleneck.

Moved the rlimit call from main() pre-logging to post-init_logging so
its tracing output actually lands in the log panel + stderr. Small
reorganization only.

49 tests pass, musl x86_64 cross-compile verified locally.
therealaleph and others added 29 commits April 24, 2026 17:16
Bug fix release. My v1.2.12 merge of Mode::Full bypassed the
deployment-ID + auth-key check on Android, but Full mode talks to
CodeFull.gs on Apps Script and needs those same credentials.

Users selecting "Full tunnel (no cert)" with empty fields would see
the VPN service bail silently instead of surfacing a clear "config
incomplete" error. Vahidlazio's fix changes the gate from
`mode == APPS_SCRIPT` to `mode != GOOGLE_ONLY` and removes the
Mode.FULL bypass in the Start button's enabled-state.

Also includes a UX refactor of the Deployment IDs editor (per-row
rows with add/remove buttons instead of raw newline-separated text),
making multi-deployment setups easier to manage on Android — useful
now that Full Tunnel Mode users routinely scale to 5+ deployments
per their Google accounts.

Android-only diff; Rust side is byte-identical to v1.2.12.
Vahidlazio flagged that the README's full-mode quick-start read as
"you need 3-12 separate Google accounts," which is wrong — you need
3-12 deployment IDs, which can all live on one Google account (each
"New deployment" produces its own ID). Going multi-account only
buys daily-quota headroom; the pipeline depth itself scales fine on
one account up to Apps Script's simultaneous-execution ceiling.

Also rewrites the accompanying recipe into three concrete tiers
(solo / small group / large group) instead of waving vaguely at a
range.

Closes #126. Also obsoletes my earlier analysis on #61 where I told
@Feiabyte that same-account multi-deployment "does not give
throughput" — that was wrong; it does, because simultaneous Apps
Script executions aren't bottlenecked by a per-deployment limit
until you pile up more than ~30 concurrent executions per account,
which 12 deployments don't come close to. Posting a correction on
#61 separately.
…ollow-up)

My earlier commit 47727b2 fixed the English quick-start but missed
that vahidlazio's PR #126 also rewrote the Persian FAQ entry
\"چند Deployment ID لازم دارم؟\" with the same correction. Mirroring
that here so both language tracks say the same thing.

Persian wording credit to @vahidlazio from #126.
Adds:
- Shield badge in the top badge row linking to https://sh1n.org/donate
- Dedicated "Support this project" section in English (above the
  ltr/rtl divider) explaining what donations cover
- Dedicated "حمایت از پروژه" section in Persian mirroring the same
  content inside the RTL block

Donations cover hosting, self-hosted CI runner costs, and continued
maintenance. Starring the repo remains the free equivalent.
New config field `passthrough_hosts: Vec<String>` that lists hostnames
which should bypass Apps Script relay entirely and pass through as
plain TCP (via upstream_socks5 if set). Applies across all modes —
apps_script, google_only, and full — because it expresses user intent
("never relay this host") that should win over the default routing.

Matching rules:
- Exact: "example.com" matches only example.com
- Suffix: ".example.com" matches example.com AND any subdomain
- Case-insensitive; trailing dots normalized
- Empty / whitespace-only entries are ignored

Dispatch order is now:
  0. passthrough_hosts  ← new, highest priority
  1. Mode::Full         → batch tunnel
  2. SNI-rewrite        → direct Google edge
  3. Mode::GoogleOnly   → plain-tcp
  4. Mode::AppsScript   → peek + MITM/relay/plain-tcp

Wired through:
- src/config.rs: new Config field with serde default
- src/proxy_server.rs: RewriteCtx.passthrough_hosts + matches_passthrough()
  helper + dispatch check as step 0 + 5 new unit tests
- src/bin/ui.rs: FormState + ConfigWire round-trip so the desktop UI
  preserves user entries across save/load

Android ConfigStore.kt wiring and a UI editor will land in a follow-up.

80 tests pass (75 → 80 with the new passthrough match tests).
The 504 "Relay timeout" message was too terse — users hit it after
their daily UrlFetchApp quota exhausts and had no idea why a setup
that worked yesterday suddenly started failing.

New message explicitly names the most common cause (daily quota
reset at 00:00 UTC) and points at the script's Executions tab for
the real server-side error. Same three-line explanation #99, #111,
#105 each independently got from me in issue comments — now it's in
the browser page body where the user actually sees it.
Pipeline depth was artificially capped at 12. Users with 20+
deployments across multiple accounts were wasting pipeline capacity.

Now: pipeline_depth = num_scripts (minimum 2, no upper cap).
The connection pool (80) is the natural ceiling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a daily-budget visualization for users worried about hitting the
Apps Script free-tier quota (20,000 UrlFetchApp calls/day).

Usage today card (desktop + Android):
- today_calls / today_bytes / today_key / today_reset_secs atomics on
  DomainFronter, hooked into the bytes_relayed fetch_add path so we only
  count successful relays (matching what Google actually billed)
- Daily rollover at 00:00 UTC, std-only date math (Hinnant's
  civil_from_days) — no chrono/time dep pull
- StatsSnapshot extended with the four new fields + to_json() for the
  Android JNI bridge
- Desktop UI renders the card right under the existing Traffic stats
  with a hyperlink to https://script.google.com/home/usage for the
  authoritative Google-side number
- Android UI renders the same card via Compose, polling
  Native.statsJson(handle) once a second only while the proxy is up,
  with an Intent(ACTION_VIEW, …) opening the dashboard URL

JNI / state plumbing:
- New Java_…_statsJson reads the Arc<DomainFronter> kept in slot_map
- VpnState.proxyHandle StateFlow so HomeScreen knows which handle to
  poll without poking into the service's internal state
- MhrvVpnService publishes the handle on start, zeroes on teardown

Persian localization:
- HowToUseBody (5-step guide + Cloudflare Turnstile note) was
  hardcoded English even when locale=FA. Ported to a string resource
  with a full FA translation in values-fa/strings.xml. Persian users
  no longer drop to English at the bottom of the screen.

Also lands the deferred Android ConfigStore.kt wiring for
passthrough_hosts (commit 889f94c shipped the Rust + desktop side).

82 tests pass (added: unix_to_ymd_utc_handles_known_epochs,
seconds_until_utc_midnight_is_bounded). Built and visually verified on
both desktop and Android emulator (mhrv_test AVD).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat: remove pipeline depth cap — scales with deployment count
Adds a Straightforward README that's short, plain-language, and skips
the technical "how it works" diagrams. Covers the same setup flow as
the main README but with friendlier wording, plus a "Common issues"
section that surfaces the most-asked-about gotchas (YouTube SafeSearch
loop #61, Cloudflare Turnstile loop, 504 Relay timeout, daily quota,
"connected but nothing loads"). Both English and Persian.

The main README's index line now offers four links instead of two:
  - Quick Start (EN)
  - Full English README
  - راهنمای خلاصه فارسی
  - راهنمای کامل فارسی

So users who don't have a deep networking background can land on a
guide tailored to them, and users who want the full picture still
have a single click.

Closes #135.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ipeline depth

Apps Script enforces 30 concurrent executions per account. The old pipeline
used a single global semaphore sized to the number of deployments, meaning
1 in-flight batch per deployment. Now each deployment ID gets its own
semaphore with 30 permits — matching the actual per-account limit.

With N deployments the system can sustain 30×N concurrent batch requests
instead of N.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: per-deployment concurrency (30 req/account)
GitHub Releases is filtered from inside IR, and the 50 MB universal APK
is a bottleneck for users on slow/unstable censorship-tunnel paths that
can't reliably pull that much data. Per-ABI APKs are ~18–23 MB each —
small enough to succeed where the universal fails.

Build changes:
- android/app/build.gradle.kts: enabled `splits { abi { ... } }` with
  `isUniversalApk = true`, producing five release APKs:
    app-universal-release.apk     ~53 MB  (all 4 ABIs)
    app-arm64-v8a-release.apk     ~21 MB  (95%+ of modern devices)
    app-armeabi-v7a-release.apk   ~18 MB  (older 32-bit ARM)
    app-x86_64-release.apk        ~23 MB  (emulators, Chromebooks)
    app-x86-release.apk           ~22 MB  (legacy 32-bit Intel)
  abiFilters is retained for the universal build; splits.abi layers on
  per-ABI outputs without removing it.

- .github/workflows/release.yml: rename step now copies all 5 APKs to
  dist/ under versioned names (mhrv-rs-android-{abi}-v{VER}.apk),
  logs a warning if any per-ABI APK is missing, and hard-fails only if
  the universal is missing. Universal keeps its existing download path
  and filename so Telegram mirrors / previous-version update prompts
  keep working.

The release + telegram aggregation jobs downstream don't need changes —
they already use `files: dist/*` and `mhrv-rs-android-universal` artifact
name respectively.

Local build verified: clean assembleRelease produces all 5 APKs with the
expected size ratios (arm64-v8a is 41% the universal size).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…roid

fix(android): app splitting ONLY mode don't mix allowed/disallowed apps
@creep247 raised a fair concern: v1.2.9's forwarded-header stripping
handles the client-side leg (browser extensions / local proxies
inserting X-Forwarded-For before the request reaches Apps Script),
but it cannot cover whatever Google's infrastructure may add when
the Apps Script runtime's subsequent UrlFetchApp.fetch() hits the
target server — that leg is outside this client's control.

Added a paragraph to both the English and Persian "Security posture"
sections making the model honest:
  - what v1.2.9's stripping DOES cover (client-side added headers)
  - what it DOES NOT cover (Google's internal header chain on the
    fetch from Apps Script runtime → destination)
  - recommendation: users whose threat model requires the destination
    site cannot under any circumstances learn their IP should use
    Full Tunnel mode, which exits via the user's own VPS end-to-end

No code change — the privacy claim is narrower than a naive reading
of "v1.2.9 fixed the IP leak" might suggest, so the docs should say
so explicitly rather than let users over-trust the apps_script mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When ONLY mode contains only this app or stale packages, no allowed application is successfully added. Android then treats the VPN as applying to every app, which can route more traffic than requested and can also include our own proxy traffic in the TUN path. Count successful addAllowedApplication calls and fall back to the existing ALL-mode self-exclusion behavior when none are usable.
Rolls up the four post-v1.2.14 commits on main into a single tagged
release. Highlights:

- Per-deployment concurrency (#142): each deployment ID gets its own
  30-permit semaphore, so setups with deployments across multiple
  Google accounts get a genuine 30×N throughput ceiling. Single-account
  setups still cap at Google's per-account 30-simultaneous limit —
  docs (EN + FA) updated to call that out.
- Android app-splitting ONLY-mode bug fix (#143): the previous code
  called both addAllowedApplication and addDisallowedApplication,
  which Android documents as mutually exclusive. ONLY mode was
  silently failing establish(). Now fixed.
- Per-ABI Android APKs (#136): ships four split APKs (arm64-v8a ~21 MB,
  armeabi-v7a ~18 MB, x86_64 ~23 MB, x86 ~22 MB) alongside the ~53 MB
  universal. Huge distribution win for users on unreliable
  censorship-tunnel paths — the 21 MB arm64-v8a download succeeds
  where the universal doesn't.
- Honest IP-exposure note in Security Posture (#148): clarified that
  v1.2.9's forwarded-header stripping only covers the client-side leg;
  what Google's own infrastructure may add on the UrlFetchApp.fetch()
  second leg is outside this client's control. Full Tunnel mode is
  the recommendation for threat models where that matters.
- Telegram release-post format: added Persian preambles above both
  links (GitHub repo + full Persian guide; release page + desktop/
  router builds) so channel readers see the intent at a glance.

82 tests pass. Desktop + Android builds both verified clean locally
across the v1.2.15+ commit series.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Range-parallel relay trusted the origin's Content-Range total before planning remaining ranges and preallocating the stitched response buffer. A bad upstream could advertise an absurd total and force unbounded memory/CPU work. Bound synthetic stitching to 64 MiB and fall back to a normal single GET for larger totals, preserving correctness without returning a fake truncated 200 response.
…mpty

fix(android): keep empty ONLY split lists self-excluded
…-cap

fix(relay): cap synthetic range stitching size
feat(tunnel): save one RTT per new HTTPS flow via connect_data op
Rolls up #150 + #151 + #153 (all merged on main) into a tagged release.

Highlights:

- Full Tunnel mode opens new HTTPS connections ~500-2000 ms faster
  (#153). The new connect_data tunnel op bundles the client's first
  bytes (typically TLS ClientHello) with the CONNECT call, eliminating
  one full Apps Script round trip per new flow. Backward compat is
  handled via UNSUPPORTED_OP detection + sticky AtomicBool fallback +
  pending_client_data replay so older deployments keep working without
  byte loss. New `connect_data preread: X win / Y loss / Z skip`
  metric in logs lets us measure the win ratio empirically.
- Android ONLY-mode split fix (#150): when the allow-list contained
  only mhrv-rs or stale uninstalled packages, every
  addAllowedApplication call silently failed and Android applied the
  TUN to every app — looping our own proxy traffic. Now we count
  successful adds; if zero, we fall back to ALL-mode self-exclusion.
  Complements PR #143 which fixed the empty-list case.
- Memory-safety cap on relay range stitching (#151): a hostile or
  buggy origin could advertise an absurd Content-Range total
  (e.g. 10 GiB) and force range-parallel to plan millions of chunks
  and preallocate a huge stitched buffer. Now capped at 64 MiB; larger
  totals fall back to a normal single GET.

Tests: 91 lib tests pass (was 82; +1 from #151, +8 from #153).
Tunnel-node: 6 tests pass (all new from #153).

Local Android build verified — universal + four per-ABI APKs all
produced at expected sizes (universal 53 MB, arm64-v8a 21 MB,
armeabi-v7a 18 MB, x86_64 23 MB, x86 22 MB). Installed on mhrv_test
emulator, app launches and renders correctly with v1.4.0 in title.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #153's connect_data instrumentation imported `AtomicU64` directly
from `std::sync::atomic`, which works on every release target except
the 32-bit MIPS OpenWRT target (`mipsel-unknown-linux-musl`) — that
platform has no hardware-backed 64-bit atomics, and Rust's std type
isn't even defined there. The v1.4.0 release workflow's mipsel build
failed with `error[E0432]: no AtomicU64 in sync::atomic`.

`domain_fronter.rs` already imports `portable_atomic::AtomicU64` for
the same reason — `portable-atomic` (already in Cargo.toml with the
"fallback" feature) provides a software-emulated 64-bit atomic on
targets that need it. Apply the same import here for the new
preread_* counter and connect_data_unsupported flag.

`AtomicBool` stays in std — it works on every target, no polyfill
needed.

Verified locally: cargo build + cargo test --lib (91/91 pass) clean
with the import change.

Same v1.4.0 — no version bump. Re-runs the release workflow to publish
the missing mhrv-rs-openwrt-mipsel-softfloat artifact alongside the
existing v1.4.0 release assets. Telegram notification suppressed for
this re-publish (same version, no point re-posting).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The original `on: push: tags: 'v*'` trigger is the right primary path
— a new tag = a new release = a fresh workflow run. But it has a
failure mode: if one matrix job fails (e.g. mipsel-softfloat is
tier-3 and occasionally regresses) and we push a fix to main, we
can't move the immutable tag (tag protection rule), so the original
artifact stays missing from that release forever.

`workflow_dispatch` with a `version` input lets us re-run the workflow
from main against an existing release tag:

    gh workflow run release.yml --ref main -f version=1.4.0

The build matrix runs against the current main commit (which has
the fix), and the release-upload step uploads to the existing
v1.4.0 release page with the same filename pattern, alongside the
artifacts that succeeded on the original push.

Three call sites had to learn the new path: the macOS .app bundle
build, the Android APK rename step, and the Telegram caption builder.
All three previously did `VER="${GITHUB_REF#refs/tags/v}"`, which on
a workflow_dispatch from main produces "heads/main" — the new
two-line `VER="${{ inputs.version || github.ref_name }}"; VER="${VER#v}"`
pattern handles both: tag pushes get the bare version from
`github.ref_name` ("v1.4.0" → "1.4.0"), workflow_dispatch gets it
from the explicit input.

The release job's `softprops/action-gh-release@v2` step also needed an
explicit `tag_name` — without it the action defaults to `github.ref`,
which on workflow_dispatch is the dispatch ref (`refs/heads/main`)
and would try to create a release named "main". Now it's:

    tag_name: ${{ inputs.version && format('v{0}', inputs.version) || github.ref_name }}

Pair with `gh variable set TELEGRAM_NOTIFY_ENABLED --body false` before
dispatch when the re-run is for the same version (no new content for
the channel to see) — the Telegram job is already gated on that
variable, no per-run flag needed.

Used right now to publish the mhrv-rs-openwrt-mipsel-softfloat artifact
that failed on v1.4.0's original push, after commit 6a0d45d fixed the
underlying compile error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related fixes for the self-hosted Linux jobs failing on
`actions/checkout@v4` with `EACCES: permission denied unlink
target/.rustc_info.json`:

1. Pre-checkout cleanup. The previous mipsel docker step left root-
   owned files in target/ when its `cargo +nightly build` failed —
   the post-step `sudo chown -R …` line was AFTER the docker run,
   and bash -e short-circuited on docker non-zero, so chown never
   executed. Once those root-owned files exist on the runner,
   every subsequent self-hosted job's `actions/checkout@v4` clean
   step fails because it can't unlink them. Add a guarded
   pre-checkout step that uses `sudo rm -rf` to clear root-owned
   leftovers from $GITHUB_WORKSPACE. Gated on
   `contains(matrix.os, 'self-hosted')` so GitHub-hosted runners
   (macOS, Windows) skip it — they get fresh VMs anyway.

2. Always-chown trap. The mipsel docker step now uses
   `trap '…chown…' EXIT` so the chown runs whether the inner
   `cargo build` succeeded or failed. A transient mipsel compile
   regression (tier-3 target, occasional std-build breakage) no
   longer poisons the runner's workspace for the next run.

Together: (1) recovers from the existing broken state on next run,
(2) prevents the same poisoned state from recurring.

Triggered for the v1.4.0 redeploy that's trying to publish the
missing mhrv-rs-openwrt-mipsel-softfloat artifact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants