Skip to content

Fix ArtifactoryReferenceToken detector: tighten regex, emit unverified when no domain#4945

Open
asivaprasad09 wants to merge 6 commits intotrufflesecurity:mainfrom
asivaprasad09:fix-artifactory-reference-token-detector
Open

Fix ArtifactoryReferenceToken detector: tighten regex, emit unverified when no domain#4945
asivaprasad09 wants to merge 6 commits intotrufflesecurity:mainfrom
asivaprasad09:fix-artifactory-reference-token-detector

Conversation

@asivaprasad09
Copy link
Copy Markdown

@asivaprasad09 asivaprasad09 commented May 4, 2026

Summary

Problem

The existing ArtifactoryReferenceToken detector had two issues:

  1. Incorrect regex — pattern cmVmdGtu[a-zA-Z0-9]{54,60} used a variable length range (54-60) which is wrong. Reference tokens always decode to reftkn:01:<expiry>:<random> making the full base64 prefix cmVmdGtuOj (encoding of reftkn:) and total token length always exactly 64 chars.

  2. Silent drop — if no *.jfrog.io domain was found in the same chunk as the token, the detector emitted no result at all. Tokens stored separately from their domain were completely invisible.

Changes

  • Tighten regexcmVmdGtu[a-zA-Z0-9]{54,60}cmVmdGtuOj[a-zA-Z0-9]{54} (exact 64 chars, correct prefix)
  • Tighten keywordcmVmdGtucmVmdGtuOj for more specific pre-filtering and fewer false positives
  • Emit unverified when no domain — token is always flagged; if a *.jfrog.io domain is found in the same chunk, verification is attempted as before

Before vs After

Before After
Regex cmVmdGtu[a-zA-Z0-9]{54,60} cmVmdGtuOj[a-zA-Z0-9]{54}
Token length matched 62–68 chars Always exactly 64 chars
Keyword cmVmdGtu cmVmdGtuOj
Token alone, no domain ❌ Silently dropped ⚠️ Flagged unverified
Token + domain in same chunk ✅ Verified ✅ Verified

Test plan

  • Regex matches real token cmVmdGtuOjAxOjE4MDg1NDE4OTQ6VWxMVVpVQ1d2dkFISEEzR0EyUldjUUp1MjRk (decodes to reftkn:01:1808541894:...)
  • Token alone → unverified result emitted (not silently dropped)
  • Token + demo.jfrog.io → verification attempted, correctly returns unverified on 401

Note

Medium Risk
Updates multiple secret detectors’ regexes and result emission/verification behavior, which can change detection coverage and verification network traffic patterns. Risk is moderate due to potential false positive/negative shifts and new external API verification calls.

Overview
Improves secret detection coverage and correctness across detectors. The ArtifactoryReferenceToken detector now matches the fixed 64-char base64 format (more specific prefix/keyword) and emits an unverified result when no *.jfrog.io domain is present, instead of dropping the token.

Hardens Grafana service account detection and verification. The GrafanaServiceAccount regex is tightened, deduping of keys/domains is added, tokens are emitted unverified when no *.grafana.net domain is found, and verification is refactored to treat 403 as verified (authenticated but unauthorized), with new unit tests.

Adds Together AI API key scanning. Introduces a new TogetherAI detector (pattern + optional verification against api.together.xyz) and wires it into defaults and DetectorType proto/Go enum, with tests and benchmarks.

Reviewed by Cursor Bugbot for commit 8f8b539. Bugbot is set up for automated code reviews on this repo. Configure here.

Akshara Sivaprasad and others added 3 commits May 4, 2026 12:12
Adds a detector for Together AI API keys (tgp_v1_ format).
Verifies keys via GET /v1/models endpoint.
…us codes, deduplicate

- Previously tokens with no co-located domain were silently dropped.
  Now an unverified result is always emitted so the token is never lost.
- Switch to common.SaneHttpClient() consistent with all other detectors.
- Properly drain response body with io.Copy(io.Discard) to avoid
  connection leaks.
- Treat 403 as determinately invalid (was falling into error bucket).
- Deduplicate keys and domains before processing to avoid duplicate results.
- Extract verification into verifyGrafanaServiceAccount() helper.
…d when no domain

- Tighten regex from cmVmdGtu[a-zA-Z0-9]{54,60} to cmVmdGtuOj[a-zA-Z0-9]{54}
  Reference tokens always decode to reftkn:01:<expiry>:<random> making
  the full prefix cmVmdGtuOj (base64 of 'reftkn:') and total length
  always exactly 64 chars. Variable length range was incorrect.
- Update keyword from cmVmdGtu to cmVmdGtuOj for more specific
  pre-filtering and fewer false positives.
- Emit unverified result when no jfrog.io domain is found in the chunk.
  Previously tokens without a co-located domain were silently dropped.
@asivaprasad09 asivaprasad09 requested a review from a team May 4, 2026 08:26
@asivaprasad09 asivaprasad09 requested review from a team as code owners May 4, 2026 08:26
Comment thread pkg/detectors/grafanaserviceaccount/grafanaserviceaccount.go Outdated
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 4, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 3 committers have signed the CLA.

✅ asivaprasad09
❌ Akshara Sivaprasad
❌ cursoragent


Akshara Sivaprasad seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

tokenPat = regexp.MustCompile(`\b(cmVmdGtu[A-Za-z0-9]{56})\b`)
// Reference tokens are base64-encoded strings of "reftkn:01:<expiry>:<random>"
// Fixed format: prefix "cmVmdGtuOj" (base64 of "reftkn:") + exactly 54 base64 chars = 64 total
tokenPat = regexp.MustCompile(`\b(cmVmdGtuOj[a-zA-Z0-9]{54})\b`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Base64 encoding of reftkn: should be cmVmdGtuOg==, which doesn’t seem to match what’s included in this PR. Am I missing something here?

From what I can tell, the original implementation already accounts for this, as noted in the comments:

	// Reference tokens are base64-encoded strings starting with "reftkn:01|<version>:<expiry>:<random>"
	// The base64 encoding of "reftkn" is "cmVmdGtu", total length is always 64 characters

Have you encountered a different format in real-world usage?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tightened regex using observed token structure and exact-length matching to reduce overmatching and improve precision

cursoragent and others added 2 commits May 6, 2026 07:04
Co-authored-by: asivaprasad09 <asivaprasad09@users.noreply.github.com>
Co-authored-by: asivaprasad09 <asivaprasad09@users.noreply.github.com>
Comment thread pkg/detectors/grafanaserviceaccount/grafanaserviceaccount.go
Co-authored-by: asivaprasad09 <asivaprasad09@users.noreply.github.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 8f8b539. Configure here.

defaultClient = common.SaneHttpClient()

// Together AI API key pattern: tgp_v1_ followed by 43 characters (alphanumeric, underscore, hyphen)
keyPat = regexp.MustCompile(`\b(tgp_v1_[A-Za-z0-9_-]{43})\b`)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing \b fails to match tokens ending with hyphen

Medium Severity

The regex \b(tgp_v1_[A-Za-z0-9_-]{43})\b includes - in the character class but uses \b (word boundary) at the end. Since - is a non-word character, any valid token whose 43-character suffix ends with - will fail to match when followed by whitespace, punctuation, or end-of-string — the most common contexts in secret scanning. About 1/64 of randomly generated keys would be silently missed.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 8f8b539. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants