This specification defines the configuration model, processing rules, and environment semantics for the Agentic Workflow Firewall (AWF). It is the normative reference for:
- the
awfCLI runtime (--config) - tooling that compiles workflows into AWF invocations (e.g.,
gh-aw) - IDE and static-analysis validation via JSON Schema
The machine-readable schema is published alongside this specification at
docs/awf-config.schema.json (live, tracking main) and as a versioned
release asset (e.g.,
https://github.com/github/gh-aw-firewall/releases/download/v0.23.1/awf-config.schema.json).
This document is normative. Informative notes are marked with Note: or placed in blockquotes. All other text is normative unless stated otherwise.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
A conforming AWF configuration document is one that:
- is valid JSON or YAML;
- satisfies all constraints defined by
docs/awf-config.schema.json; and - contains no properties beyond those defined by the schema (closed-world assumption).
A conforming AWF implementation MUST accept every conforming configuration document and MUST reject every non-conforming one.
When the user invokes awf --config <path|-> -- <command>, a conforming
implementation MUST execute the following steps in order:
- If
<path>is-, read configuration bytes from standard input. - Determine the serialisation format:
- If
<path>ends with.json, parse as JSON. - If
<path>ends with.yamlor.yml, parse as YAML. - Otherwise, attempt JSON first; if that fails, attempt YAML.
- If
- Validate the parsed document against
docs/awf-config.schema.json. - On validation failure, abort with non-zero exit status (see §7).
- Map configuration fields to CLI-option semantics per §5.
- Apply precedence rules per §3.
The effective value for any configuration parameter SHALL be determined by the following precedence order (highest wins):
- Explicit CLI flags
- Config file (
--config) - AWF internal defaults
Note: This model enables reusable, checked-in configuration files with environment-specific CLI overrides.
The root object of a conforming configuration document MAY contain the following top-level properties. All are OPTIONAL:
| Property | Type | Description |
|---|---|---|
$schema |
string | JSON Schema URI for IDE validation |
network |
object | Network egress configuration |
apiProxy |
object | API proxy sidecar configuration |
security |
object | Security and isolation settings |
container |
object | Container and Docker settings |
environment |
object | Environment variable propagation (see §8) |
logging |
object | Logging and diagnostics |
rateLimiting |
object | Egress rate limiting |
Property-level constraints, types, and descriptions are defined
normatively by docs/awf-config.schema.json.
This section is normative.
Tools generating AWF invocations (such as gh-aw) SHOULD use the mapping
below. The left side is the configuration-document path; the right side is
the corresponding CLI flag.
network.allowDomains[]→--allow-domains <csv>network.blockDomains[]→--block-domains <csv>network.dnsServers[]→--dns-servers <csv>network.upstreamProxy→--upstream-proxyapiProxy.enabled→--enable-api-proxyapiProxy.enableOpenCode→--enable-opencodeapiProxy.anthropicAutoCache→--anthropic-auto-cacheapiProxy.anthropicCacheTailTtl→--anthropic-cache-tail-ttl <5m|1h>apiProxy.maxEffectiveTokens→ (config-only; no CLI equivalent)apiProxy.modelMultipliers→ (config-only; no CLI equivalent)apiProxy.models→ (config-only; model alias rewriting)apiProxy.auth.type→ (config-only; maps toAWF_AUTH_TYPE)apiProxy.auth.provider→ (config-only; maps toAWF_AUTH_PROVIDER)apiProxy.auth.oidcAudience→ (config-only; maps toAWF_AUTH_OIDC_AUDIENCE)apiProxy.auth.azureTenantId→ (config-only; maps toAWF_AUTH_AZURE_TENANT_ID)apiProxy.auth.azureClientId→ (config-only; maps toAWF_AUTH_AZURE_CLIENT_ID)apiProxy.auth.azureScope→ (config-only; maps toAWF_AUTH_AZURE_SCOPE)apiProxy.auth.azureCloud→ (config-only; maps toAWF_AUTH_AZURE_CLOUD)apiProxy.auth.awsRoleArn→ (config-only; maps toAWF_AUTH_AWS_ROLE_ARN)apiProxy.auth.awsRegion→ (config-only; maps toAWF_AUTH_AWS_REGION)apiProxy.auth.awsRoleSessionName→ (config-only; maps toAWF_AUTH_AWS_ROLE_SESSION_NAME)apiProxy.auth.gcpWorkloadIdentityProvider→ (config-only; maps toAWF_AUTH_GCP_WORKLOAD_IDENTITY_PROVIDER)apiProxy.auth.gcpServiceAccount→ (config-only; maps toAWF_AUTH_GCP_SERVICE_ACCOUNT)apiProxy.auth.gcpScope→ (config-only; maps toAWF_AUTH_GCP_SCOPE)apiProxy.targets.<provider>.host→--<provider>-api-targetapiProxy.targets.openai.basePath→--openai-api-base-pathapiProxy.targets.anthropic.basePath→--anthropic-api-base-pathapiProxy.targets.gemini.basePath→--gemini-api-base-pathsecurity.sslBump→--ssl-bumpsecurity.enableDlp→--enable-dlpsecurity.enableHostAccess→--enable-host-accesssecurity.allowHostPorts→--allow-host-portssecurity.allowHostServicePorts→--allow-host-service-portssecurity.difcProxy.host→--difc-proxy-hostsecurity.difcProxy.caCert→--difc-proxy-ca-certcontainer.memoryLimit→--memory-limitcontainer.agentTimeout→--agent-timeoutcontainer.enableDind→--enable-dindcontainer.workDir→--work-dircontainer.containerWorkDir→--container-workdircontainer.imageRegistry→--image-registrycontainer.imageTag→--image-tagcontainer.skipPull→--skip-pullcontainer.buildLocal→--build-localcontainer.agentImage→--agent-imagecontainer.tty→--ttycontainer.dockerHost→--docker-hostenvironment.envFile→--env-fileenvironment.envAll→--env-allenvironment.excludeEnv[]→--exclude-env(repeatable)logging.logLevel→--log-levellogging.diagnosticLogs→--diagnostic-logslogging.auditDir→--audit-dirlogging.proxyLogsDir→--proxy-logs-dirlogging.sessionStateDir→--session-state-dirrateLimiting.enabled: false→--no-rate-limitrateLimiting.requestsPerMinute→--rate-limit-rpmrateLimiting.requestsPerHour→--rate-limit-rphrateLimiting.bytesPerMinute→--rate-limit-bytes-pm
The following CLI flag has no config-file equivalent by design:
-e, --env <KEY=VALUE>— inject a single environment variable into the agent container (repeatable; CLI-only)
A conforming implementation MUST accept --config - to read configuration
from standard input, enabling programmatic and pipeline scenarios.
On parse or validation failure, a conforming implementation MUST:
- exit with a non-zero status code;
- emit a diagnostic message identifying the location and nature of the error; and
- refrain from partial execution of the agent command.
This section is normative.
The agent container's environment is constructed by merging variables from multiple sources. This section defines the merge order and exclusion rules.
Note: For usage guidance, examples, and troubleshooting, see docs/environment.md.
Variables from the following sources are merged in order of increasing precedence. A value set at a higher level MUST override the same-named value from any lower level.
| Level | Source | Description |
|---|---|---|
| 1 (lowest) | AWF-reserved | Proxy routing, DNS, container paths |
| 2 | --env-all |
Inherited host environment (when enabled) |
| 3 | --env-file |
Variables read from a file |
| 4 (highest) | -e / --env |
Explicit CLI key-value pairs |
A conforming implementation MUST set the following variables in the agent
container regardless of user configuration. Values from --env-all and
--env-file MUST NOT override these variables.
| Variable | Value | Purpose |
|---|---|---|
HTTP_PROXY |
http://<squid-ip>:3128 |
Squid forward proxy for HTTP |
HTTPS_PROXY |
http://<squid-ip>:3128 |
Squid forward proxy for HTTPS |
https_proxy |
http://<squid-ip>:3128 |
Lowercase alias (Yarn 4, undici, Corepack) |
NO_PROXY |
localhost,127.0.0.1,::1,... |
Loopback and container IPs bypassing Squid |
SQUID_PROXY_HOST |
squid-proxy |
Proxy hostname (for tools requiring host separately) |
SQUID_PROXY_PORT |
3128 |
Proxy port |
PATH |
(container default) | MUST use the container's PATH, not the host's |
HOME |
(host user's home) | Derived via sudo-aware detection |
Note: Lowercase
http_proxyis intentionally NOT set. Certain curl builds on Ubuntu 22.04 ignore uppercaseHTTP_PROXYfor HTTP URLs (httpoxy mitigation), causing HTTP traffic to fall through to iptables DNAT interception — the intended defense-in-depth behavior.
The following variables MUST be excluded from --env-all and --env-file
passthrough. A conforming implementation MUST NOT inherit them from the host:
| Category | Variables |
|---|---|
| System | PATH, PWD, OLDPWD, SHLVL, _, SUDO_COMMAND, SUDO_USER, SUDO_UID, SUDO_GID |
| Proxy | HTTP_PROXY, HTTPS_PROXY, http_proxy, https_proxy, NO_PROXY, no_proxy, ALL_PROXY, all_proxy, FTP_PROXY, ftp_proxy |
| Actions artifact tokens | ACTIONS_RUNTIME_TOKEN, ACTIONS_RESULTS_URL |
| AWF internal controls | AWF_PREFLIGHT_BINARY, AWF_GEMINI_ENABLED |
Note: Host proxy variables are read for upstream proxy auto-detection (see
--upstream-proxy) but MUST NOT propagate into the agent container. AWF sets its own proxy variables pointing to Squid.
When --env-all is NOT active, a conforming implementation SHOULD forward
the following host variables into the agent container:
| Category | Variables |
|---|---|
| GitHub authentication | GITHUB_TOKEN, GH_TOKEN, GITHUB_PERSONAL_ACCESS_TOKEN |
| GitHub enterprise | GITHUB_SERVER_URL, GITHUB_API_URL |
| Actions OIDC | ACTIONS_ID_TOKEN_REQUEST_URL, ACTIONS_ID_TOKEN_REQUEST_TOKEN |
| Docker client | DOCKER_HOST, DOCKER_TLS, DOCKER_TLS_VERIFY, DOCKER_CERT_PATH, DOCKER_CONFIG, DOCKER_CONTEXT, DOCKER_API_VERSION, DOCKER_DEFAULT_PLATFORM |
| User environment | USER, XDG_CONFIG_HOME |
When --env-all IS active, all host variables not in the excluded set
(§8.3) SHALL be forwarded, subject to credential isolation rules (§9).
Variables passed via -e / --env MUST override all other sources,
including AWF-reserved variables. This is the only mechanism by which proxy
routing variables MAY be overridden.
Note: There is no config-file equivalent for
-e/--env. Individual environment variable injection is a runtime concern, not a static configuration concern.
This section is normative.
AWF implements defense-in-depth credential isolation for LLM API keys.
Behavior is governed by the value of apiProxy.enabled.
Note: For architectural diagrams and protocol-level details, see docs/authentication-architecture.md.
A conforming implementation MUST recognize the following environment variables as source credentials — real API keys read from the host:
| Variable | Provider |
|---|---|
OPENAI_API_KEY |
OpenAI |
ANTHROPIC_API_KEY |
Anthropic (Claude) |
COPILOT_GITHUB_TOKEN |
GitHub Copilot |
COPILOT_API_KEY |
GitHub Copilot (BYOK) |
GEMINI_API_KEY |
Google Gemini |
The following secondary aliases SHOULD also be recognized:
OPENAI_KEY, CODEX_API_KEY, CLAUDE_API_KEY,
COPILOT_PROVIDER_API_KEY.
When the API proxy sidecar is enabled, the following rules apply:
-
Source credentials (§9.1) MUST NOT be exposed in the agent container's environment. They SHALL be passed exclusively to the API proxy sidecar.
-
The
--env-allflag MUST NOT reintroduce excluded credentials into the agent environment. -
A conforming implementation MAY inject placeholder values into the agent container for tool compatibility (e.g.,
OPENAI_API_KEY=sk-placeholder-for-api-proxy). Placeholder values are not secrets and MUST NOT be treated as credentials. -
A conforming implementation MUST inject proxy-routing variables so that agent tools reach the sidecar rather than upstream APIs:
Agent variable Value Purpose OPENAI_BASE_URLhttp://172.30.0.30:10000Routes OpenAI calls to sidecar ANTHROPIC_BASE_URLhttp://172.30.0.30:10001Routes Anthropic calls to sidecar COPILOT_API_URLhttp://172.30.0.30:10002Routes Copilot calls to sidecar GOOGLE_GEMINI_BASE_URLhttp://172.30.0.30:10003Routes Gemini calls to sidecar GEMINI_API_BASE_URLhttp://172.30.0.30:10003Alias for compatibility -
The API proxy sidecar SHALL inject the real credentials into upstream requests. Sidecar port assignments: 10000 (OpenAI), 10001 (Anthropic), 10002 (Copilot), 10003 (Gemini), 10004 (OpenCode).
When the API proxy sidecar is disabled (the default):
- Source credentials present in the host environment SHOULD be forwarded directly to the agent container.
- No proxy-routing variables or placeholder values SHALL be injected.
Real credentials forwarded to the agent — whether source credentials in
non-proxy mode (§9.3) or GitHub tokens (GITHUB_TOKEN, GH_TOKEN) — MUST
be protected by the one-shot-token mechanism. Protected tokens are cached
on first access and removed from /proc/self/environ to prevent
environment variable inspection.
The default protected token list is:
COPILOT_GITHUB_TOKEN, GITHUB_TOKEN, GH_TOKEN, GITHUB_API_TOKEN,
GITHUB_PAT, GH_ACCESS_TOKEN, OPENAI_API_KEY, OPENAI_KEY,
ANTHROPIC_API_KEY, CLAUDE_API_KEY, CODEX_API_KEY, COPILOT_API_KEY,
COPILOT_PROVIDER_API_KEY
Placeholder compatibility values (§9.2 item 3) are not secrets and MUST NOT be subject to one-shot protection.
When apiProxy.auth.type is set to github-oidc, the API proxy sidecar
exchanges a GitHub Actions OIDC token for a provider-specific access token.
The apiProxy.auth.provider field (default: azure) selects the token
exchange protocol. A conforming implementation MUST:
-
Forward the common OIDC configuration to the sidecar via the following environment variables:
Config path Environment variable Required Default apiProxy.auth.typeAWF_AUTH_TYPE✅ — apiProxy.auth.providerAWF_AUTH_PROVIDERNo azureapiProxy.auth.oidcAudienceAWF_AUTH_OIDC_AUDIENCENo (provider-specific) -
Forward the GitHub Actions OIDC runtime tokens (
ACTIONS_ID_TOKEN_REQUEST_URL,ACTIONS_ID_TOKEN_REQUEST_TOKEN) to the sidecar whenAWF_AUTH_TYPE=github-oidc. These are injected automatically by the Actions runner when the workflow declarespermissions: id-token: write. -
NOT expose the exchanged provider token in the agent container environment. The sidecar SHALL inject it into upstream request headers.
Exchanges the GitHub OIDC JWT for an Azure AD / Microsoft Entra access
token via workload identity federation. The sidecar injects the resulting
token as a Bearer Authorization header on upstream requests.
| Config path | Environment variable | Required | Default |
|---|---|---|---|
apiProxy.auth.azureTenantId |
AWF_AUTH_AZURE_TENANT_ID |
✅ | — |
apiProxy.auth.azureClientId |
AWF_AUTH_AZURE_CLIENT_ID |
✅ | — |
apiProxy.auth.azureScope |
AWF_AUTH_AZURE_SCOPE |
No | https://cognitiveservices.azure.com/.default |
apiProxy.auth.azureCloud |
AWF_AUTH_AZURE_CLOUD |
No | public |
Default OIDC audience: api://AzureADTokenExchange
Note:
azureTenantIdandazureClientIdare required for Azure AD federated credential exchange but MAY be omitted when using managed identity. See docs/api-proxy-sidecar.md for protocol-level details.
Exchanges the GitHub OIDC JWT for temporary AWS credentials via
sts.amazonaws.com AssumeRoleWithWebIdentity. The sidecar uses these
credentials to sign upstream requests to AWS Bedrock using SigV4.
| Config path | Environment variable | Required | Default |
|---|---|---|---|
apiProxy.auth.awsRoleArn |
AWF_AUTH_AWS_ROLE_ARN |
✅ | — |
apiProxy.auth.awsRegion |
AWF_AUTH_AWS_REGION |
✅ | — |
apiProxy.auth.awsRoleSessionName |
AWF_AUTH_AWS_ROLE_SESSION_NAME |
No | awf-oidc-session |
Default OIDC audience: sts.amazonaws.com
Note: AWS Bedrock uses IAM/SigV4 request signing rather than Bearer tokens. This means the sidecar MUST sign the complete request (method, path, headers, body hash) with the temporary credentials — it is not sufficient to inject a single
Authorizationheader.
Exchanges the GitHub OIDC JWT for a GCP access token via the Security
Token Service (sts.googleapis.com), optionally followed by service
account impersonation via iamcredentials.googleapis.com. The sidecar
injects the resulting token as a Bearer Authorization header.
| Config path | Environment variable | Required | Default |
|---|---|---|---|
apiProxy.auth.gcpWorkloadIdentityProvider |
AWF_AUTH_GCP_WORKLOAD_IDENTITY_PROVIDER |
✅ | — |
apiProxy.auth.gcpServiceAccount |
AWF_AUTH_GCP_SERVICE_ACCOUNT |
No | — |
apiProxy.auth.gcpScope |
AWF_AUTH_GCP_SCOPE |
No | https://www.googleapis.com/auth/cloud-platform |
Default OIDC audience: the gcpWorkloadIdentityProvider value
When gcpServiceAccount is provided, the sidecar performs a two-step
exchange:
- Exchange GitHub OIDC JWT for a federated access token via GCP STS
- Impersonate the service account to obtain a short-lived OAuth2 token
When gcpServiceAccount is omitted, only step 1 is performed and the
federated token is used directly. This requires that the federated
principal has direct access grants on the target resource.
When security.difcProxy.host is set, GITHUB_TOKEN and GH_TOKEN MUST
be excluded from the agent environment. These tokens SHALL be held
exclusively by the external DIFC proxy.
This section is normative.
When apiProxy.maxEffectiveTokens is configured, the API proxy MUST enforce
a cumulative effective-token budget across all LLM API requests in a single
run. The budget limits total weighted token consumption, not raw token
counts.
Each upstream response's usage object is decomposed into four categories,
each with a fixed weight:
| Category | Weight | Usage field |
|---|---|---|
| Input | 1.0 | input_tokens / prompt_tokens |
| Cache read | 0.1 | cache_read_input_tokens / prompt_tokens_details.cached_tokens |
| Output | 4.0 | output_tokens / completion_tokens |
| Reasoning | 4.0 | reasoning_tokens / completion_tokens_details.reasoning_tokens |
The base weighted tokens for a single response are:
base = (1.0 × input) + (0.1 × cache_read) + (4.0 × output) + (4.0 × reasoning)
When apiProxy.modelMultipliers is configured, each model name MAY have
an associated positive multiplier. The effective tokens for a response are:
effective_tokens = model_multiplier × base_weighted_tokens
If no multiplier is configured for a given model, the multiplier defaults
to 1.
The API proxy MUST enforce the budget as follows:
-
Accumulation: After each successful upstream response, the proxy extracts the
usageobject, computes effective tokens, and adds them to a running total for the session. -
Pre-request check: Before forwarding each subsequent request to the upstream provider, the proxy checks whether the cumulative total has reached or exceeded
maxEffectiveTokens. -
Rejection: When the budget is reached or exceeded, the proxy MUST reject the request with:
- HTTP status:
429 Too Many Requests - Content-Type:
application/json - Response body:
{ "error": { "type": "effective_tokens_limit_exceeded", "message": "Maximum effective tokens exceeded (1234.56 / 1000).", "total_effective_tokens": 1234.56, "max_effective_tokens": 1000 } }
- HTTP status:
-
WebSocket rejection: For WebSocket upgrade requests, the proxy MUST reject with
HTTP/1.1 429 Too Many Requestsand include the same JSON error body before destroying the socket. -
Finality: Once the budget is reached or exceeded, all subsequent requests in the same run MUST be rejected. The budget is not recoverable.
The proxy MUST track when cumulative effective tokens cross the following
percentage thresholds of maxEffectiveTokens:
| Threshold |
|---|
| 50% |
| 75% |
| 90% |
| 95% |
Each threshold MUST be recorded at most once per run.
When the API proxy /reflect endpoint is queried, the response MUST
include the current effective-token state:
{
"effective_tokens": {
"enabled": true,
"max_effective_tokens": 1000,
"total_effective_tokens": 456.78,
"remaining_effective_tokens": 543.22,
"percent_used": 45.68,
"thresholds_crossed": []
}
}When maxEffectiveTokens is not configured, the enabled field MUST be
false and numeric fields MUST be 0 or null.
- RFC 2119 — Key words for use in RFCs to Indicate Requirement Levels
docs/awf-config.schema.json— Machine-readable JSON Schema for configuration documents (normative)
AWF emits structured JSONL artifact files at runtime. Each record type has
a corresponding JSON Schema in the schemas/ directory:
| Schema | JSONL file | Description |
|---|---|---|
schemas/audit.schema.json |
audit.jsonl |
L7 HTTP/HTTPS traffic decisions (allowed/denied) from the Squid proxy |
schemas/token-usage.schema.json |
token-usage.jsonl |
Per-API-call token usage records from the api-proxy sidecar |
Schema files do not carry an independent version. The repository release tag serves as the version:
- The
$idfield in each schema resolves to a stable release download URL. - Each JSONL record includes a
_schemawire-format field encoding the record type and AWF version (e.g.,"_schema": "audit/v0.26.0"). - Consumers SHOULD use a prefix match (
_schema.startsWith("audit/")) rather than an exact match to handle future versions gracefully.
Versioned (release assets):
https://github.com/github/gh-aw-firewall/releases/download/<tag>/awf-config.schema.json
https://github.com/github/gh-aw-firewall/releases/download/<tag>/audit.schema.json
https://github.com/github/gh-aw-firewall/releases/download/<tag>/token-usage.schema.json
Latest (main branch):
https://raw.githubusercontent.com/github/gh-aw-firewall/main/docs/awf-config.schema.json
https://raw.githubusercontent.com/github/gh-aw-firewall/main/schemas/audit.schema.json
https://raw.githubusercontent.com/github/gh-aw-firewall/main/schemas/token-usage.schema.json
- docs/environment.md — Usage guide for environment variables
- docs/authentication-architecture.md — Credential isolation architecture and diagrams
- docs/api-proxy-sidecar.md — API proxy sidecar configuration including OIDC authentication for Azure OpenAI
- schemas/README.md — JSONL schema directory with validation examples and versioning policy