AT
AgileTrust
Home Manual EN Manual ES API Reference Swagger Examples Live Best Practices
  • Security
  • Key management
  • Network isolation
  • Access control
  • Logging & audit
  • Integration
  • Encoding selection
  • Tweak strategy
  • Error handling
  • Idempotency
  • Operations
  • Key rotation
  • High availability
  • Monitoring
  • Compliance
  • PCI DSS guidance
  • GDPR / privacy
  • Deployment checklist
Security

Key Management Critical

The service uses two independent secrets: the AES encryption key (TOKENIZATION_KEY) and the API authentication key (API_KEY). Both must be treated as equally sensitive.

✓ Do

  • Use AES-256 (64 hex chars) for TOKENIZATION_KEY.
  • Generate both keys with a CSPRNG: openssl rand -hex 32
  • Store both in a secrets manager (AWS Secrets Manager, HashiCorp Vault, Azure Key Vault).
  • Rotate keys on a schedule and after any suspected exposure.
  • Keep separate keys per environment (dev / staging / prod).
  • Audit every key access event.

✗ Don't

  • Hardcode either key in source code or Dockerfiles.
  • Commit keys to version control (even encrypted repos).
  • Share keys between environments.
  • Use the same API key across multiple callers — assign one per service.
  • Log either key value anywhere, even in debug output.
  • Transmit keys over HTTP — both are injected server-side only.

Generating secure keys

bash — generate both keys
# AES-256 tokenization key (base64-encoded, 32 bytes)
export TOKENIZATION_KEY=$(python3 -c \
  "import os,base64; print(base64.b64encode(os.urandom(32)).decode())")

# API authentication key (hex, 32 bytes)
export API_KEY=$(openssl rand -hex 32)

# Store them in AWS Secrets Manager
aws secretsmanager create-secret \
  --name "tokenization/prod/tokenization-key" \
  --secret-string "$TOKENIZATION_KEY"

aws secretsmanager create-secret \
  --name "tokenization/prod/api-key" \
  --secret-string "$API_KEY"

Injecting keys at runtime

bash — Docker with secrets manager
TOK_KEY=$(aws secretsmanager get-secret-value \
  --secret-id tokenization/prod/tokenization-key \
  --query SecretString --output text)

API_KEY=$(aws secretsmanager get-secret-value \
  --secret-id tokenization/prod/api-key \
  --query SecretString --output text)

docker run -d \
  --name tokenizer \
  -p 8000:8000 \
  -e TOKENIZATION_KEY="$TOK_KEY" \
  -e API_KEY="$API_KEY" \
  --read-only \
  --security-opt no-new-privileges \
  agiletrust/tokenization:0.2

Network Isolation Critical

The tokenization service should never be reachable from the public internet. It is an internal cryptographic primitive — treat it like a database.

✓ Do

  • Deploy inside a private VPC subnet with no public IP.
  • Use security groups / NSGs to allow only your application servers to reach port 8000.
  • Put an internal load balancer (not internet-facing) in front of the service.
  • Enforce mTLS between callers and the tokenization service in zero-trust environments.
  • Use a service mesh (Istio, Linkerd) for automatic mTLS and policy enforcement.

✗ Don't

  • Bind the container to 0.0.0.0 on a host with a public IP.
  • Expose the health endpoint (/health) to external networks.
  • Route tokenization traffic through the public internet, even with TLS.

Kubernetes NetworkPolicy example

yaml — NetworkPolicy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: tokenizer-ingress
  namespace: tokenization
spec:
  podSelector:
    matchLabels:
      app: tokenizer
  policyTypes: [Ingress]
  ingress:
    - from:
        - podSelector:
            matchLabels:
              role: tokenization-client
      ports:
        - protocol: TCP
          port: 8000

Access Control High

  • Principle of least privilege: only services that legitimately need to tokenize or detokenize data should have network access to the service.
  • Separate tokenize/detokenize: if some services only need to tokenize (write path) but should never detokenize (read path), implement a network-level or application-level split.
  • Audit callers: log the source IP and/or service identity for every tokenize and detokenize call. This is your primary data access audit trail.
  • Rate limiting: apply rate limits per client IP or service account to prevent brute-force enumeration attacks.
ℹ

Because tokenization is deterministic, an attacker who can submit arbitrary plaintexts to /tokenize could build a rainbow table. Restrict /tokenize access to authorized writer services only.

Logging & Audit High

✓ Do log

  • Timestamp, endpoint (/tokenize vs /detokenize), encoding, caller identity.
  • HTTP status codes and response times.
  • Validation errors (which field, which rule).
  • Key rotation events and service restarts.

✗ Never log

  • The plaintext value — it's sensitive by definition.
  • The token value — it's a pseudonym for the plaintext.
  • The encryption key or any derivative of it.
  • The tweak value (it reveals field-level context).

Structured log example

json — recommended log record
{
  "timestamp": "2026-03-30T14:23:01.842Z",
  "service": "tokenization",
  "endpoint": "/tokenize",
  "encoding": "numeric",
  "input_length": 10,
  "status": 200,
  "duration_ms": 1.4,
  "caller_ip": "10.0.1.45",
  "request_id": "req_7f3a9b2c"
}
Integration

Encoding Selection Medium

Choosing the correct encoding for each field is critical. A mismatch between tokenization and detokenization encoding silently returns wrong data.

Field typeRecommended encodingReason
RUT / Chilean tax IDnumericDigit-heavy, hyphen separator preserved
Credit card PANnumeric16 digits, no separators
Phone numbernumericDigits + separators preserved
Bank account numbernumericDigit-only, format preserved
First / last name (any language)utf8Handles all Unicode scripts
Email addressutf8@ and . preserved; Unicode local parts
Free text / addressutf8Mixed punctuation, Unicode safe
Western European names (legacy system)latin1Token stays in Latin-1 range
⚠

Store the encoding used alongside the token in your database (or derive it from the field definition). You must know the original encoding to detokenize correctly.

Tweak Strategy Medium

Tweaks add field-level context without requiring additional keys. A consistent tweak strategy prevents cross-field token correlation attacks.

When to use tweaks

  • Same value in multiple fields: a person's name appears in both first_name and emergency_contact_name. Use different tweaks so the tokens are distinct.
  • Multi-tenant systems: use a tenant ID-derived tweak so the same plaintext produces different tokens per tenant.
  • Temporal separation: use a date-derived tweak if tokens should be unlinkable across time periods.

Deriving consistent tweaks

python — field name → tweak
def field_tweak(field_name: str) -> str:
    """Convert a field name to a deterministic 14-hex-char tweak."""
    raw = field_name.encode("ascii")[:7]   # take first 7 bytes
    padded = raw.ljust(7, b"\x00")          # zero-pad to 7 bytes
    return padded.hex()                     # → 14 hex chars

print(field_tweak("name"))    # 6e616d65000000
print(field_tweak("rut"))     # 72757400000000
print(field_tweak("email"))   # 656d61696c0000
ℹ

Store the tweak alongside the token and encoding. All three are required for correct detokenization.

Error Handling Medium

Handling 422 validation errors

Validation errors always include a descriptive message. Parse the error field to surface actionable feedback:

python
import requests

def safe_tokenize(plaintext, encoding="utf8"):
    try:
        r = requests.post("http://localhost:8000/tokenize",
                          json={"plaintext": plaintext, "encoding": encoding},
                          timeout=5)
        if r.status_code == 422:
            raise ValueError(f"Validation error: {r.json()['error']}")
        r.raise_for_status()
        return r.json()["token"]
    except requests.Timeout:
        raise RuntimeError("Tokenization service timeout — check health endpoint")
    except requests.ConnectionError:
        raise RuntimeError("Cannot reach tokenization service")

The silent detokenization failure

If you detokenize with the wrong encoding or key, you receive HTTP 200 with garbled plaintext. There is no exception to catch. If data integrity is critical, store a keyed HMAC of the original plaintext alongside the token and verify after detokenization.

python — integrity verification pattern
import hmac, hashlib

HMAC_KEY = b"separate-integrity-key"   # not the tokenization key

def store_with_integrity(plaintext, token, encoding):
    mac = hmac.new(HMAC_KEY, plaintext.encode(), hashlib.sha256).hexdigest()
    return {"token": token, "encoding": encoding, "mac": mac}

def verified_detokenize(record):
    recovered = detokenize(record["token"], record["encoding"])
    expected_mac = hmac.new(HMAC_KEY, recovered.encode(), hashlib.sha256).hexdigest()
    if not hmac.compare_digest(expected_mac, record["mac"]):
        raise ValueError("Detokenization integrity check failed — wrong key or encoding?")
    return recovered

Idempotency & Caching Low

Because tokenization is deterministic, the same plaintext + encoding + tweak always produces the same token. You can safely cache tokens or call /tokenize multiple times without side effects.

✓ Caching patterns

  • Cache tokens at the application level (in-process LRU or Redis) to reduce round-trips for frequently tokenized values.
  • Use (plaintext, encoding, tweak) as the cache key.
  • Invalidate cache on key rotation.

✗ Avoid

  • Caching tokens in the same data store as the plaintext (defeats the purpose).
  • Caching in shared caches without access controls.
  • Persisting the cache across key rotation events.
Operations

Key Rotation High

Key rotation requires re-tokenizing all stored tokens under the new key. This is an operational decision — plan for it before going to production.

  1. Generate a new key and store it in your secrets manager alongside the old key. Keep the old key — you need it to detokenize existing data.
  2. Deploy a second container instance (or use a separate endpoint) with the new key. Do not yet point production traffic at it.
  3. Re-tokenize in batches: for each stored token, detokenize with the old container, then tokenize with the new container. Write the new token back atomically.
  4. Verify a sample of re-tokenized records by detokenizing with the new key and comparing to the expected plaintext.
  5. Cut over production traffic to the new container. Decommission the old key after a retention window.
⚡

Never delete the old key until all tokens have been re-tokenized and verified. A lost key means permanently inaccessible plaintext.

High Availability Medium

The tokenization service is stateless — scale horizontally as needed.

  • Run at least 2 replicas in production, on different availability zones.
  • Use a health check on GET /health with a 2-second timeout and 3-failure threshold.
  • Set resource limits: the service is CPU-light but allocate at least 128 MB RAM per replica.
  • Set readinessProbe to /health so traffic is not routed until the container is ready.
yaml — Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tokenizer
  namespace: tokenization
spec:
  replicas: 3
  selector:
    matchLabels:
      app: tokenizer
  template:
    metadata:
      labels:
        app: tokenizer
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
      containers:
        - name: tokenizer
          image: agiletrust/tokenization:0.2
          ports:
            - containerPort: 8000
          env:
            - name: TOKENIZATION_KEY
              valueFrom:
                secretKeyRef:
                  name: tokenization-key
                  key: key
          resources:
            requests: { cpu: "100m", memory: "128Mi" }
            limits:   { cpu: "500m", memory: "256Mi" }
          livenessProbe:
            httpGet: { path: /health, port: 8000 }
            periodSeconds: 10
          readinessProbe:
            httpGet: { path: /health, port: 8000 }
            initialDelaySeconds: 5
            periodSeconds: 5
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true

Monitoring Medium

Key metrics to track

MetricAlert thresholdWhat it indicates
/health response time> 500 msService degradation
HTTP 422 rate on /tokenize> 1% of requestsBad input from callers — possible integration bug
HTTP 500 rateAny non-zeroInternal error — check logs immediately
Request rateBaseline ± 3σAnomalous usage pattern
Container restarts> 0Crash-looping — check for missing key env var

Health check script

bash
#!/bin/bash
STATUS=$(curl -sf http://localhost:8000/health | jq -r '.status' 2>/dev/null)
if [ "$STATUS" = "ok" ]; then
  echo "Tokenization service: healthy"
  exit 0
else
  echo "Tokenization service: UNHEALTHY (got: $STATUS)"
  exit 1
fi
Compliance

PCI DSS Guidance

AgileTrust Tokenization implements Format-Preserving Encryption as defined by NIST SP 800-38G Rev 1. FPE is recognized by PCI SSC as a tokenization technology when deployed correctly.

⚠

This guidance is informational. Engage a Qualified Security Assessor (QSA) to confirm that your specific deployment satisfies your PCI DSS scope reduction objectives.

Relevant PCI DSS controls

  • Requirement 3.5: Primary Account Numbers (PANs) must be rendered unreadable wherever stored. Using numeric encoding tokenization satisfies this when the tokenization system is properly isolated.
  • Requirement 3.6 / 3.7: Key management procedures must cover generation, distribution, storage, access, retirement, and destruction. Map your secrets manager workflows to these requirements.
  • Requirement 7: Restrict access to tokenization services to only those with a business need.
  • Requirement 10: Log all access to tokenized data (detokenize calls) with timestamps and caller identity.

Scope reduction

When PANs are tokenized before leaving the point of interaction and the tokenization service itself is isolated, systems that only store or process tokens (not PANs) may be eligible for a reduced PCI DSS scope. The tokenization service itself remains in scope.

GDPR & Privacy

Under GDPR, tokenized data is still personal data if the original can be recovered — which is the case here. However, pseudonymization (Art. 4(5) GDPR) is explicitly recognized as a risk-reduction measure that can reduce obligations for the pseudonymized data.

Key considerations

  • Pseudonymization, not anonymization: tokens are reversible. Your DPA must document this distinction.
  • Right to erasure: destroying the encryption key effectively anonymizes all tokens derived from it — a practical path to "erasure by key deletion."
  • Data minimization: only tokenize fields that require reversibility. For fields that never need to be recovered, use one-way hashing instead.
  • Data processing agreements: if the tokenization service runs in a third-party cloud, ensure appropriate DPAs are in place.
  • Transfer restrictions: tokens reduce risk in cross-border transfers but do not eliminate the need for adequate transfer mechanisms (SCCs, adequacy decisions).

Production Deployment Checklist

Review each item before going live.

Key management

  • AES-256 TOKENIZATION_KEY generated with CSPRNG
  • API_KEY generated with CSPRNG (openssl rand -hex 32)
  • Both keys stored in secrets manager (not in code or config files)
  • Separate keys for dev / staging / production
  • Key rotation procedure documented and tested
  • Key access audit logging enabled in secrets manager

Network security

  • Service deployed in private subnet with no public IP
  • Security groups restrict access to authorized callers only
  • Health endpoint not reachable from external networks
  • TLS / mTLS enforced between callers and service

Container hardening

  • Container runs as non-root user (UID 1000)
  • readOnlyRootFilesystem: true set
  • allowPrivilegeEscalation: false set
  • Resource limits (CPU + memory) configured
  • Image pulled from trusted registry, digest pinned

Operations

  • Liveness and readiness probes configured on /health
  • At least 2 replicas across availability zones
  • Structured logging enabled (no PII in logs)
  • Alerts configured for HTTP 500 rate and service unavailability
  • Re-tokenization runbook documented for key rotation

Integration

  • Encoding, tweak, and algorithm stored alongside each token
  • Encoding selection reviewed for each field type
  • Error handling implemented for 422 and 500 responses
  • Detokenization integrity check in place for critical fields
  • Token cache invalidation on key rotation

AgileTrust Tokenization v0.2 — FF3-1/AES — NIST SP 800-38G Rev 1

Home  •  Manual EN  •  Manual ES  •  API Reference