Skip to main content

Sanitize Before Sharing

"I want to share my knowledge graph. How do I make sure no sensitive data goes with it?"

This guide walks through scanning a knowledge graph for sensitive data and cleaning it before sharing with teammates or publishing publicly. You need an active knowledge graph (/kmgraph:status) and KMGraph v0.0.6 or later.

Run the scan​

The /kmgraph:check-sensitive command scans all knowledge graph files for known sensitive patterns.

/kmgraph:check-sensitive

The command checks for:

CategoryExamples detected
API keys and tokensAPI_KEY=, Bearer <token>, sk_live_…, AWS AKIA… keys
Passwords and secretspassword=, passwd=, DB_PASSWORD=, private key blocks
Personal paths/Users/<name>/, /home/<name>/, C:\Users\
Internal hostnames*.internal, staging.*, dev.* domain patterns
Email addressesAny user@domain.com pattern
Internal IPsRFC 1918 ranges: 10.x, 172.16-31.x, 192.168.x

The output lists each match by file, line number, and pattern category.

Fix findings​

Option A β€” Automatic fix (recommended for bulk replacements)

/kmgraph:check-sensitive --fix

The --fix flag replaces detected values with safe placeholders in-place:

  • API keys β†’ <your-api-key>
  • Passwords β†’ your_secure_password
  • Absolute paths β†’ relative equivalents or ~/<path>
  • Internal IPs β†’ RFC 5737 documentation IPs (192.0.2.x)
  • Internal hostnames β†’ <internal-host>

Review the diff after running --fix before committing. Auto-fix handles common patterns; manual review catches edge cases.

Option B β€” Manual edit

Open each flagged file and apply replacements by hand. Use standard placeholders for consistency:

Sensitive value typePlaceholder to use
Domainsexample.com
IPs192.0.2.1 (RFC 5737)
Emailsuser@example.com
Credentials<your-value-here>
Internal hostnames<internal-host>
Absolute paths~/<relative-path> or ./path

Configure custom rules​

The kg-config.json file in the knowledge graph root controls which patterns are checked and what --fix substitutes. Add or extend the sanitization block:

{
"sanitization": {
"patterns": [
{
"name": "company-name",
"regex": "Acme Corp|MegaCorp",
"replacement": "Example Corp",
"severity": "warn"
},
{
"name": "project-codename",
"regex": "Project Falcon|FLCN-[0-9]+",
"replacement": "Project X",
"severity": "block"
}
],
"excludePaths": [
"docs/plans/",
"docs/examples/"
]
}
}

Severity levels:

  • block β€” /kmgraph:check-sensitive exits non-zero; pre-commit hook blocks the commit
  • warn β€” reported but does not block
  • info β€” logged only; no user-visible alert

Run /kmgraph:check-sensitive again after editing kg-config.json to confirm custom patterns are picked up.

Install the pre-commit hook​

The pre-commit hook auto-blocks commits that contain sensitive patterns, catching issues before they reach the remote.

cp core/examples-hooks/pre-commit-sanitization.sh .git/hooks/pre-commit
chmod +x .git/hooks/pre-commit

The hook reads the same patterns from kg-config.json. Patterns with "severity": "block" will abort the commit; "severity": "warn" patterns print a warning but allow the commit through.

Confirm the scan passes​

After applying fixes, re-run the scan to confirm a clean result:

/kmgraph:check-sensitive

Expected output when clean:

Scan complete. No sensitive patterns detected.

Then do a final manual spot-check on files that contain real project data:

  • Lesson files in docs/lessons-learned/
  • ADRs in docs/decisions/
  • Session summaries in docs/sessions/
  • Any example configs or code snippets

Add a brief privacy note to the repository README before publishing:

## Privacy note

This knowledge graph has been sanitized for sharing. Company names,
internal IPs, credentials, and personal paths have been replaced with
generic placeholders. Patterns and lessons remain intact.

  • Sanitization Checklist β€” exhaustive category-by-category reference with scan commands for each pattern type