Sanitize Before Sharing
"I want to share my knowledge graph. How do I make sure no sensitive data goes with it?"
This guide walks through scanning a knowledge graph for sensitive data and cleaning it before sharing with teammates or publishing publicly. You need an active knowledge graph (/kmgraph:status) and KMGraph v0.0.6 or later.
Run the scanβ
The /kmgraph:check-sensitive command scans all knowledge graph files for known sensitive patterns.
/kmgraph:check-sensitive
The command checks for:
| Category | Examples detected |
|---|---|
| API keys and tokens | API_KEY=, Bearer <token>, sk_live_β¦, AWS AKIAβ¦ keys |
| Passwords and secrets | password=, passwd=, DB_PASSWORD=, private key blocks |
| Personal paths | /Users/<name>/, /home/<name>/, C:\Users\ |
| Internal hostnames | *.internal, staging.*, dev.* domain patterns |
| Email addresses | Any user@domain.com pattern |
| Internal IPs | RFC 1918 ranges: 10.x, 172.16-31.x, 192.168.x |
The output lists each match by file, line number, and pattern category.
Fix findingsβ
Option A β Automatic fix (recommended for bulk replacements)
/kmgraph:check-sensitive --fix
The --fix flag replaces detected values with safe placeholders in-place:
- API keys β
<your-api-key> - Passwords β
your_secure_password - Absolute paths β relative equivalents or
~/<path> - Internal IPs β RFC 5737 documentation IPs (
192.0.2.x) - Internal hostnames β
<internal-host>
Review the diff after running --fix before committing. Auto-fix handles common patterns; manual review catches edge cases.
Option B β Manual edit
Open each flagged file and apply replacements by hand. Use standard placeholders for consistency:
| Sensitive value type | Placeholder to use |
|---|---|
| Domains | example.com |
| IPs | 192.0.2.1 (RFC 5737) |
| Emails | user@example.com |
| Credentials | <your-value-here> |
| Internal hostnames | <internal-host> |
| Absolute paths | ~/<relative-path> or ./path |
Configure custom rulesβ
The kg-config.json file in the knowledge graph root controls which patterns are checked and what --fix substitutes. Add or extend the sanitization block:
{
"sanitization": {
"patterns": [
{
"name": "company-name",
"regex": "Acme Corp|MegaCorp",
"replacement": "Example Corp",
"severity": "warn"
},
{
"name": "project-codename",
"regex": "Project Falcon|FLCN-[0-9]+",
"replacement": "Project X",
"severity": "block"
}
],
"excludePaths": [
"docs/plans/",
"docs/examples/"
]
}
}
Severity levels:
blockβ/kmgraph:check-sensitiveexits non-zero; pre-commit hook blocks the commitwarnβ reported but does not blockinfoβ logged only; no user-visible alert
Run /kmgraph:check-sensitive again after editing kg-config.json to confirm custom patterns are picked up.
Install the pre-commit hookβ
The pre-commit hook auto-blocks commits that contain sensitive patterns, catching issues before they reach the remote.
cp core/examples-hooks/pre-commit-sanitization.sh .git/hooks/pre-commit
chmod +x .git/hooks/pre-commit
The hook reads the same patterns from kg-config.json. Patterns with "severity": "block" will abort the commit; "severity": "warn" patterns print a warning but allow the commit through.
Confirm the scan passesβ
After applying fixes, re-run the scan to confirm a clean result:
/kmgraph:check-sensitive
Expected output when clean:
Scan complete. No sensitive patterns detected.
Then do a final manual spot-check on files that contain real project data:
- Lesson files in
docs/lessons-learned/ - ADRs in
docs/decisions/ - Session summaries in
docs/sessions/ - Any example configs or code snippets
Add a brief privacy note to the repository README before publishing:
## Privacy note
This knowledge graph has been sanitized for sharing. Company names,
internal IPs, credentials, and personal paths have been replaced with
generic placeholders. Patterns and lessons remain intact.
Relatedβ
- Sanitization Checklist β exhaustive category-by-category reference with scan commands for each pattern type