- Consolidated documentation from Ralph Loop iterations - Archived 20+ outdated/superseded files to .archive/ - Kept essential docs: OIDC integration, mobile setup, quick start - Added operational scripts for health monitoring and backup - Research artifacts preserved in .tasks/artifacts/ Current state: - 3 VPS sites (fry, proton, photon) ONLINE in Pangolin - brn-home site pending for local services (Jellyfin, etc.) - Mobile access configuration pending Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
18 KiB
Architecture Validation: Authentik + Pangolin + Guacamole
Validation Date: 2026-01-20 Purpose: Review proposed SSO infrastructure architecture for multi-site deployment
Executive Summary
VERDICT: ✅ APPROVED WITH CRITICAL MODIFICATIONS
The proposed architecture (Authentik + Pangolin + Guacamole) is sound for your use case with one critical exception: the Guacamole/RDP integration has fundamental limitations that require architectural workarounds.
Key Findings
| Component | Status | Confidence | Notes |
|---|---|---|---|
| Authentik | ✅ RECOMMENDED | High | Best choice for self-hosted SSO in 2026 |
| Pangolin | ✅ RECOMMENDED | High | Superior to Cloudflare Tunnel for self-hosted |
| Guacamole + OIDC | ⚠️ APPROVED WITH CAVEATS | Medium | RDP NLA incompatibility requires workarounds |
1. Authentik Validation
Research Findings
Market Position (2026):
- Authentik has emerged as the leading modern SSO solution for self-hosted environments
- Superior to Keycloak for small/medium deployments (lower complexity, better UX)
- Superior to Authelia (full IdP vs just forward auth)
- MIT licensed, active development, 19.6k GitHub stars
Key Strengths:
- Modern architecture: Written in Python (Django), not Java like Keycloak
- Lower resource requirements: Documented to run well with 2GB RAM total
- Better UX: Admin interface significantly easier than Keycloak
- Full protocol support: OIDC, OAuth2, SAML2, LDAP, RADIUS
- Native MFA: TOTP, WebAuthn, Duo, all built-in
- Expression policies: Powerful Python-based policy engine
For Your Use Case:
- ✅ Single-user deployment supported (minimal resource config documented)
- ✅ Service account support for API tokens (Jellyfin mobile apps)
- ✅ MFA enforcement per-application (can require for Guacamole only)
- ✅ Proven integration with Guacamole, Jellyfin SSO plugin, OpenWebUI
- ✅ Active documentation for Pangolin integration
Alternatives Considered:
- Keycloak: Overkill for single-user, 4GB+ RAM, steeper learning curve
- Authelia: Limited to forward auth, no full OIDC provider capabilities
- Zitadel: Newer, less proven integrations
RECOMMENDATION: ✅ Use Authentik as proposed
2. Pangolin Validation
Research Findings
Market Position (2026):
- Pangolin is the leading self-hosted alternative to Cloudflare Tunnel
- Open-source (fosrl/pangolin, 18.2k GitHub stars)
- Built on proven tech: WireGuard + Traefik reverse proxy
- Active community, recently featured in major tech channels (Christian Lempa, NetworkChuck)
Key Strengths:
- Self-hosted control plane: You own all infrastructure, no third-party dependencies
- Identity-aware access control: Native OIDC integration with Authentik
- Dual mode: Tunneled reverse proxy + VPN-style private resource access
- No inbound ports required: WireGuard outbound tunnels from private networks
- Automatic SSL: Let's Encrypt integration via Traefik
- Mobile support: Native apps + WireGuard config export
Architecture Components:
- Pangolin (Control Plane): Dashboard, API, WebSocket server, auth system
- Gerbil (Tunnel Manager): WireGuard interface management
- Newt (Edge Client): Runs on private networks (brn, VPS hosts)
- Traefik (Reverse Proxy): TLS termination, routing, load balancing
- Badger (Auth Middleware): OIDC authentication enforcement
For Your Use Case:
- ✅ Replaces WireGuard mesh: Current 10.51.0.0/24 network becomes Pangolin sites
- ✅ Centralized on brn: Control plane on physically secure host
- ✅ VPS integration: Newt clients on fry, proton, photon for site-to-site routing
- ✅ Mobile access: Apps for pixel9pro, pixel6pro
- ✅ Granular ACLs: Per-service, per-user access control via Authentik
Comparison to Alternatives:
| Solution | Ownership | Cost | Mobile | OIDC | Complexity |
|---|---|---|---|---|---|
| Pangolin | Self-hosted | Free | ✅ | ✅ | Medium |
| Cloudflare Tunnel | Cloudflare | Free | ⚠️ Limited | ✅ | Low |
| Tailscale | Tailscale | $5/user | ✅ | ⚠️ Enterprise | Low |
| Headscale | Self-hosted | Free | ✅ | ❌ | Medium |
Critical Findings:
- ✅ OIDC redirect URI:
https://tunnel.obr.sh/api/v1/auth/callback - ✅ Required scopes: openid, profile, email, groups
- ✅ Site architecture: Each location (brn LAN, fry, proton) becomes a "Site"
- ✅ Resource types: Public (HTTPS with domains) + Private (TCP/UDP for VPN access)
RECOMMENDATION: ✅ Use Pangolin as proposed
3. Guacamole Validation
Research Findings
Market Position (2026):
- Apache Guacamole remains the leading open-source clientless RDP gateway
- No viable open-source alternatives with equivalent feature set
- Active Apache project, version 1.6.0 current
OIDC Support:
- ✅ Native OIDC extension available
- ✅ Documented Authentik integration guide
- ✅ Works well for authentication to Guacamole dashboard
⚠️ CRITICAL LIMITATION DISCOVERED
RDP NLA + OIDC Incompatibility:
The research uncovered a fundamental architectural limitation:
Problem:
- RDP Network Level Authentication (NLA) requires username/password for NTLM/Kerberos authentication
- OIDC authentication never provides the user's password to Guacamole
- Variables available:
${GUAC_USERNAME}✅,${GUAC_PASSWORD}❌ - Result: Cannot use NLA with OIDC authentication
Security Implications:
- NLA is recommended security best practice for RDP (encrypts credentials before RDP connection)
- Disabling NLA exposes credentials during connection handshake
- Windows 11 (argon) defaults to requiring NLA
Workarounds Available:
| Option | Security | User Experience | Implementation |
|---|---|---|---|
| 1. Disable NLA | ⚠️ Lower | Seamless SSO | Easy - disable in Guacamole connection config |
| 2. Prompt for credentials | ✅ High | Double login | Medium - configure in Guacamole |
| 3. Service account | ⚠️ Medium | Seamless SSO | Easy - hardcode credentials, lose audit trail |
| 4. Use CAS instead of OIDC | ✅ High | Seamless SSO | Hard - requires ClearPass Receiver on Windows |
Recommended Approach for Your Deployment
Since this is single-user (you) accessing your own workstation (argon):
RECOMMENDED: Option 1 - Disable NLA
Rationale:
- Low risk: You're the only user, accessing your own machine
- Network already secured: Guacamole only accessible via Pangolin tunnel + Authentik SSO + MFA
- User experience: Best (seamless SSO with TOTP)
- Defense in depth: Multiple layers (MFA on Authentik, network isolation via Pangolin)
Implementation:
# In Guacamole connection config for argon-rdp:
security: rdp # Use standard RDP security instead of NLA
ignore-cert: true # Accept self-signed certs
Additional Security Mitigations:
- ✅ Enforce MFA on Guacamole application in Authentik (TOTP required)
- ✅ Restrict Guacamole to Pangolin tunnel only (no public WAN access)
- ✅ Enable Guacamole session recording for audit trail
- ✅ Configure Windows Firewall on argon to only allow RDP from brn (10.50.0.74)
Alternative for Future Multi-User: If you later add users, switch to Option 2 (prompt for credentials) to maintain per-user accountability.
RECOMMENDATION: ✅ Use Guacamole with NLA disabled, compensated by MFA + Pangolin isolation
4. Service Integration Validation
Jellyfin SSO
Status: ✅ FULLY SUPPORTED
Plugin: SSO-Auth plugin from Jellyfin catalog
Key Findings:
- ✅ Authentik integration well-documented
- ⚠️ Critical: Mobile apps (Android/iOS) have limited OIDC support
- ✅ Solution: Use "Quick Connect" feature for mobile (6-digit code pairing)
- ✅ Alternative: API tokens for dedicated devices
Configuration:
- Provider type: Generic OpenID
- Client auth:
client_secret_post(NOTclient_secret_basic) - Claims: roles via
groupsclaim - Scopes: openid, profile, email, groups
Mobile App Strategy:
- Primary: Quick Connect (user logs in via web SSO, enters code in app)
- Secondary: API tokens per device (generated in Jellyfin dashboard)
OpenWebUI SSO
Status: ✅ FULLY SUPPORTED
Native OIDC: No plugin required
Key Findings:
- ✅ Robust OIDC implementation since v0.7.1+
- ✅ Role-based admin designation via
OAUTH_ADMIN_ROLES - ✅ JIT group provisioning with
ENABLE_OAUTH_GROUP_CREATION - ✅ Automatic role synchronization on every login
Configuration Variables:
OPENID_PROVIDER_URL=https://sso.obr.sh/application/o/openwebui/.well-known/openid-configuration
OAUTH_CLIENT_ID=<from_authentik>
OAUTH_CLIENT_SECRET=<from_authentik>
ENABLE_OAUTH_ROLE_MANAGEMENT=true
OAUTH_ROLES_CLAIM=groups
OAUTH_ADMIN_ROLES=openwebui-admins
Redirect URI: https://ll.obr.sh/oauth/oidc/callback
Gitea SSO (fry + proton)
Status: ✅ FULLY SUPPORTED
Native OIDC: Built-in authentication source
Configuration:
- Type: OAuth2
- Provider: OpenID Connect
- Auto Discovery URL:
https://sso.obr.sh/application/o/gitea/.well-known/openid-configuration - Admin role mapping: Via Authentik groups
Note: Gitea instances remain publicly accessible (federated nature), SSO is optional login method
Transmission
Status: ⚠️ NO SSO SUPPORT
Current: HTTP Basic Authentication
Recommendation:
- Keep existing basic auth
- Protect behind Pangolin tunnel only (no public WAN access)
- Consider forward auth middleware via Traefik if SSO required
Mastodon (bern.social)
Status: ✅ NO CHANGES NEEDED
Reason: Public federated service, should remain publicly accessible
Recommendation: Do not integrate with SSO, keep existing authentication
5. Architectural Risks & Mitigations
Risk Matrix
| Risk | Severity | Probability | Mitigation |
|---|---|---|---|
| Authentik failure = total auth outage | High | Low | Backup recovery codes, PostgreSQL backups, consider HA |
| Pangolin control plane failure | Medium | Low | Services still accessible via LAN, failover to WireGuard |
| RDP NLA disabled security concern | Medium | Medium | Compensate with MFA + network isolation |
| Mobile app SSO limitations (Jellyfin) | Low | High | Use Quick Connect, document for users |
| DNS failure (sso.obr.sh unreachable) | High | Low | Local /etc/hosts entries as backup |
Single Points of Failure
Authentik (sso.obr.sh):
- Impact: All SSO authentication fails
- Mitigation:
- Regular PostgreSQL backups (
pg_dump) - Store recovery codes offline
- Document emergency admin access procedure
- Consider Docker volume backups
- Regular PostgreSQL backups (
Pangolin (tunnel.obr.sh):
- Impact: Mobile/remote access fails, VPS sites unreachable
- Mitigation:
- Services still accessible from LAN (Traefik routes remain)
- Keep existing WireGuard as emergency fallback
- Document manual WireGuard reconnection procedure
brn Host (10.50.0.74):
- Impact: Total control plane failure (Authentik, Pangolin, Guacamole)
- Mitigation:
- Physical host security (already planned)
- UPS for power stability
- Backup restore procedure documented
- Consider VM snapshots before changes
Backup Strategy
Critical Data:
- Authentik PostgreSQL database -
pg_dumpdaily, keep 7 days - Authentik media files -
/srv/docker/authentik/media/ - Pangolin configuration -
/srv/docker/pangolin/database and config - Guacamole PostgreSQL database - connection definitions
- Traefik dynamic config -
/srv/docker/traefik/traefik_dynamic.yaml
Backup Script: See /home/olaf/pangolin/.tasks/artifacts/backup-strategy.md (TODO: create)
6. Alternative Architectures Considered
Alternative A: Keycloak instead of Authentik
Pros:
- More mature (13 years vs 6 years)
- Enterprise-grade features
- Larger community (32k stars vs 19k)
Cons:
- Higher resource requirements (4GB+ RAM)
- Steeper learning curve
- Overkill for single-user deployment
- Java-based (vs Python for Authentik)
Verdict: ❌ Rejected - unnecessary complexity for use case
Alternative B: Cloudflare Tunnel instead of Pangolin
Pros:
- Lower operational burden (managed service)
- Global edge network
- Built-in DDoS protection
- Simpler setup
Cons:
- Third-party dependency (Cloudflare controls routing)
- Limited customization
- No VPN-style private resource access
- Privacy concerns (traffic visibility)
Verdict: ❌ Rejected - plan specifies self-hosted control
Alternative C: Tailscale instead of Pangolin
Pros:
- Easier setup
- Better mobile apps
- NAT traversal superior (DERP relays)
Cons:
- Pricing: $5/user/month after 3 devices
- Control plane dependency on Tailscale servers
- Limited reverse proxy features
- No identity-aware access control without ACL tags
Verdict: ❌ Rejected - cost and third-party dependency
Alternative D: No RDP Gateway (Direct RDP)
Pros:
- Simpler architecture
- No NLA compatibility issues
Cons:
- Requires RDP client installation on devices
- No web-based access (can't use from Chromebook, iPad browser)
- No session recording capability
- Less secure (direct exposure vs gateway)
Verdict: ❌ Rejected - Guacamole provides superior UX and security
7. Final Recommendations
✅ APPROVED Architecture
Core Components:
- Authentik at
sso.obr.sh- SSO/IdP - Pangolin at
tunnel.obr.sh- Tunneled reverse proxy - Guacamole at
remote.obr.sh- RDP gateway (NLA disabled)
🔧 Required Modifications to Original Plan
-
Guacamole RDP Connection:
- Change from "NLA security" to "Standard RDP security"
- Enable session recording for audit trail
- Configure Windows Firewall on argon to only allow brn
-
Authentik MFA Policy:
- Create separate policy for Guacamole application (TOTP required)
- Optional for other services (Jellyfin, OpenWebUI) based on preference
-
Jellyfin Mobile Strategy:
- Document Quick Connect procedure for mobile apps
- Create API tokens for persistent devices (TV apps)
-
Transmission:
- Keep HTTP basic auth (no OIDC support)
- Access via Pangolin tunnel only
📋 Implementation Order Validation
The plan's phased approach is sound:
Phase 1: Authentik ✅
- Foundation for all SSO
Phase 2: Pangolin ✅
- Requires Authentik for OIDC
Phase 3: Guacamole ✅
- Requires Authentik for OIDC
Phase 4: Service Integration ✅
- Requires Authentik + Pangolin operational
Phase 5: Traefik Restriction ✅
- Only after Pangolin sites verified working
Phase 6: Mobile Setup ✅
- Final verification step
Order is correct: Sequential dependencies respected
🎯 Success Criteria
Deployment successful when:
- ✅ Can login to Authentik admin via
sso.obr.sh - ✅ Can login to Pangolin dashboard via
tunnel.obr.sh(SSO redirect) - ✅ Can access Guacamole via
remote.obr.sh(SSO + MFA) - ✅ Can connect to argon RDP via Guacamole web interface
- ✅ Can access Jellyfin via Pangolin mobile app (with Quick Connect)
- ✅ Can access OpenWebUI via Pangolin tunnel (SSO login)
- ✅ Jellyfin/OpenWebUI/Transmission return 404 from public WAN
- ✅ VPS hosts (fry, proton) show connected in Pangolin dashboard
⚠️ Rollback Plan
Critical checkpoints:
- After TASK-005 (Authentik deploy): Services still work without SSO
- After TASK-009 (Pangolin sites): Traefik routes still public
- After TASK-024 (Traefik restriction): CRITICAL CHECKPOINT
Rollback procedure:
# Emergency: restore public access
sudo cp /home/olaf/pangolin/.tasks/artifacts/traefik_dynamic.yaml.backup \
/srv/docker/traefik/traefik_dynamic.yaml
docker exec traefik kill -SIGHUP 1 # Reload Traefik config
📊 Resource Requirements
brn Host (10.50.0.74) Additional Load:
- Authentik: +2GB RAM, +2 CPU cores
- Pangolin: +1GB RAM, +1 CPU core
- Guacamole: +1GB RAM, +1 CPU core
- Total: +4GB RAM, +4 CPU cores
Current brn specs needed: Minimum 8GB RAM, 4-6 CPU cores recommended
🔒 Security Posture
Improvements:
- ✅ Centralized authentication (single MFA enrollment)
- ✅ Granular per-service access control
- ✅ Session recording for RDP access
- ✅ Network segmentation via Pangolin tunnels
- ✅ Elimination of password sprawl
Trade-offs:
- ⚠️ RDP NLA disabled (compensated by MFA + network isolation)
- ⚠️ Single point of failure (brn host)
Overall: ✅ Net security improvement
8. Conclusion
FINAL VERDICT: ✅ ARCHITECTURE APPROVED FOR IMPLEMENTATION
The proposed Authentik + Pangolin + Guacamole architecture is sound and recommended with the following conditions:
- Acknowledge RDP NLA limitation and implement compensating controls
- Follow phased implementation order as specified
- Create backup strategy before starting (TASK-027)
- Test thoroughly at each phase before proceeding
- Document emergency rollback procedures
Confidence Level: 85%
Remaining 15% risk factors:
- Pangolin relatively new in production (1 year track record)
- Guacamole NLA workaround requires security discipline
- Single-user deployment lacks redundancy
Recommendation to proceed: ✅ YES
Next step: Execute research-informed implementation starting with TASK-003 (Create Authentik Compose) using insights from RESEARCH-002 and TASK-001 outputs.
Validation completed by: Claude Code Date: 2026-01-20 Research artifacts referenced: RESEARCH-001 through RESEARCH-005, TASK-001