Baxter International · Feb 2024 – Present

EST Certificate Management Service

Production-ready PKI infrastructure enabling automated certificate issuance and rotation across internal microservices

PythonDockerKubernetesPKI/ESTOAuth2/KeycloakPostgreSQLLinuxHIPAA

Overview

In enterprise healthcare environments, managing digital certificates for thousands of microservices is a critical security challenge. Manual certificate issuance and rotation doesn't scale and creates security vulnerabilities when certificates expire unexpectedly.

At Baxter International, I led the end-to-end development of a production-ready EST (Enrollment over Secure Transport) certificate management service that automates the entire certificate lifecycle—from issuance to rotation—across our internal microservices infrastructure.

What is EST?

EST (RFC 7030) is a protocol for automated certificate enrollment and renewal. Think of it as "ACME for internal enterprise PKI"—it provides a standardized way for services to:

Unlike public ACME (used by Let's Encrypt), EST is designed for enterprise internal PKI where you control both the certificate authority and the clients.

Technical Architecture

Core Components

The service consists of several key components:

  1. EST Server: HTTP REST API implementing RFC 7030 endpoints
  2. Certificate Authority Integration: Secure connection to internal CA for certificate signing
  3. OAuth2/Keycloak Integration: Service authentication and authorization
  4. Certificate Store: PostgreSQL database for certificate lifecycle tracking
  5. Renewal Daemon: Automated certificate renewal before expiration
  6. Audit Logging: Comprehensive logging for HIPAA compliance

Security Model

Security was paramount. The service implements defense-in-depth:

Kubernetes Deployment

The service runs on Kubernetes with:

Implementation Challenges

Challenge 1: Certificate Renewal Timing

When should certificates be renewed? Too early wastes resources. Too late risks expiration.

Solution: Implemented adaptive renewal scheduling. Certificates are renewed at 2/3 of their lifetime by default, with exponential backoff retry logic if renewal fails. Critical services get prioritized renewal slots.

Challenge 2: HIPAA Compliance

Healthcare environments require comprehensive audit trails. Every certificate operation must be logged with:

Solution: Built comprehensive audit logging system with structured logs forwarded to centralized SIEM. All operations are tracked with correlation IDs for debugging.

Challenge 3: Zero-Downtime Certificate Rotation

How do you rotate a certificate for a running service without downtime?

Solution: Implemented dual-certificate overlap period. Services request new certificates before old ones expire, run with both certificates active for an overlap window, then deprecate the old certificate. This ensures continuous operation during rotation.

Impact & Adoption

Key Results

  • Production Ready: Currently in qualification phase with multiple internal teams
  • Automation: Eliminated manual certificate management for enrolled services
  • Security: Zero certificate expiration incidents since deployment
  • Compliance: Full audit trail meeting HIPAA requirements

The service is moving toward broader internal adoption following successful qualification with early adopter teams. Feedback has been overwhelmingly positive—teams appreciate not having to manage certificate lifecycles manually.

Collaboration & External Vendors

This project required extensive collaboration:

I acted as the technical lead, coordinating between stakeholders, making architectural decisions, and ensuring the solution met enterprise security requirements.

Technical Deep Dives

EST Protocol Implementation

EST defines several endpoints. Key implementations:

Each endpoint implements strict validation:

OAuth2 Integration

Rather than using EST's built-in HTTP Basic Auth, we integrated OAuth2 for enterprise SSO:

  1. Service obtains OAuth2 token from Keycloak
  2. Token included in Authorization header with EST request
  3. EST server validates token with Keycloak
  4. Token claims used for authorization decisions (which certificate types can this service request?)

Lessons Learned

  1. Security reviews take time: PKI services undergo extensive security review. Build extra time into your timeline for security team feedback.
  2. Logging is critical: In production, you'll debug certificate issues by examining logs. Invest heavily in structured, searchable logging from day one.
  3. Test failure scenarios: Happy-path testing isn't enough. Test certificate expiration, CA unavailability, network partitions, etc.
  4. Documentation matters: Services integrating with EST need clear documentation. I created comprehensive onboarding docs with code examples.

Future Enhancements

Several enhancements are planned for future releases:

NOTE ON PROPRIETARY INFORMATION

This description focuses on publicly-known PKI/EST concepts and general architectural patterns. Specific implementation details, vendor names, and Baxter proprietary information are intentionally omitted.