Skip to content

Making Logs Actually Useful with Structured Logging

Published: at 01:34 AM

Team,

I’ve completed implementation of our new structured logging utility and wanted to share it with everyone. This addresses the significant time we’ve all spent tracking down related log entries when looking through logs and should substantially improve our debugging workflow.

Current logging challenges

Currently, when looking through logs, we often find ourselves performing complex text searches like:

*Processing request* AND *user 12345* AND *client*

The challenge is that inconsistent formatting across different developers and services means we frequently miss relevant log entries. Different teams use “user ID”, “userId”, “user_id”, etc., leading to incomplete searches and extended debugging time.

Structured logging approach

Instead of embedding all context in text messages like:

log.info("Processing request for user 12345 in client ABC");

Structured logging uses key-value pairs:

log.info("Processing request for user {} in client {}", 
    SLog.userId(12345), SLog.clientId("ABC"));

This creates indexed, searchable fields in Datadog. Instead of text pattern matching, you can query @userId:12345 to retrieve all logs related to that user across all services instantly.

Benefits for our team

Faster debugging: 20+ minute log searches become sub-minute Datadog queries.

No missed logs: Standardized keys eliminate inconsistencies across developers and services.

Automatic correlation: Related events (workflows, async jobs, user actions) are automatically linked.

Simple dashboards: Group by clientId for error rates, filter by workflowExecutionId for tracking - no regex needed.

Better analysis: Instant filtering by customer, feature, or transaction ID for faster root cause analysis.

SLog utility implementation

The SLog utility simplifies structured logging adoption. Instead of the verbose existing approach:

import net.logstash.logback.argument.StructuredArguments;

log.info("Processing request for user {} in client {}", 
    StructuredArguments.keyValue("userId", userId),
    StructuredArguments.keyValue("clientId", clientId));

You can now just do:

log.info("Processing request for user {} in client {}", 
    SLog.userId(userId),
    SLog.clientId(clientId));

This approach reduces verbosity while ensuring consistent key naming across all services.

Available methods

The current SLog implementation provides these convenience methods:

SLog.userId(userId)           // for user operations
SLog.clientId(clientId)       // for client context  
SLog.workflowExecutionId(id)  // for workflow tracking
SLog.asyncJobId(jobId)        // for async job correlation

Implementation examples

Service methods:

public void processUserWorkflow(Long userId, Long clientId, String workflowType) {
    log.info("Starting workflow processing {} {}", 
        SLog.userId(userId), SLog.clientId(clientId));
    
    try {
        WorkflowExecution execution = startWorkflow(workflowType);
        log.info("Workflow started {} {} {}", 
            SLog.userId(userId), SLog.clientId(clientId), SLog.workflowExecutionId(execution.getId()));
        // ... rest of your logic
    } catch (WorkflowException e) {
        log.error("Workflow processing failed {} {}", 
            SLog.userId(userId), SLog.clientId(clientId), e);
    }
}

All new development should take advantage of structured logging. Additionally, whenever you have the chance, changing old logs to structured logs would be very helpful - it isn’t that much of a refactor.

Prioritize these areas for structured logging:

Extending functionality

Additional structured keys (such as projectId, findingId, etc.) can be added to the SLogKeys enum with corresponding convenience methods as needed.

Next steps

I recommend we begin incorporating this utility in new development and consider adding structured logging when modifying existing error handling code.

This implementation should significantly reduce debugging time and improve our log analysis capabilities. Please reach out if you have questions about implementation or would like to see additional examples.


Previous Post
Python cvs Utilities
Next Post
Python Observe Script