Recently, I have been working on a project as part of my job, and I found myself wasting a lot of time trying to debug my code and find out why certain things are not working as expected. As an engineer, I know that being able to debug and troubleshoot your code correctly is an essential skill, and I wanted to find a better way to do it.
After doing some research and trying out a few things for myself, I am now ready to share what I've learned about Python logging.
In this blog post, I will explain the basics of creating informative logs in Python. I will explain what logs are, why they are essential, and how to create valuable Logs. I will also provide some examples of how to use the built-in python logging and JSON formatter.
What is a Log, and How is it Used in Computer Programs?
A log is a record of events or activities in a computer program. It can be used to track errors, performance information, and other important information.
For example, when you run a program, the log can record any errors that occur and the actions that were taken when the error happened. This information can help you understand why something is not working correctly.
Why are Logs Essential in Applications?
Logs are essential because they allow you to identify and troubleshoot problems quickly. The log records also provide a historical record of what happened when errors occurred. This makes it much easier to find the source of the problem and fix it.
With log records, it would be easier to debug your code. Logs are also essential for alerting you to potential problems before they become a significant issue.
The Logging levels
The logging level is the severity of a log event. There are five different logging levels in Python:
- DEBUG—This level is used for logging detailed information about the program execution. This is useful for debugging.
- INFO — This level is used for logging general information about the program execution. This is useful for notifying successful operations.
- WARNING — This level is used for logging a potential issue.
- ERROR — This level is used for logging errors. This is useful for tracking and troubleshooting errors.
- CRITICAL—This level is used for logging critical errors. This is useful for alerting of serious issues.
It's essential to use the correct logging level when creating logs. This will ensure that you can quickly identify and troubleshoot any issue.
Python Logging module
The Python logging module is a standard library that allows you to create logs easily. It provides various features, such as formatting logs, setting logging levels, and creating handlers.
Once you have imported the logging module, you can create log records by calling the getLogger method. This method will return a Logger object which can be used to create log records.
import logging
logger = logging.getLogger(__name__)
The getLogger method argument can be used to specify the name of the logger. This is useful for creating logs for specific parts of the program.
In Python, the variable __name__ is a special variable that is automatically set by the Python interpreter when a module is imported or executed as a standalone script. The __name__ of a module is set to the string "main" when the module is executed as a standalone script and to the module's name otherwise.
Using the __name__ variable in a logger allows you to quickly identify the source of log messages, mainly when your application is composed of multiple modules.
In an application, we can make a distinction between different types of loggers:
- Root logger — The top-level logger in the logger hierarchy is automatically created and has no name or level set by default.
Any messages that are not handled by more specific loggers are passed to the root logger. - Application Logger — A logger that is specific to a particular application or module. These loggers are typically used for logging events that are related to the application’s functionality.
- Package Logger — A logger that a library uses to log events related to its functionality.
- Third-party logger — A logger that is used by a third-party library or module.
The logger object provides several methods, such as debug(), info(), warning(), error(), and critical(), that correspond to the different severity levels. To emit a log message, you call the appropriate method and pass the message as an argument. For example:
import logging
logger = logging.getLogger(__name__)
def log_level_example():
logger.debug("Debugging message")
logger.info("Informational message")
logger.warning("Warning message")
logger.error("Error message")
logger.critical("Critical message")
Creating An Informative Log
The log message is the main part of a log record, and it should include a description of the action that was taken and the result of the action. It is essential to ensure the log message is informative and easy to understand.
Why is Formatting the Message Bad?
The logging module will default format the log message as a string. However, this is not ideal, as it can be challenging to parse and extract the data from the log message.
Additionally, Formatting the message with additional information about the situation will make the log message inconsistent. It will prevent creating a metric and alerts since each log record will contain a different message log.
Maximizing Logging Efficiency: Utilizing the Extra Field for Enhanced Event Information
The extra field is an optional argument of the logger method. It is used to pass additional data to the log record. This data can be used to provide more information about the event.
The benefit of keeping the log message clean and moving the additional log information to the extra field is gaining more readable log messages and more detailed information about the events when needed.
Each field in the extra dictionary must be serializable since the log object is eventually converted to a JSON string.
For example, if you are logging an error, you can use the extra field to pass the error message and other relevant information.
There are certain reserved keys in the logging record that you cannot use as keys in the "extra" parameter.
These reserved keys are: args, exc_info, exc_text, filename, funcName, levelname, levelno, module, msecs, msg, name, pathname, process, processName, relativeCreated, stack_info, thread, threadName
If you try to use any of these keys as the key in the "extra" parameter, you will get a TypeError with a message saying that the key is reserved.
So, to avoid this error, you should choose a different key that is not in the list of reserved keys.
The Logger Formatter
The logger formatter is used to control the format of the log records. You can use it to specify the date format, the log level, and the log message format. You can also use the formatter to add additional data to the log record.
The JSON Formatter
JSON formatter is a library that allows you to format your logs into JSON format. JSON is a popular data format used to exchange data between different systems. JSON formatter makes it easy to structure your logs and make them easier to read.
In addition to emitting log messages, it's important to format them in an informative and easy-to-understand way. The logging library provides a way to customize the format of log messages, making it possible to include relevant information, such as timestamps, severity levels, source code locations, and so on. You can use the Formatter class to define the format of log messages.
By Inheriting from JsonFormatter and overwriting certain Methods, I'm Able to Expand its Functionality.
import json
from pythonjsonlogger import jsonlogger
DATE_FORMAT_TIMEZONE = "%Y-%m-%dT%H:%M:%S.%fZ"
class JsonFormatter(jsonlogger.JsonFormatter):
EXTRA_PREFIX = "extra_"
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def add_fields(self, log_record, record, message_dict):
super().add_fields(log_record, record, message_dict)
# update the timestamp format
log_record["timestamp"] = log_record["timestamp"].utcnow().strftime(DATE_FORMAT_TIMEZONE)
log_record["level"] = record.levelname
log_record["type"] = "log"
log_record["level_num"] = record.levelno
log_record["logger_name"] = record.name
trace = self._get_trace_id()
if trace:
log_record["trace_id"] = trace
self.set_extra_keys(record, log_record, self._skip_fields)
@staticmethod
def is_private_key(key):
return hasattr(key, "startswith") and key.startswith("_")
@staticmethod
def is_extra_key(key):
return hasattr(key, "startswith") and key.startswith(JsonFormatter.EXTRA_PREFIX)
@staticmethod
def set_extra_keys(record, log_record, reserved):
"""
Add the extra data to the log record.
prefix will be added to all custom tags.
"""
record_items = list(record.__dict__.items())
records_filtered_reserved = [item for item in record_items if item[0] not in reserved]
records_filtered_private_attr = [item for item in records_filtered_reserved if
not JsonFormatter.is_private_key(item[0])]
for key, value in records_filtered_private_attr:
if not JsonFormatter.is_extra_key(key):
if isinstance(value, (dict, list)):
value = json.dumps(value)
new_key_name = f"{JsonFormatter.EXTRA_PREFIX}{key}"
log_record[new_key_name] = value
log_record.pop(key, None)
@staticmethod
def _get_trace_id():
"""
The trace id can be used for tracing logs across multiple services.
It's fetched from the headers of the request.
Should be implemented according to the tracing mechanism of the service.
e.g in flask or fastapi:
trace_id = request.headers.get("X-Trace-Id")
"""
return "example_trace_id"
The Logger Handler
By default, log messages are emitted to the console. To configure the logging system to emit log messages to other destinations, you need to create a handler and attach it to the logger. For example, to emit log messages to a file, you can use the FileHandler class.
import logging
logger = logging.getLogger(__name__)
# create a file handler
file_handler = logging.FileHandler("logs.txt")
# add the file handler to the logger
logger.addHandler(file_handler)
# set the log level
logger.setLevel(logging.DEBUG)
Demo: Logging With The New JSON Formatter
You can take a look at the full demo app written with the FastAPI framework:
The following example shows how to create a log record with the JSON formatter:
import loggging
logger = logging.getLogger(__name__)
logger.debug("The API call end successfully", extra={
"url"="https://example.com/",
"payload"="{}",
"method"="GET"
})
By using the extra field and the JSON formatter, each log will be displayed as follows:
{
"timestamp": "2023-01-16T10:48:52.822595Z",
"message": "The API call end successfully",
"level": "DEBUG",
"type": "log",
"level_num": 10,
"logger_name": "main",
"extra_url": "https://example.com/",
"extra_payload": "{}",
"extra_method": "GET"
}
Now it’s simpler to search over the logs record. Let's say, for example, that you want to know how many GET requests you got in a specific time range. By querying the logs record, you can get an accurate number.
# PSUDO CODE:
start_time = datetime(2022, 1, 1, 0, 0, 0)
end_time = datetime(2022, 1, 31, 23, 59, 59)
# construct the query
query = {
'extra_method': 'GET',
'timestamp': {
'$gte': start_time,
'$lte': end_time
}
}
# execute the query and count the results
count = collection.count_documents(query)
print(f"Number of GET requests between {start_time} and {end_time}: {count}")
By following the steps outlined in this blog post, you should now be able to create informative logs in Python. Logging is an essential skill for any programmer, and when done correctly, it can save you a lot of time and headaches.
Eventually, you should have a robust and flexible logging system that is easy to use and provides detailed and helpful information. To summarize, here are the key things to remember when creating informative logs in Python:
- Use the correct logging level
- Make sure the log message is informative
- Use the extra field to provide additional information
- Make sure the extra fields are serializable
I hope this blog post has helped you understand how to create informative logs in Python.
No comments:
Post a Comment