Usage

Setting up monitoring

You can use tools like Pingdom, StatusCake or other uptime robots to monitor service status. The /health/ endpoint will respond with an HTTP 200 if all checks passed and with an HTTP 500 if any of the tests failed.

For concrete, step-by-step examples of multi-tier endpoint setups including uptime monitoring, container probes, reverse-proxy configuration, and RSS/Atom integration into Slack or Matrix, see the Cookbook.

Getting machine-readable reports

Plain text

For simple monitoring and scripting, you can request plain text output with the Accept HTTP header set to text/plain or pass format=text as a query parameter.

The endpoint will return a plain text response with HTTP 200 if all checks passed and HTTP 500 if any check failed:

$ curl -v -X GET -H "Accept: text/plain" http://www.example.com/health/

> GET /health/ HTTP/1.1
> Host: www.example.com
> Accept: text/plain
>
< HTTP/1.1 200 OK
< Content-Type: text/plain; charset=utf-8

CacheBackend: OK
DatabaseBackend: OK
S3BotoStorageHealthCheck: OK

$ curl -v -X GET http://www.example.com/health/?format=text

> GET /health/?format=text HTTP/1.1
> Host: www.example.com
>
< HTTP/1.1 200 OK
< Content-Type: text/plain; charset=utf-8

CacheBackend: OK
DatabaseBackend: OK
S3BotoStorageHealthCheck: OK

This format is particularly useful for command-line tools and simple monitoring scripts that don't need the overhead of JSON parsing.

JSON

If you want machine-readable status reports you can request the /health/ endpoint with the Accept HTTP header set to application/json or pass format=json as a query parameter.

The backend will return a JSON response:

$ curl -v -X GET -H "Accept: application/json" http://www.example.com/health/

> GET /health/ HTTP/1.1
> Host: www.example.com
> Accept: application/json
>
< HTTP/1.1 200 OK
< Content-Type: application/json

{
    "CacheBackend": "working",
    "DatabaseBackend": "working",
    "S3BotoStorageHealthCheck": "working"
}

$ curl -v -X GET http://www.example.com/health/?format=json

> GET /health/?format=json HTTP/1.1
> Host: www.example.com
>
< HTTP/1.1 200 OK
< Content-Type: application/json

{
    "CacheBackend": "working",
    "DatabaseBackend": "working",
    "S3BotoStorageHealthCheck": "working"
}

OpenMetrics for Prometheus

For Prometheus monitoring, you can request OpenMetrics format:

$ curl http://www.example.com/health/?format=openmetrics

This will return metrics in the OpenMetrics exposition format, which can be scraped by Prometheus.

RSS and Atom feeds

For RSS feed readers and monitoring tools, you can request RSS or Atom format:

$ curl http://www.example.com/health/?format=rss
$ curl http://www.example.com/health/?format=atom

You can also use the Accept header:

$ curl -H "Accept: application/rss+xml" http://www.example.com/health/
$ curl -H "Accept: application/atom+xml" http://www.example.com/health/

These endpoints always return a 200 status code with health check results in the feed content. Failed checks are indicated by categories and item descriptions.

Writing a custom health check

You can write your own health checks by inheriting from HealthCheck and implementing the run method.

`health_check.HealthCheck` `dataclass`

Bases: ABC

Base class for defining health checks.

Subclasses should implement the run method to perform the actual health check logic. The run method can be either synchronous or asynchronous.

Examples:

>>> import dataclasses
>>> from health_check.base import HealthCheck
>>>
>>> @dataclasses.dataclass
>>> class MyHealthCheck(HealthCheck):
...
...    async def run(self):
...        # Implement health check logic here

Subclasses should be dataclasses or implement their own __repr__ method to provide meaningful representations in health check reports.

Warning

The __repr__ method is used in health check reports. Consider setting repr=False for sensitive dataclass fields to avoid leaking sensitive information or credentials.

Source code in health_check/base.py

@dataclasses.dataclass
class HealthCheck(abc.ABC):
    """
    Base class for defining health checks.

    Subclasses should implement the `run` method to perform the actual health check logic.
    The `run` method can be either synchronous or asynchronous.

    Examples:
        >>> import dataclasses
        >>> from health_check.base import HealthCheck
        >>>
        >>> @dataclasses.dataclass
        >>> class MyHealthCheck(HealthCheck):
        ...
        ...    async def run(self):
        ...        # Implement health check logic here

    Subclasses should be [dataclasses][dataclasses.dataclass] or implement their own `__repr__` method
    to provide meaningful representations in health check reports.

    Warning:
        The `__repr__` method is used in health check reports.
        Consider setting `repr=False` for sensitive dataclass fields
        to avoid leaking sensitive information or credentials.

    """

    @abc.abstractmethod
    async def run(self) -> None:
        """
        Run the health check logic and raise human-readable exceptions as needed.

        Exception must be reraised to indicate the health status and provide context.
        Any unexpected exceptions will be caught and logged for security purposes
        while returning a generic error message.

        Warning:
            Exception messages must not contain sensitive information.

        Raises:
            ServiceWarning: If the service is at a critical state but still operational.
            ServiceUnavailable: If the service is not operational.
            ServiceReturnedUnexpectedResult: If the check performs a computation that returns an unexpected result.

        """
        ...

    def pretty_status(self) -> str:
        """Return a human-readable status string, always 'OK' for the check itself."""
        return "OK"

    @property
    def labels(self) -> dict[str, str]:
        """Return a human-readable label for the check, defaulting to the class name."""
        return {
            "check": self.__class__.__name__,
        } | {
            field.name: str(value)
            for field in dataclasses.fields(self)
            if field.repr and (value := getattr(self, field.name)) is not None
        }

    async def get_result(self, executor: Executor | None = None) -> HealthCheckResult:
        loop = asyncio.get_running_loop()
        start = timeit.default_timer()
        try:
            await self.run() if inspect.iscoroutinefunction(
                self.run
            ) else await loop.run_in_executor(executor, self.run)
        except HealthCheckException as e:
            error = e
        except BaseException:
            logger.exception("Unexpected exception during health check")
            error = HealthCheckException("unknown error")
        else:
            error = None
        return HealthCheckResult(
            check=self,
            error=error,
            time_taken=timeit.default_timer() - start,
        )

`labels` `property`

Return a human-readable label for the check, defaulting to the class name.

`pretty_status()`

Return a human-readable status string, always 'OK' for the check itself.

Source code in health_check/base.py

def pretty_status(self) -> str:
    """Return a human-readable status string, always 'OK' for the check itself."""
    return "OK"

`run()` `abstractmethod` `async`

Run the health check logic and raise human-readable exceptions as needed.

Exception must be reraised to indicate the health status and provide context. Any unexpected exceptions will be caught and logged for security purposes while returning a generic error message.

Warning

Exception messages must not contain sensitive information.

Raises:

Type	Description
`ServiceWarning`	If the service is at a critical state but still operational.
`ServiceUnavailable`	If the service is not operational.
`ServiceReturnedUnexpectedResult`	If the check performs a computation that returns an unexpected result.

Source code in health_check/base.py

@abc.abstractmethod
async def run(self) -> None:
    """
    Run the health check logic and raise human-readable exceptions as needed.

    Exception must be reraised to indicate the health status and provide context.
    Any unexpected exceptions will be caught and logged for security purposes
    while returning a generic error message.

    Warning:
        Exception messages must not contain sensitive information.

    Raises:
        ServiceWarning: If the service is at a critical state but still operational.
        ServiceUnavailable: If the service is not operational.
        ServiceReturnedUnexpectedResult: If the check performs a computation that returns an unexpected result.

    """
    ...

Django command

You can run the Django command health_check to perform your health checks via the command line, or periodically with a cron, as follows:

django-admin health_check --help

This should yield the following output:

Database                 ... OK
CustomHealthCheck        ... Unavailable: Something went wrong!

Similar to the http version, a critical error will cause the command to quit with the exit code 1.

Performance tweaks

All checks are executed asynchronously, either via asyncio or via a thread pool, depending on the implementation of the individual checks. This allows for concurrent execution of the IO-bound checks, which reduces the response time.

The event loop's default executor is used to run synchronous checks (e.g. Database, Mail, or Storage) in a thread pool. This pool is usually persisted across requests. This may lead to high performance while permanently allocating more memory. This may be undesirable for some applications, especially with S3Storage, which uses thread-local connections.

This can be mitigated by using a custom executor that creates a new thread pool for each request, which is then cleaned up after the checks are completed. This can be achieved by subclassing HealthCheckView and overriding the get_executor method to return a context manager providing a new ThreadPoolExecutor instance for each request.

from concurrent.futures import ThreadPoolExecutor
from health_check.views import HealthCheckView


class CustomHealthCheckView(HealthCheckView):
    def get_executor(self):
        return ThreadPoolExecutor(max_workers=len(self.checks))

This approach ensures that each request gets a fresh thread pool, which can help manage memory usage more effectively while still providing the benefits of concurrent execution for synchronous checks.

Usage

Setting up monitoring

Getting machine-readable reports

Plain text

JSON

OpenMetrics for Prometheus

RSS and Atom feeds

Writing a custom health check

health_check.HealthCheck dataclass

labels property

pretty_status()

run() abstractmethod async

Django command

Performance tweaks

`health_check.HealthCheck` `dataclass`

`labels` `property`

`pretty_status()`

`run()` `abstractmethod` `async`