There are many ways to introduce and teach Temporal based on your background. Temporal doesn't have a monopoly on explaining Temporal.
Set up Grafana with Temporal Cloud observability to view metrics
Temporal Cloud and SDKs emit metrics that can be used to monitor performance and troubleshoot errors.
Temporal Cloud emits metrics through a Prometheus HTTP API endpoint which can be directly used as a Prometheus data source in Grafana or to query and export Cloud metrics to any observability platform.
The open-source SDKs require you to set up a Prometheus scrape endpoint for Prometheus to collect and aggregate the Worker and Client metrics.
This article describes how to set up your Temporal Cloud and SDK metrics and use them as data sources in Grafana.
The process for setting up observability includes the following steps:
- Create or get your Prometheus endpoint for Temporal Cloud metrics and enable SDK metrics.
- For Temporal Cloud, generate a Prometheus HTTP API endpoint on Temporal Cloud using valid certificates.
- For SDKs, expose a metrics endpoint where Prometheus can scrape SDK metrics and run Prometheus on your host. The examples in this article describe running Prometheus on your local machine where you run your application code.
- Run Grafana and set up data sources for Temporal Cloud and SDK metrics in Grafana. The examples in this article describe running Grafana on your local host where you run your application code.
- Create dashboards in Grafana to view Temporal Cloud metrics and SDK metrics. Temporal provides sample community-driven Grafana dashboards for Cloud and SDK metrics that you can use and customize according to your requirements.
If you're following through with the examples provided here, ensure that you have the following:
- Root CA certificates and end-entity certificates. See Certificate requirements for details.
- Set up your connections to Temporal Cloud using an SDK of your choice and have some Workflows running on Temporal Cloud. See Connect to a Cluster for details.
- Prometheus and Grafana installed.
Temporal Failures
Background: What is a Failure?
A Failure is Temporal's representation of various types of errors that occur in the system.
There are different types of Failures, and each has a different type in the SDKs and different information in the protobuf messages (which are used to communicate with the Temporal Cluster and appear in Event History).
Troubleshoot deadline-exceeded error
All requests made to the Temporal Cluster by the Client or Worker are gRPC requests.
Sometimes, when these frontend requests can't be completed, you'll see this particular error message: Context: deadline exceeded
.
Network interruptions, timeouts, server overload, and Query errors are some of the causes of this error.
The following sections discuss the nature of this error and how to troubleshoot it.
Set up Prometheus and Grafana to view metrics
The Temporal Cluster and SDKs emit metrics that can be used to monitor performance and troubleshoot issues. To collect and aggregate these metrics, you can use one of the following tools:
- Prometheus
- StatsD
- M3
After you enable your monitoring tool, you can relay these metrics to any monitoring and observability platform.
Non-determinism issues for Run Ids
The current Run Id is mutable and can change during a Workflow Retry. You should not rely on storing the current Run Id, or using it for any logical choices, because a Workflow Retry changes the Run Id and can lead to non-determinism issues.
All the ways to run a Temporal Cluster
There are many ways to run a Temporal Cluster on your own. However, the right way for you depends entirely on your use case and where you plan to run it. This article aims to maintain a comprehensive list of all the ways we know of.
Migrate visibility data from ES6
We added support for Elasticsearch v7+ (ES7) in the v1.7.0 update to the Temporal Server. Elasticsearch v7 introduces several breaking changes, including the removal of mapping types. These changes make Elasticsearch v6 incompatible with Elasticsearch v7.
Temporal Platform limits sheet
Running into limits can cause unexpected failures, so be mindful when you design your systems. Here is a list of many hard (error) or soft (warn) limits that you could encounter while using the Temporal Platform.
Cadence to Temporal migration highlights
This page highlights the key differences between Cadence and Temporal that you will need to account for when migrating.