When you deploy your applications using tensorkube deploy or submit your jobs using tensorkube job queue, Tensorfuse automatically collects logs from your application and stores them in a centralised location configured while setting up Tensorfuse.

By default, Tensorfuse stores logs in AWS Cloudwatch so that they are easily accessible and can be queried using the AWS CLI or the AWS Console. This guide will walk you through the process of querying your logs and will also equip you with a library of queries that can be run quickly to debug common issues.

Pre-requisites

All your logs are accessible on the Cloudwatch console. Make sure that you have the following settings configured:

  1. Under Select log groups by: Log group name
  2. Under Selection Criteria: /aws/containerinsights/tensorkube/application

Querying application logs

To focus on your application’s logs, you can use CloudWatch Logs Insights with efficient queries targeting your application’s log streams. To view the logs, copy one of the queries from below and run it on the Cloudwatch console. Below are some useful queries:

Filter logs by application name

fields @timestamp as Time, coalesce(log_processed.message, log, @message.log) as LogMessage, kubernetes.pod_name as PodName
| filter @logStream like /YOUR_APPLICATION_NAME/ and kubernetes.namespace_name = "YOUR_ENVIRONMENT_NAME"
| sort Time desc
| limit 100
  • Replace YOUR-APPLICATION-NAME with the name of your deployed application. You can get the application name using tensorkube deployment list.
  • Replace YOUR_ENVIRONMENT_NAME with the name of your Tensorfuse environment where the application is deployed. Tensorfuse uses default environment if no environment is mentioned
  • The query retrieves logs for a specific application, sorted by timestamp.

Search for error logs

Use this query to find potential issues in your application logs by filtering for error-related keywords.

fields @timestamp as Time, coalesce(log_processed.message, log, @message.log) as LogMessage, kubernetes.pod_name as PodName
| filter @logStream like /YOUR_APPLICATION_NAME/ and kubernetes.namespace_name = "YOUR_ENVIRONMENT_NAME" and @message like /error|exception|fail/
| sort @timestamp desc
| limit 100

Querying Job Logs

To focus on your queued jobs’ logs, run the following query on the Cloudwatch console

fields @timestamp as Time, coalesce(log_processed.message, log, @message.log) as LogMessage
| filter @logStream like /inference-job/
| sort @timestamp desc
| limit 100

This query retrieves logs for all queued jobs, sorted by timestamp.

You can further modify the queries by using the CloudWatch Logs Insights Query Syntax to filter logs based on specific criteria.

Important To avoid incurring excessive charges by running large queries, keep in mind the following best practices:

  • Select only the necessary log groups for each query.

  • Always specify the narrowest possible time range for your queries.

  • When you use the console to run queries, cancel all your queries before you close the CloudWatch Logs Insights console page. Otherwise, queries continue to run until completion.

  • When you add a CloudWatch Logs Insights widget to a dashboard, ensure that the dashboard is not refreshing at a high frequency, because each refresh starts a new query.