Page cover image

TMLL

Trace-Server Machine Learning Library (TMLL) is an automated pipeline that aims to apply Machine Learning techniques to the analyses derived from Trace Server. TMLL aims to simplify the process of performing both primitive trace analyses and complementary ML-based investigations.

TMLL provides users with pre-built, automated solutions integrating general Trace-Server analyses (e.g., CPU, Memory, or Disk usage) with machine learning techniques. This allows for more precise, efficient analysis without requiring deep knowledge in either Trace-Server operations or machine learning. By streamlining the workflow, TMLL empowers users to identify anomalies, trends, and other performance insights without extensive technical expertise, significantly improving the usability of trace data in real-world applications.

Features and Modules

In a nutshell, TMLL employs a diverse set of machine learning techniques, ranging from straightforward statistical tests to more sophisticated model-training procedures, to provide insights from analyses driven by Trace Server. These features are designed to help users reduce their manual efforts by automating the trace analysis process.

Anomaly Detection

Irregularities in system behavior can disrupt operations, often without immediate visibility. The Anomaly Detection module serves as a watchful observer, meticulously analyzing time-series data to detect deviations from expected patterns. It highlights anomalies with precision, enabling you to address them proactively and maintain smooth system functionality.

from tmll.ml.modules.anomaly_detection.anomaly_detection_module import AnomalyDetection

# Initialize the module
ad = AnomalyDetection(client, experiment, outputs) # Check the Quickstart page to see what are these variables

# Find anomalies using a custom method and parameters
anomalies = ad.find_anomalies(method='zscore', zscore_threshold=3)

# Plot the anomalies
ad.plot_anomalies(anomalies, plot_size=(15,8))

Memory Leak Detection

Memory leaks can quietly deplete a system's resources, leading to inefficiencies and potential instability. TMLL delves into the intricacies of memory usage, analyzing allocation patterns and pinpointing areas of concern. By identifying potential leaks and computing critical metrics, it acts as a vigilant safeguard, ensuring your system maintains optimal performance.

from tmll.ml.modules.anomaly_detection.memory_leak_detection_module import MemoryLeakDetection

# Initialize the module
mld = MemoryLeakDetection(client, experiment) # Check the Quickstart page to see what are these variables

# Find the memory leaks
leaks = mld.analyze_memory_leaks()

# Plot the memory leaks
mld.plot_memory_leaks_analysis(leaks)

Last updated