Files
xet-core/utils
Di Xiao c4111eb6da Feature to monitor client process system usage (#617)
Introduces a client benchmark utility to track system resource usage
(CPU, memory, disk I/O, and network I/O) of a process, so we don't need
to write scripts to capture usage stats according to different OS
standards. This becomes extremely helpful when I benchmark on Python
notebook instances, e.g. Google Colab, where system monitor is not
easily accessible or when running a separate monitor script is not easy.

# Usage #
Users can enable monitoring by setting `HF_XET_SYSTEM_MONITOR_ENABLED`
to true, set usage sample interval using
`HF_XET_SYSTEM_MONITOR_SAMPLE_INTERVAL`, this outputs metrics to the
tracing stream at `INFO` level by default. In addition, these metrics
can be redirected to a separate file by setting sample log path using
`HF_XET_SYSTEM_MONITOR_LOG_PATH`.

# Output #
The stats are output in JSON format, which can be queried using tools
like `jq`, e.g.
1. Trace of peak memory usage: `jq '.memory.peak_used_bytes'
[HF_XET_SYSTEM_MONITOR_LOG_PATH]`
2. Trace of disk write speed: `jq '.disk.average_write_speed'
[HF_XET_SYSTEM_MONITOR_LOG_PATH]`
3. Trace of network receive speed: `jq '.network.average_rx_speed'
[HF_XET_SYSTEM_MONITOR_LOG_PATH]`
2026-02-27 13:36:31 -08:00
..
2024-09-11 13:30:20 -07:00

Proto

Directory where gproto files will be created

Operational helpers

  • Logs, metrics and traces
  • Configuration
  • Access to AWS services (e.g. S3)

Examples

Identify which cas_server owns a particular key

cargo run --example infra -- --server-name cas-lb.xetbeta.com:5000 --key bar
Host: 35.89.208.89
Load Stats: SystemStatus { timestamp: "2022-07-06T19:15:00Z", cpu_utilization: 0.3416666833712037 }
Host: 54.245.178.249
Load Stats: SystemStatus { timestamp: "2022-07-06T19:15:00Z", cpu_utilization: 0.2943333333333333 }
Key bar gets hashed to server "54.245.178.249"