Prometheus Setup
Prometheus is the time-series database that stores and queries metrics collected by gNMIc.Overview
Prometheus is an open-source monitoring and alerting toolkit designed for reliability and scalability:- Pull-based metric collection (scraping)
- Powerful query language (PromQL)
- Time-series data storage
- Built-in alerting capabilities
- Service discovery support
- Container: prometheus
- Management IP: 10.77.1.13
- Web UI Port: 9090
- Config File:
configs/prometheus/prometheus.yml
Configuration File
The Prometheus configuration is minimal and focused:Configuration Breakdown
- Global Settings
- Scrape Configs
- Default: 5 seconds
- Matches gNMIc’s
sample-intervalfor consistent data - Lower values = more data points but higher storage/CPU usage
- Higher values = less granular but more efficient
Data Retention
Prometheus stores data with default retention settings:| Setting | Default | Description |
|---|---|---|
| Retention time | 15 days | How long to keep data |
| Retention size | No limit | Maximum storage size |
| Storage path | /prometheus | Data directory |
lab.yml:
Access Prometheus Web UI
Prometheus includes a built-in web interface:Key Web UI Features
- Graph
- Targets
- Service Discovery
- Configuration
Execute PromQL queries and visualize results
- Navigate to Graph tab
- Enter a query in the expression box
- Click Execute
- View table or graph visualization
PromQL Query Language
Prometheus Query Language (PromQL) is used to query and aggregate metrics.Basic Queries
Common Functions
rate() - Calculate per-second rate
rate() - Calculate per-second rate
irate() - Instant rate
irate() - Instant rate
rate() but can be volatile.sum() - Aggregate values
sum() - Aggregate values
avg() - Average values
avg() - Average values
increase() - Total increase
increase() - Total increase
rate() but returns total change, not per-second.Example Queries for BNG Lab
Querying from Command Line
You can query Prometheus using its HTTP API:Monitoring Prometheus
Check Scrape Health
Query Statistics
Verify Data Collection
Performance Tuning
Adjust Scrape Interval
Scrape Timeout
Relabeling
Troubleshooting
Target is DOWN
Target is DOWN
Problem: gNMIc target shows as DOWN in PrometheusCheck list:
-
Verify gNMIc is running:
-
Test connectivity from Prometheus:
-
Check Prometheus logs:
-
Verify configuration:
No data for queries
No data for queries
Problem: Queries return empty resultsSolutions:
-
Check if metrics exist:
- Verify scraping is working (Targets page should show UP)
-
Check query syntax and label filters:
- Ensure time range includes data points
High memory usage
High memory usage
Problem: Prometheus container using excessive memorySolutions:
-
Reduce retention:
-
Increase scrape interval in
prometheus.yml: - Drop unused metrics in gNMIc or Prometheus config
-
Monitor series cardinality:
Slow queries
Slow queries
Problem: PromQL queries taking too longSolutions:
- Reduce query time range
- Use recording rules for expensive queries (requires config reload)
- Add more specific label filters
- Use
irate()instead ofrate()for recent data - Limit results with
topk()orbottomk()
Integration with Grafana
Prometheus is pre-configured as a Grafana datasource:- Navigate to Configuration → Data Sources
- Select Prometheus
- Click Test to verify connection
Grafana uses the same PromQL query language. Queries in Prometheus Web UI can be copied directly to Grafana panels.
Advanced Configuration
Multiple Scrape Targets
To scrape additional exporters:Service Discovery
For dynamic environments:Recording Rules
Pre-compute expensive queries:Next Steps
Grafana Dashboards
Visualize metrics with pre-built dashboards
Available Metrics
Complete catalog of Nokia SROS metrics