Step-by-Step: Configure Windows Azure SQL Database Management Pack for System Center 2012

Monitoring Azure SQL with System Center 2012: Management Pack Best Practices

1) Deployment & discovery

  • Use the official Azure SQL Database Management Pack MSI from Microsoft and the accompanying Operations Guide.
  • Run discovery with the wizard using Azure Resource Manager (REST API) where possible; fall back to T‑SQL discovery only for legacy cases.
  • Support multiple subscriptions and servers; create separate discoveries per subscription to limit blast radius.

2) Authentication & Run As

  • Prefer Azure AD authentication (service principal) for REST API access. Use a least‑privilege service principal with Reader + monitoring roles.
  • Configure Run As / Run As profiles securely and map them only to the management pack objects that need them.
  • Store credentials in SCOM Run As accounts and test connectivity after import.

3) Metrics & polling strategy

  • Use REST API collection for lightweight, reliable metric pulls; T‑SQL queries add deeper telemetry but increase load.
  • Default poll intervals: 60–300s for critical health/availability; 300–900s for lower‑priority performance metrics to reduce API quota and SCOM load.
  • Stagger discovery and collection schedules across agents to avoid bursts.

4) Thresholds & alert tuning

  • Replace default thresholds with environment‑specific values. Configure separate warning/critical thresholds per database or pool when needed.
  • Use overrides to:
    • Exclude known noisy databases or maintenance windows.
    • Disable per‑database file growth alerts if many DBs share the same drive (or monitor disk at OS level instead).
  • Leverage alert suppression / dependency model for failover groups and elastic pools to avoid alert storms during planned maintenance.

5) Key monitors to enable

  • Availability (server & database)
  • DTU/CPU/worker/IO usage and percent thresholds
  • Long‑running queries and maximum transaction time
  • Failed connections, deadlocks, throttling counts
  • Elastic pool and geo‑replication health
  • Transaction log usage and growth events

6) Custom queries & app‑specific checks

  • Use custom query support for application‑specific availability checks and business‑critical transactions.
  • Add exclude lists (application/database/query text) to long‑running query rules to reduce noise.

7) Dashboards & runbooks

  • Create SCOM dashboards focused on: availability, performance hotspots, elastic pool utilization, and replication status.
  • Integrate alerts with runbooks/automation (Azure Automation / Logic Apps) for automated remediation of common issues (scale up, restart, failover).

8) Capacity planning & cost control

  • Monitor CPU, memory, IO, and egress/ingress bandwidth trends for right‑sizing.
  • Track elastic pool utilization to optimize DTU/vCore allocation and avoid unnecessary scale costs.

9) Security & governance

  • Limit who can change management pack overrides and Run As accounts.
  • Audit Run As profile usage and rotate service principal credentials regularly.
  • Use least privilege access for monitoring service principals.

10) Maintenance & lifecycle

  • Keep the management pack and Operations Guide updated (import updates from Microsoft).
  • Test MP changes in a staging SCOM environment before production.
  • Review overrides, suppression rules, and alert noise quarterly.

Quick table — Recommended defaults

Area Recommended setting
Discovery method Azure Resource Manager (REST API)
Auth method Azure AD service principal (least privilege)
Polling (critical) 60–300 seconds
Polling (non‑critical) 300–900 seconds
Alert tuning Per‑DB thresholds + overrides/exclusions
Long‑running queries Enable with app/db/query exclude lists
Update cadence Quarterly review + apply MP updates from Microsoft

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *