freundcloud

Best Practices

Overview

This guide provides battle-tested best practices for integrating ServiceNow with DevOps workflows. These recommendations come from real-world implementations across enterprises in finance, healthcare, and technology sectors.

Change Type Strategy

Use the Right Change Type

ServiceNow supports three change typesβ€”each with different approval requirements and use cases:

Standard Changes

When to Use:

  • Low-risk, repeatable deployments
  • Application deployments following established patterns
  • Infrastructure updates with pre-approved runbooks
  • Deployments that have been executed successfully 10+ times

Benefits:

  • No approval delay (pre-approved)
  • Fastest path to production
  • Reduced change advisory board (CAB) workload

Example:

# Standard change - deploy immediately
change_data:
  type: "standard"
  std_change_producer_version: "app-deployment-v2"
  # No approval wait needed

Best Practice: Aim to make 80% of your deployments standard changes. Work with your CAB to pre-approve common deployment patterns.

Normal Changes

When to Use:

  • First-time deployments of new services
  • Database schema migrations
  • Infrastructure changes (new clusters, network changes)
  • High-visibility releases
  • Changes affecting multiple systems

Characteristics:

  • Requires CAB or manager approval
  • More detailed documentation needed
  • Approval typically takes 30 minutes to 4 hours

Example:

change_data:
  type: "normal"
  priority: "3"
  risk_assessment: "medium"
  # Include detailed implementation and backout plans

Best Practice: Provide comprehensive implementation plans. The more detail you include, the faster approvals happen.

Emergency Changes

When to Use:

  • Production outages or security vulnerabilities
  • Critical bugs affecting customers
  • Data loss or corruption issues
  • Security patches for zero-day exploits

Characteristics:

  • Expedited approval (5-30 minutes)
  • Requires justification
  • Higher priority (typically P1 or P2)
  • Post-implementation review required

Example:

change_data:
  type: "emergency"
  priority: "1"
  justification: "Critical security CVE-2024-12345 affecting customer data"
  # Expedited approval process

Best Practice: Don’t abuse emergency changes. Overuse leads to approval fatigue and reduced oversight.

Automation Best Practices

1. Automate Everything

Never create change tickets manually for automated deployments.

❌ Bad:

Developer β†’ Manually creates ServiceNow ticket β†’ Deploys β†’ Manually updates ticket

βœ… Good:

Pipeline β†’ Auto-creates ServiceNow ticket β†’ Waits for approval β†’ Deploys β†’ Auto-closes ticket

Benefits:

  • Zero manual overhead
  • Accurate deployment documentation
  • Complete audit trail
  • No forgotten ticket updates

2. Include Deployment Context

Always include these details in automated change requests:

Essential Information:

  • What: Application/service name, version/commit SHA
  • When: Timestamp, scheduled deployment window
  • Who: Deployment initiator (developer, pipeline, automation)
  • Where: Environment, infrastructure (cluster, region)
  • Why: Business justification, ticket reference
  • How: Deployment method, rollback procedure

Example:

{
  "short_description": "Deploy payment-api v2.5.0 to Production",
  "description": "Automated deployment from GitLab CI/CD\n\nDetails:\n- Repository: acme/payment-api\n- Commit: a3b5c7d (feat: add refund endpoint)\n- Author: jane.doe@acme.com\n- Pipeline: https://gitlab.com/acme/payment-api/-/pipelines/12345\n- Target: production-k8s-us-east\n- Method: Helm chart deployment\n- Tests: βœ“ 127 passed, 0 failed\n- Security: βœ“ No critical vulnerabilities",
  "cmdb_ci": "prod-k8s-us-east",
  "implementation_plan": "1. Deploy Helm chart v2.5.0\n2. Run smoke tests\n3. Monitor error rates for 15 minutes",
  "backout_plan": "helm rollback payment-api 1",
  "test_plan": "Automated tests passed: unit (85), integration (32), e2e (10)"
}

3. Attach Evidence

Always attach deployment evidence to change requests:

  • βœ“ Test results (JUnit XML, coverage reports)
  • βœ“ Security scan results (Trivy, Snyk, etc.)
  • βœ“ Code review evidence (merge approval)
  • βœ“ Release notes
  • βœ“ Deployment logs

Example:

# Attach test results
curl -X POST \
  "${SNOW_URL}/api/now/attachment/file?table_name=change_request&table_sys_id=${SYS_ID}" \
  -H "Authorization: Bearer ${TOKEN}" \
  -F "file=@test-results.xml;filename=test-results.xml"

# Attach security scan
curl -X POST \
  "${SNOW_URL}/api/now/attachment/file?table_name=change_request&table_sys_id=${SYS_ID}" \
  -H "Authorization: Bearer ${TOKEN}" \
  -F "file=@trivy-scan.json;filename=security-scan.json"

Approval Workflow Optimization

1. Implement Approval Timeouts

Don’t wait indefinitely for approvals.

def wait_for_approval(sys_id, timeout=3600):
    """
    Wait for change approval with timeout
    Default: 1 hour for normal changes
    """
    start_time = time.time()

    while (time.time() - start_time) < timeout:
        change = get_change(sys_id)

        if is_approved(change):
            return True
        elif is_rejected(change):
            raise Exception("Change rejected")

        time.sleep(60)  # Check every minute

    # Timeout reached
    raise TimeoutError(f"Change approval timeout after {timeout}s")

Recommended Timeouts:

  • Standard changes: 0 seconds (pre-approved)
  • Normal changes: 4 hours
  • Emergency changes: 30 minutes

2. Use Smart Polling Intervals

Don’t overwhelm ServiceNow API with requests.

❌ Bad: Poll every 5 seconds βœ… Good: Poll every 60 seconds

Adaptive Polling:

def adaptive_polling_interval(elapsed_time):
    """Increase interval as time passes"""
    if elapsed_time < 300:  # First 5 minutes
        return 30
    elif elapsed_time < 1800:  # 5-30 minutes
        return 60
    else:  # After 30 minutes
        return 120

3. Send Approval Notifications

Don’t rely on ServiceNow notifications alone.

Multi-Channel Notifications:

  • ServiceNow native notifications
  • Slack/Teams message with approval link
  • Email to approval group
  • Mobile push notification (ServiceNow app)

Example Slack Notification:

- name: Notify Approvers
  run: |
    curl -X POST ${{ secrets.SLACK_WEBHOOK }} \
      -H 'Content-Type: application/json' \
      -d '{
        "text": "🎫 Change Request Awaiting Approval",
        "blocks": [
          {
            "type": "section",
            "text": {
              "type": "mrkdwn",
              "text": "*Deploy payment-api v2.5.0*\nChange: CHG0030001\n<https://instance.service-now.com/change_request.do?sys_id=xxx|View in ServiceNow>"
            }
          },
          {
            "type": "actions",
            "elements": [
              {
                "type": "button",
                "text": {"type": "plain_text", "text": "Approve"},
                "url": "https://instance.service-now.com/change_request.do?sys_id=xxx",
                "style": "primary"
              }
            ]
          }
        ]
      }'

Environment-Specific Strategies

Don’t Create Changes for Non-Production

Only create ServiceNow changes for production deployments.

❌ Bad:

# Creates change for every environment
- dev β†’ ServiceNow change
- staging β†’ ServiceNow change
- production β†’ ServiceNow change

βœ… Good:

# Only production requires change ticket
- dev β†’ Deploy directly
- staging β†’ Deploy directly
- production β†’ ServiceNow change β†’ Deploy

Implementation:

create_change:
  stage: change_management
  script:
    - python create_servicenow_change.py
  only:
    - main  # Only on production branch

Use Environment-Specific Change Types

variables:
  CHANGE_TYPE: >
    $[[
      eq(variables['Build.SourceBranch'], 'refs/heads/main')
      && 'normal'
      || 'standard'
    ]]

Integration Patterns

Pattern 1: Gate After Build, Before Deploy

Most common pattern - change created after successful build/test:

Build β†’ Test β†’ Create Change β†’ Wait Approval β†’ Deploy β†’ Close Change
   βœ“      βœ“          ↓              ↓           ↓          ↓
                 CHG0001        Approved      Success   Closed

Benefits:

  • Only create changes for viable deployments
  • Include test results in change request
  • Fast feedback on build issues

Pattern 2: Change Created Early

Create change at pipeline start for visibility:

Create Change β†’ Build β†’ Test β†’ Wait Approval β†’ Deploy β†’ Close Change
      ↓           ↓       ↓          ↓           ↓          ↓
  CHG0001        βœ“       βœ“      Approved      Success   Closed

Benefits:

  • Early visibility into upcoming changes
  • Approvers can review while build/test runs
  • Parallel approval and testing

Trade-off: May create changes for failed builds (update to canceled if build fails)

Pattern 3: Approval Before Build (Rare)

Create change before any work starts:

Create Change β†’ Wait Approval β†’ Build β†’ Test β†’ Deploy β†’ Close Change

Use Case:

  • High-risk changes requiring pre-approval
  • Scheduled maintenance windows
  • Coordination with external teams

Trade-off: Slower, approval may happen before code is ready

Error Handling

Always Update Failed Changes

Never leave changes in β€œimplementing” state after failure.

on:
  failure:
    steps:
      - name: Update Change - Failed
        run: |
          curl -X PATCH "${SNOW_URL}/api/now/table/change_request/${CHANGE_SYS_ID}" \
            -H "Authorization: Bearer ${TOKEN}" \
            -H "Content-Type: application/json" \
            -d '{
              "state": "4",
              "close_code": "unsuccessful",
              "close_notes": "Deployment failed. Pipeline: ${CI_PIPELINE_URL}\nError: ${ERROR_MESSAGE}"
            }'

Implement Retry Logic

ServiceNow API can have transient failures - retry with exponential backoff:

def create_change_with_retry(data, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = requests.post(url, json=data, timeout=30)
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            if attempt == max_retries - 1:
                raise
            wait_time = 2 ** attempt  # 1s, 2s, 4s
            time.sleep(wait_time)

Handle ServiceNow Downtime

Don’t fail deployments if ServiceNow is down (unless required by policy):

- name: Create Change (Best Effort)
  continue-on-error: true  # Don't fail pipeline
  run: create_servicenow_change.sh

- name: Check Change Created
  run: |
    if [ -z "$CHANGE_SYS_ID" ]; then
      echo "⚠️  ServiceNow change not created - proceeding without change ticket"
      echo "CHANGE_REQUIRED=false" >> $GITHUB_ENV
    fi

Note: Only use continue-on-error if your policy allows deployments without change tickets. Many regulated industries require strict enforcement.

Security Best Practices

1. Use Service Accounts

Never use personal accounts for automation.

βœ… Good: Create dedicated service account

  • Username: svc-gitlab-ci@company.com
  • Roles: itil, api_write
  • Purpose: GitLab CI/CD automation

❌ Bad: Use developer’s personal account

2. Implement Least Privilege

Grant minimum required permissions:

Required ServiceNow Roles:

  • itil - Create and update change requests
  • api_write - Write access to REST API (if separate from itil)

Not Needed:

  • ❌ admin - Never grant admin access
  • ❌ security_admin - Not required for changes
  • ❌ asset - Not needed unless managing CMDB

3. Rotate Credentials

Regular credential rotation:

  • Passwords: Every 90 days
  • OAuth client secrets: Every 180 days
  • API tokens: Every 90 days

Automation:

# Use CI/CD secrets with expiration alerts
# Azure Key Vault / AWS Secrets Manager / HashiCorp Vault
# Set expiration notifications

4. Secure API Calls

Always use HTTPS and validate certificates:

βœ… Good:

curl -X POST https://instance.service-now.com/api/... \
  --cacert /etc/ssl/certs/ca-certificates.crt

❌ Bad:

curl -X POST https://instance.service-now.com/api/... \
  --insecure  # Disables SSL verification

Performance Optimization

1. Cache ServiceNow Metadata

Don’t look up assignment groups and CMDB CIs on every pipeline run.

# Cache in pipeline or repository
ASSIGNMENT_GROUPS = {
    'DevOps Team': 'a9e0c5f21bf45110d4e27f86624bcb24',
    'Platform Team': 'b2f1d6g32cg56221e5f38g97735cdc35'
}

# Use cached sys_id directly
change_data['assignment_group'] = ASSIGNMENT_GROUPS['DevOps Team']

2. Minimize API Calls

Batch operations where possible:

❌ Bad: 4 separate API calls

create_change()
attach_file_1()
attach_file_2()
update_change()

βœ… Good: 2 API calls

create_change()  # Include all data upfront
attach_files([file_1, file_2])  # Batch attachments

3. Use Webhooks Instead of Polling

For advanced setups, use webhooks to notify pipeline when approval happens:

// ServiceNow Business Rule - on Change approval
(function executeRule(current, previous) {
    if (current.approval == 'approved') {
        var r = new sn_ws.RESTMessageV2();
        r.setEndpoint('https://gitlab.com/api/v4/projects/123/pipeline/456/resume');
        r.setHttpMethod('POST');
        r.setRequestHeader('PRIVATE-TOKEN', 'gitlab-token');
        r.execute();
    }
})(current, previous);

Monitoring and Observability

Track Key Metrics

Monitor ServiceNow integration health:

Metric Threshold Alert
Change creation success rate < 95% Warning
Average approval wait time > 2 hours Warning
API error rate > 5% Critical
Changes closed successfully < 90% Warning
Pipeline duration impact > 5 minutes added Info

Implement Logging

Log all ServiceNow interactions:

import logging

logger = logging.getLogger('servicenow')

def create_change(data):
    logger.info(f"Creating change: {data['short_description']}")
    try:
        response = requests.post(url, json=data)
        logger.info(f"Change created: {response.json()['result']['number']}")
        return response.json()
    except Exception as e:
        logger.error(f"Failed to create change: {str(e)}", exc_info=True)
        raise

Create Dashboards

ServiceNow Dashboard Widgets:

  • Changes created by automation (trend)
  • Average approval time (by change type)
  • Change success rate (successful vs unsuccessful)
  • Changes by application/service

Common Pitfalls to Avoid

❌ Don’t: Hardcode Configuration

# Bad - hardcoded values
ASSIGNMENT_GROUP="DevOps Team"
CMDB_CI="prod-cluster"

βœ… Do: Use Configuration

# Good - environment variables
ASSIGNMENT_GROUP: ${ASSIGNMENT_GROUP}
CMDB_CI: ${CMDB_CI}

❌ Don’t: Skip Rollback Changes

If deployment fails and you rollback, create a rollback change:

on_failure:
  - name: Rollback
    run: kubectl rollout undo deployment/myapp

  - name: Create Rollback Change
    run: |
      curl -X POST ... -d '{
        "type": "emergency",
        "short_description": "ROLLBACK: Failed deployment",
        "parent": "${ORIGINAL_CHANGE_SYS_ID}"
      }'

❌ Don’t: Ignore Change Calendar

Check for change conflicts before scheduling:

def check_change_calendar(planned_start, planned_end):
    """Check if change window has conflicts"""
    response = requests.get(
        f"{SNOW_URL}/api/now/table/change_request",
        params={
            "sysparm_query": f"planned_start_dateBETWEENjavascript:gs.dateGenerate('{planned_start}')@javascript:gs.dateGenerate('{planned_end}')"
        }
    )
    conflicts = response.json()['result']
    return len(conflicts) == 0

❌ Don’t: Create Duplicate Changes

Check if change already exists before creating:

def find_existing_change(commit_sha):
    """Check if change exists for this commit"""
    response = requests.get(
        f"{SNOW_URL}/api/now/table/change_request",
        params={
            "sysparm_query": f"correlation_id={commit_sha}^stateNOT IN3,4",
            "sysparm_limit": 1
        }
    )
    results = response.json()['result']
    return results[0] if results else None

Gradual Adoption Strategy

Phase 1: Automate Creation (Week 1-2)

  1. Automate change request creation
  2. Still manually update/close changes
  3. Monitor and validate accuracy

Phase 2: Add Approval Gates (Week 3-4)

  1. Implement approval polling
  2. Gate production deployments
  3. Measure approval times

Phase 3: Full Automation (Week 5-6)

  1. Auto-update changes during deployment
  2. Auto-close on success
  3. Handle failures automatically

Phase 4: Optimize (Ongoing)

  1. Convert to standard changes where possible
  2. Reduce approval times
  3. Improve evidence collection

Real-World Example: Financial Services

Organization: Top 10 US bank Scale: 500+ microservices, 50-100 deployments/day

Implementation:

  1. Standard Changes: 85% of deployments
    • Pre-approved application deployments
    • No approval delay
    • Auto-created and auto-closed
  2. Normal Changes: 10% of deployments
    • Database migrations
    • Infrastructure changes
    • New service launches
    • Average approval: 45 minutes
  3. Emergency Changes: 5% of deployments
    • Security patches
    • Production incidents
    • Average approval: 15 minutes

Results:

  • Reduced deployment time by 75%
  • 100% change ticket compliance
  • Zero SOX audit findings
  • Team satisfaction dramatically increased

Next Steps

Additional Resources