Audit log implementation

This guide provides step-by-step instructions for accessing, downloading, and processing Bloomreach audit logs using Google Cloud Storage. It covers audit log access for both Bloomreach Engagement and Data Hub.

Prerequisites

Contact your Bloomreach account representative to enable audit log access.

If you want to access the audit logs in Data hub, you'll receive:

  • Service Account Credentials — JSON key file for GCS authentication
  • Regional Bucket Name — GCS bucket where logs are stored (e.g., us-auditlog-storage)
  • Cloud Organization ID — Your unique identifier for the folder structure

If you want to access the audit log from the Engagement application, you’ll receive:

  • Access via GSuite Google Groups (separate from Engagement Access management).
  • Google Identities (user accounts or service accounts) for group membership.

Depending on your access method, you will need:

  • Google Cloud SDK (gsutil): Install guide
  • Python client library: pip install google-cloud-storage
  • Direct API access: curl and gcloud CLI

Key Concepts

Storage Structure

Engagement:

  • Bucket naming: {instance-id}-auditlog-storage or {instance-id}-auditlog-storage-{account-id}
  • File organization: Year/Month/Day (e.g., iid/2021/10/12)
  • Replace {instance-id} with your unique instance identifier (three alphanumeric lowercase symbols)

Data Hub:

  • Bucket naming: {regional-bucket}/cloud-org-{id}/
  • File organization: {YYYY}/{MM}/{DD}/{HH}/ (includes hour granularity)

File Format and Retention

  • Files use gzipped JSON Lines format (.jsonl.gz)
  • Each line is a complete JSON log record
  • Files available for 60 days before archival
  • See Audit Logs Overview & Architecture for schema details

Authentication

Before accessing the logs, you must authenticate using the provided service account credentials.

📘

Info

The authentication process is identical for both Engagement and Data Hub.

Method 1: Activate service account

gcloud auth activate-service-account --key-file=/path/to/service-account-key.json

Method 2: Set environment variable

export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"

Replace /path/to/service-account-key.json with the actual path to your service account JSON file.

Access the Audit Log

You can access the audit log in three ways:

  • Google Cloud SDK: Uses gsutil to list, download, and manage audit log files from Google Cloud Storage buckets
  • Direct API calls: Access the audit log directly by connecting to the API
  • Client libraries: Access the audit log using your preferred client library (e.g., Python)

Google Cloud SDK Tools (gsutil)

Google Cloud SDK uses gsutil to download audit logs. The tool performs all operations, including uploads and downloads, using HTTPS and transport-layer security (TLS).

List Available Log Files

Engagement:

To list files for a specific day:

gsutil ls gs://{instance-id}-auditlog-storage/iid/{YYYY}/{MM}/{DD}

To list files for account-level logs:

gsutil ls gs://{instance-id}-auditlog-storage-{account-id}/{YYYY}/{MM}/{DD}

Data Hub:

To see what audit log files are available for a specific date:

gsutil ls gs://{regional-bucket}/cloud-org-{your-cloud-org-id}/{YYYY}/{MM}/{DD}/

To see files for a specific hour of the day, add the hour to the command:

gsutil ls gs://{regional-bucket}/cloud-org-{your-cloud-org-id}/{YYYY}/{MM}/{DD}/{HH}/

Example:

# Engagement
gsutil ls gs://abc-auditlog-storage/iid/2021/10/12

# Data Hub
gsutil ls gs://us-auditlog-storage/cloud-org-abc123def456/2024/11/28/

Download Log Files

Engagement:

To download all logs for a specific day (instance-level):

gsutil -m rsync gs://{instance-id}-auditlog-storage/iid/{YYYY}/{MM}/{DD} /data/local/storage/{MM}/{DD}

To download account-level logs:

gsutil -m rsync gs://{instance-id}-auditlog-storage-{account-id}/{YYYY}/{MM}/{DD} /data/local/storage/{MM}/{DD}

Data Hub:

To download all logs for a specific day to your computer:

gsutil -m cp -r gs://{regional-bucket}/cloud-org-{your-cloud-org-id}/{YYYY}/{MM}/{DD}/ ./local-directory/

To download logs for a specific hour only:

gsutil -m cp gs://{regional-bucket}/cloud-org-{your-cloud-org-id}/{YYYY}/{MM}/{DD}/{HH}/* ./local-directory/

Read Log Contents

The process for reading log contents is identical for both Engagement and Data Hub.

After you download the log files, uncompress and view them with this command:

gunzip -c filename.jsonl.gz | head -10

You can also view logs directly from Google Cloud Storage without downloading them first:

# Engagement
gsutil cat gs://{instance-id}-auditlog-storage/iid/{YYYY}/{MM}/{DD}/filename.jsonl.gz | gunzip | head -10

# Data Hub
gsutil cat gs://{regional-bucket}/cloud-org-{your-cloud-org-id}/{YYYY}/{MM}/{DD}/{HH}/filename.jsonl.gz | gunzip | head -10

For comprehensive gsutil documentation, refer to the gsutil documentation.

Direct API Access

You can also access audit logs programmatically using the Google Cloud Storage JSON API.

List Objects

Engagement:

To see available log files:

curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://storage.googleapis.com/storage/v1/b/{instance-id}-auditlog-storage/o?prefix=iid/{YYYY}/{MM}/{DD}/"

Data Hub:

To see available log files:

curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://storage.googleapis.com/storage/v1/b/{regional-bucket}/o?prefix=cloud-org-{your-cloud-org-id}/{YYYY}/{MM}/{DD}/"

Download an Object

Engagement:

To download a specific log file:

curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://storage.googleapis.com/storage/v1/b/{instance-id}-auditlog-storage/o/iid%2F{YYYY}%2F{MM}%2F{DD}%2F{filename}?alt=media" \
  --output filename.jsonl.gz

Data Hub:

To download a specific log file:

curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://storage.googleapis.com/storage/v1/b/{regional-bucket}/o/cloud-org-{your-cloud-org-id}%2F{YYYY}%2F{MM}%2F{DD}%2F{HH}%2F{filename}?alt=media" \
  --output filename.jsonl.gz

📘

Note

When using the API, replace forward slashes (/) with %2F in the file path for URL encoding.

Client Libraries

You can use Google Cloud client libraries in your preferred programming language to download the audit log. Here's an example using Python:

Engagement:

from google.cloud import storage
from google.oauth2 import service_account
import gzip
import json

# Initialize credentials
key_path = "service_account_key.json"
credentials = service_account.Credentials.from_service_account_file(
    key_path,
    scopes=["https://www.googleapis.com/auth/cloud-platform"],
)

# Initialize client
client = storage.Client(
    credentials=credentials,
    project=credentials.project_id,
)

# List files
bucket = client.bucket('{instance-id}-auditlog-storage')
blobs = bucket.list_blobs(prefix='iid/2021/02/23/')

for blob in blobs:
    print(blob.name)

# Download and print specific file
blob = bucket.get_blob('iid/2021/02/23/20210223T080000.000Z-0.jsonl.gz')
content = blob.download_as_bytes()
decompressed = gzip.decompress(content)

for line in decompressed.decode('utf-8').strip().split('\n'):
    log_entry = json.loads(line)
    print(log_entry)

Data Hub:

from google.cloud import storage
import gzip
import json

# Initialize the client
client = storage.Client()
bucket = client.bucket('us-auditlog-storage')

# List blobs for a specific day
prefix = 'cloud-org-abc123def456/2024/11/28/'
blobs = bucket.list_blobs(prefix=prefix)

for blob in blobs:
    print(f"Found: {blob.name}")
    
    # Download and read a specific file
    content = blob.download_as_bytes()
    decompressed = gzip.decompress(content)
    
    for line in decompressed.decode('utf-8').strip().split('\n'):
        log_entry = json.loads(line)
        print(log_entry)

SIEM Integration

Integration Pattern

  1. Download logs using gsutil, API, or Python
  2. Ingest into SIEM (most accept JSON directly)
  3. Index on fixed schema fields
  4. Configure alerts for security events

Splunk Example

Download to the monitored directory:

# Engagement
gsutil -m rsync gs://{instance-id}-auditlog-storage/iid/$(date +%Y/%m/%d) /var/log/bloomreach-audit/

# Data Hub
gsutil -m cp -r gs://{regional-bucket}/cloud-org-{id}/$(date +%Y/%m/%d)/ /var/log/bloomreach-audit/

Configure Splunk inputs.conf:

[monitor:///var/log/bloomreach-audit/]
sourcetype = bloomreach:audit
index = security

Example queries:

# Failed authorization attempts
authorizationInfo.allowed=false

# User activity
authenticationInfo.identity="[email protected]" | stats count by request.path

# Anonymization requests
serviceData.@type="auditlog.AnonymizationServiceData"

Alert on Fixed Fields

Use fixed schema fields for reliable alerts:

  • timestamp, status, scopeType, scopeID, requestID
  • authenticationInfo.identity, authorizationInfo.allowed
  • request.method, request.path

Avoid alerting on serviceData.info (schema not guaranteed).

Best Practices

Security

  • Store service account keys securely with restricted permissions
  • Never commit credentials to version control
  • Use environment variables for key paths
  • Rotate credentials every 90 days

Data Retention

  • Download logs regularly (files available for 60 days)
  • Implement automated download scripts
  • Store logs according to compliance requirements

Performance

  • Use gsutil -m for parallel downloads
  • Stream process large files to avoid memory issues
  • Use multiprocessing for analyzing multiple files

Next Steps

To understand the logs and their architecture, see the Audit Logs document.