Log Review (ipynb)
Contents
Log Review (ipynb)#
Google Log Explorer is a more than acceptable platform for reviewing logs. However, it lacks table view of log events as well as field aggregation (think stats count by
in Splunk) that might be useful during investigations.
Below serves only as a demonstration of how table view and field aggregation can be achieved outside of Google Log Explorer, and it is NOT meant to be used in production (or cases).
Install Dependencies#
Install the dependencies ipywidgets
and pandas
. Skip the next cell if they had already been installed.
!pip3 install ipywidgets pandas
Imports and Configuration#
import ipywidgets as widgets
import json
import os
import pandas as pd
from IPython.display import HTML, display
# extend width of widgets
display(HTML('''<style>
.widget-label { min-width: 18ex !important; font-weight:bold; }
</style>'''))
# extend width of cells
display(HTML("<style>.container { width:100% !important; }</style>"))
display(HTML("<style>.output_result { max-width:100% !important; }</style>"))
# extend width and max rows of pandas output
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)
# [OPTIONAL] authenticate using your service account
!gcloud auth activate-service-account --key-file <json_key_file>
Query Logs#
Specify the following information
Fields |
Description |
---|---|
|
Project id of target project (that contains potentially compromised resource) |
|
Resource type of logs to review |
|
Start date of time period of logs to review |
|
Start time of time period of logs to review |
|
End date of time period of logs to review |
|
End time of time period of logs to review |
# create UI for user input
src_project = widgets.Text(description="Source Project: ", disabled=False)
resource_type = widgets.Dropdown(
options=['bigquery_dataset', 'bigquery_resource', 'cloudsql_database', 'cloud_function', 'gce_backend_service', 'gce_disk', 'gce_firewall_rule', 'gce_instance', 'gce_instance_group', 'gce_instance_group_manager', 'gce_router', 'gce_snapshot', 'gcs_bucket', 'gke_cluster', 'http_load_balancer', 'k8s_cluster', 'k8s_container', 'k8s_node', 'k8s_pod', 'logging_sink', 'network_security_policy', 'project', 'vpn_gateway'],
value='gce_instance',
description="Resource Type: ",
disabled=False)
start_date = widgets.DatePicker(description='Start Date: ', disabled=False)
start_time = widgets.Text(value='hh:mm', description="Start Time (UTC): ", disabled=False)
end_date = widgets.DatePicker(description='End Date: ', disabled=False)
end_time = widgets.Text(value='hh:mm', description="End Time (UTC): ", disabled=False)
display(src_project, resource_type, start_date, start_time, end_date, end_time)
# set environment variables and construct query
os.environ['SRC_PROJECT'] = src_project.value
os.environ['QUERY'] = 'resource.type=' + resource_type.value + ' AND timestamp>="' + str(start_date.value) + 'T' + start_time.value + ':00Z"' + ' AND timestamp<="' + str(end_date.value) + 'T' + end_time.value + ':00Z"'
# request for log events that satisfy the query, limiting to 100 events (change as deem fit)
!gcloud logging read "$QUERY" --project $SRC_PROJECT --limit=100 --format=json > temp_logs.json
# store results into dataframe
with open('./temp_logs.json') as infile:
log_results = json.load(infile)
log_results_df = pd.json_normalize(log_results)
display(log_results_df)
# aggregate values of a specified field (pprotoPayload.methodName in this case)
log_results_df['protoPayload.methodName'].value_counts()