Skip to content

SLAB(Semantic Logging Application Block) sink for Google BigQuery

License

Notifications You must be signed in to change notification settings

tsu1980/SlabBigQuery

Repository files navigation

Mamemaki.Slab.BigQuery

Mamemaki.Slab.BigQuery is SLAB(Semantic Logging Application Block) sink for Google BigQuery. Receive log messages from ETW and push them to Google BigQuery.

Installation

PM> Install-Package Mamemaki.Slab.BigQuery

Basic Usage

    class Program
    {
        static void Main(string[] args)
        {
            var projectId = "xxxxxxx-0000";
            var datasetId = "xxxxxxx";
            var tableId = "hello{yyyyMM01}";
            var serviceAccountEmail = "000000000000-xxxxxxxxxxxxx@developer.gserviceaccount.com";
            var privateKeyFile = @"/path/to/xxxx-000000000000.p12";
            var tableSchemaFile = @"/path/to/hello.json";
            using (var listener = new ObservableEventListener())
            using (var listenerDebug = new ObservableEventListener())
            {
                listenerDebug.EnableEvents(SemanticLoggingEventSource.Log, EventLevel.LogAlways);
                listenerDebug.EnableEvents(BigQuerySinkEventSource.Log, EventLevel.LogAlways);
                listenerDebug.LogToConsole();

                listener.EnableEvents(MyEventSource.Log, EventLevel.LogAlways);
                listener.LogToConsole();
                listener.LogToBigQuery(
                    projectId: projectId,
                    datasetId: datasetId,
                    tableId: tableId,
                    serviceAccountEmail: serviceAccountEmail,
                    privateKeyFile: privateKeyFile,
                    autoCreateTable: true,
                    tableSchemaFile: tableSchemaFile);

                MyEventSource.Log.Start();
                Thread.Sleep(1);
                MyEventSource.Log.Hello("World");
                Thread.Sleep(1);
                MyEventSource.Log.End();
            }
        }
    }

    [EventSource(Name = "My")]
    class MyEventSource : EventSource
    {
        private static readonly MyEventSource Instance = new MyEventSource();
        private MyEventSource() {}
        public static MyEventSource Log { get { return Instance; } }

        [Event(1, Level = EventLevel.Verbose, Message = "Start")]
        public void Start()
        {
            if (this.IsEnabled())
                this.WriteEvent(1);
        }

        [Event(2, Level = EventLevel.Verbose, Message = "End")]
        public void End()
        {
            if (this.IsEnabled())
                this.WriteEvent(2);
        }

        [Event(3, Level = EventLevel.Informational, Message = "Hello {0}!")]
        public void Hello(string to)
        {
            if (this.IsEnabled())
                this.WriteEvent(3, to);
        }
    }

hello.json

[
  {
    "name": "Timestamp",
    "type": "TIMESTAMP",
    "mode": "REQUIRED"
  },
  {
    "name": "FormattedMessage",
    "type": "STRING",
    "mode": "REQUIRED"
  },
  {
    "name": "to",
    "type": "STRING"
  }
]

Query result in Google BigQuery console:

Query result in Google BigQuery console

Configuration

BigQuerySink parameters

Parameter Description Required(default)
projectId Project id of Google BigQuery. Yes
datasetId Dataset id of Google BigQuery. Yes
tableId Table id of Google BigQuery. Expandable through DateTime.Format(). e.g. "accesslog{yyyyMMdd}" => accesslog20150101 (bracket braces needed) Yes
authMethod Accepts "private_key" only. No("private_key")
serviceAccountEmail Email address of Google BigQuery service account. Yes if authMethod == "private_key"
privateKeyFile Private key file(*.p12) of Google BigQuery service account. Yes if authMethod == "private_key"
privateKeyPassphrase Private key passphrase of Google BigQuery service account. No("notasecret")
autoCreateTable If set true, check table exsisting and create table dynamically. see Dynamic table creating. No(false)
tableSchemaFile Json file that define Google BigQuery table schema. Yes
insertIdFieldName The field name of InsertId. If set %uuid% generate uuid each time. if not set InsertId will not set. see Specifying insertId property. No(null)
bufferingInterval The buffering interval. No(00:00:30)
bufferingCount The buffering count. No(200)
maxBufferSize The maximum number of entries that can be buffered before the sink starts dropping entries. No(30000)
onCompletedTimeout Timeout for data flushing. No(00:01:00)

See Quota policy section in the Google BigQuery document.

Authentication

There is one method supported to fetch access token for the service account.

  1. "private_key" - Public-Private key pair

On this method. You first need to create a service account (client ID), download its private key and deploy the key with your assembly.

Table id formatting

tableId accept DateTime.ToString() format to construct table id. Table ids formatted at runtime using the local time.

For example, with the tableId is set to accesslog{yyyyMM01}, table ids accesslog20140801, accesslog20140901 and so on.

Note that the timestamp of logs and the date in the table id do not always match, because there is a time lag between collection and transmission of logs.

Table schema

There is one method to describe the schema of the target table.

  1. Load a schema file in JSON.

On this method, set tableSchemaFile to a path to the JSON-encoded schema file which you used for creating the table on BigQuery. see table schema for detail information.

Example:

[
  {
    "name": "Timestamp",
    "type": "TIMESTAMP",
    "mode": "REQUIRED"
  },
  {
    "name": "FormattedMessage",
    "type": "STRING",
    "mode": "REQUIRED"
  }
]

Specifying insertId property

BigQuery uses insertId property to detect duplicate insertion requests (see data consistency in Google BigQuery documents). You can set insertIdFieldName option to specify the field to use as insertId property.

If set %uuid% to insertIdFieldName, generate uuid each time.

Dynamic table creating

When autoCreateTable is set to true, check exsiting the table before insertion, then create table if does not exist. When table name changed, rerun this sequence again.

Lookup field value

To find the value corresponding to BigQuery table field from EventEntry, we use following rule.

  1. Find matching name from Payloads.
  2. Find matching name from built-in EventEntry properties.
  3. Error if field mode is "REQUIRED", else will not set the field value.

Suppoeted built-in EventEntry properties:

Name Description Data type
EventId Activity ID on the thread that the event was written to INTEGER
EventName Event identifier STRING
Level Level of the event INTEGER
FormattedMessage Fomatted message for the event STRING
Keywords Keywords for the event INTEGER
KeywordsDescription Human-readable string name for the Keywords property STRING
Task Task for the event INTEGER
TaskName Task name STRING
Opcode Operation code for the event INTEGER
OpcodeName Human-readable string name for the Opcode property STRING
Version Version of the event INTEGER
Timestamp Timestamp of the event TIMESTAMP
ProcessId Process id of the event INTEGER
ThreadId Thread id of the event INTEGER
ProviderId Id of the source originating the event STRING
ProviderName Provider name STRING
ActivityId Activity ID on the thread that the event was written to STRING
RelatedActivityId Identifier of an activity that is related to the activity represented by the current instance STRING

NOTE: EventEntry value's data type and BigQuery table field's data type must be match too.

References


Copyright (c) 2015, Tsuyoshi Sumiyoshi and collaborators. All rights reserved

About

SLAB(Semantic Logging Application Block) sink for Google BigQuery

Resources

License

Stars

Watchers

Forks

Packages

No packages published