Mamemaki.Slab.BigQuery is SLAB(Semantic Logging Application Block) sink for Google BigQuery. Receive log messages from ETW and push them to Google BigQuery.
PM> Install-Package Mamemaki.Slab.BigQuery
class Program
{
static void Main(string[] args)
{
var projectId = "xxxxxxx-0000";
var datasetId = "xxxxxxx";
var tableId = "hello{yyyyMM01}";
var serviceAccountEmail = "000000000000-xxxxxxxxxxxxx@developer.gserviceaccount.com";
var privateKeyFile = @"/path/to/xxxx-000000000000.p12";
var tableSchemaFile = @"/path/to/hello.json";
using (var listener = new ObservableEventListener())
using (var listenerDebug = new ObservableEventListener())
{
listenerDebug.EnableEvents(SemanticLoggingEventSource.Log, EventLevel.LogAlways);
listenerDebug.EnableEvents(BigQuerySinkEventSource.Log, EventLevel.LogAlways);
listenerDebug.LogToConsole();
listener.EnableEvents(MyEventSource.Log, EventLevel.LogAlways);
listener.LogToConsole();
listener.LogToBigQuery(
projectId: projectId,
datasetId: datasetId,
tableId: tableId,
serviceAccountEmail: serviceAccountEmail,
privateKeyFile: privateKeyFile,
autoCreateTable: true,
tableSchemaFile: tableSchemaFile);
MyEventSource.Log.Start();
Thread.Sleep(1);
MyEventSource.Log.Hello("World");
Thread.Sleep(1);
MyEventSource.Log.End();
}
}
}
[EventSource(Name = "My")]
class MyEventSource : EventSource
{
private static readonly MyEventSource Instance = new MyEventSource();
private MyEventSource() {}
public static MyEventSource Log { get { return Instance; } }
[Event(1, Level = EventLevel.Verbose, Message = "Start")]
public void Start()
{
if (this.IsEnabled())
this.WriteEvent(1);
}
[Event(2, Level = EventLevel.Verbose, Message = "End")]
public void End()
{
if (this.IsEnabled())
this.WriteEvent(2);
}
[Event(3, Level = EventLevel.Informational, Message = "Hello {0}!")]
public void Hello(string to)
{
if (this.IsEnabled())
this.WriteEvent(3, to);
}
}
hello.json
[
{
"name": "Timestamp",
"type": "TIMESTAMP",
"mode": "REQUIRED"
},
{
"name": "FormattedMessage",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "to",
"type": "STRING"
}
]
Query result in Google BigQuery console:
Parameter | Description | Required(default) |
---|---|---|
projectId |
Project id of Google BigQuery. | Yes |
datasetId |
Dataset id of Google BigQuery. | Yes |
tableId |
Table id of Google BigQuery. Expandable through DateTime.Format(). e.g. "accesslog{yyyyMMdd}" => accesslog20150101 (bracket braces needed) | Yes |
authMethod |
Accepts "private_key" only. | No("private_key") |
serviceAccountEmail |
Email address of Google BigQuery service account. | Yes if authMethod == "private_key" |
privateKeyFile |
Private key file(*.p12) of Google BigQuery service account. | Yes if authMethod == "private_key" |
privateKeyPassphrase |
Private key passphrase of Google BigQuery service account. | No("notasecret") |
autoCreateTable |
If set true, check table exsisting and create table dynamically. see Dynamic table creating. | No(false) |
tableSchemaFile |
Json file that define Google BigQuery table schema. | Yes |
insertIdFieldName |
The field name of InsertId. If set %uuid% generate uuid each time. if not set InsertId will not set. see Specifying insertId property. |
No(null) |
bufferingInterval |
The buffering interval. | No(00:00:30) |
bufferingCount |
The buffering count. | No(200) |
maxBufferSize |
The maximum number of entries that can be buffered before the sink starts dropping entries. | No(30000) |
onCompletedTimeout |
Timeout for data flushing. | No(00:01:00) |
See Quota policy section in the Google BigQuery document.
There is one method supported to fetch access token for the service account.
- "private_key" - Public-Private key pair
On this method. You first need to create a service account (client ID), download its private key and deploy the key with your assembly.
tableId
accept DateTime.ToString() format to construct table id.
Table ids formatted at runtime using the local time.
For example, with the tableId
is set to accesslog{yyyyMM01}
, table ids accesslog20140801
, accesslog20140901
and so on.
Note that the timestamp of logs and the date in the table id do not always match, because there is a time lag between collection and transmission of logs.
There is one method to describe the schema of the target table.
- Load a schema file in JSON.
On this method, set tableSchemaFile
to a path to the JSON-encoded schema file which you used for creating the table on BigQuery. see table schema for detail information.
Example:
[
{
"name": "Timestamp",
"type": "TIMESTAMP",
"mode": "REQUIRED"
},
{
"name": "FormattedMessage",
"type": "STRING",
"mode": "REQUIRED"
}
]
BigQuery uses insertId
property to detect duplicate insertion requests (see data consistency in Google BigQuery documents).
You can set insertIdFieldName
option to specify the field to use as insertId
property.
If set %uuid%
to insertIdFieldName
, generate uuid each time.
When autoCreateTable
is set to true
, check exsiting the table before insertion, then create table if does not exist. When table name changed, rerun this sequence again.
To find the value corresponding to BigQuery table field from EventEntry, we use following rule.
- Find matching name from Payloads.
- Find matching name from built-in EventEntry properties.
- Error if field mode is "REQUIRED", else will not set the field value.
Suppoeted built-in EventEntry properties:
Name | Description | Data type |
---|---|---|
EventId |
Activity ID on the thread that the event was written to | INTEGER |
EventName |
Event identifier | STRING |
Level |
Level of the event | INTEGER |
FormattedMessage |
Fomatted message for the event | STRING |
Keywords |
Keywords for the event | INTEGER |
KeywordsDescription |
Human-readable string name for the Keywords property | STRING |
Task |
Task for the event | INTEGER |
TaskName |
Task name | STRING |
Opcode |
Operation code for the event | INTEGER |
OpcodeName |
Human-readable string name for the Opcode property | STRING |
Version |
Version of the event | INTEGER |
Timestamp |
Timestamp of the event | TIMESTAMP |
ProcessId |
Process id of the event | INTEGER |
ThreadId |
Thread id of the event | INTEGER |
ProviderId |
Id of the source originating the event | STRING |
ProviderName |
Provider name | STRING |
ActivityId |
Activity ID on the thread that the event was written to | STRING |
RelatedActivityId |
Identifier of an activity that is related to the activity represented by the current instance | STRING |
NOTE: EventEntry value's data type and BigQuery table field's data type must be match too.
- ETW
- EventSource
- SLAB
- Google BigQuery
- fluent-plugin-bigquery
- .NET アプリから BigQuery に Streaming Insert する方法(JPN)
Copyright (c) 2015, Tsuyoshi Sumiyoshi and collaborators. All rights reserved