Skip to content

randomravings/avro-dotnet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Avro for .NET

This project is an implementation of the Apache Avro data serialization system.

The implementation is inspired by the Apache Avro for C# and the Microsoft Avro library.

Why a new implementation

The primary goal is to join the best from both aforementioned implementations and achieve feature parity with the Java implementation. Secondary goal is to use the modern typical C# like conventions and styles in the code and use the latest features of C# to modernize the implementation where it makes sense.

Highlights

New API

This implementation is not intended to be backwards compatible with either of the aforementioned implementation.

Logical Types

All Logical types as per Avro Spec 1.9.0 are implemented.

Common interface for Specific and Generic

The Apache Avro imlementation Specific (using code bindings) Generic (without code bindings) should not be treated as separate implementations. In stead they are folded under a common interface and then offered as optional out-of-the-box implementations for records, protocols, errors, fixed, and enum types. The readers, writers, and resolvers are ambivalent about which implementation is used and the resolution is based on what is proovided as a generic argument e.g:

var reader1 = DatumReader<MyRecordType>(readerSchema, writerSchema);
var reader2 = DatumReader<GenericRecord>(readerSchema, writerSchema);

This will be passed to the same resolver (behind the scenes) the difference being the instatiation of the instances. If the generic argument is a type generated by the AvroGen tool then the resolution will recursively look up named types from the Assembly. Otherwise generic implementations will be used.

LINQ Expressions used for Serde

To avoid excessive type casting or boxing and unboxing of values the resolution returns lambda functions that are built recursively and compliled before returned. This is also the approach used in the Mircosoft implementation.

Roslyn replacing CodeDom

CodeDom is increasingly being replaced by Rolslyn for code generation and this implementation also falls into that category.

Async/Await for IPC

Calls to underlying transport in the Ipc project are propagating async/await patterns to the user. There are no background forever loops implemented and is left to the user of the API in order to delegate the control of the states.

Procedural Code style

The implementatin is leaning towards a more procedural style for modularity.

Writer schema resolution

Not an official Apache Avro feature, but to be able to evolve schema writers but retain target schema should be possible.

AvroGen extended

The AvroGen tool for code bindings is being extended to add more options in order to enhance user experience:

  • Add folder and style options to CLI syntax.
  • Option to read single file or files from a folder filtered by extension.
  • Option to create dotnet core project.
  • Options that dictate code binding style (not currently implemented).

Features to be implemented

  • JSON serde
  • Code generation for protocol
  • Bring-Your-Own logical type
  • Attributes on Properties to be used for resolution allowing for mangling and selected casing on code generated fields.
  • Block Reader enabling parallel reads of Object Containers.
  • Mangling for reserved words where appropriate.
  • SchemaBuilder API.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages