Skip to content

brumschlag/Transformalize

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformalize

Intro

This tool expedites mundane data processing tasks like reporting and denormalization.

It works with many data sources:

Provider Input Output
Microsoft SQL Server
MySql
PostgreSql
SQLite
Files
Web
Elasticsearch
SOLR
Lucene

Configuration

Jobs are designed in XML or JSON. They are executed with a provided CLI.

Hello World:

<add name="Process">
    <entities>
        <add name="Entity">
            <rows>
                <add Noun="World" />
            </rows>
            <fields>
                <add name="Noun" output="false" />
            </fields>
            <calculated-fields>
                <add name="Greeting" t="copy(Noun).format(Hello {0})" />
            </calculated-fields>
        </add>
    </entities>
</add>

Save the above in HelloWorld.xml and run like this:

> tfl -a HelloWorld.xml
Greeting
Hello World

The csv output includes the header column name; Greeting, and a single row; Hello World.

Hello Planets

Hello Planets demonstrates reading from a file called HelloPlanets.csv. It contains:

Planet,Distance,Year,Mass,Day,Diameter,Gravity
Mercury,0.39,0.24,0.055,1407.6,3.04,0.37
Venus,0.72,0.61,0.815,5832.2,7.52,0.88
Earth,1,1,1,24.0,7.92,1
...

Here is the arrangement we start with:

<add name="Process">
    <connections>
        <add name="input" provider="file" file="c:\temp\Planets.csv" />
    </connections>
</add>

The (above) is only the connection. It needs an entity with fields but I don't want to type that. So, I use the CLI:

c:\> tfl -a c:\temp\HelloPlanets.xml -m check

Check mode detects and returns the schema so I can add it to my arrangement.

<add name="Process">
    <connections>
        <add name="input" provider="file" file="c:\temp\Planets.csv" />
    </connections>
    <entities>
        <add name="input">
            <fields>
                <add name="Planet" length="8" />
                <add name="Distance" length="6" />
                <add name="Year" length="7" />
                <add name="Mass" length="7" />
                <add name="Day" length="7" />
                <add name="Diameter" length="5" />
                <add name="Gravity" length="5" />
            </fields>
        </add>
    </entities>
</add>

to be continued...

NOTE: This is the 2nd implementation. To find out more Transformalize, read the article I posted to Code Project (based on the 1st implementation).

About

Configurable Extract, Transform, and Load

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C# 100.0%