Gravity

A reverse proxy and load balancer written with C# and running on OWIN

Status

This has reached MVP now and has received some testing. If you are interested in using this code please beta test and let me know what you find.

Features

This is an OWIN based service. The source code as-is will run on IIS but you can easily customize it to run on Apache, self host (.exe) etc as supported by OWIN.

I wrote this load-balancer because Nginx is not flexible or customizable enough and provides very little visibility into what's happening within the load balancer.

Gravity can do most of the things that Nginx can do. The big difference is that Gravity is a request pipeline comprising nodes that can be combined into a request processing graph. Requests enter at the left side of the graph, progress through the nodes and reach a server at the right hand side. Requests and responses stream through the graph in both directions simultanously. Many different types of graph node are provided and each one is configurable resulting in almost infinite possibilities.

Dashboard

The Gravity server contains dashboard pages that is refreshed periodically and looks like the screenshot below with using the test configuration checked into the source code.

You can configure a graph of nodes that define how requests are processed and link these nodes together in arbirtary ways. Any node can have its disabled property set to temporarily turn it off this can be useful for example when you want to remove a server from the load balancer for maintenance or put up a maintenance mssage when the whole site is down for maintenance.

The nodes draw in a dimmed grey color when they are offline because health checks failed. Active nodes are color coded by their function to make the diagram easier to understand at a glance.

The lines on the drawing can be:

black - when the traffic is not being measured at this point
red - when there is no traffic on this link
thin green - when there is light traffic
medium green - when there is moderate traffic
thick green - when there is high traffic

You can make as many dashboards as you like and define what is shown on each one. The dashboard configuration also defines the thresholds for changing the thickness of lines, so that dashboards displaying low traffic areas can use lower threshold values.

Listener Node

Defines an IP address and port number to process requests for. Also support wildcard that matches all IP addresses and/or all ports. If running on IIS these IP addresses and ports must be configured in the IIS bindings.

Each listener measures the request rate and average request processing time.

Server Node

Defines a server to send requests to. You can specify an IP address or DNS name, and when a DNS name is specified it can resolve to multiple IP addresses. The server health checks all IP addresses and distributes requests across all the healthy ones. This means that you can bind your web server to multiple IP addresses on different networks and if one network fails the traffic will all be routed to the one that still works.

Each server node measures the request rate and average request processing time for each IP address.

If all IP addresses for the server are unhealthy then the server is taken offline. When the server is offline all upstream nodes are also taken offline so that for example if you put a round robin node in front of multiple servers then any failing servers will not be sent requests by the round robin.

Router Node

Matches requests and directs traffic to different paths through the load balancer. For example you can route requests to different pools of servers based on the host header, path, query string etc. You can also route GET requests to a larger pool of servers than PUT, POST and DELETE requests for example.

This node is also useful for segregating traffic from specific IP address ranges and handling it differently, for example more detailed logging or specific fixed responses. The router can match CIDR blocks and the list of CIDR blocks can be configured directly into the node or contained in a text file.

Note that this node matches requests, not responses. You can not route traffic based on the response because the traffic must have already been routed to a server in order to have a response.

Transform Node

Transforms requests that pass through the node using a scripting language. The transformation script can be configured directly into the node or contained in a text file.

The transform supports multiple script languages. In all cases the script is compiled into a form that is very efficient for execution.

One script language was borrowed from the URL Rewrite project. It is a copy of the same code and provides the fame features. See the readme file in my UrlRewrite.Net repo for a detailed definition of the rule syntax. Rules can be applied to the incomming request, the outbound response or both.

Another scripting language uses regular expressions to modify the body of the request. This script language is under development ... watch this space.

Response Node

Returns a fixed rersponse. This is useful for serving static content, returning 404, putting up a maintenance message when your website is down for updates etc.

CORS Node

Implements the logic required to support CORS requests - for example if you have web services in sub-domains that need to be called from your main site domain.

Internal Node

Any requests routed to this node will be passed to the rest of the OWIN pipeline and handled within the load balancer. You can use this to expose the UI that the load balancer contains for checking the configuration and monitoring traffic flow through the load balancer.

Round Robin Node

These nodes are configured with a list of output nodes and forward requests in a round robin fashion. Any outputs connected to a downstream that is offline will be not have requests sent to them so that we are not sending requests to dead servers.

Sticky Session Node

These nodes look for a session cookie in the request and send requests with a specific session IO always to the same server. When a new session ID is encountered the request is send to the server with the least number of active connections and the session cookie that is set by the server is captured in the response so that this client will be routed to the same server again on the next request.

Least Connections Node

These nodes have a list of nodes to send requests to next and always send requests to the output that has the least number of active connections.

Log Level Node

These nodes change the level of detail in the log. This allows you to capture more detail for requests that are pssing through a specific portion of the pipeline.

Custom Log Node

These nodes capture information about specific request types and write a log file to disk. You can specify which request methods to capture (POST, GET etc) or capture all methods, and you can specify the return codes to capture (503, 200, 404 etc) or capture all response codes. If you want to be more specific about what to capture please configue a Router Node in front of the Custom Log.

Compiling and running

Install Visual Studio version 2013 or later
Download the source code.
Run the 'restore.cmd' command script.
Open the solution in Visual Studio.
Press F5.
You should see the dashboard.

Installing

To run the load balancer from the original source code you need to configure a website in IIS. If you want to handle HTTPS then you need to install the HTTPS certificate into IIS as well. To run on another hosting platforn you will need to make some minor changes to the .Net project.

Copy files from the Gravity.Server folder to your production server as follows:

web.config - no changes are required in this file.
config.json - you need to put your load balancer configuration in this file.
bin\*.dll - you only need to copy the .dll files from the bin folder.

Point IIS to the root folder (where web.config is located).

Set the IIS AppPool to .Net 4.5 Integrated mode.

Configuring

I will write some detailed documentation later. For now please see the classes in the Gravity.Server\Configuration folder where there are documentation comments describing the configuration options.

Router Node

A simple Router Node configuration might look like this:

{
  "name": "N",
  "routes": [
     {
        "to": "B",
        "logic": "All",
        "groups": [
          {
            "logic": "All",
            "conditions": [
              { "condition": "{method} = GET" },
              { "condition": "{path[1]} = ui" },
              { "condition": "{ipv4} = loopback" }
            ]
          },
          {
            "logic": "Any",
            "conditions": [
              { "condition": "{header[host]} = localhost:52581" },
              { "condition": "{header[host]} = gravity.localhost" }
             ]
          }
        ]
     },
     {
       "to": "G",
       "logic": "All",
       "conditions": [
           { "condition": "{method} = GET" },
           { "condition": "{path[0]} = /favicon.ico" }
       ]
     },
     {
       "to": "C"
     }
  ]
}

The Router Node must have a name, the name should be kept short and should not contain any spaces or special characters. When the node is displayed on a dashboard you can choose a display label for the node, so this node name is not normally shown anywhere, it is used to connect nodes together.

The router must have a list of routes. These are the other nodes in the graph where the router can route requests. Each route must have a 'to' field that is the name of the node to route traffic to. Everything else is optional.

The router evaluates the routes in the order that they are written and finds the first one that matches the request and in online. Online means that there is a downstream path to a node that produces responses. If there is no way to get a response from a given route then that route will be skipped and the logic will move on to the next route in the list. This is useful for displaying maintenance pages when the system is down for maintainance for example.

Groups

Groups are a way of combining expressions. Each group has a 'logic' property that defines how the expressions are combined. The options for the 'logic' property are:

All - meaning all expressions must be true for the group to have a true value. Any - meaning at least one of the expressions must be true for the group to have a true value. None - meaning that all expressions must be false for the group to have a true value.

The expressions inside the group can either be conditions or nested groups or both. The route is a special kind of group that also has logic, groups and conditions, but additionally has a to property,

Conditions

Each condition is an expression and some flags, for example:

{ "condition": "{ipv4} = 192.168.0.0/16", "negate": false, "disabled": false }

Conditions can be inverted by setting the negate property to true, and they can be temporarily removed from the logic by setting their disabled property to true.

The condition property consists of two expressions separated by an = sign. The = sign can be prefixed with another symbol whose meaning varies with the type of value that is being compared as follows:

For string comparisons, the comparison operators are:

= meaning that the two expressions evaluate to the same case insensitive string. <= meaning that the left expression matches the beginning of the right hand expression. >= meaning that the left expression matches the end of the right hand expression. != meaning that the two expressions evaluate to different case insensitive strings. ~= meaning that the left expression is contained within the right hand expression.

For number comparisons, the comparison operators are:

= meaning that the two expressions evaluate to the same numeric value. <= meaning that the left expression is numerically less or eaqual to the right hand expression. >= meaning that the left expression is numerically greater or eaqual to the right hand expression. != meaning that the two expressions evaluate to different numeric values.

For IP address comparisons the prefix is ignored. = meaning that the left expression is an IP address contained in the CIDR block of the right hand expression

Expressions

The expressions in conditions are taken as literal values by default.

You can enclose the expression in curly braces to extract values from the incoming request as follows:

{path[0..n]} - retrieves an element from the path. {path[1]} is the first path element, {path[2]} is the second element etc. {path[-1]} is the last path element, {path[-2]} is the second to last etc.

{path} - retrieves the entire path including the leading / character.

{header[name]} - retrieves a header from the request, for example {header[Accept]} returns the Accept header from the request.

{query[name]} - retrieves a query string parameter, for example the expression {query[page]} for a request url of https://mydomain.com/news/current?page=3 would return the value 3.

{null} - retrieves a null value. This can be used to test for the absence of something. For example the condition {query[page]} = {null} is true for any request that does not have a page parameter in the query string.

{method} - retrieves the request method. The returned value will be POST, GET, PUT, DELETE, OPTIONS etc.

{ipv4} - retrieves IP v4 address where the request came from.

{ipv6} - retrieves IP v6 address where the request came from.

When using {ipv4} or {ipv6} you can use some special notation for the right side expression. There are a small number of reserved words, you can put either an IPv4 or IPv6 address, or you can put an IPv4 or IPv6 CIDR block. Note that you should not mix IPv4 and IPv6, in other words you cannot have {ipv4} as the left hand expression then put an IPv6 CIDR block in the right hand expression.

The reserved words are:

loopback means the IPv4 or IPv6 loopback address (which are 127.0.0.1 and ::1 respectively) as appropriate. For example the condition {ipv6} = loopback returns true of the source address for the request is ::1 or 127.0.0.1 and {ipv4} = loopback means exactly the same thing but is marginally more efficient for IPv4 addresses.

'link' means the IPv6 local link network. This only works for IPv6 addresses.

'site' means the local network. This works for both IPv4 and IPv6 addresses. For example the condition {ipv4} = site returns true for source addresses of 192.168.3.56 and 10.4.56.1.

An expression can also be a list of values separated by commas and encluded in square brackets. For example the condition

{
    "condition": "{ipv4} = [192.168.3.0/24, 127.0.0.1]",
    "negate": false,
    "disabled": false
}

will match any IPv4 address in the range 192.168.3.0 to 192.168.3.255 or the loopback address 127.0.0.1.

An expression can also be a file name enclosed in round brackets. This will load the file as a list of literal values and behaves in the same way as the square bracket case except that each value in the list is one line from the file.

When loading the file, blank lines are ignored and anything after a # symbol is discarded. Whitespace at the start or end of the line is also discarded so that you can indent lines for readability. If you want to include a # symbol in your data you must prefix it with a \ escape character, and if you want to include a \ character in your data you must escape it also by using \\.

For example you can create files called badguys_ipv4.txt, badguys_ipv6.txt, goodguys_ipv4.txt and goodguys_ipv6.txt in the root folder of the website, then route good and bad guys to different paths with a router configured like this:

"routers": [
  {
    "name": "R",
    "routes": [
      {
        "to": "GOOD",
        "logic": "Any",
        "conditions": [
          { "condition": "{ipv4} = (~\\goodguys_ipv4.txt)" },
          { "condition": "{ipv6} = (~\\goodguys_ipv6.txt)" }
        ]
      },
      {
        "to": "BAD",
        "logic": "Any",
        "conditions": [
          { "condition": "{ipv4} = (~\\badguys_ipv4.txt)" },
          { "condition": "{ipv6} = (~\\badguys_ipv6.txt)" }
        ]
      },
      {
        "to": "OTHER"
      }
    }
  }
]

This router will route all requests from bad guys to the BAD node, all good guys to the GOOD node and everyone else to the OTHER node.

# This is a list of IPv4 addresses to block
123.4.5.6       # Caught spamming June 12 2019
99.98.97.0/24   # Cloud provider hosting botnets

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.paket		.paket
Gravity.Server		Gravity.Server
JMeter		JMeter
.gitattributes		.gitattributes
.gitignore		.gitignore
Gravity.sln		Gravity.sln
LICENSE		LICENSE
README.md		README.md
Todo.txt		Todo.txt
dashboard.png		dashboard.png
install.cmd		install.cmd
paket.dependencies		paket.dependencies
paket.lock		paket.lock
publish.cmd		publish.cmd
restore.cmd		restore.cmd
update.cmd		update.cmd

License

unknow321/Gravity

Folders and files

Latest commit

History

Repository files navigation

Gravity

Status

Features

Dashboard

Listener Node

Server Node

Router Node

Transform Node

Response Node

CORS Node

Internal Node

Round Robin Node

Sticky Session Node

Least Connections Node

Log Level Node

Custom Log Node

Compiling and running

Installing

Configuring

Router Node

Groups

Conditions

Expressions

About

Resources

License

Stars

Watchers

Forks

Languages