Skip to content

Reimerei/port_experiments

 
 

Repository files navigation

benchmarking erlang ports

this is a quick and dirty approach for a rough estimation. much of the code was taken from this howto.

Assumptions

  • pool of running executables
  • executables echo payload without performing any real computation

Build and start

git clone git://github.com/odo/port_experiments

make start

Usage

the one method to call is benchmark/4:

benchmark(PoolSize, PayloadLength, Samples, Parallel)

  • PoolSize: nbumber of executables to start
  • PayloadLength: number of Bytes in the payload going in a and out
  • Samples: number of samples to take
  • Parallel: number of clients accessing the executables
1> echo:benchmark(10, 13 * 1024, 10000, 5).
{{10,13312,10000,5},53217.815195814954,708435555.8866887}

The return value is the arguments, the operations per second and bytes per second.

exec: ./priv/echo.py: not found? check if your interpreter matches the one in priv/echo.py?

Results

  • Erlang/OTP 18

  • python2.7

  • FreeBSD 10

  • 2 * Intel Xeon CPU E3-1230 v3 @ 3.30GHz = 8 cores

  • 32 GB RAM

  • 13 kB payload

  • 100 clients

  • 10000 samples

  • what is a good number of executable?

f(Res), Res = [ echo:benchmark(NoEx, 13 * 1024, 1000, 100) || NoEx <- lists:seq(1, 20)].
[{{1,13312,1000,100}, 15000.825045377496,199690983.00406522},
 {{2,13312,1000,100}, 20496.843486103142,272853980.487005},
 {{3,13312,1000,100}, 25366.546598346104,337679468.3171833},
 {{4,13312,1000,100}, 27126.00027126,    361101315.6110131},
 {{5,13312,1000,100}, 27899.450380827497,371397483.46957564},
 {{6,13312,1000,100}, 24541.684050359538,326698898.0783861},
 {{7,13312,1000,100}, 22465.85190510424, 299065420.5607476},
 {{8,13312,1000,100}, 19430.30350134069, 258656200.2098473},
 {{9,13312,1000,100}, 23129.944025535457,307905814.867928},
 {{10,13312,1000,100},19299.058205959547,256909062.8377335},
 {{11,13312,1000,100},20398.17232375979, 271540469.97389036},
 {{12,13312,1000,100},19801.9801980198,  263603960.39603958},
 {{13,13312,1000,100},17841.84984299172, 237510705.10990578},
 {{14,13312,1000,100},18253.504672897197,242990654.20560747},
 {{15,13312,1000,100},17320.816156857312,230574704.68008453},
 {{16,13312,1000,100},23150.291693675343,308176683.02620614},
 {{17,13312,1000,100},17031.422975389592,226722302.64838627},
 {{18,13312,1000,100},21585.68437412308, 287348630.38832647},
 {{19,13312,1000,100},13754.59059461095, 183101109.995461},
 {{20,13312,1000,100},17172.983462416927,228606755.85169414}]

so it seems to max out at about 23129 ops (300 MB/s) with 9 workers ( number of CPUs), lets stay with 10 and see how payload impacts performance:

  • 10 executables
  • 100 clients
  • 1000 samples
  • variable payload
f(Res), Res = [ echo:benchmark(10, PayloadLength * 1240 + 1, 1000, 100) || PayloadLength <- [0, 1, 5, 10, 20, 50, 100, 200, 500]].

[{{10,1,1000,100},     24837.932490499494,24837.932490499494},
 {{10,1241,1000,100},  23617.770010155644,29309652.582603153},
 {{10,6201,1000,100},  21346.539725910432,132369892.84037058},
 {{10,12401,1000,100}, 24792.36395190281, 307450105.3675468},
 {{10,24801,1000,100}, 18201.674554058973,451419730.61521655},
 {{10,62001,1000,100}, 14088.674114879048,873511883.7966158},
 {{10,124001,1000,100},10358.400662937642,1284452040.6049306},
 {{10,248001,1000,100},5013.109280769212, 1243256114.7400453},
 {{10,620001,1000,100},1634.8497654808011,1013608489.4478621}]

So we see there is an optimum at around 60 KB payload size with ~870 MB/s.

Comparing python to c#

Run on local machine (thinkpad t450s, core i7, 4 cores)

echo scripts, mock json string (22K)

c++, pakets: {{5,1,3000,10}, 49 299.130691995466, 1 103 610 339.6710105} c# binary_reader: {{5,1,100000,10}, 38 755.438841398405, 867 540 498.4647032} c++: {{5,1,3000,10}, 18 898.233015213078, 423 055 844.27856} python: {{5,1,3000,10}, 11 909.72428988269, 266 611 087.9533139} c#: {{5,1,3000,10}, 5 696.253004773461, 127 516 319.76485868}

echo with serialisation and deseialisation of mock json (13K)

c#: {{5,1,3000,10}, 87.02312848494996, 1 948 099.75426409}

echo lz4 zipped data

Observations

  • The io with stdin/stdout is quite slow in c#, c++ is the quickestZ
  • The json parsing is a much bigger overhead than the pipe, almost 2 orders of magnitude

Flatbuffers

call c++ parser from erlang as port (1 CPU)

22K: 330 ops/sec 44K: 161 ops/sec 88K: 81 ops/sec 350K: 20 ops/sec

22k 1/50000 0.02ms

440k 0.4ms 0.02ms

About

benchmarking erlang ports

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C# 73.9%
  • C++ 24.9%
  • Erlang 0.7%
  • PowerShell 0.4%
  • Makefile 0.1%
  • Python 0.0%