Tomdee

12Jan/160

Preparing Github release notes from Github pull requests

When doing releases for Project Calico, I like to include the highlights of what's been merged since the previous release.
Release notes on Github are written in markdown and automagically create links when issues or pull requests are referenced with a "#", e.g. #662 will create a link to https://github.com/projectcalico/calico-containers/pull/662
But it doesn't fill in a title for the link, so I like to write my release note with lines like "#705 pool should always come from client" which provides a link and a title.

Rather than tediously copying and pasting all the text to create these links, I wrote a one-liner to do it for me.

PREVIOUS_RELEASE=v0.13.0
git log $PREVIOUS_RELEASE..master --merges --format="%s" |grep -P -o '#\d+' | grep -P -o '\d+' |xargs -I ^ -n 1 curl -s https://api.github.com/repos/projectcalico/calico-containers/pulls/^ | jq -r '"#" + (.number |tostring) + " " + .title'

 

git log $PREVIOUS_RELEASE..master --merges --format="%s"

  • Print the commit message of the merges that have happened since the last release. e.g. Merge pull request #662 from tomdee/kubernetes-versioning

grep -P -o '#\d+' | grep -P -o '\d+'

  • Pull out just the number part of the #XXX PR number.
  • -o ensures that just the matched part of the line is output

xargs -I ^ -n 1 curl -s https://api.github.com/repos/projectcalico/calico-containers/pulls/^

  • Run curl for each of the PR numbers that were merged. -s makes curl silent.
  • xargs is run with -I to control the replace character and -n 1 ensures that curl is called for each PR.

jq -r '"#" + (.number |tostring) + " " + .title'

  • Use jq to pull out the PR number and title and format it to get the desired output.
Filed under: Uncategorized No Comments
20Nov/150

First experience of using Metal as a Service (MAAS) from Ubuntu

After coming into a number of servers for Project Calico I needed some way to set them up and provision them. MAAS from Canonical seems like a good place to start so I had a play.

I had a number of issues along the way (detailed below) but ultimately I got where I needed to be.

I started off with an existing Ubuntu server and decided to just install the packages on there. After realising how out of date (and messy) that server was I scrapped that idea and decided to just to the MAAS install from an Ubuntu ISO. I just wanted this to work with minimal fuss so I went for the latest and greatest - Ubuntu 15.10. The installation went smoothly but I found it confusing to learn about MAAS from the docs

  • Region controller, clusters and cluster controllers - How does this fit in to the single server install I just did?
  • What are the stages that a "node" goes through? Where is that documented?

When trying to configure my interfaces, I hit this bug which was fixed in the latest RC. (LP: #1439476 - Internal Server Error when creating/editing cluster)

So I upgraded, got an interface configured and went to configure my DHCP server. I followed the docs and got a server to PXE boot. Success!! Or not... the server immediately failed not being able to find a file... Eventually I tracked this down to me missing the next-server parameter. It really should be mentioned in the docs at https://maas.ubuntu.com/docs1.7/install.html#configure-dhcp

Once I got servers actually booting I ran into endless problems during the cloud-init about not being able to find files or servers. I was seeing various different errors but generally it was timing out trying to connect to servers. I was at a bit of a loss. Changing my interface settings had some effect on the IPs that the nodes were trying to connect to but still didn't resolve the problem.

I got another VLAN set up and added another interface to my MAAS server. I allowed this interface to manage DHCP and DNS and tried again PXE booting servers. They booted, they got IP addresses on this new subnet but they were still failing with strange errors trying to connect odd IPs.

I was now many hours into the process and I felt tantalizingly close but I was struggling to debug the problems I was seeing. MAAS was just too magic and the documentation didn't give me enough detail to diagnose what was going on.

I decided to take a step back and start again. This time I started from a Ubuntu 14.04.3 ISO which took me back down to a 1.7.X release. The server I used had two interfaces - the "main" one on the main LAN and another interface on the my new VLAN. After doing the initial set up I was getting errors about not being able to contact my cluster. This was my prompt to actually find the logs (under /var/log/maas) and found that something was trying to use an old IP address. It would have been really useful to be able to get this detail from within the web UI but now I knew what was wrong I needed to work out out to fix it. After some googling I found that I needed to run

sudo dpkg-reconfigure maas-region-controller
sudo dpkg-reconfigure maas-cluster-controller

And that was enough to get things working. I still needed to spent some time getting power control working (some more guidance on this would be really nice too!).

MAAS seems like it's going to be a really useful tool. The initial set up was a challenge which would have been made much easier by a few simple docs improvements.

Summary of docs issues

  • Missing detail on manual DHCP config
  • General intro/orientation - different types of controller and node lifecycle.
  • IP configuration of servers - how to reconfigure IPs and how to tell if things aren't configured correctly.
  • Troubleshooting - Linked to orientation above. What is the full flow from start to end of getting a server set up and a node provisioned. What packets should flow from where to where (e.g. DHCP, TFTP, cloud-init+Metadata etc..) Where are the logs?
  • Overview of different power options - what to use when and how to configure.

 

Filed under: Uncategorized No Comments
21Oct/150

Logging temperature using a USR-HTW, forecast.io, Grafana, InfluxDB and of course Docker

I've previously posted about running InfluxDB and Grafana using Docker. This post covers getting data from a USR-HTW temperature sensor and weather data from forecast.io.

USR-HTW

The USR-HTW is an inexpensive, Wifi enabled temperature and humidity sensor. Although it doesn't have a documented API, it's already been reverse engineered.

Data is retrieved from it by establishing a TCP connection to port 8899 then periodically (roughly every 15 seconds) packets will be received that can be decoded to the temperature and humidity reading.

I've written a Dockerfile which is just a minimal Alpine Linux container with python and curl installed. I also add the script for retrieving the data from the USR-HTW sensor and from forecast.io.

Running the container

docker run --name usr-htw --restart=always -d --link influxdb:influxdb tomdee/usr-htw usr-htw.py <ADDRESS OF USR_HTW>

The container is configured to always restart (because it will die if it loses contact with the sensor)

It's also linked to the influx DB container so it can store it's data there.

Script details

The script itself is just a simple "while True" that blocks on data coming over the TCP connection then pumps it into influx DB using curl. Not super efficient but it doesn't need to be

subprocess.call("curl -sS -i -XPOST 'http://influxdb:8086/write?db=tomdee' --data-binary 'temperature,room=XXX,location=YYY value=%s'" % temp, shell=True)

Forecast.io

Forecast.io provides both forecast and historical data for a multitude of weather related data. They have an excellent API that's simple to use and crucially free (for up to 1000 reqs/day).

Running the container

The USR-HTW container has the forecast.io script in it too, so running it is very similar.

docker run --name usr-htw --restart=always -d --link influxdb:influxdb tomdee/usr-htw forecast.py <FORECAST API KEY>

You can get an API key from https://developer.forecast.io/register

Script details

Another simple script that grabs the json from forecast.io, reads the JSON then curls the result to influxdb.

The forecast.io API call explicitly sets the units to be "si" and excludes all the unwanted forecast data.

https://api.forecast.io/forecast/%s?units=si&exclude=minutely,hourly,daily,alerts,flags

 

Putting it together

I've been collecting data for a number of weeks now. I've created two dashboards in Grafana, one which doesn't aggregate the data and one which aggregates it daily (since the weather follows a diurnal cycle)

Grafana   Weather   Vicksburg Grafana   Weather   Vicksburg   Daily

12Oct/150

Using Graphana with InfluxDB and Docker

Now that that I have InfluxDB running as part of my infrastructure, I need some way to display nice dashboards for the data I collect.

The best tool I can find for this job is Grafana. They have an active community, great graphing support, excellent support for InfluxDB and they even produce a good Docker image.

Running Grafana

docker run -d --restart=always -p 3000:3000 --link influxdb:influxdb -v /var/lib/grafana:/var/lib/grafana -v /var/log/grafana:/var/log/grafana -v /etc/grafana:/etc/grafana --name grafana grafana/grafana

Running Grafana is as simple as running the grafana/grafana container. The additional options are

  • To link to the influxdb container
  • The mount volumes for the data, logs and config
  • To expose the container on port 3000

Once the container is running, navigate to the http://host:3000 and login with admin/admin

Grafana documentation is great and it's very configurable; either by placing a config file in /etc/grapfana or by setting environment variables.

Upgrading Grafana

Since the container is stateless (all state is in volumes outside of the container), it can just be stopped, a new image pulled and a new container created.

Backing Up Grafana

Since I'm not using a custom config file, I just need to backup the /var/lib/grafana directory which contains the database storing the dashboards that I create.

e.g.tar -czvf grafana.$(date +%Y-%m-%d-%H.%M.%S).tgz /var/lib/grafana

 

Tagged as: , No Comments
10Oct/150

Deploying InfluxDB using Docker

There are a number of datastores aimed at the system monitoring space. These typically focus on storing time-series data. Currently, key players include

These different systems have different scopes; some focus entirely on the storage of data whilst others have additional features for graphing and alerting. Prometheus have a good overview.

I was looking for something that had a simple API for getting data in and simple installation and operational requirements were a must. I wasn't too keen on something that included too many bells and whistles, rather I preferred using the best tool for the job. If this meant using a different system for graphing or displaying the data then that was fine. Likewise for alerting.

Despite the wide support for Graphite, I settled on InfluxDB. It ticked the normal boxes of being modern, well supported and active. But crucially, I could dump data into it using curl, and there was a good Docker image for deploying it. The Tutum image lacks clustering and SSL support in the latest image but neither of these features are required.

Running InfluxDB

Creating a container running InfluxDB is as simple as

docker run -d --volume=/var/influxdb:/data --restart=always -e PRE_CREATE_DB="tomdee" -p 127.0.0.1:8083:8083 -p 127.0.0.1:8086:8086 --name influxdb  tutum/influxdb

This stores the data in a volume (/var/influxdb), ensures the container is restarted if it crashes or the server reboots, exposes the management APIs to the localhost and creates an initial database.

Managing InfluxDB

The influx CLI is available with  docker exec -ti influxdb /opt/influxdb/influx

The REST and Managment UIs are available over HTTP on ports 8083 and 8086. To connect to them remotely, an SSH tunnel can be used

ssh -L 8086:localhost:8086 user@example.com

Upgrading InfluxDB

The original container can be stopped and removed (since the data is in a volume).

docker rm -f influxdb

Then a new container can be pulled and started (without the need to create the database since it already exists)

docker pull tutum/influxdb

docker run -d --volume=/var/influxdb:/data --restart=always --name influxdb  tutum/influxdb

Backing up InfluxDB

Although influxdb has a hot backup feature I found it produced zero length files. Not good. Instead, I'm happy just taking a copy of the data on disk. If I was worried about data consistency I could pause the docker container whilst doing this.

e.g.  tar -czvf influxdb.$(date +%Y-%m-%d-%H.%M.%S).tgz /var/influxdb

 

Running InfluxDB on a single server for simple monitoring applications is easy and the operational requirements are low. Bugs in the backup code are a concern but at my scale I'm happy taking the risk.

23Feb/150

Simple benchmarking of etcd read and write performance

There's surprisingly little information on the web about the performance of CoreOS's distributed data store etcd. It's reasonable to assume that writes are slow (because they need to be replicated) and reads should be fast (because they can just come from RAM). But everything is being transported over HTTP and needs to be JSON encoded. I know that etcd hasn't been optimized for performance (yet) but it would be great to know what sort of ballpark performance is possible.

I ran a few simple tests against etcd version 2.0.0, on a single node "cluster" running on an Ubuntu 14.04 VM running on a slow Dell laptop. This isn't any kind of reliable benchmark - I'm just trying to get a ballpark estimate.

I started testing with boom (the python version) but it was way too slow, so I switched to ab.

Results

I tested 10000 requests, with 10 concurrent workers. I also turned on HTTP keepalives.

  • Read Performance - 6758 req/sec
  • Write Performance - 1388 writes/sec (but about 20% of the requests failed)

Not too shabby but the numbers should be taken with a large pinch of salt. I'm testing against localhost and only reading/writing a tiny amount of data. The write number is pretty meaningless given that I have a single node in my cluster and a lot of the requests failed (though I tried a single concurrent connection and got 0 failures and roughly 350 writes/sec - still pretty respectable)

Read details

ab -k -c 10 -n 10000 http://127.0.0.1:4001/v2/keys/key
This is ApacheBench, Version 2.3 <$Revision: 1528965 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests
Server Software:
Server Hostname: 127.0.0.1
Server Port: 4001

Document Path: /v2/keys/key
Document Length: 93 bytes

Concurrency Level: 10
Time taken for tests: 1.480 seconds
Complete requests: 10000
Failed requests: 0
Keep-Alive requests: 10000
Total transferred: 3220000 bytes
HTML transferred: 930000 bytes
Requests per second: 6758.10 [#/sec] (mean)
Time per request: 1.480 [ms] (mean)
Time per request: 0.148 [ms] (mean, across all concurrent requests)
Transfer rate: 2125.11 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 2
Processing: 0 1 1.0 1 13
Waiting: 0 1 1.0 1 13
Total: 0 1 1.0 1 13

Percentage of the requests served within a certain time (ms)
50% 1
66% 2
75% 2
80% 2
90% 2
95% 3
98% 4
99% 5
100% 13 (longest request)

Write details

ab -k -u data -c 10 -n 10000 http://127.0.0.1:4001/v2/keys/key
This is ApacheBench, Version 2.3 <$Revision: 1528965 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests
Server Software:
Server Hostname: 127.0.0.1
Server Port: 4001

Document Path: /v2/keys/key
Document Length: 169 bytes

Concurrency Level: 10
Time taken for tests: 7.202 seconds
Complete requests: 10000
Failed requests: 2035
(Connect: 0, Receive: 0, Length: 2035, Exceptions: 0)
Keep-Alive requests: 10000
Total transferred: 4000173 bytes
Total body sent: 1650000
HTML transferred: 1698138 bytes
Requests per second: 1388.59 [#/sec] (mean)
Time per request: 7.202 [ms] (mean)
Time per request: 0.720 [ms] (mean, across all concurrent requests)
Transfer rate: 542.44 [Kbytes/sec] received
223.75 kb/s sent
766.19 kb/s total

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 2
Processing: 3 7 1.9 7 30
Waiting: 3 7 1.9 7 30
Total: 3 7 1.9 7 32

Percentage of the requests served within a certain time (ms)
50% 7
66% 7
75% 8
80% 8
90% 9
95% 10
98% 12
99% 14
100% 32 (longest request)

Tagged as: , No Comments
20Dec/140

Getting started with USR-HTW temperature/humidity sensor.

I've always wanted an easy solution for logging temperature and humidity in my house. I've messed around in the past with Arduino and Raspberry Pi based solutions, but inspired by a recent post on Hackaday I finally took the easy option and bought a WIFI enabled sensor. The hackaday post refers to a blog post detailing the simple communication protocol that the unit uses which seemed perfect for rolling my own simple data logging solution.

I ordered a unit off eBay and it arrived from China only a week later. Powering it up was a breeze (I used a standard wall wart power supply that I had knocking around) but connecting it to my network took a bit longer because of the flakey web UI that the unit has. It acts as an access point, so I could connect to it to give it my WIFI network credentials. Unfortunately it doesn't cope well with SSIDs that have spaces in them. I took this opportunity to set up another DMZ guest network on my router and once that was working everything connected fine.

I cribbed the code from the OzoneJunkie blog post but changed it to not rely on NumPy. The code is available on GitHub.

 

Tagged as: No Comments
22Apr/140

Getting rid of the xxx@gmail.com on behalf of Tom Denham [xxx@tomdee.co.uk] message when using your own domain with Gmail.

If you have your own domain but use gmail.com for your email, you might have tried to use the "Send mail as:" feature in GMail to have your outgoing mail appear to be from your own domain without needing to pay Google.

This approach generally works well, with one exception. Some mail client (the biggest example being Outlook) will display that the mail is from your Gmail address.

xxx@gmail.com on behalf of Tom Denham [xxx@tomdee.co.uk]

Google do offer a way around this (without paying), but you need an SMTP server. I didn't fancy setting up my own and after hunting around I couldn't find any cheap SMTP servers that directly meet this use case.

Finally, a few days ago I came across https://postmarkapp.com/. Although it's aimed at "transactional email for webapps" it supports authenticated SMTP and they even give you 10,000 email credits for free when you sign up. Since these credit never expire, and I'm unlikely to send 10,000 emails in the foreseeable future, I'm hoping that this service will be free for life!

I've been using it for a few days now and I've not had any problems. It was quick and easy to sign up (no credit card details required) and the Gmail config is straight forward.

Simply edit your Gmail settings and use the following information (get the username and password from your PostMark account - use a server API key for both - not the main account API key)

Example gmail config
Filed under: Uncategorized No Comments
20May/130

PlayRTP now supports real timestamps

I'm now grabbing the timestamp from the capture file and using that to pace the playback. This means that capture files can actually be played back through the jitter buffer like they would be on a real Jitsi client.

I've also testing this with capture files from tcpdump and it works fine.

Filed under: Uncategorized No Comments
18May/130

Testing a jitter buffer by presenting packet captures through a DatagramSocket interface

I've been working on the jitter buffer code in the FMJ project which is used by the Jitsi softclient. To know that the code is good, it's been handy to be able to try it out under real world conditions. To make this repeatable I wanted to be able to play packet captures through the jitter buffer and hear the results so I could then tweak the code and hear whether it improved things.

The easiest way to achieve this was to use the excellent libjitsi library from the Jitsi team. This just allows me to call LibJitsi.start() then I can use the MediaService to create a MediaDevice to play the audio and use a MediaStream connected to this device for dealing with the RTP. See here for the code.

The MediaStream is then connected to a new class I wrote which implements the StreamConnector interface. This class PCapStreamConnector, is a small class which has most of the interesting logic in another new class I wrote - PCapDatagramSocket. This presents a packet capture file through a DatagramSocket interface! It's fairly crude, and assumes the media file is written in the exact format that Jitsi uses for writing packet capture files, but that's all I need so no point doing any more.

It's a bit rough and ready at the moment. The fact that it doesn't actually use the RTP timings makes this borderline useless (!), but I will be adding that feature soon. At the same time, I'll be cleaning up the interface to allow the payload type to be passed in, or even better just try to detect it from the capture file. With a little more work, this could be a handy generic tool for playing media from RTP streams in any format that libjitsi supports - at least g711, SILK, Opus, g722, g723, iLBC and speex.

To really make this feature useful, I also enhanced the error reporting module so that users would have the chance to report a few minutes of media if they've experienced bad voice quality on a call.