Embedded to machine learning

Setting up a Raspberry Pi sensor environment and teaching myself and the machine with it

kuva

RRDtool is a high performance data logging and graphing system for time series data. It doesn’t have a query language such as SQL, nor is it a NoSQL document store. Instead, you can store floating point numbers and accompanying timestamps there, and configure some aggregates.

RRD in the name comes from Round Robin Database, which means that the size of the database is decided when it is created, and once filled completely, the values are overwritten from the beginning. RRDtool has accompanying software for graphing the data called rrdgraph.

Required hardware and software

  1. Raspberry Pi
  2. Ansible and clone of raspberry-ansible repository configured with correct IP addresses and SSH keys
  3. One or more DS18B20 or compatible temperature sensors, see previous blog post

Installing temperature reading script

Running the Ansible playbook tempreader-rrdtool.yml installs a python script that reads all connected 1-Wire temperature sensors to RRDtool databases (one file for each sensor, files are created if they don’t exist). It also adds the temperature reading to be executed every five minutes as a cron job.

$ ansible-playbook -i hosts tempreader-rrdtool.yml

After which the Raspberry’s pi-user’s home directory should have files like these and a cron job:

pi@raspberry1:~ $ ls
0000075f24dc.rrd  0000075f5202.rrd  current.png  readtemp-rrd.py

pi@raspberry1:~ $ crontab -l
...
#Ansible: read all temperature sensors to rrd
*/5 * * * * /usr/bin/python /home/pi/readtemp-rrd.py

The first two files are RRDtool databases for temperature sensors with ids 0000075f24dc and 0000075f5202. current.png is a graph of last day’s temperatures from each sensor generated with RRDtool’s graphing function. And readtemp-rrd.py is the python script.

Crontab syntax means that the command /usr/bin/python /home/pi/readtemp-rrd.py is run every five minutes.

Setting up a RRDtool database for time series data

Diving into readtemp-rrd.py, first imports show that we are using external library rrdtool by Christian Kröger for using RRDtool from python.

The setting DATABASE_PATH defines where the database files are stored. Each sensor has its own database file.

DATABASE_PATH = '/home/pi/'

create_rrd_unless_exists creates an RRDtool database file unless one exists already using rrdcreate (or its python bindings). The data definition syntax is not simple, but first --step tells how frequently data is expected, 300 seconds being once every five minutes.

Then a data source (DS) called temp of type GAUGE is created. Gauges are for data that is just values which can increase or decrease over time, such as temperature. Next number is the heartbeat: how many seconds may pass without a new value before the data source is regarded as unknown, here 900 for 15 minutes. -100 is the minimum and 100 the maximum value for this data source.

RRA stands for round robin archive which is used for storing the read data. Data is run through a consolidation function (CF), here AVERAGE. Average is taken over 10 minutes (2 data points) and database has space for 10 years: 525600 of these 10 minute average values, if my math is correct. As the name suggests, when this time has passed, the database values will be overwritten from the beginning. This 10-year database for one sensor takes 4.1MB of space, so Raspberry wouldn’t choke even if there were more than two sensors.

rrdtool.create(
    filename,
    '--step', '300',
    'DS:temp:GAUGE:900:-100:100',
    'RRA:AVERAGE:0.5:2:525600'
)

Current stable RRDtool version 1.6.0 supports giving these time arguments with easier syntax so that you don’t have to calculate for example how many 10-minute periods there are in 10 years. Unfortunately Raspbian Jessie has version 1.4.8 which doesn’t.

Reading temperature sensor data with Python

The temperature reading part uses Timo Furrer’s w1thermsensor python library, which makes the readings really straightforward.

Code loops through all the sensors, creates a RRDtool database for each sensor unless it exists, reads the temperature value and writes it to the database and prints it to standard output as well.

Writing to database uses rrdupdate, which has a relatively easy syntax: first parameter is a timestamp (N for now), following ones are values. %.2f is python syntax for formatting a decimal number as a string with two decimals.

for sensor in W1ThermSensor.get_available_sensors():
    filename = sensor.id + '.rrd'
    create_rrd_unless_exists(filename)
    error = rrdtool.update(filename, 'N:%.2f' % (sensor.get_temperature()))
    print("Sensor %s has temperature %.2f" % (sensor.id, sensor.get_temperature()))

Graphing time-series data with RRDtool

RRDtool has functionality for generating graphs from databases included with command rrdgraph. Unfortunately its syntax is even more complex than database creation’s.

But first, the following static variables are used for configuring graph generation:

  • LAST_DAY_GRAPH_FILES is the path for current day’s temperature graph file
  • COLORS are used when graphing, first sensor’s line is drawn with the first color, second sensor with the second color, etc.
  • SENSOR_NAMES can be used to give the sensors meaningful names, which are used in the graph legend. Sensor ids are used if names are not given.
LAST_DAY_GRAPH_FILE = '/home/pi/current.png'
COLORS = ('#AA3939', '#226666', '#AA6C39', '#2D882D')
SENSOR_NAMES = {
    '0000075f24dc': 'Living room'
}

Graph configuration has two main elements here: definitions (starting with DEF) and lines (starting with LINE1 here). Definitions specify the data that is used for graphing. Here temp:AVERAGE from each sensor’s database file is redefined as the sensor’s id. Then a line is drawn for each of these sensors with different color, and the line is labeled with possible name for the sensor id.

defs.append('DEF:' + sensor.id + '=' +
            DATABASE_PATH + sensor.id + '.rrd:temp:AVERAGE')
lines.append('LINE1:' + sensor.id + color + ':' + sensor_name(sensor.id))

Last read value from each sensor is also printed to the graph. Here I simply couldn’t get the texts to be right-justified and stay under the graph area. COMMENT: \l is required so that the prints would start from a new line, last value is read from sensor.id data, and it’s printed after a sensor name.

current_temps = ['COMMENT: \l']
...
current_temps.append('GPRINT:' + sensor.id +
                     ':LAST:' + sensor_name(sensor.id) + '\: %4.2lf\l')

All this outputs a temperature graph for previous 24 hours:

RRDtool graph example

In conclusion

After all, RRDtool does the job is it designed to do. The database file is created before any data is inserted, and it stays the same size no matter how much data is inserted. This could be beneficial for systems that have limited disk sizes: for example if RRDtool is used as a local datastore in sensors with unreliable connectivity. That way measurements can be done even if there is no connection to the master node, and data can be transferred afterwards.

Transferring RRDtool data between nodes would require another blog post. rrdxport supports exporting XML or JSON with specified time intervals, so that could be used.

Generating graphs is good for simple use cases, but ideally one would want an interactive, zoomable graph with tunable parameters in browser. I spent a lot of time to get the chart above, and am not that satisfied with the result.

kuva

Reading temperatures with Raspberry Pi is a good exercise in electronics for more programmer types. There are many instructions on how to use a DS18B20 1-Wire temperature sensor with Raspberry Pi, but in this post I’ll have some Ansible magic and prettier photos.

Setting up Raspberry Pi devices with Ansible was explained in the previous blog post.

I experimented with this setup the first time following Adafruit’s Raspberry Pi Lesson 11. DS18B20 Temperature Sensing.

Required hardware and software

  1. Raspberry Pi
  2. Ansible and clone of raspberry-ansible repository configured with correct IP addresses and SSH keys
  3. DS18B20 digital temperature sensor (datasheet, Finnish electronics vendor)
  4. 4,7 kΩ or 10 kΩ resistor (Finnish electronics vendor: 4,7 kΩ, 10 kΩ)
  5. Breadboard (Wikipedia, Finnish electronics vendor)
  6. Jumper cables from Raspberry Pi to breadboard: male-female (Finnish electronics vendor)
  7. Jumper cables from and to breadboard: male-male (Finnish electronics vendor)

required parts

Raspberry Pi GPIO

GPIO stands for General Purpose Input/Output. These are ports that you can control with software reading and writing bits to and from external devices. GPIO pins are located in Raspberry’s top left corner in the image above (close to the number 1).

The pinout (i.e. which pin does what) is described in the document GPIO: Raspberry Pi models A and B. Pinout diagram is shown below, with a close-up photo of the pins. Labels P1 and L13 are used for reference to make sure that the pinout isn’t upside down.

GPIO photo

GPIO schema

where 5V and 3V3 stand for +5V and +3,3V DC, GND stands for ground and all yellow boxes with numbers in them are GPIO data pins. The numbering isn’t in any logical order, but the numbers are used for identifying those ports in the software.

A good reference with more detailed explanations for these pins is the Raspberry pinout site.

Making the connections

First of all, Raspberry Pi has to be powered off when making the GPIO connections!

circuit diagram

Circuit diagram is shown in the image above. DS18B20 uses 1-Wire data transfer protocol, and Raspberry Pi driver for 1-Wire uses GPIO port 4, so the sensor’s middle pin has to be connected there.

4,7 or 10 kΩ resistor is used as a pull-up resistor to keep the data line signal level in a valid range.

The part inside the dashed box will be done in breadboard, and connections from breadboard to Raspberry Pi will be done with the longer jumper cables (number 6 in the required hardware list).

Breadboard connections

The breadboard is connected internally so that each of the 5-pin slots are connected to each other (horizontally in the above image) and two connected lines run on each side of the boards (vertically, marked with red and blue in the board).

I added the relevant “hidden” connections to the image above:

  • Ground (GND) comes from Raspberry with the black wire to vertically connected blue line, and is connected from there with a blue jumper wire to the top pin of DS18B20 sensor.
  • Data (DQ) is the yellow wire that is connected to a 4,7 kΩ resistor and the middle pin of DS18B20 sensor.
  • +3,3 V (VDD) is the orange wire that is first connected to the vertical red line, and from there with orange jumper cables to both the other end of the resistor, and to the bottom pin of DS18B20 sensor.

Breadboard and Raspberry GPIO connections

This photo above shows also the connections from breadboard to Raspberry.

Enabling 1-Wire support

The ansible script repository raspberry-ansible has a role and playbook for setting up 1-Wire on the Raspberry Pi device.

It can be run with the following command (assuming you have SSH keys set as explained in the previous blog post):

$ ansible-playbook -i hosts onewire.yml

It will set the Raspberry Pi device tree as w1-gpio, restart the device, load kernel modules w1-gpio and w1-therm, and add them to be loaded on reboot.

Reading temperatures

Finally it’s time to read some sensor data. w1-gpio and w1-therm drivers create filehandles for each DS18B20 sensor in the following path:

pi@raspberry1:~ $ ls /sys/bus/w1/devices/
28-0000075f24dc  w1_bus_master1

All directories starting with 28- are DS18B20 sensors ending with a unique hardware id.

For each of those, you can read the temperature from the file w1_slave:

pi@raspberry1:~ $ cat /sys/bus/w1/devices/28-0000075f24dc/w1_slave
99 01 4b 46 7f ff 07 10 79 : crc=79 YES
99 01 4b 46 7f ff 07 10 79 t=25562

where the number t=25562 is the temperature including three decimal points, i.e. in this case 25,562 °C.

Multiple sensors

As a bonus, adding multiple sensors to the same 1-Wire bus is really easy. Just plug in another DS18B20 with pins connected the same way as the first one and everything just works!

pi@raspberry1:~ $ ls /sys/bus/w1/devices/
28-0000075f24dc  28-0000075f5202  w1_bus_master1
pi@raspberry1:~ $ cat /sys/bus/w1/devices/28-0000075f24dc/w1_slave
9d 01 4b 46 7f ff 03 10 57 : crc=57 YES
9d 01 4b 46 7f ff 03 10 57 t=25812
pi@raspberry1:~ $ cat /sys/bus/w1/devices/28-0000075f5202/w1_slave
a1 01 4b 46 7f ff 0f 10 d9 : crc=d9 YES
a1 01 4b 46 7f ff 0f 10 d9 t=26062

Here we can see that even though the sensors are 2 mm apart, the temperatures are somewhat different. DS18B20 accuracy is ±0,5 °C, so the readings 25,812 °C and 26,062 °C are well within tolerance.