Install and Set Up InfluxDB on Ubuntu 16.04/18.04

Im Durchschnitt wird dieses Tutorial Install and Set Up InfluxDB on Ubuntu 16.04/18.04 mit 5 bewertet, wobei 1.0 die schlechteste und 5.0 die beste Bewertung ist. Es haben insgesamt 1043 Besucher eine Bewertung abgegeben.
1043 0

Install and Set Up InfluxDB on Ubuntu 16.04/18.04

Datenbank InfluxDB Ubuntu

We have already shown you in our tutorial on creating and deleting MySQL databases how to install and set up object-relational SQL databases on your Ubuntu server. We also showed you how to install object-relational NO-SQL databases like MongoDB, Redis and RethinkDB in our tutorials on Ubuntu, Apache Cassandra on Ubuntu, Redis on Ubuntu 16.04 LTS and RethinkDB on Ubuntu. This tutorial is about an open source product that is based on the concept of a database, but differs from the SQL and NO-SQL database models we have introduced so far. I will introduce you to InfluxDB in the following post.

New! – Autoscale MySQL-as-a-Service from gridscale

Anyone who no longer wants to deal with database administration can now also use the new Platform Services from gridscale.

The advantage of Platform Services at gridscale: We take care of the secure configuration, reliable operation and installation of security updates for your Platform services. In your gridscale panel you can now start with the databases PostgreSQL, MySQL and redis.
Just try it out!

More information about PaaS at gridscale and how our PaaS work can be found here:
Getting started with PaaS at gridscale
Platform Services by gridscale.

About InfluxDB

InfluxDB is a complete open source platform designed for metrics, events and other time-based data of humans, sensors or machines and their processing and analysis. In real time, data can be collected, stored, visualized, and transformed into action. It is, so to speak, a modern time series platform. InfluxDB thus differentiates itself from the model of an SQL or NoSQL database, since this product functions as a platform and is not limited to an SQL or NoSQL memory. As compute infrastructure and architectures evolve as a result of new requirements and needs, existing technologies are often not sufficient to meet the new requirements (big data and the emergence of HDFS and Hadoop). And since the previous technology of data storage in SQL and NoSQL storage did not meet these requirements, InfluxDB was developed. InfluxDB provides a convenient and modern time series platform for storing time series data in a time series database. Time visibility and control are part of the age of digitization and instrumentation, in which real-time processing can provide insight and competitive advantage to data-driven organizations. As an end-to-end platform, InfluxDB can be deployed in the cloud or via download, elegant and easy to use, free of external dependencies, but open and flexible enough for complex implementations. The data is analyzed using a SQL-like query language.

You can even graphically display and visualize your data with the integrated open-source Chronograf project and thus carry out an investigation in the sense of data plotting. You will learn more about this in the penultimate chapter of this tutorial. InfluxDB also supports other visualization tools such as Grafana.

Use Case

In order to give you a feeling for working with the InfluxDB database platform, I will simulate a real live case with you, in which we will then have the data points integrated into a database plotted visually. To do this, I take with you the perspective of a company that wants to measure its sales activities via the online shop in relation to time, how old the person is (INTEGER) and whether they bought or not (Boolean TRUE/FALSE), and demographic data such as the gender and location of its existing customers as well as potential customers. From the data points resulting from visitor tracking time series, possibly interesting information can be determined and future online marketing strategies (e.g. newsletter, social media) and usability measures can be developed and/or optimised on the basis of this information. The project should help to understand the psychology of online shopping by identifying shopping phases and the customers behind them in certain contexts. It is – as already mentioned – a case study based on reality that is only intended to serve you as an orientation. Of course, InfluxDB can not only be very helpful for commercial purposes, but can also be used for private or educational projects of any kind, in order to make time series based correlations of data sets retrievable, guardable, analyzable, evaluable and visualizable.

To simulate the scenario, I will implement the data points as indexers in the respective measurement using the INSERT operation known from the SQL database language. With a view to a real-life case, you can imagine that the information contained per data point was tracked directly on the website and automatically communicated to the database via software operating in the background. The personal information on gender and age was intercepted via an interface query on the website, which was unavoidable for the visitor. I limit my time lapse scenario to a time window containing data points from 13:00 to 13:15 on 17.04.2018 and tracked to minutes in the epoch time format.

Requirements

Using InfluxDB requires the presence of network ports 8086 and 8088 on your system, where TCP port 8086 is used for client-server communication via the InfluxDB HTTP API and TCP port 8088 is used for the RPC backup and restore service.
In addition to these two ports, InfluxDB provides several plug-ins that may require custom ports. It is possible to change all port mappings via the configuration file. The port mappings are located in the /etc/influxdb/influxdb.conf file for standard installations. For information on how to edit this file, see the Configuring InfluxDB chapter.
The table below gives you a good overview of possible ports.

PortValue
8086The default port on which the InfluxDB HTTP service runs. You can configure this port in the configuration file by setting bind-address = “127.0.0.1:8086”.

 

Environment variable: INFLUXDB_HTTP_BIND_ADDRESS

8088The default port on which the RPC backup and restore service runs. You can configure this port in the configuration file by setting bind-address = “127.0.0.1:8088”.

 

Environment variable: INFLUXDB_HTTP_BIND_ADDRESS

2003The default port on which the graphite service runs.

You can configure this port in the configuration file by setting bind-address = “127.0.0.1:2003”.

 

Environment variable: INFLUXDB_GRAPHITE_0_BIND_ADDRESS

4242The default port on which the OpenTSDB service runs.

You can configure this port in the configuration file by setting bind-address = “127.0.0.1:4242”.

Environment variable: INFLUXDB_OPENTSDB_BIND_ADDRESS

8089The default port on which the UDP service is running.

You can configure this port in the configuration file by setting bind-address = “127.0.0.1:8089”.

 

Environment variable: INFLUXDB_UDP_BIND_ADDRESS

25826The default port on which the Collect service runs.

You can configure this port in the configuration file by setting bind-address = “127.0.0.1:25826”.

 

Environment variable: INFLUXDB_COLLECTD_BIND_ADDRESS

activated Portsdeactivated Ports

In InfluxDB, time stamps are used to coordinate and assign data, using the local time of a host in UTC. To avoid inaccuracies in the time stamps of data written in INfluxDB, it is necessary to synchronize the time between hosts using the Network Time Protocol (NTP).

Install InfluxDB

First you need to add the InfluxDB repository with the following three commands.

> curl -sL https://repos.influxdata.com/influxdb.key | apt-key add -

> source /etc/lsb-release

> echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | tee /etc/apt/sources.list.d/influxdb.list

For both Ubuntu versions below Ubuntu 15.04+ and above, you would need to run the command below in the terminal to install InfluxDB.

> apt-get update && sudo apt-get install influxdb

After the installation it is still necessary that you start InfluxDB. The command to use is different for Ubuntu versions below Ubuntu 15.04+ and above. In the case of Ubuntu versions below Ubuntu 15.04+, the command line as shown in the code block below would be correct to start.

> service influxdb start

With Ubuntu versions from Ubuntu 15.04+, however, the background program systemd is included in the command which implements the start of InfluxDB. So the command for starting InfluxDB should look like shown in the code snippet below.

> systemctl start influxdb

Set Up InfluxDB

The system provides internal default settings for setting each configuration file. Using the command shown below, you can view and edit the settings in the nano editor.

> nano /etc/influxdb/influxdb.conf

Most of the settings in the local configuration file /etc/influxdb/influxdb.conf are commented out. All commented settings are determined by the internal default settings. All uncommented settings in the local configuration file override the internal defaults. Note that the local configuration file does not have to already contain all configuration settings.

Before you can work with InfluxDB, e.g. to create, delete or edit (a) database(s) or to analyze data, you have to start InfluxDB once with your configuration file.
There are two possibilities:

1. point the process to the correct configuration file with the -config option.

> influxd -config /etc/influxdb/influxdb.conf


 8888888           .d888 888                   8888888b.  888888b.
   888            d88P"  888                   888  "Y88b 888  "88b
   888            888    888                   888    888 888  .88P
   888   88888b.  888888 888 888  888 888  888 888    888 8888888K.
   888   888 "88b 888    888 888  888  Y8bd8P' 888    888 888  "Y88b
   888   888  888 888    888 888  888   X88K   888    888 888    888
   888   888  888 888    888 Y88b 888 .d8""8b. 888  .d88P 888   d88P
 8888888 888  888 888    888  "Y88888 888  888 8888888P"  8888888P"

2018-04-04T23:03:51.641110Z	info	InfluxDB starting	{"log_id": "07GjHMMG000", "version": "1.5.1", "branch": "1.5", "commit": "cdae4ccde4c67c3390d8ae8a1a06bd3b4cdce5c5"}
2018-04-04T23:03:51.641142Z	info	Go runtime	{"log_id": "07GjHMMG000", "version": "go1.9.2", "maxprocs": 4}
run: open server: listen: listen tcp 127.0.0.1:8088: bind: address already in use

2.Set the environment variable INFLUXDB_CONFIG_PATH in relation to the path of your configuration file and start the process.

> echo $INFLUXDB_CONFIG_PATH /etc/influxdb/influxdb.conf

> influxd


 8888888           .d888 888                   8888888b.  888888b.
   888            d88P"  888                   888  "Y88b 888  "88b
   888            888    888                   888    888 888  .88P
   888   88888b.  888888 888 888  888 888  888 888    888 8888888K.
   888   888 "88b 888    888 888  888  Y8bd8P' 888    888 888  "Y88b
   888   888  888 888    888 888  888   X88K   888    888 888    888
   888   888  888 888    888 Y88b 888 .d8""8b. 888  .d88P 888   d88P
 8888888 888  888 888    888  "Y88888 888  888 8888888P"  8888888P"

2018-04-04T23:43:31.184743Z	info	InfluxDB starting	{"log_id": "07GlYaS0000", "version": "1.5.1", "branch": "1.5", "commit": "cdae4ccde4c67c3390d8ae8a1a06bd3b4cdce5c5"}
2018-04-04T23:43:31.184777Z	info	Go runtime	{"log_id": "07GlYaS0000", "version": "go1.9.2", "maxprocs": 4}
run: open server: listen: listen tcp 127.0.0.1:8088: bind: address already in use

InfluxDB works by first checking the configuration using the -config option and then checking the configuration using the environment variable.
Furthermore, you must make sure that the directories where data and the Write Ahead protocol (WAL) are stored are writable for you before running the InfluxDB service. In other words, you need to check if these directories exist. In case the data and WAL directories are not writable, the InfluxdDB service is not started. Both should be in the /var/lib/influxdb directory. You can display the directories in the terminal with the following two commands and thus guarantee their writability.

> cd /var/lib/influxdb

> ls

Create, use or delete a database in InfluxDB

To work with InfluxDB, you can use the influx command line interface (CLI) which is included in all InfluxDB packages and provides an easy and simple way to interact with the database. The CLI communicates directly with InfluxDB by default making requests to the InfluxDB HTTP API via port 8086. Therefore, Influx connects to port 8086 and localhost by default. If you need to change these defaults, run the command influx –help in the terminal.
Finally, you can start the CLI via the influx command – as I show you in the code block below – and connect to the local InfluxDB instance. The output generated will tell you that the correct port 8086 has been triggered, and the version of InfluxDB will also be displayed.
In my example I use the argument -precision in addition to the influx command. This argument specifies the format/accuracy of returned timestamps. In the above example, the rfc3339 InfluxDB attribute tells InfluxDB to return timestamps in RFC3339 format, which is YYYY-MM-DDTHH:MM:SS.nnnnnnnnnZ.

> influx -precision rfc3339
Connected to http://localhost:8086 version 1.5.x
InfluxDB shell version: 1.5.x

If you prefer timestamps in Unix epoch format, then an element from the list [h, m, s, ms, u, ns] should follow after -precision. For example, you get epoch in hours if you specify the command as shown in the code snippet below.

> influx -precision s
Connected to http://localhost:8086 version 1.5.x
InfluxDB shell version: 1.5.x

With the EXIT command you can leave the CLI or any other database you have created in InfluxDB.
To create a database you use the CREATE DATABASE command, as I show you in the code snippet below, where I create the database OnlineMarketDevelopment.

> CREATE DATABASE OnlineMarketDevelopment

The SHOW DATABASES command allows you to output all existing databases.

> SHOW DATABASES
name: databases
name
----
_internal
OnlineMarketDevelopment

To delete a database, the DROP DATABASE command is used. If I had created the HumanResources database in the meantime and wanted to delete it again, my command would look like this.

> DROP DATABASE HumanResources

Finally, if you want to fill a specific database with content, search for content, or analyze for content, you must explicitly switch to that database using the USE command. As long as you don’t execute the EXIT command, all operations will only be performed on the database OnlineMarketDevelopment shown in my example.

> USE OnlineMarketDevelopment
Using database OnlineMarketDevelopment

It is also important to note that you can use InfluxDB not only via the CLI, but also directly via raw HTTP requests via the curl application. In the terminal, the command for a POST request would result in the endpoint /query and then follow the schema as I present it to you in the code block below, where after q= any database command – except for the USE command – can be executed, which I have already executed in the above code snippets via the CLI and XXX is to be interpreted as a placeholder, so to speak.

> curl -i -XPOST http://localhost:8086/query --data-urlencode "q=XXX"

Applied to the DROP command, the command line would look like a raw HTTP request, as you can see from the code lock below.

> curl -i -XPOST http://localhost:8086/query --data-urlencode "q=DROP DATABASE HumanResources"

HTTP/1.1 200 OK
Content-Type: application/json
Request-Id: c53a00b6-3caa-11e8-8018-000000000000
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.5.1
X-Request-Id: c53a00b6-3caa-11e8-8018-000000000000
Date: Tue, 10 Apr 2018 10:34:38 GMT
Transfer-Encoding: chunked

{"results":[{"statement_id":0}]}

Fill a database with content and search for content in InfluxDB

Once we have jointly created the OnlineMarketDevelopment database and gained access to it via the USE command, InfluxDB is ready to accept requests and writes in it.
You can imagine the data storage in InfluxDB to be structured according to time series, so that they form a time-dependent sequence of data points. The data points in turn refer to a measurement. In the sense of the application scenario I presented, this could be the measurement with the name SalesActivity. Time series can have zero to many data points, with each data point representing a sample of the metric. Data points consist of time (a point is a time stamp), at least one field key value pair (the values behind the measurement SalesActivity itself, e.g. age=52i and buying=FALSE), and zero or many tag key value pairs that contain metadata about the measurement (e.g. gender=”male” region=”EMEA” and dc=”koeln”). EMEA as the value of the field key region would then stand for the economic area Europe, Middle East and Africa.
With regard to integration into the database platform, a measurement can be conceived as an SQL table in which the primary index or primary key is always the time in the form of the time stamp. Tags and fields are to be understood as columns in the table. Tags are indexed and fields are not. Zero values are not stored.

Points are written in InfluxDB using the line protocol. The format is similar to the one you take from the pattern integrated in the code block below, where the names in ()-brackets are to be interpreted as placeholders and the []-brackets are only used for clarity within the syntax. The pattern in the code snippet below clearly shows how the tag-key-value pairs of the field-key-value pairs are identified when assigned to the measurement. If the measurement is followed by a comma, at least one tag is present. However, if you do not use a tag-key-value pair(s), the field-key-value pair(s) follows immediately after the measurement, separated only by a space. In general, the first field-key-value pair never follows a comma. Both the tag-key-value pairs and the field-key-value pairs are listed by commas if there is more than one. Here you have to make sure that there are no spaces between them, otherwise you will get an error message.

*Messung*[,(Tag-Schlüssel)=(Tag-Wert),...] (Feld-Schlüssel)=(Feld-Wert)[,(Feld2-Schlüssel)=(Feld2-Wert),...][Unix-Nanosekunden-Zeitstempel]

In the table below you will find all the parameters of this syntax listed in relation to their needs, meaning and data type.

 

e

Element of the Syntaxoptional vs. requiredBedeutungDatentyp
MeasurementrequiredName of the measurement. Conceptually corresponds to a name that would be given to an SQL table.

When INSERTing a data set, one measurement per data point is accepted – according to the syntax.

String
Tag-collectionoptionalAll tag-key-value pairs of the data point.Both keys and values are integrated as strings.
Field-collectionrequired

> A data point must be able to be assigned to at least one field.

All field-key-value pairs of the data point.Field keys are always included as strings and field values as strings, integers, floats or Boolean.
Time stampsoptional

> If this is not set at the INSERT of the data set and is therefore not contained in the data point, InfluxDB uses the local nanosecond time stamp of the server in UTC.

When INSERTing a data record, one time stamp per data point is accepted according to the syntax.Unix nanosecond timestamp

If you want to use the Boolean data type for certain field values, note that both TRUE and FALSE can be interpreted in different ways in InfluxDB. The table below gives you a quick overview of the possibilities of Boolean syntax. From this you can deduce that when writing data points (via INSERT operation) two notations more can be used for the two Boolean values than when reading data points (e.g. via SELECT statement). These two additional spellings that can be used for TRUE and FALSE are interpreted as corresponding Boolean values when reading data points, but can only be reached via the spellings listed in the far right column.

Booleanaccepted spellings in InfluxDB for INSERT operationsaccepted spellings in InfluxDB for SELECT operations
TRUEt, T, true, True or TRUEtrue, True or TRUE
FALSEf, F, false, False, FALSEfalse, False, FALSE

The INSERT operation already known from SQL is also used in InfluxDB to include data points in a measurement or in a conceptual SQL table. The code snippet below shows you how such an INSERT applied to the measurement SalesActivity can look like. In the INSERT example I did via the CLI I set the time stamp of the data point myself, namely in epoch format in seconds. If you don’t implement the timestamp belonging to an integrated data point, it is automatically set in InfluxDB based on the current time. With regard to numbers, i.e. numbers that belong to the data type INTEGER, you should consider that an i must be suffixed during their implementation, otherwise they will be interpreted as floating point numbers. However, this convention only applies to integers. Floating-point numbers, that is, numbers that belong to the data type FLOAT, are not affected by this and are recognized directly as such in the database. In the database, the values for whole numbers are then stored as whole numbers, without the following i. As already mentioned, it only plays a role in the implementation. In the code snippet below the field age is not age=52, but age=52i. In addition, you should keep the convention that in an INSERT you put all values intended as data type STRING in quotation marks in order to make them explicitly differentiable as such from other data types. If you would not do this, so all values would not be put in quotation marks, the field value in age=52i would also be recognized as a string “52i” and not as a number 52i. Consequently, the complete INSERT should look as demonstrated in the code block below.

> INSERT SalesActivity,region="EMEA",dc="koeln",gender="male" age=52i,buying=FALSE 1523962800

The writing of data – including the INSERT operation – is also possible via raw HTTP requests. However, the POST request does not end in the end point /query, but in the end point /write. And here the USE command applicable in the CLI is not executed, but is triggered directly after the end point /write via ?db=. In the HTTP request below I could also set the timestamp myself, but I don’t do this, so it is automatically assigned in InfluxDB.

> curl -i -XPOST 'http://localhost:8086/write?db=OnlineMarketDevelopment' --data-binary 'SalesActivity,region="EMEA",dc="koeln",gender="male" age=52i,buying=FALSE'

HTTP/1.1 200 Ok
Content-Type: application/json
Request-Id: 3afe35d0-3c14-11e8-8048-000000000000
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.5.1
X-Request-Id: 3afe35d0-3c14-11e8-8048-000000000000
Date: Mon, 09 Apr 2018 16:37:02 GMT

You can then use the CLI to play back data records using the SELECT statement. It works the same way as a SQL query. After the SELECT the columns whose contents are to be found are listed, with the FROM the location of the columns is triggered, which would correspond conceptually to the relevant SQL table and in InfluxDB to the relevant measurement SalesActivity. If the SalesActivity measurement were already filled with many data points, you could also use the WHERE condition to specify the number of hits.

> SELECT "region", "gender", "age", "dc", "buying" FROM "SalesActivity"
name: SalesActivity
time       region gender age dc      buying
----       ------ ------ --- --      ------
1523962800 "EMEA" "male" 52  "koeln" false

If you were to use the raw HTTP request for the SELECT, the command would look like the one in the code block below. Now, make a GET request to the /query endpoint, set the URL parameter db as the target database, and embed your request in q=.

curl -GET 'http://localhost:8086/query?pretty=true' --data-urlencode 'db=OnlineMarketDevelopment' --data-urlencode 'q=SELECT "region", "gender", "age", "dc", "buying" FROM "SalesActivity"'
{
    "results": [
        {
            "statement_id": 0,
            "series": [
                {
                    "name": "SalesActivity",
                    "columns": [
                        "time",
                        "region",
                        "gender",
                        "age",
                        "dc",
                        "buying"
                    ],
                    "values": [
                        [
                            "2018-04-17T13:00:00Z",
                            "\"EMEA\"",
                            "\"male\"",
                            52,
                            "\"koeln\"",
                            false
                        ],
                    ]
                }
            ]
        }
    ]
}

As you already know, timestamps in InfluxDB are returned by default in RFC3339 UTC and have an accuracy in the nanosecond range. If you want to use Unix-Epoch timestamps, add the parameter epoch=”x” followed by –data-urlencode to your request, where x is a placeholder, which you must specify with an element from the list [h, m, s, ms, u, ns]. For example, you will get epoch in seconds if your command looks like presented in the code snippet below.

curl -GET 'http://localhost:8086/query?pretty=true' --data-urlencode 'db=OnlineMarketDevelopment' --data-urlencode 'epoch=s' --data-urlencode 'q=SELECT "region", "gender", "age", "dc", "buying" FROM "SalesActivity"'
{
    "results": [
        {
            "statement_id": 0,
            "series": [
                {
                    "name": "SalesActivity",
                    "columns": [
                        "time",
                        "region",
                        "gender",
                        "age",
                        "dc",
                        "buying"
                    ],
                    "values": [
                        [
                            1523962800,
                            "\"EMEA\"",
                            "\"male\"",
                            52,
                            "\"koeln\"",
                            false
                        ]
                    ]
                }
            ]
        }
    ]
}

After having shown you how to create a database in InfluxDB and how to fill it with measurements and corresponding data points in the context of the application scenario presented at the beginning of this tutorial, I will finally dedicate myself to the part of the application scenario that deals with the visual processing and monitoring of integrated data sets. For this I will use Chronograph, which is – besides InfluxDB – part of a stack.

Conclusion

You have now gained an insight into how time series based data storage and evaluation can be implemented using InfluxDB and managed with the associated stack components. This tutorial has provided you with a solid basis for understanding the InfluxDB database platform and its installation and setup, as well as an outlook on how to use its features. You are able to apply the learned content, in the best case for exciting projects – be it for your job, school, teaching, university or private purposes – and to grow into the possibilities InfluxDB offers you. I wish you success and above all a lot of fun working with InfluxDB.

We have already shown you in our tutorial on creating and deleting MySQL databases how to install and set up object-relational SQL databases on your Ubuntu server. We also showed you how to install object-relational NO-SQL databases like MongoDB, Redis and RethinkDB in our tutorials on Ubuntu, Apache Cassandra on Ubuntu, Redis on Ubuntu 16.04 […]

Schade, dass dir der Artikel nicht gefallen hat.
Was sollten wir deiner Meinung nach besser machen?

Thank you for your feedback!
We will get back to you as soon as the article is finished.

Übrigens: kennst du schon unser Tutorial zum Thema Install your ownCloud on your server?

×

Developer?

Get the latest gridscale developer tutorials here.
And don’t worry - we won’t spam you