For detailed documentation of the Elastic stack components we suggest to use the Elastic Stack documentation.

Lab Layout

Objective: This part will explain how the labs are structured. Describe which snippets were used for which environment.

How you should do the Lab

Each Lab part has a theory part which will be handeled before beginning with the lab. The Answer to each Question is below and hidden with a javascript button. Have a look at the following example.

  1. This will be a question
    Below you can see a javascript button with the hidden solution. Try only to open the solution when you are finished or struggeling too much. Feel free to ask the teacher, he will be glad to help you out with hints.

    The solution will be shown here.

The code snippets and their representation

The code or commands will be shown in snippets. There are different type of snippets which represents the input in different area (terminal, kibana devtools). Let us have a look at some.

Command line interface input
Command line interface output
Kibana devtool input
Kibana devtools output
File content
General or json data
This will be warnings
This will be informations or hints

Lab1: Setup Elastic Stack

Objective: In this lab the necessary system environment shall be provided, Elasticsearch and Kibana shall be installed and configured on Node-1. Subsequently Node-2 and Node-3 will be added step by step as cluster members. Kibana is not installed on Node-2 and Node-3. In the Lab we are working with the Elastic Stack Version 7.8

System Environment

Since there are no DNS servers available we work in the lab with local hosts file.
On the linux servers, the file is already configured.

In the Lab you will have 3 elastic stack nodes at your disposal. We call them server1, server2 and server3.
To play with the beats, a Linux (linux1) and a Windows (windows1) server are available.

Configure Elasticsearch on (server1)

The instructions for the installation can be found here: Installing Elasticsearch
Information about discovery and cluster formation
Elasticsearch is already installed!
We configure first only the server1
  1. Configure Elasticsearch
    You need to customize the file elasticsearch.yml. Uncomment and customize the following variables
    cluster.name, node.name, network.host, discovery.seed_hosts and cluster.initial_master_nodes. The file is located in the folder /etc/elasticsearch.

    Use following values
    cluster name:  my-application
    node name:     ${HOSTNAME}
    network.host:  _site_
    
    vim /etc/elasticsearch/elasticsearch.yml
    cluster.name: my-application 
    node.name: ${HOSTNAME}
    network.host: _site_
    discovery.seed_hosts: - server1 - server2 - server3
    cluster.initial_master_nodes: - server1 - server2 - server3
  2. Setting JVM Options
    By default, Elasticsearch use a heap with a minimum and maximum size of 1 GB. When moving to production, it is important to configure heap size to ensure that Elasticsearch has enough heap available.

    Good rules of thumb are:
    Xms and Xmx to be equal to each other
    Xmx not more than 50% of the physical RAM

    Adjust the heap size values for a system with 8 GB RAM

    vim /etc/elasticsearch/jvm.options
    Change the values of Xms and Xmx
    -Xms4g
    -Xmx4g
  3. Start Elasticsearch
    Our installation works with SystemV

    systemctl start elasticsearch
    
  4. Check elasticsearch deamon
    Check if the service is running

    systemctl status elasticsearch
    Your service should be => Active: active (running)
    elasticsearch.service - Elasticsearch
      Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled)
      Active: active (running) since Mon 2019-06-24 13:08:46 UTC; 2min 58s ago
        Docs: http://www.elastic.co
     Main PID: 21620 (java)
  5. Check elasticsearch
    Get the information about this elasticsearch node. Use curl -X GET command to get this informations.

    Use Curl on the shell command line
    curl -X GET "server1:9200/"
    You should see the following information
    {
       "name" : "server1",
       "cluster_name" : "my-application",
       "cluster_uuid" : "_na_",
       "version" : {
         "number" : "7.9.3",
         "build_flavor" : "default",
         "build_type" : "rpm",
         "build_hash" : "7a013de",
         "build_date" : "2019-05-23T14:04:00.380842Z",
         "build_snapshot" : false,
         "lucene_version" : "8.0.0",
         "minimum_wire_compatibility_version" : "6.8.0",
         "minimum_index_compatibility_version" : "6.0.0-beta1"
       },
       "tagline" : "You Know, for Search"
     } 
    cluster_uuid is still "_na_" at the moment because the cluster has not yet been formed. We need more nodes!

Configure Elasticsearch on (server2 and server3) to build the Elastic Cluster

In this section, Elasticsearch will be installed on the remaining two nodes (server2 and server3). The goal is that the Elastic Cluster (my-application) then runs on all 3 nodes and the Index shards are distributed accordingly.

Carry out the following steps first completely at server2 and then at server3
Elasticsearch is already installed!
All we have to do is adjust the configuration!
Cluster name should be: my-application
node name: is ${HOSTNAME}
For network host: use the site local address value not the ip address
  1. Configure Elasticsearch
    You need to customize the file elasticsearch.yml. Uncomment and customize the following variables cluster.name, node.name, network.host, discovery.seed_hosts and cluster.initial_master_nodes. The file is located in the folder /etc/elasticsearch. For values, see information block above.

    1. Configuration for server2

      vim /etc/elasticsearch/elasticsearch.yml 
      cluster.name: my-application 
      node.name: ${HOSTNAME}
      network.host: _site_
      discovery.seed_hosts: - server1 - server2 - server3
      cluster.initial_master_nodes: - server1 - server2 - server3
    2. Configuration for server3

      vim /etc/elasticsearch/elasticsearch.yml 
      cluster.name: my-application 
      node.name: ${HOSTNAME}
      network.host: _site_
      discovery.seed_hosts: - server1 - server2 - server3
      cluster.initial_master_nodes: - server1 - server2 - server3
  2. Setting JVM Options
    Adjust the heap size values for a system with 8 GB RAM

    vim /etc/elasticsearch/jvm.options
    Change the values of Xms and Xmx
    -Xms4g
    -Xmx4g
  3. Start Elasticsearch
    Our installation works with SystemV

    systemctl start elasticsearch
    
  4. Check elasticsearch
    Get the information about this elasticsearch node. Use curl -X GET command to get this informations.

    1. server2

      Use Curl on the shell command line
      curl -X GET "server2:9200/"
      You should see the following information
      {
         "name" : "server2"
         "cluster_name" : "my-application",
         "cluster_uuid" : "lu4fLzHESyypRz9vHxPxOA",
         "version" : {
           "number" : "7.9.3",
           "build_flavor" : "default",
           "build_type" : "rpm",
           "build_hash" : "7a013de",
           "build_date" : "2019-05-23T14:04:00.380842Z",
           "build_snapshot" : false,
           "lucene_version" : "8.0.0",
           "minimum_wire_compatibility_version" : "6.8.0",
           "minimum_index_compatibility_version" : "6.0.0-beta1"
         },
         "tagline" : "You Know, for Search"
       } 
      The cluster has now been created and the cluster UUID has been generated.
    2. server3

      Use Curl on the shell command line
      curl -X GET "server3:9200/" 
      You should see the following information
      {
         "name" : "server3"
         "cluster_name" : "my-application",
         "cluster_uuid" : "lu4fLzHESyypRz9vHxPxOA",
         "version" : {
           "number" : "7.9.3",
           "build_flavor" : "default",
           "build_type" : "rpm",
           "build_hash" : "7a013de",
           "build_date" : "2019-05-23T14:04:00.380842Z",
           "build_snapshot" : false,
           "lucene_version" : "8.0.0",
           "minimum_wire_compatibility_version" : "6.8.0",
           "minimum_index_compatibility_version" : "6.0.0-beta1"
         },
         "tagline" : "You Know, for Search"
       } 
  5. Check Cluster Nodes
    We can get a list of nodes from our cluster as follows. This check can be executed via server1. Use curl -X GET command and the _cat API call to get this informations.

    Use Curl on the shell command line
    curl -X GET "http://server1:9200/_cat/nodes?v"
    You should see the following information
     ip              cpu load_1m load_5m load_15m node.role master     name
     172.31.17.125   2   0.44    0.50    0.41     mdi       *          server1
     172.31.17.126   2   0.41    0.48    0.38     mdi       -          server2
    After setting up server3 you should see 3 nodes
  6. Check Cluster Health
    Basic health check, which we can use to see how our cluster is doing. This check can be executed via server1. Use curl -X GET command and the _cat API call to get this informations.

    Use Curl on the shell command line
    curl -X GET "http://server1:9200/_cat/health?v"
    You should see the following information
    epoch      timestamp cluster      status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
    1561793849 07:37:29  my-application green           2         2      0 0    0    0        0             0                  -                100.0%
    After setting up server3 node.total and node.data should be 3
Go back to setup and configure server3

Configure Kibana

Kibana is provided in the following package formats: tar.gz, zip, deb, rpm, docker, brew. For our installation we used a RPM package. The instructions for the installation can be found here: Installing Kibana
Kibana is already installed!
We configure Kibana only on server1
  1. Setup Kibana configuration
    Update Kibana configuration, so that we get access via browser. In our environment, the public DNS name is mapped to the internal address. So we have to enter the name of the node here. This is also server1. At elasticsearch host, all elastic servers must be entered.

    The file is located in the folder /etc/kibana.

    Use following values
    server host:          server1
    server name:          server1
    elasticsearch hosts:  URL for server1, server2 and server3
    

    vim /etc/kibana/kibana.yml
    Uncomment and change the variable server.host, server.name and elasticsearch.hosts
    server.host: "server1" 
    server.name: "server1"
    elasticsearch.hosts: - http://server1:9200 - http://server2:9200 - http://server3:9200
  2. Define Kibana Log Directory
    Per default Kibana log to /var/log/message. We want to change that. Kibana should log to /var/log/kibana/kibana.log. Create the directory and set the permissions correctly to kibana:kibana.

    mkdir /var/log/kibana
    chown kibana:kibana /var/log/kibana
    

    Update Kibana configuration and change the logging file to "/var/log/kibana/kibana.log"

    vim /etc/kibana/kibana.yml
    Uncomment and change the variable logging.dest
    logging.dest: /var/log/kibana/kibana.log
    Please adjust the log rotation so that the kibana.log file is also rotated! In our environment it is not necessary at the moment. But important on a production system.
  3. Start Kibana
    Our installation works with SystemV

    systemctl start kibana
    
  4. Check Kibana Deamon
    Check if the service is running

    systemctl status kibana
    Your service should be => Active: active (running)
     kibana.service - Kibana
      Loaded: loaded (/etc/systemd/system/kibana.service; enabled; vendor preset: disabled)
      Active: active (running) since Tue 2019-06-25 06:31:54 UTC; 11s ago
     Main PID: 13507 (node)
  5. Check Kibana GUI

    Use Port 5601 for access to the Kibana GUI

    Open new browser Tab and enter the URL (Public DNS) of server1
    machine Info shows you the Public DNS

    Machine Info

    for example: ec2-52-57-212-149.eu-central-1.compute.amazonaws.com

    Machine Info

    Access the Kibana GUI with the right Port 5601
    http://ec2-52-57-212-149.eu-central-1.compute.amazonaws.com:5601

    You must use the public DNS which you will find inside you Strigo lab.
    Machine Info

    Click on the elastic icon on top left [Try our sample data]

    Then select "Sample web logs" and click on [Add data]

    Sample web logs

Summary of important Curl Commands for Cluster Overview

Important shell commands to get an overview of the Elastic Cluster.

Show cluster nodes
curl -X GET "server1:9200/_cat/nodes?v"
Cluster nodes sorted by name
curl -X GET "http://server1:9200/_cat/nodes?v&s=name"
Show cluster health
curl -X GET "server1:9200/_cat/health?v"
Show cluster health only with cluster, status, node.total and node.data"
curl -X GET "http://server1:9200/_cat/health?v&h=cluster,status,node.total,node.data"
Show elasticsearch indices
curl -X GET "server1:9200/_cat/indices?v"

Lab2: Security Configuration

Objective: In this lab, you will encrypt all communication. This concerns transport communication (TCP-9300) and REST API traffic (TCP-9200).

Clusters that do not have encryption enabled send all data in plain text including passwords. If the Elasticsearch security features are enabled, unless you have a trial license, you must configure SSL/TLS for internode-communication.

Current insecure Configuration

Not secure Infrastructure

NO authentication
NO transport encryption
NO REST API encryption
NO browser encryption

  1. Stop Elasticsearch
    Stop Elasticsearch on all 3 nodes server1, server2 and server3

    systemctl stop elasticsearch
    

Configuration Secure Transport Communication (transport.ssl)

Transport communication is used for internal communication between nodes within the cluster

Securing Elasticsearch-Elasticsearch communication

NO authentication
YES transport encryption
NO REST API encryption
NO browser encryption

The configuration has to be done on server1, server2 and server3
  1. Install Certificates
    All certificates are already created and available on the servers. They need to be copied to the elasticsearch config directory.

    1. Create /etc/elasticsearch/certs directory

      sudo mkdir /etc/elasticsearch/certs
      
    2. CA certificate

      sudo cp /etc/pki/ca-trust/source/anchors/labs.strigo.io.crt /etc/elasticsearch/certs/
      
    3. Server key

      1. On server1

        sudo cp /etc/pki/tls/private/server1.key /etc/elasticsearch/certs
        
      2. On server2

        sudo cp /etc/pki/tls/private/server2.key /etc/elasticsearch/certs
        
      3. On server3

        sudo cp /etc/pki/tls/private/server3.key /etc/elasticsearch/certs
        
    4. Server certificate

      1. On server1

        sudo cp /etc/pki/tls/certs/server1.crt /etc/elasticsearch/certs
        
      2. On server2

        sudo cp /etc/pki/tls/certs/server2.crt /etc/elasticsearch/certs
        
      3. On server3

        sudo cp /etc/pki/tls/certs/server3.crt /etc/elasticsearch/certs
        
  2. Xpack Security Transport Setup
    Now we need to add some xpack configuration entries to the Elasticsearch configuration on all 3 nodes.
    Insert the following lines at the end of the configuration file.

    1. Only for server1

      vi /etc/elasticsearch/elasticsearch.yml
      # Encrypting communications between nodes in a cluster
      xpack.security.enabled: true
      xpack.security.transport.ssl.enabled: true
      xpack.security.transport.ssl.certificate: certs/server1.crt
      xpack.security.transport.ssl.key: certs/server1.key
      xpack.security.transport.ssl.certificate_authorities:
      - certs/labs.strigo.io.crt 
    2. Only for server2

      vi /etc/elasticsearch/elasticsearch.yml
      # Encrypting communications between nodes in a cluster
      xpack.security.enabled: true
      xpack.security.transport.ssl.enabled: true
      xpack.security.transport.ssl.certificate: certs/server2.crt
      xpack.security.transport.ssl.key: certs/server2.key
      xpack.security.transport.ssl.certificate_authorities:
      - certs/labs.strigo.io.crt 
    3. Only for server3

      vi /etc/elasticsearch/elasticsearch.yml
      # Encrypting communications between nodes in a cluster
      xpack.security.enabled: true
      xpack.security.transport.ssl.enabled: true
      xpack.security.transport.ssl.certificate: certs/server3.crt
      xpack.security.transport.ssl.key: certs/server3.key
      xpack.security.transport.ssl.certificate_authorities:
      - certs/labs.strigo.io.crt 
  3. Start Elasticsearch
    If steps 1 and 2 have been performed on all nodes, all nodes can be started step by step.

    1. Only for server1
      Start server1 and check the log file. The log file can be found in the directory /var/log/elasticsearch and has the name my-application.log.

      systemctl start elasticsearch 
      After about 1 minute the following entry should be visible in `/var/log/elasticsearch/my-application.log` on `*server1*`
      [2019-07-02T14:25:43,361][WARN ][o.e.c.c.ClusterFormationFailureHelper] [server1] master not discovered or elected yet, an election requires at least 2 nodes with ids from [kGjoIp-ESCu7cgeTaSp5Pg, qy1aF2xiQzCy5cQPDNOSmw, Z5IzSnbcRjCS1I4iJzK0sA], have discovered [] which is not a quorum; discovery will continue using [172.31.37.188:9300, 172.31.43.187:9300] from hosts providers and [{server1}{kGjoIp-ESCu7cgeTaSp5Pg}{pdP0VosIRIeD24SFdxakyw}{172.31.38.140}{172.31.38.140:9300}{ml.machine_memory=3971964928, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 4, last-accepted version 67 in term 4 
    2. Only for server2
      Start server2 and check the log files. The log file can be found in the directory /var/log/elasticsearch and has the name my-application.log.

      systemctl start elasticsearch 
      After about 1 minute the following entry should be visible in `/var/log/elasticsearch/my-application.log` on `*server1*`
      [2019-07-02T14:41:22,578][INFO ][o.e.c.s.ClusterApplierService] [server1] master node changed {previous [], current [{server2}{Z5IzSnbcRjCS1I4iJzK0sA}{l7yRtnBuS0WfWkyaIRzcXQ}{172.31.37.188}{172.31.37.188:9300}{ml.machine_memory=3971964928, ml.max_open_jobs=20, xpack.installed=true}]}, added {{server2}{Z5IzSnbcRjCS1I4iJzK0sA}{l7yRtnBuS0WfWkyaIRzcXQ}{172.31.37.188}{172.31.37.188:9300}{ml.machine_memory=3971964928, ml.max_open_jobs=20, xpack.installed=true},}, term: 5, version: 68, reason: ApplyCommitRequest{term=5, version=68, sourceNode={server2}{Z5IzSnbcRjCS1I4iJzK0sA}{l7yRtnBuS0WfWkyaIRzcXQ}{172.31.37.188}{172.31.37.188:9300}{ml.machine_memory=3971964928, ml.max_open_jobs=20, xpack.installed=true}}
      
      The following entry should be visible in `/var/log/elasticsearch/my-application.log` on `*server2*`
      [2019-07-02T14:41:22,375][INFO ][o.e.c.s.MasterService    ] [server2] elected-as-master ([2] nodes joined)[{server2}{Z5IzSnbcRjCS1I4iJzK0sA}{l7yRtnBuS0WfWkyaIRzcXQ}{172.31.37.188}{172.31.37.188:9300}{ml.machine_memory=3971964928, xpack.installed=true, ml.max_open_jobs=20} elect leader, {server1}{kGjoIp-ESCu7cgeTaSp5Pg}{qJwshrNeR_KJHNwOWFxQ8A}{172.31.38.140}{172.31.38.140:9300}{ml.machine_memory=3971964928, ml.max_open_jobs=20, xpack.installed=true} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 5, version: 68, reason: master node changed {previous [], current [{server2}{Z5IzSnbcRjCS1I4iJzK0sA}{l7yRtnBuS0WfWkyaIRzcXQ}{172.31.37.188}{172.31.37.188:9300}{ml.machine_memory=3971964928, xpack.installed=true, ml.max_open_jobs=20}]}, added {{server1}{kGjoIp-ESCu7cgeTaSp5Pg}{qJwshrNeR_KJHNwOWFxQ8A}{172.31.38.140}{172.31.38.140:9300}{ml.machine_memory=3971964928, ml.max_open_jobs=20, xpack.installed=true},}
      [2019-07-02T14:41:22,597][INFO ][o.e.c.s.ClusterApplierService] [server2] master node changed {previous [], current [{server2}{Z5IzSnbcRjCS1I4iJzK0sA}{l7yRtnBuS0WfWkyaIRzcXQ}{172.31.37.188}{172.31.37.188:9300}{ml.machine_memory=3971964928, xpack.installed=true, ml.max_open_jobs=20}]}, added {{server1}{kGjoIp-ESCu7cgeTaSp5Pg}{qJwshrNeR_KJHNwOWFxQ8A}{172.31.38.140}{172.31.38.140:9300}{ml.machine_memory=3971964928, ml.max_open_jobs=20, xpack.installed=true},}, term: 5, version: 68, reason: Publication{term=5, version=68}
      
    3. Only for server3
      Start server3 and check the log files. The log file can be found in the directory /var/log/elasticsearch and has the name my-application.log.

      systemctl start elasticsearch 
      After about 1 minute the following entry should be visible in `/var/log/elasticsearch/my-application.log` on `*server1*`
      [2019-07-02T14:48:37,171][INFO ][o.e.c.s.ClusterApplierService] [server1] added {{server3}{qy1aF2xiQzCy5cQPDNOSmw}{zfNG5c9zSbinIcSNBj6wjw}{172.31.43.187}{172.31.43.187:9300}{ml.machine_memory=3971964928, ml.max_open_jobs=20, xpack.installed=true},}, term: 5, version: 79, reason: ApplyCommitRequest{term=5, version=79, sourceNode={server2}{Z5IzSnbcRjCS1I4iJzK0sA}{l7yRtnBuS0WfWkyaIRzcXQ}{172.31.37.188}{172.31.37.188:9300}{ml.machine_memory=3971964928, ml.max_open_jobs=20, xpack.installed=true}}
      
      The following entry should be visible in `/var/log/elasticsearch/my-application.log` on `*server3*`
      [2019-07-17T08:47:54,719][INFO ][o.e.c.s.ClusterApplierService] [server3] master node changed {previous [], current [{server2}{4n5FLA0wS6WY6D8xt2AyaA}{lY4wg5UzS5iqFF7mtAnVhA}{172.31.30.254}{172.31.30.254:9300}{ml.machine_memory=8201240576, ml.max_open_jobs=20, xpack.installed=true}]}, added {{server2}{4n5FLA0wS6WY6D8xt2AyaA}{lY4wg5UzS5iqFF7mtAnVhA}{172.31.30.254}{172.31.30.254:9300}{ml.machine_memory=8201240576, ml.max_open_jobs=20, xpack.installed=true},{server1}{UkAGJTfCRaieSkZH1-vhJQ}{adHrdPR9T9G6ncXKeoo6Cw}{172.31.26.49}{172.31.26.49:9300}{ml.machine_memory=8201240576, ml.max_open_jobs=20, xpack.installed=true},}, term: 4, version: 52, reason: ApplyCommitRequest{term=4, version=52, sourceNode={server2}{4n5FLA0wS6WY6D8xt2AyaA}{lY4wg5UzS5iqFF7mtAnVhA}{172.31.30.254}{172.31.30.254:9300}{ml.machine_memory=8201240576, ml.max_open_jobs=20, xpack.installed=true}}
      
  4. Check Cluster Nodes again
    Try the Cluster Node Check again. Use curl -X GET command with _cat API call.

    Use Curl on the shell command line
    curl -X GET "http://server1:9200/_cat/nodes?v"
    You will get following error
    {"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/_cat/nodes?v]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/_cat/nodes?v]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}},"status":401}
    Reason is, via curl we must now also authenticate ourselves.
    But first we have to setup the passwords. See next section.

Elasticsearch Setup Password

The elasticsearch-setup-passwords command sets the passwords for the build-in-users.

Authentication for Elasticsearch access

YES authentication
YES transport encryption
NO REST API encryption
NO browser encryption

  1. Elasticsearch Setup Password Interactive

    Use password as password every time
    The passwords only need to be set once. On which node doesn’t matter.

    You can find the script for adding passwords in their directory /usr/share/elasticsearch/bin. Start the script elasticsearch-setup-passwords with the parameter interactive.

    /usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive
    
    Good luck entering correctly ;-)
    Initiating the setup of passwords for reserved users elastic,apm_system,kibana,logstash_system,beats_system,remote_monitoring_user.
    You will be prompted to enter passwords as the process progresses.
    Please confirm that you would like to continue [y/N] Y 
    Enter password for [elastic]: password Reenter password for [elastic]: password Enter password for [apm_system]: password Reenter password for [apm_system]: password Enter password for [kibana_system]: password Reenter password for [kibana_system]: password Enter password for [logstash_system]: password Reenter password for [logstash_system]: password Enter password for [beats_system]: password Reenter password for [beats_system]:password Enter password for [remote_monitoring_user]: password Reenter password for [remote_monitoring_user]: password Changed password for user [apm_system] Changed password for user [kibana_system] Changed password for user [logstash_system] Changed password for user [beats_system] Changed password for user [remote_monitoring_user] Changed password for user [elastic]
  2. Check Node configuration again
    The command can be executed on any node. Use curl -X GET command with _cat API call.
    The credential elastic as user and password for the password must now be specified.

    curl -X GET "http://elastic:password@server1:9200/_cat/nodes?v" 
    ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
    172.31.37.188            9          97   4    0.00    0.01     0.05 mdi       *      server2
    172.31.43.187           11          97  15    0.00    0.01     0.05 mdi       -      server3
    172.31.38.140           10          95   4    0.00    0.01     0.05 mdi       -      server1 

    Now we have to adjust Kibana. Otherwise we will no longer have access to the GUI.

Kibana Secure Setting (Authentication)

When the Elasticsearch security features are enabled, users must log in to Kibana with a valid user ID and password.

Configuring authentication credentials for Kibana to access Elasticsearch

YES authentication
YES transport encryption
NO REST API encryption
NO browser encryption

Some settings are sensitive, and relying on filesystem permissions to protect their values is not sufficient. For this use case, Kibana provides a keystore, and the kibana-keystore tool to manage the settings in the keystore. For example, authentication credentials for Elasticsearch.

Todo, only on the server where Kibana was installed. In our case on server1
  1. Use Kibana Keystore
    Some settings are sensitive, and relying on filesystem permissions to protect their values is not sufficient. For this use case, Kibana provides a keystore, and the kibana-keystore tool to manage the settings in the keystore. The account Kibana is used for the communication between Kibana and Elasticsearch! You will find the necessary script in the directory /usr/share/kibana/bin. With kibana-keystore you can create a new keystore file and fill the content.

    Have a look at the documentation, how you work with the kibana keystore.
    Kibana keystore settings
    To run a shell command as a different user, use the following command: su -s <username> /bin/bash -c '<command>'

    Create a new keystore file:

    We need this construct because Kibana is a no login user.
    /usr/share/kibana/bin/kibana-keystore create --allow-root 
    Created Kibana keystore in /var/lib/kibana/kibana.keystore
    

    Sensitive string settings, like authentication credentials for Elasticsearch can be added using the add command. Add the following to the keystore.

    elasticsearch.username : kibana_system
    
    /usr/share/kibana/bin/kibana-keystore add elasticsearch.username --allow-root 
    Enter value for elasticsearch.username: kibana_system
    

    elasticsearch.password : password
    
    /usr/share/kibana/bin/kibana-keystore add elasticsearch.password --allow-root 
    Enter value for elasticsearch.password: password
    
  2. Restart kibana
    Now we only need to restart the service and have access to the Kibana GUI again.

    systemctl restart kibana
    
  3. Login via GUI
    Reload browser Tab. After a short time the login mask should appear on the Kibana GUI.
    As login account we now use elastic and password. This account has all rights.

    If the Internet access is blocked for the Kibana Port 5601, the following NAT rules must be set up on server1.
    iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 5601
    iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 443 -j REDIRECT --to-port 5601
    
    Setting NAT rules permanently
    /usr/libexec/iptables/iptables.init save
    

    Machine Info

    After login the Home screen should appear again.

    Machine Info

Configure secure HTTP SSL Traffic for REST API (http.ssl)

HTTP is how the Elasticsearch REST APIs are exposed. For example Kibana, curl etc. Now we also want to encrypt the communication between Kibana and the Elasticsearch server.

Securing REST API access to Elasticsearch

YES authentication
YES transport encryption
YES REST API encryption
NO browser encryption

  1. Stop Elasticsearch
    Once more we have to stop Elasticsearch on all 3 nodes server1, server2 and server3

    systemctl stop elasticsearch
    
  2. Encrypting HTTP Client Communications
    When security features are enabled, you can optionally use TLS to ensure that communication between HTTP clients and the cluster is encrypted.

    Todo: This change must be made on all nodes (server1, server2 and server3)

    The following lines must be added to the Elasticsearch configuration at the end.

    vi /etc/elasticsearch/elasticsearch.yml 
    #Encrypting HTTP Client communications
    xpack.security.http.ssl.enabled: true
    xpack.security.http.ssl.certificate: certs/server.crt
    xpack.security.http.ssl.key: certs/server.key
    xpack.security.http.ssl.certificate_authorities:
    - certs/labs.strigo.io.crt
    
  3. Start Elasticsearch
    Start Elasticsearch on all 3 nodes server1, server2 and server3

    systemctl start elasticsearch
    
  4. Check Cluster Nodes
    Check if the connection is still working. The command can be executed on any node. The protocol is now https and no longer http. Use curl -X GET command with _cat API call.

    curl  -X GET "https://elastic:password@server1:9200/_cat/nodes?v" 
    ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
    172.31.16.113           39          94   8    0.00    0.12     0.16 mdi       -      server1
    172.31.25.111           17          97   0    0.00    0.01     0.05 mdi       *      server2
    172.31.18.54            26          97   0    0.00    0.01     0.05 mdi       -      server3
    

Configure Kibana to connect to Elasticsearch

You also have to change the Kibana configuration, otherwise Kibana won’t have access anymore to Elasticsearch and you will see the following error message: {"statusCode":500,"error":"Internal Server Error","message":"An internal server error occurred"}

Kibana access to Elasticsearch

YES authentication
YES transport encryption
YES REST API encryption
NO browser encryption

Todo, only on the one where Kibana was installed. In our case on server1

We have to adjust the Kibana configuration so that the Elasticsearch is called via https and the path to the certificateAuthorities has to be entered.

vim /etc/kibana/kibana.yml 
elasticsearch.hosts:
- https://server1:9200
- https://server2:9200
- https://server3:9200
elasticsearch.ssl.certificateAuthorities:
- /etc/pki/ca-trust/source/anchors/labs.strigo.io.crt
systemctl restart kibana 

Configure secure HTTP Client Communication (SSL over 5601)

Configure Kibana to encrypt communications between the browser and the Kibana server.

Securing Kibana-Browser communication

YES authentication
YES transport encryption
YES REST API encryption
YES browser encryption

You do not need to enable X-Pack security for this type of encryption.
Todo, only on the one where Kibana was installed. In our case on server1

Set the server.ssl.enabled, server.ssl.certificate properties and server.ssl.key in kibana.yml and then restart Kibana one last time.

vim /etc/kibana/kibana.yml 
server.ssl.enabled: true
server.ssl.certificate: /etc/pki/tls/certs/server1.crt
server.ssl.key: /etc/pki/tls/private/server1.key
systemctl restart kibana

Accept the Risk and Continue. Depending on the browser, the window may look different.

Browser Message

Communication now takes place via SSL over port 5601

Reporting settings in Kibana

You can configure xpack.reporting settings in your kibana.yml. Reporting is enabled by default. You have to set the xpack.reporting.encryptionKey too.

vi /etc/kibana/kibana.yml 
Add this lines to the end of the configuration file.
# General reporting settings
xpack.reporting.enabled: true
xpack.reporting.encryptionKey: password
Restart Kibana for the last time. Yupi! 😄.
systemctl restart kibana

Lab3: Elasticsearch

Lab Elasticsearch 3-1

Objective: Indices, Shards, cat API, Documents API

  1. Go to kibana GUI dev tools and enter the following query

    GET _cat/nodes?v
    
    You can get to the devtools GUI as shown in the image below. test
    You should get the list of all elasticsearch nodes running. Note that the option v (?v) is for verbose. Elasticdoc cat verbose You can send a request by the green triangle button. (Or ctrl+Enter / Cmd+Enter)
    ip            cpu node.role master name
    172.31.21.216   1 mdi       -      server2
    172.31.30.175   0 mdi       *      server3
    172.31.22.221   1 mdi       -      server1
    
  2. We are going to create an index and put a document in the index

    1. Find in the elasticsearch documentation how you can index a new document into an index

      The `Indices API` allows you to create an index Elasticdoc create index.
      Along the `Document API's` you can find the `Index API`, which allows to do index operation. Elasticdoc index a document.
      You can also index a document into an index, elasticsearch will create the index if it does not exist Elasticdoc auto index creation
    2. Create a new index and index the following document

      You can use the auto creation of the index during indexing.
      Elasticdoc index document.
      index name: people
      id of document: my_id
      document: { "firstname":"alain", "lastname":"muster", "birthday":"18.05.1985" }
      The following can be used to index a document with the id which you can define `my_id`. It also generates the index if not already existing.
      PUT people/_doc/my_id
      {
      	"firstname":"alain",
      	"lastname":"muster",
      	"birthday":"18.05.1985"
      }
      
      After sending the request you will get an response on the right side of Kibana devtools.
      {
        "_index" : "people",
        "_type" : "_doc",
        "_id" : "my_id",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1
      }
      
    3. You can get the doc with the id you have given

      GET people/_doc/my_id
      
    4. For updating a document you can also address it with id

      POST people/_doc/my_id
      {
      	"firstname":"othername"
      }
      

      The response to this command brings a field which is called result. This will have the value `updated` and the field `_version` will be incremented. But as you can see, you have updated the whole document, so you need to input the whole document.
      {
        "_index" : "people",
        "_type" : "_doc",
        "_id" : "my_id",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1
      }
      
    5. Update just one field of a document.
      Try to get the modified document, as you will se the whole doc has been updated. We want to update only a field of the document in this task.

      GET people/_doc/my_id
      
      Insert the first doc again
      PUT people/_doc/my_id
      {
      	"firstname":"alain",
      	"lastname":"muster",
      	"birthday":"18.05.1985"
      }
      
      Now update only the field firstname to the value "othername"
      POST people/_update/my_id
      {
        "doc":{"firstname":"othername"}
      }
      
    6. Use the cat indices api to get all indices from elasticsearch

      GET _cat/indices?v
      
      The result of the `_cat/indices?v` should list all indices. You should find the index `people` in the list.
    7. Can you tell me how many shards the created index people has?

      You should find this information with the cat indices api.
      Elasticdoc cat indices
      GET _cat/indices/people?v&h=health,status,pri,rep
      
      The column pri(primary shards) and rep(replica shards). If you don't explicitly define anything elasticsearch will apply the default settings for the number of shards. So there should be 1 primary shard and 1 replica for the index people. The output should look like the following snippet. You may have some other columns.
      health status index  pri rep
      green open   people   1   1
      
  3. We are going to have a look at the settings of the index which we have created

    1. Which API can be used to get informations about the setting of an index?

      There is a `get settings API` below the `Indices API's`, which can be used to get the settings of the indices Elasticdoc get index settings
    2. get the settings of the index people.

      GET people/_settings
      
      As we saw with the `cat api` you can see the number of shards and, if further configured you should be able to see the other settings which was done on this index. We will see this during the further exercises.
      The result should looks similar to the following snippet
      {
        "people" : {
          "settings" : {
            "index" : {
              "creation_date" : "1564048120123",
              "number_of_shards" : "1",
              "number_of_replicas" : "1",
              "uuid" : "_KPaln-GSxSJGAtLI5x83A",
              "version" : {
                "created" : "7010199"
              },
              "provided_name" : "people"
            }
          }
        }
      }
      
  4. We are going to change the settings of the index we have created

    1. How can you change the number of primary shards from the index people?

      For updating the index settings you can use the `update indices settings` out of the `Indices API's` collection. This is the same endpoint `_settings` used with the Method `POST` instead of `GET` which you used to get the settings. Elasticdoc update index settings
      As you maybe found out, it is not possible to modify a static index setting Elasticdoc index settings after indexing a document into an index. The static index settings cant be changed after indexing a document. You can find this described here
      So if you need to change the number of primary shards of an index you need to create a new index and set the number of primary shards before indexing a document Elasticdoc create index with predefined settings.
    2. Delete the existing index people and create a new index named people with the number of primary shards set to 3 and number of replicas set to 1. Then index the following doc into the people index, let elasticsearch define the id of the document.

      {
      	"firstname":"alain",
      	"lastname":"muster",
      	"birthday":"18.05.1985"
      }
      
      First delete the existing index.
      DELETE people
      

      Then create a new index with name `people` and the required settings
      PUT people
      {
          "settings" : {
              "number_of_shards" : 3,
              "number_of_replicas" : 1
          }
      }
      

      Then index the doc into the index.
      POST people/_doc
      {
      	"firstname":"alain",
      	"lastname":"muster",
      	"birthday":"18.05.1985"
      }
      
    3. How can you change the number of replicas for the index people?

      As this is a dynamic setting of the index, you can simply change it with the `update index settings API` Elasticdoc update index settings.
    4. Change the number of replicas to 2 for the index people.

      PUT people/_settings
      {
        "number_of_replicas": 2
      }
      
  5. We are going to have a detailed look on the shards of the index people

    1. How can you check on which nodes the shards are allocated.

      You should find an useful api inside the cat API’s collections.
      You can use the `cat/shards api`. Elasticdoc cat shards
    2. Check the shard allocation of the index people and try to understand which shard is allocated on which node.

      You can use the `cat/shards` and then use the name of the index(also wildcards) to filter for all values which matches the index name.
      GET _cat/shards/peop*?v&h=index,shard,prirep,state,node
      
      As the actual number of primary shards is 3 and number of replicas 2 you should have in total 9 shards.
      If you do not understand the columns which are returned, you can use the help option which is accepted by all `cat` commands Elasticdoc cat help option to get the list of all column description.
      GET _cat/shards?help
      

      So you should have all shard distributed over the 3 nodes. And all should have the state `STARTED`. Like the following snippet.
      index  shard prirep state      node
      people 1     p      STARTED    server1
      people 1     r      STARTED    server2
      people 1     r      STARTED    server3
      people 2     p      STARTED    server2
      people 2     r      STARTED    server1
      people 2     r      STARTED    server3
      people 0     p      STARTED    server3
      people 0     r      STARTED    server1
      people 0     r      STARTED    server2
      
    3. change the number of replicas to 3 for the index people.

      PUT people/_settings
      {
        "number_of_replicas": 3
      }
      
    4. Check the shard allocation of the index people` again, can you see anything which is not ok? And do you have an idea why this is that way?

      GET _cat/shards/people?v
      

      The result should look similar to the following snippet.
      index  shard prirep state      node
      people 1     p      STARTED    server1
      people 1     r      STARTED    server2
      people 1     r      STARTED    server3
      people 1     r      UNASSIGNED
      people 2     p      STARTED    server2
      people 2     r      STARTED    server1
      people 2     r      STARTED    server3
      people 2     r      UNASSIGNED
      people 0     p      STARTED    server3
      people 0     r      STARTED    server1
      people 0     r      STARTED    server2
      people 0     r      UNASSIGNED
      

      You should have 3 shards which have the state `UNASSIGNED`. This is because you have defined the index to have 1 primary and 3 replicas, So each shard should be there 4 times. But you only have 3 nodes. Elasticsearch will not assign a shard on an index where already a primary or replica exists.

You have successfully finished the first lab of elasticsearch. If you have any questions, or any other things which you want to know, feel free to talk with the teacher.

Lab Elasticsearch 3-2

Objective: Index mapping, queries, filters, aggregations

  1. Let us start with index mapping

    1. We will create a new index called people2 for this lab. You can input the following code snippet into Kibana devtools

      POST people2/_doc
      {
      	"firstname":"alain",
      	"lastname":"muster",
      	"birthday":"18.05.1985"
      }
      
    2. Now we want to have a look at the mapping of the index "people2", How do you think you can do this?

      Along the indices API's there is a "get mappings API" which can be used to get the index mapping. Elasticdoc get mapping.
    3. Get the mapping of the index people2

      GET people2/_mapping
      

      The response look similar to the following snippet.
      {
        "people2" : {
          "mappings" : {
            "properties" : {
              "birthday" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "firstname" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "lastname" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              }
            }
          }
        }
      }
      
    4. Which steps need to be done to change this mapping?

      As the mapping of an existing field can't be modified after indexing a document. You need to do the following steps.
      - Index the document into a test index, so that you don't have to write the whole mapping by your own.
      - copy the mapping from the test index (copy only the properties in the object mappings)
      - Now modify the mapping as requested
      - Delete the existing index "people2"
      - Now create the people2 index
      - Now apply the modified mapping to the index "people2"
      - Delete the test index
    5. Let us set the following mapping for the index people2

      firstname as keyword
      lastname as text
      birthday as date
      
      Index the document into a test index.
      POST test/_doc
      {
      	"firstname":"alain",
      	"lastname":"muster",
      	"birthday":"18.05.1985"
      }
      
      copy the mapping from the test index
      GET test/_mapping
      
      {
        "test" : {
          "mappings" : {
            "properties" : {
              "birthday" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "firstname" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              },
              "lastname" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              }
            }
          }
        }
      }
      
      Copy only the properties in the object mappings
      "properties" : {
      	"birthday" : {
      		"type" : "text",
      		"fields" : {
      			"keyword" : {
      				"type" : "keyword",
      				"ignore_above" : 256
      			}
      		}
      	},
      	"firstname" : {
      		"type" : "text",
      		"fields" : {
      			"keyword" : {
      				"type" : "keyword",
      				"ignore_above" : 256
      			}
      		}
      	},
      	"lastname" : {
      		"type" : "text",
      		"fields" : {
      			"keyword" : {
      				"type" : "keyword",
      				"ignore_above" : 256
      			}
      		}
      	}
      }
      
      Now modify the mapping as requested. (To the birthday field you can apply a format which defines how the dates are formatted Elasticdoc mapping parameter format )
      "properties" : {
      	"birthday" : {
      		"type" : "date",
      		"format": "dd.MM.yyyy"
      	},
      	"firstname" : {
      		"type" : "keyword"
      	},
      	"lastname" : {
      		"type" : "text"
      	}
      }
      
      Delete the existing index "people2"
      DELETE people2
      
      Now create the people2 index.
      PUT people2
      
      Now apply the modified mapping to the index "people2"
      PUT people2/_mapping
      {
      	"properties" : {
      		"birthday" : {
      			"type" : "date",
      			"format": "dd.MM.yyyy"
      		},
      		"firstname" : {
      			"type" : "keyword"
      		},
      		"lastname" : {
      			"type" : "text"
      		}
      	}
      }
      
      Delete the test index
      DELETE test
      
      Get the mapping of the index `people` to check of it is really as required
      GET people2/_mapping
      
  2. Let us have a look how you can change mapping of indices where the documents are valuable and should not get lost

    1. Delete the index people2

      DELETE people2
      
    2. *run this command to get a index called test index with some documents indexed. *

      POST _bulk
      { "index" : { "_index" : "test"} }
      { "firstname" : "alain","lastname": "muster","birthday":"18.05.1985" }
      { "index" : { "_index" : "test"} }
      { "firstname" : "nichtalein","lastname": "master","birthday":"08.02.1988" }
      { "index" : { "_index" : "test"} }
      { "firstname" : "john","lastname": "doe","birthday":"01.04.1988" }
      { "index" : { "_index" : "test"} }
      { "firstname" : "alice","lastname": "kinsley","birthday":"25.02.1970" }
      { "index" : { "_index" : "test"} }
      { "firstname" : "bob","lastname": "dilon","birthday":"09.06.1999" }
      { "index" : { "_index" : "test"} }
      { "firstname" : "malroy","lastname": "fahrer","birthday":"19.08.1956" }
      
    3. Check if the index is created and if the documents are indexed. You can use the _search api to check if the documents are indexed

      GET test/_search
      
    4. So the "test" index is your actual index with a mapping which is not suitable for your need. You want the documents to indexed with the following mapping. How can you achieve this without loosing the documents from the "test" index?

      {
      	"properties" : {
      		"birthday" : {
      			"type" : "date",
      			"format": "dd.MM.yyyy"
      		},
      		"firstname" : {
      			"type" : "keyword"
      		},
      		"lastname" : {
      			"type" : "text"
      		}
      	}
      }
      
      Reindex API is what you need. You could do the following steps to achieve the request.
      create a new index
      apply the new mapping to that index
      reindex the documents from the "test" index to the new index Elasticdoc reindex API
    5. Create a new index called people2 with the new mapping (delete the old "people2" index if its still there) reindex the documents to the new index.

      First create and apply a new index with the requested mapping.
      PUT people2
      {
        "mappings": {
          "properties": {
            "birthday": {
              "type": "date",
              "format": "dd.MM.yyyy"
            },
            "firstname": {
              "type": "keyword"
            },
            "lastname": {
              "type": "text"
            }
          }
        }
      }
      

      Now reindex the documents from the "test" index to the "people2" index.
      POST _reindex
      {
      	"source":{
      		"index": "test"
      	},
      	"dest":{
      		"index": "people2"
      	}
      }
      

      No you can check the mapping of the "people2" index and also run the search api to see if all documents are indexed.
      GET people2/_mapping
      
      GET people2/_search
      
  3. Let us have a look how you can avoid that new fields gets indexed to the index. We will be using the "people2" index for this lab section

    1. What mapping parameter can be used, so that a not defined field in the mapping wont get indexed into the index?

      By default, if elasticsearch find a new field inside a document, it will apply an automatic mapping for that new field and also index that field. You can use the mapping parameter "dynamic" to influence this behavior. Elasticdoc mapping parameter dynamic
    2. To see the default behavior, get the actual mapping of the index "people2"

      GET people2/_mapping
      
    3. add the following document to the index people2

      {
      	"firstname":"arnold",
      	"lastname":"bacher",
      	"birthday":"18.05.1985",
      	"nationality": "Switzerland"
      }
      
      POST people2/_doc
      {
      	"firstname":"arnold",
      	"lastname":"bacher",
      	"birthday":"18.05.1985",
      	"nationality": "Switzerland"
      }
      
    4. Now get the mapping of the "people2" again and describe what happened

      You will see the new field nationality in the mapping properties. This has been done by elasticsearch.
      GET people2/_mapping
      
    5. Now disable this behavior on the index "people2". So an indexing of a document with other fields than already defined will throw an error

      POST people2/_mapping
      {
        "dynamic":"strict"
      }
      
      You can set the value of the dynamic mapping parameter to "strict". This will cause an exception on indexing doc with fields that are not already defined.
    6. Add the following document to the index people2 (you should get an error)

      {
      	"firstname":"new",
      	"lastname":"guy",
      	"birthday":"18.05.1985",
      	"nationality": "Switzerland",
      	"city": "bern"
      }
      
      POST people2/_doc
      {
      	"firstname":"new",
      	"lastname":"guy",
      	"birthday":"18.05.1985",
      	"nationality": "Switzerland",
      	"city": "bern"
      }
      
    7. Now change the dynamic parameter value to false

      POST people2/_mapping
      {
        "dynamic":"false"
      }
      
    8. Add the following document to the index people2 (you will see that this document get indexed now). Can you tell me the difference between the value false and the default value "true" for the dynamic parameter?

      {
      	"firstname":"newer",
      	"lastname":"guy",
      	"birthday":"18.05.1985",
      	"nationality": "Switzerland",
      	"city": "bern"
      }
      
      POST people2/_doc
      {
      	"firstname":"newer",
      	"lastname":"guy",
      	"birthday":"18.05.1985",
      	"nationality": "Switzerland",
      	"city": "bern"
      }
      
      This doc is indexed, but elasticsearch did not add it to the mapping of the index. So the field is also not indexed.
      GET people2/_mapping
      

      So if you run a query on the new field "city" you will not be successful.
      POST people2/_search
      {
      	"query":{
      		"match":{
      			"city":"bern"
      		}
      	}
      }
      
      If you run the same query on the field nationality, you will see that this will be successful. It also brings the doc with the field "city":"bern", but it is not indexed and so it cant be searched.
      POST people2/_search
      {
      	"query":{
      		"match":{
      			"nationality":"switzerland"
      		}
      	}
      }
      
  4. (advanced) For the index people2 Create a dynamic mapping template which applies the mapping text for all fields which starts with tex_

    POST people2/_mapping
    {
    	"dynamic": true,
    	"dynamic_templates":[
    		{
    			"texts":{
    				"path_match": "tex_*",
    				"mapping":{
    					"type":"text"
    				}
    			}
    		}
    	]
    }
    
    1. test if it worked with indexing the following doc

      {
      	"firstname":"new",
      	"lastname":"guy",
      	"birthday":"18.05.1985",
      	"tex_field": "something"
      }
      
      If it did not worked then check the dynamic parameter value.
      If it worked then the mapping of the tex_field should be only "text", like below.
      "properties" : {
        "birthday" : {
          "type" : "date",
          "format" : "dd.MM.yyyy"
        },
        "firstname" : {
          "type" : "keyword"
        },
        "lastname" : {
          "type" : "text"
        },
        "tex_field" : {
          "type" : "text"
        }
      }
      
  5. Lets do some query
    To get some sample data, let us load the e-commerce sample from kibana. Then add data of Sample eCommerce orders

    Sample data e-commerce
    1. Get the mapping of the newly created index. the name of the index is kibana_sample_data_ecommerce

      GET kibana_sample_data_ecommerce/_mapping
      
    2. There is a field called customer_gender inside the index. How would you proceed to get all the document which are from a female gender?

      You need to know more about the field `customer_gender`. What is the mapping of it?
      You should have a look at some example value of that field. Run the following query to get some documents.
      GET kibana_sample_data_ecommerce/_search
      

      Now you can decide which query you can use to get the documents containing female genders.
    3. Write a query which gets all documents which have a female gender inside the index "kibana_sample_data_ecommerce".
      The response of a query will return a field hits.total.value. The index contains 2433 documents which have a gender as female.

      For writing a query you need to know which mapping that field has used. You can see that there is only a keyword mapping existing for the "customer_gender" field.
      GET kibana_sample_data_ecommerce/_mapping
      
      Then run a simple search api on the index to get some example values. You can use the _source parameter to only return the customer_gender field. And use the "size" parameter to define how many documents you want to get return.
      GET kibana_sample_data_ecommerce/_search
      {
        "_source": "customer_gender"
      }
      
      So now you know that the field is mapped as keyword, so you can only run term queries. As you have seen that the value for female gender is "FEMALE", you got everything to write the query.
      As you are searching for a full term as value you can use the term query
      GET kibana_sample_data_ecommerce/_search
      {
        "query": {
          "term": {
            "customer_gender": {
              "value": "FEMALE"
            }
          }
        }
      }
      
    4. Write a query which returns all documents containing the value "Primemaster" or "Elitelligence" in the field manufacturer
      You should get 1435 documents from your search. Have a look at the term query and terms query, one of them is well suited for this search request.

      You can use the terms query for this request.
      You need to query the fieldname manufacturer.keyword to get the right response. Because the keyword mapping is done as multifield mapping.
      GET kibana_sample_data_ecommerce/_search
      {
        "query": {
          "terms": {
            "manufacturer.keyword": [
              "Primemaster",
              "Elitelligence"
            ]
          }
        }
      }
      
    5. (advanced) Write a query which searches the field "products.product_name" for containing the value "shirt"
      You should get 1160 results from your query.

      GET kibana_sample_data_ecommerce/_search
      {
        "query": {
          "match": {
            "products.product_name": "shirt"
          }
        }
      }
      
    6. (advanced) Write a query to get all orders from female customers on Saturday or Sunday
      So you are searching for documents containing in field "day_of_week" the value "Saturday" or "Sunday" and in field "customer_gender" the value "FEMALE".

      Have a look at the Elasticdoc Bool query

      You should get 680 documents in return.

      GET kibana_sample_data_ecommerce/_search
      {
        "query": {
          "bool": {
            "must": [
              {
                "terms": {
                  "day_of_week": [
                    "Saturday",
                    "Sunday"
                  ]
                }
              },
              {
                "term": {
                  "customer_gender": {
                    "value": "FEMALE"
                  }
                }
              }
            ]
          }
        }
      }
      
  6. As we are finish with the sample data e-commerce, please remove them
    You can click on remove under the eCommerce sample data set.

    Sample data e-commerce
  7. Lets have a look at some aggregations
    This step should be already done during the installation Elastic Stack lab. If the sample dataset "Sample web logs" is not available yet, then add it.
    After the import, the index will be named "kibana_sample_data_logs". Take your time to understand the index and the containing documents.

    1. Run an aggregation which returns the biggest log document
      So you need do a max aggregation on the field "bytes". As an aggregation is used inside a search api you can use the parameter "size":0, so that you don’t receive any search hits(documents). During aggregation we are interested in conclusions out of all documents.

      You should get as result "19986.0"

      GET kibana_sample_data_logs/_search
      {
        "size": 0,
        "aggs": {
          "biggest_log": {
            "max": {
              "field":"bytes"
            }
          }
        }
      }
      
    2. Run an aggregation which returns the smallest log document
      You should get as result "0.0"

      GET kibana_sample_data_logs/_search
      {
        "size": 0,
        "aggs": {
          "smallest_log": {
            "min": {
              "field":"bytes"
            }
          }
        }
      }
      
    3. Run the metric aggregation stats on the field "bytes"

      "aggregations" : {
      	"bytes_stats" : {
      		"count" : 14005,
      		"min" : 0.0,
      		"max" : 19986.0,
      		"avg" : 5686.65169582292,
      		"sum" : 7.9641557E7
      	}
      }
      
      GET kibana_sample_data_logs/_search
      {
        "size": 0,
        "aggs": {
          "bytes_stats": {
            "stats": {
              "field":"bytes"
            }
          }
        }
      }
      
    4. What does a terms aggregation do?

      This aggregation will return multiple buckets for each term it finds. For example if you run the terms aggregation on the field response, it will return for each different type of value for the field response a bucket.
    5. Run the terms aggregation on the field "response"

      {
      	"key" : "200",
      	"doc_count" : 12872
      },
      {
      	"key" : "404",
      	"doc_count" : 687
      },
      {
      	"key" : "503",
      	"doc_count" : 446
      }
      
      GET kibana_sample_data_logs/_search
      {
        "size": 0,
        "aggs": {
          "responses": {
            "terms": {
              "field": "response.keyword"
            }
          }
        }
      }
      
    6. (advanced) Run an aggregation which buckets the documents per week

      GET kibana_sample_data_logs/_search
      {
        "size": 0,
        "aggs": {
          "weekly_buckets": {
            "date_histogram": {
              "field": "timestamp",
              "calendar_interval": "week",
              "format": "yyyy-MM-dd"
            }
          }
        }
      }
      
    7. (advanced) Run an aggregation which buckets the documents per week and get the max byte per week
      You already did the first part in the last task, now you need aggregate per each week the max for field "bytes".

      GET kibana_sample_data_logs/_search
      {
        "size": 0,
        "aggs": {
          "weekly_buckets": {
            "date_histogram": {
              "field": "timestamp",
              "calendar_interval": "week",
              "format": "yyyy-MM-dd"
            },
            "aggs": {
              "max_byte": {
                "max": {
                  "field": "bytes"
                }
              }
            }
          }
        }
      }
      
    8. (advanced) Run an aggregation which returns the documents per week in buckets then aggregate the sum of all bytes. Then apply a pipeline aggregation to get the max of the sums per week

      Have a look at pipeline aggregation
      Elasticdoc max bucket aggregation
      GET kibana_sample_data_logs/_search
      {
        "size": 0,
        "aggs": {
          "weekly_buckets": {
            "date_histogram": {
              "field": "timestamp",
              "calendar_interval": "week",
              "format": "yyyy-MM-dd"
            },
            "aggs": {
              "total_bytes": {
                "sum": {
                  "field": "bytes"
                }
              }
            }
          },
          "weekly_sum_max_bytes":{
            "max_bucket": {
              "buckets_path": "weekly_buckets>total_bytes"
            }
          }
        }
      }
      
    9. (advanced) Run a date histogram aggregation over the full index, then instead of getting the max byte value, get the document with the largest byte
      You have already done the date histogram per week. for getting a doc from an aggregation you need to use the top_hits aggregation. use the sort and size feature inside tophits to get the doc with the most largest value for bytes.

      GET kibana_sample_data_logs/_search
      {
        "size": 0,
        "aggs": {
          "weekly_buckets": {
            "date_histogram": {
              "field": "timestamp",
              "calendar_interval": "week",
              "format": "yyyy-MM-dd"
            },
            "aggs": {
              "top_doc": {
                "top_hits": {
                  "size": 1,
                  "sort": [{
                    "bytes":{
                      "order": "desc"
                    }
                  }]
                }
              }
            }
          }
        }
      }
      

Lab Elasticsearch 3-3

Objective: Index templates, Ingest pipeline, shard allocation, segments, index lifecycle management

  1. Let us start this lab with Index template

    1. Get a list of all existing index templates

      use the `_cat/templates` for listing all index templates.
      GET _cat/templates?v
      
      I know you don't have created any index templates, but there are still some them existing. These are the system index templates which were created by the Elastic Stack components. A system index template is noted with the `.` at the beginning of the index template name.
      If you want to get the name of the columns then use the help option
      GET _cat/templates?help
      

      You can also use the Elasticdoc get index template to get the template. This is more than just a list. Normally this is used to see the templates configuration.
      GET _template
      

      You can also add a template name at the end of the `template API`
      GET _template/.monitoring-kibana
      
    2. Create a new index template with the following requirements

      number of primary shards: 1
      number of replicas: 0
      the indices which begins with `test-*` should apply this setting
      The name of the index-template : test-indices
      
      POST _template/test-indices
      {
      	"index_patterns": ["test-*"],
      	"settings":{
      		"number_of_shards": 1,
      		"number_of_replicas":0
      	}
      }
      
    3. Create the following indices and check if all indices with the required beginning has taken the template settings from test-indices

      testament
      test-firewall
      test-1
      test_1
      test_firewall
      
      PUT testament
      PUT test-firewall
      PUT test-1
      PUT test_1
      PUT test_firewall
      
      Check with the following command, if the index has take the template settings. As the default setting is 1 primary shard and 1 replica, and the setting in index template is 1 primary shard and 0 replica. If everything is correct the indices ["test-firewall","test-1"] should have 0 replica.
      GET `indexname`
      
    4. Apply the following settings to the template test-indices

      All indices with the beginning test-*, and test_* should take the settings inside the template `test-indices`
      
      POST _template/test-indices
      {
      	"index_patterns": ["test-*","test_*"],
      	"settings":{
      		"number_of_shards": 1,
      		"number_of_replicas":0
      	}
      }
      
    5. Create a new index pattern with the following condition

      number of primary shards: 1
      number of replicas: 2
      the indices which begins with test-dev-* and test_dev_* should get template settings applied(not the settings from test-indices)
      The name of the index-template : test-dev-indices
      
      As the patterns in `test-indices` will match this index name requirement as well you need to apply the order to get the correct prioritization to the templates.Elasticdoc template multiple index matching
      POST _template/test-dev-indices
      {
      	"index_patterns": ["test-dev-*","test_dev_*"],
      	"settings":{
      		"number_of_shards": 1,
      		"number_of_replicas":2
      	},
        "order": 5
      }
      
    6. Create the following indices to check if the prioritization of the indices is working correct

      test-developer
      test-dev-modes
      test_dev_modes
      
      For creating the indices
      PUT `indexname`
      
      Check with the following command if the index has applied the template settings. The index `test-developer` should have 1 primary and 0 replica. The indices `test-dev-modes`, `test_dev_modes` should have 1 primary and 2 replicas
      GET `indexname`
      
      As we are only having changed the behavior of number of shards. You can also use the command GET _cat/indices/test*?v
  2. Let us have a look at some pipelines

    1. create a pipeline which will apply the following requirements

      The name of the pipeline: ingest_timestamp
      Add a new field to the document which is named `ingested_at` and the value should be the timestamp of ingesting.
      
      You can use the set processor Elasticdoc set processor As you need to get the name and version of the ingesting pipeline, you can access them Elasticdoc access ingesting meta data
      PUT _ingest/pipeline/ingest_timestamp
      {
        "description" : "add the ingesting timestamp to a document",
        "processors" : [
          {
            "set": {
              "field": "ingested_at",
              "value": "{{_ingest.timestamp}}"
            }
          }
        ]
      }
      
    2. index a document to an index called test through the pipeline ingest_timestamp and check if the document has the new field

      If there is an index existing delete it
      DELETE test
      
      index through the pipeline
      POST test/_doc?pipeline=ingest_timestamp
      {
        "name":"john",
        "age":18
      }
      
      Check if the document was correctly ingested with the new field
      GET test/_search
      
    3. (advanced) create a new index called test index and apply the ingest_timestamp pipeline as default pipeline to the index test

      You can use the index setting `index.default_pipeline` for this request Elasticdoc index default pipeline Delete the existing test index
      Delete test
      
      PUT test/
      {
      	"settings":{
      		"index.default_pipeline": "ingest_timestamp"
      	}
      }
      
    4. (advanced) Run the following command and check if the ingest pipeline was executed

      POST test/_doc
      {
        "name":"john",
        "age":18
      }
      
      If the document was ingested with the pipeline you should see the field ingested_at.
      GET test/_search
      
    5. (advanced) Lets create a conditional ingest pipeline with the following requirements

      The name of the pipeline: dropper_1
      if field `age` in document is bigger than 17 then drop the doc Elasticdoc drop processor
      
      As this is a conditional Execution, you need to check how you can access the field inside a document. Elasticdoc conditional execution in pipeline
      PUT _ingest/pipeline/dropper_1
      {
        "description" : "This pipeline will drop all document if it contains a field age with higher value than 17",
        "processors" : [
          {
            "drop" : {
              "if": "ctx.age > 17"
            }
          }
        ]
      }
      
  3. Index Lifecycle Management
    There is a cycle which checks the state of index policies, and this is set to 10m. As normally the conditions for rollovers are set to bigger numbers like doc count 100'000 or 50gb. So for testing the ilm with lower values like "2 minutes" etc we need to do the check cycle much often. Like all 10 seconds.

    Don’t use this cycle value for production. It is just for testing ilm. In a production area this may cause unnecessary load.
    PUT /_cluster/settings
    {
        "transient" : {
            "indices.lifecycle.poll_interval": "10s"
        }
    }
    
  4. We will create a policy and apply this to an index template, so that each index created through this template will be managed by the policy.

    1. Which steps are needed to build an ILM manually?

      Create a policy Create an index template apply the policy and the alias to the index-template Create the first index with as write index to the alias Shortly follow this Elasticdoc getting started with ilm
    2. Create a policy with the following requirements(You can not use Kibana ilm policy GUI for this, as there is no support to create time conditions less than days) Use the documentation to copy the JSON structure

      Name of policy: stream-policy
      A rollover should be applied after 3 minutes or 100 documents
      The delete of the index should be done after 10 minutes
      
      PUT _ilm/policy/stream-policy
      {
        "policy": {
          "phases": {
            "hot": {
              "actions": {
                "rollover": {
                  "max_docs": 100,
                  "max_age": "3m"
                }
              }
            },
            "delete": {
              "min_age": "10m",
              "actions": {
                "delete": {}
              }
            }
          }
        }
      }
      
    3. Create a template with the following requirements

      Name of template: stream-template
      patterns : ["stream-*"]
      ilm policy: stream-policy
      ilm alias: stream
      
      POST _template/stream-template
      {
      	"index_patterns": ["stream-*"],
      	"settings":{
      		"index.lifecycle.name": "stream-policy",
          "index.lifecycle.rollover_alias": "stream"
      	}
      }
      
    4. Create an index called stream-000001 as write index of an alias named stream

      PUT stream-000001
      {
        "aliases": {
          "stream": {
            "is_write_index": true
          }
        }
      }
      
    5. Run the following command 101 times to index 101 documents (does not matter that it is the same document), wait 11 seconds and index one more document. Now a rollover should have been done, and the response should show that the document was indexed into stream-000002 index.

      POST stream/_doc
      {
      	"some":"thing"
      }
      
    6. What do you need to do if you want to update the policy, and want it to immediately applied to the indexing behavior?

      As policy change will only be considered at the entrance into a phase, a modification of the index will not be considered till the phase changes. So for activating the new policy for a new index, you can manually do a rollover. 1) update the policy 2) run the _rollover API on the rollover index alias 3) set the `index.lifecycle.indexing_complete` to true on the old index of the `_rollover` response.
    7. (advanced)Update the policy so that all indices older than 5 minutes gets deleted

      PUT _ilm/policy/stream-policy
      {
        "policy": {
          "phases": {
            "hot": {
              "actions": {
                "rollover": {
                  "max_docs": 100,
                  "max_age": "3m"
                }
              }
            },
            "delete": {
              "min_age": "5m",
              "actions": {
                "delete": {}
              }
            }
          }
        }
      }
      
    8. (advanced)run the rollover manually on the index alias stream and apply the necessary setting to the old index of the rollover response

      Run the rollover on the alias. (It has to be the alias)
      POST stream/_rollover
      
      Let us assume this is the response, then you need to set the `index.lifecycle.indexing_complete` of the index `stream-000014` to true. (Usually this is done by ilm after the rollover)
      {
        "acknowledged" : true,
        "shards_acknowledged" : true,
        "old_index" : "stream-000014",
        "new_index" : "stream-000015",
        "rolled_over" : true,
        "dry_run" : false,
        "conditions" : { }
      }
      
      manually changing the setting of the old index
      PUT stream-000014/_settings
      {
        "index.lifecycle.indexing_complete": true
      }
      
      Check with ilm/explain if anything is wrong. if yes then run the ilm retry on the index(if the ilm fails, it will not retry by its own). Run ilm explain to see if there are any errors
      GET stream-000014/_ilm/explain
      
      The following is ok
        "indices" : {
          "stream-000014" : {
            "index" : "stream-000014",
            "managed" : true,
            "policy" : "stream-policy",
            "lifecycle_date_millis" : 1564411403960,
            "phase" : "hot",
            "phase_time_millis" : 1564411394020,
            "action" : "complete",
            "action_time_millis" : 1564411404073,
            "step" : "complete",
            "step_time_millis" : 1564411404073,
            "phase_execution" : {
              "policy" : "stream-policy",
              "phase_definition" : {
                "min_age" : "0ms",
                "actions" : {
                  "rollover" : {
                    "max_age" : "3m",
                    "max_docs" : 100
                  }
                }
              },
              "version" : 2,
              "modified_date_in_millis" : 1564410126088
            }
          }
        }
      }
      
      The following snippet shows an error
      {
        "indices" : {
          "stream-000014" : {
            "index" : "stream-000014",
            "managed" : true,
            "policy" : "stream-policy",
            "lifecycle_date_millis" : 1564411543954,
            "phase" : "hot",
            "phase_time_millis" : 1564411544446,
            "action" : "rollover",
            "action_time_millis" : 1564411554305,
            "step" : "ERROR",
            "step_time_millis" : 1564411643814,
            "failed_step" : "check-rollover-ready",
            "step_info" : {
              "type" : "illegal_argument_exception",
              "reason" : "index [stream] is not the write index for alias [stream-000014]",
              "stack_trace" : """
      java.lang.IllegalArgumentException: index [stream] is not the write index for alias [stream-000014]
      	at org.elasticsearch.xpack.core.indexlifecycle.WaitForRolloverReadyStep.evaluateCondition(WaitForRolloverReadyStep.java:100)
      	at org.elasticsearch.xpack.indexlifecycle.IndexLifecycleRunner.runPeriodicStep(IndexLifecycleRunner.java:133)
      	at org.elasticsearch.xpack.indexlifecycle.IndexLifecycleService.triggerPolicies(IndexLifecycleService.java:270)
      	at org.elasticsearch.xpack.indexlifecycle.IndexLifecycleService.triggered(IndexLifecycleService.java:213)
      	at org.elasticsearch.xpack.core.scheduler.SchedulerEngine.notifyListeners(SchedulerEngine.java:168)
      	at org.elasticsearch.xpack.core.scheduler.SchedulerEngine$ActiveSchedule.run(SchedulerEngine.java:196)
      	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      	at java.base/java.lang.Thread.run(Thread.java:835)
      """
            },
            "phase_execution" : {
              "policy" : "stream-policy",
              "phase_definition" : {
                "min_age" : "0ms",
                "actions" : {
                  "rollover" : {
                    "max_age" : "3m",
                    "max_docs" : 100
                  }
                }
              },
              "version" : 2,
              "modified_date_in_millis" : 1564410126088
            }
          }
        }
      }
      
      If you have an error then check if the setting [index.lifecycle.indexing_complete] is true and then run the ilm retry
      POST stream-000014/_ilm/retry
      
      Now run explain on the index again to see if the failure is gone.
      GET stream-000014/_ilm/explain
      
  5. Let us have a look at the backup and snapshot feature

Lab4: Kibana

Lab Kibana 4-1

Objective: This exercise covers the topic Indices, Filtering and Searching with the Discover Interface.

  1. Create a new Kibana Index Pattern
    Use following parameter for the new index pattern.

    index pattern name: kibana_sample*
    Time Filter field name: timestamp
    Custom index pattern ID: kibana_sample
    

    Custom index pattern ID is important for visualizations

    
    test
    test
    test
    
  2. Find out the difference between kibana_sample* and kibana_sample_data_logs
    There can be several differences. But one is very important!

    Definition of kibana_sample*
    test
    Definition of kibana_sample_data_logs
    test
    kibana_sample_data_logs contains a scripted field. This is missing in the new Kibana Index kibana_sample*. So we're still creating this one!
    
  3. Create Scripted Field
    Find the necessary information for the scripted field at the Kibana Index kibana_sample_data_logs. You will find the following settings.

    Name: hour_of_day
    Language: painless
    Type: number
    Format(Default: Number ): -Default-
    Popularity: 0
    Script: doc['timestamp'].value.getHour()
    
    Select "Add scripted Field"
    test
    
    Fill in the necessary information test
    We'll find this scripted field later.
  4. Find out, for which data period are documents available
    We now use the Discover Interface and use the Kibana Index kibana_sample_data_logs.
    The documents in the sample index is only available for a certain period of time and this one interested us.
    Use the "Time Range" function.

    Choose the right index.
    test
    
    Use "Relative" with "2 Months ago"for start date and "Relative" with "2 Months from now" for end date
    test
  5. Find out the last and first document
    The last document should be found quickly.

    The top document in the Discover table is also the last entry!
    test
    
    It is possible that your data are different. It depends on when the test data was loaded.

    For finding the first document, just sort it the another way around.

    test
    
    It is possible that your data are different. It depends on when the test data was loaded.
  6. Filter documents
    Reset the time range to "this week"

    test
    

    See which data fields exists at all

    test
    
    Click on ">" to expand the document test

    We are now interested in all documents which contain the value win xp in the field machine.os
    We can use Filtering by Field for this purpose.

    Click to "machine.os" to get the possible values in this time range.
    test
    
    Now you can add the value "win xp" with the "positive filter button" to the filter As a result you will now get all documents that contain "win xp" test
    316 documents were found. test

    Now we would like to extend the filter. We are now interested in all documents of windows systems. Use the Edit filter to do this.

    We edit the existing filter. Click on the filter "machine.os: win xp" and choose "Edit filter"
    test
    
    Adjust the filter as follows Operator: is one of Value: win* test
    As a result you will now get all documents that contain "win" test
  7. Adding Field Columns to the Document Table
    We now want a tabular representation of this 945 documents. The table should contain machine.os, host, ip and geo.src. At the end we want to save this Discover document table and export it as CSV file.

    Fields can be added to the table using the "Add" button. Click on this button.
    test
    
    Now add the remaining fields. host, ip and geo.src The result now looks as follows. The filter is still active! test
    The table can be adjusted and sorted as desired. If you want to process the data outside Kibana, you can export them as CSV. This discover setting must be saved before the report can be created! test
    Confirm Save and then choose Share and click to CSV Report" test
    "Click to Generate CSV" test
    Pick it up from Management > Kibana > Reporting for download test
    The saved CSV file can then be further processed with a Spreadsheet tool.

Lab Kibana 4-2

Objective: This exercise covers the topic Aggregations and Visualization.

Use the Kibana Index source kibana_sample_data_logs for all visualizations.
  1. Tag Cloud Aggregation
    Create a cloud tag visualization to display the operating systems types. The following representation is desired for a time range of "Last 30 days".

    
    
    Visualization: Tag Cloud Buckets Field: machine.os.keyword Orientations: multiple Save it under: LAB-Tag-OS

    Data configuration

    
    
    

    Options configuration

    
    

    Save visualization

    
    
    Press Save
  2. Pie Chart Aggregation
    Create a Pie Chart visualization to display the operating systems types. The following representation is desired for a time range of "Last 30 days".

    
    
    Visualization: Pie Buckets Field: machine.os.keyword Legend Position: bottom Save it under: LAB-Pie_Chart-OS

    Data configuration

    
    
    

    Options configuration

    
    

    Save visualization

    
    
    Press Save
  3. Nested Terms Aggregation
    We are now interested in the distribution of the operating system according to geographical regions.
    Use a Nested Pie Chart visualization to display this. The following representation is desired for a time range of "Last 30 days".

    
    
    Visualization: Pie 1st Buckets Field: machine.os.keyword (Split slices) 2nd Buckets Field: geo.src (Split slices) Donut: disabled Legend Position: bottom Show Labels: enabled Show Top Level: disabled Save it under: LAB-Pie_Chart-OS_GEO

    Data configuration

    
    
    
    

    Options configuration

    
    

    Save visualization

    
    
    Press Save
  4. Map Aggregation
    So that we can integrate a Geo Map later in the Dashboard, we have to create it first. Create a Map as shown below.

    
    
    Visualization: Maps Layer: Documents Geospatial Field: geo.coordinates Fill color: By value of response.keyword (200 : green / 404 : blue / 503 : red) Map type: bottom Save it under: LAB-Map

    Howto: Maps

    Add Layer

    
    

    Configure Layer

    
    
    

    Save visualization

    
    
    Press Save
  5. Add Input Controls Element
    We need an interactive control element for easy dashboard manipulation. The goal is to be able to select the operating systems later in the dashboard simply on the basis of a menu. Create a control element as shown.

    
    
    Visualization: Controls Field Name: machine.os.keyword Enable: Multiselect and Dynamic Options Save it under: LAB-Control-OS

    Controls configuration

    
    

    Options configuration

    
    

    Save visualization

    
    
    Press Save
  6. Time Series Element
    We would now like to look at the behavior of Server OS data over time. The visualization Time Series is optimal for this. Create a Times Services as shown below.

    
    
    Visualization: Visual Builder Group by: Terms By: machine.os.keyword Chart type: Bar Stacked: Stacked Split color theme: Rainbow Save it under: LAB-Time_Series-OS

    Howto: TSVB

    Panel options configuration

    
    

    Data > Metrics configuration

    
    

    Data > Options configuration

    
    

    Save visualization

    
    
    Press Save

Lab Kibana 4-3

Objective: The goal of this lab is to create a dashboard with the perviously created visualizations

Desired dashboard



  1. Create new Dashboard

    
    
    
    Now we have an empty dashboard to fill up visualizations

  2. Fill up Dashboard with visualization LAB-Control-OS

    
    
    Select LAB-Control-OS and then click to X to Close Add panels

  3. Customize Panel LAB-Control-OS and remove Panel Title

    
    

  4. Fill in the rest of the visualization LAB-PIE_Chart-OS, LAB-Pie_Chart-OS_GEO, LAB-Coordinate_Map and LAB-Time_Series-OS

    
    
    Select all visualizations and then click to X to Close Add panels

  5. Now arrange the panels so that they correspond to your needs or the specifications above

  6. Save Dashboard with the name LAB-OS

    
    
    Press Save

Lab5: Beats

Lab Linux Beats 5-1

Objective: In this lab you will learn how to install Metricbeat, Filebeat and Heartbeat. Because installing and configuring Beats needs appropriate rights we assume you are root. If not already done, switch to user root.

We will install all beats on another server named linux1
Hostnames are already preconfigured in the local /etc/hosts file

Requirements for Beats

Because we are using security we need first to create a role and a user in Kibana.

  1. Create role beats_input
    Before starting we need a User for the beats to send data to elastic. Go to Kibana and create a role with the following requirements.

    name: beats_input

    cluster privileges: manage_index_templates, monitor, manage_ingest_pipelines, manage_ilm

    index privileges: index, manage

    On the indices: metricbeat-*, filebeat-*, heartbeat-*, winlogbeat-*

    Add space privilege for the "Default" Space. The features "Dashboard" and "Saved Objects Management" should be enabled

    You can access this settings inside Kibana (Management -> Roles)
    Smiley face
    
    Add space privilege Smiley face
    Then click "Create space privilege" and "Create role"
  2. Create user beats_user

    role name: beats_user

    roles: beats_input, beats_system

    password: password

    other fields can be ignored

    You can access this settings inside kibana (Management -> Users)
    Smiley face
    

You will use /etc/pki/ca-trust/source/anchors/labs.strigo.io.crt for every beat.

Installation of Metricbeat

At first we will install Metricbeat. This one brings a lot of modules which are easy to configure.

Install Metricbeat on server linux1
Make sure that you are connected with server linux1 !!!!!
  1. Install Metricbeat
    Install the package which was been downloaded in the task above.

    yum install https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.9.3-x86_64.rpm
    
  2. Configure Metricbeat
    Add the necessary configurations to Metricbeat.

    name
    setup.kibana.host
    setup.kibana.ssl.certificate_authorities
    output.elasticsearch.hosts
    output.elasticsearch.username
    output.elasticsearch.password
    output.elasticsearch.ssl.certificate_authorities
    
    vim /etc/metricbeat/metricbeat.yml
    name: linux1
    setup.kibana: host: https://server1:5601 ssl.certificate_authorities: - /etc/pki/ca-trust/source/anchors/labs.strigo.io.crt output.elasticsearch: hosts: - server1:9200 - server2:9200 - server3:9200 protocol: https username: beats_user password: password ssl.certificate_authorities: - /etc/pki/ca-trust/source/anchors/labs.strigo.io.crt
    You can also use a keystore to protect the username and password.
    https://www.elastic.co/guide/en/beats/metricbeat/7.9/keystore.html
  3. Setup dashboards, index management, machine learning and ingest pipelines from metricbeat
    Index Management includes templates, ilm policy and rollover alias

    You need to run this command only once and only on one server.
    metricbeat setup --dashboards --index-management --machine-learning --pipelines
    or
    metricbeat setup *
    
  4. Metricbeat modules handling

    Get a list of all modules
    metricbeat modules list
    
    Enable Modules
    metricbeat modules enable module-name
    
    Disable Modules
    metricbeat modules disable module-name
    
  5. Let us configure some modules
    Check system module configuration and add if there are missing elements.

    module: system
    metricsets: cpu, load, memory, network, process, process_summary, socket_summary
    top 5 processes : by_cpu, by_memory
    period: 10s
    
    module: system metricsets: filesystem, fsstat processors: drop_event.when.regexp: system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)' period: 1m
    module: system metricsets: uptime period: 15m
    vim /etc/metricbeat/modules.d/system.yml
    - module: system
      period: 10s
      metricsets:
        - cpu
        - load
        - memory
        - network
        - process
        - process_summary
        - socket_summary
        #- entropy
        #- core
        #- diskio
        #- socket
      process.include_top_n:
        by_cpu: 5      # include top 5 processes by CPU
        by_memory: 5   # include top 5 processes by memory
    - module: system period: 1m metricsets: - filesystem - fsstat processors: - drop_event.when.regexp: system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'
    - module: system period: 15m metricsets: - uptime
    #- module: system # period: 5m # metricsets: # - raid # raid.mount_point: '/'
  6. Enable automatic start of metricbeat at system boot and start metricbeat
    To check if it worked, go to Kibana UI → Management → Index Management and check if there is an index starting with metricbeat. If you don’t have such an index something went wrong. To help debug, Metricbeat will write to /var/log/metricbeat/metricbeat and /var/log/messages.

    systemctl enable metricbeat
    systemctl start metricbeat
    
    After starting, Metricbeat add templates and ILM settings to your elasticsearch cluster. If you need to change this, please visit
    https://www.elastic.co/guide/en/beats/metricbeat/7.9/metricbeat-template.html
    https://www.elastic.co/guide/en/beats/metricbeat/7.9/ilm.html

Installation of Filebeat

In the previous chapter you learned how to install Metricbeat. It’s time to install Filebeat.

  1. Download and Installation

    yum install https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.9.3-x86_64.rpm
  2. Configure Filebeat
    Add the necessary configurations to Filebeat.

    setup.kibana.host
    setup.kibana.ssl.certificate_authorities
    output.elasticsearch.hosts
    output.elasticsearch.username
    output.elasticsearch.password
    output.elasticsearch.ssl.certificate_authorities
    
    vim /etc/filebeat/filebeat.yml
    setup.kibana:
      host: https://server1:5601
      ssl.certificate_authorities:
      - /etc/pki/ca-trust/source/anchors/labs.strigo.io.crt
    output.elasticsearch:
      hosts:
      - server1:9200
      - server2:9200
      - server3:9200
      protocol: https
      username: beats_user
      password: password
      ssl.certificate_authorities:
      - /etc/pki/ca-trust/source/anchors/labs.strigo.io.crt
    
    You can also use a keystore to protect the username and password.
    https://www.elastic.co/guide/en/beats/filebeat/7.9/keystore.html
  3. Setup dashboards, index management, machine learning and ingest pipelines from Filebeat
    Index Management includes templates, ilm policy and rollover alias

    You need to run this command only once and only on one server.
    filebeat setup *
    
  4. Enable and configure the module system
    After installing Filebeat its time to enable the system module. You can use elastic doc as help Elasticdoc filebeat system module

    module: system/syslog
    var.convert_timezone: true
    
    module: system/auth var.convert_timezone: true
    filebeat modules enable system

    Check system module configuration and add if there are missing elements. You have also to set var.convert_timezone: true.

    vim /etc/filebeat/modules.d/system.yml
    module: system
      syslog:
        enabled: true
        var.convert_timezone: true
    # Authorization logs auth: enabled: true var.convert_timezone: true
  5. Enable filebeat for auto startup at system boot, and start filebeat
    To check if it worked, go to Kibana UI → Management → Index Management and check if there is an index starting with filebeat. If you don’t have such an index something went wrong. To help debug, Filebeat will write to /var/log/filebeat/filebeat and /var/log/messages.

    systemctl enable filebeat
    systemctl start filebeat
    

    After starting, Filebeat add templates and ILM settings to your elasticsearch cluster. If you need to change this, please visit
    Filebeat index-template
    filebeat ilm
  6. What was the difference between filebeat and metricbeat?
    in configuring, installing or starting the beats.

    Most ways work for every beat. There are differences what they do. But all have the same base functions.

Installation of Heartbeat

We will install a third beat. This one is named Heartbeat and is used for monitoring availability.

  1. Download and installation

    yum install https://artifacts.elastic.co/downloads/beats/heartbeat/heartbeat-7.9.3-x86_64.rpm
  2. Configure Heartbeat
    Add the necessary configurations to Heartbeat.

    setup.kibana.host
    setup.kibana.ssl.certificate_authorities
    output.elasticsearch.hosts
    output.elasticsearch.username
    output.elasticsearch.password
    output.elasticsearch.ssl.certificate_authorities
    
    vim /etc/heartbeat/heartbeat.yml
    setup.kibana:
      host: https://server1:5601
      ssl.certificate_authorities:
      - /etc/pki/ca-trust/source/anchors/labs.strigo.io.crt
    output.elasticsearch:
      hosts:
      - server1:9200
      - server2:9200
      - server3:9200
      protocol: https
      username: beats_user
      password: password
      ssl.certificate_authorities:
      - /etc/pki/ca-trust/source/anchors/labs.strigo.io.crt
    
    You can also use a keystore to protect the username and password.
    https://www.elastic.co/guide/en/beats/heartbeat/7.9/keystore.html
  3. Setup index management from Heartbeat
    Index Management includes templates, ilm policy and rollover alias

    You need to run this command only once and only on one server.
    heartbeat setup *
    
  4. Prepare Heartbeat for Monitors
    In /etc/heartbeat/monitors.d you can add heartbeat to check by tcp, http and icmp. We prepare first the prerequisite.

    Check and adapt in /etc/heartbeat/heartbeat.yml these lines:

    vim /etc/heartbeat/heartbeat.yml
    heartbeat.config.monitors:
      # Directory + glob pattern to search for configuration files
      path: ${path.config}/monitors.d/*.yml
      # If enabled, heartbeat will periodically check the config.monitors path for changes
      reload.enabled: true
      # How often to check for changes
      reload.period: 5s
    
    Disable inline monitors. Comment following lines.
    #heartbeat.monitors:
    #- type: http
    #
    #  # List or urls to query
    #  urls: ["https://localhost:9200"]
    #
    #  # Configure task schedule
    #  schedule: '@every 10s'
    #
    #  # Total test connection and data exchange timeout
    #  #timeout: 16s
    #  check.response.status: 200
    
  5. Add http Monitor for www.google.ch
    Now we like to add monitors. First we will add a simple http check for www.google.ch, scheduled every minute and checking for a response 200.
    Add this monitor in an external file so the metricbeat don’t needs to get restarted external file: /etc/heartbeat/monitors.d/google_ch.yml.

    vim /etc/heartbeat/monitors.d/google_ch.yml
    - type: http
      name: www.google.ch
      schedule: '@every 1m'
      urls: ["https://www.google.ch"]
      check.response.status: 200
    
  6. Add tcp and icmp Monitor for the elasticservers

    vim /etc/heartbeat/monitors.d/elastic_server.yml
    - type: icmp
      name: icmp
      enabled: true
      schedule: '@every 5s'
      hosts: ["server1", "server2", "server3"] 
    - type: tcp name: tcp enabled: true schedule: '@every 10s' hosts: ["server1:9200", "server2:9200", "server3:9200"]
  7. Start Heartbeat
    Now we can start Heartbeat. We also enable automatic start at system boot.
    To check if it worked, go to Kibana UI → Management → Index Management and check if there is an index starting with heartbeat. If you don’t have such an index something went wrong. To help debug, Heartbeat will write to /var/log/filebeat/heartbeat and /var/log/messages.

    systemctl enable heartbeat-elastic
    systemctl start heartbeat-elastic
    

    After starting, Heartbeat add templates and ILM settings to your elasticsearch cluster. If you need to change this, please visit
    Heartbeat index-template
    Heartbeat ilm
  8. Heartbeat Dashboards
    After successfully installing Heartbeat on linux1 its time to use the Application called Uptime. Go to Kibana and visit Uptime.

    Uptime app

Changing ILM settings for Beats

Beats will automatically use ILM. But you have options to change this.

  1. Let us use the Metricbeat components

    1. Let us do something before we start playing with the policy
      As we are testing ilm policy, do you remember what we could set on the cluster which will be easier to monitor changes on ilm policy?

      You need to set the poll interval for the index lifecycle to a lower value.
    2. Check the cluster settings, and set the poll interval for the index lifecycle policy to 10 seconds

      GET _cluster/settings
      
      Set the poll interval to 10 seconds
      PUT /_cluster/settings
      {
          "transient" : {
              "indices.lifecycle.poll_interval": "10s"
          }
      }
      
    3. What are the defaults?
      What are the default policies used by the Beats installed in the previous labs?

      In Kibana you will find the ILM Settings in Management / Index Lifecycle Management

      Only hot phase defined. Index Size is 50GB and minimal age is 30 days.

    4. Change the ILM Policy for the Metricbeat
      Set the max size of a index to 1GB and change the max_age to 1 days

      Got to Kibana / Management / Index Lifecycle Policy and adjust the Metricbeat policy

      Rollup the index manually. Attention, do you remember? You should run the rollover on the alias instead of the index.

      Go to Kibana / Dev Tools and execute

      POST /metricbeat-7.9.3/_rollover
      

      The response look similar to the following snippet.

      {
        "acknowledged" : true,
        "shards_acknowledged" : true,
        "old_index" : "metricbeat-7.9.3-2019.07.25-000001",
        "new_index" : "metricbeat-7.9.3-2019.07.26-000002",
        "rolled_over" : true,
        "dry_run" : false,
        "conditions" : { }
      }
      

      Check in Kibana / Management / Index Management if there are errors. If you have errors you can retry the rollover by (You should replace the date and the number):

      POST /metricbeat-7.9.3-2019.07.25-000001/_ilm/retry
      

      If there is still a problem, set the key index.lifecycle.indexing_complete to true . This needs to be done on the index which was rolled over(example: rollover from 00001 to 00002, then set the index.lifecycle.indexing_complete to true on 00001). You can do this by going to Kibana / Management / Index management / select index with error / Edit settings. After, retry ILM for the specific index.

      Uptime app

Processors

  1. Lets us work with some processors on metricbeat

    1. Let us add fields to the document which are collected by metricbeats

        field: "project.name"
        value: "RS_Alpha"
      
      Update metricbeat.yml file
      vim /etc/metricbeat/metricbeat.yml
      
      Don't delete or modify the existing processors, just add this snippet below the existing processors.
      Append this on linux1
        - add_fields:
            target: 'project'
            fields:
              name: 'RS_Alpha'
      

      Use the Kibana Discover feature to check if your configuration was successful.

      "The field has an exclamation mark. Reason: The field has not yet been updated in the Kibana Index Patterns" test
      Goto Kibana > Index Pattern, then select metricbeat-* and klick to "Refresh field list" test
      Now the field "project.name" also exists in the Index Pattern test

Lab Windows Beats 5-2

Objective: In this lab you will learn how to install Metricbeat, Filebeat and Heartbeat. Because installing and configuring Beats needs appropriate rights we assume you are root. If not already done, switch to user root.

Configure System Environment

Since there are no DNS servers available we work in the lab with locally hosts file.

  1. Setup local DNS using C:\Window\System32\drivers\etc\hosts
    Use YOUR current IP addresses. These are different from the example here!

    Add the following lines to the file C:\Window\System32\drivers\etc\hosts

    172.31.x.x windows1
    172.31.x.x server1
    172.31.x.x server2
    172.31.x.x server3

Requirements for Beats

  1. Prepare the CA certificate
    Because we enabled secure connections for elasticsearch and kibana we need to get the CA certificates. You can download the CA certificate from our website.

    Open a PowerShell prompt as an Administrator (right-click the PowerShell icon and select Run As Administrator
    and enter following commands

     [Net.ServicePointManager]::SecurityProtocol =[Net.SecurityProtocolType]::Tls12 
    Invoke-WebRequest https://rsmon1.realmon.ch/elastic/elastic-stack-ca.pem -OutFile Documents/elastic-stack-ca.pem
    You can use the VM Clipboard for Cut & Paste

    We need this CA certificate for every beat.

Installation of Metricbeat

At first we will install Metricbeat. This one brings a lot of modules which are easy to configure.

Install Metricbeat (64-BIT version) on server windows1
Make sure that you are connected with server windows1 !!!!!
  1. Download the Metricbeat Windows zip file from the downloads page
    Download Page: https://www.elastic.co/de/downloads/past-releases/metricbeat-7-8-1

  2. Extract the contents of the zip file into C:\Program Files

  3. Rename the metricbeat-<version>-windows directory to Metricbeat

  4. Open a PowerShell prompt as an Administrator (right click the PowerShell icon and select *Run As Administrator)

  5. From the PowerShell prompt, run the following commands to install Metricbeat as a Windows Service:

     cd 'C:\Program Files\Metricbeat'
     .\install-service-metricbeat.ps1
    If script execution is disabled on your system, you need to set the execution policy for the current session to allow the script to run. PowerShell.exe -ExecutionPolicy Unrestricted -File .\install-service-metricbeat.ps1 or choose [R] Run once
  6. Create Certs folder under C:\Program Files\Metricbeat and copy the CA certificate elastic-stack-ca.pem into this folder
    The CA Certificate you will find in the Documents folder.

     md -Path 'C:\Program Files\Metricbeat\Certs'
     Copy-Item -Path C:\Users\Administrator\Documents\elastic-stack-ca.pem -Destination 'C:\Program Files\Metricbeat\Certs\'
  7. Enable Metricbeat Windows Module
    Open a PowerShell prompt as an Administrator (right click the PowerShell icon and select Run As Administrator) unless it’s still open.
    From the PowerShell prompt, run the following commands to enable Metricbeat modules.

     cd 'C:\Program Files\Metricbeat'
     .\metricbeat.exe modules enable windows
  8. Configure Metricbeat
    Add the necessary configurations to Metricbeat YML file.

    name
    setup.kibana.host
    setup.kibana.ssl.certificate_authorities
    output.elasticsearch.hosts
    output.elasticsearch.username
    output.elasticsearch.password
    output.elasticsearch.ssl.certificate_authorities
    
    Wordpad or Visual Studio Code metricbeat.yml
    name: windows1
    setup.kibana: host: "https://server1:5601" ssl.certificate_authorities: 'C:\Program Files\metricbeat\certs\elastic-stack-ca.pem'
    output.elasticsearch: hosts: ["server1:9200","server2:9200","server3:9200"] protocol: "https" username: "beats_user" password: "password" ssl.certificate_authorities: 'C:\Program Files\metricbeat\certs\elastic-stack-ca.pem'
    Use Visual Studio Code for editing at Open with
  9. Deploy dashboard, index management, machine learning and ingest pipelines"
    There are prepared desktops for metricbeat

     cd 'C:\Program Files\Metricbeat'
     .\metricbeat.exe setup *
  10. Start metricbeat Service
    You can start the service on PowerShell or Services Desktop App

     Start-Service -name metricbeat

Installation of Filebeat

Install Filebeat (64-BIT version) on server windows1
Make sure that you are connected with server windows1 !!!!!
  1. Download the Filebeat Windows zip file from the downloads page
    Download Page: https://www.elastic.co/de/downloads/past-releases/filebeat-7-8-1

  2. Extract the contents of the zip file into C:\Program Files

  3. Rename the filebeat-<version>-windows directory to Filebeat

  4. Open a PowerShell prompt as an Administrator (right click the PowerShell icon and select *Run As Administrator)

  5. From the PowerShell prompt, run the following commands to install Filebeat as a Windows Service:

     cd 'C:\Program Files\Filebeat'
     .\install-service-filebeat.ps1
    If script execution is disabled on your system, you need to set the execution policy for the current session to allow the script to run. PowerShell.exe -ExecutionPolicy Unrestricted -File .\install-service-filebeat.ps1 or choose [R] Run once
  6. Create Certs folder under C:\Program Files\Filebeat and copy the CA certificate elastic-stack-ca.pem into this folder
    The CA Certificate you will find in the Documents folder.

     md -Path 'C:\Program Files\Filebeat\Certs'
     Copy-Item -Path C:\Users\Administrator\Documents\elastic-stack-ca.pem -Destination 'C:\Program Files\Filebeat\Certs\'
  7. Configure Filebeat
    Add the necessary configurations to Metricbeat YML file.

    name
    setup.kibana.host
    setup.kibana.ssl.certificate_authorities
    output.elasticsearch.hosts
    output.elasticsearch.username
    output.elasticsearch.password
    output.elasticsearch.ssl.certificate_authorities
    
    C:\Program Files\filebeat\filebeat.yml
    name: windows1
    setup.kibana: host: "https://server1:5601" ssl.certificate_authorities: 'C:\Program Files\filebeat\certs\elastic-stack-ca.pem'
    output.elasticsearch: hosts: ["server1:9200","server2:9200","server3:9200"] protocol: "https" username: "beats_user" password: "password" ssl.certificate_authorities: 'C:\Program Files\filebeat\certs\elastic-stack-ca.pem'
    Use Visual Studio Code for editing at Open with
  8. Deploy dashboard, index management, machine learning and ingest pipelines"
    There are prepared desktops for filebeat

     cd 'C:\Program Files\Filebeat'
     .\filebeat.exe setup *
  9. Start filebeat Service
    You can start the service on PowerShell or Services Desktop App

     Start-Service -name filebeat

Installation of Winlogbeat

Install Winlogbeat (64-BIT version) on server windows1
Make sure that you are connected with server windows1 !!!!!
  1. Download the Winlogbeat Windows zip file from the downloads page
    Download Page: https://www.elastic.co/de/downloads/past-releases/winlogbeat-7-9-3

  2. Extract the contents of the zip file into C:\Program Files

  3. Rename the winlogbeat-<version>-windows directory to Winlogbeat

  4. Open a PowerShell prompt as an Administrator (right click the PowerShell icon and select *Run As Administrator)

  5. From the PowerShell prompt, run the following commands to install Winlogbeat as a Windows Service:

     cd 'C:\Program Files\Winlogbeat'
     .\install-service-winlogbeat.ps1
    If script execution is disabled on your system, you need to set the execution policy for the current session to allow the script to run. PowerShell.exe -ExecutionPolicy Unrestricted -File .\install-service-winlogbeat.ps1 or choose [R] Run once
  6. Create Certs folder under C:\Program Files\Winlogbeat and copy the CA certificate elastic-stack-ca.pem into this folder
    The CA Certificate you will find in the Documents folder.

     md -Path 'C:\Program Files\Winlogbeat\Certs'
     Copy-Item -Path C:\Users\Administrator\Documents\elastic-stack-ca.pem -Destination 'C:\Program Files\Winlogbeat\Certs\'
  7. Configure Winlogbeat
    Add the necessary configurations to Winlogbeat YML file.

    name
    setup.kibana.host
    setup.kibana.ssl.certificate_authorities
    output.elasticsearch.hosts
    output.elasticsearch.username
    output.elasticsearch.password
    output.elasticsearch.ssl.certificate_authorities
    
    C:\Program Files\winlogbeat\winlogbeat.yml
    name: windows1
    setup.kibana: host: "https://server1:5601" ssl.certificate_authorities: 'C:\Program Files\winlogbeat\certs\elastic-stack-ca.pem'
    output.elasticsearch: hosts: - server1:9200 - server2:9200 - server3:9200 protocol: "https" username: "beats_user" password: "password" ssl.certificate_authorities: 'C:\Program Files\winlogbeat\certs\elastic-stack-ca.pem'
    Use Visual Studio Code for editing at Open with
  8. *Deploy dashboard, index management, machine learning and ingest pipelines"
    There are prepared desktops for winlogbeat

     cd 'C:\Program Files\Winlogbeat'
     .\winlogbeat.exe setup *
  9. Start winlogbeat Service
    You can start the service on PowerShell or Services Desktop App

     Start-Service -name winlogbeat

Lab6: Logstash

Objective: In this lab, you will learn how to install and configure Logstash. Accept syslog messages from a linux server. Configure port NAT to forward port 514 to 5014.

Logstash is pre installed on all three nodes server1, server2 and server3

We will send syslog messages from linux1 to server1. So we have to configure rsyslog on linux1.

Configure rsyslog to forward syslog message

Choose linux1 in Strigo and add following lines at the end of the rsyslog.conf.

vi /etc/rsyslog.conf
To forward messages to another host via UDP, prepend the hostname with the at sign ("@").
To forward it via plain tcp, prepend two at signs ("@@").
# remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional
*.* @server1:5014
# ### end of the forwarding rule ##
systemctl restart rsyslog
All syslog messages are now forwarded to *server1*. Logstash is listens on UDP/5014.

Requirements for Logstash

Because we are using security we need first to create a role and a user in Kibana.

  1. Create role logstash_input
    Before starting we need a User for the beats to send data to elastic. Go to Kibana and create a role with the following requirements.

    name: logstash_input
    cluster privileges: manage_index_templates, monitor, manage_ingest_pipelines, manage_ilm
    index privileges: index, manage, create, create_index
    On the indices: syslog-*
    You can access this settings inside Kibana (Management -> Roles)
    Smiley face
    
    Then click "Create role"
  2. Create user logstash_user

    role name: logstash_user
    roles: logstash_input
    password: password
    other fields can be ignored
    You can access this settings inside kibana (Management -> Users)
    Smiley face
    

Configuration of Logstash

  1. Prepare a proper pipeline structure

    cd /etc/logstash
    mkdir -p pipelines/syslog/conf.d
    mkdir -p pipelines/syslog/patterns
    mkdir -p pipelines/syslog/templates
    
  2. Configure your first pipeline
    We will configure the first pipeline for Syslog. The node linux1 is configured to send syslog messages to node server1.

    Configure input plugins with the following requests

    vim /etc/logstash/pipelines/syslog/conf.d/10-syslog.conf
    input {
      tcp {
        host => "0.0.0.0"
        port => "5014"
        type => "syslog"
      }
      udp {
        host => "0.0.0.0"
        port => "5014"
        workers => 2
        type => "syslog"
      }
    }
    

    Configure filter plugins with the following requests

    vim /etc/logstash/pipelines/syslog/conf.d/20-syslog.conf
    filter {
    }

    Configure output plugins with the following requests

    vim /etc/logstash/pipelines/syslog/conf.d/30-syslog.conf
    output {
      elasticsearch {
        hosts => "https://server1:9200"
        user => "logstash_user"
        password => "password"
        cacert => "/etc/pki/ca-trust/source/anchors/labs.strigo.io.crt"
        manage_template => false
        index => "syslog-%{+YYYY.MM.dd}"
      }
    }

    Now we have to adjust the pipelines.yml file

    vim /etc/logstash/pipelines.yml
    - pipeline.id: syslog
      path.config: "/etc/logstash/pipelines/syslog/conf.d/*.conf"
    
  3. Start Logstash
    Start Logstash and enable start on boot.

    systemctl start logstash

    Check if logstash is started

    systemctl status logstash

    Does it load the pipeline? Search in /var/log/logstash/logstash-plain.log and search for these (similar) lines.

    Sep 13 08:58:10 ip-172-31-23-153.eu-central-1.compute.internal logstash[18115]: [2019-09-13T08:58:10,702][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `...rsion=>7}
    Sep 13 08:58:10 ip-172-31-23-153.eu-central-1.compute.internal logstash[18115]: [2019-09-13T08:58:10,729][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogS...1:9200"]}
    Sep 13 08:58:10 ip-172-31-23-153.eu-central-1.compute.internal logstash[18115]: [2019-09-13T08:58:10,826][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge] A gauge metric of an ...
    Sep 13 08:58:10 ip-172-31-23-153.eu-central-1.compute.internal logstash[18115]: [2019-09-13T08:58:10,829][INFO ][logstash.javapipeline    ] Starting pipeline {:pipeline_id=>"syslog", "...16 run>"}
    Sep 13 08:58:11 ip-172-31-23-153.eu-central-1.compute.internal logstash[18115]: [2019-09-13T08:58:11,481][INFO ][logstash.javapipeline    ] Pipeline started {"pipeline.id"=>"syslog"}
    Sep 13 08:58:11 ip-172-31-23-153.eu-central-1.compute.internal logstash[18115]: [2019-09-13T08:58:11,575][INFO ][logstash.inputs.tcp      ] Starting tcp input listener {:address=>"0.0....>"false"}
    Sep 13 08:58:11 ip-172-31-23-153.eu-central-1.compute.internal logstash[18115]: [2019-09-13T08:58:11,745][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipel...ines=>[]}
    Sep 13 08:58:11 ip-172-31-23-153.eu-central-1.compute.internal logstash[18115]: [2019-09-13T08:58:11,797][INFO ][logstash.inputs.udp      ] Starting UDP listener {:address=>"0.0.0.0:5014"}
    Sep 13 08:58:11 ip-172-31-23-153.eu-central-1.compute.internal logstash[18115]: [2019-09-13T08:58:11,928][INFO ][logstash.inputs.udp      ] UDP listener started {:address=>"0.0.0.0:501...=>"2000"}
    Sep 13 08:58:12 ip-172-31-23-153.eu-central-1.compute.internal logstash[18115]: [2019-09-13T08:58:12,158][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
  4. Create Kibana Pattern for syslog and check syslog messages in the discover GUI
    Use the Kibana GUI: Management → Kibana (Index Patterns)

    Index pattern: syslog-*
    Time Filter field name: @timestamp
    

    Then click "Create index pattern"

    If no syslog message has been sent, no index is available. With the command logger test message syslog message can be triggered.
  5. Search syslog messages in the discover GUI

    If you're lucky, you'll find a syslog message like this
    Smiley face
    

Filter plugins for processing syslog messages

  1. Replace existing filter plugins

    vim /etc/logstash/pipelines/syslog/conf.d/20-syslog.conf
    
    filter {
      if [type] == "syslog" {
        grok {
          match => { "message" => "<%{NUMBER:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
          add_field => [ "received_at", "%{@timestamp}" ]
          add_field => [ "received_from", "%{host}" ]
        }
        syslog_pri { }
        date {
          match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
        }
      }
    }
    
    systemctl restart logstash
    
    The elasticsearch documents looks like this now
    Smiley face
    
  2. Refresh Kibana Index
    All new fields are not known and marked with an exclamation mark.
    Start Refresh this Kibana Index

    Kibana GUI: Management -> Kibana (Index Patterns) -> Select syslog-*
    Click "Refresh field list" button
    Smiley face
    

Send Syslog Message with Port UDP/514 to logstash

Logstash runs under a non-privileged user. Therefore port below 1024 cannot be startet. One possibility is to use iptables to NAT incoming Syslog port UDP/514 to a high port.

During the basic setup on server1, server2 and server3 firewalld was stopped and disabled. Instead iptables-services was installed.
  1. Configure rsyslog to forward syslog message now over Port UDP/514
    Choose linux1 in Strigo and change following lines at the end of the rsyslog.conf.

    vi /etc/rsyslog.conf
    
    # remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional
    *.* @server1:514
    # ### end of the forwarding rule ##
    
    systemctl restart rsyslog
    
  2. Add iptables Port forward / Nating Rule
    Choose server1 in Strigo and add the NAT rule to iptable.
    Check the current firewall policy

    iptables -t nat -L
    
    Chain PREROUTING (policy ACCEPT)
    target     prot opt source               destination 
    Chain INPUT (policy ACCEPT) target prot opt source destination
    Chain FORWARD (policy ACCEPT) target prot opt source destination
    Chain OUTPUT (policy ACCEPT) target prot opt source destination

    Add the new rules

    iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 514 -j REDIRECT --to-port 5014
    iptables -A PREROUTING -t nat -i eth0 -p udp --dport 514 -j REDIRECT --to-port 5014
    

    Check the new firewall policy

    iptables -t nat -L
    
    Chain PREROUTING (policy ACCEPT)
    target     prot opt source               destination
    REDIRECT   tcp  --  anywhere             anywhere             tcp dpt:shell redir ports 5014
    REDIRECT   udp  --  anywhere             anywhere             udp dpt:syslog redir ports 5014
    Chain INPUT (policy ACCEPT) target prot opt source destination
    Chain OUTPUT (policy ACCEPT) target prot opt source destination>br> Chain POSTROUTING (policy ACCEPT) target prot opt source destination
    Set NAT rule permanent
    /usr/libexec/iptables/iptables.init save
  3. Go to Kibana and check if this NAT Rule is working
    On the node linux1 force a syslog message

    logger "this is check syslog message"
    

    Get you this message in the syslog index? Hopefully :-)

Configure logstash on a second node for example on server2

We will now send all syslog message to two logstash and see what happens.

  1. Configure node server2 like server1 logstash
    See above for the configuration. Necessary steps are:

    1. Prepare the CA certificate

    2. Configuration of Logstash

    3. Start logstash and check correct running

    4. Add iptable NAT rules

  2. On node linux1 add the server2 to rsyslog.conf
    See above.

  3. Send a test syslog message

    logger "this is another syslog check message"
    
  4. Got to Kibana Discover Tool and check the syslog indices
    What do you see there?
    Yes exactly, the syslog message is available twice!
    Bad news!

  5. What’s the solution to this issue?
    We will use the fingerprint module to create a unique document_id.
    Add following lines to 20-syslog.conf

    fingerprint {
       source => "message"
       target => "[@metadata][fingerprint]"
       method => "MURMUR3"
    }
    
    vim /etc/logstash/pipelines/syslog/conf.d/20-syslog.conf
    filter {
      if [type] == "syslog" {
        grok {
          match => { "message" => "<%{NUMBER:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
          add_field => [ "received_at", "%{@timestamp}" ]
          add_field => [ "received_from", "%{host}" ]
        }
        syslog_pri { }
        date {
          match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
        }
        fingerprint {
          source => "message"
          target => "[@metadata][fingerprint]"
          method => "MURMUR3"
        }
      }
    }
    

    Add following lines to 30-syslog.conf

    document_id => "%{[@metadata][fingerprint]}"
    
    vim /etc/logstash/pipelines/syslog/conf.d/30-syslog.conf
    output {
      elasticsearch {
        hosts => "https://server1:9200"
        user => "logstash_user"
        password => "password"
        cacert => "/etc/pki/ca-trust/source/anchors/labs.strigo.io.crt"
        manage_template => false
        document_id => "%{[@metadata][fingerprint]}"
        index => "syslog-%{+YYYY.MM.dd}"
      }
    }
    
  6. Send the syslog test message again
    How many documents are created now?
    Yes, just one. Great!

    Because of the same index number, no second document is created. Only the first index document with the same content is updated.
    https://www.elastic.co/de/blog/logstash-lessons-handling-duplicates

Lab7: Logging Configuration

Objective: In this lab, you will reduce the logging noise

Elasticsearch logging configuration

Elasticsearch should only log error messages and Java should no longer log trace and safepoints.

The configuration has to be done on server1, server2 and server3
The adjustments have to be done with the root account!

Set the following levels for Log4j 2 to error logging

logger.action.level = error
rootLogger.level = error
logger.index_indexing_slowlog.level = error
logger.index_search_slowlog_rolling.level = error
logger.xpack_security_audit_logfile.level = error
vim /etc/elasticsearch/log4j2.properties
#logger.action.level = debug
logger.action.level = error 
#rootLogger.level = info rootLogger.level = error
#logger.index_search_slowlog_rolling.level = trace logger.index_search_slowlog_rolling.level = error
#logger.index_indexing_slowlog.level = trace logger.index_indexing_slowlog.level = error
#logger.xpack_security_audit_logfile.level = info logger.xpack_security_audit_logfile.level = error

Reduce Garbage Collection logging
This setting can be done inside jvm.options file. Remove the following part out of the file

,gc+age=trace,safepoint
vim /etc/elasticsearch/jvm.options
#9-:-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m 
9-:-Xlog:gc*:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m

Kibana logging configuration

Kibana should only log error messages and we still have to setup logrotation for the kibana.log file

Set error level logging for Kibana

logging.quiet: true
vim /etc/kibana/kibana.yml
#logging.quiet: false
logging.quiet: true

Setup logrotation for kibana.log

The following instructions apply to Centos 7. Create a new logroate file for Kibana with the following content.

vim /etc/logrotate.d/kibana
/var/log/kibana/*.log {
  missingok
  daily
  size 10M
  create 0644 kibana kibana
  rotate 7
  notifempty
  sharedscripts
  notifempty
  compress
  postrotate
    /bin/kill -HUP $(ps aux | grep kibana.yml | grep -v grep | awk '{print $2}' 2>/dev/null) 2>/dev/null
  endscript
}

Check if logrotate still works correctly

/usr/sbin/logrotate -s /var/lib/logrotate/logrotate.status /etc/logrotate.conf

If no errors are issued, everything is okay.

Logstash logging configuration

Reduce the log Level to error

log.level: error
vim /etc/logstash/logstash.yml
#log.level: info
log.level: error

So now silence should return :-)
For the changes to take effect, all components must be restarted. Node by node.

Linux Beat logging configuration

Beats should only log error messages. The following instructions apply to all beats.

When Beat it running on a Linux system with systemd, it uses by default the -e command line option, that makes it write all the logging output to stderr so it can be captured by journald. Other outputs are disabled. See for Example Metricbeat and systemd to know more and learn how to change this.
logging.level: error
logging.to_files: true
logging path = /var/log/xxxbeat
logging file = xxxbeat
keepfiles = 7
permissions = -rw-r--r--

xxx = metric or file or audit etc.

vim /etc/xxxbeat/xxxbeat.yml
Update the Logging section. Attention do not forget to indent in the YML file :-)
logging.level: error
logging.to_files: true
logging.files:
  path: /var/log/xxxbeat
  file: xxxbeat
  keepfiles: 7
  permissions: 0644

Now we have to adjust the default setting so that the beat no longer writes to stderr!
That means we’ll have to adjust the launch behavior.

xxx still applies for metric or file or audit etc.

Create a new directory xxxbeat.service.d in /etc/systemd/system.

mkdir /etc/systemd/system/xxxbeat.service.d

Now create a new file environment.conf with the following content.

vim /etc/systemd/system/xxxbeat.service.d/environment.conf
[Service]
Environment="BEAT_LOG_OPTS="

To apply your changes, reload the systemd configuration and restart the beat service.

systemctl daemon-reload
systemctl restart xxxbeat

Now check if a logfile has been created.

ls -al /var/log/xxxbeat/xxxbeat
tail /var/log/xxxbeat/xxxbeat