deepsense.ai
  • Careers
    • Job offers
    • Summer internship
  • Clients’ stories
  • Services
    • AI software
    • Team augmentation
    • AI discovery workshops
    • GPT and other LLMs discovery workshops
    • Generative models
    • Train your team
  • Industries
    • Retail
    • Manufacturing
    • Financial & Insurance
    • IT operations
    • TMT & Other
    • Medical & Beauty
  • Knowledge base
    • deeptalks
    • Blog
    • R&D hub
  • About us
    • Our story
    • Management
    • Advisory board
    • Press center
  • Contact
  • Menu Menu
Elasticsearch's DevOps head start.

Elasticsearch’s DevOps head start. Part 1. Operating system.

August 25, 2021/in Elasticsearch /by Karol Bartnik

This post discusses choices DevOps faces in a fresh project, which can take advantage of the newest version of Elasticsearch (7.14) and how to grow such a cluster from PoC up to a fully fledged production cluster.

Preface

As DevOps, our task is to provision infrastructure and ensure service is up and running, constantly available and stable. There are manuals and official “getting started” guidebooks, which you should be familiar with. But here, however, we will focus on aspects that are beyond the scope of official documents. Before we dive into technical considerations, let me give you some baseline idea of what Elasticsearch is from a DevOps perspective. On a very simplistic level, it’s just a program that provides JSON for applications via HTTP api. Although considered a database, the proper technical term is a datastore. The differences go far, but for now I’ll signal just a handful: a database stores all the information you feed it, but it can do more than that, including creating Views or executing Stored Procedures. While databases aggregate and modify data, datastore is used for near real-time search. A database aims to be as complete as possible, while datastore ensures data persistence for a limited period, hence old entries are deleted. More often than not you want to grade between what is actually useful in the long term and what can be removed. This brings the benefit of cost savings and search speed.

We’ll kick things off here today with a look at how to avoid early pitfalls

Server spec

Our cluster will grow, but it can start out very modest – with 2 CPU, 8 GB RAM machine, going lower than this will hinder the JVM. There won’t be much load early, so a single server will do all the work. Later on we will specialize each machine to a specific role. In production, even in the early stages, there should be a minimum of three masters in different zones and plenty of data nodes.

Masters can be fairly small sized machines – 1 CPU, 2 GB RAM. For data nodes, the rule of thumb is:

  • at least 2 CPUs
  • 4gb ram per core, but no more than 64gb.
  • a max of 16 times RAM capacity for storage. It is better to have more nodes. With a maximum of 64gb * 16 – no need for inode64.

The Kernel impacts performance a lot, both with improvements and occasional bugs, so consider this change carefully. A common approach is to stay on the latest kernel (-1) if possible. You can be on the newest one, but test it first on lower environments. One more thing: “database servers shouldn’t be run on VMs | Cloud” – when this gets to be a problem, there should be a revenue stream to address it.

The operating system is the foundation of the whole system running on it. At the same time, a database is not merely a “program”. Its system augmentation – hosts will require the default settings to be customized. Just to have one less thing to worry about – use .deb / .rpm to apply them automatically.

Changes you’ll need to apply to the system yourself:

echo 'vm.swappiness=1' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_retries2=5' >> /etc/sysctl.conf

sysctl -p

Both changes will be discussed in upcoming parts.

CPU

The CPU model becomes important at some point. There are options to fine-tune multi-tiered compilation for performance. However, since we’ll be starting with a rented VM, just to observe our cluster growth, don’t worry about it just yet.

When your organization starts to consider moving to a physical machine, you will want to start looking at compiling elasticsearch on your own.

RAM

As long as we operate on a server with at most 64 GB RAM, we can ignore hugepages topics. There are diminishing returns for having nodes with massive RAM and storage.

Elasticsearch uses RAM not only for the JVM heap, which shouldn’t take more than 50% of available memory, and preferably less than 8gb (more about it in part 2). Whatever memory is available will serve for the memory-mapped file system – files will be kept in memory and used directly from it for speed.

Networking

Once we have more than one node, and if hosting provider allows it, we could activate the Jumbo frames.

ifconfig eth0 mtu 9000.

Benefit of above materializes when there is need to balance shards across nodes to avoid the watermark threshold.

Data disk

Putting the database on the system disk is just asking for all kinds of trouble, we should always attach a dedicated disk. The most important metric of such a disk is IOPS. Disk usage will fluctuate, growing as data arrives, and bloom even more during segment merges, to shrink back once the merge is complete. At 85% of any data disk capacity, the whole node will go read-only, till shards get re-balanced between nodes. Keep at least 30% of the disk free as a reserve. Elasticsearch will ensure reliability, so raid0 is perfectly acceptable.

The preferred I/O scheduler is either deadline or mq-deadline. Bear in mind that on Azure, device names are not stable between restarts. But there is no downside for changing I/O scheduler for each device:

# /etc/systemd/system/io-scheduler.service 
[Unit]
Description=Set I/O scheduler
After=local-fs.target

[Service]
Type=oneshot
ExecStart=/bin/bash -c 'echo deadline > /sys/block/sd*/queue/scheduler'
TimeoutSec=0
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

Filesystem

Elasticsearch allows for the use of multiple data paths, but at the same time, well, it doesn’t handle them well. Shards won’t be balanced between data.paths on the node. If a single disk gets marked with 85% watermark, then the node as a whole will also be marked read-only. In AWS, EBS allows you to resize the disk without detaching it, just remember to grow the filesystem afterwards. On-premise use LVM, just point data.path to mount_point. Use LVM in Azure as well, unless machine downtime is acceptable. If you exceed 4 disks added, it’s smart to replace them with a single one. Backup does not contain indices marked to delete, so restored data will be faster to search through. A single disk will have more IOPS too. And to top it all off, segments can be merged while the old infrastructure is still operating. The new cluster is ready once data is restored to point in time. If you are worried that we are going to waste hours and hours, remember that we are still operating on a small volume of data. Also, the old cluster will be available till the new cluster is ready. The end game is to mount a single 1tb data disk, of which we will use about 70%. Remember to adjust the CPU count for RAM and storage growth. Since with capacity we also gain IOPS, we change the I/O scheduler to NOOP – both for a lower CPU load and an even speedier response (which is now beneficial since the IOPS are maxed).

Final remarks:

  • LVM does not impact performance much
  • XFS and EXT4 are both recommended
  • mount with noatime

Since LVM commands is not something people keep in L1 cache – here is script:

#!/usr/bin/env bash
# by Karol Bartnik, use and share freely.

mount_point=$1

`sudo snap install jq`

devices_count=`/bin/lsblk -o NAME -n --json | jq .blockdevices | jq 'length'`

for (( device_index=0; $device_index<$devices_count; device_index++ ))
do

  device_name=`/bin/lsblk -o NAME -n --json | jq .blockdevices | jq .[$device_index].name`
  if [[ $device_name =~ "sd" ]]; then
    devices_children_count=`/bin/lsblk -o NAME -n --json | jq .blockdevices | jq .[$device_index].children | jq 'length'`
    if [[ $devices_children_count == 0 ]]; then
      disk_name="${device_name//\"}"1
      dev_name="${device_name//\"}"
      `printf "n\np\n\n\n\nt\n8e\nw" | sudo fdisk /dev/$dev_name &&
      sudo pvcreate /dev/$disk_name &&
      sudo vgcreate -s 32M vg_1 /dev/$disk_name &&
      sudo lvcreate -l +100%FREE -n lg_1 vg_1 &&
      sudo mkfs.xfs /dev/vg_1/lg_1 &&
      sudo mkdir -p $mount_point &&
      echo /dev/mapper/vg_1-lg_1 $mount_point xfs defaults,discard,noatime 0 2 | sudo tee -a /etc/fstab &&
      sudo mount -a`
    fi
  fi

done

Or Ansible playbook for filesystem:

# add_filesystem.yml
---
- hosts: elasticsearch_vm
  become: true
  become_method: sudo
  gather_facts: true
  roles:
    - add_filesystem

# add_filesystem/vars/fs.yml
---
fs_type: xfs
mount_point: /var/lib/elasticsearch
owner: elasticsearch

# add_filesystem/tasks/main.yml
---
# by Karol Bartnik, use and share freely.
- name: vars inclusion
  include_vars:
    dir: vars

- name: get device name
  ansible.builtin.set_fact:
    disk_name: "{{ item }}"
    dev_disk: "/dev/{{ item }}"
    partiton: "/dev/{{ item }}1"
  when: ansible_facts.devices.{{ item }}.partitions == {}
  with_items: "{{ ansible_facts.devices.keys() |
                  select('match', '^sd(.*)$') |
                  list }}"

- name: prepare device if any
  include_tasks: prepare_device.yml
  when: disk_name is defined

# add_filesystem/tasks/prepare_device.yml
---
- name: create partition
  ansible.builtin.parted:
    device: "{{ dev_disk }}"
    number: 1
    flags: [ lvm ]
    state: present

- name: create volume group
  ansible.builtin.lvg:
    vg: vg1
    pvs: "{{ partiton }}"
    pesize: "32"

- name: resize volume group
  ansible.builtin.lvg:
    vg: vg1
    pvs: "{{ partiton }}"

- name: create logical volume
  community.general.lvol:
    vg: vg1
    lv: lv1
    size: +100%FREE

- name: create filesystem
  ansible.builtin.filesystem:
    fstype: "{{ fs_type }}"
    dev: /dev/mapper/vg1-lv1

- name: mount with mapper
  mount:
    path: "{{ mount_point }}"
    src: /dev/mapper/vg1-lv1
    fstype: "{{ fs_type }}"
    opts: defaults,discard,noatime
    state: mounted

- name: chown mount_point {{ mount_point }}
  file:
    recurse: true
    path: "{{ mount_point }}"
    owner: "{{ owner }}"
    group: "{{ owner }}"
    state: directory
    mode: '0755'

Monitoring

If possible, add monitoring for CPU, I/O, networking (throughput), and memory usage.
Filesystem usage calls for particular attention, though how to go about that is beyond the scope of this article. Later on, you should observe JVM. We’ll be looking at which aspects in particular in the second part of this feature.

https://deepsense.ai/wp-content/uploads/2021/08/Elasticsearchs-DevOps-head-start..jpeg 337 1140 Karol Bartnik https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Karol Bartnik2021-08-25 19:12:492021-08-30 18:54:05Elasticsearch’s DevOps head start. Part 1. Operating system.

Start your search here

Build your AI solution
with us!

Contact us!

NEWSLETTER SUBSCRIPTION

    You can modify your privacy settings and unsubscribe from our lists at any time (see our privacy policy).

    This site is protected by reCAPTCHA and the Google privacy policy and terms of service apply.

    CATEGORIES

    • Generative models
    • Elasticsearch
    • Computer vision
    • Artificial Intelligence
    • AIOps
    • Big data & Spark
    • Data science
    • Deep learning
    • Machine learning
    • Neptune
    • Reinforcement learning
    • Seahorse
    • Job offer
    • Popular posts
    • AI Monthly Digest
    • Press release

    POPULAR POSTS

    • Diffusion models in practice. Part 1 - The tools of the tradeDiffusion models in practice. Part 1: The tools of the tradeMarch 29, 2023
    • Solution guide - The diverse landscape of large language models. From the original Transformer to GPT-4 and beyondGuide: The diverse landscape of large language models. From the original Transformer to GPT-4 and beyondMarch 22, 2023
    • ChatGPT – what is the buzz all about?ChatGPT – what is the buzz all about?March 10, 2023

    Would you like
    to learn more?

    Contact us!
    • deepsense.ai logo white
    • Services
    • AI software
    • Team augmentation
    • AI discovery workshops
    • GPT and other LLMs discovery workshops
    • Generative models
    • Train your team
    • Knowledge base
    • deeptalks
    • Blog
    • R&D hub
    • deepsense.ai
    • Careers
    • Summer internship
    • Our story
    • Management
    • Advisory board
    • Press center
    • Support
    • Terms of service
    • Privacy policy
    • Code of ethics
    • Contact us
    • Join our community
    • facebook logo linkedin logo twitter logo
    • © deepsense.ai 2014-
    Scroll to top

    This site uses cookies. By continuing to browse the site, you are agreeing to our use of cookies.

    OKLearn more

    Cookie and Privacy Settings



    How we use cookies

    We may request cookies to be set on your device. We use cookies to let us know when you visit our websites, how you interact with us, to enrich your user experience, and to customize your relationship with our website.

    Click on the different category headings to find out more. You can also change some of your preferences. Note that blocking some types of cookies may impact your experience on our websites and the services we are able to offer.

    Essential Website Cookies

    These cookies are strictly necessary to provide you with services available through our website and to use some of its features.

    Because these cookies are strictly necessary to deliver the website, refuseing them will have impact how our site functions. You always can block or delete cookies by changing your browser settings and force blocking all cookies on this website. But this will always prompt you to accept/refuse cookies when revisiting our site.

    We fully respect if you want to refuse cookies but to avoid asking you again and again kindly allow us to store a cookie for that. You are free to opt out any time or opt in for other cookies to get a better experience. If you refuse cookies we will remove all set cookies in our domain.

    We provide you with a list of stored cookies on your computer in our domain so you can check what we stored. Due to security reasons we are not able to show or modify cookies from other domains. You can check these in your browser security settings.

    Other external services

    We also use different external services like Google Webfonts, Google Maps, and external Video providers. Since these providers may collect personal data like your IP address we allow you to block them here. Please be aware that this might heavily reduce the functionality and appearance of our site. Changes will take effect once you reload the page.

    Google Webfont Settings:

    Google Map Settings:

    Google reCaptcha Settings:

    Vimeo and Youtube video embeds:

    Privacy Policy

    You can read about our cookies and privacy settings in detail on our Privacy Policy Page.

    Accept settingsHide notification only