Netdata's distributed, real-time monitoring Agent collects thousands of metrics from systems, hardware, containers, and applications with zero configuration.

Netdata

Netdata is high-fidelity infrastructure monitoring and troubleshooting.
Open-source, free, preconfigured, opinionated, and always real-time.


Latest release Build status CII Best Practices License: GPL v3+ analytics
Code Climate Codacy LGTM C LGTM PYTHON

---

Netdata's distributed, real-time monitoring Agent collects thousands of metrics from systems, hardware, containers, and applications with zero configuration. It runs permanently on all your physical/virtual servers, containers, cloud deployments, and edge/IoT devices, and is perfectly safe to install on your systems mid-incident without any preparation.

You can install Netdata on most Linux distributions (Ubuntu, Debian, CentOS, and more), container platforms (Kubernetes clusters, Docker), and many other operating systems (FreeBSD, macOS). No sudo required.

Netdata is designed by system administrators, DevOps engineers, and developers to collect everything, help you visualize metrics, troubleshoot complex performance problems, and make data interoperable with the rest of your monitoring stack.

People get addicted to Netdata. Once you use it on your systems, there's no going back! You've been warned...

Users who are addicted to Netdata

Latest release: v1.30.0, March 31, 2021

The v1.30.0 release of Netdata brings major improvements to our packaging and completely replaces Google Analytics/GTM for product telemetry. We're also releasing the first changes in an upcoming overhaul to both our dashboard UI/UX and the suite of preconfigured alarms that comes with every installation.

Menu

Features

Netdata in action

Here's what you can expect from Netdata:

  • 1s granularity: The highest possible resolution for all metrics.
  • Unlimited metrics: Netdata collects all the available metrics—the more, the better.
  • 1% CPU utilization of a single core: It's unbelievably optimized.
  • A few MB of RAM: The highly-efficient database engine stores per-second metrics in RAM and then "spills" historical metrics to disk long-term storage.
  • Minimal disk I/O: While running, Netdata only writes historical metrics and reads error and access logs.
  • Zero configuration: Netdata auto-detects everything, and can collect up to 10,000 metrics per server out of the box.
  • Zero maintenance: You just run it. Netdata does the rest.
  • Stunningly fast, interactive visualizations: The dashboard responds to queries in less than 1ms per metric to synchronize charts as you pan through time, zoom in on anomalies, and more.
  • Visual anomaly detection: Our UI/UX emphasizes the relationships between charts to help you detect the root cause of anomalies.
  • Scales to infinity: You can install it on all your servers, containers, VMs, and IoT devices. Metrics are not centralized by default, so there is no limit.
  • Several operating modes: Autonomous host monitoring (the default), headless data collector, forwarding proxy, store and forward proxy, central multi-host monitoring, in all possible configurations. Use different metrics retention policies per node and run with or without health monitoring.

Netdata works with tons of applications, notifications platforms, and other time-series databases:

  • 300+ system, container, and application endpoints: Collectors autodetect metrics from default endpoints and immediately visualize them into meaningful charts designed for troubleshooting. See everything we support.
  • 20+ notification platforms: Netdata's health watchdog sends warning and critical alarms to your favorite platform to inform you of anomalies just seconds after they affect your node.
  • 30+ external time-series databases: Export resampled metrics as they're collected to other local- and Cloud-based databases for best-in-class interoperability.

💡 Want to leverage the monitoring power of Netdata across entire infrastructure? View metrics from any number of distributed nodes in a single interface and unlock even more features with Netdata Cloud.

Get Netdata

User base Servers monitored Sessions served Docker Hub pulls
New users today New machines today Sessions today Docker Hub pulls today

To install Netdata from source on most Linux systems (physical, virtual, container, IoT, edge), run our one-line installation script. This script downloads and builds all dependencies, including those required to connect to Netdata Cloud if you choose, and enables automatic nightly updates and anonymous statistics.

bash <(curl -Ss https://my-netdata.io/kickstart.sh)

To view the Netdata dashboard, navigate to http://localhost:19999, or http://NODE:19999.

Docker

You can also try out Netdata's capabilities in a Docker container:

docker run -d --name=netdata \
  -p 19999:19999 \
  -v netdataconfig:/etc/netdata \
  -v netdatalib:/var/lib/netdata \
  -v netdatacache:/var/cache/netdata \
  -v /etc/passwd:/host/etc/passwd:ro \
  -v /etc/group:/host/etc/group:ro \
  -v /proc:/host/proc:ro \
  -v /sys:/host/sys:ro \
  -v /etc/os-release:/host/etc/os-release:ro \
  --restart unless-stopped \
  --cap-add SYS_PTRACE \
  --security-opt apparmor=unconfined \
  netdata/netdata

To view the Netdata dashboard, navigate to http://localhost:19999, or http://NODE:19999.

Other operating systems

See our documentation for additional operating systems, including Kubernetes, .deb/.rpm packages, and more.

Post-installation

When you're finished with installation, check out our single-node or infrastructure monitoring quickstart guides based on your use case.

Or, skip straight to configuring the Netdata Agent.

Read through Netdata's documentation, which is structured based on actions and solutions, to enable features like health monitoring, alarm notifications, long-term metrics storage, exporting to external databases, and more.

How it works

Netdata is a highly efficient, highly modular, metrics management engine. Its lockless design makes it ideal for concurrent operations on the metrics.

Diagram of Netdata's core functionality

The result is a highly efficient, low-latency system, supporting multiple readers and one writer on each metric.

Infographic

This is a high-level overview of Netdata features and architecture. Click on it to view an interactive version, which has links to our documentation.

An infographic of how Netdata works

Documentation

Netdata's documentation is available at Netdata Learn.

This site also hosts a number of guides to help newer users better understand how to collect metrics, troubleshoot via charts, export to external databases, and more.

Community

Netdata is an inclusive open-source project and community. Please read our Code of Conduct.

Find most of the Netdata team in our community forums. It's the best place to ask questions, find resources, and engage with passionate professionals.

You can also find Netdata on:

Contribute

Contributions are the lifeblood of open-source projects. While we continue to invest in and improve Netdata, we need help to democratize monitoring!

  • Read our Contributing Guide, which contains all the information you need to contribute to Netdata, such as improving our documentation, engaging in the community, and developing new features. We've made it as frictionless as possible, but if you need help, just ping us on our community forums!
  • We have a whole category dedicated to contributing and extending Netdata on our community forums
  • Found a bug? Open a GitHub issue.
  • View our Security Policy.

Package maintainers should read the guide on building Netdata from source for instructions on building each Netdata component from source and preparing a package.

License

The Netdata Agent is GPLv3+. Netdata re-distributes other open-source tools and libraries. Please check the third party licenses.

Is it any good?

Yes.

When people first hear about a new product, they frequently ask if it is any good. A Hacker News user remarked:

Note to self: Starting immediately, all raganwald projects will have a “Is it any good?” section in the readme, and the answer shall be “yes.".

Comments
  • what our users say about netdata?

    what our users say about netdata?

    In this thread we collect interesting (or funny, or just plain) posts, blogs, reviews, articles, etc - about netdata.

    1. don't start discussions on this post
    2. if you want to post, post the link to the original post and a screenshot!
  • Prometheus Support

    Prometheus Support

    Hey guys,

    I recently started using prometheus and I enjoy the simplicity. I want to begin to understand what it would take to implement prometheus support within Netdata. I think this is a great idea because it allows the distributed fashion of netdata to exist along with having persistence at prometheus. Centralized graphing (not monitoring) can now happen with grafana. Netdata is a treasure trove of metrics already - making this a worth wild project.

    Prometheus expects a rest end point to exist which publishes a metric, labels, and values. It will poll this endpoint at a desired time frame and ingest the metrics during that poll.

    To get the ball rolling, how are you currently serving http in Netdata? Is this an embedded sockets server in C ?

  • python.d enhancements

    python.d enhancements

    @paulfantom I am writing here a TODO list for python.d based on my findings.

    • [x] DOCUMENTATION in wiki.

    • [x] log flood protection - it will require 2 parameters: logs_per_interval = 200 and log_interval = 3600. So, every hour (this_hour = int(now / log_interval)) it should reset the counter and allow up to logs_per_interval log entries until the next hour.

      This is how netdata does it: https://github.com/firehol/netdata/blob/d7b083430de1d39d0196b82035162b4483c08a3c/src/log.c#L33-L107

    • [x] support ipv6 for SocketService (currently redis and squid)

    • [x] netdata passes the environment variable NETDATA_HOST_PREFIX. cpufreq should use this to prefix sys_dir automatically. This variable is used when netdata runs in a container. The system directories /proc, /sys of the host should be exposed with this prefix.

    • [ ] the URLService should somehow support proxy configuration.

    • [ ] the URLService should support Connection: keep-alive.

    • [x] The service that runs external commands should be more descriptive. Example running exim plugin when exim is not installed:

      python.d ERROR: exim_local exim [Errno 2] No such file or directory
      python.d ERROR: exim_local exim [Errno 2] No such file or directory
      python.d ERROR: exim: is misbehaving. Reason:'NoneType' object has no attribute '__getitem__'
      
    • [x] This message should be a debug log No unix socket specified. Trying TCP/IP socket.

    • [x] This message could state where it tried to connect: [Errno 111] Connection refused

    • [x] This message could state the hostname it tried to resolve: [Errno -9] Address family for hostname not supported

    • [x] This should state the job name, not the name:

      python.d ERROR: redis/local: check() function reports failure.
      
    • [x] This should state with is the problem:

      # ./plugins.d/python.d.plugin debug cpufreq 1
      INFO: Using python v2
      python.d INFO: reading configuration file: /etc/netdata/python.d.conf
      python.d INFO: MODULES_DIR='/root/netdata/python.d/', CONFIG_DIR='/etc/netdata/', UPDATE_EVERY=1, ONLY_MODULES=['cpufreq']
      python.d DEBUG: cpufreq: loading module configuration: '/etc/netdata/python.d/cpufreq.conf'
      python.d DEBUG: cpufreq: reading configuration
      python.d DEBUG: cpufreq: job added
      python.d INFO: Disabled cpufreq/None
      python.d ERROR: cpufreq/None: check() function reports failure.
      python.d FATAL: no more jobs
      DISABLE
      
    • [x] ~~There should be a configuration entry in python.d.conf to set the PATH to be searched for commands. By default everything in /usr/sbin/ is not found.~~ Added #695 to do this at the netdata daemon for all its plugins.

    • [x] The default retries in the code, for all modules, is 5 or 10. I suggest to make them 60 for all modules. There are many services that cannot be restarted within 5 seconds.

      Made it in #695

    • [x] When a service reports failure to collect data (during update()), there should be log entry stating the reason of failure.

    • [x] Handling of incremental dimensions in LogService

    • [x] Better autodetection of disk count in hddtemp.chart.py

    • [ ] Move logging mechanism to utilize logging module.

    more to come...

  • netdata package maintainers

    netdata package maintainers

    This issue has been converted to a wiki page

    For the latest info check it here: https://github.com/firehol/netdata/wiki/netdata-package-maintainers


    I think it would be useful to prepare a wiki page with information about the maintainers of netdata for the Linux distributions, automation systems, containers, etc.

    Let's see who is who:


    Official Linux Distributions

    | Linux Distribution | Netdata Version | Maintainer | Related URL | | :-: | :-: | :-: | :-- | | Arch Linux | Release | @svenstaro | netdata @ Arch Linux | | Arch Linux AUR | Git | @sanskritfritz | netdata @ AUR | | Gentoo Linux | Release + Git | @candrews | netdata @ gentoo | | Debian | Release | @lhw @FedericoCeratto | netdata @ debian | | Slackware | Release | @willysr | netdata @ slackbuilds | Ubuntu | | | | | Red Hat / Fedora / Centos | | | | | SuSe / openSuSe | | | |


    FreeBSD

    System|Initial PR|Core Developer|Package Maintainer |:-:|:-:|:-:|:-:| FreeBSD|#1321|@vlvkobal|@mmokhi


    MacOS

    System|URL|Core Developer|Package Maintainer |:-:|:-:|:-:|:-:| MacOS Homebrew Formula|link|@vlvkobal|@rickard-von-essen


    Unofficial Linux Packages

    | Linux Distribution | Netdata Version | Maintainer | Related URL | | :-: | :-: | :-: | :-- | | Ubuntu | Release | @gslin | netdata @ gslin ppa https://github.com/firehol/netdata/issues/69#issuecomment-217458543 |


    Embedded Linux

    | Embedded Linux | Netdata Version | Maintainer | Related URL | | :-: | :-: | :-: | :-- | | ASUSTOR NAS | ? | William Lin | https://www.asustor.com/apps/app_detail?id=532 | | OpenWRT | Release | @nitroshift | openwrt package | | ReadyNAS | Release | @NAStools | https://github.com/nastools/netdata | | QNAP | Release | QNAP_Stephane | https://forum.qnap.com/viewtopic.php?t=121518 | | DietPi | Release | @Fourdee | https://github.com/Fourdee/DietPi |


    Linux Containers

    | Containers | Netdata Version | Maintainer | Related URL | | :-: | :-: | :-: | :-- | | Docker | Git | @titpetric | https://github.com/titpetric/netdata |


    Automation Systems

    | Automation Systems | Netdata Version | Maintainer | Related URL | | :-: | :-: | :-: | :-- | | Ansible | git | @jffz | https://galaxy.ansible.com/jffz/netdata/ | | Chef | ? | @sergiopena | https://github.com/sergiopena/netdata-cookbook |


    If you know other maintainers of distributions that should be mentioned, please help me complete the list...

    cc: @mcnewton @philwhineray @alonbl @simonnagl @paulfantom

  • new prometheus format

    new prometheus format

    Based on recent the discussion on #1497 with @brian-brazil, this PR changes the format netdata sends metrics to prometheus.

    One of the key differences of netdata with traditional time-series solutions, is that it organises metrics in hosts having collections of metrics called charts.

    charts

    Each chart has several properties (common to all its metrics):

    chart_id - it serves 3 purposes: defines the chart application (e.g. mysql), the application instance (e.g. mysql_local or mysql_db2) and the chart type mysql_local.io, mysql_db2.io). However, there is another format: disk_ops.sda (it should be disk_sda.ops). There is issue #807 to normalize these better, but until then, this is how netdata works today.

    chart_name - a more human friendly name for chart_id.

    context - this is the same with above with the application instance removed. So it is mysql.io or disk.ops. Alarm templates use this.

    family is the submenu of the dashboard. Unfortunately, this is again used differently in several cases. For example disks and network interfaces have the disk or the network interface. But mysql uses it just to group multiple chart together and postgres uses both (groups charts, and provide different sections for different databases).

    units is the units for all the metrics attached to the chart.

    dimensions

    Then each chart contains metrics called dimensions. All the dimensions of a chart have the same units of measurement and should be contextually in the same category (ie. the metrics for disk bandwidth are read and write and they are both in the same chart).


    So, there are hosts (multiple netdata instances), each has its own charts, each with its own dimensions (metrics).

    The new prometheus format

    The old format netdata used for prometheus was: CHART_DIMENSION{instance="HOST}

    The new format depends on the data source requested. netdata supports the following data sources:

    • as collected or raw, to send the raw values collected
    • average, to send averages
    • sum or volume to send sums

    The default is the one defined in netdata.conf: [backend].data source = average (changing netdata.conf changes the format for prometheus too). However, prometheus may directly ask for a specific data source by appending &source=SOURCE to the URL (SOURCE being one of the options above).

    When the data source is as collected or raw, the format of the metrics is:

    CONTEXT_DIMENSION{chart="CHART",family="FAMILY",instance="HOSTNAME"}
    

    In all other cases (average, sum, volume), it is:

    CONTEXT{chart="CHART",family="FAMILY",dimension="DIMENSION",instance="HOSTNAME"}
    

    The above format fixes #1519

    time range

    When the data source is average, sum or volume, netdata has to decide the time-range it will calculate the average or the sum.

    The first time a prometheus server hits netdata, netdata will respond with the time frame defined in [backend].update every. But for all queries after the first, netdata remembers the last time it was accessed and responds with the time range since the last time prometheus asked for metrics.

    Each netdata server can respond to multiple prometheus servers. It remembers the last time it was accessed, for each prometheus IP requesting metrics. If the IP is not good enough to distinguish prometheus servers, each prometheus may append &server=PROMETHEUS_NAME to the URL. Then netdata will remember the last time it was accessed for each PROMETHEUS_NAME given.

    instance="HOSTNAME"

    instance="HOSTNAME" is sent only if netdata is called with format=prometheus_all_hosts. If netdata is called with format=prometheus, the instance is not added to the metrics.

    host tags

    Host tags are configured in netdata.conf, like this:

    [backend]
        host tags = tag1="value1",tag2="value2",...
    

    Netdata includes this line at the top of the response:

    netdata_host_tags{tag1="value1",tag2="value2"} 1 1499541463610
    

    The tags are not processed by netdata. Anything set at the host tags config option is just copied. netdata propagates host tags to masters and proxies when streaming metrics.

    If the netdata response includes multiple hosts, netdata_host_tags also includes `instance="HOSTNAME".

  • Redis python module + minor fixes

    Redis python module + minor fixes

    1. Nginx is shown as nginx: local in dashboard while using python or bash module.
    2. NetSocketService changed name to SocketService, which now can use unix sockets as well as TCP/IP sockets
    3. changed and tested new python shebang (yes it works)
    4. fixed issue with wrong data parsing in exim.chart.py
    5. changed whitelisting method in ExecutableService. It is very probable that whitelisting is not needed, but I am not sure.
    6. Added redis.chart.py

    I have tested this and it works.

    After merging this I need to take a break from rewriting modules to python. There are only 3 modules left, but I don't have any data to create opensips.chart.py as well as nut.chart.py (so I cannot code parsers). I also need to do some more research to create ap.chart.py since using iw isn't the best method.

  • How to install openvpn plugin

    How to install openvpn plugin

    Question summary

    Hi, I'm new in servers and first time install debian 9 server on VPS. Then install openvpn with openvpn-install script. I try to install few montitoring tools for my server but always fault. Now I found netdata and it works like a charm. Install script is wanderfull ;) To monitor my openvpn server I have to do something with those files: python.d.plugin, ovpn_status_log.chart.py, python.d/ovpn_status_log.conf? I don't see any tutorial so can anyone guide me what to do?

    OS / Environment

    Debian 9 64bit

    Component Name

    openvpn

    Expected results

    see openvpn traffic

    Regards, Przemek

  • Prototype: monitor disk space usage.

    Prototype: monitor disk space usage.

    This is just a prototype for disccussing some questions at this point.

    This will fix issues #249 and #74 when implemented properly.

    Questions

    1. Should we realy implement this at proc_diskstats.c? This does not get it's values from proc. I implemented it there because the file system data is already there and it produces a graph in this section.
    2. Shall we use statvfs (only mounted filesystems) or statfs (every filesystem)? If we use statfs we have to query mountinfo

    TODO

    • [x] Only add charts for filesystems where disk space is avaiable
    • [x] Do not allocate and free buffer statvfs all the time
    • [ ] Add this feature to the wiki
    • [x] Make unit more readable (TB, GB, MB depending on filesystem size)
    • [x] Do not display disk metrics for containers, only for disks

    This change is Reviewable

  • python.d modules configuration documentation

    python.d modules configuration documentation

    I suggest to add this header in all python.d/*.conf files:

    # netdata python.d.plugin configuration for ${MODULE}
    #
    # This file is in YaML format. Generally the format is:
    #
    # name: value
    #
    # There are 2 sections:
    #  - global variables
    #  - one or more JOBS
    #
    # JOBS allow you to collect values from multiple sources.
    # Each source will have its own set of charts.
    #
    # JOB parameters have to be indented (example below).
    #
    # ----------------------------------------------------------------------
    # Global Variables
    # These variables set the defaults for all JOBs, however each JOB
    # may define its own, overriding the defaults.
    #
    # update_every sets the default data collection frequency.
    # If unset, the python.d.plugin default is used.
    # update_every: 1
    #
    # priority controls the order of charts at the netdata dashboard.
    # Lower numbers move the charts towards the top of the page.
    # If unset, the default for python.d.plugin is used.
    # priority: 60000
    #
    # retries sets the number of retries to be made in case of failures.
    # If unset, the default for python.d.plugin is used.
    # Attempts to restore the service are made once every update_every
    # and only if the module has collected values in the past.
    # retries: 10
    #
    # ----------------------------------------------------------------------
    # JOBS (data collection sources)
    #
    # The default JOBS share the same *name*. JOBS with the same name
    # are mutually exclusive. Only one of them will be allowed running at
    # any time. This allows autodetection to try several alternatives and
    # pick the one that works.
    #
    # Any number of jobs is supported.
    #
    # All python.d.plugin JOBS (for all its modules) support a set of
    # predefined parameters. These are:
    #
    # job_name:
    #     name: myname     # the JOB's name as it will appear at the
    #                      # dashboard (by default is the job_name)
    #                      # JOBs sharing a name are mutually exclusive
    #     update_every: 1  # the JOB's data collection frequency
    #     priority: 60000  # the JOB's order on the dashboard
    #     retries: 10      # the JOB's number of restoration attempts
    #
    # Additionally to the above, ${MODULE} also supports the following.
    #
    

    where ${MODULE} is the name of each module.

  • Major docker build refactor

    Major docker build refactor

    1. Unify Dockerfiles and move them from top-level dir to docker
    2. Add run.sh script as a container entrypoint
    3. Introduce docker builder stage (previously used only in alpine image)
    4. Removed Dockerfile parts from Makefile.am
    5. Allow passing custom options to netdata as a docker CMD parameter (bonus from using ENTRYPOINT script)
    6. Run netdata as user netdata with static UID of 201 and /usr/sbin/nologin as shell
    7. Use multiarch/alpine as a base for all images.
    8. One Dockerfile for all platforms

    Initially I've got uncompressed image size reduction from 276MB to 111MB and also size reduction for other images:

    $ docker image ls
    REPOSITORY    TAG       SIZE     COMPRESSED
    netdata       i386      112MB    42MB
    netdata       amd64     111MB    41MB
    netdata       armhf     104MB    39MB
    netdata       aarch64   107MB    39MB
    

    Images are built with ./docker/build.sh command

    Resolves #3972

  • python.d charts with gaps

    python.d charts with gaps

    Check this:

    image

    Reporting timings has gaps too:

    image

    # cat /etc/os-release
    NAME="Ubuntu"
    VERSION="14.04.4 LTS, Trusty Tahr"
    ID=ubuntu
    ID_LIKE=debian
    PRETTY_NAME="Ubuntu 14.04.4 LTS"
    VERSION_ID="14.04"
    HOME_URL="http://www.ubuntu.com/"
    SUPPORT_URL="http://help.ubuntu.com/"
    BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
    
    # python --version
    Python 2.7.6
    
  • Build judy even without dbengine

    Build judy even without dbengine

    Summary

    Build without dbengine fails. Reason is recent big changes require libJudy regardless of dbengine.

    Fixes #13652

    Test Plan

    run installer with --disable-dbengine before and after this PR

    Additional Information
    For users: How does this change affect me?
  • Cleanly reimplement system/edit-config.in.

    Cleanly reimplement system/edit-config.in.

    Summary

    This is a clean reimplementation of the edit-config script we install in the user config directory. It adds a few new features, and fixes a number of outstanding issues with the script.

    Overall changes from the existing script:

    • Error messages are now clearly prefixed with ERROR: instead of looking no different from other output from the script.
    • We now have proper support for command-line arguments. In particular, edit-config --help now properly returns usage information instead of throwing an error. Other supported options are --file for explicitly specifying the file to edit (using this is not required, but we should ideally encourage it), and --editor to specify an editor of choice on the command-line.
    • We now can handle editing configuration for a Docker container on the host side, instead of requiring it to be done in the container. This is done by copying the file out of the container itself. The script includes primitive auto-detection that should work in most common cases, but the user can also use the new --container option to bypass the auto-detection and explicitly specify a container ID or name to use. Supports both Docker and Podman.
    • Instead of templating in the user config directory at build time, the script now uses the directory it was run from as the target for copying stock config files to. This is required for the above-mentioned Docker support, and also makes it a bit easier to test the script without having to do a full build of Netdata.
    • We now do a quick check of the validity of the editor (either auto-detected or user-supplied) instead of just blindly trusting that it’s usable. This should not result in any user-visible changes, but will provide a more useful error message if the user mistypes the name of their editor of choice.
    • Instead of blindly excluding paths starting with / or ., we now do a proper prefix check for the supplied file path to make sure it’s under the user config directory. This provides tow specific benefits:
      • We no longer blindly copy files into directories that are not ours. For example, with the existing script, you can do /etc/netdata/edit-config apps_groups.conf and it will blindly copy the stock apps_groups.conf file to the current directory. With the new script, this will throw an error instead.
      • Invoking the script using absolute paths that point under the user config directory will work properly. In particular, this means that you do not need to be in the user config directory when invoking the script, provided you use a proper path. Running netdata/edit-config netdata/apps_groups.conf when in /etc will now work, and /etc/netdata/edit-config /etc/netdata/apps_groups.conf will work from anywhere on the system.
    • If the requested file does not exist, and we do not provide a stock version of it, the script will now create an empty file instead of throwing an error. This is intended to allow it to behave better when dealing with configuration for third-party plugins (we may also want to define a standard location for third party plugins to store their stock configuration to improve this further, but that’s out of scope for this PR).
    Test Plan

    Testing can be done by copying the script to a usable system and manually replacing @[email protected] near the top with the correct stock config directory.

    If testing the Docker container handling without building a Docker image from this branch, the following command must be manually run in the container being used for testing: echo ${HOSTNAME} > /etc/netdata/.container-hostname. This file is used by the container auto-detection code in the script, and will normally be created by the Docker entrypoint script once this PR is merged.

    Additional Information

    Fixes: https://github.com/netdata/netdata/issues/13495

    For users: How does this change affect me? The edit-config script will now work correctly when dealing with configuration for Docker containers from the Docker host. Additionally, a number of other benefits
  • [Feat]: Create a collector for Cassandra

    [Feat]: Create a collector for Cassandra

    Problem

    Netdata does not currently monitor Cassandra metrics

    Description

    Create a new collector for Cassandra

    The following metrics should be collected and organized in the charts as per structure shown below:

    • Throughput (available via JMX, Jconsole)
      • Reads (Read requests/second)
      • Writes (Write requests/second)
    • Latency (available via nodetool, JMX, Jconsole)
      • Read latency (Read response time in microseconds)
      • Write latency (Write response time in microseconds)
    • Cache (available via nodetool, JMX, Jconsole)
      • Key cache hit rate (Percentage of read requests for which key location was found in cache)
    • Disk usage (available via nodetool, JMX, Jconsole)
      • Load (Disk space used on a node in bytes)
      • Total disk space used (Disk space used by column family, in btyes)
      • Compaction tasks completed (Total count of completed compaction tasks)
      • Compaction tasks in queue (Total count of pending compaction tasks in queue)
    • Garbage collection (available via nodetool, JMX, Jconsole)
      • ParNew count (Number of young-generation collection)
      • ParNew time (Elpased time of young-generation collection in milliseconds)
      • ConcurrentMarkSweep count (Number of old-generation collection)
      • ConcurrentMarkSweep time (Elapsed time of old-generation collection in milliseconds)
    • Errors (available via JMX, Jconsole)
      • Exceptions (Requests for which Cassandra encountered an error)
      • Timeout exceptions (Requests not unacknowledged within timeout window)
      • Unavailable exceptions (Requests for which required number of nodes was unavailable)
      • Pending tasks (Tasks in queue awaiting a thread for processing)
      • Blocked tasks (Tasks that have not yet ben queued for processing)

    Importance

    must have

    Value proposition

    1. Enable Cassandra users to rely on Netdata for their monitoring and troubleshooting needs.
    2. More users and connected nodes for Netdata

    Proposed implementation

    Some useful links:

  • initial implementation of QUERY_TARGET

    initial implementation of QUERY_TARGET

    This is an attempt to detach the query engine from the RRDSET and RRDDIM.

    I am currently blocked, because storage engines require an RRDDIM/RRDSET structure. So, both APIs have to change to eliminate any references to RRDDIM / RRDSET before proceeding with this.

  • [Bug]: STREAM_RECEIVER segfaults

    [Bug]: STREAM_RECEIVER segfaults

    Bug description

    STREAM_RECEIVER thread segfaults from time to time. I can see days without any segfault and I can see days when it segfaults every a few minutes. Netdata crashes and is restarted after 2 minutes.

    Is there any way how to debug this? All I see is this:

    syslog:

    kernel: [31678137.720256] STREAM_RECEIVER[8803]: segfault at 0 ip 00005601b143f837 sp 00007f0a0dc8e460 error 4 in netdata[5601b11cf000+769000]
    

    netdata error.log

    Terminated
    Terminated
    Terminated
    Terminated
    

    Expected behavior

    No segfaults

    Steps to reproduce

    Unknown

    Installation method

    kickstart.sh

    System info

    Linux 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
    /etc/lsb-release:DISTRIB_ID=Ubuntu
    /etc/lsb-release:DISTRIB_RELEASE=18.04
    /etc/lsb-release:DISTRIB_CODENAME=bionic
    /etc/lsb-release:DISTRIB_DESCRIPTION="Ubuntu 18.04.6 LTS"
    /etc/os-release:NAME="Ubuntu"
    /etc/os-release:VERSION="18.04.6 LTS (Bionic Beaver)"
    /etc/os-release:ID=ubuntu
    /etc/os-release:ID_LIKE=debian
    /etc/os-release:PRETTY_NAME="Ubuntu 18.04.6 LTS"
    /etc/os-release:VERSION_ID="18.04"
    /etc/os-release:VERSION_CODENAME=bionic
    /etc/os-release:UBUNTU_CODENAME=bionic
    

    Netdata build info

    Version: netdata v1.36.1
    Configure options:  '--prefix=/usr' '--sysconfdir=/etc' '--localstatedir=/var' '--libexecdir=/usr/libexec' '--libdir=/usr/lib' '--with-zlib' '--with-math' '--with-user=netdata' '--with-bundled-protobuf' 'CFLAGS=-O2 -pipe' 'LDFLAGS='
    Install type: custom
    Features:
        dbengine:                   YES
        Native HTTPS:               YES
        Netdata Cloud:              YES 
        ACLK Next Generation:       YES
        ACLK-NG New Cloud Protocol: YES
        ACLK Legacy:                NO
        TLS Host Verification:      YES
        Machine Learning:           YES
        Stream Compression:         NO
    Libraries:
        protobuf:                YES (bundled)
        jemalloc:                NO
        JSON-C:                  YES
        libcap:                  NO
        libcrypto:               YES
        libm:                    YES
        tcalloc:                 NO
        zlib:                    YES
    Plugins:
        apps:                    YES
        cgroup Network Tracking: YES
        CUPS:                    NO
        EBPF:                    YES
        IPMI:                    NO
        NFACCT:                  NO
        perf:                    YES
        slabinfo:                YES
        Xen:                     NO
        Xen VBD Error Tracking:  NO
    Exporters:
        AWS Kinesis:             NO
        GCP PubSub:              NO
        MongoDB:                 NO
        Prometheus Remote Write: NO
    

    Additional info

    No response

  • Use CMake generated config.h also in out of tree CMake build

    Use CMake generated config.h also in out of tree CMake build

    Summary

    Fixes problem identified during review of https://github.com/netdata/netdata/pull/13656

    CMake build was still using the autotools generated config.h in case of out of tree build.

    We cannot rely on autotools anymore (we still have to rely on installer though)

    Test Plan
    Additional Information
    For users: How does this change affect me?
Grafana - The open-source platform for monitoring and observability
Grafana - The open-source platform for monitoring and observability

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

Sep 23, 2022
A Fast and Convenient C++ Logging Library for Low-latency or Real-time Environments

xtr What is it? XTR is a C++ logging library aimed at applications with low-latency or real-time requirements. The cost of log statements is minimised

Jul 17, 2022
log4cplus is a simple to use C++ logging API providing thread-safe, flexible, and arbitrarily granular control over log management and configuration. It is modelled after the Java log4j API.

% log4cplus README Short Description log4cplus is a simple to use C++17 logging API providing thread--safe, flexible, and arbitrarily granular control

Sep 23, 2022
Colorful Logging is a simple and efficient library allowing for logging and benchmarking.
Colorful Logging is a simple and efficient library allowing for  logging and benchmarking.

Colorful-Logging "Colorful Logging" is a library allowing for simple and efficient logging as well for benchmarking. What can you use it for? -Obvious

Feb 17, 2022
View and log aoe-api requests and responses

aoe4_socketspy View and log aoe-api requests and responses Part 1: https://www.codereversing.com/blog/archives/420 Part 2: https://www.codereversing.c

Aug 6, 2022
Portable, simple and extensible C++ logging library
Portable, simple and extensible C++ logging library

Plog - portable, simple and extensible C++ logging library Pretty powerful logging library in about 1000 lines of code Introduction Hello log! Feature

Sep 23, 2022
A DC power monitor and data logger
A DC power monitor and data logger

Hoverboard Power Monitor I wanted to gain a better understanding of the power consumption of my hoverboard during different riding situations. For tha

May 1, 2021
An ATTiny85 implementation of the well known sleep aid. Includes circuit, software and 3d printed case design
An ATTiny85 implementation of the well known sleep aid. Includes circuit, software and 3d printed case design

dodowDIY An ATTiny85 implementation of the well known sleep aid. Includes circuit, software and 3d printed case design The STL shells are desiged arou

Sep 4, 2022
A BSD-based OS project that aims to provide an experience like and some compatibility with macOS
A BSD-based OS project that aims to provide an experience like and some compatibility with macOS

What is Helium? Helium is a new open source OS project that aims to provide a similar experience and some compatibiilty with macOS on x86-64 sytems. I

Sep 18, 2022
A revised version of NanoLog which writes human readable log file, and is easier to use.
A revised version of NanoLog which writes human readable log file, and is easier to use.

NanoLogLite NanoLogLite is a revised version of NanoLog, and is easier to use without performance compromise. The major changes are: NanoLogLite write

Sep 16, 2022
Receive and process logs from the Linux kernel.

Netconsd: The Netconsole Daemon This is a daemon for receiving and processing logs from the Linux Kernel, as emitted over a network by the kernel's ne

Sep 3, 2022
Minimalistic logging library with threads and manual callstacks

Minimalistic logging library with threads and manual callstacks

Jun 24, 2022
Compressed Log Processor (CLP) is a free tool capable of compressing text logs and searching the compressed logs without decompression.

CLP Compressed Log Processor (CLP) is a tool capable of losslessly compressing text logs and searching the compressed logs without decompression. To l

Sep 16, 2022
Log.c2 is based on rxi/log.c with MIT LICENSE which is inactive now. Log.c has a very flexible and scalable architecture

log.c2 A simple logging library. Log.c2 is based on rxi/log.c with MIT LICENSE which is inactive now. Log.c has a very flexible and scalable architect

Feb 13, 2022
PikaScript is an ultra-lightweight Python engine with zero dependencies and zero-configuration, that can run with 4KB of RAM (such as STM32G030C8 and STM32F103C8), and is very easy to deploy and expand.
PikaScript is an ultra-lightweight Python engine with zero dependencies and zero-configuration, that can run with 4KB of RAM (such as STM32G030C8 and STM32F103C8), and is very easy to deploy and expand.

PikaScript 中文页| Star please~ 1. Abstract PikaScript is an ultra-lightweight Python engine with zero dependencies and zero-configuration, that can run

Sep 23, 2022
Parca-agent - eBPF based always-on profiler auto-discovering targets in Kubernetes and systemd, zero code changes or restarts needed!
Parca-agent - eBPF based always-on profiler auto-discovering targets in Kubernetes and systemd, zero code changes or restarts needed!

Parca Agent Parca Agent is an always-on sampling profiler that uses eBPF to capture raw profiling data with very low overhead. It observes user-space

Sep 19, 2022
KDevelop plugin for automatic time tracking and metrics generated from your programming activity.

Wakatime KDevelop Plugin Installation instructions Make sure the project is configured to install to the directory of your choice: In KDevelop, select

Oct 13, 2021