Difference between revisions of "Netdata"
(How to get notifications) |
(add references section) |
||
(19 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | Netdata | + | Netdata is one of the [https://discourse.equality-tech.com/t/dashboards-in-qualitybox/107 QualityBox dashboards]. |
+ | |||
+ | See [{{SERVER}}:20000/ this website Live] | ||
+ | |||
+ | [{{SERVER}}:20000/netdata.conf Configuration] | ||
− | |||
<!-- netdata.conf --> | <!-- netdata.conf --> | ||
== System Locations == | == System Locations == | ||
+ | Depending on how you install netdata, it will be distributed in the normal system locations such as | ||
<pre> | <pre> | ||
- the daemon at /usr/sbin/netdata | - the daemon at /usr/sbin/netdata | ||
Line 16: | Line 20: | ||
- logrotate file at /etc/logrotate.d/netdata | - logrotate file at /etc/logrotate.d/netdata | ||
</pre> | </pre> | ||
+ | |||
+ | Or, if you use | ||
+ | <pre>bash <(curl -Ss https://my-netdata.io/kickstart-static64.sh)</pre> | ||
+ | to install, you'll get all of netdata installed into <code>/opt/netdata</code> | ||
+ | |||
== Host Modifications == | == Host Modifications == | ||
− | + | A Netdata role is available in [https://github.com/enterprisemediawiki/meza/blob/32.x/src/roles/netdata/tasks/main.yml the 32.x branch of Meza] | |
+ | |||
+ | Otherwise, you have to make room in HAProxy for netdata: | ||
+ | |||
+ | === HAProxy === | ||
+ | <source lang="python"> | ||
+ | frontend netdata | ||
+ | bind *:20000 ssl crt /etc/haproxy/certs/wiki.freephile.org.pem | ||
+ | mode http | ||
+ | default_backend netdata-back | ||
+ | |||
+ | backend netdata-back | ||
+ | server nd1 127.0.0.1:19999 | ||
+ | </source> | ||
+ | |||
+ | === Kernel === | ||
+ | You have kernel memory de-duper (called [https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/chap-ksm Kernel Same-page Merging], or KSM) available, but it is not currently enabled. | ||
Memory de-duplication instructions | Memory de-duplication instructions | ||
− | |||
− | |||
− | |||
To enable it run: | To enable it run: | ||
Line 47: | Line 69: | ||
<code>systemctl start netdata</code> | <code>systemctl start netdata</code> | ||
− | == | + | To reload configuration: |
− | The configuration will send messages to 'root' so be sure to either edit the conf <code>sudo vim /etc/netdata/health_alarm_notify.conf</code>, or set <code>vim /etc/aliases && newaliases</code> | + | <code>killall -USR2 netdata</code> <ref>https://docs.netdata.cloud/health/quickstart/#reload-health-configuration</ref> |
+ | |||
+ | == Notifications == | ||
+ | |||
+ | The default configuration will send messages to 'root' so be sure to either edit the conf <code>sudo vim /etc/netdata/health_alarm_notify.conf</code>, or set <code>vim /etc/aliases && newaliases</code> | ||
+ | |||
+ | === Turn off alarm === | ||
+ | |||
+ | |||
+ | <pre> | ||
+ | to: silent # silence notification; still see on website | ||
+ | enabled: no # disable alarm | ||
+ | </pre> | ||
+ | more details in the [https://docs.netdata.cloud/health/tutorials/stop-notifications-alarms/ netdata docs]. | ||
+ | |||
+ | |||
+ | == Issues == | ||
+ | |||
+ | You'll probably receive alarms for 'tcp listen drops'. This is likely bot-related (sending INVALID packets) and NOT due to your application dropping legitimate packets. There is a good discussion on how to identify the source of the problem and how to mitigate or resolve it [https://github.com/firehol/netdata/issues/3234 Issue #3234][https://github.com/firehol/netdata/issues/3826 Issue #3826] TLDR; increase the threshold to 1 (<code>/etc/netdata/health.d/tcp_listen.conf</code>) so you don't get bogus alerts. | ||
+ | |||
+ | Also, you should modify your firewall to drop invalid packets before they're either counted (by netstats) or dropped (by the kernel). | ||
+ | |||
+ | <source lang="bash"> | ||
+ | iptables -A INPUT -m conntrack --ctstate INVALID -j DROP | ||
+ | ip6tables -A INPUT -m conntrack --ctstate INVALID -j DROP | ||
+ | iptables -A INPUT -m tcp -p tcp ! --tcp-flags FIN,SYN,RST,ACK SYN -m conntrack --ctstate NEW -j DROP | ||
+ | ip6tables -A INPUT -m tcp -p tcp ! --tcp-flags FIN,SYN,RST,ACK SYN -m conntrack --ctstate NEW -j DROP | ||
+ | </source> | ||
+ | |||
+ | Following the advice from NASA at https://wiki.earthdata.nasa.gov/display/HDD/SOMAXCONN, I increased my somaxconn kernel parameter to 1024 from 128 | ||
+ | <source lang="bash"> | ||
+ | cat /proc/sys/net/core/somaxconn | ||
+ | 128 | ||
+ | sysctl -w net.core.somaxconn=1024 | ||
+ | </source> | ||
+ | |||
+ | [[File:Tcp state diagram fixed.svg|600px|TCP State diagram]] | ||
+ | |||
+ | |||
+ | |||
+ | == Updates == | ||
+ | Netdata will [https://github.com/firehol/netdata/wiki/Updating-Netdata update itself], and puts a script into cron: | ||
+ | <code> | ||
+ | ln -s /root/netdata/netdata-updater.sh /etc/cron.daily/netdata-updater | ||
+ | </code> | ||
+ | |||
+ | |||
+ | {{References}} | ||
+ | |||
+ | [[Category:QualityBox]] | ||
+ | [[Category:Monitoring]] |
Latest revision as of 21:46, 29 March 2020
Netdata is one of the QualityBox dashboards.
Contents
System Locations[edit | edit source]
Depending on how you install netdata, it will be distributed in the normal system locations such as
- the daemon at /usr/sbin/netdata - config files in /etc/netdata - web files in /usr/share/netdata - plugins in /usr/libexec/netdata - cache files in /var/cache/netdata - db files in /var/lib/netdata - log files in /var/log/netdata - pid file at /var/run/netdata.pid - logrotate file at /etc/logrotate.d/netdata
Or, if you use
bash <(curl -Ss https://my-netdata.io/kickstart-static64.sh)
to install, you'll get all of netdata installed into /opt/netdata
Host Modifications[edit | edit source]
A Netdata role is available in the 32.x branch of Meza
Otherwise, you have to make room in HAProxy for netdata:
HAProxy[edit | edit source]
frontend netdata
bind *:20000 ssl crt /etc/haproxy/certs/wiki.freephile.org.pem
mode http
default_backend netdata-back
backend netdata-back
server nd1 127.0.0.1:19999
Kernel[edit | edit source]
You have kernel memory de-duper (called Kernel Same-page Merging, or KSM) available, but it is not currently enabled.
Memory de-duplication instructions
To enable it run:
echo 1 >/sys/kernel/mm/ksm/run echo 1000 >/sys/kernel/mm/ksm/sleep_millisecs
If you enable it, you will save 40-60% of netdata memory.
Ports[edit | edit source]
netdata by default listens on all IPs on port 19999. We add a rule to firewalld to allow 20000 and then pass that port through to the backend in haproxy config.
http://this.machine.ip:20000/ => http://127.0.0.1:19999
Start/Stop[edit | edit source]
To stop netdata run:
systemctl stop netdata
To start netdata run:
systemctl start netdata
To reload configuration:
killall -USR2 netdata
[1]
Notifications[edit | edit source]
The default configuration will send messages to 'root' so be sure to either edit the conf sudo vim /etc/netdata/health_alarm_notify.conf
, or set vim /etc/aliases && newaliases
Turn off alarm[edit | edit source]
to: silent # silence notification; still see on website enabled: no # disable alarm
more details in the netdata docs.
Issues[edit | edit source]
You'll probably receive alarms for 'tcp listen drops'. This is likely bot-related (sending INVALID packets) and NOT due to your application dropping legitimate packets. There is a good discussion on how to identify the source of the problem and how to mitigate or resolve it Issue #3234Issue #3826 TLDR; increase the threshold to 1 (/etc/netdata/health.d/tcp_listen.conf
) so you don't get bogus alerts.
Also, you should modify your firewall to drop invalid packets before they're either counted (by netstats) or dropped (by the kernel).
iptables -A INPUT -m conntrack --ctstate INVALID -j DROP
ip6tables -A INPUT -m conntrack --ctstate INVALID -j DROP
iptables -A INPUT -m tcp -p tcp ! --tcp-flags FIN,SYN,RST,ACK SYN -m conntrack --ctstate NEW -j DROP
ip6tables -A INPUT -m tcp -p tcp ! --tcp-flags FIN,SYN,RST,ACK SYN -m conntrack --ctstate NEW -j DROP
Following the advice from NASA at https://wiki.earthdata.nasa.gov/display/HDD/SOMAXCONN, I increased my somaxconn kernel parameter to 1024 from 128
cat /proc/sys/net/core/somaxconn
128
sysctl -w net.core.somaxconn=1024
Updates[edit | edit source]
Netdata will update itself, and puts a script into cron:
ln -s /root/netdata/netdata-updater.sh /etc/cron.daily/netdata-updater