slideshow 1 slideshow 2 slideshow 3

Xen randomly crashing server - part 2

Some weeks ago I blogged about "Xen randomly crashing server". The problem back then was that I couldn't get any information why the server reboots. Using a netconsole was not possible, because netconsole refused to work with the bridge that is used for Xen networking. Luckily my colocation partner rrbone.net connected the second network port of my server to the network so that I could use eth1 instead of the bridged eth0 for netconsole.

Today the server crashed several times and I was able to collect some more information than just the screenshots from IPMI/KVM console as shown in my last blog entry (full netconsole output is attached as a file): 

May 12 11:56:39 31.172.31.251 [829681.040596] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2
May 12 11:56:39 31.172.31.251 [829681.040647] Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 3.0a 01/03/2014
May 12 11:56:39 31.172.31.251 [829681.040701] task: ffffffff8181a460 ti: ffffffff81800000 task.ti: ffffffff81800000
May 12 11:56:39 31.172.31.251 [829681.040749] RIP: e030:[<ffffffff812b7e56>]
May 12 11:56:39 31.172.31.251  [<ffffffff812b7e56>] memcpy+0x6/0x110
May 12 11:56:39 31.172.31.251 [829681.040802] RSP: e02b:ffff880280e03a58  EFLAGS: 00010286
May 12 11:56:39 31.172.31.251 [829681.040834] RAX: ffff88026eec9070 RBX: ffff88023c8f6b00 RCX: 00000000000000ee
May 12 11:56:39 31.172.31.251 [829681.040880] RDX: 00000000000004a0 RSI: ffff88006cd1f000 RDI: ffff88026eec9422
May 12 11:56:39 31.172.31.251 [829681.040927] RBP: ffff880280e03b38 R08: 00000000000006c0 R09: ffff88026eec9062
May 12 11:56:39 31.172.31.251 [829681.040973] R10: 0100000000000000 R11: 00000000af9a2116 R12: ffff88023f440d00
May 12 11:56:39 31.172.31.251 [829681.041020] R13: ffff88006cd1ec66 R14: ffff88025dcf1cc0 R15: 00000000000004a8
May 12 11:56:39 31.172.31.251 [829681.041075] FS:  0000000000000000(0000) GS:ffff880280e00000(0000) knlGS:ffff880280e00000
May 12 11:56:39 31.172.31.251 [829681.041124] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
May 12 11:56:39 31.172.31.251 [829681.041153] CR2: ffff88006cd1f000 CR3: 0000000271ae8000 CR4: 0000000000042660
May 12 11:56:39 31.172.31.251 [829681.041202] Stack:
May 12 11:56:39 31.172.31.251 [829681.041225]  ffffffff814d38ff
May 12 11:56:39 31.172.31.251  ffff88025b5fa400
May 12 11:56:39 31.172.31.251  ffff880280e03aa8
May 12 11:56:39 31.172.31.251  9401294600a7012a
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.041287]  0100000000000000
May 12 11:56:39 31.172.31.251  ffffffff814a000a
May 12 11:56:39 31.172.31.251  000000008181a460
May 12 11:56:39 31.172.31.251  00000000000080fe
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.041346]  1ad902feff7ac40e
May 12 11:56:39 31.172.31.251  ffff88006c5fd980
May 12 11:56:39 31.172.31.251  ffff224afc3e1600
May 12 11:56:39 31.172.31.251  ffff88023f440d00
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.041407] Call Trace:
May 12 11:56:39 31.172.31.251 [829681.041435]  <IRQ>
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.041441]
May 12 11:56:39 31.172.31.251  [<ffffffff814d38ff>] ? ndisc_send_redirect+0x3bf/0x410
May 12 11:56:39 31.172.31.251 [829681.041506]  [<ffffffff814a000a>] ? ipmr_device_event+0x7a/0xd0
May 12 11:56:39 31.172.31.251 [829681.041548]  [<ffffffff814bc74c>] ? ip6_forward+0x71c/0x850
May 12 11:56:39 31.172.31.251 [829681.041585]  [<ffffffff814c9e54>] ? ip6_route_input+0xa4/0xd0
May 12 11:56:39 31.172.31.251 [829681.041621]  [<ffffffff8141f1a3>] ? __netif_receive_skb_core+0x543/0x750
May 12 11:56:39 31.172.31.251 [829681.041729]  [<ffffffff8141f42f>] ? netif_receive_skb_internal+0x1f/0x80
May 12 11:56:39 31.172.31.251 [829681.041771]  [<ffffffffa0585eb2>] ? br_handle_frame_finish+0x1c2/0x3c0 [bridge]
May 12 11:56:39 31.172.31.251 [829681.041821]  [<ffffffffa058c757>] ? br_nf_pre_routing_finish_ipv6+0xc7/0x160 [bridge]
May 12 11:56:39 31.172.31.251 [829681.041872]  [<ffffffffa058d0e2>] ? br_nf_pre_routing+0x562/0x630 [bridge]
May 12 11:56:39 31.172.31.251 [829681.041907]  [<ffffffffa0585cf0>] ? br_handle_local_finish+0x80/0x80 [bridge]
May 12 11:56:39 31.172.31.251 [829681.041955]  [<ffffffff8144fb65>] ? nf_iterate+0x65/0xa0
May 12 11:56:39 31.172.31.251 [829681.041987]  [<ffffffffa0585cf0>] ? br_handle_local_finish+0x80/0x80 [bridge]
May 12 11:56:39 31.172.31.251 [829681.042035]  [<ffffffff8144fc16>] ? nf_hook_slow+0x76/0x130
May 12 11:56:39 31.172.31.251 [829681.042067]  [<ffffffffa0585cf0>] ? br_handle_local_finish+0x80/0x80 [bridge]
May 12 11:56:39 31.172.31.251 [829681.042116]  [<ffffffffa0586220>] ? br_handle_frame+0x170/0x240 [bridge]
May 12 11:56:39 31.172.31.251 [829681.042148]  [<ffffffff8141ee24>] ? __netif_receive_skb_core+0x1c4/0x750
May 12 11:56:39 31.172.31.251 [829681.042185]  [<ffffffff81009f9c>] ? xen_clocksource_get_cycles+0x1c/0x20
May 12 11:56:39 31.172.31.251 [829681.042217]  [<ffffffff8141f42f>] ? netif_receive_skb_internal+0x1f/0x80
May 12 11:56:39 31.172.31.251 [829681.042251]  [<ffffffffa063f50f>] ? xenvif_tx_action+0x49f/0x920 [xen_netback]
May 12 11:56:39 31.172.31.251 [829681.042299]  [<ffffffffa06422f8>] ? xenvif_poll+0x28/0x70 [xen_netback]
May 12 11:56:39 31.172.31.251 [829681.042331]  [<ffffffff8141f7b0>] ? net_rx_action+0x140/0x240
May 12 11:56:39 31.172.31.251 [829681.042367]  [<ffffffff8106c6a1>] ? __do_softirq+0xf1/0x290
May 12 11:56:39 31.172.31.251 [829681.042397]  [<ffffffff8106ca75>] ? irq_exit+0x95/0xa0
May 12 11:56:39 31.172.31.251 [829681.042432]  [<ffffffff8135a285>] ? xen_evtchn_do_upcall+0x35/0x50
May 12 11:56:39 31.172.31.251 [829681.042469]  [<ffffffff8151669e>] ? xen_do_hypervisor_callback+0x1e/0x30
May 12 11:56:39 31.172.31.251 [829681.042499]  <EOI>
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.042506]
May 12 11:56:39 31.172.31.251  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
May 12 11:56:39 31.172.31.251 [829681.042561]  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
May 12 11:56:39 31.172.31.251 [829681.042592]  [<ffffffff81009e7c>] ? xen_safe_halt+0xc/0x20
May 12 11:56:39 31.172.31.251 [829681.042627]  [<ffffffff8101c8c9>] ? default_idle+0x19/0xb0
May 12 11:56:39 31.172.31.251 [829681.042666]  [<ffffffff810a83e0>] ? cpu_startup_entry+0x340/0x400
May 12 11:56:39 31.172.31.251 [829681.042705]  [<ffffffff81903076>] ? start_kernel+0x497/0x4a2
May 12 11:56:39 31.172.31.251 [829681.042735]  [<ffffffff81902a04>] ? set_init_arg+0x4e/0x4e
May 12 11:56:39 31.172.31.251 [829681.042767]  [<ffffffff81904f69>] ? xen_start_kernel+0x569/0x573
May 12 11:56:39 31.172.31.251 [829681.042797] Code:
May 12 11:56:39 31.172.31.251  <f3>
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.043113] RIP
May 12 11:56:39 31.172.31.251  [<ffffffff812b7e56>] memcpy+0x6/0x110
May 12 11:56:39 31.172.31.251 [829681.043145]  RSP <ffff880280e03a58>
May 12 11:56:39 31.172.31.251 [829681.043170] CR2: ffff88006cd1f000
May 12 11:56:39 31.172.31.251 [829681.043488] ---[ end trace 1838cb62fe32daad ]---
May 12 11:56:39 31.172.31.251 [829681.048905] Kernel panic - not syncing: Fatal exception in interrupt
May 12 11:56:39 31.172.31.251 [829681.048978] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)

I'm not that good at reading this kind of output, but to me it seems that ndisc_send_redirect is at fault. When googling for "ndisc_send_redirect" you can find a patch on lkml.org and Debian bug #804079, both seem to be related to IPv6.

When looking at the linux kernel source mentioned in the lkml patch I see that this patch is already applied (line 1510): 

        if (ha) 
                ndisc_fill_addr_option(buff, ND_OPT_TARGET_LL_ADDR, ha);

So, when the patch was intended to prevent "leading to data corruption or in the worst case a panic when the skb_put failed" it does not help in my case or in the case of #804079.

Any tips are appreciated!

PS: I'll contribute to that bug in the BTS, of course!

Kategorie: 
 
AttachmentSize
syslog-xen-crash.txt24.27 KB

Avoiding Gated Communities with Diaspora, Friendica and others

At the Chaos Communication Congress 32c3 in Hamburg last year, there was a talk by Katharina Nocun named "A New Kid on the Block - Conditions for a Successful Market Entry of Decentralized Social Networks". The short abstract is this: 

The leading social networks are the powerful new gatekeepers of the digital age. Proprietary de facto standards of the dominant companies have lead to the emergence of virtual “information silos” that can barely communicate with one another. Has Diaspora really lost the war? Or is there still a chance to succeed?

Maybe some of you attended that talk or have already seen the recording. For those who haven't, here it is for your convenience: 

It's all about Social Networks and Gated Communities vs. open communities. It's like Facebook on the Gated Community side and Diaspora as an example on the other, the open side.

At timecode 17:20 Katharina mentions that the Top10 of Diaspora pods have more than half a million users. But when you look more closely at the statistics from the-federation.info you can spot a different result that is most likely true for marketing statistic of Facebook as well: there is a difference between total users and current active users. Whereas indeed the total users are easily surpassing the half million users mark, it's a total different issue for the active users count of the last month: 15488 active users in total versus 546783 total users of the Top10 Diaspora sites. That's only 2.83% of active users. A quite awful turnaround rate. 

Many users are just quick lurkers, that came passing by, looking at Diaspora (and other alternative networks), get a quick login and a first try-out and never come back after a few days. I can confirm this from my own Friendica node at Nerdica.net where I currently have a total of 13 users: 7 users never posted any content, 1 user is already automatically set to expired because of this, and 8 users never came back after first day of registration. 

Therefor I cannot confirm with Katharinas conclusion that Diaspora "is not dead, it's pretty alive". All these alternative Social Networks are pretty much dead or - to put it in more friendly words - are alive in a rather small niche or small communities like data/privacy aware peoples.

Am I happy about this?

No, definitely not, because I am one of these data/privacy aware activits. I'm no big fan of such monolithic and centralized networks like Facebook. I'm a enthusiastic advocate of self-hosting and decentralized platforms and communication protocols, such as XMPP.

So, what can be done about these kind of Gated Communities like Facebook? Are you still on Facebook, because most of your family and friends are over there and not on Diaspora/Friendica? Are you still using Skype instead of XMPP? Why are you doing this? I'm really interested in this, because I don't understand it.

PS: please watch the video in full length! Katharina has some other good points as well! :)

Kategorie: 
 

Xen randomly crashing server

It's a long story... an oddessey of almost two years...

But to start from the beginning: Back then I rented a server at Hetzner until they decided to bill for every IP address you got from them. I got a /26 in the past and so I would have to pay for every IP address of that subnet in addition to the server rent of 79.- EUR/month. That would have meant nearly doubling the monthly costs. So I moved with my server from Hetzner to rrbone Net, which offered me a /26 on a rented Cisco C200 M2 server for a competitve price.

After migrating the VMs from Hetzner to rrbone with the same setup that was running just fine at Hetzner I experienced spontaneous reboots of the server, sometimes several times per day and in short time frame. The hosting provider was very, very helpful in debugging this like exchanging the memory, setting up a remote logging service for the CIMC and such. But in the end we found no root cause for this. The CIMC logs showed that the OS was rebooting the machine.

Anyway, I then bought my own server and exchanged the Cisco C200 by my own hardware, but the reboots still happen as before. Sometimes the servers runs for weeks, sometimes the server crashes 4-6 times a day, but usually it's like a pattern: when it crashes and reboots, it will do that again within a few hours and after the second reboot the chances are high that the server will run for several days without a reboot - or even weeks.

The strange thing is, that there are absolutely no hints in the logs, neither syslog or in the Xen logs, so I assume that's something quite deep in the kernel that causes the reboot. Another hint is, that the reboots fairly often happened, when I used my Squid proxy on one of the VMs to access the net. I'm connecting for example by SSH with portforwarding to one VM, whereas the proxy runs on another VM, which led to network traffic between the VMs. Sometimes the server crashed on the very firsts proxy requests. So, I exchanged Squid by tinyproxy or other proxies, moved the proxy from one VM to that VM I connect to using SSH, because I thought that the inter-VM traffic may cause the machine to reboot. Moving the proxy to another virtual server I rented at another hosting provider to host my secondary nameserver did help a little bit, but with no real hard proof and statistics, just an impression of mine.

I moved from xm toolstack to xl toolstack as well, but didn't help either. The reboots are still happening and in the last few days very frequent. Even with the new server I exchanged the memory, used memory mirroring as well, because I thought that it might be a faulty memory module or something, but still rebooting out of the blue.

During the last weekend I configured grub to include "noreboot" command line and then got my first proof that somehow the Xen network stack is causing the reboots: 

This is a screenshot of the IPMI console, so it's not showing the full information of that kernel oops, but as you can see, there are most likely such parts involved like bridge, netif, xenvif and the physical igb NIC.

Here's another screenshot of a crash from this night: 

Slightly different information, but still somehow network involved as you can see in the first line (net_rx_action).

So the big question is: is this a bug Xen or with my setup? I'm using xl toolstack, the xl.conf is basically the default, I think: 

## Global XL config file ##

# automatically balloon down dom0 when xen doesn't have enough free
# memory to create a domain
autoballoon=0

# full path of the lockfile used by xl during domain creation
#lockfile="/var/lock/xl"

# default vif script
#vif.default.script="vif-bridge"

With this the default network scripts of the distribution (i.e. Debian stable) should be used. The network setup consists of two brdiges: 

auto xenbr0
iface xenbr0 inet static
        address 31.172.31.193
        netmask 255.255.255.192
        gateway 31.172.31.254
        bridge_ports eth0
        pre-up brctl addbr xenbr0

auto xenbr1
iface xenbr1 inet static
        address 192.168.254.254
        netmask 255.255.255.0
        pre-up brctl addbr xenbr1

There are some more lines to that config like setting up some iptables rules with up commands and such. But as you can see my eth0 NIC is part of the "main" xen bridge with all the IP addresses that are reachable from the outside. The second bridge is used for internal networking like database connections and such.

I would rather like to use a netconsole to capture the full debug output in case of a new crash, but unfortunately this only works until the bridge is brought up and in place: 

[    0.000000] Command line: placeholder root=UUID=c3....22 ro debug ignore_loglevel loglevel=7 netconsole=port@31.172.31.193/eth0,514@5.45.x.y/e0:ac:f1:4c:y:x
[   32.565624] netpoll: netconsole: local port $port
[   32.565683] netpoll: netconsole: local IPv4 address 31.172.31.193
[   32.565742] netpoll: netconsole: interface 'eth0'
[   32.565799] netpoll: netconsole: remote port 514
[   32.565855] netpoll: netconsole: remote IPv4 address 5.45.x.y
[   32.565914] netpoll: netconsole: remote ethernet address e0:ac:f1:4c:y:x
[   32.565982] netpoll: netconsole: device eth0 not up yet, forcing it
[   36.126294] netconsole: network logging started
[   49.802600] netconsole: network logging stopped on interface eth0 as it is joining a master device

So, the first question is: how to use netconsole with an interface that is used on a bridge?

The second question is: is the setup with two bridges with Xen ok? I've been using this setup for years now and it worked fairly well on the Hetzner server as well, although I used there xm toolstack with a mix of bridge and routed setup, because Hetzner didn't like to see the MAC addresses of the other VMs on the switch and shut the port down if that happens.

Kategorie: 
 

Letsencrypt: challenging challenges solved

A few weeks ago I was wondering in Letsencrypt: challenging challenges about how to setup Letsencrypt when a domain is spread across several virtual machines (VM). One of the possible solutions would be to consolidate everything on one single VM, which is nothing I would like to do. The second option would need to generate the Letsencrypt certs on the webserver and copy over the certs to the appropriate VM on a regular basis or event driven. The third option is to use a network share - and this is what I'm using right now.

So, my setup is as following after I solved the GlusterFS issue with rpcbind binding to all interfaces, although it has been configured to only listen to certain interfaces (solution was: simply remove all NFS related stuff):

On Dom0 (or the host machine) I run GlusterFS as a server on a small 1 GB LVM as part of a replicate with the VM that will do the actual Letsencrypt work: 

Volume Name: le
Type: Replicate
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.x.254:/srv/gfs/le
Brick2: 192.168.x.1:/srv/gfs/le

This is to ensure that on reboot of the machine every other VM using Letsencrypt certs can mount the GlusterFS share, because the host machine will be there for sure whereas the other VM generating the certs with the letsencrypt.sh script might still be booting. And when the GlusterFS share is missing services will not start on the other VMs because of the missing certs, of course. So, the replica on the virtualization host (Dom0) is only acting as some kind of always-being-available network share, because, well, the other VM will not always be there... for example during a kernel update when a reboot is required.

The same setup is on my mailserver, acting as the second GlusterFS brick of that replica drive. The mailserver hosts the bind9 nameserver as well and I might do something that new domains with Letsencrypt certs get added to my DNSSEC setup as well. Of course, when the letsencrypt.sh script creates or updates the certs, it needs the certs being mounted in that configured location, so I needed to add a line to /etc/fstab: 

192.168.x.254:/le /etc/letsencrypt.sh/certs glusterfs noexec,nodev,_netdev 0 0

Basically the same needs to be done on the other VMs where you want to use the certs as well, but you may want to mount the share as read-only there.

The next step was a little more tricky. When letsencrypt.sh generates new certs, Letsencrypt will contact the webserver for that domain to respond to the ACME challenge. This requires that on each VM you want to use letsencrypt you have to run a webserver. Well, actually at least that there is somewhere a webserver that can answer these requests for that specific domain...

Now, the setup of the webserver (Apache in my case) is like this: 

I'm using the Apache macro module to make it more easy, so I generated two small configs in /etc/apache/conf-available and enabled them bei a2enconf: letsencrypt-proxy.conf to do some setup for proxying the ACME challenges to a common website called acme.example.org. And then letsencrypt-sslredir.conf to setup SSL redirection when everything is in place and the domain can be switched over to HTTPS-only.

letsencrypt-proxy.conf: 

<Macro le_proxy>
     ProxyRequests Off
     <Proxy *>
            Order deny,allow
            Allow from all
     </Proxy>
     ProxyPass /.well-known/acme-challenge/ http://acme.windfluechter.net/
     ProxyPassReverse / http://%{HTTP_HOST}/.well-known/acme-challenge/
</Macro>

letsencrypt-sslredir.conf:

<Macro le_sslredir>
    RewriteEngine on
    RewriteCond %{HTTPS} !=on
    RewriteRule . https://%{HTTP_HOST}%{REQUEST_URI}  [L]
</Macro>

So, after all the setup of a virtual host for Apache looks like this: 

<Macro example.org>
(lots of setup stuff)
</Macro>
<VirtualHost 31.172.31.x:443 [2a01:a700:4629:x::1]:443>
        Header always set Strict-Transport-Security "max-age=31556926; includeSubDomains"
        SSLEngine on
        # letsencrypt certs:
        SSLCertificateFile /etc/letsencrypt.sh/certs/example.org/fullchain.pem
        SSLCertificateKeyFile /etc/letsencrypt.sh/certs/example.org/privkey.pem
        SSLHonorCipherOrder On
    Use example.org
    Use le_proxy
</VirtualHost>
<VirtualHost 31.172.31.x:80 [2a01:a700:4629:x::1]:80>
    Use example.org
    Use le_proxy
    Use le_sslredir
</VirtualHost>

le_sslredir is only needed when you are sure that you want all traffic being redirected to HTTPS. For example when your blog is listed on planet.debian.org or other Planets you might want to omit this from your HTTP config because bug #813313 is not yet solved. 

In the end, when creating a new Letsencrypt cert, you need to add the le_proxy macro to your website, add the domain to letsencrypt.sh config in /etc/letsencrypt.sh/domains.txt and then the scripts will request a new cert from Letsencrypt, handling the ACME challenge stuff via the URL redirection in le_proxy being redirected to your acme.exmaple.org site and finally writes your new cert to the GlusterFS share. From that share you can then use the new cert on all needed VMs, be it your mailserver, webserver or XMPP/SIP server VMs. 

At least this works for me.

UPDATE:
Of course you should be careful about your file permissions on that GlusterFS share, so that the automatic key renewal works, but also without too many permissions granted that everyone can obtain your private keys.

Kategorie: 
 

Letsencrypt - when your blog entries don't show up on Planet Debian

Recently there is much talk on Planet Debian about LetsEncrypt certs. This is great, because using HTTPS everywhere improves security and gives the NSA some more work to decrypt the traffic.

However, when you enabled your blog with a LetsEncrypt cert, you might run into the same problem as I: your new article won't show up on Planet Debian after changing your feed URI to HTTPS. The reason seems to be quite simple: planet-venus, which is the software behind Planet Debian seems to have problems with SNI enabled websites.

When following the steps outlined in the Debian Wiki, you can check this by yourself: 

INFO:planet.runner:Fetching https://blog.windfluechter.net/taxonomy/term/2/feed via 5
ERROR:planet.runner:HttpLib2Error: Server presented certificate that does not match host blog.windfluechter.net: {'subjectAltName': (('DNS', 'abi94oesede.de'), ('DNS', 'www.abi94oesede.de')), 'notBefore': u'Jan 26 18:05:00 2016 GMT', 'caIssuers': (u'http://cert.int-x1.letsencrypt.org/',), 'OCSP': (u'http://ocsp.int-x1.letsencrypt.org/',), 'serialNumber': u'01839A051BF9D2873C0A3BAA9FD0227C54D1', 'notAfter': 'Apr 25 18:05:00 2016 GMT', 'version': 3L, 'subject': ((('commonName', u'abi94oesede.de'),),), 'issuer': ((('countryName', u'US'),), (('organizationName', u"Let's Encrypt"),), (('commonName', u"Let's Encrypt Authority X1"),))} via 5

I've filed bug #813313 for this. So, this might explain why your blog post doesn't appear on Planet Debian. Currently there seem 18 sites to be affected by this cert mismatch.

Kategorie: 
 

AKtiVCongrEZ 2016 in Hattingen

Letzte Woche war es wieder soweit: der AKtiVCongrEZ in Hattingen fand statt. Das ist eine Veranstaltung, die von Digitalcourage veranstaltet und von der Bundeszentrale für politische Bildung und dem DGB Bildungswerk gefördert wird und bei der sich viele tolle Menschen, die sich rund um Datenschutz und Menschenrechte engagieren, zusammenkommen und Akionen planen.

Auch dieses Mal war die Veranstaltung ausgebucht, was ungefähr ca. 60 Leute bedeutet, die von Donnerstag abends bis Sonntag nachmittags selber ein volles Programm fertigstellen und abarbeiten. Auf Einzelheiten kann und werde ich natürlich nicht eingehen, aber ein paar Sachen fielen mir dennoch auf, die ich hier gerne ansprechen möchte: 

  • Es ist fast wie ein Familientreffen. Viele der Anwesenden sind altbekannte Aktivisten und jedes Jahr in Hattingen, auch wenn ab und zu natürlich jemand mal verhindert ist, ist es eine große Freude und Ehre, diese außergewöhnlichen Menschen zu treffen.
  • Auch wenn so viele Teilnehmer jedes Jahr wiederkommen, waren dieses Jahr ungewöhnlich viele Neue dabei. Es stimmt zuversichtlich, daß soviele Menschen den Bedarf sehen, aktiv zu werden und etwas gegen den ausufernden Überwachungsstaat zu unternehmen.
  • Auch wenn in den vielen Workshops sehr produktive Arbeit geleistet wird, sind eigentlich die Pausen und die Abende das eigentlich Wichtige an der ganzen Veranstaltung: das Kennenlernen der Aktivisten untereinander und der Austausch mit ihnen miteinander in vielen tollen und interessanten Diskussionen, die häufig bis in die späte Nacht bei einem Bier oder anderem Getränk stattfinden.
Kategorie: 
 

rpcbind listening on all interfaces

Currently I'm testing GlusterFS as a replicating network filesystem. GlusterFS depends on rpcbind package. No problem with that, but I usually want to have the services that run on my machines to only listen on those addresses/interfaces that are needed to fulfill the task. This is especially important, because rpcbind can be abused by remote attackers for rpc amplification attacks (dDoS). So, the rpcbind man page states: 

-h : Specify specific IP addresses to bind to for UDP requests. This option may be specified multiple times and is typically necessary when running on a multi-homed host. If no -h option is specified, rpcbind will bind to INADDR_ANY, which could lead to problems on a multi-homed host due to rpcbind returning a UDP packet from a different IP address than it was sent to. Note that when specifying IP addresses with -h, rpcbind will automatically add 127.0.0.1 and if IPv6 is enabled, ::1 to the list.

Ok, although there is neither a /etc/default/rpcbind.conf nor a /etc/rpcbind.conf nor a sample-rpcbind.conf under /usr/share/doc/rpcbind, some quick websearch revealed a sample config file. I'm using this one: 

# /etc/init.d/rpcbind
OPTIONS=""

# Cause rpcbind to do a "warm start" utilizing a state file (default)
# OPTIONS="-w "

# Uncomment the following line to restrict rpcbind to localhost only for UDP requests
OPTIONS="${OPTIONS} -h 192.168.1.254"
#127.0.0.1 -h ::1"

# Uncomment the following line to enable libwrap TCP-Wrapper connection logging
OPTIONS="${OPTIONS} -l "

As you can see, I want to bind to 192.168.1.254. After a /etc/init.d/rpcbind restart verifying that everything works as desired with netstat is showing...

tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 0 2084266 30777/rpcbind
tcp6 0 0 :::111 :::* LISTEN 0 2084272 30777/rpcbind
udp 0 0 0.0.0.0:848 0.0.0.0:* 0 2084265 30777/rpcbind
udp 0 0 192.168.1.254:111 0.0.0.0:* 0 2084264 30777/rpcbind
udp 0 0 127.0.0.1:111 0.0.0.0:* 0 2084260 30777/rpcbind
udp6 0 0 :::848 :::* 0 2084271 30777/rpcbind
udp6 0 0 ::1:111 :::* 0 2084267 30777/rpcbind

Whoooops! Although I've specified that rpcbind should only listen to 192.168.1.254 (and localhost as described by the man page) rpcbind is still listening on all addresses. Quick check if the process is using the correct options: 

root     30777  0.0  0.0  37228  2360 ?        Ss   16:11   0:00 /sbin/rpcbind -h 192.168.1.254 -l

Hmmm, yes, -h 192.168.1.254 is specified. Ok, something is going wrong here...

According to an entry in Ubuntus Launchpad I'm not the only one that experienced this problem. However this Launchpad entry mentioned that upstream seems to have a fix in version 0.2.3, but I experienced the same behaviour in stable as well as in unstable, where the package version is 0.2.3-0.2. Apparently the problem still exists in Debian unstable.

I'm somewhat undecided whether to file a normal bug against rpcbind or if I should label it as a security bug, because it opens a service to the public that can be abused for amplification attacks, although you might have configured rpcbind to just listen on internal addresses.

Kategorie: 
 

Diskussion um Obergrenzen

In Medien und Politik wird derzeit hitzig über etwaige Obergrenzen für Flüchtlinge diskutiert.

Was soll ich sagen? 

Die Diskussion ist reiner Populismus, der nur den Parteien am äußeren rechten Rand nutzt. Die Diskussion schürt zudem den Haß auf Fremde. Und alle, die sich für Obergrenzen stark machen, sei es nun von der CDU, CSU, SPD oder anderen Parteien, machen sich im Prinzip mitschuldig, wenn irgendwo in Deutschland Flüchtlingsheime brennen.

Das Grundrecht auf Asyl ist ein persönliches Grundrecht, d.h. es steht jedem Menschen zu. Das läßt sich nicht regelmentieren oder ignorieren. Es ist ein ganz einfacher Fakt. Alles andere ist bloß widerlicher rechtsgerichteter Populismus.

Flüchtlinge, die oftmals lebensbedrohliche Strapazen auf sich nehmen, um in Deutschland oder der EU ihre Grundrecht auf Asyl ausüben zu können, verdienen unsere Hilfe und nicht diese grundrechtswidrige und menschenunwürdige Diskussion um Obergrenzen für Flüchtlinge. Wenn man die Anzahl der Flüchtlinge reduzieren will, dann muss man die Ursache der Flucht beheben: Krieg.

Doch genau da haben wir Deutschen als einer der weltweit größten Waffenexporteure eine enorme Mitschuld zu tragen. Wir beteiligen uns mit der Bundeswehr an fragwürdigen Kriegseinsätzen und wundern uns dann trotzdem noch, daß letztes Jahr nach offiziellen des UNHCR weltweit 60 Mio. Menschen auf der Flucht waren?

Bevor man nach Obergrenzen bei Flüchtlingen schreit, sollte man vielleicht mal in seiner eigenen Familiengeschichte nachforschen. Nach dem 2. Weltkrieg waren auch Millionen Deutsche auf der Flucht, so daß es nicht unwahrscheinlich ist, daß ein Großteil der Bevölkerung selber mittelbar von Flüchtlingen abstammen. Was wäre gewesen, wenn es schon damals Obergrenzen gegeben hätte? Welche Familienmitglieder, Nachbarn oder Kollegen gäbe es dann heute wohl nicht?

Jeder Mensch kann früher oder später zur Flucht gezwungen sein. Auch Du, lieber Leser. Oder ich. Das kann manchmal schneller gehen als man denkt. Würde man dann wollen, daß man an einer Grenze zu einem sicheren Land, in dem man Zuflucht suchen und Asyl beantragen möchte, abgewiesen zu werden, weil eine aus reinem rechtsgerichteten Populismus gezogene Obergrenze erreicht worden ist?

Nein, als Deutsche haben wir den Anspruch, quasi überall ungehindert einreisen zu können. Aber wir sind offenbar auch so egoistisch, daß wir das, was wir für uns selber einfordern, anderen nicht zugestehen wollen. Das ist nicht nur falsch oder peinlich, sondern auch in höchstem Maße unmoralisch und Unrecht.

 

Kategorie: 
 

Letsencrypt: challenging challenges

On December 3rd 2015 the Letsencrypt project went to public beta and this is a good thing! Having more and more websites running on good and valid SSL certificates for their HTTPS is a good thing, especially because Letsencrypt takes care of renewing the certs every now and then. But there are still some issues with Letsencrypt. Some people criticize the Python client needing root priviledges and such. Others complain that Letsencrypt currently only supports webservers.

Well, I think for a public beta this is what we could have expected from the start: the Letsencrypt project focussed on a reference implementation and there are already other implementations being available. But one thing seems to a problem within the design of how Letsencrypt works as it uses a challenge response method to verify that the requesting user is controlling the domain for which the certificate shall be issued. This might work well in simple deployments, but what about a little more complex setups like multiple virtual machines and different protocols involved.

For example: you're using domain A for your communication like user@example.net for your mail, XMPP and SIP. Your mailserver runs on one virtual machine, whereas the webserver is running on a different virtual machine. The same for XMPP and SIP: a seperate VM as well. 

Usually the Letsencrypt approach would be that you configure your webserver (by configure /.well-known/acme-challenge/* location or use a standalone server on port 443) to handle the challenge response requests. This would give you a SSL cert for your webserver example.net. Of course you could copy this cert to your mail-, XMPP- and SIP-server, but then again you have to do this everytime the SSL cert gets renewed.

Another challenge is of course that you are not only have one or two domains, but a bunch of domains. In my case I host approximately >60 domains. Basically the mail for all domains are handled by my mailserver running on its own virtual machine. The webserver is located on a different VM. For some domains I offer XMPP accounts on a third VM.

What is the best way to solve this problem? Moving everything to just one virtual machine? Naaah! Writing some scripts to copy the certs as needed? Not very smart as well. Using a network share for the certs between all VMs? Hmmm... would that work?

And what about TLSA entries of your DNSSEC setup? When a SSL cert is renewed than the fingerprint might need an update in your DNS zone - for several protocols like mail, XMPP, SIP and HTTPS. At least the Bash implementation of Letsencrypt offers a "hook" which is called after the SSL cert has been issued.

What are you ways to deal with this kind of handling the ACME protocol challenges and multi-domain, multi-VM setup?

Kategorie: 
 

Guerillero vs. Terrorist

Der Guerillero besetzt das Land,
der Terrorist besetzt das Denken.

aus: "Verdächtig" von Heribert Prantl, S. 19.

'nuff said...

Kategorie: 
 

Pages

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer