You are here

May 2016

Xen randomly crashing server - part 2

Some weeks ago I blogged about "Xen randomly crashing server". The problem back then was that I couldn't get any information why the server reboots. Using a netconsole was not possible, because netconsole refused to work with the bridge that is used for Xen networking. Luckily my colocation partner rrbone.net connected the second network port of my server to the network so that I could use eth1 instead of the bridged eth0 for netconsole.

Today the server crashed several times and I was able to collect some more information than just the screenshots from IPMI/KVM console as shown in my last blog entry (full netconsole output is attached as a file): 

May 12 11:56:39 31.172.31.251 [829681.040596] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2
May 12 11:56:39 31.172.31.251 [829681.040647] Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 3.0a 01/03/2014
May 12 11:56:39 31.172.31.251 [829681.040701] task: ffffffff8181a460 ti: ffffffff81800000 task.ti: ffffffff81800000
May 12 11:56:39 31.172.31.251 [829681.040749] RIP: e030:[<ffffffff812b7e56>]
May 12 11:56:39 31.172.31.251  [<ffffffff812b7e56>] memcpy+0x6/0x110
May 12 11:56:39 31.172.31.251 [829681.040802] RSP: e02b:ffff880280e03a58  EFLAGS: 00010286
May 12 11:56:39 31.172.31.251 [829681.040834] RAX: ffff88026eec9070 RBX: ffff88023c8f6b00 RCX: 00000000000000ee
May 12 11:56:39 31.172.31.251 [829681.040880] RDX: 00000000000004a0 RSI: ffff88006cd1f000 RDI: ffff88026eec9422
May 12 11:56:39 31.172.31.251 [829681.040927] RBP: ffff880280e03b38 R08: 00000000000006c0 R09: ffff88026eec9062
May 12 11:56:39 31.172.31.251 [829681.040973] R10: 0100000000000000 R11: 00000000af9a2116 R12: ffff88023f440d00
May 12 11:56:39 31.172.31.251 [829681.041020] R13: ffff88006cd1ec66 R14: ffff88025dcf1cc0 R15: 00000000000004a8
May 12 11:56:39 31.172.31.251 [829681.041075] FS:  0000000000000000(0000) GS:ffff880280e00000(0000) knlGS:ffff880280e00000
May 12 11:56:39 31.172.31.251 [829681.041124] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
May 12 11:56:39 31.172.31.251 [829681.041153] CR2: ffff88006cd1f000 CR3: 0000000271ae8000 CR4: 0000000000042660
May 12 11:56:39 31.172.31.251 [829681.041202] Stack:
May 12 11:56:39 31.172.31.251 [829681.041225]  ffffffff814d38ff
May 12 11:56:39 31.172.31.251  ffff88025b5fa400
May 12 11:56:39 31.172.31.251  ffff880280e03aa8
May 12 11:56:39 31.172.31.251  9401294600a7012a
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.041287]  0100000000000000
May 12 11:56:39 31.172.31.251  ffffffff814a000a
May 12 11:56:39 31.172.31.251  000000008181a460
May 12 11:56:39 31.172.31.251  00000000000080fe
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.041346]  1ad902feff7ac40e
May 12 11:56:39 31.172.31.251  ffff88006c5fd980
May 12 11:56:39 31.172.31.251  ffff224afc3e1600
May 12 11:56:39 31.172.31.251  ffff88023f440d00
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.041407] Call Trace:
May 12 11:56:39 31.172.31.251 [829681.041435]  <IRQ>
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.041441]
May 12 11:56:39 31.172.31.251  [<ffffffff814d38ff>] ? ndisc_send_redirect+0x3bf/0x410
May 12 11:56:39 31.172.31.251 [829681.041506]  [<ffffffff814a000a>] ? ipmr_device_event+0x7a/0xd0
May 12 11:56:39 31.172.31.251 [829681.041548]  [<ffffffff814bc74c>] ? ip6_forward+0x71c/0x850
May 12 11:56:39 31.172.31.251 [829681.041585]  [<ffffffff814c9e54>] ? ip6_route_input+0xa4/0xd0
May 12 11:56:39 31.172.31.251 [829681.041621]  [<ffffffff8141f1a3>] ? __netif_receive_skb_core+0x543/0x750
May 12 11:56:39 31.172.31.251 [829681.041729]  [<ffffffff8141f42f>] ? netif_receive_skb_internal+0x1f/0x80
May 12 11:56:39 31.172.31.251 [829681.041771]  [<ffffffffa0585eb2>] ? br_handle_frame_finish+0x1c2/0x3c0 [bridge]
May 12 11:56:39 31.172.31.251 [829681.041821]  [<ffffffffa058c757>] ? br_nf_pre_routing_finish_ipv6+0xc7/0x160 [bridge]
May 12 11:56:39 31.172.31.251 [829681.041872]  [<ffffffffa058d0e2>] ? br_nf_pre_routing+0x562/0x630 [bridge]
May 12 11:56:39 31.172.31.251 [829681.041907]  [<ffffffffa0585cf0>] ? br_handle_local_finish+0x80/0x80 [bridge]
May 12 11:56:39 31.172.31.251 [829681.041955]  [<ffffffff8144fb65>] ? nf_iterate+0x65/0xa0
May 12 11:56:39 31.172.31.251 [829681.041987]  [<ffffffffa0585cf0>] ? br_handle_local_finish+0x80/0x80 [bridge]
May 12 11:56:39 31.172.31.251 [829681.042035]  [<ffffffff8144fc16>] ? nf_hook_slow+0x76/0x130
May 12 11:56:39 31.172.31.251 [829681.042067]  [<ffffffffa0585cf0>] ? br_handle_local_finish+0x80/0x80 [bridge]
May 12 11:56:39 31.172.31.251 [829681.042116]  [<ffffffffa0586220>] ? br_handle_frame+0x170/0x240 [bridge]
May 12 11:56:39 31.172.31.251 [829681.042148]  [<ffffffff8141ee24>] ? __netif_receive_skb_core+0x1c4/0x750
May 12 11:56:39 31.172.31.251 [829681.042185]  [<ffffffff81009f9c>] ? xen_clocksource_get_cycles+0x1c/0x20
May 12 11:56:39 31.172.31.251 [829681.042217]  [<ffffffff8141f42f>] ? netif_receive_skb_internal+0x1f/0x80
May 12 11:56:39 31.172.31.251 [829681.042251]  [<ffffffffa063f50f>] ? xenvif_tx_action+0x49f/0x920 [xen_netback]
May 12 11:56:39 31.172.31.251 [829681.042299]  [<ffffffffa06422f8>] ? xenvif_poll+0x28/0x70 [xen_netback]
May 12 11:56:39 31.172.31.251 [829681.042331]  [<ffffffff8141f7b0>] ? net_rx_action+0x140/0x240
May 12 11:56:39 31.172.31.251 [829681.042367]  [<ffffffff8106c6a1>] ? __do_softirq+0xf1/0x290
May 12 11:56:39 31.172.31.251 [829681.042397]  [<ffffffff8106ca75>] ? irq_exit+0x95/0xa0
May 12 11:56:39 31.172.31.251 [829681.042432]  [<ffffffff8135a285>] ? xen_evtchn_do_upcall+0x35/0x50
May 12 11:56:39 31.172.31.251 [829681.042469]  [<ffffffff8151669e>] ? xen_do_hypervisor_callback+0x1e/0x30
May 12 11:56:39 31.172.31.251 [829681.042499]  <EOI>
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.042506]
May 12 11:56:39 31.172.31.251  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
May 12 11:56:39 31.172.31.251 [829681.042561]  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
May 12 11:56:39 31.172.31.251 [829681.042592]  [<ffffffff81009e7c>] ? xen_safe_halt+0xc/0x20
May 12 11:56:39 31.172.31.251 [829681.042627]  [<ffffffff8101c8c9>] ? default_idle+0x19/0xb0
May 12 11:56:39 31.172.31.251 [829681.042666]  [<ffffffff810a83e0>] ? cpu_startup_entry+0x340/0x400
May 12 11:56:39 31.172.31.251 [829681.042705]  [<ffffffff81903076>] ? start_kernel+0x497/0x4a2
May 12 11:56:39 31.172.31.251 [829681.042735]  [<ffffffff81902a04>] ? set_init_arg+0x4e/0x4e
May 12 11:56:39 31.172.31.251 [829681.042767]  [<ffffffff81904f69>] ? xen_start_kernel+0x569/0x573
May 12 11:56:39 31.172.31.251 [829681.042797] Code:
May 12 11:56:39 31.172.31.251  <f3>
May 12 11:56:39 31.172.31.251 
May 12 11:56:39 31.172.31.251 [829681.043113] RIP
May 12 11:56:39 31.172.31.251  [<ffffffff812b7e56>] memcpy+0x6/0x110
May 12 11:56:39 31.172.31.251 [829681.043145]  RSP <ffff880280e03a58>
May 12 11:56:39 31.172.31.251 [829681.043170] CR2: ffff88006cd1f000
May 12 11:56:39 31.172.31.251 [829681.043488] ---[ end trace 1838cb62fe32daad ]---
May 12 11:56:39 31.172.31.251 [829681.048905] Kernel panic - not syncing: Fatal exception in interrupt
May 12 11:56:39 31.172.31.251 [829681.048978] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)

I'm not that good at reading this kind of output, but to me it seems that ndisc_send_redirect is at fault. When googling for "ndisc_send_redirect" you can find a patch on lkml.org and Debian bug #804079, both seem to be related to IPv6.

When looking at the linux kernel source mentioned in the lkml patch I see that this patch is already applied (line 1510): 

        if (ha) 
                ndisc_fill_addr_option(buff, ND_OPT_TARGET_LL_ADDR, ha);

So, when the patch was intended to prevent "leading to data corruption or in the worst case a panic when the skb_put failed" it does not help in my case or in the case of #804079.

Any tips are appreciated!

PS: I'll contribute to that bug in the BTS, of course!

Kategorie: 
 
AttachmentSize
Plain text icon syslog-xen-crash.txt24.27 KB

Avoiding Gated Communities with Diaspora, Friendica and others

At the Chaos Communication Congress 32c3 in Hamburg last year, there was a talk by Katharina Nocun named "A New Kid on the Block - Conditions for a Successful Market Entry of Decentralized Social Networks". The short abstract is this: 

The leading social networks are the powerful new gatekeepers of the digital age. Proprietary de facto standards of the dominant companies have lead to the emergence of virtual “information silos” that can barely communicate with one another. Has Diaspora really lost the war? Or is there still a chance to succeed?

Maybe some of you attended that talk or have already seen the recording. For those who haven't, here it is for your convenience: 

It's all about Social Networks and Gated Communities vs. open communities. It's like Facebook on the Gated Community side and Diaspora as an example on the other, the open side.

At timecode 17:20 Katharina mentions that the Top10 of Diaspora pods have more than half a million users. But when you look more closely at the statistics from the-federation.info you can spot a different result that is most likely true for marketing statistic of Facebook as well: there is a difference between total users and current active users. Whereas indeed the total users are easily surpassing the half million users mark, it's a total different issue for the active users count of the last month: 15488 active users in total versus 546783 total users of the Top10 Diaspora sites. That's only 2.83% of active users. A quite awful turnaround rate. 

Many users are just quick lurkers, that came passing by, looking at Diaspora (and other alternative networks), get a quick login and a first try-out and never come back after a few days. I can confirm this from my own Friendica node at Nerdica.net where I currently have a total of 13 users: 7 users never posted any content, 1 user is already automatically set to expired because of this, and 8 users never came back after first day of registration. 

Therefor I cannot confirm with Katharinas conclusion that Diaspora "is not dead, it's pretty alive". All these alternative Social Networks are pretty much dead or - to put it in more friendly words - are alive in a rather small niche or small communities like data/privacy aware peoples.

Am I happy about this?

No, definitely not, because I am one of these data/privacy aware activits. I'm no big fan of such monolithic and centralized networks like Facebook. I'm a enthusiastic advocate of self-hosting and decentralized platforms and communication protocols, such as XMPP.

So, what can be done about these kind of Gated Communities like Facebook? Are you still on Facebook, because most of your family and friends are over there and not on Diaspora/Friendica? Are you still using Skype instead of XMPP? Why are you doing this? I'm really interested in this, because I don't understand it.

PS: please watch the video in full length! Katharina has some other good points as well! :)

Kategorie: 
 

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer