Sie sind hier

Juli 2008

No wanna-build access since DSA-1571 for m68k

Dear lazyweb,

it's well known that PostgreSQL is not multi-threaded, but runs multiple instances that can run on different CPUs/cores.
We have lately bought a dual quad core system as our new database server and everything is neat and fine as long I'm testing the database locally via socket or TCP on either IP and the database scales about all 8 cores:

postgres@dhcp-140:~$ /usr/lib/postgresql/8.3/bin/pgbench -h -c 8 -t 30000 testdb

Cpu0 : 24.1%us, 3.1%sy, 0.0%ni, 70.9%id, 0.0%wa, 0.0%hi, 1.9%si, 0.0%st
Cpu1 : 24.5%us, 4.8%sy, 0.0%ni, 69.0%id, 0.0%wa, 0.0%hi, 1.8%si, 0.0%st
Cpu2 : 17.7%us, 3.2%sy, 0.0%ni, 77.3%id, 0.0%wa, 0.0%hi, 1.9%si, 0.0%st
Cpu3 : 13.5%us, 2.3%sy, 0.0%ni, 83.0%id, 0.0%wa, 0.0%hi, 1.3%si, 0.0%st
Cpu4 : 22.0%us, 5.3%sy, 0.0%ni, 70.9%id, 0.0%wa, 0.0%hi, 1.9%si, 0.0%st
Cpu5 : 23.5%us, 4.2%sy, 0.0%ni, 70.0%id, 0.0%wa, 0.0%hi, 2.3%si, 0.0%st
Cpu6 : 28.3%us, 4.3%sy, 0.0%ni, 63.7%id, 0.0%wa, 0.0%hi, 3.7%si, 0.0%st
Cpu7 : 23.9%us, 4.3%sy, 0.0%ni, 69.1%id, 0.0%wa, 0.0%hi, 2.8%si, 0.0%st

3551 postgres 20 0 98.2m 18m 16m R 11 1 0.2 0:05.30 postgres
3553 postgres 20 0 98.2m 18m 16m R 11 3 0.2 0:05.38 postgres
3546 postgres 20 0 98.2m 18m 16m S 7 7 0.2 0:05.52 postgres
3547 postgres 20 0 98.2m 18m 16m S 7 0 0.2 0:04.22 postgres
3548 postgres 20 0 98.2m 18m 16m S 7 2 0.2 0:04.62 postgres
3549 postgres 20 0 98.2m 18m 16m S 7 4 0.2 0:04.92 postgres
3550 postgres 20 0 98.2m 18m 16m S 7 0 0.2 0:05.84 postgres
3552 postgres 20 0 98.2m 18m 16m S 5 6 0.2 0:04.96 postgres

Column P marks the last used processor that process was running on. As you can see, postgres runs basically on all cores. But when I try to access the database from a remote host something strange happens:

Cpu0 : 22.5%us, 3.7%sy, 0.0%ni, 72.8%id, 0.0%wa, 0.0%hi, 0.9%si, 0.0%st
Cpu1 : 17.8%us, 2.8%sy, 0.0%ni, 79.1%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu2 : 5.6%us, 1.3%sy, 0.0%ni, 92.7%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu3 : 18.7%us, 3.2%sy, 0.0%ni, 76.5%id, 0.0%wa, 0.0%hi, 1.6%si, 0.0%st
Cpu4 : 1.0%us, 0.3%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu5 : 27.0%us, 2.5%sy, 0.0%ni, 69.8%id, 0.0%wa, 0.0%hi, 0.6%si, 0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 34.1%us, 3.7%sy, 0.0%ni, 57.0%id, 0.0%wa, 0.6%hi, 4.6%si, 0.0%st

3580 postgres 20 0 98.2m 18m 16m S 13 7 0.2 0:03.56 postgres
3582 postgres 20 0 98.2m 18m 16m S 12 1 0.2 0:03.54 postgres
3586 postgres 20 0 98.2m 18m 16m S 11 7 0.2 0:03.34 postgres
3587 postgres 20 0 98.2m 18m 16m S 11 5 0.2 0:03.36 postgres
3581 postgres 20 0 98.2m 18m 16m S 11 5 0.2 0:03.46 postgres
3583 postgres 20 0 98.2m 18m 16m S 11 7 0.2 0:03.40 postgres
3584 postgres 20 0 98.2m 18m 16m R 11 3 0.2 0:03.36 postgres
3585 postgres 20 0 98.2m 18m 16m S 11 1 0.2 0:03.34 postgres

Postgres runs only on all core with an odd cpuid, that is core 1, 3, 5 and 7. This is reproducible and doesn't change when I connect more then one remote client or by raising the -c parameter. I'm running Lenny:

dhcp-140:~# dpkg -l | grep postgres
ii postgresql-8.3 8.3.3-1 object-relational SQL database, version 8.3
ii postgresql-client-8.3 8.3.3-1 front-end programs for PostgreSQL 8.3
ii postgresql-client-common 88 manager for multiple PostgreSQL client versi
ii postgresql-common 88 PostgreSQL database-cluster manager
ii postgresql-contrib-8.3 8.3.3-1 additional facilities for PostgreSQL

So, dear lazyweb, when you have some tips to debug or solve the issue, please comment! I don't know yet if this is a bug and when, if it's in Postgres or maybe in the kernel (2.6.25-2-amd64)?


Google StreetView in Berlin

Most people will know that it's hard for the m68k port to keep up with unstable. Mostly because the hardware is not the fastest anymore, but usually we could work around this problem by throwing more hardware onto it. But there are sometimes non-m68k problems that prevent the port from keeping up like, let's say, no wanna-build access anymore, because all ssh pubkeys have been revoked due to DSA-1571 (OpenSSL).
The problem was mentioned on debian-68k mailing list, starting a discussion about the implications for m68k.
Some weeks after the incidence, the problem still existed and there was some discussion again how to proceed and it was tried to get the keys in again. And luckily there was some progress.

But that didn't last long and finally there was yet another attempt to get wanna-build working again for m68k.
As of this writing, there wasn't any success in getting wanna-build access back and after 2 months since the DSA-1571 incident, m68k is not allowed to build packages from wanna-build, because somebody didn't feel like adding some ssh pubkeys.

Maybe that person is overloaded with other work or such, but this means that this person puts extra workload on other people that are scheduling packages on multiple buildds for about 2 months now, although there is other work to do for the involved porters than acting as a human wanna-build.

I'm very disappointed by those who have failed to update the wanna-build ACLs in a timely manner for this long. And I'm very thankful of everyone who tried to help and especially of Stephen Marenka who acted as a human wanna-build in the meanwhile!


Font-Rendering: to file or not to fle!

Within the last few days, Spiegel Online (German) reported that Google is taking pictures in Berlin for their StreetView service. Today I spotted from my balcony this one:

Google StreetView car Google StreetView car (zoomed)

So, if you these cars driving around it might be a good idea to hide yourself from being photographed and put onto Google StreetView. Yes, I'm no big fan of Google and its giant data collection. I would rather see Google stopping StreetView in Germany to protect peoples privacy.


Xen and NFS performance

When I just started to read today, I realized that Russel Coker was speaking of an unknown tool to print the UUID, but that Debian version doesn't seem to do that.
broken font rendering
It's called fle apparently.
Sadly, an apt-cache showed that there's no package named fle. Then I realized that he was actually speaking of the good old tool called file - with an I between F and L!
Really, I wish the font rendering will improve in Lenny as this was on Etch... *sigh*


UPS - it's always nice to have one!

Today I discovered that one of my domUs at work is performing slow on its mounted NFS share. Bonnie++ and dd tests showed a network throughput of just 300 kB/s whereas the throughput was up to 110 MB/s from dom0 to NFS server. Some Google searches revealed that Xen has problems with NFS performance with non-standard rsize and wsize setting and especially with NFS over UDP.
Reducing rsize and wsize settings didn't help at all. The performance was still awful. After remounting the NFS share via TCP the performance was as expected. Such a huge difference in perfomance surprised me. Still, a strange bug in Xen, at least in Etch. Maybe it's already fixed in Sid?


Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer