September 25, 2017

Brain dump – what fascinates me

By Holger "zecke" Freyther

A small brain dump of topics that currently fascinate me. These are mostly pointers and maybe it is interesting to follow it.

Books/Reading:

My kobo ebook reader has the Site Reliability Engineering book and I am now mostly done. It is kind of a revelation and explains my interest to write code but also to operate infrastructure (like struggling with ruby, rmagick, nginx…). I am interested in backends since… well ever. The first time I noticed  it when we talked about Kolab at LinuxTag and I was more interested in the backend than the KDE client. At sysmocom we built an IoT product and the backend was quite some fun, especially the scale of one instance and many devices/users, capacity planning and disk commissioning, lossless upgrades.

It can be seen in my non FOSS SS7 map work on traffic masquerading and transparent rewriting. It is also clear to see which part of engineering is needed for scale (instead of just installing and restarting servers).

Lang VM design

One technology that made Java fast (Hotspot) and has seen its way into JavaScript is dynamic optimization. Most Just in Time Compilers start with generating native code per method, either directly or after the first couple of calls when the methods size is significant enough. The VM records which call paths are hot, which types are used and then can generate optimized code (e.g. specialized for integers, remove type checks). A technique pioneered at Sun for the “Self” language (and then implemented for Strongtalk and then brought to Java) was “adaptive optimization and deoptimization” and was the Phd topic of Urs Hoelzle (Google’s VP of Engineering). One of the key aspects is inlining across method boundaries as this removes method look-up, call stack handling and opens the way for code optimization across method boundaries (at the cost of RAM usage).

In OpenJDK, V8 and JavaScriptCore this adaptive optimization is typically implemented in C++ and requires quite some code. The code is complicated as it needs to optimize but also need to return to a basic function (deoptimize, e.g. if a method changed or the types passed don’t match anymore), e.g. in the middle of a for loop with tons of inlined code (think of Array.map being inlined but then need to be de-inlined). A nice and long blog post of JSC can be found here describing the On Stack Replacement (OSR).

Long introduction and now to the new thing. In the OpensmalltalkVM a new approach called Sista has been picked and I find it is genius. Like with many problems the point of view and approach really matters. Instead of writing a lot of code in the VM the optimizer runs next to the application code. The key parts seem to be:

  • Using branches taken/not-taken as indicator how hot a path is. The overhead of counting these seem to be better than counting method calls/instructions/loops.
  • Using the Inline Caches for type information on call sites (is that mono-, poly- or megamorphic?)
  • Optimize from one set of Bytecode to another set of Bytecode.

The revelation is the last part. By just optimizing from bytecode to bytecode the VM remains in charge of creating and managing machine code. The next part is that tooling in the higher language is better or at least the roundtrip is more quick (edit code and just compile the new method instead of running make, c++, ld). The productivity thanks to the abstraction and tooling is likely higher.

As last part the OSR is easier as well. In Smalltalk thisContext (the current stack frame, activation record) is an object as well. At the right point (when the JIT has either written back variables from register to the stack or at least knows where the value is) one can just manipulate thisContext, create and link news ones and then resume execution without all the magic in other VMs.

Go, Go and escape analysis

Ken Thompson and Robert Pike are well known persons and their Go programming language is a very interesting system programming language. Like with all new languages I try to get real world experience with the language, the tooling and which kind of problems can be solved with it. I have debugged and patched some bigger projects and written two small applications with it.

There is plenty I like. The escape analysis of the compiler is fun (especially now that I know it was translated from the Plan9 C compiler from C to Go), the concurrency model is good (though allowing shared state), the module system makes sense (but makes forking harder than necessary), being able to cross compile to any target from any system.

Knowing a bit of Erlang (and continuing to read the Phd Thesis of Joe Armstrong) and being a heavy Smalltalk user there are plenty of things missing. It starts with vague runtime error messages (e.g. panicslice not having parameters) and goes to runtime and post-runtime inspection. In Smalltalk thanks to the abstraction a lot of hard things are easy and I would have wished for some of them to be in Go. Serialize all unrecovered panics? Debugging someone else’s code seems like pre 1980…

So for many developers Go is a big improvement but for some people with a wider view it might look like a lost opportunity. But that can only be felt by developers that have experienced higher abstraction and productivity.

 

Unsupervised machine learning

but that is for another dump…

September 02, 2017

Purism Librem 5 campaign

By Harald "LaF0rge" Welte

There's a new project currently undergoing crowd funding that might be of interest to the former Openmoko community: The Purism Librem 5 campaign.

Similar to Openmoko a decade ago, they are aiming to build a FOSS based smartphone built on GNU/Linux without any proprietary drivers/blobs on the application processor, from bootloader to userspace.

Furthermore (just like Openmoko) the baseband processor is fully isolated, with no shared memory and with the Linux-running application processor being in full control.

They go beyond what we wanted to do at Openmoko in offering hardware kill switches for camera/phone/baseband/bluetooth. During Openmoko days we assumed it is sufficient to simply control all those bits from the trusted Linux domain, but of course once that might be compromised, a physical kill switch provides a completely different level of security.

I wish them all the best, and hope they can leave a better track record than Openmoko. Sure, we sold some thousands of phones, but the company quickly died, and the state of software was far from end-user-ready. I think the primary obstacles/complexities are verification of the hardware design as well as the software stack all the way up to the UI.

The budget of ~ 1.5 million seems extremely tight from my point of view, but then I have no information about how much Puri.sm is able to invest from other sources outside of the campaign.

If you're a FOSS developer with a strong interest in a Free/Open privacy-first smartphone, please note that they have several job openings, from Kernel Developer to OS Developer to UI Developer. I'd love to see some talents at work in that area.

It's a bit of a pity that almost all of the actual technical details are unspecified at this point (except RAM/flash/main-cpu). No details on the cellular modem/chipset used, no details on the camera, neither on the bluetooth chipset, wifi chipset, etc. This might be an indication of the early stage of their plannings. I would have expected that one has ironed out those questions before looking for funding - but then, it's their campaign and they can run it as they see it fit!

I for my part have just put in a pledge for one phone. Let's see what will come of it. In case you feel motivated by this post to join in: Please keep in mind that any crowdfunding campaign bears significant financial risks. So please make sure you made up your mind and don't blame my blog post for luring you into spending money :)

September 01, 2017

The sad state of voice support in cellular modems

By Harald "LaF0rge" Welte

Cellular modems have existed for decades and come in many shapes and kinds. They contain the cellular baseband processor, RF frontend, protocol stack software and anything else required to communicate with a cellular network. Basically a phone without display or input.

During the last decade or so, the vast majority of cellular modems come as LGA modules, i.e. a small PCB with all components on the top side (and a shielding can), which has contact pads on the bottom so you can solder it onto your mainboard. You can obtain them from vendors such as Sierra Wireless, u-blox, Quectel, ZTE, Huawei, Telit, Gemalto, and many others.

In most cases, the vendors now also solder those modules to small adapter boards to offer the same product in mPCIe form-factor. Other modems are directly manufactured in mPCIe or NGFF aka m.2 form-factor.

As long as those modems were still 2G / 2.5G / 2.75G, the main interconnection with the host (often some embedded system) was a serial UART. The Audio input/output for voice calls was made available as analog signals, ready to connect a microphone and spekaer, as that's what the cellular chipsets were designed for in the smartphones. In the Openmoko phones we also interfaced the audio of the cellular modem in analog, exactly for that reason.

From 3G onwards, the primary interface towards the host is now USB, with the modem running as a USB device. If your laptop contains a cellular modem, you will see it show up in the lsusb output.

From that point onwards, it would have made a lot of sense to simply expose the audio also via USB. Simply offer a multi-function USB device that has both whatever virutal serial ports for AT commands and network device for IP, and add a USB Audio device to it. It would simply show up as a "USB sound card" to the host, with all standard drivers working as expected. Sadly, nobody seems to have implemented this, at least not in a supported production version of their product

Instead, what some modem vendors have implemented as an ugly hack is the transport of 8kHz 16bit PCM samples over one of the UARTs. See for example the Quectel UC-20 or the Simcom SIM7100 which implement such a method.

All the others ignore any acess to the audio stream from software to a large part. One wonders why that is. From a software and systems architecture perspective it would be super easy. Instead, what most vendors do, is to expose a digital PCM interface. This is suboptimal in many ways:

  • there is no mPCIe standard on which pins PCM should be exposed
  • no standard product (like laptop, router, ...) with mPCIe slot will have anything connected to those PCM pins

Furthermore, each manufacturer / modem seems to support a different subset of dialect of the PCM interface in terms of

  • voltage (almost all of them are 1.8V, while mPCIe signals normally are 3.3V logic level)
  • master/slave (almost all of them insist on being a clock master)
  • sample format (alaw/ulaw/linear)
  • clock/bit rate (mostly 2.048 MHz, but can be as low as 128kHz)
  • frame sync (mostly short frame sync that ends before the first bit of the sample)
  • endianness (mostly MSB first)
  • clock phase (mostly change signals at rising edge; sample at falling edge)

It's a real nightmare, when it could be so simple. If they implemented USB-Audio, you could plug a cellular modem into any board with a mPCIe slot and it would simply work. As they don't, you need a specially designed mainboard that implements exactly the specific dialect/version of PCM of the given modem.

By the way, the most "amazing" vendor seems to be u-blox. Their Modems support PCM audio, but only the solder-type version. They simply didn't route those signals to the mPCIe slot, making audio impossible to use when using a connectorized modem. How inconvenient.

Summary

If you want to access the audio signals of a cellular modem from software, then you either

  • have standard hardware and pick one very specific modem model and hope this is available sufficiently long during your application, or
  • build your own hardware implementing a PCM slave interface and then pick + choose your cellular modem

On the Osmocom mpcie-breakout board and the sysmocom QMOD board we have exposed the PCM related pins on 2.54mm headers to allow for some separate board to pick up that PCM and offer it to the host system. However, such separate board hasn't been developed so far.

First actual XMOS / XCORE project

By Harald "LaF0rge" Welte

For many years I've been fascinated by the XMOS XCore architecture. It offers a surprisingly refreshing alternative virtually any other classic microcontroller architectures out there. However, despite reading a lot about it years ago, being fascinated by it, and even giving a short informal presentation about it once, I've so far never used it. Too much "real" work imposes a high barrier to spending time learning about new architectures, languages, toolchains and the like.

Introduction into XCore

Rather than having lots of fixed-purpose built-in "hard core" peripherals for interfaces such as SPI, I2C, I2S, etc. the XCore controllers have a combination of

  • I/O ports for 1/4/8/16/32 bit wide signals, with SERDES, FIFO, hardware strobe generation, etc
  • Clock blocks for using/dividing internal or external clocks
  • hardware multi-threading that presents 8 logical threads on each core
  • xCONNECT links that can be used to connect multiple processors over 2 or 5 wires per direction
  • channels as a means of communication (similar to sockets) between threads, whether on the same xCORE or a remote core via xCONNECT
  • an extended C (xC) programming language to make use of parallelism, channels and the I/O ports

In spirit, it is like a 21st century implementation of some of the concepts established first with Transputers.

My main interest in xMOS has been the flexibility that you get in implementing not-so-standard electronics interfaces. For regular I2C, UART, SPI, etc. there is of course no such need. But every so often one encounters some interface that's very rately found (like the output of an E1/T1 Line Interface Unit).

Also, quite often I run into use cases where it's simply impossible to find a microcontroller with a sufficient number of the related peripherals built-in. Try finding a microcontroller with 8 UARTs, for example. Or one with four different PCM/I2S interfaces, which all can run in different clock domains.

The existing options of solving such problems basically boil down to either implementing it in hard-wired logic (unrealistic, complex, expensive) or going to programmable logic with CPLD or FPGAs. While the latter is certainly also quite interesting, the learning curve is steep, the tools anything but easy to use and the synthesising time (and thus development cycles) long. Furthermore, your board design will be more complex as you have that FPGA/CPLD and a microcontroller, need to interface the two, etc (yes, in high-end use cases there's the Zynq, but I'm thinking of several orders of magnitude less complex designs).

Of course one can also take a "pure software" approach and go for high-speed bit-banging. There are some ARM SoCs that can toggle their pins. People have reported rates like 14 MHz being possible on a Raspberry Pi. However, when running a general-purpose OS in parallel, this kind of speed is hard to do reliably over long term, and the related software implementations are going to be anything but nice to write.

So the XCore is looking like a nice alternative for a lot of those use cases. Where you want a microcontroller with more programmability in terms of its I/O capabilities, but not go as far as to go full-on with FPGA/CPLD development in Verilog or VHDL.

My current use case

My current use case is to implement a board that can accept four independent PCM inputs (all in slave mode, i.e. clock provided by external master) and present them via USB to a host PC. The final goal is to have a board that can be combined with the sysmoQMOD and which can interface the PCM audio of four cellular modems concurrently.

While XMOS is quite strong in the Audio field and you can find existing examples and app notes for I2S and S/PDIF, I couldn't find any existing code for a PCM slave of the given requirements (short frame sync, 8kHz sample rate, 16bit samples, 2.048 MHz bit clock, MSB first).

I wanted to get a feeling how well one can implement the related PCM slave. In order to test the slave, I decided to develop the matching PCM master and run the two against each other. Despite having never written any code for XMOS before, nor having used any of the toolchain, I was able to implement the PCM master and PCM slave within something like ~6 hours, including simulation and verification. Sure, one can certainly do that in much less time, but only once you're familiar with the tools, programming environment, language, etc. I think it's not bad.

The biggest problem was that the clock phase for a clocked output port cannot be configured, i.e. the XCore insists on always clocking out a new bit at the falling edge, while my use case of course required the opposite: Clocking oout new signals at the rising edge. I had to use a second clock block to generate the inverted clock in order to achieve that goal.

Beyond that 4xPCM use case, I also have other ideas like finally putting the osmo-e1-xcvr to use by combining it with an XMOS device to build a portable E1-to-USB adapter. I have no clue if and when I'll find time for that, but if somebody wants to join in: Let me know!

The good parts

Documentation excellent

I found the various pieces of documentation extremely useful and very well written.

Fast progress

I was able to make fast progress in solving the first task using the XMOS / Xcore approach.

Soft Cores developed in public, with commit log

You can find plenty of soft cores that XMOS has been developing on github at https://github.com/xcore, including the full commit history.

This type of development is a big improvement over what most vendors of smaller microcontrollers like Atmel are doing (infrequent tar-ball code-drops without commit history). And in the case of the classic uC vendors, we're talking about drivers only. In the XMOS case it's about the entire logic of the peripheral!

You can for example see that for their I2C core, the very active commit history goes back to January 2011.

xSIM simulation extremely helpful

The xTIMEcomposer IDE (based on Eclipse) contains extensive tracing support and an extensible near cycle accurate simulator (xSIM). I've implemented a PCM mater and PCM slave in xC and was able to simulate the program while looking at the waveforms of the logic signals between those two.

The bad parts

Unfortunately, my extremely enthusiastic reception of XMOS has suffered quite a bit over time. Let me explain why:

Hard to get XCore chips

While the product portfolio on on the xMOS website looks extremely comprehensive, the vast majority of the parts is not available from stock at distributors. You won't even get samples, and lead times are 12 weeks (!). If you check at digikey, they have listed a total of 302 different XMOS controllers, but only 35 of them are in stock. USB capable are 15. With other distributors like Farnell it's even worse.

I've seen this with other semiconductor vendors before, but never to such a large extent. Sure, some packages/configurations are not standard products, but having only 11% of the portfolio actually available is pretty bad.

In such situations, where it's difficult to convince distributors to stock parts, it would be a good idea for XMOS to stock parts themselves and provide samples / low quantities directly. Not everyone is able to order large trays and/or capable to wait 12 weeks, especially during the R&D phase of a board.

Extremely limited number of single-bit ports

In the smaller / lower pin-count parts, like the XU[F]-208 series in QFN/LQFP-64, the number of usable, exposed single-bit ports is ridiculously low. Out of the total 33 I/O lines available, only 7 can be used as single-bit I/O ports. All other lines can only be used for 4-, 8-, or 16-bit ports. If you're dealing primarily with serial interfaces like I2C, SPI, I2S, UART/USART and the like, those parallel ports are of no use, and you have to go for a mechanically much larger part (like XU[F]-216 in TQFP-128) in order to have a decent number of single-bit ports exposed. Those parts also come with twice the number of cores, memory, etc- which you don't need for slow-speed serial interfaces...

Change to a non-FOSS License

XMOS deserved a lot of praise for releasing all their soft IP cores as Free / Open Source Software on github at https://github.com/xcore. The License has basically been a 3-clause BSD license. This was a good move, as it meant that anyone could create derivative versions, whether proprietary or FOSS, and there would be virtually no license incompatibilities with whatever code people wanted to write.

However, to my very big disappointment, more recently XMOS seems to have changed their policy on this. New soft cores (released at https://github.com/xmos as opposed to the old https://github.com/xcore) are made available under a non-free license. This license is nothing like BSD 3-clause license or any other Free Software or Open Source license. It restricts the license to use the code together with an XMOS product, requires the user to contribute fixes back to XMOS and contains references to importand export control. This license is incopatible with probably any FOSS license in existance, making it impossible to write FOSS code on XMOS while using any of the new soft cores released by XMOS.

But even beyond that license change, not even all code is provided in source code format anymore. The new USB library (lib_usb) is provided as binary-only library, for example.

If you know anyone at XMOS management or XMOS legal with whom I could raise this topic of license change when transitioning from older sc_* software to later lib_* code, I would appreciate this a lot.

Proprietary Compiler

While a lot of the toolchain and IDE is based on open source (Eclipse, LLVM, ...), the actual xC compiler is proprietary.

August 19, 2017

Osmocom jenkins test suite execution

By Harald "LaF0rge" Welte

Automatic Testing in Osmocom

So far, in many Osmocom projects we have unit tests next to the code. Those unit tests are executing test on a per-C-function basis, and typically use the respective function directly from a small test program, executed at make check time. The actual main program (like OsmoBSC or OsmoBTS) is not executed at that time.

We also have VTY testing, which specifically tests that the VTY has proper documentation for all nodes of all commands.

Then there's a big gap, and we have osmo-gsm-tester for testing a full cellular network end-to-end. It includes physical GSM modesm, coaxial distribution network, attenuators, splitter/combiners, real BTS hardware and logic to run the full network, from OsmoBTS to the core - both for OsmoNITB and OsmoMSC+OsmoHLR based networks.

However, I think a lot of testing falls somewhere in between, where you want to run the program-under-test (e.g. OsmoBSC), but you don't want to run the MS, BTS and MSC that normally surroudns it. You want to test it by emulating the BTS on the Abis sid and the MSC on the A side, and just test Abis and A interface transactions.

For this kind of testing, I have recently started to investigate available options and tools.

OsmoSTP (M3UA/SUA)

Several months ago, during the development of OsmoSTP, I disovered that the Network Programming Lab of Münster University of Applied Sciences led by Michael Tuexen had released implementations of the ETSI test suite for the M3UA and SUA members of the SIGTRAN protocol family.

The somewhat difficult part is that they are implemented in scheme, using the guile interpreter/compiler, as well as a C-language based execution wrapper, which then is again called by another guile wrapper script.

I've reimplemented the test executor in python and added JUnitXML output to it. This means it can feed the test results directly into Jenkins.

I've also cleaned up the Dockerfiles and related image generation for the osmo-stp-master, m3ua-test and sua-test images, as well as some scripts to actually execute them on one of the Builders. You can find related Dockerfiles as well as associtaed Makfiles in http://git.osmocom.org/docker-playground

The end result after integration with Osmocom jenkins can be seen in the following examples on jenkins.osmocom.org for M3UA and for SUA

Triggering the builds is currently periodic once per night, but we could of course also trigger them automatically at some later point.

OpenGGSN (GTP)

For OpenGGSN, during the development of IPv6 PDP context support, I wrote some test infrastructure and test cases in TTCN-3. Those test cases can be found at http://git.osmocom.org/osmo-ttcn3-hacks/tree/ggsn_tests

I've also packaged the GGSN and the test cases each into separate Docker containers called osmo-ggsn-latest and ggsn-test. Related Dockerfiles and Makefiles can again be found in http://git.osmocom.org/docker-playground - together with a Eclipse TITAN Docker base image using Debian Stretch called debian-stretch-titan

Using those TTCN-3 test cases with the TITAN JUnitXML logger plugin we can again integrate the results directly into Jenkins, whose results you can see at https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-ggsn-test/14/testReport/(root)/GGSN_Tests/

Further Work

I've built some infrastructure for Gb (NS/BSSGP), VirtualUm and other testing, but yet have to build Docker images and related jenkins integration for it. Stay tuned about that. Also, lots more actual tests cases are required. I'm very much looking forward to any contributions.

August 18, 2017

Creating a chroot for CentOS 7.3

By Holger "zecke" Freyther

I have recently written some RPM spec files (and to be honest it feels nicer than creating debian packages) and could test installing the resulting packages on a cloud based CentOS 6.8 VM. After that worked I wanted to test the package on a CentOS 7 system as well. To my surprise my cloud platform didn’t have CentOS 7 images. There was RHEL7 with extra computing costs and several CentOS images with extra packages (or “hardening”) that also incurred extra cost.

Being a Debian user for many many years I thought of using something like debootstrap. I initially remembered something called yumbootstrap but the packages/Google hits I found didn’t provide much. I mostly followed another guide and will write down some differences.

$ mkdir -p chroot/var/lib/rpm
$ rpm –rebuilddb –root=$PWD/chroot
$ rpm -i –root=$PWD/chroot –nodeps centos-release-7-3.1611.el7.centos.x86_64.rpm
$ wget -O /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 http://mirror.centos.org/centos/7/os/x86_64/RPM-GPG-KEY-CentOS-Testing-7

# Create base7 repo
$ echo ”
[base7]
name=CentOS7
baseurl=http://mirror.centos.org/centos/7/os/x86_64/
gpgcheck=1
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7″ > /etc/yum.repos.d/CentOs7.repo

$ yum –disablerepo=\* –enablerepo=base7  –installroot=$PWD/chroot –noplugins install -y rpm-build yum

At that point one can chroot into the new directory. These were enough. I am running this on a CentOS6.8 system so some binaries might fail with the older kernel but I didn’t run into such an issue yet.

August 08, 2017

IPv6 User Plane support in Osmocom

By Harald "LaF0rge" Welte

Preface

Cellular systems ever since GPRS are using a tunnel based architecture to provide IP connectivity to cellular terminals such as phones, modems, M2M/IoT devices and the like. The MS/UE establishes a PDP context between itself and the GGSN on the other end of the cellular network. The GGSN then is the first IP-level router, and the entire cellular network is abstracted away from the User-IP point of view.

This architecture didn't change with EGPRS, and not with UMTS, HSxPA and even survived conceptually in LTE/4G.

While the concept of a PDP context / tunnel exists to de-couple the transport layer from the structure and type of data inside the tunneled data, the primary user plane so far has been IPv4.

In Osmocom, we made sure that there are no impairments / assumptions about the contents of the tunnel, so OsmoPCU and OsmoSGSN do not care at all what bits and bytes are transmitted in the tunnel.

The only Osmocom component dealing with the type of tunnel and its payload structure is OpenGGSN. The GGSN must allocate the address/prefix assigned to each individual MS/UE, perform routing between the external IP network and the cellular network and hence is at the heart of this. Sadly, OpenGGSN was an abandoned project for many years until Osmocom adopted it, and it only implemented IPv4.

This is actually a big surprise to me. Many of the users of the Osmocom stack are from the IT security area. They use the Osmocom stack to test mobile phones for vulnerabilities, analyze mobile malware and the like. As any penetration tester should be interested in analyzing all of the attack surface exposed by a given device-under-test, I would have assumed that testing just on IPv4 would be insufficient and over the past 9 years, somebody should have come around and implemented the missing bits for IPv6 so they can test on IPv6, too.

In reality, it seems nobody appears to have shared line of thinking and invested a bit of time in growing the tools used. Or if they did, they didn't share the related code.

In June 2017, Gerrie Roos submitted a patch for OpenGGSN IPv6 support that raised hopes about soon being able to close that gap. However, at closer sight it turns out that the code was written against a more than 7 years old version of OpenGGSN, and it seems to primarily focus on IPv6 on the outer (transport) layer, rather than on the inner (user) layer.

OpenGGSN IPv6 PDP Context Support

So in July 2017, I started to work on IPv6 PDP support in OpenGGSN.

Initially I thought How hard can it be? It's not like IPv6 is new to me (I joined 6bone under 3ffe prefixes back in the 1990ies and worked on IPv6 support in ip6tables ages ago. And aside from allocating/matching longer addresses, what kind of complexity does one expect?

After my initial attempt of implementation, partially mislead by the patch that was contributed against that 2010-or-older version of OpenGGSN, I'm surprised how wrong I was.

In IPv4 PDP contexts, the process of establishing a PDP context is simple:

  • Request establishment of a PDP context, set the type to IETF IPv4
  • Receive an allocated IPv4 End User Address
  • Optionally use IPCP (part of PPP) to reques and receive DNS Server IP addresses

So I implemented the identical approach for IPv6. Maintain a pool of IPv6 addresses, allocate one, and use IPCP for DNS. And nothing worked.

  • IPv6 PDP contexts assign a /64 prefix, not a single address or a smaller prefix
  • The End User Address that's part of the Signalling plane of Layer 3 Session Management and GTP is not the actual address, but just serves to generate the interface identifier portion of a link-local IPv6 address
  • IPv6 stateless autoconfiguration is used with this link-local IPv6 address inside the User Plane, after the control plane signaling to establish the PDP context has completed. This means the GGSN needs to parse ICMPv6 router solicitations and generate ICMPV6 router advertisements.

To make things worse, the stateless autoconfiguration is modified in some subtle ways to make it different from the normal SLAAC used on Ethernet and other media:

  • the timers / lifetimes are different
  • only one prefix is permitted
  • only a prefix length of 64 is permitted

A few days later I implemented all of that, but it still didn't work. The problem was with DNS server adresses. In IPv4, the 3GPP protocols simply tunnel IPCP frames for this. This makes a lot of sense, as IPCP is designed for point-to-point interfaces, and this is exactly what a PDP context is.

In IPv6, the corresponding IP6CP protocol does not have the capability to provision DNS server addresses to a PPP client. WTF? The IETF seriously requires implementations to do DHCPv6 over PPP, after establishing a point-to-point connection, only to get DNS server information?!? Some people suggested an IETF draft to change this butthe draft has expired in 2011 and we're still stuck.

While 3GPP permits the use of DHCPv6 in some scenarios, support in phones/modems for it is not mandatory. Rather, the 3GPP has come up with their own mechanism on how to communicate DNS server IPv6 addresses during PDP context activation: The use of containers as part of the PCO Information Element used in L3-SM and GTP (see Section 10.5.6.3 of 3GPP TS 24.008. They by the way also specified the same mechanism for IPv4, so there's now two competing methods on how to provision IPv4 DNS server information: IPCP and the new method.

In any case, after some more hacking, OpenGGSN can now also provide DNS server information to the MS/UE. And once that was implemented, I had actual live uesr IPv6 data over a full Osmocom cellular stack!

Summary

We now have working IPv6 User IP in OpenGGSN. Together with the rest of the Osmocom stack you can operate a private GPRS, EGPRS, UMTS or HSPA network that provide end-to-end transparent, routed IPv6 connectivity to mobile devices.

All in all, it took much longer than nneeded, and the following questions remain in my mind:

  • why did the IETF not specify IP6CP capabilities to configure DNS servers?
  • why the complex two-stage address configuration with PDP EUA allocation for the link-local address first and then stateless autoconfiguration?
  • why don't we simply allocate the entire prefix via the End User Address information element on the signaling plane? For sure next to the 16byte address we could have put one byte for prefix-length?
  • why do I see duplication detection flavour neighbour solicitations from Qualcomm based phones on what is a point-to-point link with exactly two devices: The UE and the GGSN?
  • why do I see link-layer source address options inside the ICMPv6 neighbor and router solicitation from mobile phones, when that option is specifically not to be used on point-to-point links?
  • why is the smallest prefix that can be allocated a /64? That's such a waste for a point-to-point link with a single device on the other end, and in times of billions of connected IoT devices it will just encourage the use of non-public IPv6 space (i.e. SNAT/MASQUERADING) while wasting large parts of the address space

Some of those choices would have made sense if one would have made it fully compatible with normal IPv6 like e.g. on Ethernet. But implementing ICMPv6 router and neighbor solicitation without getting any benefit such as ability to have multiple prefixes, prefixes of different lengths, I just don't understand why anyone ever thought You can find the code at http://git.osmocom.org/openggsn/log/?h=laforge/ipv6 and the related ticket at https://osmocom.org/issues/2418

July 24, 2017

OpenMoko: 10 Years After (Mickey’s Story)

By Michael "mickeyl" Lauer

For the 10th anniversary since the legendary OpenMoko announcement at the „Open Source in Mobile“ (7th of November 2006 in Amsterdam), I’ve been meaning to write an anthology or – as Paul Fertser suggested on #openmoko-cdevel – an obituary. I’ve been thinking about objectively describing the motivation, the momentum, how it all began and – sadly – ended. I did even plan to include interviews with Sean, Harald, Werner, and some of the other veterans. But as with oh so many projects of (too) wide scope this would probably never be completed.

As November 2016 passed without any progress, I decided to do something different instead. Something way more limited in scope, but something I can actually finish. My subjective view of the project, my participation, and what I think is left behind: My story, as OpenMoko employee #2. On top of that you will see a bunch of previously unreleased photos (bear with me, I’m not a good photographer and the camera sucked as well).

Prehistoric

I’ve always been a programmer. Well… not always, but for a quite some time. I got into computer science when my dad brought a Commodore PET from work home for some days. That was around 1980/1981, with me being 8 years old and massively impressed by the green text scrolling down on a black monitor, depending on what you typed on the keyboard. He showed me how to do small programs in BASIC. It was cool.

Commodore PET

Unfortunately he had to bring it back after a few weeks, but I was already infected and begged for an own computer. In 1982 – a few months before his sudden and completely unexpected death – he bought me a Commodore 64, which opened up a whole world for me. I learned to program in BASIC, SYSed, POKEed and PEEKed my way through the hardware registers, and had way more fun than with any other toy I possessed.

Naturally, not all of that was programming. The Commodore 64 was an excellent gaming machine, in particular due to the massive amount of cough „free“ games available. The first game I actually bought was The Hobbit, a text adventure with the occasional graphic here and there. Rendering an image back then took minutes, and you could watch the computer painting line by line, and sometimes pixel by pixel. But I was patient. I was young and time seemed basically unlimited.

Commodore C64

Fast forward to 1985 – when the Commodore 64 suddenly seemed somewhat obsolete at the time the AMIGA was announced. This machine looked so much more powerful that it found its way into my dreams… until that one great day in 1986 when my mother surprised me with lots of white boxes, labelled Commodore AMIGA, in the hallway.

Apart from the natural amount of gaming (which made even more fun on the AMIGA), I learned Motorola 68K assembler and became part of the early demo scenehere’s more about my AMIGA history, if you’re interested.

Back in those days, it was completely normal that a demo shuts down the operating system (OS) and takes over the whole hardware. Usually you had to reboot after quitting the demo. I did some early experiments with working on my own OS. Alas, my knowledge back then was not sufficient enough to really make one, though I quite liked the idea of being able to hack every part of a system.

Commodore AMIGA

This was one thing I started missing when I migrated to a shock, horror DOS/WINDOWS machine in the early 90s. Found my way doing C++, MS Foundation Classes, Win32-API, and the lot, but it never felt the same as in the good ‚ole days of the AMIGA. (A bit of that came back when a few years later I installed Linux a PC and learned a lot of different UNIX flavors as part of my computer science studies.)

After completing my diploma thesis, I was asked by a professor (Prof. Drobnik at the institute of telematics, whom I deeply admire and thank for mentoring me!) whether I had interest in pursuing an academic career. I felt my computer science knowledge wasn’t complete enough for the „world outside“ yet, hence I agreed to work on a Ph.D. in his department.

Since by then I had quite a bit of Linux-experience, one of my first tasks was to help a colleague flashing Linux on his COMPAQ IPAQ. His work involved routing algorithms and handover strategies, so they had a test WiFi network with a bunch of laptops and PDAs equiped with 802.11b PCMCIA cards. Since he ran into lots of problems with the locked down Windows Mobile on the IPAQ, he wanted to give Linux a try.

So I learned about Familiar Linux, got into GPE and Opie, became maintainer of Opie, helped with OpenZaurus (which was the an open source distribution for the soon-to-appear-on-the-stage SHARP ZAURUS), co-founded OpenEmbedded, and eventually received my Ph.D. for µMiddle, a component-based adaptive middleware for ad-hoc networks.

The birth of a project

Some months before my graduation though, it was mid 2006, I was working on OpenEZX, an alternative Linux distribution for Motorola’s EZX series of Linux-phones. Smartphones had just began to render PDAs obsolete and were the new hot thing. I was adding EZX support to OpenEmbedded and Opie, when Linux-hacker Harald ‚LaF0rge‘ Welte (who worked on OpenEZX kernel drivers) asked me one day whether I wanted to work on a new Linux-based Smartphone project with a completely open distribution right from the start – as opposed to OpenEZX which was based on a lot of reverse engineering due to the closed nature of that hardware platform (Motorola promised a proper EZX SDK for almost a decade, but never delievered…).

MOTOROLA A780

The mastermind behind the project was Sean Moss-Pultz, an american living in Taiwan, working as a product designer for First International Computer (FIC).

Naturally I was excited and started to work on it. I was supposed to be responsible for the Linux distribution build system aspect (e.g. OpenEmbedded integration) and some UI tasks, in particular to create something the chinese engineers could use to base their applications on. It was already decided that we should base the display subsystem on X11 and the UI on GTK+.

While I wasn’t lucky with that decision, at this point of time I was not strong enough to question and discuss this. In hindsight I view this as my earliest (unfortunately not the last one though) mistake in the project.

While doing the first work on some GOBJECTs, Sean Moss-Pultz came over to Germany and we met for one week to design the basic human interface guidelines & streamlining his interface mockups. We decided that we want to distinguish two basic types of applications, so-called „finger apps“ and „stylus apps“. Here are some of the results of this phase. Note that these have been designed before the actual dimensions of the device (hence display) were finalized. Working on HIGs without being able to create actual paper prototypes (to check finger distances and gesture dimensions) must have been my 2nd mistake.

Dialer dialing Dialer popup Dialer incall Dialer history Dialer dialing Dialer dialing Dialer dialing Dialer dialing Dialer dialing Dialer dialing Dialer dialing Dialer dialing Dialer dialing

While (in my opinion) these mockups are looking beautiful – even by today’s standards – from the viewpoint of a developer who needs to implement these, they’re just crazy. Non-rectangular widgets, semi-transparency, shadows, gradients, etc. everywhere. Have fun doing this with GTK+ 2.6 in 2006. I’m not sure it’s even possible today with version 4 – let alone the necessary hardware requirements for blending and compositing.

Alas, I tried my best to come up with an UI framework which came close to the renderings to give our Shanghai Team (which were supposed to create the actual applications) the necessary tools. I even created a bunch of demo applications. Here’s a guitar toolkit application I programmed using an arm development board:

ChordMaster stylus app

Speaking about development boards… as I’ve mentioned before, a lot of the early UI concepts and prototyping code was done while the final device specifics and capabilities (and the housing!) were still unknown. This lead to a series of expectations which were nowhere to be met by the actual hardware.

Back then my idea of the ideal software stack for Openmoko looked like that:

Openmoko 2007 software stack

If you’re curious about the code for the Openmoko Application Framework libraries, feel free to browse the Openmoko SVN.

Here is the first successful run of kdrive and matchbox on the 3rd development hardware revision of the Neo1973. The picture was taken on the 4th of November 2006 on my IKEA desk. Next to the PCB you are seeing the glimpse of a SHARP ZAURUS Stylus 😉

X on GTA01

The Announcement

In November 2006, the OpenMoko Core Team (which at that time consisted of Sean Moss-Pultz, Harald Welte, and me) flew to Amsterdam where Sean made the legendary announcement of the Neo1973 as „mystery guest speaker“ on the „Open Source in Mobile“ conference. More details about that (including the slides of the presentation) can be found at the linuxdevices article „Cheap, hackable Linux smartphone due soon“.

After the announcement we received a lot of publicity. The mailing lists were flooding with many great ideas (many of those still waiting to be realized). That one small company had the balls to create something hackable from the start against the ongoing trend of locking down mobile embedded devices was very well received. Many interesting leads were made, Universities and labs contacted us, Hardware vendors approached us, etc.

I remember one great quote I picked up from Harald during a presentation that pretty much summed up our approach:

WARRANTY VOID WHEN NOT OPENED – Harald Welte

However, we had yet to deliver and when the first hardware prototypes approached me, I was devastated. Things didn’t look good. The device was tiny, much slower than I had expected (the PXA270 in the MOTOROLA EZX series ran circles around our S3C2410), the resistive touchscreen needed too much pressure, and the display was framed by a massive bezel:

Neo1973 prototypes Neo1973 prototypes

We were already struggling on many software layers (in particular to come anywhere near 80% of the mockups with GTK+), but the hardware constraints killed most of our early HIG ideas.

At this point, we brought the London-based OpenedHand (later being acquired by Intel) on board to work on the launcher and PIM applications. They already had the Matchbox window manager and a set of applications based around an embedded port of the Evolution Data Server running on arm and were quite experienced with GTK+. They came up with a massively reduced version of our mockups, but at least something that worked on the Neo1973.

To handle various low-level device aspects (buttons, powermanagement), I wrote neod.

Phase 0

One of the OpenMoko special features was the so-called phase 0, in which we sent out a dozen of Neo1973 pre-versions (for free, without any obligations) to well-known people with a history in open source. The idea was to get some early feedback, perhaps even some contributions, and have those people spread the news about it, hence act as some kind of multiplicator.

On 14th of February 2007, the OpenMoko.org website went live, with it all our source code and the tools necessary to build the current flash images. Also the mailing lists and IRC channels were populated. For most of its lifetime, OpenMoko was really run much more like an Open Source project (with all implications, good and bad) rather than like a proprietary corporate project.

On 25th of February 2007, we shipped the Phase 0 developer devices – and even though we didn’t get as much feedback as we would have liked, we felt it was a good practice to do this.

The phase 0 developer devices shipped with Openmoko 2007.1, a pretty bare-bones Linux distribution based on an OpenEmbedded matchbox+kdrive image with the Openmoko GTK+ theme and some rudimentary applications.

Neo1973 is shipping

We originally planned to ship the Neo1973 in January 2007, but both hard- and software issues made us postpone shipping until July 2007. By then we decided that we would need a faster design to target the general audience. So the Neo1973 was repurposed as a developer’s device and the Openmoko (by then someone decided that the capital ‚M‘ had to be dropped) „Freerunner“ was announced for early 2008. Apart from a faster display subsystem it was supposed to add WiFi (which was missing in the Neo1973), LED buttons, and removing the proprietary GPS chip for something that spoke NMEA. Unfortunately it had also been decided that there could not be a change in the casing, hence we needed to continue living with the bezel and the physical display size.

In June 2007 – shortly before the official launch of the Neo1973 – I had the opportunity to join Harald visiting the OpenMoko office in Taipeh, Taiwan. This is a picture from Computex 2007, where OpenMoko presented the Neo1973 with its slogan „Free Your Phone“:

Openmoko at Computex 2007

This is our room in the FIC building, where Harald and me worked on the Neo1973 phase 1 (general availability through the webshop) release code:

Hacking in Taipeh 2007 Hacking in Taipeh 2007

The Hacker’s Lunchbox, an enhanced version of the Neo1973 release package with an additional battery, debug board, and some more goodies:

Hacking in Taipeh 2007 Hacking in Taipeh 2007

If for nothing else, for this first visit in Taipeh alone it was worth joining the project. I’ve never been further away from home, but still felt so much being welcome.

Back in Germany, work continued – with a bunch of hackathons and presentations on conferences. Here’s the OpenMoko tent from Chaos Communication Camp 2007:

Chaos Communication Camp 2007 Chaos Communication Camp 2007

Although we were hired by OpenMoko to make sure the software we wrote was working great on their devices, we always tried to do our work in the most generic way as possible. We still had a sweet spot for OpenEZX and tried to make the OpenMoko software work there:

OpenMoko on A780 OpenMoko on A780

On the 15h of September 2007, all the produced Neo1973 were sold-out. Since we already announced the Freerunner for 2008, we had nothing to sell for almost a whole year. And this exactly in the phase where Google and Apple had an opportunity to sell lots of devices. This was another major fault in the project.

Another OpenMoko speciality was the release of the CAD data. On the 4th of March 2008, we released the full CAD data to allow for 3rd party cases.

freesmartphone.org

By the time the Neo1973 was in the hands of developers, I was getting more and more frustrated with the non-UI part of the system: gsmd was unstable, there was still only my prototype neod to handle lowlewel device specifics and I felt there could be much more experimentation if only there was a solid and uniform set of APIs available. I asked Sean whether I could switch my focus to that and he agreed, allowing me to work with my colleagues in Braunschweig Stefan Schmidt, Daniel Willmann, and Jan Lübbe as an independent unit.

By that time, dbus was spreading more and more and looked like a good choice for inter-process-communication, hence I decided to specify a set of dbus APIs that would allow people to come up with all kinds of cool applications using whatever language they wanted. Since I did not want to tie it to the Openmoko devices, I created the FSO project (in analogy to the freedesktop.org project which did great work in standardizing desktop APIs before).

If there was anything I learned out of the TCP/IP vs. ISO/OSI debate in the 1980s, then that APIs without a reference implementation are worthless and it is often more important to create things that work (de facto), than to have huge committees negotiate (de jure) standards. Which meant I had to also come up with code. In order to get things up and running quickly (and to allow hacking directly on the device), I chose one of my favorite languages, Python. The choice of using an interpretated (comparatively slow) language on an already slow device may sound odd, but back then I wanted to give people something to base on as fast as possible, hence development efficiency seemed way more important than runtime efficiency.

Within only a few months, we created a working set of APIs and a Python-based reference implementation that finally allowed us to a) make dozens of calls in a row without the modem hanging up, and b) an EFL-based slick proof-of-concept-and-testing UI to evaluate our APIs (eating our own dogfood was very important to us).

Here’s the server code for an SMS echo service, which illustrates how simple it was to access the telephony parts:

SMS Echo Service

For a while, I did regular status updates (e.g. Status Update 5) to keep the community informed about how the framework project went on.

Some of this work has been motivated by a very energetic group lead by Michael ‚Emdete‘ Dietrich, which I dubbed the „Openmoko Underground“ effort. They showed me that a) Python was indeed fast enough to handle things like AT, NMEA, and echoing characters to sysfs files, and also convinced me to give the Enlightenment Foundation Libraries a go.

Freerunner is shipping

By Summer 2008, we finally shipped the Freerunner. In the meantime, we were joined by Carsten ‚Rasterman‘ Heitzler, who worked on optimizing Enlightenment (Foundation Libraries) for embedded platforms. Unfortunately the plans to make the Freerunner a much faster version of the Neo1973 didn’t quite work out: The SMEDIA gfx chip which was supposed to be a display accelerator turned out to be an actual deccelerator. The chip could (under pressure) handle VGA, but was running much better with QVGA – hence its framebuffer interface was even slower than the Neo1973’s.

And there were more hardware problems. As the frameworks and APIs increased in stability, they exposed more and more – partly long standing – issues with the hardware design, in particular the infamous bug #1024 (which yours truly originally found). This bug had the effect of the TI Calypso GSM chip losing network connection when in deep sleep mode. Although we later found a workaround, it massively damaged the reputation of the Openmoko devices, since it made the devices lose calls.

In September 2008, I visited Taipei again, joined by the FSO team plus Harald and Carsten, to present the current status, fix bugs, and teach the local engineers how to best use the framework APIs. For two weeks, we lived in an apartment rented by Openmoko and coded almost 24/7:

Taipeh 2008 Taipeh 2008 Taipeh 2008

Here’s a view of the Openmoko office plus the room where we experimented with different case color & material combinations:

Openmoko Office 2008 Openmoko Office 2008 Openmoko Office 2008

Trials and Tribulations

2008 was one of the most intense years in the project’s lifetime. With more Freerunner devices in the wild, a lot of people (finally) started building alternatives to the „official“ Openmoko distribution. As always, this was good and bad. While I personally was satisfied that people tried out different ideas on all layers of the system – after all, this kind of freedom was one of the very reasons why I joined the project and created FSO in the first place – the multitude of possibilities scared many non-technical, but interested, buyers away. (The kind of people who don’t want to actively work on open source projects, but want to support them. In the same way as chosing Firefox over IE, or OpenOffice over Word).

Here is a bunch of screenshots taken from a presentation I held at the FOSDEM 2010, where we had an Openmoko developer room:

Openmoko Screenshots Openmoko Screenshots Openmoko Screenshots Openmoko Screenshots

Here’s a photo (courtesy Josch) from the first Openmoko user meeting in Karlsruhe, which also shows a glimpse of the Openmoko devices‘ diversity:

Diversity

Freedom of choice truly is a double-edged sword: Many people would have preferred one solid software stack rather than having 10 half-done ones, each of them lacking in different areas. In- and outside Openmoko, tensions arose whether to continue with the Enlightenment-based route, switch to a Qtopia-based system, revive the GTK+-based stack, or just stop doing any work above the kernel and move to Android.

Personally, I think we should have limited our software-contributions to a kernel, FSO, and a monolithic application that would have covered voice & messages and super stable day2day operation.

Here are some more screenshots of 3rd-party distributions (Thanks, Walter!).

Openmoko Screenshots Openmoko Screenshots Openmoko Screenshots Openmoko Screenshots

One distribution I particulary liked was the one from SHR Project, a community of skilled hackers working closely together with the freesmartphone.org framework team. We did a FSOSHRUDCON (FSO+SHR Users and Developers Conference) in May 2009, in the LinuxHotel in Essen, Germany. Even though it was officially „past-Openmoko“, we had an incredible time there. If anyone of you still has a group photo which shows all of us, please send it to me!

With a much smaller audience, we came back to LinuxHotel in 2011 for the 2nd – and unfortunately also the last – FSOSHR conference. By the way… this was some weeks after my daughter Lara Marie was born and I enjoyed sleeping a bit longer than usually 🙂

Freerunner in Space

One of the most remarkable, nah… frickin‘ coolest things ever done with an Openmoko device was to mount a Freerunner inside a rocket and send it into space. In early 2009, a german space agency (DLR: Deutsches Zentrum für Luft- und Raumfahrt e.V.) carried out an experiment to measure how accelerometers (and other parts of consumer electronics) reacted to massive changes in velocity. One of the DLR directors, Prof. Dr. Felix Huber himself wrote the Freerunner-software for this experiment. He based it on an FSO image (and also submitted a bunch of patches to FSO and Zhone, but that’s another story)! According to him, the Freerunner was the only smartphone where he trusted application software to have control over the GSM part, hence no other device could be used for this experiment. Images (C) DLR e.V.:

Freerunner in Space Freerunner in Space Freerunner in Space Freerunner in Space Freerunner in Space

Afterwards, Prof. Huber invited me to the DLR in Oberpfaffenhofen and gave me a tour. It was a great experience which I’ll never forget. If you want to read more about the MAPHEUS experiments, please also see this paper.

GTA03 – Our last Hope for Freedom

After it was clear that the Freerunner was not the silver bullet we had hoped for, plans to design a successor launched. This one – codenamed GTA03 (and later 3D7K… for a reason I don’t want to disclose) – was supposed to be designed free of any existing legacy we inherited from FIC’s stock.

GTA03 was supposed to contain a new S3C chip, a Cinterion modem, a capacitive (!) touchscreen, and a slick round design with a semitransparent cutout where you could see a part of the PCB (an elegant self-reference to the project’s transparency & openness). This device would have made a significant change and put Openmoko into another league.

Here’s a plastic prototype of the device:

Openmoko GTA03 Prototype Openmoko GTA03 Prototype Openmoko GTA03 Prototype Openmoko GTA03 Prototype

Here’s the PCB:

Openmoko GTA03 PCB

Here’s me hacking FSO and OE to incorporate the GTA03-specifics using development boards:

Openmoko GTA03 Hacking Openmoko GTA03 Hacking

Alas, due to a number of circumstances, the device was cancelled – although it was already 80% done.

Nails in the coffin

On the OpenExpo 2009 in Switzerland, Sean announced that Openmoko was quitting smartphone development. To me, at the very point, when framework-wise things finally started to look good. The 2nd reference implementation of the freesmartphone.org middleware had just begun and there were many promising side-projects using the FSO API.

Here’s a screenshot of the freesmartphone.org website back then (thanks to the Wayback Machine):

freesmartphone.org

For two more years, I continued to work on FSO in my spare time, trying to find an alternative hardware reference platform to run on, but nothing convincing showed up. All reverse-engineering-based efforts to replace other operating systems with our stack failed due to the short hardware life-time.

When I dropped out in June 2011 due to the birth of my daughter Lara-Marie, those projects more or less came to a full stop.

It’s not easy to pinpoint exactly what went wrong with the project. I think I can mention a couple of nails in the coffin though:

  • Transparency – if you design hardware in the open, every single bug (that can perhaps be worked around in software) is immediately being revealed and talked to death. This scares potential buyers.
  • The financial crisis of 2008 – Openmoko’s venture capital dried up when some of the investors had serious cash problems.
  • The competition – With Apple and Google two major players came out very soon after our initial announcement. Apple’s iPhone made it tough to compete hardware-wise, and Google’s seemingly open Android dragged a lot of people who perceived it as being open enough out of our community.
  • Not enough focus, not enough structure – As mentioned before, the basics (phone, messages, power management) were never stable enough to make the devices really work well as your main phone. I’m afraid we wanted too much too fast.

What’s left behind

During 2007 – 2011, I travelled a lot and presented on conferences, if I recall correctly I’ve been in Aalborg (Denmark), Bern (Switzerland), Berlin, Brussels (Belgium), Birmingham (UK), Chemnitz, St.Augustin, Munich, Paris (France), Porto de Galinhas (Brazil) [Summerville Beach Resort, best conference venue ever], Taipeh (Taiwan), Vienna (Austria), and Zürich (Switzerland). It was an incredible time and I rediscovered a spirit that I had first experienced 20 years ago during the early days of the C64 and AMIGA demo scene. I’m glad having met so many great people. I learned more about (prototype) hardware than I would have ever wanted.

Over the two years of operation, Openmoko sold ~13000 phones (3000 Neo1973, 10000 Freerunner). Openmoko Inc. rose from 4 people (Sean, Harald, me, Werner) to about 50 in their high-time.

Alas, this concludes the positive aspects though. I wished that we could have left a bigger footprint in history, but as things stand, the mobile soft- and hardware landscape in 2017 is way more closed than it was ten years ago. I really had hoped for the opposite. Many projects we started are now either obsolete or on hiatus, waiting (forever?) for an open hardware platform to run on. However, they are still there and could be revived, if there was enough interest.

Two notable active projects are Dr. Nikolaus Schaller’s GTA04 – which is a completely new design fitting in the Freerunner’s case and the Neo900 by ex-Openmoko engineer Jörg Reisenweber – attempting to do the same with the Nokia N900.

Both of these projects share the approach to build upon an existing case and retrofitting most of the innards with a newer PCB (think „second live“). While the Neo900 is still in conceptual phase (they just announced the next prototype PCB a couple of days ago), the GTA04 has already seen a number of shipping board revisions and there is still a small, die-hard, community following tireless Nikolaus‘ progress to keep the Openmoko spirit alive.

Basic support for the GTA04 has already been added to FSO and there are people actively working on kernel support as well as porting various userlands like QtMoko and Replicant to it. I really suggest browsing the archives of gta04-owner to understand the incredible amount of problems a small series of custom smartphone hardware brings – including component sourcing, CAD programs, fighting against Linux mainline (people which apparantly are not really interested in smartphone code), defending a relatively high price, etc. It’s hell of a ride.

While I applaud all these efforts, something broke inside me when Openmoko shut down – and that’s one of the reasons why I find it pretty hard to motivate myself working on FSO again. Another one is that Vala – the language of the 2nd reference implementation – had their own set of problems with parents and maintainers losing interest… although it just recently seemed to have found new love).

The Future?

By now, Android and iOS have conquered the mobile world. Blackberry is dead, HP killed WebOS and the Palm Pre, Windows Phone is fading into oblivion and even big players such as Ubuntu or Mozilla are having a hard time coming up with an alternative to platforms that contain millions of apps running on a wide variety of the finest hardware.

IF YOU’RE SERIOUS ABOUT SOFTWARE, YOU HAVE TO DESIGN YOUR OWN HARDWARE – Alan Kay

Right now my main occupation is writing software for Apple’s platforms – and while it’s nice to work on apps using a massive set of luxury frameworks and APIs, you’re locked and sandboxed within the software layers Apple allows you. I’d love to be able to work on an open source Linux-based middleware again.

However, the sad truth is that it looks like there is no business case anymore for a truly open platform based on custom-designed hardware, since people refuse to spend extra money for tweakability, freedom, and security. Despite us living in times where privacy is massively endangered.

If anyone out there thinks different and plans a project, please holler and get me on board!

Acknowledgements

Thanks to you for reading that far! Thanks to Sean Moss-Pultz for his crazy and ambitioned idea. Thanks to Harald Welte for getting me on board. Thanks to Daniel, Jan, and Stefan for working with me on FSO. Thanks to the countless organizers and helpers on conferences where we presented our work. Thanks to Nikolaus and Walter for commenting on an early draft of this. And finally… thanks to all enthusiasts who used a Neo1973 and/or a Freerunner in the past, present, or future.

Der Beitrag OpenMoko: 10 Years After (Mickey’s Story) erschien zuerst auf Vanille.de.

July 18, 2017

Virtual Um interface between OsmoBTS and OsmocomBB

By Harald "LaF0rge" Welte

During the last couple of days, I've been working on completing, cleaning up and merging a Virtual Um interface (i.e. virtual radio layer) between OsmoBTS and OsmocomBB. After I started with the implementation and left it in an early stage in January 2016, Sebastian Stumpf has been completing it around early 2017, with now some subsequent fixes and improvements by me. The combined result allows us to run a complete GSM network with 1-N BTSs and 1-M MSs without any actual radio hardware, which is of course excellent for all kinds of testing scenarios.

The Virtual Um layer is based on sending L2 frames (blocks) encapsulated via GSMTAP UDP multicast packets. There are two separate multicast groups, one for uplink and one for downlink. The multicast nature simulates the shared medium and enables any simulated phone to receive the signal from multiple BTSs via the downlink multicast group.

/images/osmocom-virtum.png

In OsmoBTS, this is implemented via the new osmo-bts-virtual BTS model.

In OsmocomBB, this is realized by adding virtphy virtual L1, which speaks the same L1CTL protocol that is used between the real OsmcoomBB Layer1 and the Layer2/3 programs such as mobile and the like.

Now many people would argue that GSM without the radio and actual handsets is no fun. I tend to agree, as I'm a hardware person at heart and I am not a big fan of simulation.

Nevertheless, this forms the basis of all kinds of possibilities for automatized (regression) testing in a way and for layers/interfaces that osmo-gsm-tester cannot cover as it uses a black-box proprietary mobile phone (modem). It is also pretty useful if you're traveling a lot and don't want to carry around a BTS and phones all the time, or get some development done in airplanes or other places where operating a radio transmitter is not really a (viable) option.

If you're curious and want to give it a shot, I've put together some setup instructions at the Virtual Um page of the Osmocom Wiki.

July 09, 2017

Ten years after first shipping Openmoko Neo1973

By Harald "LaF0rge" Welte

Exactly 10 years ago, on July 9th, 2007 we started to sell+ship the first Openmoko Neo1973. To be more precise, the webshop actually opened a few hours early, depending on your time zone. Sean announced the availability in this mailing list post

I don't really have to add much to my ten years [of starting to work on] Openmoko anniversary blog post a year ago, but still thought it's worth while to point out the tenth anniversary.

It was exciting times, and there was a lot of pioneering spirit: Building a Linux based smartphone with a 100% FOSS software stack on the application processor, including all drivers, userland, applications - at a time before Android was known or announced. As history shows, we'd been working in parallel with Apple on the iPhone, and Google on Android. Of course there's little chance that a small taiwanese company can compete with the endless resources of the big industry giants, and the many Neo1973 delays meant we had missed the window of opportunity to be the first on the market.

It's sad that Openmoko (or similar projects) have not survived even as a special-interest project for FOSS enthusiasts. Today, virtually all options of smartphones are encumbered with way more proprietary blobs than we could ever imagine back then.

In any case, the tenth anniversary of trying to change the amount of Free Softwware in the smartphone world is worth some celebration. I'm reaching out to old friends and colleagues, and I guess we'll have somewhat of a celebration party both in Germany and in Taiwan (where I'll be for my holidays from mid-September to mid-October).

July 06, 2017

Funding the Osmocom Cellular project

By Holger "zecke" Freyther

My friend and business partner has recently blogged about funding of the Osmocom Cellular Infrastructure Projects and while I want to write about the history of sysmocom s.f.m.c. GmbH I will focus on getting contributions (or as a replacement monetary support) for the project.

First of all I think the existence of Osmocom and Osmocom Cellular made a significant difference. It is used to provide connectivity to those previously ignored (Thank you everyone involved with Rhizomatica!) and we enabled mobile communication security research. This ranges from breaking ciphering, hijacking calls, easily fuzzing phones, the whole set of GSM MAP/CAP hacks which lead to real improvement of security and privacy for end users. We took the black out of the mobile black box and want to continue to do it.

My big question is how do we sustain such development (beyond personal sacrifice)? How do we get significant contributions to remove more black boxes and extend to 4G and beyond? If getting contributions is difficult the second best thing seems to be money. This allows to pay and hire new developers that want to spend their work hours on improving Free Software. So where can these contributions come from?

The research/security community

While OsmocomBB and OpenBSC opened up the door for university and corporate researchers to explore networks, offer penetration tests, the project didn’t get much in return though. Part of the problem seems that for research a sloppy modification is enough and when the researcher has published his paper, he is too ashamed to release the hack and moves on.

Universities and Students

Universities used to buy full GSM BTS but recently seem more interested in SDR platforms. While a SDR is not a BTS the promise of running a GSM and LTE network with the same universal radio peripheral is tempting. Fewer BTS sold means less funding for OpenBSC/osmo-bts but this could be easily compensated by increased contributions to osmo-bts and osmo-trx by students and university staff. For some reason this is not happening and I think there are plenty things to improve!

Vendors using OpenBSC and osmo-bts

In general I would expect that BTS vendors that integrate our software with their hardware would have an interest in the longevity of the project and either buy software support or have their staff maintain and contribute fixes. Sadly it seems that with the current state of the industry not contributing is seen as a commercial advantage…

Research grants

The first time I heard of funding of a Free Software project receiving significant funding was when the PyPy project was initiated. Today there are various funds that support Free Software initiatives (NLnet, Mozilla Grants and more) and last year my proposal to NLnet was selected and sysmocom could begin work on 3G support in Osmocom. While this is great, the amount of funding is not enough to keep a company focused on removing blackboxes from mobile communication going for too long. So more and bigger funds are needed.

I tried to get funds from Opentech but they didn’t seem to be interested in projects like replacing proprietary Qualcomm components from modules like the EC20/EC25, or building tools for 2G/3G/4G to allow to educate users on privacy impacts of using cellular technology and to understand how a phone behaves. My first research question would be to explore what really happens when 2G is disabled in a phone and a network tries to force a downgrade. But the proposal would have enabled much more. The proposals were rejected, maybe my proposal was just bad, maybe there is no interest to finance work on cellular technology (besides most data usage seems to be from mobile devices these days). The rejection doesn’t contain feedback so it is hard to tell which of the above is more true.

How can you help?

Maybe there is not enough interest and we should focus our time and energy somewhere else but if you consider our work as important as we do, maybe you can help us? We are looking

  • contributions fix a bug, add a feature, improve existing work and make sure it gets integrated
  • Help us to write project proposals for funds like the Opentech fund…
  • Buy sysmocom hardware?
  • Buy a moral license if your company can/want to do that?
  • Sponsor me (or someone else) and send bitcoin (?)?
  • Propose your idea?

June 28, 2017

Goodbye Mozilla

By Chris Lord

Today is effectively my last day at Mozilla, before I start at Impossible on Monday. I’ve been here for 6 years and a bit and it’s been quite an experience. I think it’s worth reflecting on, so here we go; Fair warning, if you have no interest in me or Mozilla, this is going to make pretty boring reading.

I started on June 6th 2011, several months before the (then new, since moved) London office opened. Although my skills lay (lie?) in user interface implementation, I was hired mainly for my graphics and systems knowledge. Mozilla was in the region of 500 or so employees then I think, and it was an interesting time. I’d been working on the code-base for several years prior at Intel, on a headless backend that we used to build a Clutter-based browser for Moblin netbooks. I wasn’t completely unfamiliar with the code-base, but it still took a long time to get to grips with. We’re talking several million lines of code with several years of legacy, in a language I still consider myself to be pretty novice at (C++).

I started on the mobile platform team, and I would consider this to be my most enjoyable time at the company. The mobile platform team was a multi-discipline team that did general low-level platform work for the mobile (Android and Meego) browser. When we started, the browser was based on XUL and was multi-process. Mobile was often the breeding ground for new technologies that would later go on to desktop. It wasn’t long before we started developing a new browser based on a native Android UI, removing XUL and relegating Gecko to page rendering. At the time this felt like a disappointing move. The reason the XUL-based browser wasn’t quite satisfactory was mainly due to performance issues, and as a platform guy, I wanted to see those issues fixed, rather than worked around. In retrospect, this was absolutely the right decision and lead to what I’d still consider to be one of Android’s best browsers.

Despite performance issues being one of the major driving forces for making this move, we did a lot of platform work at the time too. As well as being multi-process, the XUL browser had a compositor system for rendering the page, but this wasn’t easily portable. We ended up rewriting this, first almost entirely in Java (which was interesting), then with the rendering part of the compositor in native code. The input handling remained in Java for several years (pretty much until FirefoxOS, where we rewrote that part in native code, then later, switched Android over).

Most of my work during this period was based around improving performance (both perceived and real) and fluidity of the browser. Benoit Girard had written an excellent tiled rendering framework that I polished and got working with mobile. On top of that, I worked on progressive rendering and low precision rendering, which combined are probably the largest body of original work I’ve contributed to the Mozilla code-base. Neither of them are really active in the code-base at the moment, which shows how good a job I didn’t do maintaining them, I suppose.

Although most of my work was graphics-focused on the platform team, I also got to to do some layout work. I worked on some over-invalidation issues before Matt Woodrow’s DLBI work landed (which nullified that, but I think that work existed in at least one release). I also worked a lot on fixed position elements staying fixed to the correct positions during scrolling and zooming, another piece of work I was quite proud of (and probably my second-biggest contribution). There was also the opportunity for some UI work, when it intersected with platform. I implemented Firefox for Android’s dynamic toolbar, and made sure it interacted well with fixed position elements (some of this work has unfortunately been undone with the move from the partially Java-based input manager to the native one). During this period, I was also regularly attending and presenting at FOSDEM.

I would consider my time on the mobile platform team a pretty happy and productive time. Unfortunately for me, those of us with graphics specialities on the mobile platform team were taken off that team and put on the graphics team. I think this was the start in a steady decline in my engagement with the company. At the time this move was made, Mozilla was apparently trying to consolidate teams around products, and this was the exact opposite happening. The move was never really explained to me and I know I wasn’t the only one that wasn’t happy about it. The graphics team was very different to the mobile platform team and I don’t feel I fit in as well. It felt more boisterous and less democratic than the mobile platform team, and as someone that generally shies away from arguments and just wants to get work done, it was hard not to feel sidelined slightly. I was also quite disappointed that people didn’t seem particular familiar with the graphics work I had already been doing and that I was tasked, at least initially, with working on some very different (and very boring) desktop Linux work, rather than my speciality of mobile.

I think my time on the graphics team was pretty unproductive, with the exception of the work I did on b2g, improving tiled rendering and getting graphics memory-mapped tiles working. This was particularly hard as the interface was basically undocumented, and its implementation details could vary wildly depending on the graphics driver. Though I made a huge contribution to this work, you won’t see me credited in the tree unfortunately. I’m still a little bit sore about that. It wasn’t long after this that I requested to move to the FirefoxOS systems front-end team. I’d been doing some work there already and I’d long wanted to go back to doing UI. It felt like I either needed a dramatic change or I needed to leave. I’m glad I didn’t leave at this point.

Working on FirefoxOS was a blast. We had lots of new, very talented people, a clear and worthwhile mission, and a new code-base to work with. I worked mainly on the home-screen, first with performance improvements, then with added features (app-grouping being the major one), then with a hugely controversial and probably mismanaged (on my part, not my manager – who was excellent) rewrite. The rewrite was good and fixed many of the performance problems of what it was replacing, but unfortunately also removed features, at least initially. Turns out people really liked the app-grouping feature.

I really enjoyed my time working on FirefoxOS, and getting a nice clean break from platform work, but it was always bitter-sweet. Everyone working on the project was very enthusiastic to see it through and do a good job, but it never felt like upper management’s focus was in the correct place. We spent far too much time kowtowing to the desires of phone carriers and trying to copy Android and not nearly enough time on basic features and polish. Up until around v2.0 and maybe even 2.2, the experience of using FirefoxOS was very rough. Unfortunately, as soon as it started to show some promise and as soon as we had freedom from carriers to actually do what we set out to do in the first place, the project was cancelled, in favour of the whole Connected Devices IoT debacle.

If there was anything that killed morale for me more than my unfortunate time on the graphics team, and more than having FirefoxOS prematurely cancelled, it would have to be the Connected Devices experience. I appreciate it as an opportunity to work on random semi-interesting things for a year or so, and to get some entrepreneurship training, but the mismanagement of that whole situation was pretty epic. To take a group of hundreds of UI-focused engineers and tell them that, with very little help, they should organised themselves into small teams and create IoT products still strikes me as an idea so crazy that it definitely won’t work. Certainly not the way we did it anyway. The idea, I think, was that we’d be running several internal start-ups and we’d hopefully get some marketable products out of it. What business a not-for-profit company, based primarily on doing open-source, web-based engineering has making physical, commercial products is questionable, but it failed long before that could be considered.

The process involved coming up with an idea, presenting it and getting approval to run with it. You would then repeat this approval process at various stages during development. It was, however, very hard to get approval for enough resources (both time and people) to finesse an idea long enough to make it obviously a good or bad idea. That aside, I found it very demoralising to not have the opportunity to write code that people could use. I did manage it a few times, in spite of what was happening, but none of this work I would consider myself particularly proud of. Lots of very talented people left during this period, and then at the end of it, everyone else was laid off. Not a good time.

Luckily for me and the team I was on, we were moved under the umbrella of Emerging Technologies before the lay-offs happened, and this also allowed us to refocus away from trying to make an under-featured and pointless shopping-list assistant and back onto the underlying speech-recognition technology. This brings us almost to present day now.

The DeepSpeech speech recognition project is an extremely worthwhile project, with a clear mission, great promise and interesting underlying technology. So why would I leave? Well, I’ve practically ended up on this team by a series of accidents and random happenstance. It’s been very interesting so far, I’ve learnt a lot and I think I’ve made a reasonable contribution to the code-base. I also rewrote python_speech_features in C for a pretty large performance boost, which I’m pretty pleased with. But at the end of the day, it doesn’t feel like this team will miss me. I too often spend my time finding work to do, and to be honest, I’m just not interested enough in the subject matter to make that work long-term. Most of my time on this project has been spent pushing to open it up and make it more transparent to people outside of the company. I’ve added model exporting, better default behaviour, a client library, a native client, Python bindings (+ example client) and most recently, Node.js bindings (+ example client). We’re starting to get noticed and starting to get external contributions, but I worry that we still aren’t transparent enough and still aren’t truly treating this as the open-source project it is and should be. I hope the team can push further towards this direction without me. I think it’ll be one to watch.

Next week, I start working at a new job doing a new thing. It’s odd to say goodbye to Mozilla after 6 years. It’s not easy, but many of my peers and colleagues have already made the jump, so it feels like the right time. One of the big reasons I’m moving, and moving to Impossible specifically, is that I want to get back to doing impressive work again. This is the largest regret I have about my time at Mozilla. I used to blog regularly when I worked at OpenedHand and Intel, because I was excited about the work we were doing and I thought it was impressive. This wasn’t just youthful exuberance (he says, realising how ridiculous that sounds at 32), I still consider much of the work we did to be impressive, even now. I want to be doing things like that again, and it feels like Impossible is a great opportunity to make that happen. Wish me luck!

June 15, 2017

How the Osmocom GSM stack is funded

By Harald "LaF0rge" Welte

As the topic has been raised on twitter, I thought I might share a bit of insight into the funding of the Osmocom Cellular Infrastructure Projects.

Keep in mind: Osmocom is a much larger umbrella project, and beyond the Networks-side cellular stack is home many different community-based projects around open source mobile communications. All of those have started more or less as just for fun projects, nothing serious, just a hobby [1]

The projects implementing the network-side protocol stacks and network elements of GSM/GPRS/EGPRS/UMTS cellular networks are somewhat the exception to that, as they have evolved to some extent professionalized. We call those projects collectively the Cellular Infrastructure projects inside Osmocom. This post is about that part of Osmocom only

History

From late 2008 through 2009, People like Holger and I were working on bs11-abis and later OpenBSC only in our spare time. The name Osmocom didn't even exist back then. There was a strong technical community with contributions from Sylvain Munaut, Andreas Eversberg, Daniel Willmann, Jan Luebbe and a few others. None of this would have been possible if it wasn't for all the help we got from Dieter Spaar with the BS-11 [2]. We all had our dayjob in other places, and OpenBSC work was really just a hobby. People were working on it, because it was where no FOSS hacker has gone before. It was cool. It was a big and pleasant challenge to enter the closed telecom space as pure autodidacts.

Holger and I were doing freelance contract development work on Open Source projects for many years before. I was mostly doing Linux related contracting, while Holger has been active in all kinds of areas throughout the FOSS software stack.

In 2010, Holger and I saw some first interest by companies into OpenBSC, including Netzing AG and On-Waves ehf. So we were able to spend at least some of our paid time on OpenBSC/Osmocom related contract work, and were thus able to do less other work. We also continued to spend tons of spare time in bringing Osmocom forward. Also, the amount of contract work we did was only a fraction of the many more hours of spare time.

In 2011, Holger and I decided to start the company sysmocom in order to generate more funding for the Osmocom GSM projects by means of financing software development by product sales. So rather than doing freelance work for companies who bought their BTS hardware from other places (and spent huge amounts of cash on that), we decided that we wanted to be a full solution supplier, who can offer a complete product based on all hardware and software required to run small GSM networks.

The only problem is: We still needed an actual BTS for that. Through some reverse engineering of existing products we figured out who one of the ODM suppliers for the hardware + PHY layer was, and decided to develop the OsmoBTS software to do so. We inherited some of the early code from work done by Andreas Eversberg on the jolly/bts branch of OsmocomBB (thanks), but much was missing at the time.

What follows was Holger and me working several years for free [3], without any salary, in order to complete the OsmoBTS software, build an embedded Linux distribution around it based on OE/poky, write documentation, etc. and complete the first sysmocom product: The sysmoBTS 1002

We did that not because we want to get rich, or because we want to run a business. We did it simply because we saw an opportunity to generate funding for the Osmocom projects and make them more sustainable and successful. And because we believe there is a big, gaping, huge vacuum in terms of absence of FOSS in the cellular telecom sphere.

Funding by means of sysmocom product sales

Once we started to sell the sysmoBTS products, we were able to fund Osmocom related development from the profits made on hardware / full-system product sales. Every single unit sold made a big contribution towards funding both the maintenance as well as the ongoing development on new features.

This source of funding continues to be an important factor today.

Funding by means of R&D contracts

The probably best and most welcome method of funding Osmocom related work is by means of R&D projects in which a customer funds our work to extend the Osmocom GSM stack in one particular area where he has a particular need that the existing code cannot fulfill yet.

This kind of project is the ideal match, as it shows where the true strength of FOSS is: Each of those customers did not have to fund the development of a GSM stack from scratch. Rather, they only had to fund those bits that were missing for their particular application.

Our reference for this is and has been On-Waves, who have been funding development of their required features (and bug fixing etc.) since 2010.

We've of course had many other projects from a variety of customers over over the years. Last, but not least, we had a customer who willingly co-funded (together with funds from NLnet foundation and lots of unpaid effort by sysmocom) the 3G/3.5G support in the Osmocom stack.

The problem here is:

  • we have not been able to secure anywhere nearly as many of those R&D projects within the cellular industry, despite believing we have a very good foundation upon which we can built. I've been writing many exciting technical project proposals
  • you almost exclusively get funding only for new features. But it's very hard to get funding for the core maintenance work. The bug-fixing, code review, code refactoring, testing, etc.

So as a result, the profit margin you have on selling R&D projects is basically used to (do a bad job of) fund those bits and pieces that nobody wants to pay for.

Funding by means of customer support

There is a way to generate funding for development by providing support services. We've had some success with this, but primarily alongside the actual hardware/system sales - not so much in terms of pure software-only support.

Also, providing support services from a R&D company means:

  • either you distract your developers by handling support inquiries. This means they will have less time to work on actual code, and likely get side tracked by too many issues that make it hard to focus
  • or you have to hire separate support staff. This of course means that the size of the support business has to be sufficiently large to not only cover the cots of hiring + training support staff, but also still generate funding for the actual software R&D.

We've tried shortly with the second option, but fallen back to the first for now. There's simply not sufficient user/admin type support business to rectify dedicated staff for that.

Funding by means of cross-subsizing from other business areas

sysmocom also started to do some non-Osmocom projects in order to generate revenue that we can feed again into Osmocom projects. I'm not at liberty to discuss them in detail, but basically we've been doing pretty much anything from

  • custom embedded Linux board designs
  • M2M devices with GSM modems
  • consulting gigs
  • public tendered research projects

Profits from all those areas went again into Osmocom development.

Last, but not least, we also operate the sysmocom webshop. The profit we make on those products also is again immediately re-invested into Osmocom development.

Funding by grants

We've had some success in securing funding from NLnet Foundation for specific features. While this is useful, the size of their projects grants of up to EUR 30k is not a good fit for the scale of the tasks we have at hand inside Osmocom. You may think that's a considerable amount of money? Well, that translates to 2-3 man-months of work at a bare cost-covering rate. At a team size of 6 developers, you would theoretically have churned through that in two weeks. Also, their focus is (understandably) on Internet and IT security, and not so much cellular communications.

There are of course other options for grants, such as government research grants and the like. However, they require long-term planning, they require you to match (i.e. pay yourself) a significant portion, and basically mandate that you hire one extra person for doing all the required paperwork and reporting. So all in all, not a particularly attractive option for a very small company consisting of die hard engineers.

Funding by more BTS ports

At sysmocom, we've been doing some ports of the OsmoBTS + OsmoPCU software to other hardware, and supporting those other BTS vendors with porting, R&D and support services.

If sysmocom was a classic BTS vendor, we would not help our "competition". However, we are not. sysmocom exists to help Osmocom, and we strongly believe in open systems and architectures, without a single point of failure, a single supplier for any component or any type of vendor lock-in.

So we happily help third parties to get Osmocom running on their hardware, either with a proprietary PHY or with OsmoTRX.

However, we expect that those BTS vendors also understand their responsibility to share the development and maintenance effort of the stack. Preferably by dedicating some of their own staff to work in the Osmocom community. Alternatively, sysmocom can perform that work as paid service. But that's a double-edged sword: We don't want to be a single point of failure.

Osmocom funding outside of sysmocom

Osmocom is of course more than sysmocom. Even for the cellular infrastructure projects inside Osmocom is true: They are true, community-based, open, collaborative development projects. Anyone can contribute.

Over the years, there have been code contributions by e.g. Fairwaves. They, too, build GSM base station hardware and use that as a means to not only recover the R&D on the hardware, but also to contribute to Osmocom. At some point a few years ago, there was a lot of work from them in the area of OsmoTRX, OsmoBTS and OsmoPCU. Unfortunately, in more recent years, they have not been able to keep up the level of contributions.

There are other companies engaged in activities with and around Osmcoom. There's Rhizomatica, an NGO helping indigenous communities to run their own cellular networks. They have been funding some of our efforts, but being an NGO helping rural regions in developing countries, they of course also don't have the deep pockets. Ideally, we'd want to be the ones contributing to them, not the other way around.

State of funding

We're making some progress in securing funding from players we cannot name [4] during recent years. We're also making occasional progress in convincing BTS suppliers to chip in their share. Unfortunately there are more who don't live up to their responsibility than those who do. I might start calling them out by name one day. The wider community and the public actually deserves to know who plays by FOSS rules and who doesn't. That's not shaming, it's just stating bare facts.

Which brings us to:

  • sysmocom is in an office that's actually too small for the team, equipment and stock. But we certainly cannot afford more space.
  • we cannot pay our employees what they could earn working at similar positions in other companies. So working at sysmocom requires dedication to the cause :)
  • Holger and I have invested way more time than we have ever paid us, even more so considering the opportunity cost of what we would have earned if we'd continued our freelance Open Source hacker path
  • we're [just barely] managing to pay for 6 developers dedicated to Osmocom development on our payroll based on the various funding sources indicated above

Nevertheless, I doubt that any such a small team has ever implemented an end-to-end GSM/GPRS/EGPRS network from RAN to Core at comparative feature set. My deepest respects to everyone involved. The big task now is to make it sustainable.

Summary

So as you can see, there's quite a bit of funding around. However, it always falls short of what's needed to implement all parts properly, and even not quite sufficient to keep maintaining the status quo in a proper and tested way. That can often be frustrating (mostly to us but sometimes also to users who run into regressions and oter bugs). There's so much more potential. So many things we wanted to add or clean up for a long time, but too little people interested in joining in, helping out - financially or by writing code.

On thing that is often a challenge when dealing with traditional customers: We are not developing a product and then selling a ready-made product. In fact, in FOSS this would be more or less suicidal: We'd have to invest man-years upfront, but then once it is finished, everyone can use it without having to partake in that investment.

So instead, the FOSS model requires the customers/users to chip in early during the R&D phase, in order to then subsequently harvest the fruits of that.

I think the lack of a FOSS mindset across the cellular / telecom industry is the biggest constraining factor here. I've seen that some 20-15 years ago in the Linux world. Trust me, it takes a lot of dedication to the cause to endure this lack of comprehension so many years later.

[1]just like Linux has started out.
[2]while you will not find a lot of commits from Dieter in the code, he has been playing a key role in doing a lot of prototyping, reverse engineering and debugging!
[3]sysmocom is 100% privately held by Holger and me, we intentionally have no external investors and are proud to never had to take a bank loan. So all we could invest was our own money and, most of all, time.
[4]contrary to the FOSS world, a lot of aspects are confidential in business, and we're not at liberty to disclose the identities of all our customers

FOSS misconceptions, still in 2017

By Harald "LaF0rge" Welte

The lack of basic FOSS understanding in Telecom

Given that the Free and Open Source movement has been around at least since the 1980ies, it puzzles me that people still seem to have such fundamental misconceptions about it.

Something that really triggered me was an article at LightReading [1] which quotes Ulf Ewaldsson, a leading Ericsson excecutive with

"I have yet to understand why we would open source something we think is really good software"

This completely misses the point. FOSS is not about making a charity donation of a finished product to the planet.

FOSS is about sharing the development costs among multiple players, and avoiding that everyone has to reimplement the wheel. Macro-Economically, it is complete and utter nonsense that each 3GPP specification gets implemented two dozens of times, by at least a dozen of different entities. As a result, products are way more expensive than needed.

If large Telco players (whether operators or equipment manufacturers) were to collaboratively develop code just as much as they collaboratively develop the protocol specifications, there would be no need for replicating all of this work.

As a result, everyone could produce cellular network elements at reduced cost, sharing the R&D expenses, and competing in key areas, such as who can come up with the most energy-efficient implementation, or can produce the most reliable hardware, the best receiver sensitivity, the best and most fair scheduling implementation, or whatever else. But some 80% of the code could probably be shared, as e.g. encoding and decoding messages according to a given publicly released 3GPP specification document is not where those equipment suppliers actually compete.

So my dear cellular operator executives: Next time you're cursing about the prohibitively expensive pricing that your equipment suppliers quote you: You only have to pay that much because everyone is reimplementing the wheel over and over again.

Equally, my dear cellular infrastructure suppliers: You are all dying one by one, as it's hard to develop everything from scratch. Over the years, many of you have died. One wonders, if we might still have more players left, if some of you had started to cooperate in developing FOSS at least in those areas where you're not competing. You could replicate what Linux is doing in the operating system market. There's no need in having a phalanx of different proprietary flavors of Unix-like OSs. It's way too expansive, and it's not an area in which most companies need to or want to compete anyway.

Management Summary

You don't first develop and entire product until it is finished and then release it as open source. This makes little economic sense in a lot of cases, as you've already invested into developing 100% of it. Instead, you actually develop a new product collaboratively as FOSS in order to not have to invest 100% but maybe only 30% or even less. You get a multitude of your R&D investment back, because you're not only getting your own code, but all the other code that other community members implemented. You of course also get other benefits, such as peer review of the code, more ideas (not all bright people work inside one given company), etc.

[1]that article is actually a heavily opinionated post by somebody who appears to be pushing his own anti-FOSS agenda for some time. The author is misinformed about the fact that the TIP has always included projects under both FRAND and FOSS terms. As a TIP member I can attest to that fact. I'm only referencing it here for the purpose of that that Ericsson quote.

May 28, 2017

Playing back GSM RTP streams, RTP-HR bugs

By Harald "LaF0rge" Welte

Chapter 0: Problem Statement

In an all-IP GSM network, where we use Abis, A and other interfaces within the cellular network over IP transport, the audio of voice calls is transported inside RTP frames. The codec payload in those RTP frames is the actual codec frame of the respective cellular voice codec. In GSM, there are four relevant codecs: FR, HR, EFR and AMR.

Every so often during the (meanwhile many years of ) development of Osmocom cellular infrastructure software it would have been useful to be able to quickly play back the audio for analysis of given issues.

However, until now we didn't have that capability. The reason is relatively simple: In Osmocom, we genally don't do transcoding but simply pass the voice codec frames from left to right. They're only transcoded inside the phones or inside some external media gateway (in case of larger networks).

Chapter 1: GSM Audio Pocket Knife

Back in 2010, when we were very actively working on OsmocomBB, the telephone-side GSM protocol stack implementation, Sylvain Munaut wrote the GSM Audio Pocket Knife (gapk) in order to be able to convert between different formats (representations) of codec frames. In cellular communcations, everyoe is coming up with their own representation for the codec frames: The way they look on E1 as a TRAU frame is completely different from how RTP payload looks like, or what the TI Calypso DSP uses internally, or what a GSM Tester like the Racal 61x3 uses. The differences are mostly about data types used, bit-endinanness as well as padding and headers. And of course those different formats exist for each of the four codecs :/

In 2013 I first added simplistic RTP support for FR-GSM to gapk, which was sufficient for my debugging needs back then. Still, you had to save the decoded PCM output to a file and play that back, or use a pipe into aplay.

Last week, I picked up this subject again and added a long series of patches to gapk:

  • support for variable-length codec frames (required for AMR support)
  • support for AMR codec encode/decode using libopencore-amrnb
  • support of all known RTP payload formats for all four codecs
  • support for direct live playback to a sound card via ALSA

All of the above can now be combined to make GAPK bind to a specified UDP port and play back the RTP codec frames that anyone sends to that port using a command like this:

$ gapk -I 0.0.0.0/30000 -f rtp-amr -A default -g rawpcm-s16le

I've also merged a chance to OsmoBSC/OsmoNITB which allows the administrator to re-direct the voice of any active voice channel towards a user-specified IP address and port. Using that you can simply disconnect the voice stream from its normal destination and play back the audio via your sound card.

Chapter 2: Bugs in OsmoBTS GSM-HR

While going through the exercise of implementing the above extension to gapk, I had lots of trouble to get it to work for GSM-HR.

After some more digging, it seems there are two conflicting specification on how to format the RTP payload for half-rate GSM:

In Osmocom, we claim to implement RFC5993, but it turned out that (at least) osmo-bts-sysmo (for sysmoBTS) was actually implementing the ETSI format instead.

And even worse, osmo-bts-sysmo gets event the ETSI format wrong. Each of the codec parameters (which are unaligned bit-fields) are in the wrong bit-endianness :(

Both the above were coincidentially also discovered by Sylvain Munaut during operating of the 32C3 GSM network in December 2015 and resulted the two following "work around" patches: * HACK for HR * HACK: Fix the bit order in HR frames

Those merely worked around those issues in the rtp_proxy of OsmoNITB, rather than addressing the real issue. That's ok, they were "quick" hacks to get something working at all during a four-day conference. I'm now working on "real" fixes in osmo-bts-sysmo. The devil is of course in the details, when people upgrade one BTS but not the other and want to inter-operate, ...

It yet remains to be investigated how osmo-bts-trx and other osmo-bts ports behave in this regard.

Chapter 3: Conclusions

Most definitely it is once again a very clear sign that more testing is required. It's tricky to see even wih osmo-gsm-tester, as GSM-HR works between two phones or even two instances of osmo-bts-sysmo, as both sides of the implementation have the same (wrong) understanding of the spec.

Given that we can only catch this kind of bug together with the hardware (the DSP runs the PHY code), pure unit tests wouldn't catch it. And the end-to-end test is also not very well suited to it. It seems to call for something in betewen. Something like an A-bis interface level test.

We need more (automatic) testing. I cannot say that often enough. The big challenge is how to convince contributors and customers that they should invest their time and money there, rather than yet-another (not automatically tested) feature?

May 27, 2017

Free Ideas for UI Frameworks, or How To Achieve Polished UI

By Chris Lord

Ever since the original iPhone came out, I’ve had several ideas about how they managed to achieve such fluidity with relatively mediocre hardware. I mean, it was good at the time, but Android still struggles on hardware that makes that look like a 486… It’s absolutely my fault that none of these have been implemented in any open-source framework I’m aware of, so instead of sitting on these ideas and trotting them out at the pub every few months as we reminisce over what could have been, I’m writing about them here. I’m hoping that either someone takes them and runs with them, or that they get thoroughly debunked and I’m made to look like an idiot. The third option is of course that they’re ignored, which I think would be a shame, but given I’ve not managed to get the opportunity to implement them over the last decade, that would hardly be surprising. I feel I should clarify that these aren’t all my ideas, but include a mix of observation of and conjecture about contemporary software. This somewhat follows on from the post I made 6 years ago(!) So let’s begin.

1. No main-thread UI

The UI should always be able to start drawing when necessary. As careful as you may be, it’s practically impossible to write software that will remain perfectly fluid when the UI can be blocked by arbitrary processing. This seems like an obvious one to me, but I suppose the problem is that legacy makes it very difficult to adopt this at a later date. That said, difficult but not impossible. All the major web browsers have adopted this policy, with caveats here and there. The trick is to switch from the idea of ‘painting’ to the idea of ‘assembling’ and then using a compositor to do the painting. Easier said than done of course, most frameworks include the ability to extend painting in a way that would make it impossible to switch to a different thread without breaking things. But as long as it’s possible to block UI, it will inevitably happen.

2. Contextually-aware compositor

This follows on from the first point; what’s the use of having non-blocking UI if it can’t respond? Input needs to be handled away from the main thread also, and the compositor (or whatever you want to call the thread that is handling painting) needs to have enough context available that the first response to user input doesn’t need to travel to the main thread. Things like hover states, active states, animations, pinch-to-zoom and scrolling all need to be initiated without interaction on the main thread. Of course, main thread interaction will likely eventually be required to update the view, but that initial response needs to be able to happen without it. This is another seemingly obvious one – how can you guarantee a response rate unless you have a thread dedicated to responding within that time? Most browsers are doing this, but not going far enough in my opinion. Scrolling and zooming are often catered for, but not hover/active states, or initialising animations (note; initialising animations. Once they’ve been initialised, they are indeed run on the compositor, usually).

3. Memory bandwidth budget

This is one of the less obvious ideas and something I’ve really wanted to have a go at implementing, but never had the opportunity. A problem I saw a lot while working on the platform for both Firefox for Android and FirefoxOS is that given the work-load of a web browser (which is not entirely dissimilar to the work-load of any information-heavy UI), it was very easy to saturate memory bandwidth. And once you saturate memory bandwidth, you end up having to block somewhere, and painting gets delayed. We’re assuming UI updates are asynchronous (because of course – otherwise we’re blocking on the main thread). I suggest that it’s worth tracking frame time, and only allowing large asynchronous transfers (e.g. texture upload, scaling, format transforms) to take a certain amount of time. After that time has expired, it should wait on the next frame to be composited before resuming (assuming there is a composite scheduled). If the composited frame was delayed to the point that it skipped a frame compared to the last unladen composite, the amount of time dedicated to transfers should be reduced, or the transfer should be delayed until some arbitrary time (i.e. it should only be considered ok to skip a frame every X ms).

It’s interesting that you can see something very similar to this happening in early versions of iOS (I don’t know if it still happens or not) – when scrolling long lists with images that load in dynamically, none of the images will load while the list is animating. The user response was paramount, to the point that it was considered more important to present consistent response than it was to present complete UI. This priority, I think, is a lot of the reason the iPhone feels ‘magic’ and Android phones felt like junk up until around 4.0 (where it’s better, but still not as good as iOS).

4. Level-of-detail

This is something that I did get to partially implement while working on Firefox for Android, though I didn’t do such a great job of it so its current implementation is heavily compromised from how I wanted it to work. This is another idea stolen from game development. There will be times, during certain interactions, where processing time will be necessarily limited. Quite often though, during these times, a user’s view of the UI will be compromised in some fashion. It’s important to understand that you don’t always need to present the full-detail view of a UI. In Firefox for Android, this took the form that when scrolling fast enough that rendering couldn’t keep up, we would render at half the resolution. This let us render more, and faster, giving the impression of a consistent UI even when the hardware wasn’t quite capable of it. I notice Microsoft doing similar things since Windows 8; notice how the quality of image scaling reduces markedly while scrolling or animations are in progress. This idea is very implementation-specific. What can be dropped and what you want to drop will differ between platforms, form-factors, hardware, etc. Generally though, some things you can consider dropping: Sub-pixel anti-aliasing, high-quality image scaling, render resolution, colour-depth, animations. You may also want to consider showing partial UI if you know that it will very quickly be updated. The Android web-browser during the Honeycomb years did this, and I attempted (with limited success, because it’s hard…) to do this with Firefox for Android many years ago.

Pitfalls

I think it’s easy to read ideas like this and think it boils down to “do everything asynchronously”. Unfortunately, if you take a naïve approach to that, you just end up with something that can be inexplicably slow sometimes and the only way to fix it is via profiling and micro-optimisations. It’s very hard to guarantee a consistent experience if you don’t manage when things happen. Yes, do everything asynchronously, but make sure you do your book-keeping and you manage when it’s done. It’s not only about splitting work up, it’s about making sure it’s done when it’s smart to do so.

You also need to be careful about how you measure these improvements, and to be aware that sometimes results in synthetic tests will even correlate to the opposite of the experience you want. A great example of this, in my opinion, is page-load speed on desktop browsers. All the major desktop browsers concentrate on prioritising the I/O and computation required to get the page to 100%. For heavy desktop sites, however, this means the browser is often very clunky to use while pages are loading (yes, even with out-of-process tabs – see the point about bandwidth above). I highlight this specifically on desktop, because you’re quite likely to not only be browsing much heavier sites that trigger this behaviour, but also to have multiple tabs open. So as soon as you load a couple of heavy sites, your entire browsing experience is compromised. I wouldn’t mind the site taking a little longer to load if it didn’t make the whole browser chug while doing so.

Don’t lose sight of your goals. Don’t compromise. Things might take longer to complete, deadlines might be missed… But polish can’t be overrated. Polish is what people feel and what they remember, and the lack of it can have a devastating effect on someone’s perception. It’s not always conscious or obvious either, even when you’re the developer. Ask yourself “Am I fully satisfied with this” before marking something as complete. You might still be able to ship if the answer is “No”, but make sure you don’t lose sight of that and make sure it gets the priority it deserves.

One last point I’ll make; I think to really execute on all of this, it requires buy-in from everyone. Not just engineers, not just engineers and managers, but visual designers, user experience, leadership… Everyone. It’s too easy to do a job that’s good enough and it’s too much responsibility to put it all on one person’s shoulders. You really need to be on the ball to produce the kind of software that Apple does almost routinely, but as much as they’d say otherwise, it isn’t magic.

May 23, 2017

Power-cycling a USB port should be simple, right?

By Harald "LaF0rge" Welte

Every so often I happen to be involved in designing electronics equipment that's supposed to run reliably remotely in inaccessible locations,without any ability for "remote hands" to perform things like power-cycling or the like. I'm talking about really remote locations, possible with no but limited back-haul, and a very high cost of ever sending somebody there for remote maintenance.

Given that a lot of computer peripherals (chips, modules, ...) use USB these days, this is often some kind of an embedded ARM (rarely x86) SoM or SBC, which is hooked up to a custom board that contains a USB hub chip as well as a line of peripherals.

One of the most important lectures I've learned from experience is: Never trust reset signals / lines, always include power-switching capability. There are many chips and electronics modules available on the market that have either no RESET, or even might claim to have a hardware RESET line which you later (painfully) discover just to be a GPIO polled by software which can get stuck, and hence no way to really hard-reset the given component.

In the case of a USB-attached device (even though the USB might only exist on a circuit board between two ICs), this is typically rather easy: The USB hub is generally capable of switching the power of its downstream ports. Many cheap USB hubs don't implement this at all, or implement only ganged switching, but if you carefully select your USB hub (or in the case of a custom PCB), you can make sure that the given USB hub supports individual port power switching.

Now the next step is how to actually use this from your (embedded) Linux system. It turns out to be harder than expected. After all, we're talking about a standard feature that's present in the USB specifications since USB 1.x in the late 1990ies. So the expectation is that it should be straight-forward to do with any decent operating system.

I don't know how it's on other operating systems, but on Linux I couldn't really find a proper way how to do this in a clean way. For more details, please read my post to the linux-usb mailing list.

Why am I running into this now? Is it such a strange idea? I mean, power-cycling a device should be the most simple and straight-forward thing to do in order to recover from any kind of "stuck state" or other related issue. Logical enabling/disabling of the port, resetting the USB device via USB protocol, etc. are all just "soft" forms of a reset which at best help with USB related issues, but not with any other part of a USB device.

And in the case of e.g. an USB-attached cellular modem, we're actually talking about a multi-processor system with multiple built-in micro-controllers, at least one DSP, an ARM core that might run another Linux itself (to implement the USB gadget), ... - certainly enough complex software that you would want to be able to power-cycle it...

I'm curious what the response of the Linux USB gurus is.

May 17, 2017

CAMEL and protocol design

By Holger "zecke" Freyther

Today I want to share the pain of running a production 3GPP TCAP/MAP/CAP system and network protocol design in general. The excellent Free Software ASN1/TCAP/MAP/CAP stack (which is made possible by the Pharo live programming environment) I helped creating is in heavy production usage (powering standard off-the-shelf components like a SGSN, an AuC or non-standard components to enable new business cases) and sees roaming traffic from a lot of networks. From time to time something odd comes up.

In TCAP/MAP/CAP messages but also Request/Response and the possible Errors are defined using ASN1. Over the last decades ETSI and 3GPP have made various major versions and minor releases (e.g. adding new optional attributes to requests/responses/errors). The biggest new standard is CAMEL and it is so big and complicated that it was specified in four phases (each phase with their own versions of the ApplicationContext, think of it as an versioned and entry into the definition for all messages and RPC calls).

One issue in supporting a specific module version (application-context-name) is to find the right minor release of 3GPP (either the newest or oldest for that ACN). Then it is a matter to copy and paste the ASN1 definition from either a PDF or a WordDocument into individual files.. and after that is done one can fix the broken imports (or modify the ASN1 parser to make a global look-up) and typos for elements.

This artificial barrier creates two issue for people implementing MAP/CAP using components. Some use inferior ASN1 tools or can’t be bothered to create the input files and decide to hardcode the message content (after all BER/DER is more or less just nested TLV entries). The second issue is related to time/effort as well. When creating the CAMEL ASN1 files I didn’t want to do the work four times (once for each phase) and searched for shortcuts too.

The first issue materialized itself by equipment sending completely broken messages or not sending mandatory(!) elements. So what happens if a big telco sends you a message the stack can’t decode, you look up the oldest and youngest release defining this ACN and see the element that is attempted to be parsed was always mandatory? Right, one adds an OPTIONAL modifier to be able to move forward…

The second issue is on me though. I started with a set of CAMEL phase3 files and assumed that only the operations (and their arguments/response) would be different across different CAMEL phases but the support structs they use would stay the same. My assumption (and this brings us to protocol design) was that besides the versioning of the module they would be conservative and extend supporting types in a forward compatible way and integrated phase2 and phase1 into the same set of files.

And then reality sets in and the logs of the system showed a message that caused an exception during parsing (normally only happens for the first kind of issue). An extension to the Request structure was changed in a not forward compatible way. Let’s have a look:

InitialDPArgExtension ::= SEQUENCE {

-naCarrierInformation [0] NACarrierInformation OPTIONAL,
-gmscAddress [1] ISDN-AddressString OPTIONAL,
-…
+ gmscAddress [0] ISDN-AddressString OPTIONAL,
*more new optional elements*
+ …,
+ enhancedDialledServicesAllowed [11] NULL OPTIONAL,
*more elements after the extension marker*
}

So one element (naCarrierInformation) got removed and then every following element was renumbered and the extension marker was moved further down. In theory the InitialDPArgExtension name binding exists once in the phase2 to definition and once in phase3 and 3GPP had all rights to define a new binding with different. An engineering question is if this was a good decision?

A change in application-context allows to remove some old cruft and make room for new. The tag space might be considered a scarce resource and making room is saving a resource. On the other hand in the history of GSM no other struct had ran out of tags and there are various other approaches to the problem. The above is already an extension to an extension and the step to an extension of an extension of an extension doesn’t seem so absurd anymore.

So please think of forward compatibility when designing protocols, think of the implementor and make the definition machine readable and please get the imports right so one doesn’t need to resort to a global symbol search. If you are having interesting core network issues related to TCAP, MAP and CAP consider contacting me.

May 06, 2017

MariaDB Galera and custom health probe for Azure LoadBalancer

By Holger "zecke" Freyther

My Galera set-up on Kubernetes and the Azure LoadBalancer in front of it seem to work nicely but one big TODO is to implement proper health checks. If a node is down, in maintenance or split from the network it should not be part of the LoadBalancer. The Azure LoadBalancer has support for custom HTTP probes and I wanted to write something very simple that handles the HTTP GET, opens a MySQL connection to the destination, check if it is connected to a primary. As this is about health checks the code should be small and reliable.

To improve my Go(-lang) skills I decided to write my healthcheck in Go. And it seemed like a good idea, Go has a powerful HTTP package, a SQL API package and two MySQL implementations. So the entire prototype is just about 72 lines (with comments and empty lines) and I think that qualifies as small. Prototyping the MySQL code took some iterations but in general it went quite quickly. But how reliable is it? Go introduced the nice concept of a context.Context. So any operation should be associated with a context and it should be passed as argument from one method to another. One can create a child context and associate it with a deadline (absolute time) or timeout (relative) and has a way to cancel it.

I grabbed the Context from the HTTP Request, added a timeout and called a function to do the MySQL check. Wow that was easy. Some polish to parse the parameters from the CLI and I am ready to deploy it! But let’s see how reliable it is?

I imagined the following error conditions:

  1. The destination IP is reachable but no one listening on the port. The TCP connection will fail quickly (SYN -> RST,ACK)
  2. The destination IP ends in a blackhole (no RST, ACK) received. One would have a large connect timeout
  3. The Galera node (or machine hosting it) is overloaded. While the connect succeeds the authentication or a query might stall
  4. The Galera node is split and not a master

The first and fourth error conditions are easy to test/simulate and trivial to implement properly. I then moved to the third one. My first choice was to implement an infinitely slow Galera node and did that by using nc -l 3006 to accept a TCP connection and then send nothing. I made a healthprobe and waited… and waited.. no timeout. Not after 2s as programmed in the context, not after 2min and not after.. (okay I gave up after 30 min). Pretty discouraging!

After some reading and browsing I saw an open PR to add context.Context support to the MySQL backend. I modified my import, ran go get to fetch it, go build and retested. Okay that didn’t work either. So let’s try the other MySQL implementation, again change the package imports, go get and go build and retest. I picked the wrong package name but even after picking the right package this driver failed to parse the Database URL. At that point I decided to go back to the first implementation and have a deeper look.

So while many of the SQL API methods take a Context as argument, the Open one does not. Open says it might or might not connect to the database and in case of MySQL it does connect to it. Let’s see if there is a workaround? I could spawn a Go routine and have a selective receive on the result or a timeout. While this would make it possible to respond to the HTTP request it does create two issues. First one can’t cancel Go routines and I would leak memory, but worse I might run into a connection limit of the Galera node. What about other workarounds? It seems I can play with a custom parameter for readTimeout and writeTimeout and at least limit the timeout per I/O operation. I guess it takes a bit of tuning to find good values for a busy system and let’s hope that context.Context will be used more in more places in the future.

May 02, 2017

OsmoDevCon 2017 Review

By Harald "LaF0rge" Welte

After the public user-oriented OsmoCon 2017, we also recently had the 6th incarnation of our annual contributors-only Osmocom Developer Conference: The OsmoDevCon 2017.

This is a much smaller group, typically about 20 people, and is limited to actual developers who have a past record of contributing to any of the many Osmocom projects.

We had a large number of presentation and discussions. In fact, so large that the schedule of talks extended from 10am to midnight on some days. While this is great, it also means that there was definitely too little time for more informal conversations, chatting or even actual work on code.

We also have such a wide range of topics and scope inside Osmocom, that the traditional ad-hoch scheduling approach no longer seems to be working as it used to. Not everyone is interested in (or has time for) all the topics, so we should group them according to their topic/subject on a given day or half-day. This will enable people to attend only those days that are relevant to them, and spend the remaining day in an adjacent room hacking away on code.

It's sad that we only have OsmoDevCon once per year. Maybe that's actually also something to think about. Rather than having 4 days once per year, maybe have two weekends per year.

Always in motion the future is.

Overhyped Docker

By Harald "LaF0rge" Welte

Overhyped Docker missing the most basic features

I've always been extremely skeptical of suddenly emerging over-hyped technologies, particularly if they advertise to solve problems by adding yet another layer to systems that are already sufficiently complex themselves.

There are of course many issues with containers, ranging from replicated system libraries and the basic underlying statement that you're giving up on the system packet manager to properly deal with dependencies.

I'm also highly skeptical of FOSS projects that are primarily driven by one (VC funded?) company. Especially if their offering includes a so-called cloud service which they can stop to operate at any given point in time, or (more realistically) first get everybody to use and then start charging for.

But well, despite all the bad things I read about it over the years, on one day in May 2017 I finally thought let's give it a try. My problem to solve as a test balloon is fairly simple.

My basic use case

The plan is to start OsmoSTP, the m3ua-testtool and the sua-testtool, which both connect to OsmoSTP. By running this setup inside containers and inside an internal network, we could then execute the entire testsuite e.g. during jenkins test without having IP address or port number conflicts. It could even run multiple times in parallel on one buildhost, verifying different patches as part of the continuous integration setup.

This application is not so complex. All it needs is three containers, an internal network and some connections in between. Should be a piece of cake, right?

But enter the world of buzzword-fueled web-4000.0 software-defined virtualised and orchestrated container NFW + SDN vodoo: It turns out to be impossible, at least not with the preferred tools they advertise.

Dockerfiles

The part that worked relatively easily was writing a few Dockerfiles to build the actual containers. All based on debian:jessie from the library.

As m3ua-testsuite is written in guile, and needs to build some guile plugin/extension, I had to actually include guile-2.0-dev and other packages in the container, making it a bit bloated.

I couldn't immediately find a nice example Dockerfile recipe that would allow me to build stuff from source outside of the container, and then install the resulting binaries into the container. This seems to be a somewhat weak spot, where more support/infrastructure would be helpful. I guess the idea is that you simply install applications via package feeds and apt-get. But I digress.

So after some tinkering, I ended up with three docker containers:

  • one running OsmoSTP
  • one running m3ua-testtool
  • one running sua-testtool

I also managed to create an internal bridged network between the containers, so the containers could talk to one another.

However, I have to manually start each of the containers with ugly long command line arguments, such as docker run --network sigtran --ip 172.18.0.200 -it osmo-stp-master. This is of course sub-optimal, and what Docker Services + Stacks should resolve.

Services + Stacks

The idea seems good: A service defines how a given container is run, and a stack defines multiple containers and their relation to each other. So it should be simple to define a stack with three services, right?

Well, it turns out that it is not. Docker documents that you can configure a static ipv4_address [1] for each service/container, but it seems related configuration statements are simply silently ignored/discarded [2], [3], [4].

This seems to be related that for some strange reason stacks can (at least in later versions of docker) only use overlay type networks, rather than the much simpler bridge networks. And while bridge networks appear to support static IP address allocations, overlay apparently doesn't.

I still have a hard time grasping that something that considers itself a serious product for production use (by a company with estimated value over a billion USD, not by a few hobbyists) that has no support for running containers on static IP addresses. that. How many applications out there have I seen that require static IP address configuration? How much simpler do setups get, if you don't have to rely on things like dynamic DNS updates (or DNS availability at all)?

So I'm stuck with having to manually configure the network between my containers, and manually starting them by clumsy shell scripts, rather than having a proper abstraction for all of that. Well done :/

Exposing Ports

Unrelated to all of the above: If you run some software inside containers, you will pretty soon want to expose some network services from containers. This should also be the most basic task on the planet.

However, it seems that the creators of docker live in the early 1980ies, where only TCP and UDP transport protocols existed. They seem to have missed that by the late 1990ies to early 2000s, protocols like SCTP or DCCP were invented.

But yet, in 2017, Docker chooses to

Now some of the readers may think 'who uses SCTP anyway'. I will give you a straight answer: Everyone who has a mobile phone uses SCTP. This is due to the fact that pretty much all the connections inside cellular networks (at least for 3G/4G networks, and in reality also for many 2G networks) are using SCTP as underlying transport protocol, from the radio access network into the core network. So every time you switch your phone on, or do anything with it, you are using SCTP. Not on your phone itself, but by all the systems that form the network that you're using. And with the drive to C-RAN, NFV, SDN and all the other buzzwords also appearing in the Cellular Telecom field, people should actually worry about it, if they want to be a part of the software stack that is used in future cellular telecom systems.

Summary

After spending the better part of a day to do something that seemed like the most basic use case for running three networked containers using Docker, I'm back to step one: Most likely inventing some custom scripts based on unshare to run my three test programs in a separate network namespace for isolated execution of test suite execution as part of a Jenkins CI setup :/

It's also clear that Docker apparently don't care much about playing a role in the Cellular Telecom world, which is increasingly moving away from proprietary and hardware-based systems (like STPs) to virtualised, software-based systems.

[1]https://docs.docker.com/compose/compose-file/#ipv4address-ipv6address
[2]https://forums.docker.com/t/docker-swarm-1-13-static-ips-for-containers/28060
[3]https://github.com/moby/moby/issues/31860
[4]https://github.com/moby/moby/issues/24170

May 01, 2017

Book on Practical GPL Compliance

By Harald "LaF0rge" Welte

My former gpl-violations.org colleague Armijn Hemel and Shane Coughlan (former coordinator of the FSFE Legal Network) have written a book on practical GPL compliance issues.

I've read through it (in the bath tub of course, what better place to read technical literature), and I can agree wholeheartedly with its contents. For those who have been involved in GPL compliance engineering there shouldn't be much new - but for the vast majority of developers out there who have had little exposure to the bread-and-butter work of providing complete an corresponding source code, it makes an excellent introductory text.

The book focuses on compliance with GPLv2, which is probably not too surprising given that it's published by the Linux foundation, and Linux being GPLv2.

You can download an electronic copy of the book from https://www.linuxfoundation.org/news-media/research/practical-gpl-compliance

Given the subject matter is Free Software, and the book is written by long-time community members, I cannot help to notice a bit of a surprise about the fact that the book is released in classic copyright under All rights reserved with no freedom to the user.

Considering the sensitive legal topics touched, I can understand the possible motivation by the authors to not permit derivative works. But then, there still are licenses such as CC-BY-ND which prevent derivative works but still permit users to make and distribute copies of the work itself. I've made that recommendation / request to Shane, let's see if they can arrange for some more freedom for their readers.

April 30, 2017

OsmoCon 2017 Review

By Harald "LaF0rge" Welte

It's already one week past the event, so I really have to sit down and write some rewview on the first public Osmocom Conference ever: OsmoCon 2017.

The event was a huge success, by all accounts.

  • We've not only been sold out, but we also had to turn down some last minute registrations due to the venue being beyond capacity (60 seats). People traveled from Japan, India, the US, Mexico and many other places to attend.
  • We've had an amazing audience ranging from commercial operators to community cellular operators to professional developers doing work relate to osmocom, academia, IT security crowds and last but not least enthusiasts/hobbyists, with whom the project[s] started.
  • I've received exclusively positive feedback from many attendees
  • We've had a great programme. Some part of it was of introductory nature and probably not too interesting if you've been in Osmocom for a few years. However, the work on 3G as well as the current roadmap was probably not as widely known yet. Also, I really loved to see Roch's talk about Running a commercial cellular network with Osmocom software as well as the talk on Facebook's OpenCellular BTS hardware and the Community Cellular Manager.
  • We have very professional live streaming + video recordings courtesy of the C3VOC team. Thanks a lot for your support and for having the video recordings of all talks online already at the next day after the event.

We also received some requests for improvements, many of which we will hopefully consider before the next Osmocom Conference:

  • have a multiple day event. Particularly if you're traveling long-distance, it is a lot of overhead for a single-day event. We of course fully understand that. On the other hand, it was the first Osmocom Conference, and hence it was a test balloon where it was initially unclear if we'll be able to get a reasonable number of attendees interested at all, or not. And organizing an event with venue and talks for multiple days if in the end only 10 people attend would have been a lot of effort and financial risk. But now that we know there are interested folks, we can definitely think of a multiple day event next time
  • Signs indicating venue details on the last meters. I agree, this cold have been better. The address of the venue was published, but we could have had some signs/posters at the door pointing you to the right meeting room inside the venue. Sorry for that.
  • Better internet connectivity. This is a double-edged sword. Of course we want our audience to be primarily focused on the talks and not distracted :P I would hope that most people are able to survive a one day event without good connectivity, but for sure we will have to improve in case of a multiple-day event in the future

In terms of my requests to the attendees, I only have one

  • Participate in the discussions on the schedule/programme while it is still possible to influence it. When we started to put together the programme, I posted about it on the openbsc mailing list and invited feedback. Still, most people seem to have missed the time window during which talks could have been submitted and the schedule still influenced before finalizing it
  • Register in time. We have had almost no registrations until about two weeks ahead of the event (and I was considering to cancel it), and then suddenly were sold out in the week ahead of the event. We've had people who first booked their tickets, only to learn that the tickets were sold out. I guess we will introduce early bird pricing and add a very expensive last minute ticket option next year in order to increase motivation to register early and thus give us flexibility regarding venue planning.

Thanks again to everyone involved in OsmoCon 2017!

Ok, now, all of you who missed the event: Go to https://media.ccc.de/c/osmocon17 and check out the recordings. Have fun!