SANTA CLARA, Calif.— FPGA vendors and users will meet next month in an effort to define a standard software interface for accelerators. The meeting is being convened by Red Hat’s chief ARM architect, who gave an update (Wednesday) on efforts to establish ARM servers.
“There’s a trend towards high-level synthesis so an FPGA programmer can write in OpenCL up front but the little piece that’s been ignored is how OpenCL talks to Linux,” said Jon Masters, speaking at the Linley Data Center event here.
OS companies don’t ship drivers for OpenCL, so software developers need to understand the intimate details of the FPGA as well as the Linux kernel to make the link. Often it also involves developing a custom direct-memory access engine and fine tuning Java libraries.
Masters did just that as part of a test board called Trilby that ran a simple search algorithm on an FPGA mounted on a PCI Express card. “Ninety percent of the effort is interface to the FPGA,” he said.
To fix the problem, Masters has called a meeting of interested parties in March. It will be hosted by a neutral organization. He hopes to have “all the right players” involved, including major FPGA vendors.
If the meeting is successful, the group will hammer out “in the open” one or more interfaces for standard OS drivers so users can load and configure an FPGA bit stream. It’s a significant hole, and not the only one on the road to taking FPGA accelerators into mainstream markets, according to Masters.
FPGAs also need to become full citizens in the software world of virtualized functions where telecos in particular are rallying around new standards for network functions virtualization. Separately, programmers are using high-level synthesis especially with OpenCL to write code for FPGAs, however, experts are still needed to map and optimize the results of synthesis to the underlying hardware, he said.
Graphics processors present their own challenges. “GPU [vendors] are not open to having an open interface so you can write a patent-free OS driver, so people have to develop their own interfaces,” he said.
When it comes to ARM-based servers, Masters’ day job, “We have all the software pieces, it’s a matter of quality engineering and the out-of-box experience,” he said.
Specifically, current servers have some hardware glitches such as non-standard mobile PCI Express blocks in their SoCs or a few lines of tweaked code in their UEFI (Unified Extensible Firmware Interface) firmware. The anomalies prevent systems from running Linux without some fussy fiddling around — the kind of thing data center operators shouldn’t have to do. In that regard, “one of my goals is to make ARM servers boring,” he said.
“Take a look at what Qualcomm is doing with its developer system — they have a phenomenally good out-of-box experience,” Masters said.
A variety of boards are available using Applied Micro, AMD and Cavium SoCs. However many are not the standard dual-socket systems popular in the x86 world, and none carry a tier-one brand.
So Red Hat Linux 7.2 is currently only available in a developer preview for ARM. But a standard Linux OS tested on multiple, shipping servers “is very close now,” Master said.
After a couple years of hard work, most of the standard open-source server software is running on ARM including a LAMP stack, two Java virtual machines, Xen and KVM hypervisors and Linux variants from Suse and Canonical. There’s some work-in-progress polishing off Docker containers and the Ceph storage stack as well as Armband, a version of open-source Network Functions Virtualization.
Bloomberg is trying out a version of Openstack on ARM. Last year, Masters ran on ARM a version of Apache Spark, a big data analytics engine. “You have to show people this works — we did this in a weekend,” he said pointing to an online demo.
As if he didn’t have enough to do, Masters also needs to keep pace with his adversary, the world’s biggest semiconductor company.
Two years ago, Intel coined the term “microserver…to keep ARM in a box,” he said. “Every single design I’ve seen is highly performant, they may not always win every speed race but they are good enough,” he added.
These days he gets questions from analysts about Intel’s acquisition of Altera and its pending new 3D XPoint memory for servers. He claims integrated FPGAs and a new server memory tier tied to the x86 won’t diminish the “disruptive opportunity” for ARM servers, but the question is a good one.
On the other side of the fight, China is engaged with ARM servers. The province of Guizhou put up half the $280 million behind a new venture with Qualcomm to establish its ARM server SoC in China. HiSilicon, the chip division of Huawei, demoed its PhosphorV660 SoC packing 16-32 Cortex-A57 cores running at up to 2.1 GHz with a MByte of L2 cache and 32MB L3. And an ambitious China startup entered the fray last year.
It’s a marathon Masters has been running for five years so far. He helped get Linux running on Applied’s first FPGA simulation of X-Gene. He rallied for the start of the Linaro Enterprise Group working on ARM server software.
In 2012, Masters created a demo of an HP Moonshot system powered by a bike to show its low power prowess. Later he helped develop a Java virtual machine for ARM. A true geek, he claims he has as many as 20 ARM server boards at home.