1. Home >
  2. Computing

Facebook is Working on Its Own Custom AI Silicon

Facebook is the latest firm to discuss its own efforts to build custom AI processors. Is the era of the general purpose CPU drawing to a close?
By Joel Hruska
Machine learning , artificial intelligence , ai, deep learning blockchain neural network concept. Brain made with shining wireframe above multiple blockchain cpu on circuit board 3d render.

Facebook's chief AI researcher, Yann LeCun, has stated that the company is working on its own custom AI silicon, with the goal of building far more efficient methods of processing neural networks in hardware and boosting performance, addressable problems, and energy efficiency.

"We don’t want to leave any stone unturned, particularly if no one else is turning them over," he said in an interview before presenting a paper on the history and future of machine learning at ISSCC (International Solid State Circuits Conference) in San Francisco. Exact details on what Facebook is building remain vague, though Intel announced an AI-focused partnership with the company at CES this year.

Fortune, however, learned(Opens in a new window) some themes of the presentation LeCun intended to give. Its themes call for expanding the role of AI from language translation to content policing, the goal of creating smarter devices that can differentiate between, say, weeds and roses, and giving computers what we typically call "common sense." Fortune uses the example of an elephant, noting that it's much easier to teach a small child what an elephant is than to present the same example to a computer.

Bloomberg chimes in(Opens in a new window) with information from the other side of the equation. According to its reporting, LeCun is focused on creating chips that don't have to break data sets into small batches for processing, but instead, work with larger amounts of information without this step. This would seem to dovetail with the goal of teaching an AI-powered lawn maintenance device to differentiate between weeds and roses. If you want to mow an area (or vacuum a carpet), you don't need to teach the device how to differentiate between what to mow or clean nearly as much as you'd have to teach it if you wanted it to specifically avoid non-weed plants. The literal definition of a weed is "a wild plant growing where it is not wanted." The implication of a lawn mower that can target weeds but avoid roses is a lawn mower that understands which plants are wanted in a given geographical context. This is a task that even humans can fail at, as my own miserable gardening efforts would attest.

More broadly, the growth in all this AI silicon -- and there are now leading efforts at multiple major companies and a veritable swarm of smaller firms that have launched in this space -- are all part of an effort to replace the conventional general-purpose scaling of Moore's law with domain-specific architectures that offer larger improvements in specialized workloads.

Google Cloud TPUGoogle's TPU is one example of a domain-specific architecture.

To understand why this is happening now, you first need to know the catastrophic damage Moore's law, Dennard scaling, and economies of scale delivered to the specialty microprocessor market in the first place. In the early days of computing, specialty architectures were just called "architectures," because every computer was a virtual island unto itself, with their own operating systems, software libraries, and compatible hardware peripherals. Over time, manufacturers began to emphasize compatibility between hardware families with common software building blocks and peripherals. Even into the 1980s, it was common for third-party companies to design FPUs that were compatible with Intel's desktop parts of the day, for example.

The problem with specialty microprocessor architectures, historically speaking, is that even if you had an idea for a particularly clever way to execute a specific type of instructions, the speed of general purpose computation was accelerating quickly enough to eat most of your market advantage before your product could be built. Imagine starting a company in 1990 with a chip 5x faster in a particular workload than anything Intel was shipping. In 1990, the fastest CPU from Intel was the 33MHz 486DX. If it took three years to bring your part to market, you're up against the 66MHz Pentium, a CPUSEEAMAZON_ET_135 See Amazon ET commerce(Opens in a new window) more than 2x faster by clock and instruction set improvements than your initial comparison point. If it takes four years, you'd have been up against the 100MHz Pentium. Intel, meanwhile, enjoyed economies of scale that no custom architecture vendor could match.

This one-two punch of unbeatable economy of scale and rapid-fire compute improvements explains why general-purpose computation took over the market from specialty architectures and why it's maintained its lock on the market ever since. GPUs are the major exception to this trend. The reason they're such an exception is that the nature of a graphics workload is so different from a general purpose computational workload that you'd never build a GPU to handle the tasks of a serial CPU or vice-versa. The closest we ever saw to a commercial architecture intended to handle both was Sony's Cell Broadband Processor, and Cell was, by every accounting, miserably hard to program if you actually wanted good CPU performance.

But CPU performance scaling has been stuck in the doldrums since Sandy Bridge, with Intel's best efforts wringing out a few percent per year. This, more than anything, explains why Google, Facebook, and other companies are seriously considering their own architectures for specific workloads. So long as Intel (or AMD, IBM, or any other general-purpose CPU vendor) could kick out double-digit performance improvements every 12-18 months, the effort of investing in a 3-5 year architectural research project was too uncertain to justify. Now that these firms can no longer deliver such improvements, companies are reevaluating their own best interests.

GPUs are, to be clear, expected to power the AI and ML revolution for at least part of the foreseeable future. This undoubtedly pleases Nvidia, which currently effectively owns the market space for these products. But the domain-specific architectures like Google's TPU aren't going to go away.

Intel is already moving to address these concerns. Many of the firm's major acquisitions in recent years are tangentially related to the AI market, including Altera and Movidius. AMD has mostly focused on regaining previously lost market share -- its 7nm GPUs are theoretically capable of running AI and ML workloads, but Nvidia dominates this space with CUDA and OpenCL support for AI/ML is very thin on the ground. AMD is not seen as a contender in these markets, at least not according to anyone I've spoken to that actually works in the field. Given that CUDA is an Nvidia-specific language, it's not clear what the company can do to change this; its efforts to provide compatibility via a CUDA wrapper do not seem to have yielded the hoped-for results thus far.

Facebook's goal of improving AI energy usage and expanding the types of problems it can solve dovetails with the research we're seeing from other firms. Collectively, it's a significant threat to profits in the x86 CPU market, not because CPUs will be replaced -- you'll always need a general purpose machine of some sort, be it ARM or x86 -- but because the high-margin markets that CPUs currently sell into could find those needs covered by other products.

Now Read:

Tagged In

DSA FPU Cpu TPU AMD

More from Computing

Subscribe Today to get the latest ExtremeTech news delivered right to your inbox.
This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of use(Opens in a new window) and Privacy Policy. You may unsubscribe from the newsletter at any time.
Thanks for Signing Up