Our Machines Now Have Knowledge We’ll Never Understand

Artificial intelligence is making the limits of human knowledge painfully obvious.
Alien Knowledge  when machines justify knowledge
Illustrations by Todd Proctor / YouWorkForThem ; Art direction by Robert Shaw.

The new availability of huge amounts of data, along with the statistical tools to crunch these numbers, offers a whole new way of understanding the world. Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.

So wrote Wired’s Chris Anderson in 2008. It kicked up a little storm at the time, as Anderson, the magazine’s editor, undoubtedly intended. For example, an article in a journal of molecular biology asked, “…if we stop looking for models and hypotheses, are we still really doing science?” The answer clearly was supposed to be: “No.”

But today — not even a decade since Anderson’s article — the controversy sounds quaint. Advances in computer software, enabled by our newly capacious, networked hardware, are enabling computers not only to start without models — rule sets that express how the elements of a system affect one another — but to generate their own, albeit ones that may not look much like what humans would create. It’s even becoming a standard method, as any self-respecting tech company has now adopted a “machine-learning first” ethic.

We are increasingly relying on machines that derive conclusions from models that they themselves have created, models that are often beyond human comprehension, models that “think” about the world differently than we do.

But this comes with a price. This infusion of alien intelligence is bringing into question the assumptions embedded in our long Western tradition. We thought knowledge was about finding the order hidden in the chaos. We thought it was about simplifying the world. It looks like we were wrong. Knowing the world may require giving up on understanding it.

Models Beyond Understanding

In a series on machine learning, Adam Geitgey explains the basics, from which this new way of “thinking” is emerging:

[T]here are generic algorithms that can tell you something interesting about a set of data without you having to write any custom code specific to the problem. Instead of writing code, you feed data to the generic algorithm and it builds its own logic based on the data.”

For example, you give a machine learning system thousands of scans of sloppy, handwritten 8s and it will learn to identify 8s in a new scan. It does so, not by deriving a recognizable rule, such as “An 8 is two circles stacked vertically,” but by looking for complex patterns of darker and lighter pixels, expressed as matrices of numbers — a task that would stymie humans. In a recent agricultural example, the same technique of numerical patterns taught a computer how to sort cucumbers.

Then you can take machine learning further by creating an artificial neural networkthat models in software how the human brain processes signals.[1] Nodes in an irregular mesh turn on or off depending on the data coming to them from the nodes connected to them; those connections have different weights, so some are more likely to flip their neighbors than others. Although artificial neural networks date back to the 1950s, they are truly coming into their own only now because of advances in computing power, storage, and mathematics. The results from this increasingly sophisticated branch of computer science can be deep learning that produces outcomes based on so many different variables under so many different conditions being transformed by so many layers of neural networks that humans simply cannot comprehend the model the computer has built for itself.

Yet it works. It’s how Google’s AlphaGo program came to defeat the third-highest ranked Go player in the world. Programming a machine to play Go is more than a little daunting than sorting cukes, given that the game has 10^350 possible moves; there are 10^123 possible moves in chess, and 10^80 atoms in the universe. Google’s hardware wasn’t even as ridiculously overpowered as it might have been: It had only 48 processors, plus eight graphics processors that happen to be well-suited for the required calculations.

AlphaGo was trained on thirty million board positions that occurred in 160,000 real-life games, noting the moves taken by actual players, along with an understanding of what constitutes a legal move and some other basics of play. Using deep learning techniques that refine the patterns recognized by the layer of the neural network above it, the system trained itself on which moves were most likely to succeed.

Although AlphaGo has proven itself to be a world class player, it can’t spit out practical maxims from which a human player can learn. The program works not by developing generalized rules of play — e.g., “Never have more than four sets of unconnected stones on the board” — but by analyzing which play has the best chance of succeeding given a precise board configuration. In contrast, Deep Blue, the dedicated IBM chess-playing computer, has been programmed with some general principles of good play. As Christof Koch writes in Scientific American, AlphaGo’s intelligence is in the weights of all those billions of connections among its simulated neurons. It creates a model that enables it to make decisions, but that model is ineffably complex and conditional. Nothing emerges from this mass of contingencies, except victory against humans.

As a consequence, if you, with your puny human brain, want to understand why AlphaGo chose a particular move, the “explanation” may well consist of the networks of weighted connections that then pass their outcomes to the next layer of the neural network. Your brain can’t remember all those weights, and even if it could, it couldn’t then perform the calculation that resulted in the next state of the neural network. And even if it could, you would have learned nothing about how to play Go, or, in truth, how AlphaGo plays Go—just as internalizing a schematic of the neural states of a human player would not constitute understanding how she came to make any particular move.

Go is just a game, so it may not seem to matter that we can’t follow AlphaGo’s decision path. But what do we say about the neural networks that are enabling us to analyze the interactions of genes in two-locus genetic diseases? How about the use of neural networks to discriminate the decay pattern of single and multiple particles at the Large Hadron Collider? How the use of machine learning to help identify which of the 20 climate change models tracked by the Intergovernmental Panel on Climate Change is most accurate at any point? Such machines give us good results — for example: “Congratulations! You just found a Higgs boson!” — but we cannot follow their “reasoning.”

Clearly our computers have surpassed us in their power to discriminate, find patterns, and draw conclusions. That’s one reason we use them. Rather than reducing phenomena to fit a relatively simple model, we can now let our computers make models as big as they need to. But this also seems to mean that what we know depends upon the output of machines the functioning of which we cannot follow, explain, or understand.

Since we first started carving notches in sticks, we have used things in the world to help us to know that world. But never before have we relied on things that did not mirror human patterns of reasoning — we knew what each notch represented — and that we could not later check to see how our non-sentient partners in knowing came up with those answers. If knowing has always entailed being able to explain and justify our true beliefs — Plato’s notion, which has persisted for over two thousand years — what are we to make of a new type of knowledge, in which that task of justification is not just difficult or daunting but impossible?

Two Models of Models

In 1943, the US Army Corps of Engineers set Italian and German prisoners of war to work building the largest scale model in history: 200 acres representing the 41 percent of the United States that drains into the Mississippi River. By 1949 it was being used to run simulations to determine what would happen to cities and towns along the way if water flooded in from this point or that. It’s credited with preventing flooding in Omaha in 1952 that could have caused $65 million in damage.[2] In fact, some claim its simulations are more accurate than the existing digital models.

Water was at the heart of another famous physical model: the MONIAC (Monetary National Income Analogue Computer) economic simulator built in 1949 by the New Zealand economist Alban William Housego Phillips.[3] The MONIAC used colored water in transparent pipes to simulate the effects of Keynesian economic policies. It was, alas, not as reliable as the Mississippi River simulator, presumably because it did not account for all the variables that influence the state of a national economy. But the flow of water through a river the size of the Mississippi is also affected by more variables than humans can list. So how could the Mississippi model get predictions right within fractions of an inch in the real world?

You don’t have to understand everything about fluid dynamics if you want to predict what will happen if you place a boulder on the edge of a rapids: You can just build a scale model that puts a small rock into a small flow. So long as scale doesn’t matter, your model will give you your answer. As Stanford Gibson, a senior hydraulic engineer in the Core of Engineers, said about the Mississippi Basin project, “The physical model will simulate the processes on its own.”

MONIAC used water flow to model an economic theory with “tanks representing households, business, government, exporting and importing sectors of the economy,” measuring income, spending, and GDP. The number of variables it considered was limited by the number of valves, tubes, and tanks that could fit in a device about the size of a refrigerator.[4]

The Mississippi River basin model seems to make no assumptions about the factors that affect floods, other than that floods won’t occur unless you put more water into the system. But of course that’s not quite true. The model assumes that what happens at full scale also happens at 1/2000 scale. In fact, the model was built at 1/2000 horizontally but on a vertical scale of 1/100 to “ensure that topographic shifts would be apparent,” resulting in the Rockies rising out of scale, 50 feet above the ground. The model makers assumed, apparently correctly, that the height of the mountains would not affect the outcomes of their experiments. Likewise, they did not simulate the position of the moon or grow miniature crops in the fields because they assumed those factors were not relevant.

So, the “theory-free” model of the Mississippi works not simply because “the physical model will simulate the processes on its own” but because the physical model has assumptions built into it about what counts, and those assumptions provide accurate results for the purposes for which it was built. Using the Mississippi model to simulate the effects of climate change or the effect of paddle wheelers on algae growth probably won’t give reliable results, for those effects are likely affected by other factors not in the model and because the effects are sensitive to scale.

Even where the Mississippi model does work, we don’t understand exactly why or how. It wasn’t constructed based on a mathematical model of the Mississippi River basin and it works without yielding such a model. Indeed, it works because it doesn’t require us to understand it: It lets the physics of the simulation do its job without imposing the limitations of human reason on it. The result is a model that is more accurate than one like MONIAC that was constructed based on human theory and understanding.

Until machine learning, we’ve had no choice but to manually design the models that computers then implement. We assumed that the path to increased predictive power meant making the models more detailed and accurate, while accumulating more and better data for those handcrafted models to operate on. Because the models came from human minds, knowledge and understanding would go hand in hand.

But it turns out that that assumption was based on an unexpressed premise.

The Knowability Premise

In the Galileo Museum in Florence, a beautiful armillary from 1593 looms large in its room. It consists of metal and gilded wooden gears nested within gears nested within gears nested within an external layer of circles. Set its outer meridian ring to be “perpendicular to the horizon, and parallel to the actual meridian” and orient it by sighting the sun or a known star, and it will accurately show the position of celestial bodies. This is a model that produces reliable knowledge about where objects show up in the Earth’s skies, but it does so using a model that is thoroughly wrong.

Such armillaries hew to the ancient Greek understanding that the Earth is the center of the universe and that the celestial bodies move in perfect circles. To model the non-circular, eccentric movement of the planets across our sky, circular gears had to be connected in complex ways to other circular gears.

The ancient understanding strikes us now as charming, like the Music of the Spheres that understanding also supposed. But its most fundamental assumption is still very much with us: The condition for knowing the world is that the world be knowable. If there were no similarities among entities, no laws that hold across all instances, no meaningful categories of object, no way of finding a simplicity beneath the differences, then we would be left with an unknowable chaos.

Fortunately, we do not find ourselves in such a world. Thanks to the likes of Kepler, Copernicus, Galileo, and Newton, we can not only predict the positions of heavenly bodies better than the best armillary can, but also know the world more than ever: There is a handful of laws that are simple enough for us to be able to discover and understand them. These laws apply everywhere and to everything. They represent truths about the universe.

It has been important to us that the model that produces knowledge also accurately reflect how the world works. Even if the armillary yielded precisely the same results as Newton’s laws, we would insist that the pre-Newton model was simply wrong. The ancients, we’d insist, didn’t understand how the world works because the model that expresses the relationships among the parts does not reflect the actual state of affairs.

We have insisted that the model reflect the world because we have assumed that the world the model reflects is knowable.

But now we have a different sort of model. Like traditional models, they enable us to make predictions that are true. Like traditional models, they advance knowledge. But some of the new models are incomprehensible. They can exist only in the weights of countless digital triggers networked together and feeding successive layers of networked, weighted triggers representing huge quantities of variables that affect one another in ways so particular that we cannot derive general principles from them.

The success of these models may be showing us an uncomfortable truth uncontemplated by the ancients and the tradition that sprang from them.

Post-Scarcity Computing

For their first fifty years, computers assumed scarcity. They were famous for it. The minimal information required for a purpose was gathered, and was structured into records that were the same for each instance. That limitation was built into computers’ initial ingestion medium: punch cards. These cards turned information into a spatial array that could be read because the array and its encoding were uniform. That uniformity squeezed out differences, peculiarities, exceptions, and idiosyncrasies…the stuff of life, as beatniks and other malcontents recognized from the start.

Of course, one could ask why punch cards were the chosen mechanism. At least part of the answer is historical: Herman Hollerith founded the company that became IBM on his use of punch cards to automate the tallying of the 1890 census. A census by its nature reduces citizens to a relative handful of shared categories — democracy at work. Punch cards had in turn been developed in the late 18th century as a way of controlling the patterns woven by Jacquard looms. They were not designed to carry information any more than a gear is, and they carried into the computer age the reductive, repetitive parsimony of Industrial Age machine design.

Over the years computers have scaled up the amount of information they can manage, but the transformative change occurred once we plugged them in to a global, public network. A computer’s capacity now includes all the information available on the internet. That information includes not just the contents of vast data repositories, but the staccato input of sensors distributed across the land, seas, and heavens. The blessedly unregulated structure of all that information has necessitated the development of standards and protocols for dealing with data outside of the formats one has anticipated, preserving the differences rather than sacrificing them on the altar of uniformity. For example, NoSQL databases allow records to vary from object to object in terms of which fields they capture. And Tim Berners-Lee, the creator of the World Wide Web, coined the name “Linked Data” for a way of representing information that ignores the concept of records entirely, enabling every nuance about a topic to be expressed in reusable form.

These new capacities have led the Network Age to a very different conception of information than the Computer Age held. These days we talk about information less as a resource held in storage containers than as streams, a metaphor just about perfectly opposed to the embodiment of information in punch cards. Information flows over and around us with all the orderliness of water cascading around boulders in a rapids. Our computers can map the cascade of interactions caused by the presence of chemicals at a cell wall, but the causal chains may be so twisty that the human brain cannot recall them or predict them.

This is what the world looks like when the medium is a global network, not a stack of eighty-column punch cards. With prep work done by Chaos Theory, the capaciousness and connectedness of our machines is letting us acknowledge just how complex and contingent our world is.

For example, Kevin Heng in an excellent article in American Scientist, points to multiscale problems: “Small disturbances in a system” that “show up as big effects across myriad sizes and time scales.” He gives an instance:

Planet formation is an inherently multiscale problem: Uncertainties on microscopic scales, such as how turbulence and the seed particles of dust grains are created, hinder our ability to predict the outcome on celestial scales. Likewise, to understand the appearance of … exoplanets’ atmospheres requires approximating them as fluids and understanding the macroscopic manifestations of the quantum mechanical properties of the individual molecules: their absorption and scattering properties.

The behavior of each molecule is law-like, but, Heng points out, “there will always be phenomena operating on scales smaller than the size of one’s simulation pixel.” That is, no matter what scale you choose to make your digital or analog model of the Mississippi River Basin, it will gloss over what’s happening at the next smaller scale — and, one might add, maybe at the next scale up.

Models are always reductive: They confine the investigation to the factors we can observe and follow. For thousands of years we acted as if the simplicity of our models reflected the simplicity — the elegance, the beauty, the pure rationality — of the universe. Now our machines are letting us see that even if the rules are simple, elegant, beautiful and rational, the domain they govern is so granular, so intricate, so interrelated, with everything causing everything else all at once and forever, that our brains and our knowledge cannot begin to comprehend it. It takes a network of humans and computers to know a world so thoroughly governed by contingency — one in which it’s chaos all the way down. And up.

Foreswearing Knowledge

Back at the beginning of Western culture’s discovery of knowledge, Plato told us that it’s not enough for a belief to be true because then your uninformed, lucky guess about which horse will win the Preakness would have to count as knowledge. That’s why knowledge in the West has consisted of justifiable true beliefs — opinions we hold for a good reason.

Our new reliance on inscrutable models as the source of the justification of our beliefs puts us in an odd position. If knowledge includes the justification of our beliefs, then knowledge cannot be a class of mental content, because the justification now consists of models that exist in machines, models that human mentality cannot comprehend.

One reaction to this could be to back off from relying upon computer models that are unintelligible to us so that knowledge continues to work the way that it has since Plato. This would mean foreswearing some types of knowledge. We foreswear some types of knowledge already: The courts forbid some evidence because allowing it would give police an incentive for gathering it illegally. Likewise, most research institutions require proposed projects to go through an institutional review board to forestall otherwise worthy programs that might harm the wellbeing of their test subjects.

We have already begun to define the realms in which machine justification has too high a social cost. For example, Andrew Jennings, the Senior VP of Scores and Analytics at FICO, the credit scoring company, told me: “There are a number of long standing rules and regulations around credit scoring in the US and elsewhere as a result of legislation that require people who build credit scores to manage the tradeoff between things that are predictively useful and legally permitted.” Machine learning algorithms might discover, for example, that Baptists generally are good credit risks but Episcopalians are not. Even if this example were true, that knowledge could not be used in computing a credit score because U.S. law prevents discrimination on the basis of religion or other protected classes. Credit score companies are also prohibited from using data that is a surrogate for these attributes, such as subscribing to Baptist Week.

There are additional constraints on the model credit score companies can use to calculate credit risk. If a lender declines a credit application, the lender has to provide the reasons why the applicant’s score was not higher. To meet this requirement, FICO makes the explanations as actionable as possible by the consumer. For example, Jennings explained, an applicant might be told, “Your score was low because you’ve been late paying off your credit cards eight times in the past year.”

But suppose FICO’s manually created models turned out to be less predictive of credit risk than a neural network would be. In fact, Jennings says that they recently compared prototype FICO Scores derived via machine learning techniques with the results from the manual model, and found that the difference between those scores, using the same superset of input variables, was insignificant. But the promise of machine learning is that there are times when the machine’s inscrutable models will be far more predictive than the manually constructed, human-intelligible ones. In those cases, our knowledge — if we choose to use it — will depend on justifications that we simply cannot understand.

But, for all the success of machine learning models, we are now learning to be skeptical as well. The paradigmatic failures seem to be ones in which the machine justification has not escaped its human origins enough.

For example, a system that was trained to evaluate the risks posed by individuals up for bail let hardened white criminals out while keeping in jail African Americans with less of a criminal record. The system was learning from the biases of the humans whose decisions were part of the data. The system the CIA uses to identify targets for drone strikes initially suggested a well-known Al Jazeera journalist because the system was trained on a tiny set of known terrorists. Human oversight is obviously still required, especially when we’re talking about drone strikes instead of categorizing cucumbers.

Mike Williams, a research engineer at Fast Forward Labs, a data analytics company, said in a phone interview that we need to be especially vigilant about the prejudices that often, and perhaps always, make their way into which data sets are considered important and how those data are gathered. For example, a recent paper discusses a project that used neural networks to predict the probability of death for patients with pneumonia, so that low-risk patients could be treated as outpatients. The results were generally more accurate than those that came from handcrafted models that applied known rules to the data. But the neural network clearly indicated that asthmatic pneumonia patients are at low risk of dying and thus should be treated as outpatients. This contradicts what caregivers know, as well as common sense. It turns out that the finding was caused by the fact that asthmatic patients with pneumonia are immediately put into intensive care units, resulting in excellent survival rates. But obviously that does not mean they should be sent home. On the contrary. It takes a human eye to spot this sort of error.

Cathy O’Neill, author of the recent book Weapons of Math Destruction, points to implicit biases in the values that determine which data sets we use to train a computer. When I spoke to her, she gave me an example of someone looking for the best person to fill a job, with one desired trait being, 
“someone who stays for years and gets promotions.” Using machine learning algorithms for this, you might end up always hiring men, since women tend to stay at jobs for shorter intervals. The same is true, she says, about blithely using machine learning to identify bad teachers in public school systems. What constitutes a bad teacher? Her class’s average score on standardized tests? How many students go on to graduate? Attend college? Make money? Live happy and fulfilled lives? Humans might work this out, but machine learning algorithms well may reinstitute biases implicit in the data we’ve chosen to equip them with.

So, we are likely to go down both tracks simultaneously. On the one hand, we will continue our tradition of forbidding some types of justification in order to avoid undesirable social consequences. Simultaneously, we are likely to continue to rely ever more heavily on justifications that we simply cannot fathom.

And the issue is not simply that we cannot fathom them, the way a lay person can’t fathom a string theorist’s ideas. Rather, it’s that the nature of computer-based justification is not at all like human justification. It is alien.

But “alien” doesn’t mean “wrong.” When it comes to understanding how things are, the machines may be closer to the truth than we humans ever could be.

Alien Justification

Somewhere there is a worm more curious than the rest of its breed. It has slowly traveled through the soil tasting every new patch it comes to, always looking for the next new sample, for it believes that the worm’s highest calling is to know its world, and taste is its means of knowledge. Because of this particular worm’s wide experience and its superior powers of categorization and expression, it becomes revered among worms as a sage who can impart wisdom about what the planet Earth really tastes like.

But taste is not a property of the Earth or any of its parts. What our exalted worm tastes is the result of the encounter of its gustatory senses and the chemical composition of the soil. The worm’s apparatus only lets it know the world via a quality that correlates with some properties of the stuff of the Earth, but that is not actually like that stuff. As the cognitive scientist Donald Hoffman argues, realistic perceptions are unlikely to make a creature more evolutionarily fit.

Human knowledge, we have believed, is different from the sage worm’s. We discern the order, the rules, that bring unity and predictability to the blooming, buzzing confusion that the senses convey. We’re not worms.

Very true. But the more details we cram into our global network of computers, the less the world looks like a well-oiled armillary. Our machines now are letting us see that even if the rules the universe plays by are not all that much more complicated than Go’s, the interplay of everything all at once makes the place more contingent than Aristotle, Newton, Einstein, or even some Chaos theorists thought. It only looked orderly because our instruments were gross, because our conception of knowledge imposes order by simplifying matters until we find it, and because our needs were satisfied with approximations.

That’s fine if you just want to put the 8-ball in the corner pocket. But if you want to know the real path that ball will take, you have to look at the friction created at the molecular level as it passes over each fiber of the felt, at the pull of the moon and the moment’s variation in the Earth’s wobble, at the unequal impact of the photons emitted from the light fixture above the table and the lamp off to the side, and at the change in the air current as your opponent holds her breath. Not to mention the indeterminacy of the quanta. None of that may affect whether you sink the ball, but it is the truth of what’s going on. Even if the universe is governed by rules simple enough for us to understand them, the simplest of events in that universe is not understandable except through gross acts of simplification.

Our machines are letting us see this now that they do not require us to strip information down to what fits into a pile of punch cards. With this new capacity we now lean toward including everything and asking questions later.

What we capture will, of course, remain a tiny fragment of what the universe offers, and is highly subject to our biases and assumptions. Even so, data at this new scale is making manifest the truth we’ve worked around for thousands of years: Knowledge is bigger than we are.

In the mid 1990s the World Wide Web began to move us past our old strategy of knowing the world by reducing what we’re responsible for knowing. Knowledge immediately fled its paper prison and took up residency on the net. Now if you want to know about, say, King Lear, you’ll go out on the web where what we know about the play exists in the links among countless sites created by scholars of literature, historians, linguists, digital humanists, actors, directors, and members of the audience at last night’s local production. It will include professionals and amateurs, idiots and savants. The borders of our knowledge about King Lear are drawn by each person’s interest and current project. The networked knowledge within those evanescent borders is huge, connected, and often inconsistent. That’s what knowledge looks like when it scales.

The rise of machine learning is further hammering home the inadequacy of human understanding to the task it has set for itself. It’s not simply that to know that the Higgs boson exists requires a network of hardware, software, scientists, engineers, and mathematicians, as Michael Nielsen pointed out in a discussion of his excellent 2011 book, Reinventing Discovery: The New Era of Networked Scienceat Harvard’s Berkman Klein Center. After all, the traditional justification of knowledge permits us to rely on sources worthy of our trust. In part that’s because we know we could in theory interview each of the people involved and decide whether they are a solid part of the Higgs boson justification story.

Not so when a neural network produces results through processes that are alien to human ways of justifying knowledge. We can verify that what comes out of the machines is very likely knowledge by noting that AlphaGo wins games and that mobile networks of autonomous cars produce fewer accidents, if that’s what happens. But we can’t necessarily follow why AlphaGo placed a piece on this square and not that one, or why the autonomous car swerved left even though I would have swerved right. There are too many inputs, and the decisions are based on complexes of dependencies that exceed the competency of the finest brains natural selection has produced.

As this sinks in, we are beginning to undergo a paradigm shift in our pervasive, everyday idea not only of knowledge, but of how the world works. Where once we saw simple laws operating on relatively predictable data, we are now becoming acutely aware of the overwhelming complexity of even the simplest of situations. Where once the regularity of the movement of the heavenly bodies was our paradigm, and life’s constant unpredictable events were anomalies — mere “accidents,” a fine Aristotelian concept that differentiates them from a thing’s “essential” properties — now the contingency of all that happens is becoming our paradigmatic example.

This is bringing us to locate knowledge outside of our heads. We can only know what we know because we are deeply in league with alien tools of our own devising. Our mental stuff is not enough.

Philosophical pragmatism from one hundred years ago has helped to intellectually prepare us for this shift by limiting our ambitions: Knowledge is less a reflection of the world than a tool for operating in it. Martin Heidegger’s phenomenology provides a different sort of correction by pointing to the historical artificiality of the idea that knowledge is a mental representation of the world, an idea that arose thanks to a history of metaphysical errors.

The extended mind theory of Andy Clark and David Chalmers, which is at least to some extent grounded in Heidegger’s thought, provides a more direct reformulation of knowledge. In his 1996 book Being There: Putting Brain, Body, and World Together Again, Clark argues that knowing is something we’ve always done out in the world with tools. An innumerate shepherd four thousand years ago needed a handful of pebbles to be sure he was returning with the same number of sheep as he set out with, and a physicist today may well need a white board to do her cognitive work. Architects need big sheets of paper, straight edges, and sometimes 3D models to think a building through. Watson and Crick needed homemade Tinker Toys to figure out the structure of DNA. Now workers in such fields have transitioned to using computers, perhaps even the shepherd outfitted with the latest iSheep app.[5] Nevertheless the situation is the same: We think out in the world with tools. Taking knowledge as a type of mental content — a justified, true opinion — obscures that simple phenomenological truth.

As long as our computer models instantiated our own ideas, we could preserve the illusion that the world works the way our knowledge —and our models — do. Once computers started to make their own models, and those models surpassed our mental capacity, we lost that comforting assumption. Our machines have made obvious our epistemological limitations, and by providing a corrective, have revealed a truth about the universe.

The world didn’t happen to be designed, by God or by coincidence, to be knowable by human brains. The nature of the world is closer to the way our network of computers and sensors represent it than how the human mind perceives it. Now that machines are acting independently, we are losing the illusion that the world just happens to be simple enough for us wee creatures to comprehend.

It has taken a network of machines that we ourselves created to let us see that we are the aliens.

Notes

1. More specifically, it models how the human visual cortex processes signals, according to Natalie Wolchover, “A Common Logic to Seeing Cats and Cosmos,” Quanta Magazine, Dec. 4, 2014, https://www.quantamagazine.org/20141204-a-common-logic-to- seeing-cats-and-cosmos/

2. Dylan, “The Mississippi River Basin Model,” Atlas Obscura, http://www.atlasobscura.com/places/the-mississippi-river-basin-model The source of the $65 million figure seems to be Foster, J. E. “History and Description of the Mississippi Basin Model.” Mississippi Basin Model Report 1–6, Vicksburg, Mississippi: U.S. Army Engineer Waterways Experiment Station, 1971, 2.; this is according to Kristi Dykema Cheramie’s “The Scale of Nature: Modeling the Mississippi River,” Places, March 2011. https://placesjournal.org/article/the- scale-of-nature-modeling-the-mississippi-river/ If we assume, based on nothing, that the $65 million figure had already been translated into 1971 dollars, the current equivalent would be $386 million. ↩

3. A.G. Gleeman, “The Phillips Curve: A Rushed Job?,” Journal of Economic Perspectives, vol .25, #1, Winter 2011, pp. 223–238. p. 225. ↩

4. Gleeman, op. cit., p. 225 ↩

5. That’s made-up, but Bing is touting its Big Data approach to managing herds of cattle via sensors that can, for example, tell via machine learning when a cow is in heat by its pace, and the optimal times to inseminate a cow if one wants a male or a female calf. See Sean Gallagher, “The Internet of Cows: Azure-powered pedometers get dairies mooovin’”