The Social Dimension of Testing

I’ve talked in the past about my perception that specialist testers need to be cross-discipline associative. And while I’ve implicitly given some ideas about what that means in various posts, here I want to be a bit more explicit.

Here I want to show how knowledge of history, combined with being cross-discipline associative, can help deal with very present day problems. One of those problems in the testing industry is the perception of how testing itself is perceived in the wider technology industry.

This is one more element in my broader idea of framing a certain narrative or, at the very least, providing the basis for how a narrative can be framed.

Social Dimensions of Testing

Here the context is actually a book called Reliable Knowledge by John Ziman, which was published in 1978. As I was going through that, it led me to one of his earlier works, which is Public Knowledge from 1968. And this book is a perfect example of allowing yourself to be cross-discipline associative while also allowing the past to inform the present.

In the book, Ziman presents various conceptions of science. And when you read them — and when you appropriately abstract them — you can see the relevance to testing as a discipline but, beyond even that lofty height, you can see the relevance more simply with testing as a practical endeavor that we perform to provide knowledge. That knowledge is often in the form of assessments about the state of things and the risks that exist based on that state.

I tried something very similar to this with my A History of Automated Testing article. Similar to how I’m reframing Ziman’s book above, you’ll see a similar reframing I did of Sabine Hossenfelder’s book in Getting Lost in Test.

I’m actually going to present this here in the context of the so-called Cambridge Tripos. This refers to an academic structure or framework for examinations. The term “Tripos” actually originates from the three-legged stool used by students during examinations.

Specifically I’m going to take three conceptions that Ziman talks about, framing them as false conceptions. I’m taking this approach because testers often — correctly, in my view — refer to false conceptions they see promoted in the industry. What testers are less good at doing is framing a counter-narrative to those false conceptions. So I want to show how Ziman tackled that same problem in the context of his book and why it has relevance to our current testing industry.

In doing this, I’m going to quote some of the book at length. While doing so I’ll intersperse what I take away from those quoted portions and how I’ve related these thoughts to my career as well as how I use that framing as a springboard towards providing a bridge for the emic and etic viewpoints.

False Conception 1

Science is the mastery of man’s environment.

About that conception, Ziman says this:

This is, I think, the vulgar conception. It identifies Science with its products. It points to penicillin or to an artificial satellite and tells us of all the wonderful further powers that man will soon acquire by the same agency.

Clearly here we can draw a parallel with those who identify testing with artifacts like test cases. Or who identify testing with the use of certain tools, such as automation tools.

This definition enshrines two separate errors. In the first place it confounds Science with Technology. It puts all its emphasis on the applications of scientific knowledge and gives no hint as to the intellectual procedures by which that knowledge may be successfully obtained.

Indeed. And this is similar to how it’s possible to confound testing with tests. Or testing with the idea of quality assurance. Sometimes people put a lot of emphasis on the applications, or base mechanics, of carrying out or executing tests, rather than on the “intellectual procedures” that go into testing.

It does not really discriminate between Science and Magic, and gives us no reason for studies such as Cosmology and Pure Mathematics, which seem entirely remote from practical use.

This is actually an interesting point that bears wider discussion in the scientific community but a key point here is that in my career I see many who don’t discriminate between the theory of testing and the application of various techniques of testing.

Demonstrably, or so I would argue, there is not just a convenience but a necessity t to separate a purely theoretical, abstract and speculative line of thought or argument from the empirical material which that thought or arguments exemplifies and from which that empirical material is derived. In other words, calling out the specific phenomena is one thing. Calling out a general theory regarding the phenomena is another. There is value in calling out both separately but making sure there are good hooks to each for reference.

What the false conception does here is foster a focus on the mechanics of testing which really means a focus on artifacts (like tests) which leads to a focus on tools. These might be tools that store those tests or that execute those tests or that parse those tests. This is at the expense the broader ambit of understanding testing as a fundamental discipline.

It also confuses ideas with things. Penicillin is not Science, any more than a cathedral is Religion or a witness box is Law. The material manifestations and powers of Science, however beneficial, awe-inspiring, monstrous, or beautiful, are not even symbolic; they belong in a different logical realm, just as a building is not equivalent to or symbolic of the architect’s blueprints. A meal is not the same thing as a recipe.

Indeed so. A test is not the same thing as the thinking that goes into creating the test or even conceptualizing the ways something can be tested. Just as penicillin is not science, I can say that Cypress is not testing. Or, more broadly, automation is not testing. This is how you can also frame the argument that test cases are not testing.

You’ll notice that the arguments I’m providing give a basis for how to frame this narrative, which I fear many vocal testers who make these same sentiments are not doing. I believe they’re saying the right things. But I also believe they’re saying them in the wrong way.

Let’s look at another conception.

False Conception 2

Science is the experimental method.

This one is really interesting to me because I often say — and I do believe — that one of the key attributes of good testers is the ability to think and act experimentally. I do also talk often about the basis of the scientific method. But there’s nuance to that.

The recognition of the importance of experiment was the key event in the history of Science. The Baconian thesis was sound; we can often do no better today than to follow it.

The “Baconian thesis” refers to the philosophical approach to scientific inquiry advocated by Francis Bacon. This thesis asserts that the most reliable path to understanding natural phenomena involves systematic and methodical experimentation rather than relying solely on traditional reasoning or abstract philosophy. Bacon believed that by systematically observing and experimenting, scientists could uncover the underlying laws governing nature. His approach emphasized the collection of data through observation and experimentation, leading to the formulation of general principles or laws.

Sounds reasonable, right?

Yet this definition is incomplete in several respects. It arbitrarily excludes pure mathematics, and needs to be supplemented to take cognisance of those perfectly respectable sciences such as Astronomy or Geology where we can only observe the consequences of events and circumstances over which we have no control.

That “over which we have no control” part is important.

You eventually learn that as a quality and test specialist, you are helping delivery teams create a context for often relatively marginal gains. There is thus a slow and constant aggregation of small efforts to produce a greater possibility of good outcomes. But you’re not solely responsible for those good outcomes.

You learn that you’re helping delivery teams uncover information about the risks that exist where humans and technology intersect. Thus we try to make sure the modes of failure are known, managed, and mitigated. Of those three, however, we mainly only have direct control over the “making things known” part. We don’t have control over the managed and mitigated part.

What we do is provide demonstrable and interrogable evidence that allow people to make more and better informed decisions about risks, which allows them to take actions that balance the need to make progress alongside the need to consider sensible constraints so that we — collectively — do far more good than we do harm with whatever platform, service, or product we deliver.

It also fails to give due credit to the strong theoretical and logical sinews that are needed to hold the results of experiments and observations together and give them force. Scientists do not in fact work in the way that operationalists suggest; they tend to look for, and find, in Nature little more than they believe to be there, and yet they construct airier theoretical systems than their actual observations warrant. Experiment distinguishes Science from the older, more speculative ways to knowledge but it does not fully characterize the scientific method.

This is crucial. There are so many test pundits out there that rail against confirmation and then suggest that scientists of any stripe don’t use this technique to any great degree. And that’s flat out false. It’s why we have a current “crisis in physics” that’s been building since the mid 1970s and has been talked about in numerous publications with growing alarm.

And, again, that’s crucial. The “way of doing science” — and thus experiment — does not fully allow one to understand or even practice the scientific method. And it has nothing to do with reliance on confirmation or verification. It has everything to do with how far theory and experiment are allowed to drift apart.

This is something I see happen quite a bit in the testing industry. Plenty of various types of experimenting but often not guided by good theory so non-testers struggle to see the wider relevance. Alternatively, I see a lot of theory but which is often not clearly articulated so that non-testers can see how to turn the theory into practical experiment.

False Conception 3

Science arrives at “Truth” by logical inferences from empirical observations.

This is another interesting one.

This is the standard type of definition favoured by most serious philosophers. It is usually based upon the principle of induction — that what has been seen to happen a great many times is almost sure to happen invariably and may be treated as a basic fact or Law upon which a firm structure of theory can be erected.

Okay, that, by itself, sounds not too bad, right?

There is no doubt that this is the official philosophy by which most practical scientists work. From it one can deduce a number of practical procedures, such as the testing of theory by ‘predictions’ of the results of future observations, and their subsequent confirmation.

Note once again: confirmation. But most scientists do know that falsification is inherent in the process. Whether or not the predictions pan out is what we’re looking for. No one is really sitting there arguing about verification or falsification. Yet testers are most often doing exactly that, usually framed in the context of “don’t show that it’s working, instead show that it isn’t.”

Someone truly practicing the scientific method — as opposed to just “doing Science” in Ziman’s conception — does both of these automatically. And if we equate testing with the scientific method, then good testers are doing both automatically as well.

The importance of speculative thinking is recognized, provided that it is curbed by conformity to facts. There is no restriction of a metaphysical kind upon the subject matter of Science, except that it must be amenable to observations and inference.

And this is a good point. We can consider in our context various qualities that may amplify or degrade. Some are going to be more objective than others, like performance. I see the response time going up or I see the throughput going down. Others may be more subjective than others, like usability. But we can reason about them all and all are amenable to observations and inference, which allows us the basis for thinking about risks and thus communicating about those risks.

And since those risks are often framed around various qualities, that does mean testing has quite a bit to say about quality. It doesn’t “assure quality” but it very much asserts where, when and to what extent various qualities are being amplified or degraded.

But the attempt to make these principles logically watertight does not seem to have succeeded. What may be called the positivist programme, which would assign the label ‘True’ to statements that satisfy these criteria, is plausible but not finally compelling. Many philosophers have now sadly come to the conclusion that there is no ultimate procedure which will wring the last drops of uncertainty from what scientists call their knowledge.

Keep in mind that Ziman wrote his book in the late 1960s when “logical positivism” was coming out of its peak period, which lasted from the 1920s to the 1950s.

In our context, certainly we can’t fully tame the uncertainty around what we test and provide assessments on. There will always be risks. There will always be things we can’t account for. The very notion of being “bug free” is impossible. And even if it was possible, the ability to prove it would be impossible.

And although working scientists would probably state that this is the Rule of their Order, and the only safe principle upon which their discoveries may be based, they do not always obey it in practice. We often find complex theories — quite good theories — that really depend on very few observations. It is extraordinary, for example, how long and complicated the chains of inference are in the physics of elementary particles; a few clicks per month in an enormous assembly of glass tubes, magnetic fields, scintillator fluids and electronic circuits becomes a new ‘particle’, which in its turn provokes a flurry of theoretical papers and ingenious interpretations.

And this is very correct. Those who worship at the altar of science and who promote the practices of its practitioners as the way to think often don’t understand the practices of those practitioners in the slightest. This, again, speaks to a broad topic of the current “crisis on physics” that I won’t elaborate here.

I do not mean to say that the physicists are not correct; but no one can say that all the possible alternative schemes of explanation are carefully checked by innumerable experiments before the discovery is acclaimed and becomes part of the scientific canon. There is far more faith, and reliance upon personal experience and intellectual authority, than the official doctrine will allow.

Exactly right. We can’t possibly test everything. Yet that doesn’t mean we can’t acclaim some discoveries, such as “there is a sufficient enough of quality present to provide a valuable experience for the majority of people likely to use our product.” This often gets framed around the “good enough” argument.

A simple way of putting it is that the logico-inductive scheme does not leave enough room for genuine scientific error. It is too black and white. Our experience, both as individual scientists and historically, is that we only arrive at partial and incomplete truths; we never achieve the precision and finality that seem required by the definition.

And this is how certain very vocal test practitioners often sound to me: very black and white. Usually in the form of “I’m right and everyone else is wrong” often followed by “everyone who disagrees with me is thus either misinformed or a fool.” It’s a great way to stoke ego but not a great way to recognize reality and certainly not a great way to promote testing to a wider industry nor to inculcate a good narrative among test practitioners.

Testing is only ever going to deliver partial and incomplete truths and that’s because there’s this little thing called time. And time leads to evolution and evolution leads to changing circumstances. In our context, those changing circumstances can be technology shifts just as much as it can be the changing tide of user tolerance for problematic software.

Thus, nothing we do in the laboratory or study is ‘really’ scientific, however honestly we may aspire to the ideal. Surely, it is going too far to have to say, for example, that it was ‘unscientific’ to continue to believe in Newtonian dynamics as soon as it had been observed and calculated that the rotation of the perihelion of Mercury did not conform to its predictions.

And that’s exactly right. We don’t toss the Newtonian universe out and say it was all garbage just because the inverse square law of gravitation is not, as it turns out, perfectly true in a universe subject to Einstein’s relativity.

The Narrative Builds Itself

By taking some of Ziman’s points above, I hope you can see that (1) some of the challenges we face in the testing industry are the same Ziman was writing about and (2) we can extrapolate those points to see how Ziman, and others, attempted to combat this and whether they were successful in doing so. Going back to my aforementioned “History of Automated Test” post mentioned earlier, there I showed how teachers quite successfully combatted the “automation will replace us” thoughts and effectively provided a counter-narrative.

One more thought from Ziman resonates for me.

One can be zealous for Science, and a splendidly successful research worker, without pretending to a clear and certain notion of what Science really is. In practice it does not seem to matter.

Can the same be said of testing? Yes, in fact it can. As Ziman goes on to say:

The average scientist will say that he knows from experience and common sense what he is doing, and so long as he is not striking too deeply into the foundations of knowledge he is content to leave the highly technical discussion of the nature of Science to those self-appointed authorities the Philosophers of Science. A rough and ready conventional wisdom will see him through.

The same could be said of the “average test practitioner.”

And yet … there’s an odd interplay at work here. What you end up with are practitioners that don’t seem to understand — and in some cases seemingly don’t want to understand — the deeper implications of their discipline.

So consider yourself acting as a scientist, trying to make a scientific discovery. How will you know what to do to make scientific discoveries if you haven’t been taught the distinction between a scientific theory and a non-scientific one? Similarly, think about what this means when we talk about the distinction between a bad test and a good test. Or how about the much wider distinction between bad testing and good testing.

The overall point of Ziman’s book is that scientific research is a social activity. And I would argue testing is very much the same.

There are intellectual connections between the ideas of various practitioners who engage in research. But what about looking at the social relations through which those connections are established? In order to understand how people in a given discipline interact socially, we need to have a clear conception of what those people are actually trying to do. What are physicists actually trying to do? What are theologians actually trying to do? What are economists actually trying to do? What are philosophers actually trying to do? As a final thought from Ziman:

How do scientists teach, communicate with, promote, criticize, honour, give ear to, give patronage to, one another? What is the nature of the community to which they adhere?

Replace the word “scientists” in that sentence with “testers” and I believe the sentiment is just as relevant. It matters very much who speaks for the wider testing community. The most vocal of those people will have outsized impact. Thus there is an ethic to consider what that impact is and what kind of community we want people to adhere to.

And this is why, just as Ziman felt it necessary to write a book on the social dimension of science, I’ve found it necessary to have a blog largely devoted to the social dimension of testing. My effect on the wider community is unclear at best but I’m comfortable with the ethic I’ve tried to maintain.

Stories from a Software Tester

Twice upon a time, in another space, no distance in any direction from here …

Social Dimensions of Testing

False Conception 1

False Conception 2

False Conception 3

The Narrative Builds Itself

Leave a Reply Cancel reply