The Tetrad Automated Causal Discovery Platform, a software and text project developed by Peter Spirtes, Clark Glymour, Richard Scheines and Joe Ramsey of Carnegie Mellon University’s Department of Philosophy, earned the “Leader” Award at the 2020 World Artificial Intelligence Conference this past July.
The Leader Award is one of four awards presented at the conference that aim to recognize “the best in terms of impact and innovation in AI”. There were over 800 nominees for the awards, including projects by Amazon, Bosch, Huawei, Nvidia, Open AI Lab, and Siemens, among others.
The Tetrad Automated Causal Discovery Platform is a tool for discovering “valid, novel, and significant causal relationships” in data. A press release from CMU provides further information about the project:
The Tetrad project was started nearly 40 years ago by Glymour, then a professor of history and philosophy of science at the University of Pittsburgh and now Alumni University Professor Emeritus of Philosophy at CMU, and his doctoral students, Richard Scheines, now Bess Family Dean of the Dietrich College of Humanities and Social Sciences and a professor of philosophy at CMU, and Kevin Kelly, now professor of philosophy at CMU.
Glymour was fascinated by English psychologist Charles Spearman’s argument for a single “general intelligence,” proposed in the early 20th century, and later work by Hubert Blalock, a sociologist. Both researchers explored the possibility of distinguishing causal models by patterns of constraints they implied on the data. Glymour and his students undertook to generalize that idea, turn it into a computer algorithm and explore related mathematical properties.
The first version of the Tetrad program became the basis of Scheines’ doctoral research, which required him to learn as much computer science and statistics as philosophy, an interdisciplinary approach that was encouraged at CMU.
Peter Spirtes joined the project while studying for a master’s degree in computer science at Pitt following his doctoral work. A number of doctoral students at CMU have based their work around the Tetrad project.
Fundamental to the work was providing a set of general principles, or axioms, for deriving testable predictions from any causal structure. For example, consider the coronavirus. Exposure to the virus causes infection, which in turn causes symptoms (Exposure –> Infection –> Symptoms). Since not all exposures result in infections, and not all infections result in symptoms, these relations are probabilistic. But if we assume that exposure can only cause symptoms through infection, the testable prediction from the axiom is that Exposure and Symptoms are independent given Infection. That is, although knowing whether someone was exposed is informative about whether they will develop symptoms, once we already know whether someone is infected or not — knowing whether they were exposed adds no extra information — a claim that can be tested statistically with data.
Spirtes, Glymour and Scheines then turned this kind of reasoning on its head and extended it to massively complex causal systems. They developed algorithms that take measured data and background knowledge as input, and then compute the set of underlying causal systems that might have produced specific patterns in the measured data. What can the algorithms tell us about the causal system that underlies the measured data? According to Scheines, “not everything, but in some cases quite a lot.” Spirtes led the effort to prove that the algorithms were theoretically reliable. This approach to causal discovery constituted a breakthrough in fundamental methods in AI.
The next step was to make the work practical — which required efficient algorithms and massive amounts of simulation and real scientific testing. In the late 1990s, Joe Ramsey joined the team as a systems developer, and he has developed several important algorithms, and has made many others dramatically more efficient. With help from Spirtes, Glymour, Scheines and many others, Ramsey developed the Java-based Tetrad platform, which supports model building and testing, full simulation, and implements dozens of causal discovery algorithms that can be executed on one’s laptop or on the Bridges Pittsburgh Supercomputer.
Over the last 15 to 20 years, the free, open-source software platform has been successfully applied to scientific problems from economics to psychology to educational research to neuroscience by the original team and by researchers around the world.