Paper deep dive
The Computational Complexity of Circuit Discovery for Inner Interpretability
Federico Adolfi, Martina G. Vilas, Todd Wareham
Abstract
Abstract:Many proposed applications of neural networks in machine learning, cognitive/brain science, and society hinge on the feasibility of inner interpretability via circuit discovery. This calls for empirical and theoretical explorations of viable algorithmic options. Despite advances in the design and testing of heuristics, there are concerns about their scalability and faithfulness at a time when we lack understanding of the complexity properties of the problems they are deployed to solve. To address this, we study circuit discovery with classical and parameterized computational complexity theory: (1) we describe a conceptual scaffolding to reason about circuit finding queries in terms of affordances for description, explanation, prediction and control; (2) we formalize a comprehensive set of queries for mechanistic explanation, and propose a formal framework for their analysis; (3) we use it to settle the complexity of many query variants and relaxations of practical interest on multi-layer perceptrons. Our findings reveal a challenging complexity landscape. Many queries are intractable, remain fixed-parameter intractable relative to model/circuit features, and inapproximable under additive, multiplicative, and probabilistic approximation schemes. To navigate this landscape, we prove there exist transformations to tackle some of these hard problems with better-understood heuristics, and prove the tractability or fixed-parameter tractability of more modest queries which retain useful affordances. This framework allows us to understand the scope and limits of interpretability queries, explore viable options, and compare their resource demands on existing and future architectures.
Tags
Links
- Source: https://arxiv.org/abs/2410.08025
- Canonical: https://arxiv.org/abs/2410.08025
Intelligence
Status: succeeded | Model: google/gemini-3.1-flash-lite-preview | Prompt: intel-v1 | Confidence: 98%
Last extracted: 3/12/2026, 6:54:52 PM
Summary
This paper provides a formal computational complexity analysis of 'circuit discovery' in neural networks, specifically multi-layer perceptrons. It establishes a conceptual framework for interpretability queries (description, explanation, prediction, control), proves that many such queries are intractable (NP-hard, W[1]-hard, or inapproximable), and identifies specific tractable subsets and transformations to guide future interpretability research.
Entities (6)
Relation Signals (3)
Federico Adolfi → authored → The Computational Complexity of Circuit Discovery for Inner Interpretability
confidence 100% · The Computational Complexity of Circuit Discovery for Inner Interpretability Federico Adolfi
Circuit Discovery → isanalyzedusing → Computational Complexity Theory
confidence 100% · we study circuit discovery with classical and parameterized computational complexity theory
Multi-Layer Perceptron → issubjectof → Circuit Discovery
confidence 95% · we use it to settle the complexity of many query variants and relaxations of practical interest on multi-layer perceptrons.
Cypher Suggestions (2)
Identify architectures studied in the context of circuit discovery. · confidence 95% · unvalidated
MATCH (a:Architecture)-[:IS_SUBJECT_OF]->(c:Methodology {name: 'Circuit Discovery'}) RETURN a.nameFind all research methodologies used to analyze neural network interpretability. · confidence 90% · unvalidated
MATCH (m:Methodology)-[:IS_ANALYZED_USING]->(f:Framework) RETURN m.name, f.name
Full Text
492,494 characters extracted from source content.
Expand or collapse full text
The Computational Complexity of Circuit Discovery for Inner Interpretability Federico Adolfi ESI Neuroscience, Max-Planck Society & University of Bristol fede.adolfi@bristol.ac.uk &Martina G. Vilas Department of Computer Science Goethe University Frankfurt martinagvilas@em.uni-frankfurt.de &Todd Wareham Department of Computer Science Memorial University of Newfoundland harold@mun.ca Abstract Many proposed applications of neural networks in machine learning, cognitive/brain science, and society hinge on the feasibility of inner interpretability via circuit discovery. This calls for empirical and theoretical explorations of viable algorithmic options. Despite advances in the design and testing of heuristics, there are concerns about their scalability and faithfulness at a time when we lack understanding of the complexity properties of the problems they are deployed to solve. To address this, we study circuit discovery with classical and parameterized computational complexity theory: (1) we describe a conceptual scaffolding to reason about circuit finding queries in terms of affordances for description, explanation, prediction and control; (2) we formalize a comprehensive set of queries for mechanistic explanation, and propose a formal framework for their analysis; (3) we use it to settle the complexity of many query variants and relaxations of practical interest on multi-layer perceptrons. Our findings reveal a challenging complexity landscape. Many queries are intractable, remain fixed-parameter intractable relative to model/circuit features, and inapproximable under additive, multiplicative, and probabilistic approximation schemes. To navigate this landscape, we prove there exist transformations to tackle some of these hard problems with better-understood heuristics, and prove the tractability or fixed-parameter tractability of more modest queries which retain useful affordances. This framework allows us to understand the scope and limits of interpretability queries, explore viable options, and compare their resource demands on existing and future architectures. 1 Introduction As artificial neural networks (ANNs) grow in size and capabilities, Inner Interpretability — an emerging field tasked with explaining their inner workings (Räuker et al., 2023; Vilas et al., 2024a) — attempts to devise scalable, automated procedures to understand systems mechanistically. Many proposed applications of neural networks in machine learning, cognitive and brain sciences, and society, hinge on the feasibility of inner interpretability. For instance, we might have to rely on interpretability methods to improve system safety (Bereska & Gavves, 2024), detect and control vulnerabilities (García-Carrasco et al., 2024), prune for efficiency (Hooker et al., 2021), find and use task subnetworks (Zhang et al., 2024), explain internal concepts underlying decisions (Lee et al., 2023), experiment with neuro-cognitive models of language, vision, etc. (Lindsay, 2024; Lindsay & Bau, 2023; Pavlick, 2023), describe determinants of ANN-brain alignment (Feghhi et al., 2024; Oota et al., 2023), improve architectures, and extract domain insights (Räuker et al., 2023). We will have to solve different instances of these interpretability problems, ideally automatically, for increasingly large models. We therefore need efficient interpretability procedures, and this requires empirical and theoretical explorations of viable algorithmic options. Circuit discovery and its challenges. Since top-down approaches to inner interpretability (see Vilas et al., 2024a) work their way down from high-level concepts or algorithmic hypotheses (Lieberum et al., 2023), there is interest in a complementary bottom-up methodology: circuit discovery (see Shi et al., 2024; Tigges et al., 2024). It starts from neuron- and circuit-level isolation or description (e.g., Hoang-Xuan et al., 2024; Lepori et al., 2023) and attempts to build up higher-level abstractions. The motivation is the circuit hypothesis: models might implement their capabilities via small subnetworks (Shi et al., 2024). Advances in the design and testing of interpretability heuristics (see Shi et al., 2024; Tigges et al., 2024) come alongside interest in the automation of circuit discovery (e.g., Conmy et al., 2023; Ferrando & Voita, 2024; Syed et al., 2023) and concerns about its feasibility (Voss et al., 2021; Räuker et al., 2023). One challenge is scaling up methods to larger networks, more naturalistic datasets, and more complex tasks (e.g., Lieberum et al., 2023; Marks et al., 2024), given their manual-intensive search over large spaces (Voss et al., 2021). A related issue is that current heuristics, though sometimes promising (e.g., Merullo et al., 2024), often yield discrepant results (see e.g., Shi et al., 2024; Niu et al., 2023; Zhang & Nanda, 2023). They often find circuits that are not functionally faithful (Yu et al., 2024a) or lack the expected affordances (e.g., effects on behavior; Shi et al., 2024). This questions whether certain localization methods yield results that inform editing (Hase et al., 2023), and vice versa (Wang & Veitch, 2024). More broadly, we run into ‘interpretability illusions’ (Friedman et al., 2024) when our simplifications (e.g., circuits) mimic the local input-output behavior of the system but lack global faithfulness (Jacovi & Goldberg, 2020). Exploring viable algorithmic options. These challenges come at a time when, despite emerging theoretical frameworks (e.g., Vilas et al., 2024a; Geiger et al., 2024), there are notable gaps in the formalization and analysis of the computational problems that interpretability heuristics attempt to solve (see Wang & Veitch, 2024, §8). Issues around scalability of circuit discovery and faithfulness have a natural formulation in the language of Computational Complexity Theory (Arora & Barak, 2009; Downey & Fellows, 2013). A fundamental source of breakdown of scalability — which lack of faithfulness is one manifestation of — is the intrinsic resource demands of interpretability problems. In order to design efficient and effective solutions, we need to understand the complexity properties of circuit discovery queries and the constraints that might be leveraged to yield the desired results. Although experimental efforts have made promising inroads, the complexity-theoretic properties that naturally impact scalability and faithfulness remain open questions (see e.g., Subercaseaux, 2020, §6C). We settle them here by complementing these efforts with a systematic study of the computational complexity of circuit discovery for inner interpretability. We present a framework that allows us to (a) understand the scope and limits of interpretability queries for description/explanation and prediction/control, (b) explore viable options, and (c) compare their resource demands among existing and future architectures. 1.1 Contributions • We present a conceptual scaffolding to reason about circuit finding queries in terms of affordances for description, explanation, prediction and control. • We formalize a comprehensive set of queries that capture mechanistic explanation, and propose a formal framework for their analysis. • We use this framework to settle the complexity of many query variants, parameterizations, approximation schemes and relaxations of practical interest on multi-layer perceptrons, relevant to various architectures such as transformers. • We demonstrate how our proof techniques can also be useful to draw links between interpretability and explainability by using them to improve existing results on the latter. 1.2 Overview of results • We uncover a challenging complexity landscape (see Table 4) where many queries are intractable (NP-hard, Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2-hard), remain fixed-parameter intractable (W[1]-hard) when constraining model/circuit features (e.g., depth), and are inapproximable under additive, multiplicative, and probabilistic approximation schemes. • We prove there exist transformations to potentially tackle some hard problems (NP- vs. Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2-complete) with better-understood heuristics, and prove the tractability (PTIME) or fixed-parameter tractability (FPT) of other queries of interest, and we identify open problems. • We describe a quasi-minimality property of ANN circuits and exploit it to generate tractable queries which retain useful affordances as well as efficient algorithms to compute them. • We establish a separation between local and global query complexity. Together with quasi-minimality, this explains interpretability illusions of faithfulness observed in experiments. 1.3 Related work This paper gives the first systematic exploration of the computational complexity of inner interpretability problems.111 This work expands on FA’s PhD dissertation at University of Bristol (Adolfi, 2023; Adolfi et al., 2024). An adjacent area is the complexity analysis of explainability problems (Bassan & Katz, 2023; Ordyniak et al., 2023). It differs from our work in its focus on input queries — aspects of the input that explain model decisions — as we look at the inner workings of neural networks via circuit queries. Barceló et al. (2020) study the explainability of multi-layer perceptrons compared to simpler models through a set of input queries. Bassan et al. (2024) extend this idea with a comparison between local and global explainability. None of these works formalize or analyze circuit queries (although Subercaseaux, 2020, identifies it as an open problem); we adapt the local versus global distinction in our framework and show how our proof techniques can tighten some results on explainability queries. Ramaswamy (2019) and Adolfi & van Rooij (2023) explore a small set of circuit queries and only on abstract biological networks modeled as general graphs, which cannot inform circuit discovery in ANNs. Efforts in characterizing the complexity of learning neural networks (e.g., Song et al., 2017; Chen et al., 2020; Livni et al., 2014) might eventually connect to our work, although a number of differences between the formalizations makes results in one area difficult to predict from those in the other. Likewise, efforts to settle the complexity of finding small circuits consistent with a truth table (Hitchcock & Pavan, 2015) are currently too general to be applicable to interpretability problems. More generally, we join efforts to build a solid theoretical foundation for interpretability (Bassan & Katz, 2023; Geiger et al., 2024; Vilas et al., 2024a). 2 Mechanistic understanding of neural networks Mechanistic understanding is a contentious topic (Ross & Bassett, 2024), but for our purposes it will suffice to adopt a pragmatic perspective. In many cases of practical interest, we want our interpretability methods to output objects that allow us to, in some limited sense, (1) describe or explain succinctly, and (2) control or predict precisely. Such objects (e.g., circuits) should be ‘efficiently queriable’; they are often referred to as “a way of making an explanation tractable” (Cao & Yamins, 2023). Roughly, this means that we would like short descriptions (e.g., small circuits) with useful affordances (e.g., to readily answer questions and perform interventions of interest). Circuits have the potential to fulfill these criteria (Olah et al., 2020). Here we preview some special circuits with useful properties which we formalize and analyze later on. Table 1 maps the main circuits we study to their corresponding affordances for description, explanation, prediction and control. Formal definitions of circuit queries are given alongside results in Section 4 (see also Appendix: Definitions, Theorems and Proofs). Table 1: Circuit affordances for description, explanation, prediction, and control. Circuit Affordance Description / Explanation Prediction / Control Sufficient Circuit Which neurons suffice in isolation to cause a behavior? Minimum: shortest description. Inference in isolation. Minimal: ablating any neuron breaks behavior of the circuit. Quasi-minimal Sufficient Circuit Which neurons suffice in isolation to cause a behavior and which is a breaking point? Ablating the breaking point breaks behavior of the circuit. Necessary Circuit Which neurons are part of all circuits for a behavior? Key subcomputations? Ablating the neurons breaks behavior of any sufficient circuit in the network. Circuit Ablation & Clamping Which neurons are necessary in the current configuration of the network? Ablating/Clamping the neurons breaks behavior of the network. Circuit Robustness How much redundancy supports a behavior? Resilience to perturbations. Ablating any set of neurons of size below threshold does not break behavior. Patched Circuit Which neurons drive a behavior in a given input context, i.e., are control nodes? Patching neurons changes network behavior for inputs of interest. Steering; Editing. Quasi-minimal Patched Circuit Which neurons can drive a behavior in a given input context and which neuron is a breaking point? Patching neurons causes target behavior for inputs of interest; Unpatching breaking point breaks target behavior. Gnostic Neurons Which neurons respond preferentially to a certain concept? Concept editing; guided synthesis. 3 Inner interpretability queries as computational problems We model post-hoc interpretability queries on neural networks as computational problems in order to analyze their intrinsic complexity properties. These circuit queries also formalize criteria for desired circuits, including those appearing in the literature, such as ‘faithfulness’, ‘completeness’, and ‘minimality’ (Wang et al., 2022; Yu et al., 2024a). Query variants: coverage, size and minimality. The coverage of a circuit is the domain over which it behaves in a certain way (e.g., faithful to the model’s prediction). Local circuits do so over a finite set of known inputs and global circuits do so over all possible inputs. The size of a circuit is the number of neurons. Some circuit queries require circuits of bounded size whereas others leave the size unbounded. A circuit with a certain property (e.g., local sufficiency) is minimal if there is no subset of its neurons that also has that property (cf. minimum size among all such circuits present in the network; see Figure 1). Figure 1: Relationships between circuit types. Sufficient Circuits (SCs) are faithful to the model. The entire network is a trivial SC. Necessary Circuits (NCs) are units shared by all minimal SCs. Quasi-minimal SCs contain a known breaking point (here, NC) and unknown superfluous units. To fit our comprehensive suite of problems, we explain how to generate problem variants and later on only present one representative definition of each. Problem 0 (ProblemName (PN)) Input: A multi-layer perceptron ℳMM, CoverageIN, SizeIN. Output: A Property circuit CC of ℳMM, SizeOUT, s.t. CoverageOUT ()=ℳ()ℳC(x)\!=\!M(x)C ( x ) = M ( x ), Suffix. Input: and Table 2 illustrate how to generate problem variants using a template, and ProblemName = Sufficient Circuit as an example (e.g., the Coverage[IN/OUT] variables specify parts of the input/output description that vary according to whether the requested circuit must have global or local faithfulness). Problem definitions will be given for search (return specified circuits) or decision (answer yes/no circuit queries) versions. Others, including optimization (return maximum/minimum-size circuits), can be generated by assigning variables. Problems presented later on are obtained similarly. We also explore various parameterizations, approximation schemes, and relaxations that we explain in the following sections as needed. Table 2: Generating query variants from problem templates. Description variables Query variants Local Global Bounded Unbounded Optimal Bounded Unbounded Optimal CoverageIN an input xx an input xx an input xx “__” “__” “__” CoverageOUT “__” “__” “__” ∀subscriptfor-all _x∀x ∀subscriptfor-all _x∀x ∀subscriptfor-all _x∀x SizeIN int. u≤|ℳ|ℳu≤|M|u ≤ | M | “__” “__” int. u≤|ℳ|ℳu≤|M|u ≤ | M | “__” “__” SizeOUT size ||≤u|C|≤ u| C | ≤ u “__” min. size size ||≤u|C|≤ u| C | ≤ u “__” min. size Property minimal / “__” minimal / “__” “__” minimal / “__” minimal / “__” “__” Suffix if it exists, otherwise ⊥bottom ⊥ “__” “__” if it exists, otherwise ⊥bottom ⊥ “__” “__” 3.1 Complexity analyses Classical and parameterized complexity. We prove theorems about interpretability queries building on techniques from classical (Garey & Johnson, 1979) and parameterized complexity (Downey & Fellows, 2013). Given our limited knowledge of the problem space of interpretability, worst-case analysis is appropriate to explore which problems might be solvable without requiring any additional assumptions (e.g., Bassan et al., 2024; Barceló et al., 2020), and experimental results suggest it captures a lower bound on real-world complexity (e.g., Friedman et al., 2024; Shi et al., 2024; Yu et al., 2024a). Here we give a brief, informal overview of the main concepts underlying our analyses (see Appendix: Definitions, Theorems and Proofs for extensive formal definitions). We will explore beyond classical polynomial-time tractability (PTIME) by studying fixed-parameter tractability (FPT), a more novel and finer-grained look at the sources of complexity of problems to test aspects that possibly make interpretability feasible in practice. NP-hard queries are considered intractable because they cannot be computed by polynomial-time algorithms. A relaxation is to allow unreasonable (e.g., exponential) resource demands to be confined to problem parameters that can be kept small in practice. Parameterizing a given ANN and requested circuit leads to parameterized problems (see Table 3 for problem parameters we study later). Parameterized queries in the class FPT admit fixed-parameter tractable algorithms. W-hard queries (by analogy: to FPT as NP-hard is to PTIME), however, do not. We study counting problems via analogous classes #P and #W[1]. We also investigate completeness for NP and classes higher up the polynomial hierarchy such as Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2 and Π2psubscriptsuperscriptΠ2 ^p_2Πitalic_p2 to identify aspects of hard problems that make them even harder, and to explore the possibility to tackle hard interpretability problems with better-understood methods for well-known NP-complete problems (de Haan & Szeider, 2017). Most proofs involve reductions between computational problems which establish the complexity status of interpretability queries based on the known complexity of canonical problems in other areas. Table 3: Model and circuit parameterizations. Parameter Model (given) Circuit (requested) Number of layers (depth) L^ Lover start_ARG L end_ARG l^ lover start_ARG l end_ARG Maximum layer width L^wsubscript L_wover start_ARG L end_ARGw l^wsubscript l_wover start_ARG l end_ARGw Total number of units22footnotemark: 2 U^=|ℳ|≤L^⋅L^w^ℳ⋅^subscript U=|M|≤ L· L_wover start_ARG U end_ARG = | M | ≤ over start_ARG L end_ARG ⋅ over start_ARG L end_ARGw ||=u^^|C|= u| C | = over start_ARG u end_ARG Number of input units U^Isubscript U_Iover start_ARG U end_ARGI u^Isubscript u_Iover start_ARG u end_ARGI Number of output units U^Osubscript U_Oover start_ARG U end_ARGO u^Osubscript u_Oover start_ARG u end_ARGO Maximum weight W^ Wover start_ARG W end_ARG w^ wover start_ARG w end_ARG Maximum bias B^ Bover start_ARG B end_ARG b^ bover start_ARG b end_ARG Approximation. Although sometimes computing optimal solutions is intractable, it is conceivable we could devise tractable interpretability procedures to obtain approximate solutions that are useful in practice. We consider 5 notions of approximation: additive, multiplicative, and three probabilistic schemes (=c,PTAS,3PAPTAS3PAA=\c,PTAS,3PA\A = c , PTAS , 3PA ; see Appendix: Definitions, Theorems and Proofs for formal definitions). Additive approximation algorithms return solutions at most a fixed distance c away from optimal (e.g., from the minimum-sized circuit), ensuring that errors cannot get impractically large (c-approximability). Multiplicative approximation returns solutions at most a factor of optimal away. Some hard problems allow for polynomial-time multiplicative approximation schemes (PTAS) where we can get arbitrarily close to optimal solutions as long as we expend increasing compute time (Ausiello et al., 1999). Finally, we consider three types of probabilistic polynomial-time approximability (henceforth 3PA) that may be acceptable in situations where always getting the correct output for an input is not required: algorithms that (1) always run in polynomial time and produce the correct output for a given input in all but a small number of cases (Hemaspaandra & Williams, 2012); (2) always run in polynomial time and produce the correct output for a given input with high probability (Motwani & Raghavan, 1995); and (3) run in polynomial time with high probability but are always correct (Gill, 1977). Model architecture. The Multi-Layer Perceptron (MLP) is a natural first step in our exploration because (a) it is proving useful as a stepping stone in current experimental (e.g., Lampinen et al., 2024) and theoretical work (e.g., Rossem & Saxe, 2024; McInerney & Burke, 2023); (b) it exists as a leading standalone architecture (Yu et al., 2024b), as the central element of all-MLP architectures (Tolstikhin et al., 2021), and as a key component of state-of-the-art models such as transformers (Vaswani et al., 2017); (c) it is of active interest to the interpretability community (e.g., Geva et al., 2022; 2021; Dai et al., 2022; Meng et al., 2024; 2022; Niu et al., 2023; Vilas et al., 2024b; Hanna et al., 2023); and (d) we can relate our findings in inner interpretability to those in explainability, which also begins with MLPs (e.g., Barceló et al., 2020; Bassan et al., 2024). Although MLP blocks can be taken as units to simplify search, it is recommended to investigate MLPs by treating each neuron as a unit (e.g., Gurnee et al., 2023; Cammarata et al., 2020; Olah et al., 2017), as it better reflects the semantics of computations in ANNs (Lieberum et al., 2023, sec. 2.3.1). We adopt this perspective in our analyses. We write ℳMM for an MLP model and ℳ()ℳM(x)M ( x ) for its output on input vector xx. Its size |ℳ|ℳ|M|| M | is the number of neurons. A circuit CC is a subset of |||C|| C | neurons which induce a (possibly end-to-end) subgraph of ℳMM (see Appendix: Definitions, Theorems and Proofs for formal definitions). 4 Results & Discussion: the complexity of circuit queries In this section we present each circuit query with its computational problem and a discussion of the complexity profile we obtain across variants, relaxations, and parameterizations. For an overview of the results for all queries, see Table 4. Proofs of the theorems can be found in the Appendix: Definitions, Theorems and Proofs. 4.1 Sufficient Circuit Sufficient circuits (SCs) are sets of neurons connected end-to-end that suffice, in isolation, to reproduce some model behavior over an input domain (see faithfulness; Wang et al., 2022; Yu et al., 2024a). They are conceptually related to the desired outcome of zero-ablating components that do not contribute to the behavior of interest (small, parameter-efficient subnetworks). Zero-ablation as a method (e.g., to find sufficient circuits) has been criticized on the grounds that the patched value (zero) is somewhat arbitrary and therefore can mischaracterize the functioning of the neuron/circuit when operating in the context of the rest of the network during inference. This gives rise to alternative methods such as activation patching with activation means or specific input activations, which we study later. SCs remain relevant as, despite valid criticisms of zero-ablation (e.g., Conmy et al., 2023), circuit discovery through pruning might be justified at least in some cases (Yu et al., 2024a). Problem 1 (Bounded Local Sufficient Circuit (BLSC)) Input: A multi-layer perceptron ℳMM, an input vector xx, and an integer u≤|ℳ|ℳu≤|M|u ≤ | M |. Output: A circuit CC in ℳMM of size ||≤u|C|≤ u| C | ≤ u, such that ()=ℳ()ℳC(x)=M(x)C ( x ) = M ( x ), if it exists, otherwise ⊥bottom ⊥. We find that many variants of SC are NP-hard (see Table 4). Counterintuitively, this intractability does not depend straightforwardly on parameters such as network depth (W[1]-hard relative to PP). Therefore, hardness is not mitigated by keeping models shallow. Given this barrier, we explore the possibility of obtaining approximate solutions but find that hard SC variants are inapproximable relative to all schemes in Table 3. An alternative is to consider the membership of these problems in a well-studied class whose solvers are better understood than interpretability heuristics (de Haan & Szeider, 2017). We prove that local versions of SC are NP-complete. This implies there exist efficient transformations from instances of SC to those of the satisfiability problem (SAT; Biere et al., 2021), opening up the possibility to borrow techniques that work reasonably well in practice for SAT that might be suitable for some versions of neural network problems. Interestingly, this is not possible for the global version, which we prove is complete for a class higher up the complexity hierarchy (Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2-complete). This result establishes a formal separation between local and global query complexity that partly explains ‘interpretability illusions’ (Friedman et al., 2024; Yu et al., 2024a), which we conjecture holds for other queries we investigate later. These illusions come about when an interpretability abstraction (e.g., a circuit) seems empirically faithful to its target (e.g., model behavior) by some criterion (e.g., ‘local’ tests on a dataset), but actually lacks faithfulness in the way it generalizes to other criteria (e.g., tests of its ‘global’ behavior outside the original distribution). Next we explore whether we could diagnose if SCs with some desired property (e.g., minimality) are abundant, which would be informative of the ability of heuristic search to stumble upon one of them. We analyze various queries where the output is a count of SCs (i.e., counting problems). We find that both local and global, bounded and unbounded variants are #P-complete and remain intractable (#W[1]-hard) when parameterized by many network features including depth (Table 3). The hardness profile of SC over all these variants calls for exploring more substantial relaxations. We introduce the notion of quasi-minimality for this purpose (similar to Ramaswamy, 2019) and later demonstrate its usefulness beyond this particular problem. Any neuron in a minimal/minimum SC is a breaking point in the sense that removing it will break the target behavior. In quasi-minimal SCs we are merely guaranteed to know at least one neuron that causes this breakdown. By introducing this relaxation, which gives up some affordances but retains others of interest, we get a feasible interpretability query. Problem 2 (Unbounded Quasi-minimal Local Sufficient Circuit (UQLSC)) Input: A multi-layer perceptron ℳMM, and an input vector xx. Output: A circuit CC in ℳMM and a neuron v∈v ∈ C s.t. ()=ℳ()ℳC(x)=M(x)C ( x ) = M ( x ) and [∖v]()≠ℳ()delimited-[]ℳ[C \v\](x) (x)[ C ∖ v ] ( x ) ≠ M ( x ). Input: is in PTIME. We describe an efficient algorithm to compute it which can be heuristically biased towards finding smaller circuits and combined with techniques that exploit weights and gradients (see Appendix: Definitions, Theorems and Proofs). 4.2 Gnostic Neuron Gnostic neurons, sometimes called ‘grandmother neurons’ in neuroscience (Gale et al., 2020) and ‘concept neurons’ or ‘object detectors’ in AI (e.g., Bau et al., 2020), are one of the oldest and still current interpretability queries of interest (see also ‘knowledge neurons’; Niu et al., 2023). Problem 3 (Bounded Gnostic Neurons (BGN)) Input: A multi-layer perceptron ℳMM and two sets of input vectors XX and YY, an integer k, and an activation threshold t. Output: A set of neurons V in ℳMM of size |V|≥k|V|≥ k| V | ≥ k such that ∀v∈Vsubscriptfor-all _v∈ V∀v ∈ V it is the case that ∀∈subscriptfor-all _x ∀x ∈ X, ℳ()ℳM(x)M ( x ) produces activations Av≥tsubscriptsuperscriptA^v_x≥ tAitalic_vbold_x ≥ t and ∀∈:Av<t:subscriptfor-allsubscriptsuperscript _y :A^v_y<t∀y ∈ Y : Aitalic_vbold_y < t, if it exists, else ⊥bottom ⊥. Input: is in PTIME. Alternatives might require GNs to have some behavioral effect when intervened; such variants would remain tractable. 4.3 Circuit Ablation and Clamping The idea that some neurons perform key subcomputations for certain tasks naturally leads to the hypothesis that ablating them should have downstream effects on the corresponding model behaviors. Searching for neuron sets with this property has been one strategy (i.e., zero-ablation) to get at important circuits (Wang & Veitch, 2024). The circuit ablation (CA) problem formalizes this idea. Problem 4 (Bounded Local Circuit Ablation (BLCA)) Input: A multi-layer perceptron ℳMM, an input vector xx, and an integer u≤|ℳ|ℳu≤|M|u ≤ | M |. Output: A neuron set CC in ℳMM of size ||≤u|C|≤ u| C | ≤ u, s.t. [ℳ∖]()≠ℳ()delimited-[]ℳ[M ](x) (x)[ M ∖ C ] ( x ) ≠ M ( x ), if it exists, else ⊥bottom ⊥. A difference between CAs and minimal SCs is that the former can be interpreted as a possibly non-minimal breaking set in the context of the whole network whereas the latter is by default a minimal breaking set when the SC is taken in isolation. In this sense, CA can be seen as a less stringent criterion for circuit affordances. A related idea is circuit clamping (C): fixing the activations of certain neurons to a level that produces a change in the behavior of interest. Problem 5 (Bounded Local Circuit Clamping (BLCC)) Input: A multi-layer perceptron ℳMM, vector xx, value r, and an integer u s.t. 1<u≤|ℳ|1ℳ1<u≤|M|1 < u ≤ | M |. Output: A subset of neurons CC in ℳMM of size ||≤u|C|≤ u| C | ≤ u, such that for the ℳ∗superscriptℳM^*M∗ induced by clamping all c∈c ∈ C to value r, ℳ∗()≠ℳ()superscriptℳM^*(x) (x)M∗ ( x ) ≠ M ( x ), if it exists, otherwise ⊥bottom ⊥. Despite these more modest criteria, we find that both the local and global variants of CA and C are NP-hard, fixed-parameter intractable W[1]-hard relative to various parameters, and inapproximable in all 5 senses studied. However, we prove these problems are NP-complete, which opens up practical options not available for other problems we study (see remarks in Section 4.1). 4.4 Circuit Patching A critique of zero-ablation is the arbitrariness of the value, leading to alternatives such as mean-ablation (e.g., Wang et al., 2022). This contrasts studying circuits in isolation versus embedded in surrounding subnetworks. Activation patching (Ghandeharioun et al., 2024; Zhang & Nanda, 2023; Hanna et al., 2024) and path patching (Goldowsky-Dill et al., 2023) try to pinpoint which activations play an in-context role in model behavior, which inspires the circuit patching (CP) problem. Problem 6 (Bounded Local Circuit Patching (BLCP)) Input: A multi-layer perceptron ℳMM, an integer k, an input vector yy, and a vector set XX. Output: A subset CC in ℳMM of size ||≤k|C|≤ k| C | ≤ k, such that for the ℳ∗superscriptℳM^*M∗ induced by patching CC with activations from ℳ()ℳM(y)M ( y ) and ℳ∖ℳM ∖ C with activations from ℳ()ℳM(x)M ( x ), ℳ∗()=ℳ()superscriptℳM^*(x)=M(y)M∗ ( x ) = M ( y ) for all ∈x ∈ X, if it exists, otherwise ⊥bottom ⊥. We find that local/global variants are intractable (NP-hard) in a way that does not depend on parameters such as network depth or size of the patched circuit (W[1]-hard), and are inapproximable (c,PTAS,3PAPTAS3PA\c,PTAS,3PA\ c , PTAS , 3PA -inapprox.). Although we also prove the local variant of CP is NP-complete and therefore approachable in practice with solvers for hard problems not available for the global variants (see remarks in Section 4.1), these complexity barriers motivate exploring further relaxations. With some modifications the idea of quasi-minimality can be repurposed to do useful work here. Problem 7 (Unbounded Quasi-minimal Local Circuit Patching (UQLCP)) Input: A multi-layer perceptron ℳMM, an input vector yy, and a set XX of input vectors. Output: A subset CC in ℳMM and a neuron v∈v ∈ C, such that for the ℳ∗superscriptℳM^*M∗ induced by patching CC with activations from ℳ()ℳM(y)M ( y ) and ℳ∖ℳM ∖ C with activations from ℳ()ℳM(x)M ( x ), ∀∈:ℳ∗()=ℳ():subscriptfor-allsuperscriptℳ _x :M^*(x)=M(% y)∀x ∈ X : M∗ ( x ) = M ( y ), and for ℳ′M M′ induced by patching identically except for v∈v ∈ C, ∃∈:ℳ′()≠ℳ():subscriptsuperscriptℳ′ℳ _x :M (x)≠% M(y)∃x ∈ X : M′ ( x ) ≠ M ( y ). In this way we obtain a tractable query (PTIME) for quasi-minimal patching, sidestepping barriers while retaining some useful affordances (see Table 1). We present an algorithm to compute Input: efficiently that can be combined with strategies exploiting weights and gradients (see Appendix: Definitions, Theorems and Proofs). Table 4: Classical and parameterized complexity results by problem variant. Classical & parameterized queries333Circuits are bounded-size unless otherwise stated. Each cell contains the complexity of the problem variant in terms of classical and FP (in)tractability, membership in complexity classes, and (in)approximability (=c,PTAS,3PAPTAS3PAA=\c,PTAS,3PA\A = c , PTAS , 3PA ). ‘?’ marks potentially fruitful open problems. ‘N/A’ stands for not applicable. =ℳ∪subscriptℳsubscriptP=P_M _CP = Pcaligraphic_M ∪ Pcaligraphic_C ℳ=L^,U^I,U^O,W^,B^subscriptℳ^subscript^subscript^^^P_M=\ L, U_I, U_O, W, B\Pcaligraphic_M = over start_ARG L end_ARG , over start_ARG U end_ARGI , over start_ARG U end_ARGO , over start_ARG W end_ARG , over start_ARG B end_ARG =l^,l^w,u^,u^I,u^O,w^,b^subscript^subscript^^subscript^subscript^^^P_C=\ l, l_w, u, u_I, u_O% , w, b\Pcaligraphic_C = over start_ARG l end_ARG , over start_ARG l end_ARGw , over start_ARG u end_ARG , over start_ARG u end_ARGI , over start_ARG u end_ARGO , over start_ARG w end_ARG , over start_ARG b end_ARG Problem variants Local Global Decision/Search Optimization Decision/Search Optimization Sufficient Circuit (SC) NP-complete AA-inapprox. Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2-complete AA-inapprox. PP-SC W[1]-hard AA-inapprox. W[1]-hard AA-inapprox. Minimal SC NP-complete ? ∈Σ2pabsentsubscriptsuperscriptΣ2∈ ^p_2∈ Σitalic_p2 NP-hard ? PP-Minimal SC W[1]-hard ? W[1]-hard ? Unbounded Minimal SC ? N A ? N A PP-Unbounded Minimal SC ? ? Unbounded Quasi-Minimal SC PTIME ? Count SC #P-complete N A #P-hard N A PP-Count SC #W[1]-hard #W[1]-hard Count Minimal SC #P-complete #P-hard PP-Count Minimal SC #W[1]-hard #W[1]-hard Count Unbounded Minimal SC #P-complete #P-hard Gnostic Neuron (GN) PTIME N/A ? N/A Circuit Ablation (CA) NP-complete AA-inapprox. ∈Σ2pabsentsubscriptsuperscriptΣ2∈ ^p_2∈ Σitalic_p2 NP-hard AA-inapprox. L^,U^I,U^O,W^,B^,u^^subscript^subscript^^^^\ L, U_I, U_O, W, B, u\ over start_ARG L end_ARG , over start_ARG U end_ARGI , over start_ARG U end_ARGO , over start_ARG W end_ARG , over start_ARG B end_ARG , over start_ARG u end_ARG -CA W[1]-hard AA-inapprox. W[1]-hard AA-inapprox. Circuit Clamping (C) NP-complete AA-inapprox. ∈Σ2pabsentsubscriptsuperscriptΣ2∈ ^p_2∈ Σitalic_p2 NP-hard AA-inapprox. L^,U^O,W^,B^,u^^subscript^^^^\ L, U_O, W, B, u\ over start_ARG L end_ARG , over start_ARG U end_ARGO , over start_ARG W end_ARG , over start_ARG B end_ARG , over start_ARG u end_ARG -C W[1]-hard AA-inapprox. W[1]-hard AA-inapprox. Circuit Patching (CP) NP-complete AA-inapprox. ∈Σ2pabsentsubscriptsuperscriptΣ2∈ ^p_2∈ Σitalic_p2 NP-hard AA-inapprox. L^,U^O,W^,B^,u^^subscript^^^^\ L, U_O, W, B, u\ over start_ARG L end_ARG , over start_ARG U end_ARGO , over start_ARG W end_ARG , over start_ARG B end_ARG , over start_ARG u end_ARG -CP W[2]-hard AA-inapprox. W[2]-hard AA-inapprox. Unbounded Quasi-Minimal CP PTIME N/A ? N/A Necessary Circuit (NC) ∈Σ2pabsentsubscriptsuperscriptΣ2∈ ^p_2∈ Σitalic_p2 NP-hard AA-inapprox. ∈Σ2pabsentsubscriptsuperscriptΣ2∈ ^p_2∈ Σitalic_p2 NP-hard AA-inapprox. L^,U^I,U^O,W^,u^^subscript^subscript^^^\ L, U_I, U_O, W, u\ over start_ARG L end_ARG , over start_ARG U end_ARGI , over start_ARG U end_ARGO , over start_ARG W end_ARG , over start_ARG u end_ARG -NC W[1]-hard AA-inapprox. W[1]-hard AA-inapprox. Circuit Robustness (CR) coNP-complete ? ∈Π2pabsentsubscriptsuperscriptΠ2∈ ^p_2∈ Πitalic_p2 coNP-hard ? L^,U^I,U^O,W^,B^,u^^subscript^subscript^^^^\ L, U_I, U_O, W, B, u\ over start_ARG L end_ARG , over start_ARG U end_ARGI , over start_ARG U end_ARGO , over start_ARG W end_ARG , over start_ARG B end_ARG , over start_ARG u end_ARG -CR coW[1]-hard ? coW[1]-hard ? |H||H|| H |-CR FPT FPT ? ? |H||H|| H |, U^Isubscript U_Iover start_ARG U end_ARGI-CR FPT FPT FPT FPT Sufficient Reasons (SR) ∈Σ2pabsentsubscriptsuperscriptΣ2∈ ^p_2∈ Σitalic_p2 NP-hard 3PA-inapprox. N A L^,U^O,W^,B^,u^^subscript^^^^\ L, U_O, W, B, u\ over start_ARG L end_ARG , over start_ARG U end_ARGO , over start_ARG W end_ARG , over start_ARG B end_ARG , over start_ARG u end_ARG -SR W[1]-hard 3PA-inapprox. 4.5 Necessary Circuit The criterion of necessity is a stringent one, and consequently necessary circuits (NCs) carry powerful affordances (see Table 1). Since neurons in NCs collectively interact with all possible sufficient circuits for a target behavior, they are candidates to describe key task subcomputations and intervening on them is guaranteed to have effects even in the presence of high redundance. This relates to the notion of circuit overlap and therefore to efforts in identifying circuits shared by various tasks (e.g., Merullo et al., 2024), and the link between overlap and faithfulness (e.g., Hanna et al., 2024). Problem 8 (Bounded Global Necessary Circuit (BGNC)) Input: A multi-layer perceptron ℳMM, and an integer k. Output: A subset SS of neurons in ℳMM of size ||≤k|S|≤ k| S | ≤ k, such that ∩≠∅S ≠ ∩ C ≠ ∅ for every circuit CC in ℳMM that is sufficient relative to all possible input vectors, if it exists, otherwise ⊥bottom ⊥. Unfortunately both local and global versions of NC are NP-hard (in Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2; Table 4), remain intractable even when keeping parameters such as network depth, number of input and output neurons, and others small (Table 3), and does not admit any of the available approximation schemes (Table 3). Tractable versions of NC are unlikely unless substantial restrictions or relaxations are introduced. 4.6 Circuit Robustness A behavior of interest might be over-determined or resilient in the sense that many circuits in the model implement it and one can take over when the other breaks down. This is related to the notion of redundancy used in neuroscience (e.g., Nanda et al., 2023). Intuitively, when a model implements a task in this way, the behavior should be more robust to a number of perturbations. The possibility of verifying this property experimentally motivates the circuit robustness (CR) problem, and a related interpretability effort is diagnosing nodes that are excluded by circuit discovery procedures but still have an impact on behavior (false negatives; Kramár et al., 2024). Problem 9 (Bounded Local Circuit Robustness (BLCR)) Input: A multi-layer perceptron ℳMM, a subset H of ℳMM, an input vector xx, and an integer k with 1≤k≤|H|11≤ k≤|H|1 ≤ k ≤ | H |. Output: <YES> if for each H′⊆Hsuperscript′H H′ ⊆ H, with |H′|≤ksuperscript′|H |≤ k| H′ | ≤ k, ℳ()=[ℳ∖H′]()ℳdelimited-[]ℳsuperscript′M(x)=[M H ](x)M ( x ) = [ M ∖ H′ ] ( x ), else <NO>. We find that Local CR is coNP-complete while Global CR is in Π2psubscriptsuperscriptΠ2 ^p_2Πitalic_p2 and coNP-hard. It remains fixed-parameter intractable (coW[1]-hard) relative to model parameters (Table 3). Pushing further, we explore parameterizing CR by |H|\|H|\ | H | and prove fixed-parameter tractability of |H|\|H|\ | H | -CR which holds both for the local and global versions. There exist algorithms for CR that scale well as long as |H||H|| H | is reasonable; a scenario that might be useful to probe robustness in practice. This wraps up our results for circuit queries. We briefly digress into explainability before discussing some implications. 4.7 Sufficient Reasons Understanding the sufficient reasons (SR) for a model decision in terms of input features consists of knowledge of values of the input components that are enough to determine the output. Given a model decision on an input, the most interesting reasons are those with the least components. Problem 10 (Bounded Local Sufficient Reasons (BLSR)) Input: A multi-layer perceptron ℳMM, an input vector xx of length ||=u^Isubscript^|x|= u_I| x | = over start_ARG u end_ARGI, and an integer k with 1≤k≤u^I1subscript^1≤ k≤ u_I1 ≤ k ≤ over start_ARG u end_ARGI. Output: A subset ssuperscriptx^sxitalic_s of xx of size |s|=ksuperscript|x^s|=k| xitalic_s | = k, such that for every possible completion csuperscriptx^cxitalic_c of ssuperscriptx^sxitalic_s ℳ()=ℳ()ℳsuperscriptℳM(x^c)=M(x)M ( xbold_c ) = M ( x ), if it exists, otherwise ⊥bottom ⊥. To demonstrate the usefulness of our framework beyond inner interpretability, we show how it links to explainability. Using our techniques for circuit queries, we significantly tighten existing results for SR (Barceló et al., 2020; Wäldchen et al., 2021) by proving that hardness (NP-hard, W[1]-hard, 3PA-inapprox.) holds even when the model has only one hidden layer. 5 Implications, limitations, and future directions We presented a framework based on parameterized complexity to accompany experiments on inner interpretability with theoretical explorations of viable algorithms. With this grasp of circuit query complexity, we can understand the challenges of scalability and the mixed outcomes of experiments with heuristics for circuit discovery. There is ample complexity-theoretic evidence that there is a limit (often underestimated) to how good the performance of heuristics on intractable problems can be (Hemaspaandra & Williams, 2012). We can explain ‘interpretability illusions’ (Friedman et al., 2024) due to lack of faithfulness, minimality (e.g., Shi et al., 2024; Yu et al., 2024a) and other affordances (Wang & Veitch, 2024; Hase et al., 2023), in terms of the kinds of circuits that our current heuristics are well-equipped to discover. For instance, consider the algorithm for automated circuit discovery proposed by Conmy et al. (2023), which eliminates one network component at a time if the consequence on behavior is reasonably small. Since this algorithm runs in polynomial time, it is not likely to solve the problems proven hard here, such as Minimal Sufficient Circuit. However, one reason we observe interesting results in some cases is because it is well-equipped to solve Quasi-Minimal Circuit problems. As our conceptual and formal analyses show, quasi-minimal circuits can mimic various desirable aspects of sufficient circuits (Table 1), and the former can be found tractably (results for Input: and Input:). At the same time, understanding these properties of circuit discovery heuristics helps us explain observed discrepancies: why we often see (1) lack of faithfulness (i.e., global coverage is out of reach for QMC algorithms), (2) non-minimality (i.e., QM circuits can have many non-breaking points), and (3) large variability in performance across tasks and analysis parameters (e.g., Shi et al., 2024; Conmy et al., 2023). Although we find that many queries of interest are intractable in the general case (and empirical results are in line with this characterization), this should not paralyze efforts to interpret neural network models. As our exploration of the current complexity landscape shows, reasonable relaxations, restrictions and problem variants can yield tractable queries for circuits with useful properties. Consider a few out of many possible avenues to continue these explorations. (i) Study query parameters. Faced with an intractable query, we can investigate which parameters of the problem (e.g., network or circuit aspects) might be responsible for its core hardness. If these problematic parameters can be kept small in real-world applications, this yields a fixed-parameter tractable query. We have explored some, but more are possible as any aspect of the problem can be parameterized. A close dialogue between theorists and experimentalists is important for this, as empirical regularities suggest which parameters might be fruitful to explore theoretically, and experiments test whether theoretically conjectured parameters can be kept small in practice. (i) Generate novel queries. Our formalization of quasi-minimal circuit problems illustrates the search for viable algorithmic options with examples of tractable problems for inner interpretability. When the use case is well defined, efficient queries that return circuits with useful affordances for applications can be designed. Alternative circuits might also mimic the affordances for prediction/control of ideal circuits while avoiding intractability. (i) Explore network output as axis of approximation. Some of our constructions use binary input/output (following previous work; e.g., Bassan et al., 2024; Barceló et al., 2020). Although continuous output does not necessarily matter complexity-wise (see Appendix: Definitions, Theorems and Proofs for [counter]examples), this is an interesting direction for future work, as it opens the door to studying the network output as an axis of approximation, which in turn might be a useful relaxation. (iv) Design more abstract queries. A different path is to design queries that partially rely on mid-level abstractions (Vilas et al., 2024a) to bridge the gap between circuits and human-intelligible algorithms (e.g., key-value mechanisms; Geva et al., 2022; Vilas et al., 2024b). (v) Characterize actual network structure. It is in principle possible that some real-world, trained neural networks possess internal structure that is benevolent to general (ideal) circuit queries (e.g., redundancy; see Appendix: Definitions, Theorems and Proofs). In such optimistic scenarios, general-purpose heuristics might work well. The empirical evidence available to date, however, speaks against this. In any case, it will always be important to characterize any such structure to use it explicitly to design algorithms with useful guarantees. (vi) Compare resource demands of interpretability/explainability across architectures. Our results for inner interpretability complement those of explainability (e.g., Barceló et al., 2020; Bassan et al., 2024; Wäldchen et al., 2021). These aspects can be studied together for different architectures to assess their intrinsic interpretability. To some extent our results already transfer to some cases of interest. Since the transformer architecture contains MLPs, the complexity status of our circuit queries bears on neuron-level circuit discovery efforts in transformers (e.g., Large Language/Vision/Audio models). Acknowledgments We thank Ronald de Haan for comments on proving membership using alternating quantifier formulas. References Adolfi (2023) Federico Adolfi. Computational Meta-Theory in Cognitive Science: A Theoretical Computer Science Framework. PhD thesis, University of Bristol, Bristol, UK, 2023. Adolfi & van Rooij (2023) Federico Adolfi and Iris van Rooij. Resource demands of an implementationist approach to cognition. In Proceedings of the 21st International Conference on Cognitive Modeling, 2023. Adolfi et al. (2024) Federico Adolfi, Martina G. Vilas, and Todd Wareham. Complexity-Theoretic Limits on the Promises of Artificial Neural Network Reverse-Engineering. Proceedings of the Annual Meeting of the Cognitive Science Society, 46(0), 2024. Arora & Barak (2009) Sanjeev Arora and Boaz Barak. Computational Complexity: A Modern Approach. Cambridge University Press, Cambridge ; New York, 2009. ISBN 978-0-521-42426-4. Arora et al. (1998) Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. Journal of the ACM (JACM), 45(3):501–555, 1998. Ausiello et al. (1999) Giorgio Ausiello, Alberto Marchetti-Spaccamela, Pierluigi Crescenzi, Giorgio Gambosi, Marco Protasi, and Viggo Kann. Complexity and Approximation. Springer Berlin Heidelberg, Berlin, Heidelberg, 1999. ISBN 978-3-642-63581-6 978-3-642-58412-1. Barceló et al. (2020) Pablo Barceló, Mikaël Monet, Jorge Pérez, and Bernardo Subercaseaux. Model Interpretability through the lens of Computational Complexity. In Advances in Neural Information Processing Systems, volume 33, p. 15487–15498. Curran Associates, Inc., 2020. Bassan & Katz (2023) Shahaf Bassan and Guy Katz. Towards Formal XAI: Formally Approximate Minimal Explanations of Neural Networks. In Tools and Algorithms for the Construction and Analysis of Systems: 29th International Conference, TACAS 2023, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2023, Paris, France, April 22–27, 2023, Proceedings, Part I, p. 187–207, Berlin, Heidelberg, 2023. Springer-Verlag. ISBN 978-3-031-30822-2. Bassan et al. (2024) Shahaf Bassan, Guy Amir, and Guy Katz. Local vs. Global Interpretability: A Computational Complexity Perspective. In Proceedings of the 41st International Conference on Machine Learning, p. 3133–3167. PMLR, 2024. Bau et al. (2020) David Bau, Jun-Yan Zhu, Hendrik Strobelt, Agata Lapedriza, Bolei Zhou, and Antonio Torralba. Understanding the role of individual units in a deep neural network. Proceedings of the National Academy of Sciences, 117(48):30071–30078, 2020. Bereska & Gavves (2024) Leonard Bereska and Stratis Gavves. Mechanistic interpretability for AI safety - a review. Transactions on Machine Learning Research, 2024. ISSN 2835-8856. Survey Certification, Expert Certification. Biere et al. (2021) Armin Biere, Marijn J. H. Heule, Hans van Maaren, and Toby Walsh (eds.). Handbook of Satisfiability. Number Volume 336,1 in Frontiers in Artificial Intelligence and Applications. IOS Press, Amsterdam Berlin Washington, DC, second edition edition, 2021. ISBN 978-1-64368-160-3. Cammarata et al. (2020) Nick Cammarata, Shan Carter, Gabriel Goh, Chris Olah, Michael Petrov, Ludwig Schubert, Chelsea Voss, Ben Egan, and Swee Kiat Lim. Thread: Circuits. Distill, 5(3):e24, 2020. Cao & Yamins (2023) Rosa Cao and Daniel Yamins. Explanatory models in neuroscience, Part 2: Functional intelligibility and the contravariance principle. Cognitive Systems Research, p. 101200, 2023. Chen et al. (2020) Sitan Chen, Adam R. Klivans, and Raghu Meka. Learning Deep ReLU Networks Is Fixed-Parameter Tractable. (arXiv:2009.13512), 2020. Chen & Lin (2019) Yijia Chen and Bingkai Lin. The Constant Inapproximability of the Parameterized Dominating Set Problem. SIAM Journal on Computing, 48(2):513–533, 2019. Conmy et al. (2023) Arthur Conmy, Augustine Mavor-Parker, Aengus Lynch, Stefan Heimersheim, and Adrià Garriga-Alonso. Towards Automated Circuit Discovery for Mechanistic Interpretability. Advances in Neural Information Processing Systems, 36:16318–16352, 2023. Dai et al. (2022) Damai Dai, Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, and Furu Wei. Knowledge Neurons in Pretrained Transformers. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), p. 8493–8502, Dublin, Ireland, 2022. Association for Computational Linguistics. de Haan & Szeider (2017) Ronald de Haan and Stefan Szeider. Parameterized complexity classes beyond para-NP. Journal of Computer and System Sciences, 87:16–57, 2017. Downey & Fellows (1999) R. G. Downey and M. R. Fellows. Parameterized Complexity. Monographs in Computer Science. Springer New York, New York, NY, 1999. ISBN 978-1-4612-6798-0 978-1-4612-0515-9. Downey & Fellows (2013) Rod G. Downey and Michael R. Fellows. Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, London [u.a.], 2013. ISBN 978-1-4471-5559-1. Feghhi et al. (2024) Ebrahim Feghhi, Nima Hadidi, Bryan Song, Idan A. Blank, and Jonathan C. Kao. What Are Large Language Models Mapping to in the Brain? A Case Against Over-Reliance on Brain Scores. (arXiv:2406.01538), 2024. Ferrando & Voita (2024) Javier Ferrando and Elena Voita. Information Flow Routes: Automatically Interpreting Language Models at Scale. (arXiv:2403.00824), 2024. Flum & Grohe (2006) Jörg Flum and Martin Grohe. Parameterized Complexity Theory. Texts in Theoretical Computer Science. Springer, Berlin Heidelberg New York, 2006. ISBN 978-3-540-29953-0. Fortnow (2009) Lance Fortnow. The status of the P versus NP problem. Communications of the ACM, 52(9):78–86, 2009. Friedman et al. (2024) Dan Friedman, Andrew Lampinen, Lucas Dixon, Danqi Chen, and Asma Ghandeharioun. Interpretability Illusions in the Generalization of Simplified Models. (arXiv:2312.03656), 2024. Gale et al. (2020) Ella M. Gale, Nicholas Martin, Ryan Blything, Anh Nguyen, and Jeffrey S. Bowers. Are there any ‘object detectors’ in the hidden layers of CNNs trained to identify objects or scenes? Vision Research, 176:60–71, 2020. García-Carrasco et al. (2024) Jorge García-Carrasco, Alejandro Maté, and Juan Trujillo. Detecting and Understanding Vulnerabilities in Language Models via Mechanistic Interpretability. In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, p. 385–393, 2024. Garey & Johnson (1979) Michael R Garey and David S Johnson. Computers and intractability. W.H. Freeman, 1979. Gasarch et al. (1995) William I. Gasarch, Mark W. Krentel, and Kevin J. Rappoport. OptP as the normal behavior of NP-complete problems. Mathematical Systems Theory, 28(6):487–514, 1995. Geiger et al. (2024) Atticus Geiger, Duligur Ibeling, Amir Zur, Maheep Chaudhary, Sonakshi Chauhan, Jing Huang, Aryaman Arora, Zhengxuan Wu, Noah Goodman, Christopher Potts, and Thomas Icard. Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability. (arXiv:2301.04709), 2024. Geva et al. (2021) Mor Geva, Roei Schuster, Jonathan Berant, and Omer Levy. Transformer Feed-Forward Layers Are Key-Value Memories. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (eds.), Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, p. 5484–5495, Online and Punta Cana, Dominican Republic, 2021. Association for Computational Linguistics. Geva et al. (2022) Mor Geva, Avi Caciularu, Kevin Wang, and Yoav Goldberg. Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space. In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, p. 30–45, Abu Dhabi, United Arab Emirates, 2022. Association for Computational Linguistics. Ghandeharioun et al. (2024) Asma Ghandeharioun, Avi Caciularu, Adam Pearce, Lucas Dixon, and Mor Geva. Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models. In Proceedings of the 41st International Conference on Machine Learning, p. 15466–15490. PMLR, 2024. Gill (1977) John Gill. Computational Complexity of Probabilistic Turing Machines. SIAM Journal on Computing, 6(4):675–695, 1977. Goldowsky-Dill et al. (2023) Nicholas Goldowsky-Dill, Chris MacLeod, Lucas Sato, and Aryaman Arora. Localizing Model Behavior with Path Patching. (arXiv:2304.05969), 2023. Gurnee et al. (2023) Wes Gurnee, Neel Nanda, Matthew Pauly, Katherine Harvey, Dmitrii Troitskii, and Dimitris Bertsimas. Finding Neurons in a Haystack: Case Studies with Sparse Probing. Transactions on Machine Learning Research, 2023. Hanna et al. (2023) Michael Hanna, Ollie Liu, and Alexandre Variengien. How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model. In Thirty-Seventh Conference on Neural Information Processing Systems, 2023. Hanna et al. (2024) Michael Hanna, Sandro Pezzelle, and Yonatan Belinkov. Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms. (arXiv:2403.17806), 2024. Hase et al. (2023) Peter Hase, Mohit Bansal, Been Kim, and Asma Ghandeharioun. Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models. Advances in Neural Information Processing Systems, 36:17643–17668, 2023. Hemaspaandra & Williams (2012) Lane A. Hemaspaandra and Ryan Williams. SIGACT News Complexity Theory Column 76: An atypical survey of typical-case heuristic algorithms. ACM SIGACT News, 43(4):70–89, 2012. Hitchcock & Pavan (2015) John M. Hitchcock and A. Pavan. On the NP-Completeness of the Minimum Circuit Size Problem. LIPIcs, Volume 45, FSTTCS 2015, 45:236–245, 2015. Hoang-Xuan et al. (2024) Nhat Hoang-Xuan, Minh Vu, and My T. Thai. LLM-assisted Concept Discovery: Automatically Identifying and Explaining Neuron Functions. (arXiv:2406.08572), 2024. Hooker et al. (2021) Sara Hooker, Aaron Courville, Gregory Clark, Yann Dauphin, and Andrea Frome. What Do Compressed Deep Neural Networks Forget? (arXiv:1911.05248), 2021. Jacovi & Goldberg (2020) Alon Jacovi and Yoav Goldberg. Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness? In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, p. 4198–4205, Online, 2020. Association for Computational Linguistics. Karmarkar (1984) N. Karmarkar. A new polynomial-time algorithm for linear programming. Combinatorica, 4(4):373–395, 1984. Kramár et al. (2024) János Kramár, Tom Lieberum, Rohin Shah, and Neel Nanda. AtP*: An efficient and scalable method for localizing LLM behaviour to components. (arXiv:2403.00745), 2024. Krentel (1988) MW Krentel. The complexity of optimization functions. Journal of Computer and System Sciences, 36(3):490–509, 1988. Lampinen et al. (2024) Andrew Kyle Lampinen, Stephanie C. Y. Chan, and Katherine Hermann. Learned feature representations are biased by complexity, learning order, position, and more. Transactions on Machine Learning Research, 2024. Lee et al. (2023) Jae Hee Lee, Sergio Lanza, and Stefan Wermter. From Neural Activations to Concepts: A Survey on Explaining Concepts in Neural Networks. (arXiv:2310.11884), 2023. Lepori et al. (2023) Michael A. Lepori, Thomas Serre, and Ellie Pavlick. Uncovering Causal Variables in Transformers using Circuit Probing. (arXiv:2311.04354), 2023. Lieberum et al. (2023) Tom Lieberum, Matthew Rahtz, János Kramár, Neel Nanda, Geoffrey Irving, Rohin Shah, and Vladimir Mikulik. Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla. (arXiv:2307.09458), 2023. Lindsay (2024) Grace W Lindsay. Grounding neuroscience in behavioral changes using artificial neural networks. Current Opinion in Neurobiology, 2024. Lindsay & Bau (2023) Grace W. Lindsay and David Bau. Testing methods of neural systems understanding. Cognitive Systems Research, 82:101156, 2023. Livni et al. (2014) Roi Livni, Shai Shalev-Shwartz, and Ohad Shamir. On the Computational Efficiency of Training Neural Networks. In Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014. Marks et al. (2024) Samuel Marks, Can Rager, Eric J. Michaud, Yonatan Belinkov, David Bau, and Aaron Mueller. Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models. (arXiv:2403.19647), 2024. McInerney & Burke (2023) Andrew McInerney and Kevin Burke. Feedforward neural networks as statistical models: Improving interpretability through uncertainty quantification. (arXiv:2311.08139), 2023. Meng et al. (2022) Kevin Meng, Arnab Sen Sharma, Alex J. Andonian, Yonatan Belinkov, and David Bau. Mass-Editing Memory in a Transformer. In The Eleventh International Conference on Learning Representations, 2022. Meng et al. (2024) Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. Locating and editing factual associations in GPT. In Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ’22, p. 17359–17372, Red Hook, NY, USA, 2024. Curran Associates Inc. ISBN 978-1-71387-108-8. Merullo et al. (2024) Jack Merullo, Carsten Eickhoff, and Ellie Pavlick. Circuit Component Reuse Across Tasks in Transformer Language Models. In The Twelfth International Conference on Learning Representations, 2024. Motwani & Raghavan (1995) Rajeev Motwani and Prabhakar Raghavan. Randomized Algorithms. Cambridge University Press, Cambridge ; New York, 1995. ISBN 978-0-521-47465-8. Nanda et al. (2023) Vedant Nanda, Till Speicher, John Dickerson, Krishna Gummadi, Soheil Feizi, and Adrian Weller. Diffused redundancy in pre-trained representations. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (eds.), Advances in Neural Information Processing Systems, volume 36, p. 4055–4079. Curran Associates, Inc., 2023. Niu et al. (2023) Jingcheng Niu, Andrew Liu, Zining Zhu, and Gerald Penn. What does the Knowledge Neuron Thesis Have to do with Knowledge? In The Twelfth International Conference on Learning Representations, 2023. Olah et al. (2017) Chris Olah, Alexander Mordvintsev, and Ludwig Schubert. Feature Visualization. Distill, 2(11):e7, 2017. Olah et al. (2020) Chris Olah, Nick Cammarata, Ludwig Schubert, Gabriel Goh, Michael Petrov, and Shan Carter. Zoom In: An Introduction to Circuits. Distill, 5(3):e00024.001, 2020. Oota et al. (2023) Subba Reddy Oota, Emin Çelik, Fatma Deniz, and Mariya Toneva. Speech language models lack important brain-relevant semantics. (arXiv:2311.04664), 2023. Ordyniak et al. (2023) Sebastian Ordyniak, Giacomo Paesani, and Stefan Szeider. The Parameterized Complexity of Finding Concise Local Explanations. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, p. 3312–3320, Macau, SAR China, 2023. International Joint Conferences on Artificial Intelligence Organization. ISBN 978-1-956792-03-4. Papadimitriou & Yannakakis (1991) C.H. Papadimitriou and M. Yannakakis. Optimization, approximation, and complexity classes. Journal of Computer and System Sciences, 43:425–440, 1991. Pavlick (2023) Ellie Pavlick. Symbols and grounding in large language models. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 381(2251):20220041, 2023. Provan & Ball (1983) J. Scott Provan and Michael O. Ball. The Complexity of Counting Cuts and of Computing the Probability that a Graph is Connected. SIAM Journal on Computing, 12(4):777–788, 1983. Ramaswamy (2019) Venkatakrishnan Ramaswamy. An Algorithmic Barrier to Neural Circuit Understanding. bioRxiv, 2019. doi: 10.1101/639724. Räuker et al. (2023) Tilman Räuker, Anson Ho, Stephen Casper, and Dylan Hadfield-Menell. Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks. In 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), p. 464–483. IEEE Computer Society, 2023. ISBN 978-1-66546-299-0. Ross & Bassett (2024) Lauren N. Ross and Dani S. Bassett. Causation in neuroscience: Keeping mechanism meaningful. Nature Reviews Neuroscience, 25(2):81–90, 2024. Rossem & Saxe (2024) Loek Van Rossem and Andrew M. Saxe. When Representations Align: Universality in Representation Learning Dynamics. In Proceedings of the 41st International Conference on Machine Learning, p. 49098–49121. PMLR, 2024. Schaefer & Umans (2002) Marcus Schaefer and Christopher Umans. Completeness in the polynomial-time hierarchy: A compendium. SIGACT News, 33(3):32–49, 2002. Shi et al. (2024) Claudia Shi, Nicolas Beltran-Velez, Achille Nazaret, Carolina Zheng, Adrià Garriga-Alonso, Andrew Jesson, Maggie Makar, and David Blei. Hypothesis Testing the Circuit Hypothesis in LLMs. In ICML 2024 Workshop on Mechanistic Interpretability, 2024. Song et al. (2017) Le Song, Santosh Vempala, John Wilmes, and Bo Xie. On the Complexity of Learning Neural Networks. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. Subercaseaux (2020) Bernardo Aníbal Subercaseaux. Model Interpretability through the Lens of Computational Complexity. PhD thesis, Universidad de Chile, 2020. Syed et al. (2023) Aaquib Syed, Can Rager, and Arthur Conmy. Attribution Patching Outperforms Automated Circuit Discovery. In NeurIPS Workshop on Attributing Model Behavior at Scale, 2023. Tigges et al. (2024) Curt Tigges, Michael Hanna, Qinan Yu, and Stella Biderman. LLM Circuit Analyses Are Consistent Across Training and Scale. (arXiv:2407.10827), 2024. Toda (1991) Seinosuke Toda. P is as Hard as the Polynomial-Time Hierarchy. SIAM Journal on Computing, 20(5):865–877, 1991. Tolstikhin et al. (2021) Ilya O Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, and Alexey Dosovitskiy. MLP-Mixer: An all-MLP Architecture for Vision. In 35th Conference on Neural Information Processing Systems (NeurIPS 2021), volume 34, p. 24261–24272. Curran Associates, Inc., 2021. Valiant (1979) Leslie G. Valiant. The Complexity of Enumeration and Reliability Problems. SIAM Journal on Computing, 8(3):410–421, 1979. Vaswani et al. (2017) Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. Attention is All you Need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. Vilas et al. (2024a) Martina G. Vilas, Federico Adolfi, David Poeppel, and Gemma Roig. Position: An Inner Interpretability Framework for AI Inspired by Lessons from Cognitive Neuroscience. In Proceedings of the 41st International Conference on Machine Learning, p. 49506–49522. PMLR, 2024a. Vilas et al. (2024b) Martina G. Vilas, Timothy Schaumlöffel, and Gemma Roig. Analyzing vision transformers for image classification in class embedding space. In Proceedings of the 37th International Conference on Neural Information Processing Systems, NIPS ’23, p. 40030–40041, Red Hook, NY, USA, 2024b. Curran Associates Inc. Voss et al. (2021) Chelsea Voss, Nick Cammarata, Gabriel Goh, Michael Petrov, Ludwig Schubert, Ben Egan, Swee Kiat Lim, and Chris Olah. Visualizing Weights. Distill, 6(2):e00024.007, 2021. Wäldchen et al. (2021) Stephan Wäldchen, Jan Macdonald, Sascha Hauch, and Gitta Kutyniok. The computational complexity of understanding binary classifier decisions. Journal of Artificial Intelligence Research, 70:351–387, 2021. Wang et al. (2022) Kevin Ro Wang, Alexandre Variengien, Arthur Conmy, Buck Shlegeris, and Jacob Steinhardt. Interpretability in the Wild: A Circuit for Indirect Object Identification in GPT-2 Small. In The Eleventh International Conference on Learning Representations, 2022. Wang & Veitch (2024) Zihao Wang and Victor Veitch. Does Editing Provide Evidence for Localization? In ICML 2024 Workshop on Mechanistic Interpretability, 2024. Wareham (1999) Harold T. Wareham. Systematic Parameterized Complexity Analysis in Computational Phonology. PhD thesis, University of Victoria, Canada, 1999. Wareham (2022) Todd Wareham. Creating teams of simple agents for specified tasks: A computational complexity perspective. (arXiv:2205.02061), 2022. Yu et al. (2024a) Lei Yu, Jingcheng Niu, Zining Zhu, and Gerald Penn. Functional Faithfulness in the Wild: Circuit Discovery with Differentiable Computation Graph Pruning. (arXiv:2407.03779), 2024a. Yu et al. (2024b) Runpeng Yu, Weihao Yu, and Xinchao Wang. KAN or MLP: A Fairer Comparison. (arXiv:2407.16674), 2024b. Zhang et al. (2024) Enyan Zhang, Michael A. Lepori, and Ellie Pavlick. Instilling Inductive Biases with Subnetworks. (arXiv:2310.10899), 2024. Zhang & Nanda (2023) Fred Zhang and Neel Nanda. Towards Best Practices of Activation Patching in Language Models: Metrics and Methods. In The Twelfth International Conference on Learning Representations, 2023. Appendix: Definitions, Theorems and Proofs Appendix A Preliminaries Each section of this appendix is self-contained except for the following definitions. We re-state interpretability query definitions in a more detailed form in each section for convenience. To err here on the side of rigor, here we will use more cumbersome notation that we avoided in the main manuscript for succinctness. A.1 Computational problems of known complexity Some of our proofs construct reductions from the following computational problems. Clique (Garey & Johnson, 1979, Problem GT19) Input: An undirected graph G=(V,E)G=(V,E)G = ( V , E ) and a positive integer k. Question: Does G have a clique of size at least k, i.e., a subset V′⊆Vsuperscript′V V′ ⊆ V, |V′|≥ksuperscript′|V |≥ k| V′ | ≥ k, such that for all pairs v,v′∈V′superscript′v,v ∈ V v , v′ ∈ V′, (v,v′)∈Esuperscript′(v,v )∈ E( v , v′ ) ∈ E? Vertex cover (VC) (Garey & Johnson, 1979, Problem GT1) Input: An undirected graph G=(V,E)G=(V,E)G = ( V , E ) and a positive integer k. Question: Does G contain a vertex cover of size at most k, i.e., a subset V′⊆Vsuperscript′V V′ ⊆ V, |V′|≤ksuperscript′|V |≤ k| V′ | ≤ k, such that for all (u,v)∈E(u,v)∈ E( u , v ) ∈ E, at least one of u or v is in V′? Dominating set (DS) (Garey & Johnson, 1979, Problem GT2) Input: An undirected graph G=(V,E)G=(V,E)G = ( V , E ) and a positive integer k. Question: Does G contain a dominating set of size at most k, i.e., a subset V′⊆Vsuperscript′V V′ ⊆ V, |V′|≤ksuperscript′|V |≤ k| V′ | ≤ k, such that for all v∈Vv∈ Vv ∈ V, either v∈V′v∈ V v ∈ V′ or there is at least one v′∈V′superscript′v ∈ V v′ ∈ V′ such that (v,v′)∈Esuperscript′(v,v )∈ E( v , v′ ) ∈ E? Hitting set (HS) (Garey & Johnson, 1979, Problem SP8) Input: A collection of subsets C of a finite set S and a positive integer k. Question: Is there a subset S′ of S, |S′|≤ksuperscript′|S |≤ k| S′ | ≤ k, such that S′ has a non-empty intersection with each set in C? Minimum DNF Tautology (3DT) (Schaefer & Umans, 2002, Problem L7) Input: A 3-DNF tautology ϕitalic-ϕφϕ with T terms over a set of variables V and a positive integer k. Question: Is there a 3-DNF formula ϕ′italic-ϕ′φ ϕ′ made up of ≤kabsent≤ k≤ k of the terms in ϕitalic-ϕφϕ that is a also a tautology? A.2 Classical and parameterized complexity Definition 1 (Polynomial-time tractability). An algorithm is said to run in polynomial-time if the number of steps it performs is O(nc)superscriptO(n^c)O ( nitalic_c ), where n is a measure of the input size and c is some constant. A problem Π Π is said to be tractable if it has a polynomial-time algorithm. P denotes the class of such problems. Consider a more fine-grained look at the sources of complexity of problems. The following is a relaxation of the notion of tractability, where unreasonable resource demands are allowed as long as they are constrained to a set of problem parameters. Definition 2 (Fixed-parameter tractability). Let PP be a set of problem parameters. A problem PP-Π Π is fixed-parameter tractable relative to PP if there exists an algorithm that computes solutions to instances of PP-Π Π of any size n in time f()⋅nc⋅superscriptf(P)· n^cf ( P ) ⋅ nitalic_c, where c is a constant and f(⋅)⋅f(·)f ( ⋅ ) some computable function. FPT denotes the class of such problems and includes all problems in P. A.3 Hardness and reductions Most proof techniques in this work involve reductions between computational problems. Definition 3 (Reducibility). A problem Π1subscriptΠ1 _1Π1 is polynomial-time reducible to Π2subscriptΠ2 _2Π2 if there exists a polynomial-time algorithm (reduction) that transforms instances of Π1subscriptΠ1 _1Π1 into instances of Π2subscriptΠ2 _2Π2 such that solutions for Π2subscriptΠ2 _2Π2 can be transformed in polynomial-time into solutions for Π1subscriptΠ1 _1Π1. This implies that if a tractable algorithm for Π2subscriptΠ2 _2Π2 exists, it can be used to solve Π1subscriptΠ1 _1Π1 tractably. Fpt-reductions transform an instance (x,k)(x,k)( x , k ) of some problem parameterized by k into an instance (x′,k′)superscript′(x ,k )( x′ , k′ ) of another problem, with k′≤g(k)superscript′k ≤ g(k)k′ ≤ g ( k ), in time f(k)⋅p(|x|)⋅f(k)· p(|x|)f ( k ) ⋅ p ( | x | ) where p is a polynomial and g(⋅)⋅g(·)g ( ⋅ ) is an arbitrary function. These reductions analogously transfer fixed-parameter tractability results between problems. Hardness results are generally conditional on two conjectures with extensive theoretical and empirical support. Intractability statements build on these as follows. Conjecture 1. P ≠ NP. Definition 4 (Polynomial-time intractability). The class NP contains all problems in P and more. Assuming 1, NP-hard problems lie outside P. These problems are considered intractable because they cannot be solved in polynomial-time (unless 1 is false; see Fortnow, 2009). Conjecture 2. FPT ≠ W[1]. Definition 5 (Fixed-parameter intractability). The class W[1] contains all problems in the class FPT and more. Assuming 2, W[1]-hard parameterized problems lie outside FPT. These problems are considered fixed-parameter intractable, relative to a given parameter set, because no fixed-parameter tractable algorithm can exist to solve them (unless 2 is false; see Downey & Fellows, 2013). The following two easily-proven lemmas will be useful in our parameterized complexity proofs. Lemma 1. (Wareham, 1999, Lemma 2.1.30) If problem Π Π is fp-tractable relative to aspect-set K then Π Π is fp-tractable for any aspect-set K′ such that K⊂K′⊂ K K ⊂ K′. Lemma 2. (Wareham, 1999, Lemma 2.1.31) If problem Π Π is fp-intractable relative to aspect-set K then Π Π is fp-intractable for any aspect-set K′ such that K′⊂Ksuperscript′K ⊂ K′ ⊂ K. A.4 Approximation Although sometimes computing optimal solutions might be intractable, it is still conceivable that we could devise tractable procedures to obtain approximate solutions that are useful in practice. We consider two natural notions of additive and multiplicative approximation and three probabilistic schemes. A.4.1 Multiplicative approximation For a minimization problem Π Π, let OPTΠ(I)subscriptΠOPT_ (I)O P Troman_Π ( I ) be an optimal solution for Π Π on instance I, AΠ(I)subscriptΠA_ (I)Aroman_Π ( I ) be a solution for Π Π returned by an algorithm A, and m(OPTΠ(I))subscriptΠm(OPT_ (I))m ( O P Troman_Π ( I ) ) and m(AΠ(I))subscriptΠm(A_ (I))m ( Aroman_Π ( I ) ) be the values of these solutions. Definition 6 (Multiplicative approximation algorithm). [Ausiello et al. 1999, Def. 3.5]. Given a minimization problem Π Π, an algorithm A is a multiplicative ϵitalic-ϵεϵ-approximation algorithm for Π Π if for each instance I of Π Π, m(AΠ(I))−m(OPTΠ(I))≤ϵ×m(OPTΠ(I))subscriptΠsubscriptΠitalic-ϵsubscriptΠm(A_ (I))-m(OPT_ (I))≤ε× m(OPT_ (I))m ( Aroman_Π ( I ) ) - m ( O P Troman_Π ( I ) ) ≤ ϵ × m ( O P Troman_Π ( I ) ). It would be ideal if one could obtain approximate solutions for a problem Π Π that are arbitrarily close to optimal if one is willing to allow extra algorithm runtime. Definition 7 (Multiplicative approximation scheme). [Adapted from Ausiello et al., 1999, Def. 3.10]. Given a minimization problem Π Π, a polynomial-time approximation scheme (PTAS) for Π Π is a set AA of algorithms such that for each integer k>00k>0k > 0, there is a 1k1 1kdivide start_ARG 1 end_ARG start_ARG k end_ARG-approximation algorithm AΠk∈subscriptsuperscriptΠA^k_ ∈ AAitalic_kroman_Π ∈ A that runs in time polynomial in |I||I|| I |. A.4.2 Additive approximation It would be useful to have guarantees that an approximation algorithm for our problems returns solutions at most a fixed distance away from optimal. This would ensure errors cannot get impractically large. Definition 8 (Additive approximation algorithm). [Adapted from Ausiello et al., 1999, Def. 3.3]. An algorithm AΠsubscriptΠA_ Aroman_Π for a problem Π Π is a d-additive approximation algorithm (d-A) if there exists a constant d such that for all instances x of Π Π the error between the value m(⋅)⋅m(·)m ( ⋅ ) of an optimal solution optsol(x)optsol(x)o p t s o l ( x ) and the output AΠ(x)subscriptΠA_ (x)Aroman_Π ( x ) is such that |m(optsol(x))−m(AΠ(x))|≤dsubscriptΠ|\ m(optsol(x))-m(A_ (x))\ |≤ d| m ( o p t s o l ( x ) ) - m ( Aroman_Π ( x ) ) | ≤ d. A.4.3 Probabilistic approximation Finally, consider three other types of probabilistic polynomial-time approximability (henceforth 3PA) that may be acceptable in situations where always getting the correct output for an input is not required: (1) algorithms that always run in polynomial time and produce the correct output for a given input in all but a small number of cases (Hemaspaandra & Williams, 2012); (2) algorithms that always run in polynomial time and produce the correct output for a given input with high probability (Motwani & Raghavan, 1995); and (3) algorithms that run in polynomial time with high probability but are always correct (Gill, 1977). A.5 Model architecture Definition 9 (Multi-Layer Perceptron). [Adapted from Barceló et al. 2020]. A multi-layer perceptron (MLP) is a neural network model ℳMM, with L^ Lover start_ARG L end_ARG layers, defined by sequences of weight matrices (1,2,…,L^),i∈ℚdi−1×disubscript1subscript2…subscript^subscriptsuperscriptℚsubscript1subscript(W_1,W_2,…,W_ L), _i% ^d_i-1× d_i( W1 , W2 , … , Wover start_ARG L end_ARG ) , Witalic_i ∈ blackboard_Qditalic_i - 1 × ditalic_i, bias vectors (1,2,…,L^),i∈ℚdisubscript1subscript2…subscript^subscriptsuperscriptℚsubscript(b_1,b_2,…,b_ L), _i% ^d_i( b1 , b2 , … , bover start_ARG L end_ARG ) , bitalic_i ∈ blackboard_Qditalic_i, and (element-wise) ReLU functions (f1,f2,…,fL^−1),fi(x):=max(0,x).assignsubscript1subscript2…subscript^1subscript0(f_1,f_2,…,f_ L-1), f_i(x):= (0,x).( f1 , f2 , … , fover start_ARG L end_ARG - 1 ) , fitalic_i ( x ) := max ( 0 , x ) . The final function is, without loss of generality, the binary step function fL^(x):=1ifx≥0,otherwise 0.formulae-sequenceassignsubscript^1if0otherwise 0f_ L(x):=1\ if\ x≥ 0,otherwise\ 0.fover start_ARG L end_ARG ( x ) := 1 if x ≥ 0 , otherwise 0 . The computation rules for ℳMM are given by i:=fi(i−1+i),0:=,formulae-sequenceassignsubscriptsubscriptsubscript1subscriptassignsubscript0h_i:=f_i(h_i-1W+b_i), % h_0:=x,hitalic_i := fitalic_i ( hitalic_i - 1 W + bitalic_i ) , h0 := x , where xx is the input. The output of ℳMM on xx is defined as ℳ():=L^assignℳsubscript^M(x):=h_ LM ( x ) := hover start_ARG L end_ARG. The graph Gℳ=(V,E)subscriptℳG_M=(V,E)Gcaligraphic_M = ( V , E ) of ℳMM has a vertex for each component of each isubscripth_ihitalic_i. All vertices in layer i are connected by edges to all vertices of layer i+11i+1i + 1, with no intra-layer connections. Edges carry weights according to isubscriptW_iWitalic_i, and vertices carry the components of isubscriptb_ibitalic_i as biases. The size of ℳMM is defined as |ℳ|:=|V|assignℳ|M|:=|V|| M | := | V |. A.6 Preliminary remarks As is the case for other work in this area ((which might be called “Applied Complexity Theory” Bassan et al., 2024; Barceló et al., 2020), we are not aiming at developing new mathematical techniques but rather deploying existing mathematical tools to answer important questions that connect to applications. Part of the technical challenge we take up is to formalize problems of practical interest in simple (and if possible, elegant) ways that are readily understandable, and to prove their complexity properties efficiently (i.e., obtaining a one to many relation between proof constructions and meaningful results). This allows us to gain insights into the sources of complexity of problems, an investigation where the difficulty/complexity/intricacy of proofs are a liability. We use ‘input queries’ to refer to computational problems in explainability and ‘circuit queries’ for circuit discovery in inner interpretability. We make no claims as to whether one or the other query relates more to intuitive ideas of explanation or interpretation and merely use the latter as familiar pointers to the literature. All of our proofs for local problem variants assume a particular input vector I, be it the all-0 or all-1 vector. Note that we can simulate these vectors by having zero weights on the input lines and putting appropriate 0 and 1 biases on the input neurons (a technique developed and used in our later-derived proofs but readily applicable to earlier ones). This causes the input to be ‘ignored’, which renders our proofs correct under both integer and continuous inputs. Note this construction is in line with previous work (e.g., Bassan et al., 2024; Barceló et al., 2020). Importantly, this highlights that the characteristics of the input are not important but rather there is a combinatorial ’heart’ beating at the center of our circuit problems; namely, the selection of a subcircuit from exponential number of subcircuits. This combinatorial core can, if not tamed by appropriate restrictions on network and input structure, give rise to non-polynomial worst-case algorithmic complexity. Proofs for global problem variants often employ constructions similar to those for local variants, with minor to medium (though crucial) differences. For completeness, the full construction is stated again to minimize errors and the need to check proofs other than those being examined. On the issue of real-world structure and formal complexity. One example scenario where real-world statistics might act as mitigating forces with respect to computational hardness (of the general problems) is the case of high redundancy (related to our Circuit Robustness problem). Redundancy can in some sense make circuit finding easier (as solutions are more abundant), but the benefit comes at a cost for interpretability through introducing identifiably issues. As circuits supporting a particular behavior are more numerous (i.e., there is more redundancy), it might get easier to find them with heuristics, but since they are more numerous, they potentially represent competing explanations, which leads to the issue of identifiability. This redundancy would be important to diagnose and characterize, an issue that our Circuit Robustness problem touches on. Appendix B Local and Global Sufficient Circuit Minimum locally sufficient circuit (MLSC) Input: A multi-layer perceptron M of depth cdgsubscriptcd_gc ditalic_g with #ntot,g#subscript\#n_tot,g# nitalic_t o t , g neurons and maximum layer width cwgsubscriptcw_gc witalic_g, connection-value matrices W1,W2,…,Wcdgsubscript1subscript2…subscriptsubscriptW_1,W_2,…,W_cd_gW1 , W2 , … , Witalic_c d start_POSTSUBSCRIPT g end_POSTSUBSCRIPT, neuron bias vector B, a Boolean input vector I of length #ng,in#subscript\#n_g,in# nitalic_g , i n, and integers d, w, and #n#\#n# n such that 1≤d≤cdg1subscript1≤ d≤ cd_g1 ≤ d ≤ c ditalic_g, 1≤w≤cwg1subscript1≤ w≤ cw_g1 ≤ w ≤ c witalic_g, and 1≤#n≤#ntot,g1##subscript1≤\#n≤\#n_tot,g1 ≤ # n ≤ # nitalic_t o t , g. Question: Is there a subcircuit C of M of depth cdr≤dsubscriptcd_r≤ dc ditalic_r ≤ d with #ntot,r≤#n#subscript#\#n_tot,r≤\#n# nitalic_t o t , r ≤ # n neurons and maximum layer width cwr≤wsubscriptcw_r≤ wc witalic_r ≤ w that produces the same output on input I as M? Minimum globally sufficient circuit (MGSC) Input: A multi-layer perceptron M of depth cdgsubscriptcd_gc ditalic_g with #ntot,g#subscript\#n_tot,g# nitalic_t o t , g neurons and maximum layer width cwgsubscriptcw_gc witalic_g, connection-value matrices W1,W2,…,Wcdgsubscript1subscript2…subscriptsubscriptW_1,W_2,…,W_cd_gW1 , W2 , … , Witalic_c d start_POSTSUBSCRIPT g end_POSTSUBSCRIPT, neuron bias vector B, and integers d, w, and #n#\#n# n such that 1≤d≤cdg1subscript1≤ d≤ cd_g1 ≤ d ≤ c ditalic_g, 1≤w≤cwg1subscript1≤ w≤ cw_g1 ≤ w ≤ c witalic_g, and 1≤#n≤#ntot,g1##subscript1≤\#n≤\#n_tot,g1 ≤ # n ≤ # nitalic_t o t , g. Question: Is there a subcircuit C of M of depth cdr≤dsubscriptcd_r≤ dc ditalic_r ≤ d with #ntot,r≤#n#subscript#\#n_tot,r≤\#n# nitalic_t o t , r ≤ # n neurons and maximum layer width cwr≤wsubscriptcw_r≤ wc witalic_r ≤ w that produces the same output as M on every possible Boolean input vector of length #in,gsubscript#\#_in,g#i n , g? Given a subset x of the neurons in M, the subcircuit C of M based on x has the neurons in x and all connections in M among these neurons. Note that in order for the output of C to be equal to the output of M on input I, the numbers #nin,g#subscript\#n_in,g# nitalic_i n , g and #nout,g#subscript\#n_out,g# nitalic_o u t , g of input and output neurons in M must exactly equal the numbers #nin,r#subscript\#n_in,r# nitalic_i n , r and #nout,r#subscript\#n_out,r# nitalic_o u t , r of input and output neurons in C; hence, no input or output neurons can be deleted from M in creating C. Following Barceló et al. 2020, page 4, all neurons in M use the ReLU activation function and the output x of each output neuron is stepped as necessary to be Boolean, i.e, step(x)=00step(x)=0s t e p ( x ) = 0 if x≤00x≤ 0x ≤ 0 and is 1111 otherwise. For a graph G=(V,E)G=(V,E)G = ( V , E ), we shall assume an ordering on the vertices and edges in V and E, respectively. For each vertex v∈Vv∈ Vv ∈ V, let the complete neighbourhood NC(v)subscriptN_C(v)Nitalic_C ( v ) of v be the set composed of v and the set of all vertices in G that are adjacent to v by a single edge, i.e., v∪u|u∈Vand(u,v)∈Econditional-setanduvEv∪\u~|~u~∈ V~ and~(u,v)∈ E\v ∪ u | u ∈ V and ( u , v ) ∈ E . Finally, let VCB be the version of VC in which each vertex in G has degree at most B. We will prove various classical and parameterized results for MLSC and MGSC using reductions from Clique (Theorem 1 and 7). These reductions are summarized in Figure 2 and the parameterized results are proved relative to the parameters in Table 5. Additional reductions from VC and DS (Theorems 5 and 103) use specialized ReLU logic gates described in Barceló et al. 2020, Lemma 13. These gates assume Boolean neuron input and output values of 0 and 1 and are structured as follows: 1. NOT ReLU gate: A ReLU gate with one input connection weight of value −11-1- 1 and a bias of 1. This gate has output 1 if the input is 0 and 0 otherwise. 2. n-way AND ReLU gate: A ReLU gate with n input connection weights of value 1 and a bias of −(n−1)1-(n-1)- ( n - 1 ). This gate has output 1 if all inputs have value 1 and 0 otherwise. 3. n-way OR ReLU gate: A combination of an n-way AND ReLU gate with NOT ReLU gates on all of its inputs and a NOT ReLU gate on its output that uses DeMorgan’s Second Law to implement (x1∨x2∨…xn)subscript1subscript2…subscript(x_1 x_2 … x_n)( x1 ∨ x2 ∨ … xitalic_n ) as ¬(¬x1∧¬x2∧…¬xn)subscript1subscript2…subscript ( x_1 x_2 … x_n)¬ ( ¬ x1 ∧ ¬ x2 ∧ … ¬ xitalic_n ). This gate has output 1 if any input has value 1 and 0 otherwise. Figure 2: This figure summarizes the reduction from Clique. Table 5: Parameters for the minimum sufficient subcircuit and reason problems. Parameter Description Problem cdgsubscriptcd_gc ditalic_g # layers in given MLP All cwgsubscriptcw_gc witalic_g max # neurons in layer in given MLP All #ntot,g#subscript\#n_tot,g# nitalic_t o t , g total # neurons in given MLP All #nin,g#subscript\#n_in,g# nitalic_i n , g # input neurons in given MLP All #nout,g#subscript\#n_out,g# nitalic_o u t , g # output neurons in given MLP All Bmax,gsubscriptB_ ,gBroman_max , g max neuron bias in given MLP All Wmax,gsubscriptW_ ,gWroman_max , g max connection weight in given MLP All cdrsubscriptcd_rc ditalic_r # layers in requested subcircuit ML,GSC cwrsubscriptcw_rc witalic_r max # neurons in layer in requested subcircuit ML,GSC #ntot,r#subscript\#n_tot,r# nitalic_t o t , r total # neurons in requested subcircuit ML,GSC #nin,r#subscript\#n_in,r# nitalic_i n , r # input neurons in requested subcircuit ML,GSC #nout,r#subscript\#n_out,r# nitalic_o u t , r # output neurons in requested subcircuit ML,GSC Bmax,rsubscriptB_ ,rBroman_max , r max neuron bias in requested subcircuit ML,GSC Wmax,rsubscriptW_ ,rWroman_max , r max connection weight in requested subcircuit ML,GSC k Size of requested subset of input vector MSR B.1 Results for MLSC Towards proving NP-completeness, we first prove membership and then follow up with hardness. Membership of MLSC in NP can be proven via the definition of the polynomial hierarchy and the following alternating quantifier formula: ∃[⊆ℳ]:()=ℳ():delimited-[]ℳ∃[C ]:C(x)=M(% x)∃ [ C ⊆ M ] : C ( x ) = M ( x ) Theorem 1. If MLSC is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from Clique to MLSC. Given an instance ⟨G=(V,E),k⟩delimited-⟨⟩ G=(V,E),k ⟨ G = ( V , E ) , k ⟩ of Clique, construct the following instance ⟨M,I,d,w,#n⟩# M,I,d,w,\#n ⟨ M , I , d , w , # n ⟩ of MLSC: Let M be an MLP based on #ntot,g=|V|+|E|+2#subscript2\#n_tot,g=|V|+|E|+2# nitalic_t o t , g = | V | + | E | + 2 neurons spread across four layers: 1. Input layer: The single input neuron ninsubscriptn_innitalic_i n (bias 0). 2. Hidden vertex layer: The vertex neurons nv1,nv2,…nv|V|subscript1subscript2…subscriptnv_1,nv_2,… nv_|V|n v1 , n v2 , … n v| V | (all with bias 0). 3. Hidden edge layer: The edge neurons ne1,ne2,…ne|E|subscript1subscript2…subscriptne_1,ne_2,… ne_|E|n e1 , n e2 , … n e| E | (all with bias −11-1- 1). 4. Output layer: The single output neuron noutsubscriptn_outnitalic_o u t (bias −(k(k−1)/2−1)121-(k(k-1)/2-1)- ( k ( k - 1 ) / 2 - 1 )). The non-zero weight connections between adjacent layers are as follows: • The input neuron ninsubscriptn_innitalic_i n is connected to each vertex neuron with weight 1. • Each vertex neuron nvisubscriptnv_in vitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to each edge neuron whose corresponding edge has an endpoint visubscriptv_ivitalic_i with weight 1. • Each edge neuron neisubscriptne_in eitalic_i, 1≤i≤|E|11≤ i≤|E|1 ≤ i ≤ | E |, is connected to the output neuron noutsubscriptn_outnitalic_o u t with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let I=(1)1I=(1)I = ( 1 ), d=44d=4d = 4, w=k(k−1)/212w=k(k-1)/2w = k ( k - 1 ) / 2, and #n=k(k−1)/2+k+2#122\#n=k(k-1)/2+k+2# n = k ( k - 1 ) / 2 + k + 2. Observe that this instance of MLSC can be created in time polynomial in the size of the given instance of Clique. Moreover, the output behaviour of the neurons in M from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(1)subscript1n_in(1)nitalic_i n ( 1 ) 2 nv1(1),nv2(1),…nv|V|(1)subscript11subscript21…subscript1nv_1(1),nv_2(1),… nv_|V|(1)n v1 ( 1 ) , n v2 ( 1 ) , … n v| V | ( 1 ) 3 ne1(1),ne2(1),…ne|E|(1)subscript11subscript21…subscript1ne_1(1),ne_2(1),… ne_|E|(1)n e1 ( 1 ) , n e2 ( 1 ) , … n e| E | ( 1 ) 4 nout(|E|−(k(k−1)/2−1))subscript121n_out(|E|-(k(k-1)/2-1))nitalic_o u t ( | E | - ( k ( k - 1 ) / 2 - 1 ) ) Note that it is the stepped output of noutsubscriptn_outnitalic_o u t in timestep 4 that yields output 1. We now need to show the correctness of this reduction by proving that the answer for the given instance of Clique is “Yes” if and only if the answer for the constructed instance of MLSC is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a clique in G of size k. Consider the subcircuit C based on neurons ninsubscriptn_innitalic_i n, noutsubscriptn_outnitalic_o u t, nv′|v′∈V′conditional-setsuperscript′superscript′\nv ~|~v ∈ V \ n v′ | v′ ∈ V′ , and ne′|e′=(x,y)andvx,vy∈V′conditional-setsuperscript′formulae-sequencesuperscript′andsuperscript′\ne ~|~e =(x,y)~ and~vx,vy∈ V \ n e′ | e′ = ( x , y ) and v x , v y ∈ V′ . Observe that in this subcircuit, cdr=d=4subscript4cd_r=d=4c ditalic_r = d = 4, cwr=r=k(k−1)/2subscript12cw_r=r=k(k-1)/2c witalic_r = r = k ( k - 1 ) / 2, and #ntot,r=#n=k(k−1)/2+k+2#subscript#122\#n_tot,r=\#n=k(k-1)/2+k+2# nitalic_t o t , r = # n = k ( k - 1 ) / 2 + k + 2. The output behaviour of the neurons in C from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(1)subscript1n_in(1)nitalic_i n ( 1 ) 2 nv1′(1),nv2′(1),…nv|V′|′(1)subscriptsuperscript′11subscriptsuperscript′21…subscriptsuperscript′1nv _1(1),nv _2(1),… nv _|V |(1)n v′1 ( 1 ) , n v′2 ( 1 ) , … n v′| V′ | ( 1 ) 3 ne1′(1),ne2′(1),…nek(k−1)/2′(1)subscriptsuperscript′11subscriptsuperscript′21…subscriptsuperscript′121ne _1(1),ne _2(1),… ne _k(k-1)/2(1)n e′1 ( 1 ) , n e′2 ( 1 ) , … n e′italic_k ( k - 1 ) / 2 ( 1 ) 4 nout(1)subscript1n_out(1)nitalic_o u t ( 1 ) The output of noutsubscriptn_outnitalic_o u t in timestep 4 is stepped to 1, which means that C is behaviorally equivalent to M on I. ⇐ ⇐ : Let C be a subcircuit of M that is behaviorally equivalent to M on input I and has #ntot,r≤#n=k(k−1)/2+k+2#subscript#122\#n_tot,r≤\#n=k(k-1)/2+k+2# nitalic_t o t , r ≤ # n = k ( k - 1 ) / 2 + k + 2 neurons. As neurons in all four layers in M must be present in C to produce the required output, cdr=d=cdgsubscriptsubscriptcd_r=d=cd_gc ditalic_r = d = c ditalic_g and both ninsubscriptn_innitalic_i n and noutsubscriptn_outnitalic_o u t are in C. In order for noutsubscriptn_outnitalic_o u t to produce a non-zero output, there must be at least k(k−1)/212k(k-1)/2k ( k - 1 ) / 2 edge neurons in C, each of which must be activated by the inclusion of the vertex neurons corresponding to both of their endpoint vertices. This requires the inclusion of at least k vertex neurons in C, as a set V′V V′ ′ of vertices in graph can have at most |V′|(|V′|−1)/2superscript′12|V |(|V |-1)/2| V′ ′ | ( | V′ ′ | - 1 ) / 2 distinct edges between them (with this maximum occurring if all pairs of vertices in V′V V′ ′ have an edge between them). As #ntot,r≤k(k−1)/2+k+2#subscript122\#n_tot,r≤ k(k-1)/2+k+2# nitalic_t o t , r ≤ k ( k - 1 ) / 2 + k + 2, all of the above implies that there must be exactly k(k−1)/212k(k-1)/2k ( k - 1 ) / 2 edge neurons and exactly k vertex neurons in C and the vertices in G corresponding to these vertex neurons must form a clique of size k in G. As Clique is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MLSC is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Theorem 2. If ⟨cdg,#nin,g,#nout,g,Bmax,g,Wmax,g,cdr,cwr,#nin,r,#nout,r,#ntot,r, cd_g,\#n_in,g,\#n_out,g,B_ ,g,W_ ,g,cd_r,cw_r,\#n_% in,r,\#n_out,r,\#n_tot,r,⟨ c ditalic_g , # nitalic_i n , g , # nitalic_o u t , g , Broman_max , g , Wroman_max , g , c ditalic_r , c witalic_r , # nitalic_i n , r , # nitalic_o u t , r , # nitalic_t o t , r , Bmax,r,Wmax,r⟩B_ ,r,W_ ,r _max , r , Wroman_max , r ⟩-MLSC is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of MLSC constructed in the reduction in the proof of Theorem 1, cdg=cdr=4subscriptsubscript4cd_g=cd_r=4c ditalic_g = c ditalic_r = 4, #nin,g=#nin,r=#nout,r=#nout,r=Wmax,g=Wmax,r=1#subscript#subscript#subscript#subscriptsubscriptsubscript1\#n_in,g=\#n_in,r=\#n_out,r=\#n_out,r=W_ ,g=W_ ,r=1# nitalic_i n , g = # nitalic_i n , r = # nitalic_o u t , r = # nitalic_o u t , r = Wroman_max , g = Wroman_max , r = 1, and Bmax,g,Bmax,r,#ntot,r,subscriptsubscript#subscriptB_ ,g,B_ ,r,\#n_tot,r,Broman_max , g , Broman_max , r , # nitalic_t o t , r , and cwrsubscriptcw_rc witalic_r are all functions of k in the given instance of Clique. The result then follows from the fact that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Clique is W[1]delimited-[]1W[1]W [ 1 ]-hard (Downey & Fellows, 1999). ∎ Theorem 3. ⟨#ntot,g⟩delimited-⟨⟩#subscript \#n_tot,g ⟨ # nitalic_t o t , g ⟩-MLSC is fixed-parameter tractable. Proof. Consider the algorithm that generates each possible subcircuit of M and checks if that subcircuit is behaviorally equivalent to M on input I. If such a subcircuit is found, return “Yes”; otherwise, return “No”. As each such subcircuit can be run on I in time polynomial in the size of the given instance of MLSC and the total number of subcircuits that need to be checked is at most 2#ntot,gsuperscript2#subscript2^\#n_tot,g2# nitalic_t o t , g, the above is a fixed-parameter tractable algorithm for MLSC relative to parameter-set #ntot,g#subscript\\#n_tot,g\ # nitalic_t o t , g . ∎ Theorem 4. ⟨cdg,cwg⟩subscriptsubscript cd_g,cw_g ⟨ c ditalic_g , c witalic_g ⟩-MLSC is fixed-parameter tractable. Proof. Follows from the observation that #ntot,g≤cdg×cwg#subscriptsubscriptsubscript\#n_tot,g≤ cd_g× cw_g# nitalic_t o t , g ≤ c ditalic_g × c witalic_g and the algorithm in the proof of Theorem 3. ∎ Though we have already proved the polynomial-time intractability of MLSC in Theorem 1, the reduction in the proof of the following theorem will be useful in proving a certain type of polynomial-time inapproximability for MLSC (see Figure 3). Theorem 5. If MLSC is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from VC to MLSC. Given an instance ⟨G=(V,E),k⟩delimited-⟨⟩ G=(V,E),k ⟨ G = ( V , E ) , k ⟩ of VC, construct the following instance ⟨M,I,d,w,#n⟩# M,I,d,w,\#n ⟨ M , I , d , w , # n ⟩ of MLSC: Let M be an MLP based on #ntot,g=|V|+2|E|+2#subscript22\#n_tot,g=|V|+2|E|+2# nitalic_t o t , g = | V | + 2 | E | + 2 neurons spread across five layers: 1. Input layer: The single input neuron ninsubscriptn_innitalic_i n (bias 0). 2. Hidden vertex layer: The vertex neurons nvN1,nvN2,…nvN|V|subscript1subscript2…subscriptnvN_1,nvN_2,… nvN_|V|n v N1 , n v N2 , … n v N| V |, all of which are NOT ReLU gates. 3. Hidden edge layer I: The edge AND neurons neA1,neA2,…neA|E|subscript1subscript2…subscriptneA_1,neA_2,… neA_|E|n e A1 , n e A2 , … n e A| E |, all of which are 2-way AND ReLU gates. 4. Hidden edge layer I: The edge NOT neurons neN1,neN2,…neN|E|subscript1subscript2…subscriptneN_1,neN_2,… neN_|E|n e N1 , n e N2 , … n e N| E |, all of which are NOT ReLU gates. 5. Output layer: The single output neuron noutsubscriptn_outnitalic_o u t, which is an |E||E|| E |-way AND ReLU gate. The non-zero weight connections between adjacent layers are as follows: • The input neuron ninsubscriptn_innitalic_i n is connected to each vertex NOT neuron with weight 1. • Each vertex NOT neuron nvNisubscriptnvN_in v Nitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to each edge AND neuron whose corresponding edge has an endpoint visubscriptv_ivitalic_i with weight 1. • Each edge AND neuron neAisubscriptneA_in e Aitalic_i, 1≤i≤|E|11≤ i≤|E|1 ≤ i ≤ | E |, is connected to its corresponding edge NOT neuron neNisubscriptneN_in e Nitalic_i with weight 1. • Each edge NOT neuron neNisubscriptneN_in e Nitalic_i, 1≤i≤|E|11≤ i≤|E|1 ≤ i ≤ | E |, is connected to the output neuron noutsubscriptn_outnitalic_o u t with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let I=(1)1I=(1)I = ( 1 ), d=55d=5d = 5, w=|E|w=|E|w = | E |, and #n=2|E|+k+2#22\#n=2|E|+k+2# n = 2 | E | + k + 2. Observe that this instance of MLSC can be created in time polynomial in the size of the given instance of VC, Moreover, the output behaviour of the neurons in M from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(1)subscript1n_in(1)nitalic_i n ( 1 ) 2 nvN1(0),nvN2(0),…nvN|V|(0)subscript10subscript20…subscript0nvN_1(0),nvN_2(0),… nvN_|V|(0)n v N1 ( 0 ) , n v N2 ( 0 ) , … n v N| V | ( 0 ) 3 neA1(0),neA2(0),…neA|E|(0)subscript10subscript20…subscript0neA_1(0),neA_2(0),… neA_|E|(0)n e A1 ( 0 ) , n e A2 ( 0 ) , … n e A| E | ( 0 ) 4 neN1(1),neN2(1),…neN|E|(1)subscript11subscript21…subscript1neN_1(1),neN_2(1),… neN_|E|(1)n e N1 ( 1 ) , n e N2 ( 1 ) , … n e N| E | ( 1 ) 5 nout(1)subscript1n_out(1)nitalic_o u t ( 1 ) We now need to show the correctness of this reduction by proving that the answer for the given instance of VC is “Yes” if and only if the answer for the constructed instance of MLSC is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a vertex cover in G of size k. Consider the subcircuit C based on neurons ninsubscriptn_innitalic_i n, noutsubscriptn_outnitalic_o u t, nv′N|v′∈V′conditional-setsuperscript′superscript′\nv N~|~v ∈ V \ n v′ N | v′ ∈ V′ , neA1,neA2,…,neA|E|subscript1subscript2…subscript\neA_1,neA_2,…,neA_|E|\ n e A1 , n e A2 , … , n e A| E | , and neN1,neN2,…,neN|E|subscript1subscript2…subscript\neN_1,neN_2,…,neN_|E|\ n e N1 , n e N2 , … , n e N| E | . Observe that in this subcircuit, cdr=d=5subscript5cd_r=d=5c ditalic_r = d = 5, cwr=w=|E|subscriptcw_r=w=|E|c witalic_r = w = | E |, and #ntot,r=#n=2|E|+k+2#subscript#22\#n_tot,r=\#n=2|E|+k+2# nitalic_t o t , r = # n = 2 | E | + k + 2. The output behaviour of the neurons in C from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(1)subscript1n_in(1)nitalic_i n ( 1 ) 2 nvN1′(0),nvN2′(0),…nvN|V′|′(0)subscriptsuperscript′10subscriptsuperscript′20…subscriptsuperscript′0nvN _1(0),nvN _2(0),… nvN _|V |(0)n v N′1 ( 0 ) , n v N′2 ( 0 ) , … n v N′| V′ | ( 0 ) 3 neA1(0),neA2(0),…neA|E|(0)subscript10subscript20…subscript0neA_1(0),neA_2(0),… neA_|E|(0)n e A1 ( 0 ) , n e A2 ( 0 ) , … n e A| E | ( 0 ) 4 neN1(1),neN2(1),…neN|E|(1)subscript11subscript21…subscript1neN_1(1),neN_2(1),… neN_|E|(1)n e N1 ( 1 ) , n e N2 ( 1 ) , … n e N| E | ( 1 ) 5 nout(1)subscript1n_out(1)nitalic_o u t ( 1 ) This means that C is behaviorally equivalent to M on I. ⇐ ⇐ : Let C be a subcircuit of M that is behaviorally equivalent to M on input I and has #ntot,r≤#n=2|E|+k+2#subscript#22\#n_tot,r≤\#n=2|E|+k+2# nitalic_t o t , r ≤ # n = 2 | E | + k + 2 neurons. As neurons in all five layers in M must be present in C to produce the required output, cdr=cdgsubscriptsubscriptcd_r=cd_gc ditalic_r = c ditalic_g and both ninsubscriptn_innitalic_i n and noutsubscriptn_outnitalic_o u t are in C. In order for noutsubscriptn_outnitalic_o u t to produce a non-zero output, there must be at least |E||E|| E | edge NOT neurons and |E||E|| E | AND neurons in C, and each of the latter must be connected to at least one of the vertex NOT neurons corresponding to their endpoint vertices. As #ntot,r≤2|E|+k+2#subscript22\#n_tot,r≤ 2|E|+k+2# nitalic_t o t , r ≤ 2 | E | + k + 2, there must be exactly |E||E|| E | edge NOT neurons, |E||E|| E | edge AND neurons, and k vertex NOT neurons in C and the vertices in G corresponding to these vertex NOT neurons must form a vertex cover of size k in G. As VC is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MLSC is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Figure 3: This figure summarizes the reduction from Vertex Cover We now define our two notions of polynomial-time approximation. For a minimization problem Π Π, let OPTΠ(I)subscriptΠOPT_ (I)O P Troman_Π ( I ) be an optimal solution for Π Π, AΠ(I)subscriptΠA_ (I)Aroman_Π ( I ) be a solution forΠ Π returned by an algorithm A, and m(OPTΠ(I))subscriptΠm(OPT_ (I))m ( O P Troman_Π ( I ) ) and m(AΠ(I))subscriptΠm(A_ (I))m ( Aroman_Π ( I ) ) be the values of these solutions. Consider the following alternative to approximation algorithms that give solutions that are within an additive factor of optimal. Definition 10. (Ausiello et al., 1999, Definition 3.5) Given a minimization problem Π Π, an algorithm A is a (multiplicative) ϵitalic-ϵεϵ-approximation algorithm for Π Π if for each instance I of Π Π, m(AΠ(I))−m(OPTΠ(I))≤ϵ×m(OPTΠ(I))subscriptΠsubscriptΠitalic-ϵsubscriptΠm(A_ (I))-m(OPT_ (I))≤ε× m(OPT_ (I))m ( Aroman_Π ( I ) ) - m ( O P Troman_Π ( I ) ) ≤ ϵ × m ( O P Troman_Π ( I ) ). It would be ideal if one could obtain approximate solutions for a problem Π Π that are arbitrarily close to optimal if one is willing to allow extra algorithm runtime. This is encoded in the following entity. Definition 11. (Adapted from Definition 3.10 in Ausiello et al. 1999) Given a minimization problem Π Π, a polynomial-time approximation scheme (PTAS) for Π Π is a set AA of algorithms such that for each integer k>00k>0k > 0, there is a 1k1 1kdivide start_ARG 1 end_ARG start_ARG k end_ARG-approximation algorithm AΠk∈subscriptsuperscriptΠA^k_ ∈ AAitalic_kroman_Π ∈ A that runs in time polynomial in |I||I|| I |. The question of whether or not a problem has a PTAS can be answered using the following type of approximation-preserving reducibility. Definition 12. (Papadimitriou & Yannakakis, 1991, page 427) Given two minimization problems Π Π and Π′, Π Π L-reduces to Π′, i.e., Π≤LΠ′subscriptΠsuperscriptΠ′ _L Π ≤L Π′ if there are polynomial-time algorithms f and g and constants α,β>00α,β>0α , β > 0 such that for each instance I of Π Π (L1) Algorithm f produces an instance I′ of Π′ such that m(OPTΠ′(I′))≤α×m(OPTΠ(I))subscriptsuperscriptΠ′subscriptΠm(OPT_ (I ))≤α× m(OPT_ (I))m ( O P Troman_Π′ ( I′ ) ) ≤ α × m ( O P Troman_Π ( I ) ); and (L2) For any solution for I′ with value v′, algorithm g produces a solution for I of value v such that v−m(OPTΠ(I))≤β×(v′−m(OPTΠ′(I′)))subscriptΠsuperscript′subscriptsuperscriptΠ′v-m(OPT_ (I))≤β×(v -m(OPT_ (I )))v - m ( O P Troman_Π ( I ) ) ≤ β × ( v′ - m ( O P Troman_Π′ ( I′ ) ) ). Lemma 3. (Arora et al., 1998, Theorem 1.2.2) If an optimization problem that is MAXSNPMAX~SNPM A X S N P-hard under L-reductions has a PTAS then P=NP=NPP = N P. Theorem 6. If MLSC has a PTAS then P=NP=NPP = N P. Proof. We prove that the reduction from VC to MLSC in the proof of Theorem 5 is also an L-reduction from VCB to MLSC as follows: • Observe that m(OPTVCB(I))≥|E|/Bsubscriptsubscriptm(OPT_VC_B(I))≥|E|/Bm ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) ≥ | E | / B (the best case in which G is a collection of B-star subgraphs such that each edge is uniquely covered by the central vertex of its associated star) and m(OPTMLSC(I′))≤2|E|+2|E|+2=4|E|+2subscriptsuperscript′22242m(OPT_MLSC(I ))≤ 2|E|+2|E|+2=4|E|+2m ( O P Titalic_M L S C ( I′ ) ) ≤ 2 | E | + 2 | E | + 2 = 4 | E | + 2 (the worst case in which the vertex neurons corresponding to the two endpoints of every edge in G are selected). This gives us m(OPTMLSC(I′))subscriptsuperscript′ m(OPT_MLSC(I ))m ( O P Titalic_M L S C ( I′ ) ) ≤ ≤ 4Bm(OPTVCB(I))+24subscriptsubscript2 4Bm(OPT_VC_B(I))+24 B m ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) + 2 ≤ ≤ 4Bm(OPTVCB(I))+2Bm(OPTVCB(I))4subscriptsubscript2subscriptsubscript 4Bm(OPT_VC_B(I))+2Bm(OPT_VC_B(I))4 B m ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) + 2 B m ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) ≤ ≤ 6Bm(OPTVCB(I))6subscriptsubscript 6Bm(OPT_VC_B(I))6 B m ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) which satisfies condition L1 with α=6B6α=6Bα = 6 B. • Observe that any solution S′ for for the constructed instance I′ of MLSC of value k+2|E|+222k+2|E|+2k + 2 | E | + 2 implies a solution S for the given instance I of VCB of size k, i.e., the vertices in V corresponding to the selected vertex neurons in S. Hence, it is the case that m(S)−m(OPTVCB(I))=m(S′)−m(OPTMLSC(I′))subscriptsubscriptsuperscript′subscriptsuperscript′m(S)-m(OPT_VC_B(I))=m(S )-m(OPT_MLSC(I ))m ( S ) - m ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) = m ( S′ ) - m ( O P Titalic_M L S C ( I′ ) ), which satisfies condition L2 with β=11β=1β = 1. As VCB is MAXSNPMAX~SNPM A X S N P-hard under L-reductions (Papadimitriou & Yannakakis, 1991, Theorem 2(d)), the L-reduction above proves that MLSC is also MAXSNPMAX~SNPM A X S N P-hard under L-reductions. The result follows from Lemma 3. ∎ B.2 Results for MGSC Theorem 7. If MGSC is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from Clique to MGSC. Given an instance ⟨G=(V,E),k⟩delimited-⟨⟩ G=(V,E),k ⟨ G = ( V , E ) , k ⟩ of Clique, construct an instance ⟨M,d,w,#n⟩# M,d,w,\#n ⟨ M , d , w , # n ⟩ of MGSC as in the reduction in the proof of Theorem 1, omitting input vector I. Observe that this instance of MGSC can be created in time polynomial in the size of the given instance of Clique. As #nin,g=1#subscript1\#n_in,g=1# nitalic_i n , g = 1, there are only two possible Boolean input vectors, (0)0(0)( 0 ) and (1)1(1)( 1 ). Given input vector (1)1(1)( 1 ), as MLP M in this reduction is the same as M in the proof of Theorem 1, the output in timestep 4 is once again 1; moreover, given input vector (0)0(0)( 0 ), no vertex or edge neurons can have output 1 and hence the output in timestep 4 is 0. We now need to show the correctness of this reduction by proving that the answer for the given instance of Clique is “Yes” if and only if the answer for the constructed instance of MGSC is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a clique in G of size k. Consider the subcircuit C based on neurons ninsubscriptn_innitalic_i n, noutsubscriptn_outnitalic_o u t, nv′|v′∈V′conditional-setsuperscript′superscript′\nv ~|~v ∈ V \ n v′ | v′ ∈ V′ , and ne′|e′=(x,y)andvx,vy∈V′conditional-setsuperscript′formulae-sequencesuperscript′andsuperscript′\ne ~|~e =(x,y)~ and~vx,vy∈ V \ n e′ | e′ = ( x , y ) and v x , v y ∈ V′ . Observe that in this subcircuit, cdr=4subscript4cd_r=4c ditalic_r = 4, cwr=k(k−1)/2subscript12cw_r=k(k-1)/2c witalic_r = k ( k - 1 ) / 2, and #ntot,r=k(k−1)/2+k+2#subscript122\#n_tot,r=k(k-1)/2+k+2# nitalic_t o t , r = k ( k - 1 ) / 2 + k + 2. Given input (1)1(1)( 1 ), the output behaviour of the neurons in C from the presentation of input until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(1)subscript1n_in(1)nitalic_i n ( 1 ) 2 nv1′(1),nv2′(1),…nv|V′|′(1)subscriptsuperscript′11subscriptsuperscript′21…subscriptsuperscript′1nv _1(1),nv _2(1),… nv _|V |(1)n v′1 ( 1 ) , n v′2 ( 1 ) , … n v′| V′ | ( 1 ) 3 ne1′(1),ne2′(1),…nek(k−1)/2′(1)subscriptsuperscript′11subscriptsuperscript′21…subscriptsuperscript′121ne _1(1),ne _2(1),… ne _k(k-1)/2(1)n e′1 ( 1 ) , n e′2 ( 1 ) , … n e′italic_k ( k - 1 ) / 2 ( 1 ) 4 nout(1)subscript1n_out(1)nitalic_o u t ( 1 ) Moreover, given input (0)0(0)( 0 ), no vertex or edge neurons in C can have output 1 and the output of C at timestep 4 is 0. This means that C is behaviorally equivalent to M on all possible Boolean input vectors ⇐ ⇐ : Let C be a subcircuit of M that is behaviorally equivalent to M on all possible Boolean input vectors and has #ntot,r≤#n=k(k−1)/2+k+2#subscript#122\#n_tot,r≤\#n=k(k-1)/2+k+2# nitalic_t o t , r ≤ # n = k ( k - 1 ) / 2 + k + 2 neurons. Consider the case of input vector (1)1(1)( 1 ). This vector must cause C to generate output 1 at timestep 4 as C is behaviorally equivalent to M on all Boolean input vectors. As neurons in all four layers in M must be present in C to produce the required output, cdr=cdgsubscriptsubscriptcd_r=cd_gc ditalic_r = c ditalic_g and both ninsubscriptn_innitalic_i n and noutsubscriptn_outnitalic_o u t are in C. In order for noutsubscriptn_outnitalic_o u t to produce a non-zero output, there must be at least k(k−1)/212k(k-1)/2k ( k - 1 ) / 2 edge neurons in C, each of which must be activated by the inclusion of the vertex neurons corresponding to both of their endpoint vertices. This requires the inclusion of at least k vertex neurons in C, as a set V′V V′ ′ of vertices in graph can have at most |V′|(|V′|−1)/2superscript′12|V |(|V |-1)/2| V′ ′ | ( | V′ ′ | - 1 ) / 2 distinct edges between them (with this maximum occurring if all pairs of vertices in V′V V′ ′ have an edge between them). As #ntot,r≤k(k−1)/2+k+2#subscript122\#n_tot,r≤ k(k-1)/2+k+2# nitalic_t o t , r ≤ k ( k - 1 ) / 2 + k + 2, all of the above implies that there must be exactly k(k−1)/212k(k-1)/2k ( k - 1 ) / 2 edge neurons and exactly k vertex neurons in C and the vertices in G corresponding to these vertex neurons must form a clique of size k in G. As Clique is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MGSC is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Theorem 8. If ⟨cdg,#nin,g,#nout,g,Bmax,g,Wmax,g,cdr,cwr,#nin,r,#nout,r,#ntot,r, cd_g,\#n_in,g,\#n_out,g,B_ ,g,W_ ,g,cd_r,cw_r,\#n_% in,r,\#n_out,r,\#n_tot,r,⟨ c ditalic_g , # nitalic_i n , g , # nitalic_o u t , g , Broman_max , g , Wroman_max , g , c ditalic_r , c witalic_r , # nitalic_i n , r , # nitalic_o u t , r , # nitalic_t o t , r , Bmax,r,Wmax,r⟩B_ ,r,W_ ,r _max , r , Wroman_max , r ⟩-MGSC is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of MGSC constructed in the reduction in the proof of Theorem 7, cdg=cdr=4subscriptsubscript4cd_g=cd_r=4c ditalic_g = c ditalic_r = 4, #nin,g=#nin,r=#nout,r=#nout,r=Wmax,g=Wmax,r=1#subscript#subscript#subscript#subscriptsubscriptsubscript1\#n_in,g=\#n_in,r=\#n_out,r=\#n_out,r=W_ ,g=W_ ,r=1# nitalic_i n , g = # nitalic_i n , r = # nitalic_o u t , r = # nitalic_o u t , r = Wroman_max , g = Wroman_max , r = 1, and Bmax,g,Bmax,r,#ntot,r,subscriptsubscript#subscriptB_ ,g,B_ ,r,\#n_tot,r,Broman_max , g , Broman_max , r , # nitalic_t o t , r , and cwrsubscriptcw_rc witalic_r are all functions of k in the given instance of Clique. The result then follows from the fact that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Clique is W[1]delimited-[]1W[1]W [ 1 ]-hard (Downey & Fellows, 1999). ∎ Theorem 9. ⟨#ntot,g⟩delimited-⟨⟩#subscript \#n_tot,g ⟨ # nitalic_t o t , g ⟩-MGSC is fixed-parameter tractable. Proof. Consider the algorithm that generates each possible subcircuit of M and checks if that subcircuit is behaviorally equivalent to M on all possible Boolean input vectors of length #nin,g#subscript\#n_in,g# nitalic_i n , g. If such a subcircuit is found, return “Yes”; otherwise, return “No”. There are 2#nin,gsuperscript2#subscript2^\#n_in,g2# nitalic_i n , g possible Boolean input vectors and the total number of subcircuits that need to be checked is at most 2#ntot,gsuperscript2#subscript2^\#n_tot,g2# nitalic_t o t , g. As #nin,g≤#ntot,g#subscript#subscript\#n_in,g≤\#n_tot,g# nitalic_i n , g ≤ # nitalic_t o t , g and each such subcircuit can be run on an input vector in time polynomial in the size of the given instance of MGSC, the above is a fixed-parameter tractable algorithm for MGSC relative to parameter-set #ntot,g#subscript\\#n_tot,g\ # nitalic_t o t , g . ∎ Theorem 10. ⟨cdg,cwg⟩subscriptsubscript cd_g,cw_g ⟨ c ditalic_g , c witalic_g ⟩-MGSC is fixed-parameter tractable. Proof. Follows from the observation that #ntot,g≤cdg×cwg#subscriptsubscriptsubscript\#n_tot,g≤ cd_g× cw_g# nitalic_t o t , g ≤ c ditalic_g × c witalic_g and the algorithm in the proof of Theorem 9. ∎ Let us now consider the PTAS-approximability of MGSC. A first thought would be to -reuse the reduction in the proof of Theorem 5 if the given MLP M and VC subcircuit are behaviorally equivalent under both possible input vectors, (1)1(1)( 1 ) and (0)0(0)( 0 ). We already know the former is true. With respect to the latter, observe that the output behaviour of the neurons in M from the presentation of input (0)0(0)( 0 ) until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(0)subscript0n_in(0)nitalic_i n ( 0 ) 2 nvN1(1),nvN2(1),…nvN|V|(1)subscript11subscript21…subscript1nvN_1(1),nvN_2(1),… nvN_|V|(1)n v N1 ( 1 ) , n v N2 ( 1 ) , … n v N| V | ( 1 ) 3 neA1(1),neA2(1),…neA|E|(1)subscript11subscript21…subscript1neA_1(1),neA_2(1),… neA_|E|(1)n e A1 ( 1 ) , n e A2 ( 1 ) , … n e A| E | ( 1 ) 4 neN1(0),neN2(0),…neN|E|(0)subscript10subscript20…subscript0neN_1(0),neN_2(0),… neN_|E|(0)n e N1 ( 0 ) , n e N2 ( 0 ) , … n e N| E | ( 0 ) 5 nout(0)subscript0n_out(0)nitalic_o u t ( 0 ) However, in the VC subcircuit, we are no longer guaranteed that both endpoint vertex NOT neurons for any edge AND neuron (let alone the endpoint vertex NOT neurons for all edge AND neurons) will be the vertex cover encoded in the subcircuit. This means that all edge AND neurons could potentially output 0, which would cause M to output 1 at timestep 4. This problem can be fixed if we can modify the given VC graph G to create a graph G′ such that 1. we can guarantee that both endpoint vertex NOT neurons for at least one edge AND neuron are present in a VC subcircuit C constructed for G′ (which would make at least one edge AND neuron output 1 and cause C to output 0 at timestep 4); and 2. we can easily extract a graph vertex cover of size at most k for G from any vertex cover of a particular size for G′. Figure 4: The c-way Bowtie Graph BcsubscriptB_cBitalic_c. To do this, we shall use the c-way bowtie graph BcsubscriptB_cBitalic_c. For c>00c>0c > 0, BcsubscriptB_cBitalic_c consists of a central edge eBsubscripte_Beitalic_B between vertices vB1subscript1v_B1vitalic_B 1 and vB2subscript2v_B2vitalic_B 2 such that c edges radiate outwards from vB1subscript1v_B1vitalic_B 1 and vB2subscript2v_B2vitalic_B 2 to the c-sized vertex-sets VB1=vB1,1,vB1,2,…,vB1,csubscript1subscript11subscript12…subscript1V_B1=\v_B1,1,v_B1,2,…,v_B1,c\Vitalic_B 1 = vitalic_B 1 , 1 , vitalic_B 1 , 2 , … , vitalic_B 1 , c and VB2=vB2,1,vB2,2,…,vB2,csubscript2subscript21subscript22…subscript2V_B2=\v_B2,1,v_B2,2,…,v_B2,c\Vitalic_B 2 = vitalic_B 2 , 1 , vitalic_B 2 , 2 , … , vitalic_B 2 , c , respectively (see Figure 4). Note that such a graph has 2c+2222c+22 c + 2 vertices and 2c+1212c+12 c + 1 edges. Given a graph G=(V,E)G=(V,E)G = ( V , E ) with no isolated vertices such that |V|≤2|E|2|V|≤ 2|E|| V | ≤ 2 | E | (with the minimum occurring in a graph consisting of |E||E|| E | endpoint-disjoint edges), let Bow(G)=B4|E|∪Gsubscript4Bow(G)=B_4|E|∪ GB o w ( G ) = B4 | E | ∪ G. This graph has the following useful property. Lemma 4. Given a graph G=(V,E)G=(V,E)G = ( V , E ) and a positive integer k≤|V|k≤|V|k ≤ | V |, if Bow(G)Bow(G)B o w ( G ) has a vertex cover V′ of size at most k+22k+2k + 2 then (1) vB1,vB2∈V′subscript1subscript2superscript′\v_B1,v_B2\∈ V vitalic_B 1 , vitalic_B 2 ∈ V′ and (2) G has a vertex cover of size at most k. Proof. Let us prove the two consequent clauses as follows: 1. Any vertex cover V′ of Bow(G)Bow(G)B o w ( G ) must cover all the edges in both B4|E|subscript4B_4|E|B4 | E | and G. Suppose vB1,vB2∉V′subscript1subscript2superscript′\v_B1,v_B2\ ∈ V vitalic_B 1 , vitalic_B 2 ∉ V′. In order to cover the edges in B4|E|subscript4B_4|E|B4 | E |, all 8|E|88|E|8 | E | vertices in VB1∪VB2subscript1subscript2V_B1∪ V_B2Vitalic_B 1 ∪ Vitalic_B 2 must be in V′. This is however impossible as |V′|≤k+2≤|V|+2≤2|E|+2≤4|E<8|E|superscript′22224bra8|V |≤ k+2≤|V|+2≤ 2|E|+2≤ 4|E<8|E|| V′ | ≤ k + 2 ≤ | V | + 2 ≤ 2 | E | + 2 ≤ 4 | E < 8 | E |. Similarly, suppose only one of vB1subscript1v_B1vitalic_B 1 and vB2subscript2v_B2vitalic_B 2 is in V′; let us assume it is vB1subscript1v_B1vitalic_B 1. In that case, all vertices in VB2subscript2V_B2Vitalic_B 2 must be in V′. However, this too is impossible as |V′−vB1|≤k+1≤|V|+1≤2|E|+1≤3|E|<4|E|=|VB2|superscript′subscript1112134subscript2|V -\v_B1\|≤ k+1≤|V|+1≤ 2|E|+1≤ 3|E|<4|E|=|V_B2|| V′ - vitalic_B 1 | ≤ k + 1 ≤ | V | + 1 ≤ 2 | E | + 1 ≤ 3 | E | < 4 | E | = | Vitalic_B 2 |. Hence, both vB1subscript1v_B1vitalic_B 1 and vB2subscript2v_B2vitalic_B 2 must be in V′. 2. Given (1), k′≤ksuperscript′k ≤ k′ ≤ k vertices remain in V′ to cover G. All k′ of these vertices need not be in G, e.g., some may be scattered over VB1subscript1V_B1Vitalic_B 1 and VB2subscript2V_B2Vitalic_B 2. That being said, it is still the case that G must have a vertex cover of size at most k. This concludes the proof. ∎ Theorem 11. If MGSC is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from VC to MGSC. Given an instance ⟨G=(V,E),k⟩delimited-⟨⟩ G=(V,E),k ⟨ G = ( V , E ) , k ⟩ of VC, construct the following instance ⟨M,d,w,#n⟩# M,d,w,\#n ⟨ M , d , w , # n ⟩ of MGSC based on G′=(V′,E′)=Bow(G)superscript′superscript′G =(V ,E )=Bow(G)G′ = ( V′ , E′ ) = B o w ( G ): Let M be an MLP based on #ntot,g=|V′|+2|E′|+2#subscriptsuperscript′2superscript′2\#n_tot,g=|V |+2|E |+2# nitalic_t o t , g = | V′ | + 2 | E′ | + 2 neurons spread across five layers: 1. Input layer: The single input neuron ninsubscriptn_innitalic_i n (bias 0). 2. Hidden vertex layer: The vertex neurons nvN1,nvN2,…nvN|V′|subscript1subscript2…subscriptsuperscript′nvN_1,nvN_2,… nvN_|V |n v N1 , n v N2 , … n v N| V′ |, all of which are NOT ReLU gates. 3. Hidden edge layer I: The edge AND neurons neA1,neA2,…neA|E′|subscript1subscript2…subscriptsuperscript′neA_1,neA_2,… neA_|E |n e A1 , n e A2 , … n e A| E′ |, all of which are 2-way AND ReLU gates. 4. Hidden edge layer I: The edge NOT neurons neN1,neN2,…neN|E′|subscript1subscript2…subscriptsuperscript′neN_1,neN_2,… neN_|E |n e N1 , n e N2 , … n e N| E′ |, all of which are NOT ReLU gates. 5. Output layer: The single output neuron noutsubscriptn_outnitalic_o u t, which is an |E′|superscript′|E || E′ |-way AND ReLU gate. The non-zero weight connections between adjacent layers are as follows: • The input neuron ninsubscriptn_innitalic_i n is connected to each vertex NOT neuron with weight 1. • Each vertex NOT neuron nvNisubscriptnvN_in v Nitalic_i, 1≤i≤|V′|1superscript′1≤ i≤|V |1 ≤ i ≤ | V′ |, is connected to each edge AND neuron whose corresponding edge has an endpoint vi′subscriptsuperscript′v _iv′italic_i with weight 1. • Each edge AND neuron neAisubscriptneA_in e Aitalic_i, 1≤i≤|E′|1superscript′1≤ i≤|E |1 ≤ i ≤ | E′ |, is connected to its corresponding edge NOT neuron neNisubscriptneN_in e Nitalic_i with weight 1. • Each edge NOT neuron neNisubscriptneN_in e Nitalic_i, 1≤i≤|E′|1superscript′1≤ i≤|E |1 ≤ i ≤ | E′ |, is connected to the output neuron noutsubscriptn_outnitalic_o u t with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let d=55d=5d = 5, w=|E′|superscript′w=|E |w = | E′ |, and #n=2|E′|+(k+2)+2=2|E′|=k+4#2superscript′222superscript′4\#n=2|E |+(k+2)+2=2|E |=k+4# n = 2 | E′ | + ( k + 2 ) + 2 = 2 | E′ | = k + 4. Observe that this instance of MGSC can be created in time polynomial in the size of the given instance of VC, the output behaviour of the neurons in M from the presentation of input (1)1(1)( 1 ) until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(1)subscript1n_in(1)nitalic_i n ( 1 ) 2 nvN1(0),nvN2(0),…nvN|V′|(0)subscript10subscript20…subscriptsuperscript′0nvN_1(0),nvN_2(0),… nvN_|V |(0)n v N1 ( 0 ) , n v N2 ( 0 ) , … n v N| V′ | ( 0 ) 3 neA1(0),neA2(0),…neA|E′|(0)subscript10subscript20…subscriptsuperscript′0neA_1(0),neA_2(0),… neA_|E |(0)n e A1 ( 0 ) , n e A2 ( 0 ) , … n e A| E′ | ( 0 ) 4 neN1(1),neN2(1),…neN|E′|(1)subscript11subscript21…subscriptsuperscript′1neN_1(1),neN_2(1),… neN_|E |(1)n e N1 ( 1 ) , n e N2 ( 1 ) , … n e N| E′ | ( 1 ) 5 nout(1)subscript1n_out(1)nitalic_o u t ( 1 ) and the output behaviour of the neurons in M from the presentation of input (0)0(0)( 0 ) until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(0)subscript0n_in(0)nitalic_i n ( 0 ) 2 nvN1(1),nvN2(1),…nvN|V′|(1)subscript11subscript21…subscriptsuperscript′1nvN_1(1),nvN_2(1),… nvN_|V |(1)n v N1 ( 1 ) , n v N2 ( 1 ) , … n v N| V′ | ( 1 ) 3 neA1(1),neA2(1),…neA|E′|(1)subscript11subscript21…subscriptsuperscript′1neA_1(1),neA_2(1),… neA_|E |(1)n e A1 ( 1 ) , n e A2 ( 1 ) , … n e A| E′ | ( 1 ) 4 neN1(0),neN2(0),…neN|E′|(0)subscript10subscript20…subscriptsuperscript′0neN_1(0),neN_2(0),… neN_|E |(0)n e N1 ( 0 ) , n e N2 ( 0 ) , … n e N| E′ | ( 0 ) 5 nout(0)subscript0n_out(0)nitalic_o u t ( 0 ) We now need to show the correctness of this reduction by proving that the answer for the given instance of VC is “Yes” if and only if the answer for the constructed instance of MGSC is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v % _k\ V′ ′ = v′ ′1 , v′ ′2 , … , v′ ′k ⊆ V be a vertex cover in G of size k. Consider the subcircuit C based on neurons ninsubscriptn_innitalic_i n, noutsubscriptn_outnitalic_o u t, nv′N|v′∈V′∪nvB1N,nvB2Bconditional-setsuperscript′superscript′subscript1subscript2\nv N~|~v ∈ V \∪\nv_B1% N,nv_B2B\ n v′ ′ N | v′ ′ ∈ V′ ′ ∪ n vitalic_B 1 N , n vitalic_B 2 B , neA1,neA2,…,neA|E′|subscript1subscript2…subscriptsuperscript′\neA_1,neA_2,…,neA_|E |\ n e A1 , n e A2 , … , n e A| E′ | , and neN1,neN2,…,neN|E′|subscript1subscript2…subscriptsuperscript′\neN_1,neN_2,…,neN_|E |\ n e N1 , n e N2 , … , n e N| E′ | . Observe that in this subcircuit, cdr=d=5subscript5cd_r=d=5c ditalic_r = d = 5, cwr=w=|E′|subscriptsuperscript′cw_r=w=|E |c witalic_r = w = | E′ |, and #ntot,r=#n=2|E′|+k+4#subscript#2superscript′4\#n_tot,r=\#n=2|E |+k+4# nitalic_t o t , r = # n = 2 | E′ | + k + 4. The output behaviour of the neurons in C from the presentation of input (1)1(1)( 1 ) until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(1)subscript1n_in(1)nitalic_i n ( 1 ) 2 All vertex NOT neurons (0) 3 neA1(0),neA2(0),…neA|E′|(0)subscript10subscript20…subscriptsuperscript′0neA_1(0),neA_2(0),… neA_|E |(0)n e A1 ( 0 ) , n e A2 ( 0 ) , … n e A| E′ | ( 0 ) 4 neN1(1),neN2(1),…neN|E′|(1)subscript11subscript21…subscriptsuperscript′1neN_1(1),neN_2(1),… neN_|E |(1)n e N1 ( 1 ) , n e N2 ( 1 ) , … n e N| E′ | ( 1 ) 5 nout(1)subscript1n_out(1)nitalic_o u t ( 1 ) Moreover, the output behaviour of the neurons in C from the presentation of input (0)0(0)( 0 ) until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(0)subscript0n_in(0)nitalic_i n ( 0 ) 2 All vertex NOT neurons (1) 3 At least one edge AND neuron has output 1, e.g., neBAsubscriptne_BAn eitalic_B A 4 At least one edge NOT neuron has output 0, e.g., neBNsubscriptne_BNn eitalic_B N 5 nout(0)subscript0n_out(0)nitalic_o u t ( 0 ) This means that C is behaviorally equivalent to M on all possible Boolean input vectors. ⇐ ⇐ : Let C be a subcircuit of M that is behaviorally equivalent to M on all possible Boolean input vectors and has #ntot,r≤#n=2|E′|+k+4#subscript#2superscript′4\#n_tot,r≤\#n=2|E |+k+4# nitalic_t o t , r ≤ # n = 2 | E′ | + k + 4 neurons. As neurons in all five layers in M must be present in C to produce the required output, cdr=cdgsubscriptsubscriptcd_r=cd_gc ditalic_r = c ditalic_g and both ninsubscriptn_innitalic_i n and noutsubscriptn_outnitalic_o u t are in C. In order for noutsubscriptn_outnitalic_o u t to produce a non-zero output, there must be at least |E′|superscript′|E || E′ | edge NOT neurons and |E′|superscript′|E || E′ | AND neurons in C, and each of the latter must be connected to at least one of the vertex NOT neurons corresponding to their endpoint vertices. As #ntot,r≤2|E′|+k+4#subscript2superscript′4\#n_tot,r≤ 2|E |+k+4# nitalic_t o t , r ≤ 2 | E′ | + k + 4, there must be exactly |E||E|| E | edge NOT neurons, |E′|superscript′|E || E′ | edge AND neurons, and k+22k+2k + 2 vertex NOT neurons in C and the vertices in G′ corresponding to these vertex NOT neurons must form a vertex cover V′V V′ ′ of size k+22k+2k + 2 in G′. However, as G′=Bow(G)superscript′G =Bow(G)G′ = B o w ( G ), Lemma 4 implies not only that nvB1N,nvB2N∈V′subscript1subscript2superscript′\nv_B1N,nv_B2N\∈ V n vitalic_B 1 N , n vitalic_B 2 N ∈ V′ ′ but that G has a vertex cover of size at most k. As VC is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MGSC is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Theorem 12. If MGSC has a PTAS then P=NP=NPP = N P. Proof. We prove that the reduction from VC to MLSC in the proof of Theorem 11 is also an L-reduction from VCB to MGSC as follows: • Observe that m(OPTVCB(I))≥|E|/Bsubscriptsubscriptm(OPT_VC_B(I))≥|E|/Bm ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) ≥ | E | / B (the best case in which G is a collection of B-star subgraphs such that each edge is uniquely covered by the central vertex of its associated star) and m(OPTMGSC(I′))≤2|E′|+2|E′|+2=4|E′|+2subscriptsuperscript′2superscript′2superscript′24superscript′2m(OPT_MGSC(I ))≤ 2|E |+2|E |+2=4|E |+2m ( O P Titalic_M G S C ( I′ ) ) ≤ 2 | E′ | + 2 | E′ | + 2 = 4 | E′ | + 2 (the worst case in which the vertex neurons corresponding to the two endpoints of every edge in G are selected). As |E′|=9|E|+1superscript′91|E |=9|E|+1| E′ | = 9 | E | + 1, this gives us m(OPTMLSC(I′))subscriptsuperscript′ m(OPT_MLSC(I ))m ( O P Titalic_M L S C ( I′ ) ) ≤ ≤ 36|E|+4+23642 36|E|+4+236 | E | + 4 + 2 ≤ ≤ 36Bm(OPTVCB(I))+636subscriptsubscript6 36Bm(OPT_VC_B(I))+636 B m ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) + 6 ≤ ≤ 36Bm(OPTVCB(I))+6Bm(OPTVCB(I))36subscriptsubscript6subscriptsubscript 36Bm(OPT_VC_B(I))+6Bm(OPT_VC_B(I))36 B m ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) + 6 B m ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) ≤ ≤ 42Bm(OPTVCB(I))42subscriptsubscript 42Bm(OPT_VC_B(I))42 B m ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) which satisfies condition L1 with α=42B42α=42Bα = 42 B. • Observe that any solution S′ for for the constructed instance I′ of MGSC of value k+2|E′|+42superscript′4k+2|E |+4k + 2 | E′ | + 4 implies a solution S for the given instance I of VCB of size k in S. Hence, it is the case that m(S)−m(OPTVCB(I))=m(S′)−m(OPTMGSC(I′))subscriptsubscriptsuperscript′subscriptsuperscript′m(S)-m(OPT_VC_B(I))=m(S )-m(OPT_MGSC(I ))m ( S ) - m ( O P Titalic_V C start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( I ) ) = m ( S′ ) - m ( O P Titalic_M G S C ( I′ ) ), which satisfies condition L2 with β=11β=1β = 1. As VCB is MAXSNPMAX~SNPM A X S N P-hard under L-reductions (Papadimitriou & Yannakakis, 1991, Theorem 2(d)), the L-reduction above proves that MGSC is also MAXSNPMAX~SNPM A X S N P-hard under L-reductions. The result follows from Lemma 3. ∎ Appendix C Sufficient Circuit Search and Counting Problems Definition 13. An entity x with property P is minimal if there is no non-empty subset of elements in x that can be deleted to create an entity x′ with property P. We shall assume here that all subcircuits are non-trivial, i.e., the subcircuit is of size <|M|absent<|M|< | M |. Consider the following search problem templates: Name local sufficient circuit (AccLSC) Input: A multi-layer perceptron M of depth cdgsubscriptcd_gc ditalic_g with #ntot,g#subscript\#n_tot,g# nitalic_t o t , g neurons and maximum layer width cwgsubscriptcw_gc witalic_g, connection-value matrices W1,W2,…,Wcdgsubscript1subscript2…subscriptsubscriptW_1,W_2,…,W_cd_gW1 , W2 , … , Witalic_c d start_POSTSUBSCRIPT g end_POSTSUBSCRIPT, neuron bias vector B, a Boolean input vector I of length #ng,in#subscript\#n_g,in# nitalic_g , i n, integers d and w such that 1≤d≤cdg1subscript1≤ d≤ cd_g1 ≤ d ≤ c ditalic_g and 1≤w≤cwg1subscript1≤ w≤ cw_g1 ≤ w ≤ c witalic_g PrmAdd. Output: A CType subcircuit C of M of depth cdr≤dsubscriptcd_r≤ dc ditalic_r ≤ d with maximum layer width cwr≤wsubscriptcw_r≤ wc witalic_r ≤ w that C(I)=M(I)C(I)=M(I)C ( I ) = M ( I ), if such a subcircuit exists, and special symbol ⊥bottom ⊥ otherwise. Name global sufficient circuit (AccGSC) Input: A multi-layer perceptron M of depth cdgsubscriptcd_gc ditalic_g with #ntot,g#subscript\#n_tot,g# nitalic_t o t , g neurons and maximum layer width cwgsubscriptcw_gc witalic_g, connection-value matrices W1,W2,…,Wcdgsubscript1subscript2…subscriptsubscriptW_1,W_2,…,W_cd_gW1 , W2 , … , Witalic_c d start_POSTSUBSCRIPT g end_POSTSUBSCRIPT, neuron bias vector B, integers d and w such that 1≤d≤cdg1subscript1≤ d≤ cd_g1 ≤ d ≤ c ditalic_g and 1≤w≤cwg1subscript1≤ w≤ cw_g1 ≤ w ≤ c witalic_g PrmAdd. Output: A CType subcircuit C of M of depth cdr≤dsubscriptcd_r≤ dc ditalic_r ≤ d with maximum layer width cwr≤wsubscriptcw_r≤ wc witalic_r ≤ w such that C(I)=M(I)C(I)=M(I)C ( I ) = M ( I ) for every possible Boolean input vector I of length #in,gsubscript#\#_in,g#i n , g, if such a subcircuit exists, and special symbol ⊥bottom ⊥ otherwise.. Name local necessary circuit (AccLNC) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, a Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n PrmAdd. Output: A CType subcircuit C of M such that C∩C′≠∅superscript′C∩ C ≠ ∩ C′ ≠ ∅ for every sufficient circuit C′ of M relative to I, if such a s subcircuit exists, and special symbol ⊥bottom ⊥ otherwise. Name global necessary circuit (AccGNC) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B PrmAdd. Output: A CType subcircuit C of M such that for for every possible Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n, C∩C′≠∅superscript′C∩ C ≠ ∩ C′ ≠ ∅ for every sufficient circuit C′ of M relative to I, if such a subcircuit exists, and special symbol ⊥bottom ⊥ otherwise. Name local circuit ablation (AccLCA) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, bias vector B, a Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n PrmAdd. Output: A CType subcircuit C of M such that (M/C)(I)≠M(I)(M/C)(I)≠ M(I)( M / C ) ( I ) ≠ M ( I ), if such a subcircuit exists, and special symbol ⊥bottom ⊥ otherwise. Name global circuit ablation (AccGCA) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B PrmAdd. Output: A CType subcircuit C of M such that (M/C)(I)≠M(I)(M/C)(I)≠ M(I)( M / C ) ( I ) ≠ M ( I ) every possible Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n, if such a subcircuit exists, and special symbol ⊥bottom ⊥ otherwise. Each these templates can be filled out to create six search problem variants as follows: 1. Exact-size Problem: • Name === “Exact” • Acc === “Ex” • PrmAdd === “, and a positive integer k<|M|k<|M|k < | M |” • CType === “size-k” 2. Bounded-size Problem: • Name === “Bounded” • Acc === “B” • PrmAdd === “, and a positive integer k<|M|k<|M|k < | M |” • CType === “size-≤kabsent≤ k≤ k” 3. Minimal Problem: • Name === “Minimal” • Acc === “Mnl” • PrmAdd === “” • CType === “minimal” 4. Minimal Exact-size Problem: • Name === “Minimal exact” • Acc === “MnlEx” • PrmAdd === “, and a positive integer k<|M|k<|M|k < | M |” • CType === “minimal size-k” 5. Minimal Bounded-size Problem: • Name === “Minimal bounded” • Acc === “MnlB” • PrmAdd === “, and a positive integer k<|M|k<|M|k < | M |” • CType === “minimal size-≤kabsent≤ k≤ k” 6. Minimum Problem: • Name === “Minimum” • Acc === “Min” • PrmAdd === “” • CType === “minimum” We will use previous results for the following problems to prove our results for the problems above. Exact clique (ExClique) Input: An undirected graph G=(V,E)G=(V,E)G = ( V , E ) and a positive integer k≤|V|k≤|V|k ≤ | V |. Output: A k-size vertex subset V′ of G that is a clique of size k in G, if such a V′ exists, and special symbol ⊥bottom ⊥ otherwise. Exact vertex cover (ExVC) Input: An undirected graph G=(V,E)G=(V,E)G = ( V , E ) and a positive integer k≤|V|k≤|V|k ≤ | V |. Output: A k-size vertex subset V′ of G that is a vertex cover of size k in G, if such a V′ exists, and special symbol ⊥bottom ⊥ otherwise. Minimal vertex cover (MnlVC) (Valiant, 1979, Problem 4) Input: An undirected graph G=(V,E)G=(V,E)G = ( V , E ). Output: A minimal subset V′ of the vertices in G that is a vertex cover of G. Given any search problem X above, let #X be the problem that returns the number of solution outputs. To assess the complexity of these counting problems, we will use the following definitions in Garey & Johnson 1979, Section 7.3, adapted from those originally given in Valiant 1979. Definition 14. (Garey & Johnson, 1979, p. 168) A counting problem #Π Π is in #P if there is a nondeterministic algorithm such that for each input I of Π Π, (1) the number of ‘distinct “guesses” that lead to acceptance of I exactly equals the number solutions of Π Π for input I and (2) the length of the longest accepting computation is bounded by a polynomial in |I||I|| I |. #P contains very hard problems, as it is known every class in the Polynomial Hierarchy (which include P and NPNPN P as its lowest members) Turing reduces to #P, i.e., PH⊆P#Psuperscript#PH P^\#PP H ⊆ P# P (Toda, 1991). We use the following type of reduction to isolate problems that are the hardest (#P-complete) and at least as hard as the hardest (#P-hard) in #P. Definition 15. (Garey & Johnson, 1979, p. 168-169) Given two search problems Π Π and Π′, a (polynomial time) parsimonious reduction from Π Π to Π′ is a function f:IΠ→IΠ′:→subscriptΠsubscriptsuperscriptΠ′f:I_ → I_ f : Iroman_Π → Iroman_Π′ that can be computed in polynomial time such that for every I∈ΠI∈ I ∈ Π, the number of solutions of Π Π for input I is exactly equal to the number of Π′ for input f(I)f(I)f ( I ). We will also derive parameterized counting results using the framework given in Flum & Grohe 2006, Chapter 14. The definition of class #W[1]delimited-[]1W[1]W [ 1 ] (Flum & Grohe, 2006, Definition 14.11) is rather intricate and need not concern us here. We will use the following type of reduction to isolate problems that are at least as hard as the hardest (#W[1]delimited-[]1W[1]W [ 1 ]-hard) in #W[1]delimited-[]1W[1]W [ 1 ]. Definition 16. (Adapted from Flum & Grohe 2006, Definition 14.10.a) Given two parameterized search problems ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Π Π and ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-Π′, a (fpt) parsimonious reduction from ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Π Π to ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-Π′ is a function f:IΠ→IΠ′:→subscriptΠsubscriptsuperscriptΠ′f:I_ → I_ f : Iroman_Π → Iroman_Π′ computable in fixed-parameter time relative to parameter k such that for every I∈ΠI∈ I ∈ Π (1) the number of solutions of Π Π for input I is exactly equal to the number of Π′ for input f(I)f(I)f ( I ) and (2) for every parameter k′∈Ksuperscript′k ∈ Kk′ ∈ K, k′≤gk′(k)superscript′subscriptsuperscript′k ≤ g_k (k)k′ ≤ gitalic_k′ ( k ) for some function gk′()subscriptsuperscript′g_k ()gitalic_k′ ( ). Reductions are often established to be parsimonious by proving bijections between solution-sets for I and f(I)f(I)f ( I ), i.e., each solution to I corresponds to exactly one solution for f(I)f(I)f ( I ) and vice versa. We will prove various parameterized results for our problems using reductions from ExClique. The parameterized results are proved relative to the parameters in Table 6. Lemmas 1 and 2 will be useful in deriving additional parameterized results from proved ones. Table 6: Parameters for the sufficient circuit, necessary circuit, circuit ablation problems. Note that for sufficient circuit problems, there are two versions of each parameter k – namely, those describing the given MLP M and the derived subcircuit C (distinguished by g- ands r-subscripts, respectively). Parameter Description cdcdc d # layers in given MLP cwcwc w max # neurons in layer in given MLP #ntot#subscript\#n_tot# nitalic_t o t total # neurons in given MLP #nin#subscript\#n_in# nitalic_i n # input neurons in given MLP #nout#subscript\#n_out# nitalic_o u t # output neurons in given MLP BmaxsubscriptB_ Broman_max max neuron bias in given MLP WmaxsubscriptW_ Wroman_max max connection weight in given MLP k Size of requested neuron subset C.1 Results for Sufficient Circuit Problems Theorem 13. For Π∈L=LSC,GSC|∈Ex,B,MnlEx,MnlBΠconditional-set ∈ L=\ VLSC, VGSC~|~ V∈\Ex,B,MnlEx,MnlB\\Π ∈ L = V L S C , V G S C | V ∈ E x , B , M n l E x , M n l B , ExClique polynomial-time parsimoniously reduces to Π Π. Proof. Consider first the local sufficient circuit problem variants. Observe that for the reduction from Clique to MLSC in the proof of Theorem 1 (1) the reduction is also from ExClique, (2) each clique of size k in the given instance of ExClique has exactly one corresponding sufficient circuit of size k′=k+k(k−1)/2+2superscript′122k =k+k(k-1)/2+2k′ = k + k ( k - 1 ) / 2 + 2 in the constructed instance of MLSC and vice versa, and (3) courtesy of the bias in neuron noutsubscriptn_outnitalic_o u t and the structure of the MLP M in the constructed instance of MLSC, no sufficient circuit can have size <k′absentsuperscript′<k < k′ and hence problem variants ExLSC, BLSC, MnlExLSC, and MnlBLSC (when k′=k+k(k−1)/2+2superscript′122k =k+k(k-1)/2+2k′ = k + k ( k - 1 ) / 2 + 2) have the same set of sufficient circuit solutions Hence, this reduction is also a polynomial-time parsimonious reduction from ExClique to each local sufficient circuit problem variant in L. As for the global sufficient circuit problem variants, it was pointed out that in the proof of Theorem 7 that the reduction above is also a reduction from Clique to MGSC; moreover, all three properties above also hold modulo MGSC, ExGSC, BGSC, MnlExGSC, and MnlBGSC. Hence, this reduction is also a polynomial-time parsimonious reduction from ExClique to each global sufficient circuit problem variant in L. ∎ Theorem 14. For Π∈L=LSC,GSC|∈Ex,B,MnlEx,MnlBΠconditional-set ∈ L=\ VLSC, VGSC~|~ V∈\Ex,B,MnlEx,MnlB\\Π ∈ L = V L S C , V G S C | V ∈ E x , B , M n l E x , M n l B , ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExClique fpt parsimoniously reduces to ⟨cdg,#nin,g,#nout,g,Bmax,g,Wmax,g,cdr,cwr, cd_g,\#n_in,g,\#n_out,g,B_ ,g,W_ ,g,cd_r,cw_r,⟨ c ditalic_g , # nitalic_i n , g , # nitalic_o u t , g , Broman_max , g , Wroman_max , g , c ditalic_r , c witalic_r , #nin,r,#nout,r,#ntot,r,Bmax,r,Wmax,r⟩\#n_in,r,\#n_out,r,\#n_tot,r,B_ ,r,W_ ,r # nitalic_i n , r , # nitalic_o u t , r , # nitalic_t o t , r , Broman_max , r , Wroman_max , r ⟩-Π Π. Proof. Observe that in the instance of MLSC constructed in the reduction in the proof of Theorem 1, cdg=cdr=4subscriptsubscript4cd_g=cd_r=4c ditalic_g = c ditalic_r = 4, #nin,g=#nin,r=#nout,r=#nout,r=Wmax,g=Wmax,r=1#subscript#subscript#subscript#subscriptsubscriptsubscript1\#n_in,g=\#n_in,r=\#n_out,r=\#n_out,r=W_ ,g=W_ ,r=1# nitalic_i n , g = # nitalic_i n , r = # nitalic_o u t , r = # nitalic_o u t , r = Wroman_max , g = Wroman_max , r = 1, and Bmax,g,Bmax,r,#ntot,r,subscriptsubscript#subscriptB_ ,g,B_ ,r,\#n_tot,r,Broman_max , g , Broman_max , r , # nitalic_t o t , r , and cwrsubscriptcw_rc witalic_r are all functions of k in the given instance of Clique. The result then follows by the reasoning in the proof of Theorem 13. ∎ Theorem 15. For Π∈L=LSC,GSC|∈Ex,B,MnlEx,MnlBΠconditional-set ∈ L=\ VLSC, VGSC~|~ V∈\Ex,B,MnlEx,MnlB\\Π ∈ L = V L S C , V G S C | V ∈ E x , B , M n l E x , M n l B , if Π Π is polynomial-time solvable then P=NP=NPP = N P. Proof. Suppose there is a polynomial-time algorithm A for some Π∈LΠ ∈ LΠ ∈ L. Let R be the polynomial-time algorithm underlying the polynomial-time parsimonious reduction from ExClique to Π Π specified in the proof of Theorem 13. Construct an algorithm A′ for the decision version of ExClique as follows: Given an input I for ExCliqueD, create input I′ for Π Π using R, and apply A to I′ to create solution S. If S=⊥bottomS= = ⊥, return “No”; otherwise, return “Yes”. Algorithm A′ is a polynomial-time algorithm for ExCliqueD; however, as ExCliqueD is NPNPN P-complete (Garey & Johnson, 1979, Problem GT19), this implies that P=NP=NPP = N P, giving the result. ∎ Theorem 16. if MinLSC or MinGSC is polynomial-time solvable then P=NP=NPP = N P. Proof. Suppose there is a polynomial-time algorithm A for MinLSC (MinGSC). Let R be the polynomial-time algorithm underlying the polynomial-time parsimonious reduction from ExClique to BLSC (BGSC) specified in the proof of Theorem 13. Construct an algorithm A′ for the decision version of ExClique as follows: Given an input I for ExCliqueD, create input I′ for BSC )LSC) using R, and apply A to I′ to create solution S. If |S|≤k|S|≤ k| S | ≤ k, return “Yes”; otherwise, return “No”. Algorithm A′ is a polynomial-time algorithm for ExCliqueD; however, as ExCliqueD is NPNPN P-complete (Garey & Johnson, 1979, Problem GT19), this implies that P=NP=NPP = N P, giving the result. ∎ Theorem 17. For Π∈L=LSC,GSC|∈Ex,B,MnlEx,MnlBΠconditional-set ∈ L=\ VLSC, VGSC~|~ V∈\Ex,B,MnlEx,MnlB\\Π ∈ L = V L S C , V G S C | V ∈ E x , B , M n l E x , M n l B and K=cdg,#nin,g,#nout,g,Bmax,g,Wmax,g,cdr,cwr,#nin,r,#nout,r,#ntot,r,Bmax,r,K=\cd_g,\#n_in,g,\#n_out,g,B_ ,g,W_ ,g,cd_r,cw_r,\#n_in,% r,\#n_out,r,\#n_tot,r,B_ ,r,K = c ditalic_g , # nitalic_i n , g , # nitalic_o u t , g , Broman_max , g , Wroman_max , g , c ditalic_r , c witalic_r , # nitalic_i n , r , # nitalic_o u t , r , # nitalic_t o t , r , Broman_max , r , Wmax,rW_ ,r\Wroman_max , r , if ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-Π Π is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Suppose there is a fixed-parameter tractable algorithm A for ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-Π Π for some Π∈LΠ ∈ LΠ ∈ L. Let R be the fixed-parameter algorithm underlying the fpt parsimonious reduction from ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExClique to ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-Π Π specified in the proof of Theorem 14. Construct an algorithm A′ for the decision version of ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExClique as follows: Given an input I for ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExCliqueD, create input I′ for Π Π using R, and apply A to I′ to create solution S. If S=⊥bottomS= = ⊥, return “No”; otherwise, return “Yes”. Algorithm A′ is a fixed-parameter tractable algorithm for ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExCliqueD; however, as ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExCliqueD is W[1]delimited-[]1W[1]W [ 1 ]-complete (Downey & Fellows, 1999), this implies that FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ], giving the result. ∎ Theorem 18. For K=cdg,#nin,g,#nout,g,Bmax,g,Wmax,g,cdr,cwr,#nin,r,#nout,r,K=\cd_g,\#n_in,g,\#n_out,g,B_ ,g,W_ ,g,cd_r,cw_r,\#n_in,% r,\#n_out,r,K = c ditalic_g , # nitalic_i n , g , # nitalic_o u t , g , Broman_max , g , Wroman_max , g , c ditalic_r , c witalic_r , # nitalic_i n , r , # nitalic_o u t , r , #ntot,r,Bmax,r,Wmax,r\#n_tot,r,B_ ,r,W_ ,r\# nitalic_t o t , r , Broman_max , r , Wroman_max , r , if ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-MinLSC or ⟨K]⟩ K] ⟨ K ] ⟩-MinGSC is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Suppose there is a fixed-parameter tractable algorithm A for ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-MinLSC (⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-MinGSC). Let R be the fixed-parameter algorithm underlying the fpt parsimonious reduction from ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExClique to ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-BLSC (⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-BGSC) specified in the proof of Theorem 14. Construct an algorithm A′ for the decision version of ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExClique as follows: Given an input I for ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExCliqueD, create input I′ for Π Π using R, and apply A to I′ to create solution S. If |S|≤k|S|≤ k| S | ≤ k, return “Yes”; otherwise, return “No”. Algorithm A′ is a fixed-parameter tractable algorithm for ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExCliqueD; however, as ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExCliqueD is W[1]delimited-[]1W[1]W [ 1 ]-complete (Downey & Fellows, 1999), this implies that FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ], giving the result. ∎ Theorem 19. For Π∈L=LSC|∈Ex,B,MnlEx,MnlBΠconditional-set ∈ L=\ VLSC~|~ V∈\Ex,B,MnlEx,MnlB\\Π ∈ L = V L S C | V ∈ E x , B , M n l E x , M n l B , #Π Π is #P-complete. Proof. As #ExVC is #P-complete (Provan & Ball 1983, Page 781; see also Garey & Johnson 1979, Page 169), #ExClique is #P-hard by the polynomial-time parsimonious reduction from ExVC to ExClique implicit in Garey & Johnson 1979, Lemma 3.1. The #P-hardness of #Π Π then follows from the appropriate polynomial-time parsimonious reduction from ExClique to Π Π specified in the proof of Theorem 13. Membership of #Π Π in #P and the result follows from the nondeterministic algorithm for #Π Π that, on each computation path, guesses a subcircuit C of M and then verified that C satisfies the properties required by Π Π relative to MLP M input I. ∎ Theorem 20. For Π∈L=GSC|∈Ex,B,MnlEx,MnlBΠconditional-set ∈ L=\ VGSC~|~ V∈\Ex,B,MnlEx,MnlB\\Π ∈ L = V G S C | V ∈ E x , B , M n l E x , M n l B , #Π Π is #P-hard. Proof. The result follows from the #P-hardness of #ExClique noted in the proof of Theorem 20 and the appropriate polynomial-time parsimonious reduction from ExClique to Π Π specified in the proof of Theorem 13. ∎ Theorem 21. #MnlLSC is #P-complete. Proof. Consider the reduction from MnlVC to MnlLSC created by modifying the reduction from VC to MLSC given in the proof of Theorem 5 such that the bias and input-line weight of input neuron ninsubscriptn_innitalic_i n are changed to 0 and 1 to ensure that all possible input vectors (namely, 00\0\ 0 and 11\1\ 1 ) cause ninsubscriptn_innitalic_i n to output 1. Observe that this modified reduction runs in time polynomial in the size of the given instance of MnlVC. We now need to show that this reduction is parsimonious, i.e., this reduction creates a bijection between the solution-sets of the given instance of MnlVC and the constructed instance of MnlLSC. We prove the two directions of this bijection separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a minimal vertex cover in G. Consider the subcircuit C based on neurons ninsubscriptn_innitalic_i n, noutsubscriptn_outnitalic_o u t, nv′N|v′∈V′conditional-setsuperscript′superscript′\nv N~|~v ∈ V \ n v′ N | v′ ∈ V′ , neA1,neA2,…,neA|E|subscript1subscript2…subscript\neA_1,neA_2,…,neA_|E|\ n e A1 , n e A2 , … , n e A| E | , and neN1,neN2,…,neN|E|subscript1subscript2…subscript\neN_1,neN_2,…,neN_|E|\ n e N1 , n e N2 , … , n e N| E | . As shown in the ⇒ ⇒-portion of the proof of correctness of the reduction in the proof of Theorem 5, C is behaviorally equivalent to M on I. As V′ is minimal and only vertex NOT neurons can be deleted from M to create C, C must itself be minimal. Moreover, note that any such set V′ in G is associated with exactly one set of vertex NOT neurons in (and thus exactly one sufficient circuit of) M. ⇐ ⇐ : Let C be a minimal subcircuit of M that is behaviorally equivalent to M on input I. As neurons in all five layers in M must be present in C to produce the required output, both ninsubscriptn_innitalic_i n and noutsubscriptn_outnitalic_o u t are in C. In order for noutsubscriptn_outnitalic_o u t to produce a non-zero output, all |E||E|| E | edge NOT neurons and all |E||E|| E | AND neurons must also be in C, and each of the latter must be connected to at least one of the vertex NOT neurons corresponding to their endpoint vertices. Hence, the vertices in G corresponding to the vertex NOT neurons in C must form a vertex cover in G. As C is minimal and only vertex NOT neurons can be deleted from M to create C, this vertex cover must itself be minimal. Moreover, note that any such set of vertex NOT neurons in M is associated with exactly one set of vertices in (and hence exactly one vertex cover of) G. As #MnlVC is #P-complete (Valiant, 1979, Theorem 1(4)), the reduction above establishes that #MnlLSC is #P-hard. Membership of #MnlLSC in #P and hence the result follows from the nondeterministic algorithm for #MnlLSC that, on each computation path, guesses a subcircuit C of M and then verifies that C is minimal and C(I)=M(I)C(I)=M(I)C ( I ) = M ( I ). ∎ Theorem 22. #MnlGSC is #P-hard. Proof. Recall that the parsimonious reduction from MnlVC to MnlLSC in the proof of Theorem 21 creates an instance of MnlLSC whose MLP M has the same output for every possible input vector; hence, this reduction is also a parsimonious reduction from MnlVC to MnlGSC. As #MnlVC is #P-complete (Valiant, 1979, Theorem 1(4)), this reduction establishes that #MnlGSC is #P-hard, giving the result. ∎ Theorem 23. For Π∈L=LSC,GSC|∈Ex,B,MnlEx,MnlBΠconditional-set ∈ L=\ VLSC, VGSC~|~ V∈\Ex,B,MnlEx,MnlB\\Π ∈ L = V L S C , V G S C | V ∈ E x , B , M n l E x , M n l B , if #Π Π is polynomial-time solvable then P=NP=NPP = N P. Proof. Suppose there is a polynomial-time algorithm A for #Π Π for some Π∈LΠ ∈ LΠ ∈ L. Let R be the polynomial-time algorithm underlying the polynomial-time parsimonious reduction from ExClique to Π Π specified in the proof of Theorem 13. Construct an algorithm A′ for the decision version of ExClique as follows: Given an input I for ExCliqueD, create input I′ for Π Π using R, and apply A to I′ to create solution S. If S=00S=0S = 0, return “No”; otherwise, return “Yes”. Algorithm A′ is a polynomial-time algorithm for ExCliqueD; however, as ExCliqueD is NPNPN P-complete (Garey & Johnson, 1979, Problem GT19), this implies that P=NP=NPP = N P, giving the result. ∎ Theorem 24. For Π∈L=LSC,GSC|∈Ex,B,MnlEx,MnlBΠconditional-set ∈ L=\ VLSC, VGSC~|~ V∈\Ex,B,MnlEx,MnlB\\Π ∈ L = V L S C , V G S C | V ∈ E x , B , M n l E x , M n l B and K=cdg,#nin,g,#nout,g,Bmax,g,Wmax,g,cdr,cwr,#nin,r,#nout,r,#ntot,r,Bmax,r,Wmax,rsubscript#subscript#subscriptsubscriptsubscriptsubscriptsubscript#subscript#subscript#subscriptsubscriptsubscriptK=\cd_g,\#n_in,g,\#n_out,g,B_ ,g,W_ ,g,cd_r,cw_r,\#n_in,% r,\#n_out,r,\#n_tot,r,B_ ,r,W_ ,r\K = c ditalic_g , # nitalic_i n , g , # nitalic_o u t , g , Broman_max , g , Wroman_max , g , c ditalic_r , c witalic_r , # nitalic_i n , r , # nitalic_o u t , r , # nitalic_t o t , r , Broman_max , r , Wroman_max , r , ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-#Π Π is #W[1]delimited-[]1W[1]W [ 1 ]-hard. Proof. The result follows from the #W[1]delimited-[]1W[1]W [ 1 ]-hardness of ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-#ExClique (Flum & Grohe, 2006, Theorem 14.18) and the appropriate fpt parsimonious reduction from ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExClique to ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-Π Π specified in the proof of Theorem 14. ∎ Theorem 25. For Π∈L=LSC,GSC|∈Ex,B,MnlEx,MnlBΠconditional-set ∈ L=\ VLSC, VGSC~|~ V∈\Ex,B,MnlEx,MnlB\\Π ∈ L = V L S C , V G S C | V ∈ E x , B , M n l E x , M n l B and K=cdg,#nin,g,#nout,g,Bmax,g,Wmax,g,cdr,cwr,#nin,r,#nout,r,#ntot,r,Bmax,r,Wmax,rsubscript#subscript#subscriptsubscriptsubscriptsubscriptsubscript#subscript#subscript#subscriptsubscriptsubscriptK=\cd_g,\#n_in,g,\#n_out,g,B_ ,g,W_ ,g,cd_r,cw_r,\#n_in,% r,\#n_out,r,\#n_tot,r,B_ ,r,W_ ,r\K = c ditalic_g , # nitalic_i n , g , # nitalic_o u t , g , Broman_max , g , Wroman_max , g , c ditalic_r , c witalic_r , # nitalic_i n , r , # nitalic_o u t , r , # nitalic_t o t , r , Broman_max , r , Wroman_max , r , if ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-#Π Π is fixed-parameter tractable then FPT=#W[1]#delimited-[]1FPT=\#W[1]F P T = # W [ 1 ]. Proof. Suppose there is a fixed-parameter tractable algorithm A for ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-#Π Π for some Π∈LΠ ∈ LΠ ∈ L. Let R be the fixed-parameter algorithm underlying the fpt parsimonious reduction from ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExClique to ⟨K⟩delimited-⟨⟩ K ⟨ K ⟩-Π Π specified in the proof of Theorem 14. Construct an algorithm A′ for ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-#ExClique as follows: Given an input I for ⟨k]ra k]ra⟨ k ] r a-#ExClique, create input I′ for #Π Π using R, and apply A to I′ to create solution S. If S=00S=0S = 0, return “No”; otherwise, return “Yes”. Algorithm A′ is a fixed-parameter tractable algorithm for ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-#ExCliqueD; however, as ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-ExCliqueD is #W[1]#delimited-[]1\#W[1]# W [ 1 ]-complete (Flum & Grohe, 2006, Theorem 14.18), this implies that FPT=#W[1]#delimited-[]1FPT=\#W[1]F P T = # W [ 1 ], giving the result. ∎ Appendix D Global Sufficient Circuit Problem (sigma completeness) Minimum global sufficient circuit (MGSC) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdgsubscript1subscript2…subscriptsubscriptW_1,W_2,…,W_cd_gW1 , W2 , … , Witalic_c d start_POSTSUBSCRIPT g end_POSTSUBSCRIPT, neuron bias vector B, and a positive integer k. Question: Is there a subcircuit C of M based on ≤kabsent≤ k≤ k neurons from M such that for every possible input I of M, C(I)=M(I)C(I)=M(I)C ( I ) = M ( I )? Given a subset N of the neurons in M, the subcircuit C of M based on x has the neurons in x and all connections in M among these neurons. Note that in order for the output of C to be equal to the output of M on input I, the numbers #nin#subscript\#n_in# nitalic_i n and #nout#subscript\#n_out# nitalic_o u t of input and output neurons in M must exactly equal the numbers of input and output neurons in C; hence, no input or output neurons can be deleted from M in creating C. Following Barceló et al. 2020, page 4, all neurons in M use the ReLU activation function and the output x of each output neuron is stepped as necessary to be Boolean, i.e, step(x)=00step(x)=0s t e p ( x ) = 0 if x≤00x≤ 0x ≤ 0 and is 1111 otherwise. We will prove our result for MGSC using a polynomial-time reduction from the problem Minimum DNF Tautology. Given a DNF formula ϕitalic-ϕφϕ over a set V of variables, ϕitalic-ϕφϕ is a tautology if ϕitalic-ϕφϕ evaluates to TrueTrueT r u e for every possible truth-assignment to the variables in V. Our reduction will use specialized ReLU logic gates described in Barceló et al. 2020, Lemma 13. These gates assume Boolean neuron input and output values of 0 and 1 and are structured as follows: 1. NOT ReLU gate: A ReLU gate with one input connection weight of value −11-1- 1 and a bias of 1. This gate has output 1 if the input is 0 and 0 otherwise. 2. n-way AND ReLU gate: A ReLU gate with n input connection weights of value 1 and a bias of −(n−1)1-(n-1)- ( n - 1 ). This gate has output 1 if all inputs have value 1 and 0 otherwise. 3. n-way OR ReLU gate: A combination of an n-way AND ReLU gate with NOT ReLU gates on all of its inputs and a NOT ReLU gate on its output that uses DeMorgan’s Second Law to implement (x1∨x2∨…xn)subscript1subscript2…subscript(x_1 x_2 … x_n)( x1 ∨ x2 ∨ … xitalic_n ) as ¬(¬x1∧¬x2∧…¬xn)subscript1subscript2…subscript ( x_1 x_2 … x_n)¬ ( ¬ x1 ∧ ¬ x2 ∧ … ¬ xitalic_n ). This gate has output 1 if any input has value 1 and 0 otherwise. Theorem 26. MGSC is Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2-complete. Proof. Let us first show the membership of MGSC in Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2. Using the alternating-quantifier definition of classes in the polynomial hierarchy, membership of a decision problem Π Π in Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2 can be proved by showing that solving Π Π for input I is equivalent to solving a quantified formula of the form ∃(x)∀(y):p(x,y):for-all∃(x)∀(y):p(x,y)∃ ( x ) ∀ ( y ) : p ( x , y ) where both the sizes of x and y and the evaluation time of predicate formula p()p()p ( ) are upper-bounded by polynomials in |I||I|| I |. Such a formula for MGSC is ∃(⊆ℳ)∀∈0,1#nin:()=ℳ():ℳfor-allsuperscript01#subscriptℳ∃(C ) ∈\0,1\^\#n_in% :C(x)=M(x)∃ ( C ⊆ M ) ∀ x ∈ 0 , 1 # nitalic_i n : C ( x ) = M ( x ) We now show the Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2-hardness of MGSC. Consider the following reduction from 3DT to MGSC. Given an instance ⟨ϕ,T,V,k⟩italic-ϕ φ,T,V,k ⟨ ϕ , T , V , k ⟩ of 3DT, construct the following instance ⟨M,k′⟩superscript′ M,k ⟨ M , k′ ⟩ of MGSC: Let M be an MLP based on 3|V|+2T+23223|V|+2T+23 | V | + 2 T + 2 neurons spread across five layers: 1. Input neuron layer: The input neurons ni1,ni2,…,ni|V|subscript1subscript2…subscriptni_1,ni_2,…,ni_|V|n i1 , n i2 , … , n i| V | (all with bias 0). 2. Hidden layer I: The unnegated variable identity neurons nvU1,nvU2,…,nvU|V|subscript1subscript2…subscriptnvU_1,nvU_2,…,nvU_|V|n v U1 , n v U2 , … , n v U| V | (all with bias 0) and negated variable NOT neurons nvN1,nvN2,…,nvN|V|subscript1subscript2…subscriptnvN_1,nvN_2,…,nvN_|V|n v N1 , n v N2 , … , n v N| V | (all with bias 1). 3. Hidden layer I: (a) The term 3-way AND neurons nT1,xT2,…,xT|T|subscript1subscript2…subscriptnT_1,xT_2,…,xT_|T|n T1 , x T2 , … , x T| T | (all with bias -2). (b) The gadget neuron ngsubscriptn_gnitalic_g (bias |V|−11|V|-1| V | - 1). 4. Hidden layer I: The modified term 2-way AND neurons nTm1,nTM2,…,nTM|T|subscript1subscript2…subscriptnTm_1,nTM_2,…,nTM_|T|n T m1 , n T M2 , … , n T M| T | (all with bias −11-1- 1). 5. Output layer: The stepped output neuron noutsubscriptn_outnitalic_o u t. The non-zero weight connections between adjacent layers are as follows: • Each input neuron niisubscriptni_in iitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is input-connected with weight 1 to its corresponding input line and output-connected with weights 1 and -1 to unnegated and negated variable neurons nvUisubscriptnvU_in v Uitalic_i and nvNisubscriptnvN_in v Nitalic_i, respectively. • Each term neuron nTisubscriptnT_in Titalic_i, 1≤i≤|T|11≤ i≤|T|1 ≤ i ≤ | T |, is input-connected with weight 1 to each of the 3 variable neurons corresponding to that term’s literals. • The gadget neuron ngsubscriptn_gnitalic_g is input-connected with weight 1 to all of the unnegated and negated variable neurons. • Each modified term neuron nTmisubscriptnTm_in T mitalic_i, 1≤i≤|T|11≤ i≤|T|1 ≤ i ≤ | T |, is input-connected with weight 1 to the term neuron nTisubscriptnT_in Titalic_i and the gadget neuron ngsubscriptn_gnitalic_g and output-connected with weight 1 to the output neuron noutsubscriptn_outnitalic_o u t. All other connections between neurons in adjacent layers have weight 0. Finally, let k′=3|V|+2k+2superscript′322k =3|V|+2k+2k′ = 3 | V | + 2 k + 2. Observe that this instance of MGSC can be constructed time polynomial in the size of the given instance of 3DT. The following observations about the MLP M constructed above will be of use: • The input to M is exactly that of the 3-DNF formula ϕitalic-ϕφϕ. • The output neuron of M outputs 1 if and only if one or more of the modified term neurons output 1. • Modified term neuron nTmisubscriptnTm_in T mitalic_i outputs 1 if and only both the term neuron nTisubscriptnT_in Titalic_i and the gadget neuron ngsubscriptn_gnitalic_g output 1. • The gadget neuron ngsubscriptn_gnitalic_g outputs 1 for input I to a subcircuit C of M if and only if the negated and unnegated variable neurons corresponding to I each output 1; hence, ngsubscriptn_gnitalic_g outputs 1 for all possible inputs to C if and only if all negated and unnegated variable neurons in hidden layer I are part of C. As ϕitalic-ϕφϕ is a tautology, the above implies that (1) M outputs 1 for every possible input and (2) every global sufficient circuit of M must include all input and variable neurons, the gadget and output neurons, and at least one term / modified term neuron-pair. We now need to show the correctness of this reduction by proving that the answer for the given instance of 3DT is “Yes” if and only if the answer for the constructed instance of MGSC is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let T′, |T′|=ksuperscript′|T |=k| T′ | = k be a subset of the terms in ϕitalic-ϕφϕ that is a tautology. As noted above, any global sufficient circuit C for the constructed MLP M must include all input and variable neurons, the gadget and output neurons, and at least one term / modified term neuron-pair. Let C contain the term / modified term neuron-pairs corresponding to the terms in T′. Such a C is therefore a global sufficient circuit for M of size k′=3|V|=2k+2superscript′322k =3|V|=2k+2k′ = 3 | V | = 2 k + 2. ⇐ ⇐ : Let C be a global sufficient circuit for M of size k′=3|V|+2k+2superscript′322k =3|V|+2k+2k′ = 3 | V | + 2 k + 2. As noted above, C must include all input and variable neurons, the gadget and output neurons, and k term / modified term neuron-pairs. Let T′ be the subset of k terms in ϕitalic-ϕφϕ corresponding to these neuron-pairs. Given the input-output equivalence ϕitalic-ϕφϕ and M (and hence any sufficient circuit for M), the disjunction of the k terms in T′ must be a tautology. As 3DT is Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2-hard (Schaefer & Umans, 2002, Problem L7), the reduction above establishes that MGSC is also Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2-hard. The result then follows from the membership of MGSC in Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2 shown at the beginning of this proof. ∎ Appendix E Quasi-Minimal Sufficient Circuit Problem Quasi-Minimal Sufficient Circuit (QMSC) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, a set XX of input vectors of length #nin#subscript\#n_in# nitalic_i n. Output: a circuit CC in ℳMM and a neuron v∈v ∈ C such that ()=ℳ()ℳC(x)=M(x)C ( x ) = M ( x ) and [∖v]()≠ℳ()delimited-[]ℳ[C \v\](x) (x)[ C ∖ v ] ( x ) ≠ M ( x ) Theorem 27. QMSC is in PTIME (i.e., polynomial-time tractable). Proof. Consider the following algorithm for QMSC. Build a sequence of MLPs by taking ℳMM with all neurons labeled 1, and generating subsequent ℳisubscriptℳM_iMitalic_i in the sequence by labeling an additional neuron with 0 each time (this choice can be based on any heuristic strategy, for instance, one based on gradients). The first MLP, ℳ1subscriptℳ1M_1M1, obtained by removing all neurons labeled 0 (i.e., none) is such that ℳ1(x)=ℳ(x)subscriptℳ1ℳM_1(x)=M(x)M1 ( x ) = M ( x ), and the last ℳnsubscriptℳM_nMitalic_n is guaranteed to give ℳn(x)≠ℳ(x)subscriptℳM_n(x) (x)Mitalic_n ( x ) ≠ M ( x ) because all neurons are removed. Label the first MLP YES, and the last NO. Perform a variant of binary search on the sequence as follows. Evaluate the ℳisubscriptℳM_iMitalic_i halfway between YES and NO while removing all its neurons labeled 0. If it satisfies the condition, label it YES, and repeat the same strategy with the sequence starting from the YES just labeled until the last ℳnsubscriptℳM_nMitalic_n. If it does not satisfy the condition, label it NO and repeat the same strategy with the sequence starting from the YES at the beginning of the original sequence until the NO just labeled. This iterative procedure halves the sequence each time. Halt when you find two adjacent ⟨YES,NO⟩YESNO YES, NO ⟨ YES , NO ⟩ circuits (guaranteed to exist), and return the circuit set of the YES network V and the single neuron difference between YES and NO (the breaking point), v∈Vv∈ Vv ∈ V. The complexity of this algorithm is roughly O(nlogn)O(n n)O ( n log n ). ∎ Appendix F Gnostic Neurons Problem Gnostic Neurons (GN) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, and two sets XX and YY of input vectors of length #nin#subscript\#n_in# nitalic_i n, and a positive integer k such that 1≤k≤#ntot1#subscript1≤ k≤\#n_tot1 ≤ k ≤ # nitalic_t o t. Output: a subset of neurons V in M of size |V|≥k|V|≥ k| V | ≥ k such that ∀v∈Vsubscriptfor-all _v∈ V∀v ∈ V it is the case that ∀∈subscriptfor-all _x ∀x ∈ X computing M()M(x)M ( x ) produces activations Av≥tsubscriptsuperscriptA^v_x≥ tAitalic_vbold_x ≥ t and ∀∈:Av<t:subscriptfor-allsubscriptsuperscript _y :A^v_y<t∀y ∈ Y : Aitalic_vbold_y < t. Theorem 28. GN is in PTIME (i.e., polynomial-time tractable). Proof. Consider the complexity of the following subroutines of an algorithm for GN. Computing the activations of all neurons of M for all ∈x ∈ X and all ∈y ∈ Y takes polynomial time in |M|,|||M|,|X|| M | , | X | and |||Y|| Y |. Labeling neurons that pass or not the activation threshold takes time polynomial in |M|,|||M|,|X|| M | , | X | and |||Y|| Y |. Finally, checking whether the set of neurons that fulfils the condition is of size at least k can be done in polynomial time in |M||M|| M |. These subroutines can be put together to yield a polynomial-time algorithm for GN. ∎ Remark 1 One could also add to the output of the computational problem the requirement that if we silence (or activate) the neuron, we should elicit (or abolish) a behavior. Note that checking these effects can be done in polynomial time in all of the input parts given above and also in the size of the behavior set (which should be added to the input in these variants). Appendix G Necessary Circuit Problem Minimum Local Necessary Circuit (MLNC) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, a Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n, and a positive integer k such that 1≤k≤#ntot1#subscript1≤ k≤\#n_tot1 ≤ k ≤ # nitalic_t o t. Question: Is there a subset N′, |N′|≤ksuperscript′|N |≤ k| N′ | ≤ k, of the |N||N|| N | neurons in M such that N′∩C≠∅superscript′N ∩ C≠ ′ ∩ C ≠ ∅ for every sufficient circuit C of M relative to I? Minimum Global Necessary Circuit (MGNC) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, and a positive integer k such that 1≤k≤#ntot1#subscript1≤ k≤\#n_tot1 ≤ k ≤ # nitalic_t o t. Question: Is there a subset N′, |N′|≤ksuperscript′|N |≤ k| N′ | ≤ k, of the |N||N|| N | neurons in M such that for for every possible Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n, N′∩C≠∅superscript′N ∩ C≠ ′ ∩ C ≠ ∅ for every sufficient circuit C of M relative to I? We will use reductions from the Hitting Set problem to prove our results for the problems above. Regarding the Hitting Set problem, we shall assume an ordering on the sets and elements in C and S, respectively. Our parameterized results are proved relative to the parameters in Table 7. Lemmas 1 and 2 will be useful in deriving additional parameterized results from proved ones. Our reductions will use specialized ReLU logic gates described in Barceló et al. 2020, Lemma 13. These gates assume Boolean neuron input and output values of 0 and 1 and are structured as follows: 1. NOT ReLU gate: A ReLU gate with one input connection weight of value −11-1- 1 and a bias of 1. This gate has output 1 if the input is 0 and 0 otherwise. 2. n-way AND ReLU gate: A ReLU gate with n input connection weights of value 1 and a bias of −(n−1)1-(n-1)- ( n - 1 ). This gate has output 1 if all inputs have value 1 and 0 otherwise. 3. n-way OR ReLU gate: A combination of an n-way AND ReLU gate with NOT ReLU gates on all of its inputs and a NOT ReLU gate on its output that uses DeMorgan’s Second Law to implement (x1∨x2∨…xn)subscript1subscript2…subscript(x_1 x_2 … x_n)( x1 ∨ x2 ∨ … xitalic_n ) as ¬(¬x1∧¬x2∧…¬xn)subscript1subscript2…subscript ( x_1 x_2 … x_n)¬ ( ¬ x1 ∧ ¬ x2 ∧ … ¬ xitalic_n ). This gate has output 1 if any input has value 1 and 0 otherwise. Table 7: Parameters for the minimum necessary circuit problem. Parameter Description cdcdc d # layers in given MLP cwcwc w max # neurons in layer in given MLP #ntot#subscript\#n_tot# nitalic_t o t total # neurons in given MLP #nin#subscript\#n_in# nitalic_i n # input neurons in given MLP #nout#subscript\#n_out# nitalic_o u t # output neurons in given MLP BmaxsubscriptB_ Broman_max max neuron bias in given MLP WmaxsubscriptW_ Wroman_max max connection weight in given MLP k Size of requested neuron subset G.1 Results for MLNC Membership of MLNC in Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2 can be proven via the definition of the polynomial hierarchy and the following alternating quantifier formula: ∃[N⊆ℳ]∀[⊆ℳ]:[()=ℳ()]⟹N∩≠∅:delimited-[]ℳfor-alldelimited-[]ℳdelimited-[]ℳ∃[N ]\ ∀[C ]:[% C(x)=M(x)] N ≠ ∃ [ N ⊆ M ] ∀ [ C ⊆ M ] : [ C ( x ) = M ( x ) ] ⟹ N ∩ C ≠ ∅ Theorem 29. If MLNC is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from HS to MLNC. Given an instance ⟨C,S,k⟩ C,S,k ⟨ C , S , k ⟩ of HS, construct the following instance ⟨M,I,k′⟩superscript′ M,I,k ⟨ M , I , k′ ⟩ of MLNC: Let M be an MLP based on #ntot=|S|+|C|+2#subscript2\#n_tot=|S|+|C|+2# nitalic_t o t = | S | + | C | + 2 neurons spread across four layers: 1. Input neuron layer: The single input neuron ninsubscriptn_innitalic_i n (bias +11+1+ 1). 2. Hidden element layer: The element neurons ns1,ns2,…ns|S|subscript1subscript2…subscriptns_1,ns_2,… ns_|S|n s1 , n s2 , … n s| S | (all with bias 0). 3. Hidden set layer: The set AND neurons nc1,nc2,…,nc|C|subscript1subscript2…subscriptnc_1,nc_2,…,nc_|C|n c1 , n c2 , … , n c| C | (such that neuron ncisubscriptnc_in citalic_i has bias −|ci|subscript-|c_i|- | citalic_i |). 4. Output layer: The single stepped output neuron noutsubscriptn_outnitalic_o u t (bias 0). The non-zero weight connections between adjacent layers are as follows: • The input neuron has an edge of weight 0 coming from its input and is in turn connected to each of the element neurons with weight 1. • Each element neuron nsisubscriptns_in sitalic_i, 1≤i≤|S|11≤ i≤|S|1 ≤ i ≤ | S |, is connected to each set neuron ncjsubscriptnc_jn citalic_j, 1≤j≤|C|11≤ j≤|C|1 ≤ j ≤ | C |, such that si∈cjsubscriptsubscripts_i∈ c_jsitalic_i ∈ citalic_j with weight 1. • Each set neuron ncisubscriptnc_in citalic_i, 1≤i≤|C|11≤ i≤|C|1 ≤ i ≤ | C |, is connected to the output neuron with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let I=(0)0I=(0)I = ( 0 ) and k′=ksuperscript′k =k′ = k. Observe that this instance of MLNC can be created in time polynomial in the size of the given instance of HS. Moreover, the output behaviour of the neurons in M from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(1)subscript1n_in(1)nitalic_i n ( 1 ) 2 ns1,ns2,…,ns|S|subscript1subscript2…subscriptns_1,ns_2,…,ns_|S|n s1 , n s2 , … , n s| S | (1) 3 nc1,nc2,…,nc|C|subscript1subscript2…subscriptnc_1,nc_2,…,nc_|C|n c1 , n c2 , … , n c| C | (1) 4 nout(1))n_out(1))nitalic_o u t ( 1 ) ) Note the following about the behavior of M: Observation 1. For any set neuron ncisubscriptnc_in citalic_i to output 1, it must receive input 1 from all of its incoming element neurons connected with weight 1. Observation 2. For the output neuron to output 1, it is sufficient to get input 1 from any of its incoming set neurons with weight 1. Observations 1 and 2 imply that any sufficient circuit for M must contain at least one set neuron and all of its associated element neurons. We now need to show the correctness of this reduction by proving that the answer for the given instance of HS is “Yes” if and only if the answer for the constructed instance of MLNC is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let S′=s1′,s2′,…,sk′⊆Ssuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′S =\s _1,s _2,…,s _k\ S′ = s′1 , s′2 , … , s′italic_k ⊆ S be a hitting set of size k for C. By the construction above, the k element neurons corresponding to the elements in S′ collectively connect with weight 1 to all set neurons in M. By Observations 1 and 2, this means that the set N of these element neurons has a non-empty intersection with every sufficient circuit for M, and hence that N is a necessary circuit for M of size k=k′=k k = k′. ⇐ ⇐ : Let N be a necessary circuit for M of size k′. Let NSsubscriptN_SNitalic_S and NCsubscriptN_CNitalic_C be the subsets of N that are element and set neurons. We can create a set N′ consisting only of element neurons by replacing each set neuron ncsubscriptn_cnitalic_c in NCsubscriptN_CNitalic_C with an arbitrary element neuron that is not already in NSsubscriptN_SNitalic_S and is connected to ncsubscriptn_cnitalic_c with weight 1. Observe that N′ (whose size may be less than k′ if any ncsubscriptn_cnitalic_c already had an associated element neuron in NSsubscriptN_SNitalic_S) remains a necessary circuit for M. Moreover, as the element neurons in N′ by definition have a non-empty intersection with each sufficient circuit for M, by Observations 1 and 2 above, the set S′ of elements in S corresponding to the element neurons in N′ has a non-empty intersection with each set in C and hence is a hitting set of size N′≤k′=ksuperscript′N ≤ k =kN′ ≤ k′ = k As HS is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MLNC is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Theorem 30. If ⟨cd,#nin,#nout,Wmax,k⟩#subscript#subscriptsubscript cd,\#n_in,\#n_out,W_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , k ⟩-MLNC is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of MLNC constructed in the reduction in the proof of Theorem 29, #nin=#nout=Wmax=1#subscript#subscriptsubscript1\#n_in=\#n_out=W_ =1# nitalic_i n = # nitalic_o u t = Wroman_max = 1, cd=44cd=4c d = 4, and k′ is a function of k in the given instance of HS. The result then follows from the facts that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-HS is W[2]delimited-[]2W[2]W [ 2 ]-hard (by a reduction from ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Dominating set; Downey & Fellows 1999) and W[1]⊆W[2]delimited-[]1delimited-[]2W[1] W[2]W [ 1 ] ⊆ W [ 2 ]. ∎ Theorem 31. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MLNC is fixed-parameter tractable. Proof. Consider the algorithm that generates every possible subset N′ of size at most k of the neurons in MLP M and for each such subset, generates every possible subset N′N N′ ′ of M, checks if N′N N′ ′ is a sufficient circuit for M relative to I and, if so, checks if N′ has a non-empty intersection with N′N N′ ′. If an N′ is found that has a non-empty intersection with each sufficient circuit for M relative to I, return “Yes”; otherwise, return “No”. The number of possible subsets N′ and N′N N′ ′ are both at most 2#ntotsuperscript2#subscript2^\#n_tot2# nitalic_t o t. As all subsequent checking operations can be done in time polynomial in the size of the given instance of MLNC, the above is a fixed-parameter tractable algorithm for MLNC relative to parameter-set #ntot#subscript\\#n_tot\ # nitalic_t o t . ∎ Theorem 32. ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-MLNC is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 31 and the observation that #ntot≤cw×cd#subscript\#n_tot≤ cw× cd# nitalic_t o t ≤ c w × c d. ∎ Observe that the results in Theorems 30–32 in combination with Lemmas 1 and 2 suffice to establish the parameterized complexity status of MLNC relative to many subsets of the parameters listed in Table 7. Let us now consider the polynomial-time cost approximability of MLNC. Theorem 33. If MLNC has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then P=NP=NPP = N P. Proof. Recall from the proof of correctness of the reduction in the proof of Theorem 29 that a given instance of HS has a hitting set of size k if and only if the constructed instance of MLNC has a necessary circuit of size k′=ksuperscript′k =k′ = k. This implies that, given a polynomial-time c-approximation algorithm A for MLNC for some constant c>00c>0c > 0, we can create a polynomial-time c-approximation algorithm for HS by applying the reduction to the given instance x of HS to construct an instance x′ of MLNC, applying A to x′ to create an approximate solution y′, and then using y′ to create an approximate solution y for x that has the same cost as y′. The result then follows from Ausiello et al. 1999, Problem SP7, which states that if HS has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 a and is hence in approximation problem class APX then P=NP=NPP = N P. ∎ Note that this theorem also renders MLNC PTAS-inapproximable unless FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. G.2 Results for MGNC Membership of MLNC in Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2 can be proven via the definition of the polynomial hierarchy and the following alternating quantifier formula: ∃[N⊆ℳ]∀[⊆ℳs.t.∀(∈0,1#nin)]:N∩≠∅∃[N ]\ ∀[C \ s.t.% \ ∀(x∈\0,1\^\#n_in)]:N ≠ ∃ [ N ⊆ M ] ∀ [ C ⊆ M s . t . ∀ ( x ∈ 0 , 1 # nitalic_i n ) ] : N ∩ C ≠ ∅ Theorem 34. If MGNC is polynomial-time tractable then P=NP=NPP = N P. Proof. Observe that in the instance of MLNC constructed by the reduction in the proof of Theorem 29, the input-connection weight 0 and bias 1 of the input neuron force this neuron to output 1 for both of the possible input vectors (1)1(1)( 1 ) and (0)0(0)( 0 ). Hence, with slight modifications to the proof of reduction correctness, this reduction also establishes the NPNPN P-hardness of MGNC. ∎ Theorem 35. If ⟨cd,#nin,#nout,Wmax,k⟩#subscript#subscriptsubscript cd,\#n_in,\#n_out,W_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , k ⟩-MGNC is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of MGNC constructed in the reduction in the proof of Theorem 34, #nin=#nout=Wmax=1#subscript#subscriptsubscript1\#n_in=\#n_out=W_ =1# nitalic_i n = # nitalic_o u t = Wroman_max = 1, cd=44cd=4c d = 4, and k′ is a function of k in the given instance of HS The result then follows from the facts that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-HS is W[2]delimited-[]2W[2]W [ 2 ]-hard (by a reduction from ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Dominating set; Downey & Fellows 1999) and W[1]⊆W[2]delimited-[]1delimited-[]2W[1] W[2]W [ 1 ] ⊆ W [ 2 ]. ∎ Theorem 36. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MGNC is fixed-parameter tractable. Proof. Modify the algorithm in the proof of Theorem 31 such that each potential sufficient circuit N′N N′ ′ is checked to ensure that M(I)=N′(I)superscript′M(I)=N (I)M ( I ) = N′ ′ ( I ) for every possible Boolean input vector of length #nin#subscript\#n_in# nitalic_i n. As the number of such vectors is 2#nin<2#ntotsuperscript2#subscriptsuperscript2#subscript2^\#n_in<2^\#n_tot2# nitalic_i n < 2# nitalic_t o t, the above is a fixed-parameter tractable algorithm for MGNC relative to parameter-set #ntot#subscript\\#n_tot\ # nitalic_t o t . ∎ Theorem 37. ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-MGNC is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 36 and the observation that #ntot≤cw×cd#subscript\#n_tot≤ cw× cd# nitalic_t o t ≤ c w × c d. ∎ Observe that the results in Theorems 35–37 in combination with Lemmas 1 and 2 suffice to establish the parameterized complexity status of MGNC relative to many subsets of the parameters listed in Table 7. Let us now consider the polynomial-time cost approximability of MGNC. Theorem 38. If MGNC has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then P=NP=NPP = N P. Proof. As the reduction in the proof of Theorem 34 is essentially the same as the reduction in the proof of Theorem 29, the result follows by the same reasoning as given in the proof of Theorem 33. ∎ Note that this theorem also renders MGNC PTAS-inapproximable unless FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Appendix H Circuit Ablation and Clamping Problems Given an MLP M and a subset N of the neurons in M, the MLP M′ induced by N is said to be active if there is at least one path between the the input and output neurons in M′; otherwise, M′ is inactive. As we are interested in inductions that preserve or violate output behaviour, all output neurons of M must be preserved in M′; however, we only require that at least one input neuron be so preserved. Unless otherwise stated, all inductions discussed wrt MLCA and MGCA below will be assumed to result in active MLP. Minimum Local Circuit Ablation (MLCA) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, a Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n, and a positive integer k such that 1≤k≤#ntot1#subscript1≤ k≤\#n_tot1 ≤ k ≤ # nitalic_t o t. Question: Is there a subset N′, |N′|≤ksuperscript′|N |≤ k| N′ | ≤ k, of the |N||N|| N | neurons in M such that M(I)≠M′(I)superscript′M(I)≠ M (I)M ( I ) ≠ M′ ( I ) for the MLP M′ induced by N∖N′ N N ∖ N′? Minimum Global Circuit Ablation (MGCA) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, and a positive integer k such that 1≤k≤#ntot1#subscript1≤ k≤\#n_tot1 ≤ k ≤ # nitalic_t o t. Question: Is there a subset N′, |N′|≤ksuperscript′|N |≤ k| N′ | ≤ k, of the |N||N|| N | neurons in M such that for the MLP M′ induced by N∖N′ N N ∖ N′, M(I)≠M′(I)superscript′M(I)≠ M (I)M ( I ) ≠ M′ ( I ) for every possible Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n? Given an MLP M and a neuron v in M, v is clamped to value valvalv a l if the output of v is always valvalv a l regardless of the inputs to v. As one can trivially change the output of an MLP by clamping one or more of its output neurons, we shall not allow the clamping of output neurons in the problems below. Minimum Local Circuit Clamping (MLCC) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, a Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n, a Boolean value valvalv a l, and a positive integer k such that 1≤k≤#ntot1#subscript1≤ k≤\#n_tot1 ≤ k ≤ # nitalic_t o t. Question: Is there a subset N′, |N′|≤ksuperscript′|N |≤ k| N′ | ≤ k, of the |N||N|| N | neurons in M such that M(I)≠M′(I)superscript′M(I)≠ M (I)M ( I ) ≠ M′ ( I ) for the MLP M′ in which all neurons in N′ are clamped to value valvalv a l? Minimum Global Circuit Clamping (MGCC) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, a Boolean value valvalv a l, and a positive integer k such that 1≤k≤#ntot1#subscript1≤ k≤\#n_tot1 ≤ k ≤ # nitalic_t o t. Question: Is there a subset N′, |N′|≤ksuperscript′|N |≤ k| N′ | ≤ k, of the |N||N|| N | neurons in M such that for the MLP M′ in which all neurons in N′ are clamped to value valvalv a l, M(I)≠M′(I)superscript′M(I)≠ M (I)M ( I ) ≠ M′ ( I ) for every possible Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n? Following Barceló et al. 2020, page 4, all neurons in M use the ReLU activation function and the output x of each output neuron is stepped as necessary to be Boolean, i.e, step(x)=00step(x)=0s t e p ( x ) = 0 if x≤00x≤ 0x ≤ 0 and is 1111 otherwise. For a graph G=(V,E)G=(V,E)G = ( V , E ), we shall assume an ordering on the vertices and edges in V and E, respectively. For each vertex v∈Vv∈ Vv ∈ V, let the complete neighbourhood NC(v)subscriptN_C(v)Nitalic_C ( v ) of v be the set composed of v and the set of all vertices in G that are adjacent to v by a single edge, i.e., v∪u|u∈Vand(u,v)∈Econditional-setanduvEv∪\u~|~u~∈ V~ and~(u,v)∈ E\v ∪ u | u ∈ V and ( u , v ) ∈ E . We will prove various classical and parameterized results for MLCA, MGCA,MLCC, and MGCC using reductions from Clique. The parameterized results are proved relative to the parameters in Table 8. Lemmas 1 and 2 will be useful in deriving additional parameterized results from proved ones. Additional reductions from DS used to prove polynomial-time cost inapproximability use specialized ReLU logic gates described in Barceló et al. 2020, Lemma 13. These gates assume Boolean neuron input and output values of 0 and 1 and are structured as follows: 1. NOT ReLU gate: A ReLU gate with one input connection weight of value −11-1- 1 and a bias of 1. This gate has output 1 if the input is 0 and 0 otherwise. 2. n-way AND ReLU gate: A ReLU gate with n input connection weights of value 1 and a bias of −(n−1)1-(n-1)- ( n - 1 ). This gate has output 1 if all inputs have value 1 and 0 otherwise. 3. n-way OR ReLU gate: A combination of an n-way AND ReLU gate with NOT ReLU gates on all of its inputs and a NOT ReLU gate on its output that uses DeMorgan’s Second Law to implement (x1∨x2∨…xn)subscript1subscript2…subscript(x_1 x_2 … x_n)( x1 ∨ x2 ∨ … xitalic_n ) as ¬(¬x1∧¬x2∧…¬xn)subscript1subscript2…subscript ( x_1 x_2 … x_n)¬ ( ¬ x1 ∧ ¬ x2 ∧ … ¬ xitalic_n ). This gate has output 1 if any input has value 1 and 0 otherwise. Table 8: Parameters for the minimum circuit ablation and clamping problems. Parameter Description cdcdc d # layers in given MLP cwcwc w max # neurons in layer in given MLP #ntot#subscript\#n_tot# nitalic_t o t total # neurons in given MLP #nin#subscript\#n_in# nitalic_i n # input neurons in given MLP #nout#subscript\#n_out# nitalic_o u t # output neurons in given MLP BmaxsubscriptB_ Broman_max max neuron bias in given MLP WmaxsubscriptW_ Wroman_max max connection weight in given MLP k Size of requested neuron subset H.1 Results for Minimal Circuit Ablation The following hardness and inapproximability results are notable for holding when the given MLP M has three hidden layers. H.1.1 Results for MLCA Towards proving NP-completeness, we first prove membership and then follow up with hardness. Membership in NP can be proven via the definition of the polynomial hierarchy and the following alternating quantifier formula: ∃[⊆ℳ]:[ℳ∖]()≠ℳ():delimited-[]ℳdelimited-[]ℳ∃[S ]:[M ](% x) (x)∃ [ S ⊆ M ] : [ M ∖ S ] ( x ) ≠ M ( x ) Theorem 39. If MLCA is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from Clique to MLCA. Given an instance ⟨G=(V,E),k⟩delimited-⟨⟩ G=(V,E),k ⟨ G = ( V , E ) , k ⟩ of Clique, construct the following instance ⟨M,I,k′⟩superscript′ M,I,k ⟨ M , I , k′ ⟩ of MLCA: Let M be an MLP based on #ntot=3|V|+|E|+2#subscript32\#n_tot=3|V|+|E|+2# nitalic_t o t = 3 | V | + | E | + 2 neurons spread across five layers: 1. Input neuron layer: The single input neuron ninsubscriptn_innitalic_i n (bias +11+1+ 1). 2. Hidden vertex pair layer: The vertex neurons nvP11,nvP12,…nvP1|V|subscript11subscript12…subscript1nvP1_1,nvP1_2,… nvP1_|V|n v P 11 , n v P 12 , … n v P 1| V | and nvP21,nvP22,…nvP2|V|subscript21subscript22…subscript2nvP2_1,nvP2_2,… nvP2_|V|n v P 21 , n v P 22 , … n v P 2| V | (all with bias 0). 3. Hidden vertex regulator layer: The vertex neurons nvR1,nvR2,…nR|V|subscript1subscript2…subscriptnvR_1,nvR_2,… nR_|V|n v R1 , n v R2 , … n R| V | (all with bias 0). 4. Hidden edge layer: The edge neurons ne1,ne2,…ne|E|subscript1subscript2…subscriptne_1,ne_2,… ne_|E|n e1 , n e2 , … n e| E | (all with bias −11-1- 1). 5. Output layer: The single output neuron noutsubscriptn_outnitalic_o u t (bias −(k(k−1)/2−1)121-(k(k-1)/2-1)- ( k ( k - 1 ) / 2 - 1 )). The non-zero weight connections between adjacent layers are as follows: • Each input neuron has an edge of weight 0 coming from its corresponding input and is in turn connected to each of the vertex pair neurons with weight 1. • Each P1 (P2) vertex pair neuron nvP1isubscript1nvP1_in v P 1i (nvP2isubscript2nvP2_in v P 2i), 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to vertex regulator neuron nvRisubscriptnvR_in v Ritalic_i with weight −22-2- 2 (1). • Each vertex regulator neuron nvisubscriptnv_in vitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to each edge neuron whose corresponding edge has an endpoint visubscriptv_ivitalic_i with weight 1. • Each edge neuron neisubscriptne_in eitalic_i, 1≤i≤|E|11≤ i≤|E|1 ≤ i ≤ | E |, is connected to the output neuron noutsubscriptn_outnitalic_o u t with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let I=(1)1I=(1)I = ( 1 ) and k′=ksuperscript′k =k′ = k. Observe that this instance of MLCA can be created in time polynomial in the size of the given instance of Clique. Moreover, the output behaviour of the neurons in M from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(1)subscript1n_in(1)nitalic_i n ( 1 ) 2 nvP11(1),nvP12(1),…nvP1|V|(1),nvP21(1),nvP22(1),…nvP2|V|(1)subscript111subscript121…subscript11subscript211subscript221…subscript21nvP1_1(1),nvP1_2(1),… nvP1_|V|(1),nvP2_1(1),nvP2_2(1),… nvP% 2_|V|(1)n v P 11 ( 1 ) , n v P 12 ( 1 ) , … n v P 1| V | ( 1 ) , n v P 21 ( 1 ) , n v P 22 ( 1 ) , … n v P 2| V | ( 1 ) 3 nvR1(0),nvR2(0),…nvR|V|(0)subscript10subscript20…subscript0nvR_1(0),nvR_2(0),… nvR_|V|(0)n v R1 ( 0 ) , n v R2 ( 0 ) , … n v R| V | ( 0 ) 4 ne1(0),ne2(0),…ne|E|(0)subscript10subscript20…subscript0ne_1(0),ne_2(0),… ne_|E|(0)n e1 ( 0 ) , n e2 ( 0 ) , … n e| E | ( 0 ) 5 nout(0))n_out(0))nitalic_o u t ( 0 ) ) We now need to show the correctness of this reduction by proving that the answer for the given instance of Clique is “Yes” if and only if the answer for the constructed instance of MLCA is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a clique in G of size k′≥ksuperscript′k ≥ k′ ′ ≥ k and N′ be the k′≥k′=ksuperscript′k ≥ k =k′ ′ ≥ k′ = k-sized subset of the P1 vertex pair neurons corresponding to the vertices in V′. Let M′ be the version of M in which all neurons in N′ are ablated. As each of these vertex pair neurons previously forced their associated vertex regulator neurons to output 0 courtesy of their connection-weight of −22-2- 2, their ablation now allows these k′k k′ ′ vertex regulator neurons to output 1. As V′ is a clique of size k′k k′ ′, exactly k′(k′−1)/2≥k(k−1)/2superscript′1212k (k -1)/2≥ k(k-1)/2k′ ′ ( k′ ′ - 1 ) / 2 ≥ k ( k - 1 ) / 2 edge neurons in M′ receive the requisite inputs of 1 on both of their endpoints from the vertex regulator neurons associated with the P1 vertex pair neurons in N′. This in turn ensures the output neuron produces output 1. Hence, M(I)=0≠1=M′(I)01superscript′M(I)=0≠ 1=M (I)M ( I ) = 0 ≠ 1 = M′ ( I ). ⇐ ⇐ : Let N′ be a subset of N of size at most k′=ksuperscript′k =k′ = k such that for the MLP M′ induced by ablating all neurons in N′, M(I)≠M(I′)superscript′M(I)≠ M(I )M ( I ) ≠ M ( I′ ). As M(I)=00M(I)=0M ( I ) = 0 and circuit outputs are stepped to be Boolean, M′(I)=1superscript′1M (I)=1M′ ( I ) = 1. Given the bias of the output neuron, this can only occur if at least k(k−1)/212k(k-1)/2k ( k - 1 ) / 2 edge neurons in M′ have output 1 on input I, which requires that each of these neurons receives 1 from both of its endpoint vertex regulator neurons. These vertex regulator neurons can only output 1 if all of their associated P1 vertex neurons have been ablated; moreover, there must be exactly k such neurons. This means that the vertices in G corresponding to the P1 vertex pair neurons in N′ must form a clique of size k in G. As Clique is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MLCA is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Theorem 40. If ⟨cd,#nin.#nout,Wmax,Bmax,k⟩delimited-⟨⟩formulae-sequence#subscript#subscriptsubscriptsubscript cd,\#n_in.\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n . # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MLCA is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of MLCA constructed in the reduction in the proof of Theorem 39, #nin=#nout=Wmax=1#subscript#subscriptsubscript1\#n_in=\#n_out=W_ =1# nitalic_i n = # nitalic_o u t = Wroman_max = 1, cd=55cd=5c d = 5, and BmaxsubscriptB_ Broman_max and k are function of k in the given instance of Clique. The result then follows from the fact that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Clique is W[1]delimited-[]1W[1]W [ 1 ]-hard (Downey & Fellows, 1999). ∎ Theorem 41. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MLCA is fixed-parameter tractable. Proof. Consider the algorithm that generates every possible subset N′ of size at most k of the neurons N in MLP M and for each such subset, creates the MLP M′ induced from M by ablating the neurons in N′ and (assuming M′ is active) checks if M′(I)≠M(I)superscript′M (I)≠ M(I)M′ ( I ) ≠ M ( I ). If such a subset is found, return “Yes”; otherwise, return “No”. The number of possible subsets N′ is at most k×#ntotk≤#ntot×#ntot#ntot#superscriptsubscript#subscript#superscriptsubscript#subscriptk×\#n_tot^k≤\#n_tot×\#n_tot^\#n_totk × # nitalic_t o titalic_k ≤ # nitalic_t o t × # nitalic_t o t# nitalic_t o t. As any such M′ can be generated from M, checked or activity, and run on I in time polynomial in the size of the given instance of MLCA, the above is a fixed-parameter tractable algorithm for MLCA relative to parameter-set #ntot#subscript\\#n_tot\ # nitalic_t o t . ∎ Theorem 42. ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-MLCA is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 41 and the observation that #ntot≤cw×cd#subscript\#n_tot≤ cw× cd# nitalic_t o t ≤ c w × c d. ∎ Observe that the results in Theorems 40–42 in combination with Lemmas 1 and 2 suffice to establish the parameterized complexity status of MLCA relative to many subsets of the parameters listed in Table 8. Let us now consider the polynomial-time cost approximability of MLCA. As MLCA is a minimization problem, we cannot do this using reductions from a maximization problem like Clique. Hence we will instead use a reduction from another minimization problem, namely DS. Theorem 43. If MLCA is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from DS to MLCA. Given an instance ⟨G=(V,E),k⟩delimited-⟨⟩ G=(V,E),k ⟨ G = ( V , E ) , k ⟩ of DS, construct the following instance ⟨M,I,k′⟩superscript′ M,I,k ⟨ M , I , k′ ⟩ of MLCA: Let M be an MLP based on #ntot,g=3|V|+1#subscript31\#n_tot,g=3|V|+1# nitalic_t o t , g = 3 | V | + 1 neurons spread across four layers: 1. Input layer: The input vertex neurons nv1,nv2,…nv|V|subscript1subscript2…subscriptnv_1,nv_2,… nv_|V|n v1 , n v2 , … n v| V |, all of which have bias 1. 2. Hidden vertex neighbourhood layer I: The vertex neighbourhood AND neurons nvnA1,nvnA2,…nvnA|V|subscript1subscript2…subscriptnvnA_1,nvnA_2,… nvnA_|V|n v n A1 , n v n A2 , … n v n A| V |, where nvnAisubscriptnvnA_in v n Aitalic_i is an x-way AND ReLU gates such that x=|NC(vi)|subscriptsubscriptx=|N_C(v_i)|x = | Nitalic_C ( vitalic_i ) |. 3. Hidden vertex neighbourhood layer I: The vertex neighbourhood NOT neurons nvnN1,nvnN2,…nvnN|V|subscript1subscript2…subscriptnvnN_1,nvnN_2,… nvnN_|V|n v n N1 , n v n N2 , … n v n N| V |, all of which are NOT ReLU gates. 4. Output layer: The single output neuron noutsubscriptn_outnitalic_o u t, which is a |V||V|| V |-way AND ReLU gate. The non-zero weight connections between adjacent layers are as follows: • Each input vertex neuron nvisubscriptnv_in vitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to its input line with weight 0 and to each vertex neighbourhood AND neuron nvnAjsubscriptnvnA_jn v n Aitalic_j such that vi∈NC(vj)subscriptsubscriptsubscriptv_i∈ N_C(v_j)vitalic_i ∈ Nitalic_C ( vitalic_j ) with weight 1. • Each vertex neighbourhood AND neuron nvnAisubscriptnvnA_in v n Aitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to its corresponding vertex neighbourhood NOT neuron nvnNisubscriptnvnN_in v n Nitalic_i with weight 1. • Each vertex neighbourhood NOT neuron nvnNisubscriptnvnN_in v n Nitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to the output neuron noutsubscriptn_outnitalic_o u t with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let I be the |V||V|| V |-length one-vector and k′=ksuperscript′k =k′ = k. Observe that this instance of MLCA can be created in time polynomial in the size of the given instance of DS, Moreover, the output behaviour of the neurons in M from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nv1(1),nv2(1),…nv|V|(1)subscript11subscript21…subscript1nv_1(1),nv_2(1),… nv_|V|(1)n v1 ( 1 ) , n v2 ( 1 ) , … n v| V | ( 1 ) 2 nvnA1(1),nvnA2(1),…nvnA|V|(1)subscript11subscript21…subscript1nvnA_1(1),nvnA_2(1),… nvnA_|V|(1)n v n A1 ( 1 ) , n v n A2 ( 1 ) , … n v n A| V | ( 1 ) 3 nvnN1(0),nvnN2(0),…nvnN|V|(0)subscript10subscript20…subscript0nvnN_1(0),nvnN_2(0),… nvnN_|V|(0)n v n N1 ( 0 ) , n v n N2 ( 0 ) , … n v n N| V | ( 0 ) 4 nout(0)subscript0n_out(0)nitalic_o u t ( 0 ) We now need to show the correctness of this reduction by proving that the answer for the given instance of DS is “Yes” if and only if the answer for the constructed instance of MLCA is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a dominating set in G of size k and N′ be the k′=ksuperscript′k =k′ = k-sized subset of the input vertex neurons in M corresponding to the vertices in V′. Create MLP M′ by ablated in M the neurons in N′. As V′ is a dominating set, each vertex neighbourhood AND neuron in M′ is missing a i-input from at least one input vertex neuron in N′, which in turn ensures that each vertex neighbourhood AND neuron in M′ has output 0. This in turn ensures that M produces output 1 on input I such that M(I)=0≠1=M′(I)01superscript′M(I)=0≠ 1=M (I)M ( I ) = 0 ≠ 1 = M′ ( I ). ⇐ ⇐ : Let N′ be a k′≤k′=ksuperscript′k ≤ k =k′ ′ ≤ k′ = k-sized subset of the set N of neurons in M whose ablation in M creates an MLP M′ such that M(I)=0≠M′(I)0superscript′M(I)=0≠ M (I)M ( I ) = 0 ≠ M′ ( I ). As all MLP outputs are stepped to be Boolean, this implies that M′(I)=1superscript′1M (I)=1M′ ( I ) = 1. This can only happen if all vertex neighbourhood NOT neurons output 1, which in turn can happen only if all vertex neighbourhood AND gates output 0. As I=1|V|superscript1I=1^|V|I = 1| V |, this can only happen if for each vertex neighbourhood AND neuron, at least one input vertex neuron previously producing a 1-input to that vertex neighbourhood AND neuron has been ablated in creating M′. This in turn implies that the k′k k′ ′ vertices in G corresponding to the elements of N′ form a dominating set of size k′≤ksuperscript′k ≤ k′ ′ ≤ k for G. As DS is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MLCA is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Theorem 44. If MLCA has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Recall from the proof of correctness of the reduction in the proof of Theorem 43 that a given instance of DS has a dominating set of size k if and only if the constructed instance of MLCA has a subset N′ of size k′=ksuperscript′k =k′ = k of the neurons in given MLP M such that the ablation in M of the neurons in N′ creates an MLP M′ such that M(I)≠M′(I)superscript′M(I)≠ M (I)M ( I ) ≠ M′ ( I ). This implies that, given a polynomial-time c-approximation algorithm A for MLCA for some constant c>00c>0c > 0, we can create a polynomial-time c-approximation algorithm for DS by applying the reduction to the given instance x of DS to construct an instance x′ of MLCA, applying A to x′ to create an approximate solution y′, and then using y′ to create an approximate solution y for x that has the same cost as y′. The result then follows from Chen & Lin 2019, Corollary 2, which implies that if DS has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. ∎ Note that this theorem also renders MLCA PTAS-inapproximable unless FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. H.1.2 Results for MGCA Membership in in Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2 can be proven via the definition of the polynomial hierarchy and the following alternating quantifier formula: ∃[⊆ℳ]∀[∈0,1#nin]:[ℳ∖]()≠ℳ():delimited-[]ℳfor-alldelimited-[]superscript01#subscriptdelimited-[]ℳ∃[S ]\ ∀[x∈\0,1\^\#n_% in]:[M ](x) (x)∃ [ S ⊆ M ] ∀ [ x ∈ 0 , 1 # nitalic_i n ] : [ M ∖ S ] ( x ) ≠ M ( x ) Theorem 45. If MGCA is polynomial-time tractable then P=NP=NPP = N P. Proof. Observe that in the instance of MLCA constructed by the reduction in the proof of Theorem 39, the input-connection weight 0 and bias 1 of the input neuron force this neuron to output 1 for both of the possible input vectors (1)1(1)( 1 ) and (0)0(0)( 0 ). Hence, with slight modifications to the proof of reduction correctness, this reduction also establishes the NPNPN P-hardness of MGCA. ∎ Theorem 46. If ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MGCA is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of MGCA constructed in the reduction in the proof of Theorem 45, #nin=#nout=Wmax=1#subscript#subscriptsubscript1\#n_in=\#n_out=W_ =1# nitalic_i n = # nitalic_o u t = Wroman_max = 1, cd=55cd=5c d = 5, and BmaxsubscriptB_ Broman_max and k are function of k in the given instance of Clique. The result then follows from the fact that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Clique is W[1]delimited-[]1W[1]W [ 1 ]-hard (Downey & Fellows, 1999). ∎ Theorem 47. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MGCA is fixed-parameter tractable. Proof. Modify the algorithm in the proof of Theorem 41 such that each created MLP M is checked to ensure that M(I)≠M′(I)superscript′M(I)≠ M (I)M ( I ) ≠ M′ ( I ) for every possible Boolean input vector of length #nin#subscript\#n_in# nitalic_i n. As the number of such vectors is 2#nin≤2#ntotsuperscript2#subscriptsuperscript2#subscript2^\#n_in≤ 2^\#n_tot2# nitalic_i n ≤ 2# nitalic_t o t, the above is a fixed-parameter tractable algorithm for MGCA relative to parameter-set #ntot#subscript\\#n_tot\ # nitalic_t o t . ∎ Theorem 48. ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-MGCA is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 47 and the observation that #ntot≤cw×cd#subscript\#n_tot≤ cw× cd# nitalic_t o t ≤ c w × c d. ∎ Observe that the results in Theorems 46–48 in combination with Lemmas 1 and 2 suffice to establish the parameterized complexity status of MGCA relative to many subsets of the parameters listed in Table 8. Let us now consider the polynomial-time cost approximability of MGCA. As MGCA is a minimization problem, we cannot do this using reductions from a maximization problem like Clique. Hence we will instead use a reduction from another minimization problem, namely DS. Theorem 49. If MGCA is polynomial-time tractable then P=NP=NPP = N P. Proof. Observe that in the instance of MLCA constructed by the reduction in the proof of Theorem 43, the input-connection weight 0 and bias 1 of the vertex input neurons force each such neuron to output 1 for input value 0 or 1. Hence, with slight modifications to the proof of reduction correctness, this reduction also establishes the NPNPN P-hardness of MGCA. ∎ Theorem 50. If MGCA has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. As the reduction in the proof of Theorem 49 is essentially the same as the reduction in the proof of Theorem 43, the result follows by the same reasoning as given in the proof of Theorem 44. ∎ Note that this theorem also renders MGCA PTAS-inapproximable unless FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. H.2 Results for Minimal Circuit Clamping Towards proving NP-completeness, we first prove membership and then follow up with hardness. Membership in NP can be proven via the definition of the polynomial hierarchy and the following alternating quantifier formula: ∃[⊆ℳ]:ℳ()≠ℳ():delimited-[]ℳsubscriptℳ∃[S ]:M_S(x)% (x)∃ [ S ⊆ M ] : Mcaligraphic_S ( x ) ≠ M ( x ) The following hardness results are notable for holding when the given MLP M has only one hidden layer. H.2.1 Results for MLCC Theorem 51. If MLCC is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from Clique to MLCC. Given an instance ⟨G=(V,E),k⟩delimited-⟨⟩ G=(V,E),k ⟨ G = ( V , E ) , k ⟩ of Clique, construct the following instance ⟨M,I,val,k′⟩superscript′ M,I,val,k ⟨ M , I , v a l , k′ ⟩ of MLCC: Let M be an MLP based on #ntot=|V|+|E|+1#subscript1\#n_tot=|V|+|E|+1# nitalic_t o t = | V | + | E | + 1 neurons spread across three layers: 1. Input vertex layer: The vertex neurons nv1,nv2,…nv|V|subscript1subscript2…subscriptnv_1,nv_2,… nv_|V|n v1 , n v2 , … n v| V | (all with bias −22-2- 2). 2. Hidden edge layer: The edge neurons ne1,ne2,…ne|E|subscript1subscript2…subscriptne_1,ne_2,… ne_|E|n e1 , n e2 , … n e| E | (all with bias −11-1- 1). 3. Output layer: The single output neuron noutsubscriptn_outnitalic_o u t (bias −(k(k−1)/2−1)121-(k(k-1)/2-1)- ( k ( k - 1 ) / 2 - 1 )). Note that this MLP has only one hidden layer. The non-zero weight connections between adjacent layers are as follows: • Each vertex neuron nvisubscriptnv_in vitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to each edge neuron whose corresponding edge has an endpoint visubscriptv_ivitalic_i with weight 1. • Each edge neuron neisubscriptne_in eitalic_i, 1≤i≤|E|11≤ i≤|E|1 ≤ i ≤ | E |, is connected to the output neuron noutsubscriptn_outnitalic_o u t with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let I=0#nin)I=0^\#n_in)I = 0# nitalic_i n ), val=11val=1v a l = 1, and k′=ksuperscript′k =k′ = k. Observe that this instance of MLCC can be created in time polynomial in the size of the given instance of Clique. Moreover, the output behaviour of the neurons in M from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nv1(0),nv2(0),…nv|V|(0)subscript10subscript20…subscript0nv_1(0),nv_2(0),… nv_|V|(0)n v1 ( 0 ) , n v2 ( 0 ) , … n v| V | ( 0 ) 2 ne1(0),ne2(0),…ne|E|(0)subscript10subscript20…subscript0ne_1(0),ne_2(0),… ne_|E|(0)n e1 ( 0 ) , n e2 ( 0 ) , … n e| E | ( 0 ) 3 nout(0)subscript0n_out(0)nitalic_o u t ( 0 ) We now need to show the correctness of this reduction by proving that the answer for the given instance of Clique is “Yes” if and only if the answer for the constructed instance of MLCC is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a clique in G of size k′≥ksuperscript′k ≥ k′ ′ ≥ k and N′ be the k′≥k′=ksuperscript′k ≥ k =k′ ′ ≥ k′ = k-sized subset of the input vertex neurons corresponding to the vertices in V′. Let M′ be the version of M in which all neurons in N′ are clamped to value val=11val=1v a l = 1. As V′ is a clique of size k′k k′ ′, exactly k′(k′−1)/2≥k(k−1)/2superscript′1212k (k -1)/2≥ k(k-1)/2k′ ′ ( k′ ′ - 1 ) / 2 ≥ k ( k - 1 ) / 2 edge neurons in M′ receive the requisite inputs of 1 on both of their endpoints from the vertex neurons in N′. This in turn ensures the output neuron produces output 1. Hence, M(I)=0≠1=M′(I)01superscript′M(I)=0≠ 1=M (I)M ( I ) = 0 ≠ 1 = M′ ( I ). ⇐ ⇐ : Let N′ be a subset of N of size at most k′=ksuperscript′k =k′ = k such that for the MLP M′ induced by clamping all neurons in N′ to value val=11val=1v a l = 1, M(I)≠M(I′)superscript′M(I)≠ M(I )M ( I ) ≠ M ( I′ ). As M(I)=00M(I)=0M ( I ) = 0 and circuit outputs are stepped to be Boolean, M′(I)=1superscript′1M (I)=1M′ ( I ) = 1. Given the bias of the output neuron, this can only occur if at least k(k−1)/212k(k-1)/2k ( k - 1 ) / 2 edge neurons in M′ have output 1 on input I, which requires that each of these neurons receives 1 from both of its endpoint vertex neurons. As I=0#ninsuperscript0#subscriptI=0^\#n_inI = 0# nitalic_i n, these 1-inputs could only have come from the clamped vertex neurons in N′; moreover, there must be exactly k such neurons. This means that the vertices in G corresponding to the vertex neurons in N′ must form a clique of size k in G. As Clique is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MLCC is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Theorem 52. If ⟨cd,#nout,Wmax,Bmax,k⟩#subscriptsubscriptsubscript cd,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MLCC is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of MLCC constructed in the reduction in the proof of Theorem 51, #nout=Wmax=1#subscriptsubscript1\#n_out=W_ =1# nitalic_o u t = Wroman_max = 1, cd=33cd=3c d = 3, and BmaxsubscriptB_ Broman_max and k are function of k in the given instance of Clique. The result then follows from the fact that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Clique is W[1]delimited-[]1W[1]W [ 1 ]-hard (Downey & Fellows, 1999). ∎ Theorem 53. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MLCC is fixed-parameter tractable. Proof. Consider the algorithm that generates every possible subset N′ of size at most k of the neurons N in MLP M and for each such subset, creates the MLP M′ induced from M by clamping the neurons in N′ to valvalv a l and checks if M′(I)≠M(I)superscript′M (I)≠ M(I)M′ ( I ) ≠ M ( I ). If such a subset is found, return “Yes”; otherwise, return “No”. The number of possible subsets N′ is at most k×#ntotk≤#ntot×#ntot#ntot#superscriptsubscript#subscript#superscriptsubscript#subscriptk×\#n_tot^k≤\#n_tot×\#n_tot^\#n_totk × # nitalic_t o titalic_k ≤ # nitalic_t o t × # nitalic_t o t# nitalic_t o t. As any such M′ can be generated from M and M′ can be run on I in time polynomial in the size of the given instance of MLCC, the above is a fixed-parameter tractable algorithm for MLCC relative to parameter-set #ntot#subscript\\#n_tot\ # nitalic_t o t . ∎ Theorem 54. ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-MLCC is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 53 and the observation that #ntot≤cw×cd#subscript\#n_tot≤ cw× cd# nitalic_t o t ≤ c w × c d. ∎ Observe that the results in Theorems 52–54 in combination with Lemmas 1 and 2 suffice to establish the parameterized complexity status of MLCC relative to many subsets of the parameters listed in Table 8. Let us now consider the polynomial-time cost approximability of MLCC. As MLCC is a minimization problem, we cannot do this using reductions from a maximization problem like Clique. Hence we will instead use a reduction from another minimization problem, namely DS. Theorem 55. If MLCC is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from DS to MLCC. Given an instance ⟨G=(V,E),k⟩delimited-⟨⟩ G=(V,E),k ⟨ G = ( V , E ) , k ⟩ of DS, construct the following instance ⟨M,I,val,k′⟩superscript′ M,I,val,k ⟨ M , I , v a l , k′ ⟩ of MLCC: Let M be an MLP based on #ntot,g=3|V|+1#subscript31\#n_tot,g=3|V|+1# nitalic_t o t , g = 3 | V | + 1 neurons spread across four layers: 1. Input layer: The input vertex neurons nv1,nv2,…nv|V|subscript1subscript2…subscriptnv_1,nv_2,… nv_|V|n v1 , n v2 , … n v| V |, all of which have bias 1. 2. Hidden vertex neighbourhood layer I: The vertex neighbourhood AND neurons nvnA1,nvnA2,…nvnA|V|subscript1subscript2…subscriptnvnA_1,nvnA_2,… nvnA_|V|n v n A1 , n v n A2 , … n v n A| V |, where nvnAisubscriptnvnA_in v n Aitalic_i is an x-way AND ReLU gates such that x=|NC(vi)|subscriptsubscriptx=|N_C(v_i)|x = | Nitalic_C ( vitalic_i ) |. 3. Hidden vertex neighbourhood layer I: The vertex neighbourhood NOT neurons nvnN1,nvnN2,…nvnN|V|subscript1subscript2…subscriptnvnN_1,nvnN_2,… nvnN_|V|n v n N1 , n v n N2 , … n v n N| V |, all of which are NOT ReLU gates. 4. Output layer: The single output neuron noutsubscriptn_outnitalic_o u t, which is a |V||V|| V |-way AND ReLU gate. The non-zero weight connections between adjacent layers are as follows: • Each input vertex neuron nvisubscriptnv_in vitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to its input line with weight 0 and to each vertex neighbourhood AND neuron nvnAjsubscriptnvnA_jn v n Aitalic_j such that vi∈NC(vj)subscriptsubscriptsubscriptv_i∈ N_C(v_j)vitalic_i ∈ Nitalic_C ( vitalic_j ) with weight 1. • Each vertex neighbourhood AND neuron nvnAisubscriptnvnA_in v n Aitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to its corresponding vertex neighbourhood NOT neuron nvnNisubscriptnvnN_in v n Nitalic_i with weight 1. • Each vertex neighbourhood NOT neuron nvnNisubscriptnvnN_in v n Nitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to the output neuron noutsubscriptn_outnitalic_o u t with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let I be the |V||V|| V |-length one-vector, val=00val=0v a l = 0, and k′=ksuperscript′k =k′ = k. Observe that this instance of MLCC can be created in time polynomial in the size of the given instance of DS, Moreover, the output behaviour of the neurons in M from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nvN1(1),nvN2(1),…nvN|V|(1)subscript11subscript21…subscript1nvN_1(1),nvN_2(1),… nvN_|V|(1)n v N1 ( 1 ) , n v N2 ( 1 ) , … n v N| V | ( 1 ) 2 nvnA1(1),nvnA2(1),…nvnA|V|(1)subscript11subscript21…subscript1nvnA_1(1),nvnA_2(1),… nvnA_|V|(1)n v n A1 ( 1 ) , n v n A2 ( 1 ) , … n v n A| V | ( 1 ) 3 nvnN1(0),nvnN2(0),…nvnN|V|(0)subscript10subscript20…subscript0nvnN_1(0),nvnN_2(0),… nvnN_|V|(0)n v n N1 ( 0 ) , n v n N2 ( 0 ) , … n v n N| V | ( 0 ) 4 nout(0)subscript0n_out(0)nitalic_o u t ( 0 ) We now need to show the correctness of this reduction by proving that the answer for the given instance of DS is “Yes” if and only if the answer for the constructed instance of MLCC is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a dominating set in G of size k and N′ be the k′=ksuperscript′k =k′ = k-sized subset of the input vertex neurons in M corresponding to the vertices in V′. Create MLP M′ by clamping in M the neurons in N′ to val=00val=0v a l = 0. As V′ is a dominating set, each vertex neighbourhood AND neuron in M′ is now missing a i-input from at least one input vertex neuron in N′, which in turn ensures that each vertex neighbourhood AND neuron in M′ has output 0. This in turn ensures that M produces output 1 on input I such that M(I)=0≠1=M′(I)01superscript′M(I)=0≠ 1=M (I)M ( I ) = 0 ≠ 1 = M′ ( I ).. ⇐ ⇐ : Let N′ be a k′≤k′=ksuperscript′k ≤ k =k′ ′ ≤ k′ = k-sized subset of the set N of neurons in M whose clamping to val=00val=0v a l = 0 in M creates an MLP M′ such that M(I)=0≠M′(I)0superscript′M(I)=0≠ M (I)M ( I ) = 0 ≠ M′ ( I ). As all MLP outputs are stepped to be Boolean, this implies that M′(I)=1superscript′1M (I)=1M′ ( I ) = 1. This can only happen if all vertex neighbourhood NOT neurons output 1, which in turn can happen only if all vertex neighbourhood AND gates output 0. As I=1|V|superscript1I=1^|V|I = 1| V |, this can only happen if for each vertex neighbourhood AND neuron, at least one input vertex neuron previously producing a 1-input to that vertex neighbourhood AND neuron has been clamped to 0 in creating M′. This in turn implies that the k′k k′ ′ vertices in G corresponding to the elements of N′ form a dominating set of size k′≤ksuperscript′k ≤ k′ ′ ≤ k for G. As DS is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MLCC is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Theorem 56. If MLCC has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Recall from the proof of correctness of the reduction in the proof of Theorem 55 that a given instance of DS has a dominating set of size k if and only if the constructed instance of MLCC has a subset N′ of size k′=ksuperscript′k =k′ = k of the neurons in given MLP M such that the clamping to val=00val=0v a l = 0 in M of the neurons in N′ creates an MLP M′ such that M(I)≠M′(I)superscript′M(I)≠ M (I)M ( I ) ≠ M′ ( I ). This implies that, given a polynomial-time c-approximation algorithm A for MLCC for some constant c>00c>0c > 0, we can create a polynomial-time c-approximation algorithm for DS by applying the reduction to the given instance x of DS to construct an instance x′ of MLCC, applying A to x′ to create an approximate solution y′, and then using y′ to create an approximate solution y for x that has the same cost as y′. The result then follows from Chen & Lin 2019, Corollary 2, which implies that if DS has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. ∎ Note that this theorem also renders MLCC PTAS-inapproximable unless FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. H.2.2 Results for MGCC Membership in in Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2 can be proven via the definition of the polynomial hierarchy and the following alternating quantifier formula: ∃[⊆ℳ]∀[∈0,1#nin]:ℳ()≠ℳ():delimited-[]ℳfor-alldelimited-[]superscript01#subscriptsubscriptℳ∃[S ]\ ∀[x∈\0,1\^\#n_% in]:M_S(x) (x)∃ [ S ⊆ M ] ∀ [ x ∈ 0 , 1 # nitalic_i n ] : Mcaligraphic_S ( x ) ≠ M ( x ) Theorem 57. If MGCC is polynomial-time tractable then P=NP=NPP = N P. Proof. Observe that in the instance of MLCC constructed by the reduction in the proof of Theorem 51, the biases of −22-2- 2 in the input vertex neurons force these neurons to map any given Boolean input vector onto 0#ninsuperscript0#subscript0^\#n_in0# nitalic_i n. Hence, with slight modifications to the proof of reduction correctness, this reduction also establishes the NPNPN P-hardness of MGCC. ∎ Theorem 58. If ⟨cd,#nout,Wmax,Bmax,k⟩#subscriptsubscriptsubscript cd,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MGCC is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of MGCC constructed in the reduction in the proof of Theorem 57, #nout=Wmax=1#subscriptsubscript1\#n_out=W_ =1# nitalic_o u t = Wroman_max = 1, cd=33cd=3c d = 3, and BmaxsubscriptB_ Broman_max and k are function of k in the given instance of Clique. The result then follows from the fact that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Clique is W[1]delimited-[]1W[1]W [ 1 ]-hard (Downey & Fellows, 1999). ∎ Theorem 59. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MLCC is fixed-parameter tractable. Proof. Modify the algorithm in the proof of Theorem 53 such that each created MLP M is checked to ensure that M(I)≠M′(I)superscript′M(I)≠ M (I)M ( I ) ≠ M′ ( I ) for every possible Boolean input vector of length #nin#subscript\#n_in# nitalic_i n. As the number of such vectors is 2#nin≤2#ntotsuperscript2#subscriptsuperscript2#subscript2^\#n_in≤ 2^\#n_tot2# nitalic_i n ≤ 2# nitalic_t o t, the above is a fixed-parameter tractable algorithm for MGCC relative to parameter-set #ntot#subscript\\#n_tot\ # nitalic_t o t . ∎ Theorem 60. ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-MGCC is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 59 and the observation that #ntot≤cw×cd#subscript\#n_tot≤ cw× cd# nitalic_t o t ≤ c w × c d. ∎ Observe that the results in Theorems 58–60 in combination with Lemmas 1 and 2 suffice to establish the parameterized complexity status of MGCC relative to many subsets of the parameters listed in Table 8. Let us now consider the polynomial-time cost approximability of MGCC. As MGCC is a minimization problem, we cannot do this using reductions from a maximization problem like Clique. Hence we will instead use a reduction from another minimization problem, namely DS. Theorem 61. If MGCC is polynomial-time tractable then P=NP=NPP = N P. Proof. Observe that in the instance of MLCC constructed by the reduction in the proof of Theorem 55, the input-connection weight 0 and bias 1 of the vertex input neurons force each such neuron to output 1 for input value 0 or 1. Hence, with slight modifications to the proof of reduction correctness, this reduction also establishes the NPNPN P-hardness of MGCC. ∎ Theorem 62. If MGCC has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. As the reduction in the proof of Theorem 61 is essentially the same as the reduction in the proof of Theorem 55, the result follows by the same reasoning as given in the proof of Theorem 56. ∎ Note that this theorem also renders MGCC PTAS-inapproximable unless FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Appendix I Circuit Patching Problem Minimum local circuit patching (MLCP) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, Boolean input vectors x and y of length #nin#subscript\#n_in# nitalic_i n, and a positive integer k such that 1≤k≤(#ntot−(#nin+#nout))1#subscript#subscript#subscript1≤ k≤(\#n_tot-(\#n_in+\#n_out))1 ≤ k ≤ ( # nitalic_t o t - ( # nitalic_i n + # nitalic_o u t ) ). Question: Is there a subset C, |C|≤k|C|≤ k| C | ≤ k, of the internal neurons in M such that for the MLP M′ created when M is y-patched wrt C, i.e., M′ is created when M/CM/CM / C is patched with activations from M(x)M(x)M ( x ) and C is patched with activations from M(y)M(y)M ( y ), M′(x)=M(y)superscript′M (x)=M(y)M′ ( x ) = M ( y )? Minimum Global Circuit Patching (MGCP) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, Boolean input vector y of length #nin#subscript\#n_in# nitalic_i n, and a positive integer k such that 1≤k≤(#ntot−(#nin+#nout))1#subscript#subscript#subscript1≤ k≤(\#n_tot-(\#n_in+\#n_out))1 ≤ k ≤ ( # nitalic_t o t - ( # nitalic_i n + # nitalic_o u t ) ). Question: Is there a subset C, |C|≤k|C|≤ k| C | ≤ k, of the internal neurons in M such that, for all possible input vectors x, for the MLP M′ created when M is y-patched wrt C, i.e., M′ is created when M/CM/CM / C is patched with activations from M(x)M(x)M ( x ) and C is patched with activations from M(y)M(y)M ( y ), M′(x)=M(y)superscript′M (x)=M(y)M′ ( x ) = M ( y )? Following Barceló et al. 2020, page 4, all neurons in M use the ReLU activation function and the output x of each output neuron is stepped as necessary to be Boolean, i.e, step(x)=00step(x)=0s t e p ( x ) = 0 if x≤00x≤ 0x ≤ 0 and is 1111 otherwise. For a graph G=(V,E)G=(V,E)G = ( V , E ), we shall assume an ordering on the vertices and edges in V and E, respectively. For each vertex v∈Vv∈ Vv ∈ V, let the complete neighbourhood NC(v)subscriptN_C(v)Nitalic_C ( v ) of v be the set composed of v and the set of all vertices in G that are adjacent to v by a single edge, i.e., v∪u|u∈Vand(u,v)∈Econditional-setanduvEv∪\u~|~u~∈ V~ and~(u,v)∈ E\v ∪ u | u ∈ V and ( u , v ) ∈ E . We will prove various classical and parameterized results for MLCP and MGCP using reductions from Dominating Set. The parameterized results are proved relative to the parameters in Table 9. Our reductions (Theorems 63 and 67) uses specialized ReLU logic gates described in Barceló et al. 2020, Lemma 13. These gates assume Boolean neuron input and output values of 0 and 1 and are structured as follows: 1. NOT ReLU gate: A ReLU gate with one input connection weight of value −11-1- 1 and a bias of 1. This gate has output 1 if the input is 0 and 0 otherwise. 2. n-way AND ReLU gate: A ReLU gate with n input connection weights of value 1 and a bias of −(n−1)1-(n-1)- ( n - 1 ). This gate has output 1 if all inputs have value 1 and 0 otherwise. 3. n-way OR ReLU gate: A combination of an n-way AND ReLU gate with NOT ReLU gates on all of its inputs and a NOT ReLU gate on its output that uses DeMorgan’s Second Law to implement (x1∨x2∨…xn)subscript1subscript2…subscript(x_1 x_2 … x_n)( x1 ∨ x2 ∨ … xitalic_n ) as ¬(¬x1∧¬x2∧…¬xn)subscript1subscript2…subscript ( x_1 x_2 … x_n)¬ ( ¬ x1 ∧ ¬ x2 ∧ … ¬ xitalic_n ). This gate has output 1 if any input has value 1 and 0 otherwise. Table 9: Parameters for the minimum circuit patching problem. Parameter Description cdcdc d # layers in given MLP cwcwc w max # neurons in layer in given MLP #ntot#subscript\#n_tot# nitalic_t o t total # neurons in given MLP #nin#subscript\#n_in# nitalic_i n # input neurons in given MLP #nout#subscript\#n_out# nitalic_o u t # output neurons in given MLP BmaxsubscriptB_ Broman_max max neuron bias in given MLP WmaxsubscriptW_ Wroman_max max connection weight in given MLP k Size of requested patching-subset of M I.1 Results for MLCP Towards proving NP-completeness, we first prove membership and then follow up with hardness. Membership in NP can be proven via the definition of the polynomial hierarchy and the following alternating quantifier formula: ∃[⊆ℳ]:ℳ()=ℳ():delimited-[]ℳsubscriptℳ∃[S ]:M_S(x)% =M(y)∃ [ S ⊆ M ] : Mcaligraphic_S ( x ) = M ( y ) Theorem 63. If MLCP is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from DS to MLCP adapted from the reduction from DS to MSR in Theorem 103. Given an instance ⟨G=(V,E),k⟩delimited-⟨⟩ G=(V,E),k ⟨ G = ( V , E ) , k ⟩ of DS, construct the following instance ⟨M,x,y,k′⟩superscript′ M,x,y,k ⟨ M , x , y , k′ ⟩ of MLCP: Let M be an MLP based on #ntot=4|V|+1#subscript41\#n_tot=4|V|+1# nitalic_t o t = 4 | V | + 1 neurons spread across five layers: 1. Input layer: The input vertex neurons nv1,nv2,…nv|V|subscript1subscript2…subscriptnv_1,nv_2,… nv_|V|n v1 , n v2 , … n v| V |, all of which have bias 0. 2. Hidden vertex layer: The hidden vertex neurons nhv1,nhv2,…nhv|V|ℎsubscript1ℎsubscript2…ℎsubscriptnhv_1,nhv_2,… nhv_|V|n h v1 , n h v2 , … n h v| V |, all of which are identity ReLU gates with bias 0. 3. Hidden vertex neighbourhood layer I: The vertex neighbourhood AND neurons nvnA1,nvnA2,…nvnA|V|subscript1subscript2…subscriptnvnA_1,nvnA_2,… nvnA_|V|n v n A1 , n v n A2 , … n v n A| V |, where nvnAisubscriptnvnA_in v n Aitalic_i is an x-way AND ReLU gates such that x=|NC(vi)|subscriptsubscriptx=|N_C(v_i)|x = | Nitalic_C ( vitalic_i ) |. 4. Hidden vertex neighbourhood layer I: The vertex neighbourhood NOT neurons nvnN1,nvnN2,…nvnN|V|subscript1subscript2…subscriptnvnN_1,nvnN_2,… nvnN_|V|n v n N1 , n v n N2 , … n v n N| V |, all of which are NOT ReLU gates. 5. Output layer: The single output neuron noutsubscriptn_outnitalic_o u t, which is a |V||V|| V |-way AND ReLU gate. The non-zero weight connections between adjacent layers are as follows: • Each input vertex neuron nvisubscriptnv_in vitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to its associated hidden vertex neuron nhviℎsubscriptnhv_in h vitalic_i with weight 1. • Each hidden vertex neuron nhviℎsubscriptnhv_in h vitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to each vertex neighbourhood AND neuron nvnAjsubscriptnvnA_jn v n Aitalic_j such that vi∈NC(vj)subscriptsubscriptsubscriptv_i∈ N_C(v_j)vitalic_i ∈ Nitalic_C ( vitalic_j ) with weight 1. • Each vertex neighbourhood AND neuron nvnAisubscriptnvnA_in v n Aitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to its corresponding vertex neighbourhood NOT neuron nvnNisubscriptnvnN_in v n Nitalic_i with weight 1. • Each vertex neighbourhood NOT neuron nvnNisubscriptnvnN_in v n Nitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to the output neuron noutsubscriptn_outnitalic_o u t with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let x and y be the |V||V|| V |-length one- and zero-vectors and k′=ksuperscript′k =k′ = k. Observe that this instance of MLCP can be created in time polynomial in the size of the given instance of DS, Moreover, the output behaviour of the neurons in M from the presentation of input x until the output is generated is timestep neurons (outputs) 0 — 1 nv1(1),nv2(1),…nv|V|(1)subscript11subscript21…subscript1nv_1(1),nv_2(1),… nv_|V|(1)n v1 ( 1 ) , n v2 ( 1 ) , … n v| V | ( 1 ) 2 nhv1(1),nhv2(1),…nhv|V|(1)ℎsubscript11ℎsubscript21…ℎsubscript1nhv_1(1),nhv_2(1),… nhv_|V|(1)n h v1 ( 1 ) , n h v2 ( 1 ) , … n h v| V | ( 1 ) 3 nvnA1(1),nvnA2(1),…nvnA|V|(1)subscript11subscript21…subscript1nvnA_1(1),nvnA_2(1),… nvnA_|V|(1)n v n A1 ( 1 ) , n v n A2 ( 1 ) , … n v n A| V | ( 1 ) 4 nvnN1(0),nvnN2(0),…nvnN|V|(0)subscript10subscript20…subscript0nvnN_1(0),nvnN_2(0),… nvnN_|V|(0)n v n N1 ( 0 ) , n v n N2 ( 0 ) , … n v n N| V | ( 0 ) 5 nout(0)subscript0n_out(0)nitalic_o u t ( 0 ) and the output behaviour of the neurons in M from the presentation of input y until the output is generated is timestep neurons (outputs) 0 — 1 nv1(0),nv2(0),…nv|V|(0)subscript10subscript20…subscript0nv_1(0),nv_2(0),… nv_|V|(0)n v1 ( 0 ) , n v2 ( 0 ) , … n v| V | ( 0 ) 2 nhv1(0),nhv2(0),…nhv|V|(0)ℎsubscript10ℎsubscript20…ℎsubscript0nhv_1(0),nhv_2(0),… nhv_|V|(0)n h v1 ( 0 ) , n h v2 ( 0 ) , … n h v| V | ( 0 ) 3 nvnA1(0),nvnA2(0),…nvnA|V|(0)subscript10subscript20…subscript0nvnA_1(0),nvnA_2(0),… nvnA_|V|(0)n v n A1 ( 0 ) , n v n A2 ( 0 ) , … n v n A| V | ( 0 ) 4 nvnN1(1),nvnN2(1),…nvnN|V|(1)subscript11subscript21…subscript1nvnN_1(1),nvnN_2(1),… nvnN_|V|(1)n v n N1 ( 1 ) , n v n N2 ( 1 ) , … n v n N| V | ( 1 ) 5 nout(1)subscript1n_out(1)nitalic_o u t ( 1 ) We now need to show the correctness of this reduction by proving that the answer for the given instance of DS is “Yes” if and only if the answer for the constructed instance of MLCP is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a dominating set in G of size k and C be the k′=ksuperscript′k =k′ = k-sized subset of the hidden vertex neurons in M corresponding to the vertices in V′. As V′ is a dominating set, each vertex neighbourhood AND neuron receives input 0 from at least one hidden vertex neuron in C when M is y-patched wrt C. This ensures that each vertex neighbourhood AND neuron has output 0, which in turn ensures that each vertex neighbourhood NOT neuron has output 1 and for M′ created by y-patching M wrt C, M′(x)=M(y)=1superscript′1M (x)=M(y)=1M′ ( x ) = M ( y ) = 1. ⇐ ⇐ : Let C be a k′=ksuperscript′k =k′ = k-sized subset of the internal neurons of M such that when M′ is created by y-patching M wrt C, M′(x)=M(y)=1superscript′1M (x)=M(y)=1M′ ( x ) = M ( y ) = 1. The output of M′ on X can be 1 (and hence equal to the output of M on y) only if all vertex neighbourhood NOT neurons output 1, which in turn can happen only if all vertex neighbourhood AND gates output 0. However, as all elements of x have value 1, this means that each vertex neighbourhood AND neuron must be connected to at least one patched hidden vertex neuron (all of which have output 0 courtesy of y), which in turn implies that the k′=ksuperscript′k =k′ = k vertices in G corresponding to the patched hidden vertex neurons in C form a dominating set of size k for G. As DS is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MLCP is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Theorem 64. If ⟨cd,#nout,Wmax,Bmax,k⟩#subscriptsubscriptsubscript cd,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MLCP is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of MLCP constructed in the reduction in the proof of Theorem 63, #nout=Wmax=1#subscriptsubscript1\#n_out=W_ =1# nitalic_o u t = Wroman_max = 1, cd=44cd=4c d = 4, and BmaxsubscriptB_ Broman_max and k are function of k in the given instance of DS. The result then follows from the facts that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-DS is W[2]delimited-[]2W[2]W [ 2 ]-hard (Downey & Fellows, 1999) and W[1]⊆W[2]delimited-[]1delimited-[]2W[1] W[2]W [ 1 ] ⊆ W [ 2 ]. ∎ Theorem 65. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MLCP is fixed-parameter tractable. Proof. Consider the algorithm that generates all possible subset C of the internal neurons in M and for each such subset, checks if M′ created by y-patching M wrt C is such that M′(x)=M(y)superscript′M (x)=M(y)M′ ( x ) = M ( y ). If such a C is found, return “Yes”; otherwise, return “No”. The number of possible subsets C is at most 2(#ntot2^(\#n_tot2( # nitalic_t o t. Given this, as M can be patched relative to C and run on x and y in time polynomial in the size of the given instance of MLCP, the above is a fixed-parameter tractable algorithm for MLCP relative to parameter-set #ntot#subscript\\#n_tot\ # nitalic_t o t . ∎ Observe that the results in Theorems 64 and 65 in combination with Lemmas 1 and 2 suffice to establish the parameterized complexity status of MLCP relative to many subset of the parameters listed in Table 9. Let us now consider the polynomial-time cost approximability of MLCP. Theorem 66. If MLCP has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Recall from the proof of correctness of the reduction in the proof of Theorem 63 that a given instance of DS has a dominating set of size k if and only if the constructed instance of MLCPMLCPM L C P has a subset C of the internal neurons in M of size k′=ksuperscript′k =k′ = k such that for the MLP M′ created from M by y-patching M wrt C, M′(x)=M(y)superscript′M (x)=M(y)M′ ( x ) = M ( y ) This implies that, given a polynomial-time c-approximation algorithm A for MLCP for some constant c>00c>0c > 0, we can create a polynomial-time c-approximation algorithm for DS by applying the reduction to the given instance I of DS to construct an instance I′ of MLCP, applying A to I′ to create an approximate solution S′, and then using S′ to create an approximate solution S for I that has the same cost as S′. The result then follows from Chen & Lin 2019, Corollary 2, which implies that if DS has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. ∎ Note that this theorem also renders MLCP PTAS-inapproximable unless FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. I.2 Results for MGCP Membership in in Σ2psubscriptsuperscriptΣ2 ^p_2Σitalic_p2 can be proven via the definition of the polynomial hierarchy and the following alternating quantifier formula: ∃[⊆ℳ]∀[∈0,1#nin]:ℳ()=ℳ():delimited-[]ℳfor-alldelimited-[]superscript01#subscriptsubscriptℳ∃[S ]\ ∀[x∈\0,1\^\#n_% in]:M_S(x)=M(y)∃ [ S ⊆ M ] ∀ [ x ∈ 0 , 1 # nitalic_i n ] : Mcaligraphic_S ( x ) = M ( y ) Theorem 67. If MGCP is polynomial-time tractable then P=NP=NPP = N P. Proof. Modify the reduction in the proof of Theorem 63 such that each hidden vertex neuron has input weight 0 and bias 1; this will force all hidden vertex neurons to output 1 for all input vectors x instead of just when x is the all-one vector. Hence, with slight modifications to the proof of reduction correctness for the reduction in the proof of Theorem 63, this modified reduction establishes the NPNPN P-hardness of MGCP. ∎ Theorem 68. If ⟨cd,#nout,Wmax,Bmax,k⟩#subscriptsubscriptsubscript cd,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MGCP is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of MGCP constructed in the reduction in the proof of Theorem 67, #nout=Wmax=1#subscriptsubscript1\#n_out=W_ =1# nitalic_o u t = Wroman_max = 1, cd=44cd=4c d = 4, and BmaxsubscriptB_ Broman_max and k are function of k in the given instance of DS. The result then follows from the facts that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-DS is W[2]delimited-[]2W[2]W [ 2 ]-hard (Downey & Fellows, 1999) and W[1]⊆W[2]delimited-[]1delimited-[]2W[1] W[2]W [ 1 ] ⊆ W [ 2 ]. ∎ Theorem 69. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MGCP is fixed-parameter tractable. Proof. Modify the algorithm in the proof of Theorem 65 such that each circuit M′ created by y-patching M is checked to ensure that M′(x)=M(y)superscript′M (x)=M(y)M′ ( x ) = M ( y ) for every possible Boolean input vector x of length #nin#subscript\#n_in# nitalic_i n. As the number of such vectors is 2#nin<2#ntotsuperscript2#subscriptsuperscript2#subscript2^\#n_in<2^\#n_tot2# nitalic_i n < 2# nitalic_t o t, the above is a fixed-parameter tractable algorithm for MGCP relative to parameter-set #ntot#subscript\\#n_tot\ # nitalic_t o t . ∎ Observe that the results in Theorems 68 and 69 in combination with Lemmas 1 and 2 suffice to establish the parameterized complexity status of MGCP relative to many subset of the parameters listed in Table 9. Let us now consider the polynomial-time cost approximability of MGCP. Theorem 70. If MGCP has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. As the reduction in the proof of Theorem 67 is essentially the same as the reduction in the proof of Theorem 63, the result follows by the same reasoning as given in the proof of Theorem 66. ∎ Note that this theorem also renders MGCP PTAS-inapproximable unless FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Appendix J Quasi-Minimal Circuit Patching Problem Quasi-Minimal Circuit Patching (QMCP) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, an input vector yy and a set XX of input vectors of length #nin#subscript\#n_in# nitalic_i n. Output: a subset CC in ℳMM and a neuron v∈v ∈ C, such that for the ℳ∗superscriptℳM^*M∗ induced by patching CC with activations from ℳ()ℳM(y)M ( y ) and ℳ∖ℳM ∖ C with activations from ℳ()ℳM(x)M ( x ), ∀∈:ℳ∗()=ℳ():subscriptfor-allsuperscriptℳ _x :M^*(x)=M(% y)∀x ∈ X : M∗ ( x ) = M ( y ), and for ℳ′M M′ induced by patching identically except for v∈v ∈ C, ∃∈:ℳ′()≠ℳ():subscriptsuperscriptℳ′ℳ _x :M (x)≠% M(y)∃x ∈ X : M′ ( x ) ≠ M ( y ). Theorem 71. QMCP is in PTIME (i.e., polynomial-time tractable). Proof. Consider the following algorithm for QMCP. Build a sequence of MLPs by taking ℳMM with all neurons labeled 0, and generating subsequent ℳisubscriptℳM_iMitalic_i in the sequence by labeling an additional neuron with 1 each time (this choice can be based on any heuristic strategy, for instance, one based on gradients). The first MLP, ℳ1subscriptℳ1M_1M1, obtained by patching all neurons labeled 1 (i.e., none) is such that ℳ1(x)≠ℳ(y)subscriptℳ1ℳM_1(x) (y)M1 ( x ) ≠ M ( y ), and the last ℳnsubscriptℳM_nMitalic_n is guaranteed to give ℳn(x)=ℳ(y)subscriptℳM_n(x)=M(y)Mitalic_n ( x ) = M ( y ) because all neurons are patched. Label the first MLP NO, and the last YES. Perform a variant of binary search on the sequence as follows. Evaluate the ℳsubscriptℳM_iMcaligraphic_i halfway between NO and YES while patching all its neurons labeled 1. If it satisfies the condition, label it YES, and repeat the same strategy with the sequence starting from the first ℳisubscriptℳM_iMitalic_i until the YES just labeled. If it does not satisfy the condition, label it NO and repeat the same strategy with the sequence starting from the NO just labeled until the YES at the end of the original sequence. This iterative procedure halves the sequence each time. Halt when you find two adjacent ⟨NO,YES⟩NOYES NO, YES ⟨ NO , YES ⟩ patched networks (guaranteed to exist), and return the patched neuron set of the YES network V and the single neuron difference between YES and NO (the breaking point), v∈Vv∈ Vv ∈ V. The complexity of this algorithm is roughly O(nlogn)O(n n)O ( n log n ). ∎ Appendix K Circuit Robustness Problem Definition 17. Given an MLP M, a subset H of the elements in M, an integer k≤|H|k≤|H|k ≤ | H |, and an input I to M, M is k-robust relative to H for I if for each subset H′⊆Hsuperscript′H H′ ⊆ H, |H′|≤ksuperscript′|H |≤ k| H′ | ≤ k, M(I)=(M/H′)(I)superscript′M(I)=(M/H )(I)M ( I ) = ( M / H′ ) ( I ). Maximum Local Circuit Robustness (MLCR) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, a subset H of the neurons in M, a Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n, and a positive integer k such that 1≤k≤|H|11≤ k≤|H|1 ≤ k ≤ | H |. Question: Is M k-robust relative to H for I? Restricted Maximum Local Circuit Robustness (MLCR∗) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, a Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n, and a positive integer k such that 1≤k≤|M|11≤ k≤|M|1 ≤ k ≤ | M |. Question: Is M k-robust relative to H=MH=MH = M for I? Maximum Global Circuit Robustness (MGCR) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, a subset H of the neurons in M, and a positive integer k such that 1≤k≤|H|11≤ k≤|H|1 ≤ k ≤ | H |. Question: Is M k-robust relative to H for every possible Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n? Restricted Maximum Global Circuit Robustness (MGCR∗) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, and a positive integer k such that 1≤k≤|M|11≤ k≤|M|1 ≤ k ≤ | M |. Question: Is M k-robust relative to H=MH=MH = M for every possible Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n? Following Barceló et al. 2020, page 4, all neurons in M use the ReLU activation function and the output x of each output neuron is stepped as necessary to be Boolean, i.e, step(x)=00step(x)=0s t e p ( x ) = 0 if x≤00x≤ 0x ≤ 0 and is 1111 otherwise. We will use previous results for Minimum Local/Global Circuit Ablation, Clique and Vertex Cover to prove our results for the problems above. For a graph G=(V,E)G=(V,E)G = ( V , E ), we shall assume an ordering on the vertices and edges in V and E, respectively. We will prove various classical and parameterized results for MLCR, MLCR∗, MGCR, and MGCR∗. The parameterized results are proved relative to the parameters in Table 10. Lemmas 1 and 2 will be useful in deriving additional parameterized results from proved ones. Table 10: Parameters for the minimum circuit robustness problem. Parameter Description Appl. cdcdc d # layers in given MLP All cwcwc w max # neurons in layer in given MLP All #ntot#subscript\#n_tot# nitalic_t o t total # neurons in given MLP All #nin#subscript\#n_in# nitalic_i n # input neurons in given MLP All #nout#subscript\#n_out# nitalic_o u t # output neurons in given MLP All BmaxsubscriptB_ Broman_max max neuron bias in given MLP All WmaxsubscriptW_ Wroman_max max connection weight in given MLP All k Requested level of robustness All |H||H|| H | Size of investigated region M\L,G\CR Several of our proofs involve problems that are hard for coW[1]delimited-[]1coW[1]c o W [ 1 ] and coNPcoNPc o N P, the complement classes of W[1]delimited-[]1W[1]W [ 1 ] and NPNPN P (Garey & Johnson, 1979, Section 7.1). Given a decision problem X let co-X be the complement problem in which all instances answers are switched. Observation 3. MLCR∗ is the complement of MLCA. Proof. Let X=⟨M,I,k⟩X= M,I,k = ⟨ M , I , k ⟩ be a shared input of MLCR∗ and MLCA. If the the answer to MLCA on input X is “Yes”, this means that there is a subset H of size ≤kabsent≤ k≤ k of the neurons of M that can be ablated such that M(I)≠(M/H)(I)M(I)≠(M/H)(I)M ( I ) ≠ ( M / H ) ( I ). This implies that the answer to MLCR∗ on input X is “No”. Conversely, if the answer to MLCA on input X is “No”, this means that there is no subset H of size ≤kabsent≤ k≤ k of the neurons of M that can be ablated such that M(I)≠(M/H)(I)M(I)≠(M/H)(I)M ( I ) ≠ ( M / H ) ( I ). This implies that the answer to MLCR∗ on input X is “Yes’. The observation follows from the definition of complement problem. ∎ The following lemmas will be of use in the derivation and interpretation of results involving complement problems and classes. Lemma 5. (Garey & Johnson, 1979, Section 7.1) Given a decision problem X, if X is NPNPN P-hard then co-X is coNPcoNPc o N P-hard. Lemma 6. Given a decision problem X, if X is coNPcoNPc o N P-hard and X is polynomial-time solvable then P=NP=NPP = N P. Proof. Suppose decision problem X is coNPcoNPc o N P-hard and solvable in polynomial-time by algorithm A. By the definition of problem, class hardness, for every problem Y in coNPcoNPc o N P there is a polynomial-time many-one reduction Π Π from Y to X; let AΠsubscriptΠA_ Aroman_Π be the polynomial-time algorithm encoded in Π Π. We can create a polynomial-time algorithm A′ for co-Y by running AΠsubscriptΠA_ Aroman_Π on a given input, running A, and then complementing the produced output, i.e., “Yes” ⇒ ⇒ “No” and “No” ⇒ ⇒ “Yes”. However, as coX is NPNPN P-hard by Lemma 5, this implies that P=NP=NPP = N P. ∎ Lemma 7. (Flum & Grohe, 2006, Lemma 8.23) Let CC be a parameterized complexity class. Given a parameterized decision problem X, if X is CC-hard then co-X is coco Cc o C-hard. Lemma 8. Given a parameterized decision problem X, if X is fixed-parameter tractable then co-X is fixed-parameter tractable. Proof. Given a fixed-parameter tractable algorithm A for X, we can create a fixed-parameter tractable algorithm A′ for co-X by running A on a given input and then complementing the produced output, i.e., “Yes” ⇒ ⇒ “No” and “No” ⇒ ⇒ “Yes”. ∎ Lemma 9. Given a parameterized decision problem X, if X is coW[1]delimited-[]1coW[1]c o W [ 1 ]-hard and fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Suppose decision problem X is coW[1]delimited-[]1coW[1]c o W [ 1 ]-hard and solvable in polynomial-time by algorithm A. By the definition of problem class hardness, for every problem Y in coW[1]delimited-[]1coW[1]c o W [ 1 ] there is a parameterized reduction Π Π from Y to X; let AΠsubscriptΠA_ Aroman_Π be the fixed-parameter tractable algorithm encoded in Π Π. We can create a polynomial-time algorithm A′ for co-Y by running AΠsubscriptΠA_ Aroman_Π on a given input, running A, and then complementing the produced output, i.e., “Yes” ⇒ ⇒ “No” and “No” ⇒ ⇒ “Yes”. However, as coX is W[1]delimited-[]1W[1]W [ 1 ]-hard by Lemma 7, this implies that P=NP=NPP = N P. ∎ We will also be deriving polynomial-time inapproximability results for optimization versions of MLCR, MLCR∗, MGCR, and MGCR∗, i.e., • Max-MLCR, which asks for the maximum value k such that M is k-robust relative to H for I. • Max-MLCR∗, which asks for the maximum value k such that M is k-robust relative to H=MH=MH = M for I. • Max-MGCR, which asks for the maximum value k such that M is k-robust relative to H for every possible Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n • Max-MGCR∗, which asks for the maximum value k such that M is k-robust relative to H=MH=MH = M for every possible Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n The derivation of such results is complicated by both the coNPcoNPc o N P-hardness of the decision versions of these problems (and the scarcity of inapproximability results for coNPcoNPc o N P-hard problems which one could transfer to our problems by L-reductions) and the fact that the optimization versions of our problems are evaluation problems that return numbers rather than graph structures (which makes the use of instance-copy and gap inapproximability proof techniques [Garey & Johnson 1979, Chapter 6] extremely difficult). We shall sidestep many of these issues by deriving our results within the OptPOptPO p t P framework for analyzing evaluation problems developed in Krentel 1988; Gasarch et al. 1995. In particular, we shall find the following of use. Definition 18. (Adapted from (Krentel, 1988, page 493)) Let f,g:Σ∗→:→superscriptΣf,g: ^* , g : Σ∗ → Z. A metric reduction from f to g is a pair (T1,T2)subscript1subscript2(T_1,T_2)( T1 , T2 ) of polynomial-time computable functions where T1:Σ∗→Σ∗:subscript1→superscriptΣsuperscriptΣT_1: ^*→ ^*T1 : Σ∗ → Σ∗ and T2:Σ∗×→:subscript2→superscriptΣT_2: ^*×Z 2 : Σ∗ × Z → Z such that f(x)=T2(x,g(T1(x)))subscript2subscript1f(x)=T_2(x,g(T_1(x)))f ( x ) = T2 ( x , g ( T1 ( x ) ) ) for all x∈Σ∗superscriptΣx∈ ^*x ∈ Σ∗. Lemma 10. (Corollary of (Krentel, 1988, Theorem 4.3)) Given an evaluation problem Π Π that is OptP[O(logn)]delimited-[]OptP[O( n)]O p t P [ O ( log n ) ]-hard under metric reductions, if Π Π has a c-additive approximation algorithm for some c∈o(poly)c∈ o(poly)c ∈ o ( p o l y )444 Note that o(poly)o(poly)o ( p o l y ) is the set of all functions f that are strictly upper bounded by all polynomials of n, i.e., f(n)≤c×g(n)f(n)≤ c× g(n)f ( n ) ≤ c × g ( n ) for n≥n0subscript0n≥ n_0n ≥ n0 for all c>00c>0c > 0 and g(n)∈∪knk=nO(1)subscriptsuperscriptsuperscript1g(n)∈ _kn^k=n^O(1)g ( n ) ∈ ∪k nitalic_k = nitalic_O ( 1 ). then P=NP=NPP = N P. Some of our metric reductions use specialized ReLU logic gates described in Barceló et al. 2020, Lemma 13. These gates assume Boolean neuron input and output values of 0 and 1 and are structured as follows: 1. NOT ReLU gate: A ReLU gate with one input connection weight of value −11-1- 1 and a bias of 1. This gate has output 1 if the input is 0 and 0 otherwise. 2. n-way AND ReLU gate: A ReLU gate with n input connection weights of value 1 and a bias of −(n−1)1-(n-1)- ( n - 1 ). This gate has output 1 if all inputs have value 1 and 0 otherwise. 3. n-way OR ReLU gate: A combination of an n-way AND ReLU gate with NOT ReLU gates on all of its inputs and a NOT ReLU gate on its output that uses DeMorgan’s Second Law to implement (x1∨x2∨…xn)subscript1subscript2…subscript(x_1 x_2 … x_n)( x1 ∨ x2 ∨ … xitalic_n ) as ¬(¬x1∧¬x2∧…¬xn)subscript1subscript2…subscript ( x_1 x_2 … x_n)¬ ( ¬ x1 ∧ ¬ x2 ∧ … ¬ xitalic_n ). This gate has output 1 if any input has value 1 and 0 otherwise. The hardness (inapproximability) results in this section hold when the given MLP M has three (six) hidden layers. K.1 Results for MLCR-special and MLCR Let us first consider problem MLCR∗. Theorem 72. If MLCR∗ is polynomial-time tractable then P=NP=NPP = N P. Proof. As MLCR∗ is the complement of MLCA by Observation 3 and MLCA is NPNPN P-hard by the proof of Theorem 39, Lemma 5 implies that MLCR∗ is coNPcoNPc o N P-hard. The result then follows from Lemma 6. ∎ Theorem 73. If ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MLCR∗ is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. The coW[1]delimited-[]1coW[1]c o W [ 1 ]-hardness of ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MLCR∗ follows from the W[1]delimited-[]1W[1]W [ 1 ]-hardness of ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MLCA (Theorem 40), Observation 3, and Lemma 7. The result then follows from Lemma 9. ∎ Theorem 74. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MLCR∗ is fixed-parameter tractable. Proof. The result follows from the fixed-parameter tractability of ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MLCA Theorem 41, Observation 3, and Lemma 8. ∎ Theorem 75. ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-MLCR∗ is fixed-parameter tractable. Proof. The result follows from the fixed-parameter tractability of ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-MLCA Theorem 42, Observation 3, and Lemma 8. ∎ Observe that the results in Theorems 73–75 in combination with Lemmas 1 and 2 suffice to establish the parameterized complexity status of MLCR∗ relative to many subsets of the parameters listed in Table 10. We can derive a polynomial-time additive inapproximability result for Max-MLCR∗ using the following chain of metric reductions based on two evaluation problems: • Min-VC, which asks for the minimum value k such that G has a vertex cover of size k. • Min-MLCA, which asks for the minimum value k such that there is a k-size subset N′ of the |N||N|| N | neurons in M such that M(I)≠(M/N′)(I)superscript′M(I)≠(M/N )(I)M ( I ) ≠ ( M / N′ ) ( I ). Lemma 11. Min-VC metric reduces to Min-MLCA. Proof. Consider the following reduction from Min-VC to Min-MLCA. Given an instance X=⟨G=(V,E)⟩delimited-⟨⟩X= G=(V,E) = ⟨ G = ( V , E ) ⟩ of Min-VS, construct the following instance X′=⟨M,I⟩superscript′X = M,I ′ = ⟨ M , I ⟩ of Min-MLCA: Let M be an MLP based on #ntot=3|V|+2|E|+2#subscript322\#n_tot=3|V|+2|E|+2# nitalic_t o t = 3 | V | + 2 | E | + 2 neurons spread across six layers: 1. Input neuron layer: The single input neuron ninsubscriptn_innitalic_i n (bias +11+1+ 1). 2. Hidden vertex pair layer: The vertex neurons nvP11,nvP12,…nvP1|V|subscript11subscript12…subscript1nvP1_1,nvP1_2,… nvP1_|V|n v P 11 , n v P 12 , … n v P 1| V | and nvP21,nvP22,…nvP2|V|subscript21subscript22…subscript2nvP2_1,nvP2_2,… nvP2_|V|n v P 21 , n v P 22 , … n v P 2| V | (all with bias 0). 3. Hidden vertex AND layer: The vertex neurons nvA1,nvA2,…nA|V|subscript1subscript2…subscriptnvA_1,nvA_2,… nA_|V|n v A1 , n v A2 , … n A| V |, all of which are 2-way AND ReLU gates. 4. Hidden edge AND layer: The edge neurons neA1,neA2,…neA|E|subscript1subscript2…subscriptneA_1,neA_2,… neA_|E|n e A1 , n e A2 , … n e A| E |, all of which are 2-way AND ReLU gates. 5. Hidden edge NOT layer: The edge neurons neN1,neN2,…neN|E|subscript1subscript2…subscriptneN_1,neN_2,… neN_|E|n e N1 , n e N2 , … n e N| E |, all of which are NOT ReLU gates. 6. Output layer: The single output neuron noutsubscriptn_outnitalic_o u t, which is an |E||E|| E |-way AND ReLU gate. The non-zero weight connections between adjacent layers are as follows: • Each input neuron has an edge of weight 0 coming from its corresponding input and is in turn connected to each of the vertex pair neurons with weight 1. • Each vertex pair neuron nvP1isubscript1nvP1_in v P 1i (nvP2isubscript2nvP2_in v P 2i), 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to vertex AND neuron nvRisubscriptnvR_in v Ritalic_i with weight 2 (0). • Each vertex AND neuron nvAisubscriptnvA_in v Aitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to each edge AND neuron whose corresponding edge has an endpoint visubscriptv_ivitalic_i with weight 1. • Each edge AND neuron neAisubscriptneA_in e Aitalic_i, 1≤i≤|E|11≤ i≤|E|1 ≤ i ≤ | E |, is connected to edge NOT neuron neNisubscriptneN_in e Nitalic_i with weight 1. • Each edge NOT neuron neNisubscriptneN_in e Nitalic_i, 1≤i≤|E|11≤ i≤|E|1 ≤ i ≤ | E |, is connected to the output neuron noutsubscriptn_outnitalic_o u t with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let I=(1)1I=(1)I = ( 1 ). Observe that this instance of Min-MLCA can be created in time polynomial in the size of the given instance of Min-VC. Moreover, the output behaviour of the neurons in M from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nin(1)subscript1n_in(1)nitalic_i n ( 1 ) 2 nvP11(1),nvP12(1),…nvP1|V|(1),nvP21(1),nvP22(1),…nvP2|V|(1)subscript111subscript121…subscript11subscript211subscript221…subscript21nvP1_1(1),nvP1_2(1),… nvP1_|V|(1),nvP2_1(1),nvP2_2(1),… nvP% 2_|V|(1)n v P 11 ( 1 ) , n v P 12 ( 1 ) , … n v P 1| V | ( 1 ) , n v P 21 ( 1 ) , n v P 22 ( 1 ) , … n v P 2| V | ( 1 ) 3 nvA1(1),nvA2(1),…nvA|V|(1)subscript11subscript21…subscript1nvA_1(1),nvA_2(1),… nvA_|V|(1)n v A1 ( 1 ) , n v A2 ( 1 ) , … n v A| V | ( 1 ) 4 neA1(1),neA2(1),…neA|V|(1)subscript11subscript21…subscript1neA_1(1),neA_2(1),… neA_|V|(1)n e A1 ( 1 ) , n e A2 ( 1 ) , … n e A| V | ( 1 ) 5 neN1(0),neN2(0),…neN|V|(0)subscript10subscript20…subscript0neN_1(0),neN_2(0),… neN_|V|(0)n e N1 ( 0 ) , n e N2 ( 0 ) , … n e N| V | ( 0 ) 6 nout(0))n_out(0))nitalic_o u t ( 0 ) ) Note that, given the 0 (2) connection-weights of P2 (P1) vertex pair neurons to vertex AND neurons, it is the outputs of P1 vertex pair neurons in timestep 2 that enables vertex AND neurons to output 1 in timestep 3. We now need to show the correctness of this reduction by proving that the answer for the given instance of Min-VC is k if and only if the answer for the constructed instance of Min-MLCA is k. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a minimum-size vertex cover in G of size k and N′ be the k-sized sized subset of the P1 vertex pair neurons corresponding to the vertices in V′. Let M′ be the version of M in which all neurons in N′ are ablated. As each of these vertex pair neurons previously allowed their associated vertex AND neurons to output 1, their ablation now allows these k vertex AND neurons to output 0. As V′ is a vertex cover of size k, all edge AND neurons in M′ receive inputs of 0 on at least one of their endpoints from the vertex AND neurons associated with the P1 vertex pair neurons in N′. This in turn ensures that all of the edge NOT neurons and the output neuron produces output 1. Hence, M(I)=0≠1=M′(I)01superscript′M(I)=0≠ 1=M (I)M ( I ) = 0 ≠ 1 = M′ ( I ). ⇐ ⇐ : Let N′ be a minimum-sized subset of N of size k such that for the MLP M′ induced by ablating all neurons in N′, M(I)≠M(I′)superscript′M(I)≠ M(I )M ( I ) ≠ M ( I′ ). As M(I)=00M(I)=0M ( I ) = 0 and circuit outputs are stepped to be Boolean, M′(I)=1superscript′1M (I)=1M′ ( I ) = 1. Given the bias of the output neuron, this can only occur if all |E||E|| E | edge NOT (AND) neurons in M′ have output 1 (0) on input I, the latter of which requiring that each edge AND neuron receives 0 from at least one of its endpoint vertex AND neurons. These vertex AND neurons can only output 0 if all of their associated P1 vertex neurons have been ablated. This means that the vertices in G corresponding to the P1 vertex pair neurons in N′ must form a vertex cover of size k in G. As this proves that Min-VC(X) === Min-MLCA(X′), the reduction above is a metric reduction from Min-VC to Min-MLCA. ∎ Lemma 12. Min-MLCA metric reduces to Max-MLCR∗. Proof. As MLCA is the complement problem of MLCR∗ (Observation 3), we already have a trivial reduction from MLCA to MLCR∗ on their common input X=⟨M,I⟩X= M,I = ⟨ M , I ⟩. We can then show that k is the minimum value such that M has a k circuit ablation relative to I if and only if k−11k-1k - 1 is the maximum value such that M is k−11k-1k - 1-robust relative to I: ⇒ ⇒ If k is is the minimum value such that M has a k-sized circuit ablation relative to I then no subset of M of size k−11k-1k - 1 can be a circuit ablation of M relative to I, and M is (k−1)1(k-1)( k - 1 )-robust relative to I. Moreover, M cannot be k-robust relative to I as that would contradict the existence of a k-sized circuit ablation for M relative to I. Hence, k−11k-1k - 1 is the maximum robustness value for M relative to I. ⇐ ⇐ If k−11k-1k - 1 is the maximum value such that M is (k−1)1(k-1)( k - 1 )-robust relative to I then there must be a subset H of M of size k that ensures M is not k−11k-1k - 1-robust, i.e., (M/H)(I)≠M(I)(M/H)(I)≠ M(I)( M / H ) ( I ) ≠ M ( I ). Such an H is a k-sized circuit ablation of M relative to I. Moreover, there cannot be a (k−1)1(k-1)( k - 1 )-sized circuit ablation of M relative to I as that would contradict the (k−1)1(k-1)( k - 1 )-robustness of M relative to I. Hence, k is the minimum size of circuit ablations for M relative to I. As this proves that Min-MLCA(X) === Max-MLCR∗(X) +11+1+ 1, the reduction above is a metric reduction from Min-MLCA to Max-MLCR∗. ∎ Theorem 76. If Max-MLCR∗has a c-additive approximation algorithm for some c∈o(poly)c∈ o(poly)c ∈ o ( p o l y ) then P=NP=NPP = N P. Proof. The OptP[O(logn)]delimited-[]OptP[O( n)]O p t P [ O ( log n ) ]-hardness of Max-MLCR∗ follows from the OptP[O(logn)]delimited-[]OptP[O( n)]O p t P [ O ( log n ) ]-hardness of Min-VC (Gasarch et al., 1995, Theorem Theorem 3.3) and the metric reductions in Lemmas 11 and 12. The result then follows from Lemma 4 ∎ Let us now consider problem MLCR. Lemma 13. MLCR∗ many-one polynomial-time reduces to MLCR. Proof. Follows from the trivial reduction in which an instance ⟨M,I,k⟩ M,I,k ⟨ M , I , k ⟩ of MLCR∗ is transformed into an instance ⟨M,H=M,I,k⟩delimited-⟨⟩formulae-sequence M,H=M,I,k ⟨ M , H = M , I , k ⟩ of MLCR. ∎ Theorem 77. If MLCR is polynomial-time tractable then P=NP=NPP = N P. Proof. Follows from the coNPcoNPc o N P-hardness of MLCR∗ (Theorem 72), the reduction in Lemma 13, and Lemma 6. ∎ Theorem 78. If ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MLCR is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Follows from the coW[1]delimited-[]1coW[1]c o W [ 1 ]-hardness of ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MLCR∗ (Theorem 73), the reduction in Lemma 13, and Lemma 9. ∎ Theorem 79. ⟨|H|⟩delimited-⟨⟩ |H| ⟨ | H | ⟩-MLCR is fixed-parameter tractable. Proof. Consider the algorithm that generates every possible subset H′ of size at most k of the neurons N in H and for each such subset (assuming M/H′M/H M / H′ is active) checks if (M/H′)(I)≠M(I)superscript′(M/H )(I)≠ M(I)( M / H′ ) ( I ) ≠ M ( I ). If such a subset is found, return “No”; otherwise, return “Yes”. The number of possible subsets H′ is at most k×|H|k≤|H|×|H||H|superscriptsuperscriptk×|H|^k≤|H|×|H|^|H|k × | H |k ≤ | H | × | H || H |. As any such M′ can be generated from M, checked or activity, and run on I in time polynomial in the size of the given instance of MLCR, the above is a fixed-parameter tractable algorithm for MLCR relative to parameter-set |H|\|H|\ | H | . ∎ Theorem 80. ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-MLCR is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 79 and the observation that |H|≤cw×cd|H|≤ cw× cd| H | ≤ c w × c d. ∎ Theorem 81. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MLCR is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 79 and the observation that |H|≤#ntot#subscript|H|≤\#n_tot| H | ≤ # nitalic_t o t. ∎ Theorem 82. If Max-MLCR has a c-additive approximation algorithm for some c∈o(poly)c∈ o(poly)c ∈ o ( p o l y ) then P=NP=NPP = N P. Proof. As Max-MLCR∗ is a special case of Max-MLCR, if Max-MLCR has a c-additive approximation algorithm for some c then so does Max-MLCR∗. The result then follows from Theorem 76 ∎ K.2 Results for MGCR-special and MGCR Consider the following variant of MGCA: Special Circuit Ablation (SCA) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, and a positive integer k such that 1≤k≤#ntot1#subscript1≤ k≤\#n_tot1 ≤ k ≤ # nitalic_t o t. Question: Is there a subset N′, |N′|≤ksuperscript′|N |≤ k| N′ | ≤ k, of the |N||N|| N | neurons in M such that for the MLP M′ induced by N∖N′ N N ∖ N′, M(I)≠M′(I)superscript′M(I)≠ M (I)M ( I ) ≠ M′ ( I ) for some Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n? Theorem 83. If SCA is polynomial-time tractable then P=NP=NPP = N P. Proof. Observe that in the instance of MLCA constructed by the reduction from Clique in the proof of Theorem 40, the input-connection weight 0 and bias 1 of the input neuron force this neuron to output 1 for both of the possible input vectors (1)1(1)( 1 ) and (0)0(0)( 0 ). This means that the answer to the given instance of Clique is “Yes” if and only if the answer to the constructed instance of MLCA relative to any of its possible input vectors is “Yes”. Hence, with slight modifications to the proof of reduction correctness, this reduction also establishes the NPNPN P-hardness of SCA. ∎ Theorem 84. If ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-SCA is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of SCA constructed in the reduction in the proof of Theorem 83, #nin=#nout=Wmax=1#subscript#subscriptsubscript1\#n_in=\#n_out=W_ =1# nitalic_i n = # nitalic_o u t = Wroman_max = 1, cd=55cd=5c d = 5, and BmaxsubscriptB_ Broman_max and k are function of k in the given instance of Clique. The result then follows from the fact that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Clique is W[1]delimited-[]1W[1]W [ 1 ]-hard (Downey & Fellows, 1999). ∎ Theorem 85. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-SCA is fixed-parameter tractable. Proof. Modify the algorithm in the proof of Theorem 41 such that each created MLP M is checked to ensure that M(I)≠M′(I)superscript′M(I)≠ M (I)M ( I ) ≠ M′ ( I ) for every possible Boolean input vector of length #nin#subscript\#n_in# nitalic_i n. As the number of such vectors is 2#nin≤2#ntotsuperscript2#subscriptsuperscript2#subscript2^\#n_in≤ 2^\#n_tot2# nitalic_i n ≤ 2# nitalic_t o t, the above is a fixed-parameter tractable algorithm for SCA relative to parameter-set #ntot#subscript\\#n_tot\ # nitalic_t o t . ∎ Theorem 86. ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-SCA is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 85 and the observation that #ntot≤cw×cd#subscript\#n_tot≤ cw× cd# nitalic_t o t ≤ c w × c d. ∎ Our results above for SCA above gain importance for us here courtesy of the following observation. Observation 4. MGCR∗ is the complement of SCA. Proof. Let X=⟨M,k⟩X= M,k = ⟨ M , k ⟩ be a shared input of MGCR∗ and SCA. If the the answer to SCA on input X is “Yes”, this means that there is a subset H of size ≤kabsent≤ k≤ k of the neurons of M that can be ablated such that M(I)≠(M/H)(I)M(I)≠(M/H)(I)M ( I ) ≠ ( M / H ) ( I ) for some input vector I. This implies that the answer to MGCR∗ on input X is “No”. Conversely, if the answer to SCA on input X is “No”, this means that there is no subset H of size ≤kabsent≤ k≤ k of the neurons of M that can be ablated such that M(I)≠(M/H)(I)M(I)≠(M/H)(I)M ( I ) ≠ ( M / H ) ( I ) for any input vector I. This implies that the answer to MGCR∗ on input X is “Yes’. The observation follows from the definition of complement problem. ∎ Theorem 87. If MGCR∗ is polynomial-time tractable then P=NP=NPP = N P. Proof. As MGCR∗ is the complement of SCA by Observation 4 and SCA is NPNPN P-hard (Theorem 83), Lemma 5 implies that MGCR∗ is coNPcoNPc o N P-hard. The result then follows from Lemma 6. ∎ Theorem 88. If ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MGCR∗ is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. The coW[1]delimited-[]1coW[1]c o W [ 1 ]-hardness of ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MGCR∗ follows from the W[1]delimited-[]1W[1]W [ 1 ]-hardness of ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-SCA (Theorem 84), Observation 4, and Lemma 7. The result then follows from Lemma 9. ∎ Theorem 89. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MGCR∗ is fixed-parameter tractable. Proof. The result follows from the fixed-parameter tractability of ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-SCA (Theorem 85), Observation 4, and Lemma 8. ∎ Theorem 90. ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-MGCR∗ is fixed-parameter tractable. Proof. The result follows from the fixed-parameter tractability of ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-SCA (Theorem 86), Observation 4, and Lemma 8. ∎ Observe that the results in Theorems 88–90 in combination with Lemmas 1 and 2 suffice to establish the parameterized complexity status of MGCR∗ relative to many subsets of the parameters listed in Table 10. We can derive a polynomial-time additive inapproximability result for Max-MGCR∗ using the following chain of metric reductions based on two evaluation problems: • Min-VC, which asks for the minimum value k such that G has a vertex cover of size k. • Min-SCA, which asks for the minimum value k such that there is a k-size subset N′ of the |N||N|| N | neurons in M such that M(I)≠(M/N′)(I)superscript′M(I)≠(M/N )(I)M ( I ) ≠ ( M / N′ ) ( I ) for some Boolean input vector I of length#nin#subscript\#n_in# nitalic_i n. Lemma 14. Min-VC metric reduces to Min-SCA. Proof. Observe that in the instance of Min-MLCA constructed by the reduction from Min-VC in the proof of Theorem 11, the input-connection weight 0 and bias 1 of the input neuron force this neuron to output 1 for both of the possible input vectors (1)1(1)( 1 ) and (0)0(0)( 0 ). This means that the answer to the constructed instance of Min-MLCA is the same relative to any of its possible input vectors. Hence, with slight modifications to the proof of reduction correctness, this reduction is also a metric reduction from Min-VC to Min-SCA such that for given instance X of Min-VC and constructed instance X′ of Min-SCA, Min-VC(X) === Min-SCA(X′). ∎ Lemma 15. Min-SCA metric reduces to Max-MGCR∗. Proof. As SCA is the complement problem of MGCR∗ (Observation 4), we already have a trivial reduction from SCA to MGCR∗ on their common input X=⟨M⟩delimited-⟨⟩X= M = ⟨ M ⟩. We can then show that k is the minimum value such that M has a k circuit ablation relative to some possible I if and only if k−11k-1k - 1 is the maximum value such that M is k−11k-1k - 1-robust relative to all possible I: ⇒ ⇒ If k is is the minimum value such that M has a k-sized circuit ablation relative to some possible I then no subset of M of size k−11k-1k - 1 can be a circuit ablation of M relative to any possible I, and M is (k−1)1(k-1)( k - 1 )-robust relative to all possible I. Moreover, M cannot be k-robust relative to all possible I as that would contradict the existence of a k-sized circuit ablation for M relative to some possible I. Hence, k−11k-1k - 1 is the maximum robustness value for M relative to all possible I. ⇐ ⇐ If k−11k-1k - 1 is the maximum value such that M is (k−1)1(k-1)( k - 1 )-robust relative to all possible I then there must be a subset H of M of size k that ensures M is not k−11k-1k - 1-robust relative to all possible I, i.e., (M/H)(I)≠M(I)(M/H)(I)≠ M(I)( M / H ) ( I ) ≠ M ( I ) for some possible I. Such an H is a k-sized circuit ablation of M relative to that I. Moreover, there cannot be a (k−1)1(k-1)( k - 1 )-sized circuit ablation of M relative to some possible I as that would contradict the (k−1)1(k-1)( k - 1 )-robustness of M relative to all possible I. Hence, k is the minimum size of circuit ablations for M relative to some possible I. As this proves that Min-SCA(X) === Max-MGCR∗(X) +11+1+ 1, the reduction above is a metric reduction from Min-SCA to Max-MGCR∗. ∎ Theorem 91. If Max-MGCR∗has a c-additive approximation algorithm for some c∈o(poly)c∈ o(poly)c ∈ o ( p o l y ) then P=NP=NPP = N P. Proof. The OptP[O(logn)]delimited-[]OptP[O( n)]O p t P [ O ( log n ) ]-hardness of Max-MGCR∗ follows from the OptP[O(logn)]delimited-[]OptP[O( n)]O p t P [ O ( log n ) ]-hardness of Min-VC (Gasarch et al., 1995, Theorem Theorem 3.3) and the metric reductions in Lemmas 14 and 15. The result then follows from Lemma 4 ∎ Let us now consider problem MGCR. Lemma 16. MGCR∗ many-one polynomial-time reduces to MGCR. Proof. Follows from the trivial reduction in which an instance ⟨M,k⟩ M,k ⟨ M , k ⟩ of MGCR∗ is transformed into an instance ⟨M,H=M,k⟩delimited-⟨⟩formulae-sequence M,H=M,k ⟨ M , H = M , k ⟩ of MGCR. ∎ Theorem 92. If MGCR is polynomial-time tractable then P=NP=NPP = N P. Proof. Follows from the coNPcoNPc o N P-hardness of MGCR∗ (Theorem 87), the reduction in Lemma 16, and Lemma 6. ∎ Theorem 93. If ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MGCR is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Follows from the coW[1]delimited-[]1coW[1]c o W [ 1 ]-hardness of ⟨cd,#nin,#nout,Wmax,Bmax,k⟩#subscript#subscriptsubscriptsubscript cd,\#n_in,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_i n , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MGCR∗ (Theorem 88), the reduction in Lemma 16, and Lemma 9. ∎ Theorem 94. ⟨#nin,|H|⟩#subscript \#n_in,|H| ⟨ # nitalic_i n , | H | ⟩-MGCR is fixed-parameter tractable. Proof. Modify the algorithm in the proof of Theorem 79 such that each created MLP M is checked to ensure that M(I)≠(H/H′)(I)superscript′M(I)≠(H/H )(I)M ( I ) ≠ ( H / H′ ) ( I ) for every possible Boolean input vector of length #nin#subscript\#n_in# nitalic_i n. As the number of such vectors is 2#ninsuperscript2#subscript2^\#n_in2# nitalic_i n, the above is a fixed-parameter tractable algorithm for MGCR relative to parameter-set #nin,|H|#subscript\\#n_in,|H|\ # nitalic_i n , | H | . ∎ Theorem 95. ⟨cw,cd⟩ cw,cd ⟨ c w , c d ⟩-MGCR is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 94 and the observations that #ntot≤cw×cd#subscript\#n_tot≤ cw× cd# nitalic_t o t ≤ c w × c d and |H|≤cw×cd|H|≤ cw× cd| H | ≤ c w × c d. ∎ Theorem 96. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MGCR is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 94 and the observations that #ntot≤#ntot#subscript#subscript\#n_tot≤\#n_tot# nitalic_t o t ≤ # nitalic_t o t and |H|≤#ntot#subscript|H|≤\#n_tot| H | ≤ # nitalic_t o t. ∎ Theorem 97. If Max-MGCR has a c-additive approximation algorithm for some c∈o(poly)c∈ o(poly)c ∈ o ( p o l y ) then P=NP=NPP = N P. Proof. As Max-MGCR∗ is a special case of Max-MGCR, if Max-MGCR has a c-additive approximation algorithm for some c then so does Max-MGCR∗. The result then follows from Theorem 91 ∎ Appendix L Sufficient Reasons Problem Minimum sufficient reason (MSR) Input: A multi-layer perceptron M of depth cdcdc d with #ntot#subscript\#n_tot# nitalic_t o t neurons and maximum layer width cwcwc w, connection-value matrices W1,W2,…,Wcdsubscript1subscript2…subscriptW_1,W_2,…,W_cdW1 , W2 , … , Witalic_c d, neuron bias vector B, a Boolean input vector I of length #nin#subscript\#n_in# nitalic_i n, and a positive integer k such that 1≤k≤#nin1#subscript1≤ k≤\#n_in1 ≤ k ≤ # nitalic_i n. Question: Is there a k-sized subset I′ of I such that for each possible completion I′I I′ ′ of I′, I and I′I I′ ′ are behaviorally equivalent with respect to M? For a graph G=(V,E)G=(V,E)G = ( V , E ), we shall assume an ordering on the vertices and edges in V and E, respectively. For each vertex v∈Vv∈ Vv ∈ V, let the complete neighbourhood NC(v)subscriptN_C(v)Nitalic_C ( v ) of v be the set composed of v and the set of all vertices in G that are adjacent to v by a single edge, i.e., v∪u|u∈Vand(u,v)∈Econditional-setanduvEv∪\u~|~u~∈ V~ and~(u,v)∈ E\v ∪ u | u ∈ V and ( u , v ) ∈ E . We will prove various classical and parameterized results for MSR using reductions from Clique. The parameterized results are proved relative to the parameters in Table 11. An additional reduction from DS (Theorem 103) use specialized ReLU logic gates described in Barceló et al. 2020, Lemma 13. These gates assume Boolean neuron input and output values of 0 and 1 and are structured as follows: 1. NOT ReLU gate: A ReLU gate with one input connection weight of value −11-1- 1 and a bias of 1. This gate has output 1 if the input is 0 and 0 otherwise. 2. n-way AND ReLU gate: A ReLU gate with n input connection weights of value 1 and a bias of −(n−1)1-(n-1)- ( n - 1 ). This gate has output 1 if all inputs have value 1 and 0 otherwise. 3. n-way OR ReLU gate: A combination of an n-way AND ReLU gate with NOT ReLU gates on all of its inputs and a NOT ReLU gate on its output that uses DeMorgan’s Second Law to implement (x1∨x2∨…xn)subscript1subscript2…subscript(x_1 x_2 … x_n)( x1 ∨ x2 ∨ … xitalic_n ) as ¬(¬x1∧¬x2∧…¬xn)subscript1subscript2…subscript ( x_1 x_2 … x_n)¬ ( ¬ x1 ∧ ¬ x2 ∧ … ¬ xitalic_n ). This gate has output 1 if any input has value 1 and 0 otherwise. Table 11: Parameters for the minimum sufficient reason problem. Parameter Description cdcdc d # layers in given MLP cwcwc w max # neurons in layer in given MLP #ntot#subscript\#n_tot# nitalic_t o t total # neurons in given MLP #nin#subscript\#n_in# nitalic_i n # input neurons in given MLP #nout#subscript\#n_out# nitalic_o u t # output neurons in given MLP BmaxsubscriptB_ Broman_max max neuron bias in given MLP WmaxsubscriptW_ Wroman_max max connection weight in given MLP k Size of requested subset of input vector L.1 Results for MSR The following hardness results are notable for holding when the given MLP M has only one hidden layer. As such, they complement and significantly tighten results given in (Barceló et al., 2020; Wäldchen et al., 2021), respectively. Theorem 98. If MSR is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from Clique to MSR. Given an instance ⟨G=(V,E),k⟩delimited-⟨⟩ G=(V,E),k ⟨ G = ( V , E ) , k ⟩ of Clique, construct the following instance ⟨M,I,k′⟩superscript′ M,I,k ⟨ M , I , k′ ⟩ of MSR: Let M be an MLP based on #ntot=|V|+|E|+1#subscript1\#n_tot=|V|+|E|+1# nitalic_t o t = | V | + | E | + 1 neurons spread across three layers: 1. Input vertex layer: The vertex neurons nv1,nv2,…nv|V|subscript1subscript2…subscriptnv_1,nv_2,… nv_|V|n v1 , n v2 , … n v| V | (all with bias 0). 2. Hidden edge layer: The edge neurons ne1,ne2,…ne|E|subscript1subscript2…subscriptne_1,ne_2,… ne_|E|n e1 , n e2 , … n e| E | (all with bias −11-1- 1). 3. Output layer: The single output neuron noutsubscriptn_outnitalic_o u t (bias −(k(k−1)/2−1)121-(k(k-1)/2-1)- ( k ( k - 1 ) / 2 - 1 )). Note that this MLP has only one hidden layer. The non-zero weight connections between adjacent layers are as follows: • Each vertex neuron nvisubscriptnv_in vitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to each edge neuron whose corresponding edge has an endpoint visubscriptv_ivitalic_i with weight 1. • Each edge neuron neisubscriptne_in eitalic_i, 1≤i≤|E|11≤ i≤|E|1 ≤ i ≤ | E |, is connected to the output neuron noutsubscriptn_outnitalic_o u t with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let I=(1)1I=(1)I = ( 1 ) and k′=ksuperscript′k =k′ = k. Observe that this instance of MSR can be created in time polynomial in the size of the given instance of Clique. Moreover, the output behaviour of the neurons in M from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nv1(1),nv2(1),…nv|V|(1)subscript11subscript21…subscript1nv_1(1),nv_2(1),… nv_|V|(1)n v1 ( 1 ) , n v2 ( 1 ) , … n v| V | ( 1 ) 2 ne1(1),ne2(1),…ne|E|(1)subscript11subscript21…subscript1ne_1(1),ne_2(1),… ne_|E|(1)n e1 ( 1 ) , n e2 ( 1 ) , … n e| E | ( 1 ) 3 nout(|E|−(k(k−1)/2−1))subscript121n_out(|E|-(k(k-1)/2-1))nitalic_o u t ( | E | - ( k ( k - 1 ) / 2 - 1 ) ) Note that it is the stepped output of noutsubscriptn_outnitalic_o u t in timestep 3 that yields output 1. We now need to show the correctness of this reduction by proving that the answer for the given instance of Clique is “Yes” if and only if the answer for the constructed instance of MSR is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a clique in G of size k and I′ be the k′=ksuperscript′k =k′ = k-sized subset of I corresponding to the vertices in V′. As V′ is a clique of size k, exactly k(k−1)/212k(k-1)/2k ( k - 1 ) / 2 edge neurons in the constructed MLP M receive the requisite inputs of 1 on both of their endpoints from the vertex neurons associated with I′. This in turn ensures the output neuron produces output 1. No other possible inputs to the vertex neurons not corresponding to elements of I′ can change the outputs of these activated edge neurons (and hence the output neuron as well) from 1 to 0. Hence, all completions of I′ cause M to output 1 and are behaviorally equivalent to I with respect to M. ⇐ ⇐ : Let I′ be a k′=ksuperscript′k =k′ = k-sized subset of I such that all possible completions of I′ are behaviorally equivalent to I with respect to M, i.e., all such completions cause M to output 1. Consider the completion I′I I′ ′ of I′ in which all non-I′ elements have value 0. The output of M on I′I I′ ′ can be 1 (and hence equal to the output of M on I) only if at least k(k−1)/212k(k-1)/2k ( k - 1 ) / 2 edge neurons have output 1. As all non-I′ elements of I′I I′ ′ have value 0, this means that both endpoints of each of these edge neuron must be connected to elements of I′ with output 1, which in turn implies that the k′=ksuperscript′k =k′ = k vertices in G corresponding to the elements of I′ form a clique of size k for G. As Clique is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MSR is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Theorem 99. If ⟨cd,#nout,Wmax,Bmax,k⟩#subscriptsubscriptsubscript cd,\#n_out,W_ ,B_ ,k ⟨ c d , # nitalic_o u t , Wroman_max , Broman_max , k ⟩-MSR is fixed-parameter tractable then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Observe that in the instance of MSR constructed in the reduction in the proof of Theorem 98, #nout=Wmax=1#subscriptsubscript1\#n_out=W_ =1# nitalic_o u t = Wroman_max = 1, cd=33cd=3c d = 3, and BmaxsubscriptB_ Broman_max and k are function of k in the given instance of Clique. The result then follows from the fact that ⟨k⟩delimited-⟨⟩ k ⟨ k ⟩-Clique is W[1]delimited-[]1W[1]W [ 1 ]-hard (Downey & Fellows, 1999). ∎ Theorem 100. ⟨#nin⟩delimited-⟨⟩#subscript \#n_in ⟨ # nitalic_i n ⟩-MSR is fixed-parameter tractable. Proof. Consider the algorithm that generates each possible subset I′ of of I of size k and for each such I′, checks if all possible completions of I′ are behaviorally equivalent to I with respect to M. If such an I′ is found, return “Yes”; otherwise, return “No”. The number of possible I′ is at most (#nin)k≤(#nin)#ninsuperscript#subscriptsuperscript#subscript#subscript(\#n_in)^k≤(\#n_in)^\#n_in( # nitalic_i n )k ≤ ( # nitalic_i n )# nitalic_i n and the number of possible completions of any such I′ is less than 2#ninsuperscript2#subscript2^\#n_in2# nitalic_i n. Given this, as M can be run on each completion of I′ is time polynomial in the size of the given instance of MSR, the above is a fixed-parameter tractable algorithm for MSR relative to parameter-set #nin#subscript\\#n_in\ # nitalic_i n . ∎ Theorem 101. ⟨#ntot⟩delimited-⟨⟩#subscript \#n_tot ⟨ # nitalic_t o t ⟩-MSR is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 100 and the observation that #nin≤#ntot#subscript#subscript\#n_in≤\#n_tot# nitalic_i n ≤ # nitalic_t o t. ∎ Theorem 102. ⟨cw⟩delimited-⟨⟩ cw ⟨ c w ⟩-MSR is fixed-parameter tractable. Proof. Follows from the algorithm in the proof of Theorem 100 and the observation that #nin≤cw#subscript\#n_in≤ cw# nitalic_i n ≤ c w. ∎ Observe that the results in Theorems 99–102 in combination with Lemmas 1 and 2 suffice to establish the parameterized complexity status of MSR relative to every subset of the parameters listed in Table 11. Let us now consider the polynomial-time cost approximability of MSR. As MSR is a minimization problem, we cannot do this using reductions from a maximization problem like Clique. Hence we will instead use a reduction from another minimization problem, namely DS. Theorem 103. If MSR is polynomial-time tractable then P=NP=NPP = N P. Proof. Consider the following reduction from DS to MSR. Given an instance ⟨G=(V,E),k⟩delimited-⟨⟩ G=(V,E),k ⟨ G = ( V , E ) , k ⟩ of DS, construct the following instance ⟨M,I,k′⟩superscript′ M,I,k ⟨ M , I , k′ ⟩ of MSR: Let M be an MLP based on #ntot,g=3|V|+1#subscript31\#n_tot,g=3|V|+1# nitalic_t o t , g = 3 | V | + 1 neurons spread across four layers: 1. Input layer: The input vertex neurons nv1,nv2,…nv|V|subscript1subscript2…subscriptnv_1,nv_2,… nv_|V|n v1 , n v2 , … n v| V |, all of which have bias 0. 2. Hidden vertex neighbourhood layer I: The vertex neighbourhood AND neurons nvnA1,nvnA2,…nvnA|V|subscript1subscript2…subscriptnvnA_1,nvnA_2,… nvnA_|V|n v n A1 , n v n A2 , … n v n A| V |, where nvnAisubscriptnvnA_in v n Aitalic_i is an x-way AND ReLU gates such that x=|NC(vi)|subscriptsubscriptx=|N_C(v_i)|x = | Nitalic_C ( vitalic_i ) |. 3. Hidden vertex neighbourhood layer I: The vertex neighbourhood NOT neurons nvnN1,nvnN2,…nvnN|V|subscript1subscript2…subscriptnvnN_1,nvnN_2,… nvnN_|V|n v n N1 , n v n N2 , … n v n N| V |, all of which are NOT ReLU gates. 4. Output layer: The single output neuron noutsubscriptn_outnitalic_o u t, which is a |V||V|| V |-way AND ReLU gate. The non-zero weight connections between adjacent layers are as follows: • Each input vertex neuron nvisubscriptnv_in vitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to each vertex neighbourhood AND neuron nvnAjsubscriptnvnA_jn v n Aitalic_j such that vi∈NC(vj)subscriptsubscriptsubscriptv_i∈ N_C(v_j)vitalic_i ∈ Nitalic_C ( vitalic_j ) with weight 1. • Each vertex neighbourhood AND neuron nvnAisubscriptnvnA_in v n Aitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to its corresponding vertex neighbourhood NOT neuron nvnNisubscriptnvnN_in v n Nitalic_i with weight 1. • Each vertex neighbourhood NOT neuron nvnNisubscriptnvnN_in v n Nitalic_i, 1≤i≤|V|11≤ i≤|V|1 ≤ i ≤ | V |, is connected to the output neuron noutsubscriptn_outnitalic_o u t with weight 1. All other connections between neurons in adjacent layers have weight 0. Finally, let I be the |V||V|| V |-length zero-vector and k′=ksuperscript′k =k′ = k. Observe that this instance of MSR can be created in time polynomial in the size of the given instance of DS, Moreover, the output behaviour of the neurons in M from the presentation of input I until the output is generated is as follows: timestep neurons (outputs) 0 — 1 nvN1(0),nvN2(0),…nvN|V|(0)subscript10subscript20…subscript0nvN_1(0),nvN_2(0),… nvN_|V|(0)n v N1 ( 0 ) , n v N2 ( 0 ) , … n v N| V | ( 0 ) 2 nvnA1(0),nvnA2(0),…nvnA|V|(0)subscript10subscript20…subscript0nvnA_1(0),nvnA_2(0),… nvnA_|V|(0)n v n A1 ( 0 ) , n v n A2 ( 0 ) , … n v n A| V | ( 0 ) 3 nvnN1(1),nvnN2(1),…nvnN|V|(1)subscript11subscript21…subscript1nvnN_1(1),nvnN_2(1),… nvnN_|V|(1)n v n N1 ( 1 ) , n v n N2 ( 1 ) , … n v n N| V | ( 1 ) 4 nout(1)subscript1n_out(1)nitalic_o u t ( 1 ) We now need to show the correctness of this reduction by proving that the answer for the given instance of DS is “Yes” if and only if the answer for the constructed instance of MSR is “Yes”. We prove the two directions of this if and only if separately as follows: ⇒ ⇒ : Let V′=v1′,v2′,…,vk′⊆Vsuperscript′subscriptsuperscript′1subscriptsuperscript′2…subscriptsuperscript′V =\v _1,v _2,…,v _k\ V′ = v′1 , v′2 , … , v′italic_k ⊆ V be a dominating set in G of size k and I′ be the k′=ksuperscript′k =k′ = k-sized subset of I corresponding to the vertices in V′. As V′ is a dominating set, each vertex neighbourhood AND neuron receives input 0 from at least one input vertex neuron in the set of input vertex neurons associated with I′, which in turn ensures that each vertex neighbourhood AND neuron has output 0. This in turn ensures that M produces output 1. No other possible inputs to the vertex neighbourhood AND neurons can change the output of these neurons from 0 to 1. Hence, all completions of I′ cause M to output 1 and are behaviorally equivalent to I with respect to M. ⇐ ⇐ : Let I′ be a k′=ksuperscript′k =k′ = k-sized subset of I such that all possible completions of I′ are behaviorally equivalent to I with respect to M, i.e., all such completions cause M to output 1. Consider the completion I′I I′ ′ of I′ in which all non-I′ elements have value 1. The output of M on I′I I′ ′ can be 1 (and hence equal to the output of M on I) only if all vertex neighbourhood NOT neurons output 1, which in turn can happen only if all vertex neighbourhood AND gates output 0. However, as all non-I′ elements of I′I I′ ′ have value 1, this means that each vertex neighbourhood AND neuron must be connected to at least one element of I′, which in turn implies that the k′=ksuperscript′k =k′ = k vertices in G corresponding to the elements of I′ form a dominating set of size k for G. As DS is NPNPN P-hard (Garey & Johnson, 1979), the reduction above establishes that MSR is also NPNPN P-hard. The result follows from the definition of NPNPN P-hardness. ∎ Theorem 104. If MSR has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Proof. Recall from the proof of correctness of the reduction in the proof of Theorem 103 that a given instance of DS has a dominating set of size k if and only if the constructed instance of MSRMSRM S R has a subset I′ of I of size k′=ksuperscript′k =k′ = k such that every possible completion of I′ is behaviorally equivalent to I with respect to M. This implies that, given a polynomial-time c-approximation algorithm A for MSR for some constant c>00c>0c > 0, we can create a polynomial-time c-approximation algorithm for DS by applying the reduction to the given instance x of DS to construct an instance x′ of MSR, applying A to x′ to create an approximate solution y′, and then using y′ to create an approximate solution y for x that has the same cost as y′. The result then follows from Chen & Lin 2019, Corollary 2, which implies that if DS has a polynomial-time c-approximation algorithm for any constant c>00c>0c > 0 then FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. ∎ Note that this theorem also renders MSR PTAS-inapproximable unless FPT=W[1]delimited-[]1FPT=W[1]F P T = W [ 1 ]. Appendix M Probabilistic approximation schemes Let us now consider three other types of polynomial-time approximability that may be acceptable in situations where always getting the correct output for an input is not required: 1. algorithms that always run in polynomial time but are frequently correct in that they produce the correct output for a given input in all but a small number of cases (i.e., the number of errors for input size n is bounded by function err(n)err(n)e r r ( n )) Hemaspaandra & Williams (2012); 2. algorithms that always run in polynomial time but are frequently correct in that they produce the correct output for a given input with high probability Motwani & Raghavan (1995); and 3. algorithms that run in polynomial time with high probability but are always correct (Gill, 1977). Unfortunately, none of these options are in general open to us, courtesy of the following result. Theorem 105. None of the hard problems in Table 4 are polynomial-time approximable in senses (1–3). Proof. (Sketch) Holds relative to several strongly-believed or established complexity-class relation conjectures courtesy of the NPNPN P-hardness of the problems and the reasoning in the proof of (Wareham, 2022, Result E). ∎ Appendix N Supplementary discussion Search space size versus intrinsic complexity. Some of our hardness results can be surprising (e.g., fixed-parameter intractability indicating that taming intuitive network and circuit parameters is not enough to make queries feasible). Other findings might be unsurprising/surprising for the wrong reasons. Often intractability is assumed based on observing that a problem of interest has an exponential search space. But this is not a sufficient condition for intractability. For instance, although the Minimum Spanning Tree problem has an exponential search space, there is enough structure in it that can be exploited to get optimal solutions tractably. A more directly relevant example is our Quasi-Minimal Circuit problems, which also have exponential search spaces. This is a scenario where the typical reasoning in the literature would lead us astray. Given our tractability results, jumping to intractability conclusions would miss valuable opportunities to design tractable algorithms with guarantees. Worst-case analysis. Given our limited knowledge of the problem space of interpretability, worst-case analysis is appropriate to explore what problems might be solvable without requiring any additional assumptions (e.g., Bassan et al., 2024; Barceló et al., 2020) and experimental results suggest it captures a lower bound on real-world complexity (e.g., Friedman et al., 2024; Shi et al., 2024; Yu et al., 2024a). One possibly fruitful avenue would be to conduct an empirical and formal characterization of learned weights in search of structure that could potentially distinguish conditions of (in)tractability. This could inform future average-case analyses on plausible distributional assumptions. Strategies for exploring the viability of interpretability queries. Although we find that many queries of interest are intractable in the general case (and empirical results are in line with this characterization), this should not paralyze real-world efforts to interpret models. As our exploration of the current complexity landscape shows, reasonable relaxations, restrictions and problem variants can yield tractable queries for circuits with useful properties. Consider a few out of many possible avenues to continue these explorations. (i) Faced with an intractable query, we can investigate which parameters of the problem (e.g., network, circuit aspects) might be responsible for the core hardness of the general problem. If these problematic parameters can be kept small in real-world applications, this can yield a fixed-parameter tractable query which can be answered efficiently in practice. We have explored some of these parameters, but many more could be, as any aspect of the problem can be parameterized. For this, a close dialogue between theorists and experimentalists will be crucial, as often empirical regularities suggest which parameters might be fruitful to explore theoretically, and experiments can test whether theoretically conjectured parameters are or can be kept small in practice. (i) Generating altogether different circuit query variants is another way of making interpretability feasible. Our formalization of quasi-minimal circuit problems illustrates the search for viable algorithmic options with examples of tractable problems for inner interpretability. When the use case is well defined, efficient queries that return circuits with useful affordances for applications can be designed. Some circuits (e.g., quasi-minimal circuits, but likely others) might mimic the affordances for prediction/control that ideal circuits have, while shedding the intractability that plagues the latter. (i) It could be fruitful to investigate properties of the network output. Although for some problems, our constructions use step functions in the output layer (following the literature; Bassan et al., 2024; Barceló et al., 2020), for many problems we do not or provide alternative proofs without them. This suggests this is not likely a significant source of complexity. Another aspect could be the binary input/output, although continuous input/output does not necessarily matter complexity-wise. Sometimes it does, as in the case of Linear Programming (PTIME; Karmarkar, 1984) versus 0-1 Integer Programming (NP-complete Garey & Johnson, 1979), and sometimes it does not, as in Euclidean Steiner Tree (NP-hard, not in NP for technical reasons; Garey & Johnson, 1979) versus Rectilinear Steiner Tree (NP-complete; Garey & Johnson, 1979). Still, this is an interesting direction for future work, as it suggests studying the output as an axis of approximation. (iv) A different path is to design queries that partially rely on mid-level abstractions (Vilas et al., 2024a) to bridge the gap between circuits and human-intelligible algorithms (e.g., key-value mechanisms; Geva et al., 2022; Vilas et al., 2024b). (v) It is in principle possible that real-world trained neural networks possess an internal structure that is somehow benevolent to general (ideal) circuit queries (e.g., redundancy). In such optimistic scenarios, general-purpose heuristics might work well. The empirical evidence available, however, speaks against this possibility. In any case, it will always be important to characterize any ‘benevolent structure’ in the problems such that we can leverage it explicitly to design algorithms with useful guarantees.