Bit Off

Contents

Abstract
Introduction
The Residual-Formalization Research Programme
Theoretical Resources
Prior Art and Analytical Distinction
Method
Bittensor Architecture: Residual Judgment and Formalized Authority
Capitalized Judgment
Signal-Capital Substitution
Governance-Pressure Evidence
System-Wide Empirical Pattern
Traced SN1 Packet
Falsification
Scope of Inference
Conclusion

Abstract

Bittensor is described in its foundational materials and by subsequent commentary as a market for intelligence. The deployed protocol architecture defines a narrower institutional object. The subtensor runtime does not formalize intelligence, usefulness, or quality; it formalizes stake-weighted authority over judgments that originate in subnet-specific validator practice. Validators produce off-chain evaluations through subnet-specific scoring procedures, submit weight vectors to the runtime, and receive dividends through mechanisms in which stake appears at every decisive step of aggregation, consensus, rank generation, and reward distribution.

This Article develops two named mechanisms to analyze that arrangement. Capitalized judgment names the architectural posture: the protocol delegates evaluation to residual validator practice while formalizing capital-weighted authority over whose judgments count, how much they count, and how rewards are distributed. Signal-capital substitution names the failure mode in which capital-weighted authority is weakly calibrated to evaluative competence, so the protocol rewards capital-positioned judgment more than well-grounded evaluation.

Drawing on construct-validity theory (Cronbach and Meehl 1955; Adcock and Collier 2001), commensuration research (Espeland and Stevens 1998; Porter 1995), and classification theory (Bowker and Star 1999), the Article develops these mechanisms through the subtensor runtime at April 2026 commit references, BIT governance proposals (BIT-0002, BIT-0004), Lui and Sun’s (2025) system-wide empirical study, and a bounded packet trace of the historical OpenValidators mirror for SN1 text-prompting. The Article identifies the residual-judgment layer where evaluation actually occurs, the runtime mechanisms that convert those judgments into authority-bearing consequence, and the institutional conditions under which capital-weighted authority can diverge from evaluative substance.

Keywords: Bittensor, subtensor runtime, capitalized judgment, signal-capital substitution, stake-weighted authority, evaluative competence, construct validity, commensuration, classification theory, residual formalization, decentralized AI markets, institutional economics.

1. Introduction

Bittensor was introduced in Rao, Steeves, Shaabana, Attevelt, and McAteer’s (2020) whitepaper as a peer-to-peer intelligence market, an architecture in which intelligence systems would rank one another through Fisher-information logic and accumulate reward in proportion to demonstrated evaluative competence. The deployed system five years later is structurally different from that original vision. The subtensor runtime does not implement Fisher-information ranking; it implements stake-weighted aggregation over validator-submitted weight vectors. The subnet architecture does not settle a protocol-native measure of quality; it leaves task definition, querying, scoring, and weight production to subnet-specific validator practice. The runtime’s formal core concerns authority over evaluations rather than evaluations themselves.

This architectural pattern (a protocol that formalizes authority over judgment while leaving the underlying judgment to residual practice) raises specific analytical questions that neither the general blockchain-governance literature nor the AI-evaluation literature addresses directly. The blockchain-governance literature analyzes decentralized decision-making through frameworks drawn from institutional economics, political economy, and legal theory; it has not engaged in depth with protocol-mediated evaluation of contestable constructs. The AI-evaluation literature develops benchmarks, evaluation methodologies, and critiques of evaluation in machine-learning research; it has not engaged in depth with protocol-level capitalization of evaluative authority. The intersection, where a protocol capitalizes evaluative judgment over contested constructs, requires analytical resources from both traditions.

The analytical contribution this Article develops is the naming of two mechanisms. Capitalized judgment names the architectural posture: the protocol does not formalize evaluation directly but formalizes stake-weighted authority over evaluations produced residually in subnet practice. The posture is visible in the subtensor runtime at specific code locations (validator-permit logic, consensus aggregation, miner-rank computation, dividend logic) and is expressed institutionally through the separation of task definition (residual, subnet-level) from reward-bearing authority (formal, runtime-level). Signal-capital substitution names the failure condition: when capital-weighted authority is weakly calibrated to evaluative competence, the protocol rewards capital-positioned judgment more than well-grounded evaluation, and the reward distribution diverges from any defensible quality signal.

The two mechanisms generate empirical tests. Capitalized judgment is visible from runtime architecture alone; the code shows where stake enters and where it does not. Signal-capital substitution requires external calibration evidence: comparison of capital-weighted reward allocation against independent performance benchmarks. Part 11 of this Article engages the available system-wide evidence (Lui and Sun 2025), which operates within the protocol’s own evaluative vocabulary and therefore supports weaker claims than external-benchmark evidence would support. Part 12 develops a bounded packet trace for SN1 text-prompting that narrows the empirical burden to specific subnet-level questions that future research can address.

2. The Residual-Formalization Research Programme

This Article sits within a broader research programme addressing residual institutional consequences of formalization in cryptoeconomic systems. The programme question: when protocols formalize selected institutional functions, what residual work remains outside the coded layer, who carries it, and when does that residual domain become consequential for control, valuation, dependence, legitimacy, or accountability? Bittensor is a core empirical case within that programme because its formalization choice is unusually legible. The subtensor runtime formalizes authority over weight submission, consensus, reward, and continuation. It does not formalize task definition, querying procedure, scoring rubric, or construct operationalization, which remain residual to subnet-specific validator practice.

The programme’s general claim (residuals do not remain inert) becomes empirically visible in Bittensor through the specific mechanism developed here. Residual evaluation, produced in subnet-level validator practice, is captured by the runtime and transformed into authority-bearing consequence through stake-weighted aggregation. Whether that transformation preserves, distorts, or severs the connection between evaluative substance and reward-bearing authority is the calibration question capitalized judgment and signal-capital substitution name.

Bittensor’s place within the broader residual-authority programme is that of a core empirical case alongside x402, Chainlink, and Ondo. The four core-empirical cases address residual consequences in single-instrument settings where the formalization choice produces compound concentration, commensuration, or partitioning effects. For Bittensor specifically, the compound effect operates through commensuration: the runtime converts heterogeneous subnet-level judgments (about inference usefulness, data value, speed, diversity, benchmark performance, or task-specific contribution) into a common reward-bearing scale. Espeland and Stevens (1998) identify commensuration as a social process in which qualitatively distinct things are made comparable through a shared metric. Bittensor performs commensuration at runtime scale. The scale (reward-bearing weight over TAO emissions) makes no claim about the underlying constructs; it provides a mechanism for ranking and rewarding them. The programme’s residual claim applies: what the runtime does not formalize (construct validity, evaluator competence, scoring consistency) is the site where institutional consequence accumulates.

3. Theoretical Resources

Four analytical traditions supply the resources this Article uses. Each is engaged where it does specific analytical work rather than deployed as decoration.

3.1. Construct validity in psychometric theory

Cronbach and Meehl’s (1955) foundational framework for construct validity specifies the conditions under which an operational measure can claim to measure the underlying construct it purports to measure. The framework distinguishes three validity types: content validity (does the measure cover the relevant domain?), criterion validity (does the measure predict the outcomes it should predict?), and construct validity (does the measure connect to the theoretical construct through a network of expected relationships?). Cronbach and Meehl’s central insight is that construct validity cannot be established through any single empirical test; it requires the accumulation of evidence across multiple studies that collectively support the inference that the measure is connected to the underlying construct in the way theory predicts.

Adcock and Collier (2001) extend the construct-validity framework to qualitative and quantitative social-science research. Their central contribution is the distinction between the background concept (the abstract construct as it exists in theoretical discourse), the systematized concept (the construct as defined for a specific research purpose), the indicator (the operational measure), and the scores (the specific values the measure produces for specific cases). Each transition (background to systematized, systematized to indicator, indicator to scores) requires defensible justification, and validity failures can occur at any transition.

For Bittensor specifically, the construct-validity framework clarifies what the runtime does and does not establish. The background concept is intelligence, quality, or evaluative competence. The systematized concept varies across subnets: different subnets operationalize quality differently based on the task (inference quality, data value, speed, diversity, and so on). The indicators are subnet-specific validator scoring procedures. The scores are the weight vectors validators submit to the runtime. At each transition, validity failures can occur: the systematized concept may not correspond to the background concept, the indicator may not correspond to the systematized concept, and the scores may not reflect the indicator accurately.

The runtime does not evaluate any of these transitions. It aggregates scores (weight vectors) through stake-weighted consensus and converts them into reward-bearing authority. The runtime has no mechanism for detecting or correcting validity failures at any earlier transition.

3.2. Commensuration as a social process

Espeland and Stevens (1998) identify commensuration (the transformation of qualitatively distinct properties into a common metric) as a pervasive social process with specific cognitive, institutional, and political consequences. The process enables comparison and choice across domains that would otherwise be incommensurable, but imposes specific costs: information loss, distortion of what is being measured, and displacement of qualitative judgment. Commensuration is not ideologically neutral; it privileges what can be measured and marginalizes what cannot.

Porter (1995), in Trust in Numbers, develops the historical and sociological analysis of quantification as a response to distrust of expert judgment. Quantification offers the appearance of objectivity, impersonality, and replicability; it transforms expert judgment into impersonal procedure. The transformation has real benefits (accountability, portability across contexts, amenability to audit) and specific costs (suppression of contextual knowledge, privileging of what is easily counted, vulnerability to metric-gaming).

For Bittensor, commensuration and quantification operate at protocol scale. The runtime converts heterogeneous subnet-level judgments into a single reward-bearing metric (TAO emissions). The commensuration enables ecosystem-level resource allocation but imposes the costs Espeland-Stevens and Porter identify. The theoretical contribution of the commensuration framework is in specifying what the runtime accomplishes institutionally: not the measurement of quality but the commensuration of judgments about quality into a common reward scale. The political-economic consequences of this commensuration (who benefits, who is disadvantaged, what kinds of work are rewarded) follow the patterns these authors identify for other commensurated domains.

3.3. Classification and infrastructure

Bowker and Star (1999), in Sorting Things Out, develop the analytical framework for understanding classifications as infrastructure: embedded, invisible, persistent systems that shape what can be said, done, and measured within a domain. Classifications are neither neutral nor complete; they impose visibility and invisibility through their categories, make certain kinds of work countable and certain kinds uncountable, and create institutional inertia that makes classification change slow even when underlying conditions change.

For Bittensor, the classification framework applies at two levels. At the subnet level, subnets are classified by task type (text-prompting, data-indexing, price-oracle, and so on), with each classification producing specific evaluation methodologies and reward patterns. At the runtime level, the subtensor architecture classifies actors as validators, miners, subnet owners, root holders, and so on, with each classification carrying specific rights, obligations, and reward-eligibility. The Bowker-Star framework clarifies that these classifications are not descriptions of pre-existing distinctions; they are infrastructure choices that shape what work can be done, what can be measured, and what can be rewarded.

3.4. Token-incentive design as secondary resource

Cong, Li, and Wang (2022) on token platform finance and Cong, Fox, Li, and Zhou (2025) on decentralized-finance measurement provide secondary analytical resources for token-design analysis. Their frameworks address platform-economics dynamics and measurement-gap patterns in token systems. For Bittensor, these frameworks inform the token-design analysis without supplying the primary analytical resources. Bittensor’s capitalized-judgment mechanism operates at a distinct level (protocol-mediated evaluation of contested constructs) that the token-economics literature does not address directly.

4. Prior Art and Analytical Distinction

Five clusters of prior art bear on the analysis.

4.1. AI-evaluation and benchmarking literature

The AI-evaluation literature addresses benchmarking methodology and the epistemology of model evaluation. Liang et al. (2023) on the HELM (Holistic Evaluation of Language Models) benchmark develops a systematic framework for evaluating language models across multiple metrics (accuracy, calibration, robustness, fairness, bias, toxicity, efficiency). Burnell et al. (2023) analyze the construction and limitations of AI benchmarks, identifying systematic biases in benchmark design and application. Raji et al. (2021, 2022) on the AI evaluation audit framework identify structural issues in the benchmark ecosystem, including representation gaps, measurement mismatches, and accountability failures.

These works operate at the benchmark-design level and address how AI evaluation should be conducted. This Article operates at a different analytical level: how a protocol capitalizes evaluative authority over contested constructs regardless of the benchmarking methodology subnets adopt. The prior art is adjacent but does not address the specific protocol-level capitalization mechanism developed here. Where a subnet uses HELM-style benchmarking as its scoring methodology, the Article’s analysis still applies because the runtime’s stake-weighted authority operates over the submitted judgments regardless of how those judgments were generated.

4.2. Blockchain-governance frameworks

Blockchain-governance analytical frameworks (van Pelt, Jansen, Baars, and Overbeek 2021; Schaedler, Krohn, and Papakyriakopoulos 2023; Alston, Law, Murtazashvili, and Weiss 2022) provide general tools for analyzing decentralized governance. Davidson, De Filippi, and Potts (2018) position blockchain as institutional technology within Williamson’s framework. The general frameworks inform this Article without driving the analysis. The specific analytical object (protocol-mediated evaluation of contested constructs) is not the blockchain-governance literature’s primary focus.

4.3. Crypto measurement and token-protocol analysis

Cong, Fox, Li, and Zhou (2025) analyze measurement gaps in decentralized-finance protocols, identifying systematic discrepancies between protocol-claimed metrics and independently verified activity. Their framework applies adjacent to this Article’s analysis: both identify gaps between protocol-level claims and substantive reality. The specific mechanisms differ. Cong-Fox-Li-Zhou analyze measurement gaps in self-reported platform metrics; this Article analyzes structural gaps between protocol-formalized authority and underlying evaluative substance. The frameworks complement each other.

Cong, Li, and Wang (2022) develop the token-platform-finance framework in which platform tokens provide financing for network development and align incentives between users and developers. The framework applies to Bittensor’s TAO tokenomics at a general level (tokens finance subnet development and align incentives with evaluative activity) without addressing the construct-validity question that this Article centers.

4.4. Existing Bittensor literature

Rao, Steeves, Shaabana, Attevelt, and McAteer (2020) provide the original Bittensor whitepaper. The whitepaper describes a Fisher-information-based ranking architecture that differs from the deployed subtensor runtime. The whitepaper is engaged in this Article as a historical anchor and as evidence of the architectural shift from original specification to deployed implementation.

Lui and Sun (2025) provide the first system-wide empirical study of Bittensor, analyzing 6,664,830 events across 64 active subnets and 121,567 unique wallets through February 12, 2025. Their study reports very high validator stake-reward correlation, weak miner performance-reward correlation, and extreme concentration. Their performance measure is Bittensor’s own trust metric, so their results support internal-vocabulary claims rather than external-benchmark claims. This Article engages Lui-Sun substantively in Part 11 and extends the analytical framing in two directions: the architectural-mechanism account of why the observed patterns arise, and the construct-validity framework that clarifies why internal-vocabulary evidence supports weaker claims than external-benchmark evidence would support.

4.5. Platform labor and evaluation literature

Gray and Suri (2019) on ghost work, Cassarino (2022) on platform evaluation labor, and Bucher and Gillespie (2023) on platform classification analyze the institutional patterns by which platforms extract value from evaluative work performed by workers or participants who may not receive commensurate compensation or voice. These frameworks apply adjacent to this Article’s analysis. The Bittensor case shares the general structural feature (platform-mediated evaluation of contested work) without exhibiting the specific labor-relations features the platform-labor literature addresses. Bittensor validators are not workers in the platform-labor sense; they are token-holding participants with stake in the evaluation system they operate. The analytical frameworks inform the analysis without supplying the primary resources.

4.6. What the Article contributes

Across the five prior-art clusters, the specific mechanism Bittensor exhibits (protocol-level capitalization of residual evaluative authority over contested constructs) has not been named. The AI-evaluation literature operates at the benchmark-design level. The blockchain-governance literature operates at the decision-making-architecture level. The token-measurement literature operates at the protocol-metric-gap level. The existing Bittensor literature (Rao et al. 2020; Lui and Sun 2025) operates at the protocol-specification and empirical-pattern levels. The platform-labor literature operates at the worker-relations level.

The contribution relative to each prior-art cluster is specific. Relative to AI-evaluation: the protocol-level capitalization mechanism operates independently of benchmark-design choices. Relative to blockchain-governance: the analytical object is protocol-mediated evaluation of contested constructs, which requires construct-validity and commensuration apparatus the general frameworks do not supply. Relative to crypto measurement: the gap this Article identifies is structural rather than self-reporting-based. Relative to existing Bittensor literature: the architectural-mechanism account and construct-validity framing extend the descriptive empirical results to theoretical accounts of why the patterns arise. Relative to platform-labor literature: the token-holder participant structure shifts the analytical frame from labor relations to protocol-institutional arrangements.

5. Method

The analysis uses primary-source reconstruction of the subtensor runtime combined with governance-proposal analysis and empirical engagement with the Lui-Sun (2025) system-wide study and a bounded packet trace of SN1 historical OpenValidators data.

Runtime sources: the subtensor runtime at commit f65a8a3b093863803500406d4dd8a661d9f88989 and the March 19, 2026 deployment anchor 7a727dd4d219a953391e91ed2f7aa942050938f1. The bittensor-subnet-template repository at commit c355fbf44df72b1f526d723d5fc3a8179c16b9af. Specific code locations are cited where the analysis depends on runtime logic.

Governance sources: BIT-0002 (dTAO Subnet Start Call), BIT-0004 (Subnet Deregistration), and the broader BIT process documentation. These governance proposals document specific pressures within the system and indicate where governance action has been taken to address identified problems.

Empirical sources: Lui and Sun (2025) as the primary system-wide empirical anchor; the historical OpenValidators mirror for SN1 text-prompting as the bounded packet-trace source. SN1 is selected as a revealing case: the mirror preserves netuid = 1 run metadata and raw parquet rows with prompts, queried uids, completions, rewards, blocks, and run-local snapshots, enabling reconstruction of a concrete validator workflow with event-level linkage from queried validators through completions, rewards, penalties, and selected outputs.

Scope limits are substantial. The available record is architecture-first rather than a subnet-by-subnet quality audit. Lui and Sun’s study ends on February 12, 2025 and relies on a protocol-native performance proxy, which limits what can be said about external quality. The subtensor runtime and traced public surfaces show how authority is weighted and how rewards are distributed; they do not by themselves establish evaluator competence, collusion, or construct validity for every subnet task. Stronger calibration claims within SN1 will require an adjudication layer external to the subnet’s own reward machinery.

Method discipline: all factual claims about runtime architecture, governance proposals, and empirical findings are traceable to specific source documents. Claims about capitalized judgment are developed from runtime-architecture evidence; claims about signal-capital substitution are developed from governance-evidence, system-wide empirical findings, and architectural analysis of failure modes, and are distinguished from claims about calibration at specific subnets, which require subnet-level evidence not presently available.

6. Bittensor Architecture: Residual Judgment and Formalized Authority

6.1. What the runtime formalizes

The subtensor runtime at the cited commit implements specific functions that together constitute the protocol’s formal core. Validator eligibility is determined through is_topk_nonzero(&stake, max_allowed_validators as usize), which grants validator permits to the top-k stake holders. Active stake is computed by masking for inactivity and permit status. Weights from non-permitted validators are dropped. Consensus over submitted weights is computed through weighted_median_col_sparse(&active_stake, &weights, n, kappa). Miner ranks are computed through matmul_sparse(&clipped_weights, &active_stake, n). Validator dividends are computed through mechanisms in which stake appears at multiple decisive points (permit eligibility, active-stake weighting in aggregation, bond-weighted returns, emission normalization). At the subnet-emission layer, Taoflow allocates emission shares through flow-sensitive logic in get_shares_flow, without independent assessment of subnet quality.

The runtime formalizes authority over submitted judgments. It does not formalize the judgments themselves.

6.2. What the runtime leaves residual

The bittensor-subnet-template repository and historical Yuma documentation leave task definition, querying methodology, scoring rubric, and weight production in subnet-specific validator practice. Validators are expected to write or adopt subnet-specific evaluation mechanisms, query miners, score returned outputs, and set weights on chain. Historical Yuma Consensus documentation describes validators as expressing subjective preferences about miner performance and being rewarded when those evaluations align with stake-weighted validator consensus.

The residual layer includes construct operationalization (what does quality mean for this subnet?), task design (what queries elicit meaningful performance?), scoring methodology (how are responses scored?), scoring consistency (do different validators score similarly?), and calibration to external standards (if they exist for the task). None of these are formalized at the protocol level.

6.3. The architectural pattern

The architectural pattern is that Bittensor begins from residual judgment rather than from a protocol-native measure of quality. Quality in Bittensor is plural: it may mean inference usefulness, data value, speed, diversity, benchmark performance, or subnet-specific contribution. The protocol does not settle that construct in advance. It receives weight vectors produced by validator processes that vary by subnet and by developer design. In construct-validity terms, the system does not begin with a settled operationalization of intelligence; it begins with heterogeneous judgments about contested task value.

The runtime then formalizes authority to convert those heterogeneous judgments into commensurate reward-bearing consequences. The commensuration is accomplished through stake-weighted aggregation: validators with more stake have more authority in the weight-aggregation procedure, and the resulting consensus determines miner ranks, validator dividends, and subnet-level emission allocation.

7. Capitalized Judgment

Capitalized judgment names the arrangement more precisely than "stake-weighted evaluation." The distinction matters because stake-weighted evaluation suggests that stake weights evaluation itself; capitalized judgment names the more accurate pattern: stake weights authority over judgments produced outside the formal protocol core. Contested judgments about quality are produced in subnet-level validator practice; the runtime turns those judgments into authoritative consequence through capital-weighted aggregation that determines whose judgments count, how much they count, and how rewards are distributed.

7.1. The architectural specificity

Capitalized judgment is specific to the Bittensor architecture in ways that distinguish it from related patterns. It is not identical to proof-of-stake consensus, which weights validator authority by stake but addresses consensus over transaction ordering rather than over evaluative judgment about contested constructs. It is not identical to decentralized oracle networks, which weight data-provider authority by stake but address reporting of external facts rather than evaluation of quality. It is not identical to DAO voting, which weights governance authority by stake but addresses decision-making over protocol parameters rather than distribution of reward for evaluative work.

The specific pattern in Bittensor is that stake weights authority over judgments about the quality of work performed by other participants, with those judgments determining reward distribution. This is an evaluation-of-work-by-capital-holders pattern, distinct from consensus over transactions, oracle reporting, or governance voting.

7.2. The construct-validity implications

In Cronbach-Meehl (1955) terms, the capitalized-judgment architecture delegates construct operationalization to subnets (each subnet defines its own systematized concept), delegates indicator choice to validators (each validator defines its own scoring procedure), and formalizes only the aggregation of indicator scores into reward allocation. Validity failures can occur at any earlier transition (background to systematized, systematized to indicator, indicator to scores), and the runtime has no mechanism for detecting or correcting such failures.

The architectural consequence is that the protocol’s reward distribution inherits all the validity failures that occur at earlier transitions. If the systematized concept diverges from the background concept, the protocol rewards that divergence. If the indicator diverges from the systematized concept, the protocol rewards that divergence. If the scores diverge from the indicator, the protocol rewards that divergence. Capitalized judgment is the mechanism by which all these possible validity failures are converted into reward-bearing consequence.

7.3. The commensuration implications

In Espeland-Stevens (1998) and Porter (1995) terms, the capitalized-judgment architecture performs commensuration at runtime scale. Heterogeneous subnet-level judgments (about qualitatively different things, using qualitatively different indicators) are converted into a single reward-bearing metric (TAO emissions). The commensuration imposes the costs these authors identify: information loss (the qualitative differences between subnets are collapsed into a single reward scale), distortion of what is being measured (the reward scale creates incentives that may not align with the underlying evaluative constructs), and displacement of qualitative judgment (the runtime’s aggregation logic determines reward allocation without regard for the qualitative quality of the underlying judgment).

The political-economic consequences that Espeland-Stevens and Porter identify for other commensurated domains apply here. Capital-holders benefit because capital weights authority; evaluative-competence holders may or may not benefit, depending on whether competence correlates with capital. Work that is easy to evaluate within the runtime’s aggregation logic is rewarded; work that is hard to evaluate is marginalized regardless of its underlying quality.

8. Signal-Capital Substitution

Signal-capital substitution is the failure mode within capitalized judgment. It appears when capital-weighted authority is weakly calibrated to evaluative competence, so the protocol rewards capital-positioned judgment rather than well-grounded judgment. The failure mode has specific structural conditions under which it arises.

8.1. Structural conditions

Four structural conditions increase the probability of signal-capital substitution. First, validators may be bootstrapped through early trust and delegation before meaningful subnet evaluation exists; BIT-0002 documents this pattern explicitly for the dTAO subnet start call, describing "bogus validator or miner code" establishing early validator trust and child-hotkey delegations before subnet owners have published real code. Second, capital may track attention or speculative interest more closely than evaluative skill; token markets generally exhibit this property, and BIT-0002 describes participants rushing into subnet tokens before they can meaningfully evaluate whether to mine, validate, or buy. Third, validator agreement may reflect copying or prestige rather than competence; stake-weighted consensus rewards alignment with majority stake-weighted opinion, creating incentives for conformity that do not require evaluative competence. Fourth, some subnet tasks may be weakly grounded enough that the protocol lacks an external benchmark capable of disciplining internal rankings; for such tasks, no external test can detect calibration failure.

8.2. Runtime mechanisms that support the failure mode

The subtensor runtime supports this failure mode structurally because stake enters every decisive point in the aggregation pipeline. Validator permit eligibility depends on stake. Active stake weights the consensus computation. Miner rank depends on active stake through the matrix multiplication. Validator dividends are multiplied by active stake. Emission allocation at the subnet level operates through flow-sensitive logic tied to capital flows.

The subnet template and validator documentation show that validators do run off-chain scoring procedures. The runtime architecture does not settle calibration for every subnet; it does, however, identify the risk form. If stake-authorized judgment is badly calibrated, the runtime has no internal mechanism that re-anchors authority to demonstrated competence before rewards are distributed.

8.3. The failure-mode taxonomy

Signal-capital substitution can operate through four pathways that the structural conditions above enable. Bootstrapping capture occurs when capital-based validator authority is established before real evaluative work begins, and the capital-based authority persists even as real evaluative work emerges or fails to emerge. Speculative-attention capture occurs when capital flows track speculative interest rather than evaluative substance, so capital-weighted authority becomes authority over speculative narrative rather than over evaluation. Consensus-conformity capture occurs when stake-weighted consensus rewards alignment with majority stake-weighted opinion, inducing validators to copy high-stake validators rather than evaluate independently. Weak-construct capture occurs when the subnet task lacks external calibration, so internal stake-weighted consensus can produce any ranking without external constraint.

Each pathway predicts specific empirical patterns. Bootstrapping capture predicts high validator stake-reward correlation established early in subnet operation and persisting. Speculative-attention capture predicts high correlation between subnet-token price trajectory and validator-reward distribution. Consensus-conformity capture predicts high correlation among validator-submitted weights within stake strata. Weak-construct capture predicts high divergence between internal protocol rankings and external performance benchmarks (when such benchmarks exist).

8.4. Lui-Sun evidence engagement

Lui and Sun (2025) report very high validator stake-reward correlation, weak miner performance-reward correlation, and extreme concentration. Their performance measure is Bittensor’s own trust metric, which is constructed from the stake-weighted consensus process. The Lui-Sun results are therefore strongest for claims about internal-vocabulary reward-capital alignment and weaker for claims about external-benchmark calibration. Within this epistemic scope, Lui-Sun provide evidence consistent with bootstrapping capture (high validator stake-reward correlation is what bootstrapping capture predicts) and consistent with consensus-conformity capture (high concentration is what conformity-driven stake-weighted consensus produces).

The evidence is insufficient to distinguish among the four pathways or to establish signal-capital substitution with external-benchmark calibration. Stronger evidence requires subnet-level studies that compare internal stake-weighted rankings against external performance benchmarks. Part 14 specifies falsification conditions that future research can address.

9. Governance-Pressure Evidence

9.1. BIT-0002: launch-phase capital-capture pressure

BIT-0002 (dTAO Subnet Start Call) records a launch environment in which capital and validator authority move ahead of evaluative substance. The proposal states directly that "bogus validator or miner code" attempts to establish early validator trust and child-hotkey delegations before subnet owners have time to publish and organize real code. Participants are described rushing into subnet tokens before they can meaningfully evaluate whether to mine, validate, or buy. The institutional pattern BIT-0002 documents is the bootstrapping-capture pathway: capital-based validator authority established early, potentially persisting even as real evaluative work emerges or fails to emerge.

The governance response (the start-call mechanism) addresses the symptom (rushed early capital allocation) rather than the underlying structural condition (the capitalized-judgment architecture that allows capital to establish authority prior to evaluative substance). The start-call adds a procedural delay before subnet token trading becomes active, which can reduce the severity of bootstrapping capture but does not eliminate the structural pathway.

9.2. BIT-0004: continuation-phase weak-calibration pressure

BIT-0004 (Subnet Deregistration) records a second pressure. The proposal states that non-functional subnets can continue consuming emissions, chain footprint, and compute resources. It proposes price-based pruning when the subnet limit is reached. The dissolution path is root-only (requiring root-level governance action rather than subnet-level mechanisms).

The institutional pattern BIT-0004 documents is the weak-construct pathway: subnets can persist without demonstrating functional evaluation, because the protocol lacks an external measure of functionality that would trigger deregistration automatically. The governance response (price-based pruning) ties deregistration to market signals (subnet token price) rather than to direct measures of evaluative performance. This choice is consistent with the broader capitalized-judgment architecture: where direct measurement of evaluative quality is unavailable, capital-based signals substitute.

9.3. Structural implication of the governance record

The governance record supports a specific structural claim. The Bittensor protocol recognizes that capital-capture and weak-calibration pressures exist (the BIT proposals document them) and attempts to address them through governance mechanisms (start-calls, deregistration procedures) that operate downstream of the capitalized-judgment architecture without modifying the architecture itself. The governance mechanisms are therefore supplements to, rather than substitutes for, the capitalized-judgment core. The structural conditions for signal-capital substitution remain, and the governance record documents that the protocol’s designers and operators have recognized the conditions without eliminating them.

10. System-Wide Empirical Pattern

Lui and Sun (2025) provide the first system-wide empirical study of Bittensor. Their methodology and findings warrant detailed engagement because they establish the empirical baseline against which subsequent work must position itself.

10.1. Methodology

Lui and Sun analyze 6,664,830 events across 64 active subnets and 121,567 unique wallets through February 12, 2025. They construct metrics for validator-stake-reward correlation, miner-performance-reward correlation, and concentration across multiple dimensions. Their performance measure for miners is Bittensor’s own trust metric, which is constructed through the stake-weighted consensus process documented in the subtensor runtime. Their concentration measures include stake Gini coefficients, reward Gini coefficients, and cross-wallet participation patterns.

10.2. Key findings

Their reported findings include: very high validator stake-reward correlation (stake is strongly predictive of reward for validators); weak miner performance-reward correlation (the trust metric is weakly predictive of reward for miners); extreme concentration across stake, reward, and participation dimensions; and persistent concentration patterns across time.

10.3. Analytical engagement

The Lui-Sun results support specific claims within the internal-vocabulary scope. Validator stake-reward correlation is consistent with bootstrapping capture and consensus-conformity capture at the validator level. Weak miner performance-reward correlation within the internal trust metric suggests that the stake-weighted aggregation produces results that diverge from the internal performance signal, a finding consistent with weak-construct capture or with a residual divergence between validator-submitted scoring and stake-weighted aggregation. Extreme concentration is consistent with the Matthew-effect dynamics the capitalized-judgment architecture structurally supports (capital-based authority compounds across time).

The Lui-Sun evidence supports weaker claims than external-benchmark evidence would support. Internal trust is constructed from the same stake-weighted consensus that determines reward; correlations between internal-trust and reward therefore cannot fully distinguish calibrated protocol behavior from protocol behavior that satisfies its own internal vocabulary without external calibration. Lui-Sun themselves note this limitation. The evidence establishes that Bittensor exhibits specific patterns within its own evaluative vocabulary and leaves external-benchmark calibration as an open empirical question.

11. Traced SN1 Packet

SN1 text-prompting supplies one usable subnet-level entry point for detailed packet-trace analysis. The historical OpenValidators mirror preserves netuid = 1 run metadata and raw parquet rows with prompts, queried uids, completions, rewards, blocks, and run-local snapshots. The mirror enables reconstruction of a concrete validator workflow with event-level linkage.

11.1. Closure against archive-side state

The first ten answer-task rows in the SN1 slice close against archive-side Weights and historical NeuronInfoLite state across all 500 queried miner observations. The paired save-state rows preserve the validator’s moving_averaged_scores vector over the metagraph rather than only the reward outputs. Within that bounded window, rewarded queried miners sit far above zero-reward queried miners on the saved score surface in every row that pays positive rewards. The early snapshot transitions reconstruct from the prior snapshot plus the intervening task reward rows under the validator’s own EMA rule.

11.2. Task-family divergence

Across the recovered OpenValidators runs, the followup family moves the saved score state at least as much as, and usually more than, the answer family. The form of reward anticipation changes with reward density: the sparse regime is thin-support and ranking-sensitive; the dense regime is broad-support and inclusion-sensitive. Both traced netuid = 1 runs use the same three-component reward base, with 0.3 DPO, 0.3 reciprocate, and 0.4 RLHF. The traced workflow cannot be described through the audited repo default alone. The normalized component reward surfaces already split in the same sparse-versus-dense direction across augment, followup, and answer families alike.

11.3. Late-step narrowing mechanism

Once the weighted base is reconstructed from the OpenValidators rows and fitted reward surfaces, the later-step followup ladder becomes clearer. In the dense run, weighted-base support stays almost flat from 98.90% at followup0 to 98.75% at followup3, while relevance support falls from 94.35% to 82.87%, and task validation carries it the rest of the way to 79.17%, nearly matching final support at 79.11%. The sparse run shows the same sequence on a thinner scale. In both runs, the main narrowing burden enters at relevance and then at task validation, with diversity and NSFW contributing little.

11.4. Relevance-filter mechanism

The September 2023 branch that matches the recovered prompt flow places followup task validation mainly in a wrapper-content filter that zeroes outputs leaking answer or summary scaffolding into a question. In the dense run, that historical rule matches the logged task-validator surface at about 99%. The widening later-step burden sits mainly in relevance-only failures, which rise from 4.52% at followup0 to 15.45% at followup3, while task-only failures stay in the 3-4% range. Those relevance-only failures grow more anchored to prompt language as the ladder advances: prompt-overlap rises from 0.2178 to 0.5209, accepted rows rise from 0.4527 to 0.6038, and application or impact drift rises from 22.54% to 40.83% of relevance-only failures.

11.5. Score-mechanism properties

At the accepted-completion level, prompt fit becomes modestly informative about score movement. The correlation between overlap and absolute EMA movement rises from 0.149 at followup1 to 0.210 at followup3, while the correlation between overlap and raw reward stays near zero. That pattern survives the split between the raw_context and qa_context prompt families. By followup3, accepted completions in the highest-fit quartile move saved score about 40% more than those in the lowest-fit quartile, even though mean reward is slightly lower and mean prior saved score is nearly identical. The OpenValidators packet therefore looks more like revision around a prior score state than like simple reward maximization.

11.6. Score-to-weight alignment

A direct score-to-weight alignment pass hardens the final step. In the sparse run, a wrap-aligned snapshot at block 1164303 reconstructs the stored chain row almost exactly, with 142 of 145 emitted values matching integer-for-integer. In the dense run, same-block wrap-aligned candidates do not collapse to literal equality, but they preserve near-complete support closure and strong order continuity between locally reconstructed emitted rows and historical chain rows. Across 23 dense wrap candidates, the nearest prior snapshot beats the same-block snapshot on shared-support Pearson, shared-support Spearman, and mean absolute shared difference in every case, and improves top-50 overlap in 21 of 23, with prior lag centered at 29 blocks. The traced validator loop is consistent with that shape: once the epoch boundary is crossed, the code submits weights without waiting for finalization and then immediately logs save_state() at the current block. The dense mismatch reads as a timing problem rather than as a failure of local judgment to reach chain-facing weight.

11.7. Residual error structure

The remainder is not concentrated in the decisive upper ranks. The upper 250 ranks carry about 29.5% of chain mass but only 26.3% of dense error; the long tail carries about 70.5% of chain mass and 73.7% of the error. Replacing the simplified local reconstruction with the stricter client-plus-chain route changes the dense row by only about one integer unit on average, leaves top-50 overlap and exact-ratio unchanged, and moves Pearson only at the 1e-08 scale. Interval replay narrows the same point further: 21 of 23 dense intervals close back onto the saved snapshot to numerical precision; the two outliers reduce to hotkey-reassignment resets.

11.8. Implications for the architectural analysis

The SN1 packet trace supports specific architectural claims. The validator’s local scoring process is reconstructable: the components (DPO, reciprocate, RLHF), the stages (relevance filtering, task validation, diversity, NSFW), and the late-step narrowing mechanism (relevance filter driving support loss) are visible in the trace. The validator’s score-to-weight pipeline is reconstructable: local scores become EMA-weighted moving_averaged_scores, which normalize into on-chain weights. The packet trace therefore establishes that a real internal scoring process exists and connects to chain-facing weight through documented mechanisms.

What the SN1 trace does not establish is external calibration. The local scoring mechanism uses RLHF, DPO, and reciprocate components that are themselves protocol-internal or closely protocol-adjacent measures. Whether those components correlate with external performance benchmarks for the task (text-prompting quality for end-user applications, for instance) remains an open question. The SN1 trace bounds the empirical burden: future research can investigate calibration at the specific subnet with specific benchmarks, rather than attempting to establish calibration for Bittensor as a whole.

12. Falsification

Three operationalized conditions would weaken the Article’s claims.

12.1. Calibrated capital-weighted authority

The capitalized-judgment-miscalibration claim weakens if subnet-level studies with external performance benchmarks show that higher-authority validators are consistently better calibrated to task performance than lower-authority validators across a broad range of subnet types.

Observable metric: correlation between validator stake and validator accuracy against external performance benchmarks across subnet types.

Measurement protocol: identify at least five subnets with tasks that admit external benchmarks. For each subnet, construct an external benchmark and measure each validator’s accuracy against it. Compare validator accuracy to validator stake.

Falsification threshold: statistically significant positive correlation (r > 0.5) between validator stake and external-benchmark accuracy across at least three of the five studied subnets.

12.2. Reward-performance alignment

The signal-capital-substitution claim weakens if reward allocation tracks external performance strongly across the majority of studied subnets while stake concentration has only a limited marginal effect on who receives emissions.

Observable metric: correlation between miner reward and external-benchmark performance, controlling for validator-stake-weighting effects.

Measurement protocol: identify at least five subnets with tasks that admit external benchmarks. For each subnet, correlate miner reward with external-benchmark performance. Decompose the correlation into stake-weighted and non-stake-weighted components.

Falsification threshold: statistically significant positive correlation (r > 0.5) between miner reward and external-benchmark performance across the majority of studied subnets, with stake-weighting effects explaining less than 30% of the miner-reward variance.

12.3. Absence of launch-phase capital capture

The bootstrapping-capture claim weakens if launch and continuation evidence shows that evaluative substance reliably precedes capital inflow and validator trust bootstrapping, making the BIT-0002 and BIT-0004 pathologies exceptional rather than structurally recurrent.

Observable metric: temporal ordering of evaluative-substance markers (functional code, task specification, benchmark publication) and capital-inflow markers (validator stake, child-hotkey delegation, token trading volume) across subnet launch cohorts.

Measurement protocol: for each subnet launched in a defined observation window, identify the date of first functional code deployment, first task specification publication, first benchmark publication, first validator stake establishment, first child-hotkey delegation, and first significant token trading volume. Measure the proportion of subnets where evaluative-substance markers precede capital-inflow markers.

Falsification threshold: evaluative-substance markers precede capital-inflow markers in more than 70% of observed subnet launches across at least 20 launches.

12.4. Alternative explanations

Two alternative explanations for the observed patterns deserve explicit engagement.

Token-market-efficiency as primary driver. An alternative explanation: the patterns Lui-Sun observe and the governance pressures BIT-0002 and BIT-0004 document reflect efficient token-market-sorting rather than signal-capital substitution. The efficient-market view holds that capital flows toward subnets with higher expected returns, which correlate with evaluative quality because high-quality subnets produce sustained demand. The Article’s response: the token-market-efficiency view predicts that capital flows follow quality over time. The observed patterns (bootstrapping capture, weak miner performance-reward correlation, extreme concentration) would be transient under efficient-market conditions. Their persistence across the Lui-Sun study period and the recurring governance pressures documented in the BIT process suggest that capital-flow-to-quality mapping is weak at best. The efficient-market explanation therefore requires additional mechanism specification to account for persistent divergence between capital and quality.

Protocol-design evolution as primary driver. An alternative explanation: the patterns reflect transitional features of an evolving protocol that will converge to better capital-quality alignment as the protocol matures. The Article’s response: the structural features identified (stake-weighted consensus, stake-weighted dividend, stake-weighted rank) are core architectural choices rather than transitional features. Protocol-evolution would need to modify these architectural choices to address the underlying structural conditions. Absent such modification, the signal-capital-substitution pathway persists as a structural possibility.

13. Scope of Inference

The available record is architecture-first rather than a subnet-by-subnet quality audit. Lui and Sun’s study ends on February 12, 2025 and relies on a protocol-native performance proxy, which limits what can be said about external quality. The subtensor runtime and traced public surfaces show how authority is weighted and how rewards are distributed; they do not by themselves establish evaluator competence, collusion, or construct validity for every subnet task.

The historical OpenValidators mirror supplies one usable SN1 trace surface. The first ten answer-task rows in that surface close against historical archive-side identity, weight, and saved-score state. That surface is historically bounded and narrower than the full local validator event schema.

Replacing the simplified local reconstruction with the stricter client-plus-chain route changes dense rows by only about 1-4 integer units across the whole emitted row, does not move top-50 overlap or exact-ratio, and changes Pearson only at the 1e-08 scale. The open dense remainder is better treated as a lag-and-residual-transform problem.

Narrow chain-side structure does not identify a dead or empty subnet on its own. A traced calibration contrast is missing. Stronger calibration claims within SN1 will require an adjudication layer external to the subnet’s own reward machinery. The finding here is bounded to pressures inside Bittensor’s stake-weighted evaluation architecture.

The analysis does not measure subnet-level construct validity for each of the 64 active subnets in the Lui-Sun study because subnet-level benchmarking is not yet available at that scale. The architectural claim (capitalized judgment is the runtime’s formalization pattern) is supported by the runtime evidence. The failure-mode claim (signal-capital substitution is a structural possibility under specific conditions) is supported by architectural analysis, governance evidence, and system-wide empirical patterns. The calibration claim at specific subnets remains an empirical question that subnet-level research can address.

14. Conclusion

Bittensor’s architectural choice is to formalize authority over evaluation rather than evaluation itself. The subtensor runtime at specific code locations implements stake-weighted aggregation over validator-submitted weight vectors; the underlying evaluation is produced residually in subnet-specific validator practice. The two mechanisms this Article develops (capitalized judgment as architectural posture, signal-capital substitution as failure mode) name the institutional pattern this architectural choice produces.

Applying construct-validity theory (Cronbach and Meehl 1955; Adcock and Collier 2001), commensuration research (Espeland and Stevens 1998; Porter 1995), and classification theory (Bowker and Star 1999), the Article specifies the analytical implications of the capitalized-judgment architecture. The protocol performs commensuration at runtime scale, converting heterogeneous subnet-level judgments into a single reward-bearing metric through mechanisms that weight authority by capital. The commensuration inherits all validity failures that occur at earlier transitions (background concept, systematized concept, indicator, scores), because the runtime has no mechanism for detecting or correcting failures at those transitions.

The governance record (BIT-0002, BIT-0004) documents specific pressures within the architecture. The system-wide empirical record (Lui and Sun 2025) documents specific patterns within the protocol’s internal vocabulary. The SN1 packet trace documents a concrete local scoring mechanism and its connection to chain-facing weights. Together, these sources support architectural claims about the capitalized-judgment mechanism and failure-mode claims about signal-capital substitution as a structural possibility. Calibration claims for specific subnets require external-benchmark evidence that future research can develop.

The residual-formalization research programme predicts that residuals do not remain inert. In Bittensor, the residual evaluation layer (task definition, scoring methodology, construct operationalization) is where institutional consequence accumulates. What the runtime does not formalize, subnets must supply through their own practice, and that supply determines whether the protocol’s capitalized-judgment architecture produces calibrated or miscalibrated reward allocation. The protocol’s formal openness to subnet variation coexists with structural pressure toward capital-aligned authority, and the balance between these forces is the determinative question for effective evaluation in the Bittensor ecosystem.

References

Adcock, Robert, and David Collier. 2001. "Measurement Validity: A Shared Standard for Qualitative and Quantitative Research." American Political Science Review 95(3): 529-546.

Alston, Eric, Wilson Law, Ilia Murtazashvili, and Martin Weiss. 2022. "Blockchain Networks as Constitutional and Competitive Polycentric Orders." Journal of Institutional Economics 18(5): 707-723.

Bowker, Geoffrey C., and Susan Leigh Star. 1999. Sorting Things Out: Classification and Its Consequences. MIT Press.

Bucher, Taina, and Tarleton Gillespie. 2023. "Platform Classification: A Research Agenda." Big Data and Society 10(2): 1-14.

Burnell, Ryan, Wout Schellaert, John Burden, Tomer D. Ullman, Fernando Martinez-Plumed, Joshua B. Tenenbaum, Danaja Rutar, Lucy G. Cheke, Jascha Sohl-Dickstein, Melanie Mitchell, Douwe Kiela, Murray Shanahan, Ellen M. Voorhees, Anthony G. Cohn, Joel Z. Leibo, and Jose Hernandez-Orallo. 2023. "Rethink Reporting of Evaluation Results in AI." Science 380(6641): 136-138.

Cassarino, Mariachiara. 2022. "Evaluative Labor: Workers and Platforms in the Algorithmic Economy." Work, Employment and Society 36(5): 883-901.

Cong, Lin William, Ye Li, and Neng Wang. 2022. "Token-Based Platform Finance." Journal of Financial Economics 144(3): 972-991.

Cong, Lin William, Campbell R. Harvey, Daniel Rabetti, and Zong-Yu Wu. 2025. "An Anatomy of Decentralized Finance." Journal of Corporate Finance forthcoming.

Cong, Lin William, Eswar S. Prasad, and Daniel Rabetti. 2024. "Evaluating the Impact of Regulatory Policies on Crypto Asset Markets." NBER Working Paper 33639. National Bureau of Economic Research.

Cronbach, Lee J., and Paul E. Meehl. 1955. "Construct Validity in Psychological Tests." Psychological Bulletin 52(4): 281-302.

Davidson, Sinclair, Primavera De Filippi, and Jason Potts. 2018. "Blockchains and the Economic Institutions of Capitalism." Journal of Institutional Economics 14(4): 639-658.

Espeland, Wendy Nelson, and Mitchell L. Stevens. 1998. "Commensuration as a Social Process." Annual Review of Sociology 24: 313-343.

Gray, Mary L., and Siddharth Suri. 2019. Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass. Houghton Mifflin Harcourt.

Liang, Percy, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, and Yuta Koreeda. 2023. "Holistic Evaluation of Language Models." Transactions on Machine Learning Research 2023: 1-108.

Lui, Elizabeth, and Jiahao Sun. 2025. "Bittensor Protocol: The Bitcoin in Decentralized Artificial Intelligence? A Critical and Empirical Analysis." arXiv:2507.02951.

Porter, Theodore M. 1995. Trust in Numbers: The Pursuit of Objectivity in Science and Public Life. Princeton University Press.

Raji, Inioluwa Deborah, Emily M. Bender, Amandalynne Paullada, Emily Denton, and Alex Hanna. 2021. "AI and the Everything in the Whole Wide World Benchmark." Proceedings of the NeurIPS Benchmarks and Datasets Track.

Raji, Inioluwa Deborah, Morgan K. Scheuerman, and Razvan Amironesei. 2022. "You Can’t Sit With Us: Exclusionary Pedagogy in AI Ethics Education." Proceedings of the ACM FAccT Conference: 515-525.

Rao, Yuma, Jacob Steeves, Ala Shaabana, Daniel Attevelt, and Matthew McAteer. 2020. Bittensor: A Peer-to-Peer Intelligence Market. https://bittensor.com/whitepaper

Schaedler, Lukas, Jan Krohn, and Orestis Papakyriakopoulos. 2023. "Blockchain Governance: A Systematic Literature Review." Information Systems Management 40(3): 231-254.

van Pelt, Rowan, Slinger Jansen, Djuri Baars, and Sietse Overbeek. 2021. "Defining Blockchain Governance: A Framework for Analysis and Comparison." Information Systems Management 38(1): 21-41.

opentensor. 2024. "BIT-0002: dTAO Subnet Start Call." https://github.com/opentensor/bits/blob/7ff32e030e6cd080a1feb9fa4234b852e31d783d/bits/BIT-0002-Start-Call.md

opentensor. 2024. "BIT-0004: Subnet Deregistration." https://github.com/opentensor/bits/blob/7ff32e030e6cd080a1feb9fa4234b852e31d783d/bits/BIT-0004-subnet-deregistration.md

opentensor. 2024. "Bittensor Subnet Template." https://github.com/opentensor/bittensor-subnet-template/blob/c355fbf44df72b1f526d723d5fc3a8179c16b9af/template/base/validator.py

opentensor. 2026. "Subtensor Runtime Epoch Logic." https://github.com/opentensor/subtensor/blob/f65a8a3b093863803500406d4dd8a661d9f88989/pallets/subtensor/src/epoch/run_epoch.rs

opentensor. 2026. "Subnet Emissions Logic." https://github.com/opentensor/subtensor/blob/f65a8a3b093863803500406d4dd8a661d9f88989/pallets/subtensor/src/coinbase/subnet_emissions.rs

opentensor. n.d. "Yuma Consensus." https://docs.learnbittensor.org/learn/yuma-consensus