Cocoon by Durov: How Private GPU Compute on TON Shifts AI Costs and Risks

Pavel Durov launched Cocoon.org, a decentralized confidential compute network on TON for private AI inference. GPU owners earn TON while developers get market-rate compute protected by TEEs. For business, this offers a crucial alternative to cloud giants regarding cost, privacy, and compliance strategies.

Technical Context

Cocoon.org serves as the public entry point for Cocoon (Confidential Compute Open Network), a network announced and launched by Pavel Durov in late 2025. Essentially, it is a marketplace for AI inference compute (and potentially other tasks) where GPU owners lease their power, and developers purchase it using Toncoin (TON). The core focus is privacy: processing must occur in a way that neither the node operator nor the platform can access the data or requests.

This is a fresh development by industry standards: launched in November 2025, and now in February 2026, the project is in a phase of early growth and viability testing under real load. A crucial market signal is that Telegram is the first major consumer: the network is processing real requests for AI features within the ecosystem.

What is actually "new" in Cocoon's architecture

Decentralized GPU Pool: Compute is supplied by independent hardware owners who receive rewards in TON.
Confidential Computing: A declared model where data remains protected during execution using Trusted Execution Environments (TEE)—hardware-isolated execution environments.
Payments and Settlement via TON: The blockchain acts as the settlement and incentive layer for compute providers.
Focus on AI Inference: Not just generic cloud computing, but primarily executing AI requests (e.g., summarization, generation, classification).
Documentation and Open Source: Cocoon.org highlights the availability of docs and source code, increasing engineering trust, though not eliminating the need for audits and threat verification.

Technical nuances businesses must not overlook

TEE is not magic: Trust shifts from the cloud provider to the hardware supply chain, firmware, drivers, and TEE implementation. Practically, this means the threat model must be rebuilt.
Inference ≠ Training: The economic model, latency, and network requirements for inference are often simpler than for training. Cocoon logically starts with inference, where predictability is easier to ensure.
Data Confidentiality vs. Metadata Confidentiality: Even with payload encryption, questions remain regarding what can be inferred from timing, volume, request frequency, routing, and billing.
Quality of Service and SLA: Decentralization almost always complicates guarantees regarding response time, stability, and performance reproducibility (especially for production user functions).

Business & Automation Impact

For business, Cocoon is interesting not as "just another crypto project," but as an attempt to alter the basic economics and risk profile of AI inference. If the network can truly scale and maintain acceptable service quality, a third path emerges between "expensive but reliable Big Cloud" and "cheap but complex on-prem."

Where this offers direct value

Private Inference for Sensitive Data: Finance, legal documents, internal knowledge bases, personal data—areas where leaks are particularly costly.
Lowering Compute Costs: As network-side GPU supply grows, inference prices may become more competitive than centralized providers (not always, but a window of opportunity exists).
Fast Start for Telegram Products: If you are building Mini Apps/bots and AI features in Telegram, infrastructure "proximity" can reduce integration friction.
Global Availability: The "any GPU owner can be a provider" model potentially expands capacity geography and reduces dependence on specific regions.

Who wins and who is at risk

Winners: Teams requiring private inference who are ready for engineering maturity: key management, access policies, observability, quality A/B testing, and those with the competence to build proper AI solution architectures considering new risks.

At Risk: Providers selling "just GPUs" without added value, and solutions where privacy is a "checkbox" option. Cocoon markets privacy as the base layer, raising market expectations.

How AI adoption architecture changes

Previously, the typical scheme was "app → Cloud API → model → response." Now, a variant appears: "app → encryption/attestation → decentralized compute → response," where the critical elements become:

Cryptographic Wrapper: Key management, rotation, storage policies, role separation.
Remote Attestation: Proof that code is truly executing in a trusted environment, not on a "swapped" node.
Observability without Leaks: Logs, metrics, tracing—must be built so prompts/data fragments do not leak.
Hybrid Schemes: Keeping some requests in classic clouds (non-critical data) and sending others to confidential compute. This is often more optimal than "all or nothing."

In practice, companies often stumble not on the API itself, but on the end-to-end process design: from data classification and compliance to cost monitoring and quality degradation. This is where AI implementation turns from an experiment into an engineering product—requiring architects, not just developers.

Risks to evaluate before a pilot

Regulation and Token Accounting: Paying in TON may require a separate legal and financial contour (accounting, taxes, treasury policy).
Ecosystem Vendor Risks: Although the network is "decentralized," Telegram drives demand. Any changes in product strategy can impact the market.
SLA and Support: Businesses need guarantees. Decentralized networks often evolve faster than corporate support procedures form.
TEE Security: TEEs have had and will have vulnerabilities. You need to design "for compromise": data minimization, segmentation, limits, anomaly detection.

Expert Opinion: Vadym Nahornyi

Cocoon is not an "AWS replacement," but an attempt to shift the trust point in AI inference. If the project maintains its pace, the market will see a new category: confidential compute as a commodity, accessible to developers as easily as model APIs are today.

At Nahornyi AI Lab, we regularly face the request: "we want AI automation, but data cannot be sent to the public cloud." Usually, there are two options: expensive on-prem or privacy compromises. Cocoon is interesting because it offers a third scenario—outsourced inference, but with technically declared confidentiality. However, at the implementation level, I would highlight three practical lessons.

1) Start with data classification, not the platform

Before choosing Cocoon (or any confidential platform), we create a matrix: what data can be processed outside the perimeter, what cannot, and what can be processed after redaction/masking. This directly impacts architecture, budget, and timelines. Otherwise, the pilot "takes off," but production hits a wall of security and compliance.

2) Privacy does not cancel out quality and cost control

For business, it matters not only that "no one sees the prompt," but also that answers are stable. This means you need: test sets, quality metrics, drift control, token/latency limits, and observability. Without this, AI automation becomes an expensive and unpredictable service.

3) The main intrigue is scaling and standardization

The network can quickly gain demand (especially with Telegram as an anchor), but GPU supply and node quality are harder. If Cocoon can standardize hardware requirements, ensure attestation, transparent billing, and predictable performance, the chance of "taking off" is high. If not, it will remain a niche solution for enthusiasts and specific scenarios.

My forecast is pragmatic: there will be hype, but real value will emerge for companies that use Cocoon as part of a hybrid strategy—and carefully build AI integration into processes, rather than just "moving" requests to a new network.

Theory is good, but results require practice. If you want to understand if the Cocoon/TON approach fits your company, and how to safely embed private inference into your product or internal processes, let's discuss the task at Nahornyi AI Lab. I, Vadym Nahornyi, am responsible for architecture and implementation quality—from pilot to industrial operation.

Share this article

Twitter/X LinkedIn Telegram