Perplexity launches hybrid inference for PC and cloud
At Computex 2026 Perplexity introduced a hybrid agentic inference orchestrator that splits AI work between Windows PCs and cloud models; it will ship in Perplexity Computer in July.
Perplexity announced a hybrid agentic inference orchestrator at Computex 2026 in Taipei on June 2. The feature will arrive in the Perplexity Computer app for Windows in July and automatically divides parts of a single AI task between a compact local model on a user’s PC and more powerful models in the cloud.
Perplexity CEO Aravind Srinivas demonstrated the system on stage alongside Intel CEO Lip-Bu Tan. The live demo ran on an Intel Core Ultra Series 3 processor. The company described the local component as a traffic controller that determines what data should stay on the device for privacy and which task elements require a cloud frontier model for greater capability.
Simple operations such as formatting text, summarizing a document stored on the machine, or lightweight classification will be handled locally. More complex reasoning or tasks that need a frontier model’s capacity will be routed to Perplexity’s cloud servers. Routing can occur in the middle of a task and is intended to happen automatically without manual mode selection by the user.
Perplexity framed the orchestrator as a way to balance model accuracy, user privacy and compute cost. The company reported revenue growth from $100 million to $500 million while headcount increased about 34 percent. Perplexity has described the objective as maximizing ‘token value per watt’ for each user by shifting some inference workload onto users’ hardware to reduce centralized compute bills.
The local component is a compact model packaged with the Perplexity app rather than an open-source, self-hosted model. Cloud processing continues to route through Perplexity’s servers, so fully offline self-hosted inference is not included in this release.
Perplexity said the orchestrator is chip-agnostic; the Computex demo used Intel silicon and the company plans to support other processors, including Nvidia. The feature is currently exclusive to the Windows PC app, with no announced timeline for macOS or Linux releases.
The offering joins a broader industry trend toward on-device and hybrid inference. Hardware and software vendors have released options for local model execution and local-server combinations, and Perplexity positions its orchestration layer as the mechanism that decides, in real time, where each part of a task should run.
Perplexity plans the July rollout as the first large-scale test of the system’s automatic routing decisions in real-world use. The company has not published additional platform support dates or detailed performance metrics for the public release.
The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.







