BipHoo UK

collapse
Home / Daily News Analysis / Why the future of AI is on-premises - business advice from Dell Tech World 2026

Why the future of AI is on-premises - business advice from Dell Tech World 2026

May 30, 2026  Twila Rosenbaum  16 views
Why the future of AI is on-premises - business advice from Dell Tech World 2026

Nearly every technology conference today has a focus on artificial intelligence, and this week's Dell Technologies World was no exception. But what stood out was the focus on how businesses can actually execute on AI, especially by building more AI capabilities into their infrastructure. As the industry shifts from experimentation to production, enterprises are increasingly recognizing that the public cloud is not the only—or even the best—place to run AI workloads. The message from Dell Tech World 2026 was clear: the future of enterprise AI is on-premises, or at least hybrid, and companies that ignore this reality risk falling behind in cost management, data governance, and operational control.

Top trends at Dell Tech World 2026 that are pushing businesses to increase their on-premises AI capabilities include increasing demand for data and AI sovereignty, the need for tighter governance—especially for agents—and more direct control over these critical systems. These themes resonated across keynotes, breakout sessions, and product announcements. The conference made it evident that the era of simply calling an API for a large language model (LLM) and hoping for the best is already giving way to a more mature approach where infrastructure is tailored to specific business needs.

One of the most compelling drivers of on-premises AI is the economics of token consumption. In his keynote, Dell chairman and CEO Michael Dell introduced the concept of "tokenomics," a term that captures the skyrocketing costs associated with LLM usage in the cloud. Dell Technologies vice chairman and COO Jeff Clarke reinforced this point during his Day 2 keynote, revealing that token usage for AI has risen by 320-fold and that global token consumption is predicted to grow by 3,400% by 2030. For an enterprise running millions of transactions per day—especially those involving agents that autonomously execute tasks—the cost of cloud-based tokens can quickly spiral out of control. One case study presented at the show highlighted a company that consumed an entire year's token budget in just three months after deploying agentic systems.

The solution, as Dell presenters argued, is to move AI inference and even training closer to the data. This can be done using local workstations, dedicated data center racks, or edge devices—each of which reduces reliance on expensive cloud APIs. Dell's portfolio, including PowerEdge servers, Precision workstations, and the newly announced Dell AI Data Platform, is designed to support this shift. By processing tokens internally, companies can drastically cut costs while also improving latency and data privacy. For latency-sensitive applications—such as real-time customer service agents or autonomous manufacturing systems—on-premises processing provides sub-millisecond response times that cloud solutions simply cannot match.

Intelligence is becoming infrastructure

In the opening keynote, Michael Dell said the company is working to move AI closer to the data and infrastructure. "Abundant intelligence is here," he told attendees. "Intelligence is becoming infrastructure." This statement encapsulates a fundamental change in how enterprises view AI: no longer as a standalone service accessed through the internet, but as a core component of their IT environment—much like compute, storage, and networking. When intelligence is embedded in the infrastructure, it becomes more reliable, secure, and controllable.

Enterprises are realizing that piloting AI through a public cloud API is simple, but moving that pilot into large-scale production requires internal, dedicated server and compute resources. Without on-premises or hybrid architecture, enterprises face hurdles around data capacity and latency, especially as they transition from traditional AI (e.g., classification, summarization) to agentic systems (autonomous agents that plan, execute, and learn). Agentic AI demands low latency and high throughput to function effectively, as each agent may need to invoke multiple inference calls per second. In a cloud-only setup, network delays and bandwidth constraints can degrade agent performance to the point of unusability.

Moreover, data sovereignty is increasingly a regulatory issue. The European Union's AI Act, for example, mandates that certain AI processing must occur within the EU or in jurisdictions with equivalent data protection. Similarly, industries like healthcare and finance have strict compliance requirements that make cloud-based AI risky or even illegal. Research from Aberdeen shows that companies across all sectors are putting a high value on keeping data and AI training out of the cloud and protected in their own data centers. This push for sovereignty is not just about compliance—it is also about strategic independence. Businesses that control their own AI infrastructure are less vulnerable to vendor lock-in, price hikes, or service discontinuation from cloud providers.

Growing requirements for sovereign AI

At Dell Technologies World, several Dell speakers discussed the growing requirements for sovereign AI and how Dell can help customers meet these needs. The introduction of the Dell AI Data Platform is a key part of this strategy. The platform combines Dell's storage and server hardware with Nvidia's accelerated computing and a new software layer that manages data pipelines, model versioning, and governance policies. It allows organizations to build private AI clouds that feel as flexible as public clouds but are entirely under their control.

Requirements for sovereign AI become even more important as businesses begin to adopt agents and agentic systems. With agents, not only do costs around tokens see significant growth, but the need for strict security, governance, and control becomes vital to prevent unintended consequences. An agent that makes a wrong decision—whether it's leaking customer data, approving a fraudulent transaction, or sending an inappropriate email—can cause irreparable damage. "When an agent takes an action on your behalf, you need to know what it did, why it did it, and how to undo it if it got it wrong," Jeff Clarke said in his keynote. This level of auditing and control is far easier to implement in an on-premises environment where the organization owns the full stack.

Announcements from Dell designed to help businesses address these concerns included Dell Deskside Agentic AI, a development offering that includes workstations, Nvidia NemoClaw software, and Dell services. This package allows developers to build and test agents locally before deploying them to production. Also announced was support for Nvidia OpenShell, a sandboxed environment for building agents and enforcing corporate governance and privacy policies. These tools reflect a broader trend: the enterprise AI stack is becoming more modular and controllable, with a strong emphasis on safety and compliance from the start.

In addition to sovereignty and governance, the conference highlighted the need for businesses to adapt their IT architectures to handle AI workloads without disrupting existing operations. Hybrid cloud—where some workloads run on-premises and some in the cloud—is emerging as the pragmatic middle ground. Dell's offerings, such as the new APEX portfolio, combined with its bare-metal provisioning capabilities, make it easier for IT teams to allocate resources dynamically. For example, a company might run its most sensitive AI inference on-premises while leveraging the cloud for burst capacity during peak demand. This flexibility is crucial because AI workloads are notoriously unpredictable. They can spike during model training, seasonal promotions, or when a new agent goes live.

Conflicting advice: Move fast versus go slow

Many of the announcements, sessions, and discussions at Dell Technologies World highlighted one of the main balancing acts of today's AI infrastructures. There were often seemingly conflicting statements, with talk about helping businesses "move fast" and "not be left behind" contrasting with practical sessions that highlighted going slow, ensuring security and governance, and starting small with AI and agents. This tension is not a flaw; it reflects the reality that AI adoption is not a binary decision. Companies must move quickly to capture competitive advantages, but they must also build responsibly to avoid reputational or legal disasters.

Another aspect of this balancing act is the maturity of the software tools themselves. Many of the offerings touted as solutions to AI and agent hurdles are still in beta or even alpha. Dell acknowledged this, often recommending that companies not use these tools in production without thorough testing. For risk-averse industries like financial services or healthcare, this caution is wise. The cost of a bug in an AI agent could be catastrophic. Therefore, enterprises need to evaluate whether the software is mature and secure enough to meet their requirements—and if not, whether they have the internal expertise to harden it.

Beyond the technical challenges, there is also a cultural shift underway. IT teams that were once focused on infrastructure maintenance are now becoming AI operations specialists. They need to understand model lifecycle management, token optimization, and agent orchestration. The conference offered numerous sessions on upskilling and change management, emphasizing that the human element is just as critical as the technology. Without skilled personnel, even the best on-premises AI infrastructure will underperform.

Finally, the broader economic environment is pushing companies toward on-premises AI. Cloud costs have risen sharply over the past two years, driven by the AI boom and increasing demand for GPU instances. Many organizations have been shocked by their cloud bills, especially after moving from pilot to production. Dell's message—that on-premises solutions can cut token costs by 50% or more—resonated strongly with attendees. While the upfront capital expenditure for on-premises hardware is higher, the total cost of ownership (TCO) over three to five years often favors private infrastructure, especially when token consumption grows exponentially.

In summary, Dell Tech World 2026 painted a clear picture of where enterprise AI is heading: toward hybrid and on-premises architectures that offer cost control, data sovereignty, and governance. The conference provided practical guidance on how to get there, from evaluating workload types to selecting the right hardware and software stacks. Companies that fail to prepare for this shift may find themselves locked into expensive cloud contracts, unable to scale AI adoption safely and efficiently. The future of AI is not in the cloud alone—it's in the smart combination of cloud and on-premises, with a strong bias toward keeping intelligence close to the data.


Source: ZDNET News


Share:

Your experience on this site will be improved by allowing cookies Cookie Policy