Agentic AI for Beginners: My Journey into Building with Claude

There’s been a lot of buzz around Agentic AI lately, especially around how powerful Claude can be when used beyond simple prompting. Naturally, I got curious.

agentic-ai-claude.jpeg


As an architect, I wanted to understand what “agentic” really means in practice. What changes when we move from prompts to agents? And what does that mean for how we design systems? As I started exploring, it became clear this isn’t just about smarter chatbots, it’s something more.

From Prompts to Agents: What’s the Difference?

Before diving in, let’s distinguish between Generative AI and Agentic AI.

  • Generative AI (Reactive) – Deals with Prompts, where we provide an input and the model provides a one-time response. We are the orchestrator.


  • Agentic AI (Proactive) – Deals with Agents, where we provide a goal and the model determines the steps, uses tools and iterates until the goal is met. The model is the orchestrator.

“Agentic” means moving from chatting to delegating. It’s the difference between asking for instructions and having the task completed for you, like getting a recipe vs hiring a chef or asking for directions vs being driven there.

The “Agentic” Starter Pack

I started with the basics to see how Claude handles the “plumbing” of a real-world project. My exploration focused on three core hyped items:

  • Agentic Implementation: I moved away from “one-off” prompts and built loops where Claude runs a Plan -> Execute -> Test -> Fix cycle autonomously.


  • Model Context Protocol (MCP): I hooked Claude up to my local filesystem, Slack & GitHub. This was to see how the agent “reaches out” and queries the data it needs directly.


  • Role-Based Division: I experimented with “Agent Teams” by giving different Claude instances specific roles: one as the Architect to handle planning and another as the Developer to handle implementation. Further, tried to put multiple hats for the clarity of work distribution and decision making for the agent.

My Learning Project: Endpoint Watch Agent (EWA)

The goal of this project was to build a hands-on learning kit for agentic systems. Endpoint Watch Agent (EWA) is a Python-based agent that continuously monitors configurable endpoints (websites or APIs). When an endpoint is down or unhealthy, the agent autonomously evaluates the incident, avoids duplicate alerts, creates a ticket and sends a contextual Slack notification.

Flow Diagram Plan

ewa-flow-diagram.png


Structuring the Workflow

Starting from nothing, I worked with Claude itself to set up the structure and segregation of components, defining single responsibility. To keep things simple, created a single agent (Orchestrator) that runs one sequential loop: Check Endpoint 1 → Decide → Act → Check Endpoint 2 → Decide → Act...

The PolicyEngine is not an agent but a pure function called by the agent. The tools are interfaces that the agent dispatches, while the MCP servers are external services.

Explore or build on the Project available here: [Github Link]

What I Learnt: The “Pro” Framework

The real breakthrough wasn’t the model itself, but how I structured the project to guide it. Below project structure can be considered a good architectural template as a baseline start for any agentic development. The architectural pattern supports a clean separation of concerns, where we can add new tools, policy rules or tests without needing to restructure the entire system.


As a production-ready baseline though, it has gaps: no tests, single-threaded endpoint checking, no metrics, no graceful shutdown. These are solvable without rethinking the architecture, but they’d need to be added before shipping anything real.

I found that following four pillars are essential for any agentic workflow:

CLAUDE.md (The Project Brain)

This file lives in the root of your repo as the AI’s operating manual. It tells Claude agent who it is and how it should behave in this specific codebase. Thus, it helps to start with shared context instead of inferring everything from scratch each session.

# Project Context: Endpoint Watch Agent (EWA)

## Role & Mission
You are the **EWA Specialist**. Your goal is to maintain a high-availability monitoring system. You prioritize accuracy in incident detection and clarity in Slack notifications.

## Tech Stack
- **Runtime:** Python 3.12
- **Logic:** Policy-based reasoning (PolicyEngine)
- **Integrations:** Slack (Alerts), Jira (Tickets), GitHub (MCP)

## Architecture Rules
- **Separation of Concerns:** Keep tools in '/tools', logic in '/engine'.
- **Async First:** Use 'asyncio' for all network-bound endpoint checks.
- **No Deletions:** Never delete incident logs, only archive or update status.

## Dev Commands
- **Run:** 'python main.py'

CLAUDE.md is the interface between the human who designed the system and the AI that extends it. It’s not a documentation for users of the tool instead is a documentation for the next builder, human or AI.

SKILLS.md (The Capability Manual)

While CLAUDE.md is about the project, SKILLS.md is about what the agent is capable of doing. It provides pre-verified “recipes” for complex tasks, stopping the agent from hallucinating its own (often broken) logic.

# Agent Skills

## Skill: Incident Evaluation
- **When:** An endpoint returns a non-200 status.
- **Action:** 
  1. Check 'storage/incidents.json' for active tickets.
  2. If new, invoke the 'JiraTool' to create a "Critical" task.

## Skill: Slack Formatting
- **Constraint:** Always include the Status Code, Response Time, and the "Runbook Link" from the configuration file.
- **Tone:** Professional and urgent.

These are the procedural instructions or documentation that teach the agent how to use a tool effectively in a specific context.

“Plan, then Execute” Workflow

I stopped asking Claude to “just do it”. Instead, I enforced a mandatory two-step gate:

  1. The Plan: Claude must output a step-by-step technical plan first.
  2. The Approval: I review the plan for architectural alignment.
  3. The Execution: Only after approval does the agent start writing code. This eliminates 90% of the “rabbit holes” agents often fall into.

Verification Criteria

Never ask an agent to “fix a bug”. Instead, ask it to “Fix the bug and provide the specific CLI command or test case to verify the fix”. It seems an agent that knows it has to prove its work is significantly more accurate and less likely to hallucinate a “done” state!

What I Learnt: Behavioral System Design

EWA is built like a Claude agent where it has a brain (orchestrator), reasoning (policy engine), senses (endpoint checker), hands (Jira + Slack tools) and memory (incident store).


Thus, moving beyond simple monitoring, this system creates a truly agentic closed loop: it observes, reasons, decides, acts and remembers, closing the gap between detection and autonomous resolution. This is what differentiates a single prompt from a system that operates.

If designed properly, the orchestrator never does anything directly. It asks tools to observe, asks the policy engine to reason, then dispatches to tools based on the decision. Every component has one job and knows nothing about the others.

Thus, with agentic systems, we start to define goals, shape decision boundaries, orchestrate tools and design workflows. The unit of design has moved from “What does this function do?” to “How does this system behave over time?”. This is very different and is a significant mindset shift.

What I Learnt: The Operational Reality

This is where Agentic AI gets interesting and at the same time risky. They are not just capable but are also more complex to reason about.

What’s Exciting (The Wins)

  • Self-Healing Workflows: Automation of operational tasks where systems can adapt to minor changes instead of simply breaking


  • Engineering Velocity: Drastic reduction in manual intervention for complex, multi-file refactors

What’s Hard (The Risks)

  • Observability & Non-Linear Debugging: Traditional logs don’t help much when an agent enters a logic loop. It becomes difficult to answer: “Why did the agent choose this specific tool at this specific time?” Tracking these non-linear flows requires a completely different observability stack.


  • Guardrails & Cost: Without structural “circuit breakers”, agents can enter recursive loops that transform a technical logic error into a financial one. In an agentic world, unguided autonomy doesn’t just crash a service, it can drain token budgets in minutes.

What I Learnt: The Shift to “Specification of Judgment”

The biggest realization was the shift in our roles: The engineer’s job is becoming the specification of judgment.

We are moving away from writing line-by-line code and towards translating domain knowledge (e.g., Don’t auto-close the Jira ticket on recovery instead leave that to humans), operational experience (e.g., What if the MCP server subprocess hangs instead of failing?) and trust calibrations (e.g., Trust the agent to send Slack alerts without human review: yes) into rules the agent can follow.

Claude handles the execution, but its success depends on our ability to articulate why a system should behave a certain way, not just what it should do. This requires architectural experience to anticipate what could go wrong and the clarity to express those constraints precisely.

Final Thoughts: The Evolution of How We Build

It’s only a matter of time. While the technical risks are real today, the pace of advancement is blistering. We are witnessing a total paradigm shift: we aren’t just writing code anymore, instead we are managing a digital workforce.

For architects, this means rethinking system boundaries. For developers, it means thinking in workflows. I am excited to adapt! This isn’t the evolution of standard coding but the evolution of how we build.

.Sandeep Mewara Github

News Update
Tech Explore
Trend
samples GitHub Profile Readme
Learn Machine Learning with Examples
Machine Learning workflow
What is Dynamic Programming
The Great Inversion: Why AI is Moving from Cloud to Desktop
Explore or build on the Project available here: [Github Link]

Why ‘Service as Software’ is the Industry’s Next Big Bet

I recently caught a presentation by Intuit CTO Alex Balazs, where he described their evolution from a “Do-It-Yourself” software company to an AI-driven expert platform. During the talk, he used a phrase that immediately clicked for me: “Service as Software”.

service-as-software


It was one of those “aha” moments that forced me to pause and re-evaluate the trajectory of the entire SaaS industry. We’ve spent the last twenty years perfecting Software as a Service, but flipping that phrase to Service as Software implies a much deeper shift in how we deliver value. It provoked me to dig into why this isn’t just a trend, but a directional necessity for the next generation of tech.

The Shift: From Passive Tools to Active Experts

For years, the gold standard has been the “System of Record“. We built beautiful digital filing cabinets and powerful calculators, but they were ultimately passive tools. Whether it was an accounting suite or a CRM, the software only provided value if a human expert sat behind the keyboard to drive it. In that model, the value only scales as fast as the person at the controls.

Now, “Service as Software” represents a move toward a “System of Action“. With the rise of agentic AI, software is moving from the “medium” to the “expert.” Recent 2025 research from Capgemini highlights that we are moving beyond “Copilots” to “Agents” where AI that doesn’t just suggest actions but possesses the autonomy to execute end-to-end business processes.

  • SaaS (The Tool): The software provides the interface where the user performs the labor.
  • Service as Software (The Outcome): The software acts as an autonomous agent navigating complexity, identifying optimizations and executing tasks on the user’s behalf.

Why this is the Industry’s Directional Need

As I look at the landscape from a leadership perspective, this shift feels inevitable. We are hitting a ceiling with traditional models for a few key reasons:

  • Solving for “SaaS Fatigue”: The “per-seat” model is under pressure. According to 2026 SaaS pricing forecasts, nearly 60% of enterprise SaaS solutions are shifting toward hybrid or outcome-based pricing. Customers are tired of managing dozens of tools that require constant human attention. They want problems solved, not more licenses to manage.


  • Bridging the Expertise Gap: We are facing a documented global shortage of human experts in complex fields like finance, specialized engineering and data science. By “coding” that expertise directly into the software, we make high-level results accessible at a scale that human labor simply cannot match.


  • Accelerating Time-to-Value: Traditional software often has a long “time-to-value” during onboarding, a period where 63% of customers are already deciding whether to churn. A service-oriented model flips this. By having the software perform the initial heavy lifting for the user, you deliver the “aha moment” almost instantly.

Navigating the Transition: A Technical Leader’s View

Transitioning to this model is an architectural marathon. You don’t just “add AI” and call it a service. It requires a fundamental rethink of the stack.

navigating-transition

  • The “Human-in-the-Loop” Bridge: Trust is the primary hurdle. Successful transitions will likely use a hybrid model where AI performs 80% of the work, but human experts remain available for the “gray areas”. This builds the user’s confidence in the system’s autonomy while maintaining a safety net.


  • Codifying Logic, Not Just Features: We have to shift from building “buttons” to building “agents”. This requires robust reasoning engines that can handle exceptions and ambiguity without breaking.


  • The Observability Mandate: If the software is performing the service, it cannot be a black box. As architects, we must build in deep transparency providing “reasoning logs” so users can always audit why a specific decision was made.

Closing Thoughts

We are moving away from providing digital tools and toward providing digital results. The most successful companies of the next decade won’t just be selling software but they’ll be selling outcomes and confidence.

The transition from being a vendor of tools to being a partner in results is a massive challenge, but for those of us in technical leadership, it’s easily the most exciting problem to solve in a long time. It’s no longer about what our users can do with our software but it’s about what our software can do for our users.

Sandeep Mewara Github
News Update
Tech Explore
Data Explore
samples GitHub Profile Readme
Learn Machine Learning with Examples
Machine Learning workflow
What is Dynamic Programming
The Great Inversion: Why AI is Moving from Cloud to Desktop


The Great Inversion: Why AI is Moving from Cloud to Desktop

For the better part of a decade, the desktop was largely relegated to a passive terminal, a mere high-resolution viewport for remote cloud services. As the industry mantra shifted to “Cloud-First”, local hardware was often treated as an underutilized abstraction.

desktop-ai-back-great-inversion

However, we are now witnessing The Great Inversion. As AI workloads navigate the practical limits of cloud latency, data privacy and operational costs, the center of gravity is visibly shifting back to the local system. We are moving towards the era of the AI-Native Desktop, where the local machine is no longer just a window to the cloud, but is increasingly becoming the primary engine of intelligence.

The Evolution of the “SaaS Margin”

A primary driver of this shift appears to be fundamental economics. Throughout 2024 and 2025, as software providers integrated Large Language Models (LLMs) into their web platforms, it became clear that inference costs could significantly erode margins. This “Token Tax” has encouraged a strategic reckoning across the industry.

  • The Data: According to early 2025 fiscal reports from major SaaS players, AI-related compute costs increased OpEx by an average of 25-30% year-over-year.

     

  • The Cost Shift: Industry analysis from Deloitte and various independent reports suggests that local NPU inference can reduce AI operational costs by up to 90% ( Medium/Vygha, 2025). By migrating specific compute tasks to the desktop, we can transition from a variable OpEx model towards a more sustainable fixed hardware model.

The Proliferation of the AI PC

The “Inversion” is physically supported by a massive hardware refresh. We are no longer designing for underpowered machines. As of Q1 2026, the “AI PC” has moved from a premium category to the industry baseline.

  • The Benchmark: The AI PC has evolved from a niche offering into an enterprise standard. Gartner reports that AI PCs now account for over 55% of all shipments, with nearly 100% of new enterprise purchases featuring dedicated NPUs (Gartner, 2025).


    Microsoft introduced “Copilot+ PCs” as a new Windows category built around local AI acceleration (NPUs) and has continued to expand GA AI features (some in preview) across this category, emphasizing on-device experiences.
     

  • Silicon Supremacy: Standard workstations now ship with 40+ TOPS (Trillion Operations Per Second) capability. This allows for real-time local inference that was previously technically out of reach (Microsoft Learn, 2025).


    Chip vendors are also directly pushing the “on-device inference” narrative as a foundational shift (cost, latency, privacy, reliability).

Compliance and the “Privacy Moat”

Regulatory considerations are making the cloud a complex environment for sensitive data. With the EU AI Act entering its critical enforcement phase in August 2026, there is a clear directional pull toward “Zero-Export” AI solutions (EU AI Act Guide, 2026 ).

  • Apple’s Blueprint: Apple has helped standardize this approach with Apple Intelligence and Private Cloud Compute. Their architecture ensures that if a task can be processed on-device (via the M4’s 38-TOPS Neural Engine), it remains local. Only when necessary does it move to “stateless” servers designed to process data without storing it  (Apple Privacy, 2025 ).

     

  • Data Sovereignty: Modern desktop apps can index a user’s local files to provide personalized AI insights (Local RAG, i.e. Retrieval-Augmented Generation) without ever exposing that intellectual property to a third-party cloud provider. Local-first patterns are re-emerging because they improve resilience and user trust (data control, offline capability, graceful sync).

Performance: Breaking the Latency Wall

The browser is naturally limited by the “spinning wheel” of network latency. For the next generation of Agentic AI, tools that actively assist by observing screen context and reacting in real-time, the network round-trip is often a bottleneck.

cloud-ai-2-local-desktop

Feature

Web App (Cloud AI)

AI-Native Desktop App (NPU)

Response Latency

200ms – 500ms lag

<20ms (Instant)

Data Privacy

Encrypted in Transit

Zero-Export (Stays on Disk)

Offline Capability

Non-existent

Full Functionality

Operational Cost

Per-token / Monthly

One-time Development

System Access

Sandboxed/Limited

Deep File & OS Integration

Moving Forward: The Architect’s Blueprint

To remain competitive in 2026 and beyond, a forward-thinking desktop strategy should aim to capitalize on this hardware-rich environment. While the web remains vital, relying solely on the browser may now carry missed opportunities. A prepared strategy should consider:

  1. Framework Modernization: Exploring lightweight native cores. This involves moving toward Rust-based frameworks like Tauri that interface directly with the local NPU via DirectML or CoreML, rather than relying on memory-heavy wrappers.

     

  2. Hybrid Model Deployment: Integrating Small Language Models (SLMs) like Phi-4 or Llama 3-8B inside the desktop installer. These can handle the majority of daily tasks, reserving the cloud for “Heavy Reasoning” only. 

     

  3. Local Vector Databases: Utilizing local databases (e.g., LanceDB) for hyper-personalized, privacy-first “Long-Term Memory” of the user’s local files, all without requiring a cloud sync.

Conclusion: Towards a Structural Evolution

The current landscape suggests we are moving towards more than just a passing trend. We appear to be entering a structural shift in how software is delivered. There seems to be a renewed potential for the desktop to reclaim its significance, as it offers a compelling intersection where Performance, Privacy and Profit can uniquely align.

However, the most promising products in this new era likely won’t be “desktop-only” in the traditional sense. Instead, there is a clear path for the emergence of desktop-first AI workspaces which will act as platforms that leverage cloud augmentation, sophisticated model-routing and seamless OS integration to redefine the modern workflow.

Final Thought: In 2016, we asked, “Why build a desktop app when you can build a website?” In 2026, the question is increasingly, “Why would a user trust a website with their data when their desktop can do it better, faster and more securely?”

AI seems to be shifting software architecture toward hybrid local-cloud models, which is beginning to elevate the strategic importance of desktop environment once again.

Sandeep Mewara Github
News Update
Tech Explore
Data Explore
samples GitHub Profile Readme
Learn Machine Learning with Examples
Machine Learning workflow
Word Ladder solution
What is Dynamic Programming

Disclaimer: The views and opinions expressed in this article are strictly my own and reflect my personal belief in current market directions. They do not constitute professional or investment advice. Technology landscapes change rapidly, therefore, readers should perform their own due diligence and assess their specific needs before making any architectural or business decisions. I shall not be held responsible for any actions taken based on the contents of this post.