The Hidden Cost of Code Blindness in the Age of AI

June 14, 2026June 14, 2026Sandeep Mewara Leave a comment

Last month, I was looking over the shoulder of one of our engineers as they worked with an AI coding assistant. They asked a question that should have been entirely straightforward: “Who calls the validate_user function in our codebase?” The answer eventually came back. But watching them get there required a familiar and surprisingly expensive loop: reading multiple files, tracing imports, reconstructing call paths and inferring relationships that already existed inside the system.

As we stood there brainstorming around their screen, a realization struck us. What broke the workflow wasn’t the token count. It was the repetition. If anyone on the team opened a new session tomorrow and asked the exact same question, the model would perform much of the same work all over again. The relationship hadn’t changed. The code hadn’t changed. Only the cost and our collective time had.

The Cost of Rediscovery

That moment exposed a fundamental flaw in how we approach AI-assisted development. The problem isn’t that AI is inherently expensive. The problem is that AI keeps paying a premium to repeatedly rediscover the same foundational knowledge. What looks like a token limitation is actually a structural understanding problem. And that’s ultimately why our engineering team set out to build Infigraph.

AI Has Context. It Doesn’t Have Structure.

The last few years have been dominated by a single, brute-force idea: give AI more context. Bigger context windows, more capable models, better reasoning and smarter agents. All of those advances matter. But many of the questions engineers ask every day aren’t really code-understanding questions. They are system-understanding questions.

Engineers ask questions like: Who calls this function? What breaks if I change this API? Which services depend on this component? What is the blast radius of this change?

These are not primarily language problems. They are relationship problems. They are graph problems. A model can read raw text files incredibly well, but what it lacks is a persistent understanding of the architecture that connects those files together. Software systems are not just collections of files, they are collections of relationships. The industry has spent years teaching machines how to read code, but we are only beginning to teach them how to understand systems.

The Economics of Reconstructing Knowledge

Every engineering organization already possesses a vast amount of implicit structural knowledge. The system already knows which modules depend on each other, which symbols are reachable, which services communicate and which changes create downstream impact. Yet, most AI workflows require that knowledge to be rediscovered from first principles, repeatedly.

When you ask who calls validate_user, the model reads files and reconstructs relationships. Open a new session tomorrow, ask the same question and the model performs much of the same work again. The relationship didn’t change, but the cost did.

We don’t rebuild database schemas every time a SQL query executes and we don’t rebuild search indexes every time a user types a keyword. We persist structure because persistence is more efficient than rediscovery. Software systems deserve the same treatment:

Persist the knowledge once. Query it many times.

The Shift I Think We’re Entering

I don’t pretend to have all the answers for how AI and complex architectures will evolve together. But as an architect looking at how our workflows are changing, I know where the responsibility is moving. Historically, our primary effort as developers was spent translating intent into syntax. Increasingly, AI handles that translation smoothly. As that happens, the bottleneck shifts away from writing code and toward understanding architecture, change impact, dependency boundaries and system behavior.

The better AI becomes at generating code, the more critical structural understanding becomes. Generated code is only an asset if it fits correctly inside the system around it. Otherwise, it’s just technical debt written at supersonic speed. We would never build an application that rediscovered its data schema for every transaction, yet that is effectively how many AI-assisted workflows approach codebases today.

Why We Built Infigraph

As we discussed this pattern internally, a simple question emerged: If structural knowledge is repeatedly rediscovered, why aren’t we persisting it? Instead of parsing relationships from raw source files every time a question is asked, what if those relationships were represented directly? What if structural understanding became infrastructure?

That idea became Infigraph. Infigraph creates a persistent representation of codebase structure that AI agents can query directly. Rather than repeatedly reading files to discover relationships, agents can ask questions about relationships that already exist. The goal was never to replace AI reasoning; the goal was to make AI contextually aware of the broader systems it operates within.

Same Question. Same Codebase. Different Architecture.

Three principles shaped our approach:

Structure First: Code contains explicit relationships. Those relationships deserve first-class, deterministic representation.
Local First: Code intelligence should be private, fast, and fully available even when disconnected from the cloud.
Polyglot Reality: Real systems span many languages, frameworks, technologies, and internal platforms. Infigraph currently supports 63 languages out of the box because the tool should adapt to your system—not the other way around.

The Byproducts of Structural Awareness

Cost is simply the easiest metric to measure, but it isn’t the most important outcome. The more important outcome is quality. When structural relationships are treated as a foundational layer, the system answers questions with greater consistency and more complete coverage than transient inference from raw files can reliably provide.

A cheaper answer is useful, but a more complete answer is transformative. Architects care about correctness, engineering leaders care about confidence and developers care about understanding impact before making a change. Structural awareness improves all three.

When we stopped asking, “How do we slash our token bill?” and started asking, “Why are we repeatedly paying to rediscover the same relationships?” the economics fell into place naturally. Fewer files needed to be pulled into context, tool call chains became shorter, latency dropped and cost followed. Cost savings are not the primary innovation but they are a consequence of eliminating redundant engineering work.

Why We Open-Sourced It

We originally built Infigraph to solve systemic problems inside our own development workflows. But as more engineers and teams began using it, we realized that this challenge isn’t unique to us. The entire industry is moving aggressively toward AI-assisted development while software systems continue growing larger and more interconnected.

Those two trends collide around a simple question: How do we help machines understand software systems, not just individual files? We know the current trajectory: repeatedly paying to rediscover knowledge that already exists within our own codebases. That model isn’t sustainable. We believe the next step deserves community participation, scrutiny and collective engineering.

That’s why we released Infigraph as an open-source project under the Apache 2.0 license. Not because we think it’s finished, but because we believe this is a direction worth building together.

What’s Next

This article focused on the core problem. The next article (in upcoming week) will focus entirely on the engineering decisions behind our approach from graph-based representations and retrieval strategies to the tradeoffs we encountered while building local-first code intelligence.

But you don’t have to wait for that deep dive to start exploring.

⭐ Star the repository on GitHub and follow the project.
👥 Assess, Contribute and raise PR.
🚀 Install it and try it now against your own codebase.

If you hit issues, open a GitHub issue. If you want to contribute, whether that’s a new language parser, search improvements or new MCP integrations, we’d love to collaborate.

Thanks for reading. And, a special thanks to the engineers on our team who transformed a whiteboard conversation into a tool we can now share with the broader community.

. Sandeep Mewara Github
Tech Explore
Trend
samples GitHub Profile Readme
Learn Machine Learning with Examples
Machine Learning workflow

Architecting Guardrails: The Control Plane for Agentic AI

May 3, 2026May 9, 2026Sandeep Mewara Leave a comment

We are entering a new architectural phase and navigating a meaningful shift. AI systems are moving beyond static responses and into systems that can take actions like triggering workflows, calling APIs and making decisions within production environments. This is transformative.

At the same time, this shift is happening faster than most teams can fully operationalize or standardize. Across industry conversations, early implementations and emerging case studies, I believe a pattern is starting to become clear:

Most AI failures are not model failures – they are control failures.

Not necessarily because systems are poorly designed, but because:

boundaries are still evolving
failure modes are not fully understood
recovery paths are often under-defined

As we move toward more autonomous systems, we are effectively taking cautious steps into production – without always knowing how and when things might surface as unintended outcomes.

When they do, the impact is rarely isolated:

it can affect multiple customers
it can impact trust and brand perception
it can translate into real cost

When these systems scale, we don’t just scale capability. We scale uncertainty and potentially, mistakes.

I believe we are collectively building the playbook as we go and this is my attempt to make sense of what that might look like.

Guardrails: More Than Just a Safety Feature

Guardrails are no longer a theoretical concept or something that can be deferred for later. Increasingly, they are becoming a real and necessary part of building agentic AI systems.

What I still observe, though, is that in many implementations, guardrails are treated as an add-on introduced after the core system is already designed or applied post-facto to fill gaps.

Even when guardrails are considered early, they can sometimes become a checkbox exercise that makes us feel the system is “covered”, while important aspects may still be missing.

Part of the challenge is that we are still learning what “complete” actually looks like. As AI systems continue to evolve, new behaviors, edge cases and failure modes emerge – often faster than teams can fully anticipate.

This is where I have found it useful to shift how I think about guardrails. Instead of treating them as isolated checks, it helps to think of guardrails as the control plane of agentic AI.

https://learnbyinsight.com/wp-content/uploads/2026/05/gaurdrail-control-plane.png

Just as modern systems separate execution (data plane) from governance and coordination (control plane), agentic AI needs a layer that defines:

what the system can do
what it should do
how it behaves under uncertainty or failure

Without this Control Plane, we’re not really building systems – we’re simply reacting to them.

Three Questions Every Architect Should Ask

To make the idea of a control plane more practical, I have found it useful to step back and ask a few simple questions – often before writing a single prompt.

1. Can it do this? (Capability & Access)

Does the agent have the right permissions?
Are tool calls constrained?
Are access boundaries clearly defined?

Example – Billing Agent
An agent generating invoices should not have unrestricted access to pricing configuration.

2. Should it do this? (Policy & Context)

Is the action aligned with business rules?
Does it respect compliance and intent?
Is context being interpreted correctly?

Example – Support AI
Issuing refunds requires understanding policy thresholds and not just user sentiment.

3. What if it goes wrong? (Resiliency & Recovery)

Can actions be rolled back?
Is there an audit trail?
Is there a clear escalation path?

Example – Workflow Agent
Deleting or modifying customer data should always be recoverable.

If these questions are unclear, the agentic system will eventually surface that ambiguity – usually in production.

A Practical Framework for Control

Building on the idea of Guardrails as a Control Plane, it helps to think of them not as a single gate, but as a distributed system of controls.

One way to reason about this is across a few key areas:

Category	Focus	Example
Technical	Validation & thresholds	Prevent hallucinated financial metrics
Security	Access & abuse prevention	Mask PII based on user roles
Ethical	Bias & responsible behavior	Ensure fair hiring recommendations
Operational	Runtime control	Rate limits and kill switches
Infrastructure	Platform safety	Sandboxing, isolation and cost boundaries
Business	Alignment & compliance	Enforce pricing rules and customer tiers

Note: These are not independent layers – they interact continuously across the system lifecycle.

Where Guardrails Actually Live

To make this more concrete, it helps to think about where guardrails show up within a system.

They exist across the lifecycle and surface at different points as the system processes inputs, makes decisions and produces outcomes.

In practice, this often looks like:

Input validation & policy enforcement
Orchestration decisions & tool execution controls
Model grounding & memory handling
Output validation, monitoring and feedback

Each of these points represents a place where control can be applied or missed.

Hard-Earned Realities of Scaling

For engineers and architects building these systems, the gap between theory and production is where most learning happens.

https://learnbyinsight.com/wp-content/uploads/2026/05/hard-earned-reality-scaling.jpg

Here are a few patterns I have seen emerge across implementations and industry discussions:

1. The Trap of Human-in-the-Loop (HITL)

HITL is often used as a safety net. In many cases today, it’s a necessary part of deploying AI systems responsibly. At the same time, as systems begin to scale, it’s worth being mindful of how it’s used.

In practice:

humans can become bottlenecks
alert fatigue can set in
approvals can turn into routine “rubber-stamping”

The shift is not to remove HITL, but to use it more intentionally.

Design systems to be safe by default and rely on human intervention primarily for:

high-risk actions
policy exceptions
low-confidence scenarios

If every decision requires human approval, I believe the system isn’t truly autonomous instead it’s closer to a complex UI with an approval layer.

2. The Latency Tax

Safety introduces latency where every validation adds a cost in time.

Rather than forcing everything into synchronous checks, it helps to distribute controls across the lifecycle:

Pre-execution: Prevent obvious failures
In-line: Enforce business logic
Asynchronous: Audit and reconcile

3. Policy-as-Code vs. Prompt Engineering

Prompts are flexible, but brittle. Policies are enforceable. Decoupling rules from the model (using tools like Open Policy Agent (OPA) or similar approaches) allows for version control, auditability and model independence. For example, instead of encoding refund limits inside prompts, define them as policies that can be updated independently as business rules evolve.

In many ways, this becomes a key part of the control plane:

Prompts guide behaviour
Policies enforce behaviour

4. Guardrails Break Silently

A guardrail that works with one model may behave differently with another. Different models interpret constraints differently and edge cases surface in unexpected ways. For example, switching models can silently weaken compliance checks by a Contract Review Agent.

The Takeaway: Maintain a guardrail testing suite. Test adversarial cases, edge scenarios and validate across model versions. If guardrails aren’t tested, they’re just assumptions.

Two Often Overlooked Risks

As systems mature, a couple of areas tend to surface as more “silent” failure modes. They don’t always show up immediately but can have significant impact over time.

1. Economic Guardrails

Agents can loop recursively or call expensive APIs repeatedly, leading to what can effectively become a “Financial Denial of Service”.

In practice, this makes it important to introduce controls such as:

session-level budgets
token or usage limits
execution caps

Cost, in this context, becomes a control boundary – not just a metric.

2. Memory & State Management

Agents don’t just act, they remember. Over time, this introduces challenges around PII retention, long-term context storage and unintended persistence of sensitive data.

Mitigation often involves:

retention policies
PII filtering
memory scrubbing workflows

Memory becomes a liability if not managed intentionally.

The Strategic Bottom Line

To build production-grade agentic AI systems, it becomes important to think in terms of controlling:

What the system does (actions)
What it spends (economics)
What it remembers (state)

Guardrails are not just about safety – they are about sustainability and trust.

Here’s a consolidated view of how these guardrails come together:

https://learnbyinsight.com/wp-content/uploads/2026/05/poster-agent-gaurdrails-dark-v2.png

* This is still evolving but having a structured way to think about it helps in designing systems that scale.

Final Thought

Autonomy is the promise of agentic AI. But autonomy without control isn’t innovation – it’s risk.

As architects, our goal isn’t just to make AI systems work but to make them predictable, controllable and trustworthy over time.

The model is the engine.
Guardrails are the steering, the brakes and the dashboard.

.Sandeep Mewara Github
Tech Explore
Trend
samples GitHub Profile Readme
Learn Machine Learning with Examples
Machine Learning workflow

Agentic AI for Existing Codebases: A Practical Path to Getting Started

April 26, 2026April 27, 2026Sandeep Mewara Leave a comment

In the current engineering landscape, there is an unrelenting pressure to chase the “new”. Our LinkedIn feeds are dominated by AI-native learnings, startups and autonomous agents building entire applications from a single prompt in days. For many of us, this creates a strange disconnect.

Most engineers aren’t working on greenfield AI experiments. They are responsible for systems that have been running for five, ten or even fifteen years. These are the stable, revenue-generating engines that form the backbone of successful businesses. They are battle-tested, high-stakes and complex.

If you are maintaining one of these systems, it is easy to assume the Agentic AI Wave isn’t meant for you. You might look at your unique architectural patterns or your “legacy” constraints and conclude that an AI agent simply wouldn’t understand.

I’d offer a different perspective: These tools are most transformative in the systems you already understand deeply. You haven’t missed the wave instead you are simply waiting for the right entry point.

From Manual Assistance to Actual Leverage

You might not have integrated AI into your workflow yet. Many teams have already begun doing so and those who have started likely use it for tactical tasks: explaining an obscure regex, generating a unit test for a utility function or writing a quick bash script.

This is a significant step forward, but it remains manual and reactive. Using AI this way is like hiring a brilliant senior consultant but refusing to give them a badge, documentation or context. You spend half your mental energy explaining the “why” before they can even start on the “how”.

When you attempt to move toward Agentic AI – you allow an agent to navigate your repository and suggest multi-file changes. This lack of context becomes a technical liability. Without a “Project Constitution”, the agent is forced to make guesses. Usually, it will:

Default to modern “generic” patterns that are incompatible with your specific tech stack.
Miss hidden architectural constraints decided years ago for specific performance or security reasons.
Suggest “best practice” refactors that look correct in isolation but break your production logic.

The result isn’t just a failed task but it’s wasted time and unnecessary token burn.

The Missing Piece: Contextual Onboarding

Agentic AI doesn’t fail because it lacks power. It fails because it lacks context. Much of your system’s “source of truth” doesn’t actually live in the code. It lives in your head, in tribal memory, in wikis or buried in old Jira or PR descriptions.

The goal isn’t to “teach” the AI everything. It is to provide a minimalist, structured map that allows the agent to operate safely within your boundaries.

The same idea applies to any work with structured systems of any kind like operations workflows, data pipelines, internal tools, etc. Whether it’s code, processes or documentation, the moment you define the rules clearly, the quality of output improves dramatically.

A Practical Starting Point: The `claude.md`

You don’t need a massive infrastructure change to begin. You can start by creating a claude.md file in your project root. This is your “Project Constitution” – a system guide. It should be precise, technical and grounded in reality.

Start simple, example claude.md:

# Project Guidelines

## Tech Stack
- Node.js 16
- Express
- MongoDB

## Rules
- Do not upgrade dependencies unless asked
- Follow the existing folder structure
- Write tests using Jest

## Notes
- This is a legacy system, avoid large refactors

That’s it. No perfection needed to start. By spending fifteen minutes defining these boundaries, you give the agent more leverage than 90% of teams currently provide. You can refine it over time.

Expanding the Framework: Skills

Once your “Constitution” is set, you can begin defining Skills via a skills.md file. While the claude.md is global, Skills are modular playbooks for recurring workflows.

For example, if you frequently ask the agent to “Add a new API endpoint” or “Migrate a component to TypeScript”, you should document the exact steps those tasks require in your specific environment. These acts as a repeatable playbooks that reduces the back-and-forth and ensures the agent follows your team’s established SOPs (Standard Operating Procedures) when needed.

A Mentor in Your Pocket: Codex-Claude

As you begin to rely more on these agents, you’ll find that “Instruction Engineering” is a skill in itself. If your agent is still going off-track, the issue is almost always an ambiguity in your instructions.

This is why I have been developing Codex-Claude. Think of it as a Linter for your Agentic Strategy. Just as a code linter catches syntax errors, Codex-Claude analyzes your claude.md and skills.md to catch “intent errors”.

The tool helps you with:

Automated Architectural Audit: Instantly evaluates your files against best practices and provides a weighted score across structure, specificity and completeness
Precision Refactoring & Compaction: Identifies ambiguity and redundancy, rewriting instructions to be more concise and context-efficient
Intelligent Conflict Resolution: Detects contradictions and instruction drift, ensuring rules are placed correctly – either within global rules or specialized skill files
Progressive Learning Loop: Turns every optimization into a learning opportunity by explaining the “why” behind changes by linking changes to official documentation

You don’t need this to get started, but it helps once you begin refining your setup for more complex tasks.

You can explore and try it out LIVE here: https://sandeep-mewara.github.io/codex-claude/

Watchouts

As you start this journey, keep these three principles in mind:

Be precise, not verbose: Every line is context the agent must process. Clear constraints beat long explanations
Use tests as safety rails: The agent provides speed. Your test suite provides safety. Never accept changes that have not passed your CI/CD baseline
Iterate on Instructions: If an agent fails a task, it likely misunderstood something. Treat it as a bug in your claude.md and fix the instruction

The Architect’s Path Forward

The expectation for delivery speed in our industry is fundamentally shifting. However, adopting Agentic AI isn’t about “coding faster” but it’s about reducing the mental tax of working with mature, complex systems.

You don’t need a new project or deep AI expertise to benefit from this. You just need to start small:

Select one module or one feature
Draft a simple claude.md that defines that module’s rules
Run one task with an agent and observe the difference

The systems that power today’s businesses don’t need to be replaced. They just need the right leverage to move into the future.

. Sandeep Mewara Github
Tech Explore
Trend
samples GitHub Profile Readme
Learn Machine Learning with Examples
Machine Learning workflow

The Lifecycle Is the Product: AI Development Engine

April 19, 2026April 19, 2026Sandeep Mewara Leave a comment

Every team eventually finds itself rebuilding the same foundational setup in every project. In many organizations, this is still a manual struggle. We write “how we work” docs, define naming conventions and establish review gates that live only in our wikis. For teams already leveraging AI, this setup often exists in isolated pockets like a collection of disconnected prompts telling an assistant to “act as a PM” or “design like an architect”.

In both cases, the expertise remains trapped in silos. For those working manually, the immediate opportunity is to use AI skills to jump-start their specific tasks. But once you do, you quickly reach a plateau – while individual skills and prompts have become portable, the lifecycle around them has not.

That gap is what the Lifecycle Agent Orchestrator (LAO) tries to close. It’s a plugin for Claude Code and Cursor that uses multi-agent orchestration to ship the development process itself as a versioned, overridable artifact. Not just the individual role skills. The stitching between them.

The Problem: Skills Are Portable, Process Is Not

Current AI tools offer impressive specialized skills – performing architecture reviews or enforcing testing conventions with high precision. However, these tools still operate in isolation.

Modern software delivery doesn’t happen in a vacuum. It flows through a series of high-stakes handoffs. This is where even the most advanced teams hit a wall. Despite the promise of automation, the broader lifecycle remains stubbornly manual:

The Cognitive Load of Fragmentation: Engineers must still manually orchestrate which tools to run and when, creating a massive tax on context.
Traceability Decay (Drift): We lose the “intent” of a feature as it travels from a product ticket to a design mock and finally into the codebase.
Simulated Handoffs: We still rely on manual “persona-switching” – manually checking if a design works for a developer or if an architecture suits a product goal.
The Proof Problem: At the point of release, we still rely on assumptions rather than programmatic proof that we’ve satisfied every original requirement.

LAO moves beyond single-prompt interactions by employing multi-agent orchestration to bridge the high-stakes handoffs between roles.

The skills are the actors. The lifecycle is the director.

Step 1: Jump-Start Your Work with AI Skills

The quickest way to see value isn’t by changing your entire workflow. It’s by using individual skills to improve the work you’re already doing.

Each role in the plugin is independently usable. You don’t need the full pipeline to get an immediate win:

Engineers can use the Intake skill to turn a messy Jira ticket into a clean scope with real acceptance criteria.
PMs can use the Product Management skill to draft a structured PRD.
Architects can define a technical design running the Architecture skill against a requirement.

# Direct use of individual skills
Invoke code-review skill to review these changes
Invoke intake skill to extract scope from PROJ-5678

This “Step 1” approach pays back immediately.

You get a senior-level assistant for specific tasks without committing to a new workflow.

Step 2: The Orchestrator as the Director

Over time, this pattern exposes a critical bottleneck. We find ourselves questioning: Which skill comes next? Did we skip a step? Are we aligning roles or just checking boxes? Here, we hit the ceiling of isolated tools.

Once you trust the individual skills, the orchestrator stitches them together into a cohesive system.

# command inside a Claude Code or Cursor session
# Direct Jira story ingestion
/lao Work on PROJ-1234

# An ungrounded requirement
/lao Add a user notification preferences API endpoint

# a tire-kick before committing to anything
/lao-dry-run

Through multi-agent orchestration, LAO ensures that the PM, Designer and Architect personas actively review and challenge each other’s outputs.

The goal isn’t just automation – it’s coordination.

Core Internals

The following are a few key design decisions that power the LAO.

The Nine-Phase Engine

The pipeline is nine phases, structured into two halves with different personalities:

Alignment (Phases 1–3): Product, Design and Architecture align early. They cross-review every output to catch gaps before engineers write a single line of code.
Execution (Phases 4–9): Once the team establishes alignment, the system drives the project through scope, design, planning, implementation, validation and shipping.

The key shift is simple: Alignment happens once, upfront. Execution happens without rework.

Project-Specific Infrastructure

Overlays let you define how your specific system works – ensuring your project remains the domain authority. Under the hood, each phase composes up to three layers of knowledge:

Layer	Lives in	Contains
Base	Plugin	Universal rules for the role
Overlay	Project	Project-specific patterns, stack, conventions
Domain	Project	Cross-cutting domain knowledge (auth, payments, compliance)

A project looks like this once it’s connected:

If a project already has its architecture docs scattered across docs/, there’s no need to move anything. A lao.config.yaml at the project root maps existing files into the engine:

project_name: my-app
languages: [python, react]

overlays:
  architecture: docs/architecture/standards.md
  coding-standards: .cursor/rules/coding.md

domain:
  - docs/domain/*.md
  - src/payments/DESIGN.md

extra_roles:
  compliance-review: tools/compliance/SKILL.md

There are two discovery paths – the convention directory or the config file. If both exist, the config file wins because project-specific overlays take priority.

This is the project respecting itself as the domain authority.

Preview, Then Execute

Every run begins in simulation – a preview of the nine-phase pipeline that writes no files, creates no branches and posts no Jira comments. The orchestrator walks through Phases 1–6, simulating execution to produce realistic PhaseOutput objects and checkpoints for your iteration. It then summarizes Phases 7–9 as projected outcomes, as these require real code execution.

When you’re ready, you say proceed and the pipeline replays – but with the preview’s decisions carried forward instead of regenerated:

The system eliminates both upfront cost and the risk of committing to a flawed plan.

Acceptance Criteria, Tracked Across Phases

The system captures acceptance criteria during Intake and tracks them through to Validation, where you must prove each one with recorded evidence to unlock the “Ship” gate. The CLI renders this data as text today, but a dashboard could render it visually tomorrow without requiring any changes to the engine. Every phase emits a PhaseOutput – a structured object with a defined schema.

--- Phase: Tech Design (Phase 5 of 9) ---
Status: Needs Approval

SUMMARY:
  Add rate limiting middleware to API gateway.
  No new dependencies, config-driven thresholds.

ARTIFACTS:
  - [design_doc] docs/design/rate-limiting.md

ACCEPTANCE CRITERIA (tracked):
  AC1: Rate limit of 100 req/min/user ...... pending
  AC2: Returns 429 with retry header ....... pending
  AC3: Configurable per environment ........ pending

→ Approve to proceed to Plan or request changes.

No claims without fresh proof – that’s the whole point of the validation gate.

Multi-Language, Without a Fork Per Language

Four skills need to know what language they’re looking at: coding-standards, testing-conventions, code-review and security. Each has a universal base and a language pack for the specifics:

Currently, plugin supports Python, Java, C# and React. Detection runs once at pipeline start: if lao.config.yaml lists languages, use them, otherwise scan for pyproject.toml, pom.xml, *.csproj, package.json with a React dep and collect every match. A full-stack repo auto-detects as [python, react] and both packs get loaded. The agent applies each to the right file types.

Adding a new language – Go, Rust, anything – means creating a references/<language>/directory in those four skills with the expected files, plus a couple of lines in detection and validation scripts.

No change to the universal base. That separation is worth preserving.

Role vs. Workflow Split

The design deliberately separates Phases 1–5 (Role-based) from Phases 6–9 (Execution-based).

Phases 1–5 (Roles): These phases use individual skill files (PM, XD, Architecture, Intake) through multi-agent orchestration because judgment varies by project. A fintech audit requires different logic than a game engine pipeline, so project overlays merge with these base skills to provide local context.

Phases 6–9 (Workflows): These phases power the orchestrator’s core engine (TDD, validation, shipping) and maintain tight coupling for continuity. Unlike roles, workflows use substitutions. If you override a workflow, such as swapping TDD for BDD, the new logic replaces the built-in engine entirely rather than layering on top of it.

# Override the workflow for a single phase
workflows:
  
  # BDD instead of TDD
  implement: docs/workflows/our-bdd-process.md

  # custom release flow
  ship: docs/workflows/our-release-process.md

Separate judgment from execution to protect flexible strategy without sacrificing delivery.

When to Use This and When Not

The Sweet Spot: Use this if you pair Claude Code or Cursor with Jira/PRD-driven intake. It excels for teams that front-load design and track ACs to the finish line. The engine treats multi-language and monorepos as first-class citizens, using config-based discovery to navigate complex structures.

The Breaking Point: Avoid this for ad-hoc work lacking tickets or defined ACs. The fit weakens if your “ship” phase involves unmodeled complexity – like mobile store submissions or if you require unattended, autonomous execution. The plugin is designed as a human-in-the-loop engine. It doesn’t chase full autonomy – yet.

What Changes When You Adopt This?

The most immediate change is practical: Your development process leaves the wiki and enters your repository as a versioned artifact.

But adoption doesn’t have to be a cliff. You start by using individual skills (single agent) to improve local tasks. As you build trust, you let the orchestrator (multi-agent) handle the parts that are hardest to do manually – the handoffs, the alignment and the validation.

Over time, the shift becomes structural:

Handoffs become explicit gates
Requirements become traceable
Validation becomes evidence-driven
The lifecycle becomes consistent

I expect the next iteration of the tool to automate this entire flow. For now, we must build it by hand – or, more precisely, install it.

Closing Thought

This isn’t about replacing how teams work. It’s about making how they work explicit and reliable.

The lifecycle stops being something you document. It becomes something you execute and once that happens, it’s no longer just process.

It’s part of the product.

. Sandeep Mewara Github
Tech Explore
Trend

Repository & Contribution

The Lifecycle Multi-Agent Orchestrator is available as an open-source project. I encourage you to explore the repository, use the individual skills to jump-start your own work and contribute to the evolution of portable development engine.

GitHub: sandeep-mewara/lifecycle-agent-orchestrator
Documentation: Detailed design specs and phase contracts are included in the repo.

In practice, this kind of artifact only gets better when it’s applied across different projects and constraints.

.
Machine Learning workflow

Mastering the SKILL.md File in Agentic AI: A Complete Guide

April 5, 2026April 5, 2026Sandeep Mewara Leave a comment

In modern Agentic AI architectures, the primary engineering challenge is no longer generating language, but bridging the gap between conversational intent and reliable, repeatable and unambiguous execution. To achieve this, we must treat agent capabilities not as conversational shortcuts, but as well-defined engineering assets.

This requires a standardized contract for capability execution. That’s where SKILL.md comes in. A formal, machine-parsable definition file that acts as a Standard Interoperability Definition (SID) contract for systematic task execution within an agentic framework.

In this blog, I’ll dive deep into SKILL.md and share how it serves as a single source of truth for both conceptual planning (roles) and procedural execution (workflows) that power an automated, engineering-grade SDLC.

The Architectural Blueprint: The SKILL.md

A SKILL.md is structured as an engineering specification, designed for zero-ambiguity parsing by an LLM like Claude. It defines the contract for interoperability, forcing teams to move from conversational requests to precise capability definitions.

Anatomy of an Engineering Contract

The specification consists of five required metadata fields that are immutable and machine-parsable:

Name: An immutable, unique, system-wide identifier for the capability (e.g., internal-token-manager-v1, exec-raise-github-pr-v1, or sdlc-pm-v1). This is the system’s handle for the skill.
Description: Critically, this is not a summary. It is the definitive Trigger Event Definition. It must be written from the perspective of an event, user query or internal signal that activates this capability, allowing the framework to perform accurate skill matching. Example: “Triggers automatically after a successful code analysis scan…”
Commands: A list of executable operations or prompts defined by the contract. For procedural skills, these map to API endpoints or internal function calls. For conceptual skills, these map to defined prompt sequences. Example: get-linter-report(timestamp) or refresh-token(service_id).
Constraints: A critical safety and resource management section. It defines the limits, rules and error conditions of the contract. Example: “Internal authentication tokens must expire after 1 hour.”
Examples: These are not suggestions but are the gold standard of Expected Behavior. They define the intended output for specific input scenarios, providing the LLM with a definitive blueprint for successful execution and reducing non-deterministic output.

# Code Snippet 1: Sample Procedural SKILL.md (Raise GitHub PR)
---
# REQUIRED METADATA FIELDS (SID CONTRACT)

name: exec-raise-github-pr-v1
description: Triggers automatically after a successful 'exec-linter-code-analyzer-v1' scan or upon user request to systematically raise a new pull request on GitHub for reviewed code.
commands:
  - create-pr(repository_url, head_branch, base_branch, title, body)
constraints:
  - Must use a valid GitHub API token with 'repo' scope.
  - Head branch must differ from the base branch.
---

### Expected Behavior (Examples)

When this skill is matched against a standard JavaScript repository:
  - Input: create-pr("https://github.com/org/repo.git", "feat/new-api", "main", "Feat: Add API v2", "This PR introduces...")
  - Execution: Loads 'scripts/create_pr.py'.
  - Output: New PR URL.

Directory Structure & Progressive Disclosure

The SKILL.md is packaged within a defined directory structure, ensuring all supporting assets are decoupled and version-controlled alongside the specification.

.Sandeep Mewara Github

📄 SKILL.md (The only required asset, containing the definitions and contract).
📁 scripts/ (Optional: Decoupled logic – Python, Bash, Node.js, etc. The implementation details of the contract).
📁 references/ (Optional: Docs, checklists, design patterns or standards the skill must adhere to).
📁 assets/ (Optional: Templates or sample data).

This decoupled architecture enables the Progressive Disclosure Pattern, which is critical for system efficiency and managing token constraints. A high-performance agentic system should not load every asset for every skill simultaneously. Progressive disclosure ensures assets are loaded only when necessary.

Agents don’t load everything at once. They discover and expand context only when needed.

Architecting the Automated SDLC

The standardization offered by SKILL.md allows us to architect and separate the dynamic pillars of an automated SDLC, managing all capabilities via this single specification. In a professional lifecycle, conceptual setup (Defining Roles) always precedes procedural execution (Executing Workflows).

Conceptual Role-Based Skills: Defining the Contract for a Persona (Planning & Setup)

To initiate any SDLC phase (e.g., Requirements), we must first define the conceptual frameworks, knowledge bases and systematic planning workflows of specific roles that help organise content by domain (behaviour-driven). We apply the identical SKILL.md standard to define a persona’s “mindset”.

WHAT: SKILL.md definitions for Product Manager Persona or Lead Developer Persona.
APPLICATION: During the “Requirements” and “Design” phases of the SDLC.
ARCHITECTURAL FLOW: During planning, you activate the Product Manager Persona (Code Snippet 2). Claude adopts this mindset and leverages knowledge references (e.g., Agile standards) and the command contract (draft-prd(user_stories)) to provide focused, high-quality requirements.

Code Snippet 2: Sample Conceptual SKILL.md (Product Manager)
---
# REQUIRED METADATA FIELDS (SID CONTRACT)

name: sdlc-pm-v1
description: Triggers during project initiation to define the persona, responsibilities, knowledge base and systematic planning workflows of a senior Product Manager.
commands:
  - draft-prd(user_stories, acceptance_criteria)
  - run-feature-prioritization(prd_document)
constraints:
  - Must reference files in the optional 'references/' directory (e.g., 'references/agile-standards.md') for all Agile terminology.
---

### Expected Behavior (Examples)

When this skill is matched to a new project request:
  - Input: draft-prd(user_stories, acceptance_criteria)
  - Execution: Loads 'references/agile-standards.md' to define terminology.
  - Output: A structured PRD document based on the internal persona.

External Workflow Execution Skills: Defining the Contract for the Workflow to ‘Do’

Once the groundwork is established and the build begins, the agent’s focus shifts to user-triggered workflows (e.g., after a commit). These skills are guides that help perform specific, measurable steps in the automated pipeline, providing the user with domain-specific results (task-driven).

WHAT:SKILL.md definitions for exec-linter-code-analyzer, exec-raise-github-pr, or jira-ticket-update.
APPLICATION: During the “Build,” “Test” and “Deploy” phases of the SDLC, typically automated by CI/CD events.
ARCHITECTURAL FLOW: After a successful code implementation event, the framework activates the exec-linter-code-analyzer-v1 (Code Snippet 3). Claude reads the inputs and expected behavior. The framework executes the decoupled logic (scripts/) to systematically create the pull request, ensuring a reliable result (the PR URL) is provided back to the user’s workflow or CI/CD pipeline.

Code Snippet 3: Sample Procedural SKILL.md (Code Analyzer Workflow)
---
# REQUIRED METADATA FIELDS (SID CONTRACT)
name: exec-linter-code-analyzer-v1
description: Triggers automatically after a code commit event to execute a static analysis and linter scan on the modified files in a specific repository, providing a systematic JSON report.
commands:
  - run-analysis(repository_url, branch)
constraints:
  - Must use a valid GitHub API token with 'repo' scope.
---

### Expected Behavior (Examples)
When this skill is matched following a code commit:
  - Input: run-analysis("https://github.com/org/repo.git", "main")
  - Execution: Loads 'scripts/run_analysis.py'.
  - Output: Linter report JSON.

Internal Agent Operational Skills: Defining the Contract for the Software to ‘Be’

To ensure system stability, the agent software itself requires precise, standardized contracts for core operational tasks (like authentication, state, error handling, api-call, etc). These skills are operational and invisible to the SDLC workflow itself. They focus on the agent’s internal robustness and platform integrity.

WHAT: SKILL.md definitions for internal-token-manager or agent-state-historian.
APPLICATION: Triggered automatically by the agent’s orchestration layer during defined lifecycle events (e.g., establishing a session state, refreshing an expired 401 token).
ARCHITECTURAL FLOW: When any skill requires access to a restricted API, it activates the internal-token-manager (Code Snippet 4). Claude reads the command contract (refresh-token(service_id)). The framework executes the decoupled logic (scripts/) to refresh the secure token, ensuring the agent software can authenticate without creating brittle, direct credential dependencies in the domain-level skills. This internal complexity is hidden from the user but critical for security and robustness.

Code Snippet 4: Sample Procedural SKILL.md (Token Manager)
---
# REQUIRED METADATA FIELDS (SID CONTRACT)
name: internal-token-manager-v1
description: An internal operational skill that triggers throughout a workflow when the agent detects it requires a secure token to authenticate against an external service (e.g., GitHub, Slack, Splunk).
commands:
  - refresh-token(service_id)
constraints:
  - Must use a valid agent credential secret (e.g., 'agent_platform_secret').
  - Tokens must expire after 1 hour.
---

### Expected Behavior (Examples)

When this skill is matched when a GitHub operation requires auth:
  - Input: refresh-token("github_api")
  - Execution: Loads 'scripts/refresh_token.py'.
  - Output: New OAuth token JSON.

The Boundary of Autonomy and the Expertise Gap

While standardizing capabilities via SKILL.md is essential, I believe it is critical for architects to also define where SKILL.md is not the right tool. My own perspective, based on recent project implementation, is that a common architectural failure is expecting SKILL.md to easily encode true Domain Expertise and Heuristic Judgment.

Offloading Heuristics vs. Offloading Wisdom

A well-defined SKILL.md is designed to be precise, measurable and standardized. It excels at offloading common known items, standard checklists and systematic patterns into reliable workflows (as seen in our Code Snippets 3 & 4). In my recent project, this precision made the skills function as excellent fixed checklists, significantly reducing operational ambiguity.

This same precision, however, means it can appear only as a checklist. A procedural skill like exec-linter-code-analyzer can identify a syntax error based on a rule, but I found it often lacked the domain wisdom to understand the conceptual design decision that led to that error.

Assisting Expertise, Not Replacing It

Based on the experience so far, I believe that you cannot easily encode a senior engineer’s years of nuanced design thinking into a SKILL.md description. The true architectural value of a standardized specification is that it offloads the reliable execution complexity, allowing the Human Expert (or a high-level Agentic Persona) to focus entirely on core domain and design reasoning.

For now, I believe following a model where three distinct pillars of knowledge are defined will work out:

Systematic Workflows (Procedural Skills): Handled perfectly by SKILL.md. (The “What to Do”)
Conceptual Frameworks (Persona Mindsets): Setup by SKILL.md. (How Claude “Thinks”)
Domain Wisdom & Design Reasoning: Passed as the problem context in the main prompt. (Why Claude “Decides”)

Engineering Best Practices for SKILL.md Mastery

Achieving systematic capability definition requires adhering to these foundational best practices:

Strict Decoupling: Never place the execution logic (e.g., Python code) directly within the SKILL.md file. The SKILL.md is the specification & the scripts/ directory is the implementation.
Immutability: Once a skill is deployed, treat its metadata (Name, Description, Commands) as immutable. Any significant change requires a new version (e.g., exec-raise-github-pr-v2). Brittleness often stems from changing definitions in place.
Description as a Trigger: Never write a summary description (e.g., “This skill runs a linter”). It must be written as a trigger definition (e.g., “Triggers automatically after a context save event…”). Skill matching depends entirely on this accuracy.
Token Economy: Adhere to strict size constraints: < 500 lines and < 5k tokens for the SKILL.md. The Progressive Disclosure pattern will handle heavier assets, keeping the SID itself focused and parseable.
Git-Managed Context: Treat SKILL.md files as code. They must be version-controlled in Git, promoting discoverability, reuse and providing a traceable history of how capabilities have evolved throughout the lifecycle.

Final Thought: A Standard for Scaling Autonomy

By adopting the SKILL.md specification, we move from fuzzy conversational AI to a structured engineering discipline, where all agent capabilities, whether they are internal operational requirements, external user workflows or conceptual roles framework – all are defined by precise, version-controlled contracts.

This foundation standardizes reliable execution complexity, not only making your automated SDLC predictable and robust but also ensuring that precious domain expertise remains focused on main design decisions, not common patterns. Mastering the SKILL.md standard is the definitive, interoperable foundation for building scalable, maintainable and engineering-grade AgenticAI architectures.

. Sandeep Mewara Github
News Update
Tech Explore
Trend
samples GitHub Profile Readme
Learn Machine Learning with Examples
Machine Learning workflow

[DOWNLOAD: skill.md Quick Reference Guide]

Agentic AI for Beginners: My Journey into Building with Claude

March 30, 2026March 30, 2026Sandeep Mewara Leave a comment

There’s been a lot of buzz around Agentic AI lately, especially around how powerful Claude can be when used beyond simple prompting. Naturally, I got curious.

As an architect, I wanted to understand what “agentic” really means in practice. What changes when we move from prompts to agents? And what does that mean for how we design systems? As I started exploring, it became clear this isn’t just about smarter chatbots, it’s something more.

From Prompts to Agents: What’s the Difference?

Before diving in, let’s distinguish between Generative AI and Agentic AI.

Generative AI (Reactive) – Deals with Prompts, where we provide an input and the model provides a one-time response. We are the orchestrator.
Agentic AI (Proactive) – Deals with Agents, where we provide a goal and the model determines the steps, uses tools and iterates until the goal is met. The model is the orchestrator.

“Agentic” means moving from chatting to delegating. It’s the difference between asking for instructions and having the task completed for you, like getting a recipe vs hiring a chef or asking for directions vs being driven there.

The “Agentic” Starter Pack

I started with the basics to see how Claude handles the “plumbing” of a real-world project. My exploration focused on three core hyped items:

Agentic Implementation: I moved away from “one-off” prompts and built loops where Claude runs a Plan -> Execute -> Test -> Fix cycle autonomously.
Model Context Protocol (MCP): I hooked Claude up to my local filesystem, Slack & GitHub. This was to see how the agent “reaches out” and queries the data it needs directly.
Role-Based Division: I experimented with “Agent Teams” by giving different Claude instances specific roles: one as the Architect to handle planning and another as the Developer to handle implementation. Further, tried to put multiple hats for the clarity of work distribution and decision making for the agent.

My Learning Project: Endpoint Watch Agent (EWA)

The goal of this project was to build a hands-on learning kit for agentic systems. Endpoint Watch Agent (EWA) is a Python-based agent that continuously monitors configurable endpoints (websites or APIs). When an endpoint is down or unhealthy, the agent autonomously evaluates the incident, avoids duplicate alerts, creates a ticket and sends a contextual Slack notification.

Flow Diagram Plan

Structuring the Workflow

Starting from nothing, I worked with Claude itself to set up the structure and segregation of components, defining single responsibility. To keep things simple, created a single agent (Orchestrator) that runs one sequential loop: Check Endpoint 1 → Decide → Act → Check Endpoint 2 → Decide → Act...

The PolicyEngine is not an agent but a pure function called by the agent. The tools are interfaces that the agent dispatches, while the MCP servers are external services.

Explore or build on the Project available here: [Github Link]

What I Learnt: The “Pro” Framework

The real breakthrough wasn’t the model itself, but how I structured the project to guide it. Below project structure can be considered a good architectural template as a baseline start for any agentic development. The architectural pattern supports a clean separation of concerns, where we can add new tools, policy rules or tests without needing to restructure the entire system.

As a production-ready baseline though, it has gaps: no tests, single-threaded endpoint checking, no metrics, no graceful shutdown. These are solvable without rethinking the architecture, but they’d need to be added before shipping anything real.

I found that following four pillars are essential for any agentic workflow:

CLAUDE.md (The Project Brain)

This file lives in the root of your repo as the AI’s operating manual. It tells Claude agent who it is and how it should behave in this specific codebase. Thus, it helps to start with shared context instead of inferring everything from scratch each session.

# Project Context: Endpoint Watch Agent (EWA)

## Role & Mission
You are the **EWA Specialist**. Your goal is to maintain a high-availability monitoring system. You prioritize accuracy in incident detection and clarity in Slack notifications.

## Tech Stack
- **Runtime:** Python 3.12
- **Logic:** Policy-based reasoning (PolicyEngine)
- **Integrations:** Slack (Alerts), Jira (Tickets), GitHub (MCP)

## Architecture Rules
- **Separation of Concerns:** Keep tools in '/tools', logic in '/engine'.
- **Async First:** Use 'asyncio' for all network-bound endpoint checks.
- **No Deletions:** Never delete incident logs, only archive or update status.

## Dev Commands
- **Run:** 'python main.py'

CLAUDE.md is the interface between the human who designed the system and the AI that extends it. It’s not a documentation for users of the tool instead is a documentation for the next builder, human or AI.

SKILLS.md (The Capability Manual)

While CLAUDE.md is about the project, SKILLS.md is about what the agent is capable of doing. It provides pre-verified “recipes” for complex tasks, stopping the agent from hallucinating its own (often broken) logic.

# Agent Skills

## Skill: Incident Evaluation
- **When:** An endpoint returns a non-200 status.
- **Action:** 
  1. Check 'storage/incidents.json' for active tickets.
  2. If new, invoke the 'JiraTool' to create a "Critical" task.

## Skill: Slack Formatting
- **Constraint:** Always include the Status Code, Response Time, and the "Runbook Link" from the configuration file.
- **Tone:** Professional and urgent.

These are the procedural instructions or documentation that teach the agent how to use a tool effectively in a specific context.

“Plan, then Execute” Workflow

I stopped asking Claude to “just do it”. Instead, I enforced a mandatory two-step gate:

The Plan: Claude must output a step-by-step technical plan first.
The Approval: I review the plan for architectural alignment.
The Execution: Only after approval does the agent start writing code. This eliminates 90% of the “rabbit holes” agents often fall into.

Verification Criteria

Never ask an agent to “fix a bug”. Instead, ask it to “Fix the bug and provide the specific CLI command or test case to verify the fix”. It seems an agent that knows it has to prove its work is significantly more accurate and less likely to hallucinate a “done” state!

What I Learnt: Behavioral System Design

EWA is built like a Claude agent where it has a brain (orchestrator), reasoning (policy engine), senses (endpoint checker), hands (Jira + Slack tools) and memory (incident store).

Thus, moving beyond simple monitoring, this system creates a truly agentic closed loop: it observes, reasons, decides, acts and remembers, closing the gap between detection and autonomous resolution. This is what differentiates a single prompt from a system that operates.

If designed properly, the orchestrator never does anything directly. It asks tools to observe, asks the policy engine to reason, then dispatches to tools based on the decision. Every component has one job and knows nothing about the others.

Thus, with agentic systems, we start to define goals, shape decision boundaries, orchestrate tools and design workflows. The unit of design has moved from “What does this function do?” to “How does this system behave over time?”. This is very different and is a significant mindset shift.

What I Learnt: The Operational Reality

This is where Agentic AI gets interesting and at the same time risky. They are not just capable but are also more complex to reason about.

What’s Exciting (The Wins)

Self-Healing Workflows: Automation of operational tasks where systems can adapt to minor changes instead of simply breaking
Engineering Velocity: Drastic reduction in manual intervention for complex, multi-file refactors

What’s Hard (The Risks)

Observability & Non-Linear Debugging: Traditional logs don’t help much when an agent enters a logic loop. It becomes difficult to answer: “Why did the agent choose this specific tool at this specific time?” Tracking these non-linear flows requires a completely different observability stack.
Guardrails & Cost: Without structural “circuit breakers”, agents can enter recursive loops that transform a technical logic error into a financial one. In an agentic world, unguided autonomy doesn’t just crash a service, it can drain token budgets in minutes.

What I Learnt: The Shift to “Specification of Judgment”

The biggest realization was the shift in our roles: The engineer’s job is becoming the specification of judgment.

We are moving away from writing line-by-line code and towards translating domain knowledge (e.g., Don’t auto-close the Jira ticket on recovery instead leave that to humans), operational experience (e.g., What if the MCP server subprocess hangs instead of failing?) and trust calibrations (e.g., Trust the agent to send Slack alerts without human review: yes) into rules the agent can follow.

Claude handles the execution, but its success depends on our ability to articulate why a system should behave a certain way, not just what it should do. This requires architectural experience to anticipate what could go wrong and the clarity to express those constraints precisely.

Final Thoughts: The Evolution of How We Build

It’s only a matter of time. While the technical risks are real today, the pace of advancement is blistering. We are witnessing a total paradigm shift: we aren’t just writing code anymore, instead we are managing a digital workforce.

For architects, this means rethinking system boundaries. For developers, it means thinking in workflows. I am excited to adapt! This isn’t the evolution of standard coding but the evolution of how we build.

.Sandeep Mewara Github

News Update
Tech Explore
Trend
samples GitHub Profile Readme
Learn Machine Learning with Examples
Machine Learning workflow
What is Dynamic Programming

Explore or build on the Project available here: [Github Link]

Microsoft WebView2 – A new friend for native apps!

March 7, 2021May 9, 2021Sandeep Mewara Leave a comment

Now, we can embed web content (HTML, CSS, and JavaScript) in our native applications with Microsoft Edge WebView2. Earlier, it was announced only for Win32 C/C++ apps but later, it was announced available for use in .NET 5, .NET Core, and .NET Framework Windows Forms and WPF applications.

It uses the modern Microsoft Edge (Chromium) platform to host web content within native Windows applications.

In the future, the Evergreen WebView2 Runtime plans to ship with future releases of Windows. Deploy the Runtime with your production app until the Runtime becomes more universally available.
Microsoft recommendation

Power of Native

With growing presence of online world, desktop applications are pushed to use more of online like capabilities. Further, web solution also helps reuse most of the code across different platforms and easy change delivery. This is leading desktop applications more and more towards hybrid approach where best of both (native and web) can be leveraged.

Microsoft WebView2 comes to rescue. It helps build powerful applications with controlled access to native capabilities.

WebView2 uses the same process model as the Microsoft Edge browser. A browser process is associated with only one user data folder. A request process that specifies more than one user data folder is associated with the same number of browser processes.

More details about browser process model can be read here.

WebView2 apps create a user data folder to store data such as cookies, credentials, permissions, and so on. After creating the folder, your app is responsible for managing the lifetime of the user data folder, including clean up when the app is uninstalled
Microsoft – Managing user data folder

Microsoft has laid here some best practices for developing secure WebView2 application.

Distribution

When distributing your WebView2 app, ensure the backing web platform, the WebView2 Runtime, is present before the app starts.

By default, WebView2 is evergreen and receives automatic updates to stay on the latest and most secure platform.

Evergreen Bootstrapper – a tiny installer that downloads the Evergreen Runtime matching device architecture and installs it locally.
Evergreen Standalone Installer – a full-blown installer that can install the Evergreen Runtime in offline environment.
Fixed Version – to select and package a specific version of the WebView2 Runtime with your application.

The WebView2 Runtime is a redistributable runtime and serves as the backing web platform for WebView2 apps.

Download the runtime from here. Supported platforms are mentioned here.

Sample Application

I built a sample WPF application (runs on .NET Framework and not Core) to try WebView2. This was to evaluate how comparatively older .NET applications would work out.

I tried to display my blog in the WPF application using a WebView2 control. Sample application had capabilities to post message to host application and back as well as hook events as per need.

public MainWindow()
{
    InitializeComponent();

    // NavigationEvents
    webView.NavigationStarting += WebView_NavigationStarting; ;
    webView.SourceChanged += WebView_SourceChanged;
    webView.ContentLoading += WebView_ContentLoading;
    webView.NavigationCompleted += WebView_NavigationCompleted;

    // Embedded at CoreWebView2 level
    InitializeOnceCoreWebView2Intialized();
}

/// <summary>
/// initialization of CoreWebView2 is asynchronous.
/// </summary>
async private void InitializeOnceCoreWebView2Intialized()
{
    await webView.EnsureCoreWebView2Async(null);

    // Hook other events
    webView.CoreWebView2.FrameNavigationStarting += CoreWebView2_FrameNavigationStarting;
    webView.CoreWebView2.HistoryChanged += CoreWebView2_HistoryChanged;

    // For communication host to webview & vice versa
    webView.CoreWebView2.WebMessageReceived += CoreWebView2_WebMessageReceived;
    await webView.CoreWebView2.AddScriptToExecuteOnDocumentCreatedAsync("window.chrome.webview.postMessage(window.document.URL);");
    await webView.CoreWebView2.AddScriptToExecuteOnDocumentCreatedAsync("window.chrome.webview.addEventListener(\'message\', event => alert(\'Message from App to WebView2 on navigation!\'));");
}

/// <summary>
/// Web content in a WebView2 control may post a message to the host 
/// </summary>
/// <param name="sender"></param>
/// <param name="e"></param>
private void CoreWebView2_WebMessageReceived(object sender, CoreWebView2WebMessageReceivedEventArgs e)
{
    // Retrieve message from Webview2
    String uri = e.TryGetWebMessageAsString();
    addressBar.Text = uri;

    // Send message to Webview2
    webView.CoreWebView2.PostWebMessageAsString(uri);
    log.Content = $"Address bar updated ({uri}) based on WebView2 message!";
}

/// <summary>
/// Execute URL
/// </summary>
/// <param name="sender"></param>
/// <param name="e"></param>
private void ButtonGo_Click(object sender, RoutedEventArgs e)
{
    try
    {
        Uri uri = new Uri(addressBar.Text);

        if (webView != null && webView.CoreWebView2 != null)
        {
            webView.CoreWebView2.Navigate(uri.OriginalString);
        }
    }
    catch (UriFormatException)
    {
        MessageBox.Show("Please enter correct format of url!");
    }
}

/// <summary>
/// Allow only HTTPS calls
/// WebView2 starts to navigate and the navigation results in a network request. 
/// The host may disallow the request during the event.
/// </summary>
/// <param name="sender"></param>
/// <param name="e"></param>
private void WebView_NavigationStarting(object sender, CoreWebView2NavigationStartingEventArgs e)
{
    String uri = e.Uri;
    if (!uri.StartsWith("https://"))
    {
        e.Cancel = true;
        //MessageBox.Show("Only HTTPS allowed!");

        // Inject JavaScript code into WebView2 controls at runtime
        webView.CoreWebView2.ExecuteScriptAsync($"alert('{uri} is not safe, try an https link please.')");
    }
}

Various events run when specific asynchronous actions occur to the content displayed in a WebView2 instance.

`NavigationStarting`	WebView2 starts to navigate and the navigation results in a network request. The host may disallow the request during the event.
`SourceChanged`	The source of WebView2 changes to a new URL. The event may result from a navigation action that does not cause a network request such as a fragment navigation.
`ContentLoading`	WebView starts loading content for the new page.
`HistoryChanged`	The navigation causes the history of WebView2 to update.
`NavigationCompleted`	WebView2 completes loading content on the new page.
`ProcessFailed`	To react to crashes and hangs in the browser and renderer processes
`Close`	To safely shut down associated browser and renderer processes

Key events

Working with the sample application, I was able to display a webpage, intercept calls both ways and embed message/code to my need. It provides all the capabilities that seems to be needed for a stable web app display control.

Complete sample application can be downloaded from here: https://github.com/sandeep-mewara/WebView2WpfBrowserApp

Reference

https://developer.microsoft.com/en-us/microsoft-edge/webview2/
https://docs.microsoft.com/en-us/microsoft-edge/webview2/gettingstarted/wpf
https://docs.microsoft.com/en-us/microsoft-edge/webview2/concepts/distribution

Sandeep Mewara Github
News Update
Tech Explore
Data Explore
samples GitHub Profile Readme
Learn Machine Learning with Examples
Machine Learning workflow
What is Data Science
Word Ladder solution
What is Dynamic Programming

Find missing number from 1 to N?

November 8, 2020November 17, 2020Sandeep Mewara Leave a comment

Last week, there was a discussion in my team on the problem of finding missing number(s). We had different thoughts and approaches and thus I thought to share it across.

Problem statement was something like:

– An array of size (n) has numbers from 1 to (n+1). Find the missing one number.
– An array of size (n) has numbers from 1 to (n+2). Find the missing two numbers.

First thought …

Keep track of numbers found while traversing. At the end, use it to find the missing number. So kind of brute force approach.

We can maintain a hash or a boolean array of n size and keep on updating the hash or the array index location based on number found while traversing. Use it now to find the missing number. It would cover both one as well as two missing numbers case.

This would have two traversals of n (one for filling in the structure and another to find the missing one). Thus overall, time complexity of O(n). This would need an extra space to keep track of all numbers found and thus a space complexity of O(n).

Q: Now, can we avoid extra space or two times traversal?

Second thought …

We know how to calculate the sum of n natural numbers, i.e.: n*(n+1)/2. With it, we can traverse the given array and keep a sum of all numbers. Difference of the sum from formula to sum found would give us the missing number. Nice!

# Keep track of sum
def sumOfGivenNumbers(nos, n):
    sum = 0
    # calculate sum
    for i in range(0, n):
        sum += nos[i]
    return sum

# Input
numbers = [4, 2, 1, 6, 5, 7] 

# number range 
n = len(numbers) + 1
expectedSum = n*(n+1)/2
numbersSum = sumOfGivenNumbers(numbers,len(numbers))

print('Missing number:', expectedSum - numbersSum)

# Output
# Missing number: 3.0

This would help is solve one missing number in single traversal, thus time complexity of O(n). No extra space was used and thus space complexity of O(1).

Q: Can we extend this to two missing numbers now?

Yes, we can extend it. Along with sum, we can also use the product of n natural number as an expression. With it, we will have two equations and two numbers to find:

Missing1 = x1
Missing2 = x2
Sum of provided numbers = N1
Sum of n Natural numbers = N
Product of provided numbers = P1
Product of n Natural numbers = P

x1 + x2 + N1 = N
x1 * x2 * P1 = P

We can solve it to find the two missing numbers. It does have the quadratic flavor associated though. It maintains the time complexity as O(n) and space complexity as O(1). Nice!

Q: Does the solution help with large integers? Think of possible overflow?

Third thought …

Let’s look at possible way for 1 missing number first.

We will traverse through all the numbers of the array. While doing so, maintain a number that would be sum of all numbers traversed so far reduced by sum of all the indexes traversed (+1 if index starts from 0). It is still making use of n natural numbers (in form of indexes) to keep a check on sum to a defined limit.

# Keep track of sum
def getMissingNumber(nos, n):
    sum = 0
    # calculate sum
    for i in range(0, n):
        sum += (i+1)
        sum -= nos[i]

    # last number to add from n+1 natural nos.
    return sum+n+1

# Input
numbers = [4, 2, 1, 6, 5, 7] 

missingNumber =getMissingNumber(numbers,len(numbers))

print('Missing number:', missingNumber)

# Output
# Missing number: 3.0

This looks good and we maintain the same complexities along with solving for overflow.

We can probably try a similar thing for two missing numbers where we keep on multiple and divide the traversed number by index but it still could have overflow issues in worst case. Further, there could be round off issues.

Fourth thought …

Looking more, it seems we can make use of XOR operation to find the missing numbers. We can make use of XOR’s property to nullify the duplicate pair. We will take XOR of provided numbers and XOR of natural numbers. Combining both again with XOR will leave with missing numbers XOR output.

For one missing number, this would be easy and covers all the hurdles discussed earlier keeping same performance.

# Keep track of XOR data
def getMissingNumber(nos, n):
    x1 = nos[0]
    xn = 1

    # start from second
    for i in range(1, n):
        x1 = x1 ^ nos[i]
        xn = xn ^ (i+1)
    
    # last number to XOR
    xn = xn ^ (n+1)

    # find the missing number
    return x1 ^ xn

# Input
numbers = [4, 2, 1, 6, 5, 7] 

missingNumber =getMissingNumber(numbers,len(numbers))

print('Missing number:', missingNumber)

# Output
# Missing number: 3.0

For two missing numbers, using a similar logic of XOR above, we will have an output of XOR value of both missing numbers. Now, given the XOR value will not be zero, the XOR corresponding valid bit in missing1 and missing2 must be different to make it “1”.

# Keep track of XOR data
def getTwoMissingNumber(nos, n):
    x1 = nos[0]
    xn = 1

    # start from second
    for i in range(1, n-2):
        x1 = x1 ^ nos[i]
        xn = xn ^ (i+1)
    
    # last numbers to XOR
    xn = xn ^ (n-1) ^ (n)

    # XOR of two missing numbers
    # Any set bit in it must be 
    # set in one missing and 
    # unset in other missing number 
    XOR = x1 ^ xn

    # Get a rightmost set bit of XOR  
    set_bit_no = XOR & ~(XOR-1) 
  
    # Divide elements in two sets 
    # by comparing rightmost set bit of XOR 
    # with bit at same position in each element. 
    x = 0
    y = 0 
    for i in range(0,n-2): 
        if nos[i] & set_bit_no:    
            # XOR of first set in nos[]  
            x = x ^ nos[i]   
        else: 
            # XOR of second set in nos[]  
            y = y ^ nos[i]   

    for i in range(1,n+1): 
        if i & set_bit_no: 
            # XOR of first set in nos[]  
            x = x ^ i        
        else: 
            # XOR of second set in nos[]  
            y = y ^ i
    
    print ("Missing Numbers: %d %d"%(x,y)) 
    return

# Input
numbers = [4, 2, 1, 6, 7, 5] 

# total length will be provided count+2 missing ones
getTwoMissingNumber(numbers, len(numbers) + 2)

# Output
# Missing Numbers: 3 8

This overcomes the overflow issue and was easier to solve (compared to solving a quadratic equation). Though it took more than one traversal, overall it maintains the time complexity as O(n) and space complexity as O(1). Nice!

Closure …

There could be multiple ways to solve for one or more missing numbers. One can look at it based on ease and need.

Keep solving!

How to solve Word Ladder Problem?

October 18, 2020November 22, 2020Sandeep Mewara Leave a comment

Sometime back, a colleague of mine asked me about the word ladder problem. She was looking for a change. So, I believe she stumbled across this while preparing for data structures and algorithms.

Problem Statement

Typically, the puzzle shared is a flavor of below:

Find the smallest number of transformations needed to change an initial word to a target word of same length. In every transformation, change only one character and make sure word exists in the given dictionary.

Explanation

Assuming all these 4 letter words are there in the dictionary provided, it takes minimum 4 transitions to convert word from SAIL to RUIN, i.e.
SAIL -> MAIL -> MAIN -> RAIN -> RUIN

Intent here is to know about Graph algorithm. So, what are graphs in context of algorithms and how do we apply them to solve such problems?

Graph Data Structure

Graphs are flow structure that represents entities connection with each other. Visually, they are represented with help of a Node (Vertex) & an Edge (Connector).

A tree is an undirected graph in which any two nodes are connected by only one path. In it, each node (except the root node) comprises exactly one parent node.

Most common way to represent a graph is using an Adjacency matrix. In it, Element A[i][j] is 1 if there is an edge from node i to node j or else it is 0. For example, adjacency matrix of above unidirected graph is:

  | 1 2 3 4
------------
1 | 0 1 0 1
2 | 1 0 1 0
3 | 0 1 0 1
4 | 1 0 1 0

Another common way is via Adjacency list. (List format of the data instead of a matrix.)

Related Algorithms

Graphs are applied in search algorithms. Traversing the nodes and edges in a defined order helps in optimizing search. There are two specific approaches to traverse graph:

Breadth First Search (BFS)

Given a graph G and a starting node s, search proceeds by exploring edges in the graph to find all the nodes in G for which there is a path from s. With this approach, it finds all the nodes that are at a distance k from s before it finds any nodes that are at a distance k+1.

For easy visualization, think of it as, in a tree, finding all the child nodes for a parent node as first step. Post it, find all the grandchildren and hence forth.

Depth First Search (DFS)

Given a graph G and a starting node s, search proceeds by exploring edges in the graph to find all the nodes in G traversed from s through it’s edges. With this approach, we go deep in graph connecting as many nodes in the graph as possible and branch where necessary.

For easy visualization, think of it as, in a tree, finding all the family nodes for a parent node. With this, for a given node, we connect its children, grand children, grand grand children and so on before moving to next node of same level.

Thus, with DFS approach, we can have multiple deduced trees.

Knight’s tour is a classic example that leverages Depth First Search algorithm.

Shortest Path First OR Dijkstra’s Algorithm (SPF)

Given a graph G and a starting node s, search the shortest path to reach node d. It uses a concept of weights. It’s an iterative algorithm similar to results of BFS.

Many real world example fits in here, e.g. what would be shortest path from home to office.

With BFS (a simple queue), we visit one node at a time whereas in SPF (a priority queue), we visit a node at any level with lowest cost. In a sense, BFS follows Dijkstra's algorithm, a step at a time with all edge weights equal to 1. The process for exploring the graph is structurally the same in both cases. at times, BFS is preferred with equal weight graphs. This is because, operations on a priority queue are O(log n) compared to operations on a regular queue which is O(1).

Code

I will be using a breadth first graph algorithm here based on the problem need:

import collections
from collections import deque 

class Solution(object):
    # method that will help find the path
    def ladderLength(self, beginWord, 
                        endWord, wordList):
        """
        :type beginWord: str
        :type endWord: str
        :type wordList: Set[str]
        :returntype: int
        """

        # Queue for BFS
        queue = deque()

        # start by adding begin word
        queue.append((beginWord, [beginWord]))

        while queue:
            # let's keep a watch at active queue
            print('Current queue:',queue)

            # get the current node and 
            # path how it came
            node, path = queue.popleft()

            # let's keep track of path length 
            # traversed so far
            print('Current transformation count:',
                                        len(path))

            # find possible next set of 
            # child nodes, 1 diff
            for next in self.next_nodes(node, 
                            wordList) - set(path):
                # traversing through all child nodes
                # if any of the child matches, 
                # we are good               
                if next == endWord:
                    print('found endword at path:',
                                            path)
                    return len(path)
                else:
                    # keep record of next 
                    # possible paths
                    queue.append((next, 
                                path + [next]))
        return 0

    def next_nodes(self, word, word_list):
        # start with empty collection
        possiblenodes = set()

        # all the words are of fixed length
        wl_word_length = len(word)

        # loop through all the words in 
        # the word list
        for wl_word in word_list:
            mismatch_count = 0

            # find all the words that are 
            # only a letter different from 
            # current word those are the 
            # possible next child nodes
            for i in range(wl_word_length):
                if wl_word[i] != word[i]:
                    mismatch_count += 1
            if mismatch_count == 1:
                # only one alphabet different-yes
                possiblenodes.add(wl_word)
        
        # lets see the set of next possible nodes 
        print('possible next nodes:',possiblenodes)
        return possiblenodes

# Setup
beginWord = "SAIL"
endWord = "RUIN"
wordList = ["SAIL","RAIN","REST","BAIL","MAIL",
                                    "MAIN","RUIN"]

# Call
print('Transformations needed: ',
    Solution().ladderLength(beginWord, 
                            endWord, wordList))

# Transformation expected == 4
# One possible shortes path with 4 transformation:
# SAIL -> MAIL -> MAIN -> RAIN -> RUIN

Used deque (doubly ended queue) of Python

deque helps with quicker append and pop operations from both the ends. It has O(1) time complexity for append and pop operations. In comparison, list provides it in O(n) time complexity.

A quick look at the code workflow to validate if all nodes at a particular distance was traversed first and then moved to next level:

Current queue: deque([('SAIL', ['SAIL'])])

Current transformation count: 1
possible next nodes: {'BAIL', 'MAIL'}
Current queue: deque([('BAIL', ['SAIL', 'BAIL']), 
                      ('MAIL', ['SAIL', 'MAIL'])])

Current transformation count: 2
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIL', ['SAIL', 'MAIL']), 
                      ('MAIL', ['SAIL', 'BAIL', 
                       'MAIL'])])

Current transformation count: 2
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('MAIL', ['SAIL', 'BAIL', 
                                'MAIL']), 
                      ('BAIL', ['SAIL', 'MAIL', 
                                'BAIL']), 
                      ('MAIN', ['SAIL', 'MAIL', 
                                'MAIN'])])

Current transformation count: 3
possible next nodes: {'BAIL', 'MAIN', 'SAIL'}
Current queue: deque([('BAIL', ['SAIL', 'MAIL', 
                                'BAIL']), 
                      ('MAIN', ['SAIL', 'MAIL', 
                                'MAIN']), 
                      ('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN'])])

Current transformation count: 3
possible next nodes: {'SAIL', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'MAIL', 
                                'MAIN']), 
                      ('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN'])])

Current transformation count: 3
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('MAIN', ['SAIL', 'BAIL', 
                                'MAIL', 'MAIN']), 
                      ('RAIN', ['SAIL', 'MAIL', 
                                'MAIN', 'RAIN'])])

Current transformation count: 4
possible next nodes: {'RAIN', 'MAIL'}
Current queue: deque([('RAIN', ['SAIL', 'MAIL', 
                                'MAIN', 'RAIN']), 
                      ('RAIN', ['SAIL', 'BAIL', 
                        'MAIL', 'MAIN', 'RAIN'])])

Current transformation count: 4
possible next nodes: {'MAIN', 'RUIN'}
found endword at path: ['SAIL', 'MAIL', 'MAIN', 
                                        'RAIN']

Transformations needed:  4
Overall path: ['SAIL', 'MAIL', 'MAIN', 
                               'RAIN', 'RUIN']

Complexity

For above code that I used to find the shortest path for transformation:

Time

In next_nodes, for each word in the word list, we iterated over its length to find all the intermediate words corresponding to it. Thus we did M×N iterations, where M is the length of each word and N is the total number of words in the input word list. Further, to form an intermediate word, it takes O(M) time. This adds up to O(M²×N).

In ladderLength, BFS can go to each of the N words and for each word, we need to examine M possible intermediate words. This adds up to O(M²×N).

Overall, it adds up to O2(M²×N) which would be called O(M²×N).

Space

In next_nodes, each word in the word list would have M intermediate combinations. For every word we need a space of M² to save all the transformations corresponding to it. Thus, it would need a total space of O(M²×N).

In ladderLength, BFS queue would need a space of O(M×N)

Overall, it adds up to O(M²×N) + O(M×N) which would be called O(M²×N)

Wrap Up

It could be little tricky and thus would need some practice to visualize the graph as well to write code for it.

Great, so now we know how to solve problems like word ladder problem. It also touch based other related common graph algorithms that we can refer to.

I had a read of the following reference and it has much more details if needed.

Keep problem solving!

samples GitHub Profile Readme

Sandeep Mewara Github
Sandeep Mewara Learn By Insight
Matplotlib plot samples
Sandeep Mewara Github Repositories

Just reverse alphabets in a string?

October 11, 2020October 16, 2020Sandeep Mewara Leave a comment

Last week it was a simple problem that we discussed to tangle our brain.

Reverse only alphabets in a provided string in a most efficient way. (Special characters, numbers, etc should continue to stay at their original place)

First thought …

Okay, so first thought was to make use of an extra space and then we can get our desired result with two traversals of the string. Yeah, but let’s optimize.

Updated thought …

We can make the swaps in place in a single traversal. A good stopping criteria seems to be when index while moving from front crosses the index moving backwards from end.

Let’s write code:

static void Main(string[] args)
{
    Console.WriteLine($"Please enter a string:");

    // Ignore casing
    var inputString = Console.ReadLine().ToLower();
    char[] inputArray = inputString.ToCharArray();
    ReverseAlphabetsOnly(inputArray);        
    Console.WriteLine(
            $"Reversed: {new String(inputArray)}");
}

static void ReverseAlphabetsOnly(char[] inputArray)
{
    int frontIndex = 0;
    int endIndex = inputArray.Length-1;
    char temp;

    while(frontIndex < endIndex)
    {
        if(!IsAlphabet(inputArray[frontIndex]))
            frontIndex++;
        else if(!IsAlphabet(inputArray[endIndex]))
            endIndex--;    
        else
        {
            temp = inputArray[frontIndex];
            inputArray[frontIndex] 
                 = inputArray[endIndex];
            inputArray[endIndex] = temp;

            frontIndex++;
            endIndex--;
        }
    }
}

static bool IsAlphabet(char x) 
{ 
    return ( (x >= 'a' && x <= 'z') 
            || (x >= 'A' && x <= 'Z') ); 
}

// Input:  Le@rn By In$ig#t...
// Output: tg@in Iy Bn$re#L...

Closure …

Approach looks good, as it would be at maximum a single traversal with no extra space used. Thus we are able to solve it with an overall Order of Time complexity O(n) & Space complexity O(1).

Happy solving …

samples GitHub Profile Readme

Sandeep Mewara Github
Sandeep Mewara Learn By Insight
Matplotlib plot samples
Sandeep Mewara Github Repositories

The Cost of Rediscovery

AI Has Context. It Doesn’t Have Structure.

The Economics of Reconstructing Knowledge

The Shift I Think We’re Entering

Why We Built Infigraph

The Byproducts of Structural Awareness

Why We Open-Sourced It

What’s Next

Guardrails: More Than Just a Safety Feature

Three Questions Every Architect Should Ask

1. Can it do this? (Capability & Access)

2. Should it do this? (Policy & Context)

3. What if it goes wrong? (Resiliency & Recovery)

A Practical Framework for Control

Where Guardrails Actually Live

Hard-Earned Realities of Scaling

1. The Trap of Human-in-the-Loop (HITL)

2. The Latency Tax

3. Policy-as-Code vs. Prompt Engineering

4. Guardrails Break Silently

Two Often Overlooked Risks

1. Economic Guardrails

2. Memory & State Management

The Strategic Bottom Line

Final Thought

From Manual Assistance to Actual Leverage

The Missing Piece: Contextual Onboarding

A Practical Starting Point: The claude.md

Expanding the Framework: Skills

A Mentor in Your Pocket: Codex-Claude

Watchouts

The Architect’s Path Forward

The Problem: Skills Are Portable, Process Is Not

Step 1: Jump-Start Your Work with AI Skills

Step 2: The Orchestrator as the Director

Core Internals

The Nine-Phase Engine

Project-Specific Infrastructure

Preview, Then Execute

Acceptance Criteria, Tracked Across Phases

Multi-Language, Without a Fork Per Language

Role vs. Workflow Split

When to Use This and When Not

What Changes When You Adopt This?

Closing Thought

Repository & Contribution

The Architectural Blueprint: The SKILL.md

Anatomy of an Engineering Contract

Directory Structure & Progressive Disclosure

Architecting the Automated SDLC

Conceptual Role-Based Skills: Defining the Contract for a Persona (Planning & Setup)

External Workflow Execution Skills: Defining the Contract for the Workflow to ‘Do’

Internal Agent Operational Skills: Defining the Contract for the Software to ‘Be’

The Boundary of Autonomy and the Expertise Gap

Offloading Heuristics vs. Offloading Wisdom

Assisting Expertise, Not Replacing It

Engineering Best Practices for SKILL.md Mastery

Final Thought: A Standard for Scaling Autonomy

From Prompts to Agents: What’s the Difference?

The “Agentic” Starter Pack

My Learning Project: Endpoint Watch Agent (EWA)

What I Learnt: The “Pro” Framework

What I Learnt: Behavioral System Design

What I Learnt: The Operational Reality

What I Learnt: The Shift to “Specification of Judgment”

Final Thoughts: The Evolution of How We Build

Power of Native

Distribution

Sample Application

Reference

First thought …

Second thought …

Third thought …

Fourth thought …

Closure …

Problem Statement

Explanation

Graph Data Structure

Related Algorithms

Breadth First Search (BFS)

A Practical Starting Point: The `claude.md`