AI / LLM Security Testing
Security testing for production AI / LLM systems — prompt injection, jailbreaks, data exfiltration via context windows, model supply-chain risks, RAG pipeline poisoning, agentic tool-call abuse and ML training-data integrity. Mapped to OWASP LLM Top 10 and MITRE ATLAS, with deliverables your AI safety team and your CISO both accept.
- Quote SLA48 hours
- Typical engagement5–15 working days
- RetestFree within 30 days
- Reporting formatCERT-In + ISO + SOC 2 ready
- Team100% in-house · OSCP / OSWE / OSEP
A AI Pentest engagement, in plain language.
AI security is not a generic pentest with the word 'AI' added. A Macksofy AI engagement tests the live model behind your chatbot, the RAG pipeline that retrieves context, the tools your agent can invoke, the training data your fine-tune ingests, and the supply chain (HuggingFace weights, embedding models, vector DBs) underneath. We chain: prompt injection → tool-call abuse → data exfiltration. We test for membership inference, model inversion, and PII leakage from training corpora. And we ship rules — guardrail prompts, output validators, sandboxing patterns — that your platform team can deploy on Monday.
- De-risk customer-facing LLM products before regulator or media exposure
- Satisfy emerging AI-governance frameworks: EU AI Act, India DPDP Act AI-system controls, NIST AI RMF, ISO/IEC 42001
- Catch RAG-pipeline data leakage before it becomes a customer-data incident
- Validate agentic systems (function-calling, tool-use, MCP) for unintended actions
Phased delivery — every step documented.
Interactive walkthrough of how we run a AI Pentest engagement — tap a phase to expand its activities.
1 · Threat model & scope
- Architecture review: model, RAG, agent tools, fine-tune pipeline, deployment surface
- Data-flow review: training data, embeddings, vector store, output channels
- Threat model aligned to OWASP LLM Top 10 + MITRE ATLAS
Industry-standard + custom.
We use the same tooling top BFSI red teams operate — combined with Macksofy in-house extensions and proprietary scripts where commercial tools fall short.
Sectors we operate in
What you get
- OWASP LLM Top 10 + MITRE ATLAS findings inventory
- Reproducer prompts for every finding (copy-pasteable)
- Recommended guardrail prompts + output-validator rules
- RAG pipeline + vector-store hardening checklist
- Agent tool-use sandboxing patterns
- Training-data integrity + supply-chain risk register
- Free retest within 30 days of guardrail deployment
Anonymized engagement snapshots.
Scope · Customer-facing GPT-4o-powered support agent + RAG over support docs
Finding: Indirect prompt injection via a poisoned support article let an attacker exfiltrate other tenants' chat history via the agent's retrieval tool
Critical — patched via guardrail prompt + retrieval-tenancy enforcement; cross-tenant leakage closed before launch
Scope · Patient-symptom triage LLM with EMR tool-call access
Finding: Tool-call confused-deputy: a craftily-phrased patient prompt caused the agent to retrieve another patient's record via the EMR lookup tool
Critical — authz enforced at the tool layer not the agent layer; remediated before clinical rollout
Transparent tiers. No surprises at quote time.
Indicative price ranges based on typical Indian engagements. Final fixed-price quote within 72 hours of the discovery call.
Focused
- Manual + tooled testing
- CERT-In format report
- Free 30-day retest
Stack
- Everything in Focused
- Web + API + mobile coverage
- Executive + technical briefings
Programme
- Everything in Stack
- Quarterly cycles + post-release retests
- Same consultants throughout
Note · Indicative pricing in INR. Final quote depends on scope, asset count and engagement window. Fixed-price proposal within 72 hours.
Rated 4.9 ★ from 612 client reviews.
“We've worked with three Big 4 firms before Macksofy. None found what their team did in our payments stack. The most actionable report we've received in a decade.”
“The CHFI training Macksofy delivered for our cyber cell raised investigation quality measurably. Practical, India-context-aware, and respectful of our operational realities.”
“Came in with zero security background. 5 weeks later I was running Burp Suite and Metasploit confidently. Cleared CEH on the first attempt.”
Things people ask before signing.
Often paired with this engagement.
Get a fixed-price proposal in 48 hours.
Tell us about your security need — pentest, audit, training or a wider engagement. A senior consultant will reply within a few business hours.
- CERT-In Empanelled
- EC-Council ATC · CompTIA Authorized
- 20,000+ professionals trained
- India + UAE engagements
