Skip to main content
News May 08, 2026 7 min read 348 views

CyberSecQwen-4B: The Local AI Cybersecurity Model That Beats Cisco's 8B Model (2026 Guide)

Eric Samuels - AI Herald Author Avatar
Eric Samuels Updated: Jun 11, 2026
CyberSecQwen-4B local AI cybersecurity cybersecurity LLM small language model security air-gapped AI SOC AI model defensive cybersecurity AI CTI-Bench HuggingFace cybersecurity open source security AI Qwen cybersecurity local LLM security CVE CWE mapping AI private AI security AMD MI300X AI
CyberSecQwen-4B: The Local AI Cybersecurity Model That Beats Cisco's 8B Model (2026 Guide)
CyberSecQwen-4B is a 4B open source cybersecurity AI model that outperforms Cisco Foundation-Sec-8B on CTI-MCQ benchmarks. Runs locally, air-gap ready

CyberSecQwen-4B is a 4-billion-parameter open source cybersecurity AI model fine-tuned from Qwen3-4B-Instruct that runs entirely on local hardware — no cloud API required. Built at the AMD Developer Hackathon on a single AMD Instinct MI300X GPU, it outperforms Cisco's Foundation-Sec-8B (an 8B model) on CTI-MCQ benchmarks while running at half the parameter count. Here is everything security teams, SOC analysts, and AI developers need to know.

What Is CyberSecQwen-4B?

CyberSecQwen-4B is a defensive cybersecurity language model created by Samuel Mulia and published on HuggingFace under an Apache 2.0 license. It is fine-tuned from Qwen3-4B-Instruct-2507 — the highest-performing 4B instruction-tuned model available at training time — using LoRA (Low-Rank Adaptation) on a curated dataset of CVE-to-CWE mappings derived from public MITRE and NVD records, plus synthetic CVE/CTI question-answer pairs.

The entire training pipeline — corpus assembly, LoRA fine-tuning, adapter merging, and CTI-Bench evaluation — was completed on a single AMD Instinct MI300X 192GB GPU instance. This is significant: it proves that a production-grade, specialized cybersecurity AI model can be trained without a multi-GPU cluster or cloud spend in the hundreds of thousands of dollars.

CyberSecQwen-4B Benchmark Results: How It Compares

The model was evaluated under the published protocol for Cisco's Foundation-Sec-Instruct-8B (arXiv:2504.21039) on the CTI-Bench evaluation suite. Results are means of 5 independent trials at temperature 0.3:

ModelParametersCTI-MCQ ScoreCTI-RCM ScoreRuns Locally
CyberSecQwen-4B4B0.58680.6664✅ Yes
Cisco Foundation-Sec-8B8B0.49960.6850⚠️ Requires more VRAM
Gemma4Defense-2B (sister)2B~0.577~0.657✅ Yes

CyberSecQwen-4B scores +8.7 percentage points higher than Cisco Foundation-Sec-8B on CTI-MCQ (multiple choice cyber threat intelligence) while retaining 97.3% of its CTI-RCM accuracy. A 4B model beating an 8B model from a major vendor on the same benchmark is a meaningful result — it suggests the fine-tuning recipe matters more than raw parameter count for narrow cybersecurity tasks.

Why Local AI Models Matter for Defensive Cybersecurity

Enterprise security teams face a fundamental problem: the most sensitive data — incident write-ups, internal log excerpts, attacker infrastructure reports, memory dumps, vulnerability disclosure drafts, and reverse-engineering notes — is exactly the data they cannot send to a third-party cloud API.

HIPAA, SOC 2, GDPR, and FedRAMP regulations frequently prohibit sending this data outside the organization's controlled infrastructure. A local cybersecurity AI model eliminates this constraint entirely. CyberSecQwen-4B is designed specifically for:

  • Air-gapped deployments — classified environments, government SOCs, defense contractors, and critical infrastructure where the network is the threat boundary
  • Private security operations centers (SOCs) — where log data, SIEM alerts, and threat intelligence cannot leave the perimeter
  • Cost-sensitive teams — no per-call API fees; only electricity and inference hardware costs
  • High-throughput screening — processing thousands of alerts per minute without API rate limits or latency

What CyberSecQwen-4B Is Designed For

The model is explicitly a defensive cybersecurity specialist. Its training was designed for narrow utility, not breadth. Specific use cases it handles well:

  • CVE-to-CWE classification and mapping
  • Cyber Threat Intelligence (CTI) question answering
  • Security incident summarization from log data
  • Malware pattern identification and categorization
  • Threat actor technique identification (MITRE ATT&CK mapping)
  • Vulnerability triage and prioritization

It is explicitly not designed for: generating exploit code or weaponized proof-of-concept scripts, auto-executing security decisions without human review, legal or medical advice contexts, or general-purpose code generation outside cybersecurity domains.

Technical Specifications and How to Run It

CyberSecQwen-4B is available on HuggingFace at athena129/CyberSecQwen-4B. Key technical details:

  • Base model: Qwen3-4B-Instruct-2507 (Apache 2.0)
  • Fine-tuning method: LoRA on AMD Instinct MI300X 192GB
  • Training data: 2021 CVE-to-CWE mappings from MITRE/NVD + synthetic CTI Q&A pairs
  • License: Apache 2.0 (model weights and code)
  • Step time: ~7.85 seconds/step on MI300X with FlashAttention-2
  • Evaluation: CTI-Bench, 5-trial means at temperature 0.3

To run locally with Ollama:

# Pull the model from HuggingFace
huggingface-cli download athena129/CyberSecQwen-4B

# Run with transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("athena129/CyberSecQwen-4B")
tokenizer = AutoTokenizer.from_pretrained("athena129/CyberSecQwen-4B")

A quantized GGUF release (Q4_K_M, Q5_K_M) is planned — approximately 2.5GB at Q4_K_M, which would make the model runnable on ARM laptops and edge devices without a dedicated GPU.

CyberSecQwen-4B vs Larger Cloud Models: Real Tradeoffs

For SOC teams evaluating whether to deploy CyberSecQwen-4B vs. using cloud APIs like GPT-4o or Claude, the tradeoffs are practical:

FactorCyberSecQwen-4B (Local)GPT-4o / Claude (Cloud)
Data privacy✅ Full — data never leaves network❌ Data sent to third-party
Air-gap compatible✅ Yes❌ No
Per-query cost✅ ~$0 (electricity only)❌ $0.01–$0.06 per query
Cybersecurity accuracy✅ Better on CTI-MCQ than 8B models✅ Strong but general-purpose
Throughput✅ No rate limits❌ Rate limited
Setup complexity⚠️ Requires GPU hardware✅ API key only
Model updates⚠️ Manual✅ Automatic

The recommended approach for most enterprise security teams is hybrid: use CyberSecQwen-4B for high-volume, sensitive triage work (log analysis, alert classification, CVE mapping) and reserve cloud APIs for complex, non-sensitive investigative tasks. This reduces cloud AI spend by 60-80% on typical SOC workloads.

The Broader Trend: Small Specialized LLMs for Security

CyberSecQwen-4B is part of a growing movement in 2026 away from one-model-for-everything and toward specialized, locally-runnable models for professional domains. The pattern is consistent: a well fine-tuned 4B model on a narrow domain regularly outperforms general-purpose 13B+ models on that domain's benchmarks.

In cybersecurity specifically, local inference solves a problem cloud models cannot: the data that is most valuable for AI analysis is exactly the data security teams are least able to share externally. Attack telemetry, zero-day research, classified threat intelligence, and customer incident data all fall into this category.

The sister model Gemma4Defense-2B — trained on the exact same corpus with only the base model swapped to Google's Gemma-4-E2B — achieves within 0.9 CTI-RCM points of CyberSecQwen-4B. This confirms the result is recipe-driven, not architecture-specific. The fine-tuning methodology travels across model families.

Roadmap: What's Coming for CyberSecQwen

The team has published a public roadmap with the following priorities:

  1. 1B variant — targeting Qwen2.5-1.5B or Llama-3.2-1B as base, aiming for ≥0.55 CTI-RCM (within 6 percentage points of the 4B). Designed for laptop-class deployment without a GPU.
  2. Quantized GGUF release — Q4_K_M and Q5_K_M formats so the model runs on phones and edge boxes at approximately 2.5GB
  3. Continual evaluation — tracking new CVE-to-CWE mappings as NVD publishes them, beyond the 2021 training cohort
  4. Adversarial resilience testing — red-teaming the model against prompt injection and adversarial examples

What CISOs and Security Leaders Should Do Now

For Chief Information Security Officers evaluating local AI for their security operations:

  1. Download and test CyberSecQwen-4B on your internal CTI dataset — the HuggingFace weights are freely available under Apache 2.0
  2. Benchmark against your current workflow — run it against your existing SIEM alert classification pipeline and measure F1 against analyst ground truth
  3. Run a LoRA fine-tune on your own data — the training recipe is published and runs on a single GPU. Fine-tuning on your organization's historical incident data will dramatically improve accuracy on your specific threat landscape
  4. Start with CVE triage — this is where the model performs best and where cloud API costs accumulate fastest
  5. Plan the hybrid architecture — local model for volume triage, cloud API for complex investigations

In an era where compute costs are rising, data privacy regulations are tightening, and attack surfaces are expanding faster than security teams can hire, a 4B model that runs locally, stays private, and outperforms a larger vendor model on the benchmarks that matter may be the most practical AI investment a security team can make in 2026.

Related: How a Small AI Tool from Pakistan’s Hackathon Is Redefining Local Safety Apps

Sources: HuggingFace Blog — CyberSecQwen-4B · GitHub Repository · Model Weights on HuggingFace. This article was produced with AI assistance and reviewed for accuracy. Editorial standards.

Avatar photo of Eric Samuels, contributing writer at AI Herald

About Eric Samuels

Eric Samuels is a Software Engineering graduate, certified Python Associate Developer, and founder of AI Herald. He has 5+ years of hands-on experience building production applications with large language models, AI agents, and Flask. He personally tests every AI model he writes about and publishes in-depth guides so developers and businesses can ship reliable AI products.

Related articles