File size: 2,489 Bytes
9d4d5c7 7c19d46 9d4d5c7 7c19d46 9d4d5c7 7c19d46 9d4d5c7 7c19d46 9d4d5c7 7c19d46 9d4d5c7 7c19d46 9d4d5c7 7c19d46 9d4d5c7 7c19d46 9d4d5c7 82ebd41 7c19d46 9d4d5c7 7c19d46 9d4d5c7 7c19d46 9d4d5c7 7c19d46 9d4d5c7 7c19d46 9d4d5c7 7c19d46 9d4d5c7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | ---
license: apache-2.0
tags:
- devsecops
- llm
- sft
- lora
- tulu-3
- kubernetes
- terraform
---
# DevSecOps Model Platform
> Train a secure model on the best data, then deploy it securely.
## Start Here: Train Your Model
| Dataset | Size | What It Gives You | Command |
|---------|------|-------------------|---------|
| **tulu-3-sft-mixture** | 940K | Math, code, safety, chat (BEST) | python model/train_tulu3.py |
| **OpenThoughts-114k** | 114K | Reasoning, chain-of-thought | python model/train_openthoughts.py |
**allenai/tulu-3-sft-mixture** is from Allen AI Tulu 3 - current SOTA open instruction-tuned model. Proven on Llama-3.1-8B: MMLU 53.5, GSM8K 79.9, HumanEval 76.8.
LoRA config from LoRA Without Regret (Schulman 2025): r=256, alpha=16, all-linear = matches full fine-tuning at 67% compute.
## Repository Structure
```
model/ THE MODEL - train, serve, enhance
train_tulu3.py Primary: 940K best data (zero preprocessing)
train_openthoughts.py Reasoning: 114K CoT traces
finetune_configurable.py Multi-dataset configurable trainer
rag_pipeline.py RAG for DevSecOps knowledge
DATASETS.md Why these datasets, proven recipes
deployment/ SERVE IT - Kubernetes + Docker + vLLM
deployment.yaml ML inference K8s manifest
mlflow-deployment.yaml Experiment tracking
Dockerfile.ml-inference Hardened multi-stage image
security/ PROTECT IT - scanning + policies
scanning/ Trivy, Semgrep, Checkov, SBOM
policies/ Kyverno, OPA Gatekeeper
infrastructure/ RUN IT - Terraform + monitoring + CI/CD
terraform/ VPC, EKS, RDS, S3, IAM, KMS, GuardDuty, Macie
monitoring/ Prometheus, Alertmanager, OTEL, Grafana
ci-cd/ GitHub Actions DevSecOps pipeline
compliance/ CERTIFY IT - SOC2, NIST, CIS
controls-mapping.yaml SOC2 Type II
nist-800-53-mapping.yaml NIST 800-53 Rev5
cis-eks-k8s.yaml CIS Benchmarks
```
## Quick Commands
```bash
# Train on best data (A100, ~6h)
python model/train_tulu3.py
# Quick test (any GPU)
python model/train_tulu3.py --max_steps 100 --no_push
# Security scan
python security/scanning/security_audit.py
# Deploy model to K8s
kubectl apply -f deployment/deployment.yaml
# Infrastructure (Terraform)
cd infrastructure/terraform/environments/prod && terraform apply
```
|