refactor: merged structure - model at center, DevSecOps wrapped around it

9d4d5c7 verified 12 days ago

2.49 kB

	---
	license: apache-2.0
	tags:
	- devsecops
	- llm
	- sft
	- lora
	- tulu-3
	- kubernetes
	- terraform
	---

	# DevSecOps Model Platform

	> Train a secure model on the best data, then deploy it securely.

	## Start Here: Train Your Model

	\| Dataset \| Size \| What It Gives You \| Command \|
	\|---------\|------\|-------------------\|---------\|
	\| tulu-3-sft-mixture \| 940K \| Math, code, safety, chat (BEST) \| python model/train_tulu3.py \|
	\| OpenThoughts-114k \| 114K \| Reasoning, chain-of-thought \| python model/train_openthoughts.py \|

	allenai/tulu-3-sft-mixture is from Allen AI Tulu 3 - current SOTA open instruction-tuned model. Proven on Llama-3.1-8B: MMLU 53.5, GSM8K 79.9, HumanEval 76.8.

	LoRA config from LoRA Without Regret (Schulman 2025): r=256, alpha=16, all-linear = matches full fine-tuning at 67% compute.

	## Repository Structure

	```
	model/ THE MODEL - train, serve, enhance
	train_tulu3.py Primary: 940K best data (zero preprocessing)
	train_openthoughts.py Reasoning: 114K CoT traces
	finetune_configurable.py Multi-dataset configurable trainer
	rag_pipeline.py RAG for DevSecOps knowledge
	DATASETS.md Why these datasets, proven recipes

	deployment/ SERVE IT - Kubernetes + Docker + vLLM
	deployment.yaml ML inference K8s manifest
	mlflow-deployment.yaml Experiment tracking
	Dockerfile.ml-inference Hardened multi-stage image

	security/ PROTECT IT - scanning + policies
	scanning/ Trivy, Semgrep, Checkov, SBOM
	policies/ Kyverno, OPA Gatekeeper

	infrastructure/ RUN IT - Terraform + monitoring + CI/CD
	terraform/ VPC, EKS, RDS, S3, IAM, KMS, GuardDuty, Macie
	monitoring/ Prometheus, Alertmanager, OTEL, Grafana
	ci-cd/ GitHub Actions DevSecOps pipeline

	compliance/ CERTIFY IT - SOC2, NIST, CIS
	controls-mapping.yaml SOC2 Type II
	nist-800-53-mapping.yaml NIST 800-53 Rev5
	cis-eks-k8s.yaml CIS Benchmarks
	```

	## Quick Commands

	```bash
	# Train on best data (A100, ~6h)
	python model/train_tulu3.py

	# Quick test (any GPU)
	python model/train_tulu3.py --max_steps 100 --no_push

	# Security scan
	python security/scanning/security_audit.py

	# Deploy model to K8s
	kubectl apply -f deployment/deployment.yaml

	# Infrastructure (Terraform)
	cd infrastructure/terraform/environments/prod && terraform apply
	```