Practical Tips, Pitfalls & Config Checks

Tokenizers & special tokens : If your base gemma model has a tokenizer different from the one Unsloth expects, ensure FastLanguageModel.from_pretrained returns a tokenizer compatible with the model. You may need to convert a sentencepiece model or provide tokenizer files. If you get tokenizer mismatch errors, point to the tokenizer directory explicitly.
LoRA merge mismatch : If Unsloth fails to save GGUF after LoRA, make sure the version of Unsloth installed supports .save_pretrained_gguf() , upgrade to latest unsloth if needed.
VRAM : If you run out of memory, lower per_device_train_batch_size to 1 and increase gradient_accumulation_steps. If you still have issues, disable load_in_4bit=False and use CPU/offload or use smaller quantization.
Small dataset overfitting : Fine-tuning on a single resume will cause overfitting; to make the model useful you want to:
- keep LoRA rank low,
- train for only a few epochs,
- or combine multiple resumes or synthetic augmentations (paraphrases, Q/A variants) so model generalizes.
Evaluation : After training manually check outputs for hallucinations. For a resume-specific agent, consider converting outputs to deterministic prompts and use temperature=0.0 for inference.
Ollama compatibility : Ollama supports GGUF import. If you plan to share the model, include a short Modelfile and instructions.

To substantially improve quality create ground-truth outputs for each supervision example (i.e., manually craft short summaries, exact skill list, cleaned role/date pairs). If you prefer automation, you can prompt a reliable large model (e.g., previously: an upstream GPT) to produce those supervised outputs from the raw resume and then use them as labels but that requires an external LLM call. Good labels make fine-tuning much more effective than naive outputs.

If FastLanguageModel.from_pretrained fails to load gemma3:270m by HF id, try pointing to the local directory where Ollama extracted the model, or convert safetensors → hf format first. If the model came from Ollama’s pack (internal hashed names), you may need to convert to .safetensors or full HF format first , Unsloth expects a transformers-style model or compatible local folder. (If you run into this, tell me the exact error and I’ll give the conversion steps.)
If you want, I can produce a small script to automatically create richer synthetic supervised targets (paraphrases + Q/A) from the resume , I recommend that step if you want a useful fine-tuned assistant rather than a model that only parrots the resume.