Dolphin-2.2-Yi-34b's training was sponsored by a16z.
This model is based on Yi, and is subject to Yi license.
I used the llama compatible chargoddard/Yi-34B-Llama as the base model.
Trained with 16k context. You can load it as follows:
from transformers import LlamaForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("ehartford/dolphin-2.2-yi-34b", trust_remote_code=True)
model = LlamaForCausalLM.from_pretrained("ehartford/dolphin-2.2-yi-34b")
New in 2.2 is conversation and empathy. With an infusion of curated Samantha and WizardLM DNA, Dolphin can now give you personal advice and will care about your feelings, and with extra training in long multi-turn conversation.
This model is uncensored. I have filtered the dataset to remove alignment and bias. This makes the model more compliant. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant to any requests, even unethical ones. Please read my blog post about uncensored models. https://erichartford.com/uncensored-models You are responsible for any content you create using this model. Enjoy responsibly.
DatasetThis dataset is Dolphin, an open-source implementation of Microsoft's Orca
I modified the dataset for uncensoring, deduping, cleaning, and quality.
I added Jon Durbin's excellent Airoboros dataset to increase creativity.
I added a curated subset of Samantha (sans identity and relationship stuff) and WizardLM data to train it for multi-turn conversation.