Skip to main content

Available AI Models and Providers

Below, you will find the current list of AI models and providers available through our platform. We strive to keep this list up to date with the latest advancements in AI technology, so please be aware that it may change over time as we incorporate cutting-edge innovations.

To ensure you always have access to the most recent information, we recommend checking this list frequently. Alternatively, you can subscribe to our notification service to receive updates whenever new models or providers are added. This way, you can stay informed about the latest developments and take advantage of the newest AI solutions for your business needs.

Anthropic

Anthropic is an AI safety and research company based in San Francisco. They concentrate on developing reliable and beneficial AI systems through research and partnerships. Their mission is to make AI systems that are good, reliable, and safe. Anthropic have developed a class of large language models named Claude and have gained attention and support from large investors. They prioritize practical engineering solutions and address challenges in AI reliability and interpretability. Their goal is to make advanced and accountable AI systems for the welfare of society.

Claude 3 Haiku

Claude 3 Haiku, part of Anthropic's Claude 3 AI model collection, stands out as the smallest and swiftest model. It has been specifically designed to cater to tasks that prioritize efficiency and speed. While not as robust as its larger siblings, Sonnet and Opus, Haiku still manages to deliver remarkable performance.

Key Features

  • Impressive Speed: Haiku showcases its rapidity by processing approximately 21,000 tokens (equivalent to about 30 pages) per second for prompts under 32,000 tokens. This makes it three times faster than its peers in handling most workloads, making it ideal for applications like chatbots that require near-instant responses.
  • Cost-Effective: Haiku proves to be have a good benefit-cost ratio in its category.
  • Ample Context Window: Despite its compact size, Haiku have a 200,000 token context window, allowing for efficient analysis of extensive datasets and documents.
  • Vision Capabilities: In addition to its language skills, Haiku supports image inputs, with the ability to process up to 20 images simultaneously. This grants it state-of-the-art computer vision capabilities.

Pros

  • Exceptionally fast inference speeds
  • Highly affordable pricing
  • Efficient handling of large context windows
  • Supports both text and image inputs

Cons

  • Less powerful compared to Sonnet and Opus models

In conclusion, Claude 3 Haiku strikes balance between performance, speed, and cost, making it an interesting asset for enterprise applications that require responsive AI assistance.

Claude 3 Sonnet

Claude 3 Sonnet is a powerful language model developed by Anthropic. It's suited for several tasks, including information processing, code generation, and large-scale information analysis.

Key Features:

  • Accuracy: improvement in accuracy compared to previous models, reducing wrong answers and admitting uncertainty when necessary.
  • Code Generation: The model demonstrates the ability to generate code from natural language prompts.
  • Large Context Window: 200,000 token context window, Claude 3 Sonnet can manage extensive datasets and do in-depth analysis.

Pros:

  • Accuracy: Claude 3 Sonnet demonstrates high-level reasoning capabilities, providing accurate and contextually relevant output responses.
  • Fast: Compared to its predecessors, Claude 3 Sonnet is twice as fast, enabling efficient completion of large-scale and complex inputs.
  • Cost-Effective: Claude 3 Sonnet offers robust performance at a low price in its category, making it an attractive alternative for enterprise workloads.

Cons:

  • Less Powerful than Opus: While Claude 3 Sonnet is a capable model, it is not as powerful as Claude 3 Opus, which offers even higher intelligence and performance.

In summary, Claude 3 Sonnet also keeps the balance between intelligence and speed but offers a more robust performance, making it a valuable tool for tasks that need efficient language processing making it a compelling alternative for various applications such as data processing, code generation, handle extensive datasets, sales automation, generate documentation and more

Claude 3 Opus

The Claude 3 Opus LLM, provided by Anthropic, stand out as one of the most powerful language models available. Its capabilities far exceed those of other models, making it a formidable tool for several applications.

Key Features:

  • Built using constitutional AI, which means it's designed to be safe, ethical, and aligned with human values
  • Trained on a massive dataset, allowing it to engage in diverse conversations and tackle a wide range of tasks
  • Supports multiple languages, making it accessible to users worldwide
  • Constantly learning and improving through interactions with users

Pros:

  • Impressive language understanding and generation capabilities so it maintains context well, enabling coherent and engaging conversations. Exhibits strong reasoning skills and can provide detailed, well-structured responses. Demonstrates a good grasp of humor and can engage in witty exchanges

Cons:

  • Like all AI models, it may occasionally generate incorrect or biased information. Its knowledge cutoff date means it may not be aware of the most recent events or developments. Some users might find its responses a bit lengthy at times, and it's not a complete replacement for human interaction and decision-making

Overall, Claude 3 Opus is a powerful tool that offers an impressive accuracy making it a desirable asset to produce creative content, accurate summarization, question-answering tasks, research, education, journalism. Combining Opus with custom APIs alows the LLM to retrieve real-time data opening up opportunities for industries like transportation, finance, health care and more.

Endpoints

Model NameEndpoint
Claude 3 Haikuclaude-3-haiku-20240307
Claude 3 Sonnetclaude-3-sonnet-20240229
Claude 3 Opusclaude-3-opus-20240229

Azure

When it comes to AI in the cloud, Microsoft Azure is definitely one of the top players out there. They've got this whole Azure AI platform that's just packed with all sorts of services and tools to help developers and businesses build super smart, innovative apps really quickly. Azure AI aims to empower developers and organizations to rapidly create intelligent, cutting-edge applications using pre-built APIs, customizable models, and end-to-end machine learning capabilities

Azure Chat GPT-4

GPT-4 is next-generation language model from OpenAI, now available in through Azure OpenAI Service. This powerful AI system builds upon the capabilities of its predecessor GPT-3.5, offering enhanced performance and new features for developing cutting-edge language applications.

Key Features: Incorporates vision capabilities, improved instruction following. It also has a JSON mode, reproducible output wich means that you can get more consistent results across multiple runs of the model with the same input, parallel function calling. Longer context window of 128,000 tokens (GPT-4 Turbo) and 4,096 output tokens Compared to previous versions, GPT-4 is better at following instructions, providing more consistent results, and can handle more complex tasks.

Pros:

  • Cutting-edge performance on natural language tasks
  • Enterprise capabilities of Azure like security, compliance, scalability
  • Customization options like fine-tuning on your own data
  • Flexible integration into your own applications via APIs

Cons:

  • Higher cost than GPT-3.5 models
  • A common drawback is that it requires learning new APIs, such as Chat Completions but this concern is largely addressed using Serenity AI HUB.
  • Some content filtering restrictions and as in every LLM, responses depends heavily on the training data update.

Endpoints

Model NameEndpoint
Azure Chat GPT-4chatgpt4

Groq

GROQ is an artificial intelligence company based in Mountain View, California, founded by Jonathan Ross, who previously worked at Google and led the development of the TPU (Tensor Processing Unit). The company focuses on developing high-performance AI chips and software solutions to accelerate machine learning workloads. GROQ has made significant strides in the AI industry, notably by developing a unique architecture for their AI chips that can deliver up to 1,000 TOPS (Tera Operations Per Second) performance, outperforming many competitors in the market.

GROQ has partnered with industry leaders like Argonne National Laboratory to advance AI research and development in fields such as healthcare, climate science, and autonomous systems, aiming to push the boundaries of AI applications and contribute to groundbreaking discoveries.

The company aspires to become a major player in the AI hardware market by offering powerful, energy-efficient solutions for data centers, edge devices, and other computing environments. GROQ's future goals include expanding its product portfolio to cater to a wider range of AI applications and securing more partnerships to drive innovation. With its cutting-edge technology and strong financial backing, GROQ is well-positioned to make significant contributions to the advancement of AI in the coming years.

Genma 7B by Groq

Gemma 7B is a strong model originally developed by Google but included in GROQ platform services, with performance comparable to the best models in the 7B weight class such as Mistral 7B.

One of the most remarkable aspects of running Gemma 7B on GROQ's platform is the exceptional inference speed. GROQ's Language Processing Unit (LPU) Inference Engine allows Gemma 7B to run at approximately 814 tokens per second, which is 5-15 times faster than other measured API providers. This impressive speed is crucial for real-world applications of large language models (LLMs).

In terms of latency, defined as the time to receive the first token after sending an API request, GROQ offers a latency of just 0.3 seconds while maintaining a throughput of around 814 tokens per second.

Gemma 7B's strong performance and open-source nature make it an attractive option for various applications and industries. For example, in customer service, Gemma 7B can be used to develop chatbots that provide quick and accurate responses to customer inquiries, improving overall customer satisfaction. In the healthcare industry, Gemma 7B can be employed to create virtual assistants that help patients navigate complex medical information and provide personalized recommendations.

Pros:

  • Performance: Strong performance compared to other models in the 7B weight class High inference speed when run on GROQ's Language Processing Unit (LPU) Inference Engine Low latency combined with high throughput on GROQ's platform Competitive pricing for input/output tokens

Cons:

  • Versatility: Smaller language model compared to larger models like Gemini or OpenAI's ChatGPT May not be as versatile or capable as larger models for certain tasks Requires specialized hardware (GROQ's LPU chips) to achieve optimal performance

The high inference speed and low latency offered by GROQ's platform make Gemma 7B suitable for real-time applications, such as voice assistants and real-time translation services. The competitive pricing for input/output tokens also makes it an economically viable choice for small businesses looking to integrate language models into their products or services.

Llama 2 70B by Groq

Llama 2 70B, offered by GROQ, is a powerful large language model (LLM) that has been making waves in the AI industry. Developed originally by Meta in collaboration with Microsoft, this model boasts an impressive 70 billion parameters, placing it among the largest and most capable language models available today. With its extensive training data and advanced architecture, Llama 2 70B offers a wide range of natural language processing capabilities, including text generation, question answering, and language translation.

Pros:

  • Extensive knowledge base: Llama 2 70B has been trained on a vast amount of data, allowing it to possess a broad understanding of various topics and domains.
  • High-quality output: The model generates coherent, contextually relevant, and fluent text, making it suitable for a wide range of applications.
  • Multilingual support: Llama 2 70B can handle multiple languages, making it a valuable tool for businesses operating in global markets.
  • Efficient processing: GROQ's hardware architecture enables fast and efficient processing of the model, reducing latency and improving performance.

Cons:

  • High computational requirements: Due to its size, Llama 2 70B requires significant computational resources, which may be a barrier for some businesses because of the increase of price.
  • Potential biases: Like all LLMs, Llama 2 70B may inherit biases from its training data, which could lead to unintended consequences if not properly addressed.

Industries and applications that could benefit from this LLM are healthcare to generate medical reports, analyze patient data, and assist in drug discovery processes, financial forecasting, risk assessment, customer support, could power chatbots, product recommendations, and personalized marketing content, and much more.

Llama 3 70B and Llama 3 8B by Groq

Groq offers two versions of Meta AI's Llama 3 language models: Llama 3 70B and Llama 3 7B. Both models are part of the Llama 3 family, which includes pretrained and instruction-tuned generative text models optimized for dialogue use cases.

Pros

  • High performance: Groq's LPU Inference Engine enables fast response generation, with Llama 3 70B achieving 284 tokens per second and Llama 3 8B reaching 877 tokens per second, outperforming other providers.
  • Cost-competitive: Groq's pricing for both models is at or below other providers, making them an attractive option for various use cases.
  • Improved context window: Llama 3 models have an increased context window size of 8k tokens compared to the previous generation's 4k tokens, allowing for better context understanding.
  • Multilingual.

Cons

  • Computational requirements: The larger Llama 3 70B model requires significant computational resources, which may be prohibitive to some business because of its cost.
  • Potential biases: As with all large language models, Llama 3 models may inherit biases from their training data, which should be considered when deploying them in real-world applications.

Differences between Llama 3 70B and Llama 3 8B

  • Model size: Llama 3 70B is bigger than Llama 38B, the larger model generally offers better performance at the cost of increased computational requirements.
  • Training data: Llama 3 70B was trained on data up to December 2023, while Llama 3 7B's training data cutoff is March 2023.
  • Performance: Llama 3 70B outperforms popular models like Claude 3 Sonnet and Gemini Pro 1.5 on various benchmarks, while Llama 3 7B surpasses rival models like Gemma 7B and Mistral 7B.

The Llama 3 70B and Llama 3 7B models offered by Groq could be used in various industries and applications, including but not limited to:

  • Customer service and support: Chatbots and virtual assistants, answer queries, resolve issues, and offer personalized recommendations.
  • Healthcare: Assist medical professionals in patient triage, symptom analysis, and medical record summarization, making information more accessible and understandable.
  • Education: Interactive and engaging learning experiences, generating personalized content, providing instant feedback, and serving as virtual tutors.
  • Finance: Enhance risk assessment, fraud detection, and investment analysis by processing and analyzing vast amounts of financial data.
  • Creative industries: Content creators in media, entertainment, and advertising can leverage Llama 3 to generate compelling articles, scripts, and marketing copy.

Mixtral 8x7B by Groq

Mixtral 8x7B is a large language model originally developed by Mistral, Groq offers its own version their platform. The model is designed to be efficient and adaptable, offering impressive performance while maintaining a relatively small size compared to other state-of-the-art language models. It is based on an architecture called "Mixture of Experts" abreviated as MoE that provides faster pre training time, faster inference time and flexibility.

Pros:

  • High performance: Mixtral 8x7B delivers excellent results in various natural language processing tasks, such as text generation, translation, and question-answering.
  • Efficiency: The model is optimized to run on Groq's custom AI accelerators, enabling faster processing and lower power consumption compared to traditional hardware.
  • Scalability: Mixtral 8x7B can be easily scaled up or down depending on the task requirements, making it adaptable to different use cases.
  • Cost-effective: The model's efficiency and scalability make it a cost-effective solution for businesses and organizations looking to implement AI-powered applications.

Cons:

  • The MoE architecture can face certain difficulties in fine-tuning, but these concerns are currently being addressed, and promising results are emerging.
  • Requires more VRAM because the experts needs to be pre-loaded
  • As every other LLM can have biases depending on the training data

Mixtral 7x8B LLM offers significant potential across various industries, including healthcare, finance, e-commerce, and education. Its efficient processing capabilities, combined with the power of the Mixture of Experts architecture, make it ideal for tasks such as patient interaction, market analysis, personalized recommendations, and tailored learning experiences. The open-source nature of Mixtral allows for customization, making it a versatile tool for businesses and organizations seeking to leverage the power of language models.

Endpoints

Model NameEndpoint
Gemma 7b itgemma-7b-it
LLaMa 3 70b Groqllama3-70b-8192
Llama 3 8B Groqllama3-8b-8192
Mixtral 8x7b Groqmixtral-8x7b-32768

Mistral

Mistral AI is a French artificial intelligence startup founded in 2023 by Arthur Mensch (former Google DeepMind researcher), Timothe Lacroix, and Guillaume Lample (both former Meta employees). The company, named after a strong wind that blows in France, aims to create open-source, efficient, and adaptable AI models. Some of the achievments of this company are the publication of Mistral 7B, in September 2023 under the Apache 2.0 license, they also developed other smaller-size models that are less compute-intensive compared to competitors. Recently in 2024 debuted a ChatGPT-alternative chatbot called Le Chat and announced partnerships with Microsoft and Amazon.

Mistral Small

The compact version of the Mistral LLM, offers strong performance despite it's size and even can do instruction following tasks.

Pros:

  • Good benefit-cost ratio, making it a nice choice for small business or industries.
  • In it's category shows up good instruction following behaviour
  • Has a JSON mode included which simplifies capturing outputs and enhances usability.

Cons:

  • Too complex tasks requires more computing power so this redunt in longer waiting times and costs
  • As in every LLM there could be biases due to the training data

Overall, Mistral Small is a powerful and versatile LLM suitable for a wide range of functions, including customer support, summarization, and virtual assistants. It offers competent performance, cost-effectiveness, and adaptability, making it a valuable choice for various industries and use cases. While it may not be as powerful as its larger siblings, it still delivers strong performance and represents a compelling option in terms of both capability and price.

Mistral Medium

Mistral AI has introduced a range of language models, including the Mistral-Medium LLM, which is designed to cater to a variety of tasks requiring moderate reasoning and multilingual capabilities. It's key features include multilingual capacities, the larger context window and the mentioned JSON mode.

Pros:

  • Multilingual Support: Mistral-Medium is proficient in multiple languages, including English, French, Spanish, German, and Italian
  • Quality and performance: In some benchmarks, mistral-medium has outperformed other languages in its category, so is claimed to rival the performance of leading LLMs in the field, offering a strong alternative to more established models
  • Specialized Capabilities: The model is equipped with features like precise instruction-following and function calling, which are essential for broad application development

Cons:

  • Limited Track Record: As a relatively new entrant in the LLM space, Mistral AI lacks the extensive track record of more established players
  • Biases due to the training data as in every LLM
  • Not particularly good for complex math applications

Mistral medium can be useful for industrial applications, education tasks like custom virtual assistants, writing and composing text or songs, travel planning advicing, culinary aid, and a large etcetera.

Mistral Large

Mistral-large is the most advanced language model developed by Mistral. It is a powerful tool for natural language processing and generation

Pros:

  • Exceptional language understanding and generation capabilities
  • Maintains coherence and consistency in lengthy texts
  • Adaptable to various writing styles and formats
  • Handles complex writing tasks with ease
  • Extensive knowledge base covering a wide range of topics
  • JSON mode included

Cons:

  • Requires significant computational resources that leads to an increased cost that may be prohibitive for small business
  • May occasionally generate biased or inconsistent content as any LLM
  • Lacks real-time knowledge beyond its training data cutoff date
  • Potential for misuse if not properly controlled or monitored

Has numerous potential applications across various industries. In the field of content creation, it can assist in generating articles, blog posts, product descriptions, and marketing copy. Within customer service, Mistral-large can be employed to create chatbots and virtual assistants that provide human-like responses to customer inquiries. In the education sector, the model can be used to develop personalized learning content and assessments. Additionally, Mistral-large can aid in research and analysis by summarizing large volumes of text and extracting key insights.

Open Mixtral 8x22B

Open Mistral 8x22B is a large-scale language model that leverages an impressive 22 billion parameters, enabling it to understand and generate human-like text with remarkable accuracy. Its open-source nature allows for greater transparency and collaboration within the AI community. The model's architecture is optimized for efficiency, making it accessible to a wider range of users and organizations. Open Mistral 8x22B excels at capturing context and generating coherent, relevant responses across various fields.

Pros:

  • Highly efficient architecture, making it more accessible
  • Excellent at understanding context and generating relevant responses
  • Capable of handling a wide range of natural language processing tasks
  • Continuously improving through community contributions and updates

Cons:

  • Limited knowledge cutoff date due to training data constraints
  • May struggle with highly specialized or niche domains
  • Biases in responses due the training data as any other LLM

Can be used in a ton of different ways across many industries. For customer service, it can help create smart chatbots and virtual assistants that give accurate and relevant answers to customer questions. In content creation, it can assist in writing high-quality articles, blog posts, product descriptions, and advertising copy. When it comes to research and analysis, this model can summarize big chunks of text, pull out important points, and generate reports. Plus, Open Mistral 8x22B can be used to create personalized educational content, smart tutoring systems, and automated grading tools. Because it's open-source and adaptable, it's a valuable tool for businesses and people who want to use advanced language models for all sorts of text-related tasks.

Endpoints

Model NameEndpoint
Mistral Smallmistral-small-latest
Mistral Mediummistral-medium-latest
Mistral Largemistral-large-latest
Open Mixtral 8x22bopen-mixtral-8x22b-2404

OpenAI

OpenAI is an artificial intelligence research laboratory founded in December 2015 by Elon Musk, Sam Altman, Greg Brockman, Ilya Sutskever, and Wojciech Zaremba. The company, headquartered in San Francisco, California, aims to promote and develop friendly AI in a way that benefits humanity as a whole. OpenAI began as a non-profit research organization with a collective pledge of $1 billion from its founders and investors. In 2019, OpenAI transitioned into a "capped-profit" company to attract investments while maintaining its mission. Microsoft has invested over $10 billion in OpenAI, valuing the company at around $29 billion.

Being the original developers of archi known Chat-GPT should be enough presentation for this company but they also have released various interesting and cutting edge innovations such as Dall-E (1, 2, and 3), a system than can create reallistic images from text promts, Whisper, an automatic speech recognition system and Jukebox, a neural network to create music. The company aims to conduct research in a transparent and open manner, collaborating with other institutions, and carefully considering the ethical implications of AI development. With its cutting-edge research, high-profile partnerships, and substantial funding, OpenAI is at the forefront of shaping the future of artificial intelligence and its impact on society.

GPT 3.5 Turbo

ChatGPT 3.5 Turbo is an advanced conversational AI model that utilizes the GPT-3.5 series architecture. It has been fine-tuned using reinforcement learning with human feedback to provide engaging, context-aware responses. The model excels at understanding and generating human-like text across a wide range of topics, this coming from its extra large training dataset wich is rounding 400 Billion tokens (570 GB) of processed texts from sources like Wikipedia, books, and web pages.

Pros:

  • Highly capable at understanding context and providing relevant, coherent responses
  • Can handle complex instructions and engage in multi-turn conversations
  • Continuously improving based on user feedback and iterations
  • Includes safety measures to reduce harmful or untruthful outputs

Cons:

  • Due to its computational demand, may have some latency compared to other models
  • May occasionally generate incorrect or biased information
  • Limited knowledge cutoff date based on training data (2021)
  • As any LLM, it may lack deeper understanding and reasoning capabilities

ChatGPT 3.5 Turbo is a versatile language model suitable for customer support, content creation, education, research, and creative writing applications. It enables the development of intelligent chatbots, assists in generating written content, supports interactive learning experiences, aids in research and data analysis, and collaborates with writers to spark creativity.

GPT 4

ChatGPT-4 is the latest iteration of OpenAI's powerful language model, building upon the successes of its predecessor, GPT-3.5. This state-of-the-art AI system boasts enhanced capabilities in natural language processing, comprehension, and generation. With its expanded knowledge base and improved reasoning abilities, ChatGPT-4 delivers more accurate, contextually relevant, and coherent responses compared to previous versions.

One of the key features of ChatGPT-4 is its ability to engage in multi-turn conversations, maintaining context and providing consistent, well-structured answers. It can handle complex queries, generate creative content, and even assist with problem-solving tasks. Additionally, ChatGPT-4 demonstrates better common sense reasoning and a deeper understanding of abstract concepts.

Pros:

  • Enhanced language understanding and generation capabilities
  • Improved context retention and multi-turn conversation handling
  • Expanded knowledge base covering a wide range of topics
  • Better common sense reasoning and problem-solving abilities
  • Increased safety measures to mitigate harmful or biased outputs

Cons:

  • Requires significant computational resources, which may limit accessibility
  • May occasionally produce inconsistent or irrelevant responses
  • Potential for misuse or generation of misleading information if not properly controlled
  • Lacks true understanding and emotional intelligence compared to humans
  • May perpetuate biases present in its training data

It's advanced capabilities make it a valuable tool across various industries and applications. In customer service, it can power highly efficient and personalized chatbots, handling complex inquiries and providing 24/7 support. Within the realm of content creation, ChatGPT-4 can assist in generating articles, product descriptions, and marketing copy, educational processes

GPT 4 turbo

ChatGPT-4 Turbo from OpenAI is a cutting-edge language model that pushes the boundaries of natural language processing. With its advanced architecture and vast training data, ChatGPT-4 Turbo delivers highly coherent, contextually relevant, and engaging responses.

Pros:

  • Highly coherent and contextually relevant responses
  • Maintains context over long conversations
  • Capable of understanding and generating text across various topics and styles
  • Faster response times compared to Chat GPT-4

Cons:

  • Potential for generating biased or misleading information if not properly fine-tuned
  • Requires significant computational resources to train and run
  • May struggle with highly specialized or domain-specific knowledge
  • Can have some difficulties with certain complex reasoning tasks compared to Chat GPT-4

This language can be used in customer service, also to provide quick, accurate, and personalized responses to customer inquiries. The model can help in content creation, assisting writers in generating articles, stories, and even poetry. In education, ChatGPT-4 Turbo can serve as an intelligent tutoring system, providing students with instant feedback and explanations. Furthermore, the model can be applied in research, helping scientists and analysts process and summarize large volumes of text data.

Endpoints

Model NameEndpoint
GPT 3.5 Turbogpt-3.5-turbo
GPT 4gpt-4
GPT 4 turbogpt-4-1106-preview
GPT 4 Omnigpt-4o

Together AI

Together AI was founded in 2022 by a team of experienced AI researchers and entrepreneurs, including CEO Vipul Ved Prakash and CTO Ce Zhang. The company is based in San Francisco and it's mission is to develop advanced AI systems that can collaborate effectively with humans to solve complex problems. Is backed by major investors such as Nvidia and has been partnering with big healthcare providers to deploy AI systems that assist doctors in diagnosis and treatment planning. Together AI's medical AI has demonstrated the potential to significantly improve patient outcomes.

Since Together offers so many LLMs, we've decided to organize them in a table to include the most relevant info, such as context lenght and key features:

Model NameOrganizationContext LengthProsConsUsesEndpoint
01-ai Yi Chat (34B)01.AI4096Large model, MultilingualResource-intensive, Potential biasesChatbots, Customer supportzero-one-ai/Yi-34B-Chat
Alpaca (7B)Stanford2048Efficient, Open-sourceLimited context, Smaller modelGeneral-purpose, Researchtogethercomputer/alpaca-7b
Chronos Hermes (13B)Austism2048Efficient, Mythological themeLimited context, Smaller modelCreative writing, StorytellingAustism/chronos-hermes-13b
Code Llama Instruct (13B)Meta16384Code-focused, Large contextResource-intensive, Narrow domainProgramming assistance, Code generation togethercomputer/CodeLlama-13b-Instruct
Code Llama Instruct (34B)Meta16384Code-focused, Very large modelResource-intensive, Narrow domainAdvanced programming, Code analysistogethercomputer/CodeLlama-34b-Instruct
Code Llama Instruct (7B)Meta16384Code-focused, EfficientSmaller model, Narrow domainBasic programming, Code snippetstogethercomputer/CodeLlama-7b-Instruct
LLaMA-2 Chat (13B)Meta4096Efficient, ConversationalLimited context, Smaller modelChatbots, Customer supporttogethercomputer/llama-2-13b-chat
LLaMA-2 Chat (70B)Meta4096Very large model, ConversationalResource-intensive, Potential biasesAdvanced chatbots, Virtual agentstogethercomputer/llama-2-70b-chat
LLaMA-2 Chat (7B)Meta4096Efficient, ConversationalLimited context, Smaller modelChatbots, Customer supporttogethercomputer/llama-2-7b-chat
LLaMA-2-7B-32K-Instruct (7B)Together32768Large context, Instruction-followingResource-intensive, Smaller modelLong-form content, Researchtogethercomputer/Llama-2-7B-32K-Instruct
LLaMA-3 Chat (70B)Meta8000Very large model, Large contextResource-intensive, Potential biasesAdvanced chatbots, Long-form contentMETA-LLAMA/LLAMA-3-70B-CHAT-HF
LLaMA-3 Chat (8B)Meta8000Efficient, Large contextSmaller model, Potential limitationsChatbots, Q&AMETA-LLAMA/LLAMA-3-8B-CHAT-HF
Mistral (7B) Instructmistralai8192Efficient, Instruction-followingSmaller model, Potential limitationsVirtual assistants, Q&Amistralai/Mistral-7B-Instruct-v0.1
Mistral (7B) Instruct v0.2mistralai32768Large context, Instruction-followingResource-intensive, Smaller modelLong-form content, Researchmistralai/Mistral-7B-Instruct-v0.2
Mixtral-8x7B Instruct (46.7B)mistralai32768Large context, Ensemble modelResource-intensive, Potential inconsistenciesComplex tasks, Researchmistralai/Mixtral-8x7B-Instruct-v0.1
MythoMax-L2 (13B)Gryphe4096Mythological theme, EfficientLimited context, Smaller modelCreative writing, StorytellingGryphe/MythoMax-L2-13b
Nous Capybara v1.9 (7B)NousResearch8192Efficient, Mythological themeSmaller model, Potential limitationsCreative writing, StorytellingNousResearch/Nous-Capybara-7B-V1p9
Nous Hermes 2 - Mixtral 8x7B-DPO (46.7B)NousResearch32768Large context, Ensemble modelResource-intensive, Potential inconsistenciesComplex tasks, ResearchNousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO
Nous Hermes 2 - Mixtral 8x7B-SFT (46.7B)NousResearch32768Large context, Ensemble modelResource-intensive, Potential inconsistenciesComplex tasks, ResearchNousResearch/Nous-Hermes-2-Mixtral-8x7B-SFT
Nous Hermes Llama-2 (13B)NousResearch4096Efficient, Mythological themeLimited context, Smaller modelCreative writing, StorytellingNousResearch/Nous-Hermes-Llama2-13b
Nous Hermes LLaMA-2 (7B)NousResearch4096Efficient, Mythological themeLimited context, Smaller modelCreative writing, StorytellingNousResearch/Nous-Hermes-llama-2-7b
Nous Hermes-2 Yi (34B)NousResearch4096Large model, Mythological themeResource-intensive, Potential biasesAdvanced creative writing, StorytellingNousResearch/Nous-Hermes-2-Yi-34B
OpenChat 3.5 (7B)OpenChat8192Efficient, Open-sourceSmaller model, Potential limitationsGeneral-purpose, Researchopenchat/openchat-3.5-1210
OpenHermes-2-Mistral (7B)Teknium8192Efficient, Mythological themeSmaller model, Potential limitationsCreative writing, Storytellingteknium/OpenHermes-2p5-Mistral-7B
OpenHermes-2.5-Mistral (7B)Teknium8192Efficient, Mythological themeSmaller model, Potential limitationsCreative writing, Storytellingteknium/OpenHermes-2-Mistral-7B
OpenOrca Mistral (7B) 8KOpenOrca8192Efficient, Large contextSmaller model, Potential limitationsChatbots, Q&AOpen-Orca/Mistral-7B-OpenOrca
Platypus2 Instruct (70B)garage-bAInd4096Very large model, Instruction-followingResource-intensive, Potential biasesDiverse applications, Researchgarage-bAInd/Platypus2-70B-instruct
RedPajama-INCITE Chat (7B)Together2048Efficient, Open-sourceLimited context, Smaller modelChatbots, Q&Atogethercomputer/RedPajama-INCITE-7B-Chat
StripedHyena Nous (7B)Together32768Large context, Mythological themeResource-intensive, Smaller modelLong-form content, Creative writingtogethercomputer/StripedHyena-Nous-7B
Upstage SOLAR Instruct v1 (11B)upstage4096Efficient, Instruction-followingLimited context, Smaller modelVirtual assistants, Q&Aupstage/SOLAR-10.7B-Instruct-v1.0
Vicuna v1.5 (13B)LM Sys4096Efficient, Open-sourceLimited context, Smaller modelGeneral-purpose, Researchlmsys/vicuna-13b-v1.5
WizardLM v1.2 (13B)WizardLM4096Efficient, Instruction-followingLimited context, Smaller modelVirtual assistants, Q&AWizardLM/WizardLM-13B-V1.2