Student Support

The manuals on this site offer you, as a student, instructions on the use of WUR education systems for your study and courses. Is the answer to your question not listed?
• Consult the manual of the system in question for support options.
• For account-related problems or information about (re)setting your password/WUR Passcode or about the use of Office 365, go to: WUR Support Portal
• Consult the WUR Current Students for information on enrolment, guidance during your study, student counseling, minors and exchange programmes, things to do besides your study, your graduation, and beyond.

Please note! Starting in the academic year 2025–2026, you can use the new student app myWUR for all study and campus information. The myWURtoday app will no longer be available after 1 September. Students will receive an email and a notification in the myWURtoday app once the new app is ready to download. This website will also be updated with a new manual.

LM Studio

Downloadable from: https://lmstudio.ai/

Page contents

Ratings
Pros/cons
Description
Finding models
- How to find suitable models
- Overview of models and benchmarks
About the developers
- Llama
- Qwen
- Mistral
- Gemma
- DeepSeek
Alternatives

Ratings

Open source model(s)

Accuracy / Quality: ★★★☆☆

Flexibility / Features: ★★★★☆

Data security / Privacy: ★★★★☆

Click here to go directly to our recommended models

Pros/cons

Pros

Customizable for many different (language) models.
Easily updated to use newly released models.
Data-secure and private.
Low technical difficulty.
Model parameters can be (partially) modified.

Cons

Typically lower in quality than commercially hosted models.
Slower than commercially hosted models due to lower hardware quality.
Fewer features than commercial models.
At least 5-10 GB local storage space required (smaller models).

Description

General

Using commercially hosted models (e.g. ChatGPT, Gemini or Claude) can have some significant drawbacks. Whenever you enter information into these models there is a risk that the information provided ends up in the training data of the model. This means that there can be no safe usage of these tools when working with either personal sensitive information (e.g. names, phone numbers, (mail) addresses, etc.), but also not with academically sensitive information (research questions, hypotheses, new methodologies, funding information, etc.). Unless a formal (written) agreement is made between the company in question and the WUR, using an open-source model on your own device can be a solution to this problem.

LM Studio (or alternatively: Ollama) is an example of software that allows the user to install language models on their own device. Doing so would allow the model to run on the hardware of the user, rather than the large servers of external companies. Any data you enter into models that are operating locally (on your own device) is not sent to any external servers, and is therefore data secure. Even if the internet connection would fall away, these models would continue to be usable.

Quantization

When installing models in LM Studio (or Ollama) you do so via GGUF-files. These are quantized versions of released open source models. The information in the model resulting from the training of the model is stored as large numbers, often in 32- or 64-bit lengths, such that the information is maintained with high precision. In the quantization process these lengths are reduced to e.g. 8-bit lengths. This shrinks down the size of the model to a more manageable volume for local devices, but comes at the cost of the accuracy of the model. This does not necessarily need to be a problem for the functioning of the model, as the accuracy provided by the high precision of the data is not always required. But if the precision is affected too much, the model will increasingly hallucinate or be unable to answer. In simple terms this process can be compared to the value for Pi: Having more decimals makes the calculation more precise, but not necessarily better.
The level of quantization can be found in the model name, with Q8 (8-bit) referring to a model in which a large degree of precision has been kept, whereas a Q2 (2-bit) version has lost a lot of precision but is significantly smaller in size. Conventionally the Q4 or Q5 quantized versions of models are an acceptable balance between size and accuracy for most tasks.

Model parameters

An advantage of using locally installed models is that parameters which are usually hidden in the background are available for modification. We will highlight the most relevant parameters.

Temperature

A core behavior of language models is determined via the 'Temperature' parameter. This parameter can vary from 0 to 1. A value of 0 represents a very strict and deterministic model, in which the prediction of the next token (word) is based on the largest statistical probability. It will result in the most likely answer being generated, and for the model to respond with the same answer consistently. A value of 1 instead results in more variation being introduced in the model, and less likely answers being generated. This can be beneficial when writing more creative texts such as poems, or when asking for more unique suggestions for the improvement of your text. The default temperature set for the model upon installing it is an indicative value of an acceptable balance between these two extremes. In LM Studio the Temperature setting can be found in the Advanced Configuration menu (top-right) under the 'Sampling' tab.

Context length

When reading the information you provide (your prompt and/or uploaded documents) and writing your answers the model is dependent on its context length. This parameter can be seen as the 'attention span' of the model. The greater the context length, the slower the model risks becoming, but the more information it can memorize as part of its answering. In LM Studio the context length can be found by clicking on the cogwheel next to the dropdown menu for the model selection.

When uploading documents to the model, bear in mind that the available context length can affect the quality of the answer. With a context length too short to fit the full document LM Studio uses the 'retrieval' method. This means that the language model looks for more exact matches of information in the file and is more of a 'searcher'. When the context length is large enough to fit the full size of the document (and prompt), LM Studio uses the 'full-injection' method, which means that the document is used as contextual information, allowing for more in-depth and related answers, rather than exact matches from the text.

System prompt

After loading a model, it can be used straight away. However, you can also insert a 'System prompt'. These are instructions which the model always needs to consider, and which are considered more important than the prompts you enter into your chat. These can be compared to the Custom Instructions feature available in ChatGPT. The system prompt can be used to specify the behavior of the model for all chats it is used in, until the setting is disabled again. For examples on how to write a system prompt, you can take inspiration from the system prompts of Claude, which are available via this link. You can find the System prompt feature in LM Studio under the Advanced Configuration menu on the top-right.

Finding models

How to find suitable models

To find GGUF-versions of open source models the best place to start is the Huggingface website. This website serves as a repository of open source models developed by both large companies as well as fine-tuned versions of those models released by individuals. To find suitable models for LM Studio you can look on the Models page. On this page, apply the 'Text generation' filter in the 'Tasks' tab, and the 'GGUF' filter in the 'Libraries' tab. What you are left with are all potentially usable open source language models.
You will rarely find GGUF-versions of models together with the originally released models. Hence we recommend looking at the versions released by Bartowski or the LM Studio Community. Inside the model page you can download the version of the model you desire via the 'Files' tab. Alternatively, you can also use the 'Discover' functionality within LM Studio itself, though this is a more limited list of available model versions.

Next, you will need to find a suitable model from the list of available ones. When comparing the available models, you will find that the larger the model (indicated by the B-value in the name, referring to the billions of parameters in the model) the higher the accuracy. However, this also means that the model is significantly larger to install and run. In this guide we will list a handful of versions, but will leave out the largest versions of models as these are unlikely to be usable on commonly used devices. We will only list 'Instruct' or 'It' versions of models, as these have been specifically fine-tuned to operate as a chatbot, making them more suitable for the application in LM Studio.

Each model has its own strengths and weaknesses, and their own sizes and quality of the training data. Luckily this can be assessed via independent benchmarks or via blind tests. Benchmarks are datasets containing questions (and answers) which can be posed to a language model to test its capabilities or accuracy on various tasks. These questions are (ideally) not present in the training data of the model, allowing the ranking of the models by the percentage of questions correctly answered. You can test against established benchmarks (e.g. Livebench). Alternatively, models can also be tested blindly by users by presenting users with the output of two or more models to a prompt they submitted, after which the user indicates which model (without knowing the name of the model) performed best. Based on this information a ranking can be established from real user experience. In this guide we use the LM Arena website for this ranking. Note that both the benchmark and blind test rankings were done on the full versions of the models, not the quantized versions. Hence a small decrease in quality compared to the listed ranking can be expected when using the model locally via LM Studio.

You can check if the model you wish to download will adequately run on your computer by looking it up in the Discover function of LM studio. This search will warn you when a model will (likely) use too much of your computer memory.

Overview of models and benchmarks

Recommended models

A large variety of models is available, but finding the most relevant one for your task may prove to be challenging. Hence, we will list a smaller selection of models here for specific tasks/purposes to get you started.

Model	Recommended for	Comments	Link
Qwen3-4B	Text summary and small-scale "reasoning"	Small model, limited inherent content knowledge.	Huggingface
Qwen3-0.6B	Spelling and grammar checking	Mini model, sufficient for spelling/grammar checking, not for content-related questions.	Huggingface
Devstral-Small-2507	Coding	Medium size model, requires above-average device to run	Huggingface

Detailed model list

The table below contains a list of (potentially) relevant models, is by no means complete, and prioritizes the more commonly used models. Note that not all model versions have a listed benchmark score. In these cases, know that smaller models generally have a lower performance than larger models.

If a model is not able to run, check that you have recently updated LM Studio to the latest version, and any relevant dependencies. Dependencies (also called Runtimes in LM Studio) can be found in the Discover tab.

Model description				LM Arena	Model page
Model family	Developer	Version	Size	Text score	Model page
LlaMa	Meta	3.2-3B-Instruct	3B	1173	Huggingface
		3.1-8B-Instruct	8B	1213	Huggingface
		3.3-70B-Instruct	70B	1315	Huggingface
Gemma	Google	Gemma-3-4b-it	4B	1301	Huggingface
		Gemma-3-12b-it	12B	1342	Huggingface
		Gemma-3-27b-it	27B	1365	Huggingface
Qwen	Alibaba	Qwen3-0.6B	0.6B	N/A	Huggingface
		Qwen3-4B	4B	N/A	Huggingface
		Qwen3-8B	8B	N/A	Huggingface
		Qwen3-32B	32B	1347	Huggingface
Mistral	Mistral	Large-Instruct-2411	123B	1301	Huggingface
		Small-3.2-24B-2506	24B	1343	Huggingface
		Nemo-Instruct-2407	12B	N/A	Huggingface
		Devstrall-Small-2507	24B	N/A	Huggingface
DeepSeek	DeepSeek	V3	685B	1396	Huggingface
		R1-Distill-Qwen-7B	7B	N/A	Huggingface
		R1-Distill-Qwen-14B	14B	N/A	Huggingface
		R1-Distill-LLaMa-8B	8B	N/A	Huggingface

About the developers

When using language models you should be aware of the biases inherent in them, and the limitations of the models. Some of these are tied to the developers of the models. Hence we will briefly go over each developer and provide some background information on them to provide context for the use of their respective models.

LLaMa

The Llama models are developed by Meta, an American technology company that is also known for social media platforms like Facebook, Instagram and WhatsApp. Apart from these social media apps and websites, Meta is also one of the biggest AI companies, most known for its online Meta AI model (not available in the Netherlands) and the offline Llama model. In total, all versions of the offline model are downloaded more than 350 million times. The newest version of the model is Llama 3.2. Usage of the LLaMa models is allowed under the LLaMa License for research and commercial purposes, but is more limited for commercial purposes when it concerns models of 'small' sizes (less than 8B parameters) due to potential large-scale use in commercial devices such as smartphones.

Qwen

At more than 40 million downloads over more than 100 different versions, Alibaba Cloud’s Tongyi Qianwen (Qwen) is another popular open source large language model. Alibaba Cloud is the digital technology part of the Chinese Alibaba group, also known for the web shops AliExpress and Alibaba. The model is very adequate at mathematics and programming tasks. Contrary to American and European model developers, the training data used for Qwen contains a proportionally larger body of data from Chinese origin. This may result in different model behavior compared to Western models, especially when social or political subjects are concerned.

Mistral

Mistral is an independent AI company from France founded in 2023. There are several Mistral applications like the online assistant LeChat, and the open source models Mistral, Codestral and Mathstral. Mistral made a name for itself by being the first commercial company to implement the 'Mixture of Experts' model strategy, in which various smaller models were each trained on a specific set of subjects, and then brought together to form a larger combined model. The resulting output was significantly more accurate than that of its competitors at the time.

Gemma

Another prominent AI company is Google DeepMind. Google is renowned for (among others) YouTube, Android and of course the search engine. Since the introduction of chat assistant Gemini (formerly known as Bard), Google also has also released a series of open source AI tools named Gemma. Apart from the Gemma 2 tool specialized in text generation, there are specialized model for code generation (CodeGemma) and images (PaliGemma).

DeepSeek

DeepSeek is a China-based AI developer. Their newest model (Deepseek-R1) is praised for high benchmark results and a strong resemblance in functionality to OpenAI's o1 reasoning model. Until recently the DeepSeek Coder was also considered the most accurate open source AI coding assistant. Similar to Qwen, the training data used to train the DeepSeek models is less American and Western Europe focused, and as such may give different results than Western models on social or political topics. When using the DeepSeek Distill models (see table) a chat template needs to be adhered to for the model to respond, like in the example below.

Example

<｜begin▁of▁sentence｜>
<｜User｜>
What is your opinion of Wageningen University?
<｜end▁of▁sentence｜>

Click to copy

Alternatives

Ollama: https://ollama.com/
Installing the full model and operating it via programming language (commonly Python). This requires more technical knowledge and is less user-friendly.