
I installed an AI locally to reduce my ecological impact
Computer in the dark (illustration image). © Pexels
Artificial intelligence, and more particularly language models like chatgpt, has become an almost essential tool for millions of people. Capable of generating text, translating, summarizing or writing code, it is based on an infrastructure as powerful as it is Énergivore: the datacenters. These calculation plants, operating 24/7, are real energy caves, not only requiring a constant power supply, but also massive quantities of water for their cooling.
Faced with this observation, the idea of turning an AI model locally, on its own computer, appears as an attractive alternative. The objective is not to recreate a monster the size of GPT-4, but to use smaller models, open sourceand optimized for personal use. The promise is twofold: regain control of our data and, potentially, reduce our ecological impact.
What software to install an AI locally?
Different software exists to run an AI model directly on a computer. Among the best known are LM Studio, Jan and Msty. For the example of this article, I chose to turn to LM Studio, free and available on Windows, MacOS (Puce Apple Silicon) and Linux. Keep in mind that all the stages described here are also achievable through other tools. That said, one of the advantages of LM Studio is that it offers an intuitive graphical interface to download, configure and interact with a multitude of language models open source.
Prerequisites: power to execute a local AI model
By definition, running an artificial intelligence directly on a machine and without going through the cloud turns out to be rather energy -consuming. However, no need to have a war machine to get there. The AI models each have their own characteristics in terms of the resources necessary for their proper functioning.
Overall, I recommend having a fairly recent computer with a minimum of 16 GB of RAM, a processor that holds the road and, ideally, a dedicated graphics card. For users of a Mac, the Apple “M” chips are perfectly suited because of their very efficient design to run local models. SSD storage with enough free space is imperative, the weight of a model of 3 GB to more than 40 GB.
The installation of your own local “chatgpt” is done in a few simple steps and does not require advanced technical skills in computer science.
Install a model in LM Studio
Download the LM Studio version corresponding to your operating system. The installation is classic and has no particular difficulty.
Once Lm Studio launched, the home screen offers you to search for models. This is where the choice becomes crucial and must be guided by your equipment and your needs.
Search a model in LM Studio (screenshot). © Florent Lanne / Numériques
To start, it is advisable to direct yourself towards popular and versatile models. The Meta or Mistral Llama 3 series are excellent starting points. In the search bar, type for example “Llama 3 8b”. THE “8b” Referring to the number of billions of model parameters, an indicator of its size and its complexity. A model of 7 or 8 billion parameters offers an excellent compromise between performance and resources necessary for a consumer machine.
The importance of quantification
Once your search has been carried out, LM Studio will have several versions of the same model. You will notice mentions like Q4_K_M, Q5_K_M or Q8_0. These are quantification levels.
Quantification is a compression method, which reduces the size of the model and memory consumption with a minimum loss of precision. It is a key element for local efficient use.
- Q8 (8-bit) Very high quality, almost indistinguishable from the uncompressed version, but heavier in resources.
- Q5 (5-bit) Considered an excellent balance between quality and performance.
- Q4 (4-bit) The most popular. It guarantees good response quality for a file size and very reduced RAM consumption. Q4_K_M is often recommended as a good starting point.
- Q2/Q3 (2/3-bit) Clearly degraded quality, to be used only on very little powerful machines.
Download a model in LM Studio (screenshot). © Florent Lanne / Numériques
For a first installation on a computer with 16 GB of RAM, a model in Q4_K_M or Q5_K_M is a wise choice. Just click the button Download Next to the version that interests you.
Start a conversation
Once the download is finished, go to the tab Cat (the bubble dialogue icon on the left). At the top of the screen, click *Select a model to load.
Launch a conversation in LM Studio (screenshot). © Florent Lanne / Numériques
Choose the model you have just downloaded: LM Studio will analyze it and charge it in memory. This operation can take a few seconds at one minute.
Once the model is loaded, the conversation area activates. You can now ask your questions in French, as you would with chatgpt.
Use AI locally: my assessment
The central question remains: is the use of a local AI really more ecological? The answer is nuanced, but tends towards the positive.
When the computer runs a model like Llama 3 8B, its electrical consumption increases. Nevertheless, the latter is punctual and limited to the duration of the interaction with the model.
For comparison, a request on a cloud service like Chatgpt, although very optimized, implies an entire energy chain: the user terminal, the network (antennas, submarine cables) and, above all, the datacenter. The consumption of the latter is permanent (cooling, waiters pending) and is above all massive compared to local use.
A second advantage to use AI locally lies in the exemption from a paid subscription, always more numerous in the service era in the cloud. Let us not forget the protection of private data, as executing a local model makes it possible to avoid transferring too many personal details to an external server.
Beyond ecological issues, privacy protection and financial economies, resorting exclusively to local AI is not a universal solution. The size of the models available, but above all, your computer’s abilities to make them work, reduces the field of possibilities. Just observe the immensity of the data captured by much wider models, such as GPT-4O or Gemini 2.5 Flash. Using local artificial intelligence only can only with difficulty meet a generic need. On the other hand, in a very precise setting, a model carried out specifically for this purpose can be possible.
Addition, depending on the power of the computer and the nature of the prompt, the execution time can be much higher than that found in a cloud solution such as Google Gemini or Chatgpt of Openai.
Ultimately, adopting AI locally is experimentable without too many technical barriers and gives meaning to the computing power that sleeps on our offices.
Want to save even more? Discover Our promo codes Selected for you.




