This Python Code Could Save You From Spending Too Much on Your Next Laptop

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c

If you are thinking of a new laptop, you may wonder how much you should pay one. You can browse websites, but a Python code and a small linear regression could facilitate work.

Why build a laptop price predictor?

You can search for thousands of pages and travel in physical stores, but it would take a long time. I like computers, but I don’t have time or inclination for this task. I have other things I prefer to do. I would like to have a program that I can enter specifications such as the amount of RAM I want or the screen resolution I need and lower a price.

There are so many machines on the market that it would be difficult for me to calculate all this information by hand. I also thought that sharing this information with other people who could be on the market for a new laptop could be useful. Who wants to pay too much for a machine? I don’t do it, and I can guess you probably don’t do it either.

Related

I refuse to buy a laptop without these 10 features

I don’t ask for much; These features must be standard.

With my knowledge of the linear regression of statistics, I realized that I could easily build a model to answer these questions. Python is a great language, and I already had a certain basic familiarity with her. It has become popular in data analysis, because it is simple enough for people without computer backgrounds to resume but offers powerful libraries to analyze the data.

For this project, I used pieces of the Python statistical ecosystem with which I was already comfortable.

I had already created a Mamba environment with these tools. While many systems, including Linux, include Python, it is intended to support the system and less for user programs. If you upgrade the Python system, you can see that the scripts that depend on it. There are tools for installing personalized environments like Virtualenv.

The first component is Numpy. It is a popular library for all kinds of digital operations, in particular statistical and linear algebra calculations that will occur in the background.

The next library you will need is Pandas, which will allow you to import the data set and display them in columns like a “data frame”. It is a bit like a cross between a relational database and a spreadsheet. You can also make powerful manipulations on your data.

Seaborn is a library to view the plots of statistical data. I use it to view data distributions in histograms, dispersion diagrams and linear regressions.

Related

How I explore and visualize the data with Python and Seaborn

Make intrigue is easier and better than you think with Seaborn.

Finally, Penguin allows me to easily carry out many statistical tests, without having to memorize all these formulas that I forgot in my course in university statistics years ago. This is the program that will create the model thanks to a multiple linear regression of the retail price compared to all laptop attributes.

Ensuring all of this is simple in most UNIX-type environments, including Windows using the Windows subsystem for Linux. You can follow the instructions on the web page to install it.

The Jupyter notebooks provide a relatively user -friendly way to execute Python orders and display the results, as well as store the results for later, but it is strictly optional. I created a jupyter notebook and I will demonstrate examples of code. I published it on my GitHub, so that you can see the code and some examples that I could not cover in this article.

With Mamba installed, you can create an environment you need. Like a cooking show, I already had one. To activate it, I type this on the Linux shell:

        
mamba activate stats

Acquire laptop data

To build the data set for the regression model, I could browse Internet stores and build a full database of laptops. It would take a long time to accumulate, as well as to clean the data so that it is consistent. Fortunately, someone has already done it.

There is a laptop computers database with certain hardware specifications such as processor speed, quantity of RAM, quantity of storage and horizontal and vertical screen resolutions available on Kaggle.

The price of laptops was in euros, but rapid control on XE.com in July 2025 showed that the exchange rate between euros and US dollars is quite close.

Build the regression model

With the assembled environment and the data acquired, it is now time to build the model. First of all, I have to import the libraries that I will use.

        
import numpy as np
import pandas as pd
import seaborn as sns
%matplotlib inline
import pingouin as pg

These lines import NUMPY, PANDAS, SEABORN and Penguin libraries. Numpy, Pandas, Seaborn and Pinguoin, are shortened in “NP, PD, SNS and PG”. The line that starts with “%” is intended for use in a jupyter notebook. He tells her to use the Matplotlib library which draws the plots to display them in the jupyter notebook. Otherwise, they will be displayed like a separate window.

Then we will import the data with Pandas:

        
laptops = pd.read_csv("data/laptop_prices.csv")

This will create a pandas data frame. We can see how the data is presented with the HEAD () method:

        
laptops.head()
Out of laptops.head () in Jupyter Notebook.

We can also see basic descriptive statistics of all digital columns with the method described ().

        
laptops.describe()
Descriptive statistics of digital columns in the laptop data set.

This will show the average, the median, the standard deviation, the minimum value, the lower quartile or the 25th centile, the median, the upper quartile or the 75th centile, and the maximum value of each column.

I also like to view data distributions via histograms. Seaborn’s displacement does this.

To see how prices are distributed:

        sns.displot(x='Price_euros',data=laptops)
    
Histogram of laptop prices in a jupyter notebook.

This indicates that Pingouin trace prices along the X axis and use the DataFrame laptop as a source. The tail of the distribution is significantly biased to the right.

We will build a model that uses various specifications. It will look like this:

Price = A (CPU speed) + B (RAM) + C (size in inches) …

The letters are stand-over for the coefficients defined by regression. It is similar to a simple linear regression that you may have seen, but instead of adapting a line on a dispersion diagram, you install an airplane. Since there are more than three dimensions in this model, it is actually a hyperplan.

To obtain the regression of the price in euros compared to the size of the laptop, processor speed, screen size, weight, primary storage and secondary storage, use the linear regression function of penguin:

pg.linear_regression(laptops[['Inches','Ram','Weight','ScreenW','ScreenH','CPU_freq','PrimaryStorage','SecondaryStorage']],laptops['Price_euros'],relimp=True)
Laptop music regression model in a jupyter notebook.

This will give us the coefficients of this regression equation. The RELimP option = will indicate to penguin to calculate how each variable contributes to the price. The coefficients will be displayed in the most on the left column, the column on the far right telling us that RAM is the greatest predictor of the price. The number to pay attention by determining how an adjustment is the square of the correlation coefficient, which is “R2” in this table. It is around .66, which means that it is a fairly good adjustment.

With the expected coefficients, we can now connect values ​​to the equation to predict the price. Here is a function that does exactly that:

def price(inches,ram,weight,screenw,screenh,cpu_freq,primary_storage,secondary_storage):

return 77.11 + -69.81*inches + 77.89*ram + 92.04*ram + 0.04*screenw + 0.59*screenh + 284.51 - 0.21*primary_storage + -0.04 * secondary_storage

You should set up the second line, but the limits of our system ask me to present it that way.

Do prices really differ between brands?

This regression model only looks at the specifications. You may be wondering if the price is really a price predictor. We can use variance analysis, or anova, to determine if the differences between brands are significant. Because price data has been biased, as seen with the histogram, a non -parametric test will be more precise. Penguin has a Kruskal-Wallis test that does it.

This will test the null hypothesis that there is no relationship between the price and the brand:

        pg.kruskal(data=laptops,dv='Price_euros',between='Company').round(2)
    
Kruskal-Wille test results of laptop brands in Python Jupyter Notebook.

The value P is 0, which means that this price is indeed significant. The rounding was made to make value P more apparent. Otherwise, it will be shown in scientific rating. This means that we can reject the null hypothesis and conclude that the brand is a price predictor.


I was able to build a price predictor to help me decide what just to pay for a machine would be based on his specifications, and another to determine the important brand. This shows the power of Python and its libraries to do something that could have been difficult to do by hand reducible to a few lines of code.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button