# Task: 1. Take a close look at the lin_reg.py file. There are four empty functions:…

Task: 1. Take a close look at the lin_reg.py file. There are four empty functions: least_sq(file_name) and mat_least_sq(file_name) and predict (file_name, x) and plot_reg (file_name, using_matrix). Read through all of their descriptions carefully. Remember, you will lose points if you do not follow the instructions. We are using a grading script Summary of function tasks least_sq(file_name): Given the csv file_name, find the slope and y-intercept of the data using algebraic least squares (the first linear regression presented). You need to return the slope and y- intercept IN THAT ORDER. Round the slope and y-intercept to four decimal places. mat_least_sq(file_name): Given the csv file_name, find the slope and y-intercept of the data using linear algebraic least squares using matrices (the second linear regression presented). You need to return the slope and y-intercept IN THAT ORDER. Round the slope and y- intercept to four decimal places. predict(file_name, x): Given the csv file_name and an input value X, predict what the output would be using the equation that is derived from mat_least_sq(). This means that you should be calling mat_least_sq() in this function. Round the predicted output to four decimal places before returning the value. plot_reg(file_name, using_matrix): Given the csv file_name and an indicator of which linear regression method to use using_matrix, output a graph of the data points and the line of best fit. If using_matrix=False, then you should be plotting your results from least_sq. You should be using red for everything in the graph with X markers for the data points. • If using_matrix=True, then you should be plotting your results from mat_least_sq. You can use any color but the default blue and red. You can use any data point marker except for the default dot and X. plot_reg() should not return anything. Your graphs should also contain the following: Labeled x axis • Labeled y axis • Graph Title • Legend (see example for details) Some important notes: • For consistency’s sake, do not round until the very end. Meaning you should not round anything until you return your answers. • Hint: to plot the best fit line, find the smallest and largest x-coordinate. Plug these x- coordinates into the linear equation and plot them. • If you want to create extra functions/methods to assist you, feel free to do so. However, we will only be testing the three functions that are originally in the file. • If you use any library’s linear regression or least squares method function, you will get an automatic zero. You must implement this on your own! 2. Your job is to implement all four of these functions so that it passes all test cases. We provide one csv file for you to test on (data.csv), but we will be using other data sets and csv files to check if your work is correct. 3. By running the test case provided (data.csv), you should get the following results: Slope using algebraic least squares: 1.0022 y-intercept using algebraic least squares: 0.0533 Slope using linear algebra least squares: 1.0022 y-intercept using linear algebra least squares: 0.0533 Extrapolation: 100.2733 Interpolation: 38.1369 Using Algebra Least Squares y=1.0022x+0.0533 x data points 40 30 y 20 10 + 10 20 30 40 X Using Matrix Least Squares y=1.0022x+0.0533 data points 40 30 20 10 10 20 30 40 Note: your “matrix using least squares” graph may have different colors and markers from mine. Using Matrix Least Squares y=1.0022x+0.0533 data points 40 30 > ? 20 10 10 20 30 40 NO CASE should your graphs have the dot marker or the blue color shown above!

lin_reg.py

import numpy as np import pandas as pd import matplotlib.pyplot as plt # function name: least_sq # inputs: file_name- name of the csv file # output: m(slope), b(y-intercept) (IN THAT EXACT ORDER!!!) # LITERALLY return m, b (both rounded 4 decimal places) # YOU HAVE BEEN WARNED! YOU WILL GET IT WRONG IF YOU DO NOT RETURN THE CORRECT THINGS IN THE CORRECT ORDER!!!! # assumptions: The csv file will always have headers in the order of: x, y def least_sq(file_name): pass # function name: mat_least_sq # inputs: file_name- name of the csv file # output: m (slope), b(y-intercept) (IN THAT EXACT ORDER!!!) # LITERALLY return m, b (both rounded 4 decimal places) # YOU HAVE BEEN WARNED! YOU WILL GET IT WRONG IF YOU DO NOT RETURN THE CORRECT THINGS IN THE CORRECT ORDER! # assumptions: The csv file will always have headers in the order of: x, y def mat_least_sq(file_name): pass # function name: predict # inputs: file_name- name of the csv file # x- input value that you will interpolate or extrapolate using mat_least_sq # output: the predicted value based on the linear regression equation found using mat_least_sq # The output should be rounded to 4 decimal places # assumptions: The csv file will always have headers in the order of: x, y def predict(file_name, x): pass # function name: plot_reg # inputs: file_name- name of the csv file # using_matrix: True if you are plotting the linear equation from mat_least_sq # False if you are plotting the linear equation from least_sq # output: nothing is returned # task: given file_name, compute the linear equation using least_sq or mat_least_sq and graph results # your graph should have the following: labeled x and y axes, title, legend # if using_matrix is False (using least_sq), use X's and red in your graph # if using_matrix is True (using mat_least_sq), you can use any color except for the default blue and red # you can use any marker except for the default dot and X # assumptions: The csv file will always have headers in the order of: x, y def plot_reg(file_name, using_matrix): pass ######## TEST CASES ######## # this test case is the same as the one in csv_file = "data.csv" m1, b1 = least_sq(csv_file) print("Slope using algebraic least squares:", m1) print("y-intercept using algebraic least squares:", b1) print() m2, b2 = mat_least_sq(csv_file) print("Slope using linear algebra least squares:", m2) print("y-intercept using linear algebra least squares:", b2) print() y1 = predict(csv_file, 100) #extrapolation print("Extrapolation:", y1) y2 = predict(csv_file, 38) #interpolation print("Interpolation:", y2) plot_reg(csv_file, False) plot_reg(csv_file, True)