Hello Stackoverflow community,
I came across an error while datamining in Python, which I will describe here.
I got two arrays with the following values:
print(ConnectionTimeHours)
print(kWh)
[24.0, 13.0, 12.0, 22.0, 21.0, 10.0, 9.0, 12.0, 7.0, 14.0, 18.0, 1.0, 18.0, 15.0, 13.0, 13.0, 12.0, 19.0, 13.0] [10.0, 9.0, 22.0, 7.0, 4.0, 7.0, 56.0, 5.0, 24.0, 25.0, 11.0, 2.0, 9.0, 1.0, 9.0, 12.0, 9.0, 4.0, 2.0]
Let's continue now using the scipy library to make a plot and process these values to calculate r-squared:
from scipy.interpolate import *
p1 = polyfit(ConnectionTimeHours, kWh, 1)
from matplotlib.pyplot import *
%matplotlib inline
plot(ConnectionTimeHours ,kWh,'o')
show()
plot(ConnectionTimeHours ,kWhRounded,'o')
plot(ConnectionTimeHoursRounded,polyval(p1,ConnectionTimeHours),'r-')
p2 = polyfit(ConnectionTimeHours,kWhRounded,2)
So when I run the code above, I get a nice plot and no errors, however when I run the code below:
yfit = p1[0] * ConnectionTimeHours+ p1[1]
print(yfit)
print(kWh)
Now the kWh array is printed sucesfully, but the yfit array is empty and I get the following error:
C:UsersMichielAnaconda3libsite-packagesipykernel__main__.py:1: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future if name == 'main':
I can't figure out what the problem is here, and I am stuck for hours while constantly trying new data formats, rounded integers instead of doubles, but it doesn't seem to help. I used the following tutorial: https://www.youtube.com/watch?v=ro5ftxuD6is. In this example they use the following datastructure:
x = array([0,1,2,3,4,5])
y = array([0,0.8,0.9,0.1,-0.8,-1])
When I test it with the data above it does work. This might be the problem, so it might be because of the datastructure (if this is even the problem).
After this I try to calcuate r-squared, but then I get this error:
Does anyone know how to solve this?
Thank you for reading all the way to here!
Aucun commentaire:
Enregistrer un commentaire