I'm trying to make a dot plot of two datasets. To simplify those two datasets, I'll put some letters
import matplotlib.pyplot as plt
import numpy as np
x = np.array([['a',1],['b',3],['c',4],['d',5],['e',6],['f',3]])
y = np.array([['c',3],['e',2],['b',6],['a',5],['h',5],['f',2]])
#in reality, those two arrays would be imported from two csv by np.genfromtext()...
xticks = x[0:5,0]
yticks = y[0:5,0]
x0 = np.array(range(1,6))
y0 = np.array(range(1,6))
plt.xticks(x0, xticks)
plt.yticks(y0, yticks)
#Here should be the dot plot...
plt.show()}
By dot-plot I refer to the fact that I'm comparing two gene samples, so the first column of the array corresponds to the gene name and the second to an associated value of the gen from that sample. In each array, the genes follow that order, they cannot be ordered.
So, what I'm trying to do is a plot where each coindicende ('b' with 'b' in both arrays, etc.) should be seen as a dot in that plot. Moreover, I would like to compare both numbers from each sample (for instance, (b1+b2)/abs(b1-b2) for each coincidence), so that those coincidences with numbers more alike are represented as darker spots (and those less alike lighter, or something like that).
Indeed, I managed to do so by iterating over each element in both arrays and making an array with the dot plot (here is the code in case you were interested, for the original code):
for fila in range(1, n):
for columna in range(1, n):
if tabla_final[fila,0] == tabla_final[0, columna]:
y = np.log((float(tabla_A[fila,2])*float(tabla_B[fila,2]))/abs((float(tabla_A[fila,2])-float(tabla_B[fila,2]))))
tabla_final[fila,columna] = y
else:
continue
The results I obtain (the dot plot) is like that (this dot-plot is exported to a csv):
This is a frame of the values for the comparison:
This would be the dot-plot (greener values are better associations and redder values are worse:
This would be the case for same samples:
Last but not least, as I will be comparing multiple samples two by two, I would like to obtain some sort of linear regression of this plot, with the Pearson's r coefficient as a way to assess the similarities of both samples.
Than you for your advice
Aucun commentaire:
Enregistrer un commentaire