Creating Confusion Matrix From Multiple .csv Files
I have a lot of .csv files with the following format. 338,800 338,550 339,670 340,600 327,500 301,430 299,350 284,339 284,338 283,335 283,330 283,310 282,310 282,300 282,300 283
Solution 1:
Add a def get_predict(filename)
def get_predict(filename):
if'Alex'in filename:
return'Alexander'else:
return filename [0]
Reading n files, compute confusion matrix using pandas crosstab
:
import os
import pandas as pd
defget_category(filepath):
defcategory(val):
print('predict({}; abs({})'.format(val, abs(val)))
if0.8 < val <= 0.9:
return"A"ifabs(val - 0.7) < 1e-10:
return"B"if0.5 < val < 0.7:
return"C"ifabs(val - 0.5) < 1e-10:
return"E"return"D"withopen(filepath, "r") as csvfile:
ff = csv.reader(csvfile)
results = []
previous_value = 0for col1, col2 in ff:
value = int(col1)
if value >= previous_value:
previous_value = value
else:
results.append(value / previous_value)
previous_value = value
return category(sum(results) / len(results))
matrix = {'actual':[], 'predict':[]}
path = 'test/confusion'for filename in os.listdir( path ):
# The first Char in filename is Predict Key
matrix['predict'].append(filename[0])
matrix['actual'].append(get_category(os.path.join(path, filename)))
df = pd.crosstab(pd.Series(matrix['actual'], name='Actual'),
pd.Series(matrix['predict'], name='Predicted')
)
print(df)
Output: (Reading "A.csv, B.csv, C.csv" with the given example Data three times)
Predicted AB C Actual A300B030 C 003
Tested with Python:3.4.2 - pandas:0.19.2
Solution 2:
Using Scikit-Learn
is the best option to go for in your case as it provides a confusion_matrix
function. Here is an approach you can easily extend.
from sklearn.metrics import confusion_matrix
# Read your csv fileswithopen('A1.csv', 'r') as readFile:
true_values = [int(ff) for ff in readFile]
withopen('B1.csv', 'r') as readFile:
predictions = [int(ff) for ff in readFile]
# Produce the confusion matrix
confusionMatrix = confusion_matrix(true_values, predictions)
print(confusionMatrix)
This is the output you would expect.
[[0 2]
[0 2]]
For more hint - check out the following link:
Post a Comment for "Creating Confusion Matrix From Multiple .csv Files"