lundi 29 mars 2021

Question about cross-validation and what's next

Maybe it's a very simple question, but I'm just starting out. I have a list of vectors with which I want to do cross-validation, but I don't quite understand how I should do it. This is my code:

# scikit-learn k-fold cross-validation
from numpy import array
from sklearn.model_selection import KFold
import texture_wavelets as text_wav
import os
import cv2

# data sample
directory_images = 'D:/images'

results = []

for image_name in os.listdir(directory_images):
    image = cv2.imread(directory_images + "/" + image_name)
    mask = text_wav.TextureWavelets().create_mask_plaque(image, 'b&w')
    results.append(text_wav.TextureWavelets().waveletdescr(mask, maxlevel=2))

    data = results

# prepare cross validation
kfold = KFold(3, True, 1)
# enumerate splits
for train, test in kfold.split(data):
    print('train: %s, test: %s' % (data[train], data[test]))

This is data list:

[array([    0.        ,  2044.61238098,     0.        ,     0.        ,
        2618.09565353,     0.        , 39819.78557968]), array([    0.        ,  4071.92074585,     0.        ,     0.        ,
        2776.18331909,     0.        , 43219.63778687]), array([    0.        ,  3076.86044312,     0.        ,     0.        ,
        2464.76063919,     0.        , 44498.27956009]), array([   0.        , 5871.45904541,    0.        ,    0.        ,
       1783.31578445,    0.        , 5319.52641678]), array([   0.        , 4213.01197815,    0.        ,    0.        ,
       3044.87182617,    0.        , 5253.39610291]), array([   0.        , 4855.08622742,    0.        ,    0.        ,
       1976.97391891,    0.        , 6974.81827927]), array([    0.        ,  4719.39257812,     0.        ,     0.        ,
        3474.63452911,     0.        , 38802.29157257]), array([    0.        ,  5773.23097229,     0.        ,     0.        ,
        4237.98572159,     0.        , 17283.86447525]), array([    0.        ,  2585.32319641,     0.        ,     0.        ,
        2866.66228867,     0.        , 18270.41167831]), array([    0.        ,  2533.72865295,     0.        ,     0.        ,
        3004.23120117,     0.        , 43447.09034729])]

I get the following error:

print('train: %s, test: %s' % (data[train], data[test]))
TypeError: only integer scalar arrays can be converted to a scalar index

My question is, can I apply this code to my data? And how can I use the information afterwards? What kind of information will I get? Thank you very much!

Aucun commentaire:

Enregistrer un commentaire