lundi 27 avril 2020

KNN: "no missing values are allow" -> I do not have missing values

I am in a group project for a class and one of the people in my group ran the normalization, as well as creating the test/train sets so that we all have the same sets to work from (we're all utilizing different algorithms). I am assigned with running the KNN algorithm.

We had multiple columns with NA's so those columns were omitted (<-NULL). When attempting to run the KNN I keep getting the error of

Error in knn(train = trainsetne, test = testsetne, cl = ne_train_target,  :
  no missing values are allowed

I ran which(is.na(dataset$col)) and found:

which(is.na(testsetne$median_days_on_market))
# [1] 8038 8097 8098 8100 8293 8304

When I look through the dataset those cells do not have missing data.

I am wondering if I may get some help with how to either find and fix the "No missing values" or to find a work around (if any).

I am sorry if I am missing something simple. Any help is appreciated.

I have listed the code that we have below:

ne$pending_ratio_yy <- ne$total_listing_count_yy <- ne$average_listing_price_yy <- ne$median_square_feet_yy <- ne$median_listing_price_per_square_feet_yy <- ne$pending_listing_count_yy <- ne$price_reduced_count_yy <- ne$median_days_on_market_yy <- ne$new_listing_count_yy <- ne$price_increased_count_yy <- ne$active_listing_count_yy <- ne$median_listing_price_yy <- ne$flag <- NULL

ne$pending_ratio_mm <- ne$total_listing_count_mm <- ne$average_listing_price_mm <- ne$median_square_feet_mm <- ne$median_listing_price_per_square_feet_mm <- ne$pending_listing_count_mm  <- ne$price_reduced_count_mm <- ne$price_increased_count_mm <- ne$new_listing_count_mm <- ne$median_days_on_market_mm <- ne$active_listing_count_mm <- ne$median_listing_price_mm <- NULL

ne$factor_month_date <- as.factor(ne$month_date_yyyymm)
ne$factor_median_days_on_market <- as.factor(ne$median_days_on_market)

train20ne= sample(1:20893, 4179)

trainsetne=ne[train20ne,1:10]
testsetne=ne[-train20ne,1:10]

#This is where I start to come in

ne_train_target <- ne[train20ne, 3]
ne_test_target <- ne[-train20ne, 3]

predict_1 <- knn(train = trainsetne, test = testsetne, cl=ne_train_target, k=145)
# Error in knn(train = trainsetne, test = testsetne, cl = ne_train_target,  : 
#  no missing values are allowed

Aucun commentaire:

Enregistrer un commentaire