logo
Tags down

shadow

How to resolve ValueError: Input contains NaN, infinity or a value too large for dtype('float64')


By : Vsweb
Date : October 17 2020, 03:08 PM
this one helps. I guess there must be nan values in input data, so before scaling values set all nan to avg. of column or set to zero, refer this.
code :


Share : facebook icon twitter icon

StandardScaler -ValueError: Input contains NaN, infinity or a value too large for dtype('float64')


By : Mustapha EL FEDDI
Date : March 29 2020, 07:55 AM
I wish this helpful for you numpy contains various logical element-wise tests for this sort of thing.
In your particular case, you will want to use isinf and isnan.
code :
import numpy as np

test = np.array([0.1, 0.3, float("Inf"), 0.2])

bad_indices = np.where(np.isinf(test))

print(bad_indices)

ValueError: Input contains NaN, infinity or a value too large for dtype('float64')


By : Dreamless Bithy
Date : March 29 2020, 07:55 AM
like below fixes the issue You can use fillna:
code :
df['Gender'].fillna('no data',inplace=True)
df['Married'].fillna('no data',inplace=True)
cols = ['Gender','Married']
df[cols] = df[cols].fillna('no data')
df = pd.DataFrame({'Gender':['m','f',np.nan], 
                   'Married':[np.nan,'yes','no'],
                   'credit history':[1.,np.nan,0]})
print (df)
  Gender Married  credit history
0      m     NaN             1.0
1      f     yes             NaN
2    NaN      no             0.0

d = {'Gender':'no data', 'Married':'no data', 'credit history':0}
df = df.fillna(d)
print (df)
    Gender  Married  credit history
0        m  no data             1.0
1        f      yes             0.0
2  no data       no             0.0

ValueError: Input contains NaN, infinity or a value too large for dtype('float64') using fit from KNeighborsRegressor


By : Uli R
Date : March 29 2020, 07:55 AM
To fix this issue The problem that seems you are having comes from the permutation that you are doing, by commenting these two lines:
code :
# np.random.seed(1)
# df = df.loc[np.random.permutation(len(df))]
series.values.reshape(-1, 1)
    #print(train_columns, k_value)
    # Randomly resorts the DataFrame to mitigate sampling bias
    #np.random.seed(1)
    #df = df.loc[np.random.permutation(len(df))]

    # Split the DataFrame into ~75% train / 25% test data sets
    split_integer = round(len(df) * 0.75)
    train_df = df.iloc[0:split_integer]
    test_df = df.iloc[split_integer:]

    train_features = train_df[train_columns].values.reshape(-1, 1)
    train_target = train_df[predict_feature].values.reshape(-1, 1)

    # Trains the model
    knn = KNeighborsRegressor(n_neighbors=k_value)
    knn.fit(train_features, train_target)

    # Test the model & return calculate mean square error
    predictions = knn.predict(test_df[train_columns].values.reshape(-1,   1))
    print("predictions")
    mse = mean_squared_error(y_true=test_df[predict_feature], y_pred=predictions)
    return mse
predictions
{'normalized_losses': [100210405.34, 116919980.22444445, 88928383.280000001, 62378305.931836732, 65695537.133086421], 'wheel_base': [10942945.5, 31106845.595555563, 34758670.590399988, 29302177.901632652, 25464306.165925924], 'length': [71007156.219999999, 37635782.111111119, 33676038.287999995, 29868192.295918364, 22553474.111604933], 'width': [42519394.439999998, 25956086.771111108, 15199079.0744, 10443175.389795918, 8440465.6864197534], 'height': [117942530.56, 62910880.079999998, 41771068.588, 33511475.561224483, 31537852.588641971], 'curb_weight': [14514970.42, 6103365.4644444454, 6223489.0728000011, 7282828.3632653067, 6884187.4446913591], 'bore': [57147986.359999999, 88529631.346666679, 68063251.098399997, 58753168.154285707, 42950965.435555562], 'stroke': [145522819.16, 98024560.913333327, 61229681.429599993, 36452809.841224492, 25989788.846172832], 'compression_ratio': [93309449.939999998, 18108906.400000002, 30175663.952, 44964197.869387761, 39926111.747407407], 'horsepower': [25158775.920000002, 17656603.506666664, 13804482.193600001, 15772395.163265305, 14689078.471851852], 'peak_rpm': [169310760.66, 86360741.248888895, 51905953.367999993, 46999120.435102046, 45218343.222716056], 'city_mpg': [15467849.460000001, 12237327.542222224, 10855581.140000001, 11479257.790612245, 11047557.746419754], 'highway_mpg': [17384289.579999998, 15877936.197777782, 7720502.6856000004, 6315372.4963265313, 7118970.4081481481]}

KNN ValueError: Input contains NaN, infinity or a value too large for dtype('float64')


By : matt046
Date : March 29 2020, 07:55 AM
hope this fix your issue Have you checked for NaNs (not a number) in your dataset2? eg. with dataset2.isnull().values.any()?
Another thing that might be the cause of your error: You need to treat the samples the same way as you treat your training data:
code :
knn.predict(dataset2.loc[:, 1:12].values)

SVM ValueError: Input contains NaN, infinity or a value too large for dtype('float64')


By : user132411
Date : March 29 2020, 07:55 AM
Hope this helps Check if you have nulls in your data, using your_data.isnull().any(). If you have nulls, use your_data = your_data.dropna().
Check if your data contains inf using np.isfinite(your_data). If there are inf values, you can use your_data.replace([np.inf, -np.inf], np.nan) and then your_data = your_data.dropna() to delete them.
code :
from sklearn.model_selection import train_test_split
# Add these lines
X = X.replace([np.inf, -np.inf], np.nan)
y = y.replace([np.inf, -np.inf], np.nan)
X = X.dropna()
y = y.dropna()

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
X_train_counts = count_vect.fit_transform(X_train)

from sklearn.feature_extraction.text import TfidfTransformer
tfidf_transformer = TfidfTransformer()
X_train_tfidf = tfidf_transformer.fit_transform(X_train_counts)

from sklearn.feature_extraction.text import TfidfVectorizer
vectorizer = TfidfVectorizer()
X_train_tfidf = vectorizer.fit_transform(X_train)

from sklearn.svm import LinearSVC
clf = LinearSVC()
clf.fit(X_train_tfidf,y_train)

if request.method == 'POST':
    message = request.form['message']
    data = [message]
    vect = vectorizer.transform(data).toarray()
    my_prediction = clf.predict(vect)

return render_template('result.html',prediction = my_prediction)
Related Posts Related Posts :
  • Add missing rows to data frame equally distribueted
  • Converting text data into Json format
  • TensorFlow: Adding a small noise to pre-trained weights
  • Sudoku solver Python algorithm clarification needed
  • Cant call on list object when generating from a list
  • Accesing elasticsearch on Heroku Bonsai from my computer
  • Oddity calculating runtime with timeit in Python?
  • Create binary array of matching rows in an array using numpy?
  • Pyspark sc.textFile() doesn't load file completely
  • Get/display the author of a post from another app for notifications system
  • Prevent setup.py test / pytest from installing extra dependencies
  • Parse an xml file with python
  • I have a csv which breaks due to extra commas, I require only one column from the dataset but it occurs after the column
  • Pandas query using filter and sort, leading to unresolved errors
  • Execute Highlighted Code in Jupyter notebook Cell?
  • How do I multiply a column in predefined increments?
  • Printing an error message if a user does not input a certain word
  • changing format of a df and removing undesired char
  • Organize data in CSV into multiple lists in Python 3
  • Converting a row of a pandas dataframe into a dataframe itself (instead of a series)?
  • Merge two pandas dataframes with timeseries index
  • How do I achieve this in Folium?
  • Is it possible to change some product's IP address using python scripts? If possible then how ? Includes Printer and oth
  • How to remove margins from Matplotlib bar chart?
  • Procfile Heroku
  • Alternating between upper and lower cases
  • How to remove extra row after set_index() without losing index name?
  • Why is kdeplot scaled off the y-axis when including in Seaborn PairGrid?
  • Converting CSV data to string
  • Failling to build a Django project in VS 2015 - django\contrib\admin\widgets.py
  • How to put lists into one larger list in a certain order for certain conditions? - Python 3.x
  • Sublime uses Python 3, SublimeREPL uses Python 2.7. Huh?
  • Difference between dates in Pandas dataframe
  • python 3: import module
  • Add 2d array to make 3d in python
  • Pandas Merge columns of a pivot table
  • Split dataframe into testing_df and validation_df
  • Python - Splitting words in txt
  • replacing float 0 with NaN
  • What to Use Instead of QWebEngineView for PyQt 5.6 with Anaconda 3
  • Replacement of element in a list of python
  • How to supress 'star imports' warnings from Spyder IDE?
  • What I am doing wrong with S3 PUT request using AWS Signature Version 4
  • How to print the level of every node in binary-tree?
  • What would be equivalent to this list comprehension?
  • Pandas get count of group above group median
  • How to view initialized weights (i.e. before training)?
  • Python - how to split a string list into two?
  • Appending a tuple to a np array returns a list of scalars instead of tuples
  • How to only allow digits, letters, and certain characters in a string in Python?
  • difference between 2 scipy sparse csr matrices
  • How can I use lists in an equation?
  • Save tf.summary.image with Estimator API
  • Plotting multiple stacked bar graph given a pandas dataframe in Python
  • How to print only a certain part of a list?
  • Missing data in Pandas Merge
  • Python - Get html table element with lxml.html regex
  • Function that would create a copy of lists
  • python 2.7: Debugging an if-else statement (syntax error)
  • 404 fail with pytest when 'manage.py shell' and browser don't
  • shadow
    Privacy Policy - Terms - Contact Us © soohba.com