votes up 9

`ngrams` must be None, an integer, or a tuple of integers. Got %s

Package:
keras
github stars 52268
Exception Class:
ValueError

Raise code

  allow_none=True)

    # 'ngrams' must be one of (None, int, tuple(int))
    if not (ngrams is None or
            isinstance(ngrams, int) or
            isinstance(ngrams, tuple) and
            all(isinstance(item, int) for item in ngrams)):
      raise ValueError(("`ngrams` must be None, an integer, or a tuple of "
                        "integers. Got %s") % (ngrams,))

    # 'output_sequence_length' must be one of (None, int) and is only
    # set if output_mode is INT.
    if (output_mode == INT and not (isinstance(output_sequence_length, int) or
                                    (output_sequence_length is None))):
      raise ValueError("`output_sequence_length` must be either None or an "
      
🙏 Scream for help to Ukraine
Today, 2nd July 2022, Russia continues bombing and firing Ukraine. Don't trust Russia, they are bombing us and brazenly lying in same time they are not doing this 😠, civilians and children are dying too! We are screaming and asking exactly you to help us, we want to survive, our families, children, older ones.
Please spread the information, and ask your governemnt to stop Russia by any means. We promise to work extrahard after survival to make the world safer place for all.

Ways to fix

votes up 1 votes down

A float value was given to the parameter ngrams of TextVectorization class. An integer type should be given instead to fix this error:

Reproducing the error:

pipenv install tensorflow

from tensorflow.keras.layers.experimental.preprocessing import TextVectorization
vocab_data = ["earth", "wind", "and", "fire"]
max_len = 4
max_features = 5000
vectorize_layer = TextVectorization(max_tokens=max_features,
                                    output_mode='int',
                                    output_sequence_length=max_len,
                                    vocabulary=vocab_data,
                                    ngrams=3.)
print(vectorize_layer.get_vocabulary())

The error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-10-6e4a94518f79> in <module>()
      7                                     output_sequence_length=max_len,
      8                                     vocabulary=vocab_data,
----> 9                                     ngrams=3.)
     10 print(vectorize_layer.get_vocabulary())

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/preprocessing/text_vectorization.py in __init__(self, max_tokens, standardize, split, ngrams, output_mode, output_sequence_length, pad_to_max_tokens, vocabulary, **kwargs)
    291             all(isinstance(item, int) for item in ngrams)):
    292       raise ValueError(("`ngrams` must be None, an integer, or a tuple of "
--> 293                         "integers. Got %s") % (ngrams,))
    294 
    295     # 'output_sequence_length' must be one of (None, int) and is only

ValueError: `ngrams` must be None, an integer, or a tuple of integers. Got 3.0


Fixed version of the code:

from tensorflow.keras.layers.experimental.preprocessing import TextVectorization
vocab_data = ["earth", "wind", "and", "fire"]
max_len = 4
max_features = 5000
vectorize_layer = TextVectorization(max_tokens=max_features,
                                    output_mode='int',
                                    output_sequence_length=max_len,
                                    vocabulary=vocab_data,
                                    ngrams=3)
print(vectorize_layer.get_vocabulary())

Output:

['', '[UNK]', 'earth', 'wind', 'and', 'fire']

Jul 15, 2021 kellemnegasi answer
kellemnegasi 30.0k

Add a possible fix

Please authorize to post fix