votes up 4

Names should be an ordered collection.

Package:
pandas
github stars 30911
Exception Class:
ValueError

Raise code

""" 
    """
    if names is not None:
        if len(names) != len(set(names)):
            raise ValueError("Duplicate names are not allowed.")
        if not (
            is_list_like(names, allow_sets=False) or isinstance(names, abc.KeysView)
        ):
            raise ValueError("Names should be an ordered collection.")


def _read(filepath_or_buffer: FilePathOrBuffer, kwds):
    """Generic reader of line files."""
    if kwds.get("date_parser", None) is not None:
        if isinstance(kwds["parse_dates"], bool):
            kwds["parse_dates"] = True
😲  Walkingbet is Android app that pays you real bitcoins for a walking. Withdrawable real money bonus is available now, hurry up! 🚶

Ways to fix

votes up 2 votes down

pandas.read_csv is used to read a comma-separated values (csv) file into DataFrame.

usage:

df = pd.read_csv('data.csv')  

If the saved csv file doesn't have a column or if name overriding is needed, the parameter names is used to provide the new column names. This parameter should be given an array or a list like object of names.


How to reproduce the error:

pipenv install pandas

import tempfile
import pandas as pd
df = pd.DataFrame([('Foreign Cinema', 50, 289.0),
                   ('Liho Liho', 45, 224.0),
                   ('500 Club', 102, 80.5),
                   ('The Square', 65, 25.30)])
print("\t\tOriginal data")
print("\t\t-------------\n")
print(df)
with tempfile.TemporaryFile() as fp:
  df.to_csv('data.csv', index=False)
  print("\n\n\n")
  print("\t\tLoaded data")
  print("\t\t-------------\n")
  new_df = pd.read_csv('data.csv',header=0,names = "columns") 
  print(new_df)

Error output:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-21-c40342f9dddb> in <module>()
      9   df.to_csv('data.csv', index=False)
     10   print("\n\n\n")
---> 11   new_df = pd.read_csv('data.csv',header=0,names = "columns")
     12   print(new_df)

/usr/local/lib/python3.7/dist-packages/pandas/io/parsers.py in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
    686     )
    687 
--> 688     return _read(filepath_or_buffer, kwds)
    689 
    690 

/usr/local/lib/python3.7/dist-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
    449 
    450     # Check for duplicates in names.
--> 451     _validate_names(kwds.get("names", None))
    452 
    453     # Create the parser.

/usr/local/lib/python3.7/dist-packages/pandas/io/parsers.py in _validate_names(names)
    417             is_list_like(names, allow_sets=False) or isinstance(names, abc.KeysView)
    418         ):
--> 419             raise ValueError("Names should be an ordered collection.")
    420 
    421 

ValueError: Names should be an ordered collection.

Fixed version of the code:

The names parameter should be given a list of the needed names with appropriate length.

import tempfile
import pandas as pd
df = pd.DataFrame([('Foreign Cinema', 50, 289.0),
                   ('Liho Liho', 45, 224.0),
                   ('500 Club', 102, 80.5),
                   ('The Square', 65, 25.30)])
print("\t\tOriginal data")
print("\t\t-------------\n")
print(df)
with tempfile.TemporaryFile() as fp:
  df.to_csv('data.csv', index=False)
  print("\n\n\n")
  print("\t\tLoaded data")
  print("\t\t-------------\n")
  new_df = pd.read_csv('data.csv',header=0,names = ['name', 'num_customers', 'AvgBill']) 
  print(new_df)

Correct output:

	Original data
		-------------

                0    1      2
0  Foreign Cinema   50  289.0
1       Liho Liho   45  224.0
2        500 Club  102   80.5
3      The Square   65   25.3




		Loaded data
		-------------

             name  num_customers  AvgBill
0  Foreign Cinema             50    289.0
1       Liho Liho             45    224.0
2        500 Club            102     80.5
3      The Square             65     25.3

Jun 29, 2021 kellemnegasi answer
kellemnegasi 31.6k

Add a possible fix

Please authorize to post fix