Save pandas dataframe to a csv file

Other topics

Create random DataFrame and write to .csv

Create a simple DataFrame.

import numpy as np
import pandas as pd

# Set the seed so that the numbers can be reproduced.
np.random.seed(0)  

df = pd.DataFrame(np.random.randn(5, 3), columns=list('ABC'))

# Another way to set column names is "columns=['column_1_name','column_2_name','column_3_name']"

df

      A         B         C
0  1.764052  0.400157  0.978738
1  2.240893  1.867558 -0.977278
2  0.950088 -0.151357 -0.103219
3  0.410599  0.144044  1.454274
4  0.761038  0.121675  0.443863

Now, write to a CSV file:

df.to_csv('example.csv', index=False)

Contents of example.csv:

A,B,C
1.76405234597,0.400157208367,0.978737984106
2.2408931992,1.86755799015,-0.977277879876
0.950088417526,-0.151357208298,-0.103218851794
0.410598501938,0.144043571161,1.45427350696
0.761037725147,0.121675016493,0.443863232745

Note that we specify index=False so that the auto-generated indices (row #s 0,1,2,3,4) are not included in the CSV file. Include it if you need the index column, like so:

df.to_csv('example.csv', index=True)  # Or just leave off the index param; default is True

Contents of example.csv:

,A,B,C
0,1.76405234597,0.400157208367,0.978737984106
1,2.2408931992,1.86755799015,-0.977277879876
2,0.950088417526,-0.151357208298,-0.103218851794
3,0.410598501938,0.144043571161,1.45427350696
4,0.761037725147,0.121675016493,0.443863232745

Also note that you can remove the header if it's not needed with header=False. This is the simplest output:

df.to_csv('example.csv', index=False, header=False)

Contents of example.csv:

1.76405234597,0.400157208367,0.978737984106
2.2408931992,1.86755799015,-0.977277879876
0.950088417526,-0.151357208298,-0.103218851794
0.410598501938,0.144043571161,1.45427350696
0.761037725147,0.121675016493,0.443863232745

The delimiter can be set by sep= argument, although the standard separator for csv files is ',' .

df.to_csv('example.csv', index=False, header=False, sep='\t')

1.76405234597    0.400157208367    0.978737984106
2.2408931992    1.86755799015    -0.977277879876
0.950088417526    -0.151357208298    -0.103218851794
0.410598501938    0.144043571161    1.45427350696
0.761037725147    0.121675016493    0.443863232745

Save Pandas DataFrame from list to dicts to csv with no index and with data encoding

import pandas as pd
data = [
    {'name': 'Daniel', 'country': 'Uganda'},
    {'name': 'Yao', 'country': 'China'},
    {'name': 'James', 'country': 'Colombia'},
]
df = pd.DataFrame(data)
filename = 'people.csv'
df.to_csv(filename, index=False, encoding='utf-8')

Parameters:

ParameterDescription
path_or_bufstring or file handle, default None File path or object, if None is provided the result is returned as a string.
sepcharacter, default ‘,’ Field delimiter for the output file.
na_repstring, default ‘’ Missing data representation
float_formatstring, default None Format string for floating point numbers
columnssequence, optional Columns to write
headerboolean or list of string, default True Write out column names. If a list of string is given it is assumed to be aliases for the column names
indexboolean, default True Write row names (index)
index_labelstring or sequence, or False, default None Column label for index column(s) if desired. If None is given, and header and index are True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex. If False do not print fields for index names. Use index_label=False for easier importing in R
nanRepNone deprecated, use na_rep
modestr Python write mode, default ‘w’
encodingstring, optional A string representing the encoding to use in the output file, defaults to ‘ascii’ on Python 2 and ‘utf-8’ on Python 3.
compressionstring, optional a string representing the compression to use in the output file, allowed values are ‘gzip’, ‘bz2’, ‘xz’, only used when the first argument is a filename
line_terminatorstring, default ‘n’ The newline character or character sequence to use in the output file
quotingoptional constant from csv module defaults to csv.QUOTE_MINIMAL
quotecharstring (length 1), default ‘”’ character used to quote fields
doublequoteboolean, default True Control quoting of quotechar inside a field
escapecharstring (length 1), default None character used to escape sep and quotechar when appropriate
chunksizeint or None rows to write at a time
tupleize_colsboolean, default False write multi_index columns as a list of tuples (if True) or new (expanded format) if False)
date_formatstring, default None Format string for datetime objects
decimalstring, default ‘.’ Character recognized as decimal separator. E.g. use ‘,’ for European data

Contributors

Topic Id: 1558

Example Ids: 19502,21916

This site is not affiliated with any of the contributors.