Skip to content Skip to sidebar Skip to footer

Formatting Thousand Separator For Integers In A Pandas Dataframe

I'm trying to use '{:,}'.format(number) like the example below to format a number in a pandas dataframe: # This works for floats and integers print '{:,}'.format(20000) # 20,000 pr

Solution 1:

pandas (as of 0.20.1) does not allow overriding the default integer format in an easy way. It is hard coded in pandas.io.formats.format.IntArrayFormatter (the labmda function):

class IntArrayFormatter(GenericArrayFormatter):

    def _format_strings(self):
        formatter = self.formatter or (lambda x: '% d' % x)
        fmt_values = [formatter(x) for x in self.values]
        return fmt_values

I'm assuming is what you're actually asking for is how you can override the format for all integers: replace ("monkey patch") the IntArrayFormatter to print integer values with thousands separated by comma as follows:

import pandas

class _IntArrayFormatter(pandas.io.formats.format.GenericArrayFormatter):

    def _format_strings(self):
        formatter = self.formatter or (lambda x: ' {:,}'.format(x))
        fmt_values = [formatter(x) for x in self.values]
        return fmt_values

pandas.io.formats.format.IntArrayFormatter = _IntArrayFormatter

Note:

  • before 0.20.0, the formatters were in pandas.formats.format.
  • before 0.18.1, the formatters were in pandas.core.format.

Aside

For floats you do not need to jump through those hoops since there is a configuration option for it:

display.float_format: The callable should accept a floating point number and return a string with the desired format of the number. This is used in some places like SeriesFormatter. See core.format.EngFormatter for an example.


Solution 2:

The formatters parameter in to_html will take a dictionary of column names mapped to a formatting function. Below has an example of a function to build a dict that maps the same function to both floats and ints.

In [250]: num_format = lambda x: '{:,}'.format(x)

In [246]: def build_formatters(df, format):
     ...:     return {column:format 
     ...:               for (column, dtype) in df.dtypes.iteritems()
     ...:               if dtype in [np.dtype('int64'), np.dtype('float64')]}
     ...: 

In [247]: formatters = build_formatters(df_int, num_format)


In [249]: print df_int.to_html(formatters=formatters)
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>A</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>20,000</td>
    </tr>
    <tr>
      <th>1</th>
      <td>10,000</td>
    </tr>
  </tbody>
</table>

Solution 3:

You can always cast your table to float64 and then use float_format as you like, especially if you are constructing a small table for viewing purposes. Instead of dealing with ints and floats separately this gives a quick solution.

df.astype('float64',errors='ignore').to_html(float_format=lambda x: format(x,',.2f'))

errors='ignore' is there to prevent raising an exception when a column can not be converted to floats, like strings.


Post a Comment for "Formatting Thousand Separator For Integers In A Pandas Dataframe"