Conversions module

Contains function that help in converting between types

class pandas_extras.conversions.NativeDict(*args, **kwargs)[source]

Bases: dict

Helper class to ensure that only native types are in the dicts produced by to_dict()

>>> df.to_dict(orient='records', into=NativeDict)

Note

Needed until #21256 is resolved.

static convert_if_needed(value)[source]

Converts value to native python type.

Warning

Only Timestamp and numpy dtypes are converted.

pandas_extras.conversions.clear_nan(dataframe)[source]

Change the pandas.NaT and the pandas.nan elements to None.

Parameters:dataframe – The pandas.DataFrame object which should be transformed
Returns:The modified dataframe
pandas_extras.conversions.convert_to_type(dataframe, mapper, *types, kwargs_map=None)[source]

Converts columns to types specified by the mapper. In case of integer, float, signed and unsigned typecasting, the smallest possible type will be chosen. See more details at to_numeric().

>>> df = pd.DataFrame({
...     'date': ['05/06/2018', '05/04/2018'],
...     'datetime': [156879000, 156879650],
...     'number': ['1', '2.34'],
...     'int': [4, 8103],
...     'float': [4.0, 8103.0],
...     'object': ['just some', 'strings']
... })
>>> mapper = {
...     'number': 'number', 'integer': 'int', 'float': 'float',
...     'date': ['date', 'datetime']
... }
>>> kwargs_map = {'datetime': {'unit': 'ms'}}
>>> df.pipe(
...    convert_to_type, mapper, 'integer', 'date',
...    'number', 'float', kwargs_map=kwargs_map
... ).dtypes
date        datetime64[ns]
datetime    datetime64[ns]
number             float64
int                  int64
float              float32
object              object
dtype: object
Parameters:
  • dataframe (DataFrame) – The DataFrame object to work on.
  • mapper (dict) – Dict with column names as values and any of the following keys: number, integer, float, signed, unsigned, date and datetime.
  • *types (str) – any number of keys from the mapper. If omitted, all keys from mapper will be used.
  • kwargs_map (dict) – Dict of keyword arguments to apply to to_datetime() or to_numeric(). Keys must be the column names, values are the kwargs dict.
Returns:

The converted dataframe

Return type:

DataFrame

pandas_extras.conversions.truncate_strings(dataframe, length_mapping)[source]

Truncates strings in columns to defined length.

>>> df = pd.DataFrame({
...    'strings': [
...        'foo',
...        'baz',
...    ],
...    'long_strings': [
...        'foofoofoofoofoo',
...        'bazbazbazbazbaz',
...    ],
...    'even_longer_strings': [
...        'foofoofoofoofoofoofoofoo',
...        'bazbazbazbazbazbazbazbaz',
...    ]
...})
>>> df.pipe(truncate_strings, {'long_strings': 6, 'even_longer_strings': 9})
    strings  long_strings  even_longer_strings
0       foo        foofoo            foofoofoo
1       baz        bazbaz            bazbazbaz
Parameters:
  • dataframe (DataFrame) – The DataFrame object to work on.
  • length_mapping (dict) – Dict of column names and desired length
Returns:

The converted dataframe

Return type:

DataFrame