我有一个带有timedeltas列的DataFrame(实际上在检查时dtype是timedelta64[ns]
或者
import pandas as pd import numpy as np pd.__version__ Out[3]: '0.13.0rc1' np.__version__ Out[4]: '1.8.0' data = pd.DataFrame(np.random.rand(10, 3), columns=['f1', 'f2', 'td']) data['td'] *= 10000000 data['td'] = pd.Series(data['td'], dtype='或者,强制pandas尝试对
'td'
列进行操作:data.groupby(data.index < 5)['td'].mean() --------------------------------------------------------------------------- DataError Traceback (most recent call last)in () ----> 1 data.groupby(data.index < 5)['td'].mean() /path/to/lib/python3.3/site-packages/pandas-0.13.0rc1-py3.3-linux-x86_64.egg/pandas/core/groupby.py in mean(self) 417 """ 418 try: --> 419 return self._cython_agg_general('mean') 420 except GroupByError: 421 raise /path/to/lib/python3.3/site-packages/pandas-0.13.0rc1-py3.3-linux-x86_64.egg/pandas/core/groupby.py in _cython_agg_general(self, how, numeric_only) 669 670 if len(output) == 0: --> 671 raise DataError('No numeric types to aggregate') 672 673 return self._wrap_aggregated_output(output, names) DataError: No numeric types to aggregate 但是,取列的平均值可以正常工作,因此应该可以进行数值运算:
data['td'].mean() Out[11]: 0 00:00:00.003734 dtype: timedelta64[ns]显然,在进行组合之前强制浮动是很容易的,但我想我也可以尝试理解我遇到的问题.
编辑:请参阅https://github.com/pydata/pandas/issues/5724