pandas - Pandas 0.15.2 MultiIndex vs 0.14.1 ( 日期日期日期 vs pandas.tslib. 时间戳)

  显示原文与译文双语对照的内容

我已经在代码中遇到了一个中断,我已经在一个MultiIndex赋值中找到了一个中断,现在返回一个 pandas.tslib. 时间戳,而之前它是一个 datetime.date.

还有谁遇到过类似的? 是需要的功能,还是 0.15.2中的Bug? 任何推荐的修复?


i = [dt.date(2015,1,1), dt.date(2015,1,2), dt.date(2015,1,3)]
idx = pd.MultiIndex.from_product([['a', 'b'], i])

>>> idx
MultiIndex(levels=[[u'a', u'b'], [2015-01-01 00:00:00, 2015-01-02 00:00:00, 2015-01-03 00:00:00]],
 labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2]])

>>> type(idx[0][1])
pandas.tslib.Timestamp

>>> idx.levels[1]
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-11-23,.. ., 2015-03-06]
Length: 834, Freq: None, Timezone: None

>>> type(idx.levels[1][0])
Out[29]: pandas.tslib.Timestamp

运行这里语句时,出现以下错误:


df2.merge(df, left_on=['identifier', 'date'],
 right_index=True,
 how='left',
 suffixes=['', '_dup'])

 File"/Users/user4589964/anaconda/envs/madrone_dev/lib/python2.7/site-packages/pandas/core/frame.py", line 3919, in merge
 suffixes=suffixes, copy=copy)
 File"/Users/user4589964/anaconda/envs/madrone_dev/lib/python2.7/site-packages/pandas/tools/merge.py", line 39, in merge
 return op.get_result()
 File"/Users/user4589964/anaconda/envs/madrone_dev/lib/python2.7/site-packages/pandas/tools/merge.py", line 187, in get_result
 join_index, left_indexer, right_indexer = self._get_join_info()
 File"/Users/user4589964/anaconda/envs/madrone_dev/lib/python2.7/site-packages/pandas/tools/merge.py", line 264, in _get_join_info
 sort=self.sort)
 File"/Users/user4589964/anaconda/envs/madrone_dev/lib/python2.7/site-packages/pandas/tools/merge.py", line 582, in _left_join_on_index
 _get_multiindex_indexer(join_keys, right_ax, sort=sort)
 File"/Users/user4589964/anaconda/envs/madrone_dev/lib/python2.7/site-packages/pandas/tools/merge.py", line 542, in _get_multiindex_indexer
 llab, rlab, count = _factorize_keys(level, key, sort=False)
 File"/Users/user4589964/anaconda/envs/madrone_dev/lib/python2.7/site-packages/pandas/tools/merge.py", line 622, in _factorize_keys
 llab = rizer.factorize(lk)
TypeError: Argument 'values' has incorrect type (expected numpy.ndarray, got Index)

时间: 原作者:

这是索引构造中的一个 Bug,请参见这里的

下面是如何使用实际 datetime.date 对象的示例


In [8]: pd.MultiIndex.from_arrays([Index([datetime.date(2013,1,1)]),['a']])
Out[8]: 
MultiIndex(levels=[[2013-01-01], [u'a']],
 labels=[[0], [0]])

请记住,datetime.date 实际上是 2nd 个类公民,因这里将表示为 object dtypes,因这里不会非常有效。 一般只需使用 Timestamps

原作者:
...