pandas - python - 读取以空格分隔的字符串数据

我在一个文本文件中有两列数据,如下所示。


Balkrishna Industries Ltd. Auto Ancillaries 3.54


Aurobindo Pharma Ltd. Pharmaceuticals 3.36


NIIT Technologies Ltd. Software 3.31


Sonata Software Ltd. Software 3.21



当我试图在Pandas中读取时,我得到一个错误,空格是分隔符,如何修改代码,以便将此数据分隔为两列,一个是名称和一个是数字?


import numpy as np


import pandas as pd



data = pd.read_csv('file.txt', sep="", header=None)


data.columns = ["Name","Fraction"]



print(data)



时间:

使用正规表达式追溯&预测sep="(?<=w)(?=d)"

例如:


import pandas as pd



df = pd.read_csv(filename, sep="(?<=w) (?=d)", names=["Name","Fraction"])


print(df)



输出:


 Name Fraction


0 Balkrishna Industries Ltd. Auto Ancillaries 3.54


1 Aurobindo Pharma Ltd. Pharmaceuticals 3.36


2 NIIT Technologies Ltd. Software 3.31


3 Sonata Software Ltd. Software 3.21



另一种方法,将文件作为一列(使用文件中不存在的sep字符,如|)。


df = pd.read_csv('file.txt', sep='|', header=None)



df = df[0].str.rsplit(' ', n=1, expand=True)


df.columns = ["Name","Fraction"]



[out ]


 Name Fraction


0 Balkrishna Industries Ltd. Auto Ancillaries 3.54


1 Aurobindo Pharma Ltd. Pharmaceuticals 3.36


2 NIIT Technologies Ltd. Software 3.31


3 Sonata Software Ltd. Software 3.21



只需将它作为一列数据帧读取,如下例子:


df:


 name


0 Balkrishna Industries Ltd. Auto Ancillaries 3.54


1 Aurobindo Pharma Ltd. Pharmaceuticals 3.36


2 NIIT Technologies Ltd. Software 3.31


3 Sonata Software Ltd. Software 3.21



然后在df.name上调用str.rpartition,并按如下所示删除空白列:


df.name.str.rpartition().drop(1, 1).set_axis(["Name","Fraction"], axis=1, inplace=False)



Out[1594]:


 Name Fraction


0 Balkrishna Industries Ltd. Auto Ancillaries 3.54


1 Aurobindo Pharma Ltd. Pharmaceuticals 3.36


2 NIIT Technologies Ltd. Software 3.31


3 Sonata Software Ltd. Software 3.21



使用"char-space-digit"分隔符:


import pandas as pd



df = pd.read_csv("mycsv.txt", sep="wsd", engine="python", names=["Name","Fraction"])


print(df)



 Name Fraction


0 Balkrishna Industries Ltd. Auto Ancillarie 0.54


1 Aurobindo Pharma Ltd. Pharmaceutical 0.36


2 NIIT Technologies Ltd. Softwar 0.31


3 Sonata Software Ltd. Softwar 0.21




...