0

我正在处理几乎只有字符串的大型 csv 文件。我想做一些统计测试,例如定义集群,但为此我需要将我的字符串转换为 int。(我对 python、pandas、scikitlearn 也是完全陌生的)。

所以这里是我的代码:

#replace str as int
df.WORK_TYPE[df.WORK_TYPE == 'aaa']=1
df.WORK_TYPE[df.WORK_TYPE == 'bbb']=2
df.WORK_TYPE[df.WORK_TYPE == 'ccc']=3
df.WORK_TYPE[df.WORK_TYPE == 'ddd']=4
print(df)

这里是我的错误信息:

C:\Users\ishemf64\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame 

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.
C:\Users\ishemf64\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

C:\Users\ishemf64\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until
C:\Users\ishemf64\AppData\Local\Continuum\anaconda3\lib\site-packages\ipykernel_launcher.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.

我不明白为什么我会出现此错误,如果我想进行分析,您能否告诉我是否有其他方法和/或强制转换文本。

4

1 回答 1

0

这看起来像是警告,而不是错误。比我在这里解释的更好的人:https ://www.dataquest.io/blog/settingwithcopywarning/

由于您似乎只有几个类别,您会考虑使用get_dummies吗?它包含您pd.Series的分类数据并帮助您将其转换为虚拟变量(如果存在则为 1,如果不存在则为 0)。在这里查看:https ://pandas.pydata.org/pandas-docs/stable/generated/pandas.get_dummies.html

于 2018-11-10T00:38:56.123 回答