javaee论坛

普通会员

225648

帖子

345

回复

359

积分

楼主
发表于 2019-11-07 13:11:51 | 查看: 386 | 回复: 0

文章目录数据的列数据类型数据元素个数、维度、形状访问单列多行数据head,tail分别访问数据的头部和尾部loc与iloc切片为DateFrame增加数据

导入数据

importpandasaspddate=pd.read_excel('meal_order_detail.xlsx')print('数据的所有值为:\n',date.values)数据的所有值为:[[2956417610062...nan'caipu/104001.jpg'1442][2958417609957...nan'caipu/202003.jpg'1442][2961417609950...nan'caipu/303001.jpg'1442]...[6756774609949...nan'caipu/404005.jpg'1138][6763774610014...nan'caipu/302003.jpg'1138][6764774610017...nan'caipu/302006.jpg'1138]]数据的列print('数据列名为:\n',date.columns)数据列名为:Index(['detail_id','order_id','dishes_id','logicprn_name','parent_class_name','dishes_name','itemis_add','counts','amounts','cost','place_order_time','discount_amt','discount_reason','kick_back','add_inprice','add_info','bar_code','picture_file','emp_id'],dtype='object')数据类型print('数据的所有数据类型为:\n',date.dtypes)数据的所有数据类型为:detail_idint64order_idint64dishes_idint64logicprn_namefloat64parent_class_namefloat64dishes_nameobjectitemis_addint64countsint64amountsint64costfloat64place_order_timedatetime64[ns]discount_amtfloat64discount_reasonfloat64kick_backfloat64add_inpriceint64add_infofloat64bar_codefloat64picture_fileobjectemp_idint64dtype:object数据元素个数、维度、形状print('数据的元素个数为:\n',date.size)print('数据的维度为\n',date.ndim)print('数据的形状为:\n',date.shape)数据的元素个数为:52801数据的维度为2数据的形状为:(2779,19)访问单列多行数据#访问单列多行数据dishes_name5=date['dishes_name'][:5]print('dishes的前五个元素为:\n',dishes_name5)dishes的前五个元素为:0蒜蓉生蚝1蒙古烤羊腿\r\n\r\n\r\n2大蒜苋菜3芝麻烤紫菜4蒜香包Name:dishes_name,dtype:object#访问多行数据order5=date[:][:6]print('访问数据前6行所有的数据:\n',order5)访问数据前五行所有的数据:detail_idorder_iddishes_idlogicprn_nameparent_class_name\02956417610062NaNNaN12958417609957NaNNaN22961417609950NaNNaN32966417610038NaNNaN42968417610003NaNNaN51899301610019NaNNaNdishes_nameitemis_addcountsamountscostplace_order_time\0蒜蓉生蚝0149NaN2016-08-0111:05:361蒙古烤羊腿\r\n\r\n\r\n0148NaN2016-08-0111:07:072大蒜苋菜0130NaN2016-08-0111:07:403芝麻烤紫菜0125NaN2016-08-0111:11:114蒜香包0113NaN2016-08-0111:11:305白斩鸡0188NaN2016-08-0111:15:57discount_amtdiscount_reasonkick_backadd_inpriceadd_infobar_code\0NaNNaNNaN0NaNNaN1NaNNaNNaN0NaNNaN2NaNNaNNaN0NaNNaN3NaNNaNNaN0NaNNaN4NaNNaNNaN0NaNNaN5NaNNaNNaN0NaNNaNpicture_fileemp_id0caipu/104001.jpg14421caipu/202003.jpg14422caipu/303001.jpg14423caipu/105002.jpg14424caipu/503002.jpg14425caipu/204002.jpg1095head,tail分别访问数据的头部和尾部#head,tail分别访问数据的头部和尾部,默认为五行print('数据前五行:\n',date.head())print('数据后五行:\n',date.tail())数据前五行:detail_idorder_iddishes_idlogicprn_nameparent_class_name\02956417610062NaNNaN12958417609957NaNNaN22961417609950NaNNaN32966417610038NaNNaN42968417610003NaNNaNdishes_nameitemis_addcountsamountscostplace_order_time\0蒜蓉生蚝0149NaN2016-08-0111:05:361蒙古烤羊腿\r\n\r\n\r\n0148NaN2016-08-0111:07:072大蒜苋菜0130NaN2016-08-0111:07:403芝麻烤紫菜0125NaN2016-08-0111:11:114蒜香包0113NaN2016-08-0111:11:30discount_amtdiscount_reasonkick_backadd_inpriceadd_infobar_code\0NaNNaNNaN0NaNNaN1NaNNaNNaN0NaNNaN2NaNNaNNaN0NaNNaN3NaNNaNNaN0NaNNaN4NaNNaNNaN0NaNNaNpicture_fileemp_id0caipu/104001.jpg14421caipu/202003.jpg14422caipu/303001.jpg14423caipu/105002.jpg14424caipu/503002.jpg1442数据后五行:detail_idorder_iddishes_idlogicprn_nameparent_class_name\27746750774610011NaNNaN27756742774609996NaNNaN27766756774609949NaNNaN27776763774610014NaNNaN27786764774610017NaNNaNdishes_nameitemis_addcountsamountscostplace_order_time\2774白饭/大碗0110NaN2016-08-1021:56:242775牛尾汤0140NaN2016-08-1021:56:482776意文柠檬汁0113NaN2016-08-1022:01:522777金玉良缘0130NaN2016-08-1022:03:582778酸辣藕丁0133NaN2016-08-1022:04:30discount_amtdiscount_reasonkick_backadd_inpriceadd_info\2774NaNNaNNaN0NaN2775NaNNaNNaN0NaN2776NaNNaNNaN0NaN2777NaNNaNNaN0NaN2778NaNNaNNaN0NaNbar_codepicture_fileemp_id2774NaNcaipu/601005.jpg11382775NaNcaipu/201006.jpg11382776NaNcaipu/404005.jpg11382777NaNcaipu/302003.jpg11382778NaNcaipu/302006.jpg1138loc与iloc切片#loc与iloc切片,前者根据索引名称切片,后者可通过索引切片,loc使用各个场景,程序可读性强#使用loc与iloc实现多列切片order_loc=date.loc[:,['detail_id','order_id']]print('loc多行切片:\n',order_loc.shape)order_iloc=date.iloc[:,[1,2]]print('iloc多行切片:\n',order_iloc.shape)loc多行切片:(2779,2)iloc多行切片:(2779,2)#使用loc实现条件切片order_id_417=date.loc[date['order_id']==417,['order_id','dishes_name']]print('使用loc筛选order_id等于417的菜品:\n',order_id_369)使用loc筛选order_id等于417的菜品:order_iddishes_name0417蒜蓉生蚝1417蒙古烤羊腿\r\n\r\n\r\n2417大蒜苋菜3417芝麻烤紫菜4417蒜香包#使用iloc实现条件切片order_id_417=date.iloc[(date['order_id']==417).values,[1,5]]print('使用iloc筛选order_id等于417的菜品:\n',order_id_417)使用iloc筛选order_id等于417的菜品:order_iddishes_name0417蒜蓉生蚝1417蒙古烤羊腿\r\n\r\n\r\n2417大蒜苋菜3417芝麻烤紫菜4417蒜香包#更改DateFrame中的数据类型,借助loc切片date.loc[date['order_id']==417,'order_id']=666print('将id=417该为666后id=417的数据为:\n',date.loc[date['order_id']==417,'order_id'])print('将id=417该为666后id=666的数据为:\n',date.loc[date['order_id']==666,'order_id'])将id=417该为666后id=417的数据为:Series([],Name:order_id,dtype:int64)将id=417该为666后id=666的数据为:06661666266636664666Name:order_id,dtype:int64为DateFrame增加数据#为DateFrame增加数据print('counts和amounts前五行数据为:\n',date[['counts','amounts']].head())date['payment']=date['counts']*date['amounts']print('增加列属性payment:\n',date[['counts','amounts','payment']].head())counts和amounts前五行数据为:countsamounts01491148213031254113增加列属性payment:countsamountspayment014949114848213030312525411313

您需要登录后才可以回帖 登录 | 立即注册

触屏版| 电脑版

技术支持 历史网 V2.0 © 2016-2017