关于ZAKER 融媒体解决方案 合作 加入

python- 将 pandas 列的元素与另一个 pandas 数据框 .

CocoaChina 10-23

我有一个带有列关键字的熊猫数据框 A:-

keywords [ 'loans','mercedez','bugatti','a4' ] [ 'trump','usa','election','president' ] [ 'galaxy','7s','canon','macbook' ] [ 'beiber','spiderman','marvels','ironmen' ] ......................................... ......................................... .........................................

我还有另一个熊猫数据框 B, 它的列类别和单词是逗号分隔的字符串 , 如:-

category wordsaudi audi a4,audi a6bugatti bugatti veyron, bugatti chironmercedez mercedez s-class, mercedez e-classdslr canon, nikonapple iphone 7s,iphone 6s,iphone 5finance sales,loans,sales pricepolitics donald trump, election, votesentertainment spiderman,captain america, ironmenmusic justin beiber, rihana,drake........ ....................... .........

我要映射的数据框 A 列关键字与数据框 B 列字并分配相应的类别 . 关键字列的映射应与列单词的字符串中的每个单词匹配 . 例如:- 关键字 a4 应该与列中的 audi a4 字符串中的两个单词匹配 , 预期结果将是:-

keywords matched_category [ 'loans','mercedez','bugatti','a4' ] [ 'finance','mercedez','mercedez','bugatti','bugatti','audi' ] [ 'trump','usa','election','president' ] [ 'politics','politics' ] [ 'galaxy','7s','canon','macbook' ] [ 'apple','dslr' ] [ 'beiber','spiderman','marvels','ironmen' ] [ 'music','entertaiment','entertainment','entertainment' ]

一种方法是使用 pandas.transform:

import pandas as pdA = pd.DataFrame ( {'keywords': [ [ 'loans','mercedez','bugatti','a4' ] , [ 'trump','usa','election','president' ] ] } ) B = pd.DataFrame ( {'category': [ 'audi', 'finance' ] , 'words': [ 'audi a4,audi a6', 'sales,loans,sales price' ] } ) def match_category_to_keywords ( kws ) : ret = [ ] for kw in kws: m = B [ 'words' ] .transform ( lambda words: any ( [ kw in w for w in words.split ( ',' ) ] ) ) ret.extend ( B [ 'category' ] .loc [ m ] .tolist ( ) ) return pd.np.unique ( ret ) A [ 'matched_category' ] = A [ 'keywords' ] .transform ( lambda kws: match_category_to_keywords ( kws ) ) print ( A )

输出:

keywords matched_category0 [ loans, mercedez, bugatti, a4 ] [ audi, finance ] 1 [ trump, usa, election, president ] [ ]

以上内容由"CocoaChina"上传发布 查看原文
相关标签 audi数据元素apple