我看了很多答案,它们的目标是找到文件中每个单词的出现,或者大字符串,甚至数组。但我不想这么做,我的字符串也不是来自文本文件。在
给定一个大字符串,比如一个文件大小的字符串,如何计算大字符串中每个数组元素的频率(包括单词中的空格)?在def calculate_commonness(context, links):
c = Counter()
content = context.translate(string.maketrans("",""), string.punctuation).split(None)
for word in content:
if word in links:
c[word] += 1
print c
context = "It was November. Although it was November November November Passage not yet late, the sky was dark when I turned into Laundress Passage. Father had finished for the day, switched off the shop lights and closed the shutters; but so I would not come home to darkness he had left on the light over the stairs to the flat. Through the glass in the door it cast a foolscap rectangle of paleness onto the wet pavement, and it was while I was standing in that rectangle, about to turn my key in the door, that I first saw the letter. Another white rectangle, it was on the fifth step from the bottom, where I couldn\'t miss it."
links = ['November', 'Laundress', 'Passage', 'Father had']
# My output should look (something) like this:
# November = 4
# Laundress = 1
# Passage = 2
# Father had = 1
现在是十一月,洗衣店和通道,但不是'父亲有'。我需要能够找到带空格的字符串元素。我知道这是因为我将上下文拆分为“”返回“父”“had”,那么如何恰当地拆分上下文,还是将其与regex findall一起使用?在
编辑:
使用上下文作为一个大字符串,我有:
^{pr2}$
退货:Counter({'Laundress': 0, 'November': 0, 'Father had': 0, 'Passage': 0})