python删除重复两次的单词_python列表如何去除重复

激活谷笔记 • 2026-05-05 18:14 • 阅读 17

在Python中删除重复单词，你可以使用以下几种方法：

1. 使用集合（set）：

python

def remove_duplicate_words（sentence）:

words = sentence.split（）

unique_words = set（words）

return ' '.join（unique_words）

sentence = "Python is great and Java is also great"

print（remove_duplicate_words（sentence））

2. 使用`dict.fromkeys（）`方法：

python

def remove_duplicate_words_ordered（sentence）:

words = sentence.split（）

unique_words = list（dict.fromkeys（words））

return ' '.join（unique_words）

sentence = "Python is great and Java is also great"

print（remove_duplicate_words_ordered（sentence））

3. 使用列表推导式：

python

def remove_duplicate_words_list_comprehension（sentence）:

words = sentence.split（）

unique_words = [word for i, word in enumerate（words） if word not in words[:i]]

return ' '.join（unique_words）

sentence = "Python is great and Java is also great"

print（remove_duplicate_words_list_comprehension（sentence））

4. 使用`nltk`库进行分词和去重（保留顺序）：

python

import nltk

def remove_duplicate_words_nltk（sentence）:

nltk.download（'punkt'）

tokens = nltk.word_tokenize（sentence）

unique_tokens = list（dict.fromkeys（tokens））

return ' '.join（unique_tokens）

sentence = "The Sky is blue also the ocean is blue also Rainbow has a blue colour."

print（remove_duplicate_words_nltk（sentence））

以上方法各有优缺点，你可以根据具体需求选择合适的方法。需要注意的是，使用集合去重会丢失原始列表中单词的顺序，而使用`dict.fromkeys（）`和列表推导式可以保持顺序。如果需要更复杂的处理，比如大小写不敏感或者词形还原，可能需要使用更高级的自然语言处理库，如`nltk`或`spaCy`