mirror of
https://github.com/fxsjy/jieba.git
synced 2025-07-10 00:01:33 +08:00
commit
89c0659e0b
@ -56,7 +56,7 @@ Output:
|
||||
|
||||
* 开发者可以指定自己自定义的词典,以便包含jieba词库里没有的词。虽然jieba有新词识别能力,但是自行添加新词可以保证更高的正确率
|
||||
* 用法: jieba.load_userdict(file_name) # file_name为自定义词典的路径
|
||||
* 词典格式和dict.txt一样,一个词占一行;每一行分为两部分,一部分为词语,另一部分为词频,用空格隔开
|
||||
* 词典格式和`analyse/idf.txt`一样,一个词占一行;每一行分为两部分,一部分为词语,另一部分为词频,用空格隔开
|
||||
* 范例:
|
||||
|
||||
云计算 5
|
||||
@ -163,7 +163,7 @@ Function 2): Add a custom dictionary
|
||||
|
||||
* Developers can specify their own custom dictionary to include in the jieba thesaurus. jieba has the ability to identify new words, but adding your own new words can ensure a higher rate of correct segmentation.
|
||||
* Usage: `jieba.load_userdict(file_name) # file_name is a custom dictionary path`
|
||||
* The dictionary format is the same as that of `dict.txt`: one word per line; each line is divided into two parts, the first is the word itself, the other is the word frequency, separated by a space
|
||||
* The dictionary format is the same as that of `analyse/idf.txt`: one word per line; each line is divided into two parts, the first is the word itself, the other is the word frequency, separated by a space
|
||||
* Example:
|
||||
|
||||
云计算 5
|
||||
|
Loading…
x
Reference in New Issue
Block a user