10 Commits

Author SHA1 Message Date
Dingyuan Wang
ceb5c26be4 fix self.FREQ in cut_for_search; make pair object iterable 2015-06-01 14:36:38 +08:00
Dingyuan Wang
94840a734c wraps most globals in classes
API changes:
* class jieba.Tokenizer, jieba.posseg.POSTokenizer
* class jieba.analyse.TFIDF, jieba.analyse.TextRank
* global functions are mapped to jieba.(posseg.)dt, the default (POS)Tokenizer
* multiprocessing only works with jieba.(posseg.)dt
* new lcut, lcut_for_search functions that returns a list
* jieba.analyse.textrank now returns 20 items by default

Tests:
* added test_lock.py to test multithread locking
* demo.py now contains most of the examples in README
2015-05-09 21:29:05 +08:00
Dingyuan Wang
22bcf8be7a Merge master and jieba3k, make the code Python 2/3 compatible 2015-02-10 20:54:55 +08:00
Dingyuan Wang
51df77831b use prefix dict instead of trie, add a command line interface, and a few small improvements 2014-10-18 22:23:26 +08:00
Dingyuan Wang
6fad5fbb2c update to v0.33 2014-09-06 23:28:47 +08:00
cloudaice
e0434871eb 修改demo.py的代码格式,使得符合pep8规范 2013-05-11 17:40:42 +02:00
Sun Junyi
9c07d80edb first py3k version of jieba 2012-11-28 10:50:40 +08:00
Sun Junyi
e0bd9a6a50 version chage; doc update 2012-11-27 14:06:46 +08:00
Sun Junyi
15a5a2d50e add a sample script about tags extraction 2012-10-16 13:25:35 +08:00
fxsjy
64b3c0d0e0 add one more example 2012-10-06 14:50:10 +08:00