186 Commits

Author SHA1 Message Date
Sun Junyi
0f4f9067c3 fix bugs in jieba for py3k 2013-03-21 11:10:57 +08:00
Sun Junyi
d58402c8f6 for issue 26 2013-02-18 10:31:20 +08:00
Sun Junyi
981d58e106 for issue 26 2013-02-18 10:20:17 +08:00
Sun Junyi
182289c2eb for issue 25 2013-02-17 17:25:40 +08:00
Sun Junyi
13e3850ba8 try to solve this issue: https://github.com/fxsjy/jieba/issues/25 2013-02-17 17:06:47 +08:00
Sun Junyi
1edc1651ee try to fix this issue: https://github.com/fxsjy/jieba/issues/26 2013-02-17 16:04:51 +08:00
Sun Junyi
fd20cbbd4b use logarithmic addition instead of multiplication, to avoid bad case in issue19 2012-12-28 11:29:51 +08:00
Sun Junyi
06ebc6f71c en-chn mix words in POS 2012-12-12 14:24:44 +08:00
Sun Junyi
379cd4933a support en-chn mixed words, like B超 2012-12-12 11:03:29 +08:00
Sun Junyi
9c07d80edb first py3k version of jieba 2012-11-28 10:50:40 +08:00
Sun Junyi
5ce72e76b1 add new method: cut_for_search(sentence), which can get better recall rate for search engine's reverse index 2012-11-27 13:37:40 +08:00
Sun Junyi
80bf2fec30 Merge branch 'master' of https://github.com/fxsjy/jieba 2012-11-23 16:01:25 +08:00
Sun Junyi
400889b25c enhance cut_all=True mode 2012-11-23 15:59:15 +08:00
Felix Yan
085b09c3ea add file-like object support 2012-11-21 18:07:19 +08:00
Sun Junyi
ddc48d792f remove near_char_tab.txt 2012-11-06 14:09:22 +08:00
Sun Junyi
7fad14a61c remove tags.txt 2012-11-06 12:48:36 +08:00
Sun Junyi
193bfee1d4 use only one dictionary 2012-11-06 11:01:31 +08:00
Sun Junyi
59c3efeb2f improve speed of tagging 2012-11-06 10:32:00 +08:00
fxsjy
1a2a64a13f one more example of POS tagging 2012-11-06 07:44:39 +08:00
fxsjy
90cd4b3014 improve POS tagging 2012-11-06 07:17:26 +08:00
Sun Junyi
7612a62115 remove useless data & code 2012-11-05 16:16:06 +08:00
Sun Junyi
051f43c1d7 Part of Speech Tagging 2012-11-05 16:09:41 +08:00
Sun Junyi
d040e92987 new interface: load_userdict(file_name) 2012-10-25 17:06:39 +08:00
Sun Junyi
14faea710b use file cache to improve the loading speed after the first time of importing 2012-10-25 12:18:33 +08:00
Sun Junyi
3fe92f8520 new feature: tag extraction 2012-10-16 12:54:48 +08:00
Sun Junyi
c5e5bbc9c7 sorted dictionary make the loading 10% faster 2012-10-12 09:58:39 +08:00
Sun Junyi
7acba8cd54 improve chinese name recognition 2012-10-09 14:04:16 +08:00
fxsjy
ef0c0284ff improve speed 2012-10-09 06:37:01 +08:00
fxsjy
cd94e69241 fix a bug 2012-10-08 21:27:01 +08:00
fxsjy
c8b1cb0c88 remove a bug prone role 2012-10-08 20:52:35 +08:00
fxsjy
9180b90ae3 make model loading more faster 2012-10-06 18:28:52 +08:00
fxsjy
7b2439afed merge some new words 2012-10-05 07:34:05 +08:00
fxsjy
445c935b57 optimize dictionary 2012-10-04 22:54:56 +08:00
fxsjy
164b782c4e improve the speed 2012-10-04 13:10:56 +08:00
fxsjy
51765aa6dd first commit 2012-10-01 15:25:06 +08:00
Sun Junyi
6f6e812afb first commit 2012-09-29 15:54:04 +08:00