Sun Junyi
|
0f4f9067c3
|
fix bugs in jieba for py3k
|
2013-03-21 11:10:57 +08:00 |
|
Sun Junyi
|
d58402c8f6
|
for issue 26
|
2013-02-18 10:31:20 +08:00 |
|
Sun Junyi
|
981d58e106
|
for issue 26
|
2013-02-18 10:20:17 +08:00 |
|
Sun Junyi
|
182289c2eb
|
for issue 25
|
2013-02-17 17:25:40 +08:00 |
|
Sun Junyi
|
13e3850ba8
|
try to solve this issue: https://github.com/fxsjy/jieba/issues/25
|
2013-02-17 17:06:47 +08:00 |
|
Sun Junyi
|
1edc1651ee
|
try to fix this issue: https://github.com/fxsjy/jieba/issues/26
|
2013-02-17 16:04:51 +08:00 |
|
Sun Junyi
|
fd20cbbd4b
|
use logarithmic addition instead of multiplication, to avoid bad case in issue19
|
2012-12-28 11:29:51 +08:00 |
|
Sun Junyi
|
06ebc6f71c
|
en-chn mix words in POS
|
2012-12-12 14:24:44 +08:00 |
|
Sun Junyi
|
379cd4933a
|
support en-chn mixed words, like B超
|
2012-12-12 11:03:29 +08:00 |
|
Sun Junyi
|
9c07d80edb
|
first py3k version of jieba
|
2012-11-28 10:50:40 +08:00 |
|
Sun Junyi
|
5ce72e76b1
|
add new method: cut_for_search(sentence), which can get better recall rate for search engine's reverse index
|
2012-11-27 13:37:40 +08:00 |
|
Sun Junyi
|
80bf2fec30
|
Merge branch 'master' of https://github.com/fxsjy/jieba
|
2012-11-23 16:01:25 +08:00 |
|
Sun Junyi
|
400889b25c
|
enhance cut_all=True mode
|
2012-11-23 15:59:15 +08:00 |
|
Felix Yan
|
085b09c3ea
|
add file-like object support
|
2012-11-21 18:07:19 +08:00 |
|
Sun Junyi
|
ddc48d792f
|
remove near_char_tab.txt
|
2012-11-06 14:09:22 +08:00 |
|
Sun Junyi
|
7fad14a61c
|
remove tags.txt
|
2012-11-06 12:48:36 +08:00 |
|
Sun Junyi
|
193bfee1d4
|
use only one dictionary
|
2012-11-06 11:01:31 +08:00 |
|
Sun Junyi
|
59c3efeb2f
|
improve speed of tagging
|
2012-11-06 10:32:00 +08:00 |
|
fxsjy
|
1a2a64a13f
|
one more example of POS tagging
|
2012-11-06 07:44:39 +08:00 |
|
fxsjy
|
90cd4b3014
|
improve POS tagging
|
2012-11-06 07:17:26 +08:00 |
|
Sun Junyi
|
7612a62115
|
remove useless data & code
|
2012-11-05 16:16:06 +08:00 |
|
Sun Junyi
|
051f43c1d7
|
Part of Speech Tagging
|
2012-11-05 16:09:41 +08:00 |
|
Sun Junyi
|
d040e92987
|
new interface: load_userdict(file_name)
|
2012-10-25 17:06:39 +08:00 |
|
Sun Junyi
|
14faea710b
|
use file cache to improve the loading speed after the first time of importing
|
2012-10-25 12:18:33 +08:00 |
|
Sun Junyi
|
3fe92f8520
|
new feature: tag extraction
|
2012-10-16 12:54:48 +08:00 |
|
Sun Junyi
|
c5e5bbc9c7
|
sorted dictionary make the loading 10% faster
|
2012-10-12 09:58:39 +08:00 |
|
Sun Junyi
|
7acba8cd54
|
improve chinese name recognition
|
2012-10-09 14:04:16 +08:00 |
|
fxsjy
|
ef0c0284ff
|
improve speed
|
2012-10-09 06:37:01 +08:00 |
|
fxsjy
|
cd94e69241
|
fix a bug
|
2012-10-08 21:27:01 +08:00 |
|
fxsjy
|
c8b1cb0c88
|
remove a bug prone role
|
2012-10-08 20:52:35 +08:00 |
|
fxsjy
|
9180b90ae3
|
make model loading more faster
|
2012-10-06 18:28:52 +08:00 |
|
fxsjy
|
7b2439afed
|
merge some new words
|
2012-10-05 07:34:05 +08:00 |
|
fxsjy
|
445c935b57
|
optimize dictionary
|
2012-10-04 22:54:56 +08:00 |
|
fxsjy
|
164b782c4e
|
improve the speed
|
2012-10-04 13:10:56 +08:00 |
|
fxsjy
|
51765aa6dd
|
first commit
|
2012-10-01 15:25:06 +08:00 |
|
Sun Junyi
|
6f6e812afb
|
first commit
|
2012-09-29 15:54:04 +08:00 |
|