73 Commits

Author SHA1 Message Date
Sun Junyi
a8f902545c fix some bad cases 2013-05-15 18:21:08 +08:00
fxsjy
aae91b6fb6 merge change from master to jieba3k 2013-04-27 16:04:16 +08:00
fxsjy
3f003e2f29 new method: jieba.disable_parallel, which is the inverse operation of jieba.enable_parallel 2013-04-22 12:35:17 +08:00
fxsjy
b46166f768 use CRLF as seperator to make chunks in parallel mode 2013-04-20 18:46:04 +08:00
fxsjy
62cf22121f new feature: parallel segment with multiprocessing 2013-04-20 14:11:31 +08:00
Sun Junyi
6da857b554 merge changes from master branch 2013-04-19 10:21:34 +08:00
Sun Junyi
012fddf13f ignore white space 2013-04-12 22:37:53 +08:00
fxsjy
45591bb9ab support flag '_'; ignore white space 2013-04-12 21:53:03 +08:00
Sun Junyi
c77823aa1d merge improvement to Py3k branch 2013-04-12 14:58:25 +08:00
Sun Junyi
94ad7e7035 support decimal point 2013-04-08 09:53:04 +08:00
Sun Junyi
72fff6c8e2 support decimal point 2013-04-08 09:40:32 +08:00
Sun Junyi
659326c4e1 punctuation; improve keywords extraction 2013-04-06 14:02:11 +08:00
Sun Junyi
7d227da5c4 punctuation 2013-04-05 22:49:16 +08:00
Sun Junyi
58c363655c support user defined word tag 2013-03-25 17:28:37 +08:00
Sun Junyi
0f4f9067c3 fix bugs in jieba for py3k 2013-03-21 11:10:57 +08:00
Sun Junyi
06ebc6f71c en-chn mix words in POS 2012-12-12 14:24:44 +08:00
Sun Junyi
9c07d80edb first py3k version of jieba 2012-11-28 10:50:40 +08:00
Sun Junyi
193bfee1d4 use only one dictionary 2012-11-06 11:01:31 +08:00
Sun Junyi
59c3efeb2f improve speed of tagging 2012-11-06 10:32:00 +08:00
fxsjy
1a2a64a13f one more example of POS tagging 2012-11-06 07:44:39 +08:00
fxsjy
90cd4b3014 improve POS tagging 2012-11-06 07:17:26 +08:00
Sun Junyi
7612a62115 remove useless data & code 2012-11-05 16:16:06 +08:00
Sun Junyi
051f43c1d7 Part of Speech Tagging 2012-11-05 16:09:41 +08:00