211 Commits

Author SHA1 Message Date
Sun Junyi
d4ede0fee6 hold the backward compatibility, let jython use a special loading workflow 2013-07-25 10:08:58 +08:00
piaolignxue
aea8496b1f serialize model to file so that it can support jython. 2013-07-24 22:50:48 +08:00
Sun Junyi
6549deabbd merge change from master 2013-07-16 11:06:41 +08:00
Sun Junyi
d691d91674 fix a bug about ImportError 2013-07-15 09:32:52 +08:00
Sun Junyi
d63140fe5e make a serial white spaces seperated 2013-07-10 17:27:47 +08:00
Richard Wong
c2ded83ead Refactor: fix line indent to 4.
* jieba/__init__.py (cut):
2013-07-10 16:22:49 +08:00
Richard Wong
99d2492d67 Add re.U flag to re variable. 2013-07-10 16:22:17 +08:00
Richard Wong
fbfaac2eaa Reindent function
* jieba/__init__.py (require_initialized):
2013-07-08 13:54:36 +08:00
Richard Wong
7bfd432fc5 Remove the unused imports. 2013-07-08 13:51:39 +08:00
Cheng wei
27cf9cfd62 fix syntax invalid
* python3.2 not support unicode literal
* unicode regex as normal
2013-07-06 02:51:13 +08:00
Sun Junyi
9d0ea771a5 fix bug; decimals & digit-english mixed 2013-07-05 16:16:49 +08:00
Sun Junyi
b9b1f1a418 fix conflict of merging 2013-07-03 17:47:45 +08:00
Sun Junyi
c01680c6a8 merge the new file 2013-07-03 17:29:33 +08:00
Sun Junyi
b62f052927 PEP8 2013-07-03 17:21:21 +08:00
Sun Junyi
9ea14a8a54 merge chage from chao78787 2013-07-03 17:07:16 +08:00
Sun Junyi
45daf561c7 follow PEP8: change tab to 4 white spaces 2013-07-03 16:58:22 +08:00
Richard Wong
3246236133 Separate cal and IO process. 2013-07-03 15:03:45 +08:00
Sun Junyi
efc784312c add ChineseAnalyzer for whoosh search engine 2013-07-01 10:53:39 +08:00
Sun Junyi
f08690a2df add 'search mode' for jieba.tokenize 2013-06-28 12:04:16 +08:00
Sun Junyi
237dc6625e add mix words to extra_dict/dict.txt.big 2013-06-26 09:36:41 +08:00
Sun Junyi
11a3b10755 new method: jieba.tokenize 2013-06-24 16:14:11 +08:00
Sun Junyi
1a3be67691 make cache dumping more robust 2013-06-24 13:48:16 +08:00
Sun Junyi
ca97b19951 merge change from master 2013-06-23 22:28:32 +08:00
Sun Junyi
38b6bcd54e remove some words 2013-06-23 21:52:22 +08:00
fxsjy
e1afafe353 fix a bug of cxfree support 2013-06-23 12:50:28 +08:00
fxsjy
a9f53e9c85 don't seprate CRLF 2013-06-22 21:56:39 +08:00
fxsjy
c015f4e297 support cxfree py2exe; keep white space 2013-06-22 21:24:45 +08:00
fxsjy
7343679ba8 fix a bug in parallel mode 2013-06-21 15:09:27 +08:00
Sun Junyi
c0816b9bb0 more mixed words 2013-06-18 18:09:55 +08:00
Sun Junyi
c9e8da9e63 add more mix words to dict.txt 2013-06-18 14:10:36 +08:00
Sun Junyi
9d1e23ce6f speed up the viterbi 2013-06-16 13:21:43 +08:00
Sun Junyi
b050bfe946 remove some useless words 2013-06-08 15:40:01 +08:00
fxsjy
be1686654d merge master to jieba3k 2013-06-08 11:18:56 +08:00
fxsjy
e12e176d17 rollback, seems no abvious speed up by the previous change 2013-06-07 15:51:48 +08:00
fxsjy
d3531f197d rollback, seems no abvious speed up by the previous change 2013-06-07 15:51:13 +08:00
fxsjy
f2d6abf063 speed up of viterbi 2013-06-07 14:41:55 +08:00
fxsjy
0087a4e7e3 adjust prob_trans for better support of name entity; fix some bad cases 2013-06-07 13:59:36 +08:00
cloudaice
dfc807e65b Don't lose nformation about a function when using a decorator 2013-05-23 00:25:45 +02:00
Sun Junyi
a8f902545c fix some bad cases 2013-05-15 18:21:08 +08:00
cloudaice
9b0f60df93 Catch明确的错误 2013-05-10 11:26:27 +02:00
cloudaice
8ba8735f46 使用更明确的表达 2013-05-10 11:09:41 +02:00
Sun Junyi
ff4ea5d882 fix a bug of file leak 2013-05-02 11:24:22 +08:00
Sun Junyi
35aa38ed12 fix a bug caused by default argument binding 2013-04-28 12:04:16 +08:00
fxsjy
aae91b6fb6 merge change from master to jieba3k 2013-04-27 16:04:16 +08:00
Sun Junyi
94d455b079 hot fix of cut_all=True 2013-04-27 10:23:01 +08:00
Sun Junyi
59d5d3b811 fix bug and change version 2013-04-27 09:45:39 +08:00
fxsjy
c8df565981 more log trace for trouble shooting 2013-04-26 17:43:24 +08:00
fxsjy
04eb4f08cf fix a bug of changing dictionary 2013-04-26 16:48:46 +08:00
fxsjy
bc049090a5 make lazy load thread safe 2013-04-26 12:54:05 +08:00
fxsjy
d2460029d5 merge lazy load 2013-04-26 09:57:06 +08:00