Dingyuan Wang
|
6fad5fbb2c
|
update to v0.33
|
2014-09-06 23:28:47 +08:00 |
|
Dingyuan Wang
|
c04ccd0d12
|
Update to v0.32 according to the master branch.
|
2014-06-14 22:31:13 +08:00 |
|
ZoeyYoung
|
d49542c06e
|
fix bug
|
2013-08-21 19:31:12 +08:00 |
|
ZoeyYoung
|
dce353f88b
|
merge from master
|
2013-08-21 15:32:46 +08:00 |
|
ZoeyYoung
|
2857ae45cc
|
Merge branch 'master' into jieba3k
Conflicts:
Changelog
jieba/__init__.py
jieba/finalseg/__init__.py
jieba/posseg/__init__.py
setup.py
test/parallel/test_file.py
test/test_file.py
|
2013-08-21 13:55:21 +08:00 |
|
fxsjy
|
b77645b3aa
|
modify test_file.py; use less memory
|
2013-07-29 10:17:39 +08:00 |
|
Linker Lin
|
5d83855088
|
自动检测CPU数目,启动合适数目的进程。
|
2013-07-28 00:12:00 +08:00 |
|
Linker Lin
|
2ceb981da0
|
自动检测CPU数目,启动合适数目的进程。
|
2013-07-28 00:07:29 +08:00 |
|
Sun Junyi
|
6549deabbd
|
merge change from master
|
2013-07-16 11:06:41 +08:00 |
|
Cheng wei
|
6035bb6320
|
fix invalid syntax for python3
|
2013-07-06 02:52:17 +08:00 |
|
Sun Junyi
|
9d0ea771a5
|
fix bug; decimals & digit-english mixed
|
2013-07-05 16:16:49 +08:00 |
|
Sun Junyi
|
ba5114dc95
|
update whoosh example
|
2013-07-04 09:31:09 +08:00 |
|
Sun Junyi
|
f424862222
|
clean the files in tmp
|
2013-07-03 17:55:01 +08:00 |
|
Sun Junyi
|
b18d56d2a3
|
Merge pull request #72 from linkerlin/master
添加一个tmp目录,好让test_whoosh.py可以运行。
|
2013-07-03 02:52:46 -07:00 |
|
Sun Junyi
|
b9b1f1a418
|
fix conflict of merging
|
2013-07-03 17:47:45 +08:00 |
|
miao.lin
|
becd32b178
|
made test_whoosh.py happy.
添加一个tmp目录,好让test_whoosh.py可以运行。
|
2013-07-03 17:32:35 +08:00 |
|
Sun Junyi
|
c01680c6a8
|
merge the new file
|
2013-07-03 17:29:33 +08:00 |
|
Sun Junyi
|
b62f052927
|
PEP8
|
2013-07-03 17:21:21 +08:00 |
|
Sun Junyi
|
45daf561c7
|
follow PEP8: change tab to 4 white spaces
|
2013-07-03 16:58:22 +08:00 |
|
Sun Junyi
|
dbec3ad9df
|
add some comments
|
2013-07-01 11:20:56 +08:00 |
|
Sun Junyi
|
efc784312c
|
add ChineseAnalyzer for whoosh search engine
|
2013-07-01 10:53:39 +08:00 |
|
Sun Junyi
|
f08690a2df
|
add 'search mode' for jieba.tokenize
|
2013-06-28 12:04:16 +08:00 |
|
Sun Junyi
|
cb1b0499f7
|
unittest for jieba.tokenize
|
2013-06-24 16:20:04 +08:00 |
|
Sun Junyi
|
11a3b10755
|
new method: jieba.tokenize
|
2013-06-24 16:14:11 +08:00 |
|
Sun Junyi
|
ca97b19951
|
merge change from master
|
2013-06-23 22:28:32 +08:00 |
|
Sun Junyi
|
c0816b9bb0
|
more mixed words
|
2013-06-18 18:09:55 +08:00 |
|
Sun Junyi
|
c9e8da9e63
|
add more mix words to dict.txt
|
2013-06-18 14:10:36 +08:00 |
|
fxsjy
|
08bfabb9d7
|
Merge branch 'jieba3k' of https://github.com/fxsjy/jieba into jieba3k
|
2013-06-08 11:30:07 +08:00 |
|
fxsjy
|
be1686654d
|
merge master to jieba3k
|
2013-06-08 11:18:56 +08:00 |
|
fxsjy
|
0087a4e7e3
|
adjust prob_trans for better support of name entity; fix some bad cases
|
2013-06-07 13:59:36 +08:00 |
|
Sun Junyi
|
4300f79788
|
add a example of using sklearn+jieba
|
2013-05-17 09:35:12 +08:00 |
|
Sun Junyi
|
a8f902545c
|
fix some bad cases
|
2013-05-15 18:21:08 +08:00 |
|
cloudaice
|
9ee20a5293
|
add generator test
|
2013-05-11 22:50:30 +02:00 |
|
cloudaice
|
0c050b5eb2
|
add jieba.posseg test case
|
2013-05-11 17:40:43 +02:00 |
|
cloudaice
|
b0f9e6721e
|
添加cutall 测试用例
|
2013-05-11 17:40:43 +02:00 |
|
cloudaice
|
a7ff398edc
|
添加cut,set_dictionary,cut_for_search三个测试用例
|
2013-05-11 17:40:43 +02:00 |
|
cloudaice
|
667203a9ae
|
替换tab为空格,使用join代替循环
|
2013-05-11 17:40:43 +02:00 |
|
cloudaice
|
a2d2078465
|
将tab换成空格,使用is判断对象是否为None
|
2013-05-11 17:40:42 +02:00 |
|
cloudaice
|
e0434871eb
|
修改demo.py的代码格式,使得符合pep8规范
|
2013-05-11 17:40:42 +02:00 |
|
Sun Junyi
|
c1bf815343
|
update test case
|
2013-05-02 17:01:16 +08:00 |
|
Sun Junyi
|
0e833cd441
|
fix a bug in py3k test case
|
2013-04-28 19:40:24 +08:00 |
|
Sun Junyi
|
273996f7d4
|
fix a test script in jieba3k
|
2013-04-27 16:18:40 +08:00 |
|
fxsjy
|
aae91b6fb6
|
merge change from master to jieba3k
|
2013-04-27 16:04:16 +08:00 |
|
Sun Junyi
|
94d455b079
|
hot fix of cut_all=True
|
2013-04-27 10:23:01 +08:00 |
|
Sun Junyi
|
59d5d3b811
|
fix bug and change version
|
2013-04-27 09:45:39 +08:00 |
|
fxsjy
|
8666428fb0
|
fix a bug of changing dictionary
|
2013-04-26 16:47:00 +08:00 |
|
fxsjy
|
9bebe6120b
|
utf-8 output is more friendly to Linux
|
2013-04-26 16:19:00 +08:00 |
|
Sun Junyi
|
d3339633d5
|
in the speed test: initialize first to ignore the time of dict loading
|
2013-04-26 14:51:58 +08:00 |
|
fxsjy
|
bc049090a5
|
make lazy load thread safe
|
2013-04-26 12:54:05 +08:00 |
|
fxsjy
|
b46166f768
|
use CRLF as seperator to make chunks in parallel mode
|
2013-04-20 18:46:04 +08:00 |
|