Dingyuan Wang
|
872a7039f2
|
Merge branch 'master' of https://github.com/fxsjy/jieba
|
2015-02-12 10:33:56 +08:00 |
|
Dingyuan Wang
|
f808ea0ebb
|
use only one dict to store words and prefixes
|
2015-02-12 10:31:52 +08:00 |
|
fxsjy
|
5bfa43a781
|
fix test scripts
|
2015-02-11 20:46:48 +08:00 |
|
Dingyuan Wang
|
f3a53dd2da
|
fix print() in tests
|
2015-02-11 20:45:55 +08:00 |
|
fxsjy
|
8cbb26a7b6
|
fix test_file.py
|
2015-02-11 16:47:57 +08:00 |
|
Dingyuan Wang
|
22bcf8be7a
|
Merge master and jieba3k, make the code Python 2/3 compatible
|
2015-02-10 20:54:55 +08:00 |
|
Dingyuan Wang
|
3dad899ec8
|
backport 2to3 scripts and changelog
|
2014-11-29 16:12:25 +08:00 |
|
Dingyuan Wang
|
c6b386f65b
|
update jieba3k
|
2014-11-29 16:06:20 +08:00 |
|
Dingyuan Wang
|
a5ecf70f71
|
update to v0.35
|
2014-11-14 20:59:54 +08:00 |
|
Dingyuan Wang
|
4a6140081e
|
fix problems in auto2to3
|
2014-11-07 23:47:57 +08:00 |
|
Dingyuan Wang
|
7a6caa0c3c
|
port extract_tags, etc to jieba3k; add auto2to3 script
|
2014-11-07 23:33:31 +08:00 |
|
walkskyer
|
6772f0282e
|
修复带权重测试脚本输出结果是调用顺序错误
|
2014-11-06 22:24:43 +08:00 |
|
Dingyuan Wang
|
fd9f1f2c0e
|
update README, textrank, etc.
|
2014-10-25 14:23:37 +08:00 |
|
fxsjy
|
f5ca87e088
|
merge change of @fukuball
|
2014-10-23 15:59:08 +08:00 |
|
Dingyuan Wang
|
bb1e6000c6
|
fix version; fix spaces at end of line
|
2014-10-19 10:57:46 +08:00 |
|
Dingyuan Wang
|
51df77831b
|
use prefix dict instead of trie, add a command line interface, and a few small improvements
|
2014-10-18 22:23:26 +08:00 |
|
Dingyuan Wang
|
6fad5fbb2c
|
update to v0.33
|
2014-09-06 23:28:47 +08:00 |
|
Fukuball Lin
|
b658ee69cb
|
讓 jieba 可以自行增加 stop words 語料庫
1. 增加範例 stop words 語料庫
2. 為了讓 jieba 可以切換 stop words 語料庫,新增 set_stop_words 方法,並改寫 extract_tags
3. test 增加 extract_tags_stop_words.py 測試範例
|
2014-08-06 03:35:16 +08:00 |
|
Fukuball Lin
|
7198d562f1
|
讓 jieba 可以切換 idf 語料庫
1. 新增繁體中文 idf 語料庫
2. 為了讓 jieba 可以切換 iff 語料庫,新增 get_idf, set_idf_path 方法,並改寫 extract_tags
3. test 增加 extract_tags_idfpath
|
2014-08-05 22:55:13 +08:00 |
|
Dingyuan Wang
|
c04ccd0d12
|
Update to v0.32 according to the master branch.
|
2014-06-14 22:31:13 +08:00 |
|
fxsjy
|
18678d50c6
|
fix bug issue #132
|
2014-01-28 13:48:03 +08:00 |
|
gan
|
31d5845535
|
add better support for english. like input: 'this is interesting and interested me'-->output:'this interest interest',which 'interest' match 'interesting interested'
|
2013-09-09 11:54:30 +08:00 |
|
Sun Junyi
|
7e7fcc1184
|
add an option to disable HMM
|
2013-09-05 17:09:27 +08:00 |
|
ZoeyYoung
|
d49542c06e
|
fix bug
|
2013-08-21 19:31:12 +08:00 |
|
ZoeyYoung
|
dce353f88b
|
merge from master
|
2013-08-21 15:32:46 +08:00 |
|
ZoeyYoung
|
2857ae45cc
|
Merge branch 'master' into jieba3k
Conflicts:
Changelog
jieba/__init__.py
jieba/finalseg/__init__.py
jieba/posseg/__init__.py
setup.py
test/parallel/test_file.py
test/test_file.py
|
2013-08-21 13:55:21 +08:00 |
|
Sun Junyi
|
81390a2d23
|
test_file.py: close the file object
|
2013-08-02 15:51:33 +08:00 |
|
fxsjy
|
b77645b3aa
|
modify test_file.py; use less memory
|
2013-07-29 10:17:39 +08:00 |
|
Linker Lin
|
5d83855088
|
自动检测CPU数目,启动合适数目的进程。
|
2013-07-28 00:12:00 +08:00 |
|
Linker Lin
|
2ceb981da0
|
自动检测CPU数目,启动合适数目的进程。
|
2013-07-28 00:07:29 +08:00 |
|
Sun Junyi
|
6549deabbd
|
merge change from master
|
2013-07-16 11:06:41 +08:00 |
|
Cheng wei
|
6035bb6320
|
fix invalid syntax for python3
|
2013-07-06 02:52:17 +08:00 |
|
Sun Junyi
|
9d0ea771a5
|
fix bug; decimals & digit-english mixed
|
2013-07-05 16:16:49 +08:00 |
|
Sun Junyi
|
ba5114dc95
|
update whoosh example
|
2013-07-04 09:31:09 +08:00 |
|
Sun Junyi
|
f424862222
|
clean the files in tmp
|
2013-07-03 17:55:01 +08:00 |
|
Sun Junyi
|
b18d56d2a3
|
Merge pull request #72 from linkerlin/master
添加一个tmp目录,好让test_whoosh.py可以运行。
|
2013-07-03 02:52:46 -07:00 |
|
Sun Junyi
|
b9b1f1a418
|
fix conflict of merging
|
2013-07-03 17:47:45 +08:00 |
|
miao.lin
|
becd32b178
|
made test_whoosh.py happy.
添加一个tmp目录,好让test_whoosh.py可以运行。
|
2013-07-03 17:32:35 +08:00 |
|
Sun Junyi
|
c01680c6a8
|
merge the new file
|
2013-07-03 17:29:33 +08:00 |
|
Sun Junyi
|
b62f052927
|
PEP8
|
2013-07-03 17:21:21 +08:00 |
|
Sun Junyi
|
45daf561c7
|
follow PEP8: change tab to 4 white spaces
|
2013-07-03 16:58:22 +08:00 |
|
Sun Junyi
|
dbec3ad9df
|
add some comments
|
2013-07-01 11:20:56 +08:00 |
|
Sun Junyi
|
efc784312c
|
add ChineseAnalyzer for whoosh search engine
|
2013-07-01 10:53:39 +08:00 |
|
Sun Junyi
|
f08690a2df
|
add 'search mode' for jieba.tokenize
|
2013-06-28 12:04:16 +08:00 |
|
Sun Junyi
|
cb1b0499f7
|
unittest for jieba.tokenize
|
2013-06-24 16:20:04 +08:00 |
|
Sun Junyi
|
11a3b10755
|
new method: jieba.tokenize
|
2013-06-24 16:14:11 +08:00 |
|
Sun Junyi
|
ca97b19951
|
merge change from master
|
2013-06-23 22:28:32 +08:00 |
|
Sun Junyi
|
c0816b9bb0
|
more mixed words
|
2013-06-18 18:09:55 +08:00 |
|
Sun Junyi
|
c9e8da9e63
|
add more mix words to dict.txt
|
2013-06-18 14:10:36 +08:00 |
|
fxsjy
|
08bfabb9d7
|
Merge branch 'jieba3k' of https://github.com/fxsjy/jieba into jieba3k
|
2013-06-08 11:30:07 +08:00 |
|