Dingyuan Wang
|
f2b7183a71
|
use str.splitlines to avoid losing line breaks
|
2015-02-12 12:39:14 +08:00 |
|
Dingyuan Wang
|
f808ea0ebb
|
use only one dict to store words and prefixes
|
2015-02-12 10:31:52 +08:00 |
|
Dingyuan Wang
|
32a0e92a09
|
don't compile re every time; autopep8
|
2015-02-10 21:22:34 +08:00 |
|
Dingyuan Wang
|
22bcf8be7a
|
Merge master and jieba3k, make the code Python 2/3 compatible
|
2015-02-10 20:54:55 +08:00 |
|
Dingyuan Wang
|
4197dfb8fa
|
store int directly in FREQ; small improvements
|
2015-02-09 16:26:00 +08:00 |
|
Dingyuan Wang
|
765fd6b7f0
|
store int directly in FREQ; small improvements
|
2015-02-09 16:14:12 +08:00 |
|
Dingyuan Wang
|
c6b386f65b
|
update jieba3k
|
2014-11-29 16:06:20 +08:00 |
|
Dingyuan Wang
|
7b7c6955a9
|
complete the setup.py, fix #202 problem in posseg
|
2014-11-29 15:33:42 +08:00 |
|
Nomaka
|
9cb76dd8b9
|
Update __init__.py
calc的idx参数没用
|
2014-11-18 16:00:49 +08:00 |
|
fxsjy
|
447c1ded8c
|
fix problem for python3.2
|
2014-11-15 13:44:30 +08:00 |
|
Dingyuan Wang
|
7a6caa0c3c
|
port extract_tags, etc to jieba3k; add auto2to3 script
|
2014-11-07 23:33:31 +08:00 |
|
Dingyuan Wang
|
e3f3dcccba
|
improve the loading and caching process
|
2014-10-31 21:56:09 +08:00 |
|
fxsjy
|
ba87fcb01f
|
remove trie, use prefix set instead
|
2014-10-20 14:08:09 +08:00 |
|
fxsjy
|
82bfffb6ed
|
version update to 0.34
|
2014-10-20 13:35:13 +08:00 |
|
Dingyuan Wang
|
b367690eeb
|
use prefix dict instead of trie, add a command line interface, and a few small improvements
|
2014-10-19 10:32:23 +08:00 |
|
Dingyuan Wang
|
51df77831b
|
use prefix dict instead of trie, add a command line interface, and a few small improvements
|
2014-10-18 22:23:26 +08:00 |
|
Dingyuan Wang
|
626b415152
|
fix dict.itervalues mistake
|
2014-09-07 19:21:13 +08:00 |
|
Dingyuan Wang
|
6a3f228c72
|
fix python3 stuff
|
2014-09-07 18:50:10 +08:00 |
|
Dingyuan Wang
|
6fad5fbb2c
|
update to v0.33
|
2014-09-06 23:28:47 +08:00 |
|
Fukuball Lin
|
7198d562f1
|
讓 jieba 可以切換 idf 語料庫
1. 新增繁體中文 idf 語料庫
2. 為了讓 jieba 可以切換 iff 語料庫,新增 get_idf, set_idf_path 方法,並改寫 extract_tags
3. test 增加 extract_tags_idfpath
|
2014-08-05 22:55:13 +08:00 |
|
Dingyuan Wang
|
c04ccd0d12
|
Update to v0.32 according to the master branch.
|
2014-06-14 22:31:13 +08:00 |
|
Dingyuan Wang
|
81f77d7a08
|
Fix the re in enable_parallel.
|
2014-06-14 15:22:13 +08:00 |
|
davidlihm
|
5b2ec920ed
|
Update __init__.py
|
2014-05-15 07:55:11 +08:00 |
|
jagt
|
7f3513edb7
|
close cache file to avoid warning message.
|
2014-04-24 00:35:09 +08:00 |
|
wind
|
7488b114e7
|
use logging instead of print in init file
|
2014-03-20 13:48:33 +13:00 |
|
Sun Junyi
|
3e430e9769
|
Update __init__.py
|
2014-02-16 20:09:57 +08:00 |
|
fxsjy
|
5e6a2c4661
|
fix a bug of add_word
|
2013-12-05 13:35:40 +08:00 |
|
fxsjy
|
136676381a
|
fix a bug of add_word
|
2013-12-05 13:33:24 +08:00 |
|
Herman Schaaf
|
95286b8887
|
Fix typo in error message
|
2013-10-21 22:21:09 +09:00 |
|
fxsjy
|
759e1029c8
|
add an API to control log level: jieba.setLogLevel
|
2013-09-22 10:26:33 +08:00 |
|
Mozillazg
|
1cf3f0d00b
|
use logging instead of print
|
2013-09-19 10:31:44 +08:00 |
|
Sun Junyi
|
7e7fcc1184
|
add an option to disable HMM
|
2013-09-05 17:09:27 +08:00 |
|
fxsjy
|
c5bd9773d1
|
fix bug in issue #103
|
2013-08-30 18:26:53 +08:00 |
|
ZoeyYoung
|
dce353f88b
|
merge from master
|
2013-08-21 15:32:46 +08:00 |
|
ZoeyYoung
|
2857ae45cc
|
Merge branch 'master' into jieba3k
Conflicts:
Changelog
jieba/__init__.py
jieba/finalseg/__init__.py
jieba/posseg/__init__.py
setup.py
test/parallel/test_file.py
test/test_file.py
|
2013-08-21 13:55:21 +08:00 |
|
gwdwyy
|
cc81135429
|
sed -i 's/not \(.*\) in/\1 not in/g' ...
|
2013-08-20 20:08:03 +08:00 |
|
Sun Junyi
|
90ab511deb
|
fix the bug about issue: #92
|
2013-08-09 13:59:02 +08:00 |
|
fxsjy
|
b77645b3aa
|
modify test_file.py; use less memory
|
2013-07-29 10:17:39 +08:00 |
|
fxsjy
|
ed1fa64e27
|
fix a bug. use sys.version_info.major can't be used in Python2.5
|
2013-07-29 10:07:55 +08:00 |
|
Sun Junyi
|
0f972df0ac
|
raise exception in case of lower version
|
2013-07-29 10:01:47 +08:00 |
|
Sun Junyi
|
e68bb5a28e
|
fix a compatibility problem;python2.5 has no 'multiprocessing';
|
2013-07-29 09:57:09 +08:00 |
|
Sun Junyi
|
689e27280a
|
Merge branch 'master' of https://github.com/fxsjy/jieba
|
2013-07-29 09:49:10 +08:00 |
|
Sun Junyi
|
9d87e798fd
|
0.31 release
|
2013-07-29 09:48:53 +08:00 |
|
Linker Lin
|
1dbc525dff
|
自动检测CPU数目,启动合适数目的进程。
|
2013-07-28 00:10:27 +08:00 |
|
Sun Junyi
|
6549deabbd
|
merge change from master
|
2013-07-16 11:06:41 +08:00 |
|
Sun Junyi
|
d63140fe5e
|
make a serial white spaces seperated
|
2013-07-10 17:27:47 +08:00 |
|
Richard Wong
|
c2ded83ead
|
Refactor: fix line indent to 4.
* jieba/__init__.py (cut):
|
2013-07-10 16:22:49 +08:00 |
|
Richard Wong
|
99d2492d67
|
Add re.U flag to re variable.
|
2013-07-10 16:22:17 +08:00 |
|
Richard Wong
|
fbfaac2eaa
|
Reindent function
* jieba/__init__.py (require_initialized):
|
2013-07-08 13:54:36 +08:00 |
|
Richard Wong
|
7bfd432fc5
|
Remove the unused imports.
|
2013-07-08 13:51:39 +08:00 |
|