Sun Junyi
|
753c1be49c
|
Merge pull request #248 from wangbin/master
exlucde word fragments from FREQ in posseg.cut
|
2015-04-02 15:32:41 +08:00 |
|
Wang Bin
|
84ffa0d4bf
|
exlucde word fragments from FREQ
|
2015-04-02 11:06:55 +08:00 |
|
Sun Junyi
|
885417aed1
|
Merge pull request #247 from gumblex/master
更新文档
v0.36
|
2015-03-21 17:05:05 +08:00 |
|
Dingyuan Wang
|
eeaab012bf
|
update docs
|
2015-03-21 10:53:42 +08:00 |
|
fxsjy
|
89481cfd84
|
version update 0.36
|
2015-03-20 11:00:55 +08:00 |
|
Sun Junyi
|
59aa8b69b1
|
Merge pull request #246 from gumblex/master
增加自动词频
|
2015-03-16 10:10:53 +08:00 |
|
Dingyuan Wang
|
4fa2728fb6
|
update README about new features
|
2015-03-14 12:44:49 +08:00 |
|
Dingyuan Wang
|
4a552ca94f
|
suggest word frequency, support passing str to add_word
|
2015-03-14 12:44:19 +08:00 |
|
Sun Junyi
|
1b4721ebb8
|
Merge pull request #179 from changyy/master
新增自訂 cache_file 產生的目錄位置,可支援 jieba 運行在 Read-Only File System,如: Embedded Linux、Google App Engine 和 Heroku 等
|
2015-02-28 10:05:52 +08:00 |
|
Yuan-Yi Chang
|
62433a3205
|
讓 jieba 可以自行指定 cache_file 產生的目錄位置,提供 jieba 在 Read-only file system 環境中運行
1.在呼叫 jieba.cut() 等相關動作前,先透過 jieba.tmp_dir 指定目錄位置
2.當應用環境為 Read-Only File System,可透過預先產生 cache_file 的機制,讓 jieba 正常運行
3.實際案例為 Google App Engine 和 Heroku,其中前者免費版僅 128MB 記憶體空間無法運行,後者免費環境有 512MB 可正常運行。發佈前,先在本地端產生 cache_file 後,連同 cache_file 一併發佈至 Google App Engine 或 Heroku 環境上即可使用。
|
2015-02-27 17:25:59 +08:00 |
|
Sun Junyi
|
4b4aff6d89
|
Merge pull request #242 from gumblex/master
textrank 细节问题;文档更新
|
2015-02-17 14:57:27 +08:00 |
|
Dingyuan Wang
|
f29430f49e
|
details in textrank; update README
|
2015-02-16 21:25:55 +08:00 |
|
Sun Junyi
|
a4fb439070
|
Merge pull request #241 from sing1ee/master
improve some details from other commiters' adivces
|
2015-02-16 20:41:06 +08:00 |
|
zhangcheng
|
01b7f6efcf
|
improve some details from other commiters' adivces
|
2015-02-16 20:35:45 +08:00 |
|
Sun Junyi
|
4e05cde07e
|
Merge pull request #240 from sing1ee/master
build stable sort for graph iteration
|
2015-02-16 20:28:22 +08:00 |
|
zhangcheng
|
8b8c6c85d0
|
remove unusage import
|
2015-02-16 15:51:05 +08:00 |
|
zhangcheng
|
a6d1b2479e
|
build stable sort for graph iteration, then we can get stable result and adatpe details for python 3~
|
2015-02-16 15:49:10 +08:00 |
|
zhangcheng
|
1152db7736
|
build stable sort for graph iteration, then we can get stable result.
|
2015-02-16 15:46:36 +08:00 |
|
fxsjy
|
49657c976d
|
make extract_tags behavior compatiable with previous version
|
2015-02-14 21:23:58 +08:00 |
|
fxsjy
|
abcaf3e475
|
fix bug: load_userdict
|
2015-02-14 19:56:38 +08:00 |
|
Jack
|
a06b7d388e
|
fix bug in __main__.py
|
2015-02-12 14:08:39 +08:00 |
|
Sun Junyi
|
9ca5b69907
|
Merge pull request #238 from gumblex/master
use str.splitlines to avoid losing line breaks
|
2015-02-12 13:35:52 +08:00 |
|
Dingyuan Wang
|
f2b7183a71
|
use str.splitlines to avoid losing line breaks
|
2015-02-12 12:39:14 +08:00 |
|
Sun Junyi
|
b14eb329e3
|
Merge pull request #237 from gumblex/master
直接将前缀储存在词频字典里
|
2015-02-12 11:27:25 +08:00 |
|
Dingyuan Wang
|
872a7039f2
|
Merge branch 'master' of https://github.com/fxsjy/jieba
|
2015-02-12 10:33:56 +08:00 |
|
Dingyuan Wang
|
f808ea0ebb
|
use only one dict to store words and prefixes
|
2015-02-12 10:31:52 +08:00 |
|
fxsjy
|
4d7b515801
|
Merge branch 'master' of https://github.com/fxsjy/jieba
|
2015-02-11 20:57:35 +08:00 |
|
fxsjy
|
5bfa43a781
|
fix test scripts
|
2015-02-11 20:46:48 +08:00 |
|
Dingyuan Wang
|
f3a53dd2da
|
fix print() in tests
|
2015-02-11 20:45:55 +08:00 |
|
Sun Junyi
|
a229041e58
|
Merge pull request #234 from yanyiwu/patch-2
Update README.md
|
2015-02-11 18:48:47 +08:00 |
|
Yanyi Wu
|
5d321cbccd
|
Update README.md
|
2015-02-11 17:37:32 +08:00 |
|
fxsjy
|
8cbb26a7b6
|
fix test_file.py
|
2015-02-11 16:47:57 +08:00 |
|
Sun Junyi
|
41b47b0593
|
Merge pull request #233 from gumblex/master
合并 jieba3k,兼容 Python 2/3
|
2015-02-11 15:44:22 +08:00 |
|
Dingyuan Wang
|
32a0e92a09
|
don't compile re every time; autopep8
|
2015-02-10 21:22:34 +08:00 |
|
Dingyuan Wang
|
22bcf8be7a
|
Merge master and jieba3k, make the code Python 2/3 compatible
|
2015-02-10 20:54:55 +08:00 |
|
Sun Junyi
|
caae26fbfa
|
Merge pull request #231 from gumblex/master
在 FREQ 中直接储存频数
|
2015-02-09 16:50:43 +08:00 |
|
Dingyuan Wang
|
4197dfb8fa
|
store int directly in FREQ; small improvements
|
2015-02-09 16:26:00 +08:00 |
|
Dingyuan Wang
|
765fd6b7f0
|
store int directly in FREQ; small improvements
|
2015-02-09 16:14:12 +08:00 |
|
Sun Junyi
|
c95f402e2b
|
Merge pull request #214 from aszxqw/master
add iosjieba
|
2014-12-25 10:09:35 +08:00 |
|
yanyiwu
|
1d91072498
|
add iosjieba
|
2014-12-24 23:02:06 +08:00 |
|
Sun Junyi
|
852a07c4f2
|
Merge pull request #211 from gumblex/jieba3k
修复 posseg 中 pair 类 repr 返回值 (jieba3k)
|
2014-12-20 18:35:43 +08:00 |
|
Dingyuan Wang
|
7bcb128f5f
|
fix textrank divided by zero; fix posseg.pair.__repr__
|
2014-12-20 00:12:42 +08:00 |
|
Sun Junyi
|
b08c3f8ed7
|
Merge pull request #205 from lynschinzer/master
Fix divided by zero issue in case of words are not found in dict.
|
2014-12-05 20:13:51 +08:00 |
|
Lin
|
fea3aec6bd
|
Fix divided by zero issue in case of words are not found in dict.
|
2014-12-05 17:13:12 +08:00 |
|
Sun Junyi
|
8be082017a
|
Merge pull request #204 from gumblex/jieba3k
完善setup.py等对应py3k更新
|
2014-11-29 18:28:48 +08:00 |
|
Sun Junyi
|
293dbbc390
|
Merge pull request #203 from gumblex/master
修复 posseg;完善 setup.py
|
2014-11-29 18:28:23 +08:00 |
|
Dingyuan Wang
|
3dad899ec8
|
backport 2to3 scripts and changelog
|
2014-11-29 16:12:25 +08:00 |
|
Dingyuan Wang
|
c6b386f65b
|
update jieba3k
|
2014-11-29 16:06:20 +08:00 |
|
Dingyuan Wang
|
7b7c6955a9
|
complete the setup.py, fix #202 problem in posseg
|
2014-11-29 15:33:42 +08:00 |
|
Sun Junyi
|
8a2e7f0e7e
|
Merge pull request #202 from nomaka/patch-1
Update __init__.py
|
2014-11-18 16:46:59 +08:00 |
|