Compare commits

...

483 Commits

Author SHA1 Message Date
Yanyi Wu
294755fab1 build: refine CMakeLists.txt by removing unnecessary conditions and options
- Eliminated the default installation prefix condition to streamline the configuration.
- Simplified the test build logic by ensuring tests are enabled only for top-level projects.
- Cleaned up redundant code for better readability and maintainability.
2025-05-03 07:43:25 +08:00
Yanyi Wu
714a297823 build: update CMakeLists.txt to include additional directories for test configuration
- Added include directories for the current binary and test directories to improve test file accessibility.
- Ensured proper configuration for test paths in the build process.
2025-05-02 23:47:37 +08:00
Yanyi Wu
c14131e3e2 refactor: clean up load_test.cpp by removing unused dependencies and tests
- Removed unused Jieba test and associated includes from load_test.cpp.
- Simplified main function to focus on essential operations.
- Ensured consistent exit handling by returning EXIT_SUCCESS.
2025-05-02 23:41:53 +08:00
Yanyi Wu
9cd64a1694 build: enhance test configuration and path management
- Added configuration for test paths in CMake to simplify file references.
- Updated load_test.cpp and various unit tests to use defined path macros for dictionary and test data files.
- Introduced test_paths.h.in to manage directory paths consistently across tests.
2025-05-02 23:33:18 +08:00
Yanyi Wu
aa410a69bb build: simplify test configuration in CMakeLists.txt
- Removed conditional check for MSVC when adding test commands.
- Ensured that test commands are always added regardless of the compiler.
2025-05-02 21:39:18 +08:00
Yanyi Wu
b5dc8e7a35 build: update .gitignore and CMakeLists for test configuration
- Added entries to .gitignore for temporary test files.
- Included a message to display MSVC value during build.
- Added UTF-8 compile option for MSVC in unittest CMakeLists.
2025-05-02 21:28:28 +08:00
Yanyi Wu
8141d8f434
Merge pull request #200 from yanyiwu/dev
fix: remove outdated entry from jieba dictionary
2025-05-02 17:31:29 +08:00
yanyiwu
9d8af2116e build: update CI workflow to include latest OS versions 2025-05-02 11:53:33 +08:00
yanyiwu
2185315643 fix: remove outdated entry from jieba dictionary 2025-05-02 11:38:31 +08:00
yanyiwu
340de007f9 docs: update README.md 2025-04-13 18:59:44 +08:00
yanyiwu
940ea02eb4 deps: upgrade limonp from v1.0.0 to v1.0.1 2025-04-12 17:54:01 +08:00
yanyiwu
3732abc0e5 docs: update CHANGELOG for v5.5.0 2025-04-12 10:07:40 +08:00
yanyiwu
9cda7f33e8 build: upgrade googletest from 1.11.0 to 1.12.1 2025-04-12 10:02:10 +08:00
Yanyi Wu
338603b676
Merge pull request #196 from ahmadov/ahmadov/fix-ns-2
avoid implicit namespaces
2025-04-11 08:59:41 +08:00
Elmi Ahmadov
d93dda397c avoid implicit namespaces
This PR fixes the ambigious `partial_sort` in KeywordExtractor.hpp.
We also have a definition for it and the compiler is consufed which
implementation should be used. To fix it, we can use the `std` namespace
explicitly.

Also, use the `std` namespace for the other data structures and include
their headers.
2025-04-10 19:10:05 +02:00
Yanyi Wu
7730deee52
Merge pull request #195 from ahmadov/ahmadov/fix-ns
fix missing includes and make namespaces explicit
2025-04-10 23:01:18 +08:00
Elmi Ahmadov
588860b5b6 fix missing includes and make namespaces explicit 2025-04-10 16:11:20 +02:00
Yanyi Wu
0523949aa8
Update stale-issues.yml 2025-04-05 17:26:58 +08:00
Yanyi Wu
b11fd29697
Update README.md 2025-03-08 17:33:48 +08:00
yanyiwu
15b8086a2a Add CMake workflow for Windows ARM64 builds
This commit introduces a new GitHub Actions workflow for building and testing CMake projects on Windows ARM64. The workflow includes steps for checking out the repository, configuring CMake with multiple C++ standards, building the project, and running tests. This enhancement supports continuous integration for ARM64 architecture, improving the project's build versatility.
2025-01-18 20:58:17 +08:00
yanyiwu
1d74caf705 Update CMake minimum version requirement to 3.10 2025-01-18 20:47:06 +08:00
Yanyi Wu
0c7c5228d0
Update README.md 2025-01-17 23:47:09 +08:00
yanyiwu
016fc17575 Improve error logging for UTF-8 decoding failures across cppjieba components. Updated error messages in DictTrie, PosTagger, PreFilter, and SegmentBase to provide clearer context on the specific input causing the failure. This change enhances the debugging experience when handling UTF-8 encoded strings. 2024-12-08 17:26:28 +08:00
yanyiwu
39fc58f081 Remove macOS 12 from CI workflow in cmake.yml 2024-12-08 17:03:39 +08:00
yanyiwu
42a93a4b98 Refactor decoding functions to use UTF-8 compliant methods
Updated multiple files to replace instances of DecodeRunesInString with DecodeUTF8RunesInString, ensuring proper handling of UTF-8 encoded strings. This change enhances the robustness of string decoding across the cppjieba library, including updates in DictTrie, HMMModel, PosTagger, PreFilter, SegmentBase, and Unicode files. Additionally, corresponding unit tests have been modified to reflect these changes.
2024-12-08 16:46:24 +08:00
yanyiwu
5ee74d788e [stale-isssues] Monthly on the 3rd day of the month at midnight 2024-11-03 17:22:28 +08:00
yanyiwu
9b45e084a3 v5.4.0 2024-09-22 10:02:53 +08:00
yanyiwu
aa1def5ddb class Jiaba unittest add default argument input 2024-09-22 09:43:04 +08:00
yanyiwu
732812cdfb class Jieba: support default dictpath 2024-09-22 09:38:31 +08:00
yanyiwu
6e167a30dd cmake: avoid testing when FetchContent by other project 2024-09-22 00:25:23 +08:00
yanyiwu
5ef74f335a Revert "cmake: enable windows/msvc test"
This reverts commit 63392627552b018ea018848c82965c263b0030fa.
2024-09-21 23:58:59 +08:00
yanyiwu
6339262755 cmake: enable windows/msvc test 2024-09-21 21:49:56 +08:00
yanyiwu
cc58d4f858 DictTrie: removed unused var 2024-09-21 21:29:55 +08:00
yanyiwu
dbebc7cacb cmake: enable windows/msvc test 2024-09-21 21:10:53 +08:00
yanyiwu
e5b98af199 v5.3.2 2024-09-21 20:45:46 +08:00
yanyiwu
e521f26456 removed test/demo.cpp and linked https://github.com/yanyiwu/cppjieba-demo 2024-09-21 17:26:19 +08:00
Yanyi Wu
30aaf7b9ad
Update Demo Link in README.md 2024-09-21 17:21:54 +08:00
yanyiwu
84bca4bc50 [github/actions] stale 1 year ago issues 2024-09-14 21:49:46 +08:00
yanyiwu
3c8663472b [github/actions] stale 3 years ago issues 2024-09-14 21:37:25 +08:00
yanyiwu
12341a2f21 [stale issues] Run weekly on Sunday at midnight 2024-09-11 21:41:15 +08:00
yanyiwu
165bee901c [github/actions] stale issues 2024-09-07 20:58:52 +08:00
yanyiwu
e691b631b2 limonp v0.9.0 -> v1.0.0 2024-09-07 17:21:59 +08:00
yanyiwu
31dfe0f9d0 v5.3.1 2024-08-17 17:21:44 +08:00
yanyiwu
a110ab10cc [cmake] fetch googletest 2024-08-16 10:13:07 +08:00
yanyiwu
fe88bd29ac [submodules] rm test/googletest 2024-08-16 10:08:36 +08:00
yanyiwu
00c8f8fa84 v5.3.0 2024-08-10 22:00:50 +08:00
yanyiwu
90174da597 [c++17,c++20] compatibility 2024-08-10 21:50:01 +08:00
yanyiwu
a7adc22a6e limonp version 0.6.7 -> 0.9.0 2024-08-10 21:47:27 +08:00
yanyiwu
30ab2c3860 v5.2.0 2024-07-28 23:25:58 +08:00
yanyiwu
3748b56928 [README] platform updated 2024-07-28 23:18:53 +08:00
yanyiwu
79a235223d [CI] windows-[2019,2022] 2024-07-28 23:16:11 +08:00
yanyiwu
c39fd30f93 [googletest] v1.6.0->v1.10.0 2024-07-28 22:41:20 +08:00
yanyiwu
f4c87c2ff4 [CI] ubuntu version from 20 to 22, macos version from 12 to 14 2024-07-28 22:32:46 +08:00
yanyiwu
f8d063101c [CMake] mini_required 2.6->3.5 and fix CXX_VERSION variable passed from cmd 2024-07-27 19:24:57 +08:00
yanyiwu
732fec41e6 [CI] matrix and multi cpp version 2024-07-27 18:58:36 +08:00
yanyiwu
bc162dbd84 v5.1.3 2024-07-22 22:53:53 +08:00
yanyiwu
8aad517375 git submodule add googletest-1.6.0 2024-07-22 22:36:45 +08:00
yanyiwu
4ec3204280 [Changelog] v5.1.2 2024-07-16 07:27:24 +08:00
yanyiwu
c83c7111ab README fix typo 2024-07-15 22:56:01 +08:00
yanyiwu
e334fc2ce0 [submodule:deps/limonp] upgrade to v0.6.7 2024-07-14 22:58:39 +08:00
yanyiwu
bc90a8276e rm useless code 2024-07-14 09:47:34 +08:00
yanyiwu
4b2f257a6a README_EN.md is useless for AI-age 2024-07-14 00:09:49 +08:00
unknown
7cf4502e01 commit from vscode 2024-07-13 20:03:58 +08:00
Yanyi Wu
a4e2b67017
Update README.md for windows env 2024-07-13 17:15:12 +08:00
Yanyi Wu
c22b6843d3
Update ChangeLog.md 2024-06-07 17:19:23 +08:00
Yanyi Wu
f4145fd08e
Merge pull request #186 from appotry/master
修正编译报错
2024-05-01 17:05:12 +08:00
夜法之书(appotry)
f91776997b
修正编译报错 2024-05-01 16:06:53 +08:00
wuyanyi
391121d5db release v5.1.0 2022-10-16 13:25:50 +08:00
Yanyi Wu
d869831996
Merge pull request #172 from yanyiwu/wyy
feature: add RemoveWord api from gojieba/pull/99
2022-10-16 13:24:40 +08:00
wuyanyi
03cc7c39ff feature: add RemoveWord api from https://github.com/yanyiwu/gojieba/pull/99 2022-10-16 13:17:19 +08:00
wuyanyi
302a367338 release v5.0.5 2022-10-16 12:43:01 +08:00
Yanyi Wu
1bf8e11833
Merge pull request #171 from yanyiwu/wyy
[submodule] update limonp to v0.6.6
2022-10-16 12:26:00 +08:00
wuyanyi
fc6e3f4294 [submodule] update limonp to v0.6.6 2022-10-16 12:10:35 +08:00
wuyanyi
db9b4b6813 [release v5.0.4] 2022-10-16 11:37:16 +08:00
Yanyi Wu
99b496d871
Merge pull request #168 from playgithub/limonp-as-submodule
limonp as submodule
2022-07-31 22:21:03 +08:00
abc
3269637644 remove .travis.yml 2022-07-31 21:25:29 +08:00
abc
fe9901858c remove appveyor.yml 2022-07-31 21:17:06 +08:00
abc
6a1d49d99b Update github workflows to checkout code with submodules 2022-07-22 10:41:50 +08:00
abc
8c93e0978d Update CMakeLists.txt to use limonp as a submodule 2022-07-22 10:36:35 +08:00
abc
01aba1d85d add submodule limonp 2022-07-22 10:34:52 +08:00
Yanyi Wu
194c144d8b
Merge pull request #166 from playgithub/add-license
Add MIT license
2022-07-10 21:51:23 +08:00
abc
1da8e7cde7 Fix end of line sequence, use LF instead of CRLF 2022-07-10 16:38:27 +08:00
abc
23ce5c7050 Add MIT license 2022-07-10 16:28:09 +08:00
Yanyi Wu
ef2b8f8b1b
Update README.md 2022-01-31 16:44:50 +08:00
Yanyi Wu
e81930b7c2
Update README.md 2022-01-09 21:07:50 +08:00
Yanyi Wu
466419dda0
Create cmake.yml 2022-01-09 18:25:10 +08:00
Yanyi Wu
acb2ecc125
Merge pull request #151 from wangfenjin/patch-2
add simple: sqlite3 fts5 tokenizer
2021-02-21 11:35:57 +08:00
Wang Fenjin
677b471d9f
add simple: sqlite3 fts5 tokenizer 2021-02-20 18:48:23 +08:00
Yanyi Wu
7a61f14b6b
Update README.md 2020-11-21 22:22:36 +08:00
Yanyi Wu
b799936d2f
Update README.md 2020-06-23 09:34:28 +08:00
Yanyi Wu
f66c9d1184
Merge pull request #145 from yanyiwu/yanyiwu-patch-1-1
add sponsorship
2020-06-21 01:15:59 +08:00
Yanyi Wu
1d4ffd4b3d
add sponsorship 2020-06-21 00:56:57 +08:00
yanyiwu
32fd1ef010 [release] v5.0.3 2020-03-11 09:30:52 +08:00
yanyiwu
7c046e393f Upgrade [limonp](https://github.com/yanyiwu/limonp) -> v0.6.3 2020-03-11 09:23:22 +08:00
yanyiwu
b6a1f5f21c [release] v5.0.2 2020-01-13 21:33:53 +08:00
yanyiwu
dfb9c1f010 Upgrade [limonp](https://github.com/yanyiwu/limonp) -> v0.6.1 2020-01-13 21:29:53 +08:00
Yanyi Wu
79ffd00979
Update README.md 2019-10-24 23:05:33 +08:00
yanyiwu
caf4c43ad6 release tag v5.0.1 2019-09-21 12:01:15 +08:00
Yanyi Wu
9aa7537096
Update README.md 2019-09-21 11:57:15 +08:00
Yanyi Wu
28b37b9cba
Merge pull request #134 from shove70/patch-1
Explanation for adding D language bindings.
2019-09-21 09:51:00 +08:00
shove70
4f04293261 Explanation for adding D language bindings. 2019-09-21 09:28:06 +08:00
Yanyi Wu
be6e91e6c3
Update README.md 2019-09-15 18:14:02 +08:00
Yanyi Wu
8a258dfaf4
Merge pull request #127 from byronhe/patch-2
remove duplicate #include
2019-09-15 16:54:42 +08:00
Yanyi Wu
f39192e983
Merge pull request #133 from byronhe/patch-5
fix typo
2019-09-04 22:54:46 +08:00
byronhe
55a94b417c
fix typo 2019-09-04 20:50:11 +08:00
Yanyi Wu
866d0e83b0
Merge pull request #129 from byronhe/patch-4
fix compile warning
2019-04-29 12:27:41 +08:00
byronhe
6444f4b226
fix compile warning 2019-04-29 12:18:03 +08:00
Yanyi Wu
7fc865760b
Merge pull request #128 from byronhe/patch-3
会导致别的含有 print 这个符号的代码编译不过
2019-03-16 13:46:52 +08:00
byronhe
f55b591968
会导致别的含有 print 这个符号的代码编译不过
会导致别的含有 print 这个符号的代码编译不过
2019-03-15 22:02:04 +08:00
byronhe
798b7b81c9
remove duplicate #include
remove duplicate #include
2019-03-15 15:48:09 +08:00
Yanyi Wu
8fca7300a4
Merge pull request #126 from maliubiao/master
修正c++版本判断兼容性问题
2019-03-02 23:42:53 +08:00
maliubiao
07382b9cb1
修正c++版本判断兼容性问题
c++11以上,这个分支会跳到错误的地方,错误的使用using std::tr1::unordered_map ,导致undefined symbol.
2019-03-02 16:02:02 +08:00
Yanyi Wu
31eed03518
Update README.md 2018-10-05 11:33:48 +08:00
Yanyi Wu
3dfdc426f0
Update README.md 2018-10-05 11:27:35 +08:00
Yanyi Wu
7b2fdc41a2
Merge pull request #113 from bung87/exposes_InsertUserWord_and_Find
Exposes insert user word and find
2018-06-09 19:51:12 +08:00
zhoupeng
985ccd646c Merge branch 'master' of https://github.com/yanyiwu/cppjieba into HEAD 2018-06-09 16:23:49 +08:00
zhoupeng
111fb007cf exposes InsertUserWord Find 2018-06-09 16:21:13 +08:00
Yanyi Wu
e6fdd1c98b
Merge pull request #112 from bung87/master
接口一致与完整
2018-06-08 23:13:44 +08:00
zhoupeng
1e1e585194 LoadUserDict by set,vector 2018-06-08 14:23:01 +08:00
zhoupeng
1066bc085e fix input type ,expose to Jieba 2018-06-08 01:32:47 +08:00
zhoupeng
d56e5c0659 InsertUserWord with freq arg,expose InserUserDictNode with vector<string> arg 2018-06-08 00:44:33 +08:00
Yanyi Wu
36be7fb900
Merge pull request #111 from bung87/master
增加cppjieba-py 扩展说明
2018-06-07 23:19:06 +08:00
Yanyi Wu
bd368bc04d
Merge pull request #105 from Silencezjl/patch-2
Update demo.cpp
2018-06-07 23:18:35 +08:00
zhoupeng
cb4011ac56 增加cppjieba-py 扩展说明 2018-06-07 16:52:06 +08:00
张家麟
1089dcdcd3
Update demo.cpp
多了个分号~ 虽然影响不大~
2018-01-29 10:12:38 +00:00
Yanyi Wu
6aff1f637c Merge pull request #96 from wangzhe258369/master
减少Visual Studio编译器警告
2017-06-28 00:02:16 +08:00
Wangzhe
e7602afaac 减少Visual Studio编译器警告 2017-06-27 23:00:31 +08:00
yanyiwu
dabe502bb4 fix travis compiler 2017-04-03 23:14:59 +08:00
Yanyi Wu
d42602c12d Merge pull request #88 from stphnlyd/readme
mention the Perl 5 binding for CppJieba
2017-04-03 22:56:58 +08:00
Stephan Loyd
3d04caa1b1 mention Perl 5 binding for CppJieba 2017-04-03 22:37:04 +08:00
Yanyi Wu
472a584487 Merge pull request #87 from jonnywang/patch-2
增加php扩展版本链接
2017-03-31 10:38:40 +08:00
星期八
27dbfb8146 增加php扩展版本链接 2017-03-31 10:24:39 +08:00
Yanyi Wu
e5d9eb8816 Merge pull request #79 from royguo/master
Add Unicode offset/length support for `Word`
2016-10-18 23:02:01 +08:00
Roy Guo
f74d716570 Add Unicode offset/length support for Word 2016-10-16 13:05:56 +08:00
Roy Guo
a2f75a00d3 Add Unicode offset/length support for Word 2016-10-16 12:52:50 +08:00
yanyiwu
45809955f5 v5.0.0 2016-09-11 21:44:51 +08:00
yanyiwu
74c70c70cd create keyword_extract in Jieba 2016-09-11 21:42:53 +08:00
yanyiwu
4a755dff6a may be more friendly for compiler 2016-08-11 00:00:20 +08:00
yanyiwu
53bc279dea fix compiler warning 2016-07-23 20:49:27 +08:00
yanyiwu
91b7f9af63 v4.8.1 2016-07-23 00:11:02 +08:00
yanyiwu
0984c9ed3f update user dict loading method about word weight, and add unit tests 2016-07-22 23:53:49 +08:00
Yanyi Wu
e45ac012cb Merge pull request #74 from npes87184/master
fix second element parse error in dict
2016-07-22 13:40:55 +08:00
npes87184
0c3cf04b43 fix second element parse error in dict 2016-07-22 10:19:28 +08:00
Yanyi Wu
e3e5f93ca3 Merge pull request #73 from bigelephant29/user-dict-tag-bug-fix
fix user dict tag bug : wrong buf index assigned
2016-07-21 12:26:16 +08:00
bigelephant29
986106a553 change stoi to atoi 2016-07-21 10:54:08 +08:00
bigelephant29
2e1b6e0443 user dict support user weight and user tag 2016-07-21 10:38:46 +08:00
bigelephant29
b82acaf71e fix user dict tag bug : wrong buf index assigned 2016-07-21 10:06:24 +08:00
Yanyi Wu
8b75bf14a3 Merge pull request #72 from t-k-/master
增加 LookupTag 函数来对单个的 token 进行 tag 查询
2016-07-07 11:15:59 +08:00
t-k-
e40270ca86 Avoid using `initializer lists' from C++0x. 2016-07-06 13:48:18 -06:00
t-k-
5775a40bee Add LookupTag function for single token tag lookup. 2016-07-06 02:44:56 -06:00
Yanyi Wu
667acdeb7b Merge pull request #71 from jaiminpan/master
add tag capbility for each segments
2016-07-03 20:10:49 +08:00
Jaimin Pan
ce8cafe54a add tag capbility for each segments 2016-06-27 18:10:42 +08:00
yanyiwu
ec848581b2 fix issue #70 2016-06-10 21:49:31 +08:00
Yanyi Wu
0bf9341dd6 Merge pull request #69 from vsooda/master
fix unittest cmake macro bug
2016-06-08 11:01:47 +08:00
sooda
7d503e4b13 fix unittest cmake macro bug 2016-06-08 10:38:20 +08:00
yanyiwu
c0afac2598 update changelog 2016-05-09 22:52:42 +08:00
yanyiwu
c425bcc49f add Jieba::ResetSeparators api and unittest 2016-05-09 22:49:51 +08:00
yanyiwu
6e3ecec599 improve readability 2016-05-09 22:09:57 +08:00
yanyiwu
e4e1b4e953 update readme 2016-05-09 21:23:05 +08:00
Yanyi Wu
02df433f73 Merge pull request #65 from questionfish/master
增加了TextRank关键词提取
2016-05-04 20:02:07 +08:00
Yanyi Wu
00b2eb13c6 Merge pull request #2 from yanyiwu/patch-1
Patch 1
2016-05-04 19:33:37 +08:00
yanyiwu
b355e9f487 update unittest to pass 'make test' 2016-05-04 19:33:05 +08:00
yanyiwu
0a23d6b268 merge questionfish/master 2016-05-04 19:27:05 +08:00
mayunyun
d5a52a8e7b 1. remove stopword from span windows
2. update unittest
2016-05-04 17:52:30 +08:00
yanyiwu
5c739484ae merge the latest codes in master branch, and update unittest cases to pass ci 2016-05-03 23:20:03 +08:00
questionfish
04c176de08 Merge pull request #1 from yanyiwu/patch-1
Update TextRankExtractor.hpp: use yanyiwu's correction
2016-05-03 21:46:01 +08:00
yanyiwu
f253db0133 use map/set instead of unordered_map/unordered_set to make result stable 2016-05-03 21:24:40 +08:00
yanyiwu
39316114c5 correct unittest case 2016-05-03 20:49:47 +08:00
yanyiwu
a1ea1d0757 add textrank unittest into cmake 2016-05-03 20:01:44 +08:00
Yanyi Wu
6d105a864d Update TextRankExtractor.hpp
remove unused function which using c++11 keyword `auto`
2016-05-03 19:53:40 +08:00
mayunyun
0f66a923b3 1.增加单元测试
2.增加了构造函数的重载,增加了提取函数的重载
2016-05-03 18:06:14 +08:00
mayunyun
f2de41c15e code layout change: tab -> space 2016-05-03 09:03:16 +08:00
yanyiwu
a778d47046 v4.8.0 2016-05-02 17:15:38 +08:00
yanyiwu
5ac9e48eb0 rewrite QuerySegment, make Jieba::CutForSearch behaves the same as [jieba] cut_for_search api
remove Jieba::SetQuerySegmentThreshold
2016-05-02 16:18:36 +08:00
yanyiwu
3f0faec14b windows ci test 2016-04-27 20:22:05 +08:00
Yanyi Wu
4d8d793da5 Merge pull request #63 from qinwf/windows-appveyor
add Windows CI with MSVC
2016-04-27 19:13:49 +08:00
qinwf
c84594f620 add Windows CI with MSVC 2016-04-27 17:45:48 +08:00
yanyiwu
e6074eecb9 add cppjieba-server link 2016-04-27 16:24:13 +08:00
mayunyun
1aa0a32d90 code format check 2016-04-25 20:28:47 +08:00
mayunyun
669e971e3e new file: include/cppjieba/TextRankExtractor.hpp
Add TextRank Keyword Extractor to JiebaCpp
新增TextRank关键词提取
2016-04-25 20:20:50 +08:00
yanyiwu
d9e8cdac36 v4.7.0 2016-04-21 14:28:02 +08:00
yanyiwu
9ebc906d3f update README 2016-04-19 16:04:44 +08:00
yanyiwu
3befc42697 update KeywordExtractor::Word's printing format to json format 2016-04-19 16:00:53 +08:00
yanyiwu
a9301facde upgrade limonp -> v0.6.1 2016-04-19 15:24:56 +08:00
yanyiwu
29e085904d add log and unittest 2016-04-18 14:55:42 +08:00
yanyiwu
63e9c94fb7 add unicode decoding unittest 2016-04-18 14:37:17 +08:00
yanyiwu
6fa843b527 override Cut functions, add location information into Word results; 2016-04-17 23:39:57 +08:00
yanyiwu
b6703aba90 use offset instead of str in RuneStr 2016-04-17 22:50:32 +08:00
yanyiwu
e7a45d2dde remove LevelSegment 2016-04-17 22:23:00 +08:00
yanyiwu
42a73eeb64 make compiler happy 2016-04-17 22:11:58 +08:00
yanyiwu
dcced8561e remove namespace unicode 2016-04-17 21:59:10 +08:00
yanyiwu
6ff6fe1430 WordRange construct 2016-04-17 21:57:36 +08:00
yanyiwu
339e3ca772 big change: add RuneStr for the position of word in string 2016-04-17 17:30:05 +08:00
yanyiwu
abcc0af034 update readme 2016-03-30 00:41:44 +08:00
Yanyi Wu
1fb5a7c66f Update README.md 2016-03-29 23:50:59 +08:00
Yanyi Wu
82feba693c Merge pull request #59 from bitdeli-chef/master
Add a Bitdeli Badge to README
2016-03-29 00:52:05 -05:00
Bitdeli Chef
627a514b7f Add a Bitdeli badge to README 2016-03-29 06:04:47 +00:00
yanyiwu
81cd435f2a prettify demo output 2016-03-28 01:22:24 +08:00
yanyiwu
500af453e1 add new case: sqljieba 2016-03-27 23:46:14 +08:00
yanyiwu
4b97c57bb2 v4.6.0 2016-03-26 23:34:40 +08:00
yanyiwu
c19736995c Add KeywordExtractor::Word and add more overrided KeywordExtractor::Extract 2016-03-26 22:12:40 +08:00
yanyiwu
e6a2b47b87 hange the return value of KeywordExtractor::Extract from bool to void 2016-03-26 01:16:44 +08:00
yanyiwu
5102b8a5c3 Change Jieba::Locate to be static function. 2016-03-26 01:14:48 +08:00
yanyiwu
7db3f87b5f remove info log for dict loading 2016-03-22 10:45:20 +08:00
yanyiwu
5a8a0fae7a v4.5.3 2016-03-18 16:17:57 +08:00
yanyiwu
3ef005275a Upgrade limonp to v0.6.0 2016-03-18 16:14:48 +08:00
yanyiwu
81c35dde01 v4.5.2 2016-03-18 14:32:52 +08:00
yanyiwu
92fdf009cb Upgrade limonp to v0.5.6 to fix hidden trouble. 2016-03-18 14:05:18 +08:00
yanyiwu
643148edf5 platform 2016-02-26 22:33:51 +08:00
yanyiwu
f446ecf2ed v4.5.1 2016-02-19 16:29:59 +08:00
yanyiwu
3e28b4bcb1 adjust code for limonp v0.5.5 to solve macro name conflicts 2016-02-19 16:15:23 +08:00
yanyiwu
fc04cf750a upgrade limonp to v0.5.5 2016-02-19 16:14:33 +08:00
yanyiwu
9d7da4864a v4.5.0 2016-02-18 16:21:15 +08:00
yanyiwu
0a7b6e62f3 add Unicode32 cases for cut testing 2016-02-18 15:18:35 +08:00
yanyiwu
14e09290c2 change Rune type from uint16_t to uint32_t to support more chinese word 2016-02-18 14:54:03 +08:00
yanyiwu
8d66b1f1fa upgrade limonp to v0.5.4 2016-02-18 14:48:26 +08:00
yanyiwu
239d025cd8 delete HashMap, use unordered_map instead 2016-02-16 20:24:28 +08:00
yanyiwu
e6454fef77 use HashMap in Trie, and remove the base array of trie root node, see details in Changelog 2016-02-12 01:37:39 +08:00
yanyiwu
2d3c51dba7 upgrade limonp and use limonp::HashMap in Trie 2016-02-04 23:43:26 +08:00
yanyiwu
6f303ee843 v4.4.1 2016-01-29 10:14:40 +08:00
yanyiwu
8496f41e5d update changelog.md 2016-01-29 00:49:29 +08:00
yanyiwu
721b34f1bd fix bug, see details in ChangeLog.md 2016-01-29 00:30:38 +08:00
Yanyi Wu
8ca338d75a Update README.md 2016-01-22 21:38:53 +08:00
yanyiwu
446c21851d v4.4.0 2016-01-21 22:19:10 +08:00
Yanyi Wu
550ac2ab61 Merge pull request #52 from yanyiwu/remove_server
remove server, see details in ChangeLog.md
2016-01-21 19:12:54 +08:00
yanyiwu
34668aa379 remove server, see details in ChangeLog.md 2016-01-21 01:07:31 +08:00
yanyiwu
c1a6726bcc update readme.md 2016-01-20 20:47:21 +08:00
yanyiwu
963bf516a6 v4.3.3 2016-01-20 12:21:08 +08:00
yanyiwu
4493c604b9 Yet Another Incompatibility Problem Repair: Upgrade [limonp] to version v0.5.3, fix incompatibility problem in Windows 2016-01-20 12:06:56 +08:00
yanyiwu
c34c8f3082 v4.3.2 2016-01-16 01:44:28 +08:00
yanyiwu
0482ec2b6c [limonp] to version v0.5.2, fix incompatibility problem in Windows 2016-01-16 01:34:18 +08:00
yanyiwu
eb12813194 v4.3.1 2016-01-13 00:43:11 +08:00
yanyiwu
193e717d22 override constructor in KeywordExtractor 2016-01-13 00:40:46 +08:00
yanyiwu
a6c6e8df8c v4.3.0 2016-01-11 15:02:09 +08:00
yanyiwu
b41cb0e2ee fix compile error 2016-01-11 14:50:14 +08:00
yanyiwu
d92d3f194d upgrade husky to version v0.2.2 2016-01-11 14:33:43 +08:00
yanyiwu
3ab9a34909 upgrade limonp to version v0.5.1 2016-01-11 14:30:38 +08:00
yanyiwu
3c5ad24260 source code layout change:
1. src/ -> include/cppjieba/
2. src/limonp/ -> deps/limonp/
3. server/husky -> deps/husky/
4. test/unittest/gtest -> deps/gtest
2016-01-11 14:25:02 +08:00
yanyiwu
a07a22e9c4 update README 2016-01-10 19:58:47 +08:00
yanyiwu
a740fca866 add english readme 2015-12-24 21:35:06 +08:00
yanyiwu
29306c977f add badge 2015-12-24 21:09:10 +08:00
yanyiwu
fb5d989dc6 v4.2.1 2015-12-12 21:26:45 +08:00
yanyiwu
bcb112a4b1 upgrade basic functions 2015-12-12 21:25:57 +08:00
yanyiwu
8bf70127c2 upgrade limonp to version v0.4.1 2015-12-12 21:02:40 +08:00
yanyiwu
484ce39d36 update husky to version v0.2.0 2015-12-12 19:43:49 +08:00
yanyiwu
194550823f update limonp to version v0.4.0 2015-12-12 19:42:30 +08:00
yanyiwu
c38015d0ee v4.2.0 2015-12-09 00:24:27 +08:00
yanyiwu
1d33dcfdd7 add demo into 'make test' and update readme.md about dict path separator 2015-12-09 00:23:17 +08:00
yanyiwu
8482bef442 change multi user dicts seperator from ':' to '|;' 2015-12-09 00:01:27 +08:00
yanyiwu
0989dcb2c9 gitbook-plugin-search-pro 2015-12-04 00:52:29 +08:00
yanyiwu
b3868cdf78 v4.1.2 2015-12-02 01:20:18 +08:00
yanyiwu
8dc01ae614 add Jieba::Locate function to get word location of cutted sentence 2015-12-02 01:19:23 +08:00
yanyiwu
fb63e78ed2 update cases 2015-12-01 13:46:29 +08:00
Yanyi Wu
1bdedf84ec Merge pull request #49 from jaiminpan/bugfix
避免log fatal
2015-11-28 20:51:40 +08:00
Jaimin Pan
8a956642c3 fix crash if there is black line in dictionary 2015-11-28 20:30:40 +08:00
yanyiwu
a7df45df70 v4.1.1 2015-11-26 00:53:31 +08:00
yanyiwu
60ca5093a9 add Jieba::Tag 2015-11-26 00:47:16 +08:00
yanyiwu
c27d89c60d update contact in readme.md 2015-11-10 17:16:14 +08:00
yanyiwu
c6ae23ec1f update changelog.md, version v4.1.0 2015-10-29 15:31:17 +08:00
yanyiwu
8fe4de404e add SetQuerySegmentThreshold in Jieba 2015-10-29 15:28:10 +08:00
yanyiwu
c3fd357a6d [QuerySegment] add SetMaxWordLen,GetMaxWordLen, and filter the english sentence in secondary Cut 2015-10-29 14:23:01 +08:00
yanyiwu
087f3248f8 update changelog.md 2015-10-29 12:40:38 +08:00
yanyiwu
83cc67cb15 [code style] uppercase function name 2015-10-29 12:39:10 +08:00
yanyiwu
f17c2d10e2 [code style] uppercase function name 2015-10-29 12:30:47 +08:00
yanyiwu
1a9a37aa64 update changelog 2015-10-29 12:27:37 +08:00
yanyiwu
6f51373280 support optional user word freq weight 2015-10-09 11:20:06 +08:00
yanyiwu
ecacf118e6 [code style] lower case namespace 2015-10-08 21:13:11 +08:00
yanyiwu
16b69e35c1 delete Application.hpp, use Jieba.hpp instead 2015-10-08 21:03:09 +08:00
yanyiwu
4d56be920b support optional user word freq weight 2015-10-08 20:05:27 +08:00
yanyiwu
98345d6aed add SetStaticWordWeights UserWordWeightOption 2015-10-08 17:36:52 +08:00
yanyiwu
b28d6db574 code style 2015-10-08 17:08:57 +08:00
yanyiwu
9b60537b40 update changelog.md 2015-09-25 16:25:11 +08:00
yanyiwu
9de513f1d5 new feature: loading multi user dict, path is split by : 2015-09-25 16:20:06 +08:00
yanyiwu
e55d0bf95c update limonp 2015-09-25 16:11:27 +08:00
yanyiwu
5bf7454ad2 add multi user dict unittest 2015-09-25 16:07:01 +08:00
yanyiwu
9f359f3783 v3.2.1 2015-09-24 12:03:04 +08:00
yanyiwu
c70dcdd2a9 fix bug about header file including protection 2015-09-24 11:48:50 +08:00
yanyiwu
ea4d81cde7 add segment cut case 2015-09-18 14:28:34 +08:00
yanyiwu
fbd9f51b0a updatedabout make install 2015-09-16 11:19:19 +08:00
yanyiwu
b68afb0db2 v3.2.0 2015-09-14 12:44:49 +08:00
yanyiwu
ec6a12a021 add gojieba into README.md 2015-09-14 12:03:18 +08:00
yanyiwu
eb6f47b6b0 refactor unittest 2015-09-13 18:09:56 +08:00
yanyiwu
8eef9a13a8 fix bug about optional argument hmm 2015-09-13 18:06:44 +08:00
yanyiwu
f517601c29 changelog 2015-09-13 17:38:14 +08:00
yanyiwu
f98e94869c add optional argument: hmm 2015-09-13 17:28:49 +08:00
yanyiwu
14974d51b4 abondom ISegment 2015-09-13 17:02:04 +08:00
yanyiwu
6d69363145 refactor, simplify SegmentBase 2015-09-13 16:29:35 +08:00
yanyiwu
e9241d9025 fixed the bug in the last commit 2015-09-13 16:18:48 +08:00
yanyiwu
28bcb3bf57 use PreFilter in SegmentBase 2015-09-13 16:05:17 +08:00
yanyiwu
0542dd1cfd add PreFilter 2015-09-13 15:10:10 +08:00
yanyiwu
710ddacd38 add Jieba.hpp 2015-09-13 00:28:40 +08:00
yanyiwu
63ca914176 update before_install for mac 2015-09-11 18:08:21 +08:00
yanyiwu
0ffc0f8079 make test 2015-09-11 18:06:58 +08:00
yanyiwu
19bb124b3e [enhancement issue]: https://github.com/yanyiwu/nodejieba/issues/39 2015-09-11 17:30:23 +08:00
yanyiwu
1babe57ebc 细粒度分词功能 2015-08-30 16:35:21 +08:00
yanyiwu
3c60c35906 修复FullSegment对于有些单字没有输出的bug 2015-08-30 13:09:37 +08:00
yanyiwu
001a69d8c6 增加MPSegment的细粒度分词功能。 2015-08-30 01:04:30 +08:00
yanyiwu
fae951a95d 统一私有函数的命名风格 2015-08-28 11:17:38 +08:00
yanyiwu
0e0318f6ad 集成LevelSegment进Application 2015-08-11 11:57:58 +08:00
yanyiwu
0a6b01c374 update chaneglog.md 2015-08-11 00:53:43 +08:00
yanyiwu
41e4300c9a LevelSegment 2015-08-11 00:53:06 +08:00
yanyiwu
efd029c20b namespace husky; namespace limonp; 2015-08-08 12:30:14 +08:00
yanyiwu
8a3ced2b27 去掉一些没必要的返回值判断,精简代码 2015-07-24 14:39:03 +08:00
yanyiwu
0f79fa6c24 统一在SegmentBase搞定所有Unicode和string的转码事情 2015-07-24 13:42:24 +08:00
yanyiwu
4d86abb001 新增findByLimit函数 2015-07-23 21:10:56 +08:00
yanyiwu
78e41e5fd0 规范Unicode的相关命名,使用Rune代表一个中文字符 2015-07-21 14:54:50 +08:00
yanyiwu
0e16e000ea 解决一些历史遗留问题 2015-07-21 14:32:05 +08:00
yanyiwu
620d276887 底层常用结构修整 2015-07-21 12:11:43 +08:00
yanyiwu
83222918cc 更新ChangeLog 2015-07-21 11:26:33 +08:00
Yanyi Wu
5296a83823 Merge pull request #44 from aholic/master
提升Trie的效率
2015-07-21 11:15:26 +08:00
aholic
f5e74a3f46 replace old trie 2015-07-21 00:29:49 +08:00
aholic
f5d824043c Merge branch 'master' of https://github.com/aholic/cppjieba 2015-07-21 00:17:02 +08:00
aholic
791ee25295 pull upstream 2015-07-21 00:16:49 +08:00
xuangong
cf9cc45c19 astyle 2015-07-21 00:11:13 +08:00
xuangong
931db7d1e5 astyle 2015-07-20 23:54:20 +08:00
yanyiwu
6e723c2c58 v3.1.0 2015-06-27 13:19:26 +08:00
yanyiwu
2ae6eba3a7 更新insertUserWord的示例程序 2015-06-27 13:16:25 +08:00
yanyiwu
d33c09d74a 增加单元测试 2015-06-27 12:34:27 +08:00
yanyiwu
64d073d194 支持insertUserWord接口 2015-06-27 11:39:43 +08:00
yanyiwu
c5f7d4d670 重构trie前先ci一下 2015-06-26 14:29:44 +08:00
yanyiwu
e0db070529 开放insertUserWord接口;增加cut的默认参数,默认切词算法为Mix 2015-06-26 12:22:11 +08:00
yanyiwu
1d27559209 refactor DictTrie, and expose function: insertUserWord 2015-06-26 11:49:35 +08:00
yanyiwu
ee255baf56 v3.0.1 提升兼容性,修复在某些特定环境下的编译错误问题。 2015-06-24 16:01:41 +08:00
yanyiwu
9284fe1872 性能评测 2015-06-14 12:21:09 +08:00
yanyiwu
389914ae1b 修复部分代码在 windows 上编译不通过的问题,提升兼容性。 2015-06-09 15:31:43 +08:00
yanyiwu
e3c57c0ba1 提升兼容性,修复在某些特定环境下的编译错误问题。 2015-06-08 15:01:59 +08:00
yanyiwu
67cc5941be update demo 2015-06-07 11:13:33 +08:00
yanyiwu
acd01bda99 v3.0.0 2015-06-06 11:47:04 +08:00
yanyiwu
3528b6296a 修改 cjserver 服务,可以通过http参数使用不同切词算法进行切词。
修改 make install 的安装目录,统一安装到同一个目录 /usr/local/cppjieba
2015-06-05 21:59:16 +08:00
yanyiwu
8ce2af9706 更新Demo示例文件,demo只使用一个Application实例即可。 2015-06-05 18:12:27 +08:00
yanyiwu
e5d1ac7bc8 把dict/{extra_dict,gbk_dict} 挪进 test/testdata 2015-06-05 16:31:43 +08:00
yanyiwu
a3d9b40c2a 修改QuerySegment的构造函数参数顺序 2015-06-05 16:23:51 +08:00
yanyiwu
45588b75cc 增加 Application 这个类,整合了所有CppJieba的功能进去,以后用户只需要使用这个类即可。 2015-06-05 16:00:32 +08:00
yanyiwu
d56bf2cc68 重构:增加让各个分词类的构造函数,为后面的憋大招做准备。 2015-06-04 22:38:55 +08:00
yanyiwu
b99d0698f0 将 HMMSegment 里面关于模型文件的数据独立成 HMMModel 2015-06-04 17:52:18 +08:00
yanyiwu
d3b34b73c6 更新关于分词服务中,分词算法修改的办法。 2015-06-04 14:40:34 +08:00
yanyiwu
d34ed79b03 more flexible 2015-06-04 14:39:40 +08:00
yanyiwu
9218ccb9c9 set default argument in QuerySegment: size_t maxWordLen = 4 2015-06-04 14:37:09 +08:00
yanyiwu
aed1c8f4a6 删除一些无必要的错误检查 2015-05-21 16:04:41 +08:00
yanyiwu
954100dc3d use LogFatal for more human-readable 2015-05-20 16:50:12 +08:00
yanyiwu
6e3bb7d057 use reverse_iterator 2015-05-18 23:57:13 +08:00
yanyiwu
c04b2dd0d4 增加更详细的错误日志,在初始化过程中合理使用LogFatal。 2015-05-07 20:03:19 +08:00
yanyiwu
31400cee17 update changelog 2015-05-06 23:02:57 +08:00
yanyiwu
2b18a582fc code style 2015-05-06 23:02:03 +08:00
yanyiwu
bb32234654 astyle --style=google --indent=spaces=2 2015-05-06 17:53:20 +08:00
yanyiwu
b70875f412 update LogFatal, print more readable error message when errors happened 2015-05-06 17:20:15 +08:00
yanyiwu
56c524f7a8 yanyiwu.mit-license.org 2015-04-25 12:19:24 +08:00
aholic
d1a112c0c4 improve efficiency for trie tree in ugly way 2015-04-19 21:44:50 +08:00
aholic
ea0d464519 Merge https://github.com/yanyiwu/cppjieba 2015-03-19 22:57:04 +08:00
yanyiwu
5121bf675e __APPLE__ 2015-02-28 12:49:07 +08:00
yanyiwu
b3d928a450 rename aszxqw -> yanyiwu 2015-02-11 17:11:37 +08:00
Yanyi Wu
8fe97fc898 Merge pull request #39 from qinwf/patch-test
添加英文+数字分词规则 qinwf/jiebaR#7
2015-02-06 10:59:43 +08:00
qinwf
c0bdef74fb 添加英文+数字分词规则 qinwf/jiebaR#7 2015-02-06 10:19:43 +08:00
yanyiwu
10e9b32258 little adjustment 2015-01-31 12:58:49 +08:00
yanyiwu
00f738a617 update husky for server 2015-01-31 10:14:16 +08:00
yanyiwu
660cd9d93e upload limonp for Colors.hpp and use ColorPrintln in load_test.cpp 2015-01-28 21:27:46 +08:00
yanyiwu
8c23da4332 remove debug log in hmm 2015-01-28 20:29:38 +08:00
yanyiwu
2488738b55 update unittest 2015-01-24 15:51:24 +08:00
yanyiwu
4e72d4a06f KeywordExtractor 支持自定义词典(可选参数)。 2015-01-24 15:34:34 +08:00
yanyiwu
269bc0fd0d make QuerySegment support user.dict.utf8 2015-01-23 01:10:12 +08:00
yanyiwu
a406c0f8cc 2.4.4 2015-01-06 15:29:21 +08:00
yanyiwu
51e4583fd1 update email 2015-01-06 15:28:10 +08:00
yanyiwu
7304ccb854 add iosjieba into readme.md 2014-12-24 22:57:33 +08:00
yanyiwu
dc41c9eeb9 update jieba_rb 2014-12-24 19:13:09 +08:00
wyy
5858fe29a2 update changelog.md 2014-12-16 12:45:29 +08:00
wyy
e0e0a6b976 修复typename在不同版本编译器的兼容问题 2014-12-16 12:44:48 +08:00
yanyiwu
0edb2b13cc cjieba 2014-12-16 01:30:14 +08:00
wyy
e84d57426d fix warnings 2014-11-30 01:13:25 +08:00
wyy
a63fe809b1 rm unused file 2014-11-30 00:34:17 +08:00
Yanyi Wu
de962ec97b Merge pull request #37 from qinwf/master
删除 MPSegment.hpp 中的重复头文件 以及 UBSAN 测试
2014-11-30 00:11:42 +08:00
Qin Wenfeng
2b522b20ff 使用 uint8_t 通过 UBSAN 测试 2014-11-29 19:41:12 +08:00
Qin Wenfeng
61f2031e4b 删除 MPSegment.hpp 中的重复头文件 2014-11-29 19:36:55 +08:00
wyy
e9cbec02c2 增加两条词性标注的规则,针对连续英文和数字。 2014-11-29 12:45:11 +08:00
aholic
7791290473 Merge https://github.com/aszxqw/cppjieba 2014-11-14 13:20:04 +08:00
wyy
9d5359fc34 update changelog.md 2014-11-13 01:32:38 +08:00
wyy
7868f7cdff 去除一些 template 代码 2014-11-13 01:16:38 +08:00
wyy
c119dc0a93 use localvector in dag 2014-11-12 21:18:30 +08:00
wyy
99c3405e13 move flag 2014-11-12 20:03:32 +08:00
wyy
75367a20c9 little modification 2014-11-12 19:45:20 +08:00
wyy
3ced451212 use automation 2014-11-12 18:55:17 +08:00
wyy
b9736ee132 update trie and dag , make cut faster . see details in changelog.md 2014-11-05 15:31:09 +08:00
wyy
11b041ed52 make load_test test time longer 2014-11-05 14:57:34 +08:00
aholic
283c65db0a fetch ahead 2014-11-05 11:13:00 +08:00
aholic
c2125b5371 Merge https://github.com/aszxqw/cppjieba 2014-11-05 11:12:33 +08:00
Yanyi Wu
a3671ab252 Merge pull request #36 from qinwf/master
README 中添加 jiebaR
2014-11-04 12:11:06 +08:00
Qin Wenfeng
7bf2bceee4 README 中添加 jiebaR
添加了CppJieba的R语言封装 jiebaR。
2014-11-04 12:07:33 +08:00
wyy
471a68e08e 增加测试 2014-11-03 11:30:45 +08:00
wyy
107638f7d8 修改测试数据等 2014-11-03 11:19:00 +08:00
wyy
fbae0f6075 增加两条分词规则 2014-11-03 10:54:53 +08:00
wyy
b68a76e63a 完善一些测试 2014-10-26 12:21:10 +08:00
aholic
e85a3ef8d3 fix bug for map.erase 2014-10-25 18:29:04 +08:00
wyy
11de561332 支持 docker 2014-10-25 14:47:20 +08:00
wyy
22f5e06715 docker 2014-10-25 11:21:27 +08:00
wyy
6ac7a8c85c add dockerfile 2014-10-25 00:58:31 +08:00
Yanyi Wu
82d8a23ab9 Update README.md
更新 wiki 地址
2014-10-22 23:01:27 +08:00
wyy
ad02d2d43e 更好的支持 mac osx 系统 2014-10-16 00:08:21 +08:00
wyy
b572597777 分享词典 2014-10-15 21:22:46 +08:00
wyy
0fd68846af update travis-ci for operating system osx 2014-10-12 16:22:56 +08:00
wyy
020aeaeeb0 update tagging_demo.cpp 2014-09-28 14:13:02 +08:00
wyy
ef5766904a 修改自定义词性的格式为: word tag 2014-09-28 13:43:30 +08:00
wyy
6a8ebae344 支持自定义词性 2014-09-28 13:22:37 +08:00
wyy
28246fba5d 去除 PosTagger 构造函数里一些暂时无用的参数,和增加 PosTagger 的单元测试。 2014-09-28 11:59:30 +08:00
wyy
da1b9e0c1c update limonp 2014-09-18 00:05:43 +08:00
wyy
23aee266c3 update changelog.md 2014-09-16 23:41:43 +08:00
wyy
49e3a1760f interrupt socket receive when header is too long. 2014-09-16 21:53:33 +08:00
wyy
198c483c66 update husky for become more stable 2014-09-15 23:03:58 +08:00
wyy
eb113acfbe update test/servertest 2014-09-15 22:21:37 +08:00
wyy
38af4a5fb6 update receive 2014-09-15 19:01:04 +08:00
wyy
fbbcfbdec7 update limonp and husky for threadpool using 2014-09-15 17:52:33 +08:00
wyy
e25828e0a9 update readme.md 2014-09-12 23:40:35 +08:00
wyy
698bde3c85 add ngx_http_cppjieba_module in readme.md 2014-09-06 20:52:23 +08:00
wyy
12befefe4e update changelog.md 2014-08-16 00:14:20 +08:00
wyy
269fee6f2c v2.4.2 2014-08-16 00:10:16 +08:00
wyy
4d686edb7f update unittest for compiling ok in mac 2014-08-15 22:30:52 +08:00
wyy
e317f25d94 update changelog.md 2014-08-15 22:12:02 +08:00
wyy
40eb40288d compatiable with -std=c++0x 2014-08-15 22:09:21 +08:00
wyy
9571a4d0d5 remove InitOnOff to make code lighter 2014-08-12 00:34:37 +08:00
wyy
5bfd3d0c49 update fullsegment for reducing memory cost 2014-08-11 23:34:29 +08:00
wyy
f6762e07ae update testing in readme.md 2014-07-28 20:39:01 +08:00
wyy
2113df1344 update readme.md 2014-07-23 00:36:49 +08:00
wyy
d6f114cd73 update changelog.md 2014-07-08 23:39:02 -07:00
wyy
8df0a1c89e fix max probability segmentor's bug : result is imcomplete while speical symbol in sentence 2014-07-08 23:38:06 -07:00
wyy
5b0ac64bc2 add unittest 2014-07-08 23:07:27 -07:00
wyy
007649494d avoid warning in cmake about Loggger.hpp 2014-07-05 19:18:39 +08:00
wyy
3c95ee686a update changelog.md 2014-06-13 00:36:34 +08:00
wyy
fc621ce856 add user_dict_path for server 2014-06-13 00:26:37 +08:00
wyy
c9c1ff5ac6 update readme.md 2014-06-12 23:58:55 +08:00
wyy
0ee13c8c06 fix bug about space in httpstr 2014-06-12 23:58:47 +08:00
wyy
8f5d08b7ae update readme.md 2014-06-11 19:49:14 +08:00
wyy
4a8f63fcd2 make segments NonCopyable 2014-06-11 16:18:09 +08:00
wyy
12d3741562 avoid warning in g++ 2014-06-05 19:29:57 +08:00
wyy
16e6ac0819 update changelog.md 2014-06-05 18:36:41 +08:00
wyy
a8f83dd6f0 update localvector 2014-06-05 18:30:08 +08:00
wyy
189b2725a0 add localvector 2014-06-05 01:00:17 +08:00
wyy
76dd93051e add localvector 2014-06-05 00:48:49 +08:00
wyy
014bea02ba update readme.md 2014-05-31 18:08:04 +08:00
wyy
c46980c17c minor change 2014-05-30 00:21:11 +08:00
wyy
e96885c38e update limonp/codeconverter.hpp 2014-05-29 23:57:32 +08:00
wyy
059f05c25d update limonp : add CodeConverter and delete some unused files 2014-05-29 22:39:22 +08:00
wyy
fb608627c9 update limonp 2014-05-26 17:15:52 +08:00
wyy
51ae3ffb87 update changelog.md 2014-05-24 16:14:19 +08:00
wyy
75581495b4 use vector's reserve 2014-05-24 16:09:00 +08:00
wyy
bc6ed2368d use vector's reserve 2014-05-24 15:37:31 +08:00
wyy
1a314d4b4c use vector's reserve 2014-05-24 13:44:55 +08:00
wyy
7eb896529f update .travis.yml 2014-05-24 13:36:12 +08:00
wyy
28cdc2e86b finished v2.4.1 2014-05-24 13:28:42 +08:00
wyy
ac49986592 little modification in readme.md 2014-05-22 15:18:20 +08:00
wyy
5a7f8fea95 Merge branch 'master' of github.com:aszxqw/cppjieba 2014-05-20 19:32:44 +08:00
wyy
0869568f4a modify blog url in readme.md 2014-05-20 19:31:06 +08:00
wyy
dd2e08f1e5 update EpollServer for cjserver 2014-05-17 21:21:28 -05:00
wyy
f0a0731b74 add server.conf into testdata for testing 2014-05-17 21:20:09 -05:00
wyy
f7108ce693 modify changelog.md 2014-05-17 16:28:59 +08:00
wyy
5b654f66db make single one chinese word in userdict will not be ignored in mixsegment.hpp 2014-05-17 16:22:54 +08:00
wyy
5174ac098a corrected word spell in script 2014-05-15 12:15:32 +08:00
wyy
fb25d4640c add some notices about gbk 2014-05-08 18:04:35 +08:00
wyy
2479bb1927 modify reamde.md 2014-04-27 18:44:03 +08:00
wyy
932bcc96db use travis 2014-04-26 18:27:03 +08:00
wyy
af01164c7f modify .travis.yml 2014-04-26 18:12:08 +08:00
wyy
4819d307e9 aded .travis.yml 2014-04-26 18:04:34 +08:00
wyy
376750a518 mofy cmakelists.txt for mac 2014-04-25 22:57:04 +08:00
wyy
ac6207635f modify changelog and readme 2014-04-25 22:34:22 +08:00
wyy
57ef504d9b modify test/segment_demo.cpp 2014-04-25 22:09:55 +08:00
wyy
f8487fd9cf remove src/segment and mv server.cpp into server/server.cpp and modify readme.md 2014-04-25 21:48:29 +08:00
wyy
94ae4bdd6f rm unused server in test 2014-04-25 21:21:05 +08:00
wyy
3e0aaf73a5 adding user dict interface and test ok 2014-04-25 19:30:26 +08:00
wyy
566187a49c add userdict.utf8 2014-04-25 19:22:32 +08:00
wyy
2937985243 adding user dict interface 2014-04-25 18:47:22 +08:00
wyy
dc96bb3795 add userdict loader 2014-04-25 17:29:42 +08:00
wyy
2f314ffdb1 mv *.gbk to gbk_dict 2014-04-25 17:13:14 +08:00
wyy
bea6174316 modify changelog.md 2014-04-20 00:24:30 +08:00
wyy
be3773920a modify keyword_demo 2014-04-20 00:23:42 +08:00
wyy
ae3e0a1b6a make keywordextractor faster 2014-04-20 00:20:25 +08:00
wyy
2645a4e837 add keyword extrator into load_test 2014-04-19 23:56:43 +08:00
wyy
cbe9642972 ci readme.md 2014-04-19 13:07:58 +08:00
wyy
884aa89009 add test case 2014-04-19 13:01:31 +08:00
wyy
9f100121f8 ci changelogmd 2014-04-19 12:45:43 +08:00
wyy
d6bf7cd10c modify test demo 2014-04-19 12:41:09 +08:00
wyy
e225c8c722 and modify some test case 2014-04-19 12:35:19 +08:00
wyy
a585471e76 rewrite cut for chinese special symbol 2014-04-19 11:25:13 +08:00
wyy
3d6bade24f Merge branch 'master' into for_09az 2014-04-16 20:46:15 +08:00
wyy
084bd91093 modify readme 2014-04-16 20:41:47 +08:00
wyy
d61d694ac7 do some rename 2014-04-16 19:12:24 +08:00
wyy
76d640b26e use filterSpecialChars in segmentbase.hpp 2014-04-14 22:21:09 +08:00
134 changed files with 3873 additions and 36206 deletions

40
.github/workflows/cmake-arm64.yml vendored Normal file
View File

@ -0,0 +1,40 @@
name: CMake Windows ARM64
on:
push:
pull_request:
workflow_dispatch:
env:
BUILD_TYPE: Release
jobs:
build-windows-arm64:
runs-on: windows-2022
strategy:
matrix:
cpp_version: [11, 14, 17, 20]
steps:
- name: Check out repository code
uses: actions/checkout@v2
with:
submodules: recursive
- name: Configure CMake
# Configure CMake in a 'build' subdirectory. `CMAKE_BUILD_TYPE` is only required if you are using a single-configuration generator such as make.
# See https://cmake.org/cmake/help/latest/variable/CMAKE_BUILD_TYPE.html?highlight=cmake_build_type
# run: cmake -B ${{github.workspace}}/build -DCMAKE_BUILD_TYPE=${{env.BUILD_TYPE}}
run: cmake -B ${{github.workspace}}/build -DBUILD_TESTING=ON -DCMAKE_CXX_STANDARD=${{matrix.cpp_version}} -DCMAKE_BUILD_TYPE=${{env.BUILD_TYPE}}
- name: Build
# Build your program with the given configuration
# run: cmake --build ${{github.workspace}}/build --config ${{env.BUILD_TYPE}}
run: cmake --build ${{github.workspace}}/build --config ${{env.BUILD_TYPE}}
- name: Test
working-directory: ${{github.workspace}}/build
# Execute tests defined by the CMake configuration.
# See https://cmake.org/cmake/help/latest/manual/ctest.1.html for more detail
run: ctest -C ${{env.BUILD_TYPE}} --verbose

53
.github/workflows/cmake.yml vendored Normal file
View File

@ -0,0 +1,53 @@
name: CMake
on:
push:
pull_request:
env:
# Customize the CMake build type here (Release, Debug, RelWithDebInfo, etc.)
BUILD_TYPE: Release
jobs:
build:
# The CMake configure and build commands are platform agnostic and should work equally well on Windows or Mac.
# You can convert this to a matrix build if you need cross-platform coverage.
# See: https://docs.github.com/en/free-pro-team@latest/actions/learn-github-actions/managing-complex-workflows#using-a-build-matrix
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [
ubuntu-22.04,
ubuntu-latest,
macos-13,
macos-14,
macos-latest,
windows-2019,
windows-2022,
windows-latest,
]
cpp_version: [11, 14, 17, 20]
steps:
- name: Check out repository code
uses: actions/checkout@v2
with:
submodules: recursive
- name: Configure CMake
# Configure CMake in a 'build' subdirectory. `CMAKE_BUILD_TYPE` is only required if you are using a single-configuration generator such as make.
# See https://cmake.org/cmake/help/latest/variable/CMAKE_BUILD_TYPE.html?highlight=cmake_build_type
# run: cmake -B ${{github.workspace}}/build -DCMAKE_BUILD_TYPE=${{env.BUILD_TYPE}}
run: cmake -B ${{github.workspace}}/build -DBUILD_TESTING=ON -DCMAKE_CXX_STANDARD=${{matrix.cpp_version}} -DCMAKE_BUILD_TYPE=${{env.BUILD_TYPE}}
- name: Build
# Build your program with the given configuration
# run: cmake --build ${{github.workspace}}/build --config ${{env.BUILD_TYPE}}
run: cmake --build ${{github.workspace}}/build --config ${{env.BUILD_TYPE}}
- name: Test
working-directory: ${{github.workspace}}/build
# Execute tests defined by the CMake configuration.
# See https://cmake.org/cmake/help/latest/manual/ctest.1.html for more detail
run: ctest -C ${{env.BUILD_TYPE}} --verbose

25
.github/workflows/stale-issues.yml vendored Normal file
View File

@ -0,0 +1,25 @@
name: Close Stale Issues
on:
schedule:
- cron: '0 0 3 */3 *' # Every three months on the 3rd day at midnight
jobs:
stale:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v5
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
stale-issue-message: 'This issue has not been updated for over 1 year and will be marked as stale. If the issue still exists, please comment or update the issue, otherwise it will be closed after 7 days.'
close-issue-message: 'This issue has been automatically closed due to inactivity. If the issue still exists, please reopen it.'
days-before-issue-stale: 365
days-before-issue-close: 7
stale-issue-label: 'Stale'
exempt-issue-labels: 'pinned,security'
operations-per-run: 100

2
.gitignore vendored
View File

@ -15,3 +15,5 @@ tmp
t.*
*.pid
build
Testing/Temporary/CTestCostData.txt
Testing/Temporary/LastTest.log

3
.gitmodules vendored Normal file
View File

@ -0,0 +1,3 @@
[submodule "deps/limonp"]
path = deps/limonp
url = https://github.com/yanyiwu/limonp.git

313
CHANGELOG.md Normal file
View File

@ -0,0 +1,313 @@
# CHANGELOG
## v5.5.0
+ feat: add Windows ARM64 build support
+ build: upgrade googletest from 1.11.0 to 1.12.1
+ build: update CMake minimum version requirement to 3.10
+ fix: make namespaces explicit and fix missing includes
+ ci: update stale-issues workflow configuration
## v5.4.0
+ unittest: class Jiaba add default argument input
+ class Jieba: support default dictpath
+ cmake: avoid testing when FetchContent by other project
+ class DictTrie: removed unused var
## v5.3.2
+ removed test/demo.cpp and linked https://github.com/yanyiwu/cppjieba-demo
+ Update Demo Link in README.md
+ [github/actions] stale 1 year ago issues
+ limonp v0.9.0 -> v1.0.0
## v5.3.1
+ [cmake] fetch googletest
+ [submodules] rm test/googletest
## v5.3.0
+ [c++17,c++20] compatibility
+ limonp version 0.6.7 -> 0.9.0
## v5.2.0
+ [CI] windows-[2019,2022]
+ [googletest] v1.6.0->v1.10.0
+ [CI] ubuntu version from 20 to 22, macos version from 12 to 14
+ [CMake] mini_required 2.6->3.5 and fix CXX_VERSION variable passed from cmd
+ [CI] matrix and multi cpp version [11, 14]
## v5.1.3
+ [googletest] git submodule add googletest-1.6.0
## v5.1.2
+ [submodule:deps/limonp] upgrade to v0.6.7
## v5.1.1
+ Merged [pr-186](https://github.com/yanyiwu/cppjieba/pull/186)
## v5.1.0
+ Merged [feature: add RemoveWord api from gojieba/pull/99 #172](https://github.com/yanyiwu/cppjieba/pull/172)
## v5.0.5
+ Merged [pr-171 submodule update limonp to v0.6.6 #171](https://github.com/yanyiwu/cppjieba/pull/171)
## v5.0.4
+ Merged [pr-168 limonp as submodule #168](https://github.com/yanyiwu/cppjieba/pull/168)
## v5.0.3
+ Upgrade [limonp](https://github.com/yanyiwu/limonp) -> v0.6.3
## v5.0.2
+ Upgrade [limonp](https://github.com/yanyiwu/limonp) -> v0.6.1
## v5.0.1
+ Make Compiler Happier.
+ Add PHP, DLang Links.
## v5.0.0
+ Notice(**api changed**) : Jieba class 3 arguments -> 5 arguments, and use KeywordExtractor in Jieba
## v4.8.1
+ add TextRankExtractor by [@questionfish] in [pull request 65](https://github.com/yanyiwu/cppjieba/pull/65)
+ add Jieba::ResetSeparators api for some special situation, for example in [issue67](https://github.com/yanyiwu/cppjieba/issues/67)
+ fix [issue70](https://github.com/yanyiwu/cppjieba/issues/70)
+ support (word, freq, tag) format in user_dict, see details in [pr74](https://github.com/yanyiwu/cppjieba/pull/74)
## v4.8.0
+ rewrite QuerySegment, make `Jieba::CutForSearch` behaves the same as [jieba] `cut_for_search` api
+ remove Jieba::SetQuerySegmentThreshold
## v4.7.0
api changes:
+ override Cut functions, add location information into Word results;
+ remove LevelSegment;
+ remove Jieba::Locate;
upgrade:
+ limonp -> v0.6.1
## v4.6.0
+ Change Jieba::Locate(deprecated) to be static function.
+ Change the return value of KeywordExtractor::Extract from bool to void.
+ Add KeywordExtractor::Word and add more overrided KeywordExtractor::Extract
## v4.5.3
+ Upgrade limonp to v0.6.0
## v4.5.2
+ Upgrade limonp to v0.5.6 to fix hidden trouble.
## v4.5.1
+ Upgrade limonp to v0.5.5 to solve macro name conficts in some special case.
## v4.5.0
+ 在 Trie 中去除之前糟糕的针对 uint16 优化的用数组代替 map 的设计,
该设计的主要问题是前提 unicode 每个字符必须是 uint16 ,则无法更全面得支持 unicode 多国字符。
+ Rune 类型从 16bit 更改为 32bit ,支持更多 Unicode 字符,包括一些罕见汉字。
## v4.4.1
+ 使用 valgrind 检查内存泄露的问题定位出一个HMM模型初始化的问题导致内存泄露的bug不过此内存泄露不是致命问题
因为只会在词典载入的时候发生,而词典载入通常情况下只会被运行一次,故不会导致严重问题。
+ 感谢 [qinwf] 帮我发现这个bug非常感谢。
## v4.4.0
+ 加代码容易删代码难,思索良久,还是决定把 Server 功能的源码剥离出这个项目。
+ 让 [cppjieba] 回到当年情窦未开时清纯的感觉删除那些无关紧要的server代码让整个项目轻装上阵专注分词的核心代码。
+ By the way, 之前的 server 相关的代码,如果你真的需要它,就去新的项目仓库 [cppjieba-server](https://github.com/yanyiwu/cppjieba-server) 找它们。
## v4.3.3
+ Yet Another Incompatibility Problem Repair: Upgrade [limonp] to version v0.5.3, fix incompatibility problem in Windows
## v4.3.2
+ Upgrade [limonp] to version v0.5.2, fix incompatibility problem in Windows
## v4.3.1
+ 重载 KeywordExtractor 的构造函数,可以传入 Jieba 进行字典和模型的构造。
## v4.3.0
源码目录布局调整:
1. src/ -> include/cppjieba/
2. src/limonp/ -> deps/limonp/
3. server/husky -> deps/husky/
4. test/unittest/gtest -> deps/gtest
依赖库升级:
1. [limonp] to version v0.5.1
2. [husky] to version v0.2.0
## v4.2.1
1. Upgrade [limonp] to version v0.4.1, [husky] to version v0.2.0
## v4.2.0
1. 修复[issue50]提到的多词典分隔符在Windows环境下存在的问题从':'修改成'|'或';'。
## v4.1.2
1. 新增 Jieba::Locate 函数接口,作为计算分词结果的词语位置信息,在某些场景下有用,比如搜索结果高亮之类的。
## v4.1.1
1. 在 class Jieba 中新增词性标注的接口函数 Jieba::Tag
## v4.1.0
1. QuerySegment切词时加一层判断当长词满足IsAllAscii(比如英文单词)时,不进行细粒度分词。
2. QuerySegment新增SetMaxWordLen和GetMaxWordLen接口用来设置二次分词条件被触发的词长阈值。
3. Jieba新增SetQuerySegmentThreshold设置CutForSearch函数的词长阈值。
## v4.0.0
1. 支持多个userdict载入多词典路径用英文冒号(:)作为分隔符就当是向环境变量PATH致敬哈哈。
2. userdict是不带权重的之前对于新的userword默认设置词频权重为最大值现已支持可配置默认使用中位值。
3. 【兼容性预警】修改一些代码风格比如命名空间小写化从CppJieba变成cppjieba。
4. 【兼容性预警】弃用Application.hpp, 取而代之使用Jieba.hpp 接口也进行了大幅修改函数风格更统一和python版本的Jieba分词更一致。
## v3.2.1
1. 修复 Jieba.hpp 头文件保护写错导致的 bug。
## v3.2.0
1. 使用工程上比较 tricky 的 Trie树优化办法。废弃了之前的 `Aho-Corasick-Automation` 实现,可读性更好,性能更高。
2. 新增层次分词器: LevelSegment 。
3. 增加MPSegment的细粒度分词功能。
4. 增加 class Jieba ,提供可读性更好的接口。
5. 放弃了统一接口ISegment因为统一的接口限制了分词方式的灵活性限制了一些功能的增加。
6. 增加默认开启新词发现功能的可选参数hmm让MixSegment和QuerySegment都支持开关新词发现功能。
## v3.1.0
1. 新增可动态增加词典的API: insertUserWord
2. cut函数增加默认参数默认使用Mix切词算法。关于切词算法详见README.md
## v3.0.1
1. 提升兼容性,修复在某些特定环境下的编译错误问题。
## v3.0.0
1. 使得 QuerySegment 支持自定义词典(可选参数)。
2. 使得 KeywordExtractor 支持自定义词典(可选参数)。
3. 修改 Code Style ,参照 google code style 。
4. 增加更详细的错误日志在初始化过程中合理使用LogFatal。
5. 增加 Application 这个类整合了所有CppJieba的功能进去以后用户只需要使用这个类即可。
6. 修改 cjserver 服务可以通过http参数使用不同切词算法进行切词。
7. 修改 make install 的安装目录,统一安装到同一个目录 /usr/local/cppjieba 。
## v2.4.4
1. 修改两条更细粒度的特殊过滤规则,将连续的数字(包括浮点数)和连续的字母单独切分出来(而不会混在一起)。
2. 修改最大概率法时动态规划过程需要使用的 DAG 数据结构(同时也修改 Trie 的 DAG 查询函数),提高分词速度 8% 。
3. 使用了 `Aho-Corasick-Automation` 算法提速 Trie 查找的过程等优化,提升性能。
4. 增加词性标注的两条特殊规则。
## v2.4.3
1. 更新 [husky] 服务代码,新 [husky] 为基于线程池的服务器简易框架。并且修复当 HTTP POST 请求时 body 过长数据可能丢失的问题。
2. 修改 PosTagger 的参数结构,删除暂时无用的参数。并添加使用自定义字典的参数,也就是支持 **自定义词性**
3. 更好的支持 `mac osx` (原谅作者如此屌丝,这么晚才买 `mac` )。
4. 支持 `Docker` ,具体请见 `Dockerfile`
## v2.4.2
1. 适当使用 `vector` 的基础上,使用`limonp/LocalVector.hpp`作为`Unicode`的类型等优化,约提高性能 `30%`
2. 使 `cjserver` 支持用户自定义词典,通过在 `conf/server.conf` 里面配置 `user_dict_path` 来实现。
3. 修复 `MPSegment` 切词时,当句子中含有特殊字符时,切词结果不完整的问题。
4. 修改 `FullSegment` 减少内存使用。
5. 修改 `-std=c++0x` 或者 `-std=c++11` 时编译失败的问题。
## v2.4.1
1. 完善一些特殊字符和字母串的切词效果。
2. 提高关键词抽取的速度。
3. 提供用户自定义词典的接口。
4. 将server相关的代码独立出来单独放在`server/`目录下。
5. 修复用户自定义词典中单字会被MixSegment的新词发现功能给忽略的问题。也就是说现在的词典是用户词典优先级最高其次是自带的词典再其次是新词发现出来的词。
## v2.4.0
1. 适配更低级版本的`g++``cmake`,已在`g++ 4.1.2``cmake 2.6`上测试通过。
2. 修改一些测试用例的文件,减少测试时编译的时间。
3. 修复`make install`相关的问题。
4. 增加HTTP服务的POST请求接口。
5. 拆分`Trie.hpp``DictTrie.hpp``Trie.hpp`将trie树这个数据结构抽象出来并且修复Trie这个类潜在的bug并完善单元测试。
6. 重写cjserver的启动和停止新启动和停止方法详见README.md。
## v2.3.4
1. 修改了设计上的问题,删除了`TrieManager`这个类,以避免造成一些可能的隐患。
2. 增加`stop_words.utf8`词典,并修改`KeywordExtractor`的初始化函数用以使用此词典。
3. 优化了`Trie`树相关部分代码结构。
## v2.3.3
1. 修复因为使用unordered_map导致的在不同机器上结果不一致的问题。
2. 将部分数据结果从unordered_map改为map提升了差不多1/6的切词速度。(因为unordered_map虽然查找速度快但是在范围迭代的效率较低。)
## v2.3.2
1. 修复单元测试的问题有些case在x84和x64中结果不一致。
2. merge进词性标注的简单版本。
## v2.3.1
1. 修复安装时的服务启动问题不过安装切词服务只是linux下的一个附加功能不影响核心代码。
## v2.3.0
1. 增加`KeywordExtractor.hpp`来进行关键词抽取。
2. 使用`gtest`来做单元测试。
## v2.2.0
1. 性能优化提升切词速度约6倍。
2. 其他暂时也想不起来了。
## v2.1.1 (v2.1.1之前的统统一起写在 v2.1.1里面了)
1. 完成__最大概率分词算法__和__HMM分词算法__并且将他们结合起来成效果最好的`MixSegment`
2. 进行大量的代码重构将主要的功能性代码都写成了hpp文件。
3. 使用`cmake`工具来管理项目。
4. 使用 [limonp]作为工具函数库,比如日志,字符串操作等常用函数。
5. 使用 [husky] 搭简易分词服务的服务器框架。
[limonp]:http://github.com/yanyiwu/limonp.git
[husky]:http://github.com/yanyiwu/husky.git
[issue50]:https://github.com/yanyiwu/cppjieba/issues/50
[qinwf]:https://github.com/yanyiwu/cppjieba/pull/53#issuecomment-176264929
[jieba]:https://github.com/fxsjy/jieba
[@questionfish]:https://github.com/questionfish

View File

@ -1,24 +1,31 @@
CMAKE_MINIMUM_REQUIRED (VERSION 3.10)
PROJECT(CPPJIEBA)
CMAKE_MINIMUM_REQUIRED (VERSION 2.6)
INCLUDE_DIRECTORIES(${PROJECT_SOURCE_DIR}/deps/limonp/include
${PROJECT_SOURCE_DIR}/include)
if (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT)
set (CMAKE_INSTALL_PREFIX "/usr" CACHE PATH "default install path" FORCE )
if(NOT DEFINED CMAKE_CXX_STANDARD)
set(CMAKE_CXX_STANDARD 11)
endif()
ADD_DEFINITIONS(-O3 -Wall -g)
IF (DEFINED ENC)
ADD_DEFINITIONS(-DCPPJIEBA_${ENC})
ENDIF()
#ADD_DEFINITIONS(-DNO_FILTER)
ADD_SUBDIRECTORY(src)
ADD_SUBDIRECTORY(dict)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)
IF("${CMAKE_SYSTEM}" MATCHES "Linux")
ADD_SUBDIRECTORY(script)
ADD_SUBDIRECTORY(conf)
ADD_DEFINITIONS(-O3 -g)
# Define a variable to check if this is the top-level project
if(NOT DEFINED CPPJIEBA_TOP_LEVEL_PROJECT)
if(CMAKE_CURRENT_SOURCE_DIR STREQUAL CMAKE_SOURCE_DIR)
set(CPPJIEBA_TOP_LEVEL_PROJECT ON)
else()
set(CPPJIEBA_TOP_LEVEL_PROJECT OFF)
endif()
endif()
ADD_SUBDIRECTORY(test)
if(CPPJIEBA_TOP_LEVEL_PROJECT)
ENABLE_TESTING()
ENABLE_TESTING()
ADD_TEST(NAME test.run COMMAND test.run)
message(STATUS "MSVC value: ${MSVC}")
ADD_SUBDIRECTORY(test)
ADD_TEST(NAME ./test/test.run COMMAND ./test/test.run)
ADD_TEST(NAME ./load_test COMMAND ./load_test)
endif()

View File

@ -1,49 +0,0 @@
# CppJieba ChangeLog
## v2.4.0
1. 适配更低级版本的`g++``cmake`,已在`g++ 4.1.2``cmake 2.6`上测试通过。
2. 修改一些测试用例的文件,减少测试时编译的时间。
3. 修复`make install`相关的问题。
4. 增加HTTP服务的POST请求接口。
5. 拆分`Trie.hpp``DictTrie.hpp``Trie.hpp`将trie树这个数据结构抽象出来并且修复Trie这个类潜在的bug并完善单元测试。
6. 重写cjserver的启动和停止新启动和停止方法详见README.md。
## v2.3.4
1. 修改了设计上的问题,删除了`TrieManager`这个类,以避免造成一些可能的隐患。
2. 增加`stop_words.utf8`词典,并修改`KeywordExtractor`的初始化函数用以使用此词典。
3. 优化了`Trie`树相关部分代码结构。
## v2.3.3
1. 修复因为使用unordered_map导致的在不同机器上结果不一致的问题。
2. 将部分数据结果从unordered_map改为map提升了差不多1/6的切词速度。(因为unordered_map虽然查找速度快但是在范围迭代的效率较低。)
## v2.3.2
1. 修复单元测试的问题有些case在x84和x64中结果不一致。
2. merge进词性标注的简单版本。
## v2.3.1
1. 修复安装时的服务启动问题不过安装切词服务只是linux下的一个附加功能不影响核心代码。
## v2.3.0
1. 增加`KeywordExtractor.hpp`来进行关键词抽取。
2. 使用`gtest`来做单元测试。
## v2.2.0
1. 性能优化提升切词速度约6倍。
2. 其他暂时也想不起来了。
## v2.1.1 (v2.1.1之前的统统一起写在 v2.1.1里面了)
1. 完成__最大概率分词算法__和__HMM分词算法__并且将他们结合起来成效果最好的`MixSegment`
2. 进行大量的代码重构将主要的功能性代码都写成了hpp文件。
3. 使用`cmake`工具来管理项目。
4. 使用`Limonp`作为工具函数库,比如日志,字符串操作等常用函数。
5. 使用`Husky` 搭简易分词服务的服务器框架。

View File

@ -1,6 +1,6 @@
The MIT License (MIT)
Copyright (c) 2013 Yanyi Wu
Copyright (c) 2013
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in

260
README.md
View File

@ -1,103 +1,85 @@
#CppJieba是"结巴"中文分词的C++版本
# CppJieba
功能性的代码全写成hpp文件文件依赖一直是很让人讨厌的东西全做成hpp头文件形式的目的就是为了省去链接的依赖。
[![CMake](https://github.com/yanyiwu/cppjieba/actions/workflows/cmake.yml/badge.svg)](https://github.com/yanyiwu/cppjieba/actions/workflows/cmake.yml)
[![Author](https://img.shields.io/badge/author-@yanyiwu-blue.svg?style=flat)](http://yanyiwu.com/)
[![Platform](https://img.shields.io/badge/platform-Linux,macOS,Windows-green.svg?style=flat)](https://github.com/yanyiwu/cppjieba)
[![Performance](https://img.shields.io/badge/performance-excellent-brightgreen.svg?style=flat)](http://yanyiwu.com/work/2015/06/14/jieba-series-performance-test.html)
[![Tag](https://img.shields.io/github/v/tag/yanyiwu/cppjieba.svg)](https://github.com/yanyiwu/cppjieba/releases)
**没有依赖,就没有伤害。**
## 简介
实践证明写成hpp使用起来真的很爽在后面提到的在iOS应用中的使用和包装成`Node.js`的扩展[NodeJieba]都特别顺利。
CppJieba是"结巴(Jieba)"中文分词的C++版本
如果对代码细节感兴趣的请见 [代码详解]
### 主要特点
## 中文编码
- 🚀 高性能:经过线上环境验证的稳定性和性能表现
- 📦 易集成:源代码以头文件形式提供 (`include/cppjieba/*.hpp`),包含即可使用
- 🔍 多种分词模式:支持精确模式、全模式、搜索引擎模式等
- 📚 自定义词典:支持用户自定义词典,支持多词典路径(使用'|'或';'分隔)
- 💻 跨平台:支持 Linux、macOS、Windows 操作系统
- 🌈 UTF-8编码原生支持 UTF-8 编码的中文处理
现在支持utf8,gbk编码的分词。
## 快速开始
## 安装与使用
### 环境要求
### 依赖
- C++ 编译器:
- g++ (推荐 4.1 以上版本)
- 或 clang++
- cmake (推荐 2.6 以上版本)
* g++ (version >= 4.1 recommended);
* cmake (version >= 2.6 recommended);
### 下载和安装
### 安装步骤
```sh
wget https://github.com/aszxqw/cppjieba/archive/master.zip -O cppjieba-master.zip
unzip cppjieba-master.zip
cd cppjieba-master
git clone https://github.com/yanyiwu/cppjieba.git
cd cppjieba
git submodule init
git submodule update
mkdir build
cd build
cmake ..
# 默认是utf8编码如果要使用gbk编码则使用下句cmake命令
# cmake .. -DENC=GBK
make
sudo make install
make test
```
#### 测试
```sh
make test
```
### 启动服务
因为服务的后台运行需要`start-stop-daemon`在ubuntu下是自带的。但是在CentOS下就需要自己安装了。
## 使用示例
```
#Usage: /etc/init.d/cjserver {start|stop|restart|force-reload}
#启动
/etc/init.d/cjserver.start
#停止
/etc/init.d/cjserver.stop
./demo
```
#### 测试服务
然后用chrome浏览器打开`http://127.0.0.1:11200/?key=南京市长江大桥`
(用chrome的原因是chrome的默认编码就是utf-8)
或者用命令 `curl "http://127.0.0.1:11200/?key=南京市长江大桥"` (ubuntu中的curl安装命令`sudo apt-get install curl`)
你可以看到返回的结果如下:(返回结果是json格式)
结果示例:
```
["南京市", "长江大桥"]
[demo] Cut With HMM
他/来到/了/网易/杭研/大厦
[demo] Cut Without HMM
他/来到/了/网易/杭/研/大厦
我来到北京清华大学
[demo] CutAll
我/来到/北京/清华/清华大学/华大/大学
小明硕士毕业于中国科学院计算所,后在日本京都大学深造
[demo] CutForSearch
小明/硕士/毕业/于/中国/科学/学院/科学院/中国科学院/计算/计算所//后/在/日本/京都/大学/日本京都大学/深造
[demo] Insert User Word
男默/女泪
男默女泪
[demo] CutForSearch Word With Offset
[{"word": "小明", "offset": 0}, {"word": "硕士", "offset": 6}, {"word": "毕业", "offset": 12}, {"word": "于", "offset": 18}, {"word": "中国", "offset": 21}, {"word": "科学", "offset": 27}, {"word": "学院", "offset": 30}, {"word": "科学院", "offset": 27}, {"word": "中国科学院", "offset": 21}, {"word": "计算", "offset": 36}, {"word": "计算所", "offset": 36}, {"word": "", "offset": 45}, {"word": "后", "offset": 48}, {"word": "在", "offset": 51}, {"word": "日本", "offset": 54}, {"word": "京都", "offset": 60}, {"word": "大学", "offset": 66}, {"word": "日本京都大学", "offset": 54}, {"word": "深造", "offset": 72}]
[demo] Tagging
我是拖拉机学院手扶拖拉机专业的。不用多久我就会升职加薪当上CEO走上人生巅峰。
[我:r, 是:v, 拖拉机:n, 学院:n, 手扶拖拉机:n, 专业:n, 的:uj, 。:x, 不用:v, 多久:m, :x, 我:r, 就:d, 会:v, 升职:v, 加薪:nr, :x, 当上:t, CEO:eng, :x, 走上:v, 人生:n, 巅峰:n, 。:x]
[demo] Keyword Extraction
我是拖拉机学院手扶拖拉机专业的。不用多久我就会升职加薪当上CEO走上人生巅峰。
[{"word": "CEO", "offset": [93], "weight": 11.7392}, {"word": "升职", "offset": [72], "weight": 10.8562}, {"word": "加薪", "offset": [78], "weight": 10.6426}, {"word": "手扶拖拉机", "offset": [21], "weight": 10.0089}, {"word": "巅峰", "offset": [111], "weight": 9.49396}]
```
如果你使用如下调用方式:
For more details, please see [demo](https://github.com/yanyiwu/cppjieba-demo).
```
curl "http://127.0.0.1:11200/?key=南京市长江大桥&format=simple"
```
### 分词结果示例
则返回结果如下:(返回结果按空格隔开)
```
南京市 长江大桥
```
同时也支持HTTP POST模式使用如下调用:
```
curl -d "南京市长江大桥" "http://127.0.0.1:11200/"
```
返回结果如下:
```
["南京市", "长江大桥"]
```
### 卸载
```sh
cd build/
cat install_manifest.txt | sudo xargs rm -rf
```
## 分词效果
### MPSegment's demo
**MPSegment**
Output:
```
@ -112,9 +94,8 @@ Output:
```
### HMMSegment's demo
**HMMSegment**
Output:
```
我来到北京清华大学
我来/到/北京/清华大学
@ -127,9 +108,8 @@ Output:
```
### MixSegment's demo
**MixSegment**
Output:
```
我来到北京清华大学
我/来到/北京/清华大学
@ -142,9 +122,8 @@ Output:
```
### FullSegment's demo
**FullSegment**
Output:
```
我来到北京清华大学
我/来到/北京/清华/清华大学/华大/大学
@ -157,9 +136,8 @@ Output:
```
### QuerySegment's demo
**QuerySegment**
Output:
```
我来到北京清华大学
我/来到/北京/清华/清华大学/华大/大学
@ -172,8 +150,6 @@ Output:
```
### 效果分析
以上依次是MP,HMM,Mix三种方法的效果。
可以看出效果最好的是Mix也就是融合MP和HMM的切词算法。即可以准确切出词典已有的词又可以切出像"杭研"这样的未登录词。
@ -182,72 +158,88 @@ Full方法切出所有字典里的词语。
Query方法先使用Mix方法切词对于切出来的较长的词再使用Full方法。
### 自定义用户词典
自定义词典示例请看`dict/user.dict.utf8`
没有使用自定义用户词典时的结果:
```
令狐冲/是/云/计算/行业/的/专家
```
使用自定义用户词典时的结果:
```
令狐冲/是/云计算/行业/的/专家
```
### 关键词抽取
```
make && ./test/keyword.demo
我是拖拉机学院手扶拖拉机专业的。不用多久我就会升职加薪当上CEO走上人生巅峰。
["CEO:11.7392", "升职:10.8562", "加薪:10.6426", "手扶拖拉机:10.0089", "巅峰:9.49396"]
```
you will see:
```
我是蓝翔技工拖拉机学院手扶拖拉机专业的。不用多久我就会升职加薪当上总经理出任CEO迎娶白富美走上人生巅峰。
->
["CEO:11.7392", "蓝翔:11.7392", "白富美:11.7392", "升职:10.8562", "加薪:10.6426"]
```
关键词抽取的demo代码请见`test/keyword_demo.cpp`
For more details, please see [demo](https://github.com/yanyiwu/cppjieba-demo).
### 词性标注
```
make && ./test/tagging_demo
我是蓝翔技工拖拉机学院手扶拖拉机专业的。不用多久我就会升职加薪当上总经理出任CEO迎娶白富美走上人生巅峰。
["我:r", "是:v", "拖拉机:n", "学院:n", "手扶拖拉机:n", "专业:n", "的:uj", "。:x", "不用:v", "多久:m", ":x", "我:r", "就:d", "会:v", "升职:v", "加薪:nr", ":x", "当上:t", "CEO:eng", ":x", "走上:v", "人生:n", "巅峰:n", "。:x"]
```
For more details, please see [demo](https://github.com/yanyiwu/cppjieba-demo).
支持自定义词性。
比如在(`dict/user.dict.utf8`)增加一行
```
["我:r", "是:v", "蓝翔:x", "技工:n", "拖拉机:n", "学院:n", "手扶拖拉机:n", "专业:n", "的:uj", "。:x", "不用:v", "多久:m", ":x", "我:r", "就:d", "会:v", "升职:v", "加薪:nr", ":x", "当:t", "上:f", "总经理:n", ":x", "出任:v", "CEO:x", ":x", "迎娶:v", "白富美:x", ":x", "走上:v", "人生:n", "巅峰:n", "。:x"]
蓝翔 nz
```
__词性标注是一个未完成的部分现在只是一个简单版本。__
结果如下:
```
["我:r", "是:v", "蓝翔:nz", "技工:n", "拖拉机:n", "学院:n", "手扶拖拉机:n", "专业:n", "的:uj", "。:x", "不用:v", "多久:m", ":x", "我:r", "就:d", "会:v", "升职:v", "加薪:nr", ":x", "当:t", "上:f", "总经理:n", ":x", "出任:v", "CEO:eng", ":x", "迎娶:v", "白富美:x", ":x", "走上:v", "人生:n", "巅峰:n", "。:x"]
```
## 其它词典资料分享
+ [dict.367W.utf8] iLife(562193561 at qq.com)
## 生态系统
CppJieba 已经被广泛应用于各种编程语言的分词实现中:
- [GoJieba](https://github.com/yanyiwu/gojieba) - Go 语言版本
- [NodeJieba](https://github.com/yanyiwu/nodejieba) - Node.js 版本
- [CJieba](https://github.com/yanyiwu/cjieba) - C 语言版本
- [jiebaR](https://github.com/qinwf/jiebaR) - R 语言版本
- [exjieba](https://github.com/falood/exjieba) - Erlang 版本
- [jieba_rb](https://github.com/altkatz/jieba_rb) - Ruby 版本
- [iosjieba](https://github.com/yanyiwu/iosjieba) - iOS 版本
- [phpjieba](https://github.com/jonnywang/phpjieba) - PHP 版本
- [perl5-jieba](https://metacpan.org/pod/distribution/Lingua-ZH-Jieba/lib/Lingua/ZH/Jieba.pod) - Perl 版本
### 应用项目
- [simhash](https://github.com/yanyiwu/simhash) - 中文文档相似度计算
- [pg_jieba](https://github.com/jaiminpan/pg_jieba) - PostgreSQL 分词插件
- [gitbook-plugin-search-pro](https://plugins.gitbook.com/plugin/search-pro) - Gitbook 中文搜索插件
- [ngx_http_cppjieba_module](https://github.com/yanyiwu/ngx_http_cppjieba_module) - Nginx 分词插件
## 贡献指南
我们欢迎各种形式的贡献,包括但不限于:
- 提交问题和建议
- 改进文档
- 提交代码修复
- 添加新功能
如果您觉得 CppJieba 对您有帮助,欢迎 star ⭐️ 支持项目!
## 相关应用
### 关于CppJieba的跨语言包装使用
收到邮件询问跨语言包装(ios应用开发)使用的问题这方面我没有相关的经验建议参考如下python使用cppjieba的项目
[jannson] 开发的供 python模块调用的项目 [cppjiebapy] , 和相关讨论 [cppjiebapy_discussion] .
### NodeJieba
如果有需要在`node.js`中使用分词,不妨试一下[NodeJieba]。
### simhash
如果有需要在处理中文文档的的相似度计算,不妨试一下[simhash]。
## 演示
http://cppjieba-webdemo.herokuapp.com/
(建议使用chrome打开)
## 客服
如果有运行问题或者任何疑问,欢迎联系 : wuyanyi09@gmail.com
## 鸣谢
"结巴"中文分词作者: SunJunyi
https://github.com/fxsjy/jieba
顾名思义之所以叫CppJieba是参照Jieba分词Python程序写成的所以饮水思源再次感谢SunJunyi。
[CppJieba]:https://github.com/aszxqw/cppjieba
[jannson]:https://github.com/jannson
[cppjiebapy]:https://github.com/jannson/cppjiebapy
[cppjiebapy_discussion]:https://github.com/aszxqw/cppjieba/issues/1
[NodeJieba]:https://github.com/aszxqw/nodejieba
[simhash]:https://github.com/aszxqw/simhash
[代码详解]:http://aszxqw.github.io/jekyll/update/2014/02/10/cppjieba-dai-ma-xiang-jie.html

View File

@ -1 +0,0 @@
INSTALL(FILES server.conf DESTINATION /etc/CppJieba)

View File

@ -1,11 +0,0 @@
# config
#socket listen port
port=11200
#dict path
dict_path=/usr/share/CppJieba/dict/jieba.dict.utf8
#model path
model_path=/usr/share/CppJieba/dict/hmm_model.utf8

1
deps/limonp vendored Submodule

@ -0,0 +1 @@
Subproject commit 5c82a3f17e4e0adc6a5decfe245054b0ed533d1a

View File

@ -1 +0,0 @@
INSTALL(FILES hmm_model.utf8 jieba.dict.utf8 DESTINATION share/CppJieba/dict)

View File

@ -312698,7 +312698,6 @@ T恤 4 n
部属 1126 n
部属工作 3 n
部属院校 3 n
部手机 33 n
部族 643 n
部标 4 n
部省级 2 n

4
dict/user.dict.utf8 Normal file
View File

@ -0,0 +1,4 @@
云计算
韩玉鉴赏
蓝翔 nz
区块链 10 nz

View File

@ -0,0 +1,280 @@
#ifndef CPPJIEBA_DICT_TRIE_HPP
#define CPPJIEBA_DICT_TRIE_HPP
#include <algorithm>
#include <fstream>
#include <cstring>
#include <cstdlib>
#include <cmath>
#include <deque>
#include <set>
#include <string>
#include <unordered_set>
#include "limonp/StringUtil.hpp"
#include "limonp/Logging.hpp"
#include "Unicode.hpp"
#include "Trie.hpp"
namespace cppjieba {
const double MIN_DOUBLE = -3.14e+100;
const double MAX_DOUBLE = 3.14e+100;
const size_t DICT_COLUMN_NUM = 3;
const char* const UNKNOWN_TAG = "";
class DictTrie {
public:
enum UserWordWeightOption {
WordWeightMin,
WordWeightMedian,
WordWeightMax,
}; // enum UserWordWeightOption
DictTrie(const std::string& dict_path, const std::string& user_dict_paths = "", UserWordWeightOption user_word_weight_opt = WordWeightMedian) {
Init(dict_path, user_dict_paths, user_word_weight_opt);
}
~DictTrie() {
delete trie_;
}
bool InsertUserWord(const std::string& word, const std::string& tag = UNKNOWN_TAG) {
DictUnit node_info;
if (!MakeNodeInfo(node_info, word, user_word_default_weight_, tag)) {
return false;
}
active_node_infos_.push_back(node_info);
trie_->InsertNode(node_info.word, &active_node_infos_.back());
return true;
}
bool InsertUserWord(const std::string& word,int freq, const std::string& tag = UNKNOWN_TAG) {
DictUnit node_info;
double weight = freq ? log(1.0 * freq / freq_sum_) : user_word_default_weight_ ;
if (!MakeNodeInfo(node_info, word, weight , tag)) {
return false;
}
active_node_infos_.push_back(node_info);
trie_->InsertNode(node_info.word, &active_node_infos_.back());
return true;
}
bool DeleteUserWord(const std::string& word, const std::string& tag = UNKNOWN_TAG) {
DictUnit node_info;
if (!MakeNodeInfo(node_info, word, user_word_default_weight_, tag)) {
return false;
}
trie_->DeleteNode(node_info.word, &node_info);
return true;
}
const DictUnit* Find(RuneStrArray::const_iterator begin, RuneStrArray::const_iterator end) const {
return trie_->Find(begin, end);
}
void Find(RuneStrArray::const_iterator begin,
RuneStrArray::const_iterator end,
std::vector<struct Dag>&res,
size_t max_word_len = MAX_WORD_LENGTH) const {
trie_->Find(begin, end, res, max_word_len);
}
bool Find(const std::string& word)
{
const DictUnit *tmp = NULL;
RuneStrArray runes;
if (!DecodeUTF8RunesInString(word, runes))
{
XLOG(ERROR) << "Decode failed.";
}
tmp = Find(runes.begin(), runes.end());
if (tmp == NULL)
{
return false;
}
else
{
return true;
}
}
bool IsUserDictSingleChineseWord(const Rune& word) const {
return IsIn(user_dict_single_chinese_word_, word);
}
double GetMinWeight() const {
return min_weight_;
}
void InserUserDictNode(const std::string& line) {
std::vector<std::string> buf;
DictUnit node_info;
limonp::Split(line, buf, " ");
if(buf.size() == 1){
MakeNodeInfo(node_info,
buf[0],
user_word_default_weight_,
UNKNOWN_TAG);
} else if (buf.size() == 2) {
MakeNodeInfo(node_info,
buf[0],
user_word_default_weight_,
buf[1]);
} else if (buf.size() == 3) {
int freq = atoi(buf[1].c_str());
assert(freq_sum_ > 0.0);
double weight = log(1.0 * freq / freq_sum_);
MakeNodeInfo(node_info, buf[0], weight, buf[2]);
}
static_node_infos_.push_back(node_info);
if (node_info.word.size() == 1) {
user_dict_single_chinese_word_.insert(node_info.word[0]);
}
}
void LoadUserDict(const std::vector<std::string>& buf) {
for (size_t i = 0; i < buf.size(); i++) {
InserUserDictNode(buf[i]);
}
}
void LoadUserDict(const std::set<std::string>& buf) {
std::set<std::string>::const_iterator iter;
for (iter = buf.begin(); iter != buf.end(); iter++){
InserUserDictNode(*iter);
}
}
void LoadUserDict(const std::string& filePaths) {
std::vector<std::string> files = limonp::Split(filePaths, "|;");
for (size_t i = 0; i < files.size(); i++) {
std::ifstream ifs(files[i].c_str());
XCHECK(ifs.is_open()) << "open " << files[i] << " failed";
std::string line;
while(getline(ifs, line)) {
if (line.size() == 0) {
continue;
}
InserUserDictNode(line);
}
}
}
private:
void Init(const std::string& dict_path, const std::string& user_dict_paths, UserWordWeightOption user_word_weight_opt) {
LoadDict(dict_path);
freq_sum_ = CalcFreqSum(static_node_infos_);
CalculateWeight(static_node_infos_, freq_sum_);
SetStaticWordWeights(user_word_weight_opt);
if (user_dict_paths.size()) {
LoadUserDict(user_dict_paths);
}
Shrink(static_node_infos_);
CreateTrie(static_node_infos_);
}
void CreateTrie(const std::vector<DictUnit>& dictUnits) {
assert(dictUnits.size());
std::vector<Unicode> words;
std::vector<const DictUnit*> valuePointers;
for (size_t i = 0 ; i < dictUnits.size(); i ++) {
words.push_back(dictUnits[i].word);
valuePointers.push_back(&dictUnits[i]);
}
trie_ = new Trie(words, valuePointers);
}
bool MakeNodeInfo(DictUnit& node_info,
const std::string& word,
double weight,
const std::string& tag) {
if (!DecodeUTF8RunesInString(word, node_info.word)) {
XLOG(ERROR) << "UTF-8 decode failed for dict word: " << word;
return false;
}
node_info.weight = weight;
node_info.tag = tag;
return true;
}
void LoadDict(const std::string& filePath) {
std::ifstream ifs(filePath.c_str());
XCHECK(ifs.is_open()) << "open " << filePath << " failed.";
std::string line;
std::vector<std::string> buf;
DictUnit node_info;
while (getline(ifs, line)) {
limonp::Split(line, buf, " ");
XCHECK(buf.size() == DICT_COLUMN_NUM) << "split result illegal, line:" << line;
MakeNodeInfo(node_info,
buf[0],
atof(buf[1].c_str()),
buf[2]);
static_node_infos_.push_back(node_info);
}
}
static bool WeightCompare(const DictUnit& lhs, const DictUnit& rhs) {
return lhs.weight < rhs.weight;
}
void SetStaticWordWeights(UserWordWeightOption option) {
XCHECK(!static_node_infos_.empty());
std::vector<DictUnit> x = static_node_infos_;
std::sort(x.begin(), x.end(), WeightCompare);
min_weight_ = x[0].weight;
max_weight_ = x[x.size() - 1].weight;
median_weight_ = x[x.size() / 2].weight;
switch (option) {
case WordWeightMin:
user_word_default_weight_ = min_weight_;
break;
case WordWeightMedian:
user_word_default_weight_ = median_weight_;
break;
default:
user_word_default_weight_ = max_weight_;
break;
}
}
double CalcFreqSum(const std::vector<DictUnit>& node_infos) const {
double sum = 0.0;
for (size_t i = 0; i < node_infos.size(); i++) {
sum += node_infos[i].weight;
}
return sum;
}
void CalculateWeight(std::vector<DictUnit>& node_infos, double sum) const {
assert(sum > 0.0);
for (size_t i = 0; i < node_infos.size(); i++) {
DictUnit& node_info = node_infos[i];
assert(node_info.weight > 0.0);
node_info.weight = log(double(node_info.weight)/sum);
}
}
void Shrink(std::vector<DictUnit>& units) const {
std::vector<DictUnit>(units.begin(), units.end()).swap(units);
}
std::vector<DictUnit> static_node_infos_;
std::deque<DictUnit> active_node_infos_; // must not be std::vector
Trie * trie_;
double freq_sum_;
double min_weight_;
double max_weight_;
double median_weight_;
double user_word_default_weight_;
std::unordered_set<Rune> user_dict_single_chinese_word_;
};
}
#endif

View File

@ -0,0 +1,93 @@
#ifndef CPPJIEBA_FULLSEGMENT_H
#define CPPJIEBA_FULLSEGMENT_H
#include <algorithm>
#include <set>
#include <cassert>
#include "limonp/Logging.hpp"
#include "DictTrie.hpp"
#include "SegmentBase.hpp"
#include "Unicode.hpp"
namespace cppjieba {
class FullSegment: public SegmentBase {
public:
FullSegment(const string& dictPath) {
dictTrie_ = new DictTrie(dictPath);
isNeedDestroy_ = true;
}
FullSegment(const DictTrie* dictTrie)
: dictTrie_(dictTrie), isNeedDestroy_(false) {
assert(dictTrie_);
}
~FullSegment() {
if (isNeedDestroy_) {
delete dictTrie_;
}
}
void Cut(const string& sentence,
vector<string>& words) const {
vector<Word> tmp;
Cut(sentence, tmp);
GetStringsFromWords(tmp, words);
}
void Cut(const string& sentence,
vector<Word>& words) const {
PreFilter pre_filter(symbols_, sentence);
PreFilter::Range range;
vector<WordRange> wrs;
wrs.reserve(sentence.size()/2);
while (pre_filter.HasNext()) {
range = pre_filter.Next();
Cut(range.begin, range.end, wrs);
}
words.clear();
words.reserve(wrs.size());
GetWordsFromWordRanges(sentence, wrs, words);
}
void Cut(RuneStrArray::const_iterator begin,
RuneStrArray::const_iterator end,
vector<WordRange>& res) const {
// result of searching in trie tree
LocalVector<pair<size_t, const DictUnit*> > tRes;
// max index of res's words
size_t maxIdx = 0;
// always equals to (uItr - begin)
size_t uIdx = 0;
// tmp variables
size_t wordLen = 0;
assert(dictTrie_);
vector<struct Dag> dags;
dictTrie_->Find(begin, end, dags);
for (size_t i = 0; i < dags.size(); i++) {
for (size_t j = 0; j < dags[i].nexts.size(); j++) {
size_t nextoffset = dags[i].nexts[j].first;
assert(nextoffset < dags.size());
const DictUnit* du = dags[i].nexts[j].second;
if (du == NULL) {
if (dags[i].nexts.size() == 1 && maxIdx <= uIdx) {
WordRange wr(begin + i, begin + nextoffset);
res.push_back(wr);
}
} else {
wordLen = du->word.size();
if (wordLen >= 2 || (dags[i].nexts.size() == 1 && maxIdx <= uIdx)) {
WordRange wr(begin + i, begin + nextoffset);
res.push_back(wr);
}
}
maxIdx = uIdx + wordLen > maxIdx ? uIdx + wordLen : maxIdx;
}
uIdx++;
}
}
private:
const DictTrie* dictTrie_;
bool isNeedDestroy_;
};
}
#endif

View File

@ -0,0 +1,129 @@
#ifndef CPPJIEBA_HMMMODEL_H
#define CPPJIEBA_HMMMODEL_H
#include "limonp/StringUtil.hpp"
#include "Trie.hpp"
namespace cppjieba {
using namespace limonp;
typedef unordered_map<Rune, double> EmitProbMap;
struct HMMModel {
/*
* STATUS:
* 0: HMMModel::B, 1: HMMModel::E, 2: HMMModel::M, 3:HMMModel::S
* */
enum {B = 0, E = 1, M = 2, S = 3, STATUS_SUM = 4};
HMMModel(const string& modelPath) {
memset(startProb, 0, sizeof(startProb));
memset(transProb, 0, sizeof(transProb));
statMap[0] = 'B';
statMap[1] = 'E';
statMap[2] = 'M';
statMap[3] = 'S';
emitProbVec.push_back(&emitProbB);
emitProbVec.push_back(&emitProbE);
emitProbVec.push_back(&emitProbM);
emitProbVec.push_back(&emitProbS);
LoadModel(modelPath);
}
~HMMModel() {
}
void LoadModel(const string& filePath) {
ifstream ifile(filePath.c_str());
XCHECK(ifile.is_open()) << "open " << filePath << " failed";
string line;
vector<string> tmp;
vector<string> tmp2;
//Load startProb
XCHECK(GetLine(ifile, line));
Split(line, tmp, " ");
XCHECK(tmp.size() == STATUS_SUM);
for (size_t j = 0; j< tmp.size(); j++) {
startProb[j] = atof(tmp[j].c_str());
}
//Load transProb
for (size_t i = 0; i < STATUS_SUM; i++) {
XCHECK(GetLine(ifile, line));
Split(line, tmp, " ");
XCHECK(tmp.size() == STATUS_SUM);
for (size_t j =0; j < STATUS_SUM; j++) {
transProb[i][j] = atof(tmp[j].c_str());
}
}
//Load emitProbB
XCHECK(GetLine(ifile, line));
XCHECK(LoadEmitProb(line, emitProbB));
//Load emitProbE
XCHECK(GetLine(ifile, line));
XCHECK(LoadEmitProb(line, emitProbE));
//Load emitProbM
XCHECK(GetLine(ifile, line));
XCHECK(LoadEmitProb(line, emitProbM));
//Load emitProbS
XCHECK(GetLine(ifile, line));
XCHECK(LoadEmitProb(line, emitProbS));
}
double GetEmitProb(const EmitProbMap* ptMp, Rune key,
double defVal)const {
EmitProbMap::const_iterator cit = ptMp->find(key);
if (cit == ptMp->end()) {
return defVal;
}
return cit->second;
}
bool GetLine(ifstream& ifile, string& line) {
while (getline(ifile, line)) {
Trim(line);
if (line.empty()) {
continue;
}
if (StartsWith(line, "#")) {
continue;
}
return true;
}
return false;
}
bool LoadEmitProb(const string& line, EmitProbMap& mp) {
if (line.empty()) {
return false;
}
vector<string> tmp, tmp2;
Unicode unicode;
Split(line, tmp, ",");
for (size_t i = 0; i < tmp.size(); i++) {
Split(tmp[i], tmp2, ":");
if (2 != tmp2.size()) {
XLOG(ERROR) << "emitProb illegal.";
return false;
}
if (!DecodeUTF8RunesInString(tmp2[0], unicode) || unicode.size() != 1) {
XLOG(ERROR) << "TransCode failed.";
return false;
}
mp[unicode[0]] = atof(tmp2[1].c_str());
}
return true;
}
char statMap[STATUS_SUM];
double startProb[STATUS_SUM];
double transProb[STATUS_SUM][STATUS_SUM];
EmitProbMap emitProbB;
EmitProbMap emitProbE;
EmitProbMap emitProbM;
EmitProbMap emitProbS;
vector<EmitProbMap* > emitProbVec;
}; // struct HMMModel
} // namespace cppjieba
#endif

View File

@ -0,0 +1,190 @@
#ifndef CPPJIBEA_HMMSEGMENT_H
#define CPPJIBEA_HMMSEGMENT_H
#include <iostream>
#include <fstream>
#include <memory.h>
#include <cassert>
#include "HMMModel.hpp"
#include "SegmentBase.hpp"
namespace cppjieba {
class HMMSegment: public SegmentBase {
public:
HMMSegment(const string& filePath)
: model_(new HMMModel(filePath)), isNeedDestroy_(true) {
}
HMMSegment(const HMMModel* model)
: model_(model), isNeedDestroy_(false) {
}
~HMMSegment() {
if (isNeedDestroy_) {
delete model_;
}
}
void Cut(const string& sentence,
vector<string>& words) const {
vector<Word> tmp;
Cut(sentence, tmp);
GetStringsFromWords(tmp, words);
}
void Cut(const string& sentence,
vector<Word>& words) const {
PreFilter pre_filter(symbols_, sentence);
PreFilter::Range range;
vector<WordRange> wrs;
wrs.reserve(sentence.size()/2);
while (pre_filter.HasNext()) {
range = pre_filter.Next();
Cut(range.begin, range.end, wrs);
}
words.clear();
words.reserve(wrs.size());
GetWordsFromWordRanges(sentence, wrs, words);
}
void Cut(RuneStrArray::const_iterator begin, RuneStrArray::const_iterator end, vector<WordRange>& res) const {
RuneStrArray::const_iterator left = begin;
RuneStrArray::const_iterator right = begin;
while (right != end) {
if (right->rune < 0x80) {
if (left != right) {
InternalCut(left, right, res);
}
left = right;
do {
right = SequentialLetterRule(left, end);
if (right != left) {
break;
}
right = NumbersRule(left, end);
if (right != left) {
break;
}
right ++;
} while (false);
WordRange wr(left, right - 1);
res.push_back(wr);
left = right;
} else {
right++;
}
}
if (left != right) {
InternalCut(left, right, res);
}
}
private:
// sequential letters rule
RuneStrArray::const_iterator SequentialLetterRule(RuneStrArray::const_iterator begin, RuneStrArray::const_iterator end) const {
Rune x = begin->rune;
if (('a' <= x && x <= 'z') || ('A' <= x && x <= 'Z')) {
begin ++;
} else {
return begin;
}
while (begin != end) {
x = begin->rune;
if (('a' <= x && x <= 'z') || ('A' <= x && x <= 'Z') || ('0' <= x && x <= '9')) {
begin ++;
} else {
break;
}
}
return begin;
}
//
RuneStrArray::const_iterator NumbersRule(RuneStrArray::const_iterator begin, RuneStrArray::const_iterator end) const {
Rune x = begin->rune;
if ('0' <= x && x <= '9') {
begin ++;
} else {
return begin;
}
while (begin != end) {
x = begin->rune;
if ( ('0' <= x && x <= '9') || x == '.') {
begin++;
} else {
break;
}
}
return begin;
}
void InternalCut(RuneStrArray::const_iterator begin, RuneStrArray::const_iterator end, vector<WordRange>& res) const {
vector<size_t> status;
Viterbi(begin, end, status);
RuneStrArray::const_iterator left = begin;
RuneStrArray::const_iterator right;
for (size_t i = 0; i < status.size(); i++) {
if (status[i] % 2) { //if (HMMModel::E == status[i] || HMMModel::S == status[i])
right = begin + i + 1;
WordRange wr(left, right - 1);
res.push_back(wr);
left = right;
}
}
}
void Viterbi(RuneStrArray::const_iterator begin,
RuneStrArray::const_iterator end,
vector<size_t>& status) const {
size_t Y = HMMModel::STATUS_SUM;
size_t X = end - begin;
size_t XYSize = X * Y;
size_t now, old, stat;
double tmp, endE, endS;
vector<int> path(XYSize);
vector<double> weight(XYSize);
//start
for (size_t y = 0; y < Y; y++) {
weight[0 + y * X] = model_->startProb[y] + model_->GetEmitProb(model_->emitProbVec[y], begin->rune, MIN_DOUBLE);
path[0 + y * X] = -1;
}
double emitProb;
for (size_t x = 1; x < X; x++) {
for (size_t y = 0; y < Y; y++) {
now = x + y*X;
weight[now] = MIN_DOUBLE;
path[now] = HMMModel::E; // warning
emitProb = model_->GetEmitProb(model_->emitProbVec[y], (begin+x)->rune, MIN_DOUBLE);
for (size_t preY = 0; preY < Y; preY++) {
old = x - 1 + preY * X;
tmp = weight[old] + model_->transProb[preY][y] + emitProb;
if (tmp > weight[now]) {
weight[now] = tmp;
path[now] = preY;
}
}
}
}
endE = weight[X-1+HMMModel::E*X];
endS = weight[X-1+HMMModel::S*X];
stat = 0;
if (endE >= endS) {
stat = HMMModel::E;
} else {
stat = HMMModel::S;
}
status.resize(X);
for (int x = X -1 ; x >= 0; x--) {
status[x] = stat;
stat = path[x + stat*X];
}
}
const HMMModel* model_;
bool isNeedDestroy_;
}; // class HMMSegment
} // namespace cppjieba
#endif

169
include/cppjieba/Jieba.hpp Normal file
View File

@ -0,0 +1,169 @@
#ifndef CPPJIEAB_JIEBA_H
#define CPPJIEAB_JIEBA_H
#include "QuerySegment.hpp"
#include "KeywordExtractor.hpp"
namespace cppjieba {
class Jieba {
public:
Jieba(const string& dict_path = "",
const string& model_path = "",
const string& user_dict_path = "",
const string& idf_path = "",
const string& stop_word_path = "")
: dict_trie_(getPath(dict_path, "jieba.dict.utf8"), getPath(user_dict_path, "user.dict.utf8")),
model_(getPath(model_path, "hmm_model.utf8")),
mp_seg_(&dict_trie_),
hmm_seg_(&model_),
mix_seg_(&dict_trie_, &model_),
full_seg_(&dict_trie_),
query_seg_(&dict_trie_, &model_),
extractor(&dict_trie_, &model_,
getPath(idf_path, "idf.utf8"),
getPath(stop_word_path, "stop_words.utf8")) {
}
~Jieba() {
}
struct LocWord {
string word;
size_t begin;
size_t end;
}; // struct LocWord
void Cut(const string& sentence, vector<string>& words, bool hmm = true) const {
mix_seg_.Cut(sentence, words, hmm);
}
void Cut(const string& sentence, vector<Word>& words, bool hmm = true) const {
mix_seg_.Cut(sentence, words, hmm);
}
void CutAll(const string& sentence, vector<string>& words) const {
full_seg_.Cut(sentence, words);
}
void CutAll(const string& sentence, vector<Word>& words) const {
full_seg_.Cut(sentence, words);
}
void CutForSearch(const string& sentence, vector<string>& words, bool hmm = true) const {
query_seg_.Cut(sentence, words, hmm);
}
void CutForSearch(const string& sentence, vector<Word>& words, bool hmm = true) const {
query_seg_.Cut(sentence, words, hmm);
}
void CutHMM(const string& sentence, vector<string>& words) const {
hmm_seg_.Cut(sentence, words);
}
void CutHMM(const string& sentence, vector<Word>& words) const {
hmm_seg_.Cut(sentence, words);
}
void CutSmall(const string& sentence, vector<string>& words, size_t max_word_len) const {
mp_seg_.Cut(sentence, words, max_word_len);
}
void CutSmall(const string& sentence, vector<Word>& words, size_t max_word_len) const {
mp_seg_.Cut(sentence, words, max_word_len);
}
void Tag(const string& sentence, vector<pair<string, string> >& words) const {
mix_seg_.Tag(sentence, words);
}
string LookupTag(const string &str) const {
return mix_seg_.LookupTag(str);
}
bool InsertUserWord(const string& word, const string& tag = UNKNOWN_TAG) {
return dict_trie_.InsertUserWord(word, tag);
}
bool InsertUserWord(const string& word,int freq, const string& tag = UNKNOWN_TAG) {
return dict_trie_.InsertUserWord(word,freq, tag);
}
bool DeleteUserWord(const string& word, const string& tag = UNKNOWN_TAG) {
return dict_trie_.DeleteUserWord(word, tag);
}
bool Find(const string& word)
{
return dict_trie_.Find(word);
}
void ResetSeparators(const string& s) {
//TODO
mp_seg_.ResetSeparators(s);
hmm_seg_.ResetSeparators(s);
mix_seg_.ResetSeparators(s);
full_seg_.ResetSeparators(s);
query_seg_.ResetSeparators(s);
}
const DictTrie* GetDictTrie() const {
return &dict_trie_;
}
const HMMModel* GetHMMModel() const {
return &model_;
}
void LoadUserDict(const vector<string>& buf) {
dict_trie_.LoadUserDict(buf);
}
void LoadUserDict(const set<string>& buf) {
dict_trie_.LoadUserDict(buf);
}
void LoadUserDict(const string& path) {
dict_trie_.LoadUserDict(path);
}
private:
static string pathJoin(const string& dir, const string& filename) {
if (dir.empty()) {
return filename;
}
char last_char = dir[dir.length() - 1];
if (last_char == '/' || last_char == '\\') {
return dir + filename;
} else {
#ifdef _WIN32
return dir + '\\' + filename;
#else
return dir + '/' + filename;
#endif
}
}
static string getCurrentDirectory() {
string path(__FILE__);
size_t pos = path.find_last_of("/\\");
return (pos == string::npos) ? "" : path.substr(0, pos);
}
static string getPath(const string& path, const string& default_file) {
if (path.empty()) {
string current_dir = getCurrentDirectory();
string parent_dir = current_dir.substr(0, current_dir.find_last_of("/\\"));
string grandparent_dir = parent_dir.substr(0, parent_dir.find_last_of("/\\"));
return pathJoin(pathJoin(grandparent_dir, "dict"), default_file);
}
return path;
}
DictTrie dict_trie_;
HMMModel model_;
// They share the same dict trie and model
MPSegment mp_seg_;
HMMSegment hmm_seg_;
MixSegment mix_seg_;
FullSegment full_seg_;
QuerySegment query_seg_;
public:
KeywordExtractor extractor;
}; // class Jieba
} // namespace cppjieba
#endif // CPPJIEAB_JIEBA_H

View File

@ -0,0 +1,149 @@
#ifndef CPPJIEBA_KEYWORD_EXTRACTOR_H
#define CPPJIEBA_KEYWORD_EXTRACTOR_H
#include <algorithm>
#include <unordered_map>
#include <unordered_set>
#include "MixSegment.hpp"
namespace cppjieba {
/*utf8*/
class KeywordExtractor {
public:
struct Word {
std::string word;
std::vector<size_t> offsets;
double weight;
}; // struct Word
KeywordExtractor(const std::string& dictPath,
const std::string& hmmFilePath,
const std::string& idfPath,
const std::string& stopWordPath,
const std::string& userDict = "")
: segment_(dictPath, hmmFilePath, userDict) {
LoadIdfDict(idfPath);
LoadStopWordDict(stopWordPath);
}
KeywordExtractor(const DictTrie* dictTrie,
const HMMModel* model,
const std::string& idfPath,
const std::string& stopWordPath)
: segment_(dictTrie, model) {
LoadIdfDict(idfPath);
LoadStopWordDict(stopWordPath);
}
~KeywordExtractor() {
}
void Extract(const std::string& sentence, std::vector<std::string>& keywords, size_t topN) const {
std::vector<Word> topWords;
Extract(sentence, topWords, topN);
for (size_t i = 0; i < topWords.size(); i++) {
keywords.push_back(topWords[i].word);
}
}
void Extract(const std::string& sentence, std::vector<pair<std::string, double> >& keywords, size_t topN) const {
std::vector<Word> topWords;
Extract(sentence, topWords, topN);
for (size_t i = 0; i < topWords.size(); i++) {
keywords.push_back(pair<std::string, double>(topWords[i].word, topWords[i].weight));
}
}
void Extract(const std::string& sentence, std::vector<Word>& keywords, size_t topN) const {
std::vector<std::string> words;
segment_.Cut(sentence, words);
std::map<std::string, Word> wordmap;
size_t offset = 0;
for (size_t i = 0; i < words.size(); ++i) {
size_t t = offset;
offset += words[i].size();
if (IsSingleWord(words[i]) || stopWords_.find(words[i]) != stopWords_.end()) {
continue;
}
wordmap[words[i]].offsets.push_back(t);
wordmap[words[i]].weight += 1.0;
}
if (offset != sentence.size()) {
XLOG(ERROR) << "words illegal";
return;
}
keywords.clear();
keywords.reserve(wordmap.size());
for (std::map<std::string, Word>::iterator itr = wordmap.begin(); itr != wordmap.end(); ++itr) {
std::unordered_map<std::string, double>::const_iterator cit = idfMap_.find(itr->first);
if (cit != idfMap_.end()) {
itr->second.weight *= cit->second;
} else {
itr->second.weight *= idfAverage_;
}
itr->second.word = itr->first;
keywords.push_back(itr->second);
}
topN = min(topN, keywords.size());
std::partial_sort(keywords.begin(), keywords.begin() + topN, keywords.end(), Compare);
keywords.resize(topN);
}
private:
void LoadIdfDict(const std::string& idfPath) {
std::ifstream ifs(idfPath.c_str());
XCHECK(ifs.is_open()) << "open " << idfPath << " failed";
std::string line ;
std::vector<std::string> buf;
double idf = 0.0;
double idfSum = 0.0;
size_t lineno = 0;
for (; getline(ifs, line); lineno++) {
buf.clear();
if (line.empty()) {
XLOG(ERROR) << "lineno: " << lineno << " empty. skipped.";
continue;
}
limonp::Split(line, buf, " ");
if (buf.size() != 2) {
XLOG(ERROR) << "line: " << line << ", lineno: " << lineno << " empty. skipped.";
continue;
}
idf = atof(buf[1].c_str());
idfMap_[buf[0]] = idf;
idfSum += idf;
}
assert(lineno);
idfAverage_ = idfSum / lineno;
assert(idfAverage_ > 0.0);
}
void LoadStopWordDict(const std::string& filePath) {
std::ifstream ifs(filePath.c_str());
XCHECK(ifs.is_open()) << "open " << filePath << " failed";
std::string line ;
while (getline(ifs, line)) {
stopWords_.insert(line);
}
assert(stopWords_.size());
}
static bool Compare(const Word& lhs, const Word& rhs) {
return lhs.weight > rhs.weight;
}
MixSegment segment_;
std::unordered_map<std::string, double> idfMap_;
double idfAverage_;
std::unordered_set<std::string> stopWords_;
}; // class KeywordExtractor
inline std::ostream& operator << (std::ostream& os, const KeywordExtractor::Word& word) {
return os << "{\"word\": \"" << word.word << "\", \"offset\": " << word.offsets << ", \"weight\": " << word.weight << "}";
}
} // namespace cppjieba
#endif

View File

@ -0,0 +1,137 @@
#ifndef CPPJIEBA_MPSEGMENT_H
#define CPPJIEBA_MPSEGMENT_H
#include <algorithm>
#include <set>
#include <cassert>
#include "limonp/Logging.hpp"
#include "DictTrie.hpp"
#include "SegmentTagged.hpp"
#include "PosTagger.hpp"
namespace cppjieba {
class MPSegment: public SegmentTagged {
public:
MPSegment(const string& dictPath, const string& userDictPath = "")
: dictTrie_(new DictTrie(dictPath, userDictPath)), isNeedDestroy_(true) {
}
MPSegment(const DictTrie* dictTrie)
: dictTrie_(dictTrie), isNeedDestroy_(false) {
assert(dictTrie_);
}
~MPSegment() {
if (isNeedDestroy_) {
delete dictTrie_;
}
}
void Cut(const string& sentence, vector<string>& words) const {
Cut(sentence, words, MAX_WORD_LENGTH);
}
void Cut(const string& sentence,
vector<string>& words,
size_t max_word_len) const {
vector<Word> tmp;
Cut(sentence, tmp, max_word_len);
GetStringsFromWords(tmp, words);
}
void Cut(const string& sentence,
vector<Word>& words,
size_t max_word_len = MAX_WORD_LENGTH) const {
PreFilter pre_filter(symbols_, sentence);
PreFilter::Range range;
vector<WordRange> wrs;
wrs.reserve(sentence.size()/2);
while (pre_filter.HasNext()) {
range = pre_filter.Next();
Cut(range.begin, range.end, wrs, max_word_len);
}
words.clear();
words.reserve(wrs.size());
GetWordsFromWordRanges(sentence, wrs, words);
}
void Cut(RuneStrArray::const_iterator begin,
RuneStrArray::const_iterator end,
vector<WordRange>& words,
size_t max_word_len = MAX_WORD_LENGTH) const {
vector<Dag> dags;
dictTrie_->Find(begin,
end,
dags,
max_word_len);
CalcDP(dags);
CutByDag(begin, end, dags, words);
}
const DictTrie* GetDictTrie() const {
return dictTrie_;
}
bool Tag(const string& src, vector<pair<string, string> >& res) const {
return tagger_.Tag(src, res, *this);
}
bool IsUserDictSingleChineseWord(const Rune& value) const {
return dictTrie_->IsUserDictSingleChineseWord(value);
}
private:
void CalcDP(vector<Dag>& dags) const {
size_t nextPos;
const DictUnit* p;
double val;
for (vector<Dag>::reverse_iterator rit = dags.rbegin(); rit != dags.rend(); rit++) {
rit->pInfo = NULL;
rit->weight = MIN_DOUBLE;
assert(!rit->nexts.empty());
for (LocalVector<pair<size_t, const DictUnit*> >::const_iterator it = rit->nexts.begin(); it != rit->nexts.end(); it++) {
nextPos = it->first;
p = it->second;
val = 0.0;
if (nextPos + 1 < dags.size()) {
val += dags[nextPos + 1].weight;
}
if (p) {
val += p->weight;
} else {
val += dictTrie_->GetMinWeight();
}
if (val > rit->weight) {
rit->pInfo = p;
rit->weight = val;
}
}
}
}
void CutByDag(RuneStrArray::const_iterator begin,
RuneStrArray::const_iterator end,
const vector<Dag>& dags,
vector<WordRange>& words) const {
size_t i = 0;
while (i < dags.size()) {
const DictUnit* p = dags[i].pInfo;
if (p) {
assert(p->word.size() >= 1);
WordRange wr(begin + i, begin + i + p->word.size() - 1);
words.push_back(wr);
i += p->word.size();
} else { //single chinese word
WordRange wr(begin + i, begin + i);
words.push_back(wr);
i++;
}
}
}
const DictTrie* dictTrie_;
bool isNeedDestroy_;
PosTagger tagger_;
}; // class MPSegment
} // namespace cppjieba
#endif

View File

@ -0,0 +1,109 @@
#ifndef CPPJIEBA_MIXSEGMENT_H
#define CPPJIEBA_MIXSEGMENT_H
#include <cassert>
#include "MPSegment.hpp"
#include "HMMSegment.hpp"
#include "limonp/StringUtil.hpp"
#include "PosTagger.hpp"
namespace cppjieba {
class MixSegment: public SegmentTagged {
public:
MixSegment(const string& mpSegDict, const string& hmmSegDict,
const string& userDict = "")
: mpSeg_(mpSegDict, userDict),
hmmSeg_(hmmSegDict) {
}
MixSegment(const DictTrie* dictTrie, const HMMModel* model)
: mpSeg_(dictTrie), hmmSeg_(model) {
}
~MixSegment() {
}
void Cut(const string& sentence, vector<string>& words) const {
Cut(sentence, words, true);
}
void Cut(const string& sentence, vector<string>& words, bool hmm) const {
vector<Word> tmp;
Cut(sentence, tmp, hmm);
GetStringsFromWords(tmp, words);
}
void Cut(const string& sentence, vector<Word>& words, bool hmm = true) const {
PreFilter pre_filter(symbols_, sentence);
PreFilter::Range range;
vector<WordRange> wrs;
wrs.reserve(sentence.size() / 2);
while (pre_filter.HasNext()) {
range = pre_filter.Next();
Cut(range.begin, range.end, wrs, hmm);
}
words.clear();
words.reserve(wrs.size());
GetWordsFromWordRanges(sentence, wrs, words);
}
void Cut(RuneStrArray::const_iterator begin, RuneStrArray::const_iterator end, vector<WordRange>& res, bool hmm) const {
if (!hmm) {
mpSeg_.Cut(begin, end, res);
return;
}
vector<WordRange> words;
assert(end >= begin);
words.reserve(end - begin);
mpSeg_.Cut(begin, end, words);
vector<WordRange> hmmRes;
hmmRes.reserve(end - begin);
for (size_t i = 0; i < words.size(); i++) {
//if mp Get a word, it's ok, put it into result
if (words[i].left != words[i].right || (words[i].left == words[i].right && mpSeg_.IsUserDictSingleChineseWord(words[i].left->rune))) {
res.push_back(words[i]);
continue;
}
// if mp Get a single one and it is not in userdict, collect it in sequence
size_t j = i;
while (j < words.size() && words[j].left == words[j].right && !mpSeg_.IsUserDictSingleChineseWord(words[j].left->rune)) {
j++;
}
// Cut the sequence with hmm
assert(j - 1 >= i);
// TODO
hmmSeg_.Cut(words[i].left, words[j - 1].left + 1, hmmRes);
//put hmm result to result
for (size_t k = 0; k < hmmRes.size(); k++) {
res.push_back(hmmRes[k]);
}
//clear tmp vars
hmmRes.clear();
//let i jump over this piece
i = j - 1;
}
}
const DictTrie* GetDictTrie() const {
return mpSeg_.GetDictTrie();
}
bool Tag(const string& src, vector<pair<string, string> >& res) const {
return tagger_.Tag(src, res, *this);
}
string LookupTag(const string &str) const {
return tagger_.LookupTag(str, *this);
}
private:
MPSegment mpSeg_;
HMMSegment hmmSeg_;
PosTagger tagger_;
}; // class MixSegment
} // namespace cppjieba
#endif

View File

@ -0,0 +1,77 @@
#ifndef CPPJIEBA_POS_TAGGING_H
#define CPPJIEBA_POS_TAGGING_H
#include "limonp/StringUtil.hpp"
#include "SegmentTagged.hpp"
#include "DictTrie.hpp"
namespace cppjieba {
using namespace limonp;
static const char* const POS_M = "m";
static const char* const POS_ENG = "eng";
static const char* const POS_X = "x";
class PosTagger {
public:
PosTagger() {
}
~PosTagger() {
}
bool Tag(const string& src, vector<pair<string, string> >& res, const SegmentTagged& segment) const {
vector<string> CutRes;
segment.Cut(src, CutRes);
for (vector<string>::iterator itr = CutRes.begin(); itr != CutRes.end(); ++itr) {
res.push_back(make_pair(*itr, LookupTag(*itr, segment)));
}
return !res.empty();
}
string LookupTag(const string &str, const SegmentTagged& segment) const {
const DictUnit *tmp = NULL;
RuneStrArray runes;
const DictTrie * dict = segment.GetDictTrie();
assert(dict != NULL);
if (!DecodeUTF8RunesInString(str, runes)) {
XLOG(ERROR) << "UTF-8 decode failed for word: " << str;
return POS_X;
}
tmp = dict->Find(runes.begin(), runes.end());
if (tmp == NULL || tmp->tag.empty()) {
return SpecialRule(runes);
} else {
return tmp->tag;
}
}
private:
const char* SpecialRule(const RuneStrArray& unicode) const {
size_t m = 0;
size_t eng = 0;
for (size_t i = 0; i < unicode.size() && eng < unicode.size() / 2; i++) {
if (unicode[i].rune < 0x80) {
eng ++;
if ('0' <= unicode[i].rune && unicode[i].rune <= '9') {
m++;
}
}
}
// ascii char is not found
if (eng == 0) {
return POS_X;
}
// all the ascii is number char
if (m == eng) {
return POS_M;
}
// the ascii chars contain english letter
return POS_ENG;
}
}; // class PosTagger
} // namespace cppjieba
#endif

View File

@ -0,0 +1,54 @@
#ifndef CPPJIEBA_PRE_FILTER_H
#define CPPJIEBA_PRE_FILTER_H
#include "Trie.hpp"
#include "limonp/Logging.hpp"
namespace cppjieba {
class PreFilter {
public:
//TODO use WordRange instead of Range
struct Range {
RuneStrArray::const_iterator begin;
RuneStrArray::const_iterator end;
}; // struct Range
PreFilter(const unordered_set<Rune>& symbols,
const string& sentence)
: symbols_(symbols) {
if (!DecodeUTF8RunesInString(sentence, sentence_)) {
XLOG(ERROR) << "UTF-8 decode failed for input sentence";
}
cursor_ = sentence_.begin();
}
~PreFilter() {
}
bool HasNext() const {
return cursor_ != sentence_.end();
}
Range Next() {
Range range;
range.begin = cursor_;
while (cursor_ != sentence_.end()) {
if (IsIn(symbols_, cursor_->rune)) {
if (range.begin == cursor_) {
cursor_ ++;
}
range.end = cursor_;
return range;
}
cursor_ ++;
}
range.end = sentence_.end();
return range;
}
private:
RuneStrArray::const_iterator cursor_;
RuneStrArray sentence_;
const unordered_set<Rune>& symbols_;
}; // class PreFilter
} // namespace cppjieba
#endif // CPPJIEBA_PRE_FILTER_H

View File

@ -0,0 +1,89 @@
#ifndef CPPJIEBA_QUERYSEGMENT_H
#define CPPJIEBA_QUERYSEGMENT_H
#include <algorithm>
#include <set>
#include <cassert>
#include "limonp/Logging.hpp"
#include "DictTrie.hpp"
#include "SegmentBase.hpp"
#include "FullSegment.hpp"
#include "MixSegment.hpp"
#include "Unicode.hpp"
namespace cppjieba {
class QuerySegment: public SegmentBase {
public:
QuerySegment(const string& dict, const string& model, const string& userDict = "")
: mixSeg_(dict, model, userDict),
trie_(mixSeg_.GetDictTrie()) {
}
QuerySegment(const DictTrie* dictTrie, const HMMModel* model)
: mixSeg_(dictTrie, model), trie_(dictTrie) {
}
~QuerySegment() {
}
void Cut(const string& sentence, vector<string>& words) const {
Cut(sentence, words, true);
}
void Cut(const string& sentence, vector<string>& words, bool hmm) const {
vector<Word> tmp;
Cut(sentence, tmp, hmm);
GetStringsFromWords(tmp, words);
}
void Cut(const string& sentence, vector<Word>& words, bool hmm = true) const {
PreFilter pre_filter(symbols_, sentence);
PreFilter::Range range;
vector<WordRange> wrs;
wrs.reserve(sentence.size()/2);
while (pre_filter.HasNext()) {
range = pre_filter.Next();
Cut(range.begin, range.end, wrs, hmm);
}
words.clear();
words.reserve(wrs.size());
GetWordsFromWordRanges(sentence, wrs, words);
}
void Cut(RuneStrArray::const_iterator begin, RuneStrArray::const_iterator end, vector<WordRange>& res, bool hmm) const {
//use mix Cut first
vector<WordRange> mixRes;
mixSeg_.Cut(begin, end, mixRes, hmm);
vector<WordRange> fullRes;
for (vector<WordRange>::const_iterator mixResItr = mixRes.begin(); mixResItr != mixRes.end(); mixResItr++) {
if (mixResItr->Length() > 2) {
for (size_t i = 0; i + 1 < mixResItr->Length(); i++) {
WordRange wr(mixResItr->left + i, mixResItr->left + i + 1);
if (trie_->Find(wr.left, wr.right + 1) != NULL) {
res.push_back(wr);
}
}
}
if (mixResItr->Length() > 3) {
for (size_t i = 0; i + 2 < mixResItr->Length(); i++) {
WordRange wr(mixResItr->left + i, mixResItr->left + i + 2);
if (trie_->Find(wr.left, wr.right + 1) != NULL) {
res.push_back(wr);
}
}
}
res.push_back(*mixResItr);
}
}
private:
bool IsAllAscii(const Unicode& s) const {
for(size_t i = 0; i < s.size(); i++) {
if (s[i] >= 0x80) {
return false;
}
}
return true;
}
MixSegment mixSeg_;
const DictTrie* trie_;
}; // QuerySegment
} // namespace cppjieba
#endif

View File

@ -0,0 +1,46 @@
#ifndef CPPJIEBA_SEGMENTBASE_H
#define CPPJIEBA_SEGMENTBASE_H
#include "limonp/Logging.hpp"
#include "PreFilter.hpp"
#include <cassert>
namespace cppjieba {
const char* const SPECIAL_SEPARATORS = " \t\n\xEF\xBC\x8C\xE3\x80\x82";
using namespace limonp;
class SegmentBase {
public:
SegmentBase() {
XCHECK(ResetSeparators(SPECIAL_SEPARATORS));
}
virtual ~SegmentBase() {
}
virtual void Cut(const string& sentence, vector<string>& words) const = 0;
bool ResetSeparators(const string& s) {
symbols_.clear();
RuneStrArray runes;
if (!DecodeUTF8RunesInString(s, runes)) {
XLOG(ERROR) << "UTF-8 decode failed for separators: " << s;
return false;
}
for (size_t i = 0; i < runes.size(); i++) {
if (!symbols_.insert(runes[i].rune).second) {
XLOG(ERROR) << s.substr(runes[i].offset, runes[i].len) << " already exists";
return false;
}
}
return true;
}
protected:
unordered_set<Rune> symbols_;
}; // class SegmentBase
} // cppjieba
#endif

View File

@ -0,0 +1,23 @@
#ifndef CPPJIEBA_SEGMENTTAGGED_H
#define CPPJIEBA_SEGMENTTAGGED_H
#include "SegmentBase.hpp"
namespace cppjieba {
class SegmentTagged : public SegmentBase{
public:
SegmentTagged() {
}
virtual ~SegmentTagged() {
}
virtual bool Tag(const string& src, vector<pair<string, string> >& res) const = 0;
virtual const DictTrie* GetDictTrie() const = 0;
}; // class SegmentTagged
} // cppjieba
#endif

View File

@ -0,0 +1,190 @@
#ifndef CPPJIEBA_TEXTRANK_EXTRACTOR_H
#define CPPJIEBA_TEXTRANK_EXTRACTOR_H
#include <cmath>
#include "Jieba.hpp"
namespace cppjieba {
using namespace limonp;
using namespace std;
class TextRankExtractor {
public:
typedef struct _Word {string word;vector<size_t> offsets;double weight;} Word; // struct Word
private:
typedef std::map<string,Word> WordMap;
class WordGraph{
private:
typedef double Score;
typedef string Node;
typedef std::set<Node> NodeSet;
typedef std::map<Node,double> Edges;
typedef std::map<Node,Edges> Graph;
//typedef std::unordered_map<Node,double> Edges;
//typedef std::unordered_map<Node,Edges> Graph;
double d;
Graph graph;
NodeSet nodeSet;
public:
WordGraph(): d(0.85) {};
WordGraph(double in_d): d(in_d) {};
void addEdge(Node start,Node end,double weight){
Edges temp;
Edges::iterator gotEdges;
nodeSet.insert(start);
nodeSet.insert(end);
graph[start][end]+=weight;
graph[end][start]+=weight;
}
void rank(WordMap &ws,size_t rankTime=10){
WordMap outSum;
Score wsdef, min_rank, max_rank;
if( graph.size() == 0)
return;
wsdef = 1.0 / graph.size();
for(Graph::iterator edges=graph.begin();edges!=graph.end();++edges){
// edges->first start节点edge->first end节点edge->second 权重
ws[edges->first].word=edges->first;
ws[edges->first].weight=wsdef;
outSum[edges->first].weight=0;
for(Edges::iterator edge=edges->second.begin();edge!=edges->second.end();++edge){
outSum[edges->first].weight+=edge->second;
}
}
//sort(nodeSet.begin(),nodeSet.end()); 是否需要排序?
for( size_t i=0; i<rankTime; i++ ){
for(NodeSet::iterator node = nodeSet.begin(); node != nodeSet.end(); node++ ){
double s = 0;
for( Edges::iterator edge= graph[*node].begin(); edge != graph[*node].end(); edge++ )
// edge->first end节点edge->second 权重
s += edge->second / outSum[edge->first].weight * ws[edge->first].weight;
ws[*node].weight = (1 - d) + d * s;
}
}
min_rank=max_rank=ws.begin()->second.weight;
for(WordMap::iterator i = ws.begin(); i != ws.end(); i ++){
if( i->second.weight < min_rank ){
min_rank = i->second.weight;
}
if( i->second.weight > max_rank ){
max_rank = i->second.weight;
}
}
for(WordMap::iterator i = ws.begin(); i != ws.end(); i ++){
ws[i->first].weight = (i->second.weight - min_rank / 10.0) / (max_rank - min_rank / 10.0);
}
}
};
public:
TextRankExtractor(const string& dictPath,
const string& hmmFilePath,
const string& stopWordPath,
const string& userDict = "")
: segment_(dictPath, hmmFilePath, userDict) {
LoadStopWordDict(stopWordPath);
}
TextRankExtractor(const DictTrie* dictTrie,
const HMMModel* model,
const string& stopWordPath)
: segment_(dictTrie, model) {
LoadStopWordDict(stopWordPath);
}
TextRankExtractor(const Jieba& jieba, const string& stopWordPath) : segment_(jieba.GetDictTrie(), jieba.GetHMMModel()) {
LoadStopWordDict(stopWordPath);
}
~TextRankExtractor() {
}
void Extract(const string& sentence, vector<string>& keywords, size_t topN) const {
vector<Word> topWords;
Extract(sentence, topWords, topN);
for (size_t i = 0; i < topWords.size(); i++) {
keywords.push_back(topWords[i].word);
}
}
void Extract(const string& sentence, vector<pair<string, double> >& keywords, size_t topN) const {
vector<Word> topWords;
Extract(sentence, topWords, topN);
for (size_t i = 0; i < topWords.size(); i++) {
keywords.push_back(pair<string, double>(topWords[i].word, topWords[i].weight));
}
}
void Extract(const string& sentence, vector<Word>& keywords, size_t topN, size_t span=5,size_t rankTime=10) const {
vector<string> words;
segment_.Cut(sentence, words);
TextRankExtractor::WordGraph graph;
WordMap wordmap;
size_t offset = 0;
for(size_t i=0; i < words.size(); i++){
size_t t = offset;
offset += words[i].size();
if (IsSingleWord(words[i]) || stopWords_.find(words[i]) != stopWords_.end()) {
continue;
}
for(size_t j=i+1,skip=0;j<i+span+skip && j<words.size();j++){
if (IsSingleWord(words[j]) || stopWords_.find(words[j]) != stopWords_.end()) {
skip++;
continue;
}
graph.addEdge(words[i],words[j],1);
}
wordmap[words[i]].offsets.push_back(t);
}
if (offset != sentence.size()) {
XLOG(ERROR) << "words illegal";
return;
}
graph.rank(wordmap,rankTime);
keywords.clear();
keywords.reserve(wordmap.size());
for (WordMap::iterator itr = wordmap.begin(); itr != wordmap.end(); ++itr) {
keywords.push_back(itr->second);
}
topN = min(topN, keywords.size());
partial_sort(keywords.begin(), keywords.begin() + topN, keywords.end(), Compare);
keywords.resize(topN);
}
private:
void LoadStopWordDict(const string& filePath) {
ifstream ifs(filePath.c_str());
XCHECK(ifs.is_open()) << "open " << filePath << " failed";
string line ;
while (getline(ifs, line)) {
stopWords_.insert(line);
}
assert(stopWords_.size());
}
static bool Compare(const Word &x,const Word &y){
return x.weight > y.weight;
}
MixSegment segment_;
unordered_set<string> stopWords_;
}; // class TextRankExtractor
inline ostream& operator << (ostream& os, const TextRankExtractor::Word& word) {
return os << "{\"word\": \"" << word.word << "\", \"offset\": " << word.offsets << ", \"weight\": " << word.weight << "}";
}
} // namespace cppjieba
#endif

200
include/cppjieba/Trie.hpp Normal file
View File

@ -0,0 +1,200 @@
#ifndef CPPJIEBA_TRIE_HPP
#define CPPJIEBA_TRIE_HPP
#include <vector>
#include <queue>
#include "limonp/StdExtension.hpp"
#include "Unicode.hpp"
namespace cppjieba {
using namespace std;
const size_t MAX_WORD_LENGTH = 512;
struct DictUnit {
Unicode word;
double weight;
string tag;
}; // struct DictUnit
// for debugging
// inline ostream & operator << (ostream& os, const DictUnit& unit) {
// string s;
// s << unit.word;
// return os << StringFormat("%s %s %.3lf", s.c_str(), unit.tag.c_str(), unit.weight);
// }
struct Dag {
RuneStr runestr;
// [offset, nexts.first]
limonp::LocalVector<pair<size_t, const DictUnit*> > nexts;
const DictUnit * pInfo;
double weight;
size_t nextPos; // TODO
Dag():runestr(), pInfo(NULL), weight(0.0), nextPos(0) {
}
}; // struct Dag
typedef Rune TrieKey;
class TrieNode {
public :
TrieNode(): next(NULL), ptValue(NULL) {
}
public:
typedef unordered_map<TrieKey, TrieNode*> NextMap;
NextMap *next;
const DictUnit *ptValue;
};
class Trie {
public:
Trie(const vector<Unicode>& keys, const vector<const DictUnit*>& valuePointers)
: root_(new TrieNode) {
CreateTrie(keys, valuePointers);
}
~Trie() {
DeleteNode(root_);
}
const DictUnit* Find(RuneStrArray::const_iterator begin, RuneStrArray::const_iterator end) const {
if (begin == end) {
return NULL;
}
const TrieNode* ptNode = root_;
TrieNode::NextMap::const_iterator citer;
for (RuneStrArray::const_iterator it = begin; it != end; it++) {
if (NULL == ptNode->next) {
return NULL;
}
citer = ptNode->next->find(it->rune);
if (ptNode->next->end() == citer) {
return NULL;
}
ptNode = citer->second;
}
return ptNode->ptValue;
}
void Find(RuneStrArray::const_iterator begin,
RuneStrArray::const_iterator end,
vector<struct Dag>&res,
size_t max_word_len = MAX_WORD_LENGTH) const {
assert(root_ != NULL);
res.resize(end - begin);
const TrieNode *ptNode = NULL;
TrieNode::NextMap::const_iterator citer;
for (size_t i = 0; i < size_t(end - begin); i++) {
res[i].runestr = *(begin + i);
if (root_->next != NULL && root_->next->end() != (citer = root_->next->find(res[i].runestr.rune))) {
ptNode = citer->second;
} else {
ptNode = NULL;
}
if (ptNode != NULL) {
res[i].nexts.push_back(pair<size_t, const DictUnit*>(i, ptNode->ptValue));
} else {
res[i].nexts.push_back(pair<size_t, const DictUnit*>(i, static_cast<const DictUnit*>(NULL)));
}
for (size_t j = i + 1; j < size_t(end - begin) && (j - i + 1) <= max_word_len; j++) {
if (ptNode == NULL || ptNode->next == NULL) {
break;
}
citer = ptNode->next->find((begin + j)->rune);
if (ptNode->next->end() == citer) {
break;
}
ptNode = citer->second;
if (NULL != ptNode->ptValue) {
res[i].nexts.push_back(pair<size_t, const DictUnit*>(j, ptNode->ptValue));
}
}
}
}
void InsertNode(const Unicode& key, const DictUnit* ptValue) {
if (key.begin() == key.end()) {
return;
}
TrieNode::NextMap::const_iterator kmIter;
TrieNode *ptNode = root_;
for (Unicode::const_iterator citer = key.begin(); citer != key.end(); ++citer) {
if (NULL == ptNode->next) {
ptNode->next = new TrieNode::NextMap;
}
kmIter = ptNode->next->find(*citer);
if (ptNode->next->end() == kmIter) {
TrieNode *nextNode = new TrieNode;
ptNode->next->insert(make_pair(*citer, nextNode));
ptNode = nextNode;
} else {
ptNode = kmIter->second;
}
}
assert(ptNode != NULL);
ptNode->ptValue = ptValue;
}
void DeleteNode(const Unicode& key, const DictUnit* ptValue) {
if (key.begin() == key.end()) {
return;
}
//定义一个NextMap迭代器
TrieNode::NextMap::const_iterator kmIter;
//定义一个指向root的TrieNode指针
TrieNode *ptNode = root_;
for (Unicode::const_iterator citer = key.begin(); citer != key.end(); ++citer) {
//链表不存在元素
if (NULL == ptNode->next) {
return;
}
kmIter = ptNode->next->find(*citer);
//如果map中不存在,跳出循环
if (ptNode->next->end() == kmIter) {
break;
}
//从unordered_map中擦除该项
ptNode->next->erase(*citer);
//删除该node
ptNode = kmIter->second;
delete ptNode;
break;
}
return;
}
private:
void CreateTrie(const vector<Unicode>& keys, const vector<const DictUnit*>& valuePointers) {
if (valuePointers.empty() || keys.empty()) {
return;
}
assert(keys.size() == valuePointers.size());
for (size_t i = 0; i < keys.size(); i++) {
InsertNode(keys[i], valuePointers[i]);
}
}
void DeleteNode(TrieNode* node) {
if (NULL == node) {
return;
}
if (NULL != node->next) {
for (TrieNode::NextMap::iterator it = node->next->begin(); it != node->next->end(); ++it) {
DeleteNode(it->second);
}
delete node->next;
}
delete node;
}
TrieNode* root_;
}; // class Trie
} // namespace cppjieba
#endif // CPPJIEBA_TRIE_HPP

View File

@ -0,0 +1,227 @@
#ifndef CPPJIEBA_UNICODE_H
#define CPPJIEBA_UNICODE_H
#include <stdint.h>
#include <stdlib.h>
#include <string>
#include <vector>
#include <ostream>
#include "limonp/LocalVector.hpp"
namespace cppjieba {
using std::string;
using std::vector;
typedef uint32_t Rune;
struct Word {
string word;
uint32_t offset;
uint32_t unicode_offset;
uint32_t unicode_length;
Word(const string& w, uint32_t o)
: word(w), offset(o) {
}
Word(const string& w, uint32_t o, uint32_t unicode_offset, uint32_t unicode_length)
: word(w), offset(o), unicode_offset(unicode_offset), unicode_length(unicode_length) {
}
}; // struct Word
inline std::ostream& operator << (std::ostream& os, const Word& w) {
return os << "{\"word\": \"" << w.word << "\", \"offset\": " << w.offset << "}";
}
struct RuneStr {
Rune rune;
uint32_t offset;
uint32_t len;
uint32_t unicode_offset;
uint32_t unicode_length;
RuneStr(): rune(0), offset(0), len(0), unicode_offset(0), unicode_length(0) {
}
RuneStr(Rune r, uint32_t o, uint32_t l)
: rune(r), offset(o), len(l), unicode_offset(0), unicode_length(0) {
}
RuneStr(Rune r, uint32_t o, uint32_t l, uint32_t unicode_offset, uint32_t unicode_length)
: rune(r), offset(o), len(l), unicode_offset(unicode_offset), unicode_length(unicode_length) {
}
}; // struct RuneStr
inline std::ostream& operator << (std::ostream& os, const RuneStr& r) {
return os << "{\"rune\": \"" << r.rune << "\", \"offset\": " << r.offset << ", \"len\": " << r.len << "}";
}
typedef limonp::LocalVector<Rune> Unicode;
typedef limonp::LocalVector<struct RuneStr> RuneStrArray;
// [left, right]
struct WordRange {
RuneStrArray::const_iterator left;
RuneStrArray::const_iterator right;
WordRange(RuneStrArray::const_iterator l, RuneStrArray::const_iterator r)
: left(l), right(r) {
}
size_t Length() const {
return right - left + 1;
}
bool IsAllAscii() const {
for (RuneStrArray::const_iterator iter = left; iter <= right; ++iter) {
if (iter->rune >= 0x80) {
return false;
}
}
return true;
}
}; // struct WordRange
struct RuneStrLite {
uint32_t rune;
uint32_t len;
RuneStrLite(): rune(0), len(0) {
}
RuneStrLite(uint32_t r, uint32_t l): rune(r), len(l) {
}
}; // struct RuneStrLite
inline RuneStrLite DecodeUTF8ToRune(const char* str, size_t len) {
RuneStrLite rp(0, 0);
if (str == NULL || len == 0) {
return rp;
}
if (!(str[0] & 0x80)) { // 0xxxxxxx
// 7bit, total 7bit
rp.rune = (uint8_t)(str[0]) & 0x7f;
rp.len = 1;
} else if ((uint8_t)str[0] <= 0xdf && 1 < len) {
// 110xxxxxx
// 5bit, total 5bit
rp.rune = (uint8_t)(str[0]) & 0x1f;
// 6bit, total 11bit
rp.rune <<= 6;
rp.rune |= (uint8_t)(str[1]) & 0x3f;
rp.len = 2;
} else if((uint8_t)str[0] <= 0xef && 2 < len) { // 1110xxxxxx
// 4bit, total 4bit
rp.rune = (uint8_t)(str[0]) & 0x0f;
// 6bit, total 10bit
rp.rune <<= 6;
rp.rune |= (uint8_t)(str[1]) & 0x3f;
// 6bit, total 16bit
rp.rune <<= 6;
rp.rune |= (uint8_t)(str[2]) & 0x3f;
rp.len = 3;
} else if((uint8_t)str[0] <= 0xf7 && 3 < len) { // 11110xxxx
// 3bit, total 3bit
rp.rune = (uint8_t)(str[0]) & 0x07;
// 6bit, total 9bit
rp.rune <<= 6;
rp.rune |= (uint8_t)(str[1]) & 0x3f;
// 6bit, total 15bit
rp.rune <<= 6;
rp.rune |= (uint8_t)(str[2]) & 0x3f;
// 6bit, total 21bit
rp.rune <<= 6;
rp.rune |= (uint8_t)(str[3]) & 0x3f;
rp.len = 4;
} else {
rp.rune = 0;
rp.len = 0;
}
return rp;
}
inline bool DecodeUTF8RunesInString(const char* s, size_t len, RuneStrArray& runes) {
runes.clear();
runes.reserve(len / 2);
for (uint32_t i = 0, j = 0; i < len;) {
RuneStrLite rp = DecodeUTF8ToRune(s + i, len - i);
if (rp.len == 0) {
runes.clear();
return false;
}
RuneStr x(rp.rune, i, rp.len, j, 1);
runes.push_back(x);
i += rp.len;
++j;
}
return true;
}
inline bool DecodeUTF8RunesInString(const string& s, RuneStrArray& runes) {
return DecodeUTF8RunesInString(s.c_str(), s.size(), runes);
}
inline bool DecodeUTF8RunesInString(const char* s, size_t len, Unicode& unicode) {
unicode.clear();
RuneStrArray runes;
if (!DecodeUTF8RunesInString(s, len, runes)) {
return false;
}
unicode.reserve(runes.size());
for (size_t i = 0; i < runes.size(); i++) {
unicode.push_back(runes[i].rune);
}
return true;
}
inline bool IsSingleWord(const string& str) {
RuneStrLite rp = DecodeUTF8ToRune(str.c_str(), str.size());
return rp.len == str.size();
}
inline bool DecodeUTF8RunesInString(const string& s, Unicode& unicode) {
return DecodeUTF8RunesInString(s.c_str(), s.size(), unicode);
}
inline Unicode DecodeUTF8RunesInString(const string& s) {
Unicode result;
DecodeUTF8RunesInString(s, result);
return result;
}
// [left, right]
inline Word GetWordFromRunes(const string& s, RuneStrArray::const_iterator left, RuneStrArray::const_iterator right) {
assert(right->offset >= left->offset);
uint32_t len = right->offset - left->offset + right->len;
uint32_t unicode_length = right->unicode_offset - left->unicode_offset + right->unicode_length;
return Word(s.substr(left->offset, len), left->offset, left->unicode_offset, unicode_length);
}
inline string GetStringFromRunes(const string& s, RuneStrArray::const_iterator left, RuneStrArray::const_iterator right) {
assert(right->offset >= left->offset);
uint32_t len = right->offset - left->offset + right->len;
return s.substr(left->offset, len);
}
inline void GetWordsFromWordRanges(const string& s, const vector<WordRange>& wrs, vector<Word>& words) {
for (size_t i = 0; i < wrs.size(); i++) {
words.push_back(GetWordFromRunes(s, wrs[i].left, wrs[i].right));
}
}
inline vector<Word> GetWordsFromWordRanges(const string& s, const vector<WordRange>& wrs) {
vector<Word> result;
GetWordsFromWordRanges(s, wrs, result);
return result;
}
inline void GetStringsFromWords(const vector<Word>& words, vector<string>& strs) {
strs.resize(words.size());
for (size_t i = 0; i < words.size(); ++i) {
strs[i] = words[i].word;
}
}
} // namespace cppjieba
#endif // CPPJIEBA_UNICODE_H

View File

@ -1,2 +0,0 @@
INSTALL(PROGRAMS cjserver.start cjserver.stop DESTINATION /etc/init.d/)
INSTALL(PROGRAMS cjseg.sh DESTINATION bin)

View File

@ -1,5 +0,0 @@
if [ $# -lt 1 ]; then
echo "usage: $0 <file>"
exit 1
fi
cjsegment --dictpath /usr/share/CppJieba/dict/jieba.dict.utf8 --modelpath /usr/share/CppJieba/dict/hmm_model.utf8 $1

View File

@ -1,12 +0,0 @@
#!/bin/sh
PATH=/usr/bin/:/usr/local/bin/:/sbin/:$PATH
PID=`pidof cjserver`
if [ ! -z "${PID}" ]
then
echo "please stop cjserver first."
else
cjserver /etc/CppJieba/server.conf &
echo "service startted."
fi

View File

@ -1,13 +0,0 @@
#!/bin/sh
PATH=/usr/bin/:/usr/local/bin/:/sbin/:$PATH
PID=`pidof cjserver`
if [ ! -z "${PID}" ]
then
kill ${PID}
sleep 1
echo "service stop ok."
else
echo "cjserver is not running."
fi

View File

@ -1,12 +0,0 @@
SET(EXECUTABLE_OUTPUT_PATH ${PROJECT_BINARY_DIR}/bin)
SET(LIBRARY_OUTPUT_PATH ${PROJECT_BINARY_DIR}/lib)
INCLUDE_DIRECTORIES(${PROJECT_SOURCE_DIR}/src)
ADD_EXECUTABLE(cjsegment segment.cpp)
ADD_EXECUTABLE(cjserver server.cpp)
TARGET_LINK_LIBRARIES(cjserver pthread)
INSTALL(TARGETS cjsegment RUNTIME DESTINATION bin)
INSTALL(TARGETS cjserver RUNTIME DESTINATION bin)

View File

@ -1,186 +0,0 @@
#ifndef CPPJIEBA_DICT_TRIE_HPP
#define CPPJIEBA_DICT_TRIE_HPP
#include <iostream>
#include <fstream>
#include <map>
#include <cstring>
#include <stdint.h>
#include <cmath>
#include <limits>
#include "Limonp/str_functs.hpp"
#include "Limonp/logger.hpp"
#include "Limonp/InitOnOff.hpp"
#include "TransCode.hpp"
#include "Trie.hpp"
namespace CppJieba
{
using namespace Limonp;
const double MIN_DOUBLE = -3.14e+100;
const double MAX_DOUBLE = 3.14e+100;
const size_t DICT_COLUMN_NUM = 3;
struct DictUnit
{
Unicode word;
size_t freq;
string tag;
double logFreq; //logFreq = log(freq/sum(freq));
};
inline ostream & operator << (ostream& os, const DictUnit& unit)
{
string s;
s << unit.word;
return os << string_format("%s %u %s %.3lf", s.c_str(), unit.freq, unit.tag.c_str(), unit.logFreq);
}
typedef map<size_t, const DictUnit*> DagType;
class DictTrie: public InitOnOff
{
public:
typedef Trie<Unicode::value_type, DictUnit> TrieType;
private:
vector<DictUnit> _nodeInfos;
TrieType * _trie;
size_t _freqSum;
double _minLogFreq;
public:
DictTrie()
{
_trie = NULL;
_freqSum = 0;
_minLogFreq = MAX_DOUBLE;
_setInitFlag(false);
}
DictTrie(const string& filePath)
{
new (this) DictTrie();
_setInitFlag(init(filePath));
}
~DictTrie()
{
if(_trie)
{
delete _trie;
}
}
private:
public:
bool init(const string& filePath)
{
assert(!_getInitFlag());
_loadDict(filePath, _nodeInfos);
_shrink(_nodeInfos);
_freqSum = _calculateFreqSum(_nodeInfos);
assert(_freqSum);
_minLogFreq = _calculateLogFreqAndGetMinValue(_nodeInfos, _freqSum);
_trie = _creatTrie(_nodeInfos);
return _setInitFlag(_trie);
}
public:
const DictUnit* find(Unicode::const_iterator begin, Unicode::const_iterator end) const
{
return _trie->find(begin, end);
}
bool find(Unicode::const_iterator begin, Unicode::const_iterator end, DagType& dag, size_t offset = 0) const
{
return _trie->find(begin, end, dag, offset);
}
public:
double getMinLogFreq() const {return _minLogFreq;};
private:
TrieType * _creatTrie(const vector<DictUnit>& dictUnits)
{
if(dictUnits.empty())
{
return NULL;
}
vector<Unicode> words;
vector<const DictUnit*> valuePointers;
for(size_t i = 0 ; i < dictUnits.size(); i ++)
{
words.push_back(dictUnits[i].word);
valuePointers.push_back(&dictUnits[i]);
}
TrieType * trie = new TrieType(words, valuePointers);
return trie;
}
void _loadDict(const string& filePath, vector<DictUnit>& nodeInfos) const
{
ifstream ifs(filePath.c_str());
if(!ifs)
{
LogFatal("open %s failed.", filePath.c_str());
exit(1);
}
string line;
vector<string> buf;
nodeInfos.clear();
DictUnit nodeInfo;
for(size_t lineno = 0 ; getline(ifs, line); lineno++)
{
split(line, buf, " ");
assert(buf.size() == DICT_COLUMN_NUM);
if(!TransCode::decode(buf[0], nodeInfo.word))
{
LogError("line[%u:%s] illegal.", lineno, line.c_str());
continue;
}
nodeInfo.freq = atoi(buf[1].c_str());
nodeInfo.tag = buf[2];
nodeInfos.push_back(nodeInfo);
}
}
size_t _calculateFreqSum(const vector<DictUnit>& nodeInfos) const
{
size_t freqSum = 0;
for(size_t i = 0; i < nodeInfos.size(); i++)
{
freqSum += nodeInfos[i].freq;
}
return freqSum;
}
double _calculateLogFreqAndGetMinValue(vector<DictUnit>& nodeInfos, size_t freqSum) const
{
assert(freqSum);
double minLogFreq = MAX_DOUBLE;
for(size_t i = 0; i < nodeInfos.size(); i++)
{
DictUnit& nodeInfo = nodeInfos[i];
assert(nodeInfo.freq);
nodeInfo.logFreq = log(double(nodeInfo.freq)/double(freqSum));
if(minLogFreq > nodeInfo.logFreq)
{
minLogFreq = nodeInfo.logFreq;
}
}
return minLogFreq;
}
void _shrink(vector<DictUnit>& units) const
{
vector<DictUnit>(units.begin(), units.end()).swap(units);
}
};
}
#endif

View File

@ -1,129 +0,0 @@
#ifndef CPPJIEBA_FULLSEGMENT_H
#define CPPJIEBA_FULLSEGMENT_H
#include <algorithm>
#include <set>
#include <cassert>
#include "Limonp/logger.hpp"
#include "DictTrie.hpp"
#include "ISegment.hpp"
#include "SegmentBase.hpp"
#include "TransCode.hpp"
namespace CppJieba
{
class FullSegment: public SegmentBase
{
private:
DictTrie _dictTrie;
public:
FullSegment(){_setInitFlag(false);};
explicit FullSegment(const string& dictPath){_setInitFlag(init(dictPath));}
virtual ~FullSegment(){};
public:
bool init(const string& dictPath)
{
if(_getInitFlag())
{
LogError("already inited before now.");
return false;
}
_dictTrie.init(dictPath.c_str());
assert(_dictTrie);
return _setInitFlag(true);
}
public:
using SegmentBase::cut;
public:
bool cut(Unicode::const_iterator begin, Unicode::const_iterator end, vector<Unicode>& res) const
{
assert(_getInitFlag());
if (begin >= end)
{
LogError("begin >= end");
return false;
}
//resut of searching in trie tree
DagType tRes;
//max index of res's words
int maxIdx = 0;
// always equals to (uItr - begin)
int uIdx = 0;
//tmp variables
int wordLen = 0;
for (Unicode::const_iterator uItr = begin; uItr != end; uItr++)
{
//find word start from uItr
if (_dictTrie.find(uItr, end, tRes, 0))
{
for(DagType::const_iterator itr = tRes.begin(); itr != tRes.end(); itr++)
//for (vector<pair<size_t, const DictUnit*> >::const_iterator itr = tRes.begin(); itr != tRes.end(); itr++)
{
wordLen = itr->second->word.size();
if (wordLen >= 2 || (tRes.size() == 1 && maxIdx <= uIdx))
{
res.push_back(itr->second->word);
}
maxIdx = uIdx+wordLen > maxIdx ? uIdx+wordLen : maxIdx;
}
tRes.clear();
}
else // not found word start from uItr
{
if (maxIdx <= uIdx) // never exist in prev results
{
//put itr itself in res
res.push_back(Unicode(1, *uItr));
//mark it exits
++maxIdx;
}
}
++uIdx;
}
return true;
}
bool cut(Unicode::const_iterator begin, Unicode::const_iterator end, vector<string>& res) const
{
assert(_getInitFlag());
if (begin >= end)
{
LogError("begin >= end");
return false;
}
vector<Unicode> uRes;
if (!cut(begin, end, uRes))
{
LogError("get unicode cut result error.");
return false;
}
string tmp;
for (vector<Unicode>::const_iterator uItr = uRes.begin(); uItr != uRes.end(); uItr++)
{
if (TransCode::encode(*uItr, tmp))
{
res.push_back(tmp);
}
else
{
LogError("encode failed.");
}
}
return true;
}
};
}
#endif

View File

@ -1,332 +0,0 @@
#ifndef CPPJIBEA_HMMSEGMENT_H
#define CPPJIBEA_HMMSEGMENT_H
#include <iostream>
#include <fstream>
#include <memory.h>
#include <cassert>
#include "Limonp/str_functs.hpp"
#include "Limonp/logger.hpp"
#include "TransCode.hpp"
#include "ISegment.hpp"
#include "SegmentBase.hpp"
#include "DictTrie.hpp"
namespace CppJieba
{
using namespace Limonp;
typedef unordered_map<uint16_t, double> EmitProbMap;
class HMMSegment: public SegmentBase
{
public:
/*
* STATUS:
* 0:B, 1:E, 2:M, 3:S
* */
enum {B = 0, E = 1, M = 2, S = 3, STATUS_SUM = 4};
private:
char _statMap[STATUS_SUM];
double _startProb[STATUS_SUM];
double _transProb[STATUS_SUM][STATUS_SUM];
EmitProbMap _emitProbB;
EmitProbMap _emitProbE;
EmitProbMap _emitProbM;
EmitProbMap _emitProbS;
vector<EmitProbMap* > _emitProbVec;
public:
HMMSegment(){_setInitFlag(false);}
explicit HMMSegment(const string& filePath)
{
_setInitFlag(init(filePath));
}
virtual ~HMMSegment(){}
public:
bool init(const string& filePath)
{
if(_getInitFlag())
{
LogError("inited already.");
return false;
}
memset(_startProb, 0, sizeof(_startProb));
memset(_transProb, 0, sizeof(_transProb));
_statMap[0] = 'B';
_statMap[1] = 'E';
_statMap[2] = 'M';
_statMap[3] = 'S';
_emitProbVec.push_back(&_emitProbB);
_emitProbVec.push_back(&_emitProbE);
_emitProbVec.push_back(&_emitProbM);
_emitProbVec.push_back(&_emitProbS);
if(!_setInitFlag(_loadModel(filePath.c_str())))
{
LogError("_loadModel(%s) failed.", filePath.c_str());
return false;
}
LogInfo("HMMSegment init(%s) ok.", filePath.c_str());
return true;
}
public:
using SegmentBase::cut;
bool cut(Unicode::const_iterator begin, Unicode::const_iterator end, vector<Unicode>& res)const
{
if(!_getInitFlag())
{
LogError("not inited.");
return false;
}
vector<size_t> status;
if(!_viterbi(begin, end, status))
{
LogError("_viterbi failed.");
return false;
}
Unicode::const_iterator left = begin;
Unicode::const_iterator right;
for(size_t i =0; i< status.size(); i++)
{
if(status[i] % 2) //if(E == status[i] || S == status[i])
{
right = begin + i + 1;
res.push_back(Unicode(left, right));
left = right;
}
}
return true;
}
public:
virtual bool cut(Unicode::const_iterator begin, Unicode::const_iterator end, vector<string>& res)const
{
assert(_getInitFlag());
if(begin == end)
{
return false;
}
vector<Unicode> words;
if(!cut(begin, end, words))
{
return false;
}
string tmp;
for(size_t i = 0; i < words.size(); i++)
{
if(TransCode::encode(words[i], tmp))
{
res.push_back(tmp);
}
}
return true;
}
private:
bool _viterbi(Unicode::const_iterator begin, Unicode::const_iterator end, vector<size_t>& status)const
{
if(begin == end)
{
return false;
}
size_t Y = STATUS_SUM;
size_t X = end - begin;
size_t XYSize = X * Y;
size_t now, old, stat;
double tmp, endE, endS;
vector<int> path(XYSize);
vector<double> weight(XYSize);
//start
for(size_t y = 0; y < Y; y++)
{
weight[0 + y * X] = _startProb[y] + _getEmitProb(_emitProbVec[y], *begin, MIN_DOUBLE);
path[0 + y * X] = -1;
}
double emitProb;
for(size_t x = 1; x < X; x++)
{
for(size_t y = 0; y < Y; y++)
{
now = x + y*X;
weight[now] = MIN_DOUBLE;
path[now] = E; // warning
emitProb = _getEmitProb(_emitProbVec[y], *(begin+x), MIN_DOUBLE);
for(size_t preY = 0; preY < Y; preY++)
{
old = x - 1 + preY * X;
tmp = weight[old] + _transProb[preY][y] + emitProb;
if(tmp > weight[now])
{
weight[now] = tmp;
path[now] = preY;
}
}
}
}
endE = weight[X-1+E*X];
endS = weight[X-1+S*X];
stat = 0;
if(endE > endS)
{
stat = E;
}
else
{
stat = S;
}
status.assign(X, 0);
for(int x = X -1 ; x >= 0; x--)
{
status[x] = stat;
stat = path[x + stat*X];
}
return true;
}
bool _loadModel(const char* const filePath)
{
LogDebug("loadModel [%s] start ...", filePath);
ifstream ifile(filePath);
string line;
vector<string> tmp;
vector<string> tmp2;
//load _startProb
if(!_getLine(ifile, line))
{
return false;
}
split(line, tmp, " ");
if(tmp.size() != STATUS_SUM)
{
LogError("start_p illegal");
return false;
}
for(size_t j = 0; j< tmp.size(); j++)
{
_startProb[j] = atof(tmp[j].c_str());
//cout<<_startProb[j]<<endl;
}
//load _transProb
for(size_t i = 0; i < STATUS_SUM; i++)
{
if(!_getLine(ifile, line))
{
return false;
}
split(line, tmp, " ");
if(tmp.size() != STATUS_SUM)
{
LogError("trans_p illegal");
return false;
}
for(size_t j =0; j < STATUS_SUM; j++)
{
_transProb[i][j] = atof(tmp[j].c_str());
//cout<<_transProb[i][j]<<endl;
}
}
//load _emitProbB
if(!_getLine(ifile, line) || !_loadEmitProb(line, _emitProbB))
{
return false;
}
//load _emitProbE
if(!_getLine(ifile, line) || !_loadEmitProb(line, _emitProbE))
{
return false;
}
//load _emitProbM
if(!_getLine(ifile, line) || !_loadEmitProb(line, _emitProbM))
{
return false;
}
//load _emitProbS
if(!_getLine(ifile, line) || !_loadEmitProb(line, _emitProbS))
{
return false;
}
LogDebug("loadModel [%s] end.", filePath);
return true;
}
bool _getLine(ifstream& ifile, string& line)
{
while(getline(ifile, line))
{
trim(line);
if(line.empty())
{
continue;
}
if(startsWith(line, "#"))
{
continue;
}
return true;
}
return false;
}
bool _loadEmitProb(const string& line, EmitProbMap& mp)
{
if(line.empty())
{
return false;
}
vector<string> tmp, tmp2;
uint16_t unico = 0;
split(line, tmp, ",");
for(size_t i = 0; i < tmp.size(); i++)
{
split(tmp[i], tmp2, ":");
if(2 != tmp2.size())
{
LogError("_emitProb illegal.");
return false;
}
if(!_decodeOne(tmp2[0], unico))
{
LogError("TransCode failed.");
return false;
}
mp[unico] = atof(tmp2[1].c_str());
}
return true;
}
bool _decodeOne(const string& str, uint16_t& res)
{
Unicode ui16;
if(!TransCode::decode(str, ui16) || ui16.size() != 1)
{
return false;
}
res = ui16[0];
return true;
}
double _getEmitProb(const EmitProbMap* ptMp, uint16_t key, double defVal)const
{
EmitProbMap::const_iterator cit = ptMp->find(key);
if(cit == ptMp->end())
{
return defVal;
}
return cit->second;
}
};
}
#endif

View File

@ -1,297 +0,0 @@
#ifndef HUSKY_EPOLLSERVER_H
#define HUSKY_EPOLLSERVER_H
#include <stdio.h>
#include <string.h>
#include <cassert>
#include <sys/socket.h>
#include <sys/types.h>
#include <arpa/inet.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <vector>
#include <sys/epoll.h>
#include <fcntl.h>
#include "HttpReqInfo.hpp"
namespace Husky
{
using namespace Limonp;
const char* const HTTP_FORMAT = "HTTP/1.1 200 OK\r\nConnection: close\r\nServer: HuskyServer/1.0.0\r\nContent-Type: text/json; charset=%s\r\nContent-Length: %d\r\n\r\n%s";
const char* const CHARSET_UTF8 = "UTF-8";
const char* const CLIENT_IP_K = "CLIENT_IP";
const struct linger LNG = {1, 1};
const struct timeval SOCKET_TIMEOUT = {5, 0};
class IRequestHandler
{
public:
virtual ~IRequestHandler(){};
public:
virtual bool do_GET(const HttpReqInfo& httpReq, string& res) const = 0;
virtual bool do_POST(const HttpReqInfo& httpReq, string& res) const = 0;
};
class EpollServer
{
private:
static const size_t LISTEN_QUEUE_LEN = 1024;
static const size_t RECV_BUFFER_SIZE = 1024*4;
static const int MAXEPOLLSIZE = 512;
private:
const IRequestHandler* _reqHandler;
int _host_socket;
int _epoll_fd;
bool _isShutDown;
int _epollSize;
unordered_map<int, string> _sockIpMap;
private:
bool _isInited;
bool _getInitFlag() const {return _isInited;}
bool _setInitFlag(bool flag) {return _isInited = flag;}
public:
explicit EpollServer(uint port, const IRequestHandler* pHandler): _reqHandler(pHandler), _host_socket(-1), _isShutDown(false), _epollSize(0)
{
assert(_reqHandler);
_setInitFlag(_init_epoll(port));
};
~EpollServer(){};// unfinished;
public:
operator bool() const
{
return _getInitFlag();
}
public:
bool start()
{
//int clientSock;
sockaddr_in clientaddr;
socklen_t nSize = sizeof(clientaddr);
struct epoll_event events[MAXEPOLLSIZE];
int nfds, clientSock;
while(!_isShutDown)
{
if(-1 == (nfds = epoll_wait(_epoll_fd, events, _epollSize, -1)))
{
LogFatal(strerror(errno));
return false;
}
//LogDebug("epoll_wait return event sum[%d]", nfds);
for(int i = 0; i < nfds; i++)
{
if(events[i].data.fd == _host_socket) /*new connect coming.*/
{
if(-1 == (clientSock = accept(_host_socket, (struct sockaddr*) &clientaddr, &nSize)))
{
LogError(strerror(errno));
continue;
}
if(!_epoll_add(clientSock, EPOLLIN | EPOLLET))
{
LogError("_epoll_add(%d, EPOLLIN | EPOLLET)", clientSock);
_closesocket(clientSock);
continue;
}
//LogInfo("connecting from: %d:%d client socket: %d\n", inet_ntoa(clientaddr.sin_addr), ntohs(clientaddr.sin_port), clientSock);
/* inet_ntoa is not thread safety at some version */
//_sockIpMap[clientSock] = inet_ntoa(clientaddr.sin_addr);
}
else /*client socket data to be received*/
{
_response(events[i].data.fd);
/*close socket will case it to be removed from epoll automatically*/
_closesocket(events[i].data.fd);
}
}
}
return true;
}
void stop()
{
_isShutDown = true;
if(-1 == close(_host_socket))
{
LogError(strerror(errno));
return;
}
int sockfd;
struct sockaddr_in dest;
if((sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
{
LogError(strerror(errno));
return;
}
bzero(&dest, sizeof(dest));
dest.sin_family = AF_INET;
dest.sin_port = htons(_host_socket);
if(0 == inet_aton("127.0.0.1", (struct in_addr *) &dest.sin_addr.s_addr))
{
LogError(strerror(errno));
return;
}
if(connect(sockfd, (struct sockaddr *) &dest, sizeof(dest)) < 0)
{
LogError(strerror(errno));
}
_closesocket(sockfd);
}
private:
bool _epoll_add(int sockfd, uint32_t events)
{
if (!_setNonBLock(sockfd))
{
LogError(strerror(errno));
return false;
}
struct epoll_event ev;
ev.data.fd = sockfd;
ev.events = events;
if(epoll_ctl(_epoll_fd, EPOLL_CTL_ADD, sockfd, &ev) < 0)
{
LogError("insert socket '%d' into epoll failed: %s", sockfd, strerror(errno));
return false;
}
_epollSize ++;
return true;
}
bool _response(int sockfd) const
{
if(-1 == setsockopt(sockfd, SOL_SOCKET, SO_LINGER, (const char*)&LNG, sizeof(LNG)))
{
LogError(strerror(errno));
return false;
}
if(-1 == setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO, (const char*)&SOCKET_TIMEOUT, sizeof(SOCKET_TIMEOUT)))
{
LogError(strerror(errno));
return false;
}
if(-1 == setsockopt(sockfd, SOL_SOCKET, SO_SNDTIMEO, (const char*)&SOCKET_TIMEOUT, sizeof(SOCKET_TIMEOUT)))
{
LogError(strerror(errno));
return false;
}
string strRec, strSnd, strRetByHandler;
char recvBuf[RECV_BUFFER_SIZE];
int nRetCode = -1;
while(true)
{
memset(recvBuf, 0, sizeof(recvBuf));
nRetCode = recv(sockfd, recvBuf, sizeof(recvBuf) - 1, 0);
if(-1 == nRetCode)
{
LogDebug(strerror(errno));
return false;
}
if(0 == nRetCode)
{
LogDebug("client socket orderly shut down");
return false;
}
strRec += recvBuf;
if(nRetCode != sizeof(recvBuf) - 1)
{
break;
}
}
HttpReqInfo httpReq(strRec);
if("GET" == httpReq.getMethod() && !_reqHandler->do_GET(httpReq, strRetByHandler))
{
LogError("do_GET failed.");
return false;
}
if("POST" == httpReq.getMethod() && !_reqHandler->do_POST(httpReq, strRetByHandler))
{
LogError("do_POST failed.");
return false;
}
string_format(strSnd, HTTP_FORMAT, CHARSET_UTF8, strRetByHandler.length(), strRetByHandler.c_str());
if(-1 == send(sockfd, strSnd.c_str(), strSnd.length(), 0))
{
LogError(strerror(errno));
return false;
}
LogInfo("{response:%s, epollsize:%d}", strRetByHandler.c_str(), _epollSize);
return true;
}
bool _init_epoll(uint port)
{
_host_socket = socket(AF_INET, SOCK_STREAM, 0);
if(-1 == _host_socket)
{
LogError(strerror(errno));
return false;
}
int nRet = 1;
if(-1 == setsockopt(_host_socket, SOL_SOCKET, SO_REUSEADDR, (char*)&nRet, sizeof(nRet)))
{
LogError(strerror(errno));
return false;
}
struct sockaddr_in addrSock;
addrSock.sin_family = AF_INET;
addrSock.sin_port = htons(port);
addrSock.sin_addr.s_addr = htonl(INADDR_ANY);
if(-1 == ::bind(_host_socket, (sockaddr*)&addrSock, sizeof(sockaddr)))
{
LogError(strerror(errno));
_closesocket(_host_socket);
return false;
}
if(-1 == listen(_host_socket, LISTEN_QUEUE_LEN))
{
LogError(strerror(errno));
return false;
}
if(-1 == (_epoll_fd = epoll_create(MAXEPOLLSIZE)))
{
LogError(strerror(errno));
return false;
}
if(!_epoll_add(_host_socket, EPOLLIN))
{
LogError("_epoll_add(%d, EPOLLIN) failed.", _host_socket);
return false;
}
LogInfo("create socket listening port[%u], epoll{size:%d} init ok", port, _epollSize);
return true;
}
void _closesocket(int sockfd)
{
if(-1 == close(sockfd))
{
LogError(strerror(errno));
return;
}
_epollSize--;
}
static bool _setNonBLock(int sockfd)
{
return -1 != fcntl(sockfd, F_SETFL, fcntl(sockfd, F_GETFD, 0)|O_NONBLOCK);
}
};
}
#endif

View File

@ -1,233 +0,0 @@
#ifndef HUSKY_HTTP_REQINFO_H
#define HUSKY_HTTP_REQINFO_H
#include <iostream>
#include <string>
#include "Limonp/logger.hpp"
#include "Limonp/str_functs.hpp"
namespace Husky
{
using namespace Limonp;
using namespace std;
static const char* const KEY_METHOD = "METHOD";
static const char* const KEY_PATH = "PATH";
static const char* const KEY_PROTOCOL = "PROTOCOL";
typedef unsigned char BYTE;
inline BYTE toHex(BYTE x)
{
return x > 9 ? x -10 + 'A': x + '0';
}
inline BYTE fromHex(BYTE x)
{
return isdigit(x) ? x-'0' : x-'A'+10;
}
inline void URLEncode(const string &sIn, string& sOut)
{
for( size_t ix = 0; ix < sIn.size(); ix++ )
{
BYTE buf[4];
memset( buf, 0, 4 );
if( isalnum( (BYTE)sIn[ix] ) )
{
buf[0] = sIn[ix];
}
//else if ( isspace( (BYTE)sIn[ix] ) ) //貌似把空格编码成%20或者+都可以
//{
// buf[0] = '+';
//}
else
{
buf[0] = '%';
buf[1] = toHex( (BYTE)sIn[ix] >> 4 );
buf[2] = toHex( (BYTE)sIn[ix] % 16);
}
sOut += (char *)buf;
}
};
inline void URLDecode(const string &sIn, string& sOut)
{
for( size_t ix = 0; ix < sIn.size(); ix++ )
{
BYTE ch = 0;
if(sIn[ix]=='%')
{
ch = (fromHex(sIn[ix+1])<<4);
ch |= fromHex(sIn[ix+2]);
ix += 2;
}
else if(sIn[ix] == '+')
{
ch = ' ';
}
else
{
ch = sIn[ix];
}
sOut += (char)ch;
}
}
class HttpReqInfo
{
public:
HttpReqInfo(const string& headerStr)
{
size_t lpos = 0, rpos = 0;
vector<string> buf;
rpos = headerStr.find("\n", lpos);
if(string::npos == rpos)
{
LogError("headerStr illegal.");
return;
}
string firstline(headerStr, lpos, rpos - lpos);
trim(firstline);
if(!split(firstline, buf, " ") || 3 != buf.size())
{
LogError("parse header first line failed.");
return;
}
_headerMap[KEY_METHOD] = trim(buf[0]);
_headerMap[KEY_PATH] = trim(buf[1]);
_headerMap[KEY_PROTOCOL] = trim(buf[2]);
//first request line end
//parse path to _methodGetMap
if("GET" == _headerMap[KEY_METHOD])
{
_parseUrl(firstline, _methodGetMap);
}
lpos = rpos + 1;
if(lpos >= headerStr.size())
{
LogError("headerStr illegal");
return;
}
//message header begin
while(lpos < headerStr.size() && string::npos != (rpos = headerStr.find('\n', lpos)) && rpos > lpos)
{
string s(headerStr, lpos, rpos - lpos);
size_t p = s.find(':');
if(string::npos == p)
{
break;//encounter empty line
}
string k(s, 0, p);
string v(s, p+1);
trim(k);
trim(v);
if(k.empty()||v.empty())
{
LogError("headerStr illegal.");
return;
}
upper(k);
_headerMap[k] = v;
lpos = rpos + 1;
}
//message header end
//body begin
_body.assign(headerStr.substr(rpos));
trim(_body);
}
public:
string& operator[] (const string& key)
{
return _headerMap[key];
}
bool find(const string& key, string& res)const
{
return _find(_headerMap, key, res);
}
bool GET(const string& argKey, string& res)const
{
return _find(_methodGetMap, argKey, res);
}
//bool POST(const string& argKey, string& res)const
//{
// return _find(_methodPostMap, argKey, res);
//}
const string& getMethod() const
{
return _headerMap.find(KEY_METHOD)->second;
}
const string& getBody() const
{
return _body;
}
private:
std::unordered_map<string, string> _headerMap;
std::unordered_map<string, string> _methodGetMap;
//std::unordered_map<string, string> _methodPostMap;
string _body;
//public:
friend ostream& operator<<(ostream& os, const HttpReqInfo& obj);
private:
bool _find(const std::unordered_map<string, string>& mp, const string& key, string& res)const
{
std::unordered_map<string, string>::const_iterator it = mp.find(key);
if(it == mp.end())
{
return false;
}
res = it->second;
return true;
}
private:
bool _parseUrl(const string& url, std::unordered_map<string, string>& mp)
{
if(url.empty())
{
return false;
}
uint pos = url.find('?');
if(string::npos == pos)
{
return false;
}
uint kleft = 0, kright = 0;
uint vleft = 0, vright = 0;
for(uint i = pos + 1; i < url.size();)
{
kleft = i;
while(i < url.size() && url[i] != '=')
{
i++;
}
if(i >= url.size())
{
break;
}
kright = i;
i++;
vleft = i;
while(i < url.size() && url[i] != '&' && url[i] != ' ')
{
i++;
}
vright = i;
mp[url.substr(kleft, kright - kleft)] = url.substr(vleft, vright - vleft);
i++;
}
return true;
}
};
inline std::ostream& operator << (std::ostream& os, const Husky::HttpReqInfo& obj)
{
return os << obj._headerMap << obj._methodGetMap/* << obj._methodPostMap*/ << obj._body;
}
}
#endif

View File

@ -1,17 +0,0 @@
#ifndef CPPJIEBA_SEGMENTINTERFACE_H
#define CPPJIEBA_SEGMENTINTERFACE_H
namespace CppJieba
{
class ISegment
{
public:
virtual ~ISegment(){};
public:
virtual bool cut(Unicode::const_iterator begin , Unicode::const_iterator end, vector<string>& res) const = 0;
virtual bool cut(const string& str, vector<string>& res) const = 0;
};
}
#endif

View File

@ -1,184 +0,0 @@
#ifndef CPPJIEBA_KEYWORD_EXTRACTOR_H
#define CPPJIEBA_KEYWORD_EXTRACTOR_H
#include "MixSegment.hpp"
#include <cmath>
#include <set>
#define MIN(X,Y) ((X) < (Y) ? (X) : (Y))
namespace CppJieba
{
using namespace Limonp;
/*utf8*/
class KeywordExtractor: public InitOnOff
{
private:
MixSegment _segment;
private:
unordered_map<string, double> _idfMap;
double _idfAverage;
unordered_set<string> _stopWords;
public:
KeywordExtractor(){_setInitFlag(false);};
explicit KeywordExtractor(const string& dictPath, const string& hmmFilePath, const string& idfPath, const string& stopWordPath)
{
_setInitFlag(init(dictPath, hmmFilePath, idfPath, stopWordPath));
};
~KeywordExtractor(){};
public:
bool init(const string& dictPath, const string& hmmFilePath, const string& idfPath, const string& stopWordPath)
{
_loadIdfDict(idfPath);
_loadStopWordDict(stopWordPath);
return _setInitFlag(_segment.init(dictPath, hmmFilePath));
};
public:
bool extract(const string& str, vector<string>& keywords, size_t topN) const
{
assert(_getInitFlag());
vector<pair<string, double> > topWords;
if(!extract(str, topWords, topN))
{
return false;
}
for(size_t i = 0; i < topWords.size(); i++)
{
keywords.push_back(topWords[i].first);
}
return true;
}
bool extract(const string& str, vector<pair<string, double> >& keywords, size_t topN) const
{
vector<string> words;
if(!_segment.cut(str, words))
{
LogError("segment cut(%s) failed.", str.c_str());
return false;
}
// filtering single word.
for(vector<string>::iterator iter = words.begin(); iter != words.end(); )
{
if(_isSingleWord(*iter))
{
iter = words.erase(iter);
}
else
{
iter++;
}
}
map<string, double> wordmap;
for(size_t i = 0; i < words.size(); i ++)
{
wordmap[ words[i] ] += 1.0;
}
for(map<string, double>::iterator itr = wordmap.begin(); itr != wordmap.end(); )
{
if(_stopWords.end() != _stopWords.find(itr->first))
{
wordmap.erase(itr++);
continue;
}
unordered_map<string, double>::const_iterator cit = _idfMap.find(itr->first);
if(cit != _idfMap.end())
{
itr->second *= cit->second;
}
else
{
itr->second *= _idfAverage;
}
itr ++;
}
keywords.clear();
std::copy(wordmap.begin(), wordmap.end(), std::inserter(keywords, keywords.begin()));
topN = MIN(topN, keywords.size());
partial_sort(keywords.begin(), keywords.begin() + topN, keywords.end(), _cmp);
keywords.resize(topN);
return true;
}
private:
void _loadIdfDict(const string& idfPath)
{
ifstream ifs(idfPath.c_str());
if(!ifs)
{
LogError("open %s failed.", idfPath.c_str());
assert(false);
}
string line ;
vector<string> buf;
double idf = 0.0;
double idfSum = 0.0;
size_t lineno = 0;
for(;getline(ifs, line); lineno++)
{
buf.clear();
if(line.empty())
{
LogError("line[%d] empty. skipped.", lineno);
continue;
}
if(!split(line, buf, " ") || buf.size() != 2)
{
LogError("line %d [%s] illegal. skipped.", lineno, line.c_str());
continue;
}
idf = atof(buf[1].c_str());
_idfMap[buf[0]] = idf;
idfSum += idf;
}
assert(lineno);
_idfAverage = idfSum / lineno;
assert(_idfAverage > 0.0);
}
void _loadStopWordDict(const string& filePath)
{
ifstream ifs(filePath.c_str());
if(!ifs)
{
LogError("open %s failed.", filePath.c_str());
assert(false);
}
string line ;
while(getline(ifs, line))
{
_stopWords.insert(line);
}
assert(_stopWords.size());
}
private:
bool _isSingleWord(const string& str) const
{
Unicode unicode;
TransCode::decode(str, unicode);
if(unicode.size() == 1)
return true;
return false;
}
private:
static bool _cmp(const pair<string, double>& lhs, const pair<string, double>& rhs)
{
return lhs.second > rhs.second;
}
};
}
#endif

View File

@ -1,89 +0,0 @@
/************************************
* file enc : ascii
* author : wuyanyi09@gmail.com
************************************/
#ifndef LIMONP_ARGV_FUNCTS_H
#define LIMONP_ARGV_FUNCTS_H
#include <set>
#include <sstream>
#include "str_functs.hpp"
namespace Limonp
{
using namespace std;
class ArgvContext
{
public :
ArgvContext(int argc, const char* const * argv)
{
for(int i = 0; i < argc; i++)
{
if(startsWith(argv[i], "-"))
{
if(i + 1 < argc && !startsWith(argv[i + 1], "-"))
{
_mpss[argv[i]] = argv[i+1];
i++;
}
else
{
_sset.insert(argv[i]);
}
}
else
{
_args.push_back(argv[i]);
}
}
}
~ArgvContext(){};
public:
friend ostream& operator << (ostream& os, const ArgvContext& args);
string operator [](uint i) const
{
if(i < _args.size())
{
return _args[i];
}
return "";
}
string operator [](const string& key) const
{
map<string, string>::const_iterator it = _mpss.find(key);
if(it != _mpss.end())
{
return it->second;
}
return "";
}
public:
bool hasKey(const string& key) const
{
if(_mpss.find(key) != _mpss.end() || _sset.find(key) != _sset.end())
{
return true;
}
return false;
}
private:
vector<string> _args;
map<string, string> _mpss;
set<string> _sset;
};
inline ostream& operator << (ostream& os, const ArgvContext& args)
{
return os<<args._args<<args._mpss<<args._sset;
}
//string toString()
//{
// stringstream ss;
// return ss.str();
//}
}
#endif

View File

@ -1,93 +0,0 @@
/************************************
* file enc : utf8
* author : wuyanyi09@gmail.com
************************************/
#ifndef LIMONP_CONFIG_H
#define LIMONP_CONFIG_H
#include <map>
#include <fstream>
#include <iostream>
#include "logger.hpp"
#include "str_functs.hpp"
namespace Limonp
{
using namespace std;
class Config
{
public:
Config(const char * const filePath)
{
_loadFile(filePath);
}
public:
operator bool ()
{
return !_map.empty();
}
private:
bool _loadFile(const char * const filePath)
{
ifstream ifs(filePath);
if(!ifs)
{
LogFatal("open file[%s] failed.", filePath);
return false;
}
string line;
vector<string> vecBuf;
uint lineno = 0;
while(getline(ifs, line))
{
lineno ++;
trim(line);
if(line.empty() || startsWith(line, "#"))
{
continue;
}
vecBuf.clear();
if(!split(line, vecBuf, "=") || 2 != vecBuf.size())
{
LogFatal("line[%d:%s] is illegal.", lineno, line.c_str());
return false;
}
string& key = vecBuf[0];
string& value = vecBuf[1];
trim(key);
trim(value);
if(_map.end() != _map.find(key))
{
LogFatal("key[%s] already exists.", key.c_str());
return false;
}
_map[key] = value;
}
ifs.close();
return true;
}
public:
bool get(const string& key, string& value) const
{
map<string, string>::const_iterator it = _map.find(key);
if(_map.end() != it)
{
value = it->second;
return true;
}
return false;
}
private:
map<string, string> _map;
private:
friend ostream& operator << (ostream& os, const Config& config);
};
inline ostream& operator << (ostream& os, const Config& config)
{
return os << config._map;
}
}
#endif

View File

@ -1,21 +0,0 @@
#ifndef LIMONP_INITONOFF_H
#define LIMONP_INITONOFF_H
namespace Limonp
{
class InitOnOff
{
public:
InitOnOff(){_setInitFlag(false);};
~InitOnOff(){};
protected:
bool _isInited;
bool _getInitFlag()const{return _isInited;};
bool _setInitFlag(bool flag){return _isInited = flag;};
public:
operator bool(){return _getInitFlag();};
};
}
#endif

View File

@ -1,126 +0,0 @@
#ifndef LIMONP_MYSQLCLIENT_H
#define LIMONP_MYSQLCLIENT_H
#include <mysql.h>
#include <iostream>
#include <vector>
#include <string>
#include "logger.hpp"
namespace Limonp
{
using namespace std;
class MysqlClient
{
public:
typedef vector< vector<string> > RowsType;
private:
const char * const HOST;
const unsigned int PORT;
const char * const USER;
const char * const PASSWD;
const char * const DB;
const char * const CHARSET;
public:
MysqlClient(const char* host, uint port, const char* user, const char* passwd, const char* db, const char* charset = "utf8"): HOST(host), PORT(port), USER(user), PASSWD(passwd), DB(db), CHARSET(charset){ _conn = NULL;};
~MysqlClient(){dispose();};
public:
bool init()
{
//cout<<mysql_get_client_info()<<endl;
if(NULL == (_conn = mysql_init(NULL)))
{
LogError("mysql_init faield. %s", mysql_error(_conn));
return false;
}
if (mysql_real_connect(_conn, HOST, USER, PASSWD, DB, PORT, NULL, 0) == NULL)
{
LogError("mysql_real_connect failed. %s", mysql_error(_conn));
mysql_close(_conn);
_conn = NULL;
return false;
}
if(mysql_set_character_set(_conn, CHARSET))
{
LogError("mysql_set_character_set [%s] failed.", CHARSET);
return false;
}
//set reconenct
char value = 1;
mysql_options(_conn, MYSQL_OPT_RECONNECT, &value);
LogInfo("MysqlClient {host: %s, port:%d, database:%s, charset:%s}", HOST, PORT, DB, CHARSET);
return true;
}
bool dispose()
{
if(NULL != _conn)
{
mysql_close(_conn);
}
_conn = NULL;
return true;
}
bool executeSql(const char* sql)
{
if(NULL == _conn)
{
LogError("_conn is NULL");
return false;
}
if(mysql_query(_conn, sql))
{
LogError("mysql_query failed. %s", mysql_error(_conn));
return false;
}
return true;
}
uint insert(const char* tb_name, const char* keys, const vector<string>& vals)
{
uint retn = 0;
string sql;
for(uint i = 0; i < vals.size(); i ++)
{
sql.clear();
string_format(sql, "insert into %s (%s) values %s", tb_name, keys, vals[i].c_str());
retn += executeSql(sql.c_str());
}
return retn;
}
bool select(const char* sql, RowsType& rows)
{
if(!executeSql(sql))
{
LogError("executeSql failed. [%s]", sql);
return false;
}
MYSQL_RES * result = mysql_store_result(_conn);
if(NULL == result)
{
LogError("mysql_store_result failed.[%d]", mysql_error(_conn));
}
uint num_fields = mysql_num_fields(result);
MYSQL_ROW row;
while((row = mysql_fetch_row(result)))
{
vector<string> vec;
for(uint i = 0; i < num_fields; i ++)
{
row[i] ? vec.push_back(row[i]) : vec.push_back("NULL");
}
rows.push_back(vec);
}
mysql_free_result(result);
return true;
}
private:
MYSQL * _conn;
};
}
#endif

View File

@ -1,22 +0,0 @@
/************************************
************************************/
#ifndef LIMONP_NONCOPYABLE_H
#define LIMONP_NONCOPYABLE_H
#include <iostream>
#include <string>
namespace Limonp
{
class NonCopyable
{
protected:
NonCopyable(){};
~NonCopyable(){};
private:
NonCopyable(const NonCopyable& );
const NonCopyable& operator=(const NonCopyable& );
};
}
#endif

View File

@ -1,87 +0,0 @@
#ifndef LIMONP_CAST_FUNCTS_H
#define LIMONP_CAST_FUNCTS_H
namespace Limonp
{
//logical and or
static const int sign_32 = 0xC0000000;
static const int exponent_32 = 0x07800000;
static const int mantissa_32 = 0x007FE000;
static const int sign_exponent_32 = 0x40000000;
static const int loss_32 = 0x38000000;
static const short sign_16 = (short)0xC000;
static const short exponent_16 = (short)0x3C00;
static const short mantissa_16 = (short)0x03FF;
static const short sign_exponent_16 = (short)0x4000;
static const int exponent_fill_32 = 0x38000000;
//infinite
static const short infinite_16 = (short) 0x7FFF;
static const short infinitesmall_16 = (short) 0x0000;
inline float intBitsToFloat(unsigned int x)
{
union
{
float f;
int i;
}u;
u.i = x;
return u.f;
}
inline int floatToIntBits(float f)
{
union
{
float f;
int i ;
}u;
u.f = f;
return u.i;
}
inline short floatToShortBits(float f)
{
int fi = floatToIntBits(f);
// 提取关键信息
short sign = (short) ((unsigned int)(fi & sign_32) >> 16);
short exponent = (short) ((unsigned int)(fi & exponent_32) >> 13);
short mantissa = (short) ((unsigned int)(fi & mantissa_32) >> 13);
// 生成编码结果
short code = (short) (sign | exponent | mantissa);
// 无穷大量、无穷小量的处理
if ((fi & loss_32) > 0 && (fi & sign_exponent_32) > 0) {
// 当指数符号为1时(正次方)且左234位为1返回无穷大量
return (short) (code | infinite_16);
}
if (((fi & loss_32) ^ loss_32) > 0 && (fi & sign_exponent_32) == 0) {
// 当指数符号位0时(负次方)且左234位为0(与111异或>0),返回无穷小量
return infinitesmall_16;
}
return code;
}
inline float shortBitsToFloat(short s)
{
/*
* 31001 0(13)
*/
int sign = ((int) (s & sign_16)) << 16;
int exponent = ((int) (s & exponent_16)) << 13;
// 指数符号位为0234位补1
if ((s & sign_exponent_16) == 0 && s != 0) {
exponent |= exponent_fill_32;
}
int mantissa = ((int) (s & mantissa_16)) << 13;
// 生成解码结果
int code = sign | exponent | mantissa;
return intBitsToFloat(code);
}
}
#endif

View File

@ -1,82 +0,0 @@
/************************************
* file enc : utf8
* author : wuyanyi09@gmail.com
************************************/
#ifndef LIMONP_IO_FUNCTS_H
#define LIMONP_IO_FUNCTS_H
#include <fstream>
#include <iostream>
#include <stdlib.h>
namespace Limonp
{
using namespace std;
inline bool loadFile2Str(const char * const filepath, string& res)
{
ifstream in(filepath);
if(!in)
{
return false;
}
istreambuf_iterator<char> beg(in), end;
res.assign(beg, end);
in.close();
return true;
}
inline void loadStr2File(const char * const filename, ios_base::openmode mode, const string& str)
{
ofstream out(filename, mode);
ostreambuf_iterator<char> itr (out);
copy(str.begin(), str.end(), itr);
out.close();
}
inline int ReadFromFile(const char * fileName, char* buf, int maxCount, const char* mode)
{
FILE* fp = fopen(fileName, mode);
if (!fp)
return 0;
int ret;
fgets(buf, maxCount, fp) ? ret = 1 : ret = 0;
fclose(fp);
return ret;
}
inline int WriteStr2File(const char* fileName, const char* buf, const char* mode)
{
FILE* fp = fopen(fileName, mode);
if (!fp)
return 0;
int n = fprintf(fp, "%s", buf);
fclose(fp);
return n;
}
inline bool checkFileExist(const string& filePath)
{
fstream _file;
_file.open(filePath.c_str(), ios::in);
if(_file)
return true;
return false;
}
inline bool createDir(const string& dirPath, bool p = true)
{
string dir_str(dirPath);
string cmd = "mkdir";
if(p)
{
cmd += " -p";
}
cmd += " " + dir_str;
int res = system(cmd.c_str());
return res;
}
inline bool checkDirExist(const string& dirPath)
{
return checkFileExist(dirPath);
}
}
#endif

View File

@ -1,73 +0,0 @@
/************************************
* file enc : utf8
* author : wuyanyi09@gmail.com
************************************/
#ifndef LIMONP_LOGGER_H
#define LIMONP_LOGGER_H
#include <vector>
#include <iostream>
#include <fstream>
#include <string>
#include <stdio.h>
#include <stdarg.h>
#include <cassert>
#define FILE_BASENAME strrchr(__FILE__, '/') ? strrchr(__FILE__, '/') + 1 : __FILE__
#define LogDebug(fmt, ...) Limonp::Logger::LoggingF(Limonp::LL_DEBUG, FILE_BASENAME, __LINE__, fmt, ## __VA_ARGS__)
#define LogInfo(fmt, ...) Limonp::Logger::LoggingF(Limonp::LL_INFO, FILE_BASENAME, __LINE__, fmt, ## __VA_ARGS__)
#define LogWarn(fmt, ...) Limonp::Logger::LoggingF(Limonp::LL_WARN, FILE_BASENAME, __LINE__, fmt, ## __VA_ARGS__)
#define LogError(fmt, ...) Limonp::Logger::LoggingF(Limonp::LL_ERROR, FILE_BASENAME, __LINE__, fmt, ## __VA_ARGS__)
#define LogFatal(fmt, ...) Limonp::Logger::LoggingF(Limonp::LL_FATAL, FILE_BASENAME, __LINE__, fmt, ## __VA_ARGS__)
namespace Limonp
{
using namespace std;
enum {LL_DEBUG = 0, LL_INFO = 1, LL_WARN = 2, LL_ERROR = 3, LL_FATAL = 4, LEVEL_ARRAY_SIZE = 5, CSTR_BUFFER_SIZE = 32};
static const char * LOG_LEVEL_ARRAY[LEVEL_ARRAY_SIZE]= {"DEBUG","INFO","WARN","ERROR","FATAL"};
static const char * LOG_FORMAT = "%s %s:%d %s %s\n";
static const char * LOG_TIME_FORMAT = "%Y-%m-%d %H:%M:%S";
class Logger
{
public:
static void Logging(size_t level, const string& msg, const char* fileName, int lineno)
{
assert(level <= LL_FATAL);
char buf[CSTR_BUFFER_SIZE];
time_t timeNow;
time(&timeNow);
strftime(buf, sizeof(buf), LOG_TIME_FORMAT, localtime(&timeNow));
fprintf(stderr, LOG_FORMAT, buf, fileName, lineno,LOG_LEVEL_ARRAY[level], msg.c_str());
}
static void LoggingF(size_t level, const char* fileName, int lineno, const string& fmt, ...)
{
#ifdef LOGGER_LEVEL
if(level < LOGGER_LEVEL) return;
#endif
int size = 256;
string msg;
va_list ap;
while (1) {
msg.resize(size);
va_start(ap, fmt);
int n = vsnprintf((char *)msg.c_str(), size, fmt.c_str(), ap);
va_end(ap);
if (n > -1 && n < size) {
msg.resize(n);
break;
}
if (n > -1)
size = n + 1;
else
size *= 2;
}
Logging(level, msg, fileName, lineno);
}
};
}
#endif

View File

@ -1,22 +0,0 @@
#ifndef LIMONP_MACRO_DEF_H
#define LIMONP_MACRO_DEF_H
#define XX_GET_SET(varType, varName, funName)\
private: varType varName;\
public: inline varType get##funName(void) const {return varName;}\
public: inline void set##funName(varType var) {varName = var;}
#define XX_GET(varType, varName, funName)\
private: varType varName;\
public: inline varType get##funName(void) const {return varName;}
#define XX_SET(varType, varName, funName)\
private: varType varName;\
public: inline void set##funName(varType var) {varName = var;}
#define XX_GET_SET_BY_REF(varType, varName, funName)\
private: varType varName;\
public: inline const varType& get##funName(void) const {return varName;}\
public: inline void set##funName(const varType& var){varName = var;}
#endif

View File

@ -1,435 +0,0 @@
#ifndef __MD5_H__
#define __MD5_H__
// Copyright (C) 1991-2, RSA Data Security, Inc. Created 1991. All
// rights reserved.
// License to copy and use this software is granted provided that it
// is identified as the "RSA Data Security, Inc. MD5 Message-Digest
// Algorithm" in all material mentioning or referencing this software
// or this function.
//
// License is also granted to make and use derivative works provided
// that such works are identified as "derived from the RSA Data
// Security, Inc. MD5 Message-Digest Algorithm" in all material
// mentioning or referencing the derived work.
//
// RSA Data Security, Inc. makes no representations concerning either
// the merchantability of this software or the suitability of this
// software for any particular purpose. It is provided "as is"
// without express or implied warranty of any kind.
//
// These notices must be retained in any copies of any part of this
// documentation and/or software.
// The original md5 implementation avoids external libraries.
// This version has dependency on stdio.h for file input and
// string.h for memcpy.
#include <cstdio>
#include <cstring>
#include <iostream>
namespace Limonp
{
//#pragma region MD5 defines
// Constants for MD5Transform routine.
#define S11 7
#define S12 12
#define S13 17
#define S14 22
#define S21 5
#define S22 9
#define S23 14
#define S24 20
#define S31 4
#define S32 11
#define S33 16
#define S34 23
#define S41 6
#define S42 10
#define S43 15
#define S44 21
// F, G, H and I are basic MD5 functions.
#define F(x, y, z) (((x) & (y)) | ((~x) & (z)))
#define G(x, y, z) (((x) & (z)) | ((y) & (~z)))
#define H(x, y, z) ((x) ^ (y) ^ (z))
#define I(x, y, z) ((y) ^ ((x) | (~z)))
// ROTATE_LEFT rotates x left n bits.
#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32-(n))))
// FF, GG, HH, and II transformations for rounds 1, 2, 3, and 4.
// Rotation is separate from addition to prevent recomputation.
#define FF(a, b, c, d, x, s, ac) { \
(a) += F ((b), (c), (d)) + (x) + (UINT4)(ac); \
(a) = ROTATE_LEFT ((a), (s)); \
(a) += (b); \
}
#define GG(a, b, c, d, x, s, ac) { \
(a) += G ((b), (c), (d)) + (x) + (UINT4)(ac); \
(a) = ROTATE_LEFT ((a), (s)); \
(a) += (b); \
}
#define HH(a, b, c, d, x, s, ac) { \
(a) += H ((b), (c), (d)) + (x) + (UINT4)(ac); \
(a) = ROTATE_LEFT ((a), (s)); \
(a) += (b); \
}
#define II(a, b, c, d, x, s, ac) { \
(a) += I ((b), (c), (d)) + (x) + (UINT4)(ac); \
(a) = ROTATE_LEFT ((a), (s)); \
(a) += (b); \
}
//#pragma endregion
typedef unsigned char BYTE ;
// POINTER defines a generic pointer type
typedef unsigned char *POINTER;
// UINT2 defines a two byte word
typedef unsigned short int UINT2;
// UINT4 defines a four byte word
typedef unsigned int UINT4;
static unsigned char PADDING[64] = {
0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
};
// convenient object that wraps
// the C-functions for use in C++ only
class MD5
{
private:
struct __context_t {
UINT4 state[4]; /* state (ABCD) */
UINT4 count[2]; /* number of bits, modulo 2^64 (lsb first) */
unsigned char buffer[64]; /* input buffer */
} context ;
//#pragma region static helper functions
// The core of the MD5 algorithm is here.
// MD5 basic transformation. Transforms state based on block.
static void MD5Transform( UINT4 state[4], unsigned char block[64] )
{
UINT4 a = state[0], b = state[1], c = state[2], d = state[3], x[16];
Decode (x, block, 64);
/* Round 1 */
FF (a, b, c, d, x[ 0], S11, 0xd76aa478); /* 1 */
FF (d, a, b, c, x[ 1], S12, 0xe8c7b756); /* 2 */
FF (c, d, a, b, x[ 2], S13, 0x242070db); /* 3 */
FF (b, c, d, a, x[ 3], S14, 0xc1bdceee); /* 4 */
FF (a, b, c, d, x[ 4], S11, 0xf57c0faf); /* 5 */
FF (d, a, b, c, x[ 5], S12, 0x4787c62a); /* 6 */
FF (c, d, a, b, x[ 6], S13, 0xa8304613); /* 7 */
FF (b, c, d, a, x[ 7], S14, 0xfd469501); /* 8 */
FF (a, b, c, d, x[ 8], S11, 0x698098d8); /* 9 */
FF (d, a, b, c, x[ 9], S12, 0x8b44f7af); /* 10 */
FF (c, d, a, b, x[10], S13, 0xffff5bb1); /* 11 */
FF (b, c, d, a, x[11], S14, 0x895cd7be); /* 12 */
FF (a, b, c, d, x[12], S11, 0x6b901122); /* 13 */
FF (d, a, b, c, x[13], S12, 0xfd987193); /* 14 */
FF (c, d, a, b, x[14], S13, 0xa679438e); /* 15 */
FF (b, c, d, a, x[15], S14, 0x49b40821); /* 16 */
/* Round 2 */
GG (a, b, c, d, x[ 1], S21, 0xf61e2562); /* 17 */
GG (d, a, b, c, x[ 6], S22, 0xc040b340); /* 18 */
GG (c, d, a, b, x[11], S23, 0x265e5a51); /* 19 */
GG (b, c, d, a, x[ 0], S24, 0xe9b6c7aa); /* 20 */
GG (a, b, c, d, x[ 5], S21, 0xd62f105d); /* 21 */
GG (d, a, b, c, x[10], S22, 0x2441453); /* 22 */
GG (c, d, a, b, x[15], S23, 0xd8a1e681); /* 23 */
GG (b, c, d, a, x[ 4], S24, 0xe7d3fbc8); /* 24 */
GG (a, b, c, d, x[ 9], S21, 0x21e1cde6); /* 25 */
GG (d, a, b, c, x[14], S22, 0xc33707d6); /* 26 */
GG (c, d, a, b, x[ 3], S23, 0xf4d50d87); /* 27 */
GG (b, c, d, a, x[ 8], S24, 0x455a14ed); /* 28 */
GG (a, b, c, d, x[13], S21, 0xa9e3e905); /* 29 */
GG (d, a, b, c, x[ 2], S22, 0xfcefa3f8); /* 30 */
GG (c, d, a, b, x[ 7], S23, 0x676f02d9); /* 31 */
GG (b, c, d, a, x[12], S24, 0x8d2a4c8a); /* 32 */
/* Round 3 */
HH (a, b, c, d, x[ 5], S31, 0xfffa3942); /* 33 */
HH (d, a, b, c, x[ 8], S32, 0x8771f681); /* 34 */
HH (c, d, a, b, x[11], S33, 0x6d9d6122); /* 35 */
HH (b, c, d, a, x[14], S34, 0xfde5380c); /* 36 */
HH (a, b, c, d, x[ 1], S31, 0xa4beea44); /* 37 */
HH (d, a, b, c, x[ 4], S32, 0x4bdecfa9); /* 38 */
HH (c, d, a, b, x[ 7], S33, 0xf6bb4b60); /* 39 */
HH (b, c, d, a, x[10], S34, 0xbebfbc70); /* 40 */
HH (a, b, c, d, x[13], S31, 0x289b7ec6); /* 41 */
HH (d, a, b, c, x[ 0], S32, 0xeaa127fa); /* 42 */
HH (c, d, a, b, x[ 3], S33, 0xd4ef3085); /* 43 */
HH (b, c, d, a, x[ 6], S34, 0x4881d05); /* 44 */
HH (a, b, c, d, x[ 9], S31, 0xd9d4d039); /* 45 */
HH (d, a, b, c, x[12], S32, 0xe6db99e5); /* 46 */
HH (c, d, a, b, x[15], S33, 0x1fa27cf8); /* 47 */
HH (b, c, d, a, x[ 2], S34, 0xc4ac5665); /* 48 */
/* Round 4 */
II (a, b, c, d, x[ 0], S41, 0xf4292244); /* 49 */
II (d, a, b, c, x[ 7], S42, 0x432aff97); /* 50 */
II (c, d, a, b, x[14], S43, 0xab9423a7); /* 51 */
II (b, c, d, a, x[ 5], S44, 0xfc93a039); /* 52 */
II (a, b, c, d, x[12], S41, 0x655b59c3); /* 53 */
II (d, a, b, c, x[ 3], S42, 0x8f0ccc92); /* 54 */
II (c, d, a, b, x[10], S43, 0xffeff47d); /* 55 */
II (b, c, d, a, x[ 1], S44, 0x85845dd1); /* 56 */
II (a, b, c, d, x[ 8], S41, 0x6fa87e4f); /* 57 */
II (d, a, b, c, x[15], S42, 0xfe2ce6e0); /* 58 */
II (c, d, a, b, x[ 6], S43, 0xa3014314); /* 59 */
II (b, c, d, a, x[13], S44, 0x4e0811a1); /* 60 */
II (a, b, c, d, x[ 4], S41, 0xf7537e82); /* 61 */
II (d, a, b, c, x[11], S42, 0xbd3af235); /* 62 */
II (c, d, a, b, x[ 2], S43, 0x2ad7d2bb); /* 63 */
II (b, c, d, a, x[ 9], S44, 0xeb86d391); /* 64 */
state[0] += a;
state[1] += b;
state[2] += c;
state[3] += d;
// Zeroize sensitive information.
memset((POINTER)x, 0, sizeof (x));
}
// Encodes input (UINT4) into output (unsigned char). Assumes len is
// a multiple of 4.
static void Encode( unsigned char *output, UINT4 *input, unsigned int len )
{
unsigned int i, j;
for (i = 0, j = 0; j < len; i++, j += 4) {
output[j] = (unsigned char)(input[i] & 0xff);
output[j+1] = (unsigned char)((input[i] >> 8) & 0xff);
output[j+2] = (unsigned char)((input[i] >> 16) & 0xff);
output[j+3] = (unsigned char)((input[i] >> 24) & 0xff);
}
}
// Decodes input (unsigned char) into output (UINT4). Assumes len is
// a multiple of 4.
static void Decode( UINT4 *output, unsigned char *input, unsigned int len )
{
unsigned int i, j;
for (i = 0, j = 0; j < len; i++, j += 4)
output[i] = ((UINT4)input[j]) | (((UINT4)input[j+1]) << 8) |
(((UINT4)input[j+2]) << 16) | (((UINT4)input[j+3]) << 24);
}
//#pragma endregion
public:
// MAIN FUNCTIONS
MD5()
{
Init() ;
}
// MD5 initialization. Begins an MD5 operation, writing a new context.
void Init()
{
context.count[0] = context.count[1] = 0;
// Load magic initialization constants.
context.state[0] = 0x67452301;
context.state[1] = 0xefcdab89;
context.state[2] = 0x98badcfe;
context.state[3] = 0x10325476;
}
// MD5 block update operation. Continues an MD5 message-digest
// operation, processing another message block, and updating the
// context.
void Update(
unsigned char *input, // input block
unsigned int inputLen ) // length of input block
{
unsigned int i, index, partLen;
// Compute number of bytes mod 64
index = (unsigned int)((context.count[0] >> 3) & 0x3F);
// Update number of bits
if ((context.count[0] += ((UINT4)inputLen << 3))
< ((UINT4)inputLen << 3))
context.count[1]++;
context.count[1] += ((UINT4)inputLen >> 29);
partLen = 64 - index;
// Transform as many times as possible.
if (inputLen >= partLen) {
memcpy((POINTER)&context.buffer[index], (POINTER)input, partLen);
MD5Transform (context.state, context.buffer);
for (i = partLen; i + 63 < inputLen; i += 64)
MD5Transform (context.state, &input[i]);
index = 0;
}
else
i = 0;
/* Buffer remaining input */
memcpy((POINTER)&context.buffer[index], (POINTER)&input[i], inputLen-i);
}
// MD5 finalization. Ends an MD5 message-digest operation, writing the
// the message digest and zeroizing the context.
// Writes to digestRaw
void Final()
{
unsigned char bits[8];
unsigned int index, padLen;
// Save number of bits
Encode( bits, context.count, 8 );
// Pad out to 56 mod 64.
index = (unsigned int)((context.count[0] >> 3) & 0x3f);
padLen = (index < 56) ? (56 - index) : (120 - index);
Update( PADDING, padLen );
// Append length (before padding)
Update( bits, 8 );
// Store state in digest
Encode( digestRaw, context.state, 16);
// Zeroize sensitive information.
memset((POINTER)&context, 0, sizeof (context));
writeToString() ;
}
/// Buffer must be 32+1 (nul) = 33 chars long at least
void writeToString()
{
int pos ;
for( pos = 0 ; pos < 16 ; pos++ )
sprintf( digestChars+(pos*2), "%02x", digestRaw[pos] ) ;
}
public:
// an MD5 digest is a 16-byte number (32 hex digits)
BYTE digestRaw[ 16 ] ;
// This version of the digest is actually
// a "printf'd" version of the digest.
char digestChars[ 33 ] ;
/// Load a file from disk and digest it
// Digests a file and returns the result.
const char* digestFile( const char *filename )
{
if (NULL == filename || strcmp(filename, "") == 0)
return NULL;
Init() ;
FILE *file;
int len;
unsigned char buffer[1024] ;
if((file = fopen (filename, "rb")) == NULL)
{
return NULL;
}
else
{
while( (len = fread( buffer, 1, 1024, file )) )
Update( buffer, len ) ;
Final();
fclose( file );
}
return digestChars ;
}
/// Digests a byte-array already in memory
const char* digestMemory( BYTE *memchunk, int len )
{
if (NULL == memchunk)
return NULL;
Init() ;
Update( memchunk, len ) ;
Final() ;
return digestChars ;
}
// Digests a string and prints the result.
const char* digestString(const char *string )
{
if (string == NULL)
return NULL;
Init() ;
Update( (unsigned char*)string, strlen(string) ) ;
Final() ;
return digestChars ;
}
};
inline bool md5String(const char* str, std::string& res)
{
if (NULL == str)
{
res = "";
return false;
}
MD5 md5;
const char *pRes = md5.digestString(str);
if (NULL == pRes)
{
res = "";
return false;
}
res = pRes;
return true;
}
inline bool md5File(const char* filepath, std::string& res)
{
if (NULL == filepath || strcmp(filepath, "") == 0)
{
res = "";
return false;
}
MD5 md5;
const char *pRes = md5.digestFile(filepath);
if (NULL == pRes)
{
res = "";
return false;
}
res = pRes;
return true;
}
}
#endif

View File

@ -1,130 +0,0 @@
#ifndef LIMONP_STD_OUTBOUND_H
#define LIMONP_STD_OUTBOUND_H
#include <map>
#if(__cplusplus == 201103L)
#include <unordered_map>
#include <unordered_set>
#else
#include <tr1/unordered_map>
#include <tr1/unordered_set>
namespace std
{
using std::tr1::unordered_map;
using std::tr1::unordered_set;
}
#endif
#include <set>
#include <vector>
#include <fstream>
#include <sstream>
namespace std
{
template<typename T>
ostream& operator << (ostream& os, const vector<T>& vec)
{
if(vec.empty())
{
return os << "[]";
}
os<<"[\""<<vec[0];
for(size_t i = 1; i < vec.size(); i++)
{
os<<"\", \""<<vec[i];
}
os<<"\"]";
return os;
}
template<class T1, class T2>
ostream& operator << (ostream& os, const pair<T1, T2>& pr)
{
os << pr.first << ":" << pr.second ;
return os;
}
template<class T>
string& operator << (string& str, const T& obj)
{
stringstream ss;
ss << obj; // call ostream& operator << (ostream& os,
return str = ss.str();
}
template<class T1, class T2>
ostream& operator << (ostream& os, const map<T1, T2>& mp)
{
if(mp.empty())
{
os<<"{}";
return os;
}
os<<'{';
typename map<T1, T2>::const_iterator it = mp.begin();
os<<*it;
it++;
while(it != mp.end())
{
os<<", "<<*it;
it++;
}
os<<'}';
return os;
}
template<class T1, class T2>
ostream& operator << (ostream& os, const std::unordered_map<T1, T2>& mp)
{
if(mp.empty())
{
return os << "{}";
}
os<<'{';
typename std::unordered_map<T1, T2>::const_iterator it = mp.begin();
os<<*it;
it++;
while(it != mp.end())
{
os<<", "<<*it++;
}
return os<<'}';
}
template<class T>
ostream& operator << (ostream& os, const set<T>& st)
{
if(st.empty())
{
os << "{}";
return os;
}
os<<'{';
typename set<T>::const_iterator it = st.begin();
os<<*it;
it++;
while(it != st.end())
{
os<<", "<<*it;
it++;
}
os<<'}';
return os;
}
template<class KeyType, class ContainType>
bool isIn(const ContainType& contain, const KeyType& key)
{
return contain.end() != contain.find(key);
}
template<class T>
basic_string<T> & operator << (basic_string<T> & s, ifstream & ifs)
{
return s.assign((istreambuf_iterator<T>(ifs)), istreambuf_iterator<T>());
}
}
#endif

View File

@ -1,363 +0,0 @@
/************************************
* file enc : ascii
* author : wuyanyi09@gmail.com
************************************/
#ifndef LIMONP_STR_FUNCTS_H
#define LIMONP_STR_FUNCTS_H
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <cctype>
#include <map>
#include <stdint.h>
#include <stdio.h>
#include <stdarg.h>
#include <memory.h>
#include <functional>
#include <locale>
#include <sstream>
#include <sys/types.h>
#include <iterator>
#include <algorithm>
#include "std_outbound.hpp"
#define print(x) cout<< #x": " << x <<endl
namespace Limonp
{
using namespace std;
inline string string_format(const char* fmt, ...)
{
int size = 256;
std::string str;
va_list ap;
while (1) {
str.resize(size);
va_start(ap, fmt);
int n = vsnprintf((char *)str.c_str(), size, fmt, ap);
va_end(ap);
if (n > -1 && n < size) {
str.resize(n);
return str;
}
if (n > -1)
size = n + 1;
else
size *= 2;
}
return str;
}
inline void string_format(string& res, const char* fmt, ...)
{
int size = 256;
va_list ap;
res.clear();
while (1) {
res.resize(size);
va_start(ap, fmt);
int n = vsnprintf((char *)res.c_str(), size, fmt, ap);
va_end(ap);
if (n > -1 && n < size) {
res.resize(n);
return;
}
if (n > -1)
size = n + 1;
else
size *= 2;
}
}
template<class T>
void join(T begin, T end, string& res, const string& connector)
{
if(begin == end)
{
return;
}
stringstream ss;
ss<<*begin;
begin++;
while(begin != end)
{
ss << connector << *begin;
begin ++;
}
res = ss.str();
}
template<class T>
string join(T begin, T end, const string& connector)
{
string res;
join(begin ,end, res, connector);
return res;
}
inline bool split(const string& src, vector<string>& res, const string& pattern, size_t offset = 0, size_t len = string::npos)
{
if(src.empty())
{
return false;
}
res.clear();
size_t start = 0;
size_t end = 0;
size_t cnt = 0;
while(start < src.size() && res.size() < len)
{
end = src.find_first_of(pattern, start);
if(string::npos == end)
{
if(cnt >= offset)
{
res.push_back(src.substr(start));
}
return true;
}
//if(end == src.size() - 1)
//{
// res.push_back("");
// return true;
//}
if(cnt >= offset)
{
res.push_back(src.substr(start, end - start));
}
cnt ++;
start = end + 1;
}
return true;
}
inline string& upper(string& str)
{
transform(str.begin(), str.end(), str.begin(), (int (*)(int))toupper);
return str;
}
inline string& lower(string& str)
{
transform(str.begin(), str.end(), str.begin(), (int (*)(int))tolower);
return str;
}
inline std::string &ltrim(std::string &s)
{
s.erase(s.begin(), std::find_if(s.begin(), s.end(), std::not1(std::ptr_fun<int, int>(std::isspace))));
return s;
}
inline std::string &rtrim(std::string &s)
{
s.erase(std::find_if(s.rbegin(), s.rend(), std::not1(std::ptr_fun<int, int>(std::isspace))).base(), s.end());
return s;
}
inline std::string &trim(std::string &s)
{
return ltrim(rtrim(s));
}
inline std::string & ltrim(std::string & s, char x)
{
s.erase(s.begin(), std::find_if(s.begin(), s.end(), std::not1(std::bind2nd(std::equal_to<char>(), x))));
return s;
}
inline std::string & rtrim(std::string & s, char x)
{
s.erase(std::find_if(s.rbegin(), s.rend(), std::not1(std::bind2nd(std::equal_to<char>(), x))).base(), s.end());
return s;
}
inline std::string &trim(std::string &s, char x)
{
return ltrim(rtrim(s, x), x);
}
inline bool startsWith(const string& str, const string& prefix)
{
if(prefix.length() > str.length())
{
return false;
}
return 0 == str.compare(0, prefix.length(), prefix);
}
inline bool endsWith(const string& str, const string& suffix)
{
if(suffix.length() > str.length())
{
return false;
}
return 0 == str.compare(str.length() - suffix.length(), suffix.length(), suffix);
}
inline bool isInStr(const string& str, char ch)
{
return str.find(ch) != string::npos;
}
inline uint16_t twocharToUint16(char high, char low)
{
return (((uint16_t(high) & 0x00ff ) << 8) | (uint16_t(low) & 0x00ff));
}
inline bool utf8ToUnicode(const char * const str, uint len, vector<uint16_t>& vec)
{
if(!str)
{
return false;
}
char ch1, ch2;
uint16_t tmp;
vec.clear();
for(uint i = 0;i < len;)
{
if(!(str[i] & 0x80)) // 0xxxxxxx
{
vec.push_back(str[i]);
i++;
}
else if ((unsigned char)str[i] <= 0xdf && i + 1 < len) // 110xxxxxx
{
ch1 = (str[i] >> 2) & 0x07;
ch2 = (str[i+1] & 0x3f) | ((str[i] & 0x03) << 6 );
tmp = (((uint16_t(ch1) & 0x00ff ) << 8) | (uint16_t(ch2) & 0x00ff));
vec.push_back(tmp);
i += 2;
}
else if((unsigned char)str[i] <= 0xef && i + 2 < len)
{
ch1 = (str[i] << 4) | ((str[i+1] >> 2) & 0x0f );
ch2 = ((str[i+1]<<6) & 0xc0) | (str[i+2] & 0x3f);
tmp = (((uint16_t(ch1) & 0x00ff ) << 8) | (uint16_t(ch2) & 0x00ff));
vec.push_back(tmp);
i += 3;
}
else
{
return false;
}
}
return true;
}
inline bool utf8ToUnicode(const string& str, vector<uint16_t>& vec)
{
return utf8ToUnicode(str.c_str(), str.size(), vec);
}
inline bool unicodeToUtf8(vector<uint16_t>::const_iterator begin, vector<uint16_t>::const_iterator end, string& res)
{
if(begin >= end)
{
return false;
}
res.clear();
uint16_t ui;
while(begin != end)
{
ui = *begin;
if(ui <= 0x7f)
{
res += char(ui);
}
else if(ui <= 0x7ff)
{
res += char(((ui>>6) & 0x1f) | 0xc0);
res += char((ui & 0x3f) | 0x80);
}
else
{
res += char(((ui >> 12) & 0x0f )| 0xe0);
res += char(((ui>>6) & 0x3f )| 0x80 );
res += char((ui & 0x3f) | 0x80);
}
begin ++;
}
return true;
}
inline bool gbkTrans(const char* const str, uint len, vector<uint16_t>& vec)
{
vec.clear();
if(!str)
{
return false;
}
uint i = 0;
while(i < len)
{
if(0 == (str[i] & 0x80))
{
vec.push_back(uint16_t(str[i]));
i++;
}
else
{
if(i + 1 < len) //&& (str[i+1] & 0x80))
{
uint16_t tmp = (((uint16_t(str[i]) & 0x00ff ) << 8) | (uint16_t(str[i+1]) & 0x00ff));
vec.push_back(tmp);
i += 2;
}
else
{
return false;
}
}
}
return true;
}
inline bool gbkTrans(const string& str, vector<uint16_t>& vec)
{
return gbkTrans(str.c_str(), str.size(), vec);
}
//inline pair<char, char> uint16ToChar2(uint16_t in)
//{
// pair<char, char> res;
// res.first = (in>>8) & 0x00ff; //high
// res.second = (in) & 0x00ff; //low
// return res;
//}
inline bool gbkTrans(vector<uint16_t>::const_iterator begin, vector<uint16_t>::const_iterator end, string& res)
{
if(begin >= end)
{
return false;
}
res.clear();
//pair<char, char> pa;
char first, second;
while(begin != end)
{
//pa = uint16ToChar2(*begin);
first = ((*begin)>>8) & 0x00ff;
second = (*begin) & 0x00ff;
if(first & 0x80)
{
res += first;
res += second;
}
else
{
res += second;
}
begin++;
}
return true;
}
}
#endif

View File

@ -1,204 +0,0 @@
/************************************
* file enc : ASCII
* author : wuyanyi09@gmail.com
************************************/
#ifndef CPPJIEBA_MPSEGMENT_H
#define CPPJIEBA_MPSEGMENT_H
#include <algorithm>
#include <set>
#include <cassert>
#include "Limonp/logger.hpp"
#include "DictTrie.hpp"
#include "DictTrie.hpp"
#include "ISegment.hpp"
#include "SegmentBase.hpp"
namespace CppJieba
{
struct SegmentChar
{
uint16_t uniCh;
DagType dag;
const DictUnit * pInfo;
double weight;
SegmentChar():uniCh(0), pInfo(NULL), weight(0.0)
{}
};
typedef vector<SegmentChar> SegmentContext;
class MPSegment: public SegmentBase
{
protected:
DictTrie _dictTrie;
public:
MPSegment(){_setInitFlag(false);};
explicit MPSegment(const string& dictPath)
{
_setInitFlag(init(dictPath));
};
virtual ~MPSegment(){};
public:
bool init(const string& dictPath)
{
if(_getInitFlag())
{
LogError("already inited before now.");
return false;
}
_dictTrie.init(dictPath);
assert(_dictTrie);
LogInfo("MPSegment init(%s) ok", dictPath.c_str());
return _setInitFlag(true);
}
public:
using SegmentBase::cut;
virtual bool cut(Unicode::const_iterator begin, Unicode::const_iterator end, vector<string>& res)const
{
assert(_getInitFlag());
if(begin == end)
{
return false;
}
vector<Unicode> words;
if(!cut(begin, end, words))
{
return false;
}
string word;
for(size_t i = 0; i < words.size(); i++)
{
if(TransCode::encode(words[i], word))
{
res.push_back(word);
}
else
{
LogError("encode failed.");
}
}
return true;
}
bool cut(Unicode::const_iterator begin , Unicode::const_iterator end, vector<Unicode>& res) const
{
if(!_getInitFlag())
{
LogError("not inited.");
return false;
}
SegmentContext segContext;
//calc DAG
if(!_calcDAG(begin, end, segContext))
{
LogError("_calcDAG failed.");
return false;
}
if(!_calcDP(segContext))
{
LogError("_calcDP failed.");
return false;
}
if(!_cut(segContext, res))
{
LogError("_cut failed.");
return false;
}
return true;
}
private:
bool _calcDAG(Unicode::const_iterator begin, Unicode::const_iterator end, SegmentContext& segContext) const
{
SegmentChar schar;
size_t offset;
for(Unicode::const_iterator it = begin; it != end; it++)
{
schar.uniCh = *it;
offset = it - begin;
schar.dag.clear();
_dictTrie.find(it, end, schar.dag, offset);
if(!isIn(schar.dag, offset))
{
schar.dag[offset] = NULL;
}
segContext.push_back(schar);
}
return true;
}
bool _calcDP(SegmentContext& segContext)const
{
if(segContext.empty())
{
LogError("segContext empty");
return false;
}
size_t nextPos;
const DictUnit* p;
double val;
for(int i = segContext.size() - 1; i >= 0; i--)
{
segContext[i].pInfo = NULL;
segContext[i].weight = MIN_DOUBLE;
for(DagType::const_iterator it = segContext[i].dag.begin(); it != segContext[i].dag.end(); it++)
{
nextPos = it->first;
p = it->second;
val = 0.0;
if(nextPos + 1 < segContext.size())
{
val += segContext[nextPos + 1].weight;
}
if(p)
{
val += p->logFreq;
}
else
{
val += _dictTrie.getMinLogFreq();
}
if(val > segContext[i].weight)
{
segContext[i].pInfo = p;
segContext[i].weight = val;
}
}
}
return true;
}
bool _cut(SegmentContext& segContext, vector<Unicode>& res)const
{
size_t i = 0;
while(i < segContext.size())
{
const DictUnit* p = segContext[i].pInfo;
if(p)
{
res.push_back(p->word);
i += p->word.size();
}
else//single chinese word
{
res.push_back(Unicode(1, segContext[i].uniCh));
i++;
}
}
return true;
}
};
}
#endif

View File

@ -1,130 +0,0 @@
#ifndef CPPJIEBA_MIXSEGMENT_H
#define CPPJIEBA_MIXSEGMENT_H
#include <cassert>
#include "MPSegment.hpp"
#include "HMMSegment.hpp"
#include "Limonp/str_functs.hpp"
namespace CppJieba
{
class MixSegment: public SegmentBase
{
private:
MPSegment _mpSeg;
HMMSegment _hmmSeg;
public:
MixSegment(){_setInitFlag(false);};
explicit MixSegment(const string& mpSegDict, const string& hmmSegDict)
{
_setInitFlag(init(mpSegDict, hmmSegDict));
assert(_getInitFlag());
}
virtual ~MixSegment(){}
public:
bool init(const string& mpSegDict, const string& hmmSegDict)
{
assert(!_getInitFlag());
if(!_mpSeg.init(mpSegDict))
{
LogError("_mpSeg init");
return false;
}
if(!_hmmSeg.init(hmmSegDict))
{
LogError("_hmmSeg init");
return false;
}
LogInfo("MixSegment init(%s, %s)", mpSegDict.c_str(), hmmSegDict.c_str());
return _setInitFlag(true);
}
public:
using SegmentBase::cut;
public:
virtual bool cut(Unicode::const_iterator begin, Unicode::const_iterator end, vector<Unicode>& res) const
{
assert(_getInitFlag());
vector<Unicode> words;
if(!_mpSeg.cut(begin, end, words))
{
LogError("mpSeg cutDAG failed.");
return false;
}
vector<Unicode> hmmRes;
Unicode piece;
for (size_t i = 0, j = 0; i < words.size(); i++)
{
//if mp get a word, it's ok, put it into result
if (1 != words[i].size())
{
res.push_back(words[i]);
continue;
}
// if mp get a single one, collect it in sequence
j = i;
while (j < words.size() && words[j].size() == 1)
{
piece.push_back(words[j][0]);
j++;
}
// cut the sequence with hmm
if (!_hmmSeg.cut(piece.begin(), piece.end(), hmmRes))
{
LogError("_hmmSeg cut failed.");
return false;
}
//put hmm result to return
for (size_t k = 0; k < hmmRes.size(); k++)
{
res.push_back(hmmRes[k]);
}
//clear tmp vars
piece.clear();
hmmRes.clear();
//let i jump over this piece
i = j - 1;
}
return true;
}
virtual bool cut(Unicode::const_iterator begin, Unicode::const_iterator end, vector<string>& res)const
{
assert(_getInitFlag());
if(begin >= end)
{
LogError("begin >= end");
return false;
}
vector<Unicode> uRes;
if (!cut(begin, end, uRes))
{
LogError("get unicode cut result error.");
return false;
}
string tmp;
for (vector<Unicode>::const_iterator uItr = uRes.begin(); uItr != uRes.end(); uItr++)
{
if (TransCode::encode(*uItr, tmp))
{
res.push_back(tmp);
}
else
{
LogError("encode failed.");
}
}
return true;
}
};
}
#endif

View File

@ -1,63 +0,0 @@
#ifndef CPPJIEBA_POS_TAGGING_H
#define CPPJIEBA_POS_TAGGING_H
#include "MixSegment.hpp"
#include "Limonp/str_functs.hpp"
#include "DictTrie.hpp"
namespace CppJieba
{
using namespace Limonp;
class PosTagger: public InitOnOff
{
private:
MixSegment _segment;
DictTrie _dictTrie;
public:
PosTagger(){_setInitFlag(false);};
explicit PosTagger(const string& dictPath, const string& hmmFilePath, const string& charStatus, const string& startProb, const string& emitProb, const string& endProb, const string& transProb)
{
_setInitFlag(init(dictPath, hmmFilePath, charStatus, startProb, emitProb, endProb, transProb));
};
~PosTagger(){};
public:
bool init(const string& dictPath, const string& hmmFilePath, const string& charStatus, const string& startProb, const string& emitProb, const string& endProb, const string& transProb)
{
assert(!_getInitFlag());
_dictTrie.init(dictPath);
assert(_dictTrie);
return _setInitFlag(_segment.init(dictPath, hmmFilePath));
};
bool tag(const string& src, vector<pair<string, string> >& res)
{
assert(_getInitFlag());
vector<string> cutRes;
if (!_segment.cut(src, cutRes))
{
LogError("_mixSegment cut failed");
return false;
}
const DictUnit *tmp = NULL;
Unicode unico;
for (vector<string>::iterator itr = cutRes.begin(); itr != cutRes.end(); ++itr)
{
if (!TransCode::decode(*itr, unico))
{
LogError("decode failed.");
return false;
}
tmp = _dictTrie.find(unico.begin(), unico.end());
res.push_back(make_pair(*itr, tmp == NULL ? "x" : tmp->tag));
}
tmp = NULL;
return !res.empty();
}
};
}
#endif

View File

@ -1,137 +0,0 @@
#ifndef CPPJIEBA_QUERYSEGMENT_H
#define CPPJIEBA_QUERYSEGMENT_H
#include <algorithm>
#include <set>
#include <cassert>
#include "Limonp/logger.hpp"
#include "DictTrie.hpp"
#include "ISegment.hpp"
#include "SegmentBase.hpp"
#include "FullSegment.hpp"
#include "MixSegment.hpp"
#include "TransCode.hpp"
#include "DictTrie.hpp"
namespace CppJieba
{
class QuerySegment: public SegmentBase
{
private:
MixSegment _mixSeg;
FullSegment _fullSeg;
size_t _maxWordLen;
public:
QuerySegment(){_setInitFlag(false);};
explicit QuerySegment(const string& dict, const string& model, size_t maxWordLen)
{
_setInitFlag(init(dict, model, maxWordLen));
};
virtual ~QuerySegment(){};
public:
bool init(const string& dict, const string& model, size_t maxWordLen)
{
if (_getInitFlag())
{
LogError("inited already.");
return false;
}
if (!_mixSeg.init(dict, model))
{
LogError("_mixSeg init");
return false;
}
if (!_fullSeg.init(dict))
{
LogError("_fullSeg init");
return false;
}
_maxWordLen = maxWordLen;
return _setInitFlag(true);
}
public:
using SegmentBase::cut;
public:
bool cut(Unicode::const_iterator begin, Unicode::const_iterator end, vector<Unicode>& res) const
{
assert(_getInitFlag());
if (begin >= end)
{
LogError("begin >= end");
return false;
}
//use mix cut first
vector<Unicode> mixRes;
if (!_mixSeg.cut(begin, end, mixRes))
{
LogError("_mixSeg cut failed.");
return false;
}
vector<Unicode> fullRes;
for (vector<Unicode>::const_iterator mixResItr = mixRes.begin(); mixResItr != mixRes.end(); mixResItr++)
{
// if it's too long, cut with _fullSeg, put fullRes in res
if (mixResItr->size() > _maxWordLen)
{
if (_fullSeg.cut(mixResItr->begin(), mixResItr->end(), fullRes))
{
for (vector<Unicode>::const_iterator fullResItr = fullRes.begin(); fullResItr != fullRes.end(); fullResItr++)
{
res.push_back(*fullResItr);
}
//clear tmp res
fullRes.clear();
}
}
else // just use the mix result
{
res.push_back(*mixResItr);
}
}
return true;
}
bool cut(Unicode::const_iterator begin, Unicode::const_iterator end, vector<string>& res) const
{
assert(_getInitFlag());
if (begin >= end)
{
LogError("begin >= end");
return false;
}
vector<Unicode> uRes;
if (!cut(begin, end, uRes))
{
LogError("get unicode cut result error.");
return false;
}
string tmp;
for (vector<Unicode>::const_iterator uItr = uRes.begin(); uItr != uRes.end(); uItr++)
{
if (TransCode::encode(*uItr, tmp))
{
res.push_back(tmp);
}
else
{
LogError("encode failed.");
}
}
return true;
}
};
}
#endif

View File

@ -1,114 +0,0 @@
#ifndef CPPJIEBA_SEGMENTBASE_H
#define CPPJIEBA_SEGMENTBASE_H
#include "TransCode.hpp"
#include "Limonp/logger.hpp"
#include "Limonp/InitOnOff.hpp"
#include "ISegment.hpp"
#include <cassert>
namespace CppJieba
{
using namespace Limonp;
class SegmentBase: public ISegment, public InitOnOff
{
public:
SegmentBase(){};
virtual ~SegmentBase(){};
public:
virtual bool cut(Unicode::const_iterator begin, Unicode::const_iterator end, vector<string>& res)const = 0;
virtual bool cut(const string& str, vector<string>& res)const
{
assert(_getInitFlag());
Unicode unico;
res.clear();
#ifdef NO_FILTER
if(!TransCode::decode(str, unico))
{
LogFatal("str[%s] decode failed.", str.c_str());
return false;
}
return cut(unico.begin(), unico.end(), res);
#else
const char * const cstr = str.c_str();
size_t size = str.size();
size_t offset = 0;
string subs;
int ret;
size_t len;
while(offset < size)
{
const char * const nstr = cstr + offset;
size_t nsize = size - offset;
if(-1 == (ret = filterAscii(nstr, nsize, len)) || 0 == len || len > nsize)
{
LogFatal("str[%s] illegal.", cstr);
return false;
}
subs.assign(nstr, len);
if(!ret)
{
res.push_back(subs);
}
else
{
unico.clear();
if(!TransCode::decode(subs, unico))
{
LogFatal("str[%s] decode failed.", subs.c_str());
return false;
}
cut(unico.begin(), unico.end(), res);
}
offset += len;
}
return true;
#endif
}
public:
/*
* if char is ascii, count the ascii string's length and return 0;
* else count the nonascii string's length and return 1;
* if errors, return -1;
* */
static int filterAscii(const char* str, size_t len, size_t& resLen)
{
if(!str || !len)
{
return -1;
}
char x = 0x80;
int resFlag = (str[0] & x ? 1 : 0);
resLen = 0;
if(!resFlag)
{
while(resLen < len && !(str[resLen] & x))
{
resLen ++;
}
}
else
{
while(resLen < len && (str[resLen] & x))
{
#ifdef CPPJIEBA_GBK
resLen += 2;
#else
resLen ++;
#endif
}
}
if(resLen > len)
{
return -1;
}
return resFlag;
}
};
}
#endif

View File

@ -1,43 +0,0 @@
/************************************
* file enc : utf-8
* author : wuyanyi09@gmail.com
************************************/
#ifndef CPPJIEBA_TRANSCODE_H
#define CPPJIEBA_TRANSCODE_H
#include "Limonp/str_functs.hpp"
namespace CppJieba
{
using namespace Limonp;
typedef std::vector<uint16_t> Unicode;
namespace TransCode
{
inline bool decode(const string& str, Unicode& vec)
{
#ifdef CPPJIEBA_GBK
return gbkTrans(str, vec);
#else
return utf8ToUnicode(str, vec);
#endif
}
inline bool encode(Unicode::const_iterator begin, Unicode::const_iterator end, string& res)
{
#ifdef CPPJIEBA_GBK
return gbkTrans(begin, end, res);
#else
return unicodeToUtf8(begin, end, res);
#endif
}
inline bool encode(const Unicode& uni, string& res)
{
return encode(uni.begin(), uni.end(), res);
}
}
}
#endif

View File

@ -1,143 +0,0 @@
#ifndef CPPJIEBA_TRIE_HPP
#define CPPJIEBA_TRIE_HPP
#include "Limonp/std_outbound.hpp"
#include <vector>
namespace CppJieba
{
using namespace std;
template <class KeyType, class ValueType>
class TrieNode
{
public:
typedef unordered_map<KeyType, TrieNode<KeyType, ValueType>* > KeyMapType;
public:
KeyMapType * ptKeyMap;
const ValueType * ptValue;
};
template <class KeyType, class ValueType>
class Trie
{
public:
typedef TrieNode<KeyType, ValueType> TrieNodeType;
private:
TrieNodeType* _root;
public:
Trie(const vector<vector<KeyType> >& keys, const vector<const ValueType* >& valuePointers)
{
_root = new TrieNodeType;
_root->ptKeyMap = NULL;
_root->ptValue = NULL;
_createTrie(keys, valuePointers);
}
~Trie()
{
if(_root)
{
_deleteNode(_root);
}
}
public:
const ValueType* find(typename vector<KeyType>::const_iterator begin, typename vector<KeyType>::const_iterator end) const
{
typename TrieNodeType::KeyMapType::const_iterator citer;
const TrieNodeType* ptNode = _root;
for(typename vector<KeyType>::const_iterator it = begin; it != end; it++)
{
assert(ptNode);
if(NULL == ptNode->ptKeyMap || ptNode->ptKeyMap->end() == (citer = ptNode->ptKeyMap->find(*it)))
{
return NULL;
}
ptNode = citer->second;
}
return ptNode->ptValue;
}
bool find(typename vector<KeyType>::const_iterator begin, typename vector<KeyType> ::const_iterator end, map<typename vector<KeyType>::size_type, const ValueType* >& ordererMap, size_t offset = 0) const
{
const TrieNodeType * ptNode = _root;
typename TrieNodeType::KeyMapType::const_iterator citer;
ordererMap.clear();
for(typename vector<KeyType>::const_iterator itr = begin; itr != end ; itr++)
{
assert(ptNode);
if(NULL == ptNode->ptKeyMap || ptNode->ptKeyMap->end() == (citer = ptNode->ptKeyMap->find(*itr)))
{
break;
}
ptNode = citer->second;
if(ptNode->ptValue)
{
ordererMap[itr - begin + offset] = ptNode->ptValue;
}
}
return ordererMap.size();
}
private:
void _createTrie(const vector<vector<KeyType> >& keys, const vector<const ValueType*>& valuePointers)
{
if(valuePointers.empty() || keys.empty())
{
return;
}
assert(keys.size() == valuePointers.size());
for(size_t i = 0; i < keys.size(); i++)
{
_insertNode(keys[i], valuePointers[i]);
}
}
private:
void _insertNode(const vector<KeyType>& key, const ValueType* ptValue)
{
TrieNodeType* ptNode = _root;
typename TrieNodeType::KeyMapType::const_iterator kmIter;
for(typename vector<KeyType>::const_iterator citer = key.begin(); citer != key.end(); citer++)
{
if(NULL == ptNode->ptKeyMap)
{
ptNode->ptKeyMap = new typename TrieNodeType::KeyMapType;
}
kmIter = ptNode->ptKeyMap->find(*citer);
if(ptNode->ptKeyMap->end() == kmIter)
{
TrieNodeType * nextNode = new TrieNodeType;
nextNode->ptKeyMap = NULL;
nextNode->ptValue = NULL;
(*ptNode->ptKeyMap)[*citer] = nextNode;
ptNode = nextNode;
}
else
{
ptNode = kmIter->second;
}
}
ptNode->ptValue = ptValue;
}
void _deleteNode(TrieNodeType* node)
{
if(!node)
{
return;
}
if(node->ptKeyMap)
{
typename TrieNodeType::KeyMapType::iterator it;
for(it = node->ptKeyMap->begin(); it != node->ptKeyMap->end(); it++)
{
_deleteNode(it->second);
}
delete node->ptKeyMap;
}
delete node;
}
};
}
#endif

View File

@ -1,114 +0,0 @@
#include <iostream>
#include <fstream>
#include <cstdlib>
#include <cstdio>
#include "Limonp/ArgvContext.hpp"
#include "MPSegment.hpp"
#include "HMMSegment.hpp"
#include "MixSegment.hpp"
#include "FullSegment.hpp"
#include "QuerySegment.hpp"
using namespace CppJieba;
void cut(const ISegment * seg, const char * const filePath)
{
ifstream ifile(filePath);
vector<string> res;
string line;
while(getline(ifile, line))
{
if(!line.empty())
{
cout << line << endl;
res.clear();
if(!seg->cut(line, res))
{
LogError("seg cut failed.");
}
else
{
print(join(res.begin(), res.end(), "/"));
}
}
}
}
int main(int argc, char ** argv)
{
if(argc < 2)
{
cout<<"usage: \n\t"<<argv[0]<<" [options] <filename>\n"
<<"options:\n"
<<"\t--algorithm\tSupported methods are [cutDAG, cutHMM, cutFull, cutQuery, cutMix] for now. \n\t\t\tIf not specified, the default is cutMix\n"
<<"\t--dictpath\tsee example\n"
<<"\t--modelpath\tsee example\n"
<<"\t--maxlen\tspecify the granularity of cut used in cutQuery. \n\t\t\tIf not specified, the default is 3\n"
<<"example:\n"
<<"\t"<<argv[0]<<" ../test/testdata/testlines.utf8 --dictpath ../dict/jieba.dict.utf8 --algorithm cutDAG\n"
<<"\t"<<argv[0]<<" ../test/testdata/testlines.utf8 --dictpath ../dict/jieba.dict.utf8 --algorithm cutFull\n"
<<"\t"<<argv[0]<<" ../test/testdata/testlines.utf8 --modelpath ../dict/hmm_model.utf8 --algorithm cutHMM\n"
<<"\t"<<argv[0]<<" ../test/testdata/testlines.utf8 --dictpath ../dict/jieba.dict.utf8 --modelpath ../dict/hmm_model.utf8 --algorithm cutMix\n"
<<"\t"<<argv[0]<<" ../test/testdata/testlines.utf8 --dictpath ../dict/jieba.dict.utf8 --modelpath ../dict/hmm_model.utf8 --algorithm cutQuery --maxlen 3\n"
<<endl;
return EXIT_FAILURE;
}
ArgvContext arg(argc, argv);
string dictPath = arg["--dictpath"];
string modelPath = arg["--modelpath"];
string algorithm = arg["--algorithm"];
int maxLen = atoi(arg["--maxlen"] == "" ? arg["--maxlen"].c_str() : "3");
if("cutHMM" == algorithm)
{
HMMSegment seg(modelPath.c_str());
if(!seg)
{
cout<<"seg init failed."<<endl;
return EXIT_FAILURE;
}
cut(&seg, arg[1].c_str());
}
else if("cutDAG" == algorithm)
{
MPSegment seg(dictPath.c_str());
if(!seg)
{
cout<<"seg init failed."<<endl;
return false;
}
cut(&seg, arg[1].c_str());
}
else if ("cutFull" == algorithm)
{
FullSegment seg(dictPath.c_str());
if (!seg)
{
cout << "seg init failed" << endl;
return false;
}
cut(&seg, arg[1].c_str());
}
else if ("cutQuery" == algorithm)
{
QuerySegment seg(dictPath.c_str(), modelPath.c_str(), maxLen);
if (!seg)
{
cout << "seg init failed" << endl;
return false;
}
cut(&seg, arg[1].c_str());
}
else
{
MixSegment seg(dictPath.c_str(), modelPath.c_str());
if(!seg)
{
cout<<"seg init failed."<<endl;
return EXIT_FAILURE;
}
cut(&seg, arg[1].c_str());
}
return EXIT_SUCCESS;
}

View File

@ -1,95 +0,0 @@
#include <unistd.h>
#include <algorithm>
#include <string>
#include <ctype.h>
#include <string.h>
#include "Limonp/Config.hpp"
#include "Limonp/io_functs.hpp"
#include "Husky/EpollServer.hpp"
#include "MPSegment.hpp"
#include "HMMSegment.hpp"
#include "MixSegment.hpp"
using namespace Husky;
using namespace CppJieba;
class ReqHandler: public IRequestHandler
{
public:
ReqHandler(const string& dictPath, const string& modelPath): _segment(dictPath, modelPath){};
virtual ~ReqHandler(){};
public:
virtual bool do_GET(const HttpReqInfo& httpReq, string& strSnd) const
{
string sentence, tmp;
vector<string> words;
httpReq.GET("key", tmp);
URLDecode(tmp, sentence);
_segment.cut(sentence, words);
if(httpReq.GET("format", tmp) && tmp == "simple")
{
join(words.begin(), words.end(), strSnd, " ");
return true;
}
strSnd << words;
return true;
}
virtual bool do_POST(const HttpReqInfo& httpReq, string& strSnd) const
{
vector<string> words;
_segment.cut(httpReq.getBody(), words);
strSnd << words;
return true;
}
private:
MixSegment _segment;
};
bool run(int argc, char** argv)
{
if(argc < 2)
{
return false;
}
Config conf(argv[1]);
if(!conf)
{
return false;
}
unsigned int port = 0;
string dictPath;
string modelPath;
string val;
if(!conf.get("port", val))
{
LogFatal("conf get port failed.");
return false;
}
port = atoi(val.c_str());
if(!conf.get("dict_path", dictPath))
{
LogFatal("conf get dict_path failed.");
return false;
}
if(!conf.get("model_path", modelPath))
{
LogFatal("conf get model_path failed.");
return false;
}
ReqHandler reqHandler(dictPath, modelPath);
EpollServer sf(port, &reqHandler);
return sf.start();
}
int main(int argc, char* argv[])
{
if(!run(argc, argv))
{
printf("usage: %s <config_file>\n", argv[0]);
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}

View File

@ -1,7 +1,12 @@
SET(EXECUTABLE_OUTPUT_PATH ${PROJECT_BINARY_DIR})
ADD_EXECUTABLE(segment.demo segment.cpp)
ADD_EXECUTABLE(keyword.demo keyword_demo.cpp)
ADD_EXECUTABLE(tagging.demo tagging_demo.cpp)
# Configure test paths
configure_file("${CMAKE_CURRENT_SOURCE_DIR}/test_paths.h.in" "${CMAKE_BINARY_DIR}/test/test_paths.h")
INCLUDE_DIRECTORIES(
${CMAKE_CURRENT_BINARY_DIR}
${CMAKE_BINARY_DIR}/test
)
ADD_EXECUTABLE(load_test load_test.cpp)
ADD_SUBDIRECTORY(unittest)
ADD_SUBDIRECTORY(unittest)

View File

@ -1,13 +0,0 @@
#include "../src/KeywordExtractor.hpp"
using namespace CppJieba;
int main(int argc, char ** argv)
{
KeywordExtractor extractor("../dict/jieba.dict.utf8", "../dict/hmm_model.utf8", "../dict/idf.utf8", "../dict/stop_words.utf8");
string s("我是蓝翔技工拖拉机学院手扶拖拉机专业的。不用多久我就会升职加薪当上总经理出任CEO迎娶白富美走上人生巅峰。");
vector<pair<string, double> > wordweights;
size_t topN = 5;
extractor.extract(s, wordweights, topN);
cout<< s << "\n -> \n" << wordweights << endl;
return EXIT_SUCCESS;
}

View File

@ -1,47 +1,58 @@
#include <iostream>
#include <ctime>
#include <fstream>
#include "../src/Limonp/ArgvContext.hpp"
#include "../src/Limonp/io_functs.hpp"
#include "../src/MPSegment.hpp"
#include "../src/HMMSegment.hpp"
#include "../src/MixSegment.hpp"
#include "cppjieba/MPSegment.hpp"
#include "cppjieba/HMMSegment.hpp"
#include "cppjieba/MixSegment.hpp"
#include "cppjieba/KeywordExtractor.hpp"
#include "limonp/Colors.hpp"
#include "test_paths.h"
using namespace CppJieba;
using namespace cppjieba;
void cut(const ISegment * seg, const char * const filePath, size_t times = 30)
{
ifstream ifile(filePath);
if(!ifile)
{
LogFatal("open file[%s] failed.", filePath);
return;
}
LogInfo("open file[%s].", filePath);
vector<string> res;
string doc;
loadFile2Str(filePath, doc);
for(uint i = 0; i < times; i ++)
{
printf("process [%3.0lf %%]\r", 100.0*(i+1)/times);
fflush(stdout);
res.clear();
seg->cut(doc, res);
}
printf("\n");
void Cut(size_t times = 50) {
MixSegment seg(DICT_DIR "/jieba.dict.utf8", DICT_DIR "/hmm_model.utf8");
vector<string> res;
string doc;
ifstream ifs(TEST_DATA_DIR "/weicheng.utf8");
assert(ifs);
doc << ifs;
long beginTime = clock();
for (size_t i = 0; i < times; i ++) {
printf("process [%3.0lf %%]\r", 100.0*(i+1)/times);
fflush(stdout);
res.clear();
seg.Cut(doc, res);
}
printf("\n");
long endTime = clock();
ColorPrintln(GREEN, "Cut: [%.3lf seconds]time consumed.", double(endTime - beginTime)/CLOCKS_PER_SEC);
}
int main(int argc, char ** argv)
{
MixSegment seg("../dict/jieba.dict.utf8", "../dict/hmm_model.utf8");
if(!seg)
{
cout<<"seg init failed."<<endl;
return EXIT_FAILURE;
}
long beginTime = clock();
cut(&seg, "../test/testdata/weicheng.utf8");
long endTime = clock();
printf("[%.3lf seconds]time consumed.\n", double(endTime - beginTime)/CLOCKS_PER_SEC);
return EXIT_SUCCESS;
void Extract(size_t times = 400) {
KeywordExtractor Extractor(DICT_DIR "/jieba.dict.utf8",
DICT_DIR "/hmm_model.utf8",
DICT_DIR "/idf.utf8",
DICT_DIR "/stop_words.utf8");
vector<string> words;
string doc;
ifstream ifs(TEST_DATA_DIR "/review.100");
assert(ifs);
doc << ifs;
long beginTime = clock();
for (size_t i = 0; i < times; i ++) {
printf("process [%3.0lf %%]\r", 100.0*(i+1)/times);
fflush(stdout);
words.clear();
Extractor.Extract(doc, words, 5);
}
printf("\n");
long endTime = clock();
ColorPrintln(GREEN, "Extract: [%.3lf seconds]time consumed.", double(endTime - beginTime)/CLOCKS_PER_SEC);
}
int main(int argc, char ** argv) {
Cut();
Extract();
return EXIT_SUCCESS;
}

View File

@ -1,60 +0,0 @@
#include <iostream>
#include <fstream>
#include "../src/MPSegment.hpp"
#include "../src/HMMSegment.hpp"
#include "../src/MixSegment.hpp"
using namespace CppJieba;
void cut(const ISegment * seg, const char * const filePath)
{
ifstream ifile(filePath);
vector<string> res;
string line;
while(getline(ifile, line))
{
if(!line.empty())
{
res.clear();
seg->cut(line, res);
cout<<join(res.begin(), res.end(),"/")<<endl;
}
}
}
const char * const TEST_FILE = "../test/testdata/testlines.utf8";
const char * const JIEBA_DICT_FILE = "../dict/jieba.dict.utf8";
const char * const HMM_DICT_FILE = "../dict/hmm_model.utf8";
int main(int argc, char ** argv)
{
//demo
{
HMMSegment seg(HMM_DICT_FILE);
if(!seg)
{
cout<<"seg init failed."<<endl;
return EXIT_FAILURE;
}
cut(&seg, TEST_FILE);
}
{
MixSegment seg(JIEBA_DICT_FILE, HMM_DICT_FILE);
if(!seg)
{
cout<<"seg init failed."<<endl;
return EXIT_FAILURE;
}
cut(&seg, TEST_FILE);
}
{
MPSegment seg(JIEBA_DICT_FILE);
if(!seg)
{
cout<<"seg init failed."<<endl;
return false;
}
cut(&seg, TEST_FILE);
}
return EXIT_SUCCESS;
}

View File

@ -1,58 +0,0 @@
#include <CppJieba/Husky/ServerFrame.h>
#include <CppJieba/Husky/Daemon.h>
#include <CppJieba/Limonp/ArgvContext.hpp>
#include <CppJieba/MPSegment.h>
#include <CppJieba/HMMSegment.h>
#include <CppJieba/MixSegment.h>
using namespace Husky;
using namespace CppJieba;
const char * const DEFAULT_DICTPATH = "../dict/jieba.dict.utf8";
const char * const DEFAULT_MODELPATH = "../dict/hmm_model.utf8";
class ServerDemo: public IRequestHandler
{
public:
ServerDemo(){};
virtual ~ServerDemo(){};
virtual bool init(){return _segment.init(DEFAULT_DICTPATH, DEFAULT_MODELPATH);};
virtual bool dispose(){return _segment.dispose();};
public:
virtual bool do_GET(const HttpReqInfo& httpReq, string& strSnd)
{
string sentence, tmp;
vector<string> words;
httpReq.GET("key", tmp);
URLDecode(tmp, sentence);
_segment.cut(sentence, words);
strSnd << words;
return true;
}
private:
MixSegment _segment;
};
int main(int argc,char* argv[])
{
if(argc != 7)
{
printf("usage: %s -n THREAD_NUMBER -p LISTEN_PORT -k start|stop\n",argv[0]);
return -1;
}
ArgvContext arg(argc, argv);
unsigned int port = atoi(arg["-p"].c_str());
unsigned int threadNum = atoi(arg["-n"].c_str());
ServerDemo s;
Daemon daemon(&s);
if(arg["-k"] == "start")
{
return !daemon.Start(port, threadNum);
}
else
{
return !daemon.Stop();
}
}

View File

@ -1,91 +0,0 @@
#!/usr/bin/python
# coding:utf-8
import time
import urllib2
import threading
from Queue import Queue
from time import sleep
import sys
# 性能测试页面
#PERF_TEST_URL = "http://10.2.66.38/?yyid=-1&suv=1309231700203264&callback=xxxxx"
URLS = [line for line in open("../testdata/load_test.urls", "r")]
# 配置:压力测试
THREAD_NUM = 10 # 并发线程总数
ONE_WORKER_NUM = 500 # 每个线程的循环次数
LOOP_SLEEP = 0.01 # 每次请求时间间隔(秒)
# 配置:模拟运行状态
#THREAD_NUM = 10 # 并发线程总数
#ONE_WORKER_NUM = 10 # 每个线程的循环次数
#LOOP_SLEEP = 0 # 每次请求时间间隔(秒)
# 出错数
ERROR_NUM = 0
#具体的处理函数,负责处理单个任务
def doWork(index, url):
t = threading.currentThread()
#print "["+t.name+" "+str(index)+"] "+PERF_TEST_URL
try:
html = urllib2.urlopen(url).read()
except urllib2.URLError, e:
print "["+t.name+" "+str(index)+"] "
print e
global ERROR_NUM
ERROR_NUM += 1
#这个是工作进程,负责不断从队列取数据并处理
def working():
t = threading.currentThread()
print "["+t.name+"] Sub Thread Begin"
i = 0
while i < ONE_WORKER_NUM:
i += 1
doWork(i, URLS[i % len(URLS)])
sleep(LOOP_SLEEP)
print "["+t.name+"] Sub Thread End"
def main():
#doWork(0)
#return
t1 = time.time()
Threads = []
# 创建线程
for i in range(THREAD_NUM):
t = threading.Thread(target=working, name="T"+str(i))
t.setDaemon(True)
Threads.append(t)
for t in Threads:
t.start()
for t in Threads:
t.join()
print "main thread end"
t2 = time.time()
print "========================================"
#print "URL:", PERF_TEST_URL
print "任务数量:", THREAD_NUM, "*", ONE_WORKER_NUM, "=", THREAD_NUM*ONE_WORKER_NUM
print "总耗时(秒):", t2-t1
print "每次请求耗时(秒):", (t2-t1) / (THREAD_NUM*ONE_WORKER_NUM)
print "每秒承载请求数:", 1 / ((t2-t1) / (THREAD_NUM*ONE_WORKER_NUM))
print "错误数量:", ERROR_NUM
if __name__ == "__main__":
main()

View File

@ -1,11 +0,0 @@
CURL_RES=../testdata/curl.res
TMP=curl.res.tmp
curl -s "http://127.0.0.1:11200/?key=南京市长江大桥" >> $TMP
if diff $TMP $CURL_RES >> /dev/null
then
echo "ok";
else
echo "failed."
fi
rm $TMP

View File

@ -1,12 +0,0 @@
#include "../src/PosTagger.hpp"
using namespace CppJieba;
int main(int argc, char ** argv)
{
PosTagger tagger("../dict/jieba.dict.utf8", "../dict/hmm_model.utf8", "", "", "", "", "");
string s("我是蓝翔技工拖拉机学院手扶拖拉机专业的。不用多久我就会升职加薪当上总经理出任CEO迎娶白富美走上人生巅峰。");
vector<pair<string, string> > res;
tagger.tag(s, res);
cout << res << endl;
return EXIT_SUCCESS;
}

7
test/test_paths.h.in Normal file
View File

@ -0,0 +1,7 @@
#ifndef TEST_PATHS_H
#define TEST_PATHS_H
#define TEST_DATA_DIR "@CMAKE_CURRENT_SOURCE_DIR@/testdata"
#define DICT_DIR "@CMAKE_SOURCE_DIR@/dict"
#endif // TEST_PATHS_H

View File

@ -1 +1,2 @@
http://127.0.0.1:11200/?key=南京市长江大桥
http://127.0.0.1:11200/?key=长春市长春药店

View File

@ -1,169 +1,169 @@
标&#12288;&#12288;签:保湿还不错比商场便宜补水效果好乳液很好用是正品心&#12288;&#12288;得:感觉还蛮好吸收的,不错啦
["标", "&#12288;&#12288;", "签", "", "保湿", "还", "不错", "比", "商场", "便宜", "补水", "效果", "好", "乳液", "很", "好", "用", "是", "正品", "心", "&#12288;&#12288;", "得", "", "感觉", "还", "蛮", "好", "吸收", "的", "", "不错", "啦"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "保湿", "还", "不错", "比", "商场", "便宜", "补水", "效果", "好", "乳液", "很", "好", "用", "是", "正品", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "感觉", "还", "蛮", "好", "吸收", "的", "", "不错", "啦"]
标&#12288;&#12288;签:还可以心&#12288;&#12288;得:不错~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
["标", "&#12288;&#12288;", "签", "", "还", "可以", "心", "&#12288;&#12288;", "得", "", "不错", "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "还", "可以", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "不错", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~"]
标&#12288;&#12288;签:是正品心&#12288;&#12288;得:下次我还要咋京东这里买不错
["标", "&#12288;&#12288;", "签", "", "是", "正品", "心", "&#12288;&#12288;", "得", "", "下次", "我", "还要", "咋", "京东", "这里", "买", "不错"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "是", "正品", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "下次", "我", "还要", "咋", "京东", "这里", "买", "不错"]
标&#12288;&#12288;签:挺保湿的心&#12288;&#12288;得:价格实惠,适合夏天用,很轻薄
["标", "&#12288;&#12288;", "签", "", "挺", "保湿", "的", "心", "&#12288;&#12288;", "得", "", "价格", "实惠", "", "适合", "夏天", "用", "", "很", "轻薄"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "挺", "保湿", "的", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "价格", "实惠", "", "适合", "夏天", "用", "", "很", "轻薄"]
标&#12288;&#12288;签:皮肤滑滑的味道不错挺保湿的很好用物流速度快心&#12288;&#12288;得:使用的挺好的一直用着这个的
["标", "&#12288;&#12288;", "签", "", "皮肤", "滑", "滑", "的", "味道", "不错", "挺", "保湿", "的", "很", "好", "用", "物流", "速度", "快", "心", "&#12288;&#12288;", "得", "", "使用", "的", "挺", "好", "的", "一直", "用", "着", "这个", "的"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "皮肤", "滑", "滑", "的", "味道", "不错", "挺", "保湿", "的", "很", "好", "用", "物流", "速度", "快", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "使用", "的", "挺", "好", "的", "一直", "用", "着", "这个", "的"]
标&#12288;&#12288;签:价格实惠比商场便宜心&#12288;&#12288;得:不错不错,活动买的很划算
["标", "&#12288;&#12288;", "签", "", "价格", "实惠", "比", "商场", "便宜", "心", "&#12288;&#12288;", "得", "", "不错", "不错", "", "活动", "买", "的", "很", "划算"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "价格", "实惠", "比", "商场", "便宜", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "不错", "不错", "", "活动", "买", "的", "很", "划算"]
标&#12288;&#12288;签:吸收快品牌好是正品挺保湿的心&#12288;&#12288;得一直使用3年值得信赖好用
["标", "&#12288;&#12288;", "签", "", "吸收", "快", "品牌", "好", "是", "正品", "挺", "保湿", "的", "心", "&#12288;&#12288;", "得", "", "一直", "使用", "3", "年", "", "值得", "信赖", "", "好", "用"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "吸收", "快", "品牌", "好", "是", "正品", "挺", "保湿", "的", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "一直", "使用", "3", "年", "", "值得", "信赖", "", "好", "用"]
标&#12288;&#12288;签:是正品皮肤滑滑的补水效果好乳液很好用心&#12288;&#12288;得:不错不错老婆很喜欢我值
["标", "&#12288;&#12288;", "签", "", "是", "正品", "皮肤", "滑", "滑", "的", "补水", "效果", "好", "乳液", "很", "好", "用心", "&#12288;&#12288;", "得", "", "不错", "不错", "老婆", "很", "喜欢", "我", "值"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "是", "正品", "皮肤", "滑", "滑", "的", "补水", "效果", "好", "乳液", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "不错", "不错", "老婆", "很", "喜欢", "我", "值"]
标&#12288;&#12288;签:保湿还不错心&#12288;&#12288;得:挺好的。。。。。。。。。。
["标", "&#12288;&#12288;", "签", "", "保湿", "还", "不错", "心", "&#12288;&#12288;", "得", "", "挺", "好", "的", "。", "。", "。", "。", "。", "。", "。", "。", "。", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "保湿", "还", "不错", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "挺", "好", "的", "。", "。", "。", "。", "。", "。", "。", "。", "。", "。"]
标&#12288;&#12288;签:是正品很好用心&#12288;&#12288;得:一直在京东买,可以信赖
["标", "&#12288;&#12288;", "签", "", "是", "正品", "很", "好", "用心", "&#12288;&#12288;", "得", "", "一直", "在", "京东", "买", "", "可以", "信赖"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "是", "正品", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "一直", "在", "京东", "买", "", "可以", "信赖"]
标&#12288;&#12288;签:是正品挺保湿的效果不错心&#12288;&#12288;得:送货快!是正品,大品牌的用的放心!
["标", "&#12288;&#12288;", "签", "", "是", "正品", "挺", "保湿", "的", "效果", "不错", "心", "&#12288;&#12288;", "得", "", "送货", "快", "", "是", "正品", "", "大", "品牌", "的", "用", "的", "放心", ""]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "是", "正品", "挺", "保湿", "的", "效果", "不错", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "送货", "快", "", "是", "正品", "", "大", "品牌", "的", "用", "的", "放心", ""]
标&#12288;&#12288;签:乳液很好用心&#12288;&#12288;得:很好的东东,下次还会买
["标", "&#12288;&#12288;", "签", "", "乳液", "很", "好", "用心", "&#12288;&#12288;", "得", "", "很", "好", "的", "东东", "", "下次", "还", "会", "买"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "乳液", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "很", "好", "的", "东东", "", "下次", "还", "会", "买"]
心&#12288;&#12288;得:送同学的,希望她喜欢
["心", "&#12288;&#12288;", "得", "", "送", "同学", "的", "", "希望", "她", "喜欢"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "送", "同学", "的", "", "希望", "她", "喜欢"]
标&#12288;&#12288;签:价格实惠心&#12288;&#12288;得:一直用,还可以吧,性价比高
["标", "&#12288;&#12288;", "签", "", "价格", "实惠", "心", "&#12288;&#12288;", "得", "", "一直", "用", "", "还", "可以", "吧", "", "性价比", "高"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "价格", "实惠", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "一直", "用", "", "还", "可以", "吧", "", "性价比", "高"]
心&#12288;&#12288;得:不错够速度,效果也不错,希望大家用着也一样,顶顶顶
["心", "&#12288;&#12288;", "得", "", "不错", "够", "速度", "", "效果", "也", "不错", "", "希望", "大家", "用", "着", "也", "一样", "", "顶", "顶", "顶"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "不错", "够", "速度", "", "效果", "也", "不错", "", "希望", "大家", "用", "着", "也", "一样", "", "顶", "顶", "顶"]
标&#12288;&#12288;签:挺保湿的心&#12288;&#12288;得:用着还不错。挺好的。
["标", "&#12288;&#12288;", "签", "", "挺", "保湿", "的", "心", "&#12288;&#12288;", "得", "", "用", "着", "还", "不错", "。", "挺", "好", "的", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "挺", "保湿", "的", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "用", "着", "还", "不错", "。", "挺", "好", "的", "。"]
优&#12288;&#12288;点:东西很好哦!不&#12288;&#12288;足:暂时还没有发现缺点哦!心&#12288;&#12288;得:很好,也很划算
["优", "&#12288;&#12288;", "点", "", "东西", "很", "好", "哦", "!", "不", "&#12288;&#12288;", "足", "", "暂时", "还", "没有", "发现", "缺点", "哦", "", "心", "&#12288;&#12288;", "得", "", "很", "好", "", "也", "很", "划算"]
["优", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "点", "", "东西", "很", "好", "哦", "!", "不", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "足", "", "暂时", "还", "没有", "发现", "缺点", "哦", "", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "很", "好", "", "也", "很", "划算"]
标&#12288;&#12288;签:脸上很舒服是正品心&#12288;&#12288;得:哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈和
["标", "&#12288;&#12288;", "签", "", "脸上", "很", "舒服", "是", "正品", "心", "&#12288;&#12288;", "得", "", "哈哈哈", "哈哈哈", "哈哈哈", "哈哈哈", "哈哈哈", "和"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "脸上", "很", "舒服", "是", "正品", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "哈哈哈", "哈哈哈", "哈哈哈", "哈哈哈", "哈哈哈", "和"]
优&#12288;&#12288;点:用了一下,感觉还不错不&#12288;&#12288;足:暂时还没有发现缺点哦!心&#12288;&#12288;得:用了一下,还可以
["优", "&#12288;&#12288;", "点", "", "用", "了", "一下", "", "感觉", "还", "不错", "不", "&#12288;&#12288;", "足", "", "暂时", "还", "没有", "发现", "缺点", "哦", "", "心", "&#12288;&#12288;", "得", "", "用", "了", "一下", "", "还", "可以"]
["优", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "点", "", "用", "了", "一下", "", "感觉", "还", "不错", "不", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "足", "", "暂时", "还", "没有", "发现", "缺点", "哦", "", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "用", "了", "一下", "", "还", "可以"]
标&#12288;&#12288;签:品牌好心&#12288;&#12288;得:东西还行,就是线太少了
["标", "&#12288;&#12288;", "签", "", "品牌", "好心", "&#12288;&#12288;", "得", "", "东西", "还", "行", "", "就是", "线", "太", "少", "了"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "品牌", "好心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "东西", "还", "行", "", "就是", "线", "太", "少", "了"]
标&#12288;&#12288;签:还可以老婆买的心&#12288;&#12288;得:代买的,据说还不错,搞优惠屯着。
["标", "&#12288;&#12288;", "签", "", "还", "可以", "老婆", "买", "的", "心", "&#12288;&#12288;", "得", "", "代", "买", "的", "", "据说", "还", "不错", "", "搞", "优惠", "屯", "着", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "还", "可以", "老婆", "买", "的", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "代", "买", "的", "", "据说", "还", "不错", "", "搞", "优惠", "屯", "着", "。"]
标&#12288;&#12288;签:保湿还不错很好用心&#12288;&#12288;得:一直在用这个,现在继续。
["标", "&#12288;&#12288;", "签", "", "保湿", "还", "不错", "很", "好", "用心", "&#12288;&#12288;", "得", "", "一直", "在", "用", "这个", "", "现在", "继续", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "保湿", "还", "不错", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "一直", "在", "用", "这个", "", "现在", "继续", "。"]
标&#12288;&#12288;签:很好用心&#12288;&#12288;得:正品,方便好用,比店里便宜
["标", "&#12288;&#12288;", "签", "", "很", "好", "用心", "&#12288;&#12288;", "得", "", "正品", "", "方便", "好", "用", "", "比", "店里", "便宜"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "正品", "", "方便", "好", "用", "", "比", "店里", "便宜"]
标&#12288;&#12288;签:保湿还不错妈妈买的比商场便宜挺保湿的吸收快心&#12288;&#12288;得:可以先去专柜试试~然后再京东上购买,由京东的发票,还是比较放心的~
["标", "&#12288;&#12288;", "签", "", "保湿", "还", "不错", "妈妈", "买", "的", "比", "商场", "便宜", "挺", "保湿", "的", "吸收", "快", "心", "&#12288;&#12288;", "得", "", "可以", "先", "去", "专柜", "试试", "~", "然后", "再", "京东", "上", "购买", "", "由", "京东", "的", "发票", "", "还是", "比较", "放心", "的", "~"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "保湿", "还", "不错", "妈妈", "买", "的", "比", "商场", "便宜", "挺", "保湿", "的", "吸收", "快", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "可以", "先", "去", "专柜", "试试", "~", "然后", "再", "京东", "上", "购买", "", "由", "京东", "的", "发票", "", "还是", "比较", "放心", "的", "~"]
心&#12288;&#12288;得:很好很滋润又不油
["心", "&#12288;&#12288;", "得", "", "很", "好", "很", "滋润", "又", "不", "油"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "很", "好", "很", "滋润", "又", "不", "油"]
标&#12288;&#12288;签:吸收快脸上很舒服保湿还不错很好用比商场便宜心&#12288;&#12288;得用过几瓶了http://club.jd.com/JdVote/TradeComment.aspx?ruleid=586763684&ot=0#none感觉很不错不油腻吸收快还保湿。
["标", "&#12288;&#12288;", "签", "", "吸收", "快", "脸上", "很", "舒服", "保湿", "还", "不错", "很", "好", "用", "比", "商场", "便宜", "心", "&#12288;&#12288;", "得", "", "用", "过", "几瓶", "了", "http://club.jd.com/JdVote/TradeComment.aspx?ruleid=586763684&ot=0#none", "", "感觉", "很", "不错", "", "不", "油腻", "", "吸收", "快", "", "还", "保湿", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "吸收", "快", "脸上", "很", "舒服", "保湿", "还", "不错", "很", "好", "用", "比", "商场", "便宜", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "用", "过", "几瓶", "了", "h", "t", "t", "p", ":", "/", "/", "c", "l", "u", "b", ".", "j", "d", ".", "c", "o", "m", "/", "J", "d", "V", "o", "t", "e", "/", "T", "r", "a", "d", "e", "C", "o", "m", "m", "e", "n", "t", ".", "a", "s", "p", "x", "?", "r", "u", "l", "e", "i", "d", "=", "5", "8", "6", "7", "6", "3", "6", "8", "4", "&", "o", "t", "=", "0", "#", "n", "o", "n", "e", "", "感觉", "很", "不错", "", "不", "油腻", "", "吸收", "快", "", "还", "保湿", "。"]
标&#12288;&#12288;签:还可以心&#12288;&#12288;得:一般吧,还没怎么用。现在不知道效果。
["标", "&#12288;&#12288;", "签", "", "还", "可以", "心", "&#12288;&#12288;", "得", "", "一般", "吧", "", "还", "没", "怎么", "用", "。", "现在", "不", "知道", "效果", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "还", "可以", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "一般", "吧", "", "还", "没", "怎么", "用", "。", "现在", "不", "知道", "效果", "。"]
标&#12288;&#12288;签:还可以心&#12288;&#12288;得:东西很不错。很好。很喜欢!
["标", "&#12288;&#12288;", "签", "", "还", "可以", "心", "&#12288;&#12288;", "得", "", "东西", "很", "不错", "。", "很", "好", "。", "很", "喜欢", ""]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "还", "可以", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "东西", "很", "不错", "。", "很", "好", "。", "很", "喜欢", ""]
标&#12288;&#12288;签:比商场便宜价格实惠心&#12288;&#12288;得:一直都在用,没有刺激,很舒服,价格合适
["标", "&#12288;&#12288;", "签", "", "比", "商场", "便宜", "价格", "实惠", "心", "&#12288;&#12288;", "得", "", "一直", "都", "在", "用", "", "没有", "刺激", "", "很", "舒服", "", "价格", "合适"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "比", "商场", "便宜", "价格", "实惠", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "一直", "都", "在", "用", "", "没有", "刺激", "", "很", "舒服", "", "价格", "合适"]
标&#12288;&#12288;签:包装好服务好比商场便宜皮肤滑滑的很好用心&#12288;&#12288;得:送货速度也很快!非常好,质量不错,推荐购买!包装很好!
["标", "&#12288;&#12288;", "签", "", "包装", "好", "服务", "好比", "商场", "便宜", "皮肤", "滑", "滑", "的", "很", "好", "用心", "&#12288;&#12288;", "得", "", "送货", "速度", "也", "很快", "", "非常", "好", "", "质量", "不错", "", "推荐", "购买", "", "包装", "很", "好", ""]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "包装", "好", "服务", "好比", "商场", "便宜", "皮肤", "滑", "滑", "的", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "送货", "速度", "也", "很快", "", "非常", "好", "", "质量", "不错", "", "推荐", "购买", "", "包装", "很", "好", ""]
标&#12288;&#12288;签:吸收快服务好心&#12288;&#12288;得:质量不错,值得信赖,网购上京东,放心又轻松!
["标", "&#12288;&#12288;", "签", "", "吸收", "快", "服务", "好心", "&#12288;&#12288;", "得", "", "质量", "不错", "", "值得", "信赖", "", "网", "购", "上", "京东", "", "放心", "又", "轻松", ""]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "吸收", "快", "服务", "好心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "质量", "不错", "", "值得", "信赖", "", "网", "购", "上", "京东", "", "放心", "又", "轻松", ""]
标&#12288;&#12288;签:味道不错吸收快心&#12288;&#12288;得:不油腻,味道也不错,美白效果嘛暂时没有,毕竟只用了几次而已。
["标", "&#12288;&#12288;", "签", "", "味道", "不错", "吸收", "快", "心", "&#12288;&#12288;", "得", "", "不", "油腻", "", "味道", "也", "不错", "", "美", "白", "效果", "嘛", "暂时", "没有", "", "毕竟", "只用", "了", "几次", "而已", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "味道", "不错", "吸收", "快", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "不", "油腻", "", "味道", "也", "不错", "", "美", "白", "效果", "嘛", "暂时", "没有", "", "毕竟", "只用", "了", "几次", "而已", "。"]
标&#12288;&#12288;签:还可以价格实惠心&#12288;&#12288;得:还不错,促销活动买的.........
["标", "&#12288;&#12288;", "签", "", "还", "可以", "价格", "实惠", "心", "&#12288;&#12288;", "得", "", "还", "不错", "", "促销", "活动", "买", "的", "........."]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "还", "可以", "价格", "实惠", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "还", "不错", "", "促销", "活动", "买", "的", ".", ".", ".", ".", ".", ".", ".", ".", "."]
标&#12288;&#12288;签:挺保湿的效果不错脸上很舒服很好用心&#12288;&#12288;得帮朋友买的她觉得非常不错继续关注ZA
["标", "&#12288;&#12288;", "签", "", "挺", "保湿", "的", "效果", "不错", "脸上", "很", "舒服", "很", "好", "用心", "&#12288;&#12288;", "得", "", "帮", "朋友", "买", "的", "", "她", "觉得", "非常", "不错", "", "继续", "关注", "ZA"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "挺", "保湿", "的", "效果", "不错", "脸上", "很", "舒服", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "帮", "朋友", "买", "的", "", "她", "觉得", "非常", "不错", "", "继续", "关注", "Z", "A"]
标&#12288;&#12288;签:乳液很好用心&#12288;&#12288;得:比较清爽,补水效果并不是很好,夏天用用吧
["标", "&#12288;&#12288;", "签", "", "乳液", "很", "好", "用心", "&#12288;&#12288;", "得", "", "比较", "清爽", "", "补水", "效果", "并", "不是", "很", "好", "", "夏天", "用", "用", "吧"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "乳液", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "比较", "清爽", "", "补水", "效果", "并", "不是", "很", "好", "", "夏天", "用", "用", "吧"]
标&#12288;&#12288;签:是正品补水效果好还可以心&#12288;&#12288;得:补水效果不错,很好用
["标", "&#12288;&#12288;", "签", "", "是", "正品", "补水", "效果", "好", "还", "可以", "心", "&#12288;&#12288;", "得", "", "补水", "效果", "不错", "", "很", "好", "用"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "是", "正品", "补水", "效果", "好", "还", "可以", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "补水", "效果", "不错", "", "很", "好", "用"]
优&#12288;&#12288;点:东西很好哦!不&#12288;&#12288;足:暂时还没有发现缺点哦!心&#12288;&#12288;得:一直在用,信任京东,感觉不错,下次再来。。
["优", "&#12288;&#12288;", "点", "", "东西", "很", "好", "哦", "", "不", "&#12288;&#12288;", "足", "", "暂时", "还", "没有", "发现", "缺点", "哦", "", "心", "&#12288;&#12288;", "得", "", "一直", "在", "用", "", "信任", "京东", "", "感觉", "不错", "", "下次", "再", "来", "。", "。"]
["优", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "点", "", "东西", "很", "好", "哦", "", "不", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "足", "", "暂时", "还", "没有", "发现", "缺点", "哦", "", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "一直", "在", "用", "", "信任", "京东", "", "感觉", "不错", "", "下次", "再", "来", "。", "。"]
标&#12288;&#12288;签:皮肤滑滑的味道不错价格实惠保湿还不错乳液很好用心&#12288;&#12288;得:用的很好的下次还会购买
["标", "&#12288;&#12288;", "签", "", "皮肤", "滑", "滑", "的", "味道", "不错", "价格", "实惠", "保湿", "还", "不错", "乳液", "很", "好", "用心", "&#12288;&#12288;", "得", "", "用", "的", "很", "好", "的", "下次", "还", "会", "购买"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "皮肤", "滑", "滑", "的", "味道", "不错", "价格", "实惠", "保湿", "还", "不错", "乳液", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "用", "的", "很", "好", "的", "下次", "还", "会", "购买"]
标&#12288;&#12288;签:很好用皮肤滑滑的心&#12288;&#12288;得:好用啊,一如既往的好用
["标", "&#12288;&#12288;", "签", "", "很", "好", "用", "皮肤", "滑", "滑", "的", "心", "&#12288;&#12288;", "得", "", "好", "用", "啊", "", "一如既往", "的", "好", "用"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "很", "好", "用", "皮肤", "滑", "滑", "的", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "好", "用", "啊", "", "一如既往", "的", "好", "用"]
心&#12288;&#12288;得:买了以后就知道不后悔的呢
["心", "&#12288;&#12288;", "得", "", "买", "了", "以后", "就", "知道", "不", "后悔", "的", "呢"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "买", "了", "以后", "就", "知道", "不", "后悔", "的", "呢"]
心&#12288;&#12288;得:非常满意,五星
["心", "&#12288;&#12288;", "得", "", "非常", "满意", "", "五星"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "非常", "满意", "", "五星"]
心&#12288;&#12288;得:非常满意,五星
["心", "&#12288;&#12288;", "得", "", "非常", "满意", "", "五星"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "非常", "满意", "", "五星"]
心&#12288;&#12288;得:非常满意,五星
["心", "&#12288;&#12288;", "得", "", "非常", "满意", "", "五星"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "非常", "满意", "", "五星"]
心&#12288;&#12288;得:宝贝很喜欢,连作业都不肯做,在那儿看呢,呵呵
["心", "&#12288;&#12288;", "得", "", "宝贝", "很", "喜欢", "", "连", "作业", "都", "不肯", "做", "", "在", "那儿", "看", "呢", "", "呵呵"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "宝贝", "很", "喜欢", "", "连", "作业", "都", "不肯", "做", "", "在", "那儿", "看", "呢", "", "呵呵"]
心&#12288;&#12288;得:非常满意,五星
["心", "&#12288;&#12288;", "得", "", "非常", "满意", "", "五星"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "非常", "满意", "", "五星"]
心&#12288;&#12288;得:非常满意,五星
["心", "&#12288;&#12288;", "得", "", "非常", "满意", "", "五星"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "非常", "满意", "", "五星"]
心&#12288;&#12288;得:非常满意,五星
["心", "&#12288;&#12288;", "得", "", "非常", "满意", "", "五星"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "非常", "满意", "", "五星"]
心&#12288;&#12288;得:非常满意,五星
["心", "&#12288;&#12288;", "得", "", "非常", "满意", "", "五星"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "非常", "满意", "", "五星"]
标&#12288;&#12288;签:服务好很好用心&#12288;&#12288;得:不错,正品,还会继续关注
["标", "&#12288;&#12288;", "签", "", "服务", "好", "很", "好", "用心", "&#12288;&#12288;", "得", "", "不错", "", "正品", "", "还", "会", "继续", "关注"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "服务", "好", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "不错", "", "正品", "", "还", "会", "继续", "关注"]
标&#12288;&#12288;签:乳液很好用心&#12288;&#12288;得:比较滋润还不错。。。。。。。。。。
["标", "&#12288;&#12288;", "签", "", "乳液", "很", "好", "用心", "&#12288;&#12288;", "得", "", "比较", "滋润", "还", "不错", "。", "。", "。", "。", "。", "。", "。", "。", "。", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "乳液", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "比较", "滋润", "还", "不错", "。", "。", "。", "。", "。", "。", "。", "。", "。", "。"]
标&#12288;&#12288;签:品牌好心&#12288;&#12288;得:送货快,还没有用,具体效果还不清楚
["标", "&#12288;&#12288;", "签", "", "品牌", "好心", "&#12288;&#12288;", "得", "", "送货", "快", "", "还", "没有", "用", "", "具体", "效果", "还", "不", "清楚"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "品牌", "好心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "送货", "快", "", "还", "没有", "用", "", "具体", "效果", "还", "不", "清楚"]
标&#12288;&#12288;签:很好用心&#12288;&#12288;得:一直用这个,在京东买方便。
["标", "&#12288;&#12288;", "签", "", "很", "好", "用心", "&#12288;&#12288;", "得", "", "一直", "用", "这个", "", "在", "京东", "买", "方便", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "一直", "用", "这个", "", "在", "京东", "买", "方便", "。"]
标&#12288;&#12288;签:保湿还不错包装好脸上很舒服吸收快物流速度快心&#12288;&#12288;得:必须要说的是,这是我老婆自己买的。
["标", "&#12288;&#12288;", "签", "", "保湿", "还", "不错", "包装", "好", "脸上", "很", "舒服", "吸收", "快", "物流", "速度", "快", "心", "&#12288;&#12288;", "得", "", "必须", "要说", "的", "是", "", "这", "是", "我", "老婆", "自己", "买", "的", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "保湿", "还", "不错", "包装", "好", "脸上", "很", "舒服", "吸收", "快", "物流", "速度", "快", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "必须", "要说", "的", "是", "", "这", "是", "我", "老婆", "自己", "买", "的", "。"]
标&#12288;&#12288;签:效果不错心&#12288;&#12288;得:一直用这个存货中**************
["标", "&#12288;&#12288;", "签", "", "效果", "不错", "心", "&#12288;&#12288;", "得", "", "一直", "用", "这个", "存货", "中", "**************"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "效果", "不错", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "一直", "用", "这个", "存货", "中", "*", "*", "*", "*", "*", "*", "*", "*", "*", "*", "*", "*", "*", "*"]
标&#12288;&#12288;签:很好用心&#12288;&#12288;得:还可以,常规的东东。.
["标", "&#12288;&#12288;", "签", "", "很", "好", "用心", "&#12288;&#12288;", "得", "", "还", "可以", "", "常规", "的", "东东", "。", "."]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "还", "可以", "", "常规", "的", "东东", "。", "."]
标&#12288;&#12288;签:包装好乳液很好用补水效果好物流速度快价格实惠心&#12288;&#12288;得:挺好的,脸上不紧绷,舒服
["标", "&#12288;&#12288;", "签", "", "包装", "好", "乳液", "很", "好", "用", "补水", "效果", "好", "物流", "速度", "快", "价格", "实惠", "心", "&#12288;&#12288;", "得", "", "挺", "好", "的", "", "脸上", "不", "紧", "绷", "", "舒服"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "包装", "好", "乳液", "很", "好", "用", "补水", "效果", "好", "物流", "速度", "快", "价格", "实惠", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "挺", "好", "的", "", "脸上", "不", "紧", "绷", "", "舒服"]
标&#12288;&#12288;签:物流速度快价格实惠心&#12288;&#12288;得:应该是正品吧,价格比超市便宜些。正在使用中
["标", "&#12288;&#12288;", "签", "", "物流", "速度", "快", "价格", "实惠", "心", "&#12288;&#12288;", "得", "", "应该", "是", "正品", "吧", "", "价格比", "超市", "便宜", "些", "。", "正在", "使用", "中"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "物流", "速度", "快", "价格", "实惠", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "应该", "是", "正品", "吧", "", "价格比", "超市", "便宜", "些", "。", "正在", "使用", "中"]
标&#12288;&#12288;签:还可以心&#12288;&#12288;得:挺滋润的,价钱也合适!
["标", "&#12288;&#12288;", "签", "", "还", "可以", "心", "&#12288;&#12288;", "得", "", "挺", "滋润", "的", "", "价钱", "也", "合适", ""]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "还", "可以", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "挺", "滋润", "的", "", "价钱", "也", "合适", ""]
标&#12288;&#12288;签:是正品效果不错心&#12288;&#12288;得:用过以后效果挺好的,不错是正品
["标", "&#12288;&#12288;", "签", "", "是", "正品", "效果", "不错", "心", "&#12288;&#12288;", "得", "", "用", "过", "以后", "效果", "挺", "好", "的", "", "不错", "是", "正品"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "是", "正品", "效果", "不错", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "用", "过", "以后", "效果", "挺", "好", "的", "", "不错", "是", "正品"]
标&#12288;&#12288;签:很好用比商场便宜心&#12288;&#12288;得:用这个产品一年了,比较认可。
["标", "&#12288;&#12288;", "签", "", "很", "好", "用", "比", "商场", "便宜", "心", "&#12288;&#12288;", "得", "", "用", "这个", "产品", "一年", "了", "", "比较", "认可", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "很", "好", "用", "比", "商场", "便宜", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "用", "这个", "产品", "一年", "了", "", "比较", "认可", "。"]
标&#12288;&#12288;签:保湿还不错心&#12288;&#12288;得:第一次用乳液,感觉还不错
["标", "&#12288;&#12288;", "签", "", "保湿", "还", "不错", "心", "&#12288;&#12288;", "得", "", "第一次", "用", "乳液", "", "感觉", "还", "不错"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "保湿", "还", "不错", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "第一次", "用", "乳液", "", "感觉", "还", "不错"]
标&#12288;&#12288;签:价格实惠心&#12288;&#12288;得:便宜,东西还行吧,用着不习惯,感觉有酒精
["标", "&#12288;&#12288;", "签", "", "价格", "实惠", "心", "&#12288;&#12288;", "得", "", "便宜", "", "东西", "还", "行", "吧", "", "用", "着", "不", "习惯", "", "感觉", "有", "酒精"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "价格", "实惠", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "便宜", "", "东西", "还", "行", "吧", "", "用", "着", "不", "习惯", "", "感觉", "有", "酒精"]
标&#12288;&#12288;签:价格实惠包装好心&#12288;&#12288;得:看牌子买的,先试着用用看效果
["标", "&#12288;&#12288;", "签", "", "价格", "实惠", "包装", "好心", "&#12288;&#12288;", "得", "", "看", "牌子", "买", "的", "", "先", "试", "着", "用", "用", "看", "效果"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "价格", "实惠", "包装", "好心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "看", "牌子", "买", "的", "", "先", "试", "着", "用", "用", "看", "效果"]
心&#12288;&#12288;得:配套用的不错个人觉得
["心", "&#12288;&#12288;", "得", "", "配套", "用", "的", "不错", "个人", "觉得"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "配套", "用", "的", "不错", "个人", "觉得"]
标&#12288;&#12288;签:味道刺激心&#12288;&#12288;得:不怎么样,用后脸上会起红点
["标", "&#12288;&#12288;", "签", "", "味道", "刺激", "心", "&#12288;&#12288;", "得", "", "不怎么样", "", "用", "后", "脸上", "会", "起", "红", "点"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "味道", "刺激", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "不怎么样", "", "用", "后", "脸上", "会", "起", "红", "点"]
标&#12288;&#12288;签:挺保湿的物流速度快比商场便宜品牌好心&#12288;&#12288;得:正品,平价,比商场便宜,物流很快。
["标", "&#12288;&#12288;", "签", "", "挺", "保湿", "的", "物流", "速度", "快", "比", "商场", "便宜", "品牌", "好心", "&#12288;&#12288;", "得", "", "正品", "", "平价", "", "比", "商场", "便宜", "", "物流", "很快", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "挺", "保湿", "的", "物流", "速度", "快", "比", "商场", "便宜", "品牌", "好心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "正品", "", "平价", "", "比", "商场", "便宜", "", "物流", "很快", "。"]
标&#12288;&#12288;签:服务好心&#12288;&#12288;得还没有使用过就发现YMX只要79元我哭为什么京东价格拼不过YMX呀~~~
["标", "&#12288;&#12288;", "签", "", "服务", "好心", "&#12288;&#12288;", "得", "", "还", "没有", "使用", "过", "", "就", "发现", "YMX", "只要", "79", "元", "", "我", "哭", "", "为什么", "京东", "价格", "拼", "不过", "YMX", "呀", "~~~"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "服务", "好心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "还", "没有", "使用", "过", "", "就", "发现", "Y", "M", "X", "只要", "7", "9", "元", "", "我", "哭", "", "为什么", "京东", "价格", "拼", "不过", "Y", "M", "X", "呀", "~", "~", "~"]
标&#12288;&#12288;签:挺保湿的心&#12288;&#12288;得:第一次购买,用了感觉还不错
["标", "&#12288;&#12288;", "签", "", "挺", "保湿", "的", "心", "&#12288;&#12288;", "得", "", "第一次", "购买", ",", "用", "了", "感觉", "还", "不错"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "挺", "保湿", "的", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "第一次", "购买", ",", "用", "了", "感觉", "还", "不错"]
标&#12288;&#12288;签:服务好物流速度快脸上很舒服心&#12288;&#12288;得:刚送到家。。用用在发表好坏。
["标", "&#12288;&#12288;", "签", "", "服务", "好", "物流", "速度", "快", "脸上", "很", "舒服", "心", "&#12288;&#12288;", "得", "", "刚", "送到", "家", "。", "。", "用", "用", "在", "发表", "好坏", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "服务", "好", "物流", "速度", "快", "脸上", "很", "舒服", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "刚", "送到", "家", "。", "。", "用", "用", "在", "发表", "好坏", "。"]
心&#12288;&#12288;得:还没用看看包装蛮好的晒&#12288;&#12288;单共3张图片查看晒单>
["心", "&#12288;&#12288;", "得", "", "还", "没用", "看看", "包装", "蛮", "好", "的", "晒", "&#12288;&#12288;", "单", "", "共", "3", "张", "图片", "查看", "晒", "单", ">"]
["心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "还", "没用", "看看", "包装", "蛮", "好", "的", "晒", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "单", "", "共", "3", "张", "图片", "查看", "晒", "单", ">"]
标&#12288;&#12288;签:品牌好价格实惠脸上很舒服味道不错心&#12288;&#12288;得:防晒,不油腻,还可以使皮肤稍稍增白些,
["标", "&#12288;&#12288;", "签", "", "品牌", "好", "价格", "实惠", "脸上", "很", "舒服", "味道", "不错", "心", "&#12288;&#12288;", "得", "", "防晒", "", "不", "油腻", "", "还", "可以", "使", "皮肤", "稍稍", "增白", "些", ""]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "品牌", "好", "价格", "实惠", "脸上", "很", "舒服", "味道", "不错", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "防晒", "", "不", "油腻", "", "还", "可以", "使", "皮肤", "稍稍", "增白", "些", ""]
标&#12288;&#12288;签:价格实惠保湿还不错心&#12288;&#12288;得:东西好用,分不清楚是不是正品。
["标", "&#12288;&#12288;", "签", "", "价格", "实惠", "保湿", "还", "不错", "心", "&#12288;&#12288;", "得", "", "东西", "好", "用", "", "分", "不", "清楚", "是不是", "正品", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "价格", "实惠", "保湿", "还", "不错", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "东西", "好", "用", "", "分", "不", "清楚", "是不是", "正品", "。"]
标&#12288;&#12288;签:服务好乳液很好用心&#12288;&#12288;得:乳液还是不错的用用不错的
["标", "&#12288;&#12288;", "签", "", "服务", "好", "乳液", "很", "好", "用心", "&#12288;&#12288;", "得", "", "乳液", "还是", "不错", "的", "用", "用", "不错", "的"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "服务", "好", "乳液", "很", "好", "用心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "乳液", "还是", "不错", "的", "用", "用", "不错", "的"]
标&#12288;&#12288;签:物流速度快效果不错心&#12288;&#12288;得:常用这个,夏天用,美白效果还好
["标", "&#12288;&#12288;", "签", "", "物流", "速度", "快", "效果", "不错", "心", "&#12288;&#12288;", "得", "", "常用", "这个", "", "夏天", "用", "", "美", "白", "效果", "还好"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "物流", "速度", "快", "效果", "不错", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "常用", "这个", "", "夏天", "用", "", "美", "白", "效果", "还好"]
标&#12288;&#12288;签:还可以心&#12288;&#12288;得:不错
["标", "&#12288;&#12288;", "签", "", "还", "可以", "心", "&#12288;&#12288;", "得", "", "不错"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "还", "可以", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "不错"]
标&#12288;&#12288;签:价格实惠比商场便宜服务好心&#12288;&#12288;得:真的还不错而且价格也实惠快递速度
["标", "&#12288;&#12288;", "签", "", "价格", "实惠", "比", "商场", "便宜", "服务", "好心", "&#12288;&#12288;", "得", "", "真的", "还", "不错", "而且", "价格", "也", "实惠", "快递", "速度"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "价格", "实惠", "比", "商场", "便宜", "服务", "好心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "真的", "还", "不错", "而且", "价格", "也", "实惠", "快递", "速度"]
标&#12288;&#12288;签:比商场便宜脸上很舒服很好用物流速度快是正品心&#12288;&#12288;得:京东就是好一日既往的好
["标", "&#12288;&#12288;", "签", "", "比", "商场", "便宜", "脸上", "很", "舒服", "很", "好", "用", "物流", "速度", "快", "是", "正品", "心", "&#12288;&#12288;", "得", "", "京东", "就是", "好", "一日", "既往", "的", "好"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "比", "商场", "便宜", "脸上", "很", "舒服", "很", "好", "用", "物流", "速度", "快", "是", "正品", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "京东", "就是", "好", "一日", "既往", "的", "好"]
活动时购买的很划算,用下来觉得还可以吧,等用完了才能知道有没有效果吧。反正很划算,随便用用看
["活动", "时", "购买", "的", "很", "划算", "", "用", "下来", "觉得", "还", "可以", "吧", "", "等", "用", "完", "了", "才能", "知道", "有没有", "效果", "吧", "。", "反正", "很", "划算", "", "随便", "用", "用", "看"]
新能真皙美白乳液很好用,有美白的效果,吸收也很快,搞活动买的,比外面便宜好多~~~~~
["新", "能", "真", "皙", "美", "白", "乳液", "很", "好", "用", "", "有", "美", "白", "的", "效果", "", "吸收", "也", "很快", "", "搞", "活动", "买", "的", "", "比", "外面", "便宜", "好多", "~~~~~"]
["新", "能", "真", "皙", "美", "白", "乳液", "很", "好", "用", "", "有", "美", "白", "的", "效果", "", "吸收", "也", "很快", "", "搞", "活动", "买", "的", "", "比", "外面", "便宜", "好多", "~", "~", "~", "~", "~"]
三八妇女节买的Z的产品随便用用可以的。女人要对自己好一点。
["三八妇女节", "买", "的", "", "Z", "的", "产品", "随便", "用", "用", "可以", "的", "。", "女人", "要", "对", "自己", "好", "一点", "。"]
标&#12288;&#12288;签:是正品挺保湿的心&#12288;&#12288;得好东东ZA我的最爱。
["标", "&#12288;&#12288;", "签", "", "是", "正品", "挺", "保湿", "的", "心", "&#12288;&#12288;", "得", "", "好", "东东", "", "ZA", "我", "的", "最", "爱", "。"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "是", "正品", "挺", "保湿", "的", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "好", "东东", "", "Z", "A", "我", "的", "最", "爱", "。"]
优&#12288;&#12288;点:没有让这次的尝试失望不&#12288;&#12288;足:货运慢,慢,慢心&#12288;&#12288;得:很舒适,用的不错
["优", "&#12288;&#12288;", "点", "", "没有", "让", "这次", "的", "尝试", "失望", "不", "&#12288;&#12288;", "足", "", "货运", "慢", "", "慢", "", "慢", "心", "&#12288;&#12288;", "得", "", "很", "舒适", "", "用", "的", "不错"]
["优", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "点", "", "没有", "让", "这次", "的", "尝试", "失望", "不", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "足", "", "货运", "慢", "", "慢", "", "慢", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "很", "舒适", "", "用", "的", "不错"]
标&#12288;&#12288;签:挺保湿的心&#12288;&#12288;得:一直用还可以~~~~~~~~~~~~~~~~
["标", "&#12288;&#12288;", "签", "", "挺", "保湿", "的", "心", "&#12288;&#12288;", "得", "", "一直", "用", "还", "可以", "~~~~~~~~~~~~~~~~"]
["标", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "签", "", "挺", "保湿", "的", "心", "&", "#", "1", "2", "2", "8", "8", ";", "&", "#", "1", "2", "2", "8", "8", ";", "得", "", "一直", "用", "还", "可以", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~", "~"]
很滋润效果好味道接受
["很", "滋润", "效果", "好", "味道", "接受"]
朋友推荐,醇润型,有点稠,我是混合型皮肤,很好吸收,不粘腻
@ -173,13 +173,13 @@
效果挺好的滋润保湿了味道清淡
["效果", "挺", "好", "的", "滋润", "保湿", "了", "味道", "清淡"]
瓶子盖子都有刮痕了是不是都用过了啊。以前也在卓越买过za的其他化妆品都还算满意。这一次真觉得很恶心以后不会在这买了
["瓶子", "盖子", "都", "有", "刮", "痕", "了", "", "是不是", "都", "用", "过", "了", "啊", "。", "以前", "也", "在", "卓越", "买", "过", "za", "的", "其他", "化妆品", "", "都", "还", "算", "满意", "。", "这", "一次", "真", "觉得", "很", "恶心", "", "以后", "不会", "在", "这", "买", "了"]
["瓶子", "盖子", "都", "有", "刮", "痕", "了", "", "是不是", "都", "用", "过", "了", "啊", "。", "以前", "也", "在", "卓越", "买", "过", "z", "a", "的", "其他", "化妆品", "", "都", "还", "算", "满意", "。", "这", "一次", "真", "觉得", "很", "恶心", "", "以后", "不会", "在", "这", "买", "了"]
好用不知道是不是正品啊
["好", "用", "不", "知道", "是不是", "正品", "啊"]
很好用
["很", "好", "用"]
za乳液不够滋润全新但是怎么没有密封
["za", "乳液", "不够", "滋润", "", "全新", "但是", "怎么", "没有", "密封", ""]
["z", "a", "乳液", "不够", "滋润", "", "全新", "但是", "怎么", "没有", "密封", ""]
还不错,一直在用
["还", "不错", "", "一直", "在", "用"]
妈妈收到了
@ -193,7 +193,7 @@ za乳液不够滋润全新但是怎么没有密封
还可以
["还", "可以"]
挺好的,这个用上也不是很油腻..
["挺", "好", "的", "", "这个", "用", "上", "也", "不是", "很", "油腻", ".."]
["挺", "好", "的", "", "这个", "用", "上", "也", "不是", "很", "油腻", ".", "."]
纯度不够。
["纯度", "不够", "。"]
这个给婆婆买的,我就用过几次,但感觉挺滋润

19
test/testdata/server.conf vendored Normal file
View File

@ -0,0 +1,19 @@
# config
#socket listen port
port=11200
thread_number=4
queue_max_size=4096
#dict path
dict_path=../dict/jieba.dict.utf8
#model path
model_path=../dict/hmm_model.utf8
user_dict_path=../dict/user.dict.utf8
idf_path=../dict/idf.utf8
stop_words_path=../dict/stop_words.utf8

View File

@ -1,9 +1,8 @@
我来到北京清华大学
他来到了网易杭研大厦
杭研
小明硕士毕业于中国科学院计算所,后在日本京都大学深造
我来自北京邮电大学。。。 学号 091111xx。。。
来这里看看别人正在搜索什么吧
我来到南京市长江大桥
请在一米线外等候
人事处女干事
去医院做B超编号123
令狐冲是云计算行业的专家
IBM,3.14

1
test/testdata/userdict.2.utf8 vendored Normal file
View File

@ -0,0 +1 @@
千树万树梨花开

2
test/testdata/userdict.english vendored Normal file
View File

@ -0,0 +1,2 @@
in
internal

8
test/testdata/userdict.utf8 vendored Normal file
View File

@ -0,0 +1,8 @@
云计算
韩玉鉴赏
A
B
iPhone6
蓝翔 nz
忽如一夜春风来
区块链 10 nz

View File

@ -1,12 +1,41 @@
SET(EXECUTABLE_OUTPUT_PATH ${PROJECT_BINARY_DIR})
message(STATUS "MSVC value: ${MSVC}")
if (MSVC)
set(CMAKE_MSVC_RUNTIME_LIBRARY "MultiThreadedDebugDLL")
set(gtest_force_shared_crt ON CACHE BOOL "" FORCE)
add_compile_options(/utf-8)
endif()
include(FetchContent)
FetchContent_Declare(
googletest
GIT_REPOSITORY https://github.com/google/googletest.git
GIT_TAG release-1.12.1
)
FetchContent_MakeAvailable(googletest)
SET(EXECUTABLE_OUTPUT_PATH ${PROJECT_BINARY_DIR}/test)
SET(LIBRARY_OUTPUT_PATH ${PROJECT_BINARY_DIR}/lib)
SET(GTEST_ROOT_DIR gtest-1.6.0)
ADD_DEFINITIONS(-DLOGGING_LEVEL=LL_WARNING)
ADD_DEFINITIONS(-DLOGGER_LEVEL=LL_WARN)
INCLUDE_DIRECTORIES(${GTEST_ROOT_DIR} ${GTEST_ROOT_DIR}/include ${PROJECT_SOURCE_DIR})
ADD_LIBRARY(gtest STATIC ${GTEST_ROOT_DIR}/src/gtest-all.cc)
ADD_EXECUTABLE(test.run gtest_main.cpp TKeywordExtractor.cpp TTrie.cpp TSegments.cpp )
TARGET_LINK_LIBRARIES(gtest pthread)
TARGET_LINK_LIBRARIES(test.run gtest pthread)
# Add include directories
INCLUDE_DIRECTORIES(
${CMAKE_CURRENT_BINARY_DIR}
${CMAKE_BINARY_DIR}/test
)
ADD_EXECUTABLE(test.run
gtest_main.cpp
keyword_extractor_test.cpp
trie_test.cpp
segments_test.cpp
pos_tagger_test.cpp
jieba_test.cpp
pre_filter_test.cpp
unicode_test.cpp
textrank_test.cpp
)
TARGET_LINK_LIBRARIES(test.run gtest)

View File

@ -1,18 +0,0 @@
#include "src/KeywordExtractor.hpp"
#include "gtest/gtest.h"
using namespace CppJieba;
TEST(KeywordExtractorTest, Test1)
{
KeywordExtractor extractor("../dict/extra_dict/jieba.dict.small.utf8", "../dict/hmm_model.utf8", "../dict/idf.utf8", "../dict/stop_words.utf8");
string s("我是拖拉机学院手扶拖拉机专业的。不用多久,我就会升职加薪,当上总经理,迎娶白富美,走上人生巅峰。");
string res;
vector<pair<string, double> > wordweights;
size_t topN = 5;
extractor.extract(s, wordweights, topN);
res << wordweights;
ASSERT_EQ(res, "[\"白富美:11.7392\", \"升职:10.8562\", \"加薪:10.6426\", \"迎娶:10.0505\", \"手扶拖拉机:10.0089\"]");
}

View File

@ -1,134 +0,0 @@
#include "src/SegmentBase.hpp"
#include "src/MixSegment.hpp"
#include "src/MPSegment.hpp"
#include "src/HMMSegment.hpp"
#include "src/Limonp/io_functs.hpp"
#include "src/FullSegment.hpp"
#include "src/QuerySegment.hpp"
#include "gtest/gtest.h"
using namespace CppJieba;
TEST(SegmentBaseTest, Test1)
{
const char* str = "heheh你好...hh";
string s;
vector<string> buf;
buf.push_back("heheh");
buf.push_back("你好");
buf.push_back("...hh");
vector<string> res;
size_t size = strlen(str);
size_t offset = 0;
while(offset < size)
{
size_t len = 0;
const char* t = str + offset;
SegmentBase::filterAscii(t, size - offset, len);
s.assign(t, len);
res.push_back(s);
//cout<<s<<","<<ret<<","<<len<<endl;
//cout<<str<<endl;
offset += len;
}
EXPECT_EQ(res, buf);
}
//int main(int argc, char** argv)
//{
// //ChineseFilter chFilter;
// return 0;
//}
TEST(MixSegmentTest, Test1)
{
MixSegment segment("../dict/extra_dict/jieba.dict.small.utf8", "../dict/hmm_model.utf8");;
const char* str = "我来自北京邮电大学。。。 学号 123456";
const char* res[] = {"", "来自", "北京邮电大学", "","",""," ","学号", " 123456"};
vector<string> words;
ASSERT_TRUE(segment);
ASSERT_TRUE(segment.cut(str, words));
EXPECT_EQ(words, vector<string>(res, res + sizeof(res)/sizeof(res[0])));
}
TEST(MPSegmentTest, Test1)
{
MPSegment segment("../dict/extra_dict/jieba.dict.small.utf8");;
const char* str = "我来自北京邮电大学。。。 学号 123456";
const char* res[] = {"", "来自", "北京邮电大学", "","",""," ","","", " 123456"};
vector<string> words;
ASSERT_TRUE(segment);
ASSERT_TRUE(segment.cut(str, words));
//print(words);
EXPECT_EQ(words, vector<string>(res, res + sizeof(res)/sizeof(res[0])));
}
TEST(MPSegmentTest, Test2)
{
MPSegment segment("../dict/extra_dict/jieba.dict.small.utf8");
string line;
ifstream ifs("../test/testdata/review.100");
vector<string> words;
string eRes;
loadFile2Str("../test/testdata/review.100.res", eRes);
string res;
while(getline(ifs, line))
{
res += line;
res += '\n';
segment.cut(line, words);
string s;
s << words;
res += s;
res += '\n';
}
WriteStr2File("../test/testdata/review.100.res", res.c_str(), "w");
//ASSERT_EQ(res, eRes);
}
TEST(HMMSegmentTest, Test1)
{
HMMSegment segment("../dict/hmm_model.utf8");;
const char* str = "我来自北京邮电大学。。。 学号 123456";
const char* res[] = {"我来", "自北京", "邮电大学", "", "", "", " ", "学号", " 123456"};
//string s;
//vector<string> buf(res, res + sizeof(res)/sizeof(res[0]));
vector<string> words;
ASSERT_TRUE(segment);
ASSERT_TRUE(segment.cut(str, words));
//print(words);
EXPECT_EQ(words, vector<string>(res, res + sizeof(res)/sizeof(res[0])));
}
TEST(FullSegment, Test1)
{
FullSegment segment("../dict/extra_dict/jieba.dict.small.utf8");
const char* str = "我来自北京邮电大学。。。 学号 123456";
vector<string> words;
ASSERT_EQ(segment.cut(str, words), true);
string s;
s << words;
ASSERT_EQ(s, "[\"\", \"来自\", \"北京\", \"北京邮电大学\", \"邮电\", \"电大\", \"大学\", \"\", \"\", \"\", \" \", \"\", \"\", \" 123456\"]");
}
TEST(QuerySegment, Test1)
{
QuerySegment segment("../dict/extra_dict/jieba.dict.small.utf8", "../dict/hmm_model.utf8", 3);
const char* str = "小明硕士毕业于中国科学院计算所,后在日本京都大学深造";
vector<string> words;
ASSERT_TRUE(segment.cut(str, words));
string s1, s2;
s1 << words;
s2 = "[\"小明\", \"硕士\", \"毕业\", \"\", \"中国\", \"中国科学院\", \"科学\", \"科学院\", \"学院\", \"计算所\", \"\", \"\", \"\", \"日本\", \"京都\", \"京都大学\", \"大学\", \"深造\"]";
ASSERT_EQ(s1, s2);
}

View File

@ -1,57 +0,0 @@
#include "src/DictTrie.hpp"
#include "gtest/gtest.h"
using namespace CppJieba;
static const char* const DICT_FILE = "../dict/extra_dict/jieba.dict.small.utf8";
TEST(DictTrieTest, NewAndDelete)
{
DictTrie * trie;
trie = new DictTrie(DICT_FILE);
delete trie;
trie = new DictTrie();
delete trie;
}
TEST(DictTrieTest, Test1)
{
string s1, s2;
DictTrie trie;
ASSERT_TRUE(trie.init(DICT_FILE));
ASSERT_LT(trie.getMinLogFreq() + 15.6479, 0.001);
string word("来到");
Unicode uni;
ASSERT_TRUE(TransCode::decode(word, uni));
DictUnit nodeInfo;
nodeInfo.word = uni;
nodeInfo.freq = 8779;
nodeInfo.tag = "v";
nodeInfo.logFreq = -8.87033;
s1 << nodeInfo;
s2 << (*trie.find(uni.begin(), uni.end()));
EXPECT_EQ("[\"26469\", \"21040\"] 8779 v -8.870", s2);
word = "清华大学";
vector<pair<size_t, const DictUnit*> > res;
map<size_t, const DictUnit* > resMap;
map<size_t, const DictUnit* > mp;
const char * words[] = {"", "清华", "清华大学"};
for(size_t i = 0; i < sizeof(words)/sizeof(words[0]); i++)
{
ASSERT_TRUE(TransCode::decode(words[i], uni));
res.push_back(make_pair(uni.size() - 1, trie.find(uni.begin(), uni.end())));
resMap[uni.size() - 1] = trie.find(uni.begin(), uni.end());
}
//DictUnit
//res.push_back(make_pair(0, ))
vector<pair<size_t, const DictUnit*> > vec;
ASSERT_TRUE(TransCode::decode(word, uni));
//print(uni);
ASSERT_TRUE(trie.find(uni.begin(), uni.end(), mp, 0));
ASSERT_EQ(mp, resMap);
// print(vec);
}

View File

@ -1,283 +0,0 @@
// Copyright 2005, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
//
// The Google C++ Testing Framework (Google Test)
//
// This header file defines the public API for death tests. It is
// #included by gtest.h so a user doesn't need to include this
// directly.
#ifndef GTEST_INCLUDE_GTEST_GTEST_DEATH_TEST_H_
#define GTEST_INCLUDE_GTEST_GTEST_DEATH_TEST_H_
#include "gtest/internal/gtest-death-test-internal.h"
namespace testing {
// This flag controls the style of death tests. Valid values are "threadsafe",
// meaning that the death test child process will re-execute the test binary
// from the start, running only a single death test, or "fast",
// meaning that the child process will execute the test logic immediately
// after forking.
GTEST_DECLARE_string_(death_test_style);
#if GTEST_HAS_DEATH_TEST
// The following macros are useful for writing death tests.
// Here's what happens when an ASSERT_DEATH* or EXPECT_DEATH* is
// executed:
//
// 1. It generates a warning if there is more than one active
// thread. This is because it's safe to fork() or clone() only
// when there is a single thread.
//
// 2. The parent process clone()s a sub-process and runs the death
// test in it; the sub-process exits with code 0 at the end of the
// death test, if it hasn't exited already.
//
// 3. The parent process waits for the sub-process to terminate.
//
// 4. The parent process checks the exit code and error message of
// the sub-process.
//
// Examples:
//
// ASSERT_DEATH(server.SendMessage(56, "Hello"), "Invalid port number");
// for (int i = 0; i < 5; i++) {
// EXPECT_DEATH(server.ProcessRequest(i),
// "Invalid request .* in ProcessRequest()")
// << "Failed to die on request " << i);
// }
//
// ASSERT_EXIT(server.ExitNow(), ::testing::ExitedWithCode(0), "Exiting");
//
// bool KilledBySIGHUP(int exit_code) {
// return WIFSIGNALED(exit_code) && WTERMSIG(exit_code) == SIGHUP;
// }
//
// ASSERT_EXIT(client.HangUpServer(), KilledBySIGHUP, "Hanging up!");
//
// On the regular expressions used in death tests:
//
// On POSIX-compliant systems (*nix), we use the <regex.h> library,
// which uses the POSIX extended regex syntax.
//
// On other platforms (e.g. Windows), we only support a simple regex
// syntax implemented as part of Google Test. This limited
// implementation should be enough most of the time when writing
// death tests; though it lacks many features you can find in PCRE
// or POSIX extended regex syntax. For example, we don't support
// union ("x|y"), grouping ("(xy)"), brackets ("[xy]"), and
// repetition count ("x{5,7}"), among others.
//
// Below is the syntax that we do support. We chose it to be a
// subset of both PCRE and POSIX extended regex, so it's easy to
// learn wherever you come from. In the following: 'A' denotes a
// literal character, period (.), or a single \\ escape sequence;
// 'x' and 'y' denote regular expressions; 'm' and 'n' are for
// natural numbers.
//
// c matches any literal character c
// \\d matches any decimal digit
// \\D matches any character that's not a decimal digit
// \\f matches \f
// \\n matches \n
// \\r matches \r
// \\s matches any ASCII whitespace, including \n
// \\S matches any character that's not a whitespace
// \\t matches \t
// \\v matches \v
// \\w matches any letter, _, or decimal digit
// \\W matches any character that \\w doesn't match
// \\c matches any literal character c, which must be a punctuation
// . matches any single character except \n
// A? matches 0 or 1 occurrences of A
// A* matches 0 or many occurrences of A
// A+ matches 1 or many occurrences of A
// ^ matches the beginning of a string (not that of each line)
// $ matches the end of a string (not that of each line)
// xy matches x followed by y
//
// If you accidentally use PCRE or POSIX extended regex features
// not implemented by us, you will get a run-time failure. In that
// case, please try to rewrite your regular expression within the
// above syntax.
//
// This implementation is *not* meant to be as highly tuned or robust
// as a compiled regex library, but should perform well enough for a
// death test, which already incurs significant overhead by launching
// a child process.
//
// Known caveats:
//
// A "threadsafe" style death test obtains the path to the test
// program from argv[0] and re-executes it in the sub-process. For
// simplicity, the current implementation doesn't search the PATH
// when launching the sub-process. This means that the user must
// invoke the test program via a path that contains at least one
// path separator (e.g. path/to/foo_test and
// /absolute/path/to/bar_test are fine, but foo_test is not). This
// is rarely a problem as people usually don't put the test binary
// directory in PATH.
//
// TODO(wan@google.com): make thread-safe death tests search the PATH.
// Asserts that a given statement causes the program to exit, with an
// integer exit status that satisfies predicate, and emitting error output
// that matches regex.
# define ASSERT_EXIT(statement, predicate, regex) \
GTEST_DEATH_TEST_(statement, predicate, regex, GTEST_FATAL_FAILURE_)
// Like ASSERT_EXIT, but continues on to successive tests in the
// test case, if any:
# define EXPECT_EXIT(statement, predicate, regex) \
GTEST_DEATH_TEST_(statement, predicate, regex, GTEST_NONFATAL_FAILURE_)
// Asserts that a given statement causes the program to exit, either by
// explicitly exiting with a nonzero exit code or being killed by a
// signal, and emitting error output that matches regex.
# define ASSERT_DEATH(statement, regex) \
ASSERT_EXIT(statement, ::testing::internal::ExitedUnsuccessfully, regex)
// Like ASSERT_DEATH, but continues on to successive tests in the
// test case, if any:
# define EXPECT_DEATH(statement, regex) \
EXPECT_EXIT(statement, ::testing::internal::ExitedUnsuccessfully, regex)
// Two predicate classes that can be used in {ASSERT,EXPECT}_EXIT*:
// Tests that an exit code describes a normal exit with a given exit code.
class GTEST_API_ ExitedWithCode {
public:
explicit ExitedWithCode(int exit_code);
bool operator()(int exit_status) const;
private:
// No implementation - assignment is unsupported.
void operator=(const ExitedWithCode& other);
const int exit_code_;
};
# if !GTEST_OS_WINDOWS
// Tests that an exit code describes an exit due to termination by a
// given signal.
class GTEST_API_ KilledBySignal {
public:
explicit KilledBySignal(int signum);
bool operator()(int exit_status) const;
private:
const int signum_;
};
# endif // !GTEST_OS_WINDOWS
// EXPECT_DEBUG_DEATH asserts that the given statements die in debug mode.
// The death testing framework causes this to have interesting semantics,
// since the sideeffects of the call are only visible in opt mode, and not
// in debug mode.
//
// In practice, this can be used to test functions that utilize the
// LOG(DFATAL) macro using the following style:
//
// int DieInDebugOr12(int* sideeffect) {
// if (sideeffect) {
// *sideeffect = 12;
// }
// LOG(DFATAL) << "death";
// return 12;
// }
//
// TEST(TestCase, TestDieOr12WorksInDgbAndOpt) {
// int sideeffect = 0;
// // Only asserts in dbg.
// EXPECT_DEBUG_DEATH(DieInDebugOr12(&sideeffect), "death");
//
// #ifdef NDEBUG
// // opt-mode has sideeffect visible.
// EXPECT_EQ(12, sideeffect);
// #else
// // dbg-mode no visible sideeffect.
// EXPECT_EQ(0, sideeffect);
// #endif
// }
//
// This will assert that DieInDebugReturn12InOpt() crashes in debug
// mode, usually due to a DCHECK or LOG(DFATAL), but returns the
// appropriate fallback value (12 in this case) in opt mode. If you
// need to test that a function has appropriate side-effects in opt
// mode, include assertions against the side-effects. A general
// pattern for this is:
//
// EXPECT_DEBUG_DEATH({
// // Side-effects here will have an effect after this statement in
// // opt mode, but none in debug mode.
// EXPECT_EQ(12, DieInDebugOr12(&sideeffect));
// }, "death");
//
# ifdef NDEBUG
# define EXPECT_DEBUG_DEATH(statement, regex) \
do { statement; } while (::testing::internal::AlwaysFalse())
# define ASSERT_DEBUG_DEATH(statement, regex) \
do { statement; } while (::testing::internal::AlwaysFalse())
# else
# define EXPECT_DEBUG_DEATH(statement, regex) \
EXPECT_DEATH(statement, regex)
# define ASSERT_DEBUG_DEATH(statement, regex) \
ASSERT_DEATH(statement, regex)
# endif // NDEBUG for EXPECT_DEBUG_DEATH
#endif // GTEST_HAS_DEATH_TEST
// EXPECT_DEATH_IF_SUPPORTED(statement, regex) and
// ASSERT_DEATH_IF_SUPPORTED(statement, regex) expand to real death tests if
// death tests are supported; otherwise they just issue a warning. This is
// useful when you are combining death test assertions with normal test
// assertions in one test.
#if GTEST_HAS_DEATH_TEST
# define EXPECT_DEATH_IF_SUPPORTED(statement, regex) \
EXPECT_DEATH(statement, regex)
# define ASSERT_DEATH_IF_SUPPORTED(statement, regex) \
ASSERT_DEATH(statement, regex)
#else
# define EXPECT_DEATH_IF_SUPPORTED(statement, regex) \
GTEST_UNSUPPORTED_DEATH_TEST_(statement, regex, )
# define ASSERT_DEATH_IF_SUPPORTED(statement, regex) \
GTEST_UNSUPPORTED_DEATH_TEST_(statement, regex, return)
#endif
} // namespace testing
#endif // GTEST_INCLUDE_GTEST_GTEST_DEATH_TEST_H_

View File

@ -1,230 +0,0 @@
// Copyright 2005, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
//
// The Google C++ Testing Framework (Google Test)
//
// This header file defines the Message class.
//
// IMPORTANT NOTE: Due to limitation of the C++ language, we have to
// leave some internal implementation details in this header file.
// They are clearly marked by comments like this:
//
// // INTERNAL IMPLEMENTATION - DO NOT USE IN A USER PROGRAM.
//
// Such code is NOT meant to be used by a user directly, and is subject
// to CHANGE WITHOUT NOTICE. Therefore DO NOT DEPEND ON IT in a user
// program!
#ifndef GTEST_INCLUDE_GTEST_GTEST_MESSAGE_H_
#define GTEST_INCLUDE_GTEST_GTEST_MESSAGE_H_
#include <limits>
#include "gtest/internal/gtest-string.h"
#include "gtest/internal/gtest-internal.h"
namespace testing {
// The Message class works like an ostream repeater.
//
// Typical usage:
//
// 1. You stream a bunch of values to a Message object.
// It will remember the text in a stringstream.
// 2. Then you stream the Message object to an ostream.
// This causes the text in the Message to be streamed
// to the ostream.
//
// For example;
//
// testing::Message foo;
// foo << 1 << " != " << 2;
// std::cout << foo;
//
// will print "1 != 2".
//
// Message is not intended to be inherited from. In particular, its
// destructor is not virtual.
//
// Note that stringstream behaves differently in gcc and in MSVC. You
// can stream a NULL char pointer to it in the former, but not in the
// latter (it causes an access violation if you do). The Message
// class hides this difference by treating a NULL char pointer as
// "(null)".
class GTEST_API_ Message {
private:
// The type of basic IO manipulators (endl, ends, and flush) for
// narrow streams.
typedef std::ostream& (*BasicNarrowIoManip)(std::ostream&);
public:
// Constructs an empty Message.
// We allocate the stringstream separately because otherwise each use of
// ASSERT/EXPECT in a procedure adds over 200 bytes to the procedure's
// stack frame leading to huge stack frames in some cases; gcc does not reuse
// the stack space.
Message() : ss_(new ::std::stringstream) {
// By default, we want there to be enough precision when printing
// a double to a Message.
*ss_ << std::setprecision(std::numeric_limits<double>::digits10 + 2);
}
// Copy constructor.
Message(const Message& msg) : ss_(new ::std::stringstream) { // NOLINT
*ss_ << msg.GetString();
}
// Constructs a Message from a C-string.
explicit Message(const char* str) : ss_(new ::std::stringstream) {
*ss_ << str;
}
#if GTEST_OS_SYMBIAN
// Streams a value (either a pointer or not) to this object.
template <typename T>
inline Message& operator <<(const T& value) {
StreamHelper(typename internal::is_pointer<T>::type(), value);
return *this;
}
#else
// Streams a non-pointer value to this object.
template <typename T>
inline Message& operator <<(const T& val) {
::GTestStreamToHelper(ss_.get(), val);
return *this;
}
// Streams a pointer value to this object.
//
// This function is an overload of the previous one. When you
// stream a pointer to a Message, this definition will be used as it
// is more specialized. (The C++ Standard, section
// [temp.func.order].) If you stream a non-pointer, then the
// previous definition will be used.
//
// The reason for this overload is that streaming a NULL pointer to
// ostream is undefined behavior. Depending on the compiler, you
// may get "0", "(nil)", "(null)", or an access violation. To
// ensure consistent result across compilers, we always treat NULL
// as "(null)".
template <typename T>
inline Message& operator <<(T* const& pointer) { // NOLINT
if (pointer == NULL) {
*ss_ << "(null)";
} else {
::GTestStreamToHelper(ss_.get(), pointer);
}
return *this;
}
#endif // GTEST_OS_SYMBIAN
// Since the basic IO manipulators are overloaded for both narrow
// and wide streams, we have to provide this specialized definition
// of operator <<, even though its body is the same as the
// templatized version above. Without this definition, streaming
// endl or other basic IO manipulators to Message will confuse the
// compiler.
Message& operator <<(BasicNarrowIoManip val) {
*ss_ << val;
return *this;
}
// Instead of 1/0, we want to see true/false for bool values.
Message& operator <<(bool b) {
return *this << (b ? "true" : "false");
}
// These two overloads allow streaming a wide C string to a Message
// using the UTF-8 encoding.
Message& operator <<(const wchar_t* wide_c_str) {
return *this << internal::String::ShowWideCString(wide_c_str);
}
Message& operator <<(wchar_t* wide_c_str) {
return *this << internal::String::ShowWideCString(wide_c_str);
}
#if GTEST_HAS_STD_WSTRING
// Converts the given wide string to a narrow string using the UTF-8
// encoding, and streams the result to this Message object.
Message& operator <<(const ::std::wstring& wstr);
#endif // GTEST_HAS_STD_WSTRING
#if GTEST_HAS_GLOBAL_WSTRING
// Converts the given wide string to a narrow string using the UTF-8
// encoding, and streams the result to this Message object.
Message& operator <<(const ::wstring& wstr);
#endif // GTEST_HAS_GLOBAL_WSTRING
// Gets the text streamed to this object so far as a String.
// Each '\0' character in the buffer is replaced with "\\0".
//
// INTERNAL IMPLEMENTATION - DO NOT USE IN A USER PROGRAM.
internal::String GetString() const {
return internal::StringStreamToString(ss_.get());
}
private:
#if GTEST_OS_SYMBIAN
// These are needed as the Nokia Symbian Compiler cannot decide between
// const T& and const T* in a function template. The Nokia compiler _can_
// decide between class template specializations for T and T*, so a
// tr1::type_traits-like is_pointer works, and we can overload on that.
template <typename T>
inline void StreamHelper(internal::true_type /*dummy*/, T* pointer) {
if (pointer == NULL) {
*ss_ << "(null)";
} else {
::GTestStreamToHelper(ss_.get(), pointer);
}
}
template <typename T>
inline void StreamHelper(internal::false_type /*dummy*/, const T& value) {
::GTestStreamToHelper(ss_.get(), value);
}
#endif // GTEST_OS_SYMBIAN
// We'll hold the text streamed to this object here.
const internal::scoped_ptr< ::std::stringstream> ss_;
// We declare (but don't implement) this to prevent the compiler
// from implementing the assignment operator.
void operator=(const Message&);
};
// Streams a Message to an ostream.
inline std::ostream& operator <<(std::ostream& os, const Message& sb) {
return os << sb.GetString();
}
} // namespace testing
#endif // GTEST_INCLUDE_GTEST_GTEST_MESSAGE_H_

File diff suppressed because it is too large Load Diff

View File

@ -1,487 +0,0 @@
$$ -*- mode: c++; -*-
$var n = 50 $$ Maximum length of Values arguments we want to support.
$var maxtuple = 10 $$ Maximum number of Combine arguments we want to support.
// Copyright 2008, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Authors: vladl@google.com (Vlad Losev)
//
// Macros and functions for implementing parameterized tests
// in Google C++ Testing Framework (Google Test)
//
// This file is generated by a SCRIPT. DO NOT EDIT BY HAND!
//
#ifndef GTEST_INCLUDE_GTEST_GTEST_PARAM_TEST_H_
#define GTEST_INCLUDE_GTEST_GTEST_PARAM_TEST_H_
// Value-parameterized tests allow you to test your code with different
// parameters without writing multiple copies of the same test.
//
// Here is how you use value-parameterized tests:
#if 0
// To write value-parameterized tests, first you should define a fixture
// class. It is usually derived from testing::TestWithParam<T> (see below for
// another inheritance scheme that's sometimes useful in more complicated
// class hierarchies), where the type of your parameter values.
// TestWithParam<T> is itself derived from testing::Test. T can be any
// copyable type. If it's a raw pointer, you are responsible for managing the
// lifespan of the pointed values.
class FooTest : public ::testing::TestWithParam<const char*> {
// You can implement all the usual class fixture members here.
};
// Then, use the TEST_P macro to define as many parameterized tests
// for this fixture as you want. The _P suffix is for "parameterized"
// or "pattern", whichever you prefer to think.
TEST_P(FooTest, DoesBlah) {
// Inside a test, access the test parameter with the GetParam() method
// of the TestWithParam<T> class:
EXPECT_TRUE(foo.Blah(GetParam()));
...
}
TEST_P(FooTest, HasBlahBlah) {
...
}
// Finally, you can use INSTANTIATE_TEST_CASE_P to instantiate the test
// case with any set of parameters you want. Google Test defines a number
// of functions for generating test parameters. They return what we call
// (surprise!) parameter generators. Here is a summary of them, which
// are all in the testing namespace:
//
//
// Range(begin, end [, step]) - Yields values {begin, begin+step,
// begin+step+step, ...}. The values do not
// include end. step defaults to 1.
// Values(v1, v2, ..., vN) - Yields values {v1, v2, ..., vN}.
// ValuesIn(container) - Yields values from a C-style array, an STL
// ValuesIn(begin,end) container, or an iterator range [begin, end).
// Bool() - Yields sequence {false, true}.
// Combine(g1, g2, ..., gN) - Yields all combinations (the Cartesian product
// for the math savvy) of the values generated
// by the N generators.
//
// For more details, see comments at the definitions of these functions below
// in this file.
//
// The following statement will instantiate tests from the FooTest test case
// each with parameter values "meeny", "miny", and "moe".
INSTANTIATE_TEST_CASE_P(InstantiationName,
FooTest,
Values("meeny", "miny", "moe"));
// To distinguish different instances of the pattern, (yes, you
// can instantiate it more then once) the first argument to the
// INSTANTIATE_TEST_CASE_P macro is a prefix that will be added to the
// actual test case name. Remember to pick unique prefixes for different
// instantiations. The tests from the instantiation above will have
// these names:
//
// * InstantiationName/FooTest.DoesBlah/0 for "meeny"
// * InstantiationName/FooTest.DoesBlah/1 for "miny"
// * InstantiationName/FooTest.DoesBlah/2 for "moe"
// * InstantiationName/FooTest.HasBlahBlah/0 for "meeny"
// * InstantiationName/FooTest.HasBlahBlah/1 for "miny"
// * InstantiationName/FooTest.HasBlahBlah/2 for "moe"
//
// You can use these names in --gtest_filter.
//
// This statement will instantiate all tests from FooTest again, each
// with parameter values "cat" and "dog":
const char* pets[] = {"cat", "dog"};
INSTANTIATE_TEST_CASE_P(AnotherInstantiationName, FooTest, ValuesIn(pets));
// The tests from the instantiation above will have these names:
//
// * AnotherInstantiationName/FooTest.DoesBlah/0 for "cat"
// * AnotherInstantiationName/FooTest.DoesBlah/1 for "dog"
// * AnotherInstantiationName/FooTest.HasBlahBlah/0 for "cat"
// * AnotherInstantiationName/FooTest.HasBlahBlah/1 for "dog"
//
// Please note that INSTANTIATE_TEST_CASE_P will instantiate all tests
// in the given test case, whether their definitions come before or
// AFTER the INSTANTIATE_TEST_CASE_P statement.
//
// Please also note that generator expressions (including parameters to the
// generators) are evaluated in InitGoogleTest(), after main() has started.
// This allows the user on one hand, to adjust generator parameters in order
// to dynamically determine a set of tests to run and on the other hand,
// give the user a chance to inspect the generated tests with Google Test
// reflection API before RUN_ALL_TESTS() is executed.
//
// You can see samples/sample7_unittest.cc and samples/sample8_unittest.cc
// for more examples.
//
// In the future, we plan to publish the API for defining new parameter
// generators. But for now this interface remains part of the internal
// implementation and is subject to change.
//
//
// A parameterized test fixture must be derived from testing::Test and from
// testing::WithParamInterface<T>, where T is the type of the parameter
// values. Inheriting from TestWithParam<T> satisfies that requirement because
// TestWithParam<T> inherits from both Test and WithParamInterface. In more
// complicated hierarchies, however, it is occasionally useful to inherit
// separately from Test and WithParamInterface. For example:
class BaseTest : public ::testing::Test {
// You can inherit all the usual members for a non-parameterized test
// fixture here.
};
class DerivedTest : public BaseTest, public ::testing::WithParamInterface<int> {
// The usual test fixture members go here too.
};
TEST_F(BaseTest, HasFoo) {
// This is an ordinary non-parameterized test.
}
TEST_P(DerivedTest, DoesBlah) {
// GetParam works just the same here as if you inherit from TestWithParam.
EXPECT_TRUE(foo.Blah(GetParam()));
}
#endif // 0
#include "gtest/internal/gtest-port.h"
#if !GTEST_OS_SYMBIAN
# include <utility>
#endif
// scripts/fuse_gtest.py depends on gtest's own header being #included
// *unconditionally*. Therefore these #includes cannot be moved
// inside #if GTEST_HAS_PARAM_TEST.
#include "gtest/internal/gtest-internal.h"
#include "gtest/internal/gtest-param-util.h"
#include "gtest/internal/gtest-param-util-generated.h"
#if GTEST_HAS_PARAM_TEST
namespace testing {
// Functions producing parameter generators.
//
// Google Test uses these generators to produce parameters for value-
// parameterized tests. When a parameterized test case is instantiated
// with a particular generator, Google Test creates and runs tests
// for each element in the sequence produced by the generator.
//
// In the following sample, tests from test case FooTest are instantiated
// each three times with parameter values 3, 5, and 8:
//
// class FooTest : public TestWithParam<int> { ... };
//
// TEST_P(FooTest, TestThis) {
// }
// TEST_P(FooTest, TestThat) {
// }
// INSTANTIATE_TEST_CASE_P(TestSequence, FooTest, Values(3, 5, 8));
//
// Range() returns generators providing sequences of values in a range.
//
// Synopsis:
// Range(start, end)
// - returns a generator producing a sequence of values {start, start+1,
// start+2, ..., }.
// Range(start, end, step)
// - returns a generator producing a sequence of values {start, start+step,
// start+step+step, ..., }.
// Notes:
// * The generated sequences never include end. For example, Range(1, 5)
// returns a generator producing a sequence {1, 2, 3, 4}. Range(1, 9, 2)
// returns a generator producing {1, 3, 5, 7}.
// * start and end must have the same type. That type may be any integral or
// floating-point type or a user defined type satisfying these conditions:
// * It must be assignable (have operator=() defined).
// * It must have operator+() (operator+(int-compatible type) for
// two-operand version).
// * It must have operator<() defined.
// Elements in the resulting sequences will also have that type.
// * Condition start < end must be satisfied in order for resulting sequences
// to contain any elements.
//
template <typename T, typename IncrementT>
internal::ParamGenerator<T> Range(T start, T end, IncrementT step) {
return internal::ParamGenerator<T>(
new internal::RangeGenerator<T, IncrementT>(start, end, step));
}
template <typename T>
internal::ParamGenerator<T> Range(T start, T end) {
return Range(start, end, 1);
}
// ValuesIn() function allows generation of tests with parameters coming from
// a container.
//
// Synopsis:
// ValuesIn(const T (&array)[N])
// - returns a generator producing sequences with elements from
// a C-style array.
// ValuesIn(const Container& container)
// - returns a generator producing sequences with elements from
// an STL-style container.
// ValuesIn(Iterator begin, Iterator end)
// - returns a generator producing sequences with elements from
// a range [begin, end) defined by a pair of STL-style iterators. These
// iterators can also be plain C pointers.
//
// Please note that ValuesIn copies the values from the containers
// passed in and keeps them to generate tests in RUN_ALL_TESTS().
//
// Examples:
//
// This instantiates tests from test case StringTest
// each with C-string values of "foo", "bar", and "baz":
//
// const char* strings[] = {"foo", "bar", "baz"};
// INSTANTIATE_TEST_CASE_P(StringSequence, SrtingTest, ValuesIn(strings));
//
// This instantiates tests from test case StlStringTest
// each with STL strings with values "a" and "b":
//
// ::std::vector< ::std::string> GetParameterStrings() {
// ::std::vector< ::std::string> v;
// v.push_back("a");
// v.push_back("b");
// return v;
// }
//
// INSTANTIATE_TEST_CASE_P(CharSequence,
// StlStringTest,
// ValuesIn(GetParameterStrings()));
//
//
// This will also instantiate tests from CharTest
// each with parameter values 'a' and 'b':
//
// ::std::list<char> GetParameterChars() {
// ::std::list<char> list;
// list.push_back('a');
// list.push_back('b');
// return list;
// }
// ::std::list<char> l = GetParameterChars();
// INSTANTIATE_TEST_CASE_P(CharSequence2,
// CharTest,
// ValuesIn(l.begin(), l.end()));
//
template <typename ForwardIterator>
internal::ParamGenerator<
typename ::testing::internal::IteratorTraits<ForwardIterator>::value_type>
ValuesIn(ForwardIterator begin, ForwardIterator end) {
typedef typename ::testing::internal::IteratorTraits<ForwardIterator>
::value_type ParamType;
return internal::ParamGenerator<ParamType>(
new internal::ValuesInIteratorRangeGenerator<ParamType>(begin, end));
}
template <typename T, size_t N>
internal::ParamGenerator<T> ValuesIn(const T (&array)[N]) {
return ValuesIn(array, array + N);
}
template <class Container>
internal::ParamGenerator<typename Container::value_type> ValuesIn(
const Container& container) {
return ValuesIn(container.begin(), container.end());
}
// Values() allows generating tests from explicitly specified list of
// parameters.
//
// Synopsis:
// Values(T v1, T v2, ..., T vN)
// - returns a generator producing sequences with elements v1, v2, ..., vN.
//
// For example, this instantiates tests from test case BarTest each
// with values "one", "two", and "three":
//
// INSTANTIATE_TEST_CASE_P(NumSequence, BarTest, Values("one", "two", "three"));
//
// This instantiates tests from test case BazTest each with values 1, 2, 3.5.
// The exact type of values will depend on the type of parameter in BazTest.
//
// INSTANTIATE_TEST_CASE_P(FloatingNumbers, BazTest, Values(1, 2, 3.5));
//
// Currently, Values() supports from 1 to $n parameters.
//
$range i 1..n
$for i [[
$range j 1..i
template <$for j, [[typename T$j]]>
internal::ValueArray$i<$for j, [[T$j]]> Values($for j, [[T$j v$j]]) {
return internal::ValueArray$i<$for j, [[T$j]]>($for j, [[v$j]]);
}
]]
// Bool() allows generating tests with parameters in a set of (false, true).
//
// Synopsis:
// Bool()
// - returns a generator producing sequences with elements {false, true}.
//
// It is useful when testing code that depends on Boolean flags. Combinations
// of multiple flags can be tested when several Bool()'s are combined using
// Combine() function.
//
// In the following example all tests in the test case FlagDependentTest
// will be instantiated twice with parameters false and true.
//
// class FlagDependentTest : public testing::TestWithParam<bool> {
// virtual void SetUp() {
// external_flag = GetParam();
// }
// }
// INSTANTIATE_TEST_CASE_P(BoolSequence, FlagDependentTest, Bool());
//
inline internal::ParamGenerator<bool> Bool() {
return Values(false, true);
}
# if GTEST_HAS_COMBINE
// Combine() allows the user to combine two or more sequences to produce
// values of a Cartesian product of those sequences' elements.
//
// Synopsis:
// Combine(gen1, gen2, ..., genN)
// - returns a generator producing sequences with elements coming from
// the Cartesian product of elements from the sequences generated by
// gen1, gen2, ..., genN. The sequence elements will have a type of
// tuple<T1, T2, ..., TN> where T1, T2, ..., TN are the types
// of elements from sequences produces by gen1, gen2, ..., genN.
//
// Combine can have up to $maxtuple arguments. This number is currently limited
// by the maximum number of elements in the tuple implementation used by Google
// Test.
//
// Example:
//
// This will instantiate tests in test case AnimalTest each one with
// the parameter values tuple("cat", BLACK), tuple("cat", WHITE),
// tuple("dog", BLACK), and tuple("dog", WHITE):
//
// enum Color { BLACK, GRAY, WHITE };
// class AnimalTest
// : public testing::TestWithParam<tuple<const char*, Color> > {...};
//
// TEST_P(AnimalTest, AnimalLooksNice) {...}
//
// INSTANTIATE_TEST_CASE_P(AnimalVariations, AnimalTest,
// Combine(Values("cat", "dog"),
// Values(BLACK, WHITE)));
//
// This will instantiate tests in FlagDependentTest with all variations of two
// Boolean flags:
//
// class FlagDependentTest
// : public testing::TestWithParam<tuple(bool, bool)> > {
// virtual void SetUp() {
// // Assigns external_flag_1 and external_flag_2 values from the tuple.
// tie(external_flag_1, external_flag_2) = GetParam();
// }
// };
//
// TEST_P(FlagDependentTest, TestFeature1) {
// // Test your code using external_flag_1 and external_flag_2 here.
// }
// INSTANTIATE_TEST_CASE_P(TwoBoolSequence, FlagDependentTest,
// Combine(Bool(), Bool()));
//
$range i 2..maxtuple
$for i [[
$range j 1..i
template <$for j, [[typename Generator$j]]>
internal::CartesianProductHolder$i<$for j, [[Generator$j]]> Combine(
$for j, [[const Generator$j& g$j]]) {
return internal::CartesianProductHolder$i<$for j, [[Generator$j]]>(
$for j, [[g$j]]);
}
]]
# endif // GTEST_HAS_COMBINE
# define TEST_P(test_case_name, test_name) \
class GTEST_TEST_CLASS_NAME_(test_case_name, test_name) \
: public test_case_name { \
public: \
GTEST_TEST_CLASS_NAME_(test_case_name, test_name)() {} \
virtual void TestBody(); \
private: \
static int AddToRegistry() { \
::testing::UnitTest::GetInstance()->parameterized_test_registry(). \
GetTestCasePatternHolder<test_case_name>(\
#test_case_name, __FILE__, __LINE__)->AddTestPattern(\
#test_case_name, \
#test_name, \
new ::testing::internal::TestMetaFactory< \
GTEST_TEST_CLASS_NAME_(test_case_name, test_name)>()); \
return 0; \
} \
static int gtest_registering_dummy_; \
GTEST_DISALLOW_COPY_AND_ASSIGN_(\
GTEST_TEST_CLASS_NAME_(test_case_name, test_name)); \
}; \
int GTEST_TEST_CLASS_NAME_(test_case_name, \
test_name)::gtest_registering_dummy_ = \
GTEST_TEST_CLASS_NAME_(test_case_name, test_name)::AddToRegistry(); \
void GTEST_TEST_CLASS_NAME_(test_case_name, test_name)::TestBody()
# define INSTANTIATE_TEST_CASE_P(prefix, test_case_name, generator) \
::testing::internal::ParamGenerator<test_case_name::ParamType> \
gtest_##prefix##test_case_name##_EvalGenerator_() { return generator; } \
int gtest_##prefix##test_case_name##_dummy_ = \
::testing::UnitTest::GetInstance()->parameterized_test_registry(). \
GetTestCasePatternHolder<test_case_name>(\
#test_case_name, __FILE__, __LINE__)->AddTestCaseInstantiation(\
#prefix, \
&gtest_##prefix##test_case_name##_EvalGenerator_, \
__FILE__, __LINE__)
} // namespace testing
#endif // GTEST_HAS_PARAM_TEST
#endif // GTEST_INCLUDE_GTEST_GTEST_PARAM_TEST_H_

View File

@ -1,796 +0,0 @@
// Copyright 2007, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
// Google Test - The Google C++ Testing Framework
//
// This file implements a universal value printer that can print a
// value of any type T:
//
// void ::testing::internal::UniversalPrinter<T>::Print(value, ostream_ptr);
//
// A user can teach this function how to print a class type T by
// defining either operator<<() or PrintTo() in the namespace that
// defines T. More specifically, the FIRST defined function in the
// following list will be used (assuming T is defined in namespace
// foo):
//
// 1. foo::PrintTo(const T&, ostream*)
// 2. operator<<(ostream&, const T&) defined in either foo or the
// global namespace.
//
// If none of the above is defined, it will print the debug string of
// the value if it is a protocol buffer, or print the raw bytes in the
// value otherwise.
//
// To aid debugging: when T is a reference type, the address of the
// value is also printed; when T is a (const) char pointer, both the
// pointer value and the NUL-terminated string it points to are
// printed.
//
// We also provide some convenient wrappers:
//
// // Prints a value to a string. For a (const or not) char
// // pointer, the NUL-terminated string (but not the pointer) is
// // printed.
// std::string ::testing::PrintToString(const T& value);
//
// // Prints a value tersely: for a reference type, the referenced
// // value (but not the address) is printed; for a (const or not) char
// // pointer, the NUL-terminated string (but not the pointer) is
// // printed.
// void ::testing::internal::UniversalTersePrint(const T& value, ostream*);
//
// // Prints value using the type inferred by the compiler. The difference
// // from UniversalTersePrint() is that this function prints both the
// // pointer and the NUL-terminated string for a (const or not) char pointer.
// void ::testing::internal::UniversalPrint(const T& value, ostream*);
//
// // Prints the fields of a tuple tersely to a string vector, one
// // element for each field. Tuple support must be enabled in
// // gtest-port.h.
// std::vector<string> UniversalTersePrintTupleFieldsToStrings(
// const Tuple& value);
//
// Known limitation:
//
// The print primitives print the elements of an STL-style container
// using the compiler-inferred type of *iter where iter is a
// const_iterator of the container. When const_iterator is an input
// iterator but not a forward iterator, this inferred type may not
// match value_type, and the print output may be incorrect. In
// practice, this is rarely a problem as for most containers
// const_iterator is a forward iterator. We'll fix this if there's an
// actual need for it. Note that this fix cannot rely on value_type
// being defined as many user-defined container types don't have
// value_type.
#ifndef GTEST_INCLUDE_GTEST_GTEST_PRINTERS_H_
#define GTEST_INCLUDE_GTEST_GTEST_PRINTERS_H_
#include <ostream> // NOLINT
#include <sstream>
#include <string>
#include <utility>
#include <vector>
#include "gtest/internal/gtest-port.h"
#include "gtest/internal/gtest-internal.h"
namespace testing {
// Definitions in the 'internal' and 'internal2' name spaces are
// subject to change without notice. DO NOT USE THEM IN USER CODE!
namespace internal2 {
// Prints the given number of bytes in the given object to the given
// ostream.
GTEST_API_ void PrintBytesInObjectTo(const unsigned char* obj_bytes,
size_t count,
::std::ostream* os);
// For selecting which printer to use when a given type has neither <<
// nor PrintTo().
enum TypeKind {
kProtobuf, // a protobuf type
kConvertibleToInteger, // a type implicitly convertible to BiggestInt
// (e.g. a named or unnamed enum type)
kOtherType // anything else
};
// TypeWithoutFormatter<T, kTypeKind>::PrintValue(value, os) is called
// by the universal printer to print a value of type T when neither
// operator<< nor PrintTo() is defined for T, where kTypeKind is the
// "kind" of T as defined by enum TypeKind.
template <typename T, TypeKind kTypeKind>
class TypeWithoutFormatter {
public:
// This default version is called when kTypeKind is kOtherType.
static void PrintValue(const T& value, ::std::ostream* os) {
PrintBytesInObjectTo(reinterpret_cast<const unsigned char*>(&value),
sizeof(value), os);
}
};
// We print a protobuf using its ShortDebugString() when the string
// doesn't exceed this many characters; otherwise we print it using
// DebugString() for better readability.
const size_t kProtobufOneLinerMaxLength = 50;
template <typename T>
class TypeWithoutFormatter<T, kProtobuf> {
public:
static void PrintValue(const T& value, ::std::ostream* os) {
const ::testing::internal::string short_str = value.ShortDebugString();
const ::testing::internal::string pretty_str =
short_str.length() <= kProtobufOneLinerMaxLength ?
short_str : ("\n" + value.DebugString());
*os << ("<" + pretty_str + ">");
}
};
template <typename T>
class TypeWithoutFormatter<T, kConvertibleToInteger> {
public:
// Since T has no << operator or PrintTo() but can be implicitly
// converted to BiggestInt, we print it as a BiggestInt.
//
// Most likely T is an enum type (either named or unnamed), in which
// case printing it as an integer is the desired behavior. In case
// T is not an enum, printing it as an integer is the best we can do
// given that it has no user-defined printer.
static void PrintValue(const T& value, ::std::ostream* os) {
const internal::BiggestInt kBigInt = value;
*os << kBigInt;
}
};
// Prints the given value to the given ostream. If the value is a
// protocol message, its debug string is printed; if it's an enum or
// of a type implicitly convertible to BiggestInt, it's printed as an
// integer; otherwise the bytes in the value are printed. This is
// what UniversalPrinter<T>::Print() does when it knows nothing about
// type T and T has neither << operator nor PrintTo().
//
// A user can override this behavior for a class type Foo by defining
// a << operator in the namespace where Foo is defined.
//
// We put this operator in namespace 'internal2' instead of 'internal'
// to simplify the implementation, as much code in 'internal' needs to
// use << in STL, which would conflict with our own << were it defined
// in 'internal'.
//
// Note that this operator<< takes a generic std::basic_ostream<Char,
// CharTraits> type instead of the more restricted std::ostream. If
// we define it to take an std::ostream instead, we'll get an
// "ambiguous overloads" compiler error when trying to print a type
// Foo that supports streaming to std::basic_ostream<Char,
// CharTraits>, as the compiler cannot tell whether
// operator<<(std::ostream&, const T&) or
// operator<<(std::basic_stream<Char, CharTraits>, const Foo&) is more
// specific.
template <typename Char, typename CharTraits, typename T>
::std::basic_ostream<Char, CharTraits>& operator<<(
::std::basic_ostream<Char, CharTraits>& os, const T& x) {
TypeWithoutFormatter<T,
(internal::IsAProtocolMessage<T>::value ? kProtobuf :
internal::ImplicitlyConvertible<const T&, internal::BiggestInt>::value ?
kConvertibleToInteger : kOtherType)>::PrintValue(x, &os);
return os;
}
} // namespace internal2
} // namespace testing
// This namespace MUST NOT BE NESTED IN ::testing, or the name look-up
// magic needed for implementing UniversalPrinter won't work.
namespace testing_internal {
// Used to print a value that is not an STL-style container when the
// user doesn't define PrintTo() for it.
template <typename T>
void DefaultPrintNonContainerTo(const T& value, ::std::ostream* os) {
// With the following statement, during unqualified name lookup,
// testing::internal2::operator<< appears as if it was declared in
// the nearest enclosing namespace that contains both
// ::testing_internal and ::testing::internal2, i.e. the global
// namespace. For more details, refer to the C++ Standard section
// 7.3.4-1 [namespace.udir]. This allows us to fall back onto
// testing::internal2::operator<< in case T doesn't come with a <<
// operator.
//
// We cannot write 'using ::testing::internal2::operator<<;', which
// gcc 3.3 fails to compile due to a compiler bug.
using namespace ::testing::internal2; // NOLINT
// Assuming T is defined in namespace foo, in the next statement,
// the compiler will consider all of:
//
// 1. foo::operator<< (thanks to Koenig look-up),
// 2. ::operator<< (as the current namespace is enclosed in ::),
// 3. testing::internal2::operator<< (thanks to the using statement above).
//
// The operator<< whose type matches T best will be picked.
//
// We deliberately allow #2 to be a candidate, as sometimes it's
// impossible to define #1 (e.g. when foo is ::std, defining
// anything in it is undefined behavior unless you are a compiler
// vendor.).
*os << value;
}
} // namespace testing_internal
namespace testing {
namespace internal {
// UniversalPrinter<T>::Print(value, ostream_ptr) prints the given
// value to the given ostream. The caller must ensure that
// 'ostream_ptr' is not NULL, or the behavior is undefined.
//
// We define UniversalPrinter as a class template (as opposed to a
// function template), as we need to partially specialize it for
// reference types, which cannot be done with function templates.
template <typename T>
class UniversalPrinter;
template <typename T>
void UniversalPrint(const T& value, ::std::ostream* os);
// Used to print an STL-style container when the user doesn't define
// a PrintTo() for it.
template <typename C>
void DefaultPrintTo(IsContainer /* dummy */,
false_type /* is not a pointer */,
const C& container, ::std::ostream* os) {
const size_t kMaxCount = 32; // The maximum number of elements to print.
*os << '{';
size_t count = 0;
for (typename C::const_iterator it = container.begin();
it != container.end(); ++it, ++count) {
if (count > 0) {
*os << ',';
if (count == kMaxCount) { // Enough has been printed.
*os << " ...";
break;
}
}
*os << ' ';
// We cannot call PrintTo(*it, os) here as PrintTo() doesn't
// handle *it being a native array.
internal::UniversalPrint(*it, os);
}
if (count > 0) {
*os << ' ';
}
*os << '}';
}
// Used to print a pointer that is neither a char pointer nor a member
// pointer, when the user doesn't define PrintTo() for it. (A member
// variable pointer or member function pointer doesn't really point to
// a location in the address space. Their representation is
// implementation-defined. Therefore they will be printed as raw
// bytes.)
template <typename T>
void DefaultPrintTo(IsNotContainer /* dummy */,
true_type /* is a pointer */,
T* p, ::std::ostream* os) {
if (p == NULL) {
*os << "NULL";
} else {
// C++ doesn't allow casting from a function pointer to any object
// pointer.
//
// IsTrue() silences warnings: "Condition is always true",
// "unreachable code".
if (IsTrue(ImplicitlyConvertible<T*, const void*>::value)) {
// T is not a function type. We just call << to print p,
// relying on ADL to pick up user-defined << for their pointer
// types, if any.
*os << p;
} else {
// T is a function type, so '*os << p' doesn't do what we want
// (it just prints p as bool). We want to print p as a const
// void*. However, we cannot cast it to const void* directly,
// even using reinterpret_cast, as earlier versions of gcc
// (e.g. 3.4.5) cannot compile the cast when p is a function
// pointer. Casting to UInt64 first solves the problem.
*os << reinterpret_cast<const void*>(
reinterpret_cast<internal::UInt64>(p));
}
}
}
// Used to print a non-container, non-pointer value when the user
// doesn't define PrintTo() for it.
template <typename T>
void DefaultPrintTo(IsNotContainer /* dummy */,
false_type /* is not a pointer */,
const T& value, ::std::ostream* os) {
::testing_internal::DefaultPrintNonContainerTo(value, os);
}
// Prints the given value using the << operator if it has one;
// otherwise prints the bytes in it. This is what
// UniversalPrinter<T>::Print() does when PrintTo() is not specialized
// or overloaded for type T.
//
// A user can override this behavior for a class type Foo by defining
// an overload of PrintTo() in the namespace where Foo is defined. We
// give the user this option as sometimes defining a << operator for
// Foo is not desirable (e.g. the coding style may prevent doing it,
// or there is already a << operator but it doesn't do what the user
// wants).
template <typename T>
void PrintTo(const T& value, ::std::ostream* os) {
// DefaultPrintTo() is overloaded. The type of its first two
// arguments determine which version will be picked. If T is an
// STL-style container, the version for container will be called; if
// T is a pointer, the pointer version will be called; otherwise the
// generic version will be called.
//
// Note that we check for container types here, prior to we check
// for protocol message types in our operator<<. The rationale is:
//
// For protocol messages, we want to give people a chance to
// override Google Mock's format by defining a PrintTo() or
// operator<<. For STL containers, other formats can be
// incompatible with Google Mock's format for the container
// elements; therefore we check for container types here to ensure
// that our format is used.
//
// The second argument of DefaultPrintTo() is needed to bypass a bug
// in Symbian's C++ compiler that prevents it from picking the right
// overload between:
//
// PrintTo(const T& x, ...);
// PrintTo(T* x, ...);
DefaultPrintTo(IsContainerTest<T>(0), is_pointer<T>(), value, os);
}
// The following list of PrintTo() overloads tells
// UniversalPrinter<T>::Print() how to print standard types (built-in
// types, strings, plain arrays, and pointers).
// Overloads for various char types.
GTEST_API_ void PrintTo(unsigned char c, ::std::ostream* os);
GTEST_API_ void PrintTo(signed char c, ::std::ostream* os);
inline void PrintTo(char c, ::std::ostream* os) {
// When printing a plain char, we always treat it as unsigned. This
// way, the output won't be affected by whether the compiler thinks
// char is signed or not.
PrintTo(static_cast<unsigned char>(c), os);
}
// Overloads for other simple built-in types.
inline void PrintTo(bool x, ::std::ostream* os) {
*os << (x ? "true" : "false");
}
// Overload for wchar_t type.
// Prints a wchar_t as a symbol if it is printable or as its internal
// code otherwise and also as its decimal code (except for L'\0').
// The L'\0' char is printed as "L'\\0'". The decimal code is printed
// as signed integer when wchar_t is implemented by the compiler
// as a signed type and is printed as an unsigned integer when wchar_t
// is implemented as an unsigned type.
GTEST_API_ void PrintTo(wchar_t wc, ::std::ostream* os);
// Overloads for C strings.
GTEST_API_ void PrintTo(const char* s, ::std::ostream* os);
inline void PrintTo(char* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const char*>(s), os);
}
// signed/unsigned char is often used for representing binary data, so
// we print pointers to it as void* to be safe.
inline void PrintTo(const signed char* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const void*>(s), os);
}
inline void PrintTo(signed char* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const void*>(s), os);
}
inline void PrintTo(const unsigned char* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const void*>(s), os);
}
inline void PrintTo(unsigned char* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const void*>(s), os);
}
// MSVC can be configured to define wchar_t as a typedef of unsigned
// short. It defines _NATIVE_WCHAR_T_DEFINED when wchar_t is a native
// type. When wchar_t is a typedef, defining an overload for const
// wchar_t* would cause unsigned short* be printed as a wide string,
// possibly causing invalid memory accesses.
#if !defined(_MSC_VER) || defined(_NATIVE_WCHAR_T_DEFINED)
// Overloads for wide C strings
GTEST_API_ void PrintTo(const wchar_t* s, ::std::ostream* os);
inline void PrintTo(wchar_t* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const wchar_t*>(s), os);
}
#endif
// Overload for C arrays. Multi-dimensional arrays are printed
// properly.
// Prints the given number of elements in an array, without printing
// the curly braces.
template <typename T>
void PrintRawArrayTo(const T a[], size_t count, ::std::ostream* os) {
UniversalPrint(a[0], os);
for (size_t i = 1; i != count; i++) {
*os << ", ";
UniversalPrint(a[i], os);
}
}
// Overloads for ::string and ::std::string.
#if GTEST_HAS_GLOBAL_STRING
GTEST_API_ void PrintStringTo(const ::string&s, ::std::ostream* os);
inline void PrintTo(const ::string& s, ::std::ostream* os) {
PrintStringTo(s, os);
}
#endif // GTEST_HAS_GLOBAL_STRING
GTEST_API_ void PrintStringTo(const ::std::string&s, ::std::ostream* os);
inline void PrintTo(const ::std::string& s, ::std::ostream* os) {
PrintStringTo(s, os);
}
// Overloads for ::wstring and ::std::wstring.
#if GTEST_HAS_GLOBAL_WSTRING
GTEST_API_ void PrintWideStringTo(const ::wstring&s, ::std::ostream* os);
inline void PrintTo(const ::wstring& s, ::std::ostream* os) {
PrintWideStringTo(s, os);
}
#endif // GTEST_HAS_GLOBAL_WSTRING
#if GTEST_HAS_STD_WSTRING
GTEST_API_ void PrintWideStringTo(const ::std::wstring&s, ::std::ostream* os);
inline void PrintTo(const ::std::wstring& s, ::std::ostream* os) {
PrintWideStringTo(s, os);
}
#endif // GTEST_HAS_STD_WSTRING
#if GTEST_HAS_TR1_TUPLE
// Overload for ::std::tr1::tuple. Needed for printing function arguments,
// which are packed as tuples.
// Helper function for printing a tuple. T must be instantiated with
// a tuple type.
template <typename T>
void PrintTupleTo(const T& t, ::std::ostream* os);
// Overloaded PrintTo() for tuples of various arities. We support
// tuples of up-to 10 fields. The following implementation works
// regardless of whether tr1::tuple is implemented using the
// non-standard variadic template feature or not.
inline void PrintTo(const ::std::tr1::tuple<>& t, ::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1>
void PrintTo(const ::std::tr1::tuple<T1>& t, ::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2>
void PrintTo(const ::std::tr1::tuple<T1, T2>& t, ::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3>& t, ::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4>& t, ::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4, T5>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5,
typename T6>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4, T5, T6>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5,
typename T6, typename T7>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4, T5, T6, T7>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5,
typename T6, typename T7, typename T8>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4, T5, T6, T7, T8>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5,
typename T6, typename T7, typename T8, typename T9>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4, T5, T6, T7, T8, T9>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5,
typename T6, typename T7, typename T8, typename T9, typename T10>
void PrintTo(
const ::std::tr1::tuple<T1, T2, T3, T4, T5, T6, T7, T8, T9, T10>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
#endif // GTEST_HAS_TR1_TUPLE
// Overload for std::pair.
template <typename T1, typename T2>
void PrintTo(const ::std::pair<T1, T2>& value, ::std::ostream* os) {
*os << '(';
// We cannot use UniversalPrint(value.first, os) here, as T1 may be
// a reference type. The same for printing value.second.
UniversalPrinter<T1>::Print(value.first, os);
*os << ", ";
UniversalPrinter<T2>::Print(value.second, os);
*os << ')';
}
// Implements printing a non-reference type T by letting the compiler
// pick the right overload of PrintTo() for T.
template <typename T>
class UniversalPrinter {
public:
// MSVC warns about adding const to a function type, so we want to
// disable the warning.
#ifdef _MSC_VER
# pragma warning(push) // Saves the current warning state.
# pragma warning(disable:4180) // Temporarily disables warning 4180.
#endif // _MSC_VER
// Note: we deliberately don't call this PrintTo(), as that name
// conflicts with ::testing::internal::PrintTo in the body of the
// function.
static void Print(const T& value, ::std::ostream* os) {
// By default, ::testing::internal::PrintTo() is used for printing
// the value.
//
// Thanks to Koenig look-up, if T is a class and has its own
// PrintTo() function defined in its namespace, that function will
// be visible here. Since it is more specific than the generic ones
// in ::testing::internal, it will be picked by the compiler in the
// following statement - exactly what we want.
PrintTo(value, os);
}
#ifdef _MSC_VER
# pragma warning(pop) // Restores the warning state.
#endif // _MSC_VER
};
// UniversalPrintArray(begin, len, os) prints an array of 'len'
// elements, starting at address 'begin'.
template <typename T>
void UniversalPrintArray(const T* begin, size_t len, ::std::ostream* os) {
if (len == 0) {
*os << "{}";
} else {
*os << "{ ";
const size_t kThreshold = 18;
const size_t kChunkSize = 8;
// If the array has more than kThreshold elements, we'll have to
// omit some details by printing only the first and the last
// kChunkSize elements.
// TODO(wan@google.com): let the user control the threshold using a flag.
if (len <= kThreshold) {
PrintRawArrayTo(begin, len, os);
} else {
PrintRawArrayTo(begin, kChunkSize, os);
*os << ", ..., ";
PrintRawArrayTo(begin + len - kChunkSize, kChunkSize, os);
}
*os << " }";
}
}
// This overload prints a (const) char array compactly.
GTEST_API_ void UniversalPrintArray(const char* begin,
size_t len,
::std::ostream* os);
// Implements printing an array type T[N].
template <typename T, size_t N>
class UniversalPrinter<T[N]> {
public:
// Prints the given array, omitting some elements when there are too
// many.
static void Print(const T (&a)[N], ::std::ostream* os) {
UniversalPrintArray(a, N, os);
}
};
// Implements printing a reference type T&.
template <typename T>
class UniversalPrinter<T&> {
public:
// MSVC warns about adding const to a function type, so we want to
// disable the warning.
#ifdef _MSC_VER
# pragma warning(push) // Saves the current warning state.
# pragma warning(disable:4180) // Temporarily disables warning 4180.
#endif // _MSC_VER
static void Print(const T& value, ::std::ostream* os) {
// Prints the address of the value. We use reinterpret_cast here
// as static_cast doesn't compile when T is a function type.
*os << "@" << reinterpret_cast<const void*>(&value) << " ";
// Then prints the value itself.
UniversalPrint(value, os);
}
#ifdef _MSC_VER
# pragma warning(pop) // Restores the warning state.
#endif // _MSC_VER
};
// Prints a value tersely: for a reference type, the referenced value
// (but not the address) is printed; for a (const) char pointer, the
// NUL-terminated string (but not the pointer) is printed.
template <typename T>
void UniversalTersePrint(const T& value, ::std::ostream* os) {
UniversalPrint(value, os);
}
inline void UniversalTersePrint(const char* str, ::std::ostream* os) {
if (str == NULL) {
*os << "NULL";
} else {
UniversalPrint(string(str), os);
}
}
inline void UniversalTersePrint(char* str, ::std::ostream* os) {
UniversalTersePrint(static_cast<const char*>(str), os);
}
// Prints a value using the type inferred by the compiler. The
// difference between this and UniversalTersePrint() is that for a
// (const) char pointer, this prints both the pointer and the
// NUL-terminated string.
template <typename T>
void UniversalPrint(const T& value, ::std::ostream* os) {
UniversalPrinter<T>::Print(value, os);
}
#if GTEST_HAS_TR1_TUPLE
typedef ::std::vector<string> Strings;
// This helper template allows PrintTo() for tuples and
// UniversalTersePrintTupleFieldsToStrings() to be defined by
// induction on the number of tuple fields. The idea is that
// TuplePrefixPrinter<N>::PrintPrefixTo(t, os) prints the first N
// fields in tuple t, and can be defined in terms of
// TuplePrefixPrinter<N - 1>.
// The inductive case.
template <size_t N>
struct TuplePrefixPrinter {
// Prints the first N fields of a tuple.
template <typename Tuple>
static void PrintPrefixTo(const Tuple& t, ::std::ostream* os) {
TuplePrefixPrinter<N - 1>::PrintPrefixTo(t, os);
*os << ", ";
UniversalPrinter<typename ::std::tr1::tuple_element<N - 1, Tuple>::type>
::Print(::std::tr1::get<N - 1>(t), os);
}
// Tersely prints the first N fields of a tuple to a string vector,
// one element for each field.
template <typename Tuple>
static void TersePrintPrefixToStrings(const Tuple& t, Strings* strings) {
TuplePrefixPrinter<N - 1>::TersePrintPrefixToStrings(t, strings);
::std::stringstream ss;
UniversalTersePrint(::std::tr1::get<N - 1>(t), &ss);
strings->push_back(ss.str());
}
};
// Base cases.
template <>
struct TuplePrefixPrinter<0> {
template <typename Tuple>
static void PrintPrefixTo(const Tuple&, ::std::ostream*) {}
template <typename Tuple>
static void TersePrintPrefixToStrings(const Tuple&, Strings*) {}
};
// We have to specialize the entire TuplePrefixPrinter<> class
// template here, even though the definition of
// TersePrintPrefixToStrings() is the same as the generic version, as
// Embarcadero (formerly CodeGear, formerly Borland) C++ doesn't
// support specializing a method template of a class template.
template <>
struct TuplePrefixPrinter<1> {
template <typename Tuple>
static void PrintPrefixTo(const Tuple& t, ::std::ostream* os) {
UniversalPrinter<typename ::std::tr1::tuple_element<0, Tuple>::type>::
Print(::std::tr1::get<0>(t), os);
}
template <typename Tuple>
static void TersePrintPrefixToStrings(const Tuple& t, Strings* strings) {
::std::stringstream ss;
UniversalTersePrint(::std::tr1::get<0>(t), &ss);
strings->push_back(ss.str());
}
};
// Helper function for printing a tuple. T must be instantiated with
// a tuple type.
template <typename T>
void PrintTupleTo(const T& t, ::std::ostream* os) {
*os << "(";
TuplePrefixPrinter< ::std::tr1::tuple_size<T>::value>::
PrintPrefixTo(t, os);
*os << ")";
}
// Prints the fields of a tuple tersely to a string vector, one
// element for each field. See the comment before
// UniversalTersePrint() for how we define "tersely".
template <typename Tuple>
Strings UniversalTersePrintTupleFieldsToStrings(const Tuple& value) {
Strings result;
TuplePrefixPrinter< ::std::tr1::tuple_size<Tuple>::value>::
TersePrintPrefixToStrings(value, &result);
return result;
}
#endif // GTEST_HAS_TR1_TUPLE
} // namespace internal
template <typename T>
::std::string PrintToString(const T& value) {
::std::stringstream ss;
internal::UniversalTersePrint(value, &ss);
return ss.str();
}
} // namespace testing
#endif // GTEST_INCLUDE_GTEST_GTEST_PRINTERS_H_

View File

@ -1,232 +0,0 @@
// Copyright 2007, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
//
// Utilities for testing Google Test itself and code that uses Google Test
// (e.g. frameworks built on top of Google Test).
#ifndef GTEST_INCLUDE_GTEST_GTEST_SPI_H_
#define GTEST_INCLUDE_GTEST_GTEST_SPI_H_
#include "gtest/gtest.h"
namespace testing {
// This helper class can be used to mock out Google Test failure reporting
// so that we can test Google Test or code that builds on Google Test.
//
// An object of this class appends a TestPartResult object to the
// TestPartResultArray object given in the constructor whenever a Google Test
// failure is reported. It can either intercept only failures that are
// generated in the same thread that created this object or it can intercept
// all generated failures. The scope of this mock object can be controlled with
// the second argument to the two arguments constructor.
class GTEST_API_ ScopedFakeTestPartResultReporter
: public TestPartResultReporterInterface {
public:
// The two possible mocking modes of this object.
enum InterceptMode {
INTERCEPT_ONLY_CURRENT_THREAD, // Intercepts only thread local failures.
INTERCEPT_ALL_THREADS // Intercepts all failures.
};
// The c'tor sets this object as the test part result reporter used
// by Google Test. The 'result' parameter specifies where to report the
// results. This reporter will only catch failures generated in the current
// thread. DEPRECATED
explicit ScopedFakeTestPartResultReporter(TestPartResultArray* result);
// Same as above, but you can choose the interception scope of this object.
ScopedFakeTestPartResultReporter(InterceptMode intercept_mode,
TestPartResultArray* result);
// The d'tor restores the previous test part result reporter.
virtual ~ScopedFakeTestPartResultReporter();
// Appends the TestPartResult object to the TestPartResultArray
// received in the constructor.
//
// This method is from the TestPartResultReporterInterface
// interface.
virtual void ReportTestPartResult(const TestPartResult& result);
private:
void Init();
const InterceptMode intercept_mode_;
TestPartResultReporterInterface* old_reporter_;
TestPartResultArray* const result_;
GTEST_DISALLOW_COPY_AND_ASSIGN_(ScopedFakeTestPartResultReporter);
};
namespace internal {
// A helper class for implementing EXPECT_FATAL_FAILURE() and
// EXPECT_NONFATAL_FAILURE(). Its destructor verifies that the given
// TestPartResultArray contains exactly one failure that has the given
// type and contains the given substring. If that's not the case, a
// non-fatal failure will be generated.
class GTEST_API_ SingleFailureChecker {
public:
// The constructor remembers the arguments.
SingleFailureChecker(const TestPartResultArray* results,
TestPartResult::Type type,
const string& substr);
~SingleFailureChecker();
private:
const TestPartResultArray* const results_;
const TestPartResult::Type type_;
const string substr_;
GTEST_DISALLOW_COPY_AND_ASSIGN_(SingleFailureChecker);
};
} // namespace internal
} // namespace testing
// A set of macros for testing Google Test assertions or code that's expected
// to generate Google Test fatal failures. It verifies that the given
// statement will cause exactly one fatal Google Test failure with 'substr'
// being part of the failure message.
//
// There are two different versions of this macro. EXPECT_FATAL_FAILURE only
// affects and considers failures generated in the current thread and
// EXPECT_FATAL_FAILURE_ON_ALL_THREADS does the same but for all threads.
//
// The verification of the assertion is done correctly even when the statement
// throws an exception or aborts the current function.
//
// Known restrictions:
// - 'statement' cannot reference local non-static variables or
// non-static members of the current object.
// - 'statement' cannot return a value.
// - You cannot stream a failure message to this macro.
//
// Note that even though the implementations of the following two
// macros are much alike, we cannot refactor them to use a common
// helper macro, due to some peculiarity in how the preprocessor
// works. The AcceptsMacroThatExpandsToUnprotectedComma test in
// gtest_unittest.cc will fail to compile if we do that.
#define EXPECT_FATAL_FAILURE(statement, substr) \
do { \
class GTestExpectFatalFailureHelper {\
public:\
static void Execute() { statement; }\
};\
::testing::TestPartResultArray gtest_failures;\
::testing::internal::SingleFailureChecker gtest_checker(\
&gtest_failures, ::testing::TestPartResult::kFatalFailure, (substr));\
{\
::testing::ScopedFakeTestPartResultReporter gtest_reporter(\
::testing::ScopedFakeTestPartResultReporter:: \
INTERCEPT_ONLY_CURRENT_THREAD, &gtest_failures);\
GTestExpectFatalFailureHelper::Execute();\
}\
} while (::testing::internal::AlwaysFalse())
#define EXPECT_FATAL_FAILURE_ON_ALL_THREADS(statement, substr) \
do { \
class GTestExpectFatalFailureHelper {\
public:\
static void Execute() { statement; }\
};\
::testing::TestPartResultArray gtest_failures;\
::testing::internal::SingleFailureChecker gtest_checker(\
&gtest_failures, ::testing::TestPartResult::kFatalFailure, (substr));\
{\
::testing::ScopedFakeTestPartResultReporter gtest_reporter(\
::testing::ScopedFakeTestPartResultReporter:: \
INTERCEPT_ALL_THREADS, &gtest_failures);\
GTestExpectFatalFailureHelper::Execute();\
}\
} while (::testing::internal::AlwaysFalse())
// A macro for testing Google Test assertions or code that's expected to
// generate Google Test non-fatal failures. It asserts that the given
// statement will cause exactly one non-fatal Google Test failure with 'substr'
// being part of the failure message.
//
// There are two different versions of this macro. EXPECT_NONFATAL_FAILURE only
// affects and considers failures generated in the current thread and
// EXPECT_NONFATAL_FAILURE_ON_ALL_THREADS does the same but for all threads.
//
// 'statement' is allowed to reference local variables and members of
// the current object.
//
// The verification of the assertion is done correctly even when the statement
// throws an exception or aborts the current function.
//
// Known restrictions:
// - You cannot stream a failure message to this macro.
//
// Note that even though the implementations of the following two
// macros are much alike, we cannot refactor them to use a common
// helper macro, due to some peculiarity in how the preprocessor
// works. If we do that, the code won't compile when the user gives
// EXPECT_NONFATAL_FAILURE() a statement that contains a macro that
// expands to code containing an unprotected comma. The
// AcceptsMacroThatExpandsToUnprotectedComma test in gtest_unittest.cc
// catches that.
//
// For the same reason, we have to write
// if (::testing::internal::AlwaysTrue()) { statement; }
// instead of
// GTEST_SUPPRESS_UNREACHABLE_CODE_WARNING_BELOW_(statement)
// to avoid an MSVC warning on unreachable code.
#define EXPECT_NONFATAL_FAILURE(statement, substr) \
do {\
::testing::TestPartResultArray gtest_failures;\
::testing::internal::SingleFailureChecker gtest_checker(\
&gtest_failures, ::testing::TestPartResult::kNonFatalFailure, \
(substr));\
{\
::testing::ScopedFakeTestPartResultReporter gtest_reporter(\
::testing::ScopedFakeTestPartResultReporter:: \
INTERCEPT_ONLY_CURRENT_THREAD, &gtest_failures);\
if (::testing::internal::AlwaysTrue()) { statement; }\
}\
} while (::testing::internal::AlwaysFalse())
#define EXPECT_NONFATAL_FAILURE_ON_ALL_THREADS(statement, substr) \
do {\
::testing::TestPartResultArray gtest_failures;\
::testing::internal::SingleFailureChecker gtest_checker(\
&gtest_failures, ::testing::TestPartResult::kNonFatalFailure, \
(substr));\
{\
::testing::ScopedFakeTestPartResultReporter gtest_reporter(\
::testing::ScopedFakeTestPartResultReporter::INTERCEPT_ALL_THREADS,\
&gtest_failures);\
if (::testing::internal::AlwaysTrue()) { statement; }\
}\
} while (::testing::internal::AlwaysFalse())
#endif // GTEST_INCLUDE_GTEST_GTEST_SPI_H_

View File

@ -1,176 +0,0 @@
// Copyright 2008, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: mheule@google.com (Markus Heule)
//
#ifndef GTEST_INCLUDE_GTEST_GTEST_TEST_PART_H_
#define GTEST_INCLUDE_GTEST_GTEST_TEST_PART_H_
#include <iosfwd>
#include <vector>
#include "gtest/internal/gtest-internal.h"
#include "gtest/internal/gtest-string.h"
namespace testing {
// A copyable object representing the result of a test part (i.e. an
// assertion or an explicit FAIL(), ADD_FAILURE(), or SUCCESS()).
//
// Don't inherit from TestPartResult as its destructor is not virtual.
class GTEST_API_ TestPartResult {
public:
// The possible outcomes of a test part (i.e. an assertion or an
// explicit SUCCEED(), FAIL(), or ADD_FAILURE()).
enum Type {
kSuccess, // Succeeded.
kNonFatalFailure, // Failed but the test can continue.
kFatalFailure // Failed and the test should be terminated.
};
// C'tor. TestPartResult does NOT have a default constructor.
// Always use this constructor (with parameters) to create a
// TestPartResult object.
TestPartResult(Type a_type,
const char* a_file_name,
int a_line_number,
const char* a_message)
: type_(a_type),
file_name_(a_file_name),
line_number_(a_line_number),
summary_(ExtractSummary(a_message)),
message_(a_message) {
}
// Gets the outcome of the test part.
Type type() const { return type_; }
// Gets the name of the source file where the test part took place, or
// NULL if it's unknown.
const char* file_name() const { return file_name_.c_str(); }
// Gets the line in the source file where the test part took place,
// or -1 if it's unknown.
int line_number() const { return line_number_; }
// Gets the summary of the failure message.
const char* summary() const { return summary_.c_str(); }
// Gets the message associated with the test part.
const char* message() const { return message_.c_str(); }
// Returns true iff the test part passed.
bool passed() const { return type_ == kSuccess; }
// Returns true iff the test part failed.
bool failed() const { return type_ != kSuccess; }
// Returns true iff the test part non-fatally failed.
bool nonfatally_failed() const { return type_ == kNonFatalFailure; }
// Returns true iff the test part fatally failed.
bool fatally_failed() const { return type_ == kFatalFailure; }
private:
Type type_;
// Gets the summary of the failure message by omitting the stack
// trace in it.
static internal::String ExtractSummary(const char* message);
// The name of the source file where the test part took place, or
// NULL if the source file is unknown.
internal::String file_name_;
// The line in the source file where the test part took place, or -1
// if the line number is unknown.
int line_number_;
internal::String summary_; // The test failure summary.
internal::String message_; // The test failure message.
};
// Prints a TestPartResult object.
std::ostream& operator<<(std::ostream& os, const TestPartResult& result);
// An array of TestPartResult objects.
//
// Don't inherit from TestPartResultArray as its destructor is not
// virtual.
class GTEST_API_ TestPartResultArray {
public:
TestPartResultArray() {}
// Appends the given TestPartResult to the array.
void Append(const TestPartResult& result);
// Returns the TestPartResult at the given index (0-based).
const TestPartResult& GetTestPartResult(int index) const;
// Returns the number of TestPartResult objects in the array.
int size() const;
private:
std::vector<TestPartResult> array_;
GTEST_DISALLOW_COPY_AND_ASSIGN_(TestPartResultArray);
};
// This interface knows how to report a test part result.
class TestPartResultReporterInterface {
public:
virtual ~TestPartResultReporterInterface() {}
virtual void ReportTestPartResult(const TestPartResult& result) = 0;
};
namespace internal {
// This helper class is used by {ASSERT|EXPECT}_NO_FATAL_FAILURE to check if a
// statement generates new fatal failures. To do so it registers itself as the
// current test part result reporter. Besides checking if fatal failures were
// reported, it only delegates the reporting to the former result reporter.
// The original result reporter is restored in the destructor.
// INTERNAL IMPLEMENTATION - DO NOT USE IN A USER PROGRAM.
class GTEST_API_ HasNewFatalFailureHelper
: public TestPartResultReporterInterface {
public:
HasNewFatalFailureHelper();
virtual ~HasNewFatalFailureHelper();
virtual void ReportTestPartResult(const TestPartResult& result);
bool has_new_fatal_failure() const { return has_new_fatal_failure_; }
private:
bool has_new_fatal_failure_;
TestPartResultReporterInterface* original_reporter_;
GTEST_DISALLOW_COPY_AND_ASSIGN_(HasNewFatalFailureHelper);
};
} // namespace internal
} // namespace testing
#endif // GTEST_INCLUDE_GTEST_GTEST_TEST_PART_H_

View File

@ -1,259 +0,0 @@
// Copyright 2008 Google Inc.
// All Rights Reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
#ifndef GTEST_INCLUDE_GTEST_GTEST_TYPED_TEST_H_
#define GTEST_INCLUDE_GTEST_GTEST_TYPED_TEST_H_
// This header implements typed tests and type-parameterized tests.
// Typed (aka type-driven) tests repeat the same test for types in a
// list. You must know which types you want to test with when writing
// typed tests. Here's how you do it:
#if 0
// First, define a fixture class template. It should be parameterized
// by a type. Remember to derive it from testing::Test.
template <typename T>
class FooTest : public testing::Test {
public:
...
typedef std::list<T> List;
static T shared_;
T value_;
};
// Next, associate a list of types with the test case, which will be
// repeated for each type in the list. The typedef is necessary for
// the macro to parse correctly.
typedef testing::Types<char, int, unsigned int> MyTypes;
TYPED_TEST_CASE(FooTest, MyTypes);
// If the type list contains only one type, you can write that type
// directly without Types<...>:
// TYPED_TEST_CASE(FooTest, int);
// Then, use TYPED_TEST() instead of TEST_F() to define as many typed
// tests for this test case as you want.
TYPED_TEST(FooTest, DoesBlah) {
// Inside a test, refer to TypeParam to get the type parameter.
// Since we are inside a derived class template, C++ requires use to
// visit the members of FooTest via 'this'.
TypeParam n = this->value_;
// To visit static members of the fixture, add the TestFixture::
// prefix.
n += TestFixture::shared_;
// To refer to typedefs in the fixture, add the "typename
// TestFixture::" prefix.
typename TestFixture::List values;
values.push_back(n);
...
}
TYPED_TEST(FooTest, HasPropertyA) { ... }
#endif // 0
// Type-parameterized tests are abstract test patterns parameterized
// by a type. Compared with typed tests, type-parameterized tests
// allow you to define the test pattern without knowing what the type
// parameters are. The defined pattern can be instantiated with
// different types any number of times, in any number of translation
// units.
//
// If you are designing an interface or concept, you can define a
// suite of type-parameterized tests to verify properties that any
// valid implementation of the interface/concept should have. Then,
// each implementation can easily instantiate the test suite to verify
// that it conforms to the requirements, without having to write
// similar tests repeatedly. Here's an example:
#if 0
// First, define a fixture class template. It should be parameterized
// by a type. Remember to derive it from testing::Test.
template <typename T>
class FooTest : public testing::Test {
...
};
// Next, declare that you will define a type-parameterized test case
// (the _P suffix is for "parameterized" or "pattern", whichever you
// prefer):
TYPED_TEST_CASE_P(FooTest);
// Then, use TYPED_TEST_P() to define as many type-parameterized tests
// for this type-parameterized test case as you want.
TYPED_TEST_P(FooTest, DoesBlah) {
// Inside a test, refer to TypeParam to get the type parameter.
TypeParam n = 0;
...
}
TYPED_TEST_P(FooTest, HasPropertyA) { ... }
// Now the tricky part: you need to register all test patterns before
// you can instantiate them. The first argument of the macro is the
// test case name; the rest are the names of the tests in this test
// case.
REGISTER_TYPED_TEST_CASE_P(FooTest,
DoesBlah, HasPropertyA);
// Finally, you are free to instantiate the pattern with the types you
// want. If you put the above code in a header file, you can #include
// it in multiple C++ source files and instantiate it multiple times.
//
// To distinguish different instances of the pattern, the first
// argument to the INSTANTIATE_* macro is a prefix that will be added
// to the actual test case name. Remember to pick unique prefixes for
// different instances.
typedef testing::Types<char, int, unsigned int> MyTypes;
INSTANTIATE_TYPED_TEST_CASE_P(My, FooTest, MyTypes);
// If the type list contains only one type, you can write that type
// directly without Types<...>:
// INSTANTIATE_TYPED_TEST_CASE_P(My, FooTest, int);
#endif // 0
#include "gtest/internal/gtest-port.h"
#include "gtest/internal/gtest-type-util.h"
// Implements typed tests.
#if GTEST_HAS_TYPED_TEST
// INTERNAL IMPLEMENTATION - DO NOT USE IN USER CODE.
//
// Expands to the name of the typedef for the type parameters of the
// given test case.
# define GTEST_TYPE_PARAMS_(TestCaseName) gtest_type_params_##TestCaseName##_
// The 'Types' template argument below must have spaces around it
// since some compilers may choke on '>>' when passing a template
// instance (e.g. Types<int>)
# define TYPED_TEST_CASE(CaseName, Types) \
typedef ::testing::internal::TypeList< Types >::type \
GTEST_TYPE_PARAMS_(CaseName)
# define TYPED_TEST(CaseName, TestName) \
template <typename gtest_TypeParam_> \
class GTEST_TEST_CLASS_NAME_(CaseName, TestName) \
: public CaseName<gtest_TypeParam_> { \
private: \
typedef CaseName<gtest_TypeParam_> TestFixture; \
typedef gtest_TypeParam_ TypeParam; \
virtual void TestBody(); \
}; \
bool gtest_##CaseName##_##TestName##_registered_ GTEST_ATTRIBUTE_UNUSED_ = \
::testing::internal::TypeParameterizedTest< \
CaseName, \
::testing::internal::TemplateSel< \
GTEST_TEST_CLASS_NAME_(CaseName, TestName)>, \
GTEST_TYPE_PARAMS_(CaseName)>::Register(\
"", #CaseName, #TestName, 0); \
template <typename gtest_TypeParam_> \
void GTEST_TEST_CLASS_NAME_(CaseName, TestName)<gtest_TypeParam_>::TestBody()
#endif // GTEST_HAS_TYPED_TEST
// Implements type-parameterized tests.
#if GTEST_HAS_TYPED_TEST_P
// INTERNAL IMPLEMENTATION - DO NOT USE IN USER CODE.
//
// Expands to the namespace name that the type-parameterized tests for
// the given type-parameterized test case are defined in. The exact
// name of the namespace is subject to change without notice.
# define GTEST_CASE_NAMESPACE_(TestCaseName) \
gtest_case_##TestCaseName##_
// INTERNAL IMPLEMENTATION - DO NOT USE IN USER CODE.
//
// Expands to the name of the variable used to remember the names of
// the defined tests in the given test case.
# define GTEST_TYPED_TEST_CASE_P_STATE_(TestCaseName) \
gtest_typed_test_case_p_state_##TestCaseName##_
// INTERNAL IMPLEMENTATION - DO NOT USE IN USER CODE DIRECTLY.
//
// Expands to the name of the variable used to remember the names of
// the registered tests in the given test case.
# define GTEST_REGISTERED_TEST_NAMES_(TestCaseName) \
gtest_registered_test_names_##TestCaseName##_
// The variables defined in the type-parameterized test macros are
// static as typically these macros are used in a .h file that can be
// #included in multiple translation units linked together.
# define TYPED_TEST_CASE_P(CaseName) \
static ::testing::internal::TypedTestCasePState \
GTEST_TYPED_TEST_CASE_P_STATE_(CaseName)
# define TYPED_TEST_P(CaseName, TestName) \
namespace GTEST_CASE_NAMESPACE_(CaseName) { \
template <typename gtest_TypeParam_> \
class TestName : public CaseName<gtest_TypeParam_> { \
private: \
typedef CaseName<gtest_TypeParam_> TestFixture; \
typedef gtest_TypeParam_ TypeParam; \
virtual void TestBody(); \
}; \
static bool gtest_##TestName##_defined_ GTEST_ATTRIBUTE_UNUSED_ = \
GTEST_TYPED_TEST_CASE_P_STATE_(CaseName).AddTestName(\
__FILE__, __LINE__, #CaseName, #TestName); \
} \
template <typename gtest_TypeParam_> \
void GTEST_CASE_NAMESPACE_(CaseName)::TestName<gtest_TypeParam_>::TestBody()
# define REGISTER_TYPED_TEST_CASE_P(CaseName, ...) \
namespace GTEST_CASE_NAMESPACE_(CaseName) { \
typedef ::testing::internal::Templates<__VA_ARGS__>::type gtest_AllTests_; \
} \
static const char* const GTEST_REGISTERED_TEST_NAMES_(CaseName) = \
GTEST_TYPED_TEST_CASE_P_STATE_(CaseName).VerifyRegisteredTestNames(\
__FILE__, __LINE__, #__VA_ARGS__)
// The 'Types' template argument below must have spaces around it
// since some compilers may choke on '>>' when passing a template
// instance (e.g. Types<int>)
# define INSTANTIATE_TYPED_TEST_CASE_P(Prefix, CaseName, Types) \
bool gtest_##Prefix##_##CaseName GTEST_ATTRIBUTE_UNUSED_ = \
::testing::internal::TypeParameterizedTestCase<CaseName, \
GTEST_CASE_NAMESPACE_(CaseName)::gtest_AllTests_, \
::testing::internal::TypeList< Types >::type>::Register(\
#Prefix, #CaseName, GTEST_REGISTERED_TEST_NAMES_(CaseName))
#endif // GTEST_HAS_TYPED_TEST_P
#endif // GTEST_INCLUDE_GTEST_GTEST_TYPED_TEST_H_

File diff suppressed because it is too large Load Diff

View File

@ -1,358 +0,0 @@
// Copyright 2006, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// This file is AUTOMATICALLY GENERATED on 09/24/2010 by command
// 'gen_gtest_pred_impl.py 5'. DO NOT EDIT BY HAND!
//
// Implements a family of generic predicate assertion macros.
#ifndef GTEST_INCLUDE_GTEST_GTEST_PRED_IMPL_H_
#define GTEST_INCLUDE_GTEST_GTEST_PRED_IMPL_H_
// Makes sure this header is not included before gtest.h.
#ifndef GTEST_INCLUDE_GTEST_GTEST_H_
# error Do not include gtest_pred_impl.h directly. Include gtest.h instead.
#endif // GTEST_INCLUDE_GTEST_GTEST_H_
// This header implements a family of generic predicate assertion
// macros:
//
// ASSERT_PRED_FORMAT1(pred_format, v1)
// ASSERT_PRED_FORMAT2(pred_format, v1, v2)
// ...
//
// where pred_format is a function or functor that takes n (in the
// case of ASSERT_PRED_FORMATn) values and their source expression
// text, and returns a testing::AssertionResult. See the definition
// of ASSERT_EQ in gtest.h for an example.
//
// If you don't care about formatting, you can use the more
// restrictive version:
//
// ASSERT_PRED1(pred, v1)
// ASSERT_PRED2(pred, v1, v2)
// ...
//
// where pred is an n-ary function or functor that returns bool,
// and the values v1, v2, ..., must support the << operator for
// streaming to std::ostream.
//
// We also define the EXPECT_* variations.
//
// For now we only support predicates whose arity is at most 5.
// Please email googletestframework@googlegroups.com if you need
// support for higher arities.
// GTEST_ASSERT_ is the basic statement to which all of the assertions
// in this file reduce. Don't use this in your code.
#define GTEST_ASSERT_(expression, on_failure) \
GTEST_AMBIGUOUS_ELSE_BLOCKER_ \
if (const ::testing::AssertionResult gtest_ar = (expression)) \
; \
else \
on_failure(gtest_ar.failure_message())
// Helper function for implementing {EXPECT|ASSERT}_PRED1. Don't use
// this in your code.
template <typename Pred,
typename T1>
AssertionResult AssertPred1Helper(const char* pred_text,
const char* e1,
Pred pred,
const T1& v1) {
if (pred(v1)) return AssertionSuccess();
return AssertionFailure() << pred_text << "("
<< e1 << ") evaluates to false, where"
<< "\n" << e1 << " evaluates to " << v1;
}
// Internal macro for implementing {EXPECT|ASSERT}_PRED_FORMAT1.
// Don't use this in your code.
#define GTEST_PRED_FORMAT1_(pred_format, v1, on_failure)\
GTEST_ASSERT_(pred_format(#v1, v1),\
on_failure)
// Internal macro for implementing {EXPECT|ASSERT}_PRED1. Don't use
// this in your code.
#define GTEST_PRED1_(pred, v1, on_failure)\
GTEST_ASSERT_(::testing::AssertPred1Helper(#pred, \
#v1, \
pred, \
v1), on_failure)
// Unary predicate assertion macros.
#define EXPECT_PRED_FORMAT1(pred_format, v1) \
GTEST_PRED_FORMAT1_(pred_format, v1, GTEST_NONFATAL_FAILURE_)
#define EXPECT_PRED1(pred, v1) \
GTEST_PRED1_(pred, v1, GTEST_NONFATAL_FAILURE_)
#define ASSERT_PRED_FORMAT1(pred_format, v1) \
GTEST_PRED_FORMAT1_(pred_format, v1, GTEST_FATAL_FAILURE_)
#define ASSERT_PRED1(pred, v1) \
GTEST_PRED1_(pred, v1, GTEST_FATAL_FAILURE_)
// Helper function for implementing {EXPECT|ASSERT}_PRED2. Don't use
// this in your code.
template <typename Pred,
typename T1,
typename T2>
AssertionResult AssertPred2Helper(const char* pred_text,
const char* e1,
const char* e2,
Pred pred,
const T1& v1,
const T2& v2) {
if (pred(v1, v2)) return AssertionSuccess();
return AssertionFailure() << pred_text << "("
<< e1 << ", "
<< e2 << ") evaluates to false, where"
<< "\n" << e1 << " evaluates to " << v1
<< "\n" << e2 << " evaluates to " << v2;
}
// Internal macro for implementing {EXPECT|ASSERT}_PRED_FORMAT2.
// Don't use this in your code.
#define GTEST_PRED_FORMAT2_(pred_format, v1, v2, on_failure)\
GTEST_ASSERT_(pred_format(#v1, #v2, v1, v2),\
on_failure)
// Internal macro for implementing {EXPECT|ASSERT}_PRED2. Don't use
// this in your code.
#define GTEST_PRED2_(pred, v1, v2, on_failure)\
GTEST_ASSERT_(::testing::AssertPred2Helper(#pred, \
#v1, \
#v2, \
pred, \
v1, \
v2), on_failure)
// Binary predicate assertion macros.
#define EXPECT_PRED_FORMAT2(pred_format, v1, v2) \
GTEST_PRED_FORMAT2_(pred_format, v1, v2, GTEST_NONFATAL_FAILURE_)
#define EXPECT_PRED2(pred, v1, v2) \
GTEST_PRED2_(pred, v1, v2, GTEST_NONFATAL_FAILURE_)
#define ASSERT_PRED_FORMAT2(pred_format, v1, v2) \
GTEST_PRED_FORMAT2_(pred_format, v1, v2, GTEST_FATAL_FAILURE_)
#define ASSERT_PRED2(pred, v1, v2) \
GTEST_PRED2_(pred, v1, v2, GTEST_FATAL_FAILURE_)
// Helper function for implementing {EXPECT|ASSERT}_PRED3. Don't use
// this in your code.
template <typename Pred,
typename T1,
typename T2,
typename T3>
AssertionResult AssertPred3Helper(const char* pred_text,
const char* e1,
const char* e2,
const char* e3,
Pred pred,
const T1& v1,
const T2& v2,
const T3& v3) {
if (pred(v1, v2, v3)) return AssertionSuccess();
return AssertionFailure() << pred_text << "("
<< e1 << ", "
<< e2 << ", "
<< e3 << ") evaluates to false, where"
<< "\n" << e1 << " evaluates to " << v1
<< "\n" << e2 << " evaluates to " << v2
<< "\n" << e3 << " evaluates to " << v3;
}
// Internal macro for implementing {EXPECT|ASSERT}_PRED_FORMAT3.
// Don't use this in your code.
#define GTEST_PRED_FORMAT3_(pred_format, v1, v2, v3, on_failure)\
GTEST_ASSERT_(pred_format(#v1, #v2, #v3, v1, v2, v3),\
on_failure)
// Internal macro for implementing {EXPECT|ASSERT}_PRED3. Don't use
// this in your code.
#define GTEST_PRED3_(pred, v1, v2, v3, on_failure)\
GTEST_ASSERT_(::testing::AssertPred3Helper(#pred, \
#v1, \
#v2, \
#v3, \
pred, \
v1, \
v2, \
v3), on_failure)
// Ternary predicate assertion macros.
#define EXPECT_PRED_FORMAT3(pred_format, v1, v2, v3) \
GTEST_PRED_FORMAT3_(pred_format, v1, v2, v3, GTEST_NONFATAL_FAILURE_)
#define EXPECT_PRED3(pred, v1, v2, v3) \
GTEST_PRED3_(pred, v1, v2, v3, GTEST_NONFATAL_FAILURE_)
#define ASSERT_PRED_FORMAT3(pred_format, v1, v2, v3) \
GTEST_PRED_FORMAT3_(pred_format, v1, v2, v3, GTEST_FATAL_FAILURE_)
#define ASSERT_PRED3(pred, v1, v2, v3) \
GTEST_PRED3_(pred, v1, v2, v3, GTEST_FATAL_FAILURE_)
// Helper function for implementing {EXPECT|ASSERT}_PRED4. Don't use
// this in your code.
template <typename Pred,
typename T1,
typename T2,
typename T3,
typename T4>
AssertionResult AssertPred4Helper(const char* pred_text,
const char* e1,
const char* e2,
const char* e3,
const char* e4,
Pred pred,
const T1& v1,
const T2& v2,
const T3& v3,
const T4& v4) {
if (pred(v1, v2, v3, v4)) return AssertionSuccess();
return AssertionFailure() << pred_text << "("
<< e1 << ", "
<< e2 << ", "
<< e3 << ", "
<< e4 << ") evaluates to false, where"
<< "\n" << e1 << " evaluates to " << v1
<< "\n" << e2 << " evaluates to " << v2
<< "\n" << e3 << " evaluates to " << v3
<< "\n" << e4 << " evaluates to " << v4;
}
// Internal macro for implementing {EXPECT|ASSERT}_PRED_FORMAT4.
// Don't use this in your code.
#define GTEST_PRED_FORMAT4_(pred_format, v1, v2, v3, v4, on_failure)\
GTEST_ASSERT_(pred_format(#v1, #v2, #v3, #v4, v1, v2, v3, v4),\
on_failure)
// Internal macro for implementing {EXPECT|ASSERT}_PRED4. Don't use
// this in your code.
#define GTEST_PRED4_(pred, v1, v2, v3, v4, on_failure)\
GTEST_ASSERT_(::testing::AssertPred4Helper(#pred, \
#v1, \
#v2, \
#v3, \
#v4, \
pred, \
v1, \
v2, \
v3, \
v4), on_failure)
// 4-ary predicate assertion macros.
#define EXPECT_PRED_FORMAT4(pred_format, v1, v2, v3, v4) \
GTEST_PRED_FORMAT4_(pred_format, v1, v2, v3, v4, GTEST_NONFATAL_FAILURE_)
#define EXPECT_PRED4(pred, v1, v2, v3, v4) \
GTEST_PRED4_(pred, v1, v2, v3, v4, GTEST_NONFATAL_FAILURE_)
#define ASSERT_PRED_FORMAT4(pred_format, v1, v2, v3, v4) \
GTEST_PRED_FORMAT4_(pred_format, v1, v2, v3, v4, GTEST_FATAL_FAILURE_)
#define ASSERT_PRED4(pred, v1, v2, v3, v4) \
GTEST_PRED4_(pred, v1, v2, v3, v4, GTEST_FATAL_FAILURE_)
// Helper function for implementing {EXPECT|ASSERT}_PRED5. Don't use
// this in your code.
template <typename Pred,
typename T1,
typename T2,
typename T3,
typename T4,
typename T5>
AssertionResult AssertPred5Helper(const char* pred_text,
const char* e1,
const char* e2,
const char* e3,
const char* e4,
const char* e5,
Pred pred,
const T1& v1,
const T2& v2,
const T3& v3,
const T4& v4,
const T5& v5) {
if (pred(v1, v2, v3, v4, v5)) return AssertionSuccess();
return AssertionFailure() << pred_text << "("
<< e1 << ", "
<< e2 << ", "
<< e3 << ", "
<< e4 << ", "
<< e5 << ") evaluates to false, where"
<< "\n" << e1 << " evaluates to " << v1
<< "\n" << e2 << " evaluates to " << v2
<< "\n" << e3 << " evaluates to " << v3
<< "\n" << e4 << " evaluates to " << v4
<< "\n" << e5 << " evaluates to " << v5;
}
// Internal macro for implementing {EXPECT|ASSERT}_PRED_FORMAT5.
// Don't use this in your code.
#define GTEST_PRED_FORMAT5_(pred_format, v1, v2, v3, v4, v5, on_failure)\
GTEST_ASSERT_(pred_format(#v1, #v2, #v3, #v4, #v5, v1, v2, v3, v4, v5),\
on_failure)
// Internal macro for implementing {EXPECT|ASSERT}_PRED5. Don't use
// this in your code.
#define GTEST_PRED5_(pred, v1, v2, v3, v4, v5, on_failure)\
GTEST_ASSERT_(::testing::AssertPred5Helper(#pred, \
#v1, \
#v2, \
#v3, \
#v4, \
#v5, \
pred, \
v1, \
v2, \
v3, \
v4, \
v5), on_failure)
// 5-ary predicate assertion macros.
#define EXPECT_PRED_FORMAT5(pred_format, v1, v2, v3, v4, v5) \
GTEST_PRED_FORMAT5_(pred_format, v1, v2, v3, v4, v5, GTEST_NONFATAL_FAILURE_)
#define EXPECT_PRED5(pred, v1, v2, v3, v4, v5) \
GTEST_PRED5_(pred, v1, v2, v3, v4, v5, GTEST_NONFATAL_FAILURE_)
#define ASSERT_PRED_FORMAT5(pred_format, v1, v2, v3, v4, v5) \
GTEST_PRED_FORMAT5_(pred_format, v1, v2, v3, v4, v5, GTEST_FATAL_FAILURE_)
#define ASSERT_PRED5(pred, v1, v2, v3, v4, v5) \
GTEST_PRED5_(pred, v1, v2, v3, v4, v5, GTEST_FATAL_FAILURE_)
#endif // GTEST_INCLUDE_GTEST_GTEST_PRED_IMPL_H_

View File

@ -1,58 +0,0 @@
// Copyright 2006, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
//
// Google C++ Testing Framework definitions useful in production code.
#ifndef GTEST_INCLUDE_GTEST_GTEST_PROD_H_
#define GTEST_INCLUDE_GTEST_GTEST_PROD_H_
// When you need to test the private or protected members of a class,
// use the FRIEND_TEST macro to declare your tests as friends of the
// class. For example:
//
// class MyClass {
// private:
// void MyMethod();
// FRIEND_TEST(MyClassTest, MyMethod);
// };
//
// class MyClassTest : public testing::Test {
// // ...
// };
//
// TEST_F(MyClassTest, MyMethod) {
// // Can call MyClass::MyMethod() here.
// }
#define FRIEND_TEST(test_case_name, test_name)\
friend class test_case_name##_##test_name##_Test
#endif // GTEST_INCLUDE_GTEST_GTEST_PROD_H_

View File

@ -1,308 +0,0 @@
// Copyright 2005, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Authors: wan@google.com (Zhanyong Wan), eefacm@gmail.com (Sean Mcafee)
//
// The Google C++ Testing Framework (Google Test)
//
// This header file defines internal utilities needed for implementing
// death tests. They are subject to change without notice.
#ifndef GTEST_INCLUDE_GTEST_INTERNAL_GTEST_DEATH_TEST_INTERNAL_H_
#define GTEST_INCLUDE_GTEST_INTERNAL_GTEST_DEATH_TEST_INTERNAL_H_
#include "gtest/internal/gtest-internal.h"
#include <stdio.h>
namespace testing {
namespace internal {
GTEST_DECLARE_string_(internal_run_death_test);
// Names of the flags (needed for parsing Google Test flags).
const char kDeathTestStyleFlag[] = "death_test_style";
const char kDeathTestUseFork[] = "death_test_use_fork";
const char kInternalRunDeathTestFlag[] = "internal_run_death_test";
#if GTEST_HAS_DEATH_TEST
// DeathTest is a class that hides much of the complexity of the
// GTEST_DEATH_TEST_ macro. It is abstract; its static Create method
// returns a concrete class that depends on the prevailing death test
// style, as defined by the --gtest_death_test_style and/or
// --gtest_internal_run_death_test flags.
// In describing the results of death tests, these terms are used with
// the corresponding definitions:
//
// exit status: The integer exit information in the format specified
// by wait(2)
// exit code: The integer code passed to exit(3), _exit(2), or
// returned from main()
class GTEST_API_ DeathTest {
public:
// Create returns false if there was an error determining the
// appropriate action to take for the current death test; for example,
// if the gtest_death_test_style flag is set to an invalid value.
// The LastMessage method will return a more detailed message in that
// case. Otherwise, the DeathTest pointer pointed to by the "test"
// argument is set. If the death test should be skipped, the pointer
// is set to NULL; otherwise, it is set to the address of a new concrete
// DeathTest object that controls the execution of the current test.
static bool Create(const char* statement, const RE* regex,
const char* file, int line, DeathTest** test);
DeathTest();
virtual ~DeathTest() { }
// A helper class that aborts a death test when it's deleted.
class ReturnSentinel {
public:
explicit ReturnSentinel(DeathTest* test) : test_(test) { }
~ReturnSentinel() { test_->Abort(TEST_ENCOUNTERED_RETURN_STATEMENT); }
private:
DeathTest* const test_;
GTEST_DISALLOW_COPY_AND_ASSIGN_(ReturnSentinel);
} GTEST_ATTRIBUTE_UNUSED_;
// An enumeration of possible roles that may be taken when a death
// test is encountered. EXECUTE means that the death test logic should
// be executed immediately. OVERSEE means that the program should prepare
// the appropriate environment for a child process to execute the death
// test, then wait for it to complete.
enum TestRole { OVERSEE_TEST, EXECUTE_TEST };
// An enumeration of the three reasons that a test might be aborted.
enum AbortReason {
TEST_ENCOUNTERED_RETURN_STATEMENT,
TEST_THREW_EXCEPTION,
TEST_DID_NOT_DIE
};
// Assumes one of the above roles.
virtual TestRole AssumeRole() = 0;
// Waits for the death test to finish and returns its status.
virtual int Wait() = 0;
// Returns true if the death test passed; that is, the test process
// exited during the test, its exit status matches a user-supplied
// predicate, and its stderr output matches a user-supplied regular
// expression.
// The user-supplied predicate may be a macro expression rather
// than a function pointer or functor, or else Wait and Passed could
// be combined.
virtual bool Passed(bool exit_status_ok) = 0;
// Signals that the death test did not die as expected.
virtual void Abort(AbortReason reason) = 0;
// Returns a human-readable outcome message regarding the outcome of
// the last death test.
static const char* LastMessage();
static void set_last_death_test_message(const String& message);
private:
// A string containing a description of the outcome of the last death test.
static String last_death_test_message_;
GTEST_DISALLOW_COPY_AND_ASSIGN_(DeathTest);
};
// Factory interface for death tests. May be mocked out for testing.
class DeathTestFactory {
public:
virtual ~DeathTestFactory() { }
virtual bool Create(const char* statement, const RE* regex,
const char* file, int line, DeathTest** test) = 0;
};
// A concrete DeathTestFactory implementation for normal use.
class DefaultDeathTestFactory : public DeathTestFactory {
public:
virtual bool Create(const char* statement, const RE* regex,
const char* file, int line, DeathTest** test);
};
// Returns true if exit_status describes a process that was terminated
// by a signal, or exited normally with a nonzero exit code.
GTEST_API_ bool ExitedUnsuccessfully(int exit_status);
// Traps C++ exceptions escaping statement and reports them as test
// failures. Note that trapping SEH exceptions is not implemented here.
# if GTEST_HAS_EXCEPTIONS
# define GTEST_EXECUTE_DEATH_TEST_STATEMENT_(statement, death_test) \
try { \
GTEST_SUPPRESS_UNREACHABLE_CODE_WARNING_BELOW_(statement); \
} catch (const ::std::exception& gtest_exception) { \
fprintf(\
stderr, \
"\n%s: Caught std::exception-derived exception escaping the " \
"death test statement. Exception message: %s\n", \
::testing::internal::FormatFileLocation(__FILE__, __LINE__).c_str(), \
gtest_exception.what()); \
fflush(stderr); \
death_test->Abort(::testing::internal::DeathTest::TEST_THREW_EXCEPTION); \
} catch (...) { \
death_test->Abort(::testing::internal::DeathTest::TEST_THREW_EXCEPTION); \
}
# else
# define GTEST_EXECUTE_DEATH_TEST_STATEMENT_(statement, death_test) \
GTEST_SUPPRESS_UNREACHABLE_CODE_WARNING_BELOW_(statement)
# endif
// This macro is for implementing ASSERT_DEATH*, EXPECT_DEATH*,
// ASSERT_EXIT*, and EXPECT_EXIT*.
# define GTEST_DEATH_TEST_(statement, predicate, regex, fail) \
GTEST_AMBIGUOUS_ELSE_BLOCKER_ \
if (::testing::internal::AlwaysTrue()) { \
const ::testing::internal::RE& gtest_regex = (regex); \
::testing::internal::DeathTest* gtest_dt; \
if (!::testing::internal::DeathTest::Create(#statement, &gtest_regex, \
__FILE__, __LINE__, &gtest_dt)) { \
goto GTEST_CONCAT_TOKEN_(gtest_label_, __LINE__); \
} \
if (gtest_dt != NULL) { \
::testing::internal::scoped_ptr< ::testing::internal::DeathTest> \
gtest_dt_ptr(gtest_dt); \
switch (gtest_dt->AssumeRole()) { \
case ::testing::internal::DeathTest::OVERSEE_TEST: \
if (!gtest_dt->Passed(predicate(gtest_dt->Wait()))) { \
goto GTEST_CONCAT_TOKEN_(gtest_label_, __LINE__); \
} \
break; \
case ::testing::internal::DeathTest::EXECUTE_TEST: { \
::testing::internal::DeathTest::ReturnSentinel \
gtest_sentinel(gtest_dt); \
GTEST_EXECUTE_DEATH_TEST_STATEMENT_(statement, gtest_dt); \
gtest_dt->Abort(::testing::internal::DeathTest::TEST_DID_NOT_DIE); \
break; \
} \
default: \
break; \
} \
} \
} else \
GTEST_CONCAT_TOKEN_(gtest_label_, __LINE__): \
fail(::testing::internal::DeathTest::LastMessage())
// The symbol "fail" here expands to something into which a message
// can be streamed.
// A class representing the parsed contents of the
// --gtest_internal_run_death_test flag, as it existed when
// RUN_ALL_TESTS was called.
class InternalRunDeathTestFlag {
public:
InternalRunDeathTestFlag(const String& a_file,
int a_line,
int an_index,
int a_write_fd)
: file_(a_file), line_(a_line), index_(an_index),
write_fd_(a_write_fd) {}
~InternalRunDeathTestFlag() {
if (write_fd_ >= 0)
posix::Close(write_fd_);
}
String file() const { return file_; }
int line() const { return line_; }
int index() const { return index_; }
int write_fd() const { return write_fd_; }
private:
String file_;
int line_;
int index_;
int write_fd_;
GTEST_DISALLOW_COPY_AND_ASSIGN_(InternalRunDeathTestFlag);
};
// Returns a newly created InternalRunDeathTestFlag object with fields
// initialized from the GTEST_FLAG(internal_run_death_test) flag if
// the flag is specified; otherwise returns NULL.
InternalRunDeathTestFlag* ParseInternalRunDeathTestFlag();
#else // GTEST_HAS_DEATH_TEST
// This macro is used for implementing macros such as
// EXPECT_DEATH_IF_SUPPORTED and ASSERT_DEATH_IF_SUPPORTED on systems where
// death tests are not supported. Those macros must compile on such systems
// iff EXPECT_DEATH and ASSERT_DEATH compile with the same parameters on
// systems that support death tests. This allows one to write such a macro
// on a system that does not support death tests and be sure that it will
// compile on a death-test supporting system.
//
// Parameters:
// statement - A statement that a macro such as EXPECT_DEATH would test
// for program termination. This macro has to make sure this
// statement is compiled but not executed, to ensure that
// EXPECT_DEATH_IF_SUPPORTED compiles with a certain
// parameter iff EXPECT_DEATH compiles with it.
// regex - A regex that a macro such as EXPECT_DEATH would use to test
// the output of statement. This parameter has to be
// compiled but not evaluated by this macro, to ensure that
// this macro only accepts expressions that a macro such as
// EXPECT_DEATH would accept.
// terminator - Must be an empty statement for EXPECT_DEATH_IF_SUPPORTED
// and a return statement for ASSERT_DEATH_IF_SUPPORTED.
// This ensures that ASSERT_DEATH_IF_SUPPORTED will not
// compile inside functions where ASSERT_DEATH doesn't
// compile.
//
// The branch that has an always false condition is used to ensure that
// statement and regex are compiled (and thus syntactically correct) but
// never executed. The unreachable code macro protects the terminator
// statement from generating an 'unreachable code' warning in case
// statement unconditionally returns or throws. The Message constructor at
// the end allows the syntax of streaming additional messages into the
// macro, for compilational compatibility with EXPECT_DEATH/ASSERT_DEATH.
# define GTEST_UNSUPPORTED_DEATH_TEST_(statement, regex, terminator) \
GTEST_AMBIGUOUS_ELSE_BLOCKER_ \
if (::testing::internal::AlwaysTrue()) { \
GTEST_LOG_(WARNING) \
<< "Death tests are not supported on this platform.\n" \
<< "Statement '" #statement "' cannot be verified."; \
} else if (::testing::internal::AlwaysFalse()) { \
::testing::internal::RE::PartialMatch(".*", (regex)); \
GTEST_SUPPRESS_UNREACHABLE_CODE_WARNING_BELOW_(statement); \
terminator; \
} else \
::testing::Message()
#endif // GTEST_HAS_DEATH_TEST
} // namespace internal
} // namespace testing
#endif // GTEST_INCLUDE_GTEST_INTERNAL_GTEST_DEATH_TEST_INTERNAL_H_

Some files were not shown because too many files have changed in this diff Show More