CppJieba 简体中文

Build Status Author Performance License

Introduction

The Jieba Chinese Word Segmentation Implemented By C++ .

Usage

Dependencies

  • g++ (version >= 4.1 is recommended) or clang++;
  • cmake (version >= 2.6 is recommended);

Download & Compile

git clone --depth=10 --branch=master git://github.com/yanyiwu/cppjieba.git
cd cppjieba
mkdir build
cd build
cmake ..
make

Unit Testing

make test

Demo

./demo

Output:

[demo] Cut With HMM
我/是/拖拉机/学院/手扶拖拉机/专业/的/。/不用/多久//我/就/会/升职/加薪//当上/CEO//走上/人生/巅峰/。
[demo] Cut Without HMM
我/是/拖拉机/学院/手扶拖拉机/专业/的/。/不用/多久//我/就/会/升职/加薪//当/上/C/E/O//走上/人生/巅峰/。
[demo] CutAll
我/是/拖拉/拖拉机/学院/手扶/手扶拖拉机/拖拉/拖拉机/专业/的/。/不用/多久//我/就/会升/升职/加薪//当上/C/E/O//走上/人生/巅峰/。[demo] CutForSearch
我/是/拖拉机/学院/手扶/手扶拖拉机/拖拉/拖拉机/专业/的/。/不用/多久//我/就/会/升职/加薪//当上/CEO//走上/人生/巅峰/。
[demo] Insert User Word
男默/女泪
男默女泪
[demo] Locate Words
南京市, 0, 3
长江大桥, 3, 7
[demo] TAGGING
我是拖拉机学院手扶拖拉机专业的。不用多久我就会升职加薪当上CEO走上人生巅峰。
["我:r", "是:v", "拖拉机:n", "学院:n", "手扶拖拉机:n", "专业:n", "的:uj", "。:x", "不用:v", "多久:m", ":x", "我:r", "就:d", "会:v", "升职:v", "加薪:nr", ":x", "当上:t", "CEO:eng", ":x", "走上:v", "人生:n", "巅峰:n", "。:x"]

Please see details in test/demo.cpp.

Cases

Contact

  • Email: i@yanyiwu.com
  • QQ: 64162451
  • WeChat: image
Description
"结巴"中文分词的C++版本
Readme MIT 12 MiB
Languages
C++ 97.9%
CMake 2.1%