[ 源代码: htmlcxx ]
软件包:libhtmlcxx3v5(0.87-4 以及其他的)
simple HTML parser library for C++
htmlcxx is a simple non-validating CSS1 and HTML parser for C++. Although there are several other html parsers available, htmlcxx has some characteristics that make it unique:
* STL like navigation of DOM tree, using excellent tree.hh library from Kasper Peeters * It is possible to reproduce exactly, character by character, the original document from the parse tree * Bundled CSS parser * Optional parsing of attributes * C++ code that looks like C++ (not so true anymore) * Offsets of tags/elements in the original document are stored in the nodes of the DOM tree
The parsing politics of htmlcxx were created trying to mimic Mozilla Firefox (http://www.mozilla.org) behavior. So you should expect parse trees similar to those create by Firefox. However, differently from Firefox, htmlcxx does not insert non-existent stuff in your html. Therefore, serializing the DOM tree gives exactly the same bytes contained in the original HTML document.
其他与 libhtmlcxx3v5 有关的软件包
|
|
|
|
-
- dep: libc6 (>= 2.38)
- GNU C 语言运行库:共享库
同时作为一个虚包由这些包填实: libc6-udeb
-
- dep: libgcc-s1 (>= 3.0)
- GCC 支持库
-
- dep: libstdc++6 (>= 13.1)
- GNU 标准 C++ 库,第3版