java HTML Parser

any good one

se裏面個個試左未?

javax.swing.text.html.parser

TOP

原帖由 阿水 於 2008-12-22 17:02 發表
any good one


nekoHTML

TOP

jtidy

TOP

TOP

原帖由 thinkpanda 於 2008-12-23 12:51 AM 發表


nekoHTML


trying this parser but I found a problem

String html="<html><head><title>test</title></head><body><a href="xx">aa</a></body>";
InputSource i = new InputSource(new StringReader(html));
DOMParser parser = new DOMParser();
try {
        parser.parse(i);
} catch (Exception ex) {
        ex.printStackTrace();
}
Document document = parser.getDocument();
Node body = document.getElementsByTagName("A");
System.out.println(body.getLength());

But the result obtain is 0
there should be many hyperlink

TOP

del....

[ 本帖最後由 chiefumpire 於 2008-12-26 18:56 編輯 ]

TOP

因為invalid html?

TOP


一時import 錯了 DOMParser

TOP

原帖由 thinkpanda 於 2008-12-23 12:51 AM 發表


nekoHTML


HttpURL url = new HttpURL("http://www.kmb.hk/english.php?bus_type=A&page=search&prog=bus_type.php");

but it doesn't do it job perfectly

getting the full bus list from KMB (interested in href att of a element)

[ 本帖最後由 阿水 於 2008-12-26 21:06 編輯 ]
附件: 您需要登錄才可以下載或查看附件。沒有帳號?註冊

TOP