use oleacc;//http://files.codes-sources.com/fichier.aspx?id=28611&f=oleacc.pas
function ffdoc: THandle;
const
A_szClassName:array[0..6] of PChar=('MozillaUIWindowClass','MozillaWindowClass',
'MozillaWindowClass','MozillaWindowClass','MozillaContentWindowClass',
'MozillaWindowClass','MozillaWindowClass'
);
var
i:Integer;
begin
Result:=0;
for i:=0 to 6 do Result:=FindWindowEx(Result,THandle(nil),A_szClassName[i],nil);
end;
function ffurl:string;
var
acc:IAccessible;
pw:PWChar;
begin
if AccessibleObjectFromWindow(ffdoc,OBJID_CLIENT,IID_IAccessible,Pointer(acc))=0 then
Acc.get_accValue(CHILDID_SELF,pw);
Result:=pw;
end;
摘要:对基于IE内核(如IE,Maxthon)与基于Gecko内核(如Firefox)的浏览器的网页内容获取与分析的技术进行了研究,采用Visual C++ 6.0为平台,基于COM技术和微软的MSAA技术,采用了多种方式实现了基于以上两类不同内核的浏览器的网页内容获取,并对这几种获取方式进行了优劣比较。
关键词:COM; DOM; MSAA; IE; Gecko; windows编程
中图分类号:TP393文献标识码:A文章编号:1009-3044(2008)23-936-04
The Research of the Technique of Capturing and Analyzing the Web Page Contents Based on the Kernel of IE and Gecko
ZHOU Zhou
(College of Software Engineering, Beijing Jiao Tong University, Beijing 100044, China)
Abstract: With the instant development of World Wide Web, it becomes important to capture and analyze the web page contents in order to give the customers information better arranged. This article did a research in several ways and techniques which can capture and analyze the web pages by using the Visual C++ 6.0, on the basis of COM and MSAA. At last the article compared all the techniques and listed distinctly their advantages and disadvantages.
Key words: COM; DOM; MSAA; IE; Gecko; windows programming