Foxtable(狐表)用户栏目专家坐堂 → 这种网页数据如何抓取数据


  共有16584人关注过本帖树形打印复制链接

主题:这种网页数据如何抓取数据

帅哥哟,离线,有人找我吗?
有点甜
  1楼 | 信息 | 搜索 | 邮箱 | 主页 | UC


加好友 发短信
等级:版主 帖子:85326 积分:427815 威望:0 精华:5 注册:2012/10/18 22:13:00
  发帖心情 Post By:2018/7/22 19:02:00 [显示全部帖子]

参考代码,自己看懂和调整细节

 

Dim web As New System.Windows.Forms.WebBrowser()
web.Navigate("http://www.hzctc.cn/OpenBidRecord/Index?id=36177CC9-5F91-473F-84E6-A2EFA35D6DD9&tenderID=969B1A8D-1A57-4A21-864F-A5E98F8288FB&ModuleID=486")
Do Until web.ReadyState = 4 AndAlso web.Document.GetElementById("Table1") IsNot Nothing
    Application.DoEvents
Loop

Dim elems As object = web.Document.GetElementsByTagName("div")
For Each elem As object In elems
    If elem.getattribute("classname") = "row cl" Then
        If elem.InnerText.contains("工程编号") Then
            msgbox(elem.GetElementsByTagName("span")(0).innerText)
        ElseIf elem.InnerText.contains("建设单位") Then
            msgbox(elem.GetElementsByTagName("span")(0).innerText)
        End If
    End If
Next

elems = web.Document.GetElementById("Table1").GetElementsByTagName("tr")
For i As Integer = 1 To elems.count-2
    Dim tds = elems(i).getelementsbytagname("td")
    msgbox(tds(0).InnerText & " " & tds(1).InnerText)
Next


 回到顶部
帅哥哟,离线,有人找我吗?
有点甜
  2楼 | 信息 | 搜索 | 邮箱 | 主页 | UC


加好友 发短信
等级:版主 帖子:85326 积分:427815 威望:0 精华:5 注册:2012/10/18 22:13:00
  发帖心情 Post By:2018/7/23 0:34:00 [显示全部帖子]

以下是引用小美菜在2018/7/22 23:05:00的发言:
有点甜老师,假定这个下面有5页,有上一页下一页的按钮,请问怎么实现遍历完这五页呢?

 

参考 http://www.foxtable.com/bbs/dispbbs.asp?BoardID=2&ID=109179&skin=0

 


 回到顶部
帅哥哟,离线,有人找我吗?
有点甜
  3楼 | 信息 | 搜索 | 邮箱 | 主页 | UC


加好友 发短信
等级:版主 帖子:85326 积分:427815 威望:0 精华:5 注册:2012/10/18 22:13:00
  发帖心情 Post By:2018/10/15 15:25:00 [显示全部帖子]

要说多少次?

 

Dim web As New System.Windows.Forms.WebBrowser()
web.Navigate("http://www.hzctc.cn/OpenBidRecord/Index?id=36177CC9-5F91-473F-84E6-A2EFA35D6DD9&tenderID=969B1A8D-1A57-4A21-864F-A5E98F8288FB&ModuleID=486")
Do Until web.ReadyState = 4 AndAlso web.Document.GetElementById("Table1") IsNot Nothing
    Application.DoEvents
Loop

Dim elems As object = web.Document.GetElementsByTagName("div")
For Each elem As object In elems
    If elem.getattribute("classname") = "row cl" Then
        If elem.InnerText.contains("工程编号") Then
            msgbox(elem.GetElementsByTagName("span")(0).innerText)
        ElseIf elem.InnerText.contains("建设单位") Then
            msgbox(elem.GetElementsByTagName("span")(0).innerText)
        End If
    End If
Next

elems = web.Document.GetElementById("Table1").GetElementsByTagName("tr")
For i As Integer = 1 To elems.count-2
    Dim tds = elems(i).getelementsbytagname("td")
    msgbox(tds(0).InnerText & " " & tds(1).InnerText)
Next


 回到顶部
帅哥哟,离线,有人找我吗?
有点甜
  4楼 | 信息 | 搜索 | 邮箱 | 主页 | UC


加好友 发短信
等级:版主 帖子:85326 积分:427815 威望:0 精华:5 注册:2012/10/18 22:13:00
  发帖心情 Post By:2018/10/15 16:13:00 [显示全部帖子]

Dim web As New System.Windows.Forms.WebBrowser()
web.scripterrorssuppressed = True
web.Navigate("http://www.hzctc.cn/OpenBidRecord/Index?id=36177CC9-5F91-473F-84E6-A2EFA35D6DD9&tenderID=969B1A8D-1A57-4A21-864F-A5E98F8288FB&ModuleID=486")
Do Until web.ReadyState = 4 AndAlso web.Document.GetElementById("Table1") IsNot Nothing
    Application.DoEvents
Loop

Dim elems As object = web.Document.GetElementsByTagName("div")
For Each elem As object In elems
    If elem.getattribute("classname") = "row cl" Then
        If elem.InnerText.contains("工程编号") Then
            msgbox(elem.GetElementsByTagName("span")(0).innerText)
        ElseIf elem.InnerText.contains("建设单位") Then
            msgbox(elem.GetElementsByTagName("span")(0).innerText)
        End If
    End If
Next

elems = web.Document.GetElementById("Table1").GetElementsByTagName("tr")
For i As Integer = 1 To elems.count-2
    Dim tds = elems(i).getelementsbytagname("td")
    msgbox(tds(0).InnerText & " " & tds(1).InnerText)
Next

 


 回到顶部
帅哥哟,离线,有人找我吗?
有点甜
  5楼 | 信息 | 搜索 | 邮箱 | 主页 | UC


加好友 发短信
等级:版主 帖子:85326 积分:427815 威望:0 精华:5 注册:2012/10/18 22:13:00
  发帖心情 Post By:2018/10/17 8:55:00 [显示全部帖子]

Dim web As New System.Windows.Forms.WebBrowser()
web.scripterrorssuppressed = True
web.Navigate("http://www.hzctc.cn/OpenBidRecord/Index?id=111E6F37-5AB7-4F3F-B56D-E355701A68E9&tenderID=416E3F38-00E8-4CC4-8D72-8DE2EDA078AC&ModuleID=486")
Do Until web.ReadyState = 4 AndAlso web.Document.GetElementById("Table1") IsNot Nothing
    Application.DoEvents
Loop


Dim elems As object = web.Document.GetElementsByTagName("div")
elems = web.Document.GetElementById("Table1").GetElementsByTagName("th")
Dim dtb As New DataTableBuilder("录入表")
For i As Integer = 0 To elems.count-1
    'msgbox(elems (i).InnerText)
    dtb.AddDef(elems (i).InnerText, Gettype(String), 250)
Next
dtb.Build()
MainTable = Tables("录入表")


elems = web.Document.GetElementById("Table1").GetElementsByTagName("tr")
Dim ndr As Row
For n As Integer = 1 To elems.count - 2
    Dim tds = elems(n).getelementsbytagname("td")
    ndr = Tables("录入表").AddNew()
    For tn As Integer = 0 To tds.count -1
        ndr(tn) = tds(tn).InnerText
    Next
Next


 回到顶部
帅哥哟,离线,有人找我吗?
有点甜
  6楼 | 信息 | 搜索 | 邮箱 | 主页 | UC


加好友 发短信
等级:版主 帖子:85326 积分:427815 威望:0 精华:5 注册:2012/10/18 22:13:00
  发帖心情 Post By:2018/10/17 11:23:00 [显示全部帖子]

Dim web As New System.Windows.Forms.WebBrowser()
web.scripterrorssuppressed = True
web.Navigate("http://www.hzctc.cn/OpenBidRecord/Index?id=111E6F37-5AB7-4F3F-B56D-E355701A68E9&tenderID=416E3F38-00E8-4CC4-8D72-8DE2EDA078AC&ModuleID=486")
Do Until web.ReadyState = 4 AndAlso web.Document.GetElementById("Table1") IsNot Nothing
    Application.DoEvents
Loop


Dim elems As object = web.Document.GetElementsByTagName("div")
For Each elem As object In elems
    If elem.getattribute("classname") = "row cl" Then
        If elem.InnerText.contains("工程编号") Then
            msgbox(elem.GetElementsByTagName("span")(0).innerText)
            msgbox(elem.GetElementsByTagName("span")(1).innerText)
        End If
        If elem.InnerText.contains("建设单位") Then
            msgbox(elem.GetElementsByTagName("span")(0).innerText)
            msgbox(elem.GetElementsByTagName("span")(1).innerText)
        End If

    End If
Next


 回到顶部
帅哥哟,离线,有人找我吗?
有点甜
  7楼 | 信息 | 搜索 | 邮箱 | 主页 | UC


加好友 发短信
等级:版主 帖子:85326 积分:427815 威望:0 精华:5 注册:2012/10/18 22:13:00
  发帖心情 Post By:2018/10/17 15:09:00 [显示全部帖子]

Dim web As New System.Windows.Forms.WebBrowser()
web.scripterrorssuppressed = True
web.Navigate("http://www.hzctc.cn/OpenBidRecord/Index?id=111E6F37-5AB7-4F3F-B56D-E355701A68E9&tenderID=416E3F38-00E8-4CC4-8D72-8DE2EDA078AC&ModuleID=486")
Do Until web.ReadyState = 4 AndAlso web.Document.GetElementById("Table1") IsNot Nothing
    Application.DoEvents
Loop


Dim nms() As String = {"工程编号","工程名称","建设单位","代理机构","开标时间","开标地点","代理人员"}
Dim elems As object = web.Document.GetElementsByTagName("span")
Dim i As Integer = 0
For Each elem As object In elems
    If elem.getattribute("classname") = "input-group" Then
        msgbox(elem.innerText)
        i += 1
    End If
Next


 回到顶部
帅哥哟,离线,有人找我吗?
有点甜
  8楼 | 信息 | 搜索 | 邮箱 | 主页 | UC


加好友 发短信
等级:版主 帖子:85326 积分:427815 威望:0 精华:5 注册:2012/10/18 22:13:00
  发帖心情 Post By:2018/10/17 16:33:00 [显示全部帖子]

Dim web As New System.Windows.Forms.WebBrowser()
web.scripterrorssuppressed = True
web.Navigate("http://www.hzctc.cn/AfficheShow/Home?AfficheID=1192f417-f6a2-4a92-9cd7-47f0c1eb3133&IsInner=0&ModuleID=22")
Do Until web.ReadyState = 4
    Application.DoEvents
Loop

Dim elems As object = web.Document.GetElementsByTagName("table")(0).GetElementsByTagName("tr")

Dim i As Integer = 0
For Each elem As object In elems
    Dim tds = elem.GetElementsByTagName("td")
    msgbox(tds(0).innerText)
    msgbox(tds(1).innerText)
Next


 回到顶部