another wget download question

wget can be used to download a web page or even mirror a web site.
But wget only get default content on the downloaded page.

Now some web page can show different content through dropdown list.

my question is:
How I can download by using wget the undefault content which show when user select different value in dropdown list on web page. thanks!

ps.
When user select different item in dropdown list, address in browser's address bar will not change which is achieved by AJAX.

我覺得你想做 web spider
呢樣野需要 programming 的
你去 google 搵下資料啦

TOP

原帖由 netter 於 2009-8-31 11:04 發表
我覺得你想做 web spider
呢樣野需要 programming 的
你去 google 搵下資料啦

       
web spider是search engine用的
我是只要有些感興趣的東東,希望能自動的拿來

TOP

可以試下用 curl
佢可以 submit form (POST / GET 都得). 亦都可以 handle cookies.  不過首先都要理解下你想攞嗰個 web site 係點 (form values, cookies, etc)。

TOP

web page use following code to get user selection and show related content.
now the item 7 is chosen.
  1. <select name='applicationId' onchange="reloadStatistics(this, 22, 'itemFamilyId')" size='1'><option value='1'>item1</option><option value='2'>item2</option><option value='3'>item3</option><option value='4'>item4</option><option value='5'>item5</option><option value='6'>item6</option><option value='7'selected='selected'>item7</option><option value='8'>item8</option><option value='9'>item9</option><option value='10'>item10</option></select>
複製代碼
on the source code of web page, there are also sth showing what scripts is used
  1. <script language="javascript" type="text/javascript" src="/js/aforms.js"></script>
  2. <script type="text/javascript" src="/js/btype.js"></script>
  3. <script type="text/javascript" src="/js/ajaxtags-1.2.js"></script>
複製代碼
is it possible to use wget or curl to download undefaulted content on this web page?

TOP

Do you need to login before filling in the form?
Do you need to download a file after submitting the form?

If both questions are NO, go for curl.

TOP

原帖由 corvus 於 2009-8-31 16:45 發表
Do you need to login before filling in the form?
Do you need to download a file after submitting the form?

If both questions are NO, go for curl.


first one is NO, second one I'm not sure.
I want to submit the form and after web page connect is changed according to the form submitted I want to download the whole web page.

curl -F option seems let user emulate the form

TOP

I have read "Using cURL to automate HTTP jobs" and haven't found a useful approach for my question.

according to this article, the most difficult issues is related to access a web page which need  to login in. user must handle cookie, hidden tags/hidden fields and URL encoded.

But my situation is different from that. I only want to access and automatically download web page which totally open to public and no login-in is needed.

On the web page, there is big table which cannot show all fields on default so it show dropdown list for user to select which field need to be show by using ajax technology.

please give some hints on how to handle this kind of issue.

TOP

Can you tell what you can't do with curl? (because I have forgot why I switched to autoit on Windows with IE   )

TOP

原帖由 corvus 於 2009-9-2 17:41 發表
Can you tell what you can't do with curl? (because I have forgot why I switched to autoit on Windows with IE   )


I often visit http://www.hwbot.org to check cpu benchmark.

for example, this web page
http://www.hwbot.org/browseHardw ... ?cpuSubFamilyId=122
when you visit it, it will show default content including superpi testing result.
if you are interested in other benchmark such as wPrime, you need to click the dropdown list to choose it then you can get what you want. But when you do all these selection, address in browser's address bar doesn't change.

When I use curl to download this web page, I can only get the default page ie. superpi test result. Now I hope that I can also download test results other than superpi. But I found curl cannot emulate manual click on an item in dropdown list which is handled by ajax technology through a javascript.

TOP