`

Google搜索结果API被封之后的解决之道

    博客分类:
  • java
阅读更多
    前段时间,由于开发需要使用google的搜索结果,结果怎么弄都报错连接超时,结果后来去官方去查看究竟,结果坑爹呀,google公司在2010年的11月1日就关闭了Webservice基于Soap的API,只提供Ajax访问。下面的这个是解决这个困境的方法:

package com.zzs.search;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;
public class GoogleSearchTest {

	/**
	 * @param args
	 */
	public static void main(String[] args) {
		// TODO Auto-generated method stub
		URL url = null;
		String keywords = "abc";
		String start = "1";
		try {
			url = new URL(
					"http://ajax.googleapis.com/ajax/services/search/web?v=1.0&hl=zh-CN&rsz=large&q="
							+ keywords + "&start=" + start);
		} catch (MalformedURLException e1) {
			e1.printStackTrace();
		}
		URLConnection connection = null;
		StringBuilder builder = new StringBuilder();
		String builderStr = "";
		String line;
		BufferedReader reader = null;
		try {
			// 发送请求,读取查询结果
			connection = url.openConnection();
			// connection.addRequestProperty("Referer",
			// "http://www.mysite.com/index.html");
			reader = new BufferedReader(new InputStreamReader(connection
					.getInputStream(), "utf-8"));
			while ((line = reader.readLine()) != null) {
				builder.append(line);
			}
			builderStr = builder.toString();
			System.out.println(builderStr);
		} catch (IOException e1) {
			e1.printStackTrace();
		}
		/*
		 * q q=Paris%20Hilton 该参数提供了传递到搜索器的查询或搜索表达式。 
		 * v v=1.0    此参数可提供协议版本号。目前唯一有效的值为 1.0。 
		 * rsz? rsz=small 该可选参数提供了应用程序要接收的结果数。值 small表示较小的结果集大小或 4 个结果。值 large 表示较大的结果集大小或 8 个结果。如果没有提供此参数,将假定值为 small。
		 * hl? hl=fr 此可选参数提供了提出请求的应用程序的宿主语言。如果未提供此参数,系统将根据 Accept-Language http标头的值选择一个值。如果此标题未显示,将假定值为 en。 
		 * key? key=your-key 此可选参数提供了应用程序的密钥。如果指定了此参数 ,则此密钥必须是与您的网站(已通过传递的参考标头进行确认)关联的有效密钥。提供密钥的优点在于,我 们可以在您的应用程序出错时识别并联系您。如果没有密钥,我们仍会采取相同的适当措施,但是我们将无法联系您。强烈建议您最好提供一个密钥。
		 * start? start=4 该可选参数提供了第一个搜索结果的起始索引。每个成功的响应都包含了一个 cursor 对象(请参见下文),该对象包括一个 pages 的数组。页面的 start 属性可以用作该参数的有效值。
		 */
	}

}

搜索结果如下:
{"responseData": {"results":[{"GsearchResultClass":"GwebSearch","unescapedUrl":"http://abcnews.go.com/","url":"http://abcnews.go.com/","visibleUrl":"abcnews.go.com","cacheUrl":"","title":"\u003cb\u003eABC\u003c/b\u003e News","titleNoFormatting":"ABC News","content":"ABCNews.com: Breaking News, Politics, World News, Good Morning America,   Exclusive Interviews."},{"GsearchResultClass":"GwebSearch","unescapedUrl":"http://abc.go.com/watch","url":"http://abc.go.com/watch","visibleUrl":"abc.go.com","cacheUrl":"http://www.google.com/search?q\u003dcache:poVfr19iF9UJ:abc.go.com","title":"Watch Full Episodes - \u003cb\u003eABC\u003c/b\u003e","titleNoFormatting":"Watch Full Episodes - ABC","content":"Watch full episodes from your favorite \u003cb\u003eABC\u003c/b\u003e programs online. The official \u003cb\u003eABC\u003c/b\u003e \u003cb\u003e...\u003c/b\u003e"},{"GsearchResultClass":"GwebSearch","unescapedUrl":"http://abc.go.com/shows/desperate-housewives","url":"http://abc.go.com/shows/desperate-housewives","visibleUrl":"abc.go.com","cacheUrl":"http://www.google.com/search?q\u003dcache:H0P7mX35mMQJ:abc.go.com","title":"Desperate Housewives: Watch Full Episodes for Free Online - \u003cb\u003eABC\u003c/b\u003e.com","titleNoFormatting":"Desperate Housewives: Watch Full Episodes for Free Online - ABC.com","content":"The official Desperate Housewivespage on \u003cb\u003eABC\u003c/b\u003e offers a deeper look at the hit \u003cb\u003e...\u003c/b\u003e"},{"GsearchResultClass":"GwebSearch","unescapedUrl":"http://en.wikipedia.org/wiki/American_Broadcasting_Company","url":"http://en.wikipedia.org/wiki/American_Broadcasting_Company","visibleUrl":"en.wikipedia.org","cacheUrl":"http://www.google.com/search?q\u003dcache:DI7iYqHzygwJ:en.wikipedia.org","title":"\u003cb\u003eAmerican Broadcasting Company\u003c/b\u003e - Wikipedia, the free encyclopedia","titleNoFormatting":"American Broadcasting Company - Wikipedia, the free encyclopedia","content":"The \u003cb\u003eAmerican Broadcasting Company\u003c/b\u003e (\u003cb\u003eABC\u003c/b\u003e) is an American television network.   Created in 1943 from the former NBC Blue radio network, \u003cb\u003eABC\u003c/b\u003e is owned by The Walt   \u003cb\u003e...\u003c/b\u003e"},{"GsearchResultClass":"GwebSearch","unescapedUrl":"http://twitter.com/abc","url":"http://twitter.com/abc","visibleUrl":"twitter.com","cacheUrl":"http://www.google.com/search?q\u003dcache:DHkYr_pKNK8J:twitter.com","title":"\u003cb\u003eABC\u003c/b\u003e News (\u003cb\u003eABC\u003c/b\u003e) on Twitter","titleNoFormatting":"ABC News (ABC) on Twitter","content":"\u003cb\u003eABC\u003c/b\u003e News (\u003cb\u003eABC\u003c/b\u003e) is on Twitter. Sign up for Twitter to follow \u003cb\u003eABC\u003c/b\u003e News (\u003cb\u003eABC\u003c/b\u003e) and   get their latest updates."},{"GsearchResultClass":"GwebSearch","unescapedUrl":"http://www.abc.org/","url":"http://www.abc.org/","visibleUrl":"www.abc.org","cacheUrl":"http://www.google.com/search?q\u003dcache:hoLnleEslnwJ:www.abc.org","title":"\u003cb\u003eABC\u003c/b\u003e - Associated Builders \u0026amp; Contractors, Inc - Home","titleNoFormatting":"ABC - Associated Builders \u0026amp; Contractors, Inc - Home","content":"National trade association representing merit shop contractors, subcontractors,   material suppliers and related firms in the United States."},{"GsearchResultClass":"GwebSearch","unescapedUrl":"http://pingpong-abc.sourceforge.net/","url":"http://pingpong-abc.sourceforge.net/","visibleUrl":"pingpong-abc.sourceforge.net","cacheUrl":"http://www.google.com/search?q\u003dcache:OIiCTLrardcJ:pingpong-abc.sourceforge.net","title":"\u003cb\u003eABC\u003c/b\u003e [Yet \u003cb\u003eAnother Bittorrent Client\u003c/b\u003e]","titleNoFormatting":"ABC [Yet Another Bittorrent Client]","content":"Forked from Bittornado, handles multiple torrents."},{"GsearchResultClass":"GwebSearch","unescapedUrl":"http://www.accessabc.com/","url":"http://www.accessabc.com/","visibleUrl":"www.accessabc.com","cacheUrl":"http://www.google.com/search?q\u003dcache:tcxxMtiqZFIJ:www.accessabc.com","title":"\u003cb\u003eAudit Bureau of Circulations\u003c/b\u003e","titleNoFormatting":"Audit Bureau of Circulations","content":"Non-profit association of advertisers, ad agencies and publishers. Provides   media audits, publisher statements, news bulletins, press and conference \u003cb\u003e...\u003c/b\u003e"}],"cursor":{"pages":[{"start":"0","label":1},{"start":"8","label":2},{"start":"16","label":3},{"start":"24","label":4},{"start":"32","label":5},{"start":"40","label":6},{"start":"48","label":7},{"start":"56","label":8}],"estimatedResultCount":"29000000","currentPageIndex":0,"moreResultsUrl":"http://www.google.com/search?oe\u003dutf8\u0026ie\u003dutf8\u0026source\u003duds\u0026start\u003d1\u0026hl\u003dzh-CN\u0026q\u003dabc"}}, "responseDetails": null, "responseStatus": 200}

从中能看出规律吧,自己写个解析程序就OK了

分享到:
评论
3 楼 zcyjava 2011-10-20  
这种方法取出来的结果数是29000000(读取estimatedResultCount),
如果从GOOGLE的页面点击按钮搜索出来的结果数是 686000000,为什么结果不想同啊?是不是GOOGLE进行限制啊!
2 楼 xiaoyangok 2011-08-25  
不知你有没发现,或者有其他更好方法?
1 楼 xiaoyangok 2011-08-25  
请求次数有限制!

相关推荐

Global site tag (gtag.js) - Google Analytics