热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

HttpComponents组件探究-HttpClient篇

在Java领域,谈到网络编程,可能大家脑海里第一反应就是MINA,NETTY,GRIZZLY等优秀的开源框架。没错,不过在深入探究这些框架之前,我们需要先从最original的技

        在Java领域,谈到网络编程,可能大家脑海里第一反应就是MINA,NETTY,GRIZZLY等优秀的开源框架。没错,不过在深入探究这些框架之前,我们需要先从最original的技术探究开始(当然,需要大家先熟悉java.net.*类库)。这里,我要和大家分享一下HttpComponents项目的部分组件特性。HttpClient,想必大家早都接触过了吧。HttpComponents和HttpClient的”血缘“有点像guava和google-collection的关系。目前,HttpComponents已经是Apache的顶级项目了,它旨在为我们提供一个Http协议相关的Java平台工具集。它的代码组织很精妙,主要分两部分,一部分是核心工具集(包括HttpCore-bio,HttpCore-nio,HttpClient,HttpMIme,HttpCOOKIE等),一部分是扩展工具集(目前主要包括ssl)

        HttpClient主要包括Connection management,Status management,Authentication Management三部分。下面给出对它的二次封装,经过了线上的接近半年的验证(这里指的是httpClient 3,httpClient 4还有待检验),可以看做是一个高性能的Client封装吧。感兴趣的朋友可以根据apache的MPM IO模型进行部分参数的调整。

        先来段httpClient 4的封装,代码如下:

    

/**
 * @author von gosling 2012-3-2
 */
public class HttpComponentsClientExecutor implements DisposableBean {
    private static final int    DEFAULT_MAX_TOTAL_COnNECTIONS= 100;

    private static final int    DEFAULT_MAX_CONNECTIONS_PER_ROUTE = 5;                 //notice IE 6,7,8

    private static final int    DEFAULT_CONN_TIMEOUT_MILLISECOnDS= 5 * 1000;

    private static final int    DEFAULT_READ_TIMEOUT_MILLISECOnDS= 60 * 1000;

    private static final String HTTP_HEADER_CONTENT_ENCODING      = "Content-Encoding";
    private static final String ENCODING_GZIP                     = "gzip";

    private HttpClient          httpClient;

    /**
     * Create a new instance of the HttpComponentsClient with a default
     * {@link HttpClient} that uses a default
     * {@link org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager}.
     */
    public HttpComponentsClientExecutor() {
        SchemeRegistry schemeRegistry = new SchemeRegistry();
        schemeRegistry.register(new Scheme("http", 80, PlainSocketFactory.getSocketFactory()));
        schemeRegistry.register(new Scheme("https", 443, SSLSocketFactory.getSocketFactory()));

        ThreadSafeClientConnManager cOnnectionManager= new ThreadSafeClientConnManager(
                schemeRegistry);
        connectionManager.setMaxTotal(DEFAULT_MAX_TOTAL_CONNECTIONS);
        connectionManager.setDefaultMaxPerRoute(DEFAULT_MAX_CONNECTIONS_PER_ROUTE);
        this.httpClient = new DefaultHttpClient(connectionManager);

        setConnectTimeout(DEFAULT_CONN_TIMEOUT_MILLISECONDS);
        setReadTimeout(DEFAULT_READ_TIMEOUT_MILLISECONDS);
    }

    /**
     * Create a new instance of the HttpComponentsClient with the given
     * {@link HttpClient} instance.
     * 
     * @param httpClient the HttpClient instance to use for this request
     */
    public HttpComponentsClientExecutor(HttpClient httpClient) {
        Validate.notNull(httpClient, "HttpClient must not be null");
        //notice: if you want to custom exception recovery mechanism 
        //you should provide an implementation of the HttpRequestRetryHandler interface.
        this.httpClient = httpClient;
    }

    /**
     * Set the {@code HttpClient} used by this request.
     */
    public void setHttpClient(HttpClient httpClient) {
        this.httpClient = httpClient;
    }

    /**
     * Return the {@code HttpClient} used by this request.
     */
    public HttpClient getHttpClient() {
        return this.httpClient;
    }

    /**
     * Set the connection timeout for the underlying HttpClient. A timeout value
     * of 0 specifies an infinite timeout.
     * 
     * @param timeout the timeout value in milliseconds
     */
    public void setConnectTimeout(int timeout) {
        Validate.isTrue(timeout >= 0, "Timeout must be a non-negative value");
        getHttpClient().getParams().setIntParameter(CoreConnectionPNames.CONNECTION_TIMEOUT,
                timeout);
    }

    /**
     * Set the socket timeout (SO_TIMEOUT) in milliseconds, which is the timeout
     * for waiting for data or, put differently, a maximum period inactivity
     * between two consecutive data packets.A timeout value of 0 specifies an
     * infinite timeout.
     * 
     * @param timeout the timeout value in milliseconds
     */
    public void setReadTimeout(int timeout) {
        Validate.isTrue(timeout >= 0, "Timeout must be a non-negative value");
        getHttpClient().getParams().setIntParameter(CoreConnectionPNames.SO_TIMEOUT, timeout);
    }

    /**
     * Create a Commons HttpMethodBase object for the given HTTP method and URI
     * specification.
     * 
     * @param httpMethod the HTTP method
     * @param uri the URI
     * @return the Commons HttpMethodBase object
     */
    protected HttpUriRequest createHttpUriRequest(HttpMethod httpMethod, URI uri) {
        switch (httpMethod) {
            case GET:
                return new HttpGet(uri);
            case DELETE:
                return new HttpDelete(uri);
            case HEAD:
                return new HttpHead(uri);
            case OPTIONS:
                return new HttpOptions(uri);
            case POST:
                return new HttpPost(uri);
            case PUT:
                return new HttpPut(uri);
            case TRACE:
                return new HttpTrace(uri);
            default:
                throw new IllegalArgumentException("Invalid HTTP method: " + httpMethod);
        }
    }

    /**
     * Execute the given method on the provided URI.
     * 
     * @param method the HTTP method to execute (GET, POST, etc.)
     * @param url the fully-expanded URL to connect to
     * @param responseHandler httpClient will automatically take care of
     *            ensuring release of the connection back to the connection
     *            manager regardless whether the request execution succeeds or
     *            causes an exception,if using this response handler
     * @return an response object's string representation
     * @throws IOException
     * @throws ClientProtocolException
     */
    public String doExecuteRequest(HttpMethod httpMethod, URI uri,
                                   ResponseHandler responseHandler)
            throws ClientProtocolException, IOException {
        return httpClient.execute(createHttpUriRequest(httpMethod, uri), responseHandler);
    }

    public InputStream doExecuteRequest(HttpMethod httpMethod, URI uri)
            throws ClientProtocolException, IOException {
        //1.
        HttpUriRequest httpUriRequest = createHttpUriRequest(httpMethod, uri);
        //2.
        HttpResponse respOnse= httpClient.execute(httpUriRequest);
        //3.
        validateResponse(response);
        //4.
        return getResponseBody(response);
    }

    /**
     * Validate the given response, throwing an exception if it does not
     * correspond to a successful HTTP response.
     * 

* Default implementation rejects any HTTP status code beyond 2xx, to avoid * parsing the response body and trying to deserialize from a corrupted * stream. * * @param config the HTTP invoker configuration that specifies the target * service * @param response the resulting HttpResponse to validate * @throws NoHttpResponseException * @throws java.io.IOException if validation failed */ protected void validateResponse(HttpResponse response) throws IOException { StatusLine status = response.getStatusLine(); if (status.getStatusCode() >= 300) { throw new NoHttpResponseException( "Did not receive successful HTTP response: status code = " + status.getStatusCode() + ", status message = [" + status.getReasonPhrase() + "]"); } } /** * Extract the response body *

* The default implementation simply fetches the response body stream. If * the response is recognized as GZIP response, the InputStream will get * wrapped in a GZIPInputStream. * * @param httpResponse the resulting HttpResponse to read the response body * from * @return an InputStream for the response body * @throws java.io.IOException if thrown by I/O methods * @see #isGzipResponse * @see java.util.zip.GZIPInputStream */ protected InputStream getResponseBody(HttpResponse httpResponse) throws IOException { if (isGzipResponse(httpResponse)) { return new GZIPInputStream(httpResponse.getEntity().getContent()); } else { return httpResponse.getEntity().getContent(); } } /** * Determine whether the given response indicates a GZIP response. *

* The default implementation checks whether the HTTP "Content-Encoding" * header contains "gzip" (in any casing). * * @param httpResponse the resulting HttpResponse to check * @return whether the given response indicates a GZIP response */ protected boolean isGzipResponse(HttpResponse httpResponse) { Header encodingHeader = httpResponse.getFirstHeader(HTTP_HEADER_CONTENT_ENCODING); return (encodingHeader != null && encodingHeader.getValue() != null && encodingHeader .getValue().toLowerCase().contains(ENCODING_GZIP)); } /** * Shutdown hook that closes the underlying * {@link org.apache.http.conn.ClientConnectionManager * ClientConnectionManager}'s connection pool, if any. */ public void destroy() { getHttpClient().getConnectionManager().shutdown(); } enum HttpMethod { GET, POST, HEAD, OPTIONS, PUT, DELETE, TRACE } }

   下面是久经考验的httpClient 3的二次封装,如下:

    

/**
 * @author von gosling 2011-12-12
 */
public class HttpClientUtils {

    private static final Logger log                 = LoggerFactory
                                                            .getLogger(HttpClientUtils.class);

    private static int          timeOut             = 100;
    private static int          retryCount          = 1;
    private static int          cOnnectionTimeout= 100;
    private static int          maxHostCOnnections= 32;                                     //根据apache work MPM设置此值
    private static int          maxTotalCOnnections= 512;                                    //同上
    private static String       charsetName         = "UTF-8";

    public static JSONObject executeMethod(HttpClient httpClient, HttpMethod method) {

        JSONObject result = new JSONObject();
        StopWatch watch = new StopWatch();
        int status = -1;
        try {
            log.info("Execute method({}) begin...", method.getURI());

            watch.start();
            status = httpClient.executeMethod(method);
            watch.stop();

            if (status == HttpStatus.SC_OK) {
                InputStream inputStream = method.getResponseBodyAsStream();
                ByteArrayOutputStream baos = new ByteArrayOutputStream();
                IOUtils.copy(inputStream, baos);
                String respOnse= new String(baos.toByteArray(), charsetName);

                log.info("Response is:{}", response);

                result = JSONObject.parseObject(response);
            } else {
                log.error("Http request failure! status is {}", status);
            }
        } catch (SocketTimeoutException e) {
            log.error("Request time out!");//只关注请求超时,对于其它两类超时,使用通用异常捕获
        } catch (Exception e) {
            log.error("Error occur!", e);
        } finally {
            method.releaseConnection();
            log.info("Method {},statusCode {},consuming {} ms", new Object[] { method.getName(),
                    status, watch.getTime() });
        }
        return result;
    }

    /**
     * @param uri
     * @param nameValuePairs
     * @return
     */
    public static PostMethod createPostMethod(String uri, NameValuePair[] nameValuePairs) {
        PostMethod method = new PostMethod(uri);
        method.addParameters(nameValuePairs);
        method.getParams().setContentCharset(charsetName);
        return method;
    }

    /**
     * @param uri
     * @param nameValuePairs
     * @return
     */
    public static GetMethod createGetMethod(String uri, NameValuePair[] nameValuePairs) {
        GetMethod method = new GetMethod(uri);
        List list = Lists.newArrayList();
        if (nameValuePairs != null) {
            Collections.addAll(list, nameValuePairs);
            method.setQueryString(list.toArray(new NameValuePair[nameValuePairs.length]));
        }
        method.getParams().setContentCharset(charsetName);
        return method;
    }

    public static HttpClient createHttpClient() {
        //1.
        HttpClient httpClient = new HttpClient(new MultiThreadedHttpConnectionManager());

        //2.
        HttpConnectionManagerParams httpCOnnectionManagerParams= httpClient
                .getHttpConnectionManager().getParams();
        httpConnectionManagerParams.setConnectionTimeout(connectionTimeout);
        httpConnectionManagerParams.setTcpNoDelay(true);//Nagle's algorithm
        httpConnectionManagerParams.setSoTimeout(timeOut);
        httpConnectionManagerParams.setDefaultMaxConnectionsPerHost(maxHostConnections);
        httpConnectionManagerParams.setMaxTotalConnections(maxTotalConnections);

        //3.
        HttpClientParams httpClientParam = httpClient.getParams();
        //httpClientParam.setConnectionManagerTimeout(connectionTimeout);//暂且不关注这个超时设置,后面根据性能酌情考虑
        httpClientParam.setParameter(HttpMethodParams.RETRY_HANDLER,
                new DefaultHttpMethodRetryHandler(retryCount, false));
        httpClientParam.setCOOKIEPolicy(COOKIEPolicy.BROWSER_COMPATIBILITY);

        return httpClient;
    }

    public static JSONObject doGet(String url, NameValuePair[] params) {
        return executeMethod(createHttpClient(), createGetMethod(url, params));
    }

    public static JSONObject doPost(String url, NameValuePair[] params) {
        return executeMethod(createHttpClient(), createPostMethod(url, params));
    }

    protected HttpClientUtils() {

    }

    public void setTimeOut(int timeOut) {
        HttpClientUtils.timeOut = timeOut;
    }

    public static int getTimeOut() {
        return timeOut;
    }

    public static int getRetryCount() {
        return retryCount;
    }

    public void setRetryCount(int retryCount) {
        HttpClientUtils.retryCount = retryCount;
    }

    public static int getConnectionTimeout() {
        return connectionTimeout;
    }

    public void setConnectionTimeout(int connectionTimeout) {
        HttpClientUtils.cOnnectionTimeout= connectionTimeout;
    }

    public static int getMaxHostConnections() {
        return maxHostConnections;
    }

    public void setMaxHostConnections(int maxHostConnections) {
        HttpClientUtils.maxHostCOnnections= maxHostConnections;
    }

    public static int getMaxTotalConnections() {
        return maxTotalConnections;
    }

    public void setMaxTotalConnections(int maxTotalConnections) {
        HttpClientUtils.maxTotalCOnnections= maxTotalConnections;
    }

    public static String getCharsetName() {
        return charsetName;
    }

    public void setCharsetName(String charsetName) {
        HttpClientUtils.charsetName = charsetName;
    }
}

         好了,有了活生生的代码,我们来总结一下httpClient封装过程中需要注意的一些事项吧。恩,其实更多的是体现在安全,性能上面:

(1)多线程模型,尤其注意finally中collection的释放问题。除此之外,需要考虑池化连接的异常处理,这是我文中提到特别注意的三大异常之一;

(2)Retry机制中对幂等性的处理。尤其是在httpClient4中,put和post操作,未按照http规范行事,需要我们额外注意;

(3)SSL、TLS的定制化处理;

(4)并发标记的处理,这里使用了Concurrency in practice中的并发annotation,有什么用?感兴趣的朋友可以了解下SureLogic(http://www.surelogic.com/concurrency-tools.html),别问我要license,因为俺也不是apache开源社区的developer呀;

(5)拦截器对header的处理;

(6)collection stale check机制;

(7)COOKIE specification choose或者是自定义实现;

       恩,今天就写到这里吧。感谢大家的阅读,如果哪里有疑问,欢迎留言~

参考文献:

1.http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html

2.http://hc.apache.org/httpcomponents-client-ga/tutorial/pdf/httpclient-tutorial.pdf


推荐阅读
  • 本篇文章给大家分享的是有关如何正确的使用HttpClient方法,小编觉得挺实用的,因此分享给大家学习,希望大家阅读完这篇文章后可以有所收获,话不 ... [详细]
  • 在这分布式系统架构盛行的时代,很多互联网大佬公司开源出自己的分布式RPC系统框架,例如:阿里的dubbo,谷歌的gRPC,apache的Thrift。而在我们公司一直都在推荐使用d ... [详细]
  • 一、Jsoup介绍Jsoup是一款Java的HTML解析器,可直接解析某个URL、HTML文本内容。他提供了一套非常具有效率的API,可通过DOM,CSS以及类似于Jquer ... [详细]
  • httpClient源码如何下载呢?
    httpClient源码如何下载呢?-转自:http:www.java265.comJavaCourse2022053310.htmlHttpClient简介: HttpC ... [详细]
  • 参考资料:http:www.systinet.comdocwasp_uddiuddiigpreliminary.html教程中的一个例程,可以下载。来源:竹笋炒肉虽然用telnet这样的程 ... [详细]
  • Android实战——jsoup实现网络爬虫,糗事百科项目的起步
    本文介绍了Android实战中使用jsoup实现网络爬虫的方法,以糗事百科项目为例。对于初学者来说,数据源的缺乏是做项目的最大烦恼之一。本文讲述了如何使用网络爬虫获取数据,并以糗事百科作为练手项目。同时,提到了使用jsoup需要结合前端基础知识,以及如果学过JS的话可以更轻松地使用该框架。 ... [详细]
  • 大数据Hadoop生态(20)MapReduce框架原理OutputFormat的开发笔记
    本文介绍了大数据Hadoop生态(20)MapReduce框架原理OutputFormat的开发笔记,包括outputFormat接口实现类、自定义outputFormat步骤和案例。案例中将包含nty的日志输出到nty.log文件,其他日志输出到other.log文件。同时提供了一些相关网址供参考。 ... [详细]
  • Apache Shiro 身份验证绕过漏洞 (CVE202011989) 详细解析及防范措施
    本文详细解析了Apache Shiro 身份验证绕过漏洞 (CVE202011989) 的原理和影响,并提供了相应的防范措施。Apache Shiro 是一个强大且易用的Java安全框架,常用于执行身份验证、授权、密码和会话管理。在Apache Shiro 1.5.3之前的版本中,与Spring控制器一起使用时,存在特制请求可能导致身份验证绕过的漏洞。本文还介绍了该漏洞的具体细节,并给出了防范该漏洞的建议措施。 ... [详细]
  • 如何搭建服务器环境php(2023年最新解答)
    导读:本篇文章编程笔记来给大家介绍有关如何搭建服务器环境php的相关内容,希望对大家有所帮助,一起来看看吧。本文目录一览:1、怎么搭建p ... [详细]
  • springboot基于redis配置session共享项目环境配置pom.xml引入依赖application.properties配置Cookie序列化(高版本不需要)测试启 ... [详细]
  • 阿里首席架构师科普RPC框架
    RPC概念及分类RPC全称为RemoteProcedureCall,翻译过来为“远程过程调用”。目前,主流的平台中都支持各种远程调用技术,以满足分布式系统架构中不同的系统之间的远程 ... [详细]
  • 列举几个Java程序员通用的、必须掌握的框架
    Java程序员历来就被认为是好职业,但并不是所有的Java程序员都能如愿获得好的回报,任何一个行业,都有低端饱和、高端紧缺的现象ÿ ... [详细]
  • 开发笔记:SpringBoot学习开发web应用
    篇首语:本文由编程笔记#小编为大家整理,主要介绍了SpringBoot学习开发web应用相关的知识,希望对你有一定的参考价值。SpringBoot ... [详细]
  • 最近手上在进行一个性能测试项目,脚本是java语言使用httpClient实现http请求。并发用户数线程只有40个,但是服务器端启动的线程出现了400多个,是哪里平白无故出现这么多线程呢?肯定是有问 ... [详细]
  • Android本地化存储Cookie(针对HttpClient)
    因为最近有人问我怎么保存HttpClient的Cookie,所以这里写下,顺便记录总结吧.当然,有Android网络编程经历的童鞋一看就懂喇~就不多说了,直接上代码: ... [详细]
author-avatar
婷婷Yo-jiang_373
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有