热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

使用MySQL将大型结果集串流。-StreaminglargeresultsetswithMySQL

ImdevelopingaspringapplicationthatuseslargeMySQLtables.Whenloadinglargetables,Igeta

I'm developing a spring application that uses large MySQL tables. When loading large tables, I get an OutOfMemoryException, since the driver tries to load the entire table into application memory.

我正在开发一个使用大型MySQL表的spring应用程序。当加载大型表时,我得到一个OutOfMemoryException,因为驱动程序试图将整个表加载到应用程序内存中。

I tried using

我试着使用

statement.setFetchSize(Integer.MIN_VALUE);

but then every ResultSet I open hangs on close(); looking online I found that that happens because it tries loading any unread rows before closing the ResultSet, but that is not the case since I do this:

但是我打开的每个结果集都挂在close()上;在网上查找时,我发现这样做是因为它在关闭ResultSet之前尝试加载任何未读的行,但由于我这样做,所以情况并非如此:

ResultSet existingRecords = getTableData(tablename);
try {
    while (existingRecords.next()) {
        // ...
    }
} finally {
    existingRecords.close(); // this line is hanging, and there was no exception in the try clause
}

The hangs happen for small tables (3 rows) as well, and if I don't close the RecordSet (which happened in one method) then connection.close() hangs.

小表(3行)的挂起也会发生,如果我不关闭记录集(在一个方法中发生),那么连接。close()挂起。


Stack trace of the hang:

吊挂的栈迹:

SocketInputStream.socketRead0(FileDescriptor, byte[], int, int, int) line: not available [native method]
SocketInputStream.read(byte[], int, int) line: 129
ReadAheadInputStream.fill(int) line: 113
ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(byte[], int, int) line: 160
ReadAheadInputStream.read(byte[], int, int) line: 188
MysqlIO.readFully(InputStream, byte[], int, int) line: 2428 MysqlIO.reuseAndReadPacket(Buffer, int) line: 2882
MysqlIO.reuseAndReadPacket(Buffer) line: 2871
MysqlIO.checkErrorPacket(int) line: 3414
MysqlIO.checkErrorPacket() line: 910
MysqlIO.nextRow(Field[], int, boolean, int, boolean, boolean, boolean, Buffer) line: 1405
RowDataDynamic.nextRecord() line: 413
RowDataDynamic.next() line: 392 RowDataDynamic.close() line: 170
JDBC4ResultSet(ResultSetImpl).realClose(boolean) line: 7473 JDBC4ResultSet(ResultSetImpl).close() line: 881 DelegatingResultSet.close() line: 152
DelegatingResultSet.close() line: 152
DelegatingPreparedStatement(DelegatingStatement).close() line: 163
(This is my class) Database.close() line: 84

SocketInputStream。socketRead0(FileDescriptor, byte[], int, int, int) line: not available [native method] SocketInputStream。读(字节[],int, int)行:129 ReadAheadInputStream.fill(int)行:113 ReadAheadInputStream。如果需要的话,readfromunderlyingstreamif(字节[],int, int)行:160 ReadAheadInputStream。读(字节[],int, int)行:188 MysqlIO。已读(InputStream, byte[], int, int)行:2428 MysqlIO。reuseAndReadPacket(Buffer, int)行:2882 MysqlIO.reuseAndReadPacket(Buffer)行:2871 MysqlIO. checkerrorpacket (int)行:3414 MysqlIO. checkerrorpacket()行:910 MysqlIO。下一行(字段[],int, boolean, int, boolean, boolean, boolean, Buffer)行:1405 RowDataDynamic.nextRecord()行:413 RowDataDynamic.next()行:392 RowDataDynamic.close

6 个解决方案

#1


52  

Only setting the fetch size is not the correct approach. The javadoc of Statement#setFetchSize() already states the following:

只设置fetch大小不是正确的方法。语句#setFetchSize()的javadoc已经声明了以下内容:

Gives the JDBC driver a hint as to the number of rows that should be fetched from the database

给JDBC驱动程序一个提示,说明应该从数据库中获取的行数。

The driver is actually free to apply or ignore the hint. Some drivers ignore it, some drivers apply it directly, some drivers need more parameters. The MySQL JDBC driver falls in the last category. If you check the MySQL JDBC driver documentation, you'll see the following information (scroll about 2/3 down until header ResultSet):

实际上,驱动程序可以应用或忽略这个提示。一些驱动程序忽略它,一些驱动程序直接应用它,一些驱动程序需要更多的参数。JDBC驱动程序属于最后一类。如果您检查了MySQL JDBC驱动程序文档,您将看到以下信息(向下滚动大约2/3直到头结果集):

To enable this functionality, you need to create a Statement instance in the following manner:

要启用此功能,您需要以以下方式创建语句实例:

stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);

Please read the entire section of the document, it describes the caveats of this approach as well. Here's a relevant cite:

请阅读文档的整个部分,它还描述了这种方法的注意事项。这里有一个相关的引用:

There are some caveats with this approach. You will have to read all of the rows in the result set (or close it) before you can issue any other queries on the connection, or an exception will be thrown.

这种方法有一些值得注意的地方。您必须读取结果集中的所有行(或关闭它),才能对连接发出任何其他查询,否则将抛出异常。

(...)

(…)

If the statement is within scope of a transaction, then locks are released when the transaction completes (which implies that the statement needs to complete first). As with most other databases, statements are not complete until all the results pending on the statement are read or the active result set for the statement is closed.

如果语句位于事务的范围内,则在事务完成时释放锁(这意味着语句需要首先完成)。与大多数其他数据库一样,在读取语句上的所有挂起的结果或关闭语句的活动结果集之前,语句是不完整的。

If that doesn't fix the OutOfMemoryError (not Exception), then the problem is likely that you're storing all the data in Java's memory instead of processing it immediately as soon as the data comes in. This would require more changes in your code, maybe a complete rewrite. I've answered similar question before here.

如果这不能修复OutOfMemoryError(不是Exception),那么问题很可能是您将所有数据存储在Java内存中,而不是在数据一进入就立即处理它。这将需要对代码进行更多的修改,可能还需要进行一次完整的重写。我之前也回答过类似的问题。

#2


12  

Don't close your ResultSets twice.

不要关闭结果集两次。

Apparently, when closing a Statement it attempts to close the corresponding ResultSet, as you can see in these two lines from the stack trace:

显然,当关闭一个语句时,它试图关闭相应的ResultSet,您可以从堆栈跟踪中看到这两行:

DelegatingResultSet.close() line: 152
DelegatingPreparedStatement(DelegatingStatement).close() line: 163

(),(),(),(),(),(),(),(),()(),()()()()()

I had thought the hang was in ResultSet.close() but it was actually in Statement.close() which calls ResultSet.close(). Since the ResultSet was already closed, it just hung.

我原以为挂起是在ResultSet.close(),但实际上是在Statement.close()中调用ResultSet.close()。由于结果集已经关闭,所以它只是挂起了。

We've replaced all ResultSet.close() with results.getStatement().close() and removed all Statement.close()s, and the problem is now solved.

我们已经用results.getStatement().close()替换了所有ResultSet.close(),并删除了所有的语句。

#3


4  

In case someone has the same problem, I resolved it by using the LIMIT clause in my query.

如果有人遇到同样的问题,我在查询中使用LIMIT子句来解决它。

This issue was reported to MySql as a bug (find it here http://bugs.mysql.com/bug.php?id=42929) which now has a status of "not a bug". The most pertinent part is:

这个问题被报告给MySql是一个bug(在这里可以找到http://bugs.mysql.com/bug.php?id=42929),它现在的状态是“不是bug”。最相关的部分是:

There's no way currently to close a result set "midstream"

目前没有办法关闭结果集“中流”

Since you have to read ALL rows, you will have to limit your query results using a clause like WHERE or LIMIT. Alternatively, try the following:

由于必须读取所有行,因此必须使用WHERE或limit这样的子句限制查询结果。另外,试试以下:

ResultSet rs = ...
while(rs.next()) {
   ...
   if(bailOut == true) { break; }
}

while(rs.next()); // This will deplete the remaining rows on the stream

rs.close();

It may not be ideal, but at least it gets you past the hang on close.

这可能不太理想,但至少能让你通过近距离接触。

#4


1  

If you are using spring jdbc then you need to use a preparedstatement creator in conjunction with SimpleJdbcTemplate to set the fetchSize as Integer.MIN_VALUE. Its described here http://neopatel.blogspot.com/2012/02/mysql-jdbc-driver-and-streaming-large.html

如果您正在使用spring jdbc,那么需要使用preparedstatement creator并结合SimpleJdbcTemplate将fetchSize设置为Integer.MIN_VALUE。这里描述的http://neopatel.blogspot.com/2012/02/mysql-jdbc-driver-and-streaming-large.html

#5


0  

It hangs because even if you stop listening, the request still goes on. In order to close the ResultSet and Statement in the right order, try calling statement.cancel() first:

它挂起是因为即使您停止监听,请求仍然继续。为了以正确的顺序关闭ResultSet和语句,请尝试调用statement.cancel()首先:

public void close() {
    try {
        statement.cancel();
        if (resultSet != null)
            resultSet.close();
    } catch (SQLException e) {
        // ignore errors on closing
    } finally {
        try {
            statement.close();
        } catch (SQLException e) {
            // ignore errors on closing
        } finally {
            resultSet = null;
            statement = null;
        }
    }
}

#6


0  

Scrollable Resultset ignore fetchSize and fetches all the rows at once causing out of meory error.

可滚动的Resultset忽略fetchSize并一次获取所有的行,导致了错误。

For me it worked properly when setting useCursors=true, otherwise The Scrollable Resultset ignores all the implementations of fetch size, in my case it was 5000 but Scrollable Resultset fetched millions of records at once causing excessive memory usage. underlying DB is MSSQLServer.

对于我来说,当设置useCursors=true时,它可以正常工作,否则,可滚动的Resultset会忽略所有的fetch大小实现,在我的情况下,它是5000,但是可滚动的Resultset立即获取了数百万条记录,导致内存占用过多。MSSQLServer底层数据库。

jdbc:jtds:sqlserver://localhost:1433/ACS;TDS=8.0;useCursors=true

jdbc:jtds::状态"置疑" / / localhost:1433 / ACS;TDS = 8.0;useCursors = true


推荐阅读
  • 本文介绍了关系型数据库和NoSQL数据库的概念和特点,列举了主流的关系型数据库和NoSQL数据库,同时描述了它们在新闻、电商抢购信息和微博热点信息等场景中的应用。此外,还提供了MySQL配置文件的相关内容。 ... [详细]
  • 本文介绍了在SpringBoot中集成thymeleaf前端模版的配置步骤,包括在application.properties配置文件中添加thymeleaf的配置信息,引入thymeleaf的jar包,以及创建PageController并添加index方法。 ... [详细]
  • Spring特性实现接口多类的动态调用详解
    本文详细介绍了如何使用Spring特性实现接口多类的动态调用。通过对Spring IoC容器的基础类BeanFactory和ApplicationContext的介绍,以及getBeansOfType方法的应用,解决了在实际工作中遇到的接口及多个实现类的问题。同时,文章还提到了SPI使用的不便之处,并介绍了借助ApplicationContext实现需求的方法。阅读本文,你将了解到Spring特性的实现原理和实际应用方式。 ... [详细]
  • r2dbc配置多数据源
    R2dbc配置多数据源问题根据官网配置r2dbc连接mysql多数据源所遇到的问题pom配置可以参考官网,不过我这样配置会报错我并没有这样配置将以下内容添加到pom.xml文件d ... [详细]
  • PDO MySQL
    PDOMySQL如果文章有成千上万篇,该怎样保存?数据保存有多种方式,比如单机文件、单机数据库(SQLite)、网络数据库(MySQL、MariaDB)等等。根据项目来选择,做We ... [详细]
  • 2018深入java目标计划及学习内容
    本文介绍了作者在2018年的深入java目标计划,包括学习计划和工作中要用到的内容。作者计划学习的内容包括kafka、zookeeper、hbase、hdoop、spark、elasticsearch、solr、spring cloud、mysql、mybatis等。其中,作者对jvm的学习有一定了解,并计划通读《jvm》一书。此外,作者还提到了《HotSpot实战》和《高性能MySQL》等书籍。 ... [详细]
  • 先看一段错误日志:###Errorqueryingdatabase.Cause:com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransie ... [详细]
  • Java SE从入门到放弃(三)的逻辑运算符详解
    本文详细介绍了Java SE中的逻辑运算符,包括逻辑运算符的操作和运算结果,以及与运算符的不同之处。通过代码演示,展示了逻辑运算符的使用方法和注意事项。文章以Java SE从入门到放弃(三)为背景,对逻辑运算符进行了深入的解析。 ... [详细]
  • 合并列值-合并为一列问题需求:createtabletab(Aint,Bint,Cint)inserttabselect1,2,3unionallsel ... [详细]
  • 分享css中提升优先级属性!important的用法总结
    web前端|css教程css!importantweb前端-css教程本文分享css中提升优先级属性!important的用法总结微信门店展示源码,vscode如何管理站点,ubu ... [详细]
  • 本文介绍了使用Spark实现低配版高斯朴素贝叶斯模型的原因和原理。随着数据量的增大,单机上运行高斯朴素贝叶斯模型会变得很慢,因此考虑使用Spark来加速运行。然而,Spark的MLlib并没有实现高斯朴素贝叶斯模型,因此需要自己动手实现。文章还介绍了朴素贝叶斯的原理和公式,并对具有多个特征和类别的模型进行了讨论。最后,作者总结了实现低配版高斯朴素贝叶斯模型的步骤。 ... [详细]
  • Activiti7流程定义开发笔记
    本文介绍了Activiti7流程定义的开发笔记,包括流程定义的概念、使用activiti-explorer和activiti-eclipse-designer进行建模的方式,以及生成流程图的方法。还介绍了流程定义部署的概念和步骤,包括将bpmn和png文件添加部署到activiti数据库中的方法,以及使用ZIP包进行部署的方式。同时还提到了activiti.cfg.xml文件的作用。 ... [详细]
  • 一次上线事故,30岁+的程序员踩坑经验之谈
    本文主要介绍了一位30岁+的程序员在一次上线事故中踩坑的经验之谈。文章提到了在双十一活动期间,作为一个在线医疗项目,他们进行了优惠折扣活动的升级改造。然而,在上线前的最后一天,由于大量数据请求,导致部分接口出现问题。作者通过部署两台opentsdb来解决问题,但读数据的opentsdb仍然经常假死。作者只能查询最近24小时的数据。这次事故给他带来了很多教训和经验。 ... [详细]
  • mysqldinitializeconsole失败_mysql03误删除了所有用户解决办法
    误删除了所有用户解决办法第一种方法(企业常用)1.将数据库down掉[rootdb03mysql]#etcinit.dmysqldstopShuttingdownMySQL..SU ... [详细]
  • STM32 IO口模拟串口通讯
    转自:http:ziye334.blog.163.comblogstatic224306191201452833850647前阵子,调项目时需要用到低波 ... [详细]
author-avatar
zerosmall
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有