I'm not able to load an Excel file in the older Office XML format (think Office 2002 or 2003 version) into Java. I tried JXL and Apache's POI (version 3.7). POI doesn't work since it appears to want the newer Office .xlsx
format.
我无法将旧的Office XML格式(请考虑Office 2002或2003版本)的Excel文件加载到Java中。我尝试了JXL和Apache的POI(版本3.7)。POI不能工作,因为它似乎想要更新的Office .xlsx格式。
Here's an example of the older Office XML format.
这里有一个旧的Office XML格式的示例。
One can generate a similar XML file from MS Excel 2010 by saving the workbook as the format "XML Spreadsheet 2003"?
您可以通过将工作簿保存为格式“XML Spreadsheet 2003”,从MS Excel 2010中生成类似的XML文件。
Are there any open-source Java libraries that will load the XMLSS format? Otherwise I have no choice but to write a custom parser: read the XML file then interpret the cell tags to build out the cell matrix. In this XML format, any rows with empty cell values are skipped, the next cell with data positioned with an index attribute that acts like an offset in the columns, I assume to save space in the XML file.
是否有任何开源的Java库将加载XMLSS格式?否则,我别无选择,只能编写一个自定义解析器:读取XML文件,然后解释单元格标记,构建单元格矩阵。在这种XML格式中,任何空单元格值的行都被跳过,下一个单元格的数据定位为一个索引属性,它的作用类似于列中的偏移量,我假设要在XML文件中保存空间。
3
The format is called SpreadsheetML (do not confuse with .xlsx which is also xml-based), a library called Xelem can handle it:
这种格式称为SpreadsheetML(不要和基于xml的.xlsx混淆),一个名为Xelem的库可以处理它:
import nl.fountain.xelem.excel.Workbook;
import nl.fountain.xelem.lex.ExcelReader;
//...
ExcelReader reader = new ExcelReader();
Workbook xlWorkbook = reader.getWorkbook("c:\\my\\spreadsheet.xml");
System.out.println(xlWorkbook.getSheetNames());
2
Copying Mark Beardsley's answer from POI team http://apache-poi.1045710.n5.nabble.com/How-to-convert-xml-to-xls-td2306602.html :
从POI团队http://apache-poi.1045710.n5.nabble.com/How-to-convert-xml-to-xls-td2306602.html复制Mark Beardsley的答案:
You have got an Office 2003 xml file there, not an OpenXML file; it is an early attempt by Microsoft to create an xml based file format for Excel and it is in that sense a 'valid' Office file format.
这里有一个Office 2003 xml文件,不是OpenXML文件;这是微软为Excel创建基于xml的文件格式的早期尝试,从这个意义上说,这是一种“有效的”Office文件格式。
Sadly, POI cannot interpret this file at all and that is why you saw the exception when you tried to wrap it up in the InputStream and pass it to WorkbookFactory(s) constructor. You do however have a number of options;
遗憾的是,POI根本无法解释这个文件,这就是为什么当您试图在InputStream中包装它并将它传递给WorkbookFactory(s)构造函数时,您会看到异常。但是你有很多选择;
1
After a lot of pain I've found a solution to this. JODConverter uses the OpenOffice.org/LibreOffice API and can convert SpreadsheetML to whatever formats OpenOffice.org suppports.
在经历了很多痛苦之后,我找到了解决办法。JODConverter使用OpenOffice.org/LibreOffice API,可以将SpreadsheetML转换为OpenOffice.org的任何格式。
0
You might get some result using the OpenOffice API. If not directly you could probably convert to a 'supported' format. Otherwise the schema for the Office 2003 'SpreadsheetML' isn't very complicated. I have succesfully created an xslt scenario to convert a resultset (database query) to a (simple yet effective) Excel 2003 document (XML format). The other way around should not be very hard to achieve.
您可能会使用OpenOffice API得到一些结果。如果不能直接转换成“受支持的”格式。否则,Office 2003“SpreadsheetML”的架构就不是很复杂了。我成功地创建了一个xslt场景,将resultset(数据库查询)转换为Excel 2003文档(XML格式)。另一种方法不应该很难实现。
Cheers, Wim
欢呼,Wim
0
The answer today was to ask the vendor to change their Excel file format to an Excel binary rather than the old Office XML. Doing so allowed me to use Apache POI 3.7 to read the file with no issues. I appreciate the answers, as I had no idea there was no direct support in the Java-based open source libraries for this old Office XML format. Now I know next time to check earlier to see what format the Excel files are in before committing to a timeline.
今天的答案是要求供应商将他们的Excel文件格式改为Excel二进制文件,而不是旧的Office XML。这样做让我可以使用Apache POI 3.7来读取文件,没有问题。我很欣赏这些答案,因为我不知道在基于java的开放源码库中没有直接支持这种旧的Office XML格式。现在我知道下次在提交时间轴之前要检查Excel文件的格式。
0
I had the same problem some time ago, ended up writing a SAX parser to read the XML file. I wrote a blog post about it here.
不久前我遇到了同样的问题,最后我编写了一个SAX解析器来读取XML文件。我在这里写了一篇博客。
You can find the sample project to parse the file in Github.
您可以找到示例项目来解析Github中的文件。