热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

用Java合并两个XML文件-MergeTwoXMLFilesinJava

IhavetwoXMLfilesofsimilarstructurewhichIwishtomergeintoonefile.CurrentlyIamusing

I have two XML files of similar structure which I wish to merge into one file. Currently I am using EL4J XML Merge which I came across in this tutorial. However it does not merge as I expect it to for instances the main problem is its not merging the from both files into one element aka one that contains 1, 2, 3 and 4. Instead it just discards either 1 and 2 or 3 and 4 depending on which file is merged first.

我有两个结构相似的XML文件,我希望将它们合并到一个文件中。目前我使用的是我在本教程中遇到的EL4J XML Merge。然而它并没有像我期望的那样合并实例主要问题是它没有将两个文件合并为一个元素,即包含1,2,3和4的元素。相反,它只丢弃1和2或3和4取决于首先合并的文件。

So I would be grateful to anyone who has experience with XML Merge if they could tell me what I might be doing wrong or alternatively does anyone know of a good XML API for Java that would be capable of merging the files as I require?

所以我会感谢任何有XML Merge经验的人,如果他们可以告诉我我可能做错了什么,或者有没有人知道一个优秀的XML API for Java能够根据我的要求合并文件?

Many Thanks for Your Help in Advance

非常感谢您的帮助

Edit:

编辑:

Could really do with some good suggestions on doing this so added a bounty. I've tried jdigital's suggestion but still having issues with XML merge.

真的可以做一些关于这样做的好建议,所以增加了赏金。我已经尝试过jdigital的建议,但仍然遇到XML合并的问题。

Below is a sample of the type of structure of XML files that I am trying to merge.

下面是我尝试合并的XML文件结构类型的示例。


    
    
    
        
        
        
            
        
        
            
                
                
            
            
                
                
            
        
        
    
    
        
        
    



    
    
    
        
        
        
            
        
        
            
                
                
            
            
                
                
            
        
        
    
    
        
        
    

Expected output

预期产出


    
    
    
        
        
        
        
            
        
        
            
                
                
            
            
                
                
            
            
                
                
            
            
                
                
            
        
        
    
    
        
        
    

11 个解决方案

#1


11  

Not very elegant, but you could do this with the DOM parser and XPath:

不是很优雅,但你可以用DOM解析器和XPath做到这一点:

public class MergeXmlDemo {

  public static void main(String[] args) throws Exception {
    // proper error/exception handling omitted for brevity
    File file1 = new File("merge1.xml");
    File file2 = new File("merge2.xml");
    Document doc = merge("/run/host/results", file1, file2);
    print(doc);
  }

  private static Document merge(String expression,
      File... files) throws Exception {
    XPathFactory xPathFactory = XPathFactory.newInstance();
    XPath xpath = xPathFactory.newXPath();
    XPathExpression compiledExpression = xpath
        .compile(expression);
    return merge(compiledExpression, files);
  }

  private static Document merge(XPathExpression expression,
      File... files) throws Exception {
    DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
        .newInstance();
    docBuilderFactory
        .setIgnoringElementContentWhitespace(true);
    DocumentBuilder docBuilder = docBuilderFactory
        .newDocumentBuilder();
    Document base = docBuilder.parse(files[0]);

    Node results = (Node) expression.evaluate(base,
        XPathConstants.NODE);
    if (results == null) {
      throw new IOException(files[0]
          + ": expression does not evaluate to node");
    }

    for (int i = 1; i 

This assumes that you can hold at least two of the documents in RAM simultaneously.

这假设您可以同时在RAM中保存至少两个文档。

#2


6  

I use XSLT to merge XML files. It allows me to adjust the merge operation to just slam the content together or to merge at an specific level. It is a little more work (and XSLT syntax is kind of special) but super flexible. A few things you need here

我使用XSLT来合并XML文件。它允许我调整合并操作,只是将内容一起猛击或在特定级别合并。这是一个更多的工作(和XSLT语法是一种特殊的),但超级灵活。这里有一些你需要的东西

a) Include an additional file b) Copy the original file 1:1 c) Design your merge point with or without duplication avoidance

a)包含一个附加文件b)复制原始文件1:1 c)设置合并点,有或没有避免重复

a) In the beginning I have

a)一开始我有

yoursecondfile.xml

this allows to point to the second file using $mDoc

这允许使用$ mDoc指向第二个文件

b) The instructions to copy a source tree 1:1 are 2 templates:

b)复制源树1:1的说明是2个模板:



    
         
        
    



    

With nothing else you get a 1:1 copy of your first source file. Works with any type of XML. The merging part is file specific. Let's presume you have event elements with an event ID attribute. You do not want duplicate IDs. The template would look like this:

没有其他任何东西你得到你的第一个源文件的1:1副本。适用于任何类型的XML。合并部分是特定于文件的。让我们假设你有一个带有事件ID属性的事件元素。您不需要重复的ID。模板看起来像这样:

 
    
    
        
        
        
        
            
            
                
                    
                    
                
            
        
    

Of course you can compare other things like tag names etc. Also it is up to you how deep the merge happens. If you don't have a key to compare, the construct becomes easier e.g. for log:

当然,您可以比较标签名称等其他内容。此外,合并发生的深度取决于您。如果您没有要比较的密钥,则构造变得更容易,例如对于日志:

 
     
          
          
          
    

To run XSLT in Java use this:

要在Java中运行XSLT,请使用以下命令:

    Source xmlSource = new StreamSource(xmlFile);
    Source xsltSource = new StreamSource(xsltFile);
    Result xmlResult = new StreamResult(resultFile);
    TransformerFactory transFact = TransformerFactory.newInstance();
    Transformer trans = transFact.newTransformer(xsltSource);
    // Load Parameters if we have any
    if (ParameterMap != null) {
       for (Entry curParam : ParameterMap.entrySet()) {
            trans.setParameter(curParam.getKey(), curParam.getValue());
       }
    }
    trans.transform(xmlSource, xmlResult);

or you download the Saxon SAX Parser and do it from the command line (Linux shell example):

或者您下载Saxon SAX Parser并从命令行执行(Linux shell示例):

#!/bin/bash
notify-send -t 500 -u low -i gtk-dialog-info "Transforming $1 with $2 into $3 ..."
# That's actually the only relevant line below
java -cp saxon9he.jar net.sf.saxon.Transform -t -s:$1 -xsl:$2 -o:$3
notify-send -t 1000 -u low -i gtk-dialog-info "Extraction into $3 done!"

YMMV

因人而异

#3


3  

Thanks to everyone for their suggestions unfortunately none of the methods suggested turned out to be suitable in the end, as I needed to have rules for the way in which different nodes of the structure where mereged.

感谢大家的建议,遗憾的是,最终没有一种方法被证明是合适的,因为我需要对结构中不同节点的方式有规则。

So what I did was take the DTD relating to the XML files I was merging and from that create a number of classes reflecting the structure. From this I used XStream to unserialize the XML file back into classes.

所以我所做的是获取与我正在合并的XML文件相关的DTD,并创建了许多反映结构的类。从这里我使用XStream将XML文件反序列化为类。

This way I annotated my classes making it a process of using a combination of the rules assigned with annotations and some reflection in order to merge the Objects as opposed to merging the actual XML structure.

这样我就可以对我的类进行注释,使其成为使用注释和注释分配的规则组合的过程,以便合并对象而不是合并实际的XML结构。

If anyone is interested in the code which in this case merges Nmap XML files please see http://fluxnetworks.co.uk/NmapXMLMerge.tar.gz the codes not perfect and I will admit not massively flexible but it definitely works. I'm planning to reimplement the system with it parsing the DTD automatically when I have some free time.

如果有人对代码感兴趣,在这种情况下合并Nmap XML文件,请参阅http://fluxnetworks.co.uk/NmapXMLMerge.tar.gz代码不完美,我承认不是大规模灵活但它绝对有效。我打算重新实现系统,因为我有空闲时间自动解析DTD。

#4


2  

This is how it should look like using XML Merge:

这就是使用XML Merge的样子:

action.default=MERGE

xpath.info=/run/info
action.info=PRESERVE

xpath.result=/run/host/results/result
action.result=MERGE
matcher.result=ID

You have to set ID matcher for //result node and set PRESERVE action for //info node. Also beware that .properties XML Merge uses are case sensitive - you have to use "xpath" not "XPath" in your .properties.

您必须为//结果节点设置ID匹配器,并为// info节点设置PRESERVE操作。还要注意.properties XML Merge使用区分大小写 - 您必须在.properties中使用“xpath”而不是“XPath”。

Don't forget to define -config parameter like this:

不要忘记定义-config参数,如下所示:

java -cp lib\xmlmerge-full.jar; ch.elca.el4j.services.xmlmerge.tool.XmlMergeTool -config xmlmerge.properties example1.xml example2.xml 

#6


1  

I took a look at the referenced link; it's odd that XMLMerge would not work as expected. Your example seems straightforward. Did you read the section entitled Using XPath declarations with XmlMerge? Using the example, try to set up an XPath for results and set it to merge. If I'm reading the doc correctly, it would look something like this:

我看了一下参考链接;奇怪的是XMLMerge无法按预期工作。你的例子似乎很简单。您是否阅读了标题为使用XmlMerge使用XPath声明的部分?使用该示例,尝试为结果设置XPath并将其设置为合并。如果我正确阅读文档,它看起来像这样:

XPath.resultsNode=results
action.resultsNode=MERGE

#7


0  

You might be able to write a java app that deserilizes the XML documents into objects, then "merge" the individual objects programmatically into a collection. You can then serialize the collection object back out to an XML file with everything "merged."

您可以编写一个将XML文档解密为对象的Java应用程序,然后以编程方式将各个对象“合并”到一个集合中。然后,您可以将集合对象序列化为XML文件,并将所有内容“合并”。

The JAXB API has some tools that can convert an XML document/schema into java classes. The "xjc" tool might be able to do this, although I can't remember if you can create classes directly from the XML doc, or if you have to generate a schema first. There are tools out there than can generate a schema from an XML doc.

JAXB API有一些工具可以将XML文档/模式转换为java类。 “xjc”工具可能能够做到这一点,虽然我不记得你是否可以直接从XML文档创建类,或者你必须先生成一个模式。除了可以从XML文档生成模式之外,还有一些工具。

Hope this helps... not sure if this is what you were looking for.

希望这有帮助......不确定这是否是你想要的。

#8


0  

In addition to using Stax (which does make sense), it'd probably be easier with StaxMate (http://staxmate.codehaus.org/Tutorial). Just create 2 SMInputCursors, and child cursor if need be. And then typical merge sort with 2 cursors. Similar to traversing DOM documents in recursive-descent manner.

除了使用Stax(确实有意义)之外,使用StaxMate(http://staxmate.codehaus.org/Tutorial)可能更容易。只需创建2个SMInputCursors和子游标即可。然后典型的合并排序与2个游标。类似于以递归 - 下降方式遍历DOM文档。

#9


0  

So, you're only interested in merging the 'results' elements? Everything else is ignored? The fact that input0 has an and input1 has an and the expected result has an seems to suggest this.

那么,你只对合并'结果'元素感兴趣吗?其他一切都被忽略了? input0具有 并且input1具有 并且预期结果具有 的事实似乎暗示了这一点。

If you're not worried about scaling and you want to solve this problem quickly then I would suggest writing a problem-specific bit of code that uses a simple library like JDOM to consider the inputs and write the output result.

如果您不担心扩展并且想要快速解决这个问题,那么我建议编写一个特定于问题的代码,使用像JDOM这样的简单库来考虑输入并写出输出结果。

Attempting to write a generic tool that was 'smart' enough to handle all of the possible merge cases would be pretty time consuming - you'd have to expose a configuration capability to define merge rules. If you know exactly what your data is going to look like and you know exactly how the merge needs to be executed then I would imagine your algorithm would walk each XML input and write to a single XML output.

尝试编写一个“智能”足以处理所有可能的合并情况的通用工具将非常耗时 - 您必须公开配置功能来定义合并规则。如果你确切知道你的数据是什么样的,并且你确切知道合并需要如何执行,那么我会想象你的算法会遍历每个XML输入并写入单个XML输出。

#10


0  

You can try Dom4J which provides a very good means to extract information using XPath Queries and also allows you to write XML very easily. You just need to play around with the API for a while to do your job

您可以尝试使用Dom4J,它提供了一种使用XPath查询提取信息的非常好的方法,并且还允许您非常轻松地编写XML。您只需要使用API​​一段时间来完成工作

#11


-6  

Have you considered just not bothering with parsing the XML "properly" and just treating the files as big long strings and using boring old things such as hash maps and regular expressions...? This could be one of those cases where the fancy acronyms with X in them just make the job fiddlier than it needs to be.

你有没有考虑过没有“正确”解析XML而只是将文件视为大字符串并使用无聊的旧东西,如哈希映射和正则表达式......?这可能是其中带有X的花哨首字母缩略词使得工作比需要更加繁琐的情况之一。

Obviously this does depend a bit on how much data you actually need to parse out while doing the merge. But by the sound of things, the answer to that is not much.

显然,这确实取决于您在进行合并时实际需要解析的数据量。但通过事物的声音,答案并不多。


推荐阅读
  • baresip android编译、运行教程1语音通话
    本文介绍了如何在安卓平台上编译和运行baresip android,包括下载相关的sdk和ndk,修改ndk路径和输出目录,以及创建一个c++的安卓工程并将目录考到cpp下。详细步骤可参考给出的链接和文档。 ... [详细]
  • XML介绍与使用的概述及标签规则
    本文介绍了XML的基本概念和用途,包括XML的可扩展性和标签的自定义特性。同时还详细解释了XML标签的规则,包括标签的尖括号和合法标识符的组成,标签必须成对出现的原则以及特殊标签的使用方法。通过本文的阅读,读者可以对XML的基本知识有一个全面的了解。 ... [详细]
  • 本文介绍了Android 7的学习笔记总结,包括最新的移动架构视频、大厂安卓面试真题和项目实战源码讲义。同时还分享了开源的完整内容,并提醒读者在使用FileProvider适配时要注意不同模块的AndroidManfiest.xml中配置的xml文件名必须不同,否则会出现问题。 ... [详细]
  • 在Docker中,将主机目录挂载到容器中作为volume使用时,常常会遇到文件权限问题。这是因为容器内外的UID不同所导致的。本文介绍了解决这个问题的方法,包括使用gosu和suexec工具以及在Dockerfile中配置volume的权限。通过这些方法,可以避免在使用Docker时出现无写权限的情况。 ... [详细]
  • YOLOv7基于自己的数据集从零构建模型完整训练、推理计算超详细教程
    本文介绍了关于人工智能、神经网络和深度学习的知识点,并提供了YOLOv7基于自己的数据集从零构建模型完整训练、推理计算的详细教程。文章还提到了郑州最低生活保障的话题。对于从事目标检测任务的人来说,YOLO是一个熟悉的模型。文章还提到了yolov4和yolov6的相关内容,以及选择模型的优化思路。 ... [详细]
  • Spring源码解密之默认标签的解析方式分析
    本文分析了Spring源码解密中默认标签的解析方式。通过对命名空间的判断,区分默认命名空间和自定义命名空间,并采用不同的解析方式。其中,bean标签的解析最为复杂和重要。 ... [详细]
  • Linux重启网络命令实例及关机和重启示例教程
    本文介绍了Linux系统中重启网络命令的实例,以及使用不同方式关机和重启系统的示例教程。包括使用图形界面和控制台访问系统的方法,以及使用shutdown命令进行系统关机和重启的句法和用法。 ... [详细]
  • 在说Hibernate映射前,我们先来了解下对象关系映射ORM。ORM的实现思想就是将关系数据库中表的数据映射成对象,以对象的形式展现。这样开发人员就可以把对数据库的操作转化为对 ... [详细]
  • eclipse学习(第三章:ssh中的Hibernate)——11.Hibernate的缓存(2级缓存,get和load)
    本文介绍了eclipse学习中的第三章内容,主要讲解了ssh中的Hibernate的缓存,包括2级缓存和get方法、load方法的区别。文章还涉及了项目实践和相关知识点的讲解。 ... [详细]
  • 自动轮播,反转播放的ViewPagerAdapter的使用方法和效果展示
    本文介绍了如何使用自动轮播、反转播放的ViewPagerAdapter,并展示了其效果。该ViewPagerAdapter支持无限循环、触摸暂停、切换缩放等功能。同时提供了使用GIF.gif的示例和github地址。通过LoopFragmentPagerAdapter类的getActualCount、getActualItem和getActualPagerTitle方法可以实现自定义的循环效果和标题展示。 ... [详细]
  • Android系统移植与调试之如何修改Android设备状态条上音量加减键在横竖屏切换的时候的显示于隐藏
    本文介绍了如何修改Android设备状态条上音量加减键在横竖屏切换时的显示与隐藏。通过修改系统文件system_bar.xml实现了该功能,并分享了解决思路和经验。 ... [详细]
  • 在CentOS/RHEL 7/6,Fedora 27/26/25上安装JAVA 9的步骤和方法
    本文介绍了在CentOS/RHEL 7/6,Fedora 27/26/25上安装JAVA 9的详细步骤和方法。首先需要下载最新的Java SE Development Kit 9发行版,然后按照给出的Shell命令行方式进行安装。详细的步骤和方法请参考正文内容。 ... [详细]
  • Java在运行已编译完成的类时,是通过java虚拟机来装载和执行的,java虚拟机通过操作系统命令JAVA_HOMEbinjava–option来启 ... [详细]
  • MyBatis多表查询与动态SQL使用
    本文介绍了MyBatis多表查询与动态SQL的使用方法,包括一对一查询和一对多查询。同时还介绍了动态SQL的使用,包括if标签、trim标签、where标签、set标签和foreach标签的用法。文章还提供了相关的配置信息和示例代码。 ... [详细]
  • Spring常用注解(绝对经典),全靠这份Java知识点PDF大全
    本文介绍了Spring常用注解和注入bean的注解,包括@Bean、@Autowired、@Inject等,同时提供了一个Java知识点PDF大全的资源链接。其中详细介绍了ColorFactoryBean的使用,以及@Autowired和@Inject的区别和用法。此外,还提到了@Required属性的配置和使用。 ... [详细]
author-avatar
玩偶0-0
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有