使用DOM4J解析大容量XML文件

论坛 期权论坛     
选择匿名的用户   2021-6-1 16:17   482   0
<p style="color:rgb(51,51,51); font-family:Arial; font-size:14px; line-height:26px"> dom4j本身提供了两种解析xml的方式:dom解析和sax解析。关于dom解析和sax解析各自的优缺点这里不再多述,只强调的一点是由于越来越多的应用会遇到大数据场景,SAX解析方式刚好是解决此类场景的完美方案,因此“DOM4J解析大数据的方案”就是&#34;如何利用SAX方式解析大数据的方案&#34;(当然JAXP中的sax解析也是同样的方案),本文梳理总结下实际工作中使用DOM4J解析大容量XML文件的实现。</p>
<p style="color:rgb(51,51,51); font-family:Arial; font-size:14px; line-height:26px"> XML文件结构以滴答团购数据为例:</p>
<p style="color:rgb(51,51,51); font-family:Arial; font-size:14px; line-height:26px"> </p>
<div class="dp-highlighter bg_html" style="font-family:Consolas,&#39;Courier New&#39;,Courier,mono,serif; background-color:rgb(231,229,220); width:700.90625px; overflow:auto; padding-top:1px; color:rgb(51,51,51); line-height:26px; margin:18px 0px!important">
<div class="bar" style="padding-left:45px">
  <div class="tools" style="padding:3px 8px 10px 10px; font-size:9px; line-height:normal; font-family:Verdana,Geneva,Arial,Helvetica,sans-serif; color:silver; background-color:rgb(248,248,248); border-left-width:3px; border-left-style:solid; border-left-color:rgb(108,226,108)">
   <strong>[html]</strong>
   <a class="ViewSource" href="http://blog.csdn.net/warhin/article/details/17507197#" rel="noopener noreferrer" style="color:rgb(160,160,160); text-decoration:none; background-color:inherit; border:none; padding:1px; margin:0px 10px 0px 0px; font-size:9px; display:inline-block; width:16px; height:16px; text-indent:-2000px" target="_blank" title="view plain">view plain</a>
   <a class="CopyToClipboard" href="http://blog.csdn.net/warhin/article/details/17507197#" rel="noopener noreferrer" style="color:rgb(160,160,160); text-decoration:none; background-color:inherit; border:none; padding:1px; margin:0px 10px 0px 0px; font-size:9px; display:inline-block; width:16px; height:16px; text-indent:-2000px" target="_blank" title="copy">copy</a>
   <a href="https://code.csdn.net/snippets/124931" rel="noopener noreferrer" style="color:rgb(160,160,160); text-decoration:none; background-color:inherit; border:none; padding:1px; margin:0px 10px 0px 0px; font-size:9px; display:inline-block; width:16px; height:16px" target="_blank" title="在CODE上查看代码片"><img alt="在CODE上查看代码片" height="12" src="https://beijingoptbbs.oss-cn-beijing.aliyuncs.com/cs/5606289-a7c8e286f463007e2a900848b93dd72c.png" style="border:none; max-width:100%; position:relative; top:1px; left:2px" width="12"></a>
   <a href="https://code.csdn.net/snippets/124931/fork" rel="noopener noreferrer" style="color:rgb(160,160,160); text-decoration:none; background-color:inherit; border:none; padding:1px; margin:0px 10px 0px 0px; font-size:9px; display:inline-block; width:16px; height:16px" target="_blank" title="派生到我的代码片"><img alt="派生到我的代码片" height="12" src="https://beijingoptbbs.oss-cn-beijing.aliyuncs.com/cs/5606289-9e12f1d3e499fc949c886e7c9e0484f9.svg" style="border:none; max-width:100%; position:relative; top:2px; left:2px" width="12"></a>
   <div style="position:absolute; left:555px; top:599px; width:18px; height:18px; z-index:99">
   </div>
  </div>
</div>
<ol class="dp-xml" start="1" style="padding:0px; border:none; list-style-position:initial; background-color:rgb(255,255,255); color:rgb(92,92,92); margin:0px 0px 1px 45px!important"><li class="alt" style="border-style:none none none solid; border-left-width:3px; border-left-color:rgb(108,226,108); list-style:decimal-leading-zero outside; color:inherit; line-height:18px; margin:0px!important; padding:0px 3px 0px 10px!important"> <span style="margin:0px; padding:0px; border:none; color:black; background-color:inherit"><span class="tag" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold">&lt;</span><span class="tag-name" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold">urlset</span><span class="tag" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold">&gt;</span><span style="margin:0px; padding:0px; border:none; background-color:inherit">  </span></span></li><li style="border-style:none none none solid; border-left-width:3px; border-left-color:rgb(108,226,108); list-style:decimal-leading-zero outside; background-color:rgb(248,248,248); line-height:18px; margin:0px!important; padding:0px 3px 0px 10px!important"> <span style="margin:0px; padding:0px; border:none; color:black; background-color:inherit"><span class="tag" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold">&lt;</span><span class="tag-name" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold">url</span><span class="tag" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold">&gt;</span><span style="margin:0px; padding:0px; border:none; background-color:inherit">  </span></span></li><li class="alt" style="border-style:none none none solid; border-left-width:3px; border-left-color:rgb(108,226,108); list-style:decimal-leading-zero outside; color:inherit; line-height:18px; margin:0px!imp
分享到 :
0 人收藏
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

积分:3875789
帖子:775174
精华:0
期权论坛 期权论坛
发布
内容

下载期权论坛手机APP