<p style="color:rgb(51,51,51); font-family:Arial; font-size:14px; line-height:26px"> dom4j本身提供了两种解析xml的方式:dom解析和sax解析。关于dom解析和sax解析各自的优缺点这里不再多述,只强调的一点是由于越来越多的应用会遇到大数据场景,SAX解析方式刚好是解决此类场景的完美方案,因此“DOM4J解析大数据的方案”就是"如何利用SAX方式解析大数据的方案"(当然JAXP中的sax解析也是同样的方案),本文梳理总结下实际工作中使用DOM4J解析大容量XML文件的实现。</p>
<p style="color:rgb(51,51,51); font-family:Arial; font-size:14px; line-height:26px"> XML文件结构以滴答团购数据为例:</p>
<p style="color:rgb(51,51,51); font-family:Arial; font-size:14px; line-height:26px"> </p>
<div class="dp-highlighter bg_html" style="font-family:Consolas,'Courier New',Courier,mono,serif; background-color:rgb(231,229,220); width:700.90625px; overflow:auto; padding-top:1px; color:rgb(51,51,51); line-height:26px; margin:18px 0px!important">
<div class="bar" style="padding-left:45px">
<div class="tools" style="padding:3px 8px 10px 10px; font-size:9px; line-height:normal; font-family:Verdana,Geneva,Arial,Helvetica,sans-serif; color:silver; background-color:rgb(248,248,248); border-left-width:3px; border-left-style:solid; border-left-color:rgb(108,226,108)">
<strong>[html]</strong>
<a class="ViewSource" href="http://blog.csdn.net/warhin/article/details/17507197#" rel="noopener noreferrer" style="color:rgb(160,160,160); text-decoration:none; background-color:inherit; border:none; padding:1px; margin:0px 10px 0px 0px; font-size:9px; display:inline-block; width:16px; height:16px; text-indent:-2000px" target="_blank" title="view plain">view plain</a>
<a class="CopyToClipboard" href="http://blog.csdn.net/warhin/article/details/17507197#" rel="noopener noreferrer" style="color:rgb(160,160,160); text-decoration:none; background-color:inherit; border:none; padding:1px; margin:0px 10px 0px 0px; font-size:9px; display:inline-block; width:16px; height:16px; text-indent:-2000px" target="_blank" title="copy">copy</a>
<a href="https://code.csdn.net/snippets/124931" rel="noopener noreferrer" style="color:rgb(160,160,160); text-decoration:none; background-color:inherit; border:none; padding:1px; margin:0px 10px 0px 0px; font-size:9px; display:inline-block; width:16px; height:16px" target="_blank" title="在CODE上查看代码片"><img alt="在CODE上查看代码片" height="12" src="https://beijingoptbbs.oss-cn-beijing.aliyuncs.com/cs/5606289-a7c8e286f463007e2a900848b93dd72c.png" style="border:none; max-width:100%; position:relative; top:1px; left:2px" width="12"></a>
<a href="https://code.csdn.net/snippets/124931/fork" rel="noopener noreferrer" style="color:rgb(160,160,160); text-decoration:none; background-color:inherit; border:none; padding:1px; margin:0px 10px 0px 0px; font-size:9px; display:inline-block; width:16px; height:16px" target="_blank" title="派生到我的代码片"><img alt="派生到我的代码片" height="12" src="https://beijingoptbbs.oss-cn-beijing.aliyuncs.com/cs/5606289-9e12f1d3e499fc949c886e7c9e0484f9.svg" style="border:none; max-width:100%; position:relative; top:2px; left:2px" width="12"></a>
<div style="position:absolute; left:555px; top:599px; width:18px; height:18px; z-index:99">
</div>
</div>
</div>
<ol class="dp-xml" start="1" style="padding:0px; border:none; list-style-position:initial; background-color:rgb(255,255,255); color:rgb(92,92,92); margin:0px 0px 1px 45px!important"><li class="alt" style="border-style:none none none solid; border-left-width:3px; border-left-color:rgb(108,226,108); list-style:decimal-leading-zero outside; color:inherit; line-height:18px; margin:0px!important; padding:0px 3px 0px 10px!important"> <span style="margin:0px; padding:0px; border:none; color:black; background-color:inherit"><span class="tag" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold"><</span><span class="tag-name" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold">urlset</span><span class="tag" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold">></span><span style="margin:0px; padding:0px; border:none; background-color:inherit"> </span></span></li><li style="border-style:none none none solid; border-left-width:3px; border-left-color:rgb(108,226,108); list-style:decimal-leading-zero outside; background-color:rgb(248,248,248); line-height:18px; margin:0px!important; padding:0px 3px 0px 10px!important"> <span style="margin:0px; padding:0px; border:none; color:black; background-color:inherit"><span class="tag" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold"><</span><span class="tag-name" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold">url</span><span class="tag" style="margin:0px; padding:0px; border:none; color:rgb(153,51,0); background-color:inherit; font-weight:bold">></span><span style="margin:0px; padding:0px; border:none; background-color:inherit"> </span></span></li><li class="alt" style="border-style:none none none solid; border-left-width:3px; border-left-color:rgb(108,226,108); list-style:decimal-leading-zero outside; color:inherit; line-height:18px; margin:0px!imp |
|