从ElementTree findall返回的空列表

 下个路口见的 发布于 2023-02-12 12:45

我是xml解析和Python的新手,所以请耐心等待.我正在使用lxml来解析wiki转储,但我只想要每个页面,它的标题和文本.

现在我有了这个:

from xml.etree import ElementTree as etree

def parser(file_name):
    document = etree.parse(file_name)
    titles = document.findall('.//title')
    print titles

目前,冠军没有返回任何东西.我已经看过像这样的前面的答案:ElementTree findall()返回空列表和lxml文档,但大多数事情似乎都是为解析HTML而定制的.

这是我的XML的一部分:


  
  Wikipedia
http://en.wikipedia.org/wiki/Main_Page
MediaWiki 1.20wmf9
first-letter

  Media
  Special
  
  Talk
  User
  User talk
  Wikipedia
  Wikipedia talk
  File
  File talk
  MediaWiki
  MediaWiki talk
  Template
  Template talk
  Help
  Help talk
  Category
  Category talk
  Portal
  Portal talk
  Book
  Book talk

  
  
    Aratrum
    0
    65741
    
  349931990
  225434394
  2010-03-15T02:55:02Z
  
    143.105.193.119
  
  /* Sources */
  2zkdnl9nsd1fbopv0fpwu2j5gdf0haw
  '''Aratrum''' is the Latin word for  [[plough]], and "arotron" (???????) is the [[Greek language|Greek]] word. The   [[Ancient Greece|Greeks]] appear to have had diverse kinds of plough from the earliest  historical records. [[Hesiod]] advised the farmer to have always two ploughs, so that if  one broke the other might be ready for use. These ploughs should be of two kinds, the one  called "autoguos" (????????, "self-limbed"), in which the plough-tail  was of the same piece of timber as the share-beam and the pole; and the other called  "pekton" (??????, "fixed"), because in it, three parts, which were of  three kinds of timber, were adjusted to one another, and fastened together by nails.

The ''autoguos'' plough was made from a [[sapling]] with two branches growing from its   trunk in opposite directions. In ploughing, the trunk served as the pole, one of the two     branches stood upwards and became the tail, and the other penetrated the ground and,    sometimes shod with bronze or iron, acted as the [[ploughshare]]. 

==Sources==
Based on an article from ''A Dictionary of Greek and Roman Antiquities,'' John Murray,     London, 1875.
???????

==External links==
*[http://penelope.uchicago.edu/Thayer/E/Roman/Texts/secondary/SMIGRA*/Aratrum.html Smith's     Dictionary article], with diagrams, further details, sources.
[[Category:Agricultural machinery]]
[[Category:Ancient Greece]]
[[Category:Animal equipment]]


我也尝试过iterparse,然后打印它找到的元素的标签:

for e in etree.iterparse(file_name):
    print e.tag

但它抱怨没有标签属性.

编辑: 截图

撰写答案
今天,你开发时遇到什么问题呢?
立即提问
热门标签
PHP1.CN | 中国最专业的PHP中文社区 | PNG素材下载 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有