我有一个我从
target="_blank">
5548U
我试图从这个元素(包括空格)中提取“55488 Power La Vaca(8025K)Linux 4.2.x.x”.
import lxml.etree as ET
td_html = """
"""
td_elem = ET.fromstring(td_html)
fail_1 = td_elem.find('a').text + td_elem.text
print "FAIL_1", fail_1
print "FAIL_2"
for elem in td_elem.iterchildren():
print elem.tag, elem.text
结果
$python textxml.py
FAIL_1
FAIL_2
a
br None
$
题
令人羞愧的是,我不得不问这个问题,因为它似乎不应该很难.
如何从td_elem元素(包括空格)中提取“Power La Vaca(8025K)Linux 4.2.x.x”?
请,没有正则表达式解决方案.
解
显式解决方案(使用Finn的itertext()建议):
print "SUCCESS", ' '.join([txt.strip() for txt in td_elem.itertext()])