python


5、提取对应url中的数据

<pre><code class="language-python">import urllib.request from lxml import etree url = 'http://blog.sina.com.cn/s/blog_c3b6050b0102xeks.html' url_2 = 'http://blog.sina.com.cn/s/articlelist_3283485963_4_1.html' page = urllib.request.urlopen(url) html = page.read().decode("utf-8") selector = etree.HTML(html) result_content = selector.xpath('//a/font/span/text()') print (result_content[0]) ''' //*[@id="sina_keyword_ad_area2"]/div/p[7]/a/font/span--------xpath信息 '''</code></pre> <p>输出</p> <pre><code> "D:\Program Files\Anaconda3\python.exe" D:/Android/Python/PycharmProjects/spyder_project/pdf_get_url.py https://cangshuzhe.ctfile.com/fs/3990681-242718587 Process finished with exit code 0 </code></pre>

页面列表

ITEM_HTML