5、提取对应url中的数据
<pre><code class="language-python">import urllib.request
from lxml import etree
url = 'http://blog.sina.com.cn/s/blog_c3b6050b0102xeks.html'
url_2 = 'http://blog.sina.com.cn/s/articlelist_3283485963_4_1.html'
page = urllib.request.urlopen(url)
html = page.read().decode("utf-8")
selector = etree.HTML(html)
result_content = selector.xpath('//a/font/span/text()')
print (result_content[0])
'''
//*[@id="sina_keyword_ad_area2"]/div/p[7]/a/font/span--------xpath信息
'''</code></pre>
<p>输出</p>
<pre><code>
"D:\Program Files\Anaconda3\python.exe" D:/Android/Python/PycharmProjects/spyder_project/pdf_get_url.py
https://cangshuzhe.ctfile.com/fs/3990681-242718587
Process finished with exit code 0
</code></pre>