问

如何使用漂亮的汤查找特定的视频html标签？

西格咒_779 发布于 2023-02-13 08:21

python

有谁知道如何在python中使用beautifulsoup。

我有一个带有不同网址列表的搜索引擎。

我只想获取包含视频嵌入网址的html标签。并获取链接。

例

import BeautifulSoup

html = '''https://archive.org/details/20070519_detroit2'''
    #or this.. html = '''http://www.kumby.com/avatar-the-last-airbender-book-3-chapter-5/'''
    #or this... html = '''https://www.youtube.com/watch?v=fI3zBtE_S_k'''

soup = BeautifulSoup.BeautifulSoup(html)

我下一步该怎么做。获取视频，对象或视频的确切链接的html标签。

我需要将它放在我的iframe上。我将python集成到我的php中。所以获取视频的链接并使用python输出它，然后我将在我的iframe上回显它。

1 个回答

您需要获取页面的html而不只是url

urllib像这样使用内置库：
```
import urllib
from bs4 import BeautifulSoup as BS

url = '''https://archive.org/details/20070519_detroit2'''
#open and read page
page = urllib.urlopen(url)
html = page.read()
#create BeautifulSoup parse-able "soup"
soup = BS(html)
#get the src attribute from the video tag
video = soup.find("video").get("src")
```
同样在您正在使用的网站上，我注意到要获取嵌入链接，只需更改details链接即可，embed因此如下所示：
```
https://archive.org/embed/20070519_detroit2
```
因此，如果您想对多个网址进行解析而不必解析，只需执行以下操作：
```
url = '''https://archive.org/details/20070519_detroit2'''
spl = url.split('/')
spl[3] = 'embed'
embed = "/".join(spl)
print embed
```
编辑

要获取您在编辑中提供的其他链接的嵌入链接，您需要浏览正在解析的页面的html，直到找到该链接，然后在其中获取标签，然后在属性中

对于
```
'''http://www.kumby.com/avatar-the-last-airbender-book-3-chapter-5/'''
```
做就是了
```
soup.find("iframe").get("src")
```
在iframe监守链接是在iframe标签及.get("src")，因为链接是src属性

您可以尝试下一个，因为如果您希望将来能够做的话，您应该学习如何做：)

祝好运！
2023-02-13 08:27 回答

单莼de笑脸

撰写答案

今天，你开发时遇到什么问题呢？

立即提问

热门标签