网站建设资金方案,郑州服装 网站建设,wordpress 栏目调用,抖音开放平台是什么本次编写一个程序要爬取歌曲音乐榜https://www.onenzb.com/ 里面歌曲。有帮到铁子的可以收藏和关注起来#xff01;#xff01;#xff01;废话不多说直接上代码。
1 必要的包
import requests
from lxml import html,etree
from bs4 import BeautifulSoup
import re
impo…本次编写一个程序要爬取歌曲音乐榜https://www.onenzb.com/ 里面歌曲。有帮到铁子的可以收藏和关注起来废话不多说直接上代码。
1 必要的包
import requests
from lxml import html,etree
from bs4 import BeautifulSoup
import re
import pandas as pd2 获取歌曲url和歌曲名称
url https://www.onenzb.com/
header {User-Agent : Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36
}
response requests.get(urlurl, headersheader)
soup BeautifulSoup(response.text, html.parser)
print(soup)
url_list []
song_name []
for link in soup.find_all(a, hreflambda x: x and x.startswith(/music/)):# 提取href属性和title属性href link.get(href)title link.get(title)url_ https://www.1nzb.com href # 完整的urlurl_list.append(str(url_))song_name.append(str(title))
song_name [song_name.replace(/,).replace(CV,).replace(砂狼白子:安雪璃早濑优香:小敢,) for song_name in song_name]
print(song_name)
print(url_list)3 解析每首歌曲的url 以及歌名添加
for url,name in dict(zip(url_list,song_name)).items():print(url,name)print(name)header {User-Agent : Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36}response requests.get(urlurl, headersheader)soup BeautifulSoup(response.text, html.parser) # html.parser lxmlmp3_links [a[href] for a in soup.find_all(a, hrefTrue) if a[href].endswith(.mp3)]# 输出找到的URLfor url in mp3_links:print(url)# MP3文件的URLmp3_url url# 定义要保存的文件名filename E:/学习/项目/歌曲爬虫/歌曲2/{}.mp3.format(name)# 发送GET请求response requests.get(mp3_url, streamTrue)# 确保请求成功response.raise_for_status()# 写入文件with open(filename, wb) as f:for chunk in response.iter_content(chunk_size8192):f.write(chunk)print(MP3文件已下载并保存为:, filename)部分结果