打開深交所公募REITs公開說明書頁面,F12查看網絡,找到真實地址:https://reits.szse.cn/api/disc/announcement/annList?random=0.3555675437003616
![](http://image.uc.cn/s/wemedia/s/upload/2024/a14c4a8a5a7f5d65fb803961c8be6b02.jpg)
{
"announceCount": 39,
"data": [
{
"id": "80bc99a7-8a04-4803-b42a-d9cca1e6c5d5",
"annId": 1220300147,
"title": "華夏華潤商業REIT:華夏華潤商業資産封閉式基礎設施證券投資基金招募說明書更新",
"content": null,
"publishTime": "2024-06-08 00:00:00",
"attachPath": "/disc/disk03/finalpage/2024-06-08/a77d6a34-c4eb-4dcf-9b16-7c2ce856ebdd.PDF",
"attachFormat": "PDF",
"attachSize": 6265,
"secCode": [
"180601"
],
"secName": [
"華夏華潤商業REIT"
],
"bondType": null,
"bigIndustryCode": null,
"bigCategoryId": null,
"smallCategoryId": null,
"channelCode": null,
"_index": "ows_disclosure-20180825"
},
返回的是json數據,PDF地址在這裏:"/disc/disk03/finalpage/2024-06-08/a77d6a34-c4eb-4dcf-9b16-7c2ce856ebdd.PDF",
![](http://image.uc.cn/s/wemedia/s/upload/2024/52db2deba09546b21b324bbc43559415.jpg)
打開下載頁面,查看網站URL:https://disc.static.szse.cn/disc/disk03/finalpage/2024-06-08/a77d6a34-c4eb-4dcf-9b16-7c2ce856ebdd.PDF
那麽,開頭要添加的是“https://disc.static.szse.cn”
在deepseek中輸入提示詞:
你是一個Python編程專家,寫一個Python腳本,具體步驟如下:
請求網址:
https://reits.szse.cn/api/disc/announcement/annList?random=0.3555675437003616
請求方法:
POST
狀態代碼:
200 OK
遠程地址:
58.251.50.138:443
引薦來源網址政策:
strict-origin-when-cross-origin
請求載荷:
{"seDate":["",""],"channelCode":["reits-xxpl"],"bigCategoryId":["directions"],"pageSize":50,"pageNum":1}
請求標頭:
Accept:
application/json, text/javascript, */*; q=0.01
Accept-Encoding:
gzip, deflate, br, zstd
Accept-Language:
zh-CN,zh;q=0.9,en;q=0.8
Connection:
keep-alive
Content-Length:
104
Content-Type:
application/json
Host:
http://reits.szse.cn
Origin:
https://reits.szse.cn
Referer:
https://reits.szse.cn/disclosure/index.html
Sec-Ch-Ua:
"Google Chrome";v="125", "Chromium";v="125", "Not.A/Brand";v="24"
Sec-Ch-Ua-Mobile:
?0
Sec-Ch-Ua-Platform:
"Windows"
Sec-Fetch-Dest:
empty
Sec-Fetch-Mode:
cors
Sec-Fetch-Site:
same-origin
User-Agent:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36
X-Request-Type:
ajax
X-Requested-With:
XMLHttpRequest
獲取網頁返回的響應,這是一個嵌套的json數據;
定位到 "data"鍵下"title"鍵對應的值,這是PDF文件的標題;
定位到 "data"鍵下 "attachPath"鍵對應的值,這是PDF文件URL,前面加上“https://disc.static.szse.cn”,構成一個完整的PDF下載URL;
下載PDF文件,保存到文件夾:F:\AI自媒體內容\AI炒股\REITs
注意:每一步都要輸出信息
PDF文件標題中可能包括一些不符合window系統命名規則的特殊符號,在重命名PDF文件前要先進行處理;
每下一個PDF文件,就隨機暫停3-6秒;
源代碼:
import requests
import json
import os
import time
import random
import re
# 定義請求URL和請求頭
url = "https://reits.szse.cn/api/disc/announcement/annList?random=0.3555675437003616"
headers = {
"Accept": "application/json, text/javascript, */*; q=0.01",
"Accept-Encoding": "gzip, deflate, br, zstd",
"Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
"Connection": "keep-alive",
"Content-Type": "application/json",
"Host": "http://reits.szse.cn",
"Origin": "https://reits.szse.cn",
"Referer": "https://reits.szse.cn/disclosure/index.html",
"Sec-Ch-Ua": '"Google Chrome";v="125", "Chromium";v="125", "Not.A/Brand";v="24"',
"Sec-Ch-Ua-Mobile": "?0",
"Sec-Ch-Ua-Platform": '"Windows"',
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36",
"X-Request-Type": "ajax",
"X-Requested-With": "XMLHttpRequest"
}
# 定義請求載荷
payload = {
"seDate": ["", ""],
"channelCode": ["reits-xxpl"],
"bigCategoryId": ["directions"],
"pageSize": 50,
"pageNum": 1
}
# 發送POST請求
response = requests.post(url, headers=headers, json=payload)
# 檢查響應狀態碼
if response.status_code == 200:
print("請求成功,狀態碼: 200 OK")
else:
print(f"請求失敗,狀態碼: {response.status_code}")
exit()
# 解析JSON響應
data = response.json()
# 檢查是否有數據
if "data" in data and isinstance(data["data"], list):
for item in data["data"]:
# 獲取PDF標題
pdf_title = item.get("title", "unknown_title")
print(f"PDF標題: {pdf_title}")
# 獲取PDF URL
pdf_url = item.get("attachPath", "")
if pdf_url:
pdf_url = "https://disc.static.szse.cn" + pdf_url
print(f"PDF URL: {pdf_url}")
# 處理PDF標題中的非法字符
pdf_title = re.sub(r'[<>:"/\\|?*]', '_', pdf_title)
# 定義保存路徑
save_path = f"F:\\AI自媒體內容\\AI炒股\\REITs\\{pdf_title}.pdf"
# 下載PDF文件
pdf_response = requests.get(pdf_url)
if pdf_response.status_code == 200:
with open(save_path, 'wb') as f:
f.write(pdf_response.content)
print(f"PDF文件已保存到: {save_path}")
else:
print(f"下載PDF文件失敗,狀態碼: {pdf_response.status_code}")
# 隨機暫停3-6秒
time.sleep(random.uniform(3, 6))
else:
print("沒有找到數據")
![](http://image.uc.cn/s/wemedia/s/upload/2024/03620164fd8cd7c0a3bc5c71427c4797.jpg)