Open
Description
楼主你好。首先很感谢你分享自己的心得,给了我很大帮助
发现你最后提取真正URL的时候的代码如下:
url_text = re.findall("\'(\S+?)\';", second_url, re.S) best_url = ''.join(url_text)
截取的效果如下:
&from=innerhttp://mp.weixin.qq.com/s?src=11×tamp=1578541951&ver=2085&signature=1c3e2o2NzgWJffH0bchXLv21TsvnPpio-R65LSusRchiIxZ3kMOnANDzGYIoTJRhPzNluorh-Dmgd*B6pbHxHSOjqjSKdwjHI4cH4Tiio-SBtTrDpU9BK7cGAiS1qo1b&new=1
其实会有一些杂音,个人建议如下:
url_text = re.findall(r"\+= '(.*?)';", second_url, re.S) best_url = ''.join(url_text)
结果如下:
http://mp.weixin.qq.com/s?src=11×tamp=1578541951&ver=2085&signature=1c3e2o2NzgWJffH0bchXLv21TsvnPpio-R65LSusRchiIxZ3kMOnANDzGYIoTJRhPzNluorh-Dmgd*B6pbHxHSOjqjSKdwjHI4cH4Tiio-SBtTrDpU9BK7cGAiS1qo1b&new=1
Metadata
Metadata
Assignees
Labels
No labels