php 采集curl_init抓取网页内容

<p>file_get_contents,curl采集不到怎么办?</p><p>php 采集curl_init抓取网页内容</p><pre class="brush:php;toolbar:false">functiongetCurl($url){ $headers=array( &#39;Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8&#39;, &#39;Accept-Encoding:gzip,deflate&#39;, &#39;Accept-Language:zh-CN,zh;q=0.9&#39;, &#39;Cache-Control:max-age=0&#39;, &#39;Connection:keep-alive&#39;, &#39;Cookie:BAIDUID=762D167F1A19D476B5F8B1E7E87B788A:FG=1;BIDUPSID=762D167F1A19D476B5F8B1E7E87B788A;PSTM=1510715646;BDUSS=3RwVlc0UVJoa3hGc3lBLU02bko4T3pHYUo3NFpYTVdDNHJRLS1XZkNSOGxNMjFhQUFBQUFBJCQAAAAAAAAAAAEAAAB8exUWeXViaW5fa2V0eTEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACWmRVolpkVaT;MCITY=-340%3A119%3A;BDORZ=B490B5EBF6F3CD402E515D22BCDA1598;PSINO=7;BDRCVFR[feWj1Vr5u3D]=I67x6TjHwwYf0;pgv_pvi=9686848512;pgv_si=s9426284544;H_PS_PSSID=1435_21114_20692_26350_20930&#39;, &#39;Host:sp0.baidu.com&#39;, &#39;Upgrade-Insecure-Requests:1&#39;, &#39;User-Agent:Mozilla/5.0(WindowsNT10.0;Win64;x64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/64.0.3282.140Safari/537.36&#39; ); $ch=curl_init(); curl_setopt($ch,CURLOPT_URL,$url); curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); curl_setopt($ch,CURLOPT_HEADER,0); curl_setopt($ch,CURLOPT_HTTPHEADER,$headers); curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,FALSE);//https请求不验证证书和hosts curl_setopt($ch,CURLOPT_SSL_VERIFYHOST,FALSE); //执行并获取HTML文档内容 $result=curl_exec($ch); //释放curl句柄 curl_close($ch); $data=@iconv(&quot;GBK&quot;,&quot;UTF-8//IGNORE&quot;,$result); returnjson_decode($data,true); } $url=&#39;https://sp0.baidu.com/&#39;; $str=$this-&gt;getCurl($url);</pre><p> headers从哪里来的呢?</p><p>1,google 浏览器打开网址</p><p>2,按f12键</p><p>3,点network选项</p><p>如图</p><p><img src="/up_pic/201805/220530202624.png" title="220530202624.png" alt="1.png"/></p>
RangeTime:0.006329s
RangeMem:205.55 KB
返回顶部 留言