作者:mengziwudao | 来源:互联网 | 2023-09-10 12:50
文章目录1.前言2.全部代码3.WKWebView3.1引库3.2网页简单加载4.WKNavigationDelegate4.1方法4.2本文主要用的4.2.1分析1.前言当时不
文章目录 1. 前言 2. 全部代码 3. WKWebView 4. WKNavigationDelegate
1. 前言 当时不知道怎么想的,想实现把一个网页数据拿下来然后自己组装到app的tableview里面,后来想法废弃了,但是这个东西学了部分,做个记录吧,所以有了这个文章,同时,后续还会补充一下WKWebView的学习。 截止发文:目标网址的结构没有变化,我会把我现在爬取的网页结构一起放在demo里面,供分析用。要不然结构变了,demo就没用了。
2. 全部代码 demo:点击下载
3. WKWebView 3.1 引库 #import @property ( nonatomic, strong) WKWebView * webView;
3.2 网页简单加载 就拿微博说事吧
self . webView= [ [ WKWebView alloc] initWithFrame: self . view. frame] ; self . webView. navigationDelegate = self ; [ self . view addSubview: self . webView] ; [ self . webView loadRequest: [ NSURLRequest requestWithURL: [ NSURL URLWithString: @"https://weibo.com/" ] ] ] ;
4. WKNavigationDelegate 4.1 方法 &#64;protocol WKNavigationDelegate < NSObject> - ( void ) webView: ( WKWebView * ) webView decidePolicyForNavigationAction: ( WKNavigationAction * ) navigationAction decisionHandler: ( void ( ^ ) ( WKNavigationActionPolicy) ) decisionHandler; - ( void ) webView: ( WKWebView * ) webView decidePolicyForNavigationResponse: ( WKNavigationResponse * ) navigationResponse decisionHandler: ( void ( ^ ) ( WKNavigationResponsePolicy) ) decisionHandler; - ( void ) webView: ( WKWebView * ) webView didStartProvisionalNavigation: ( null_unspecified WKNavigation * ) navigation; - ( void ) webView: ( WKWebView * ) webView didReceiveServerRedirectForProvisionalNavigation: ( null_unspecified WKNavigation * ) navigation; - ( void ) webView: ( WKWebView * ) webView didFailProvisionalNavigation: ( null_unspecified WKNavigation * ) navigation withError: ( NSError * ) error; - ( void ) webView: ( WKWebView * ) webView didCommitNavigation: ( null_unspecified WKNavigation * ) navigation; - ( void ) webView: ( WKWebView * ) webView didFinishNavigation: ( null_unspecified WKNavigation * ) navigation; - ( void ) webView: ( WKWebView * ) webView didFailNavigation: ( null_unspecified WKNavigation * ) navigation withError: ( NSError * ) error; - ( void ) webView: ( WKWebView * ) webView didReceiveAuthenticationChallenge: ( NSURLAuthenticationChallenge * ) challenge completionHandler: ( void ( ^ ) ( NSURLSessionAuthChallengeDisposition disposition, NSURLCredential * _Nullable credential) ) completionHandler; - ( void ) webViewWebContentProcessDidTerminate: ( WKWebView * ) webView API_AVAILABLE ( macosx ( 10.11 ) , ios ( 9.0 ) ) ; &#64;end
4.2 本文主要用的 - ( void ) webView: ( WKWebView * ) webView didFinishNavigation: ( null_unspecified WKNavigation * ) navigation { [ self . webView evaluateJavascript: &#64;"document.body.innerHTML" completionHandler: ^ ( id _Nullable result, NSError * _Nullable error) { NSLog ( &#64;"网页抓取结果:%&#64;" , result) ; [ self writeToFileWithTxt: result] ; } ] ; NSString * titleSrcString &#61; [ NSString stringWithFormat: &#64;"document.getElementsByClassName(&#39;weibo-text&#39;)[0].getElementsByTagName(&#39;a&#39;)[0].href" ] ; [ self . webView evaluateJavascript: titleSrcString completionHandler: ^ ( id _Nullable result, NSError * _Nullable error) { NSLog ( &#64;"标题链接抓取结果:%&#64;" , result) ; } ] ; NSString * titleString &#61; [ NSString stringWithFormat: &#64;"document.getElementsByClassName(&#39;weibo-text&#39;)[0].textContent" ] ; [ self . webView evaluateJavascript: titleString completionHandler: ^ ( id _Nullable result, NSError * _Nullable error) { NSLog ( &#64;"标题抓取结果:%&#64;" , result) ; } ] ; NSString * imageSrcString &#61; [ NSString stringWithFormat: &#64;"document.getElementsByClassName(&#39;m-img-box&#39;)[0].getElementsByTagName(&#39;img&#39;)[0].src" ] ; [ self . webView evaluateJavascript: imageSrcString completionHandler: ^ ( id _Nullable result, NSError * _Nullable error) { NSLog ( &#64;"头像抓取结果:%&#64;" , result) ; } ] ; NSString * authorString &#61; [ NSString stringWithFormat: &#64;"document.getElementsByClassName(&#39;m-text-cut&#39;)[0].textContent" ] ; [ self . webView evaluateJavascript: authorString completionHandler: ^ ( id _Nullable result, NSError * _Nullable error) { NSLog ( &#64;"自媒体名称抓取结果:%&#64;" , result) ; } ] ; }
4.2.1 分析 脚本中要根据当前字段的class与tpye等相关内容去获取&#xff0c;可以结合demo里面的网页结构来分析。