CefSharp浅尝辄止

docker logo

CefSharp

CEF全称:Chromium Embedded Framework
CefSharp是什么?官网上它是这么写的:CefSharp是在C#或VB.NET应用程序中嵌入全功能标准兼容web浏览器的最简单方法。CefSharp有WinForms和WPF应用程序的浏览器控件,也有自动化项目的无标题(屏幕外)版本。CefSharp基于Chromium嵌入式框架,这是Google Chrome的开源版本。
说白了,就是基于C#或VB语言的可编程浏览器(当然CEF也有其他语言的,如JavaGo)。

本文环境: * CefSharp版本:75.1.143 * VS版本:2015 * 操作系统:Windows 10专业版

WPF引入CefSharp

CefSharp有现成的NuGet包,先引入到项目中,然后在XAML中添加响应控件:

1
<cefSharp:ChromiumWebBrowser Name="myChrome" Loaded="myChrome_Loaded"/>
添加cefSharp命名空间:
1
xmlns:cefSharp="clr-namespace:CefSharp.Wpf;assembly=CefSharp.Wpf"
myChrome_Loaded事件中,我们让浏览器打开百度首页:
1
2
3
4
5
private void myChrome_Loaded(object sender, RoutedEventArgs e)
{
String url = "https://www.baidu.com";
myChrome.Load(url);
}
运行程序,我们就可以看到百度首页了。

截断请求

根据文档,我们可以看到RequestHandler类中的方法GetResourceRequestHandler会在每次发请求前被调用: > GetResourceRequestHandler
> Called on the CEF IO thread before a resource request is initiated.

RequestHandler类是IRequestHandler接口的默认实现,我们自定义请求可以继承这个类: > Default implementation of IRequestHandler. > This class provides default implementations of the methods from IRequestHandler, therefore providing a convenience base class for any custom request handler.

所以我们可以创建一个继承RequestHandler的类

1
2
3
4
5
6
7
8

class CustomRequestHandler : RequestHandler
{
protected override IResourceRequestHandler GetResourceRequestHandler(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, bool isNavigation, bool isDownload, string requestInitiator, ref bool disableDefaultHandling)
{
return new CustomResourceRequestHandler();
}
}

GetResourceRequestHandler是我们要重点关注的方法,里头我们返回了一个类实例,在这个类中我们就可以自定义请求
新版的CefSharp(75版本之后)把OnBeforeResourceLoad方法移动到了IResourceRequestHandler接口里(文档),同样的CefSharp也提供了这个接口的默认实现:ResourceRequestHandler,所以我们还需要一个继承ResourceRequestHandler的类(也就是上面代码中的CustomResourceRequestHandler类):

1
2
3
4
5
6
7
8
9
10
11
public class CustomResourceRequestHandler : ResourceRequestHandler
{
protected override CefReturnValue OnBeforeResourceLoad(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, IRequestCallback callback)
{
var headers = request.Headers;
headers["Custom-Header"] = "My Custom Header";
request.Headers = headers;

return CefReturnValue.Continue;
}
}
最后,把自定义请求类设置到CefSharp实例中
1
myChrome.RequestHandler = new CustomRequestHandler();
通过Fiddler这样的抓包工具,我们就会发现,自定义的Custom-Header头已经加上了

detail

添加自定义查询参数

上面的例子中,我们添加了自定义的header,如果我们想改写URL添加一些自定义的查询参数呢,譬如name=foo?这里有个坑,如果我们简单地把request.Url += "?name=foo",这样会导致无限重定向(因为改了Url就会重定向)。解决方法也很简单,就是判断一下我们想要的查询参数是否已经在Url里了:

1
2
3
4
5
6
7
8
9
10
11
12
13
protected override CefReturnValue OnBeforeResourceLoad(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, IRequestCallback callback)
{
var headers = request.Headers;
headers["Custom-Header"] = "My Custom Header";
request.Headers = headers;

if (!request.Url.Contains("name=foo"))
{
request.Url += "?" + "name=foo";
}

return CefReturnValue.Continue;
}

添加自定义Body

根据IRequest的文档,我们可以利用PostData属性:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
protected override CefReturnValue OnBeforeResourceLoad(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, IRequestCallback callback)
{
var headers = request.Headers;
headers["Custom-Header"] = "My Custom Header";
request.Headers = headers;

string body = "name=foo";
byte[] byteArray = System.Text.Encoding.UTF8.GetBytes(body);

request.InitializePostData();
var element = request.PostData.CreatePostDataElement();
element.Bytes = byteArray;
request.PostData.AddElement(element);

return CefReturnValue.Continue;
}
通过Fiddler这样的抓包工具,我们就会发现,POST 数据已经加上了:

detail

加载本地HTML字符串

有时候,我们可能需要渲染一个内存中的HTML字符串,CefSharp也提供这样的接口,代码很简单:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
private void myChrome_Loaded(object sender, RoutedEventArgs e)
{
string html = @"<!DOCTYPE html>
<html>
<head>
<title>这是个标题</title>
<meta charset='utf-8' />
<meta name = 'viewport' content = 'width=device-width, initial-scale=1' />
</head>
<body>
<h1>这是一个一个简单的HTML</h1>
<p>Hello World!</p >
</body>
</html>";
String url = "https://www.baidu.com";
myChrome.LoadHtml(html, url);
}

截断响应

这里的关键在于GetResourceResponseFilter方法,它的签名如下:

1
2
3
4
5
6
7
IResponseFilter GetResourceResponseFilter(
IWebBrowser chromiumWebBrowser,
IBrowser browser,
IFrame frame,
IRequest request,
IResponse response
)
它返回了一个IResponseFilter接口,在这个接口中,我们可以截取到请求响应的内容。在CefSharp最新版本中,GetResourceResponseFilter已经被放入到IResourceRequestHandler接口中,最新文档
下面我放了一个截断网页XHR请求的例子:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
public class TestJsonFilter : IResponseFilter
{
public List<byte> DataAll = new List<byte>();

public FilterStatus Filter(System.IO.Stream dataIn, out long dataInRead, System.IO.Stream dataOut, out long dataOutWritten)
{
try
{
if (dataIn == null || dataIn.Length == 0)
{
dataInRead = 0;
dataOutWritten = 0;

return FilterStatus.Done;
}

dataInRead = dataIn.Length;
dataOutWritten = Math.Min(dataInRead, dataOut.Length);

dataIn.CopyTo(dataOut);
dataIn.Seek(0, SeekOrigin.Begin);
byte[] bs = new byte[dataIn.Length];
dataIn.Read(bs, 0, bs.Length);
DataAll.AddRange(bs);

dataInRead = dataIn.Length;
dataOutWritten = dataIn.Length;

return FilterStatus.NeedMoreData;
}
catch (Exception ex)
{
dataInRead = dataIn.Length;
dataOutWritten = dataIn.Length;

return FilterStatus.Done;
}
}

public bool InitFilter()
{
return true;
}

public void Dispose()
{

}
}

public class FilterManager
{
private static Dictionary<string, IResponseFilter> dataList = new Dictionary<string, IResponseFilter>();

public static IResponseFilter CreateFilter(string guid)
{
lock (dataList)
{
var filter = new TestJsonFilter();
dataList.Add(guid, filter);

return filter;
}
}

public static IResponseFilter GetFileter(string guid)
{
lock (dataList)
{

if (dataList.ContainsKey(guid)) // 这里要检测key存在,不然会报异常,会导致ContextSwitchDeadlock
{
return dataList[guid];
}
else
{
return null;
}
}
}
}

public class CustomResourceRequestHandler : ResourceRequestHandler
{
protected override CefReturnValue OnBeforeResourceLoad(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, IRequestCallback callback)
{
// 截断请求的代码...
return CefReturnValue.Continue;
}


protected override IResponseFilter GetResourceResponseFilter(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, IResponse response)
{
if (!(request.ResourceType == ResourceType.Xhr)) // 不是XHR类型就不去过滤
{
return null;
}
var filer = FilterManager.CreateFilter(request.Identifier.ToString());
return filer;
}

protected override void OnResourceLoadComplete(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, IResponse response, UrlRequestStatus status, long receivedContentLength)
{
var filer = FilterManager.GetFileter(request.Identifier.ToString()) as TestJsonFilter;
if (filer != null)
{
Console.WriteLine(ASCIIEncoding.UTF8.GetString(filer.DataAll.ToArray())); // 打印body内容
}
}
}


private void myChrome_Loaded(object sender, RoutedEventArgs e)
{
String url = "https://github.com/salamander-mh"; // github首页上有ajax请求,可以看效果
myChrome.Load(url);
}

运行程序,在输出视图就可以看到Ajax请求的body数据。

截取cookie

建立Cookie读取对象,继承接口 ICookieVisitor

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public class CookieVisitor : CefSharp.ICookieVisitor
{
public event Action<CefSharp.Cookie> SendCookie;


public bool Visit(Cookie cookie, int count, int total, ref bool deleteCookie)
{
deleteCookie = false;
if (SendCookie != null)
{
SendCookie(cookie);
}

return true;
}

public void Dispose()
{
}
}
在browser事件中进行处理
1
2
3
4
5
6
7
8
private void browser_FrameLoadEnd(object sender, CefSharp.FrameLoadEndEventArgs e)
{
var cookieManager = myChrome.GetCookieManager();

CookieVisitor visitor = new CookieVisitor();
visitor.SendCookie += visitor_SendCookie;
cookieManager.VisitAllCookies(visitor);
}
回调事件
1
2
3
4
private void visitor_SendCookie(CefSharp.Cookie obj)
{
Console.WriteLine("获取cookie:" + obj.Domain.TrimStart('.') + "^" + obj.Name + "^" + obj.Value + "$");
}
设置CefSharp实例事件:
1
2
3
4
5
6
private void myChrome_Loaded(object sender, RoutedEventArgs e)
{
String url = "https://www.baidu.com";
myChrome.Load(url);
myChrome.FrameLoadEnd += browser_FrameLoadEnd;
}
运行程序,在输出视图就可以看到cookie数据了。

Javascript交互

C#执行js方法

1
myChrome.GetBrowser().MainFrame.ExecuteJavaScriptAsync("document.getElementById('testid').click();");  

以上代码就会触发id为testid的元素的click事件。
注意:脚本是在 Frame 级别执行,页面永远至少有一个Frame( MainFrame )。

获取Javascript方法结果

这里需要使用Task<JavascriptResponse> EvaluateScriptAsync(string script, TimeSpan? timeout)方法。 JavaScript代码是异步执行的,因此使用.NET Task 类返回一个响应,其中包含错误消息,结果和一个成功(bool)标志。

1
2
3
4
5
6
7
8
9
10
11
// Get Document Height  
var task = frame.EvaluateScriptAsync("(function() { var body = document.body, html = document.documentElement; return Math.max( body.scrollHeight, body.offsetHeight, html.clientHeight, html.scrollHeight, html.offsetHeight ); })();", null);

task.ContinueWith(t =>
{
if (!t.IsFaulted)
{
var response = t.Result;
EvaluateJavaScriptResult = response.Success ? (response.Result ?? "null") : response.Message;
}
}, TaskScheduler.FromCurrentSynchronizationContext());

资源清理

关闭应用,发现CefSharp.BrowserSubprocess.exe进程会发现没有结束,其实在退出事件中,我们需要调用Cef.Shutdown()方法

1
2
3
4
5
6
7
8
9
try  
{
if (browser != null)
{
browser.Dispose();
Cef.Shutdown();
}
}
catch { }

示例代码下载

参考: * StackOverflow * How to read the JSON response content from a XMLHttpRequest? * CefSharp中文帮助文档