I recently posted about consuming SQL Server Reporting Services, and how you may need to use the XmlUrlResolver in order to access the external source. Additionally, you may have need to get behind a credentialed page.
There is a lot of material out there on how you can accomplish this using C#, but one factor that gets looked over is the fact that many ASP.NET MVC sites use request verification tokens to curtail cross-site request forgeries. This is a design pattern added on to the Web.Mvc API by Microsoft, and if we are to gain access to our desired site, it needs to be taken into consideration.
Fortunately, if we have a legitimate set of user credentials, this becomes a trivial matter for most basic installations once you have a general idea of what you need to do.
#1 – Set Up a Cookie Class
To get started, we need a class that can manage our cookies. It will essentially establish a CookieContainer, and can make basic WebRequests.
public class CookieAwareWebClient : WebClient { public CookieAwareWebClient(CookieContainer container) { CookieContainer = container; } public CookieAwareWebClient() : this(new CookieContainer()) { } public CookieContainer CookieContainer { get; private set; } protected override WebRequest GetWebRequest(Uri address) { var request = (HttpWebRequest)base.GetWebRequest(address); request.CookieContainer = CookieContainer; return request; } }
#2 – Establish a Get Request
Personally, I found it easier to next think about what I would need for my final GET request, when I would repeatedly be calling on my access-restricted pages. Essentially, to first build out how I would use the solution and then go about figuring out how to build up to that need.
public HtmlDocument GetPage(string url, CookieContainer CookieContainer) { Uri absoluteUri = new Uri(a_url_string); var cookies = CookieContainer.GetCookies(absoluteUri); HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url); request.CookieContainer = new CookieContainer(); foreach (Cookie cookie in cookies) { request.CookieContainer.Add(cookie); } request.Method = "GET"; HttpWebResponse response = (HttpWebResponse)request.GetResponse(); var stream = response.GetResponseStream(); using (var reader = new StreamReader(stream)) { string html = reader.ReadToEnd(); var doc = new HtmlDocument(); doc.LoadHtml(html); return doc; } }
What this does beyond a standard GET
request is that it will receive a CookieContainer
from a previous request, iterating through the contents and adding them to the CookieContainer
of the current request. The function will return an HTMLDocument
of your requested page, which is exactly what it sounds like.
#3 – Create the Login Procedure
This leaves us with the code required to actually manipulate the login form.
To understand how to make it all work, you have to understand how the anti-CSRF feature works with ASP MVC. When you make a standard render request to the URL, MVC generates a RequestVerificationToken
which can perform a handshake with the server – think of it as a short-term and auto-generated private key/public key interaction. In order to make this feature seamless, the token is added to the form as a hidden input value and submitted along with the rest of the form information without the user ever being the wiser. You probably have used this feature hundreds of times without ever noticing, unless you were so inclined to look at the DOM.
That being said, we then need to first perform a GET request on the page with the login form, enabling CookieContainer
for that request. For functional reasons, I have set things up to receive the form values as a NameValueCollection
. We then need to add the __RequestVerificationToken
to that collection, and that can be found from the form on the page itself.
This led to one very weird quirk. I used HTMLAgilityPack to navigate the XPath
of HTMLDocument
, but as it turns out the package returns zero children for form elements by default. So I had to explicitly declare a change to the ElementsFlags
.
Once that is fixed, you can reliably target the input node of __RequestVerificationToken
and get its value. Once you have it, you can add it to your NameValueCollection
that contains the rest of your form input values.
public void Login(string loginPageAddress, NameValueCollection loginData) { CookieContainer container; var request = (HttpWebRequest)WebRequest.Create(loginPageAddress); request.Method = "GET"; container = request.CookieContainer = new CookieContainer(); HttpWebResponse response = (HttpWebResponse)request.GetResponse(); var stream = response.GetResponseStream(); using (var reader = new StreamReader(stream)) { HtmlNode.ElementsFlags.Remove("form"); string html = reader.ReadToEnd(); var doc = new HtmlDocument(); doc.LoadHtml(html); var input = doc.DocumentNode.SelectSingleNode("//*[@name='__RequestVerificationToken']"); var token = input.Attributes["value"].Value; loginData.Add("__RequestVerificationToken", token); loginData.Add("returnURL", ""); } ... }
We now have everything we need, including the token. But this token has been assigned to the previous request, which is fortunately synced with the CookieContainer
that you created earlier. Similarly to GetPage()
, if we iterate through the cookies from the previous request and add them to the current request, this will keep our available token matching with its header information for the server, thus passing the requirements for the MVC anti-CSRF system. This all needs to get packaged into a POST
request that includes your NameValueCollection
, and returns that request’s CookieContainer
for future consumption by GetPage()
.
public void Login(string loginPageAddress, NameValueCollection loginData) { ... var request2 = (HttpWebRequest)WebRequest.Create(loginPageAddress); request2.CookieContainer = new CookieContainer(); foreach (Cookie cookie in response.Cookies) { request2.CookieContainer.Add(cookie); } request2.Method = "POST"; request2.ContentType = "application/x-www-form-urlencoded"; var buffer = Encoding.ASCII.GetBytes(GenerateQueryString(loginData)); request2.ContentLength = buffer.Length; var requestStream = request2.GetRequestStream(); requestStream.Write(buffer, 0, buffer.Length); requestStream.Close(); var response2 = request2.GetResponse(); response2.Close(); CookieContainer = request2.CookieContainer; }
public string GenerateQueryString(NameValueCollection collection) { var array = (from key in collection.AllKeys from value in collection.GetValues(key) select string.Format("{0}={1}", WebUtility.UrlEncode(key), WebUtility.UrlEncode(value))).ToArray(); return string.Join("&", array); }
Hello Matthew, I was wondering if you have the full code for this. I am stuck at HtmlDocument.
using HtmlAgilityPack;
@Puzzle, you need to get the HtmlAgilityPack for the HtmlDocument. Try Nuget: https://www.nuget.org/packages/HtmlAgilityPack/