Chapter 14
Networking

Having moved twice with my family in the last year, I’ve been struck by the similarities between that process and how data moves around on our networks. Packing a truck for the first move—from Oregon to California—was certainly an exercise in compression! At times I didn’t think we’d manage to get everything into our van and the 20-foot truck we’d rented, but somehow it all fit with just inches to spare. Then we had the long drive south before decompressing the data, so to speak, into a house where we lived for a time while the builders were finishing up our permanent home.

Moving into the new house was a different experience, mainly because it was only about 200 meters away and we were able to move bits and pieces at different times. When the final inspection was signed off and we could (legally) move all the furniture and boxes, the task was accomplished with the help of many friends and many random vehicles, with very little compression of our stuff along the way!

These two moves illustrate, in some loose way, the character of different networking transports, such as datagram sockets and stream sockets, one of the topics we’ll visit here. In the first case we see a large (compressed) data packet moving in toto from one endpoint to another. In the second we instead have more of a data streaming experience, with some compression to optimize each trip between the endpoints but very different in nature from the first experience.

Along these same lines, consider how you might hire movers to do all this work—you show them your stuff, give them the destination address, write them a large check, and magically all that stuff shows up in another location. Such a process is reminiscent of the background transfer APIs in WinRT that will be one of the first subjects of this chapter. In moving one’s household, there is also the concept of renting and packing “pods,” in which case you do the packing yourself but the transportation (and possibly storage) is handled separately. But whether this has anything whatsoever to do with writing a Windows 8 app, I’m not so sure!

Lest I dig too deep a hole with such analogies, let’s just say that networking is a rich and varied topic, all of which is based on the need to get data from one place to another in a variety of ways. The goal of this chapter, then, is to provide at least an overview of the networking capabilities in Windows 8. Topics will range from XmlHttpRequests, background transfers, authentication and credentials, and syndication, to connectivity and network information, offline functionality, and sockets. We’ll focus most on areas of common concern for most apps, touching briefly on other areas that are more specific to certain scenarios and for which there are many additional resources to draw upon. One such resource is Developing connected apps, which serves as a good overview of networking in general.

In any case, we’ll begin here with the subject of connectivity, because without it, there isn’t much to speak about with networking at all.

Network Information and Connectivity

In the previous chapter, I mentioned in a footnote how at the very time I was writing on the subject of live tiles and all the connections that Windows 8 apps can have to the Internet, my home and many thousands of others in Northern California were completely disconnected due to a fiber optic breakdown. The outage lasted for what seemed like an eternity by present standards—36 hours! Although I wasn’t personally at a loss for how to keep myself busy, there was a time when I opened one of my laptops, found that our service was still down, and wondered for a moment just what the computer was really good for without connectivity! Clearly I’ve grown, as I suspect you have too, to take constant connectivity completely for granted.

As developers of great apps, however, we cannot afford to be so complacent. It’s always important to handle errors when trying to make connections and draw from online resources, because any number of problems can arise within the span of a single operation. But it goes much deeper than that. It’s our job to make our apps as useful as they can be when connectivity is lost, perhaps just because our customers got on an airplane and switched on airplane mode. That is, don’t give customers a reason to wonder about the usefulness of their device in such situations! A great app will prove its worth through a great user experience even if it lacks connectivity.

Connectivity can also vary throughout an app session, where an app can often be suspended and resumed, or suspended for a long time. With mobile devices especially, one might switch between any number of networks without necessarily knowing about it. Windows 8, in fact, tries to make the transition between networks as transparent as possible, except where it’s important to inform the user that there may be costs associated with the current network. It’s required by Window Store policy, in fact, for apps to be aware of data transfer costs on metered networks and to prevent “bill shock” from not-always-generous mobile broadband providers. Just as there are certain things an app can’t always do when the device is offline, the characteristics of the current network might also cause it to defer or avoid certain operations as well.

Let’s now see how to retrieve and work with connectivity details, starting with the different types of networks represented in the manifest, followed by obtaining network information, dealing with metered networks, and providing for an offline experience.

Network Types in the Manifest

Nearly every sample we’ve worked with so far in this book has had the Internet (Client) capability declared in its manifest, thanks to Visual Studio turning that on by default. Once before I mentioned how this wasn’t always the case: early app builders within Microsoft would occasionally scratch their heads wondering just why something really obvious—like making a simple XmlHttpRequest to a blog—failed outright. Without this capability, there just isn’t any Internet!

Still, Internet (Client) isn’t the only player in the capabilities game. Some networking apps will also want to act as a server to receive incoming traffic from the Internet, and not just make requests to other servers. In those cases—such as file sharing, media servers, VoIP, chat, multiplayer/multicast games, and other bi-directional scenarios involving incoming network traffic, as with sockets—the app must declare the Internet (Client & Server) capability, as shown in Figure 14-1. This lets such traffic through the inbound firewall, though critical ports are always blocked.

There is also network traffic that occurs on a private network, as in a home or business, where the Internet isn’t involved at all. For this there is also the Private Networks (Client & Server) capability, also shown in Figure 14-1, which is good for file or media sharing, line-of-business apps, HTTP client apps, multiplayer games on a LAN, and so on. What makes any given IP address part of this private network depends on many factors, all of which are described on How to configure network isolation capabilities. For example, IPv4 addresses in the ranges of 10.0.0.0–10.255.255.255, 172.16.0.0–172.31.255.255, and 192.168.0.0–192.168.255.255 are considered private. Users can flag a network as trusted, and the presence of a domain controller makes the network private as well. Whatever the case, if a device’s network endpoint falls into this category, the behavior of apps on that device is governed by this capability rather than those related to the Internet.

Image

FIGURE 14-1 Additional network capabilities in the manifest.

Regardless of the capabilities declared in the manifest, local loopback—that is, using http://localhost URIs—is blocked for Windows Store apps. An exception is made for machines on which a developer license has been installed, as described in Chapter 13, “Tiles, Notifications, the Lock Screen, and Background Tasks,” in the section “Using the Localhost.” This exception exists only to simplify debugging apps and services together, because they can all be running on a single machine during development.

Network Information (the Network Object Roster)

Regardless of the network involved, everything you want to know about that network is available through the Windows.Networking.Connectivity.NetworkInformation object. Besides a single networkstatuschanged event that we’ll discuss in “Connectivity Events” a little later, the interface of this object is made up of methods to retrieve more specific details in other objects.

Fulfilling my earlier promise to just touch on some specifics, below is the roster of the methods in NetworkInformation and the contents of the objects obtained through them. You can exercise the most common of these APIs through the indicated scenarios of the Network information sample:

getHostNames Returns a vector of Windows.Networking.HostName objects, one for each connection, that provides various name strings (displayName, canonicalName, and rawName), the name’s type (from HostNameType, with values of domainName, ipv4, ipv6, and bluetooth), and an ipinformation property (of type IPInformation) containing prefixLength and networkAdapter properties for IPV4 and IPV6 hosts. (The latter is a NetworkAdapter object with various low-level details.) The HostName class is used in various networking APIs to identify a server or some other endpoint.

getConnectionProfiles (Scenario 3) Returns a vector of ConnectionProfile objects, one for each connection, among which will be the active Internet connection as returned by getInternetConnectionProfile. Also included are any wireless connections you’ve made in the past for which you indicated Connect Automatically. (In this way the sample will show you some details of where you’ve been recently!) See the next section for more on ConnectionProfile.

getInternetConnectionProfile (Scenario 1) Returns a single ConnectionProfile object for the currently active Internet connection. If there is more than one connection, this method returns the profile of the preferred one that is most likely to be used for Internet traffic.

getLanIdentifiers (Scenario 4) Returns a vector of LanIdentifier objects, each of which contains an infrastructureId (LanIdentifierData containing a type and value), a networkAdapterId (a GUID), and a portId (LanIdentifierData).

getProxyConfigurationAsync Returns a ProxyConfiguration object for a given URI and the current user. The properties of this object are canConnectDirectly (a Boolean) and proxyUris (a vector of Windows.Foundation.Uri objects for the configuration).

getSortedEndpointPairs Sorts an array of EndpointPair objects according to HostNameSortOptions. An EndpointPair contains a host and service name for local and remote endpoints, typically obtained when you set up specific connections like sockets. The two sort options are none and optimizeForLongConnections, which vary connection behaviors based on whether the app is making short or long duration connection. See the documentation for EndpointPair and HostNameSortOptions for more details.

The ConnectionProfile Object

Of all the information available through the NetworkInformation object, the most important for apps is found in ConnectionProfile, most frequently that returned by getInternetConnection-Profile because that’s the one through which an app’s Internet traffic will flow. The profile is what contains all the information you need to make decisions about how you’re using the network, especially for cost awareness. It’s also what you’ll typically check when there’s a change in network status. Scenarios 1 and 3 of the Network information sample retrieve and display most of these details.

Each profile has a profileName property (a string), such as “Ethernet” or the SSID of your wireless access point, plus a getNetworkNames method that returns a vector of friendly names for the endpoint. The networkAdapter property contains a NetworkAdapter object for low-level details, should you want them, and the networkSecuritySettings property contains a NetworkSecurity-Settings object properties describing authentication and encryption types.

More generally interesting is the getNetworkConnectivityLevel, which returns a value from the NetworkConnectivityLevel enumeration: none (no connectivity), localAccess (the level you hate to see when you’re trying to get a good connection!), constrainedInternetAccess (captive portal connectivity, typically requiring further credentials as is often encountered in hotels, airports, etc.), and internetAccess (the state you’re almost always trying to achieve). The connectivity level is often a factor in your app logic and something you typically watch with network status changes.

To track the inbound and outbound traffic on a connection, the getLocalUsage method returns a DataUsage object that contains bytesReceived and bytesSent, either for the lifetime of the connection or for a specific time period. Similarly, the getConnectionCost and getDataPlanStatus provide the information an app needs to be aware of how much network traffic is happening and how much it might cost the user. We’ll come back to this in “Cost Awareness” shortly, including how to see per-app usage in Task Manager.

Connectivity Events

It is very common for a running app to want to know when connectivity changes. This way it can take appropriate steps to disable or enable certain functionality, alert the user, synchronize data after being offline, and so on. For this, apps need only watch the NetworkInformation.onnetworkstatus-changed event, which is fired whenever there’s a significant change within the hierarchy of objects we’ve just seen (and be mindful that this event comes from a WinRT object). For example, the event will be fired if the connectivity level of a profile changes. It will also be fired if the Internet profile itself changes, as when a device roams between different networks, or when a metered data plan is approaching or has exceeded its limit, at which point the user will start worrying about every megabyte of traffic. In short, you’ll generally want to listen for this event to refresh any internal state of your app that’s dependent on network characteristics and set whatever flags you use to configure the app’s networking behavior. This is especially important for transitioning between online and offline and between unlimited and metered networks; Windows, for its part, also watches this event to adjust its own behavior, as with the Background Transfer APIs.

Note Windows Store apps written in JavaScript can also use the basic window.nagivator.ononline and window.navigator.onoffline events to track connectivity. The window.navigator.onLine property is also true or false accordingly. These events, however, will not alert you to changes in connection profiles, cost, or other aspects that aren’t related to the basic availability of an Internet connection.

You can play with networkstatuschanged in Scenario 5 of the Network information sample. As you connect and disconnect networks or make other changes, the sample will update its details output for the current Internet profile if one is available (code condensed from js/network-status-change.js):

var networkInfo = Windows.Networking.Connectivity.NetworkInformation;
// Remember to removeEventListener for this event from WinRT as needed
networkInfo.addEventListener("networkstatuschanged", onNetworkStatusChange);

function onNetworkStatusChange(sender) {
    internetProfileInfo = "Network Status Changed: \n\r";
    var internetProfile = networkInfo.getInternetConnectionProfile();

    if (internetProfile === null) {
        // Error message
    } else {
        internetProfileInfo += getConnectionProfileInfo(internetProfile) + "\n\r";
        // display info
    }

    internetProfileInfo = "";
}

Of course, listening for this event is useful only if the app is actually running, but what if it isn’t? In that case an app needs to register a background task, as discussed at the end of Chapter 13, for the networkStateChange trigger, typically applying the internetAvailable or internetNot-Available conditions as needed. The Network status background sample provides a demonstration of this, declaring a background task in its manifest with a C# entry point of NetworkStatusTask.-NetworkStatusBackgroundTask. The task is registered in js/network-status-with-internet-present.js (using helpers in js/global.js as typical for the background task samples):

BackgroundTaskSample.registerBackgroundTask(BackgroundTaskSample.sampleBackgroundTaskEntryPoint,
    BackgroundTaskSample.sampleBackgroundTaskWithConditionName,
    new Windows.ApplicationModel.Background.SystemTrigger(
        Windows.ApplicationModel.Background.SystemTriggerType.networkStateChange, false),
    new Windows.ApplicationModel.Background.SystemCondition(
        Windows.ApplicationModel.Background.SystemConditionType.internetAvailable));

The background task in BackgroundTask.cs simply writes the Internet profile name and network adapter id to local app data in response to the trigger. These values are output to the display within the completeHandler in js/global.js. A real app would clearly take more meaningful action, such as activating background transfers for data synchronization when connectivity is restored. The basic structure is there in the sample nonetheless.

It’s also very important to remember that network status might have changed while the app was suspended. Apps that watch the networkstatuschanged event should also refresh their connectivity-related state within their resuming handler.

As a final note, check out the Troubleshooting and debugging network connections topic in the documentation, which has a little more guidance on responding to network changes as well as network errors.

Cost Awareness

If you ever crossed between 3G roaming territories with a smartphone that’s set to automatically download email, you probably learned the hard way to disable syncing in such circumstances. I once drove from Washington State into Canada without realizing that I would suddenly be paying $15/megabyte for the privilege of downloading large email attachments. Of course, since I’m a law-abiding citizen I did not look at my phone while driving (wink-wink!) to notice the roaming network. Well, a few weeks later I knew what “bill shock” was all about!

The point here is that if users conclude that your app is responsible for similar behavior, regardless of whether it’s actually true, the kinds of rating and reviews you’ll receive in the Windows Store won’t be good! It’s vital, then, to pay attention to changes in the cost of the connection profiles you’re using, typically the Internet profile. Always check these details on startup, within your networkstatuschanged event handler, and within your resuming handler.

You—and all of your customers, I might add—can track your app’s network usage in the App History tab of Task Manager, as shown below. Make sure you’ve expanded the view by tapping More Details on the bottom left if you don’t see this view. You can see that it shows Network and Metered Network usage along with the traffic due to tile updates:

Image

Programmatically, as noted before, the profile provides usage information through its get-ConnectionCost and getDataPlanStatus methods. The first method returns a ConnectionCost object with four properties:

networkCostType A NetworkCostType value, one of unknown, unrestricted (no extra charges), fixed (unrestricted up to a limit), and variable (charged on a per-byte or per-megabyte basis).

roaming A Boolean indicating whether the connection is to a network outside of your provider’s normal coverage area, meaning that extra costs are likely involved. An app should be very conservative with network activity when this is true.

approachingDataLimit A Boolean that indicates that data usage on a fixed type network (see networkCostType) is getting close to the limit of the data plan.

overDataLimit A Boolean indicating that a fixed data plan’s limit has been exceeded and overage charges are definitely in effect. When this is true, an app should be very conservative with network activity, as when roaming is true.

The second method, getDataPlanStatus, returns a DataPlanStatus object with these properties:

dataPlanLimitInMegabytes The maximum data transfer allowed for the connection in each billing cycle.

dataPlanUsage A DataPlanUsage object with an all-important megabytesUsed property and a lastSyncTime (UTC) indicating when megabytesUsed was last updated.

maxTransferSizeInMegabytes The maximum recommended size of a single network operation. This property reflects not so much the capacities of the metered network itself (as its documentation suggests), but rather an appropriate upper limit to transfers on that network.

nextBillingCycle The UTC date and time when the next billing cycle on the plan kicks in and resets dataPlanUsage to zero.

inboundBitsPerSecond and outboundBitsPerSecond Indicate the nominal transfer speed of the connection.

With all these properties you can make intelligent decisions about your app’s network activity and/or warn the user about possible overage charges. Clearly, when the networkCostType is unrestricted, you can really do whatever you want. On the other hand, when the type is variable and the user is paying for every byte, especially when roaming is true, you’ll want to inform the user of that status and provide settings through which the user can limit the app’s network activity, if not halt that activity entirely. After all, the user might decide that certain kinds of data are worth having. For example, they should be able to set the quality of a streaming movie, indicate whether to download email messages or just headers, indicate whether to download images, specify whether caching of online data should occur, turn off background streaming audio, and so on.

Such settings, by the way, might include tile, badge, and other notification activities that you might have established, as those can generate network traffic. If you’re also using background transfers, you can set the cost policies for downloads and uploads as well.

An app can, of course, ask the user’s permission for any given network operation. It’s up to you and your designers to decide when to ask and how often. Windows Store policy, for its part (section 4.5), requires that you ask the user for any transfer exceeding one megabyte when roaming and overDataLimit are both true, and when performing any transfer over maxTransferSizeInMegabytes.

On a fixed type network, where data is unrestricted up to dataPlanLimitInMegabytes, we find cases where a number of the other properties become interesting. For example, if overDataLimit is already true, you can ask the user to confirm additional network traffic or just defer certain operations until the nextBillingCycle. Or, if approachingDataLimit is true (or even when it’s not), you can determine whether a given operation might exceed that limit. This is where the connection profile’s getLocalUsage method comes in handy to obtain a DataUsage object for a given period (see How to retrieve connection usage information for a specific time period). Call getLocalUsage with the time period between lastSyncTime and DateTime.now(). Then add that value to megabytesUsed and subtract the result from dataPlanLimitInMegabytes. This tells you how much more data you can transfer before incurring extra costs, thereby providing the basis for asking the user, “Downloading this file will exceed your data plan limit. Do you want to proceed?”

For simplicity’s sake, you can think of cost awareness in terms of three behaviors: normal, conservative, and opt-in, which are described on Managing connections on metered networks and, more broadly, on Developing connected apps. Both topics provide additional guidance on making the kinds of decisions described here already. In the end, saving the user from bill shock—and designing a great user experience around network costs—is definitely an essential investment.

You may be thinking, “OK, so I get the need for my app to behave properly with metered networks, but how do I test such conditions without signing up with some provider and paying them a bunch of money (including roaming fees) while I’m doing my testing?” The simple answer is that you can simulate the behavior of metered networks with any Wi-Fi connection. First, invoke the Settings charm and tap on your network connection near the bottom (see below left, specifically the upper left icon, shown here as “Nuthatch”). In the Networks pane that then opens up (below right), right-click a wireless connection to open the menu and then select Set As Metered Connection:

Image

Although this option will not set up DataUsage properties and all that a real metered network might provide, it will return a networkCostType of fixed, which allows you to see how your app responds. You can also use the Show Estimated Data Usage menu item to watch how much traffic your app generates during its normal operation, and you can reset the counter so that you can take some accurate readings:

Image

Running Offline

The other user experience that is sure to earn your app a better reputation is how it behaves when there is no connectivity or when there’s a change in connectivity. Ask yourself the following questions:

• What happens if your app starts without connectivity, both from tiles (primary and secondary) and through contracts such as search, share, and the file picker?

• What happens if your app runs the first time without connectivity?

• What happens if connectivity is lost while the app is running?

• What happens when connectivity comes back?

As described above in the “Connectivity Awareness” section, you can use the networkstatus-changed event to handle these situations while running and your resuming handler to check if connection status changed while the app was suspended. If you have a background task tied to the networkStateChange trigger, it would primarily save state that your resuming handler would then check.

It’s perfectly understood that some apps just can’t run without connectivity, in which case it’s appropriate to inform the user of that situation when the app is launched or when connectivity is lost while the app is running. In other situations, an app might be partially usable, in which case you should inform the user more on a case-by-case basis, allowing them to use unaffected parts of the app. Better still is to cache data that might make the app even more useful when connectivity is lost. Such data might even be built into the app package so that it’s always available on first launch.

Consider the case of an ebook reader app that would generally acquire new titles from an online catalog. For offline use it would do well to cache copies of the user’s titles locally, rather than rely solely on having a good Internet connection. The app’s publisher might also include a number of popular free titles directly in the app package such that a user could install the app while waiting to board a plane and have at least those books ready to go when the app is first launched at 30,000 feet. Other apps might include some set of preinstalled data at first and then add to that data over time (perhaps through in-app purchases) when unrestricted networks are available. By following network costs closely, such an app might defer downloading a large data set until either the user confirms the action or a different connection is available.

How and when to cache data from online resources is probably one of the fine arts of software development. When do you download it? How much do you acquire? Where do you store it? Should you place an upper limit on the cache? Do you allow changes to cached data that would need to be synchronized with a service when connectivity is restored? These are all good questions ask, and certainly there are others to ask as well. Let me at least offer a few thoughts and suggestions.

First, you can use any network transport to acquire data to cache such as WinJS.xhr, the background transfer API, as well as the HTML5 AppCache mechanism, which works well for web content you load up in iframe elements. Note that using the AppCache requires that the URIs in question are declared in the manifest as ApplicationContentUris (see Chapter 3, “App Anatomy and Page Navigation”). Separately, other content acquired from remote resources, such as images, are also cached automatically like typical temporary Internet files. Even remote script downloaded within a web context iframe is cached this way. Both caching mechanisms are subject to the storage limits defined by Internet Explorer.

How much data you cache depends, certainly, on the type of connection you have and the relative importance of the data. On an unrestricted network, feel free to acquire everything you feel the user might want offline, but it would be a good idea to provide settings to control that behavior, such as overall cache size or the amount of data to acquire per day. I mention the latter because even though my own Internet connection appears to the system as unrestricted, I’m charged more as my usage reaches certain tiers (on the order of gigabytes). As a user, I would appreciate having a say in matters that involve significant network traffic.

Even so, if caching specific data will greatly enhance the user experience, separate that option to give the user control over the decision. For example, an ebook reader might automatically download a whole title while the reader is perhaps just browsing the first few pages. Of course, this would also mean consuming more storage space. Letting users control this behavior as a setting, or even on a per-book basis, lets them decide what’s best. For smaller data, on the other hand—say, in the range of several hundred kilobytes—if you know from analytics that a user that views one set of data is highly likely to view another, automatically acquiring and caching those additional data sets could be the right design.

The best place to store cached data is your app data folders, specifically the LocalFolder and TemporaryFolder. Avoid using the RoamingFolder to cache data acquired from online sources: besides running the risk of exceeding the roaming quota (see Chapter 8, “State, Settings, Files, and Documents”), it’s also quite pointless. Because the system would have to roam such data over the network anyway, it’s better to just have the app re-acquire it when it needs to. The same applies to in-app purchases: because the user can easily acquire those purchases through the Windows Store on another machine (where the app on that machine would find that those purchases are already paid for), they need not be roamed.

Whether you use the LocalFolder or TemporaryFolder depends on how essential the data is to the operation of the app. If the app cannot run without the cache—such as the cookbook app I mentioned earlier—use local app data. If the cache is just an optimization such that the user could reclaim that space with the Disk Cleanup tool, store the cache in the TemporaryFolder and rebuild it again later on. (Be aware once again that IndexedDB, as described in Chapter 8, has a per-app limit and an overall system limit. If this is a potential problem, you might want to choose a different storage mechanism.)

In all of this, also consider that what you’re caching really might be user data that you’d want to store outside of your app data folders. That is, be sure to think through the distinction between app data and user data!

Finally, you might again have the kind of app that allows offline activity (like processing email) where you will have been caching the results of that activity for later synchronization with an online resource. When connectivity is restored, then, check if the network cost is suitable before starting your sync process.

In Chapter 13 we looked at how an app appears “alive with activity” through features such as live tiles and notifications. Clearly, periodic updates and push notifications are completely dependent on connectivity and will not operate without it; a running app, on the other hand, can still issue updates when the device is offline. Under such a circumstance, the app must avoid referencing remote images in the update, because these will not be resolved without connectivity and the tile and toast systems do not presently support the use of local fallback images. An app should thus check connectivity status before issuing an update and should make sure to use local (ms-appx:/// or ms-appdata:///) images instead of remote ones or opt for text-only tile and toast templates.

XmlHttpRequest

As we’ve seen a number of times already in this book, transferring data to and from web services with XmlHttpRequest is a common activity for Windows Store apps, especially those written in JavaScript for which handling XML and/or JSON is simple and straightforward. This is especially true when using the WinJS.xhr wrapper that turns the whole process into a simple promise.

To build on what we already covered in Chapter 3, in the section “Data from Services and WinJS.xhr,” there are a few other points to make where such requests are concerned, most of which come from the section in the documentation entitled Connecting to a web service.

First, Downloading different types of content provides the details of the different content types supported by XHR for Windows Store apps. These are summarized here:

Image

Second, know that XHR responses can be automatically cached, meaning that later requests to the same URI might return old data. To resend the request despite the cache, add an If-Modified-Since HTTP header as shown on How to ensure that WinJS.xhr resends requests.

Along similar lines, you can wrap a WinJS.xhr operation in another promise to encapsulate automatic retries if there is an error in any given request. That is, build your retry logic around the core XHR operation, with the result stored in some variable. Then place that whole block of code within WinJS.Promise.wrap (or a new WinJS.Promise) and use that elsewhere in the app.

In each XHR attempt, remember that you can also use WinJS.Promise.timeout in conjunction with WinJS.Xhr as described on Setting timeout values with WinJS.xhr., because WinJS.xhr doesn’t have a timeout notion directly. You can, of course, set a timeout in the raw request, but that would mean rebuilding everything that WinJS.xhr already does.

Generally speaking, XHR headers are accessible to the app with the exception of cookies (the set-cookie and set-cookie2 headers), as these are filtered out by design for XHR done from a local context. They are not filtered for XHR from the web context, so if you need cookies, try acquiring them in a web context iframe and pass them to a local context using postMessage.

Finally, avoid using XHR for large file transfers because such operations will be suspended when the app is suspended. Use the Background Transfer API instead (see the next section), which uses XHR under the covers, so your web services won’t know the difference anyway!

And on that note, let’s now look at that Background Transfer API in detail.

If you’re interested in watching the HTTP(S) traffic between your computer and the Internet—something that can be invaluable when working with XmlHttpRequests—check out the freeware tool known as Fiddler (http://www.fiddler2.com/fiddler2/). In addition to inspecting traffic, you can also set breakpoints on various events and fiddle with (that is, modify) incoming and outgoing data. It supports traffic from any app or browser, including Windows Store apps.

Background Transfer

One need with user data, especially, is to transfer potentially large files to and from an online repository. For even moderately sized files, however, this presents a challenge: very few users typically want to stare at their display to watch file transfer progress, so it’s highly likely that they’ll switch to another app to do something far more interesting while the transfer is taking place. In doing so, the app that’s doing the transfer will be suspended in the background and possibly even terminated. This does not bode well for trying to complete such operations using a mechanism like WinJS.xhr!

One solution would be to provide a background task for this purpose, which was a common request with early previews of Windows 8 before this API was ready. However, there’s little need to run app code for this common purpose. WinRT thus provides a specific background transfer API, Windows.Networking.BackgroundTransfer, supporting up to 500 scheduled transfers systemwide. It offers built-in cost awareness and resiliency to changes in connectivity, reliving apps from needing to worry about such concerns themselves. Transfers continue when an app is suspended and will be paused if the app is terminated. When the app is resumed or launched again, it can then check the status of background transfers it previously initiated and take further action as necessary—processing downloaded information, noting successful uploads in its UI, and enumerating pending transfers, which will restart any that were paused or otherwise interrupted. On the other hand, if the user directly closes the app (through a gesture, Alt+F4, or Task Manager), all pending transfers for that app are canceled. This is also true if you stop debugging an app in Visual Studio.

Generally speaking, then, it’s highly recommended that you use the background transfer API whenever you expect the operation to exceed your customer’s tolerance for waiting. This clearly depends on the network’s connection speed, and whether you think the user will switch away from your app while such a transfer is taking place. For example, if you initiate a transfer operation but the user can continue to be productive (or entertained) in your app while that’s happening, then using WinJS.xhr with HTTP GET and POST/PUT might be a possibility, though you’ll still be responsible for cost awareness and handling connectivity. If, on the other hand, the user cannot do anything more until the transfer is complete, you might choose to use background transfer for perhaps any data larger than 10K or some other amount based on the current network speed.

In any case, when you’re ready to employ background transfer in your app, the BackgroundDown-loader and BackgroundUploader objects in the Windows.Networking.BackgroundTransfer namespace will become your fast friends. Both objects have methods and properties through which you can enumerate pending transfers as well as perform general configuration of credentials, HTTP request headers, transfer method, cost policy (for metered networks), and grouping. Each individual operation is then represented by a DownloadOperation or UploadOperation object, through which you can control the operation (pause, cancel, etc.) and retrieve status. With each operation you can also set its particular credentials, cost policy, and so forth, overriding the general settings in the BackgroundDownloader and BackgroundUploader classes.

Note In both download and upload cases, the connection request will be aborted if a new connection is not established within five minutes. After that, any other HTTP request involved with the transfer times out after two minutes. Background transfer will retry an operation up to three times if there’s connectivity.

To see the basics of this API in action, let’s start by looking at the Background transfer sample. To run this sample you must first set up a localhost server along with a data file and an upload target page. So make sure you have Internet Information Services installed on your machine, as described in Chapter 13 in the section “Using the Localhost.” Then, from an administrator command prompt, navigate to the sample’s Server folder and run the command powershell –file serversetup.ps1. This will install the necessary server-side files for the sample on the localhost, and allow you to run an additional example in this chapter’s companion content.

Basic Downloads

Scenario 1 of the Background transfer sample (js/downloadFile.js) lets you download an image file from the localhost server and save it to the Pictures library. By default the URI entry field is set to a specific localhost URI and the control is disabled. This is because the sample doesn’t perform any validation on the URI, a process that you should always perform in your own app. If you’d like to enter other URIs in the same, of course, just remove disabled="disabled" from the serverAddressField element in html/downloadFile.html. To see the downloader in action, it’s also helpful to locate some large image files that will take a while to transfer; your favorite search engine can help you out, or you can copy one of your own to the localhost server.

The sample’s UI also provides a handful of buttons to start, cancel, pause, and resume the async operation, an essential feature for apps with background transfers. Within its progress handler, which the transfer operations support, the sample demonstrates how to display as much of the image has been transferred. You can also start multiple transfers to observe how they are all managed simultaneously.

Starting a download transfer happens as follows. First create a StorageFile to receive the data (though this is not required as we’ll see later in this section). Then create a DownloadOperation object for the transfer using BackgroundDownloader.createDownload, at which point you can set its method, costPolicy, and group properties to override the defaults supplied by the Background-Downloader. The method is a string that identifies the type transfer being used (normally GET for HTTP or RETR for FTP). We’ll come back to the other two properties later in the “Setting Cost Policy” and “Grouping Multiple Transfers” sections.

Once the operation is configured as needed, the last step is to call its startAsync method with your completed, error, and progress handlers:72

// Asynchronously create the file in the pictures folder (capability declaration required).
Windows.Storage.KnownFolders.picturesLibrary.createFileAsync(fileName,
    Windows.Storage.CreationCollisionOption.generateUniqueName)
    .done(function (newFile) {
        // Assume uriString is the text URI of the file to download
        var uri = Windows.Foundation.Uri(uriString);
        var downloader = new Windows.Networking.BackgroundTransfer.BackgroundDownloader();

        // Create a new download operation.
        var download = downloader.createDownload(uri, newFile);

        // Start the download
        download.startAsync().then(complete, error, progress);
    }

While the operation underway, the following properties provide additional information on the transfer:

requestedUri and resultFile The same as those passed to createDownload.

guid A unique identifier assigned to the operation.

progress A BackgroundDownloadProgress structure with bytesReceived, total-BytesToReceive, hasResponseChanged (a Boolean, see the getResponseInformation method below), hasRestarted (a Boolean set to true if the download had to be restarted), and status (a BackgroundTransferStatus value: idle, running, pausedByApplication, pausedCostedNetwork, pausedNoNetwork, canceled, error, and completed).

A few methods of DownloadOperation can also be used with the transfer:

pause and resume Control the download in progress. We’ll talk more of these in the “Suspend, Resume, and Restart with Background Transfers” section below.

getResponseInformation Returns a ResponseInformation object with properties named headers (a collection of response headers from the server), actualUri, isResumable, and statusCode (from the server). Repeated calls to this method will return the same information until the hasResponseChanged property is set to true.

getResultStreamAt Returns an IInputStream for the content downloaded so far or the whole of the data once the operation is complete.

In Scenario 1 of the sample, the progress function—which is given to the promise returned by startAsync—uses getResponseInformation and getResultStreamAt to show a partially downloaded image:

var currentProgress = download.progress;

// ...

// Get Content-Type response header.
var contentType = download.getResponseInformation().headers.lookup("Content-Type");

// Check the stream is an image.
if (contentType.indexOf("image/") === 0) {
    // Get the stream starting from byte 0.
    imageStream = download.getResultStreamAt(0);

    // Convert the stream to a WinRT type
    var msStream = MSApp.createStreamFromInputStream(contentType, imageStream);
    var imageUrl = URL.createObjectURL(msStream);

    // Pass the stream URL to the HTML image tag.
    id("imageHolder").src = imageUrl;

    // Close the stream once the image is displayed.
    id("imageHolder").onload = function () {
       if (imageStream) {
           imageStream.close();
           imageStream = null;
       }
    };
}

All of this works because the background transfer API is saving the downloaded data into a temporary tile and providing a stream on top of that, hence a function like URL.createObjectURL does the same job as if we provided it with a StorageFile object directly. Once the DownloadOperation object goes out of scope and is garbage collected, however, that temporary file will be deleted.

The existence of this temporary file is also why, as I noted earlier, it’s not actually necessary to provide a StorageFile object in which to place the downloaded data. That is, you can pass null as the second argument to createDownload and work with the data through DownloadOperation.getResultStreamAt. This is entirely appropriate if the ultimate destination of the data in your app isn’t a separate file.

There is also a variation of createDownload that takes a second StorageFile argument whose contents provide the body of the HTTP GET or FTP RETR request that will be sent to the server URI before the download is started. This accommodates some web sites that require you to fill out a form to start the download.

You might have already noticed that neither DownloadOperation nor UploadOperation has a cancellation method. So how is this accomplished? You cancel the transfer by canceling the startAsync operation—that is, call the cancel method of the promise returned by startAsync. This means that you need to hold onto the promises for each transfers you initiate.

Basic Uploads

Scenario 2 of the Background transfer sample (js/uploadFile.js) exercises the background upload capability, specifically sending some file (chosen through the file picker) to a URI that can receive it. By default the URI points to http://localhost/BackgroundTransferSample/upload.aspx, a page installed with the PowerShell script that sets up the server. As with Scenario 1, the URI entry control is disabled because the sample performs no validation, as you would again always want to do if you accepted any URI from an untrusted source (user input in this case). For testing purposes, of course, you can remove disabled="disabled" from the serverAddressField element in html/uploadFile.html and enter other URIs that will exercise your own upload services. This is especially handy if you run the server part of the sample in Visual Studio 2012 Express for Web where the URI will need a localhost port number as assigned by the debugger.

In addition to a button to start an upload and to cancel it, the sample provides another button to start a multipart upload; we’ll talk more of this in the “Multipart Uploads” section below.

In code, an upload happens very much like a download. Assuming you have a StorageFile with the contents to upload, create an UploadOperation object for the transfer with BackgroundUploader.createUpload. If, on the other hand, you have data in a stream (IInputStream), create the operation object with BackgroundUploader.createUploadFromStreamAsync instead. This can also be used to break up a large file into discrete chunks, if the server can accommodate it; see “Breaking Up Large Files” below.

With the operation object in hand you can customize a few properties of the transfer, overriding the defaults provided by the BackgroundUploader. These are method (HTTP POST or PUT, or FTP STOR), costPolicy, and group. For the latter, again see “Setting Cost Policy” and “Grouping Multiple Transfers” below.

Once you’re ready, then, calling the operation’s startAsync will proceed with the upload:73


// Assume uri is a Windows.Foundation.Uri object and file is the StorageFile to upload
var uploader = new Windows.Networking.BackgroundTransfer.BackgroundUploader();
var upload = uploader.createUpload(uri, file);
promise = upload.startAsync().then(complete, error, progress);

While the operation is underway, the following properties provide additional information on the transfer:

requestedUri and sourceFile The same as those passed to createUpload (an operation created with createUploadFromStreamAsync supports only requestedUri).

guid A unique identifier assigned to the operation.

progress A BackgroundUploadProgress structure with bytesReceived, totalBytesToReceive, bytesSent, totalBytesToSend, hasResponseChanged (a Boolean, see the getResponseInformation method below), hasRestarted (a Boolean set to true if the upload had to be restarted), and status (a BackgroundTransferStatus value, again with values of idle, running, pausedByApplication, pausedCostedNetwork, pausedNoNetwork, canceled, error, and completed).

Unlike a download, an UploadOperation does not have pause or resume methods but does have the same getResponseInformation and getResultStreamAt methods. In the upload case, the response from the server is less interesting because it doesn’t contain the transferred data, just headers, status, and whatever body contents the upload page cares to return. If that page returns some interesting HTML, though, you might use the results as part of your app’s output for the upload.

As noted before, to cancel an UploadOperation, call the cancel method of the promise returned from startAsync.

Breaking Up Large Files

Because the outbound (upload) transfer rates of most broadband connections is significantly slower than the inbound (download) rates and might have other limitations, uploading a large file to a server is typically a riskier business than a large download. If an error occurs during the upload, it can invalidate the entire transfer—a very frustrating occurrence if you’ve already been waiting an hour for that upload to complete!

For this reason, a cloud service might allow a large file to be transferred in discrete chunks, each of which is sent as a separate HTTP request with the server reassembling the single file from those requests. This minimizes or at least reduces the overall impact of connectivity hiccups.

From the client’s point of view, each piece would be transferred with an individual Upload-Operation; that much is obvious. The tricky part is breaking up a large file in the first place. With a lot of elbow grease—and what would likely end up being a complex chain of nested async operations—it is possible to create a bunch of temporary files from the single source. If you’re up to a challenge, I invite to you write such a routine and post it somewhere for the rest of us to see!

But there is an easier path using createUploadFromStreamAsync, through which you can create separate UploadOperation objects for different segments of the stream. Given a StorageFile for the source, start by calling its openReadAsync method, the result of which is an Irandom-AccessStreamWithContentType object. Through its getInputStreamAt method you then obtain an IInputStream for each starting point in the stream (that is, at each offset depending on your segment size). You then create an UploadOperation with each input stream by using create-UploadFromStreamAsync. The last requirement is to tell that operation to consume only some portion of that stream. You do this by calling its setRequestHeader("content-length", <length>) where <length> is the size of the segment plus the size of other data in the request; you’ll also want to add a header to identify the segment for that particular upload. After all this, call each operation’s startAsync method to begin its transfer.

Multipart Uploads

In addition to the createUpload and createUploadFromStreamAsync methods, the BackgroundUploader provides another method called createUploadAsync (with three variants) that handles what are called multipart uploads.

From the server’s point of view, a multipart upload is a single HTTP request that contains various pieces of information (the parts), such as app identifiers, authorization tokens, and so forth, along with file content, where each part is possibly separated by a specific boundary string. Such uploads are used by online services like Flickr and YouTube, each of which accepts a request with a multipart Content-Type. (See Content-type: multipart for a reference.) For example, as shown on Uploading Photos – POST Example, Flickr wants a request with the content type of multipart/form-data, followed by parts for api_key, auth_token, api_sig, photo, and finally the file contents. With YouTube, as described on YouTube API v2.0 – Direct Uploading, it wants a content type of multipart/related with parts containing the XML request data, the video content type, and then the binary file data.

The background uploader supports all this through the BackgroundUploader.create-UploadAsync method. (Note the Async suffix that separates these from the synchronous createUpload.) There are three variants of this method. The first takes the server URI to receive the upload and an array of BackgroundTransferContentPart objects, each of which represents one part of the upload. The resulting operation will send a request with a content type of multipart/form-data with a random GUID for a boundary string. The second variation of createUploadAsync allows you to specify the content type directly (through the sub-type, such as related), and the third variation then adds the boundary string. That is, assuming parts is the array of parts, the methods look like this:

var uploadOpPromise1 = uploader.createUploadAsync(uri, parts);
var uploadOpPromise2 = uploader.createUploadAsync(uri, parts, "related");
var uploadOpPromise3 = uploader.createUploadAsync(uri, parts, "form-data", "-------123456");

To create each part, first create a BackgroundTransferContentPart using one of its three constructors:

new BackgroundContentPart() Creates a default part.

new BackgroundContentPart(<name>) Creates a part with a given name.

new BackgroundContentPart(<name>, <file>) Creates a part with a given name and a local filename.

In each case you further initialize the part with a call to its setText, setHeader, and setFile methods. The first, setText, assigns a value to that part. The second, setHeader, can be called multiple times to supply header values for the part. The third, setFile, is how you provide the StorageFile to a part created with the third variant above.

Now, Scenario 2 of the original Background transfer sample shows the latter using an array of selected files, but probably few services would accept a request of this nature. Let’s instead look at how we’d create the multipart request shown on Uploading Photos – POST Example. For this purpose I’ve created the Multipart Upload example in this chapter’s companion content. Here’s the code from js/uploadMultipart.js that creates all the necessary parts using the tinyimage.jpg file in the app package:

// The file and uri variables are already set by this time. bt is a namespace shortcut
var bt = Windows.Networking.BackgroundTransfer;
var uploader = new bt.BackgroundUploader();
var contentParts = [];

// Instead of sending multiple files (as in the original sample), we'll create those parts that
// match the POST example for Flickr on http://www.flickr.com/services/api/upload.example.html
var part;

part = new bt.BackgroundTransferContentPart();
part.setHeader("Content-Disposition", "form-data; name=\"api_key\"");
part.setText("3632623532453245");
contentParts.push(part);

part = new bt.BackgroundTransferContentPart();
part.setHeader("Content-Disposition", "form-data; name=\"auth_token\"");
part.setText("436436545");
contentParts.push(part);

part = new bt.BackgroundTransferContentPart();
part.setHeader("Content-Disposition", "form-data; name=\"api_sig\"");
part.setText("43732850932746573245");
contentParts.push(part);

part = new bt.BackgroundTransferContentPart();
part.setHeader("Content-Disposition", "form-data; name=\"photo\"; filename=\"" + file.name + "\"");
part.setHeader("Content-Type", "image/jpeg");
part.setFile(file);
contentParts.push(part);

// Create a new upload operation specifying a boundary string.
uploader.createUploadAsync(uri, contentParts,

   "form-data", "-----------------------------7d44e178b0434")
   .then(function (uploadOperation) {
       // Start the upload and persist the promise
       upload = uploadOperation;
       promise = uploadOperation.startAsync().then(complete, error, progress);
   }
);

The resulting request will look like this, very similar to what’s shown on the Flickr page (just with some extra headers):

POST /website/multipartupload.aspx HTTP/1.1
Cache-Control=no-cache
Connection=Keep-Alive
Content-Length=1328
Content-Type=multipart/form-data; boundary="-----------------------------7d44e178b0434"
Accept=*/*
Accept-Encoding=gzip, deflate
Host=localhost:60355
User-Agent=Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; Win64; x64; Trident/6.0; Touch)
UA-CPU=AMD64
-------------------------------7d44e178b0434
Content-Disposition: form-data; name="api_key"

3632623532453245
-------------------------------7d44e178b0434
Content-Disposition: form-data; name="auth_token"

436436545
-------------------------------7d44e178b0434
Content-Disposition: form-data; name="api_sig"

43732850932746573245
-------------------------------7d44e178b0434
Content-Disposition: form-data; name="photo"; filename="tinysquare.jpg"
Content-Type: image/jpeg

{RAW JFIF DATA}
-------------------------------7d44e178b0434--

To run the sample and also see how this request is received, go to the MultipartUploadServer folder in this chapter’s companion content. Load website.sln into Visual Studio 2012 Express for Web, open MultipartUploadServer.aspx, and set a breakpoint on the first if statement inside the Page_Load method. Then start the site in Internet Explorer to open that page on a localhost debugging port. Copy that page’s URI for the next step.

In the Multipart Upload example, paste that URI into the URI field and click the Start Multipart Transfer. When the upload operation’s startAsync is called, you should hit the server page breakpoint in Visual Studio for Web. You can step through that code if you want and examine the Request object; in the end, the code will copy the request into a file named multipart-request.txt on that server. This will contain the request contents as above, where you can see the relationship between how you set up the parts in the client and how they are received by the server.

Providing Headers and Credentials

Within the BackgroundDownloader and BackgroundDownloader you have the ability to set values for individual HTTP headers by using their setRequestHeader methods. Both take a header name and a value, and you call them multiple times if you have more than one header to set.

Similarly, both the downloader and uploader objects have two properties for credentials: serverCredential and proxyCredential, depending on the needs of your server URI. Both properties are Windows.Security.Credentials.PasswordCredential objects. As the purpose in a background transfer operation is to provide credentials to the server, you’d typically create a PasswordCredential as follows:

var cred = new Windows.Security.Credentials.PasswordCredential(resource, userName, password);

where the resource in this case is just a string that identifies the resource to which the credentials applies. This is used to manage credentials in the credential locker, as we’ll see in the “Authentication, Credentials, and the User Profile” section later. For now, just creating a credential in this way is all you need to authenticate with your server when doing a transfer.

Note At present, setting the serverCredential property doesn’t work with URIs that specify an FTP server. To work around this, include the credentials directly in the URI with the form ftp://<user>: <password>@server.com/file.ext (for example, ftp://admin:password1@server.com/file.bin).

Setting Cost Policy

As mentioned earlier in the section “Cost Awareness,” the Windows Store policy requires that apps are careful about performing large data transfers on metered networks. The Background Transfer API takes this into account, based on values from the BackgroundTransferCostPolicy enumeration:

default Allow transfers on costed networks.

unrestrictedOnly Do not allow transfers on costed networks.

always Always download regardless of network cost.

To apply a policy to subsequent transfers, set the value of BackgroundDownloader.costPolicy and/or BackgroundUploader.costPolicy. The policy for individual operations can be set through the DownloadOperation.costPolicy and UploadOperation.costPolicy properties.

Basically, you would change the policy if you’ve prompted the user accordingly or allow them to set behavior through your settings. For example, if the user has a setting to disallow downloads or uploads on a metered network, your apps would set the general costPolicy to unrestrictedOnly. If you know you’re on a network where roaming charges would apply and the user has consented to a transfer, you’d want to change the costPolicy of that individual operation to always. Otherwise the API would not perform the transfer because doing so on a roaming network is disallowed by default.

When a transfer is blocked by policy, the operation’s progress.status property will contain BackgroundTransferStatus.pausedCostedNetwork.

Grouping Transfers

The group property that’s found in BackgroundDownloader, BackgroundUploader, DownloadOperation, and UploadOperation is a simple string that tags a transfer as belonging to a particular group. The property can be set only through BackgroundDownloader and BackgroundUploader; you would set this prior to creating a series of individual operations. In those operations, the group property is available but read-only.

The purpose of grouping is so that you can selectively enumerate and control related transfers, as we’ll see in the next section. For example, a photo app that organizes pictures into albums or album pages can present a UI through which the user can pause, resume, or cancel the transfer of an entire album, rather than working on the level of individual files. The group property makes the implementation of this kind of experience much easier, as the app doesn’t need to maintain its own grouping structures.

The group has no bearing on the transfers themselves; it is not communicated to the server upload page.

Suspend, Resume, and Restart with Background Transfers

At the beginning of this section I mentioned that background transfers will continue while an app is suspended and paused if the app is terminated by the system. Because apps will be terminated only in low-memory conditions, it’s appropriate to also pause background transfers.

When an app is resumed from the suspended state, it can check on the status of pending transfers by using the BackgroundDownloader.getCurrentDownloadsAsync and BackgroundUploader.getCurrent-UploadsAsync methods. In both cases two variants of the methods exist: one that enumerates all transfers, and one that enumerates those belonging to a specific group (as matches the group properties in the operations).

The list that comes back from these methods is a vector of DownloadOperation and UploadOperation objects, and, as always, the vector can be addressed as an array. Code to iterate over the list looks like this:


Windows.Networking.BackgroundTransfer.BackgroundDownloader.getCurrentDownloadsAsync()
    .done(function (downloads) {
        for (var i = 0; i < downloads.size; i++) {
            var download = downloads[i];
        }
    });

Windows.Networking.BackgroundTransfer.BackgroundUploader.getCurrentUploadsAsync()
    .done(function (uploads) {
         for (var i = 0; i < uploads.size; i++) {
            var upload = uploads[i];
         }
    });

In each case, the progress property of each operation will tell you how far the transfer has come along. The progress.status property is especially important. Again, status is a Background-TransferStatus value and will be one of idle, running, pausedByApplication, pausedCosted-Network, pausedNoNetwork, canceled, error, and completed). These are clearly necessary to inform the user, as appropriate, and to give her the ability to restart transfers that are paused or experienced an error, to pause running transfers, and to act on completed transfers.

Speaking of which, when using the background transfer API, an app should always give the user control over pending transfers. Downloads can be paused through the DownloadOperation.pause method and resumed through DownloadOperation.resume. (There are no equivalents for uploads.) Download and upload operations are canceled by canceling the promises returned from startAsync.

This brings up an interesting situation: if your app has been terminated and later restarted, how do you restart transfers that were paused? The answer is quite simple. By enumerating transfers through getCurrentDownloadsAsync and getCurrentUploadsAsync, incomplete transfers are automatically restarted. But then how do you get back to the promises originally returned by the startAsync methods? Those are not values that you can save in your app state and reload on startup, and yet you need them to be able to cancel those operations, if necessary, and also to attach your completed, error, and progress handlers.

For this reason, both DownloadOperation and UploadOperation provide a method called attachAsync, which returns a promise for the operation just like startAsync did originally. You can then call the promise’s then or done methods to provide your handlers:

promise = download.attachAsync().then(complete, error, progress);

and call promise.cancel() if needed. In short, when Windows restarts a background transfer and essentially calls startAsync on your app’s behalf, it holds that promise internally. The attachAsync methods simply return that new promise.

A final question is whether a suspended app can be notified when a transfer is complete, perhaps to issue a toast to inform the user. Such a feature isn’t supported in Windows 8 as there is no background task available for this purpose. At present, then, the user needs to switch back to the app to check on transfer progress.

Authentication, Credentials, and the User Profile

If you think about it, just about every online resource in the world has some kind of credentials or authentication associated with it. Sure, we can read many of those resources without credentials, but having permission to upload data to a website is more tightly controlled, as is access to one’s account or profile in a database managed by a website. In many scenarios, then, apps need to manage credentials and handle other authentication processes, perhaps for accounts that you manage but perhaps also when using accounts from other sources such as Facebook, Twitter, and so on.

There are two basic approaches for dealing with credentials. First, you can collect credentials directly through the Credential Picker UI or a UI of your own. In either case, though, the next question is how to store those credentials securely, for which we have the Credential Locker API. The locker allows an app to retrieve those credentials in subsequent sessions such that it doesn’t need to ask the user to enter those credentials again (which gets tiresome, as I’m sure you know).

It’s very important to understand here that whenever an app acquires credentials as plain text, either from its own UI or from the Credential Picker with certain options, the app is fully responsible for protecting those credentials. For one thing, the app must always store and transmit those credentials with full encryption, but there are many subtleties here that are typically far more complicated than apps should worry about themselves.

For this reason it’s a good idea to delegate those details to others. For example, the Credential Picker UI will, by default, encrypt passwords before they ever get back to your app. Or you can use the second approach to credentials where the app authenticates users through another provider altogether, such as Microsoft Live Connect, Facebook, Flickr, Yahoo, and so forth. In doing so, the provider does the heavy lifting of authentication and the app needs only to store the appropriate tokens or other access keys for these services. A primary benefit to this kind of integrated authorization is that the app never touches those credentials itself and thus does not need to concern itself with their security. (An should still encrypt tokens or access keys if it stores them. The credential locker can also be used for this purpose.)

In most cases this process involves an agent called the Web Authentication Broker, which specifically works with OAuth/OpenID protocols and providers as generally found on the web. Microsoft Live Connect is a special case because the Microsoft account used with it might also be the one used to log into Windows itself. (Authenticating through Live Connect also gives an app access to other data from Live services including Calendar, Messenger, and SkyDrive.)

One of the other significant benefits of this second approach is the ability to provide a single sign on experience. This means that once a user has signed in through a particular OAuth provider in one app, they often don’t need to sign into other apps that use the same provider (unless the app deems it necessary). In the case of Live Connect, apps might never need to request credentials at all if that same Microsoft account is used to log in to Windows or is linked to the user’s domain login.

In this section we’ll also take a brief look—which is all that’s needed—at the user profile information available through WinRT APIs, along with the API for encryption and decryption. Beyond this, I’ll mention two other resources on the subject. The first is How to secure connections and authenticate requests; the second is the Banking with strong authentication sample, which demonstrates secure authentication and communication over the Internet. A full writeup on this sample is found on Tailored banking app code walkthrough, so we won’t be specifically looking at it here.

Design tip There are a number of design guidelines for different login scenarios, such as when an app requires a login to be useful and when a login is simply optional. These topics as well as where to place login and account/profile management UI are discussed in Guidelines and checklist for login controls.

The Credential Picker UI

Just as WinRT provides a built-in UI for picking files, it also has a built-in UI for entering credentials: Windows.Security.Credentials.UI.CredentialsPicker. This is provided as a convenience; again, you’re free to implement your own UI if that works better for your app, but many features make the credential picker attractive.

When you instantiate this object and call its pickAsync method, as with the Credential Picker sample, you’ll see the UI shown in Figure 14-2. This UI provides for domain logins, supports, smart cards (as you can see—I have two smart card readers on my machine), and it allows for various options such as authentication protocols and automatic saving of the credential (see the next section).

Image

FIGURE 14-2 The credential picker UI appears like a message dialog over the app.

The result from pickAsync, as given to your completed handler, is a CredentialPickerResults object with the following properties—when you enter some credentials in the sample here, you’ll see these values reflected in the sample’s output:

credentialuserName A string containing the entered username.

credentialPassword A string containing the password (typically encrypted depending on the authentication protocol option).

credentialDomainName A string containing a domain if entered with the username (as in <domain>\<username>).

credentialSaved A Boolean indicating whether the credential was saved automatically; this depends on picker options, as discussed below.

credentialSavedOption A CredentialSavedOption value indicating the state of the Remember My Credentals check box: unselected, selected, or hidden. We’ll see how to handle this shortly as well.

errorCode Contains zero if there is no error, otherwise an error code.

credential An IBuffer containing the credential as an opaque byte array. This is what you can save in your own persistent state if needs be and passed back to the picker at a later time; we’ll see how shortly.

The three scenarios in the sample demonstrate the different options you can use to invoke the credential picker. For this there are three separate variants of pickAsync. The first variant accepts a target name (ignored) and a message string that appears in the place of “Please enter your credentials” in Figure 14-2:

Windows.Security.Credentials.UI.CredentialPicker.pickAsync(targetName, message)
    .done(function (results) {
    }

The second variant accepts the same arguments plus a caption string that appears in the place of “Credential Picker Sample” in Figure 14-2:

Windows.Security.Credentials.UI.CredentialPicker.pickAsync(targetName, message, caption)
    .done(function (results) {
    }

The third variant accepts a CredentialPickerOptions object that has properties for the same targetName, message, and caption strings along with the following:

previousCredential An IBuffer with the opaque credential information as provided by a previous invocation of the picker (see CredentialPickerResults.credential above).

alwaysDisplayDialog A Boolean indicating whether the dialog box is displayed. The default value is false, but this applies only if you also populate previousCredential (with an exception for domain-joined machines—see table below). The purpose here is to show the dialog when a stored credential might be incorrect and the user is expected to provide a new one.

errorCode The numerical value of a Win32 error code (default is ERROR_SUCCESS) that will be formatted and displayed in the dialog box. You would use this when you obtain credentials from the picker initially but find that those credentials don’t work and need to invoke the dialog again. Instead of providing your own message, you just choose an error code and let the system do the rest. The most common values for this are 1326 (login failure), 1330 (password expired), 2202 (bad username), 1907 or 1938 (password must change/password change required), 1351 (can’t access domain info), and 1355 (no such domain). There are, in fact, over 15,000 Win32 error codes, but that means you’ll have to search the reference linked here (or search within the winerror.h file typically found in your Program Files (x86)\Windows Kits\8.0\Include\shared folder). Happy hunting!

callerSavesCredential A Boolean indicating that the app will save the credential and that the picker should not. The default value is false. When set to true, credentials are saved to a secure system location (not the credential locker) if the app has the Enterprise Authentication capability (see below).

credentialSaveOption A value from the CredentialSaveOption enumeration indica-ting the initial state of the Remember My Credentials check box: unselected, selected, or hidden.

authenticationProtocol A value from the AuthenticationProtocol enumeration: basic, digest, ntlm, kerberos, negotiate (the default), credSsp, and custom (in which case you must supply a string in the customAuthenticationProcotol property). Note that with basic and digest, the CredentialPickerResults.credentialPassword will not be encrypted and is subject to the same security needs as a plain text password you collect from your own UI.

Here’s an example of invoking the picker with an errorCode indicating a previous failed login:

var options = new Windows.Security.Credentials.UI.CredentialPickerOptions();
options.message = "Please enter your credentials";
options.caption = "Sample App";
options.targetName = "Target";
options.alwaysDisplayDialog = true;
options.errorCode = 1326;  // Shows "The username or password is incorrect."
options.callerSavesCredential = true;
options.authenticationProtocol =
    Windows.Security.Credentials.UI.AuthenticationProtocol.negotiate;
options.credentialSaveOption = Windows.Security.Credentials.UI.CredentialSaveOption.selected;

Windows.Security.Credentials.UI.CredentialPicker.pickAsync(options)
    .done(function (results) {
    }

To clarify the relationship between the callerSavesCredential, credentialSaveOption, and the credentialSaved properties, the following table enumerates the possibilities:

Image

The first column refers to the Enterprise Authentication capability in the app’s manifest, which indicates that the app can work with Intranet resources that require domain credentials (assuming that the app is running on the Enterprise Edition of Windows 8 as well). In such cases the credential picker has a separate secure location (apart from the credential locker) in which to store credentials, so the app need not save them itself. Furthermore, if the picker saves a credential and the app invokes the picker with alwaysDisplayDialog set to false, previousCredential can be empty because the credential will be loaded automatically. But without a domain-joined machine and this capability, the app must supply a previousCredential to avoid having the picker appear.

The Credential Locker

One of the reasons that apps might repeatedly ask a user for credentials is simply because they don’t have a truly secure place to store and retrieve those credentials that’s also isolated from all other apps. This is entirely the purpose of the credential locker, a function that’s also immediately clear from the name of this particular API: Windows.Security.Credentials.PasswordVault.

With the locker, any given credential itself is represented by a Windows.Security.-Credentials.PasswordCredential object, as we saw briefly with the background transfer API. You can create an initialized credential as follows:

var cred = new Windows.Security.Credentials.PasswordCredential(resource, userName, password);

Another option is to create an uninitialized credential and populate its properties individually:

var cred = new Windows.Security.Credentials.PasswordCredential();
cred.resource = "userLogin"
cred.userName = "username";
cred.password = "password";

A credential object also contains an IPropertySet value named properties, through which the same information can be managed.

In any case, when you collect credentials from a user and want to save them, create a Password-Credential and pass it to PasswordVault.add:

var vault = new Windows.Security.Credentials.PasswordVault();
vault.add(cred);

Note that if you add a credential to the locker with a resource and userName that already exist, the new credential will replace the old. And if at any point you want to delete a credential from the locker, call the PasswordVault.remove method with that credential.

Furthermore, even though a PasswordCredential object sees the world in terms of usernames and passwords, that password can be anything else you need to store securely, such as an access token. As we’ll see in the next section, authentication through OAuth providers might return such a token, in which case you might store something like “Facebook_Token” in the credential’s resource property, your app name in userName, and the token in password. This is a perfectly legitimate and expected use.

Once a credential is in the locker, it will remain there for subsequent launches of the app until you call the remove method or the user explicitly deletes it through Control Panel > User Accounts and Family Safety >Credential Manager. On a trusted PC (which requires sign-in with a Microsoft account), Windows will also automatically and securely roam the contents of the locker to the user’s other devices (which can be turned off in PC Settings > Sync Your Settings > Passwords). This help to create a seamless experience with your app as the user moves between devices.74

So, when you launch an app—even when launching it for the first time—always check if the locker contains saved credentials. There are several methods of the PasswordVault class for doing this:

findAllByResource Returns an array (vector) of credential objects for a given resource identifier. This is how you can obtain the username and password that’s been roamed from another device, because the app would have stored those credentials in the locker on the other machine under the same resource.

findAllByUserName Returns an array (vector) of credential objects for a given username. This is useful if you know the username and want to retrieve all the credentials for multiple resources that the app connects to.

retrieve Returns a single credential given a resource identifier and a username. Again, there will only ever be a single credential in the locker for any given resource and username.

retrieveAll Returns a vector of all credentials in the locker for this app. The vector contains a snapshot of the locker and will not be updated with later changes to credentials in the locker.

There is one subtle difference between the findAll and retrieve methods in the list above. The retrieve method will provide you with fully populated credentials objects. The findAll methods, on the other hand, will give you objects in which the password properties are still empty. This avoids performing password decryption on what is potentially a large number of credentials. To populate that property for any individual credential, call the PasswordCredential.retievePassword method.

For further demonstrations of the credential locker—the code is very straightfoward—refer to the Credential locker sample. This shows variations for single user/single resource (Scenario 1), single user/multiple resources (Scenario 2), multiple users/multiple resources (Scenario 3), and clearing out the locker entirely (Scenario 4).

The Web Authentication Broker

Although apps can acquire and manage user credentials of their own, supplying users perhaps with the ability to create app-specific or service-specific accounts (typically through the Settings charm, as discussed in Chapter 8 and also on Guidelines and checklist for login controls), you might want to simply leverage an account that the user has already created through another OAuth provider, especially when you want to use that provider’s resources. You’ve likely experienced this on many websites already, where you log in through another site like Facebook. Of course, that typically means navigating away from the original website to the provider’s site—a process that flows well enough in a web browser but isn’t quite so attractive in the context of an app!

For this purpose, Windows provides the Web Authentication Broker, which essentially does the same job without leaving the context of the app itself. An app provides the URI of the authenticating page of the external site (which must use the https:// URI scheme, otherwise you get an invalid parameter error). The broker then creates a new web host process in its own app container, into which it loads the indicated web page. The UI for that process is displayed as an overlay dialog on the app, as shown in Figure 14-3, for which I’m using Scenario 1 of the Web authentication broker sample.

Image

FIGURE 14-3 The Web authentication broker sample using a Facebook login page.

Note To run the sample you’ll need an app ID for each of the authentication providers in the various scenarios. For Facebook in Scenario 1, visit http://developers.facebook.com/setup and create an App ID/API Key for a test app.

In the case of Facebook, the authentication process involves more than just checking the user’s credentials. It also needs to obtain permission for other capabilities that the app wants to use (which the user might have independently revoked directly through Facebook). As a result, the authentication process might navigate to additional pages, each of which still appears within the web authentication broker, as shown in Figure 14-4. In this case the app identity, ProgrammingWin8_AuthTest, is just one that I created through the Facebook developer setup page for the purposes of this demonstration.

Image

FIGURE 14-4 Additional authentication steps for Facebook within the web authentication broker.

Within the web authentication broker UI, the user might be taken through multiple pages on the provider’s site (but note that the back button next to the “Connecting to a service” title dismisses the dialog entirely). But this begs a question: how does the broker know when authentication is actually complete? On the right side of Figure 14-4, clicking the Allow button is the last step in the process, after which Facebook would normally show a login success page. In the context of an app, however, we don’t need that page to appear—we’d rather have the broker’s UI taken down so that we return to the app with the results of the authentication. What’s more, many OAuth providers don’t even have such a page—so what do we do?

Fortunately, the broker takes this into account. As we’ll see in a moment, the app simply provides the URI of that final page of the provider’s process. When the broker detects that it’s navigated to that page, it removes its UI and gives the response to the app.

As part of this process, Facebook saves these various permissions in its own back end for each particular user, so even if the app started the authentication process again, the user would not see the same pages shown in Figure 14-4. The user can, of course, manage these permissions when visiting Facebook through a web browser; if the user deletes the app information there, these additional authentication steps would reappear.

In WinRT, the broker is represented by the Windows.Security.Authentication.Web.-WebAuthenticationBroker class. Authentication happens through its authenticateAsync methods. I say “methods” here because there are two variations. We’ll look at one here and return to the second in the next section, “Single Sign On.”

This first variant of authenticateAsync method takes three arguments:

options Any combination of values from the WebAuthenticationOptions enumeration (combined with bitwise OR). Values are none (the default), silentMode (no UI is shown), useTitle (returns the window title of the webpage in the results), useHttpPost (returns the body of the page with the results), and useCorporateNetwork (to render the web page in an app container with the Private Networks (Client & Server), Enterprise Authentication, and Shared User Certificates capabilities; the app must have also declared these).

requestUri The URI (Windows.Foundation.Uri) for the provider’s authentication page along with the parameters required by the service; again, this must use the https:// URI scheme.

callbackUri The URI (Windows.Foundation.Uri) of the provider’s final page in its authentication process. The broker again uses this to determine when to take down its UI.75

The results given to the completed handler for authenticateAsync is a WebAuthentication-Result object. This contains properties named responseStatus (a WebAuthenticationStatus with either success, userCancel, or errorHttp), responseData (a string that will contain the page title and body if the useTitle and useHttpPost options are set, respectively), and response-ErrorDetail (an HTTP response number).

Generally speaking, the app is most interested in the contents of responseData, because it will contain whatever tokens or other keys that might be necessary later on. Let’s look at this again in the context of Scenario 1 of the Web authentication broker sample. Set a breakpoint within the completed handler of authenticateAsync (line 59 or thereabouts), and then run the sample, enter an app ID you created earlier, and click Launch. (Note that the callbackUri parameter is set to https://www.facebook.com/connect/login_success.html, which is where the authentication process finishes up.)

In the case of Facebook, the responseData contains a string in this format:

https://www.facebook.com/connect/login_success.html#access_token=<token>&expires_in=<timeout>

where <token> is a bunch of alphanumeric gobbledygook and <timeout> is some period defined by Facebook. If you’re calling any Facebook APIs—which is likely because that’s why you’re authenticating through Facebook in the first place—the <token> is the real treasure you’re after because it’s how you authenticate the user when making later calls to that API.

This token is what you then save in the credential locker for later use when the app is relaunched after being closed or terminated. (With Facebook, you don’t need to worry about the expiration of that token because the API generally reports that as an error and has a built-in renewal process.) You’d do something similar with other authentication providers, referring, of course, to their particular documentation on what information you’ll receive with the response and how to use and/or renew keys or tokens when necessary.

All in all, a key benefit to web authentication is that the user never actually gives credentials to an app—the user gives them only to a much more trusted provider. From the app’s point of view as well, it never needs to ask for or manage those credentials, only the tokens returned by the provider. For this same reason, invoking the broker as we’ve seen here will always show the login page with blank fields, irrespective of the Keep Me Logged In check box, because the calling app doesn’t retain any of that information, and any cookies and session state created within the broker’s hosting environment will have been discarded. So, if the app wants to have the user log in again with different credentials, it would just invoke the broker as before and replace whatever tokens or keys it saved from the last authentication.

Speaking of providers, the OAuth page on Wikipedia lists current authentication providers. The Web authentication broker sample, for its part, shows how to work specifically with Facebook (Scenario 1), Twitter (Scenario 2), Flickr (Scenario 3), and Google/Picasa (Scenario 4), and it also provides a generic interface for any other service (Scenario 5).

It’s instructive to look through these various scenarios. Because Facebook and Google use the OAuth 2.0 protocol, the requestUri for each is relatively simple (ignore the word wrapping):


https://www.facebook.com/dialog/oauth?client_id=<client_id>&redirect_uri=<redirectUri>&
scope-read_stream&display=popup&response_type=token

https://accounts.google.com/o/oauth2/auth?client_id=<client_id>&redirect_uri=<redirectUri>&
response_type=code&scope=http://picasaweb.google.com/data

where <client_id> and <redirectUri> are replaced with whatever is specific to the app. Twitter and Flickr, for their parts, use OAuth 1.0a protocol instead, so much more ceremony goes into creating the lengthy OAuth token to include with the requestUri argument to authenticateAsync. I’ll leave it to the sample code to show those details.

Tip Web authentication events are visible in the Event Viewer under Application and Services Logs > Microsoft > Windows > WebAuth > Operational. This can be helpful for debugging because it brings out information that is otherwise hidden behind the opaque layer of the broker.

Single Sign On

What we’ve seen so far with the credential locker and the web authentication broker works very well to minimize how often the app needs to pester the user for credentials. Where a single app is concerned, it would ideally only ask for credentials once until such time as the user explicitly logs out. But what about multiple apps? Imagine over time that you acquire some dozens, or even hundreds, of apps from the Windows Store that use external authentication providers. It could mean that you’d have to enter your Facebook, Twitter, Google, LinkedIn, Tumblr, Yahoo, or Yammer credentials in each app that uses them. Sure, you might need to do that only once in each individual app, but the compound effect is still tedious and annoying!

From the user’s point of view, once they’ve authenticated through a given provider in one app, it makes sense that other apps should benefit from that authentication if possible. Yes, some apps might need to prompt for additional permissions and some providers may not support the process, but the ideal is again to minimize the fuss and bother where we can.

The concept of single sign on is exactly this: authenticating the user in one app (or the system in the case of a Microsoft account) effectively logs the user in to other apps that use the same provider. At the same time, each app must often acquire its own access keys or tokens, because these should not be shared between apps. So the real trick is to effectively perform the same kind of authentication we’ve already seen, only to do it without showing any UI unless it’s really necessary.

This is provided for in the web authentication broker through the variation of authenticateAsync that takes only the options and requestUri arguments. In this case options is often set to Web-AuthenticationOptions.silentMode to prevent the broker’s UI from appearing (but this isn’t required).

To make silentMode work the broker still needs to know when the process is complete. So what callbackUri does it use for comparison, and how does the provider know that itself? It sounds like a situation where the broker would just sit there, forever hidden, while the provider patiently waits for input to a web page that’s equally invisible! What actually happens is that authenticateAsync watches for the provider to navigate to a special callbackUri that looks something like ms-app:// <app_package>/<secret_sauce>, at which point it will pass the provider’s response data as the async result.

Of course, that URI won’t mean a thing to the provider…unless it’s told about it beforehand and is expecting such a URI to appear in its midst.

This brings us to the fact that single sign on will work only if a provider has a means (an API or such) through which an app can communicate its intent along these lines. To understand this, let’s follow the entire flow of the silent authentication process:

1. An app that wants to use single sign on obtains its particular ms-app:// URI—also called an SID URI—through one of two means. First is by calling the static method WebAuthentication-Broker.getCurrentApplicationCallbackUri. This returns a Windows.Foundation.Uri object whose absoluteUri property is the string you need. The second means is through the Windows Store Dashboard > Manage Your Cloud Services > Advanced Features > Application Authentication page, where you should see a string that looks like this:
ms-app://s-1-15-2-477157379-2961032073-432767880-3229792171-202870256-1369967874-2241987394/.

2. If necessary, the app then calls the provider’s API to register the SID URI (typically a provider will have a page to define an app where you’d enter this).

3. When constructing the requestUri argument for authenticateAsync, the app inserts its SID URI as the value of the &redirect_uri= parameter.

4. The app calls authenticateAsync with the silentMode option.

5. When the provider processes the requestUri parameters, it checks whether the redirect_uri value has been registered, responding with a failure if it hasn’t.

6. Having validated the app, the provider then silently authenticates (if possible) and navigates to the redirect_uri, making sure to include things like access keys and tokens in the response data.

7. The web authentication broker will detect this navigation and match it to its special callbackUri. Finding a match, the broker can complete the async operation and provide the response data to the app.

Again, the provider must have a way for the developer or app to register its SID URI, must check that SID URI when it appears in an authentication request, and must write appropriate response data to that page when authentication is complete. The developer or app is then responsible for registering that SID URI in the first place and including it in the requestUri. (Whew, that’s a lot of URIs!)

With all of this, it’s still possible that the authentication might fail for some other reason. For example, if the user has not set up permissions for the app in question (as with Facebook), it’s not possible to silently authenticate. So, an app attempting to use single sign on would call this form of authenticateAsync first and, failing that, would then revert to calling its longer form (with UI), as described in the previous section.

Single Sign On with Live Connect

Because various Microsoft services, such as Hotmail, are OAuth providers, it is possible to use the web authentication broker with a Microsoft account (such as Hotmail, Live, and MSN accounts). (I still have the same @msn.com email account I’ve had since 1996!) Details can be found on the OAuth 2.0 page on the Live Connect Developer Center.

However, Live Connect accounts are in a somewhat more privileged position because they can also be used to sign in to Windows or can be connected to a domain account used for the same purpose. Many of the built-in apps such as Mail, Calendar, SkyDrive, People, and for that matter the Windows Store itself work with this same account. Thus, it’s something that many other apps might want to take advantage of as well, because such authentication provides access to the same Live services data that those built-in apps draw from themselves.

The Live Services API for signing in this way is called WL.login, which is available when you install the Live SDK and add the appropriate references to your project. To get started with that process, visit the Live Connect documentation and check out the following references:

Live Connect (Windows Store apps) home page

Live Connect Developer Center (Windows Store Apps)

Guidelines for single sign-on and connected accounts

Guidelines for the Microsoft account sign-in experience

Single sign-on with Microsoft accounts

Quickstart: Accessing Live services data

Windows account authorization sample

Bring single sign-on and SkyDrive to your Windows 8 apps with the Live SDK and Best Practices when adding single sign-on to your app with the Live SDK on the Windows 8 Developer Blog.

As you can imagine, working with Live Services is an extensive topic, so I’ll defer to the resources above. One key point, though, is that it’s possible for a user to log in to Windows with a domain account that has not been connected to a Microsoft account through PC Settings > Users. In this case, the first call to WL.login from any app will display the Microsoft account login dialog, as shown in Figure 14-5. Once the user enters credentials here, they’re logged in to all other apps that use the Microsoft account.

Image

FIGURE 14-5 The Microsoft account login dialog.

The User Profile (and the Lock Screen Image)

Any discussion about user credentials brings up the question of accessing additional user information. What is available to Windows Store apps is provided through the Windows.System.UserProfile API, where we find three classes of interest.

The first is the LockScreen class through which you can get or set the lock screen image (and nothing more). The image is available through the originalImageFile property (returning a StorageFile) and the getImageStream method (returning an IRandomAccessStream). Setting the image can be accomplished through setImageFileAsync (using a StorageFile) and setImage-StreamAsync (using an IRandomAccessStream). This would be utilized in a photo app that has a command to use a picture for the lock screen. See the Lock screen personalization sample for a demonstration.

The second is the GlobalizationPreferences object, which we’ll return to in Chapter 17, “Apps for Everyone.”

Third is the UserInformation class, whose capabilities are clearly exercised within PC Settings > Personalize > Account picture:

• User name If the nameAccessAllowed property is true, an app can then call getDispla-yNameAsync, getFirstNameAsync, and getLastNameAsync, all of which provide a string to your completed handler. If nameAccessAllowed is false, these methods will complete but provide an empty result. Also note that the first and last names are available only from a Microsoft account.

• User picture Retrieved through getAccountPicture, which returns a StorageFile for the image. The method takes a value from AccountPictureKind: smallImage, largeImage, and video.

• If the accountPictureChangeEnabled property is true, you can use one of four methods to set the image(s): setAccountPictureAsync (for providing one image from a StorageFile), setAccountPicturesAsync (for providing small and large images as well as a video from StorageFile objects), and setAccountPictureFromStreamAsync and setAccount-PicturesFromStreamAsync (which do the same given IRandomAccessStream objects instead). In each case the async result is a SetAccountPictureResult value: success, failure, changeDisabled (accountPictureChangeEnabled is false), large-OrDynamicError (the picture is too large), fileSizeError (file is too large), or video-FrameSizeError (video frame size is too large),

• The accountpicturechanged event signals when the user picture(s) have been altered. Remember that because this event originates within WinRT you should call removeEvent-Listener if you aren’t listening for this event for the lifetime of the app.

These features are demonstrated in the Account picture name sample. Scenario 1 retrieves the display name, Scenario 2 retrieves the first and last name (if available), Scenario 3 retrieves the account pictures and video, and Scenario 4 changes the account pictures and video and listens for picture changes.

Tip To obtain the profile picture from Live Connect, the exact API call is as follows: https://apis.live.net/v5.0/me/picture?access_token=<ACCESS_TOKEN>.

One other bit that this sample demonstrates is the Account Picture Provider declaration in its manifest, which causes the app to appear within PC Settings > Personalize under Create an Account Picture:

Image

In this case the sample doesn’t actually provide a picture directly but activates into Scenario 4. A real app, like the Camera app that’s also in PC Settings by default, will automatically set the account picture when one is acquired through its UI. How does it know to do this? The answer lies in a special URI scheme through which the app is activated. That is, when you declare the Account Picture Provider declaration in the manifest, the app will be activated with the activation kind of protocol (see Chapter 12, “Contracts”), where the URI scheme specifically starts with ms-accountpictureprovider. You can see how this is handled in the sample’s js/default.js file:

if (eventObject.detail.kind === Windows.ApplicationModel.Activation.ActivationKind.protocol) {
    // Check if the protocol matches the "ms-accountpictureprovider" scheme
    if (eventObject.detail.uri.schemeName === "ms-accountpictureprovider") {

    // This app was activated via the Account picture apps section in PC Settings.
    // Here you would do app-specific logic for providing the user with account
    // picture selection UX
}

Returning to the UserInformation class, it also provides a few more details for domain accounts provided that the app has declared the Enterprise Authentication capability in its manifest:

getDomainNameAsync Provides the user’s fully qualified domain name as a string in the form of <domain>\<user> where <domain> is the full name of the domain controller, such as mydomain.corp.ourcompany.com.

getPrincipalNameAsync Provides the principal name as a string. In Active Directory parlance, this is an Internet-style login name (known as a user principal name or UPN) that is shorter and simpler than the domain name, consolidating the email and login namespaces. Typically, this is an email address like user@ourcompany.com.

getSessionInitiationProtocolUriAsync Provides a session initiation protocol URI that will connect with this user; for background, see Session Initiation Protocol (Wikipedia).

The use of these methods is demonstrated in the User domain name sample.

Encryption, Decryption, Data Protection, and Certificates

Authorization and credentials are a security matter, so it’s appropriate to end this section with a quick rundown of the other APIs grouped under the Windows.Security namespace, where we found the web authentication broker already.

First is Windows.Security.Cryptography. Here you’ll find the CryptographicBuffer class that can encode and decode strings in hexadecimal and base64 (UTF-8 or UTF-16) and also provide random numbers and a byte array full of such randomness. Refer to Scenario 1 of the CryptoWinRT sample for some demonstrations, as well as Scenarios 2 and 3 of the Web authentication broker sample. WinRT’s base64 encoding is fully compatible with the JavaScript atob and btoa functions.

Next is Windows.Security.Cryptography.Core, which is truly about encryption and decryption according to various algorithms. See the Encryption topic, Scenarios 2-8 of the CryptoWinRT sample, and again Scenarios 2 and 3 of the Web authentication broker sample.

Third is Windows.Security.Cryptography.DataProtection, whose single class, Data-ProtectionProvider, deals with protecting and unprotecting both static data and a data stream. This applies only to apps that declare the Enterprise Authentication capability. For details, refer to Data protection API along with Scenarios 9 and 10 of the CryptoWinRT sample.

Fourth, Windows.Security.Cryptography.Certificates provides several classes through which you can create certificate requests and install certificate responses. Refer to Working with certificates and the Certificate enrollment sample for more.

And lastly it’s worth at least listing the API under Windows.Security.ExchangeActive-SyncProvisioning for which there is the EAS policies for mail clients sample. I’m assuming that if you know why you’d want to look into this, well, you’ll know!

Syndication

When we first looked at doing XmlHttpRequests with WinJS.XHR in Chapter 3, we grabbed the RSS feed from the Windows 8 Developer Blog with the URI http://blogs.msdn.com/b/windowsappdev/rss.aspx. We learned then that WinJS.xhr returned a promise, the result of which contained a responseXML property, which is itself a DomParser through which you can traverse the DOM structure and so forth.

Working with syndicated feeds like this is completely supported for Windows Store apps. In fact, the How to create a mashup topic in the documentation describes exactly this process, components of which are demonstrated in the Integrating content and controls from web services sample.

That said, WinRT offers additional APIs for dealing with syndicated content. One, Windows.Web.-Syndication, offers a more structured way to work with RSS feeds. The other, Windows.Web.AtomPub, provides a means to publish and manage feed entries. Both are provided in WinRT for languages that don’t have another means of accomplishing the same ends, but as a developer working JavaScript, you have the choice.

Reading RSS Feeds

The primary class within Windows.Web.Syndication is the SyndicationClient. To work with any given feed, you create an instance of this class and set any necessary properties. These are serverCredential (a PasswordCredential), proxyCredential (another PasswordCredential), timeout (in millisceonds, default is 30000 or 30 seconds), maxResponseBufferSize (a means to protect from potentially malicious servers), and bypassCacheOnRetrieve (a Boolean to indicate whether to always obtain new data from the server). You can also make as many calls to its setRequestHeader method (passing a name and value) to configure the XmlHttpRequest header.

The final step is to then call the SyndicationClient.retrieveFeedAsync method with the URI of the desired RSS feed (a Windows.Foundation.Uri). Here’s an example derived from the Syndication sample, which retrieves the RSS feed for the Building Windows 8 blog:

uri = new Windows.Foundation.Uri("http://blogs.msdn.com/b/b8/rss.aspx");
var client = new Windows.Web.Syndication.SyndicationClient();
client.bypassCacheOnRetrieve = true;
client.setRequestHeader("User-Agent",
    "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0)");

client.retrieveFeedAsync(uri).done(function (feed) {
    // feed is a SyndicationFeed object

}

The result of retrieveFeedAsync is a Windows.Web.Syndication.SyndicationFeed object; that is, the SyndicationClient is what you use to talk to the service, and when you retrieve the feed you get an object though which you can then process the feed itself. If you take a look at SyndicationFeed using the link above, you’ll see that it’s wholly stocked with properties that represent all the parts of the feed, such as authors, categories, items, title, and so forth. Some of these are represented themselves by other classes in Windows.Web.Syndication, or collections of them, where simpler types aren’t sufficient: SyndicationAttribute, SyndicationCategory, SyndicationContent, SyndicationGenerator, SyndicationItem, SyndicationLink, SyndicationNode, SyndicationPerson, and SyndicationText. I’ll leave the many details to the documentation.

We can see some of this in the sample, picking up from inside the completed handler for retrieveFeedAsync. Let me offer a more annotated version of that code:

client.retrieveFeedAsync(uri).done(function (feed) {
    currentFeed = feed;

    var title = "(no title)";

    // currentFeed.title is a SyndicationText object
    if (currentFeed.title) {
       title = currentFeed.title.text;
    }

    // currentFeed.items is a SyndicationItem collection (array)
    currentItemIndex = 0;
    if (currentFeed.items.size > 0) {
       displayCurrentItem();
    }
}

// ...

function displayCurrentItem() {
    // item will be a SyndicationItem

    var item = currentFeed.items[currentItemIndex];

    // Display item number.
    document.getElementById("scenario1Index").innerText = (currentItemIndex + 1) + " of "
        + currentFeed.items.size;

    // Display title (item.title is another SyndicationText).
    var title = "(no title)";
    if (item.title) {
        title = item.title.text;
    }
    document.getElementById("scenario1ItemTitle").innerText = title;

    // Display the main link (item.links is a collection of SyndicationLink objects).
    var link = "";
    if (item.links.size > 0) {
        link = item.links[0].uri.absoluteUri;

    }

    var scenario1Link = document.getElementById("scenario1Link");
    scenario1Link.innerText = link;
    scenario1Link.href = link;

    // Display the body as HTML (item.content is a SyndicationContent object, item.summary is
    // a SyndicationText object).
    var content = "(no content)";
    if (item.content) {
        content = item.content.text;
    }
    else if (item.summary) {
        content = item.summary.text;
    }
    document.getElementById("scenario1WebView").innerHTML = window.toStaticHTML(content);

    // Display element extensions. The elementExtensions collection contains all the additional
    // child elements within the current element that do not belong to the Atom or RSS standards
    // (e.g., Dublin Core extension elements). By creating an array of these, we can create a
    // WinJS.Binding.List that's easily displayed in a ListView.
    var bindableNodes = [];
    for (var i = 0; i < item.elementExtensions.size; i++) {
        var bindableNode = {
            nodeName: item.elementExtensions[i].nodeName,
            nodeNamespace: item.elementExtensions[i].nodeNamespace,
            nodeValue: item.elementExtensions[i].nodeValue,
        };
        bindableNodes.push(bindableNode);
    }
    var dataList = new WinJS.Binding.List(bindableNodes);
    var listView = document.getElementById("extensionsListView").winControl;
    WinJS.UI.setOptions(listView, { itemDataSource: dataList.dataSource });
}

It’s probably obvious that the API, under the covers, is probably just using the XmlDocument API to retrieve all the feed these properties. In fact, its getXmlDocument returns that XmlDocument if you want to access it yourself.

You can also create a SyndicationFeed object around the XML for a feed you might already have. For example, if you obtain the feed contents by using WinJS.xhr, you can create a new SyndicationFeed object and call its load method with the XHR responseXML. Then you can work with the feed through the class hierarchy. When using the Windows.Web.AtomPub API to manage a feed, you also create a new or updated SyndicationItem to send across the wire, settings its values through the other objects in its hierarchy. We’ll see this shortly.

One last note: if retrieveFeedAsync throws an exception, which would be picked up by an error handler you provide to the promise’s done method, you can turn the error code into a SyndicationErrorStatus value. Here’s how it’s used in the sample’s error handler:


function onError(err) {
    // Match error number with a SyndicationErrorStatus value. Use
    // Windows.Web.WebErrorStatus.getStatus() to retrieve HTTP error status codes.
    var errorStatus = Windows.Web.Syndication.SyndicationError.getStatus(err.number);
    if (errorStatus === Windows.Web.Syndication.SyndicationErrorStatus.invalidXml) {
        displayLog("An invalid XML exception was thrown. Please make sure to use a URI that"
            + "points to a RSS or Atom feed.");
    }
}

Using AtomPub

On the flip side of reading an RSS feed, as we’ve just seen, is the need to possibly manage entries on a feed: adding, removing, and editing entries. This would be used for an app that lets the user maintain a specific blog, not just read entries from others.

The API for this is found in Windows.Web.AtomPub and demonstrated in the AtomPub sample. The main class is the AtomPubClient that encapsulates all the operations of the AtomPub protocol. It has methods like createResourceAsync, retrieveResourceAsync, updateResourceAsync, and deleteResourceAsync for working with those entries, where each resource is identified with a URI and a SyndicationItem object, as appropriate. Media resources for entries are managed through createMediaResourceAsync and similarly named methods, where the resource is provided as an IInputStream.

The AtomPubClient also has retrieveFeedAsync and setRequestHeader methods that do the same as the SyndicationClient methods of the same names, along with a few similar properties like serverCredential, timeout, and bypassCacheOnRetrieve. Another method, retrieve-ServiceDocumentAsync, provides the workspaces/service documents for the feed (in the form of a Windows.Web.AtomPub.ServiceDocument object).

Again, the AtomPub sample demonstrates the different operations: retrieve (Scenario 1), create (Scenario 2), delete (Scenario 3), and update (Scenario 4). Here’s how it first creates the AtomPub-Client object (see js/common.js), assuming there are credentials:

function createClient() {
    client = new Windows.Web.AtomPub.AtomPubClient();
    client.bypassCacheOnRetrieve = true;

    var credential = new Windows.Security.Credentials.PasswordCredential();
    credential.userName = document.getElementById("userNameField").value;
    credential.password = document.getElementById("passwordField").value;
    client.serverCredential = credential;
}

Updating an entry (js/update.js) then looks like this, where the update is represented by a newly created SyndicationItem:

function getCurrentItem() {
    if (currentFeed) {

      return currentFeed.items[currentItemIndex];
   }
   return null;
}

var resourceUri = new Windows.Foundation.Uri( /* service address */ );
createClient();

var currentItem = getCurrentItem();

if (!currentItem) {
    return;
}

// Update the item
var updatedItem = new Windows.Web.Syndication.SyndicationItem();
var title = document.getElementById("titleField").value;
updatedItem.title = new Windows.Web.Syndication.SyndicationText(title,
    Windows.Web.Syndication.SyndicationTextType.text);
var content = document.getElementById("bodyField").value;
updatedItem.content = new Windows.Web.Syndication.SyndicationContent(content,
    Windows.Web.Syndication.SyndicationTextType.html);

client.updateResourceAsync(currentItem.editUri, updatedItem).done(function () {
    displayStatus("Updating item completed.");
}, onError);

Error handling in this case works with the Window.Web.WebError class (see js/common.js):

function onError(err) {
   displayError(err);

   // Match error number with a WebErrorStatus value, in order to deal with a specific error.
   var errorStatus = Windows.Web.WebError.getStatus(err.number);
   if (errorStatus === Windows.Web.WebErrorStatus.unauthorized) {
       displayLog("Wrong username or password!");
   }
}

Sockets

Sockets are a fundamental network transport. Unlike HTTP requests, where a client sends a request to a server and the server responds—essentially an isolated transaction—sockets are a connection between client and server IP ports such that either one can send information to the other at any time. Certainly we’ve seen a mechanism like this earlier—namely, using the Windows Push Notification Service (WNS). WNS, however, is limited to notifications and is specifically designed to issue tile updates or notifications for apps that aren’t running. Sockets, on the other hand, are for data exchange between a server and a running client.

Sockets are generally used when there isn’t a higher-level API or other abstraction for your particular scenario, when there’s a custom protocol involved, when you need two-way communication, or when it makes sense to minimize the overhead of each exchange. Consider HTTP, a protocol that is itself built on lower-level sockets. A single HTTP request generally includes headers and lots of other information beyond just the bit of data involved, so it’s an inefficient transport when you need to send lots of little bits. It’s better to connect directly with the server and exchange data with a minimized custom protocol. VoIP is another example where sockets work well, as are multicast scenarios like multiplayer games. In the latter, one player’s machine, acting as a server within a local subnet, can broadcast a message to all the other players, and vice versa, again with minimal overhead.

In the world of sockets, exchanging data can happen two ways: as discrete packets/messages (like water balloons) or as a continuous stream (like water running through a hose). These are called datagram and stream sockets, respectively, and both are supported through the WinRT API. WinRT also supports both forms of exchange through the WebSocket protocol, a technology originally created for web browsers and web servers that has become increasingly interesting for general purpose use within apps. All of the applicable classes can be found in the Windows.Networking.Sockets API, as we’ll see in the following sections. Note that because there is some overlap between the different types of sockets, these sections are meant to be read in sequence so that I don’t have to repeat myself too much!

Datagram Sockets

In the language of sockets, a water balloon is called a datagram, a bundle of information sent from one end of the socket to the other—even without a prior connection—according to the User Datagram Protocol (UDP) standard. UDP, as I summarize here from its description on Wikipedia, is simple, stateless, unidirectional, and transaction-oriented. It has minimal overhead and lacks retransmission delays, and for these reasons it cannot guarantee that a datagram will actually be delivered. Thus, it’s used where error checking and correction isn’t necessary or is done by the apps involved rather than at the network interface level. In a VoIP scenario, for example, this allows data packets to just be dropped if they cannot be delivered, rather than having everything involved wait for a delayed packet. As a result, the quality of the audio might suffer, but it won’t start stuttering or make your friends and colleagues sound like they’re from another galaxy. In short, UDP might be unreliable, but it minimizes latency. Higher-level protocols like the Real-time Transport Protocol (RTP) and the Real Time Streaming Protocol (RTSP) are built on UDP.

A Windows Store app works with this transport—either as a client or a server—using the Windows.Networking.Sockets.DatagramSocket class, an object that you need to instantiate with the new operator to set up a specific connection and listen for messages:

var listener = new Windows.Networking.Sockets.DatagramSocket();

On either side of the conversation, the next step is to listen for the object’s messagereceived event:

// Event from WinRT: remember to call removeEventListener as needed
listener.addEventListener("messagereceived", onMessageReceived);

When data arrives, the handler receives a—wait for it!—DatagramSocketMessageReceived-EventArgs object (that’s a mouthful). This contains localAddress and remoteAddress properties, both of which are a Windows.Networking.HostName that contains the IP address, a display name, and a few other bits. See the “Network Information (the Network Object Roster)” section earlier in this chapter for details. The event args also contains a remotePort string. More importantly, though, are the two methods through which you extract the data. One is getDataStream, which returns an IInputStream through which you can read sequential bytes. The other is getDataReader, which returns a Windows.Storage.Streams.DataReader object, a higher-level abstraction built on top of the IInputStream that helps you read specific data types directly. Clearly, if you know the data structure you expect to receive in the message, using the DataReader will relieve you from doing type conversions yourself.

Of course, to get any kind of data from a socket, you need to connect it to something. For this purpose there are a few methods in DatagramSocket for establishing and managing a connection:

connectAsync Starts a connection operation given a HostName object and a service name (or UDP port, a string) of the remote network destination. This is used to create a one-way client to server connection.

• Another form of connectAsync takes a Windows.Networking.EndpointPair object that specifies host and service names for both local and remote endpoints. This is used to create a two-way client/server connection, as the local endpoint implies a call to bindEndpointAsync as below.

bindEndpointAsync For a one-way server connection—that is, to only listen to but not send data on the socket—this method just binds a local endpoint given a HostName and a service name/port. Binding the service name by itself can be done with bindServiceNameAsync.

joinMulticastGroup Given a HostName, connects the Datagram socket to a multicast group.

close Terminates the connection and aborts any pending operations.

Tip To open a socket to a localhost port for debugging purposes, use connectAsync as follows:

  var socket = new Windows.Networking.Sockets.DatagramSocket();
  socket.connectAsync(new Windows.Networking.Sockets.DatagramSocket("localhost",
      "12345", Windows.Networking.Sockets.SocketProtectionLevel.plainSocket)
      .done(function () {
          // ...
      }, onError);

Note that any given socket can be connected to any number of endpoints—you can call connect-Async multiple times, join multiple multicast groups, and bind multiple local endpoints with bindEnd-pointAsync and bindServiceNameAsync. The close method, mind you, closes everything at once!

Once the socket has one or more connections, connection information can be retrieved with the DatagramSocket.information property (a DatagramSocketInformation). Also, note that the static DatagramSocket.getEndpointPairsAsync method provides (as the async result) a vector of available EndpointPair objects for a given remote hostname and service name. You can optionally indicate that you’d like the endpoints sorted according to the optimizeForLongConnections flag. See the documentation page linked here for details, but it basically lets you control which endpoint is preferred over others based on whether you want to optimize for a high-quality and long-duration connection that might take longer to connect to initially (as for video streaming) or for connections that are easiest to acquire (the default).

Control data can also be set through the DatagramSocket.control property, a Datagram-SocketControl object with qualityOfService and outputUnicastHopLimit properties.

All this work, of course, is just a preamble to sending data on the socket connection. This is done through the DatagramSocket.outputStream property, an IOutputStream to which you can write whatever data you need using its writeAsync and flushAsync methods. This will send the data on every connection within the socket. Alternately, you can use one of the variations of getOutput-StreamAsync to specify a specific EndpointPair or HostName/port to which to send the data. The result of both of these async operations is again an IOutputStream. And in all cases you can create a higher-level DataWriter object around that stream:

var dataWriter = new Windows.Storage.Streams.DataWriter(socket.outputStream)

Here’s how it’s all demonstrated in the DatagramSocket sample, a little app in which you need to run each of the scenarios in turn. Scenario 1, for starters, sets up the server-side listener of the relationship on the localhost, using port number 22112 (the service name) by default. To do this, it creates the sockets, adds the listener, and calls bindServiceNameAsync (js/startListener.js):

socketsSample.listener = new Windows.Networking.Sockets.DatagramSocket();
// Reminder: call removeEventListener as needed; this can be common with socket relationships
// that can come and go through the lifetime of the app.
socketsSample.listener.addEventListener("messagereceived", onServerMessageReceived);

socketsSample.listener.bindServiceNameAsync(serviceName).done(function () {
    // ...
}, onError);

When a message is received, this server-side component takes the contents of the message and writes it to the socket’s output stream so that it’s reflected in the client side. This looks a little confusing in the code, so I’ll show the core code path of this process with added comments:

function onServerMessageReceived(eventArgument) {
    // [Code here checks if we already got an output stream]

    socketsSample.listener.getOutputStreamAsync(eventArgument.remoteAddress,
        eventArgument.remotePort).done(function (outputStream) {
            // [Save the output stream with some other info, omitted]
            socketsSample.listenerOutputStream = outputStream;
        }

        // This is a helper function
        echoMessage(socketsSample.listenerOutputStream, eventArgument);
    });
}

// eventArgument here is a DatagramSocketMessageReceivedEventArgs with a getDataReader method function
echoMessage(outputStream, eventArgument) {
    // [Some display code omitted]

    // Get the message stream from the DataReader and send it to the output stream
    outputStream.writeAsync(eventArgument.getDataReader().detachBuffer()).done(function () {
        // Do nothing - client will print out a message when data is received.
    });
}

In most apps using sockets, the server side would do something more creative with the data than just send it back to the client! But this just changes what you do with the data in the input stream.

Scenario 2 sets up a listener to the localhost on the same port. On this side, we also create a DatagramSocket and set up a listener for messagereceived. Those messages—such as the one written to the output stream on the server side, as we’ve just seen—are picked up in the event handler below (js/connectToListener.js), which uses the DataReader to extract and display the message:

function onMessageReceived(eventArgument) {
    try {
        var messageLength = eventArgument.getDataReader().unconsumedBufferLength;
        var message = eventArgument.getDataReader().readString(messageLength);
        socketsSample.displayStatus("Client: receive message from server \"" + message + "\"");
    } catch (exception) {
        status = Windows.Networking.Sockets.SocketError.getStatus(exception.number);
        // [Display error details]
    }
}

Note in the code above that when an error occurs on a socket connection, you can pass the error number to the getStatus method of the Windows.Networking.Sockets.SocketError object and get back a more actionable SocketErrorStatus value. There are many possible errors here, so see its reference page for details.

Even with all the work we’ve done so far, nothing has yet happened because we’ve sent no data! So switching to Scenario 3, pressing its Send ‘Hello’ Now button does the honors from the client side (js/sendData.js):

// [This comes after a check on the socket's validity]
socketsSample.clientDataWriter =
    new Windows.Storage.Streams.DataWriter(socketsSample.clientSocket.outputStream);

var string = "Hello World";
socketsSample.clientDataWriter.writeString(string);

socketsSample.clientDataWriter.storeAsync().done(function () {

    socketsSample.displayStatus("Client sent: " + string + ".");
}, onError);

The DataWriter.storeAsync call is what actually writes the data to the stream in the socket. If you set a breakpoint here and on both messagereceived event handlers, you’ll then see that storeAsync generates a message to the server side, hitting onServerMessageReceived in js/startListener.js. This will then write the message back to the socket, which will hit onMessage-Received in js/connectToListener.js, which displays the message. (And to complete the process, Scenario 4 gives you a button to call the socket’s close method.)

The sample does everything with the same app on localhost to make it easier to see how the process works. Typically, of course, the server will be running on another machine entirely, but the steps of setting up a listener apply just the same. As noted in Chapter 13, localhost connections work only on a machine with a developer license and will not work for apps acquired through the Windows Store.

Stream Sockets

In contrast to datagram sockets, streaming data over sockets uses the Transmission Control Protocol (TCP). The hallmark of TCP is accurate and reliable delivery—it guarantees that the bytes received are the same as the bytes that were sent: when a packet is sent across the network, TCP will attempt to retransmit the packet if there are problems along the way. This is why it’s part of TCP/IP, which gives us the World Wide Web, email, file transfers, and lots more. HTTP, SMTP, and the Session Initiation Protocol (SIP) are also built on TCP. In all cases, clients and servers just see a nice reliable stream of data flowing from one end to the other.

Unlike datagram sockets, for which we have a single class in WinRT for both sides of the relationship, stream sockets are more distinctive to match the unique needs of the client and server roles. On the client side is Windows.Networking.Sockets.StreamSocket; on the server it’s StreamSocketListener.

Starting with the latter, the StreamSocketListener object looks quite similar to the DatagramSocket we’ve just covered in the previous section, with these methods, properties, and events:

information Provides a StreamSocketListenerInformation object containing a localPort string.

control Provides a StreamSocketListenerControl object with a qualityOfService property.

connectionreceived An event that’s fired when a connection is made to the listener. Its event arguments are a StreamSocketListenerConnectionReceivedEventArgs that contains a single property, socket. This is the StreamSocket for the client, in which is an outputStream property where the listener can obtain the data stream.

bindEndpointAsync and bindServiceNameAsync Binds the listener to a HostName and service name, or binds just a service name. close Terminates connections and aborts pending operations.

On the client side, StreamSocket again looks like parts of the DatagramSocket. In addition to the control (StreamSocketControl) and information properties (StreamSocketInformation) and the ubiquitous close method, we find a few other usual suspects and one unusual one:

connectAsync Connects to a HostName/service name or to an EndpointPair. In each case you can also provide an optional SocketProtectionLevel object that can be plainSocket, ssl, or sslAllowNullEncryption. There are, in other words, four variations of this method.

inputStream The IInputStream that’s being received over the connection.

outputStream The IOutputStream into which data is written.

upgradeToSslAsync Upgrades a plainSocket connection (created through connectAsync) to use SSL as specified by either SocketProtectionLevel.ssl or sslAllowNullEncryption. This method also required a HostName that validates the connection.

For more details on using SSL, see How to secure socket connections with TLS/SSL.

In any case, you can see that for one-way communications over TCP, an app creates either a StreamSocket or a StreamSocketListener, depending on its role. For two-way communications an app will create both.

The StreamSocket sample, like the DatagramSocket sample, has four scenarios that are meant to be run in sequence on the localhost: first to create a listener (to receive a message from a client, Scenario 1), then to create the StreamSocket (Scenario 2) and send a message (Scenario 3), and then to close the socket (Scenario 4). With streamed data, the app implements a custom protocol for how the data should appear, as we’ll see.

Starting in Scenario 1 (js/startListener.js), here’s how we create the listener and event handler. Processing the incoming stream data is trickier than with a datagram because we need to make sure the data we need is all there. This code shows a good pattern of waiting for one async operation to finish before the function calls itself recursively. Also note how it creates a DataReader on the input stream for convenience:

socketsSample.listener = new Windows.Networking.Sockets.StreamSocketListener(serviceName);
// Match with removeEventListener as needed
socketsSample.listener.addEventListener("connectionreceived", onServerAccept);

socketsSample.listener.bindServiceNameAsync(serviceName).done(function () {
    // ...
    }, onError);
}

// This has to be a real function; it will "loop" back on itself with the call to
// acceptAsync at the very end.
function onServerAccept(eventArgument) {
    socketsSample.serverSocket = eventArgument.socket;
    socketsSample.serverReader =
         new Windows.Storage.Streams.DataReader(socketsSample.serverSocket.inputStream);
    startServerRead();

}

// The protocol here is simple: a four-byte 'network byte order' (big-endian) integer that
// says how long a string is, and then a string that is that long. We wait for exactly 4 bytes,
// read in the count value, and then wait for count bytes, and then display them.
function startServerRead() {
    socketsSample.serverReader.loadAsync(4).done(function (sizeBytesRead) {
        // Make sure 4 bytes were read.
        if (sizeBytesRead !== 4) { /* [Show message] */ }

        // Read in the 4 bytes count and then read in that many bytes.
        var count = socketsSample.serverReader.readInt32();
        return socketsSample.serverReader.loadAsync(count).then(function (stringBytesRead) {
            // Make sure the whole string was read.
            if (stringBytesRead !== count) { /* [Show message] */ }

            // Read in the string.
            var string = socketsSample.serverReader.readString(count);
            socketsSample.displayOutput("Server read: " + string);

            // Restart the read for more bytes. We could just call startServerRead() but in
            // the case subsequent read operations complete synchronously we start building
            // up the stack and potentially crash. We use WinJS.Promise.timeout() to invoke
            // this function after the stack for current call unwinds.
            WinJS.Promise.timeout().done(function () { return startServerRead(); });
        }); // End of "read in rest of string" function.
    }, onError);
}

This code is structured to wait for incoming data that isn’t ready yet, but you might have situations in which you want to know if there’s more data available that you haven’t read. This value can be obtained through the DataReader.unconsumedBufferLength property.

In Scenario 2, the data-sending side of the relationship is simple: create a StreamSocket and call connectAsync (js/connectToListener.js; note that onError uses StreamSocketError.getStatus again):

socketsSample.clientSocket = new Windows.Networking.Sockets.StreamSocket();
socketsSample.clientSocket.connectAsync(hostName, serviceName).done(function () {
    // ...
}, onError);

Sending data in Scenario 3 takes advantage of a DataWriter built on the socket’s output stream (js/sendData.js):

var writer = new Windows.Storage.Streams.DataWriter(socketsSample.clientSocket.outputStream);
var string = "Hello World";
var len = writer.measureString(string); // Gets the UTF-8 string length.
writer.writeInt32(len);
writer.writeString(string);

writer.storeAsync().done(function () {
    writer.detachStream();
}, onError);

And closing the socket in Scenario 4 is again just a call to StreamSocket.close.

As with the DatagramSocket sample, setting breakpoints within openClient (js/connectTo-Listener.js), onServerAccept (js/startListener.js), and sendHello (js/sendData.js) will let you see what’s happening at each step of the process.

Web Sockets: MessageWebSocket and StreamWebSocket

Having now seen both Datagram and Stream sockets in action, we can look at their equivalents on the WebSocket side. As you might already know, WebSockets is a standard created to use HTTP (and thus TCP) to set up an initial connection after which the data exchange happens through sockets over TCP. This provides the simplicity of using HTTP requests for the first stages of communication and the efficiency of sockets afterwards.

As with regular sockets, the WebSocket side of WinRT supports both water balloons and water hoses: the MessageWebSocket class provides for discrete packets as with datagram sockets (though it uses TCP and not UDP), and StreamWebSocket clearly provides for stream sockets. Both classes are very similar to their respective DatagramSocket and StreamSocket counterparts, so much so that their interfaces are very much the same (with distinct secondary types like MessageWebSocket-Control):

• Like DatagramSocket, MessageWebSocket has control, information, and outputStream properties, a messagereceived event, and methods of connectAsync and close. It adds a closed event along with a setRequestHeader method.

• Like StreamSocket, StreamWebSocket has control, information, inputStream, and outputStream properties, and methods of connectAsync and close. It adds a closed event and a setRequestHeader method.

You’ll notice that there isn’t an equivalent to StreamSocketListener here. This is because the process of establishing that connection is handled through HTTP requests, so such a distinct listener class isn’t necessary. This is also why we have setRequestHeader methods on the classes above: so that you can configure those HTTP requests. Along these same lines, you’ll find that the connectAsync methods take a Windows.Foundation.Uri rather than hostnames and service names. But otherwise we see the same kind of activity going on once the connection is established, with streams, Data-Reader, and DataWriter.

Standard WebSockets, as they’re defined in the W3C API, are entirely supported for Windows Store apps. However, they support only a transaction-based UDP model like DatagramSocket and only text content. The MessageWebSocket in WinRT, however, supports both text and binary, plus you can use the StreamWebSocket for a streaming TCP model as well. The WinRT APIs also emit more detailed error information and so are generally preferred over the W3C API.

Let’s look more closely at these in the context of the Connecting with WebSockets sample. This sample is dependent upon an ASP.NET server page running in the localhost, so you must first go into its Server folder and run powershell.exe -ExecutionPolicy unrestricted -file setupserver.ps1 from an Administrator command prompt. (For more on setting up Internet Information Services and the localhost, refer to the “Using the localhost” section in Chapter 13.) If the script succeeds, you’ll see a WebSocketSample folder in c:\inetpub\wwwroot that contains an EchoWebService.ashx file. Also, as suggested in Chapter 13, you can run the Web platform installer to install Visual Studio 2012 Express for Web that will allow you to run the server page in a debugger. Always a handy capability!

Within EchoWebService.ashx you’ll find an EchoWebSocket class written in C#. It basically has one method, ProcessRequest, that handles the initial HTTP request from the web socket client. With this request it acquires the socket, writes an announcement message to the socket’s stream when the socket is opened, and then waits to receive other messages. If it receives a text message, it echoes that text back through the socket with “You said” prepended. If it receives a binary message, it echoes back a message indicating the amount of data received.

Going to Scenario 1 of the Connecting with WebSockets sample, we can send a message to that ser-ver page, using MessageWebSocket, and get back a message of our own; see Figure 14-6. In this case the output in the sample reflects information known to the app and nothing from the service itself.

Image

FIGURE 14-6 Output of Scenario 1 of the Connecting with WebSockets sample.

In the sample, we first create a MessageWebSocket, call its connectAsync, and then use a DataWriter to write some data to the socket. It also listens for the messagereceived event to output the result of the send, and it listens to the closed event from the server so that it can do the close from its end. The code here is simplified from js/scenario1.js:

var messageWebSocket;
var messageWriter;

var webSocket = new Windows.Networking.Sockets.MessageWebSocket();
webSocket.control.messageType = Windows.Networking.Sockets.SocketMessageType.utf8;
webSocket.onmessagereceived = onMessageReceived;
webSocket.onclosed = onClosed;

// The server URI is obtained and validated here, and stored in a variable named uri.

webSocket.connectAsync(uri).done(function () {
    messageWebSocket = webSocket;
    // The default DataWriter encoding is utf8.
    messageWriter = new Windows.Storage.Streams.DataWriter(webSocket.outputStream);
    sendMessage();    // Helper function, see below
}, function (error)  {
   var errorStatus = Windows.Networking.Sockets.WebSocketError.getStatus(error.number);
   // [Output error message]
});

function onMessageReceived(args) {
    var dataReader = args.getDataReader();
    // [Output message contents]
}

function sendMessage() {
    // Write message in the input field to the socket
    messageWriter.writeString(document.getElementById("inputField").value);
    messageWriter.storeAsync().done("", sendError);
}

function onClosed(args) {
    // Close our socket if the server closes [simplified from actual sample; it also closes
    // the DataWriter it might have opened.]
    messageWebSocket.close();
}

Similar to what we saw in previous sections, when an error occurs you can turn the error number into a SocketErrorStatus value. In the case of WebSockets you do this with the getStatus method of Windows.Networking.Sockets.WebSocketError. Again, see its reference page for details.

Scenario 2, for its part, uses a StreamWebSocket to send a continuous stream of data packets, a process that will continue until you close the connection; see Figure 14-7.

Image

FIGURE 14-7 Output of Scenario 2 of the Connecting with WebSockets sample (cropped).

Here’s the process in code, simplified from js/scenario2.js, where we see a similar pattern to what we just saw for MessageWebSocket, only sending a continuous stream of data:

var streamWebSocket;
var dataWriter;
var dataReader;
var data = "Hello World";
var countOfDataSent;
var countOfDataReceived;

var webSocket = new Windows.Networking.Sockets.StreamWebSocket();
webSocket.onclosed = onClosed;

// The server URI is obtained and validated here, and stored in a variable named uri.

webSocket.connectAsync(uri).done(function () {
    streamWebSocket = webSocket;
    dataWriter = new Windows.Storage.Streams.DataWriter(webSocket.outputStream);
    dataReader = new Windows.Storage.Streams.DataReader(webSocket.inputStream);
    // When buffering, return as soon as any data is available.
    dataReader.inputStreamOptions = Windows.Storage.Streams.InputStreamOptions.partial;
    countOfDataSent = 0;
    countOfDataReceived = 0;

    // Continuously send data to the server
    writeOutgoing();

    // Continuously listen for a response
    readIncoming();
}, function (error) {
   var errorStatus = Windows.Networking.Sockets.WebSocketError.getStatus(error.number);
   // [Output error message]
});

function writeOutgoing() {

    try {
        var size = dataWriter.measureString(data);
        countOfDataSent += size;
        }
        dataWriter.writeString(data);
        dataWriter.storeAsync().done(function () {
            // Add a 1 second delay so the user can see what's going on.
            setTimeout(writeOutgoing, 1000);
        }, writeError);
    }
    catch (error) {
        // [Output error message]
    }
}

function readIncoming(args) {
    // Buffer as much data as you require for your protocol.
    dataReader.loadAsync(100).done(function (sizeBytesRead) {
        countOfDataReceived += sizeBytesRead;
        // [Output count]

        var incomingBytes = new Array(sizeBytesRead);
        dataReader.readBytes(incomingBytes);

        // Do something with the data. Alternatively you can use DataReader to
        // read out individual booleans, ints, strings, etc.

        // Start another read.
        readIncoming();
    }, readError);
}

function onClosed(args) {
    // [Other code omitted, including closure of DataReader and DataWriter]
    streamWebSocket.close();
}

As with regular sockets, you can exercise additional controls with WebSockets, including setting credentials and indicating supported protocols through the control property of both MessageWeb-Socket and StreamWebSocket. For details, see How to use advanced WebSocket controls in the documentation. Similarly, you can set up a secure/encrypted connection by using the wss:// URI scheme instead of ws:// as used in the sample. For more, see How to secure WebSocket connections with TLS/SSL.

The ControlChannelTrigger Background Task

In Chapter 13, in the “Lock Screen Dependent Tasks and Triggers” section we took a brief look at the Windows.Networking.Sockets.ControlChannelTrigger class that can be used to set up a background task for real-time notifications as would be used by VoIP, IM, email, and other “always reachable” scenarios. To repeat, working with the control channel is not something that can be done from JavaScript, so refer to How to set background connectivity options in the documentation along with the following C#/C++ samples:

ControlChannelTrigger StreamSocket sample

ControlChannelTrigger XmlHttpRequest sample

ControlChannelTrigger StreamWebSocket sample

ControlChannelTrigger HTTP client sample

Loose Ends (or Some Samples To Go)

Although we’ve covered quite a bit of territory in this chapter, you might find some additional samples helpful in your networking efforts. I won’t address these topics further in this book, but this list will at least help you be aware of their existence.

Image

What We’ve Just Learned

• Networks come in a number of different forms, and separate capabilities in the manifest specifically call out Internet (Client), Internet (Client & Server), and Private Networks (Client & Server). Local loopback within these is normally blocked for apps but may be used for debugging purposes on machines with a developer license.

• Rich network information is available through the Windows.Networking.Connectivity.-NetworkInformation API, including the ability to track connectivity, be aware of network costs, and obtain connection profile details.

• Connectivity can be monitored from a background task by using the networkStateChange trigger and conditions such as internetAvailable and internetNotAvailable.

• The ability to run offline can be an important consideration that can make an app much more attractive to customers. Apps need to design and implement such features themselves, using local or temporary app data folders to store the necessary caches.

Windows.Networking.BackgroundTransfer provides for cost-aware downloads and up-loads that continue to run while an app is suspended and that are easy to resume if an app is restarted after termination. Using this API is highly recommended over doing the same with XmlHttpRequests; the API supports credentials, multipart uploads, cost policy, and grouping.

• The Credential Picker UI provides a built-in UI for collecting credentials, and the credential locker provides a secure means for storing and retrieving those credentials (that can also be roamed to the user’s other trusted devices if they allow it).

• Apps can authenticate through OAuth providers using the web authentication broker API. This allows apps to obtain necessary access keys and tokens to work with those providers while never having to manage user credentials directly.

• For authentication providers that support it, apps can use single sign on so that authenticating the user in one app will authenticate them in all others using the same provider. The Live SDK/Live Connect provides for this with the user’s Microsoft account.

• Apps can obtain and manage some of the user’s profile data, including the user image and the lock screen image.

• WinRT provides APIs for encryption and decryption, along with certificates.

• The Windows.Web.Syndication API provides a structured way to consume RSS feeds, and Windows.Web.AtomPub provides a structured way to post, edit, and manage entries.

• Socket support in WinRT includes datagram and stream sockets, as well as message and stream WebSockets. The capabilities of the latter expand on the capabilities of W3C WebSockets by supporting both a streaming (TCP) model and binary content.