To say that media is important to apps—and to culture in general—is a gross understatement. Ever since the likes of Edison made it possible to record a performance for later enjoyment, and the likes of Marconi made it possible to widely broadcast and distribute such performances, humanity’s worldwide appetite for media—graphics, audio, and video—has probably outpaced the appetite for automobiles, electricity, and even junk food. In the early days of the Internet, graphics and images easily accounted for the bulk of network traffic. Today, streaming video even from a single source like Netflix holds top honors for pushing the capabilities of our broadband infrastructure! (It certainly holds true in my own household with my young son’s love of Curious George, Bob the Builder, Dinosaur Train, and other such shows.)
Incorporating some form of media is likely a central concern for most Windows Store apps. Simple ones, even, probably use at least a few graphics to brand the app and present an attractive UI, as we’ve already seen on a number of occasions. Many others, especially games, will certainly use graphics, video, and audio together. In the context of this book, all of this means using the img
, svg
(Scalable Vector Graphics), canvas
, audio
, and video
elements of HTML5.
Of course, working with media goes well beyond just presentation because apps might also provide any of the following capabilities:
• Organize and edit media files, including those in the pictures, music, and videos media libraries.
• Transcode (convert) media files, possibly applying various filters and custom codecs.
• Organize and edit playlists.
• Capture audio and video from available devices.
• Stream media from a server to a device, or from a device to a PlayTo target, perhaps also applying DRM.
These capabilities, for which many WinRT APIs exist, along with the media elements of HTML5 and their particular capabilities within the Windows 8 environment, will be our focus for this chapter.
Note As is relevant to this chapter, a complete list of audio and video formats that are supported for WinRT apps can be found on Supported audio and video formats.
Some of the recommendations in this chapter come from a great talk by Jason Weber, the Performance Lead for Internet Explorer, called 50 Performance Tricks to Make Your Windows 8 Apps Using HTML5 Faster. While some of these tricks are specifically for web applications running in a browser, many of them are wholly applicable to Windows Store apps written in JavaScript as they run on top of the same infrastructure as Internet Explorer.
Certainly the easiest means to incorporate media into an app is what we’ve already been doing for years: simply use the appropriate HTML element in your layout and voila! there you have it. With img
, audio
, and video
elements, in fact, you’re completely free to use content from just about any location. That is, the src
attributes of these elements can be assigned URIs that point to in-package content (using relative paths, ms-appx:///
URIs, or paths based on Windows.ApplicationModel.Package.current.installedLocation
that you then pass to URL.createObjectURL
), files in your app data folders (using ms-appdata:///
URIs or paths based on Windows.Storage.ApplicationData.current
again using URL.createObjectURL
) , and remote files with http://
and other URIs. With the img
element, this includes using SVG files as the source.
There are three ways to create a media element in a page or page control.
First is to include the element directly in declarative HTML. Here it’s often useful to use the preload="auto"
attribute for remote audio and video to increase the responsiveness of controls and other UI that depend on those elements. (Doing so isn’t really important for local media files since they are, well, already local!) Oftentimes, media elements are placed near the top of the HTML file, in order of priority, so that downloading can begin while the rest of the document is being parsed.
On the flip side, if the user can wait a short time to start a video, use a preview image in place of the video and don’t start the download until it’s actually necessary. Code for this is shown later in this chapter in the “Video Playback and Deferred Loading” section.
Playback for a declarative element can be automatically started with the autoplay
attribute, through the built-in UI if the element has the controls
attribute, or by calling <element>.play()
from JavaScript.
The second method is to create an HTML element in JavaScript via document.createElement
and add it to the DOM with <parent>.appendChild
and similar methods. Here’s an example using media files in this chapter’s companion content, though you’ll need to drop the code into a new project of your own:
//Create elements and add to DOM, which will trigger layout var picture = document.createElement("img"); picture.src = "media/wildflowers.jpg"; picture.width = 300; picture.height = 450; document.getElementById("divShow").appendChild(picture); var movie = document.createElement("video"); movie.src = "media/ModelRocket1.mp4"; movie.autoplay = false; movie.controls = true; document.getElementById("divShow").appendChild(movie); var sound = document.createElement("audio"); sound.src = "media/SpringyBoing.mp3"; sound.autoplay = true; //Play as soon as element is added to DOM sound.controls = true; //If false, audio plays but does not affect layout document.getElementById("divShow").appendChild(sound);
Unless otherwise hidden by styles, image and video elements, plus audio elements with the controls
attribute, will trigger re-rendering of the document layout. An audio element without that attribute will not cause re-rendering. As with declarative HTML, setting autoplay
to true
will cause video and audio to start playing as soon as the element is added to the DOM.
Finally, for audio, apps can create an Audio
object in JavaScript to play sounds or music without any effect on UI. More on this later. JavaScript also has the Image
class, and the Audio
class can be used to load video:
//Create objects (pre-loading), then set other DOM object sources accordingly var picture = new Image(300, 450); picture.src = "http://www.kraigbrockschmidt.com/downloads/media/wildflowers.jpg"; document.getElementById("image1").src = picture.src; //Audio object can be used to pre-load (but not render) video var movie = new Audio("http://www.kraigbrockschmidt.com/downloads/media/ModelRocket1.mp4"); document.getElementById("video1").src = movie.src; var sound = new Audio("http://www.kraigbrockschmidt.com/downloads/media/SpringyBoing.mp3"); document.getElementById("audio1").src = sound.src;
Creating an Image
or Audio
object from code does not create elements in the DOM, which can be a useful trait. The Image
object, for instance, has been used for years to preload an array of image sources for use with things like image rotators and popup menus. Preloading in this case only means that the images have been downloaded and cached. This way, assigning the same URI to the src
attribute of an element that is in the DOM, as shown above, will have that image appear immediately. The same is true for preloading video and audio, but again, this is primarily helpful with remote media as files on the local file system will load relatively quickly as-is. Still, if you have large local images and want them to appear quickly when needed, preloading them into memory is a useful strategy.
Of course, you might want to load media only when it’s needed, in which case the same type of code can be used with existing elements, or you can just create an element and add it to the DOM as shown earlier.
I know you’re probably excited to get to sections of this chapter on video and audio, but we cannot forget that images have been the backbone of web applications since the beginning and remain a huge part of any app’s user experience. Indeed, it’s helpful to remember that video itself is conceptually just a series of static images sequenced over time! Fortunately, HTML5 has greatly expanded an app’s ability to incorporate image data by adding SVG support and the canvas
element to the tried-and-true img
element. Furthermore, applying CSS animations and transitions (covered in detail in Chapter 11, “Purposeful Animations”) to otherwise static image elements can make them appear very dynamic.
Speaking of CSS, it’s worth noting that many graphical effects that once required the use of static images can be achieved with just CSS, especially CSS3:
• Borders, background colors, and background images
• Folder tabs, menus, and toolbars
• Rounded border corners, multiple backgrounds/borders, and image borders
• Transparency
• Embeddable fonts
• Box shadows
• Text shadows
• Gradients
In short, if you’ve ever used img
elements to create small visual effects, create gradient backgrounds, use a nonstandard font, or provide some kind of graphical navigation structure, there’s probably a way to do it in pure CSS. For details, see the great overview of CSS3 by Smashing Magazine as well as the CSS specs at http://w3.org/. CSS also provides the ability to declaratively handle some events and visual states using pseudo-selectors of hover
, visited
, active
, focus
, target
, enabled
, disabled
, and checked
. For more, see http://css-tricks.com/ as well as another Smashing Magazine tutorial on pseudo-classes.
That said, let’s review the three primary HTML5 elements for graphics:
• img
is used for raster data. The PNG format generally preferred over other formats, especially for text and line art, though JPEG makes smaller files for photographs. GIF is generally considered outdated, as the primary scenarios where GIF produced a smaller file size can probably be achieved with CSS directly. Where scaling is concerned, Windows Store apps need to consider pixel density, as we saw in Chapter 6, “Layout,” and provide separate image files for each scale the app might encounter. This is where the smaller size of JPEGs can reduce the overall size of your app package in the Windows Store.
• SVGs are best used for smooth scaling across display sizes and pixel densities. SVGs can be declared inline, created dynamically in the DOM, or maintained as separate files and used as a source for an img
element (in which case all the scaling characteristics are maintained). An svg file can also be used for an iframe
source, which has the added benefit that the SVG’s child elements are accessible in the DOM. As we saw in Chapter 6, preserving the aspect ratio of an SVG is often important, for which you employ the viewBox
and preserveAspectRatio
attributes of the svg
tag.
• The canvas
element provides a drawing surface and API for creating graphics with lines, rectangles, arcs, text, and so forth. The canvas ultimately generates raster data, which means that once created, a canvas scales like a bitmap. (An app, of course, will typically redraw a canvas with scaled coordinates when necessary to avoid pixelation.) The canvas is also very useful for performing pixel manipulation, even on individual frames of a video while it’s playing.
Apps often use all three of these elements, drawing on their various strengths. I say this because when canvas
first became available, developers seemed so enamored with it that they seemed to forget how to use img
elements, and they ignored the fact that SVGs are often a better choice altogether! (And did I already say that CSS can accomplish a great deal by itself as well?)
In the end, it’s helpful to think of all the HTML5 graphics elements as ultimately producing a bitmap that the app host simply renders to the display. You can, of course, programmatically animate the internal contents of these elements in JavaScript, as we’ll see in Chapter 11, but for our purposes here it’s helpful to simply think of these as essentially static.
What differs between the elements is how image data gets into the element to begin with. Img
elements are loaded from a source file, svg
’s are defined in markup, and canvas
elements are filled through procedural code. But in the end, as demonstrated in Scenario 1 in the HTML Graphics example for this chapter and shown in Figure 10-1, each can produce identical results.
FIGURE 10-1 Image, canvas, and svg elements showing identical results.
In short, there are no fundamental differences as to what can be rendered through each type of element. However, they do have differences that become apparent when we begin to manipulate those elements as with CSS. Because each element is just a node in the DOM, plain and simple, and they are treated like all other nongraphic elements: CSS doesn’t affect the internals of the element, just how it ultimately appears on the page. Individual parts of SVGs declared in markup can, in fact, be separately styled so long as they can be identified with a CSS selector. In any case, such styling only affects presentation, so if new styles are applied, they are applied to the original contents of the element.
What’s also true is that graphics elements can overlap with each other and with nongraphic elements (as well as video), and the rendering engine automatically manages transparency according to the z-index
of those elements. Each graphic element can have clear or transparent areas, as is built into image formats like PNG. In a canvas
, any areas cleared with the clearRect
method that aren’t otherwise affected by other API calls will be transparent. Similarly, any area in an SVG’s rectangle that’s not affected by its individual parts will be transparent.
Scenario 2 in the HTML Graphics example allows you to toggle a few styles (with a check box) on the same elements shown earlier. In this case, I’ve left the background of the canvas element transparent so that we can see areas that show through. When the styles are applied, the img
element gets is rotated and transformed, the canvas
gets scaled, and individual parts of the svg
are styled with new colors, as shown in Figure 10-2.
FIGURE 10-2 Styles applied to graphic elements; individual parts of the SVG can be styled if they are accessible through the DOM.
The styles in css/scenario2.css are simple:
.transformImage { transform: rotate(30deg) translateX(120px); } .scaleCanvas { transform: scale(1.5, 2); }
as is the code in js/scenario2.js that applies them:
function toggleStyles() { var applyStyles = document.getElementById("check1").checked; document.getElementById("image1").className = applyStyles ? "transformImage" : ""; document.getElementById("canvas1").className = applyStyles ? "scaleCanvas" : ""; document.getElementById("r").style.fill = applyStyles ? "purple" : ""; document.getElementById("l").style.stroke = applyStyles ? "green" : ""; document.getElementById("c").style.fill = applyStyles ? "red" : ""; document.getElementById("t").style.fontStyle = applyStyles ? "normal" : ""; document.getElementById("t").style.textDecoration = applyStyles ? "underline" : ""; }
The other thing you might have noticed when the styles are applied is that the scaled-up canvas looks rasterized, like a bitmap would typically be. This is expected behavior, as shown in the following table of scaling characteristics. These are demonstrated in Scenarios 3 and 4 of the HTML Graphics example.
There are a few additional characteristics to be aware of with graphics elements. First, different kinds of operations will trigger a re-rendering of the element in the document. Second is the mode of operation of each element. Third are the relative strengths of each element. These are summarized in the following table:
Because SVGs generate elements in the DOM, those elements can be individually styled. You can use this fact with media queries to hide different parts of the SVG depending on its size. To do this, add different classes to those SVG elements. Then, in CSS, add or remove the display: none
style for those classes within media queries like @media (min-width:300px) and (max-width:499px)
. You may need to account for the size of the SVG relative to the app window, but it means that you can effectively remove detail from an SVG rather than allowing those parts to be rendered with too few pixels to be useful.
In the end, the reason why HTML5 has all three of these elements is because all three are really needed. All of them benefit from full hardware acceleration, just as they do in Internet Explorer, since apps written in HTML and JavaScript run on the same rendering engine as the browser.
The best practice in app design is to really explore the appropriate use of each type of elements. Each element can have transparent areas, so you can easily achieve some very fun effects. For example, if you have data that maps video timings to caption or other text, you can simply use an interval handler (with the interval set to the necessary granularity like a half-second) to take the video’s currentTime
property, retrieve the appropriate text for that segment, and render the text to an otherwise transparent canvas that sits on top of the video. Titles and credits can be done in a similar manner, eliminating the need to reencode the video.
Working with the HTML graphics elements is generally straightforward, but knowing some details can help when working with them inside a Windows Store app.
• Use the title
attribute of img
for tooltips, not the alt
attribute. You can also use a WinJS.UI.Tooltip
control as described in Chapter 4, “Controls, Control Styling, and Data Binding.”
• To create an image from an in-memory stream, see MSApp.createBlobFromRandom-AccessStream
, the result of which can be then given to URL.createObjectURL
to create an appropriate URI for a src
attribute. We’ll encounter this elsewhere in this chapter, and we’ll need it when working with the Share contract in Chapter 12, “Contracts.” The same technique also works for audio and video streams.
• When loading images from http://
or other remote sources, you run the risk of having the element show a red X placeholder image. To prevent this, catch the img.onerror
event and supply your own placeholder:
var myImage = document.getElementById('image'); myImage.onerror = function () { onImageError(this);} function onImageError(source) { source.src = "placeholder.png"; source.onerror = ""; }
• <script>
tags are not supported within <svg>
.
• If you have an SVG file, you can load it into an img
element by pointing at the file with the src
attribute, but this doesn’t let you traverse the SVG in the DOM. If you want the latter behavior, load the SVG in an iframe
instead. The SVG contents will then be within that element’s contentDocument.documentElement
property:
<!-- in HTML--> <iframe id="Mysvg" src="myFolder/mySVGFile.svg" /> // in JavaScript var svg = document.getElementById("Mysvg").contentDocument.documentElement;
• PNGs and JPEGs generally perform better than SVGs, so if you don’t technically need an SVG or have a high-performance scenario, consider using scaled raster graphics. Or you can dynamically create a scaled static image from an SVG so as to use the image for faster rendering later:
<!-- in HTML--> <img id="svg" src="somesvg.svg" style="display: none;" /> <canvas id="canvas" style="display: none;" /> // in JavaScript var c = document.getElementById("canvas").getContext("2d"); c.drawImage(document.getElementById("svg"),0,0); var imageURLToUse = document.getElementById("canvas").toDataURL();
• Two helpful SVG references (JavaScript examples): http://www.carto.net/papers/svg/samples/ and http://srufaculty.sru.edu/david.dailey/svg/.
All the function names mentioned here are methods of a canvas’s context object:
• Remember that a canvas
element needs specific width
and height
attributes (in JavaScript, canvas.width
and canvas.height
), not styles. It does not accept px, em, %, or other units.
• Despite its name, the closePath
method is not a direct complement to beginPath
. beginPath
is used to start a new path that can be stroked, clearing any previous path. closePath
, on the other hand, simply connects the two endpoints of the current path, as if you did a lineTo
between those points. It does not clear the path or start a new one. This seems to confuse programmers quite often, which is why you sometimes see a circle drawn with a line to the center!
• A call to stroke
is necessary to render a path; until that time, think of paths as a pencil sketch of something that’s not been inked in. Note also that stroking implies a call to beginPath
.
• When animating on a canvas, doing clearRect
on the entire canvas and redrawing every frame is generally easier to work with than clearing many small areas and redrawing individual parts of the canvas. The app host eventually has to render the entire canvas in its entirety with every frame anyway to manage transparency, so trying to optimize performance by clearing small rectangles isn’t an effective strategy except when you’re only doing a small number of API calls for each frame.
• Rendering canvas API calls is accomplished by converting them to the equivalent Direct2D calls in the GPU. This draws shapes with automatic antialiasing. As a result, drawing a shape like a circle in a color and drawing the same circle with the background color does not erase every pixel. To effectively erase a shape, use clearRect
on an area that’s slightly larger than the shape itself. This is one reason why clearing the entire canvas and redrawing every frame often ends up being easier.
• To set a background image in a canvas (so you don’t have to draw each time), you can use the canvas.style.backgroundImage
property with an appropriate URI to the image.
• Use the msToBlob
method on a canvas
object to obtain a blob
for the canvas contents.
• When using drawImage
, you may need to wait for the source image to load using code such as
var img = new Image(); img.onload = function () { myContext.drawImage(myImg, 0, 0); } myImg.src = "myImageFile.png";
• Although other graphics APIs see a circle as a special case of an ellipse (with x and y radii being the same), the canvas arc
function works with circles only. Fortunately, a little use of scaling makes it easy to draw ellipses, as shown in the utility function below. Note that we use save
and restore
so that the scale
call applies only to the arc
; it does not affect the stroke
that’s used from main
. This is important, because if the scaling factors are still in effect when you call stroke
, the line width will vary instead of remaining constant.
function arcEllipse(ctx, x, y, radiusX, radiusY, startAngle, endAngle, anticlockwise) { //Use the smaller radius as the basis and stretch the other var radius = Math.min(radiusX, radiusY); var scaleX = radiusX / radius; var scaleY = radiusY / radius; ctx.save(); ctx.scale(scaleX, scaleY); //Note that centerpoint must take the scale into account ctx.arc(x / scaleX, y / scaleY, radius, startAngle, endAngle, anticlockwise); ctx.restore(); }
• By copying pixel data from a video, it’s possible with the canvas to dynamically manipulate a video (without affecting the source, of course). This is a useful technique, even if it’s processor-intensive; for this latter reason, though, it might not work well on low-power devices.
Here’s an example of frame-by-frame video manipulation, the technique for which is nicely outlined in a Windows team blog post, Canvas Direct Pixel Manipulation.53 In the VideoEdit example for this chapter, default.html contains a video
and canvas
element in its main body:
<video id="video1" src="ModelRocket1.mp4" muted style="display: none"></video> <canvas id="canvas1" width="640" height="480"></canvas>
In code (js/default.js), we call startVideo
from within the activated handler. This function starts the video and uses requestAnimationFrame
to do the pixel manipulation for every video frame:
var video1, canvas1, ctx; var colorOffset = { red: 0, green: 1, blue: 2, alpha: 3 }; function startVideo() { video1 = document.getElementById("video1"); canvas1 = document.getElementById("canvas1"); ctx = canvas1.getContext("2d"); video1.play(); requestAnimationFrame(renderVideo); } function renderVideo() { //Copy a frame from the video to the canvas ctx.drawImage(video1, 0, 0, canvas1.width, canvas1.height); //Retrieve that frame as pixel data var imgData = ctx.getImageData(0, 0, canvas1.width, canvas1.height); var pixels = imgData.data; //Loop through the pixels, manipulate as needed var r, g, b; for (var i = 0; i < pixels.length; i += 4) { r = pixels[i + colorOffset.red]; g = pixels[i + colorOffset.green]; b = pixels[i + colorOffset.blue]; //This creates a negative image pixels[i + colorOffset.red] = 255 - r; pixels[i + colorOffset.green] = 255 - g; pixels[i + colorOffset.blue] = 255 - b; } //Copy the manipulated pixels to the canvas ctx.putImageData(imgData, 0, 0);
//Request the next frame requestAnimationFrame(renderVideo);
}
Here the page contains a hidden video element (style="display: none"
) that is told to start playing once the document is loaded (video1.play()
). In a requestAnimationFrame
loop, the current frame of the video is copied to the canvas (drawImage
) and the pixels for the frame are copied (getImageData
) into the imgData
buffer. We then go through that buffer and negate the color values, thereby producing a photographically negative image (an alternate formula to change to grayscale is also shown in the code comments, omitted above). We then copy those pixels back to the canvas (putImageData
) so that when we return, those negated pixels are rendered to the display.
Again, this is processor-intensive as it’s not generally a GPU-accelerated process and might perform poorly on lower-power devices (be sure, however, to run a Release build outside the debugger when evaluating erformance). It’s much better to write a video effect DLL where possible as discussed in “Applying a Video Effect” later on. Nevertheless, it is a useful technique to know. What’s really happening is that instead of drawing each frame with API calls, we’re simply using the video as a data source. So we could, if we like, embellish the canvas in any other way we want before returning from the renderVideo
function. An example of this that I really enjoy is shown in Manipulating video using canvas on Mozilla’s developer site, which dynamically makes green-screen background pixels transparent so that an img
element placed underneath the video shows through as a background. The same could even be used to layer two videos so that a background video is used instead of a static image. Again, be mindful of performance on low-power devices; you might consider providing a setting through which the user can disable such extra effects.
Let’s now talk a little more about video playback itself. As we’ve already seen, simply including a video
element in your HTML or creating an element on the fly, gives you playback ability. In the code below, the video is sourced from a local file, starts playing by itself, loops continually, and provides controls:
<video src="media/ModelRocket1.mp4" controls loop autoplay></video>
As we’ve been doing in this book, we’re not going to rehash the details that are available in the W3C spec for the video
and audio
tags, found on http://www.w3.org/TR/html5/video.html. This spec will give you all the properties, methods, and events for these elements; especially note the event summary in section 4.8.10.15, and that most of the properties and methods for both are found in Media elements section 4.8.10. Note that the track
element is supported for both video
and audio
; you can find an example of using it in Scenario 4 (demonstrating subtitles) of the HTML media playback sample. We won’t be covering it more here.
It’s also helpful to understand that video
and audio
are closely related, since they’re part of the same spec. In fact, if you want to just play the audio portion of a video, you can use the Audio
object in JavaScript:
//Play just the audio of a video var movieAudio = new Audio("http://www.kraigbrockschmidt.com/downloads/media/ModelRocket1.mp4"); movieAudio.load(); movieAudio.play();
For any given video element, you can set the width and height to control the playback size (as to 100% for full-screen). This is important when your app switches between view states, and you’ll likely have CSS styles for video elements in your various media queries. Also, if you have a control to play full screen, simply make the video the size of the viewport (after also calling Windows.UI.ViewManage-ment.ApplicationView.tryUnsnap
if you’re in the snapped view). In addition, when you create a video element with the controls
attribute, it will automatically have a full-screen control on the far right that does exactly what you expect within a Windows Store app:
In short, you don’t need to do anything special to make this work. When the video is full screen, a similar button (or the ESC key) returns to the normal app view.
Note In case you’re wondering, the audio and video elements don’t provide any CSS pseudo-selectors for styling the controls bar. As my son’s preschool teacher would say (in reference to handing out popsicles, but it works here too), “You get what you get and you don’t throw a fit and you’re happy with it.” If you’d like to do something different with these controls, you’ll need to turn off the defaults and provide controls of your own that would call the element methods appropriately.
When implementing your own controls, be sure to set a timeout to make the controls disappear (either hiding them or changing the z-index) when they’re not being used. This is especially important if you have a full-screen button for video like the built-in controls, where you would basically resize the element to match the screen dimensions. When you do this, Windows will automatically detect this full-screen video state and do some performance optimizations, but not if any other element is front of the video. It’s also a good idea to disable any animations you might be running and disable unnecessary background processes like web workers.
You can use the various events of the video
element to know when the video is played and paused through the controls, among other things (though there is not an event for going full-screen), but you should also respond appropriately when hardware buttons for media control are used. For this purpose, listen for events coming from the Windows.Media.MediaControl
object, such as playpressed
, pausepressed
, and so on. (These are WinRT object events, so call removeEventListener
as needed.) Refer to the Configure keys for media sample for a demonstration, but adding the listeners generally looks like this:
mediaControl = Windows.Media.MediaControl; mediaControl.addEventListener("soundlevelchanged", soundLevelChanged, false); mediaControl.addEventListener("playpausetogglepressed", playpause, false); mediaControl.addEventListener("playpressed", play, false); mediaControl.addEventListener("stoppressed", stop, false); mediaControl.addEventListener("pausepressed", pause, false);
I also mentioned that you might want to defer loading a video until it’s needed and show a preview image in its place. This is accomplished with the poster
attribute, whose value is the image to use:
<video id="video1" poster="media/rocket.png" width="640" height="480"></video> var video1 = document.getElementById("video1"); var clickListener = video1.addEventListener("click", function () { video1.src = "http://www.kraigbrockschmidt.com/downloads/media/ModelRocket1.mp4"; video1.load(); //Remove listener to prevent interference with video controls video1.removeEventListener("click", clickListener); video1.addEventListener("click", function () { video1.controls = true; video1.play(); }); });
In this case I’m not using preload="true"
or even providing a src
value so that nothing is transferred until the video is tapped. When a tap occurs, that listener is removed, the video’s own controls are turned on, and playback is started. This, of course, is a more roundabout method; often you’ll use preload="true" controls src="..."
directly in the video element, as the poster
attribute will handle the preview image.
When playing video, especially full-screen, it’s important to disable any automatic timeouts that would blank the display or lock the device. This is done through the Windows.System.Display.Display-Request
object. Before starting playback, create an instance of this object and call its requestActive
method.
var displayRequest = new Windows.System.Display.DisplayRequest(); displayRequest.requestActive();
If this call succeeds, you’ll be guaranteed that the screen will stay active despite user inactivity. When the video is complete, be sure to call requestRelease
. Note that Windows will automatically deactivate such requests when your app is moved to the background, and it will reactivate them when the user switches back.
Beyond the HTML5 standards for video
elements, some additional properties and methods are added to them in Windows 8, as shown in the following table and documented on the video element page. Also note the references to the HTML media playback sample where you can find some examples of using these.
Video (and audio) elements can use the HTML5 source
attribute. In web applications, multiple source elements are used to provide alternate video formats in case a client system doesn’t have the necessary codec for the primary source. Given that the list of supported formats in Windows is well known (refer again to Supported audio and video formats), this isn’t much of a concern for Windows Store apps. However, source is still useful because it can identify the specific codecs for the source:
<video controls loop autoplay> <source src="video1.vp8" type="video/webm" /> </video>
This is important when you need to provide a custom codec for your app through Windows.Media.MediaExtensionManager
, outlined in the “Custom Decoders/Encovers and Scheme Handlers” section later in this chapter, as the codec identifies the extension to load for decoding. I show WebM as an example here because it’s not directly available to Store apps (though it is in Internet Explorer). When the app host running a Store app encounters the video
element above, it will look for a matching decoder for the specified type
.
The earlier table shows that video elements have msInsertVideoEffect
and msInsertAudio-Effect
methods on them. WinRT provides a built-in video stabilization effect that is easily applied to an element. This is demonstrated in Scenario 3 of the Media extensions sample, which plays the same video with and without the effect, so the stabilized one is muted:
vidStab.msClearEffects(); vidStab.muted = true; vidStab.msInsertVideoEffect(Windows.Media.VideoEffects.videoStabilization, true, null);
Custom effects, as demonstrated in Scenario 4 of the sample, are implemented as separate dynamic-link libraries (DLLs), typically written in C++ for best performance, and are included in the app package because a Store app can install a DLL only for its own use and not for systemwide access. With the sample you’ll find DLL projects for a grayscale, invert, and geometric effects, where the latter has three options for fisheye, pinch, and warp. In the js/CustomEffect.js file you can see how these are applied, with the first parameter to msInsertVideoEffect
being a string that identifies the effect as exported by the DLL (see, for instance, the InvertTransform.idl file in the InvertTransform project):
vid.msInsertVideoEffect("GrayscaleTransform.GrayscaleEffect", true, null); vid.msInsertVideoEffect("InvertTransform.InvertEffect", true, null);
The second parameter to msInsertVideoEffect
, by the way, indicates whether the effect is required, so it’s typically true
. The third is a parameter called config, which just contains additional information to pass to the effect. In the case of the geometric effects in the sample, this parameter specifies the particular variation:
var effect = new Windows.Foundation.Collections.PropertySet(); effect["effect"] = effectName; vid.msClearEffects(); vid.msInsertVideoEffect("PolarTransform.PolarEffect", true, effect);
where effectName
will be either “Fisheye”, “Pinch”, or “Warp”.
Audio effects, not shown in the sample, are applied the same way with msInsertAudioEffect
(with the same parameters). Do note that each element can have at most two effects per media stream. A video
element can have two video effects and two audio effects; an audio
element can have two audio effects. If you try to add more, the methods will throw an exception. This is why it’s a good idea to call msClearEffects
before inserting any others.
For additional discussion on effects and other media extensions, see Using media extensions.
Many households, including my own, have one or more media servers available on the local network from which apps can play media. Getting to these servers is the purpose of the one other property in Windows.Storage.KnownFolders
that we haven’t mentioned yet: mediaServerDevices
. As with other known folders, this is simply a StorageFolder
object through which you can then enumerate or query additional folders and files. In this case, if you call its getFoldersAsync
, you’ll receive back a list of available servers, each of which is represented by another StorageFolder
. From there you can use file queries, as discussed in Chapter 8, “State, Settings, Libraries, and Documents,” to search for the types of media you’re interested in or apply user-provided search criteria. An example of this can be found in the Media Server client sample.
As with video, the audio
element provides its own playback abilities, including controls, looping, and autoplay:
<audio src="media/SpringyBoing.mp3" controls loop autoplay></audio>
Again, as described earlier, the same W3C spec applies to both video
and audio
elements. The same code to play just the audio portion of a video is exactly what we use to play an audio file:
var sound = new Audio("media/SpringyBoing.mp3"); sound1.msAudioCategory = "SoundEffect"; sound1.load(); //For pre-loading media sound1.play(); //At any later time
As also mentioned before, creating an Audio
object without controls and playing it has no effect on layout, so this is what’s generally used for sound effects in games and other apps.
As with video, it’s important for apps that do audio playback to respond appropriately to the events coming from the Windows.Media.MediaControl
object, especially playpressed
, pausepressed
, stoppressed
, and playpausetogglepressed
. This lets the user control audio playback with hardware buttons, which you would use when playing music tracks, for instance. However, you would not apply these events to audio, such as game sounds.
Speaking of which, an interesting aspect of audio is how to mix multiple sounds together, as games generally require. Here it’s important to understand that each audio
element can be playing one sound: it only has one source file and one source file alone. However, multiple audio
elements can be playing at the same time with automatic intermixing depending on their assigned categories. (See “Playback Manager and Background Audio” below.) In this example, some background music plays continually (loop
is set to true, and the volume is halved) while another sound is played in response to taps (see the AudioPlayback example with this chapter’s content):54
var sound1 = new Audio("media/SpringyBoing.mp3"); sound1.load(); //For pre-loading media //Background music var sound2 = new Audio(); sound2.msAudioCategory = "ForegroundOnlyMedia"; //Set this before setting src sound2.src = "http://www.kraigbrockschmidt.com/mp3/WhoIsSylvia_PortlandOR_5-06.mp3"; sound2.loop = true; sound2.volume = 0.5; //50%; sound2.play(); document.getElementById("btnSound").addEventListener("click", function () { //Reset position in case we're already playing sound1.currentTime = 0; sound1.play(); });
By loading the tap sound when the object is created, we know we can play it at any time. When initiating playback, it’s a good idea to set the currentTime
to 0 so that the sound always plays from the beginning.
The question with mixing, especially in games, really becomes how to manage many different sounds without knowing ahead of time how they will be combined. You may need, for instance, to overlap playback of the same sound with different starting times, but it’s impractical to declare three audio elements with the same source. The technique that’s emerged is to use “rotating channels” as described on the Ajaxian website. To summarize:
1. Declare audio
elements for each sound (with preload="auto"
).
2. Create a pool (array) of Audio
objects for however many simultaneous channels you need.
3. To play a sound:
a. Obtain an available Audio
object from the pool.
b. Set its src
attribute to one that matches a preloaded audio
element.
c. Call that pool object’s play
method.
As sound designers in the movies have discovered, it is possible to have too much sound going on at the same time, because it gets really muddied. So you may not need more than a couple dozen channels at most.
Hint Need some sounds for your app? Check out http://www.freesound.org.
As with the video
element, a few extensions are available on audio
elements as well, namely those to do with effects (msInsertAudioEffect
), DRM (msSetMediaProtectionManager
), PlayTo (msPlayToSource
, etc.), msRealtime
, and msAudioTracks
, as listed earlier in “Video Element Extension APIs.“ In fact, every extension API for audio
exists on video
, but two of them have primary importance for audio
:
• msAudioDeviceType
Allows an app to determine which output device audio will render to: "Console"
(the default) and "Communications"
. This way an app that knows it’s doing communication (like chat) doesn’t interfere with media audio.
• msAudioCategory
Identifies the type of audio being played (see table in the next section). This is very important for identifying whether audio should continue to play in the background (thereby preventing the app from being suspended), as described in the next section. Note that you should always set this property before setting the audio’s src
and that setting this to "Communications"
will also set the device type to "Communications"
and force msRealtime
to true
.
Do note that despite the similarities between the values in these properties, msAudioDeviceType
is for selecting an output device whereas msAudioCategory
identifies the type of audio that’s being played through whatever device. A communications category audio could be playing through the console device, for instance, or a media category could be playing through the communications device. The two are separate concepts.
To explore different kinds of audio playback, let’s turn our attention to the Playback Manager msAudioCategory sample. I won’t show a screen shot of this because, doing nothing but audio, there isn’t much to show! Instead, let me outline the behaviors of its different scenarios in the following table, as well as list those categories that aren’t represented in the sample but that can be used in your own app. In each scenario you need to first select an audio file through the file picker.
Where a single audio stream is concerned, there isn’t always a lot of difference between some of these categories. Yet as the table indicates, different categories have different effects on other simultaneous audio streams. For this purpose, the Windows SDK does an odd thing by providing a second identical sample to the first, the Playback Manager companion sample. This allows you run these apps at the same time (one in snapped view, the other in filled view, or one or both in the background) and play audio with different category settings to see how they combine.
How different audio streams combine is a subject that’s discussed in the Audio Playback in a Windows 8 App whitepaper. However, what’s most important is that you assign the most appropriate category to any particular audio stream. These categories help the playback manager perform the right level of mixing between audio streams according to user expectations, both with multiple streams in the same app, and streams coming from multiple apps (with limits on how many background audio apps can be going at once). For example, users will expect that alarms, being an important form of notification, will temporarily attenuate other audio streams. Similarly, users will expect that an audio stream of a foreground app will takes precedence over a stream of the same category of audio playing in the background. As a developer, then, avoid playing games with the categories. Just assign the most appropriate category to your audio stream so that the playback manager can do its job with audio from all sources and deliver a consistent experience for the entire system.
Setting an audio category for any given audio
element is a simple matter of settings its msAudio-Category
attribute. Every scenario in the sample does the same thing for this, making sure to set the category before setting the src
attribute (shown here from js/backgroundcapablemedia.js):
audtag = document.createElement('audio'); audtag.setAttribute("msAudioCategory", "BackgroundCapableMedia"); audtag.setAttribute("src", fileLocation);
You could accomplish the same thing in markup, of course. Some examples:
<audio id="audio1" src="song.mp3" msAudioCategory="BackgroundCapableMedia"></audio> <audio id="audio2" src="voip.mp3" msAudioCategory="Communications"></audio> <audio id="audio3" src="lecture.mp3" msAudioCategory="Other"></audio>
With BackgroundCapableMedia
and Communcations
, however, simply setting the category isn’t sufficient: you also need to declare an audio background task extension in your manifest. This is easily accomplished by going to the Declarations tab in the manifest designer:
First, select Background Tasks from the Available Declarations drop-down list. Then check Audio under Supported Task Types, and identify a Start page under App Settings. The start page isn’t really essential for background audio (because you’ll never be launched for this purpose), but you need to provide something to make the manifest editor happy.
These declarations appear as follows in the manifest XML, should you care to look:
<Application Id="App"StartPage="default.html"> <!-- ...--> <Extensions> <Extension Category="windows.backgroundTasks" StartPage="default.html"> <BackgroundTasks > <Task Type="audio" /> </BackgroundTasks> </Extension> </Extensions> </Application>
Furthermore, background audio apps must also add listeners for the Windows.Media. -MediaControl
events that we’ve already mentioned so that the user can control background audio playback through the media control UI (see the next section). They’re also required because they make it possible for the playback manager to control the audio streams as the user switches between apps. If you fail to provide these listeners, your audio will always be paused and muted when the app goes into the background.
How to do this is shown in the Playback Manager sample for all its scenarios; the following is from js/communications.js (some code omitted):
mediaControl = Windows.Media.MediaControl; mediaControl.addEventListener("soundlevelchanged", soundLevelChanged, false); mediaControl.addEventListener("playpausetogglepressed", playpause, false); mediaControl.addEventListener("playpressed", play, false); mediaControl.addEventListener("stoppressed", stop, false); mediaControl.addEventListener("pausepressed", pause, false); // audtag variable is the global audio element for the page function playpause() { if (!audtag.paused) { audtag.pause(); } else { audtag.play(); } } function play() { audtag.play(); } function stop() { // Nothing to do here } function pause() { audtag.pause(); } function soundLevelChanged() { //Catch SoundLevel notifications and determine SoundLevel state. //If it's muted, we'll pause the player. var soundLevel = Windows.Media.MediaControl.soundLevel; //No actions are shown here, but the options are spelled out to show the enumeration switch (soundLevel) { case Windows.Media.SoundLevel.muted: break; case Windows.Media.SoundLevel.low: break; case Windows.Media.SoundLevel.full: break; } appMuted(); } function appMuted() { if (audtag) { if (!audtag.paused) { audtag.pause(); } } }
Technically speaking, a handler for soundlevelchanged
is not required here, but the other four are. Such a minimum implementation is part of the AudioPlayback example with this chapter, where the code also uses the MediaControl.isPlaying
flag to set the play/pause button in the media control UI (see next section).
A few additional notes about background audio:
• The reason for distinct playpressed
, pausepressed
, and playpausepressed
events is to support a variety of hardware where some devices have separate play and pause buttons and others have a single button for both.
• If the audio is paused, a background audio app will be suspended like any other, but if the user presses a play button, the app will be resumed and audio will then continue playback.
• The use of background audio is carefully evaluated with apps submitted to the Windows Store. If you attempt to play an inaudible track as a means to avoid being suspended, the app will fail Store certification.
• A background audio app should be careful about how it uses the network for streaming media to support the low power state called connected standby. For details, refer to Writing a power savvy background media app.
Now let’s see the other important reason why you must implement the media control events: the UI that Windows displays in response to hardware buttons.
As mentioned in the previous section, providing handlers for the MediaControl
object events is required for background audio so that the user can control the audio through hardware buttons (built into many devices, including keyboards and remote controls) without needing to switch to the app. This is especially important because background audio continues to play not only when the user switches to another app, but also when they switch to the Start screen switch to the desktop, or lock the device.
The default media control UI appears as shown in Figure 10-3 in the upper left of the screen, regardless of what is on the screen at the time. Tapping the app name will switch to the app.
FIGURE 10-3 The media control UI appearing above the Start screen (left) and the desktop (right)
You can see in these images that the app’s Display Name from the manifest is what’s shown by default in the UI. Although this is an acceptable fallback, audio apps should ideally provide richer audio metadata to the MediaControl
, specifically it’s albumArt
, trackName
, and artistName
properties (the latter two of which must be less than 128 characters). This is done in the Configure keys for media sample, which demonstrates how to obtain album art for a track, a subject we’ll return to later.
With such metadata the media control UI will appear as follows; tapping the album art, track name or artist name will switch back to the audio app.
You’ll notice in the images above that the forward and backward buttons are disabled. This is because the app does not have listeners for the nexttrackpressed
and previoustrackpressed
events of the MediaControl
object; we’ll see how to use these in the next section. There are other events as well, such as channeldownpressed
, channeluppressed
, fastforwardpressed
, rewindpressed
, recordpressed
, and stoppressed
, though these aren’t represented in the media control UI.
An app that’s playing audio tracks (such as music, an audio book, or recorded lectures) will often have a list of tracks to play sequentially, especially while the app is running in the background. In this case it’s important to start the next track quickly because Windows will otherwise suspend the app in 10 seconds after the current audio is finished. For this purpose, listen for the audio
element’s ended
event and set audio.src
to the next track. A good optimization in this case is to create a second Audio
object and set its src
attribute after the first track starts to play. This way that second track will be preloaded and ready to go immediately, thereby avoiding potential delays in playback between tracks. This is shown in the AudioPlayback example for this chapter, where I’ve split the one complete song into four segments for continuous playback. I’ve also shown here how to handle the next and previous button events, along with setting the segment number as the track name:
var mediaControl = Windows.Media.MediaControl var playlist = ["media/segment1.mp3", "media/segment2.mp3", "media/segment3.mp3", "media/segment4.mp3"]; var curSong = 0; var audio1 = null; var preload = null; document.getElementById("btnSegments").addEventListener("click", playSegments); audio1 = document.getElementById("audioSegments"); preload = document.createElement("audio"); function playSegments() { //Always reset WinRT object event listeners to prevent duplication and leaks mediaControl.removeEventListener("nexttrackpressed", nextHandler); mediaControl.removeEventListener("previoustrackpressed", prevHandler); curSong = 0; //Pause the other music document.getElementById("musicPlayback").pause(); //Set up media control listeners setMediaControl(audio1); //Show the element (initially hidden) and start playback audio1.style.display = ""; audio1.volume = 0.5; //50%; playCurrent(); //Preload the next track in readiness for the switch var preload = document.createElement("audio"); preload.setAttribute("preload", "auto"); preload.src = playlist[1]; //Switch to the next track as soon as one had ended or next button is pressed audio1.addEventListener("ended", nextHandler); mediaControl.addEventListener("nexttrackpressed", nextHandler); } function nextHandler () { curSong++; //Enable previous button if we have at least one previous track if (curSong > 0) { mediaControl.addEventListener("previoustrackpressed", prevHandler); } if (curSong < playlist.length) { //playlist[curSong] should already be loaded playCurrent(); //Set up the next preload var nextTrack = curSong + 1; if (nextTrack < playlist.length) { preload.src = playlist[nextTrack]; } else { preload.src = null; mediaControl.removeEventListener("nexttrackpressed", nextHandler); } } } function prevHandler() { //If we're already playing the last song, add the next button handler again if (curSong == playlist.length - 1) { mediaControl.addEventListener("nexttrackpressed", nextHandler); } curSong--; if (curSong == 0) { mediaControl.removeEventListener("previoustrackpressed", prevHandler); } playCurrent(); preload.src = playlist[curSong + 1]; //This should always work } function playCurrent() { audio1.src = playlist[curSong]; audio1.play(); mediaControl.trackName = "Segment " + (curSong + 1); }
When playing sequential tracks like this from an app written in JavaScript and HTML, you might notice brief gaps between the tracks, especially if the first track flows directly into the second. This is a present limitation of the platform given the layers that exist between the HTML audio
element and the low-level XAudio2 APIs that are ultimately doing the real work. You can mitigate the effects to some extent—for example, you can crossfade the two tracks or crossfade a third overlay track that contains a little of the first and a little of the second track. You can also use a negative time offset to start playing the next track slightly before the previous one ends. But if want you a truly seamless transition, you’ll need to bypass the audio
element and use the XAudio2 APIs from a WinRT component for direct playback. How to do this is discussed in the Building your own Windows Runtime components to deliver great apps post on the Windows 8 developer blog.
The multisegment playback example in the previous section is clearly contrived because an app wouldn’t typically have an in-memory playlist. More likely an app would load an existing playlist or create one from files that a user has selected.
WinRT supports these actions through a simple API in Windows.Media.Playlists
namespace, using the WPL (Windows Media Player), ZPL (Zune), and M3U formats. The Playlist sample in the Windows SDK (which almost wins the prize for the shortest sample name!) shows how to perform various tasks with the API. In Scenario 1 it lets you choose multiple files by using the file picker, creates a new Windows.Media.Playlists.Playlist
object, adds those files to its files
list (a vector of StorageFile
objects), and saves the playlist with its saveAsAsync
method (this code from create.js is simplified and reformatted a bit):
function pickAudio() { var picker = new Windows.Storage.Pickers.FileOpenPicker(); picker.suggestedStartLocation = Windows.Storage.Pickers.PickerLocationId.musicLibrary; picker.fileTypeFilter.replaceAll(SdkSample.audioExtensions); picker.pickMultipleFilesAsync().done(function (files) { if (files.size > 0) { SdkSample.playlist = new Windows.Media.Playlists.Playlist(); files.forEach(function (file) { SdkSample.playlist.files.append(file); }); SdkSample.playlist.saveAsAsync(Windows.Storage.KnownFolders.musicLibrary, "Sample", Windows.Storage.NameCollisionOption.replaceExisting, Windows.Media.Playlists.PlaylistFormat.windowsMedia) .done(); } }
Notice that saveAsAsync
takes a StorageFolder
and a name for the file (along with an optional format parameter). This accommodates a common use pattern for playlists where a music app has a single folder where it stores playlists and provides users with a simple means to name them and/or select them. In this way, playlists aren’t typically managed like other user data files where one always goes through a file picker to do a Save As into an arbitrary folder. You could use FileSavePicker
, get a StorageFile
, and use its path
property to get to the appropriate StorageFolder
, but more likely you’ll save playlists in one place and present them as entities that appear only within the app itself.
For example, the Music app that comes with Windows 8 allows you create a new playlist when you’re viewing tracks of some album. The following commands appear on the app bar (left), and when you select New Playlist, a flyout appears (middle) requesting the name, after which the flyout appears on the app bar (right):
The playlist then appears within the app as another album. In other words, though playlists might be saved in discrete files, they aren’t necessarily presented that way to the user, and the API reflects that usage pattern.
Loading a playlist uses the Playlist.loadAsync
method given a StorageFile
for the playlist. This might be a StorageFile
obtained from a file picker or from the enumeration of the app’s private playlist folder. Scenario 2 of the Playlist sample (display.js) demonstrates the former, where it then goes through each file and requests their music properties:
function displayPlaylist() { var picker = new Windows.Storage.Pickers.FileOpenPicker(); picker.suggestedStartLocation = Windows.Storage.Pickers.PickerLocationId.musicLibrary; picker.fileTypeFilter.replaceAll(SdkSample.playlistExtensions); var promiseCount = 0; picker.pickSingleFileAsync() .then(function (item) { if (item) { return Windows.Media.Playlists.Playlist.loadAsync(item); } return WinJS.Promise.wrapError("No file picked."); }) .then(function (playlist) { SdkSample.playlist = playlist; var promises = {}; // Request music properties for each file in the playlist. playlist.files.forEach(function (file) { promises[promiseCount++] = file.properties.getMusicPropertiesAsync(); }); // Print the music properties for each file. Due to the asynchronous // nature of the call to retrieve music properties, the data may appear // in an order different than the one specified in the original playlist. // To guarantee the ordering we use Promise.join with an associative array // passed as a parameter, containing an index for each individual promise. return WinJS.Promise.join(promises); }) .done(function (results) { var output = "Playlist content:\n\n"; var musicProperties; for (var resultIndex = 0; resultIndex < promiseCount; resultIndex++) { musicProperties = results[resultIndex]; output += "Title: " + musicProperties.title + "\n"; output += "Album: " + musicProperties.album + "\n"; output += "Artist: " + musicProperties.artist + "\n\n"; } if (resultIndex === 0) { output += "(playlist is empty)"; } }, function (error) { // ... }); }
We’ll come back to working with these special properties in the next section, as the process also applies to other types of media.
The other method for managing a playlist is PlayList.saveAsync
, which takes a single StorageFile
. This is what you’d use if you’ve loaded and modified a playlist and simply want to save those changes (typically done automatically when the user adds or removes items from the playlist). This is demonstrated in Scenarios 3, 4, and 5 of the sample (add.js, js/remove.js, and js/clear.js), which just use methods of the Playlist.files
vector like append
, removeAtEnd
, and clear
, respectively.
Playback of a playlist depends, of course, on the type of media involved, but typically you’d load a playlist and sequentially take the next StorageFile
object from its files
vector, pass it to URL.createObjectURL
, and then assign that URI to the src
attribute of an audio
or video
element. You could also use playlists to manage lists of images for specific slide shows as well.
A user might store media files anywhere, but images, music, and videos are typically stored in the user’s Pictures, Music, and Videos libraries specifically. Simply said, these are the folders that media apps should use by default until the user indicates otherwise through a folder picker. As we saw in Chapter 8, apps can declare programmatic access to the pictures, music, and videos libraries in their manifests and acquire the StorageFolder
objects for these through Windows.Storage.KnownFolders
:
var picsLib = Windows.Storage.KnownFolders.picturesLibrary; var musicLib = Windows.Storage.KnownFolders.musicLibrary; var vidsLib = Windows.Storage.KnownFolders.videosLibrary;
A photos app will typically declare the capability for the Pictures Library and display those contents in a ListView. A music and videos app will do the same for their respective libraries, as you can see in the built-in Photos, Music, and Videos apps in Windows 8. Remember too that if you forget to declare the appropriate capabilities, the lines of code above will throw access denied exceptions. You’ll know right away if you forgot these important details.
I should warn you ahead of time that working with media can become very complicated and intricate. For that reason you’ll probably find it helpful to refer to some of the topics in the documentation, such as Processing image files, Transcoding, and Using media extensions.
With a StorageFolder
in hand for some media library or subset thereof, you can use, as we also saw in Chapter 8, its getItemsAsync
method to retrieve its contents. You can also use file queries to enumerate those files that match specific criteria. Whatever the case, you end up with a collection of StorageFile
objects that you can work with however you want.
Now comes the interesting part. As I mentioned in Chapter 8, you can retrieve additional metadata for those files. This has a number of layers that you discover when you start opening some of the secrets doors of the StorageFile
class, as illustrated in Figure 10-4. The following sections discuss these areas in turn.
FIGURE 10-4 Relationships between the StorageFile
object and others obtainable through it. 434
First, StorageFile.getThumbnailAsync
provides a thumbnail image appropriate for a particular “mode” from the Windows.Storage.FileProperties. ThumbnailMode
enumeration. Options here are picturesView
, videosView
, musicView
, documentsView
, listView
, and singleItem
. What you receive in your completed handler is a StorageItemThumbnail
object that provides thumbnail data as a stream . You can conveniently pass to our old friend URL.createObjectURL
for display in an img
element and whatnot.
Examples of this are found throughout the File and folder thumbnail sample. Scenario 1, for instance (js/scenario1.js), obtains the thumbnail and displays it in an img
element:
file.getThumbnailAsync(thumbnailMode, requestedSize, thumbnailOptions).done(function (thumbnail) { if (thumbnail) { outputResult(file, thumbnail, modeNames[modeSelected], requestedSize); } // ... }); function outputResult(item, thumbnailImage, thumbnailMode, requestedSize) { document.getElementById("picture-thumb-imageHolder").src = URL.createObjectURL(thumbnailImage, { oneTimeOnly: true }); // ... }
Common file properties—those that exist on all files—are found in a number of different places. Very common properties are found on the StorageFile
object directly, like attributes
, contentType
, dateCreated
, displayName
, displayType
, fileType
, name
, and path
.
The next group is obtained through StorageFile.getBasicPropertiesAsync
. This gives you a Windows.Storage.FileProperties.BasicProperties
object that contains dateModified
, itemDate
, and size
properties. “That’s a snoozer!” you’re saying to yourself. Well, this object also has an additional method called retrievePropertiesAsync
that gives you an array of name-value pairs for all kinds of other stuff.
The trick to understand here is that you have to take an array of the property names you want and pass it to retrievePropertiesAsync
where each name is a string that comes from a very extensive list of Windows Properties, such as System.FileOwner and System.FileAttributes. An example of this is given in Scenario 5 of the Folder enumeration sample we saw in Chapter 8:
var dateAccessedProperty = "System.DateAccessed"; var fileOwnerProperty = "System.FileOwner"; SdkSample.sampleFile.getBasicPropertiesAsync().then(function (basicProperties) { outputDiv.innerHTML += "Size: " + basicProperties.size + " bytes<br />"; outputDiv.innerHTML += "Date modified: " + basicProperties.dateModified + "<br />"; // Get extra properties return SdkSample.sampleFile.properties.retrievePropertiesAsync([fileOwnerProperty, dateAccessedProperty]); }).done(function (extraProperties) { var propValue = extraProperties[dateAccessedProperty]; if (propValue !== null) { outputDiv.innerHTML += "Date accessed: " + propValue + "<br />"; } propValue = extraProperties[fileOwnerProperty]; if (propValue !== null) { outputDiv.innerHTML += "File owner: " + propValue; } });
What’s very useful about this is that you can get to just about any property you want (the list of properties has hundreds of options) and then modify the array and call BasicProperties.save-PropertiesAsync
. Voila! You’ve just updated those properties on the file. A variation of savePropertiesAsync
also lets you pass a specific array of name-value pairs if you only want to change specific ones.
The third set of properties is found by going through the secret door of StorageFile.properties
. This contains a StorageItemContentProperties
object whose retrievePropertiesAsync
and savePropertiesAsync
methods are like those we just saw for BasicProperties
. What’s more interesting is that it also has four other methods—getDocumentPropertiesAsync
, getImagePropertiesAsync
, getMusicPropertiesAsync
, and getVideoPropertiesAsync
—which are how you get to the really specific stuff for individual file types, as we’ll see next.
Alongside the BasicProperties
class in the Windows.Storage.FileProperties
namespace we also find those returned by the StorageFile.properties.get*PropertiesAsync
methods: ImageProperties
, VideoProperties
, MusicProperties
, and DocumentProperties
. Though we’ve had to dig deep to find these, they each contain deeper treasure troves of information—and I do mean deep! The tables below summarize each of these in turn. Note that each object type contains a retrievePropertiesAsync
method, like that of BasicProperties
, that lets you request additional properties by name that aren’t already included in the main properties object. Refer to the links at the top of the table for the references that identify the most relevant Windows properties.
Two notes about all this. First, the string vectors are, as we’ve seen before, instances of IVector
that provide manipulation methods like append
, insertAt
, removeAt
, and so forth. In JavaScript you can access members of the vector like an array with [ ]
’s; just remember that the available methods are more specific.
Second, the latitude
and longitude
properties for images and video are double
types but contain degrees, minutes, seconds, and a directional reference. The Simple imaging sample (in js/default.js) contains a helper function to extract the components of these values and convert them into a string:
"convertLatLongToString": function (latLong, isLatitude) { var reference; if (isLatitude) { reference = (latLong >= 0) ? "N" : "S"; } else { reference = (latLong >= 0) ? "E" : "W"; } latLong = Math.abs(latLong); var degrees = Math.floor(latLong); var minutes = Math.floor((latLong - degrees) * 60); var seconds = ((latLong - degrees - minutes / 60) * 3600).toFixed(2); return degrees + "°" + minutes + "\'" + seconds + "\"" + reference; }
To summarize, the sign of the value indicates direction. A positive value for latitude means North, negative means South; for longitude, positive means East, negative means West. The whole number portion of the value provides the degrees, and the fractional part contains the number of minutes expressed in base 60. Multiplying this value by 60 gives the whole minutes, with the remainder then containing the seconds. It’s odd, but that’s the kind of raw data you get from a GPS device that geolocation APIs normally convert for you directly.
A few of the samples in the Windows SDK show you how to work with some of the properties described in the last section and how to work with those properties more generally. The Simple imaging sample, in Scenario 1 (js/scenario1.js), provides the most complete demonstration because you can choose an image file and it will load and display various properties, as shown in Figure 10-5 (I’ve scrolled down to see all the properties). I can verify that the date, camera make/model, and exposure information are all accurate.
FIGURE 10-5 Image file properties in the Simple imaging sample.
The sample’s openHandler
method is what retrieves these properties from the file, specifically showing a call to StorageFile.properties.getImagePropertiesAsync
and the use of ImageProperties.retrievePropertiesAsync
for a couple of additional properties not already in ImageProperties
. Then getImagePropertiesForDisplay
coalesces these into a single object used by the sample’s UI. Some lines are omitted in the code shown here:
var ImageProperties = {}; function openHandler() { // Keep data in-scope across multiple asynchronous methods. var file = {}; Helpers.getFileFromOpenPickerAsync().then(function (_file) { file = _file; return file.properties.getImagePropertiesAsync(); }).then(function (imageProps) { ImageProperties = imageProps; var requests = [ "System.Photo.ExposureTime", // In seconds "System.Photo.FNumber" // F-stop values defined by EXIF spec ]; return ImageProperties.retrievePropertiesAsync(requests); }).done(function (retrievedProps) { // Format the properties into text to display in the UI. displayImageUI(file, getImagePropertiesForDisplay(retrievedProps)); }); } function getImagePropertiesForDisplay(retrievedProps) { // If the specified property doesn't exist, its value will be null. var orientationText = Helpers.getOrientationString(ImageProperties.orientation); var exposureText = retrievedProps.lookup("System.Photo.ExposureTime") ? retrievedProps.lookup("System.Photo.ExposureTime") * 1000 + " ms" : ""; var fNumberText = retrievedProps.lookup("System.Photo.FNumber") ? retrievedProps.lookup("System.Photo.FNumber").toFixed(1) : ""; // Omitted: Code to convert ImageProperties.latitude and ImageProperties.longitude to // degrees, minutes, seconds, and direction return { "title": ImageProperties.title, "keywords": ImageProperties.keywords, // array of strings "rating": ImageProperties.rating, // number "dateTaken": ImageProperties.dateTaken, "make": ImageProperties.cameraManufacturer, "model": ImageProperties.cameraModel, "orientation": orientationText, // Omitted: lat/long properties "exposure": exposureText, "fNumber": fNumberText }; }
Most of the displayImageUI
function to which these properties are passed just copies the data into various controls. It’s good to note again, though, that displaying the picture itself is easily accomplished with our good friend, URL.createObjectURL
:
function displayImageUI(file, propertyText) { id("outputImage").src = window.URL.createObjectURL(file, { oneTimeOnly: true });
For MusicProperties
a small example can be found in the Playlist sample, as we already saw earlier in “Playlists.” You might go back now and look at the code listed in that section, as you should be able to understand what’s going on. And while the SDK lacks samples that use VideoProperties
and DocumentProperties
, working with these follows the same pattern as shown above for ImageProperties
, so it should be straightforward to write the necessary code.
Also take a look again at the Configure keys for media sample, as we saw earlier in “The Media Control UI.” It shows how to use the music properties to obtain album art.
As for saving properties, the Simple Imaging sample delivers there as well, also in Scenario 1. As the fields shown earlier in Figure 10-5 are editable, the sample provides an Apply butting that invokes the applyHandler
function below to write them back to the file:
function applyHandler() { ImageProperties.title = id("propertiesTitle").value; // Keywords are stored as an array of strings. Split the textarea text by newlines. ImageProperties.keywords.clear(); if (id("propertiesKeywords").value !== "") { var keywordsArray = id("propertiesKeywords").value.split("\n"); keywordsArray.forEach(function (keyword) { ImageProperties.keywords.append(keyword); }); } var properties = new Windows.Foundation.Collections.PropertySet(); // When writing the rating, use the "System.Rating" property key. // ImageProperties.rating does not handle setting the value to 0 (no stars/unrated). properties.insert("System.Rating", Helpers.convertStarsToSystemRating( id("propertiesRatingControl").winControl.userRating )); // Code omitted: convert discrete latitude/longitude values from the UI into the // appropriate forms needed for the properties, and do some validation; the end result // is to store these in the properties list properties.insert("System.GPS.LatitudeRef", latitudeRef); properties.insert("System.GPS.LongitudeRef", longitudeRef); properties.insert("System.GPS.LatitudeNumerator", latNum); properties.insert("System.GPS.LongitudeNumerator", longNum); properties.insert("System.GPS.LatitudeDenominator", latDen); properties.insert("System.GPS.LongitudeDenominator", longDen); // Write the properties array to the file ImageProperties.savePropertiesAsync(properties).done(function () { // ... }, function (error) { // Some error handling as some properties may not be supported by all image formats. }); }
A few noteworthy features of this code include the following:
• It separates keywords in the UI control and separately appends each to the keywords
property vector.
• It creates a new collection of properties of type Windows.Foundation.Collections.PropertySet
and uses its insert
method to add properties to the list. This property set is what’s expected by the savePropertiesAsync
method.
• The Helpers.convertStartsToSystemRating
method (see js/default.js) converts between 1–5 stars, as used in the WinJS.UI.Rating
control, to the System.Rating value that uses a 1–99 range. The documentation for System.Rating specifically indicates this mapping.
In general, all the detailed information you want for any particular Windows property can be found on the reference page for that property. Again start at the Windows Properties and drill down from there.
To do something more with an image than just loading and displaying it (where again you can apply various CSS transforms for effect), you need to get to the actual pixels by means of a decoder. This already happens under the covers when you assign a URI to an img.src
., but to have direct access to pixels means decoding manually. On the flip side, saving pixels back out to an image file means using an encoder.
WinRT provides APIs for both in the Windows.Graphics. Imaging
namespace, namely in the BitmapDecoder
, BitmapTransform
, and BitmapEncoder
classes. Loading, manipulating, and saving an image file often involves these three classes in turn, though the BitmapTransform
object is focused on rotation and scaling so you won’t use it if you’re doing other manipulations.
One demonstration of this API can be found in Scenario 2 of the Simple imaging sample. I’ll leave it to you to look at the code directly, however, because it gets fairly involved—up to 11 chained promises to save a file! It also does all decoding, manipulation, and encoding within a single function such as saveHandler
(js/scenario2.js). Here’s the process it follows:
• Open a file with StorageFile.openAsync
, which provides a stream.
• Pass that stream to the static method BitmapDecoder.createAsync
which provides a specific instance of BitmapDecoder
for the stream.
• Pass that decoder to the static method BitmapEncoder.createForTranscodingAsync
, which provides a specific BitmapEncoder
instance. This encoder is created with an InMemoryRandomAccessStream
.
• Set properties in the encoder’s bitmapTransform
property (a BitmapTransform
object) to configure scaling and rotation. This creates the transformed graphic in the in-memory stream.
• Create a property set (Windows.Graphics.Imaging.BitmapPropertySet
) that includes System.Photo.Orientation and use the encoder’s bitmapProperties.setPropertiesAsync
to save it.
• Copy the in-memory stream to the output file stream by using Windows.Storage.Stream.RandomAccessStream.copyAsync
.
• Close both streams with their respective close
methods (this is what closes the file).
As comprehensive as this scenario is, it’s helpful to look at different stages of the process separately, for which purpose we have the ImageManipulation example in this chapter’s companion content. This lets you pick and load an image, convert it to grayscale, and save that converted image to a new file. Its output is shown in Figure 10-6. It also gives us an opportunity to see how we can send decoded image data to an HTML canvas
element and save that canvas’s contents to a file.
FIGURE 10-6 Output of the ImageManipulation example in the chapter’s companion content.
The handler for the Load Image button (loadImage
in js/default.js) provides the initial display. It lets you select an image with the file picker, displays the full-size image in an img
element with URL.createObjectURL
, calls StorageFile.properties.getImagePropertiesAsync
to retrieve the title
and dateTaken
properties, and uses StorageFile.getThumbnailAsync
to provide the thumbnail at the top. We’ve seen all of these APIs in action already.
When we click Grayscale we enter the setGrayscale
handler where the interesting work happens. We call StorageFile.openReadAsync
to get a stream, call BitmapDecoder.createAsync
with that to obtain a decoder, cache some details from the decoder in a local object (encoding
), and call BitmapDecoder.getPixelDataAsync
and copy those pixels to a canvas (and only three chained async operations here!):
var Imaging = Windows.Graphics.Imaging; //Shortcut var imageFile; //Saved from the file picker var decoder; //Saved from BitmapDecoder.createAsync var encoding = {}; //To cache some details from the decoder function setGrayscale() { //Decode the image file into pixel data for a canvas //Get an input stream for the file (StorageFile object saved from opening) imageFile.openReadAsync().then(function (stream) { //Create a decoder using static createAsync method and the file stream return Imaging.BitmapDecoder.createAsync(stream); }).then(function (decoderArg) { decoder = decoderArg; //Configure the decoder if desired. Default is BitmapPixelFormat.rgba8 and //BitmapAlphaMode.ignore. The parameterized version of getPixelDataAsync can also //control transform, ExifOrientationMode, and ColorManagementMode if needed. //Cache these settings for encoding later encoding.dpiX = decoder.dpiX; encoding.dpiY = decoder.dpiY; encoding.pixelFormat = decoder.bitmapPixelFormat; encoding.alphaMode = decoder.bitmapAlphaMode; encoding.width = decoder.pixelWidth; encoding.height = decoder.pixelHeight; return decoder.getPixelDataAsync(); }).done(function (pixelProvider) { //detachPixelData gets the actual bits (array can't be returned from //an async operation) copyGrayscaleToCanvas(pixelProvider.detachPixelData(), decoder.pixelWidth, decoder.pixelHeight); }); }
The decoder’s getPixelDataAsync
method comes in two forms. The simple form, shown here, decodes using defaults. The full-control version lets you specify other parameters, as explained in the code comments above. A common use of this is doing a transform using a Windows.Graphics.Imaging.BitmapTransform
object (as mentioned before), which accommodates scaling (with different interpolation modes), rotation (90-degree increments), cropping, and flipping.
Either way, what you get back from the getPixelDataAsync
is not the actual pixel array, because of a limitation in the WinRT language projection mechanism whereby an asynchronous operation cannot return an array. Instead, the operation returns a PixelDataProvider
object whose singular super-exciting synchronous method called detachPixelData
gives you the array you want. (And that method can be called only once and will fail on subsequent calls, hence the “detach” name.) In the end, though, what we have is exactly the data we need to manipulate the pixels and display the result on a canvas, as the copyGrayscaleToCanvas
function demonstrates. You can, of course, replace this kind of function with any other manipulation routine:
function copyGrayscaleToCanvas(pixels, width, height) { //Set up the canvas context and get its pixel array var canvas = document.getElementById("canvas1"); canvas.width = width; canvas.height = height; var ctx = canvas.getContext("2d"); //Loop through and copy pixel values into the canvas after converting to grayscale var imgData = ctx.createImageData(canvas.width, canvas.height); var colorOffset = { red: 0, green: 1, blue: 2, alpha: 3 }; var r, g, b, gray; var data = imgData.data; //Makes a huge perf difference! for (var i = 0; i < pixels.length; i += 4) { r = pixels[i + colorOffset.red]; g = pixels[i + colorOffset.green]; b = pixels[i + colorOffset.blue]; //Assign each rgb value to brightness for gray = Math.floor(.3 * r + .55 * g + .11 * b); data[i + colorOffset.red] = gray; data[i + colorOffset.green] = gray; data[i + colorOffset.blue] = gray; data[i + colorOffset.alpha] = pixels[i + colorOffset.alpha]; } //Show it on the canvas ctx.putImageData(imgData, 0, 0); //Enable save button document.getElementById("btnSave").disabled = false; }
This is a great place to point out that JavaScript isn’t necessarily the best language for working over a pile of pixels like this, though in this case the performance of a Release build running outside the debugger is actually quite good. Such routines may be better implemented as a WinRT component in a language like C# or C++ and made callable by JavaScript. We’ll take the opportunity to do exactly this in Chapter 16, “WinRT Components,” where we’ll also see limitations of the canvas
element that require us to take a slightly different approach.
Saving this canvas data to a file then happens in the saveGrayscale
function, where we use the file picker to get a StorageFile
, open a stream, acquire the canvas
pixel data, and hand it off to a BitmapEncoder
:
function saveGrayscale() { var picker = new Windows.Storage.Pickers.FileSavePicker(); picker.suggestedStartLocation = Windows.Storage.Pickers.PickerLocationId.picturesLibrary; picker.suggestedFileName = imageFile.name + " - grayscale"; picker.fileTypeChoices.insert("PNG file", [".png"]); var imgData, fileStream = null; picker.pickSaveFileAsync().then(function (file) { if (file) { return file.openAsync(Windows.Storage.FileAccessMode.readWrite); } else { return WinJS.Promise.wrapError("No file selected"); } }).then(function (stream) { fileStream = stream; var canvas = document.getElementById("canvas1"); var ctx = canvas.getContext("2d"); imgData = ctx.getImageData(0, 0, canvas.width, canvas.height); return Imaging.BitmapEncoder.createAsync( Imaging.BitmapEncoder.pngEncoderId, stream); }).then(function (encoder) { //Set the pixel data--assume "encoding" object has options from elsewhere. //Conversion from canvas data to Uint8Array is necessary because the array type //from the canvas doesn't match what WinRT needs here. encoder.setPixelData(encoding.pixelFormat, encoding.alphaMode, encoding.width, encoding.height, encoding.dpiX, encoding.dpiY, new Uint8Array(imgData.data)); //Go do the encoding return encoder.flushAsync(); }).done(function () { fileStream.close(); }, function () { //Empty error handler (do nothing if the user canceled the picker) }); }
Note how the BitmapEncoder
takes a codec identifier in its first parameter. We’re using pngEncoderId
, which is, as you can see, defined as a static property of the Windows.Graphics.Imaging.BitmapEncoder
class; other values are bmpEncoderId
, gifEncoderId
, jpegEncoderId
, jpegXREncoderId
, and tiffEncoderId
. These are the formats supported by the API. You can set additional properties of the BitmapEncoder
before setting pixel data, such as its BitmapTransform
, which will then be applied during encoding.
One gotcha to be aware of here is that the pixel array obtained from a canvas
element (a DOM CanvasPixelArray
) is not directly compatible with the WinRT byte array required by the encoder. This is the reason for the new Uint8Array
call down there in the last parameter.
In the previous section we mostly saw the use of a BitmapEncoder
created with that class’s static createAsync
method to write a new file. That’s all well and good, but you might want to know about a few of the encoder’s other capabilities.
First is the BitmapEncoder.createForTranscodingAsync
method that was mentioned briefly in the context of the Simple imaging sample. This specifically creates a new encoder that is initialized from an existing BitmapDecoder
. This is primarily used to manipulate some aspects of the source image file while leaving the rest of the data intact. To be more specific, you can first change those aspects that are expressed through the encoder’s setPixelData
method: the pixel format (rgba8, rgba16, and bgra8, see BitmapPixelFormat
), the alpha mode (premultiplied, straight, or ignore, see BitmapAlphaMode
), the image dimensions, the image DPI, and, of course, the pixel data itself. Beyond that, you can change other properties through the encoder’s bitmapProperties.setProperties-Async
method. In fact, if all you need to do is change a few properties and you don’t want to affect the pixel data, you can use BitmapEncoder.createForInPlacePropertyEncodingAsync
instead (how’s that for a method name!). This encoder allows calls to only bitmapProperties.setPropertiesAsync
, bitmapProperties.getPropertiesAsync
, and flushAsync
, and since it can assume that the underlying data in the file will remain unchanged, it executes much faster than its more flexible counterparts and has less memory overhead.
An encoder from createForTranscodingAsync
does not accommodate a change of image file format (e.g., JPEG to PNG); for that you need to use createAsync
wherein you can specify the specific kind of encoding. As we’ve already seen, the first argument to createAsync
is a codec identifier, for which you normally pass one of the static properties on Windows.Graphics.Imaging.BitmapEncoder
. What I haven’t mentioned is that you can also specify custom codecs in this first parameter and that the createAsync
call also supports an optional third argument in which you can provide options for the particular codec in question. However, there are complications and restrictions here.
Let me address options first. The present documentation for the BitmapEncoder
codec values (like pngEncoderId
) lacks any details about available options. For that you need to instead refer to the docs for the Windows Imaging Component (WIC), specifically the Native WIC Codecs that are what WinRT is surfacing to Store apps. If you go into the page for a specific codec, you’ll then see a section on “Encoder Options” that tells you what you can use. For example, the JPEG codec supports properties like ImageQuality
(a value between 0.0 and 1.0), as well as built-in rotations. The PNG codec supports properties like FilterOption
for various compression optimizations.
To provide these properties, you need to create a new BitmapPropertySet
and insert an entry in that set for each desired options. If, for example, you have a variable named quality
that you want to apply to a JPEG encoding, you’d create the encoder like this:
var options = new Windows.Graphics.Imaging.BitmapPropertySet(); options.insert("ImageQuality", quality); var encoderPromise = Imaging.BitmapEncoder.createAsync(Imaging.BitmapEncoder.jpegEncoderId, stream, options);
You use the same BitmapPropertySet
for any properties you might pass to an encoder’s bitmap-Properties.setPropertiesAsync
call. Here’s we’re just using the same mechanism for encoder options.
As for custom codecs, this simply means that the first argument to BitmapEncoder.createAsync
(as well as BitmapDecoder.createAsync
) is the GUID (a class identifier or CLSID) for that codec, the implementation of which must be provided by a DLL. Details on how to write one of these is provided in How to Write a WIC-Enabled Codec. The catch is that including custom image codecs in your package is not presently supported. If the codec is already on the system (that is, installed via the desktop), it will work. However, the Windows Store policies do not allow apps to be dependent on other apps, so it’s unlikely that you can even ship such an app unless it’s preinstalled on some specific OEM device and the DLL is part of the system image. (An app written in C++ can do more here, but that’s beyond the scope of this book.)
In short, for apps written in JavaScript and HTML, you’re really limited, for all practical purposes, to image formats that are inherently supported in the system.
Do note that these restrictions do not exist for custom audio and video codecs. The Media extensions sample shows how to do this with a custom video codec, as we’ll see in the next section.
As with images, if all we want to do is load the contents of a StorageFile
into an audio or video element, we can just pass that StorageFile
to URL.createObjectUrl
and assign the result to a src
attribute. Similarly, if we want to get at the raw data, we can just use the StorageFile.openAsync
or openReadAsync
methods to obtain a file stream.
To be honest, opening the file is probably the farthest you’d ever go in JavaScript with raw audio or video, if even that. While chewing on an image is a marginally acceptable process in the JavaScript environment, churning on audio and especially video is really best done in a highly performant C++ DLL. In fact, many third-party, platform-neutral C/C++ libraries for such manipulations are readily available that you should be able to directly incorporate into such a DLL. In this case you might as well just let the DLL open the file itself!
That said, WinRT does provide for transcoding (converting) between different media formats and provides an extensibility model for custom codecs, effects, and scheme handlers. In fact, we’ve already seen how to apply custom video effects through the Media extensions sample, and the same DLLs can also be used within an encoding process, where all that the JavaScript code really does is glue the right components together (which it’s very good at doing). Let’s see how this works with transcoding video first and then with custom codecs.
Transcoding both audio and video is accomplished through the Windows.Media.Transcoding.MediaTranscoder
class, which supports output formats of mp3 and wma for audio, and mp4, wmv, and m4a for video. The transcoding process also allows you to apply effects and to trim start and end times.
Transcoding happens either from one StorageFile
to another or one RandomAccessStream
to another, and in each case happens according to a MediaEncodingProfile
. To set up a transcoding operation you call the MediaTranscoder
prepareFileTranscodeAsync
or prepareStreamTranscodeAsync
method, which returns back a PrepareTranscodeResult
object. This represents the operation that’s ready to go, but it won’t happen until you call that result’s transcodeAsync
method. In JavaScript, each result is a promise, allowing you to provide completed and progress handlers for a single operation but also allowing you to combine operations with WinJS.Promise.join
. This allows them to be set up and started later, which is useful for batch processing and doing automatic uploads to a service like YouTube while you’re sleeping! (And at times like these I’ve actually pulled ice packs from my freezer and placed them under my laptop as a poor-man’s cooling system….)
The Transcoding media sample provides us with a couple of transcoding scenarios. In Scenario 1 (js/presets.js) we can pick a video file, pick a target format, select a transcoding profile, and turn the machine loose to do the job (with progress being reported), as shown in Figure 10-7.
FIGURE 10-7 The Transcoding media sample cranking away on a video of my then two-year-old son discovering the joys of a tape measure.
The code that’s executed when you press the Transcode button is as follows (some bits omitted; this sample happens to use nested promises, which again isn’t recommended for proper error handling unless you want, as this code would show, to eat any exceptions that occur prior to the transcode-Async
call):
function onTranscode() { // Create transcode object. var transcoder = null; transcoder = new Windows.Media.Transcoding.MediaTranscoder(); // Get transcode profile. getPresetProfile(id("profileSelect")); // Create output file and transcode. var videoLib = Windows.Storage.KnownFolders.videosLibrary; var createFileOp = videoLib.createFileAsync(g_outputFileName, Windows.Storage.CreationCollisionOption.generateUniqueName); createFileOp.done(function (ofile) { g_outputFile = ofile; g_transcodeOp = null; var prepareOp = transcoder.prepareFileTranscodeAsync(g_inputFile, g_outputFile, g_profile); prepareOp.done(function (result) { if (result.canTranscode) { g_transcodeOp = result.transcodeAsync(); g_transcodeOp.done(transcodeComplete, transcoderErrorHandler, transcodeProgress); } else { transcodeFailure(result.failureReason); } }); // prepareOp.done id("cancel").disabled = false; }); // createFileOp.done }
The getPresetProfile
method retrieves the appropriate profile object according to the option selected in the app. For the selections shown in Figure 10-7 (WMV and WVGA), we’d use these parts of that function:
function getPresetProfile(profileSelect) { g_profile = null; var mediaProperties = Windows.Media.MediaProperties; var videoEncodingProfile; switch (profileSelect.selectedIndex) { // other cases omitted case 2: videoEncodingProfile = mediaProperties.VideoEncodingQuality.wvga; break; } if (g_useMp4) { g_profile = mediaProperties.MediaEncodingProfile.createMp4(videoEncodingProfile); } else { g_profile = mediaProperties.MediaEncodingProfile.createWmv(videoEncodingProfile); } }
In Scenario 2, the sample always uses the WVGA encoding but allows you to set specific values for the video dimensions, the frame rate, the audio and video bitrates, audio channels, and audio sampling. It applies these settings in getCustomProfile
(js/custom.js) simply by configuring the profile properties after the profile is created:
function getCustomProfile() { if (g_useMp4) { g_profile = Windows.Media.MediaProperties.MediaEncodingProfile.createMp4( Windows.Media.MediaProperties.VideoEncodingQuality.wvga); } else { g_profile = Windows.Media.MediaProperties.MediaEncodingProfile.createWmv( Windows.Media.MediaProperties.VideoEncodingQuality.wvga); } // Pull configuration values from the UI controls g_profile.audio.bitsPerSample = id("AudioBPS").value; g_profile.audio.channelCount = id("AudioCC").value; g_profile.audio.bitrate = id("AudioBR").value; g_profile.audio.sampleRate = id("AudioSR").value; g_profile.video.width = id("VideoW").value; g_profile.video.height = id("VideoH").value; g_profile.video.bitrate = id("VideoBR").value; g_profile.video.frameRate.numerator = id("VideoFR").value; g_profile.video.frameRate.denominator = 1; }
And to finish off, Scenario 3 is like Scenario 1, but it lets you set start and end times that are then saved in the transcoder’s trimStartTime
and trimStopTime
properties (see js/trim.js):
transcoder = new Windows.Media.Transcoding.MediaTranscoder();
transcoder.trimStartTime = g_start;
transcoder.trimStopTime = g_stop;
Through not shown in the sample, you can apply effects to a transcoding operation by using the transcoder’s addAudioEffect
and addVideoEffect
methods.
Clearly, there are many more audio and video formats in the world than Windows can support in-box, so an extensibility mechanism is provided in WinRT to allow for custom bytestream objects, custom media sources, and custom codecs and effects. It’s important to note again that all such extensions are available only to the app itself and are not available to other apps on the system. Furthermore, Windows will always prefer in-box components over a custom one, which means don’t bother wasting your time creating a new mp3 decoder or such since it will never actually be used.
As suggested earlier with custom image formats, this subject will certainly take you into some pretty vast territory around the entire Windows Media Foundation (WMF) SDK. What’s in WinRT is really just a wrapper, so knowledge of WMF is essential and not for the faint of heart!
Audio and video extensions are declared in the app manifest where you’ll need to edit the XML directly. As seen in the Media extensions sample for all the DLLs in its overall solution, each declaration looks like this:
<Extension Category="windows.activatableClass.inProcessServer"> <InProcessServer> <Path>MPEG1Decoder.dll</Path> <ActivatableClass ActivatableClassId="MPEG1Decoder.MPEG1Decoder" ThreadingModel="both" /> </InProcessServer> </Extension>
The ActivatableClassId
is how an extension is identified when calling the WinRT APIs, which is clearly mapped in the manifest to the specific DLL that needs to be loaded.
Depending, then, on the use of the extension, you might need to register it with WinRT through the methods of Windows.Media.MediaExtensionManager
: registerAudio[Decoder | Encoder]
, registerByteStreamHandler
(media sinks), registerSchemeHandler
(media sources/file containers), and registerVideo[Decoder | Encoder]
. In Scenario 1 of the Media extensions sample (js/LocalDecoder.js), we can see how to set up a custom decoder for video playback:
var page = WinJS.UI.Pages.define("/html/LocalDecoder.html", { extensions: null, MFVideoFormat_MPG1: { value: "{3147504d-0000-0010-8000-00aa00389b71}" }, NULL_GUID: { value: "{00000000-0000-0000-0000-000000000000}" }, ready: function (element, options) { if (!this.extensions) { // Add any initialization code here this.extensions = new Windows.Media.MediaExtensionManager(); // Register custom ByteStreamHandler and custom decoder. this.extensions.registerByteStreamHandler("MPEG1Source.MPEG1ByteStreamHandler", ".mpg", null); this.extensions.registerVideoDecoder("MPEG1Decoder.MPEG1Decoder", this.MFVideoFormat_MPG1, this.NULL_GUID); } // ...
where the MPEG1Source.MPEG1ByteStreamHandler CLSID is implemented in one DLL (see the MPEG1Source C++ project in the sample’s solution) and the MPEG1Decoder.MPEG1.Decoder CLSID is implemented in another (the MPEG1Decoder C++ project).
Scenario 2, for its part, shows the use of a custom scheme handler, where the handler (in the GeometricSource C++ project) generates video frames on the fly. Fascinating stuff, but again beyond the scope of this book.
Effects, as we’ve seen, are quite simple to use once you have one implemented: just pass their CLSID to methods like msInsertVideoEffect
and msInsertAudioEffect
on video
and audio
elements. You can also apply effects during the transcoding process in the MediaTranscoder
class’s addAudio-Effect
and addVideoEffect
methods. The same is also true for media capture, as we’ll see shortly.
There are times when we can really appreciate the work that people have done to protect individual privacy, such as making sure I know when my computer’s camera is being used since I am often using it in the late evening, sitting in bed, or in the early pre-shower mornings when I have, in the words of my father-in-law, “pineapple head.”
And there are times when we want to turn on a camera or a microphone and record something: a picture, a video, or audio. Of course, an app cannot know ahead of time what exact camera and microphones might be on a system. A key step in capturing media, then, is determining which device to use—something that the Windows.Media.Capture
APIs provide for nicely, along with the process of doing the capture itself into a file, a stream, or some other custom “sink” depending on how an app wants to manipulate or process the capture.
Back in Chapter 2, “Quickstart,” we learned how to use WinRT to easily capture a photograph in the Here My Am! app. To quickly review, we only needed to declare the Webcam capability in the manifest and add a few lines of code:
function capturePhoto() { var that = this; var captureUI = new Windows.Media.Capture.CameraCaptureUI(); //Indicate that we want to capture a PNG that's no bigger than our target element -- //the UI will automatically show a crop box of this size captureUI.photoSettings.format = Windows.Media.Capture.CameraCaptureUIPhotoFormat.png; captureUI.photoSettings.croppedSizeInPixels = { width: this.clientWidth, height: this.clientHeight }; captureUI.captureFileAsync(Windows.Media.Capture.CameraCaptureUIMode.photo) .done(function (capturedFile) { //Be sure to check validity of the item returned; could be null //if the user canceled. if (capturedFile) { lastCapture = capturedFile; //Save for Share that.src = URL.createObjectURL(capturedFile, {oneTimeOnly: true}); } }, function (error) { console.log("Unable to invoke capture UI."); }); }
The UI that Windows brings up through this API provides for cropping, retakes, and adjusting camera settings. Another example of taking a photo can also be found in Scenario 1 of the CameraCaptureUI Sample, along with an example of capturing video in Scenario 2. In this latter case (js/capturevideo.js) we configure the capture UI object for a video format and indicate a video mode in the call to captureFileAsync
:
function captureVideo() { var dialog = new Windows.Media.Capture.CameraCaptureUI(); dialog.videoSettings.format = Windows.Media.Capture.CameraCaptureUIVideoFormat.mp4; dialog.captureFileAsync(Windows.Media.Capture.CameraCaptureUIMode.video) .done(function (file) { if (file) { var videoBlobUrl = URL.createObjectURL(file, {oneTimeOnly: true}); } else { //... } }, function (err) { //... }); }
It should be noted that the Webcam capability in the manifest applies only to the image or video side of camera capture. If you want to capture audio, be sure to also select the Microphone capability on the Capabilities tab of the manifest editor.
If you look in the Windows.Media.Capture.CameraCaptureUI
object, you’ll also see many other options you can configure. Its photoSettings
property, a CameraCaptureUIPhotoCapture-Settings
object, lets you indicate cropping size and aspect ratio, format, and maximum resolution. Its videoSettings
property, a CameraCaptureUIVideoCaptureSettings
object, lets you set the format, set the maximum duration and resolution, and indicate whether the UI should allow for trimming. All useful stuff! You can find discussions of some of these in the docs on Capturing or rendering audio, video, and images, including coverage of managing calls on a Bluetooth device.
Of course, the default capture UI won’t necessarily suffice in every use case. For one, it always sends output to a file, but if you’re writing a communications app, for example, you’d rather send captured video to a stream or send it over a network without any files involved at all. You might also want to preview a video before any capture actually happens. Furthermore, you may want to add effects during the capture, apply rotation, and perhaps apply a custom encoding.
All of these capabilities are available through the Windows.Media.Capture.MediaCapture
class:
For a very simple demonstration of previewing video in a video
element we can look at the CameraOptionsUI sample in js/showoptionsui.js. When you tap the Start Preview button, it creates an initializes a MediaCapture
object as follows:
function initializeMediaCapture() { mediaCaptureMgr = new Windows.Media.Capture.MediaCapture(); mediaCaptureMgr.initializeAsync().done(initializeComplete, initializeError); }
where the initializeComplete
handler calls into startPreview
:
function startPreview() { document.getElementById("previewTag").src = URL.createObjectURL(mediaCaptureMgr); document.getElementById("previewTag").play(); startPreviewButton.disabled = true; showSettingsButton.style.visibility = "visible"; previewStarted = true; }
The other little bit shown in this sample is invoking the Windows.Media.Capture.Camera-OptionsUI
, which happens when you tap its Show Settings button; see Figure 10-8. This is just a system-provided flyout with options that are relevant to the current media stream being captured:
function showSettings() { if (mediaCaptureMgr) { Windows.Media.Capture.CameraOptionsUI.show(mediaCaptureMgr); } }
By the way, if you have trouble running a sample like this in the Visual Studio simulator—specifically, you see exceptions when trying to turn on the camera—try running on the local machine or a remote machine instead.
FIGURE 10-8 The Camera Options UI, as shown in the CameraOptionsUI sample (empty bottom is cropped).
More complex scenarios involving the MediaCapture
class (and a few others) can be found now in the Media capture using capture device sample, such as previewing and capturing video, changing properties dynamically (Scenario 1), selecting a specific media device (Scenario 2), and recording just audio (Scenario 3).
Starting with Scenario 3 (js/AudioCapture.js, the simplest), here’s the core code to create and initialize the MediaCapture
object for an audio stream (see the streamingCaptureMode
property in the initialization settings), where that stream is directed to a file in the music library via startRecordToStorageFileAsync
(some code omitted for brevity):
var mediaCaptureMgr = null; var captureInitSettings = null; var encodingProfile = null; var storageFile = null; // This is called when the page is loaded function initCaptureSettings() { captureInitSettings = new Windows.Media.Capture.MediaCaptureInitializationSettings(); captureInitSettings.audioDeviceId = ""; captureInitSettings.videoDeviceId = ""; captureInitSettings.streamingCaptureMode = Windows.Media.Capture.StreamingCaptureMode.audio; } function startDevice() { mediaCaptureMgr = new Windows.Media.Capture.MediaCapture(); mediaCaptureMgr.initializeAsync(captureInitSettings).done(function (result) { // ... }); } function startRecord() { // ... // Start recording. Windows.Storage.KnownFolders.videosLibrary.createFileAsync("cameraCapture.m4a", Windows.Storage.CreationCollisionOption.generateUniqueName) .done(function (newFile) { storageFile = newFile; encodingProfile = Windows.Media.MediaProperties .MediaEncodingProfile.createM4a(Windows.Media.MediaProperties .AudioEncodingQuality.auto); mediaCaptureMgr.startRecordToStorageFileAsync(encodingProfile, storageFile).done(function (result) { // ... }); }); } function stopRecord() { mediaCaptureMgr.stopRecordAsync().done(function (result) { displayStatus("Record Stopped. File " + storageFile.path + " "); // Playback the recorded audio var audio = id("capturePlayback" + scenarioId); audio.src = URL.createObjectURL(storageFile, { oneTimeOnly: true }); audio.play(); }); }
Scenario 1 is essentially the same code but captures a video stream as well as photos, with results shown in Figure 10-9. This variation is enabled through these properties in the initialization settings (see js/BasicCapture.js within initCaptureSettings
):
captureInitSettings.photoCaptureSource = Windows.Media.Capture.PhotoCaptureSource.videoPreview; captureInitSettings.streamingCaptureMode = Windows.Media.Capture.StreamingCaptureMode.audioAndVideo;
FIGURE 10-9 Previewing and recording video with the default device in the Media capture sample, Scenario 1. (The output is cropped because I needed to run the app using the Local Machine option in Visual Studio, and I didn’t think you needed to see a 1920x1200 screenshot with lots of whitespace!).
Notice the Contrast and Brightness controls in Figure 10-9. Changing these will change the preview video, along with the recorded video. The sample does this through the MediaCapture.video-DeviceController
object’s contrast
and brightness
properties, showing that these (and any others in the controller) can be adjusted dynamically. Refer to the getCameraSettings
function in js/BasicCapture.js that basically wires the slider change
events into a generic anonymous function to update the desired property.
Looking now at Scenario 2 (js/AdvancedCapture.js), it’s more or less like Scenario 1 but it allows you to select the specific input device. Until now, everything we’ve done has simply used the default device, but you’re not limited to that, of course. You can use the Windows.Devices.Enumeration
API to retrieve a list of devices within a particular device interface class; the sample uses the predefined videoCapture
class:
function enumerateCameras() { var cameraSelect = id("cameraSelect"); deviceList = null; deviceList = new Array(); while (cameraSelect.length > 0) { cameraSelect.remove(0); } //Enumerate webcams and add them to the list var deviceInfo = Windows.Devices.Enumeration.DeviceInformation; deviceInfo.findAllAsync(Windows.Devices.Enumeration.DeviceClass.videoCapture) .done(function (devices) { // Add the devices to deviceList if (devices.length > 0) { for (var i = 0; i < devices.length; i++) { deviceList.push(devices[i]); cameraSelect.add(new Option(deviceList[i].name), i); } //Select the first webcam cameraSelect.selectedIndex = 0; initCaptureSettings(); } else { // disable buttons. } }, errorHandler); }
The selected device’s ID is then copied within initCaptureSettings
to the MediaCapture-InitializationSetting.videoDeviceId
property:
var selectedIndex = id("cameraSelect").selectedIndex; var deviceInfo = deviceList[selectedIndex]; captureInitSettings.videoDeviceId = deviceInfo.id;
By the way, you can retrieve the default device ID at any time through the methods of the Windows.Media.Devices.MediaDevice
object and listen to its events for changes in the default devices. It’s also important to note that DeviceInformation
(in the deviceInfo
variable above) includes a property called enclosureLocation
. This tells you whether a camera is forward or back-ward facing, which you can use to rotate the video or photo as appropriate for the user’s perspective:
var cameraLocation = null; if (deviceInfo.enclosureLocation) { cameraLocation = deviceInfo.enclosureLocation.panel; } if (cameraLocation === Windows.Devices.Enumeration.Panel.back) { rotateVideoOnOrientationChange = true; reverseVideoRotation = false; } else if (cameraLocation === Windows.Devices.Enumeration.Panel.front) { rotateVideoOnOrientationChange = true; reverseVideoRotation = true; } else { rotateVideoOnOrientationChange = false; }
The other bit that Scenario 2 demonstrates is using the MediaCapture.addEffectAsync
with a grayscale effect, shown in Figure 10-10, that’s implemented in a DLL (the GrayscaleTransform project in the sample’s solution). This works exactly as it did with transcoding, and you can refer to the addRemoveEffect
and addEffectToImageStream
functions in js/AdvancedCapture.js for the details. You’ll notice there that these functions do a number of checks using the MediaCaptureSettings.videoDeviceCharacteristic
value to make sure that the effect is added in the right place.
FIGURE 10-10 Scenario 2 of the Media capture sample in which one can select a specific device and apply an effect. (The output here is again cropped from a larger screen shot.) Were you also paying attention enough to notice that I switched guitars?
To say that streaming media is popular is certainly a gross understatement. As mentioned in this chapter’s introduction, Netflix alone consumes for a large percentage of today’s Internet bandwidth (including that of my own home). YouTube certainly does its part as well—so your app might as well contribute to the cause!
Streaming media from a server to your app is easily the most common case, and it happens automatically when you set an audio or video src
attribute to a remote URI. To improve on this, Microsoft also has a Smooth Streaming SDK for Windows 8 Apps (in beta at the time of writing) that helps you build media apps with a number of rich features including live playback and PlayReady content protection. I won’t be covering that SDK in this book, so I wanted to make sure you were aware of it.
What we’ll focus on here, in the few pages we have left before my editors at Microsoft Press pull the plug on this chapter, are considerations for digital rights management (DRM) and streaming not from a network but to a network, for example, audio/video capture in a communications app, as well as streaming media from an app to a PlayTo device.
Again, streaming media from a server is what you already do whenever you’re using an audio
or video
element with a remote URI. The details just happen for you. Indeed, much of what a great media client app does is talking to web services, retrieving metadata and the catalog, helping the user navigate all of that information, and ultimately getting to a URI that can be dropped in the src
attribute of a video
or audio
element. Then, once the app receives the canplay
event, you can call the element’s play
method to get everything going.
Of course, media is often protected with DRM, otherwise the content on paid services wouldn’t be generating much income for the owners of those rights! So there needs to be a mechanism to acquire and verify rights somewhere between setting the element’s src
and receiving canplay
. Fortunately, there’s a simple means to do exactly that:
• Before setting the src
attribute, create an instance of Windows.Media.Protection.Media-ProtectionManager
and configure its properties
.
• Listen to this object’s serviceRequested
event, the handler for which performs the appropriate rights checks and sets a completed flag when all is well. (Two other events, just to mention them, are componentloadfailed
and rebootneeded
.)
• Assign the protection manager to the audio/video element with the msSetMediaProtectionManager
extension method.
• Set the src
attribute. This will trigger the serviceRequested
event to start the DRM process which will prevent canplay
until DRM checks are completed successfully.
• In the event of an error, the media element’s error
event will be fired. The element’s error
property will then contain an msExtendedCode
with more details.
You can refer to How to use pluggable DRM and How to handle DRM errors for additional details, but here’s a minimal and hypothetical example of all this in code:
var video1 = document.getElementById("video1"); video1.addEventListener('error', function () { var error = video1.error.msExtendedCode; //... }, false); video1.addEventListener('canplay', function () { video1.play(); }, false); var cpm = new Windows.Media.Protection.MediaProtectionManager(); cpm.addEventListener('servicerequested', enableContent, false); //Remove this later video1.msSetContentProtectionManager(cpm); video1.src = "http://some.content.server.url/protected.wmv"; function enableContent(e) { if (typeof (e.request) != 'undefined') { var req = e.request; var system = req.protectionSystem; var type = req.type; //Take necessary actions based on the system and type } if (typeof (e.completion) != 'undefined') { //Requested action completed var comp = e.completion; comp.complete(true); } }
How you specifically check for rights, of course, is particular to the service you’re drawing from—and not something you’d want to publish in any case!
For a more complete demonstration of handling DRM, check out the Simple PlayReady sample, which will require that you download and install the Microsoft PlayReady Client SDK. PlayReady, if you aren’t familiar with it yet, is a license service that Microsoft provides so that you don’t have to create one from scratch. The PlayReady client SDK provides additional tools and framework support for apps wanting to implement both online and offline media scenarios, such as progressive download, download to own, rentals, and subscriptions. Plus, with the SDK you don’t need to submit your app for
DRM Conformance testing. In any case, here’s how the Simple PlayReady sample sets up its content protection manager, just to give an idea of how the WinRT APIs are used with specific DRM service identifiers:
mediaProtectionManager = new Windows.Media.Protection.MediaProtectionManager(); mediaProtectionManager.properties["Windows.Media.Protection.MediaProtectionSystemId"] = '{F4637010-03C3-42CD-B932-B48ADF3A6A54}' var cpsystems = new Windows.Foundation.Collections.PropertySet(); cpsystems["{F4637010-03C3-42CD-B932-B48ADF3A6A54}"] = "Microsoft.Media.PlayReadyClient.PlayReadyWinRTTrustedInput"; mediaProtectionManager.properties[ "Windows.Media.Protection.MediaProtectionSystemIdMapping"] = cpsystems;
The next case to consider is when an app is the source of streaming media rather than the consumer, which means that client apps elsewhere are acting in that capacity. In reality, in this scenario—audio or video communications and conferencing—it’s usually the case that the app plays both roles, streaming media to other clients and consuming media from them. This is the case with Windows Live Messenger, Skype, and other such utilities, along with apps like games that include chat capabilities.
Here’s how such apps generally work:
• Set up the necessary communication channels over the network, which could be a peer-to-peer system or could involve a central service of some kind.
• Capture audio or video to a stream using the WinRT APIs we’ve seen (specifically Media-Capture.startRecordToStreamAsync
) or capturing to a custom sink.
• Do any additional processing to the stream data. Note, however, that effects are plugged into the capture mechanism (MediaCapture.addEffectAsync
) rather than something you do in post-processing.
• Encode the stream for transmission however you need.
• Transmit the stream over the network channel.
• Receive transmissions from other connected apps.
• Decode transmitted streams and convert to a blob by using MSApp.createBlobFromRandom-AccessStream
.
• Use URL.createObjectURL
to hook an audio
or video
element to the stream.
To see such features in action, check out the Real-time communications sample that implements video chat in Scenario 2 and demonstrates working with different latency modes in Scenario 1. The latter two steps in the list above are also shown in the PlayToReceiver sample that is set up to receive a media stream from another source.
The final case of streaming is centered on the PlayTo capabilities that were introduced in Windows 7. Simply said, PlayTo is a means through which an app can connect local playback/display for audio
, video
, and img
elements to a remote device.
The details happen through the Windows.Media.PlayTo
APIs along with the extension methods added to media elements. If, for example, you want to specifically start a process of streaming to a PlayTo device, invoking the selection UI directly, you’d do the following:
• Windows.Media.PlayTo.PlayToManager
:
getForCurrentView
returns the object.
showPlayToUI
invokes the flyout UI where the user selects a receiver.
sourceRequested
event is fired when user selects a receiver.
• In sourceRequested
Get
PlayToSource
object from audio
, video
, or img
element (msPlayToSource
property) and pass to e.setSource.
Set
PlayToSource.next
property to the msPlayToSource
of another element for continual playing.
• Pick up the media element’s ended
event to stage additional media
Another approach, as demonstrated in the Media Play To sample, is to go ahead and play media locally and then let the user choose a PlayTo device on the fly from the Devices charm. In this case you don’t need to do anything because Windows will pick up the current playback element and direct it accordingly. But the app can listen to the statechanged
event of the element’s msPlayToSource.connection
object (a PlayToConnection) that will fire when the user selects a PlayTo device and when other changes happen.
Generally speaking, PlayTo is primarily intended for streaming to a media receiver device that’s probably connected to a TV or other large screen. This way you can select local content on a Windows 8 device and send it straight to that receiver. But it’s also possible to make a software receiver—that is, an app that can receive streamed content from a PlayTo source. The PlayToReceiver sample does exactly this, and when you run it on another device on your local network, it will show up in the Devices charms as follows:
You can even run the app from your primarily machine using the remote debugging tools of Visual Studio, allowing you to step through the code of both source and receiver apps at the same time! (Another option is to run Windows Media Player on one machine and check its Stream > Allow Remote Control of My Player menu option. This should make that machine appear in the PlayTo target list.)
To be a receiver, an app will generally want to declare some additional networking capabilities in the manifest—namely, Internet (Client & Server) and Private Networks (Client & Server)—otherwise it won’t see much action! It then creates an instance of Windows.Media.PlayTo.PlayToReceiver
, as shown in the PlayTo Receiver sample’s startPlayToReceiver
function (js/audiovideoptr.js):
function startPlayToReceiver() { if (!g_receiver) { g_receiver = new Windows.Media.PlayTo.PlayToReceiver(); }
Next you’ll want to wire up handlers for the element that will play the media stream:
var dmrVideo = id("dmrVideo"); dmrVideo.addEventListener("volumechange", g_elementHandler.volumechange, false); dmrVideo.addEventListener("ratechange", g_elementHandler.ratechange, false); dmrVideo.addEventListener("loadedmetadata", g_elementHandler.loadedmetadata, false); dmrVideo.addEventListener("durationchange", g_elementHandler.durationchange, false); dmrVideo.addEventListener("seeking", g_elementHandler.seeking, false); dmrVideo.addEventListener("seeked", g_elementHandler.seeked, false); dmrVideo.addEventListener("playing", g_elementHandler.playing, false); dmrVideo.addEventListener("pause", g_elementHandler.pause, false); dmrVideo.addEventListener("ended", g_elementHandler.ended, false); dmrVideo.addEventListener("error", g_elementHandler.error, false);
along with handlers for events that the receiver object will fire:
g_receiver.addEventListener("playrequested", g_receiverHandler.playrequested, false); g_receiver.addEventListener("pauserequested", g_receiverHandler.pauserequested, false); g_receiver.addEventListener("sourcechangerequested", g_receiverHandler.sourcechangerequested, false); g_receiver.addEventListener("playbackratechangerequested", g_receiverHandler.playbackratechangerequested, false); g_receiver.addEventListener("currenttimechangerequested", g_receiverHandler.currenttimechangerequested, false); g_receiver.addEventListener("mutechangerequested", g_receiverHandler.mutedchangerequested, false); g_receiver.addEventListener("volumechangerequested", g_receiverHandler.volumechangerequested, false); g_receiver.addEventListener("timeupdaterequested", g_receiverHandler.timeupdaterequested, false); g_receiver.addEventListeer("stoprequested", g_receiverHandler.stoprequested, false); g_receiver.supportsVideo = true; g_receiver.supportsAudio = true; g_receiver.supportsImage = false; g_receiver.friendlyName = 'SDK JS Sample PlayToReceiver';
The last line above, as you can tell from the earlier image, is the string that will show in the Devices charm for this receiver once it’s made available on the network. This is done by calling startAsync
:
// Advertise the receiver on the local network and start receiving commands g_receiver.startAsync().then(function () { g_receiverStarted = true; // Prevent the screen from locking if (!g_displayRequest) { g_displayRequest = new Windows.System.Display.DisplayRequest(); } g_displayRequest.requestActive(); });
Of all the receiver object’s events, the critical one is sourcechangerequested
where eventArgs.stream
contains the media we want to play in whatever element we choose. This is easily accomplished by creating a blob from the stream and then a URI from the blob that we can assign to an element’s src
attribute:
sourcechangerequested: function (eventIn) { if (!eventIn.stream) { id("dmrVideo").src = ""; } else { var blob = MSApp.createBlobFromRandom-AccessStream(eventIn.stream.contentType, eventIn.stream); id("dmrVideo").src = URL.createObjectURL(blob, {oneTimeOnly: true}); } }
All the other events, as you can imagine, are primarily for wiring together the source’s media controls to the receiver such that pressing a pause button, switching tracks, or acting on the media in some other way at the source will be reflected in the receiver. There may be a lot of events, but handling them is quite simple as you can see in the sample.
• Creating media elements can be done in markup or code by using the standard img
, svg
, canvas
, audio
, and video
elements.
• The three graphics elements—img
, svg
, and canvas
—can all produce essentially the same output, only with different characteristics as to how they are generated and how they scale. All of them can be styled with CSS, however.
• The Windows.System.Display.DisplayRequest
object allows for disabling screen savers and the lock screen during video playback (or any other appropriate scenario).
• Both the audio and video elements provide a number of extension APIs (properties, methods, and events) for working with various platform-specific capabilities in Windows 8, such as horizontal mirroring, zooming, playback optimization, 3D video, low-latency rendering, PlayTo, playback management of different audio types or categories, effects (generally provided as DLLs in the appp package), and digital rights management.
• Background audio is supported for several categories given the necessary declarations in the manifest and handlers for media control events (so the audio can be appropriately paused and played). Media control events are important to also support the media control UI.
• Through the WinRT APIs, apps can manage very rich metadata and properties for media files, including thumbnails, album art, and properties specific to the media type, including access to a very extensive list of Windows Properties.
• The WinRT APIs provide for decoding and encoding of media files and streams, through which the media can be converted or properties changed. This includes support for custom codecs.
• WinRT provides a rich API for media capture (photo, video, and audio), including a built-in capture UI, along with the ability to provide your own and yet still easily enumerate and access available devices.
• Streaming media is supported from a server (with and without digital rights management, including PlayReady), between apps (inbound and outbound), and from apps to PlayTo devices. An app can also be configured as a PlayTo receiver.