Monday, 20 September 2010

Metadata and Plex update

From the Plex Blog here

Plex » Metadata Update

So what exactly is metadata? Defined on Wikipedia as “data about data”, it turns our collection of media from a drab list of files into an interlinked web of facts and pictures, and imbues each item with a rich set of properties. Your files might have structure, but painting them with metadata converts that simple collection into a multi-dimensional universe of relationships. You can now set about answering complex questions like “Do I have any romantic comedies from the 1990′s that I haven’t watched in at least a year, starring Julia Roberts, and not featuring Whoopi Goldberg?”

Everyone loves metadata, and we didn’t anticipate the extreme load the Plex/Nine release would put on a number of sites when we launched. Tens of thousands of early downloaders, eagerly rescanning their huge personal media collections, contributed to massive amounts of traffic to multiple sites.

As a result, we’ve had to spend quite a bit of time since the release focusing on stabilizing our sources of metadata, optimizing the metadata agents (the bits of code that go out and get your metadata), and adding infrastructure to support all of our new users. Here’s a summary of what we’ve done:

  • We’ve brought up a massively powerful machine to serve as our TheTVDB proxy cache. All requests to TheTVDB go through this machine, and it serves over 99% of all bytes out of its cache, which means we’ve reduced the data load on the parent site by a factor of 100x. At peak, we were serving over 500 requests/second, and sending out 320Mbps. Darrin, one of our super-talented Plex engineers, worked literally day and night to get this running after the release, and we also appreciate the help and support from the TVDB guys!
  • For movies, we’ve moved to using data that is accessible with an API or through structured data dumps. Specifically, we’re using metadata from Freebase, Wikipedia, and TheMovieDB (as well as a few others for extra artwork, such as MoviePosterDB). This ensures the best availability and stability of the data.

If you’re not familiar with Freebase, you should check them out. It’s one of the few sites in recent memory that’s totally blown me away. The people who designed it are very, very smart people and the amount of data available is unbelievable. If you check out the page for the movie 300, you’ll see it links to 33 reviews of the movie, 6 other sites (such as Rotten Tomatoes), and then has a veritable cornucopia of data including cast, genres, subjects, filming locations, award lists, and more. All of that data is available via a sophisticated API, or via weekly database dumps.

We’ve processed the most recent Freebase data dump into a form that’s most suitable for our agent to consume. Additionally, we’ve enhanced the Wikipedia agent to support multiple different languages for the summaries. Finally, much more data from TheMovieDB is being pulled in by that agent.

In summary, massive amounts of data, all structured (no more “scraping” sites that can change at a moment’s notice), and all completely up to you as to how you use them. Like TheMovieDB summaries? Drag it to the top of the list of agents. Prefer your summaries in Swedish? Make sure Wikipedia is above TheMovieDB, so its internationalized summaries will take precedence. Have two French movies for your mother-in-law? You can manually set the language preference to French for just those two movies, and she’ll offer to babysit her grandkids while happily reading the summaries in French.

These agent changes have been pushed, and you will have them within the hour, unless you check sooner with Plex Online > More > Check for Updates.

Get your settings exactly how you like them, shift-click the refresh button to get new metadata for all your movies, and then sit back and watch the metadata flow in. (N.B. At this point in time, poster/art selections are “sticky” so once set, it won’t change unless you rescan a section from scratch).

Here’s a summary of what the different movie agents now provide, so as to allow you to prioritize them accordingly, through the settings option shown below:

Fullscreen.jpg

  • Freebase: Genres, content ratings, studio, directors, writers, actors, tag-lines.
  • Wikipedia: Multi-language summaries, directors, writers, actors, studio.
  • TheMovieDB: Summaries (more plot oriented), content ratings, directors, writers, actors, studio, tag-lines.
  • MoviePosterDB: Lots of movie posters, at lower resolution than TheMovieDB.

So as an example, if you hate the Wikipedia summaries, and prefer English plot summaries, drag TheMovieDB above Wikipedia. If you leave Wikipedia enabled, summaries that aren’t found from TheMovieDB will be filled in by Wikipedia.

If you want your summaries in Swedish, you’ll need to enable Wikipedia and have it higher in the list than TheMovieDB. Note that currently, in order to change languages, you’ll need to create a new section with the new language setting. Alternatively you can “fix match” on an individual item and manually set the language.

Lots of you have asked: How can we help? Luckily this is quite easy; let’s say you have a movie that’s missing data, or has incorrect data. You can head to one of those sites above and add the missing data, and then everyone in the community will benefit, including users of other apps that access those sites. This really is a case where each one of you has the power to help hundreds of thousands of other people!

The most immediate “turnaround” from this data would be through TheMovieDB, which we access through a well-designed API. We cache requests for 4 hours, so if you add data, you will not see the new data for at most this amount of time. (Note that we are also working to improve TheTVDB refresh times, which are now between 24-48 hours.)

Also, if you’re a developer, please check out our repository for agents. They are easy and fun to write, and we’re really looking forward to seeing the creative things you come up with. Oncleben31 has already written an agent for Allociné for French users, and the ever talented Sander wrote one for MovieMeter, for our Dutch users.

In the near future, we’ll allow you to fully customize any of the data for your media and lock it in place, so that it won’t be overwritten by new data from the Internet. So, for example, you can lock all your titles and summaries, but let the ratings and genres continue to expand and improve over time.

Your media has a bright future inside Plex, and metadata is the key.

Friday, 3 September 2010

WOW - Plex is going to be embedded into 2011 LG TVs!

I'm at a loss - this is just beyond impressive, and came out of nowhere:

Plex and the Future of Television

3diggsdigg

This week has been a big one for TV-related announcements. Apple annouced their revamped Apple TV, and Boxee was quick to reply with their own thoughts on the matter. It turns out that now is the perfect time for us to explain our vision for the future of Plex and television as well.

Today, Plex requires that you have a Mac connected to your TV. As sexy as they are, a Mac Mini is $699. And let’s face it, you probably have a few televisions, so it becomes an expensive proposition to Plex-ify your house, especially now that you can stream your library all over with Plex/Nine. Of course, on the flip side, a Mac Mini is a powerful computer that can be used for other meaningful tasks like 3D modeling, genome sequencing, or World of Warcraft.

MacMini.png

Another approach is something like the Boxee Box. It’s $199, which is much better, but still prohibitive for many, and it’s completely specialized. No way you could send your kid off to college with a Boxee Box, although you have to admit, it would probably help his or her social life. Additionally, releasing a custom piece of hardware is not a trivial endeavor, if we even wanted to attempt such a thing. Embedded systems are hard, and the XBMC codebase from which Plex and Boxee are both derived is a large and complex one. And really, at the end of the day, do you want yet another specialized box sitting by your TV?

Boxee.png

Even if you get the design right, you have to be able to price it appropriately. With not outrageously different hardware from Boxee, the new Apple TV is half the price. Apple also has a two-fold advantage over Boxee: They are going to be selling their Apple TV in much higher volume (which means lower cost to produce), and – critically – they can subsidize the cost of the device because they make money every time you put your arm around your date and click “Watch” on a movie. Also, let’s face it, it’s a typical Apple product: it works perfectly as long as you don’t stray outside their ecosystem. Your files have to be in their limited range of supported formats, and you only get access to the online content they sanction. It’s not an open platform in any sense of the word, and trust me, I was the first person hoping to be able to run our Plex iOS app on it.

AppleTV.png

The optimal solution, of course, would be a box that was free, infinitely small, and required no cables. Well, we’re extremely proud to be able to introduce to you, for the first time, the Plex Box, with exactly those characteristics.

PlexBox.png

How is this possible? Well, we actually have one more “one more thing” to announce: We’re working with LG Electronics (the second largest TV manufacturer in the world) to integrate the Plex platform into their 2011 lineup of Netcast™ connected TVs and Blu-ray devices. So early next year, when you buy an LG Netcast™ TV or Blu-ray player, you will have Plex functionality built-in. Specifically, it will connect to a cloud version of the Plex platform for online content, and, if you happen to have a Plex Media Server running anywhere in your house (after all, who doesn’t have a computer in their house?), you can access your local and online content, in a rich interface, with full metadata. I’ve seen it, and it looks awesome.

I’ve been talking a lot about the importance of getting the architecture right for our platform, and this is a perfect example. Thin clients (LG TV, iOS devices), a smart media server, and plug-ins that can run in the cloud. A single integrated interface to access online content, local content, and personal content.

I can’t even begin to tell you how exciting this is to us. LG chose our platform in no small part because it is OPEN, and that is what makes it special. We have developers all over the world creating plug-ins, helping us evolve the platform, and using it creatively. We wouldn’t be here without them, and it’s been an absolute pleasure working with them over the years. I also have enormous respect for LG, who have great products, massively talented engineering, and forward-thinking management. I’ve been to Korea twice in the last year, and their engineers are super-smart, highly knowledgable, and a delight to work with. They “get” where TV is going, and I have to make a confession – the first time I saw their Plex interface, talking to a remote Plex Media Server and flawlessly streaming content, I had to pretend I had something in my eye. This is a team completely committed to revolutionizing the way we enjoy content, and clearly willing to take chances in doing so, as evidenced by working with a small team like ours.

This is also a massive win for content providers. Yesterday, writing a Plex plug-in would make their content available on a Mac, or a television powered by a Mac. Yesterday, they could suddenly make their content available on 100 million iOS devices. And tomorrow (early next year, technically), they will be able to get their content onto millions of LG TVs and Blu-ray devices. This, friends, is an unprecedented time in history. The distance between content provider and consumer has never been this close or frictionless, and it’s incredible to be a part of.

So what does this mean to you, our dear users? You’ve been so supportive over the years, and this is great news for you as well. It means, first and foremost, that we’ll be able to focus more resources on the development. This will be a full-time job for me and others on the team, which is – honestly – a dream come true. The Plex Media Server is the heart and lungs of the platform, and we’ll be making it rock solid and adding some really, really cool new features. We’ll be bringing it to more platforms, to make it available everywhere. There will be more content providers investing in writing Plex plug-ins, so your online content choices will grow. And next year, if you’re upgrading your TV, or or buying an LG Blu-ray player, you’ll have the ability to get Plex, built in, at no additional cost. Fully integrated into killer consumer electronics gear, exactly as it should be.

And *that* is cool.

It’s been a long journey this past year. Now you finally know all of the cool stuff we’ve been working on, and it’s so great to be able to share it with you. We’ve re-architected our platform for the future, and thankfully, most of that work is behind us. Now we can focus on making Plex more stable, more usable, and overall more AWESOME.

Wednesday, 1 September 2010

Plex 9, the 24 hour status report from Plex

The official word from Plex on how the release is going as well as some minor issues being addressed:

State of the Release

0diggsdigg

Just over 24 hours since the release, and this has been an extremely exciting time for all of us. We simply can’t tell you how much we appreciate the outpouring of positive feedback on the release. Over Twitter, Facebook, email, and in the forums, the number of positive comments were astounding to us, and definitely made all the hard work worthwhile. We all want to say, collectively: Thank You.

A few salient things about the release:
  • The demand on our server was completely unprecedented. The OS load went over 30 for long periods of time, and there were HTTP and database issues. Isaac jumped in and quickly and skillfully spread our services out over a few spare slices and was able to return things to normal. Isaac, you rock.
  • Our mirrors (a big thanks to them!) were hammered badly as tens of thousands of downloaders tried to get the latest release, so in the morning we moved the main download site to S3 to ease of the pressure on the mirrors and get the app into people’s hands faster.
  • At this point, all services should be performing well, and we’re continuing to monitor and make adjustments as needed.

We are also tracking at this point a number of issues with the release (nobody’s perfect, right?), and we wanted to give you a quick summary of the more common issues, along with workarounds or resolutions whenever possible. Note that these are not the only issues, just the ones at the top of the list at the moment:

  • CRASHES ON STARTUP: We’re tracking these here. The most common reason (fixed in the next release), is if your computer doesn’t have a name. You can easily workaround this by going to System Preferences > Sharing > Computer Name and put in a clever, well thought out name like “Macadamia”. There’s also a crash we’re seeing on Leopard, which will also be fixed in the next release. Please post your crash reports in that thread.
  • SCANNERS: First and foremost, we’d like to help you get all your media into the library. There were issues with M4V files stopping a scan, an issue parsing date-base episodes, and an issue with .AppleDouble folder which we weren’t ignoring. We’ve pushed bug-fixes for these issues, and your Plex Media Server should update automatically the scanners within the hour. The good news is that you simply have to do another scan (assuming you don’t already have it set to automatically scan). You might be asleep at this point, and you might wake up with a bunch of missing episodes tucked comfortable into your library, and that, my friends, is magic. MAGIC! (If you’d like to track the progress on the scanners, you can follow our Github repository here.)
  • TV SHOW METADATA: Slightly less about magic, and more about laws of large numbers, I’m sorry to report that we, um, melted down TheTVDB today. My sincere apologies to them, and my apologies to you all, as we’ll be without metadata from them until we figure out how to reduce the load. We are working with them as we speak, and hopefully will be able to bring that back online shortly. The good news is that the Alexandria library system has been architected for this exact scenario, and you’ll still be able to scan your episodes into the library, and play them, there just won’t be show and season posters or summary data (until the next time you’re sleeping, when we’ll silently push another update and you’ll wake up with lots of posters). We really like magic, have I mentioned that?
  • LOCAL MEDIA AGENT: This is the one that picks up existing thumbs and fanart and such. There were a few bugs in it, and we think we’ve fixed them, but we need to test more. If you’d like to take the new agent for a spin and know what you’re doing, feel free to check it out here. (Note that you’ll need to remove and add the section to get the new art to “stick”, expect improvements here.)
  • AFP SHARE ISSUE: There seems to be an issue adding folders which live on a remote NAS. A bit baffling, but we’re looking into it.

There are of course other issues, but these are the ones we’d like to get resolved as soon as possible. So, how can you help? Why, I’m glad you asked. If you’re having trouble getting your media into the library, please post in the forum with your exact directory and file layout (screenshot or ls -lR from the terminal).

A few other tips:

  • If you’re seeing pausing when playing video in the iOS app, there are a few simple things you can do. Don’t select “auto” quality setting, and make sure you select a quality that’s appropriate to your network speed. Also, make sure you’re not trying to play 720p content (highest quality) to an iPad/iPhone4 if you have a slow server (less than 2.2GHz). Use this as an excuse to upgrade (“Honey, but don’t you want to watch So You Think You Can Dance without a pause ever time she does a pirouette?”). And finally, if you take the wireless router and tape it to your chest while using the iOS app, this improves reception.
  • If you have a huge library, you might want to let the Media Server take a bit of “alone time” while scanning for the first time. I mean, come on, it’s walking through all your files, computing hashes, extracting thumbnails, generating automatic fanart, analyzing the media, getting it all into the database, talking to the Internet for metadata, downloading that metadata and getting it associated with your media. Multiple agents are working together to contribute data. There are hamsters running around everywhere. So just sit back, work on your golf swing or water the garden or something. Lotus position and staring at the Plex Media Manager works well.
  • A positive review on the App Store directly contributes to Barkley’s diet. Like every time we get a five star review, I walk into the kitchen, get him a delicious venison jerky treat and feed it to him. Literally. Just remember, only YOU can make Barkley gain 10 pounds.

And now, I really need some sleep.