Movie Lookup Web Service

by Chris 8. June 2008 14:37

I continue my series (I Want More Mobile (Web) Services and Flight Lookup Web Service, and Package Lookup Web Service) of useful Web Services for Windows Mobile with a service to check movie info. I often find myself in discussions about movies, and have often wished for a simple app where I can get some quick info about a movie. My effort is a growing creation on CodePlex called Windows Mobile Web Services where you get to the code.

moviescrape As in previous posts, you find the UX on the right, and if we start on the client side with the right menu button (somewhat simplified)...

private void lookupMenuItem_Click(object sender, EventArgs e)
{
    if(lookupMenuItem.Text == "Lookup")
    {
       
Movie[] movies = null;
       
try
        {
           
Service ws = new Service();
           
//ws.Credentials = new NetworkCredential("uid", "pwd");
            movies = ws.MovieLookup(titleTextBox.Text);
        }
       
catch(Exception ex)
        {
            MessageBox.Show(ex.Message);
        }
        if(movies != null && movies.Length > 0)
        {
            titleComboBox.Items.Clear();
           
foreach(Movie movie in movies)
                titleComboBox.Items.Add(movie);
            titleComboBox.Visible =
true;
            titleTextBox.Visible =
false;
            lookupMenuItem.Text =
"Back";
        }
    }
   
else
    {
        titleTextBox.Text =
string.Empty;
        titleTextBox.Visible =
true;
        titleComboBox.Visible =
false;
        lookupMenuItem.Text =
"Lookup";
    }
}

...so when a title is entered in the text box, a search is made with the text box replaced with a combo box containing the search results. When an item in the combo box is selected...

private void titleComboBox_SelectedIndexChanged(object sender, EventArgs e)
{
   
Movie movie = titleComboBox.SelectedItem as Movie;
    resultTextBox.Text =
string.Format("Title: {0}\r\nYear: {1}\r\n\r\nPlot: {2}",
        movie.Title, movie.Year, movie.Description);
   
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(movie.ImageUrl);
   
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
    pictureBox.Image =
new Bitmap(resp.GetResponseStream());
}

...the movie info is shown along with the box shot. On the server, the code looks like this...

[WebMethod]
public Movie[] MovieLookup(string movieTitle)
{
   
string url = string.Format("http://www.imdb.com/find?s=all&q={0}", movieTitle);
   
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
   
StreamReader responseReader = new StreamReader(request.GetResponse().GetResponseStream());
   
string responseData = responseReader.ReadToEnd();
    responseReader.Close();

   
int i = responseData.IndexOf("<b>Popular Titles</b>");
   
if(i <= 0)
       
return new Movie[] { };
   
int j = responseData.IndexOf("</p>", i);

   
string table = Regex.Match(responseData.Substring(i, j - i), "<table.*?>(.*?)</table>").ToString();

   
var q = from Match match in Regex.Matches(table,
                "<a\\shref=[\"\"\"\"'](?<url>.*?)[\"\"\"\"'].*?>(?<title>.*?)</a>")
           
where !match.Groups["title"].ToString().Contains("<img")
           
select new Movie { ID = Regex.Match(match.Groups["url"].ToString(),
                "(?<=\\w\\w)\\d\\d\\d\\d\\d\\d\\d").ToString(),
                Title =
HttpUtility.HtmlDecode(match.Groups["title"].ToString()) };

   
Movie[] movies = q.ToArray<Movie>();

   
foreach(Movie movie in movies)
    {
        url =
string.Format("http://www.imdb.com/title/tt{0}", movie.ID.ToString());
        request =
WebRequest.Create(url) as HttpWebRequest;
        responseReader =
new StreamReader(request.GetResponse().GetResponseStream());
        responseData = responseReader.ReadToEnd();
        responseReader.Close();

       
string title = Regex.Match(responseData, "(?<=<(title)>).*(?=<\\/\\1>)").ToString();
        movie.Title =
HttpUtility.HtmlDecode(Regex.Match(title, ".*(?=\\s\\(\\d+.*?\\))").ToString()).Replace(
            "\"", string.Empty);
        movie.Year =
Regex.Match(title, "(?<=\\()\\d+(?=.*\\))").ToString();
        movie.ImageUrl =
Regex.Match(Regex.Match(responseData, "(?<=\\b(name=\"poster\")).*\\b[</a>]\\b").ToString(),
           
"(?<=\\b(src=)).*\\b(?=[</a>])").ToString().Replace("\"", string.Empty).Replace("/></", string.Empty);

       
try
        {
           
if(movie.Title.Contains("(VG)"))
            {
                i = responseData.IndexOf(
"<h5>Plot Summary:</h5>") > 0 ?
                    responseData.IndexOf("<h5>Plot Summary:</h5>") :
                    responseData.IndexOf(
"<h5>Tagline:</h5>");
               
if(i > 0) j = responseData.IndexOf("</div>", i);
            }
           
else
            {
                i = responseData.IndexOf(
"<h5>Plot:</h5>") > 0 ? responseData.IndexOf("<h5>Plot:</h5>") :
                    responseData.IndexOf(
"<h5>Plot Summary:</h5>");
               
if(i <= 0) i = responseData.IndexOf("<h5>Plot Synopsis:</h5>");
               
if(i > 0) j = responseData.IndexOf("<a class=", i);
               
if(j <= 0)
                    j = responseData.IndexOf(
"</div>", i);
            }
           
string plotOutline = responseData.Substring(i, j - i).Remove(0, "<h5>Plot:</h5> ".Length);
            plotOutline =
HttpUtility.HtmlDecode(plotOutline);
            movie.Description =
Regex.Replace(plotOutline.Contains("is empty") ||
                plotOutline.Contains("View full synopsis")
                ?
string.Empty : plotOutline, "<a.*?href=[\"'](?<url>.*?)[\"'].*?>(?<name>.*?)</a>", string.Empty);
        }
       
catch
        {
            movie.Description =
string.Empty;
        }
    }
   
return movies;
}

...and as you can see, I'm using the excellent IMDB to get the movie info. Of course, there's a lot more info to be retrieved about a movie, its actors, etc. My recommendation if you want to build further on this example, take a look at Imdb Service project on Codeplex.

Note that we save a lot of coding by the extensive use of regular expression (regex) to extract the data from the web page. First, a request is made to do a search for the movie title, and from the search results the matching movies in the category "popular titles" is captured as a list of movies. Then, for each movie in the list a request is made to the movie details page, and each movie object is updated with the data about the movie (as well as the image URL). Then all is returned to the client, and note that the nicely typed movie data is kept in a small helper class...

public class Movie
{
   
public string ID { get; set; }
   
public string Title { get; set; }
   
public string Year { get; set; }
   
public string ImageUrl { get; set; }
   
public string Description { get; set; }
}

...which is transferred to the client via the Web Service proxy. For more details, check out the project on CodePlex.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , ,

Chris | Compact Framework | WM Web Services

Comments

Add comment


(Will show your Gravatar icon)  

  Country flag

biuquote
  • Comment
  • Preview
Loading



Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen