Index Project

IMDB indexing project

  1. Download actors.list.gz, actresses.list.gz, and movies.list.gz either from the official website.
  2. You will need to implement indexes for these text files. You may use any type of index covered in the textbook or in the classroom, and if you would like to use some index scheme that is not the textbook or in the classroom, then you must contact the instructor. You may not use any DBMS.
  3. You will also need to provide a command-line interface for a text file that contains keywords. Given an actor/actress name, print all the movies that he or she starred in. Given a keyword, print all the movie titles that contain the keyword. Example executions are shown below, where names.txt contains the names of actors and actresses and keywords.txt contains the keywords of movie titles. If you plan to use Java, then the command-line interface is something like "java program_name names.txt".
    Note that the name search may support exact match (equality search) only, but the movie search support substring match, i.e. titles (e.g. Kingdom of Heaven) that contain the keyword (e.g. King). The valid keywords consist of uppercase and lowercase alphabets ([a-zA-Z]), comma, period, and hyphen(-). All the name keywords come in the same form as they are displayed in actors.list and actress.list, and all the movie keywords will start with a capital letter, and your search is case-sensitive.
  4. Your grade will be decided as below: