Project 1: File Search Utility

Starter repository on GitHub: https://classroom.github.com/a/tJ4qE40S

Have you ever lost a file? Pretty awful, right? In this assignment, we’ll implement a file search utility that can help you find your lost files.

The closest analog to what we are going to build is the find command. find supports lots of features, and if you’d like to see them all you can consult the man pages: man find. Our search tool will focus on a limited subset of those features:

Recursive directory traversal
Limiting the depth of the traversal
Searching based on both exact file name matches or partial file name matches
Filtering out files or directories (folders)
Displaying or hiding hidden files (anything that starts with a .)

Another way to think about this is the command line options the program will support:

$ ./search -H
Usage: ./search [-defhH] [-l depth-limit] [directory] [search-pattern]

Options:
    * -d    Only display directories (no files)
    * -e    Match search-pattern exactly; no partial matches reported.
    * -f    Only display files (no directories)
    * -l    Set a depth limit, e.g., recurse no more than 2 directories deep.
    * -h    Display hidden files.
    * -H    Display help/usage information

So, to run the program, you’ll supply any command line options you want, followed by the directory to search and the search pattern. If no options are given, our search will behave similarly to find – it will perform a recursive directory traversal and report ALL files starting in the current working directory (CWD).

Imagine you want to find files whose names contain ‘pass’ in the /etc/ directory:

$ ./search /etc pass
/etc/passwd
/etc/pam.d/chpasswd
  ...
etc, etc.

Or perhaps you want files named exactly “passwd”:

$ ./search -e /etc passwd
/etc/pam.d/passwd
/etc/passwd

Similarly, ./search /etc would search for ANY files in /etc/. We will assume that if the user wants to provide a search pattern, they also need to provide a search directory. The CWD (.) and a blank string ("") are the defaults for these two options.

When evaluating an exact or partial match, you should only consider the directory entry name (i.e., dirent->d_name), not the entire path.

By default, any file name starting with a period (.) is considered hidden, and will not be displayed unless the -h flag is passed in. However, you should still traverse into hidden directories regardless of whether the -h option is turned on – there could be visible files inside that you will display.

Learning Objectives

Traversing directories recursively
Parsing command line options with getopt
String manipulation
Dynamic memory allocation with malloc, and free
Using structs to store data from system calls
System calls: opendir, readdir

Implementation Hints

Here’s a couple places to start:

readdir.c – explains how to read directory contents, but not recursively
getopt.c – provides an example of parsing command line options

You are free to implement your program however you wish. The provided test cases will only check the outputs of your program, not individual functions.

Some things to think about (and start planning for):

How can the readdir example be modified to operate recursively?
How do we stop the recursive traversal when we reach a specific depth? (-l flag)
- If the depth is 0, then don’t display anything.
After processing the command line options, how will you make them accessible to your functions?
What functions will be needed to check whether the file names match, and how can we determine if a file is hidden, a regular file, or a directory?
- If the -d flag is passed in, you will only display directories.
- If the -f flag is passed in, you will only display files, but still traverse through directories (just don’t print them)
- By default, you should display both files and directories.

Implementation Restrictions

Restrictions: you may use any standard C library functionality. Hard-coded buffer sizes are not allowed. External libraries are not allowed unless permission is granted in advance. If in doubt, ask first. Your code must compile and run on your VM set up with Arch Linux as described in class – failure to do so will receive a grade of 0.

Grading

Check your code against the provided test cases. You should make sure your code runs on your Arch Linux VM.

Submission: submit via GitHub by checking in your code before the project deadline.

Your grade is based on:

Passing the test cases (make test).
- Remember to run make testupdate to pull in the latest test cases.
- When you are satisfied with your project, use make grade to test it on our test hardware.
- You should continue to test your code for robustness beyond the provided test cases; i.e., just because you hard-coded a function to pass does not guarantee you will receive the points for its functionality.
Code review: evaluation of code quality, stylistic consistency, cleanliness, efficiency, and documentation.