Skip to main content
  1. Posts/

Building Your Own wc Tool in C# Code Challenge

·5 mins

The Unix wc (word count) tool is a classic command-line utility that counts the number of lines, words, and characters in a text file. In this post, we’ll work on a minimalistic version of this tool using C#.

For this project let’s focus on clarity over optimization, so even beginners can follow along.

Introduction #

The wc tool is quite useful for analyzing text files in the terminal. It provides simple statistics:

  • Number of lines
  • Number of words
  • Number of characters
  • Number of bytes

This blog post is based on a code challenge where the goal is to recreate the functionality of the wc tool in your language of choice.

Below, we’ll build a minimal version of wc using C#.

Step-by-Step Implementation #

First, let’s break down what we’ll need.

  1. Reading the file: We’ll use a stream to read the contents of the file.
  2. Counting lines, words, and characters: We’ll write simple methods to count these.
  3. Handling command-line arguments: We’ll add options to count specific things (lines, words, bytes, etc.).

Here’s the code for our minimal wc tool:

using System;
using System.IO;
using System.Linq;
using static System.Console;

if (args.Length == 0)
{
    PrintUsage();
    return;
}

var parsedArgs = ArgumentsParser.Parse(args);

try
{
    using var reader = new StreamReader(parsedArgs.FilePath);
    long lineCount = 0, wordCount = 0, charCount = 0, byteCount = 0;

    string? line;
    while ((line = reader.ReadLine()) != null)
    {
        if (parsedArgs.CountLines) lineCount++;
        if (parsedArgs.CountCharacters) charCount += line.Length + 1; // +1 for newline character
        if (parsedArgs.CountWords) wordCount += CountWords(line);
        if (parsedArgs.CountBytes) byteCount += System.Text.Encoding.UTF8.GetByteCount(line) + 1;
    }

    if (parsedArgs.CountLines) Write($"{lineCount} ");
    if (parsedArgs.CountWords) Write($"{wordCount} ");
    if (parsedArgs.CountBytes) Write($"{byteCount} ");
    if (parsedArgs.CountCharacters) Write($"{charCount} ");
    WriteLine(parsedArgs.FilePath);
}
catch (FileNotFoundException)
{
    WriteLine($"Error: The file '{parsedArgs.FilePath}' does not exist.");
}

static long CountWords(string line)
{
    // Split the line by spaces and count non-empty entries
    var words = line.Split(' ', StringSplitOptions.RemoveEmptyEntries);
    return words.Length;
}

static void PrintUsage()
{
    const string usage_helper = @"Usage: ccwc [OPTION]... [FILE]...
  or:  ccwc [OPTION]... --files0-from=F
Print newline, word, and byte counts for each FILE, and a total line if
more than one FILE is specified.  A word is a non-zero-length sequence of
characters delimited by white space.

With no FILE, or when FILE is -, read standard input.

The options below may be used to select which counts are printed, always in
the following order: newline, word, character, byte, maximum line length.
  -c, --bytes            print the byte counts
  -m, --chars            print the character counts
  -l, --lines            print the newline counts
  -w, --words            print the word counts
      --help     display this help and exit
      --version  output version information and exit
      
!!!DISCLAIMER: This is a clone of famous wc (Word Count) command line. ";
    WriteLine(usage_helper);
}

A very simple ArgumentsParser.

public record class CWArgument(
    string FilePath, 
    bool CountBytes, 
    bool CountWords, 
    bool CountLines, 
    bool CountCharacters);

public static class ArgumentsParser
{
    /*
     * -c : number of bytes
     * -l : number of lines
     * -w : number of words
     * -m : number of characters
     * Default options - equivalent to -c -l -w
     */
    public static CWArgument Parse(string[] arguments)
    {
        var countB = arguments.Contains("-c") || arguments.Contains("--bytes");
        var countW = arguments.Contains("-w") || arguments.Contains("--words");
        var countL = arguments.Contains("-l") || arguments.Contains("--lines");
        var countM = arguments.Contains("-m") || arguments.Contains("--chars");
        var path = arguments[^1];

        var defaults = !countB && !countW && !countL && !countM;
        if (defaults) return new CWArgument(path, true, true, true, false);

        return new CWArgument(path, countB, countW, countL, countM);
    }
}

How It Works #

  1. File Handling: We use a StreamReader to read the file line by line. This is the simplest way to handle file reading in C#.
  2. Counting Options: We handle the different counting options using the CWArgument record and the ArgumentsParser class. The parser processes the command-line arguments to determine what should be counted (bytes, words, lines, or characters). By default, it counts bytes, words, and lines, unless specified otherwise.
  3. Counting Lines, Words, and Characters: Depending on the options provided, we count the number of lines, words, characters, and bytes.
  4. Command-Line Arguments: The file path and options (e.g., -c for bytes) are passed as command-line arguments. The program parses them to determine which statistics to display.

Usage #

To run the program, compile it and run it from the command line:

dotnet run -- -l -w -c <path_to_your_file>

For example:

dotnet run -- -l -w example.txt

The output will show the selected statistics based on the arguments passed. For example:

5 15 78 example.txt

This means the file example.txt has 5 lines, 15 words, and 78 characters.

Create an runnable executable #

To do so all you need is.

dotnet publish -c Release -o ../output

Check the output folder recently created.

Special Mention #

This implementation is a minimalist solution to the Build Your Own wc Tool challenge.

If you’d like to see a more refined version, including optimizations and additional features, check out my GitHub repository: Rmauro.CommandLines.WordCount.

Conclusion #

We’ve built a minimalist version of the Unix wc tool in C# using just a few lines of code.

By adding argument parsing, we made the tool customizable based on the user’s needs.

This implementation is a great starting point for building command-line utilities and getting comfortable with file handling and argument parsing in C#.

You can experiment further by adding features or improving performance, but this version should give you a solid foundation.

Happy coding! 😎