Building Your Own wc Tool in C# Code Challenge
Table of Contents
The Unix wc (word count) tool is a classic command-line utility that counts the number of lines, words, and characters in a text file. In this post, we’ll work on a minimalistic version of this tool using C#.
For this project let’s focus on clarity over optimization, so even beginners can follow along.
Introduction #
The wc tool is quite useful for analyzing text files in the terminal. It provides simple statistics:
- Number of lines
- Number of words
- Number of characters
- Number of bytes
This blog post is based on a code challenge where the goal is to recreate the functionality of the wc tool in your language of choice.
Below, we’ll build a minimal version of wc using C#.
Step-by-Step Implementation #
First, let’s break down what we’ll need.
- Reading the file: We’ll use a stream to read the contents of the file.
- Counting lines, words, and characters: We’ll write simple methods to count these.
- Handling command-line arguments: We’ll add options to count specific things (lines, words, bytes, etc.).
Here’s the code for our minimal wc tool:
using System;
using System.IO;
using System.Linq;
using static System.Console;
if (args.Length == 0)
{
PrintUsage();
return;
}
var parsedArgs = ArgumentsParser.Parse(args);
try
{
using var reader = new StreamReader(parsedArgs.FilePath);
long lineCount = 0, wordCount = 0, charCount = 0, byteCount = 0;
string? line;
while ((line = reader.ReadLine()) != null)
{
if (parsedArgs.CountLines) lineCount++;
if (parsedArgs.CountCharacters) charCount += line.Length + 1; // +1 for newline character
if (parsedArgs.CountWords) wordCount += CountWords(line);
if (parsedArgs.CountBytes) byteCount += System.Text.Encoding.UTF8.GetByteCount(line) + 1;
}
if (parsedArgs.CountLines) Write($"{lineCount} ");
if (parsedArgs.CountWords) Write($"{wordCount} ");
if (parsedArgs.CountBytes) Write($"{byteCount} ");
if (parsedArgs.CountCharacters) Write($"{charCount} ");
WriteLine(parsedArgs.FilePath);
}
catch (FileNotFoundException)
{
WriteLine($"Error: The file '{parsedArgs.FilePath}' does not exist.");
}
static long CountWords(string line)
{
// Split the line by spaces and count non-empty entries
var words = line.Split(' ', StringSplitOptions.RemoveEmptyEntries);
return words.Length;
}
static void PrintUsage()
{
const string usage_helper = @"Usage: ccwc [OPTION]... [FILE]...
or: ccwc [OPTION]... --files0-from=F
Print newline, word, and byte counts for each FILE, and a total line if
more than one FILE is specified. A word is a non-zero-length sequence of
characters delimited by white space.
With no FILE, or when FILE is -, read standard input.
The options below may be used to select which counts are printed, always in
the following order: newline, word, character, byte, maximum line length.
-c, --bytes print the byte counts
-m, --chars print the character counts
-l, --lines print the newline counts
-w, --words print the word counts
--help display this help and exit
--version output version information and exit
!!!DISCLAIMER: This is a clone of famous wc (Word Count) command line. ";
WriteLine(usage_helper);
}
A very simple ArgumentsParser.
public record class CWArgument(
string FilePath,
bool CountBytes,
bool CountWords,
bool CountLines,
bool CountCharacters);
public static class ArgumentsParser
{
/*
* -c : number of bytes
* -l : number of lines
* -w : number of words
* -m : number of characters
* Default options - equivalent to -c -l -w
*/
public static CWArgument Parse(string[] arguments)
{
var countB = arguments.Contains("-c") || arguments.Contains("--bytes");
var countW = arguments.Contains("-w") || arguments.Contains("--words");
var countL = arguments.Contains("-l") || arguments.Contains("--lines");
var countM = arguments.Contains("-m") || arguments.Contains("--chars");
var path = arguments[^1];
var defaults = !countB && !countW && !countL && !countM;
if (defaults) return new CWArgument(path, true, true, true, false);
return new CWArgument(path, countB, countW, countL, countM);
}
}
How It Works #
- File Handling: We use a
StreamReaderto read the file line by line. This is the simplest way to handle file reading in C#. - Counting Options: We handle the different counting options using the
CWArgumentrecord and theArgumentsParserclass. The parser processes the command-line arguments to determine what should be counted (bytes, words, lines, or characters). By default, it counts bytes, words, and lines, unless specified otherwise. - Counting Lines, Words, and Characters: Depending on the options provided, we count the number of lines, words, characters, and bytes.
- Command-Line Arguments: The file path and options (e.g.,
-cfor bytes) are passed as command-line arguments. The program parses them to determine which statistics to display.
Usage #
To run the program, compile it and run it from the command line:
dotnet run -- -l -w -c <path_to_your_file>
For example:
dotnet run -- -l -w example.txt
The output will show the selected statistics based on the arguments passed. For example:
5 15 78 example.txt
This means the file example.txt has 5 lines, 15 words, and 78 characters.
Create an runnable executable #
To do so all you need is.
dotnet publish -c Release -o ../output
Check the output folder recently created.
Special Mention #
This implementation is a minimalist solution to the Build Your Own wc Tool challenge.
If you’d like to see a more refined version, including optimizations and additional features, check out my GitHub repository: Rmauro.CommandLines.WordCount.
Conclusion #
We’ve built a minimalist version of the Unix wc tool in C# using just a few lines of code.
By adding argument parsing, we made the tool customizable based on the user’s needs.
This implementation is a great starting point for building command-line utilities and getting comfortable with file handling and argument parsing in C#.
You can experiment further by adding features or improving performance, but this version should give you a solid foundation.
Happy coding! 😎