Building Your Own wc Tool in C# Code Challenge
The Unix wc
(word count) tool is a classic command-line utility that counts the number of lines, words, and characters in a text file. In this post, we'll work on a minimalistic version of this tool using C#.
For this project let's focus on clarity over optimization, so even beginners can follow along.
Introduction
The wc
tool is quite useful for analyzing text files in the terminal. It provides simple statistics:
- Number of lines
- Number of words
- Number of characters
- Number of bytes
This blog post is based on a code challenge where the goal is to recreate the functionality of the wc
tool in your language of choice.
Below, we’ll build a minimal version of wc
using C#.
Step-by-Step Implementation
First, let's break down what we'll need.
- Reading the file: We’ll use a stream to read the contents of the file.
- Counting lines, words, and characters: We'll write simple methods to count these.
- Handling command-line arguments: We'll add options to count specific things (lines, words, bytes, etc.).
Here’s the code for our minimal wc
tool:
using System;
using System.IO;
using System.Linq;
using static System.Console;
if (args.Length == 0)
{
PrintUsage();
return;
}
var parsedArgs = ArgumentsParser.Parse(args);
try
{
using var reader = new StreamReader(parsedArgs.FilePath);
long lineCount = 0, wordCount = 0, charCount = 0, byteCount = 0;
string? line;
while ((line = reader.ReadLine()) != null)
{
if (parsedArgs.CountLines) lineCount++;
if (parsedArgs.CountCharacters) charCount += line.Length + 1; // +1 for newline character
if (parsedArgs.CountWords) wordCount += CountWords(line);
if (parsedArgs.CountBytes) byteCount += System.Text.Encoding.UTF8.GetByteCount(line) + 1;
}
if (parsedArgs.CountLines) Write($"{lineCount} ");
if (parsedArgs.CountWords) Write($"{wordCount} ");
if (parsedArgs.CountBytes) Write($"{byteCount} ");
if (parsedArgs.CountCharacters) Write($"{charCount} ");
WriteLine(parsedArgs.FilePath);
}
catch (FileNotFoundException)
{
WriteLine($"Error: The file '{parsedArgs.FilePath}' does not exist.");
}
static long CountWords(string line)
{
// Split the line by spaces and count non-empty entries
var words = line.Split(' ', StringSplitOptions.RemoveEmptyEntries);
return words.Length;
}
static void PrintUsage()
{
const string usage_helper = @"Usage: ccwc [OPTION]... [FILE]...
or: ccwc [OPTION]... --files0-from=F
Print newline, word, and byte counts for each FILE, and a total line if
more than one FILE is specified. A word is a non-zero-length sequence of
characters delimited by white space.
With no FILE, or when FILE is -, read standard input.
The options below may be used to select which counts are printed, always in
the following order: newline, word, character, byte, maximum line length.
-c, --bytes print the byte counts
-m, --chars print the character counts
-l, --lines print the newline counts
-w, --words print the word counts
--help display this help and exit
--version output version information and exit
!!!DISCLAIMER: This is a clone of famous wc (Word Count) command line. ";
WriteLine(usage_helper);
}
A very simple ArgumentsParser.
public record class CWArgument(
string FilePath,
bool CountBytes,
bool CountWords,
bool CountLines,
bool CountCharacters);
public static class ArgumentsParser
{
/*
* -c : number of bytes
* -l : number of lines
* -w : number of words
* -m : number of characters
* Default options - equivalent to -c -l -w
*/
public static CWArgument Parse(string[] arguments)
{
var countB = arguments.Contains("-c") || arguments.Contains("--bytes");
var countW = arguments.Contains("-w") || arguments.Contains("--words");
var countL = arguments.Contains("-l") || arguments.Contains("--lines");
var countM = arguments.Contains("-m") || arguments.Contains("--chars");
var path = arguments[^1];
var defaults = !countB && !countW && !countL && !countM;
if (defaults) return new CWArgument(path, true, true, true, false);
return new CWArgument(path, countB, countW, countL, countM);
}
}
How It Works
- File Handling: We use a
StreamReader
to read the file line by line. This is the simplest way to handle file reading in C#. - Counting Options: We handle the different counting options using the
CWArgument
record and theArgumentsParser
class. The parser processes the command-line arguments to determine what should be counted (bytes, words, lines, or characters). By default, it counts bytes, words, and lines, unless specified otherwise. - Counting Lines, Words, and Characters: Depending on the options provided, we count the number of lines, words, characters, and bytes.
- Command-Line Arguments: The file path and options (e.g.,
-c
for bytes) are passed as command-line arguments. The program parses them to determine which statistics to display.
Usage
To run the program, compile it and run it from the command line:
dotnet run -- -l -w -c <path_to_your_file>
For example:
dotnet run -- -l -w example.txt
The output will show the selected statistics based on the arguments passed. For example:
5 15 78 example.txt
This means the file example.txt
has 5 lines, 15 words, and 78 characters.
Create an runnable executable
To do so all you need is.
dotnet publish -c Release -o ../output
Check the output folder recently created.
Special Mention
This implementation is a minimalist solution to the Build Your Own wc Tool challenge.
If you'd like to see a more refined version, including optimizations and additional features, check out my GitHub repository: Rmauro.CommandLines.WordCount.
Conclusion
We’ve built a minimalist version of the Unix wc
tool in C# using just a few lines of code.
By adding argument parsing, we made the tool customizable based on the user’s needs.
This implementation is a great starting point for building command-line utilities and getting comfortable with file handling and argument parsing in C#.
You can experiment further by adding features or improving performance, but this version should give you a solid foundation.
Happy coding! 😎