Haskell: Does a File Exist?

July 28, 2020

If you're going to write command-line tools, then the first thing you need to know is how to determine the parameters passed in. Shortly after that, you'll need to know how to determine whether a file exists.

Haskell has the function

doesFileExist :: FilePath -> IO Bool

where a FilePath is nothing more than a String:

type FilePath = String

So a program to read command-line arguments and determine whether or not a file exists is easy:

-- This allows access to the arguments (getArgs and getProgName). 
import System.Environment
  
-- This is for doesFileExist
import System.Directory

main = do
    args <- getArgs
    progName <- getProgName  
    putStrLn "The arguments are:"  
    mapM putStrLn args  
    putStrLn "The program name is:"  
    putStrLn progName

    if null args
        then putStrLn "Give me a file name!"
        else do
            -- Due to 'head', this only checks the first name given.
            b <- doesFileExist $ head args
            putStrLn $ show b

Now suppose that you want to pass a list of files on the command-line, and check whether they all exist. Lines that have been changed or added are in red.

import System.Environment
import System.Directory

-- Needed for allM (but see below)
import MonadUtils

main = do
    args <- getArgs
  
    if null args
        then putStrLn "Give me some file names!"
        else do
            b <- doFilesExist args
            putStrLn $ show b

doFilesExist :: [String] -> IO Bool
doFilesExist names = allM (\s -> doesFileExist s) names

To compile the first program, it suffices to use ghc exist.hs (or whatever you call the source file). For the second program, you may need to use Cabal to make MonadUtils available. Or you can say ghc -package ghc exist.hs since MonadUtils is part of the default installation, but is not visible. Another alternative is to look on Hoogle to find the source code where the desired function (allM) is defined; copy it from there into your source code and compile without having to type -package ghc. Of course, allM depends on additional functions, and they'll need to be copied and pasted too.

Another solution – if you don't want to mess with Cabal – is to copy the necessary function definitions to a new source file. The ghc compiler is smart enough to detect the dependence and compile the additional source automatically. In the case above, instead of importing MonadUtils, create a new file called MyUtils and import that instead.

module MyUtils where

-- Copied from MonadUtils.

(&&^) :: Monad m => m Bool -> m Bool -> m Bool
(&&^) a b = ifM a b (pure False)
  
allM :: Monad m => (a -> m Bool) -> [a] -> m Bool
allM p = foldr ((&&^) . p) (pure True)

ifM :: Monad m => m Bool -> m a -> m a -> m a
ifM b t f = do b <- b; if b then t else f

notM :: Functor m => m Bool -> m Bool
notM = fmap not

What if you want a program that takes exactly two file names on the command-line? Adding a test for the number of arguments in main would make it messy. Instead, define a function:

import System.Environment
import System.Directory
import MyUtils

main = do
    args <- getArgs
    
    -- ifM is a function, so 'then' and 'else' are not explicitly used.
    ifM (notM $ argsValid args)
        (putStrLn "Give me the names of two existing files!")
        (putStrLn "Congratulations. You can type.")

    
argsValid :: [String] -> IO Bool
argsValid names = do
    if (null names) || (length names /= 2)
        then return False
        else doFilesExist names

doFilesExist :: [String] -> IO Bool
doFilesExist names = allM (\s -> doesFileExist s) names

The above feels natural for someone coming from an imperative background, but the use of ifM and notM seems contrary to the spirit of Haskell. The user isovector on reddit suggested replacing

    ifM (notM $ argsValid args)
        (putStrLn "Give me the names of two existing files!")
        (putStrLn "Congratulations. You can type.")
with
    argsValid args >>= \case
        True  -> putStrLn "Congratulations. You can type."
        False -> putStrLn "Give me the names of two existing files!"

This feels more Haskellicious, with the data flowing along, kicking out side-effects as appropriate. It does require the XLambdaCase switch (a "language extension"), which can be given when ghc is invoked, or you can include the line

{-# LANGUAGE LambdaCase #-}

at the head of the source file.

Contact

Next