< All Topics

Traversing sub-directories

If you’re running models for a bunch of options, or a deposition sequence, you might have to go back and do some post-processing on the data at some point.  If its a set of deposition runs, you might want to go an create a snapshot of the output data and add it to a PowerPoint presentation.  Perhaps you just want to get the deposition details for each run.  Or maybe something else.

You could manually set each directory to be the working directory and run a script, but that’ll get tedious if there are a whole lot of runs to process.

This tutorial will show some different ways to process a list of sub-directories using Muk3D’s Python scripting interface.

Python’s os.walk

In the Python Standard Library there is a module called os.  This module is for interacting with the operating system in a way that is consistent across different Python versions (Linux, Windows, Mac, etc).  This module contains some basic functions for dealing with files and directories.

For this example this is the directory structure that we’re working with.

Root_directory
    -sub-1
    -sub-2
    -sub-3
    -sub-4
    -sub-5
        -sub-5-1
        -sub-5-2

To use a Python module in our script we need to first import it using the import statement.

import os

for root, dirs, files in os.walk('.'):
    print root, dirs, files

In line 1 we import the os module.  If we don’t import a module, we can’t use it.

In line 3 we iterate through the outputs from os.walk  function using a for loop. The arguement to the os.walk  function is ‘.’  which means the current working directory.  If we had a subdirectory called ‘stuff’ and wanted to walk through the sub-directories of stuff, we’d write os.walk(‘./stuff’)

For each iteration of the loop, we get a tuple that contains 3 values:

  • root : the directory that os.walk  is currently visiting;
  • dirs : a list of sub-directories within the current directory; and
  • files : a list of files in the directory os.walk  is currently visiting.

If we run this script in the Root_directory, this is the output we get.

. ['sub-1', 'sub-2', 'sub-3', 'sub-4', 'sub-5'] ['traverse_subdirs.py',]
.\sub-1 [] []
.\sub-2 [] []
.\sub-3 [] []
.\sub-4 [] []
.\sub-5 ['sub-5-1', 'sub-5-2'] []
.\sub-5\sub-5-1 [] []
.\sub-5\sub-5-2 [] []

The first directory that os.walk visits is the start directory that is passed as the argument to the function.  In this case, its ‘.’  (which corresponds to Root_directory in this example).

The first item in Line 1 is ‘.’  which is shorthand for the current directory (working directory).

The second item in Line 1 is a list of the sub-directories within the current directory (working directory).  The directories in this list will be what the os.walk  function visits after the starting directory.

The third entry in Line 1 is a list of files in the start directory.  Currently there is only 1 item – our script file.

In Line 2, os.walk  is visiting the sub-directory sub-1. The root variable (path relative to our current/working directory) is ./sub-1 .  Item 2 and 3 on this line are empty lists because there are no child sub-directories or files in this directory.

In Line 6, the  dirs variable contains a list of [‘sub-5-1’, ‘sub-5-2’] which are the sub-directories of ./sub-5 .

Using os.listdir

If you don’t need dive into all directories and are interested in just going into each sub-directory of the current directory we can try another of the functions in the os module called os.listdir . This function returns a list of files and directories in the target directory.

import os

for entry in os.listdir('.'):
    print entry

The output from this is shown below.

file1.txt
file2.txt
sub-1
sub-2
sub-3
sub-4
sub-5

All entries (files and directories) are included in the os.listdir  output.  If we want to just get the directories, we can use os.path.isdir  function.

import os

for entry in os.listdir('.'):
    if not os.path.isdir(entry):
        continue
    
    print entry

In line 4, the entry is tested to see if its a directory.  If its not a directory, then the continue keyword is used to skip to the next entry in the os.listdir output.

The output from this is below.

sub-1
sub-2
sub-3
sub-4
sub-5

In this example, we’re going to go into each of these directories and look for some tailings deposition results, load them, and grab the deposited volume. In this example, not every sub-directory has tailings results in it, so the first thing we want to do is to check that the results.py file exists.

import os

for entry in os.listdir('.'):
    if not os.path.isdir(entry):
        continue
    
    results_file = os.path.join(entry, 'results.py')    
    print results_file, os.path.exists(results_file)

In line 7 we create a variable called results_file which comprises the sub-directory (entry variable) and the name of the results file (results.py ).  The os.path.join  function takes any number of arguments (2 or more) and creates a string that is a path, using the appropriate path separator (/ or \). Note that this does not mean the file indicated by that path exist.  So in line 8 the os.path.exists returns True if the results_file exists, or False if it doesn’t.

The output is below. The results.py file was only found in sub-1.

sub-1\results.py True
sub-2\results.py False
sub-3\results.py False
sub-4\results.py False
sub-5\results.py False

So now we want to modify the script to just skip to the next entry in os.listdir if results.py doesn’t exist in a directory.  This is done by adding another conditional check that executes the continue keyword if results.py  doesn’t exist.

import os

for entry in os.listdir('.'):
    if not os.path.isdir(entry):
        continue
    
    results_file = os.path.join(entry, 'results.py')
    
    if not os.path.exists(results_file):
        continue
        
    print results_file

This produces a single line of output.

sub-1\results.py

Table of Contents