Folder Backup Utility

This is an example of a folder backup utility (included is a command-line option parser also). Although originally designed to allow the backing up of Visual Studio project folders, it will walk the subfolders of a set of directories and create a zip file of any folders which have one or more modified files. It takes advantage of the open source SharpZipLib to create zip archives.

Features

 * Archives entire folders into .zip files.
 * Does not back up folders which have not changed.
 * Easy to set up, use, and configure.
 * Ability to skip certain folders.
 * Accepts command-line paramters.
 * Ability to keep one backup per minute, day, week, or year.
 * Great way to back up all those Microsoft Visual Studio projects without any hassle.

Code
The following code files (BackupFolders.py, CmdLine.py, ICSharpCode.SharpZipLib.dll) must be placed within the same directory.

BackupFolders.py
"""Folder-to-Archive Backup Tool

This utility will backup the directories that you indicate into archives based on the directory names.

Version	Changes 0.5		Initial version 0.6		Encapsulated all functionality into a class decarlation while maintaining command-line execution functionality. 0.9		Changed the CompilePreviousBackupTimes method to look at the creation time for the backup archive instead of parsing the file name with a regular expression. The __CreateBackup method signatire as a fix was made for when the archive file name already existed and the creation date was not modified. Added more validation to pyFolderBackup.__init__. 1.0		Fixed bugs and changed to an IronPython version of CmdLine. """ __author__ = 'Dag R. Calafell, III' __date__    = '2007-05-31'  # yyyy-mm-dd __module_name__ = "Folder-to-Archive Backup Tool" __short_cright__= " Creative Commons License" # http://creativecommons.org/licenses/by/3.0/ __version__ = '1.0'     # Human-Readable Version number version_info = (1,0,0)  # Easier format: if version_info > (1,2,5)

import re import sys import clr clr.AddReference("ICSharpCode.SharpZipLib") from ICSharpCode.SharpZipLib.Zip import FastZip from ICSharpCode.SharpZipLib import SharpZipBaseException from CmdLine import CmdArgParser from System import Array, DateTime, Environment from System.IO import Path, File, Directory, SearchOption

class pyFolderBackup: """Class encapsulates all the functionality needed to create and manage backups of directories.	It create new backups only when a file within the directory has changed.""" def __init__(self, source_directories, destination_directory, skip_directories=None, verbose=False, target_format=None): # Validation if destination_directory == None: raise ValueError, 'Please specify an archive directory. %s' % destination_directory elif type(destination_directory) != str and hasattr(destination_directory, '__getitem__'): destination_directory = destination_directory[0] print 'Warning: Only one archive directory is supported. Script will use only the first directory in the list.' elif not Directory.Exists(str(destination_directory)): raise ValueError, 'The destination directory does not exist. %s' % destination_directory if not hasattr(source_directories, '__getitem__'): raise TypeError, 'The source_directories parameter must be enumerable. A \'%s\' was passed.' % type(source_directories) if skip_directories != None and not hasattr(skip_directories, '__getitem__'): raise TypeError, 'The skip_directories parameter must be enumerable. A \'%s\' was passed.' % type(skip_directories) if type(destination_directory) != str: raise TypeError, 'The destination_directory parameter must be a string. A \'%s\' was passed.' % type(destination_directory) if verbose == None: verbose = True if target_format != None: if type(target_format) != str: raise TypeError, 'The target_format parameter must be a string. A \'%s\' was passed.' % type(target_format) elif len(target_format.split('%s')) != 2: raise ValueError, 'The target_format parameter must be a string with at least one occurance of \'%s\', where the project name is subsituted. \'' + str(target_format) + '\' was passed' if source_directories == None or len(source_directories) == 0: raise ValueError, 'No source directories to process.'

self.VerboseMode = bool(verbose) self.SourceDirs = source_directories self.DestDir = destination_directory self.__fz = FastZip if skip_directories == None: self.SkipDirs = []; else: self.SkipDirs = skip_directories

if target_format == None or len(target_format) == 0: slash = Path.DirectorySeparatorChar if destination_directory[-len(slash)] == slash: slash = '' self.TargetFormat = destination_directory + slash + '%s ' + DateTime.Now.ToString("yyyy.MM.dd") + '.zip' else: self.TargetFormat = target_format

# Functions

def GetMostRecent(self, directory): """Determines the most recent modified date within all files of a directory.		Input:			directory: The directory to walk.		Output:			The most recent date & time at which a file within the directory structure has changed.		""" mostRecent = DateTime(1990, 1, 1) for f in Directory.GetFiles(directory, '*.*', SearchOption.AllDirectories): last = File.GetLastWriteTime(f) if mostRecent < last: mostRecent = last return mostRecent

def __CreateBackup(self, directory, creation): """Creates a zip backup of a directory.		Input:			directory: The directory to back up.			creation: The date and time of 		Output:			None		""" if type(directory) != str: raise TypeError, 'The directory parameter must be a string. A \'%s\' was passed.' % type(directory) if type(creation) != DateTime: raise TypeError, 'The creation parameter must be of type System.DateTime. A \'%s\' was passed.' % type(creation) # Create the backup proj = directory.split('\\')[-1] target = self.TargetFormat % proj self.__fz.CreateZip(target, directory, True, '.*') File.SetCreationTime(target, creation) print Path.GetFileName(target) + ' created.'

def CompilePreviousBackupTimes(self): """Create a dictionary of backup times for each project.		Input:			None		Output:			A dictionary of the most recent backup date by directory name.		""" reBackupInfo = re.compile(r"([^\\]+)\s(\d{4}.\d{2}.\d{2})\.zip") previousBackups = {} try: files = Directory.GetFiles(self.DestDir, '*.zip') except Exception, e:			print '%s' % e			print '' print 'Archive Directory: %s' % self.DestDir return

for f in files: m = reBackupInfo.search(f) if m:				projName = m.group(1) dte = m.group(2) filenameDate = DateTime(int(dte[0:4]), int(dte[5:7]), int(dte[8:10])) creationDate = File.GetCreationTime(f) if self.VerboseMode and filenameDate.Day != creationDate.Day and filenameDate.Month != creationDate.Month and filenameDate.Year != creationDate.Year: print 'Warning: The filename \'%s\' does not agree with the date that the file was created. (%s != %s)' % (Path.GetFileName(f), filenameDate, creationDate) if projName not in previousBackups or previousBackups[projName] < creationDate: previousBackups[projName] = creationDate return previousBackups

def RunBackup(self): """Runs a backup.		Input:			None		Output:			None		""" # Dictionary of backup times for each project. prevBackups = self.CompilePreviousBackupTimes if self.VerboseMode: print '' print 'Most Recent Backups' if prevBackups != None: for k in prevBackups.Keys: print '  %-25s %s' % (k,prevBackups[k].ToString('MM/dd/yyyy')) print '' else: print '  None' # Loop through the projects to backup for x in self.SourceDirs: if Directory.Exists(x): for proj in Directory.GetDirectories(x): if proj in self.SkipDirs: continue mostRecent = self.GetMostRecent(proj) projName = proj.split('\\')[-1] # Test if this project has already been backed up					if not projName in prevBackups or mostRecent > prevBackups[projName]: # print '\nmostRecent > prevBackups[projName]\n%s > %s\n' % (mostRecent, prevBackups[projName]) self.__CreateBackup(proj, mostRecent) elif self.VerboseMode: print '%s backup created on %s is up to date.' % (projName, prevBackups[projName].ToString('MM/dd/yyyy'))

if __name__ == '__main__': if len(sys.argv) == 1: print 'Backup Visual Studio Projects' user = Environment.GetEnvironmentVariable('username') # Directories to be backed up. source = [r'C:\Documents and Settings\%s\My Documents\Visual Studio 2005\Projects' % user, r'C:\Documents and Settings\%s\My Documents\Visual Studio 2005\Websites' % user] # Directories to skip (do not back up). skip = [r'C:\Documents and Settings\%s\My Documents\Visual Studio 2005\Projects\VSMacros80' % user] # All backups are stored in this backup directory. target_directory = 'C:\Documents and Settings\%s\My Documents\Visual Studio 2005\pyBackup' % user # The zip file name format. targetFmt = target_directory + '\\%s ' + DateTime.Now.ToString("yyyy.MM.dd") + '.zip' # Run the backup pyFolderBackup(source, target_directory, skip, True, targetFmt).RunBackup else: parser = CmdArgParser(sys.argv) pyFolderBackup(parser['source'], parser['target'], parser['skip'], parser['verbose'], parser['format']).RunBackup # This allows the command window in Windows to stay open until the user hits enter. print '' raw_input('Hit enter to exit.')

CmdLine.py
This is a simple class to allow for easier parsing of command-line options without installing CPython.

"""Simple Command-Line Option Parsing Tool

This utility class will parse command-line options without validation. It does not require installation of the standard CPython library.

Version	Changes 0.5		Initial version in C# 0.6		Converted to IronPython 1.0		Added some test cases. """ __author__ = 'Dag R. Calafell, III' __date__    = '2007-06-18'  # yyyy-mm-dd __module_name__ = "CmdLine" __short_cright__= " Creative Commons License" # http://creativecommons.org/licenses/by/3.0/ __version__ = '1.0'     # Human-Readable Version number version_info = (1,0,0)  # Easier format: if version_info > (1,2,5)

import System from System import Array from System.Text.RegularExpressions import Regex from System.Collections.Generic import Dictionary, List

class CmdArgParser(object): '''Very simple command-line argument/option parser. Usage: import CmdLine import sys cmds = CmdArgParser(sys.argv) # cmds = CmdArgParser(System.Environment.CommandLine) # cmds = CmdArgParser(('/tEst','/teSt:"fancy"', '/tester:""','/go+')) if not cmds: # No command-line arguments sys.exit(1) # /target="C:\" if cmds['target'] is None: print 'Please specify the target variable.' sys.exit(2) # cmds = ['test', 'tester', 'go'] # cmds['test'] = [True, 'fancy'] # cmds['tester'] = # cmds['go'] = True

Notes: *	The initializer can accept a list, tuple, array, string, or any .NET class which implements the GetEnumerator method. *	All keys are converted to lower case. *	All options must be preceeded by a switch. For example, "myprogram.exe dag.py" would return no command-line options whereas "myprogram.exe -f:dag.py would contain the 'f' key.	*	Command-line options may contain any character except new line \n, carriage return \r,		pipe |, colon :, and double quote ". *	When the value of a option is specified more than once, then the value for that option is a list of the values. '''	def __init__(self, args): self.__params = Dictionary[str, list] # The regex needs to parse a single string, not a list if args == None: self.CommandLine = None return elif type(args) in (list, tuple, Array) or hasattr(args, 'GetEnumerator'): args = ' '.join(str(i) for i in args) else: args = str(args)

# Remove any arguments at the begining which do not start with a switch character while args.count(' ') != 0 and args[0] not in ('/', '-'): args = args[args.index(' ') + 1:]

if len(args) == 0: return

self.CommandLine = args args += ' '

# This same regex failed when using the builtin 're' module under IronPython 1.1. for m in Regex.Matches(args, "[/-]-?([^\"\\r\\n|:]+?)(?:[:=](\"?)([^\"\\r\\n|]*?)\\2)?\\s+"):			if m.Success:				g = m.Groups				key = g[1].Value.lower

if g[3].Success: val = g[3].Value elif g[2].Success and g[2].Value == '"':					val = ''				else:					val = None

# Handle pluses or minuses after a specifier if val == None: if key[-1] == '-': val = False key = key[:-1] elif key[-1] == '+': val = True key = key[:-1] else: val = True

if self.__params.ContainsKey(key): self.__params[key].Add(val) else: if type(val) != list: val = [val] self.__params.Add(key, val)

@property def Keys(self): return list(self.__params.Keys)

def __getitem__(self, name): name = str(name) if not self.__params.ContainsKey(name): return None obj = self.__params[name] if len(obj) == 0: return  # [] converted to [], so convert back if len(obj) == 1: return obj[0] return obj

def __nonzero__(self): '''Allows boolean conversion for use such as: cmds = CmdLine.CmdArgParser(sys.argv) if not cmds: # No command-line arguments sys.exit(1) '''		return self.__params.Count != 0

def __len__(self): Returns the number of parsed parameters. return self.__params.Count

def __hash__(self): Allows this instance to be placed into a dictionary. return self.__params.GetHashCode

if __name__ == '__main__': # Tests the above class assert len(CmdArgParser(('/verbose'))) == 1, 'Test 1' a = CmdArgParser(('/tEst','/teSt:"fancy"', '/tester:""','/go+','/stop-','/safemode')) assert a != None, 'Test 2' assert len(a) == 5, 'Test 3' assert a['test'] != None, 'Test 4' assert a['go'] == True, 'Test 5' assert a['safemode'] == True, 'Test 6' assert a['stop'] == False, 'Test 7' assert len(CmdArgParser((''))) == 0, 'Test 8' assert len(CmdArgParser('')) == 0, 'Test 9' assert len(CmdArgParser('/verbose')) == 1, 'Test 10'

ICSharpCode.SharpZipLib.dll
The #ziplib open source C# libary was used to handle zip file creation. "ICSharpCode.SharpZipLib.dll" must be included in the same directory as the above code files. It can be downloaded from icsharpcode.net. More information about how to use the library can be found on the #ziplib wiki.

Output
Example output specifies what actions were taken during the backup process.

Most Recent Backups IconExplorer             05/31/2007 pyInterfaceHelper        06/11/2007 Utilities                06/08/2007

IconExplorer backup created on 05/31/2007 is up to date. pyInterfaceHelper backup created on 06/11/2007 is up to date. Utilities 2007.06.18.zip created.

Hit enter to exit.

Usage
The easiest way to use this utility class is by creating a windows batch file or shortcut.

Shortcut
A sample shortcut may have the following properties set, assuming that the script is placed in the "C:\Program Files\IronPython-1.1\Lib" directory.

Target: "C:\Program Files\IronPython-1.1\ipy.exe" "C:\Program Files\IronPython-1.1\Lib\BackupProjects.py"

Start In: C:\Program Files\IronPython-1.1"

Batch File
An example batch file, "RunBackup.bat", which runs the backup utility.

@echo off

"C:\Program Files\IronPython-1.1\ipy.exe" "C:\Program Files\IronPython-1.1\Lib\BackupProjects.py" /source:"C:\Documents and Settings\dcalafell\My Documents\Visual Studio 2005\Projects" /target:"C:\Documents and Settings\dcalafell\My Documents\Visual Studio 2005" /verbose

@echo. pause

Configuration
The default configuration will backup the Visual Studio 2005 projects and websites folders for the current user and ignores the 'VSMacros80' project.

The source_directories are the directories which the script walks to create backups.

The destination_directory is the directory where all backups are stored. This could be a network share or removable flash drive.

The skip_directories are the directories to not archive when walking the source directories. For example, these projects may already be part of a source control repository so there is no point in making another backup.

When the verbose option is True the script will print more information.

It allows only one backup for each day. By providing or modifying the default target_format (parameter for pyFolderBackup.__init__) for the class, you can make the backups as granular as a millisecond. There is a balance between the benefit of having backups and how many must be stored.

Notes / Limitations
The utility assumes that the modified date on the backup archive is when the backup was created and it has not been tested on Mono.

Version History
1.0	First publicly-available version - Fixed bugs and changed to an IronPython version of CmdLine.

0.9	Changed the CompilePreviousBackupTimes method to look at the creation time for the backup archive instead of parsing the file name with a regular expression. The __CreateBackup method signatire as a fix was made for when the archive file name already existed and the creation date was not modified. Added more validation to pyFolderBackup.__init__.

0.6	Encapsulated all functionality into a class decarlation while maintaining command-line execution functionality.

0.5	Initial version

Back to Contents.