Folder Backup Utility
From IronPython Cookbook
This is an example of a folder backup utility (included is a command-line option parser also). Although originally designed to allow the backing up of Visual Studio project folders, it will walk the subfolders of a set of directories and create a zip file of any folders which have one or more modified files. It takes advantage of the open source SharpZipLib to create zip archives.
Contents |
Features
- Archives entire folders into .zip files.
- Does not back up folders which have not changed.
- Easy to set up, use, and configure.
- Ability to skip certain folders.
- Accepts command-line paramters.
- Ability to keep one backup per minute, day, week, or year.
- Great way to back up all those Microsoft Visual Studio projects without any hassle.
Code
The following code files (BackupFolders.py, CmdLine.py, ICSharpCode.SharpZipLib.dll) must be placed within the same directory.
BackupFolders.py
"""Folder-to-Archive Backup Tool
This utility will backup the directories that you indicate into archives
based on the directory names.
Version Changes
0.5 Initial version
0.6 Encapsulated all functionality into a class decarlation while
maintaining command-line execution functionality.
0.9 Changed the CompilePreviousBackupTimes method to look at the
creation time for the backup archive instead of parsing the
file name with a regular expression. The __CreateBackup method
signatire as a fix was made for when the archive file name
already existed and the creation date was not modified. Added
more validation to pyFolderBackup.__init__.
1.0 Fixed bugs and changed to an IronPython version of CmdLine.
"""
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
__author__ = 'Dag R. Calafell, III'
__date__ = '2007-05-31' # yyyy-mm-dd
__module_name__ = "Folder-to-Archive Backup Tool"
__short_cright__= " Creative Commons License" # http://creativecommons.org/licenses/by/3.0/
__version__ = '1.0' # Human-Readable Version number
version_info = (1,0,0) # Easier format: if version_info > (1,2,5)
import re
import sys
import clr
clr.AddReference("ICSharpCode.SharpZipLib")
from ICSharpCode.SharpZipLib.Zip import FastZip
from ICSharpCode.SharpZipLib import SharpZipBaseException
from CmdLine import CmdArgParser
from System import Array, DateTime, Environment
from System.IO import Path, File, Directory, SearchOption
class pyFolderBackup:
"""Class encapsulates all the functionality needed to create and manage backups of directories.
It create new backups only when a file within the directory has changed."""
def __init__(self, source_directories, destination_directory, skip_directories=None, verbose=False, target_format=None):
# Validation
if destination_directory == None:
raise ValueError, 'Please specify an archive directory. %s' % destination_directory
elif type(destination_directory) != str and hasattr(destination_directory, '__getitem__'):
destination_directory = destination_directory[0]
print 'Warning: Only one archive directory is supported. Script will use only the first directory in the list.'
elif not Directory.Exists(str(destination_directory)):
raise ValueError, 'The destination directory does not exist. %s' % destination_directory
if not hasattr(source_directories, '__getitem__'):
raise TypeError, 'The source_directories parameter must be enumerable. A \'%s\' was passed.' % type(source_directories)
if skip_directories != None and not hasattr(skip_directories, '__getitem__'):
raise TypeError, 'The skip_directories parameter must be enumerable. A \'%s\' was passed.' % type(skip_directories)
if type(destination_directory) != str:
raise TypeError, 'The destination_directory parameter must be a string. A \'%s\' was passed.' % type(destination_directory)
if verbose == None:
verbose = True
if target_format != None:
if type(target_format) != str:
raise TypeError, 'The target_format parameter must be a string. A \'%s\' was passed.' % type(target_format)
elif len(target_format.split('%s')) != 2:
raise ValueError, 'The target_format parameter must be a string with at least one occurance of \'%s\', where the project name is subsituted. \'' + str(target_format) + '\' was passed'
if source_directories == None or len(source_directories) == 0:
raise ValueError, 'No source directories to process.'
self.VerboseMode = bool(verbose)
self.SourceDirs = source_directories
self.DestDir = destination_directory
self.__fz = FastZip()
if skip_directories == None:
self.SkipDirs = [];
else:
self.SkipDirs = skip_directories
if target_format == None or len(target_format) == 0:
slash = Path.DirectorySeparatorChar
if destination_directory[-len(slash)] == slash:
slash = ''
self.TargetFormat = destination_directory + slash + '%s ' + DateTime.Now.ToString("yyyy.MM.dd") + '.zip'
else:
self.TargetFormat = target_format
# -------------------- Functions --------------------
def GetMostRecent(self, directory):
"""Determines the most recent modified date within all files of a directory.
Input:
directory: The directory to walk.
Output:
The most recent date & time at which a file within the directory structure has changed.
"""
mostRecent = DateTime(1990, 1, 1)
for f in Directory.GetFiles(directory, '*.*', SearchOption.AllDirectories):
last = File.GetLastWriteTime(f)
if mostRecent < last:
mostRecent = last
return mostRecent
def __CreateBackup(self, directory, creation):
"""Creates a zip backup of a directory.
Input:
directory: The directory to back up.
creation: The date and time of
Output:
None
"""
if type(directory) != str:
raise TypeError, 'The directory parameter must be a string. A \'%s\' was passed.' % type(directory)
if type(creation) != DateTime:
raise TypeError, 'The creation parameter must be of type System.DateTime. A \'%s\' was passed.' % type(creation)
# Create the backup
proj = directory.split('\\')[-1]
target = self.TargetFormat % proj
self.__fz.CreateZip(target, directory, True, '.*')
File.SetCreationTime(target, creation)
print Path.GetFileName(target) + ' created.'
def CompilePreviousBackupTimes(self):
"""Create a dictionary of backup times for each project.
Input:
None
Output:
A dictionary of the most recent backup date by directory name.
"""
reBackupInfo = re.compile(r"([^\\]+)\s(\d{4}.\d{2}.\d{2})\.zip")
previousBackups = {}
try:
files = Directory.GetFiles(self.DestDir, '*.zip')
except Exception, e:
print '%s' % e
print ''
print 'Archive Directory: %s' % self.DestDir
return
for f in files:
m = reBackupInfo.search(f)
if m:
projName = m.group(1)
dte = m.group(2)
filenameDate = DateTime(int(dte[0:4]), int(dte[5:7]), int(dte[8:10]))
creationDate = File.GetCreationTime(f)
if self.VerboseMode and filenameDate.Day != creationDate.Day and filenameDate.Month != creationDate.Month and filenameDate.Year != creationDate.Year:
print 'Warning: The filename \'%s\' does not agree with the date that the file was created. (%s != %s)' % (Path.GetFileName(f), filenameDate, creationDate)
if projName not in previousBackups or previousBackups[projName] < creationDate:
previousBackups[projName] = creationDate
return previousBackups
def RunBackup(self):
"""Runs a backup.
Input:
None
Output:
None
"""
# Dictionary of backup times for each project.
prevBackups = self.CompilePreviousBackupTimes()
if self.VerboseMode:
print ''
print 'Most Recent Backups'
if prevBackups != None:
for k in prevBackups.Keys:
print ' %-25s %s' % (k,prevBackups[k].ToString('MM/dd/yyyy'))
print ''
else:
print ' None'
# Loop through the projects to backup
for x in self.SourceDirs:
if Directory.Exists(x):
for proj in Directory.GetDirectories(x):
if proj in self.SkipDirs:
continue
mostRecent = self.GetMostRecent(proj)
projName = proj.split('\\')[-1]
# Test if this project has already been backed up
if not projName in prevBackups or mostRecent > prevBackups[projName]:
# print '\nmostRecent > prevBackups[projName]\n%s > %s\n' % (mostRecent, prevBackups[projName])
self.__CreateBackup(proj, mostRecent)
elif self.VerboseMode:
print '%s backup created on %s is up to date.' % (projName, prevBackups[projName].ToString('MM/dd/yyyy'))
if __name__ == '__main__':
if len(sys.argv) == 1:
print 'Backup Visual Studio Projects'
user = Environment.GetEnvironmentVariable('username')
# Directories to be backed up.
source = [r'C:\Documents and Settings\%s\My Documents\Visual Studio 2005\Projects' % user,
r'C:\Documents and Settings\%s\My Documents\Visual Studio 2005\Websites' % user]
# Directories to skip (do not back up).
skip = [r'C:\Documents and Settings\%s\My Documents\Visual Studio 2005\Projects\VSMacros80' % user]
# All backups are stored in this backup directory.
target_directory = 'C:\Documents and Settings\%s\My Documents\Visual Studio 2005\pyBackup' % user
# The zip file name format.
targetFmt = target_directory + '\\%s ' + DateTime.Now.ToString("yyyy.MM.dd") + '.zip'
# Run the backup
pyFolderBackup(source, target_directory, skip, True, targetFmt).RunBackup()
else:
parser = CmdArgParser(sys.argv)
pyFolderBackup(parser['source'], parser['target'], parser['skip'], parser['verbose'], parser['format']).RunBackup()
# This allows the command window in Windows to stay open until the user hits enter.
print ''
raw_input('Hit enter to exit.')
CmdLine.py
This is a simple class to allow for easier parsing of command-line options without installing CPython.
"""Simple Command-Line Option Parsing Tool
This utility class will parse command-line options without validation.
It does not require installation of the standard CPython library.
Version Changes
0.5 Initial version in C#
0.6 Converted to IronPython
1.0 Added some test cases.
"""
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
__author__ = 'Dag R. Calafell, III'
__date__ = '2007-06-18' # yyyy-mm-dd
__module_name__ = "CmdLine"
__short_cright__= " Creative Commons License" # http://creativecommons.org/licenses/by/3.0/
__version__ = '1.0' # Human-Readable Version number
version_info = (1,0,0) # Easier format: if version_info > (1,2,5)
import System
from System import Array
from System.Text.RegularExpressions import Regex
from System.Collections.Generic import Dictionary, List
class CmdArgParser(object):
'''Very simple command-line argument/option parser. Usage:
import CmdLine
import sys
cmds = CmdArgParser(sys.argv)
# cmds = CmdArgParser(System.Environment.CommandLine)
# cmds = CmdArgParser(('/tEst','/teSt:"fancy"', '/tester:""','/go+'))
if not cmds:
# No command-line arguments
sys.exit(1)
# /target="C:\"
if cmds['target'] is None:
print 'Please specify the target variable.'
sys.exit(2)
# cmds = ['test', 'tester', 'go']
# cmds['test'] = [True, 'fancy']
# cmds['tester'] =
# cmds['go'] = True
Notes:
* The initializer can accept a list, tuple, array, string, or any .NET class which implements
the GetEnumerator method.
* All keys are converted to lower case.
* All options must be preceeded by a switch. For example, "myprogram.exe dag.py" would
return no command-line options whereas "myprogram.exe -f:dag.py would contain the 'f' key.
* Command-line options may contain any character except new line \n, carriage return \r,
pipe |, colon :, and double quote ".
* When the value of a option is specified more than once, then the value for that option
is a list of the values.
'''
def __init__(self, args):
self.__params = Dictionary[str, list]()
# The regex needs to parse a single string, not a list
if args == None:
self.CommandLine = None
return
elif type(args) in (list, tuple, Array) or hasattr(args, 'GetEnumerator'):
args = ' '.join(str(i) for i in args)
else:
args = str(args)
# Remove any arguments at the begining which do not start with a switch character
while args.count(' ') != 0 and args[0] not in ('/', '-'):
args = args[args.index(' ') + 1:]
if len(args) == 0:
return
self.CommandLine = args
args += ' '
# This same regex failed when using the builtin 're' module under IronPython 1.1.
for m in Regex.Matches(args, "[/-]-?([^\"\\r\\n|:]+?)(?:[:=](\"?)([^\"\\r\\n|]*?)\\2)?\\s+"):
if m.Success:
g = m.Groups
key = g[1].Value.lower()
if g[3].Success:
val = g[3].Value
elif g[2].Success and g[2].Value == '"':
val = ''
else:
val = None
# Handle pluses or minuses after a specifier
if val == None:
if key[-1] == '-':
val = False
key = key[:-1]
elif key[-1] == '+':
val = True
key = key[:-1]
else:
val = True
if self.__params.ContainsKey(key):
self.__params[key].Add(val)
else:
if type(val) != list:
val = [val]
self.__params.Add(key, val)
@property
def Keys(self):
return list(self.__params.Keys)
def __getitem__(self, name):
name = str(name)
if not self.__params.ContainsKey(name):
return None
obj = self.__params[name]
if len(obj) == 0:
return '' # [''] converted to [], so convert back
if len(obj) == 1:
return obj[0]
return obj
def __nonzero__(self):
'''Allows boolean conversion for use such as:
cmds = CmdLine.CmdArgParser(sys.argv)
if not cmds:
# No command-line arguments
sys.exit(1)
'''
return self.__params.Count != 0
def __len__(self):
'''Returns the number of parsed parameters.'''
return self.__params.Count
def __hash__(self):
'''Allows this instance to be placed into a dictionary.'''
return self.__params.GetHashCode()
if __name__ == '__main__':
# Tests the above class
assert len(CmdArgParser(('/verbose'))) == 1, 'Test 1'
a = CmdArgParser(('/tEst','/teSt:"fancy"', '/tester:""','/go+','/stop-','/safemode'))
assert a != None, 'Test 2'
assert len(a) == 5, 'Test 3'
assert a['test'] != None, 'Test 4'
assert a['go'] == True, 'Test 5'
assert a['safemode'] == True, 'Test 6'
assert a['stop'] == False, 'Test 7'
assert len(CmdArgParser((''))) == 0, 'Test 8'
assert len(CmdArgParser('')) == 0, 'Test 9'
assert len(CmdArgParser('/verbose')) == 1, 'Test 10'
ICSharpCode.SharpZipLib.dll
The #ziplib open source C# libary was used to handle zip file creation. "ICSharpCode.SharpZipLib.dll" must be included in the same directory as the above code files. It can be downloaded from icsharpcode.net. More information about how to use the library can be found on the #ziplib wiki.
Output
Example output specifies what actions were taken during the backup process.
Most Recent Backups IconExplorer 05/31/2007 pyInterfaceHelper 06/11/2007 Utilities 06/08/2007 IconExplorer backup created on 05/31/2007 is up to date. pyInterfaceHelper backup created on 06/11/2007 is up to date. Utilities 2007.06.18.zip created. Hit enter to exit.
Usage
The easiest way to use this utility class is by creating a windows batch file or shortcut.
Shortcut
A sample shortcut may have the following properties set, assuming that the script is placed in the "C:\Program Files\IronPython-1.1\Lib" directory.
Target: "C:\Program Files\IronPython-1.1\ipy.exe" "C:\Program Files\IronPython-1.1\Lib\BackupProjects.py" Start In: C:\Program Files\IronPython-1.1"
Batch File
An example batch file, "RunBackup.bat", which runs the backup utility.
@echo off "C:\Program Files\IronPython-1.1\ipy.exe" "C:\Program Files\IronPython-1.1\Lib\BackupProjects.py" /source:"C:\Documents and Settings\dcalafell\My Documents\Visual Studio 2005\Projects" /target:"C:\Documents and Settings\dcalafell\My Documents\Visual Studio 2005" /verbose @echo. pause
Configuration
The default configuration will backup the Visual Studio 2005 projects and websites folders for the current user and ignores the 'VSMacros80' project.
The source_directories are the directories which the script walks to create backups.
The destination_directory is the directory where all backups are stored. This could be a network share or removable flash drive.
The skip_directories are the directories to not archive when walking the source directories. For example, these projects may already be part of a source control repository so there is no point in making another backup.
When the verbose option is True the script will print more information.
It allows only one backup for each day. By providing or modifying the default target_format (parameter for pyFolderBackup.__init__) for the class, you can make the backups as granular as a millisecond. There is a balance between the benefit of having backups and how many must be stored.
Notes / Limitations
The utility assumes that the modified date on the backup archive is when the backup was created and it has not been tested on Mono.
Version History
1.0 First publicly-available version - Fixed bugs and changed to an IronPython version of CmdLine. 0.9 Changed the CompilePreviousBackupTimes method to look at the creation time for the backup archive instead of parsing the file name with a regular expression. The __CreateBackup method signatire as a fix was made for when the archive file name already existed and the creation date was not modified. Added more validation to pyFolderBackup.__init__. 0.6 Encapsulated all functionality into a class decarlation while maintaining command-line execution functionality. 0.5 Initial version
Back to Contents.

