TZWorks LLC
System Analysis and Programming
www.tzworks.com
(Version 0.59)
TZWorks LLC software and related documentation ("Software") is governed by separate licenses issued from TZWorks LLC. The User Agreement, Disclaimer, and/or Software may change from time to time. By continuing to use the Software after those changes become effective, you agree to be bound by all such changes. Permission to use the Software is granted provided that (1) use of such Software is in accordance with the license issued to you and (2) the Software is not resold, transferred or distributed to any other person or entity. Refer to your specific EULA issued to for your specific the terms and conditions. There are 3 types of licenses available: (i) for educational purposes, (ii) for demonstration and testing purposes and (iii) business and/or commercial purposes. Contact TZWorks LLC (info@tzworks.com) for more information regarding licensing and/or to obtain a license. To redistribute the Software, prior approval in writing is required from TZWorks LLC. The terms in your specific EULA do not give the user any rights in intellectual property or technology, but only a limited right to use the Software in accordance with the license issued to you. TZWorks LLC retains all rights to ownership of this Software.
The Software is subject to U.S. export control laws, including the U.S. Export Administration Act and its associated regulations. The Export Control Classification Number (ECCN) for the Software is 5D002, subparagraph C.1. The user shall not, directly or indirectly, export, re-export or release the Software to, or make the Software accessible from, any jurisdiction or country to which export, re-export or release is prohibited by law, rule or regulation. The user shall comply with all applicable U.S. federal laws, regulations and rules, and complete all required undertakings (including obtaining any necessary export license or other governmental approval), prior to exporting, re-exporting, releasing, or otherwise making the Software available outside the U.S.
The user agrees that this Software made available by TZWorks LLC is experimental in nature and use of the Software is at user's sole risk. The Software could include technical inaccuracies or errors. Changes are periodically added to the information herein, and TZWorks LLC may make improvements and/or changes to Software and related documentation at any time. TZWorks LLC makes no representations about the accuracy or usability of the Software for any purpose.
ALL SOFTWARE ARE PROVIDED "AS IS" AND "WHERE IS" WITHOUT WARRANTY OF ANY KIND INCLUDING ALL IMPLIED WARRANTIES AND CONDITIONS OF MERCHANTABILITY, FITNESS FOR ANY PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT SHALL TZWORKS LLC BE LIABLE FOR ANY KIND OF DAMAGE RESULTING FROM ANY CAUSE OR REASON, ARISING OUT OF IT IN CONNECTION WITH THE USE OR PERFORMANCE OF INFORMATION AVAILABLE FROM THIS SOFTWARE, INCLUDING BUT NOT LIMITED TO ANY DAMAGES FROM ANY INACCURACIES, ERRORS, OR VIRUSES, FROM OR DURING THE USE OF THE SOFTWARE.
The Software are the original works of TZWorks LLC. However, to be in compliance with the Digital Millennium Copyright Act of 1998 ("DMCA") we agree to investigate and disable any material for infringement of copyright. Contact TZWorks LLC at email address: info@tzworks.com, regarding any DMCA concerns.
wisp is a command line version of a Windows parser that targets NTFS index type attributes. The NTFS index attribute points to one or more INDX records. These records contain index entries that are used to account for each item in a directory. An index item represents either a file or a subdirectory, and includes enough metadata to contain the name, modified/access/MFT changed/birth (MACB) timestamps, size (if it is a file versus subdirectory), as well as MFT entry numbers of the item and its parent. The wisp tool, in its simplest form, is able to walk these structures, read the metadata, and report which index entries are present.
As a directory's contents are changed, the number of valid index entries grows or shrinks, as appropriate. As more directory entries are added, eventually it will exceed the existing INDX record allocation space. At this point, the operating system will allocate an additional INDX record in the size of 0x1000 byte chunk. Conversely, when entries are removed from the directory, the INDX record space is not necessarily deallocated. Thus, anytime the number of index entries shrinks, the invalid ones potentially can be harvested from the slack space. The slack space is defined to be the allocated, but unused, space. By comparing both the valid entries and those still in the slack space, one can make some inferences about whether a file (or subdirectory) was present in the past.
A good tutorial on harvesting index entries from INDX slack space can be found on Willi Ballenthin's webpage [4] and his DFIRonline presentation [5].
wisp uses the NTFS and index attribute parsing engine that is used in the ntfswalk tool [6] available from the TZWorks LLC website. Currently, there are compiled versions for Windows, Linux and Mac OS-X.
To use this tool, an authentication file is required to be in the same directory as the binary in order for the tool to run.
While the wisp tool doesn't require one to run with administrator privileges, without doing so will restrict one to only looking at off-line 'dd' images. Therefore, to perform live processing of volumes, one needs to launch the command prompt with administrator privileges.
One can display the menu options by typing in the executable name without parameters. Below is the menu with the various options. Details of each option can be found here.
Usage wisp [source of NTFS volume] [item to analyze] [output options] [source of NTFS volume options] -path <dir> = analyze directory -image <file> [-offset <vol>] [-path <dir>] = src is dd image -partition <drv letter> [-path <dir>] -drivenum <#> [-offset <vol>] [-path <dir>] = phys drive# -vmdk "disk1 | ..." [-path <dir>] = VMWare disk(s) -indxfile <indx datafile> = INDX file -vss <num> = Vol Shadow [suboption for above src opts - cannot be used with (-indxfile)] -mft <dir inode to analyze> Basic options -csv = output in comma separated value format -csvl2t = log2timeline output -bodyfile = sleuthkit output -valid = output only valid entries (default) Additional options -nodups = output w/ no duplicate records -slack = output only slack entries -all = output both valid and slack entries -level <num> = recurse # levels [default is 0 levels] -base10 = output in base10 vice hex -username <name> = output will contain this username -hostname <name> = output will contain this hostname -dateformat mm/dd/yyyy = "yyyy-mm-dd" is the default -timeformat hh:mm:ss = "hh:mm:ss.xxx" is the default -pair_datetime = combine date/time into 1 field for csv -no_whitespace = remove whitespace between csv delimiter -quiet = don't display status during run -csv_separator "|" = use a pipe char for csv separator -hexdump = incl hex dump [not for -csv options]
From the available options, one can process NTFS INDX records with a handful of 'use-cases'. Specifically, wisp allows processing from any of these sources: (a) live volume, (b) 'dd' type image, (c) VMWare volume or (d) separately extracted INDX type record.
After selecting the source of the data, one can either: (a) process a single directory on the file system, (b) recursively process the subdirectories to some specified level, or (c) process all the index entries in an entire volume. Processing every directory in the entire volume is not explicitly shown in the above menu since it is the default option.
If one only wants a certain type of index entry, one can select: (a) just show valid index entries, (b) just show index entries in the slack space, or (c) both. For default output, one can select nothing. This will output the data in unstructured text. If parsable output is desired (or something that can be displayed in a spreadsheet application), one can select from 3 options that allow for structured output (CSV, log2timeline CSV, or SleuthKit body-file). The other useful option is the 'no duplicates' choice to minimize any redundancy in the output.
To parse INDX entries from a live NTFS volume (or partition), one has two choices: (a) specify the volume directly by using the -partition <drive letter> option or (b) specify the drive number and volume offset by using the -drivenum <num> -offset <volume offset> option. Either choice accomplishes the same task. The first choice is more straightforward and easier to use. The second choice, while more complex, allows one to target hidden NTFS partitions that do not have a drive letter.
The next step is to decide what you want to target. The choices are: (a) a specific directory on the file system (specified by either the -mft or -path options), or (b) a collection of subdirectories within a directory (how deep you wish to go is specified by the -level option). A couple examples are shown below:
wisp -path c:\$Recycle.Bin -level 2 -all -csv > results1.csv wisp -partition c -all -csv > results2.csv wisp -drivenum 0 -offset 0x100000 -all -csv > results3.csv
The first example targets the hidden directory of c:\$Recycle.Bin, and the -level 2 switch tells wisp to include any subdirectory in the analysis, up to 2 levels deep. The -all switch means both valid and invalid (slack) entries will be included in the output. Finally, the output is redirected to a file and the format is CSV.
The second example uses the same output options as the first, but now targets the 'c' partition. Since there is no -mft or -path options explicitly listed, the implication to wisp is we want to traverse the entire volume parsing all INDX records associated with the volume.
The third example uses the same output options as the second, but now targets the first physical hard drive. The hex value 0x100000 is specified as the offset to the volume (or partition) we wish to analyze. For this example, this happens to be the hidden partition created during a Windows 7 installation. Since there is no -mft or -path options explicitly listed, the implication to wisp is we want to traverse the entire volume parsing all INDX records associated with the volume.
To process an image that has been already acquired and is in the 'dd' format, one uses the -image switch. This option can be used in two flavors. If the image is of an entire drive then one needs to explicitly specify the offset of the location of the volume you wish to target. On the other hand, if the image is only of a volume, then you do not need to specify the offset of the volume (since it is presumed to be at offset 0).
For the first case, where an offset needs to be explicitly specified, wisp will help the user in locating where the NTFS volume offsets are. If one just issues the -image command without the offset, and there is not a NTFS volume at offset 0 (eg. second case mentioned above), wisp will proceed to look at the master boot record contained in the image, determine where the NTFS partitions are, and report them to the user. This behavior was meant to be an aid to the user so that one does not need to resort to other tools to determine where the offsets for the NTFS volumes are in an image.
Another nuance with using images as the source is, when specifying a path to a directory within the image to analyze, using the -path option. Since the image is not mounted as a drive, one really should not associate it with a drive letter when specifying the path. If one does do this, wisp will ignore the drive letter and proceed to try to find the path starting at the root directory, which is at MFT entry 5 for NTFS volumes.
Below are two examples of processing 'dd' type images: (a) the first analyzes an entire volume at drive offset 0x100000 hex and (b) the second analyzes an image of a volume starting at the path "Users".
wisp -image c:\dump\my_image.dd -offset 0x100000 -all -csv > results1.csv wisp -image c:\dump\vol_image.dd -path "\Users" -level 5 -all -csv > results2.csv
While the first example traverses the entire volume, the second starts at the "Users" directory and recursively processes the subdirectories up to 5 levels deep. Notice the second example does not specify an offset, since the image is of a volume (meaning the volume starts at offset 0) while the first is an image of a drive and the first NTFS volume starts at offset 0x100000 hex.
Both examples extract valid and invalid index entries as well as redirect their output to a file using CSV formatting.
For starters, to access Volume Shadow copies, one needs to be running with administrator privileges. Also, it is important to note that, Volume Shadow copies as is discussed here, only applies to Windows Vista, Win7, Win8, and beyond. It does not apply to Windows XP.
To tell wisp to look at a Volume Shadow, one needs to use the -vss <index of volume shadow> option. This then points wisp at the appropriate Volume Shadow and start processing the desired directory.
Below are 2 examples. The first will traverse the Users directory to level of 4 deep for the Volume Shadow copy specified by index 1. The second will traverse all the directories in Volume Shadow copy specified by index 2.
wisp -vss 1 -path \Users -level 4 -csv > out.csv
wisp -vss 2 -csv > out.csv
To determine which indexes are available from the various Volume Shadows, one can use the Windows built-in utility vssadmin, as follows:
vssadmin list shadows -- or to filter out extraneous detail -- vssadmin list shadows | find /i "volume"
While the amount of data can be voluminous from that above command, the keywords one needs to look for are names that look like this:
Shadow Copy Volume: \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy1
Shadow Copy Volume: \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy2
...
From the example above, notice the number after the word HarddiskvolumeShadowCopy. This would be the number that is passed as an argument to the -vss option.
Sometimes you do not have a 'dd' image of a volume or drive, but instead have the physical hard drive available you wish to analyze. If you are running wisp in Windows, then one can mount the physical drive and proceed to follow the guidelines in the earlier section for parsing a live volume. However, if you are running wisp in Linux or Mac OS-X, you should also be able to mount the target drive as well. Once it is successfully mounted, one uses the -image <device name of drive or volume> -offset <offset to desired volume, if a drive> option to access the appropriate NTFS volume. Below is an example of how to do this using a Mac box.
Assuming one has the proper setup with write blocker and hard drive shuttle, after connecting the Windows drive to the Mac, one can issue the diskutil list command to enumerate all the drives and volumes mounted on the machine. For this example, lets also assume the drive that we mounted was labeled as /dev/disk1 and its NTFS partition was /dev/disk1s1. From this data one could issue the following command to wisp to analyze the partition.
sudo wisp -image /dev/rdisk1s1 -all -csv -nodups > out.csv
Notice the 'sudo' in front of the wisp command. This
will allow wisp to run with administrator privileges to access
the raw drive. Also note /dev/rdisk1s1
is used. The 'r' (which is unique to Mac, in this case) is used
to specify we want to
access the drive as raw I/O as opposed to buffered I/O. Buffered
I/O is nice for normal reads and/or writes, but it is much slower when
traversing in chunks aligned on sector boundaries.
Linux is similar to Mac, but instead of using the diskutil tool, one would use the df tool to enumerate the mounted devices.
Occasionally it is useful to analyze a VMWare image containing a Windows volume, both from a forensics standpoint as well as from a testing standpoint. This option is still considered experimental since it has only been tested on a handful of configurations. Furthermore, this option is limited to monolithic type VMWare images versus split images. In VMWare, the term split image means the volume is separated into multiple files, while the term monolithic virtual disk is defined to be a virtual disk where everything is kept in one file. There may be more than one VMDK file in a monolithic architecture, where each monolithic VMDK file would represent a separate snapshot. More information about the monolithic virtual disk architecture can be obtained from the VMWare website [10].
When working with virtual machines, the capability to handle snapshot images is important. Thus, if processing a VMWare snapshot, one needs to include the snapshot/image as well as its inheritance chain.
To handle an inheritance chain, wisp can handle multiple VMDK files by listing all the vmdk files separated by a pipe character and enclosed in double quotes. (eg. -vmdk "<VMWare NTFS virtual disk-1> | ... | <VMWare NTFS virtual disk-x>").
Aside from the VMDK inheritance chain, everything else is the same when using this option to that of normal 'dd' type images discussed in the previous section.
At this point in the analysis, one may want to go deeper and try to find if the deleted file is still available by trying to find the 'cluster run' data associated with the MFT entry. Since the INDX records do not have any cluster run data associated with an index entry, one would need to use the MFT entry specified and then use some other tool to read the file record associated with that MFT entry. One could extract the data either from the local volume or from the volume shadow copy store. If pulling from the local volume, one can use the ntfscopy utility [7] from TZWorks LLC. This tool will allow one to (a) input a volume (live or off-line), (b) specify the desired MFT entry to copy from and (c) output the extracted data associated with the MFT's cluster run as well as the metadata associated with that MFT entry. Below is an example of doing this with ntfscopy using the MFT entry number 645130, which is for the slack entry shown above.
ntfscopy -mft 645130 -dst c:\dump\645130.bin -partition c: -meta
For details on the ntfscopy syntax, refer to the ntfscopy readme file [7]. Briefly, the -mft option allows one to specify a source MFT entry to copy from. The -meta option says to create a separate file (in addition to the copied file) that contains the metadata information about the specified MFT entry. The metadata file will be created with the same name as the destination file with the appended suffix .meta.txt. Included in the metadata file are many of the NTFS attributes of the target source file (or MFT entry). This includes, amongst other things, the cluster run and MACB timestamps. From the metadata one can see if the MFT sequence number is the same or not (which would be the indication whether the MFT record was assigned to another file or not).
For those INDX records that have many slack entries, it is not uncommon for there to be quite a few duplicate entries that are parsed and displayed in the output. Duplicate here means the filename and MACB timestamps are the same, however the location of the entry in the INDX record is different. For every unique location in the INDX record, wisp will happily parse the index entry and report its findings to the investigator. This can be quite annoying when some entries have more than a few duplicates and one is trying to wade through a lot of data; especially when carving out entries from slack data on all the directories in an entire volume.
To get rid of duplicates, one can invoke the -nodups switch. This tells wisp to analyze the data extracted and only report one instance of the entry. One thing to be aware of when using this option, is that wisp will internally always analyze valid and slack entries independent of what the user selects as input options. After all the data is extracted, wisp will start deciding if there are duplicate entries or not. It does this by looking at comparing all slack entries with valid entries to see if there is a duplicate, and if not, they are compared to any slack entries that have been marked as non-dups. What this means is, if one runs wisp and only wants non-duplicate slack entries, some slack entries will not be reported if there are valid entries present that are the same.
The two output options that give all the metadata available are the default (unstructured) output and the CSV (-csv) output. The other two output options (-csvl2t and -bodyfile) are geared toward generating cross-artifact timelines. As a result, they are more restrictive in the output fields, and therefore, the metadata that is parsable from these options have some limitations. Specifically, wisp makes use of a couple of free-form fields in these last two output options to try to inject as much useful data as possible, but it makes the data in these fields unstructured, and therefore, difficult to parse if trying to post process the results. Therefore, the best option for metadata that needs to be parsed from wisp output comes from the -csv option.
The slack entries in the output have comments associated with them. The comments come in 2 categories: (a) entries that have not been deleted and (b) those that have been deleted.
The first case there is a valid and invalid (slack) entry pointing to the same file (one can tell this since the MFT entry number and sequence numbers match up). The difference, aside from the fact one is in slack space, is something has changed in the metadata from one file to the other. This could be that MACB timestamps are different, or the the size of the file has changed. wisp annotates the time modifications with a comment, denoted by [macb] and/or [size], depending whether the MACB times are different or the size is different. In the case of timestamps changing, wisp may put the following: [m.c.], which translates to the modify timestamp and 'MFT change' timestamps, respectively, as being different than the valid entry. From this, one can get some past data on a file that has gone through some revisions.
For the second case, wisp just annotates the slack entry as deleted. The term deleted here is only accurate in the sense that this index entry is no longer part of the directory containing the INDX record(s). Whether the item was moved to another subdirectory or actually deleted is unknown from the data presented here.
In cases where there the slack data is obviously corrupted, wisp will either leave that field blank if using CSV output or annotate the word <corrupted> if using the default output.
Option | Description |
---|---|
-path | Analyze the index entries given a path. The syntax is: -path <directory to analyze>. |
-image | Extract artifacts from a volume specified by an image and volume offset. The syntax is -image <filename> -offset <volume offset> |
-partition | Extract artifacts from a mounted Windows volume. The syntax is -partition <drive letter>. |
-drivenum | Extract artifacts from a mounted disk specified by a drive number and volume offset. The syntax is -drivenum <#> -offset <volume offset> |
-vmdk | Extract artifacts from a VMWare monolithic NTFS formatted volume. The syntax is -vmdk <disk name>. For a collection of VMWare disks that include snapshots, one can use the following syntax: -vmdk "disk1 | disk2 | ..." |
-vss | Experimental. Extract artifacts from Volume Shadow. The syntax is -vss <index number of shadow copy>. Only applies to Windows Vista, Win7, Win8 and beyond. Does not apply to Windows XP. |
-csv | Outputs the data fields delimited by commas. Since filenames can have commas, to ensure the fields are uniquely separated, any commas in the filenames get converted to spaces. |
-csvl2t | Outputs the data fields in accordance with the log2timeline format. |
-bodyfile | Outputs the data fields in accordance with the 'body-file' version3 specified in the SleuthKit. The date/timestamp outputted to the body-file is in terms of UTC. So if using the body-file in conjunction with the mactime.pl utility, one needs to set the environment variable TZ=UTC. |
-valid | Extract the metadata from the valid index entries only. |
-mft | Extract index metadata given an inode. The syntax is: -mft <MFT entry to analyze>. |
-indxfile | Process INDX data that may be been extracted from another tool. The syntax is: -indxfile <datafile name;>. |
-slack | Extract the metadata from the slack index entries only. |
-all | Pull every index entry found. This includes both valid and invalid (slack) index entries. |
-nodups | Output unique entries only. Duplicate entries are flagged if the name and MACB timestamps are the same. |
-base10 | Ensure all size/address outputs are displayed in base-10 format versus hexadecimal format. Default is hexadecimal format. |
-username | Option is used to populate the output records with a specified username. The syntax is -username <name to use> . |
-hostname | Option is used to populate the output records with a specified hostname. The syntax is -hostname <name to use>. |
-no_whitespace | Used in conjunction with -csv option to remove any whitespace between the field value and the CSV separator. |
-csv_separator | Used in conjunction with the -csv option to change the CSV separator from the default comma to something else. Syntax is -csv_separator "|" to change the CSV separator to the pipe character. To use the tab as a separator, one can use the -csv_separator "tab" OR -csv_separator "\t" options. |
-dateformat | Output the date using the specified format. Default behavior is -dateformat "yyyy-mm-dd". Using this option allows one to adjust the format to mm/dd/yy, dd/mm/yy, etc. The restriction with this option is the forward slash (/) or dash (-) symbol needs to separate month, day and year and the month is in digit (1-12) form versus abbreviated name form. |
-timeformat | Output the time using the specified format. Default behavior is -timeformat "hh:mm:ss.xxx" One can adjust the format to microseconds, via "hh:mm:ss.xxxxxx" or nanoseconds, via "hh:mm:ss.xxxxxxxxx", or no fractional seconds, via "hh:mm:ss". The restrictions with this option is that a colon (:) symbol needs to separate hours, minutes and seconds, a period (.) symbol needs to separate the seconds and fractional seconds, and the repeating symbol 'x' is used to represent number of fractional seconds. (Note: the fractional seconds applies only to those time formats that have the appropriate precision available. The Windows internal filetime has, for example, 100 nsec unit precision available. The DOS time format and the UNIX 'time_t' format, however, have no fractional seconds). Some of the times represented by this tool may use a time format without fractional seconds and therefore will not show a greater precision beyond seconds when using this option. |
-pair_datetime | Output the date/time as 1 field versus 2 for csv option |
-quiet | This option suppresses status output during processing. |
-hexdump | This option dovetails hex output after parsed data. Useful for verification of parsed data. Only available for unstructured default output (eg. not for -csv, csvl2t or -bodyfile outputs). |
-utf8_bom | All output is in Unicode UTF-8 format. If desired, one can prefix an UTF-8 byte order mark to the CSV output using this option. |
Field | Definition |
---|---|
mft entry | Master File Table (MFT) entry used for this file/folder |
mft seq | MFT sequence number |
parent mft | Parent MFT of the MFT entry |
type | valid or deleted |
file mdate | File/Folder modify date |
mtime-UTC | File/Folder modify time in UTC |
file adate | File/Folder access date |
atime-UTC | File/Folder access time in UTC |
mftdate | MFT metadata modify date |
mfttime-UTC | MFT metadata modify time |
file cdate | File/Folder create date |
ctime-UTC | File/Folder create time |
dir/file | dir or file |
size resv | Space reserved/allocated |
size used | Space actually used |
flags | flags used in the INDX entry |
name | path/name of the file/folder |
slack comment | any additional comments |
For CSV (comma separated values) output, there are restrictions in the characters that are outputted. Since commas are used as a separator, any data that had comma in its name are changed to spaces. For the default (non-csv) output no changes are made to the data.
(Windows only) When processing filenames with characters that are not ascii, one option is to change the code page of the command window from the default code page to UTF-8. This can be done via the command:
chcp 65001
This tool has authentication built into the binary. The primary authentication mechanism is the digital X509 code signing certificate embedded into the binary (Windows and macOS).
The other mechanism is the runtime authentication, which applies to all the versions of the tools (Windows, Linux and macOS). The runtime authentication ensures that the tool has a valid license. The license needs to be in the same directory of the tool for it to authenticate. Furthermore, any modification to the license, either to its name or contents, will invalidate the license.
The tools from TZWorks will output header information about the tool's version and whether it is running in limited, demo or full mode. This is directly related to what version of a license the tool authenticates with. The limited and demo keywords indicates some functionality of the tool is not available, and the full keyword indicates all the functionality is available. The lacking functionality in the limited or demo versions may mean one or all of the following: (a) certain options may not be available, (b) certain data may not be outputted in the parsed results, and (c) the license has a finite lifetime before expiring.