CyberCX has released its annual Digital Forensics and Incident Response Year in Review Report for 2023 

NTFS Usnjrnl Rewind

Blogs from CyberCX

Published by Yogesh Khatri, Director, Capability, Digital Forensics and Incident Response, 11 April 2024

 

CyberCX is routinely involved in increasingly complex investigations where threat actors are taking measures to hide their activities and increase the complexity of forensic analysis. Simply put, hackers know we can find out what they’ve done, so they are covering their tracks.

This blog post is about the NTFS filesystem journal, which tracks when a file or folder is created, deleted, renamed or moved. We discuss using the journal for recreating deleted file/folder full paths, issues with current methods and tools, and introduce an alternate method (and a new tool) to guarantee correct and complete path information.


When performing incident response, investigators often come across hosts where a threat actor has performed extensive activity, however the forensic traces and artefacts are all gone because the attackers cleaned up after themselves.

Fortunately, much of the file activity may still be captured in $UsnJrnl:$J, the NTFS filesystem journal. While it is possible to remove the journal, CyberCX has not seen any threat actor do this as yet. Using the journal, it is possible to recreate the full file/folder path, which may be an important IoC to hunt for elsewhere. This is especially useful if you can see files being created under specific user profile folders, which may hint at those users being compromised (but not always!).

Traditionally recreating the full path for an entity (file/folder) in the journal is accomplished by linking the entity back to its parent folder via the Master File Table (MFT) record and sequence number, which are tracked in the journal. However, it is often seen that the parent folders are themselves now deleted and those MFT entries were reallocated once or in some cases several times over.

In such a scenario it is not possible to determine a file’s full path by simply matching parent references to exiting MFT entries (because they don’t exist anymore). This is where a journal rewind helps.

 

NTFS refresher – MFT entries and paths

On an NTFS filesystem, the MFT is the filesystem database that keeps track of file and folder metadata. Each file or folder, which for convenience we will refer to as an entry, has its metadata stored in the MFT. However an entry’s path is not stored in this metadata.

Inside the MFT, every entry has a parent reference, an 8-byte value of which 6 bytes represent an entry number and 2 bytes represent a sequence number. The parent reference points to the MFT entry, which is its parent folder. Therefore to determine the full path to any entry, you need to read the parent reference, then go to that entry representing the parent folder and read its parent reference and so on. Keep repeating this process until you reach the root folder entry, i.e., entry number 5. The figure below shows how the file C:\My Files\One.txt is represented in the MFT.

 

Figure 1 – MFT entries and parent references

 

Now when a file is deleted, the MFT entry corresponding to that file is marked as not-in-use, so it can be reused later. If an entry is not yet re-used, then all its metadata can easily be recovered as the rest of the metadata in that entry is unaltered.

But when it gets re-used, it’s metadata is overwritten with that of the new entry taking its place. NTFS makes it known that this entry is not the same as the previous by updating the sequence number field in the MFT entry header. This sequence number increments by one every time an entry transitions from in-use to not-in-use, i.e., when the entry is deleted.

Fortunately, the journal tracks this sequence number for the entry as well as its parent. Therefore when reading the journal, if we try to determine the parent folder for an entity, we need to match both the parent entry number and the parent sequence number in the MFT. Essentially this means that one would know if the parent reference they are looking for in the current MFT is a valid one. If the MFT entry’s sequence number does not match, then the current entry is not the one you are looking for and has been removed.

The current method utilised by most forensic tools for recreating full paths takes into consideration the entry number and sequence num for both the entry and its parent, and tries to resolve it to the current state of the volume. Tools such as Eric Zimmerman’s MFTECmd can do the above matching and recreate the parent path wherever the MFT entries have still not been overwritten in the current state of the volume.

The problem is the current state of the volume may be very different now with entries being reused/overwritten. This results in many paths being shown incorrectly or just unable to be computed. Paths may be incorrect because we match an entry to its parent as seen in the current state of the volume (MFT). However in the specific journal entry, at the time, that folder may have resided in an alternate location or may have had a different name.

 

Rewind the Journal

We propose a different method that provides complete results: reading the journal records from the last recorded event to the earliest in sequence, which we call rewinding the journal.

As files and folders are being created, deleted or moved, reading the journal in reverse and keeping stateful information about every entry (and its parent) allows us to compute all the full paths to every entry (found in the journal). If the end of the journal corresponds to the current state of the volume (which it should!), then there should be no “Unknown” paths in the final output.

Consider the following example. At one point in time, there existed a file named data.txt whose full path was C:\Intel\Drivers\ip_scanner\data.txt. The entry and sequence numbers of all the entities in the full path are shown below. For convenience, we’ll just club them together in the format entry-sequence.

Currently these MFT entries are allocated as seen below:

 

Since the entry has been reused, the sequence number (4) of the parent entry (983) for file data.txt in the journal is different from the sequence number (6) on the same entry (983) on the current volume, hence the full path for that entry cannot be directly computed and is marked as “UNKNOWN” by most forensic tools that process the journal.

However, the journal did capture the reallocation of these entries, as well as the names corresponding to the entry-sequence pairs that existed prior. A snippet of some of the events captured in the journal are as shown below. For ease of reading, entries that represent the same file/folder have been colour coded. The far-right column shows the full path computed by tools that resolve the parent entry and sequence number against data from the MFT.

 

 

As we can see, the original MFT entries are no longer present for several of the files/folders, hence paths cannot be computed for several of the entries. However, in order to construct the full path for all these entries, we only need the name, entry and sequence number, all of which is present in the journal. Hence even without the original MFT information, we have the parts needed to re-create full path.

Working backwards in the journal (from last to first), we come to the entry Drivers (66 – 1), a currently overwritten entry (66 – 2) and note down its name and parent info. The entry prior to it, ip_scanner, can now have its full path computed as we know its parent entry now (66 – 1).

Continuing our journal “rewind”, we will find the original name and full path of every file including data.txt. Our table with the far-right column populated with “rewind” computed paths now looks like this, with the bold highlighted text representing items different from the previous table.

 

 

Another item of note is the path to the file 1.exe. Previously computed as C:\temp\1.exe, its correct path is now computed as C:\Users\bee\temp\1.exe. The path was incorrect earlier as it was being computed based on the current state of the volume, where temp resides at the root (C:), but when 1.exe was deleted, temp resided elsewhere.

A python script that recreates the paths using this method has been made available here. This proof of concept utilises the CSV outputs of MFTEcmd processing of the MFT and Usnjrnl:$J file, and outputs a new USNRJNL csv (with the complete path information) as well as an Sqlite database containing the MFT and USNJRNL data.


In conclusion, we have demonstrated a better way to recreate full paths for journal entries to allow for better utilisation of this artefact. Since a lot of practitioners incorporate the use of the MFTEcmd tool into their workflow, we decided to not reinvent the wheel, simply use that data as input and produce an output that is also in the same exact format, only with more accurate data.

The CyberCX DFIR team are continuously looking for ways to improve the current state of the art, hopefully you will find this tool (or methodology) a useful addition to your artefact analysis workflow.

Ready to get started?

Find out how CyberCX can help your organisation manage risk, respond to incidents and build cyber resilience.