Azure File Sync is a popular service which you can easily use to synchronize data from your on-premises data center to your Azure Storage Account and vice versa. Similar to OneDrive, it releases storage on operating system level and just keeps a reference to the original data on Azure. Whenever a user or application accesses such a reference, either the data is already cached locally or downloaded immediately and potentially released sometime later again to make some storage available for other files that need to be accessed. There are different scenarios that make it quite easy and comfortable to use Azure File Sync. Few examples are:
- Global Azure File Sync implementation across multiple sites to keep data in sync – similar to Distributed File System (DFS)
- Recover standalone file server quickly by just installing and registering the Azure File Sync agent on an alternative server without existing local backup
- Hardware refresh and resizing of on-premises file server(s) to an absolute minimum of local storage
There are more examples and scenarios, however, in this blog post, I want to focus on the third example.
File services grow and grow over time and require more storage capacity although stored data are not regularly accessed by user(s) anymore. In almost all cases, nobody cares about old data anymore. In a lot of environments, only 5-10% or even less data stored on file servers are still accessed – however, irregularly which means maybe once a week or even a month. It means that a lot of local storage could be recycled (if attached through SAN/NAS) or saved when you need to replace a file server, e.g. if leasing period / hardware support will end.
If you then want to implement Azure File Sync or even before, you need to be aware how much storage is required for your local cloud cache before you order a replacement. This could be 50 GB, 500 GB or even 5 TB.
How do find out how much storage you need?
Microsoft does not explicitly tell you what the best mechanism is to answer this question, however there are other sizing recommendations that focus on IOPS, data throughput and more for Azure file share but not for local storage. So, if you guess and size local storage capacity too high, you pay too much for your locally attached storage and potentially you choose 2 rack units which will maybe even more expensive instead of a 1 rack unit server. On the other side, in case you size it too small, file sync will most likely delete cached files that are permanently in use which leads to higher network utilization and a potential impact to end users because they need to wait until files have been downloaded.
To make it easier for you and be better prepared for the question what the ideal local storage capacity is, I wrote a PowerShell script you can use to determine the local storage capacity within a specific time period, e.g. last 7 days or 30 days (even better). Before you can execute the script, you first need to set NTFS Disable Last Access to false in order to let the file system update the file property last file/folder access. This is required to identify when a file has been accessed last time. The PowerShell Script will evaluate this property. If you do not set this property to false, you will not be able to identify which files have been accessed last time.
You can set NTFS Disable Last Access to false by executing this command: fsutil behavior set DisableLastAccess 2
After, you need to reboot your machine.
PowerShell Script FileSync_CalculateStorageCapacity.ps1
########################
# Author: Tobias Massoth
# Date: 25.02.2023
# Version: 1.0
########################
# value=2147483650 --> Windows Server 2019 and newer
# value=2 --> Windows Server 2016 and older
if((get-itemproperty HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem\).ntfsdisablelastaccessupdate -eq 2147483650 -OR
(get-itemproperty HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem\).ntfsdisablelastaccessupdate -eq 2) {
$days = Read-Host "How many days shall be considered for local storage analysis "
$location = Read-Host "Which SMB file share (\\FQDN\share01 ) or location (C:\SharedFolder) shall be analyzed "
$size = 0
# Iterate through the folder structure and do some magic
get-childitem $location -recurse | % {
if(((Get-date) - ($_.LastAccessTime)).Days -lt $days -AND $_.PSIsContainer -eq $false) {
$size += (get-item $_.Fullname).length
}
}
# Output calculated size in GB
Write-host ""
Write-host ""
Write-host ("Calculated required storage capacity: " + [Math]::Round(($size / 1024 / 1024 / 1024),2) + " GByte")
Write-host ""
Write-host ""
} else {
Write-host "You first need to configure your system and you should wait at least 7 days, better 30 days before rerunning this script again. Please execute the following command to enable LastAccessUpdate and reboot your machine"
Write-host "Command to execute: fsutil behavior set DisableLastAccess 2"
}
pause