(and a sprinkle of Regex)
The Aim
They say that a good definition of madness is doing the same thing over and over and expecting different results. Surely then another definition of madness is do a thing perfectly well once then dream up 4 different ways to do it in slightly less lines of code. With this in mind, here are 5 ways to extract the computer name from a UNC filepath using PowerShell – a task I found surprisingly difficult.
- Given a UNC filepath i.e.
\\CODEBUCKETSERVER1\wwwroot\WebSite\ImageDir
I want a powershell script that returns
CODEBUCKETSERVER1
- If I pass in a local directory then I want an empty string
- I want it with as little code as possible. Really I want to see it on one line
- Practice some PowerShell and learn something – as always
Function 1- splitting the string
function GetHostName_V1{ param ([string] $FilePath) return $FilePath -split "\\" | Where { $_ -ne "" } | Select -first 1 }
So working it through one at a time
$FilePath -split "\\"
splits the string into an array using \ as the delimiter (note it is \\ because it is escaped). The elements it splits the string in to are …
the output is piped to
Where { $_ -ne "" }
Filter out any empty strings (our first result will be empty as \\ is split into 2 parts)
Select -first 1
Return the first (non-empty) element in the array which is our hostname
Evaluation
Not good. If I pass in a local path i.e.
C:\intepub\wwwroot\WebSite\ImageDir
Then I get
C:
as the output (the first non-empty string as the output). Misleading – this isn’t a machine name therefore I shouldn’t return it. Back to the drawing board
Function 2 – Regular expression
Really, this feels like a task for regular expressions. So the first regex pass is
function GetHostName_V2{ param ([string] $FilePath) $FilePath -match "\\\\(.*?)\\" | Out-Null if($Matches.Count -ge 2) { return $Matches[1] } }
The regex
The regex we are going to use is
\\\\(.*?)\\
It’s better to look at it without the escape characters so ..
\\(*.?)\
Breaking it down
\\
Is a straight character match of two backslashes
(*.?)
Is any number of character BUT the ? makes it non-greedy, So it will match the least amount of characters it can to still make the match. Without that it would be greedy and match everything it could up to the last \ rather than just matching to the first.
\
Another character match
So it matches \\ then anything then \. The trick is that the anything (*.?) is in parenthesis so it will be available to us as a group – the parenthesis does that
The PowerShell function
So step at a time
$FilePath -match "\\\\(.*?)\\"
Matches the file path to the regex. The match function then copies the result into a magic global variable called $Matches. This contains the results of the match and all the groups.
So we can see the overall match
\\CODEBUCKETSERVER1\
And the group
CODEBUCKETSERVER1
To return the group we check that there is at least 2 elements in $Matches
$Matches.Count -ge 2
Then return the hostname which is in the 2nd position in the matches collection
return $Matches[1]
What are we really returning?
Powershell is odd with returning values out of functions. It will return all values that haven’t been used. The return keyword just signals the end of the function so…
if($Matches.Count -ge 2) { $Matches[1] return }
Would work as would
if($Matches.Count -ge 2) { $Matches[1] }
There is additional weirdness though
$FilePath -match "\\\\(.*?)\\"
Returns true – we haven’t used it so that would be returned too. Out-Null ‘uses it’ and stops it returning so getting us just the single return value we want.
$FilePath -match "\\\\(.*?)\\" | Out-Null
It’s so odd (to me) that I might write a separate blog post about it one day. Anyway digression over.
Evaluation
It works. Host name for UNC and Null for local paths. I hate it though (an extreme reaction to a PowerShell script admittedly).
- Magical variable called $Matches – what’s that about?
- Having to use Out-Null to monkey around with the return value
- Too many lines – I can do this in one line surely.
Function 3 – regex and split
Trying to get away from the magic $Matches variable I’ll combine the first two attempts to get
function GetHostName_V3{ param ([string] $FilePath) if ($FilePath -match "\\\\(.*?)\\" -eq $TRUE) { return $FilePath -split "\\" | Where { $_ -ne "" } | Select -first 1 } }
This one is fairly transparent so
$FilePath -match "\\\\(.*?)\\" -eq $TRUE
Checks to see if the input is in a UNC type format. If it is then
$FilePath -split "\\" | Where { $_ -ne "" } | Select -first 1
We split it. The return doesn’t need to be there but for me points out the intention. We don’t need Out-Null because we are using the return value of the –match function so it won’t be put on the pipeline and returned out.
Evaluation
It’s OK. It returns empty for a local path which is good. It actually can be understood. In real life I would be happy with this – I’ve seen far worse PowerShell. But in my heart of hearts I know I can do better
Function 4 – Select-String
I’m abandoning – match now and using Select-String. Select-String will also pattern match a string to regex but it returns out the results as a MatchInfo object which we can then consume by piping it to other operators. It gets us to the one liner that I want so…
function GetHostName_V4{ param ([string] $FilePath) $FilePath | select-string -pattern "\\\\(.*?)\\" -AllMatches | ForEach {$_.Matches} | ForEach {$_.Groups} | Select-Object -skip 1 -first 1 }
Examining this a piece at a time
$FilePath | select-string -pattern "\\\\(.*?)\\" –AllMatches
Matches the rfegex to the string and returns all matches in a collection of match info objects
So we have the match and then the group collection
| ForEach {$_.Matches}
Takes us through all the matches
ForEach {$_.Groups}
Takes us through each group for each match. Our machine name is put in a regex group (remember (.*?)). The first group is the full match and the second group is the machine name so
| Select-Object -skip 1 -first 1
Skips the first and picks up the next one. It works.
Evaluation
Good. It’s one line with the output all flowing along the pipeline which I like. It works – I’m nearly done. But I’ve a tiny bit of disquiet – am I really doing it in the simplest way I can?
Note on aliases
To shorten this we can use the % alias instead of Foreach (which is itself an alias for ForEach-Object). So the main body of the function could become
$FilePath | select-string -pattern "\\\\(.*?)\\" -AllMatches | % {$_.Matches} | % {$_.Groups} | Select -skip 1 -first 1
Shorter still. Nice.
Function 5 – lookahead and lookbehind
Reflecting on this – a lot of the complexity is the use of groups in this regex. Do I need them? Well no I can use the zero length regex assertions lookahead and lookbehind
function GetHostName_V5{ param ([string] $FilePath) $FilePath | select-string -pattern "(?<=\\\\)(.*?)(?=\\)" | Select -ExpandProperty Matches | Select -ExpandProperty Value }
The regex
Once again it’s got escape characters in i.e.
(?<=\\\\)(.*?)(?=\\)
It’s easier to understand if we just remove them while we are dissecting it
(?<=\\)(.*?)(?=\)
So there is three parts
(?<=\\)
Looks behind the match to check for \\. It isn’t part of the match though
.*?
Matches the least amount of anything it can (remember the non-greedy stuff).
(?=\)
Looks ahead of the match to check for \. Again it isn’t part of the match. So the match is only the machine name i.e. the least amount of anything.
The function
function GetHostName_V5{ param ([string] $FilePath) $FilePath | select-string -pattern "(?<=\\\\)(.*?)(?=\\)" | Select -ExpandProperty Matches | Select -ExpandProperty Value }
So in parts
$FilePath | select-string -pattern "(?<=\\\\)(.*?)(?=\\)"
Returns a match info with just one match (and no extra groups)
Select -ExpandProperty Matches
Select the match property of the match info
Select -ExpandProperty Value
Selects the value property of the match object. This is the machine name. Done and Done!
The madness ends
It’s madness to write it all out – but then again there is a lot even a simple task. To go all the way through we covered off
- Powershell return keyword and Out-Null
- How regex groups work
- How regex look ahead and look behind work
- The powershell pipeline operator
- PowerShell select-string vs –match
So perhaps not quite as mad as all that.
Post Script. Function 6 – FQDN
As requested in comments here is function 5 amended to deal with fully qualified domain names
function GetHostName_V6{ param ([string] $FilePath) $FilePath | select-string -pattern "(?<=\\\\)(.*?)(?=(\\|[.]))" | Select -ExpandProperty Matches | Select -ExpandProperty Value }
The regex has been changed slightly to
(?<=\\\\)(.*?)(?=(\\|[.]))
So the lookahead (?=(\\|[.])) will stop the search if it finds \ or .
So testing with
\\server.domain.tld\share
gives the expected server and not server.domain.tld therefore the computer name as promised. It still works with UNC paths that are not FQDN.
Useful Links
https://en.wikipedia.org/wiki/Path_(computing)#Uniform_Naming_Convention
UNC means Uniform Naming Convention i.e. paths in the form of \\MYHOST\more\more1
https://mcpmag.com/articles/2015/09/30/regex-groups-with-powershell.aspx
Good description of the magic $Matches object
http://www.regular-expressions.info/lookaround.html
Lookahead and Lookbehind in regex. This site is so old but still really useful – I’ve been looking at it for years now
https://msdn.microsoft.com/en-us/powershell/reference/5.0/microsoft.powershell.utility/select-string
MSDN documentation for select-string. Useful
https://blog.mariusschulz.com/2014/06/03/why-using-in-regular-expressions-is-almost-never-what-you-actually-want
Good post on greedy vs non-greedy regex operators
https://code.visualstudio.com/
All PowerShell was written with Visual Studio Code. My IDE of choice for PowerShell, Nice debugging.
https://github.com/timbrownls20/Demo/tree/master/PowerShell/UNC%20FilePath
As ever the code is on my git hub site
The Script works well for network path like \\server\share but not for \\server.domain.tld\share…there it returns server.domain.tld and not the plain computername.
Ditto, Rolf… I came across this article because I needed to see how to get just the hostname from a FQDN. Would be cool to see an update to this article that also included that function. Oh well!
Bit late to this but I’ve put in a function to deal with FQDN. Hope that helps you
How would you handle a replace for cases where you have both a UNC path or a FQDN path (you can’t be sure of the input).
$server = (Resolve-DNSName -Name $server).Name
$folder = $folder -Replace(“(?<=\\\\)(.*?)(?=(\\|[.]))",("$server$1"))
$folder input:
\\xxx-xxx-xxx\yyyy\zzzz
or
\\xxx-xxx-xxx.on.ca\yyy\zzz
Expected output:
\\xxx-xxx-xxx.on.ca\yyy/zzz
Function 6 works for case #1, but for Case#2 I end up with (note the double "on.ca")
\\xxx-xxx-xxx.on.ca.on.ca\yyy\zzz