5 ways to extract the computer name from a network file path with PowerShell

 (and a sprinkle of Regex)

The Aim

They say that a good definition of madness is doing the same thing over and over and expecting different results. Surely then another definition of madness is do a thing perfectly well once then dream up 4 different ways to do it in slightly less lines of code. With this in mind, here are 5 ways to extract the computer name from a UNC filepath using PowerShell – a task I found surprisingly difficult.

  1. Given a UNC filepath i.e.

    I want a powershell script that returns
  2. If I pass in a local directory then I want an empty string
  3. I want it with as little code as possible. Really I want to see it on one line
  4. Practice some PowerShell and learn something – as always

Function 1- splitting the string

So working it through one at a time

splits the string into an array using \ as the delimiter (note it is \\ because it is escaped). The elements it splits the string in to are …

the output is piped to

Filter out any empty strings (our first result will be empty as \\ is split into 2 parts)

Return the first (non-empty) element in the array which is our hostname

Evaluation

Not good. If I pass in a local path i.e.

Then I get

as the output (the first non-empty string as the output). Misleading – this isn’t a machine name therefore I shouldn’t return it. Back to the drawing board

Function 2 – Regular expression

Really, this feels like a task for regular expressions. So the first regex pass is

The regex

The regex we are going to use is

It’s better to look at it without the escape characters so ..

Breaking it down

Is a straight character match of two backslashes

Is any number of character BUT the ? makes it non-greedy, So it will match the least amount of characters it can to still make the match. Without that it would be greedy and match everything it could up to the last \ rather than just matching to the first.

Another character match

So it matches \\ then anything then \. The trick is that the anything (*.?) is in parenthesis so it will be available to us as a group – the parenthesis does that

The PowerShell function

So step at a time

Matches the file path to the regex. The match function then copies the result into a magic global variable called $Matches. This contains the results of the match and all the groups.

So we can see the overall match

And the group

To return the group we check that there is at least 2 elements in $Matches

Then return the hostname which is in the 2nd position in the matches collection

What are we really returning?

Powershell is odd with returning values out of functions. It will return all values that haven’t been used. The return keyword just signals the end of the function so…

Would work as would

There is additional weirdness though

Returns true – we haven’t used it so that would be returned too. Out-Null ‘uses it’ and stops it returning so getting us just the single return value we want.

It’s so odd (to me) that I might write a separate blog post about it one day. Anyway digression over.

Evaluation

It works. Host name for UNC and Null for local paths. I hate it though (an extreme reaction to a PowerShell script admittedly).

  1. Magical variable called $Matches – what’s that about?
  2. Having to use Out-Null to monkey around with the return value
  3. Too many lines – I can do this in one line surely.

    Function 3 – regex and split

    Trying to get away from the magic $Matches variable I’ll combine the first two attempts to get

    This one is fairly transparent so

    Checks to see if the input is in a UNC type format. If it is then

    We split it. The return doesn’t need to be there but for me points out the intention. We don’t need Out-Null because we are using the return value of the –match function so it won’t be put on the pipeline and returned out.

    Evaluation

    It’s OK. It returns empty for a local path which is good. It actually can be understood. In real life I would be happy with this – I’ve seen far worse PowerShell. But in my heart of hearts I know I can do better

Function 4 – Select-String

I’m abandoning – match now and using Select-String. Select-String will also pattern match a string to regex but it returns out the results as a MatchInfo object which we can then consume by piping it to other operators. It gets us to the one liner that I want so…

Examining this a piece at a time

Matches the rfegex to the string and returns all matches in a collection of match info objects

So we have the match and then the group collection

Takes us through all the matches

Takes us through each group for each match. Our machine name is put in a regex group (remember (.*?)). The first group is the full match and the second group is the machine name so

Skips the first and picks up the next one. It works.

Evaluation

Good. It’s one line with the output all flowing along the pipeline which I like. It works – I’m nearly done. But I’ve a tiny bit of disquiet – am I really doing it in the simplest way I can?

Note on aliases

To shorten this we can use the % alias instead of Foreach (which is itself an alias for ForEach-Object). So the main body of the function could become

Shorter still. Nice.

Function  5 – lookahead and lookbehind

Reflecting on this – a lot of the complexity is the use of groups in this regex. Do I need them? Well no I can use the zero length regex assertions lookahead and lookbehind

The regex

Once again it’s got escape characters in i.e.

It’s easier to understand if we just remove them while we are dissecting it

So there is three parts

Looks behind the match to check for \\. It isn’t part of the match though

Matches the least amount of anything it can (remember the non-greedy stuff).

Looks ahead of the match to check for \. Again it isn’t part of the match. So the match is only the machine name i.e. the least amount of anything.

The function

So in parts

Returns a match info with just one match (and no extra groups)

Select the match property of the match info

Selects the value property of the match object. This is the machine name. Done and Done!

The madness ends

It’s madness to write it all out – but then again there is a lot even a simple task. To go all the way through we covered off

  • Powershell return keyword and Out-Null
  • How regex groups work
  • How regex look ahead and look behind work
  • The powershell pipeline operator
  • PowerShell select-string vs –match

So perhaps not quite as mad as all that.

Useful Links

https://en.wikipedia.org/wiki/Path_(computing)#Uniform_Naming_Convention
UNC means Uniform Naming Convention i.e. paths in the form of \\MYHOST\more\more1

https://mcpmag.com/articles/2015/09/30/regex-groups-with-powershell.aspx
Good description of the magic $Matches object

http://www.regular-expressions.info/lookaround.html
Lookahead and Lookbehind in regex. This site is so old but still really useful – I’ve been looking at it for years now

https://msdn.microsoft.com/en-us/powershell/reference/5.0/microsoft.powershell.utility/select-string
MSDN documentation for select-string. Useful

https://blog.mariusschulz.com/2014/06/03/why-using-in-regular-expressions-is-almost-never-what-you-actually-want
Good post on greedy vs non-greedy regex operators

https://code.visualstudio.com/
All PowerShell was written with Visual Studio Code. My IDE of choice for PowerShell, Nice debugging.

https://github.com/timbrownls20/Demo/tree/master/PowerShell/UNC%20FilePath
As ever the code is on my git hub site

Sorting Unknown Images With PowerShell

The Problem

You’ve got a large amount of binary files. Some are images but you’ve no idea which ones. Some will be gifs, some jpegs, some bmp and other strange formats. They’ve been dropped on you without file extensions. Perhaps they’ve been extracted from a database blob field. Perhaps they have partially retrieved from some backup tapes after a system crash. Perhaps they have been under earthed in an Anglo-Saxon burial mound just outside of Norfolk. However they arrived, it is now your task to sort them by file type.

The Solution

The broad principle here is that image types are identifiable from their first few bytes. So in hex

  • Jpg starts with “FFD8”
  • gif starts with “474946”
  • bmp starts with “424D”
  • png starts with “89504E470D0A1A0A”

We could use any programming language we choose to sort the images on this basis. I’m going to use PowerShell because

  1. This seems to me like a dev ops type of activity. PowerShell scripts are easy to hook into and run in continuous build and the like
  2. I don’t need to write any kind of UI
  3. I’m practising my PowerShell and trying to get better (the real reason)

The Script

This is the entire script with explanatory comments

The core part of the script is

Get-Content gets the content of the file, in this case the first 8 bytes. Each byte is then iterated through and changed into a hexadecimal string (ToString(“X2”)). This is appended to the $FileHeader variable which we use to compare against the known image headers. This allows us to identify which type of image this is. The rest of the script is moving the files around and sorting them into different directories.

If the script is saved into a file e.g. ImageSorter.ps1 it can then be run with the dot sourcing command.

So that’s it, image sorting in a nutshell. Hopefully the above script will be useful for someone, somewhere at some point. Happy unknown image sorting everyone.

Useful Links

File signatures

https://en.wikipedia.org/wiki/List_of_file_signatures
A useful, easy to read list of file signatures which could be used in identifying unknown files. Most are headers but some have an offset which is given. The above script could easily be extended to account for alternative types using this list.

http://www.fileformat.info/ gives a wealth of information on file structure and formats. For the (very) interested here is a detailed breakdown of bmp, gif, jpeg and png files including header information.

http://www.fileformat.info/format/bmp/corion.htm
http://www.fileformat.info/format/jpeg/egff.htm
http://www.fileformat.info/format/gif/egff.htm
http://www.fileformat.info/format/png/egff.htm

Other file types that I didn’t implement such as tiffs  are also detailed.

Alternative implementations

This Stack Overflow answer is a C# implementation of an image sorter should anyone require it. I did pinch the file header information from here (easier than fileformat.info) but the rest is my very own work crafted by my very own coding fingers – promise.

PowerShell Join-Path with multiple parameters

In my day to day job I use a fair amount of PowerShell. Some might say the deployment process is mountains of PowerShell glued together by Team Foundation Server. Cruel but accurate. I don’t make any claims to be a PowerShell expert (I really don’t) so I post this not to show my PS brilliance but rather as a way for me to practice and learn. I appreciate this has been written about elsewhere but as I say it’s just practice and I wrote this without peeking at anybody else’s solution (promise)

Standard Call

Within the mountains of PowerShell I use Join-Path quite a bit. It’s fine for two parameters

Output

But I’m always mildly surprised it can’t cope with more parameters

Output

Boom – unhappy code, unhappy coder.

Piped Call

The most straightforward solution is to pipe the output of one call to a send call and so on. So

Sadly doesn’t work

In order to get this to work the second parameter needs to be explicitly named

Which then works perfectly well

Using parameter array

It feels as if it should/must/ought to be possible to do this without piping – by passing in an array of parameters. Indeed it is. This function will do the trick

To call it it needs to be imported into the shell, There are a number of ways to do this but the easiest is by using the dot operator (dot-sourcing) i.e.

The function is then available to be called

And again works perfectly well

So there you have it, both original and insightful.  Sadly the original is not insightful and the insightful is not original. But I enjoyed writing it!

Note on script permissions

When importing the script about you may see this

scripts is disabled on this system. As part of Microsoft Trustworthy Computing initiative PowerShell is secure by default won’t allow scripts to be run. To bypass this run

This is just for demo purposes though. For production or anything other than quick demos then consider signing scripts (see links below).

Useful Links

Adding functions to Powershell sessions
http://mikepfeiffer.net/2010/06/how-to-add-functions-to-your-powershell-session/

Execution Policies
https://technet.microsoft.com/library/hh847748.aspx

Better ways to work with the PowerShell execution policy
https://technet.microsoft.com/en-us/magazine/2b6812f2-e6e5-41ad-a1c4-6b9e01977cc7