Friday, July 01, 2011

Alternation without Capture/Extraction/Selection

This drove me crazy for longer than I wanted (since I would want that for varying amounts of time) so I will note it here for other frustrated people to find.

I had a regular expression from which I wanted to capture part of it in $1. I also had an alternation in it that needed grouping with parentheses. It kept capturing the alteration in $1 when I didn't care or want to capture that at all.

This is an example of what I had at first that didn't do what I wanted:

$string =~ /bytes\s+=\s+\d+\.?\d*(K|M)?\s+\(\s*(\d+\.?\d*)\%/;

The above code was giving me either 'K' or 'M' in $1 if either were there instead of what I wanted which was the second grouping (\d+\.?\d*). I just needed to know how to stop the capture since stuff like K|M? and K?|M? without parentheses didn't work right either.

After a lot of online searching using probably the wrong query terms, I found the concept of "non-capturing groupings" which are apparently denoted by (?:regex).

Changed my code to this to finally get what I wanted:

$string =~ /bytes\s+=\s+\d+\.?\d*(?:K|M)?\s+\(\s*(\d+\.?\d*)\%/;

This way the 'K' or 'M' isn't captured and I get the second grouping (\d+\.?\d*) stored in $1.

1 comment: