MACSCRPT Archives

September 2009

MACSCRPT@LISTSERV.DARTMOUTH.EDU

Options: Use Monospaced Font
Show Text Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Content-Type:
text/plain; charset=UTF-8
Sender:
Macintosh Scripting Systems <[log in to unmask]>
Subject:
From:
"Mark J. Reed" <[log in to unmask]>
Date:
Thu, 10 Sep 2009 22:09:21 -0400
In-Reply-To:
MIME-Version:
1.0
Reply-To:
Macintosh Scripting Systems <[log in to unmask]>
Parts/Attachments:
text/plain (46 lines)
On Thu, Sep 10, 2009 at 8:31 PM, Ed Stockly <[log in to unmask]> wrote:
> Thanks for the help

You're welcome!

> It works, and I think I even know why.

Always a bonus. :)

> The \| is used by the shell to indicate a pipe is part of the string, right? With out it
> numbers like 696123| would also be returned (the \\ is needed to tell applescript that \ is
> part of the shell script)

Right.  Since we're talking about the egrep syntax, a pipe is a
special character meaning "or", so you have to escape it with a
backslash if you want it to just match a literal pipe.

The grep command is the other way around, which can be confusing.
That's because grep, unlike egrep, originally had no support for
alternation at all.  When it was later added as a feature by the GNU
project, they used backslashed versions of the special characters to
avoid breaking old shell scripts that relied on the fact that e.g. "|"
wasn't a special character to grep.

Perl 5's regex syntax atempts to make the backslash vs no-backslash
decision somewhat more consistent and less arbitrary: in general,
punctuation is special by itself, literal when backslashed; letters
and numbers are literal by themselves, special when backslashed.

> So now I will write the results to a new file, then grep a pattern from that file that
> matches the third delimited item.

As Shane said, you just pipe the output of egrep to another egrep.  Or
or construct a pattern that matches only the lines you care about in
the first place, something like:

egrep '^(first1|first2|first3)\|[^|]*\|(third1|third2|third3)\|'  myfile


which matches any of the first set of things, followed by a pipe,
followed by any amount of text that doesn't include a pipe, followed
by another pipe, followed by any of the second set of things.

-- 
Mark J. Reed <[log in to unmask]>

ATOM RSS1 RSS2