MACSCRPT Archives

July 2009

MACSCRPT@LISTSERV.DARTMOUTH.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Emmanuel LEVY <[log in to unmask]>
Reply To:
Macintosh Scripting Systems <[log in to unmask]>
Date:
Thu, 30 Jul 2009 11:26:53 +0200
Content-Type:
text/plain
Parts/Attachments:
text/plain (73 lines)
Peter, did you try Satimage.osax' equivalent for offset, that is find  
text?

Also, you may be interested in the following excerpt from  
Satimage.osax' dictionary:

> normalize unicode v : normalize Unicode text (canonical composition  
> or decomposition)
>
> normalize unicode string
> [decomposition boolean] : want canonical decomposition. default:  
> false. For example, HFS Plus converts all file names to decomposed  
> Unicode, while Macintosh keyboards generally produce precomposed  
> Unicode.


Best,
Emmanuel

On Jul 30, 2009, at 9:34 AM, Peter J. Hartmann wrote:

> Somewhere in the more recent OS updates "offset of" seems to have  
> gone broken (I'm running 10.5.7, PPC). Some of my scripts seem to  
> exhibit the following problem recently.
>
> Try this:
> - In ScriptEditor create the following new script:
>
> offset of "_" in ""
>
> - In the Finder, create a new empty folder or file with a name  
> containing a character with a diacritical and a trailing underscore.
> - Copy this file name and paste it between the empty quotation marks  
> in your script.
> - Check the return value: every diacritical is counted as extra  
> character, so the offset is off by one for each.
> - To prove it replace the characters with diacriticals by their  
> standard forms. Now the result is correct.
>
> The Apple Script Language guides on p. 144 states
>
> offset compares text as the equals operator does, including  
> considering and ignoring conditions.
> The values returned are counted the same way character elements of  
> text are counted — for example,
> offset of "c" in "école" is always 2, regardless of whether "école"  
> is in Normalization Form
> C or D.
>
> This obviously does not describe reality.
>
> You won't see this if you directly type a string with diacriticals  
> etc. into ScriptEditor.
> The problem is that the files system saves
>
> Sjögren
>
> as
>
> Sjo¨gren
>
> internally and that there is no way currently to normalize these  
> strings coming from the FS via AS. I know I can do it in Perl and  
> Ruby 1.9. Ignoring diacriticals likewise does not help.
>
> Count characters, however, yields correct results.
>
> Or am I missing something here?
>
> ___ Peter Hartmann ________
>
> mailto:[log in to unmask]

ATOM RSS1 RSS2