Subject: | |
From: | |
Reply To: | |
Date: | Thu, 30 Jul 2009 11:26:53 +0200 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
Peter, did you try Satimage.osax' equivalent for offset, that is find
text?
Also, you may be interested in the following excerpt from
Satimage.osax' dictionary:
> normalize unicode v : normalize Unicode text (canonical composition
> or decomposition)
>
> normalize unicode string
> [decomposition boolean] : want canonical decomposition. default:
> false. For example, HFS Plus converts all file names to decomposed
> Unicode, while Macintosh keyboards generally produce precomposed
> Unicode.
Best,
Emmanuel
On Jul 30, 2009, at 9:34 AM, Peter J. Hartmann wrote:
> Somewhere in the more recent OS updates "offset of" seems to have
> gone broken (I'm running 10.5.7, PPC). Some of my scripts seem to
> exhibit the following problem recently.
>
> Try this:
> - In ScriptEditor create the following new script:
>
> offset of "_" in ""
>
> - In the Finder, create a new empty folder or file with a name
> containing a character with a diacritical and a trailing underscore.
> - Copy this file name and paste it between the empty quotation marks
> in your script.
> - Check the return value: every diacritical is counted as extra
> character, so the offset is off by one for each.
> - To prove it replace the characters with diacriticals by their
> standard forms. Now the result is correct.
>
> The Apple Script Language guides on p. 144 states
>
> offset compares text as the equals operator does, including
> considering and ignoring conditions.
> The values returned are counted the same way character elements of
> text are counted — for example,
> offset of "c" in "école" is always 2, regardless of whether "école"
> is in Normalization Form
> C or D.
>
> This obviously does not describe reality.
>
> You won't see this if you directly type a string with diacriticals
> etc. into ScriptEditor.
> The problem is that the files system saves
>
> Sjögren
>
> as
>
> Sjo¨gren
>
> internally and that there is no way currently to normalize these
> strings coming from the FS via AS. I know I can do it in Perl and
> Ruby 1.9. Ignoring diacriticals likewise does not help.
>
> Count characters, however, yields correct results.
>
> Or am I missing something here?
>
> ___ Peter Hartmann ________
>
> mailto:[log in to unmask]
|
|
|