MACSCRPT Archives

January 2011

MACSCRPT@LISTSERV.DARTMOUTH.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
John Delacour <[log in to unmask]>
Reply To:
Macintosh Scripting Systems <[log in to unmask]>
Date:
Tue, 11 Jan 2011 17:34:24 +0000
Content-Type:
text/plain
Parts/Attachments:
text/plain (45 lines)
At 01:38 -0800 11/01/2011, Walter Ian Kaye wrote:

>Hey JD, you would probably know this procedure: I see a Unicode
>character on a Web page <http://www.ytn.co.kr/>, in this case a black
>triangle (like "[>"), and want to find out what code point it is. So
>I copied to the clipboard and wrote a script to try and convert it
>using something like this:
>
>item 1 of ((((theU as Çclass ut16È) as string) as record) as list)
>
>and then convert the string to hex, but it didn't match the code
>point in the Character Palette (25B6, or possibly 25BA).
>
>Any ideas?


The best idea is almost cetainly this:

<http://earthlingsoft.net/UnicodeChecker/>

UnicodeChecker is scriptable.  An invaluable tool.

Perl will do it easily as well.


As to getting the individual bytes, this is as far as I have time to 
go at the moment, and it's a lot of typing for very little result:

set _clip to the clipboard
set f to (path to temporary items from user domain as string) & "tmp.txt"
try
   close access file f
end try
open for access file f with write permission
set eof file f to 0
write _clip as Çclass ut16È to file f
close access file f
set _s to read file f as string
set {_a, _b} to characters 3 through 4 of _s
-- (the first 2 chars are the BOM)
{ASCII number _a, ASCII number _b}
--=> {186, 37}

JD

ATOM RSS1 RSS2