LISTSERV - NISUS Archives - LISTSERV.DARTMOUTH.EDU

NISUS Archives

October 2015

NISUS@LISTSERV.DARTMOUTH.EDU

	LISTSERV Archives
	NISUS Home
	NISUS October 2015

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Subject:	Re: Indexing
From:	Nobumi Iyanaga <[log in to unmask]>
Reply To:	[log in to unmask]
Date:	Sun, 25 Oct 2015 13:30:47 +0900
Content-Type:	text/plain
Parts/Attachments:	text/plain (87 lines)

Hello Philip,

Thank you very much for your reply.

> On Oct 24, 2015, at 3:16 PM, spaelti <[log in to unmask]> wrote:
> 
>> ...
>> 
>> There are cases which can be confusing, for example, I have the name "Yamato" (old name for Japan), and "Yamato-takeru" (the name of a hero in Japanese mythology). For these names, I may have this table:
>> 
>> Yamato	Yamato (old name for Japan)
>> Yamato-takeru	Yamato-takeru (a hero in Japanese mythology)
>> 
>> Then, "Yamato" in "Yamato-takeru" would be indexed twice, both as "Yamato" and "Yamato-takeru". If in the generated index, the page for "Yamato" appears in the entry for "Yamato-takeru", that is not good. Is there any way to avoid this kind of situation…??
> 
> Yes, I just tried this, and this does seem to work this way. Apparently, since hyphen counts as a word boundary, the indexing finds “Yamato” inside “Yamato-takeru” and indexes it twice. I was hoping that by having it index “Yamato-takeru” first this might be avoided, but apparently this doesn’t work. It also doesn’t work to replace the hyphen with a non-breaking hyphen.
> 
> The simplest thing here is to avoid putting such hyphenated items in the word list. Then when you have indexed using the word list, use Find to find all instances of “Yamato-takeru” and remove the indexing (which will remove their index as “Yamato”). Then you can index them again correctly

Thank you for trying this. Yes, I will do as you indicated.

> ...
>> 
>> In order to remove index mark from words, we have to find these words to which index marks were added, but this seems not easy...? Indexed words can have some color, but it is not very visible. And there is no command to find "Next Indexed Word", or something like this... Is it possible to find these words with regex?
> 
> I’m not sure if you can use Powerfind to find indexed items. My quick test didn’t work.
> It seems that indexing is handled a bit differently from other styles. And the reason is probably that they need to deal with the “Index As” topics. Since you can index any text bit for multiple topics, this can get quite complicated. Even writing macros to handle this is complicated. Below I’m attaching a macro that will select the next indexed bit of text, and show you what it is indexed for
> 

I tried your macro. It works very well. Thank you very much for it! I will use this macro to work on the indices.

One thing that I found using this macro is that there are cases in which a string of words is indexed as several times, even spaces between are indexed... This is perhaps because of a special setting that I did in my table of words to be indexed... Example:

I have a title of an old book:

Kogo shūi 古語拾遺

I index it as "Kogo shūi 古語拾遺"; but as there may be cases in which the word appear only in Roman characters (without the kanji), I added another entry:

Kogo shūi

to be indexed as "Kogo shūi 古語拾遺". Now, when I find the indexed words with your macro, each of the following: "Kogo", " ", "shūi" and "古語拾遺" are indexed as "Kogo shūi 古語拾遺" (what is strange is that the first space is indexed, while the second one between "shūi" and "古語拾遺" is not...). But in another similar case, this is not the first space, but another space which is indexed, etc. -- Actually, this does not make any change in the generated indices, so I can leave things as they are, but this is very strange...

Another this that I find very annoying is that when I select an indexed words or strings of words, and choose "Index As...", the "indexed as words" don't appear, so that it is impossible to know by which words it was indexed, not to edit the "indexed as words". I think this is a major problem, and I will post a feature request to Nisus for that...

Best regard,

Nobumi Iyanaga
Tokyo,
Japan

> ...
> 
> # Macro Select Next Indexed Item
> 
> $doc = Document.active
> $indexStyles = $doc.textIndexStyleNames
> 
> # Get the name of the index style
> $indexName = Active Text Index Name
> 
> # Go through the document and look for indexed items
> $sel = TextSelection.active
> $loc = $sel.bound
> $txt = $sel.text
> 
> while $loc < $txt.length
> $attr = $txt.attributesAtIndex $loc
> $range = $txt.rangeOfAttributesAtIndex $loc
> $topic = $attr.textIndexTopicsForStyleName($indexName)
> # If an indexed item is found select it and quit
> if Defined $topic
> $indexed = TextSelection.new $txt, $range
> $doc.setSelection $indexed
> Prompt $indexed.substring, 'Indexed as :' & $topic
> Exit
> end
> $loc = $range.bound
> end
> 
> # If nothing is found
> Prompt 'No indexed items found.'
> 
> # end of macro
> 
>

ATOM RSS1 RSS2

LISTSERV.DARTMOUTH.EDU