Skip to content

Instantly share code, notes, and snippets.

@kleneway
Created August 7, 2013 05:10
Show Gist options
  • Select an option

  • Save kleneway/6171372 to your computer and use it in GitHub Desktop.

Select an option

Save kleneway/6171372 to your computer and use it in GitHub Desktop.

Revisions

  1. kleneway created this gist Aug 7, 2013.
    31 changes: 31 additions & 0 deletions gistfile1.m
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,31 @@
    // use the built-in ios linguistics functionality to stem the tags
    + (NSMutableArray *)stemTags:(NSMutableArray*)originalTags {

    NSLinguisticTagger *tagger = [[NSLinguisticTagger alloc]
    initWithTagSchemes:[NSArray arrayWithObjects:NSLinguisticTagSchemeLemma, nil]
    options:(NSLinguisticTaggerOmitWhitespace | NSLinguisticTaggerOmitPunctuation)];
    NSMutableArray *stemmedTags = [[NSMutableArray alloc] init];

    // convert tags to string
    [tagger setString:[originalTags componentsJoinedByString:@" "]];
    __block int i=0;
    // loop through each tag and stem it, documentation found here:
    // http://developer.apple.com/library/ios/#documentation/cocoa/reference/NSLinguisticTagger_Class/Reference/Reference.html
    [tagger enumerateTagsInRange:NSMakeRange(0, allTags.length)
    scheme:NSLinguisticTagSchemeLemma
    options:(NSLinguisticTaggerOmitWhitespace | NSLinguisticTaggerOmitPunctuation)
    usingBlock:^(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop) {
    // tag has been stemmed, add the stemmed version to the list
    if(tag) {
    [stemmedTags addObject:tag];
    }
    // tag was not stemmed, add the original version to the list
    else {
    if(originalTags.count > i) {
    [stemmedTags addObject:[originalTags objectAtIndex:i]];
    }
    }
    i++;
    }];
    return stemmedTags;
    }