Skip to content

Instantly share code, notes, and snippets.

@suribes
Forked from swayson/lsa_hack.r
Created May 4, 2016 21:52
Show Gist options
  • Save suribes/2c96e9a7520eb3b16102c3fbb421f04c to your computer and use it in GitHub Desktop.
Save suribes/2c96e9a7520eb3b16102c3fbb421f04c to your computer and use it in GitHub Desktop.

Revisions

  1. @swayson swayson revised this gist Apr 30, 2014. No changes.
  2. @rpietro rpietro revised this gist Dec 13, 2013. No changes.
  3. @rpietro rpietro revised this gist Dec 13, 2013. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion lsa_hack.r
    Original file line number Diff line number Diff line change
    @@ -4,10 +4,13 @@
    # install.packages("ggplot2")
    # install.packages("lsa")
    # install.packages("scatterplot3d")
    #install.packages("SnowballC")
    #if !(require('SnowballC')) then install.packages("SnowballC")
    library(tm)
    library(ggplot2)
    library(lsa)
    library(scatterplot3d)
    library(SnowballC)



    @@ -102,7 +105,7 @@ corpus <- tm_map(corpus, removePunctuation)
    corpus <- tm_map(corpus, function(x) removeWords(x, stopwords("english")))

    # error below, can't stem
    #corpus <- tm_map(corpus, stemDocument, language = "english")
    corpus <- tm_map(corpus, stemDocument, language = "english")
    corpus


  4. @rpietro rpietro revised this gist Sep 3, 2013. No changes.
  5. @rpietro rpietro revised this gist Sep 3, 2013. No changes.
  6. @rpietro rpietro revised this gist Sep 3, 2013. 1 changed file with 19 additions and 3 deletions.
    22 changes: 19 additions & 3 deletions lsa_hack.r
    Original file line number Diff line number Diff line change
    @@ -62,9 +62,25 @@ experiences just as we experience them. That is to say, we proceed from the
    experience. Thus, Husserl and Merleau-Ponty spoke of pure description of
    lived experience. (2) We interpret a type of experience by relating it to
    relevant features of context. In this vein, Heidegger and his followers
    spoke of hermeneutics, the art of interpretation in context, especially social and linguistic context. (3) We analyze the form of a type of experience. In the end, all the classical phenomenologists practiced analysis of experience, factoring out notable features for further elaboration.",

    "These traditional methods have been ramified in recent decades, expanding the methods available to phenomenology. Thus: (4) In a logico-semantic model of phenomenology, we specify the truth conditions for a type of thinking (say, where I think that dogs chase cats) or the satisfaction conditions for a type of intention (say, where I intend or will to jump that hurdle). (5) In the experimental paradigm of cognitive neuroscience, we design empirical experiments that tend to confirm or refute aspects of experience (say, where a brain scan shows electrochemical activity in a specific region of the brain thought to subserve a type of vision or emotion or motor control). This style of “neurophenomenology” assumes that conscious experience is grounded in neural activity in embodied action in appropriate surroundings — mixing pure phenomenology with biological and physical science in a way that was not wholly congenial to traditional phenomenologists.",
    spoke of hermeneutics, the art of interpretation in context, especially
    social and linguistic context. (3) We analyze the form of a type of experience.
    In the end, all the classical phenomenologists practiced analysis of
    experience, factoring out notable features for further elaboration.",

    "These traditional methods have been ramified in recent decades, expanding
    the methods available to phenomenology. Thus: (4) In a logico-semantic model
    of phenomenology, we specify the truth conditions for a type of thinking
    (say, where I think that dogs chase cats) or the satisfaction conditions for
    a type of intention (say, where I intend or will to jump that hurdle). (5)
    In the experimental paradigm of cognitive neuroscience, we design empirical
    experiments that tend to confirm or refute aspects of experience (say,
    where a brain scan shows electrochemical activity in a specific region
    of the brain thought to subserve a type of vision or emotion or motor
    control). This style of “neurophenomenology” assumes that conscious
    experience is grounded in neural activity in embodied action in appropriate
    surroundings — mixing pure phenomenology with biological and physical
    science in a way that was not wholly congenial to traditional
    phenomenologists.",

    "What makes an experience conscious is a certain awareness one has of the experience while living through or performing it. This form of inner awareness has been a topic of considerable debate, centuries after the issue arose with Locke's notion of self-consciousness on the heels of Descartes' sense of consciousness (conscience, co-knowledge). Does this awareness-of-experience consist in a kind of inner observation of the experience, as if one were doing two things at once? (Brentano argued no.) Is it a higher-order perception of one's mind's operation, or is it a higher-order thought about one's mental activity? (Recent theorists have proposed both.) Or is it a different form of inherent structure? (Sartre took this line, drawing on Brentano and Husserl.) These issues are beyond the scope of this article, but notice that these results of phenomenological analysis shape the characterization of the domain of study and the methodology appropriate to the domain. For awareness-of-experience is a defining trait of conscious experience, the trait that gives experience a first-person, lived character. It is that lived character of experience that allows a first-person perspective on the object of study, namely, experience, and that perspective is characteristic of the methodology of phenomenology.",

  7. @rpietro rpietro revised this gist Sep 3, 2013. 1 changed file with 62 additions and 19 deletions.
    81 changes: 62 additions & 19 deletions lsa_hack.r
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,4 @@
    # script based on http://goo.gl/YbQyAQ
    # script stolen from http://goo.gl/YbQyAQ

    # install.packages("tm")
    # install.packages("ggplot2")
    @@ -14,17 +14,55 @@ library(scatterplot3d)
    #------------------------------------------------------------------------------

    # 1. Prepare data from http://goo.gl/1RB32f
    text <- c("The discipline of phenomenology is defined by its domain of study, its methods, and its main results.",

    "Phenomenology studies structures of conscious experience as experienced from the first-person point of view, along with relevant conditions of experience. The central structure of an experience is its intentionality, the way it is directed through its content or meaning toward a certain object in the world.",

    "We all experience various types of experience including perception, imagination, thought, emotion, desire, volition, and action. Thus, the domain of phenomenology is the range of experiences including these types (among others). Experience includes not only relatively passive experience as in vision or hearing, but also active experience as in walking or hammering a nail or kicking a ball. (The range will be specific to each species of being that enjoys consciousness; our focus is on our own, human, experience. Not all conscious beings will, or will be able to, practice phenomenology, as we do.)",

    "Conscious experiences have a unique feature: we experience them, we live through them or perform them. Other things in the world we may observe and engage. But we do not experience them, in the sense of living through or performing them. This experiential or first-person feature — that of being experienced — is an essential part of the nature or structure of conscious experience: as we say, “I see / think / desire / do …” This feature is both a phenomenological and an ontological feature of each experience: it is part of what it is for the experience to be experienced (phenomenological) and part of what it is for the experience to be (ontological).",

    "How shall we study conscious experience? We reflect on various types of experiences just as we experience them. That is to say, we proceed from the first-person point of view. However, we do not normally characterize an experience at the time we are performing it. In many cases we do not have that capability: a state of intense anger or fear, for example, consumes all of one's psychic focus at the time. Rather, we acquire a background of having lived through a given type of experience, and we look to our familiarity with that type of experience: hearing a song, seeing a sunset, thinking about love, intending to jump a hurdle. The practice of phenomenology assumes such familiarity with the type of experiences to be characterized. Importantly, also, it is types of experience that phenomenology pursues, rather than a particular fleeting experience — unless its type is what interests us.",

    "Classical phenomenologists practiced some three distinguishable methods. (1) We describe a type of experience just as we find it in our own (past) experience. Thus, Husserl and Merleau-Ponty spoke of pure description of lived experience. (2) We interpret a type of experience by relating it to relevant features of context. In this vein, Heidegger and his followers spoke of hermeneutics, the art of interpretation in context, especially social and linguistic context. (3) We analyze the form of a type of experience. In the end, all the classical phenomenologists practiced analysis of experience, factoring out notable features for further elaboration.",
    text <- c("The discipline of phenomenology is defined by its domain of study,
    its methods, and its main results.",

    "Phenomenology studies structures of conscious experience as experienced from
    the first-person point of view, along with relevant conditions of experience.
    The central structure of an experience is its intentionality, the way it is
    directed through its content or meaning toward a certain object in the world.",

    "We all experience various types of experience including perception,
    imagination, thought, emotion, desire, volition, and action. Thus, the domain
    of phenomenology is the range of experiences including these types (among
    others). Experience includes not only relatively passive experience as in
    vision or hearing, but also active experience as in walking or hammering a
    nail or kicking a ball. (The range will be specific to each species of being
    that enjoys consciousness; our focus is on our own, human, experience. Not
    all conscious beings will, or will be able to, practice phenomenology, as
    we do.)",

    "Conscious experiences have a unique feature: we experience them, we live
    through them or perform them. Other things in the world we may observe and
    engage. But we do not experience them, in the sense of living through or
    performing them. This experiential or first-person feature — that of being
    experienced — is an essential part of the nature or structure of conscious
    experience: as we say, “I see / think / desire / do …” This feature is both
    a phenomenological and an ontological feature of each experience: it is part
    of what it is for the experience to be experienced (phenomenological) and part
    of what it is for the experience to be (ontological).",

    "How shall we study conscious experience? We reflect on various types of
    experiences just as we experience them. That is to say, we proceed from the
    first-person point of view. However, we do not normally characterize an
    experience at the time we are performing it. In many cases we do not have
    that capability: a state of intense anger or fear, for example, consumes
    all of one's psychic focus at the time. Rather, we acquire a background
    of having lived through a given type of experience, and we look to our
    familiarity with that type of experience: hearing a song, seeing a sunset,
    thinking about love, intending to jump a hurdle. The practice of
    phenomenology assumes such familiarity with the type of experiences to
    be characterized. Importantly, also, it is types of experience that
    phenomenology pursues, rather than a particular fleeting experience
    — unless its type is what interests us.",

    "Classical phenomenologists practiced some three distinguishable methods.
    (1) We describe a type of experience just as we find it in our own (past)
    experience. Thus, Husserl and Merleau-Ponty spoke of pure description of
    lived experience. (2) We interpret a type of experience by relating it to
    relevant features of context. In this vein, Heidegger and his followers
    spoke of hermeneutics, the art of interpretation in context, especially social and linguistic context. (3) We analyze the form of a type of experience. In the end, all the classical phenomenologists practiced analysis of experience, factoring out notable features for further elaboration.",

    "These traditional methods have been ramified in recent decades, expanding the methods available to phenomenology. Thus: (4) In a logico-semantic model of phenomenology, we specify the truth conditions for a type of thinking (say, where I think that dogs chase cats) or the satisfaction conditions for a type of intention (say, where I intend or will to jump that hurdle). (5) In the experimental paradigm of cognitive neuroscience, we design empirical experiments that tend to confirm or refute aspects of experience (say, where a brain scan shows electrochemical activity in a specific region of the brain thought to subserve a type of vision or emotion or motor control). This style of “neurophenomenology” assumes that conscious experience is grounded in neural activity in embodied action in appropriate surroundings — mixing pure phenomenology with biological and physical science in a way that was not wholly congenial to traditional phenomenologists.",

    @@ -38,49 +76,54 @@ df <- data.frame(text, view, stringsAsFactors = FALSE)
    df



    #------------------------------------------------------------------------------

    # prepare corpus
    corpus <- Corpus(VectorSource(df$text))
    corpus <- tm_map(corpus, tolower)
    corpus <- tm_map(corpus, removePunctuation)
    corpus <- tm_map(corpus, function(x) removeWords(x, stopwords("english")))

    # error below
    #?stemDocument
    # error below, can't stem
    #corpus <- tm_map(corpus, stemDocument, language = "english")
    corpus




    #------------------------------------------------------------------------------

    # 2. MDS with raw term-document matrix compute distance matrix
    # MDS with raw term-document matrix compute distance matrix
    td.mat <- as.matrix(TermDocumentMatrix(corpus))
    td.mat
    dist.mat <- dist(t(as.matrix(td.mat)))
    dist.mat # check distance matrix



    #------------------------------------------------------------------------------

    # MDS
    fit <- cmdscale(dist.mat, eig = TRUE, k = 2)
    points <- data.frame(x = fit$points[, 1], y = fit$points[, 2])
    ggplot(points, aes(x = x, y = y)) + geom_point(data = points, aes(x = x, y = y,
    color = df$view)) + geom_text(data = points, aes(x = x, y = y - 0.2, label = row.names(df)))

    color = df$view)) + geom_text(data = points, aes(x = x, y = y - 0.2, label =
    row.names(df)))




    #------------------------------------------------------------------------------

    # 3. MDS with LSA
    # MDS with LSA
    td.mat.lsa <- lw_bintf(td.mat) * gw_idf(td.mat) # weighting
    lsaSpace <- lsa(td.mat.lsa) # create LSA space
    dist.mat.lsa <- dist(t(as.textmatrix(lsaSpace))) # compute distance matrix
    dist.mat.lsa # check distance mantrix


    #------------------------------------------------------------------------------

    # MDS
    fit <- cmdscale(dist.mat.lsa, eig = TRUE, k = 2)
    points <- data.frame(x = fit$points[, 1], y = fit$points[, 2])
    @@ -90,7 +133,7 @@ ggplot(points, aes(x = x, y = y)) + geom_point(data = points, aes(x = x, y = y,

    #------------------------------------------------------------------------------


    # plot
    fit <- cmdscale(dist.mat.lsa, eig = TRUE, k = 3)
    colors <- rep(c("blue", "green", "red"), each = 3)
    scatterplot3d(fit$points[, 1], fit$points[, 2], fit$points[, 3], color = colors,
  8. @rpietro rpietro revised this gist Sep 3, 2013. 1 changed file with 27 additions and 14 deletions.
    41 changes: 27 additions & 14 deletions lsa_hack.r
    Original file line number Diff line number Diff line change
    @@ -1,24 +1,37 @@
    # script from http://goo.gl/YbQyAQ
    # script based on http://goo.gl/YbQyAQ

    # load required libraries
    install.packages(c("tm", "lsa"))
    install.packages("scatterplot3d")
    # install.packages("tm")
    # install.packages("ggplot2")
    # install.packages("lsa")
    # install.packages("scatterplot3d")
    library(tm)
    library(ggplot2)
    library(lsa)
    library(scatterplot3d)



    # 1. Prepare mock data
    text <- c("transporting food by cars will cause global warming. so we should go local.",
    "we should try to convince our parents to stop using cars because it will cause global warming.",
    "some food, such as mongo, requires a warm weather to grow. so they have to be transported to canada.",
    "a typical electronic circuit can be built with a battery, a bulb, and a switch.",
    "electricity flows from batteries to the bulb, just like water flows through a tube.",
    "batteries have chemical energe in it. then electrons flow through a bulb to light it up.",
    "birds can fly because they have feather and they are light.", "why some birds like pigeon can fly while some others like chicken cannot?",
    "feather is important for birds' fly. if feather on a bird's wings is removed, this bird cannot fly.")
    #------------------------------------------------------------------------------

    # 1. Prepare data from http://goo.gl/1RB32f
    text <- c("The discipline of phenomenology is defined by its domain of study, its methods, and its main results.",

    "Phenomenology studies structures of conscious experience as experienced from the first-person point of view, along with relevant conditions of experience. The central structure of an experience is its intentionality, the way it is directed through its content or meaning toward a certain object in the world.",

    "We all experience various types of experience including perception, imagination, thought, emotion, desire, volition, and action. Thus, the domain of phenomenology is the range of experiences including these types (among others). Experience includes not only relatively passive experience as in vision or hearing, but also active experience as in walking or hammering a nail or kicking a ball. (The range will be specific to each species of being that enjoys consciousness; our focus is on our own, human, experience. Not all conscious beings will, or will be able to, practice phenomenology, as we do.)",

    "Conscious experiences have a unique feature: we experience them, we live through them or perform them. Other things in the world we may observe and engage. But we do not experience them, in the sense of living through or performing them. This experiential or first-person feature — that of being experienced — is an essential part of the nature or structure of conscious experience: as we say, “I see / think / desire / do …” This feature is both a phenomenological and an ontological feature of each experience: it is part of what it is for the experience to be experienced (phenomenological) and part of what it is for the experience to be (ontological).",

    "How shall we study conscious experience? We reflect on various types of experiences just as we experience them. That is to say, we proceed from the first-person point of view. However, we do not normally characterize an experience at the time we are performing it. In many cases we do not have that capability: a state of intense anger or fear, for example, consumes all of one's psychic focus at the time. Rather, we acquire a background of having lived through a given type of experience, and we look to our familiarity with that type of experience: hearing a song, seeing a sunset, thinking about love, intending to jump a hurdle. The practice of phenomenology assumes such familiarity with the type of experiences to be characterized. Importantly, also, it is types of experience that phenomenology pursues, rather than a particular fleeting experience — unless its type is what interests us.",

    "Classical phenomenologists practiced some three distinguishable methods. (1) We describe a type of experience just as we find it in our own (past) experience. Thus, Husserl and Merleau-Ponty spoke of pure description of lived experience. (2) We interpret a type of experience by relating it to relevant features of context. In this vein, Heidegger and his followers spoke of hermeneutics, the art of interpretation in context, especially social and linguistic context. (3) We analyze the form of a type of experience. In the end, all the classical phenomenologists practiced analysis of experience, factoring out notable features for further elaboration.",

    "These traditional methods have been ramified in recent decades, expanding the methods available to phenomenology. Thus: (4) In a logico-semantic model of phenomenology, we specify the truth conditions for a type of thinking (say, where I think that dogs chase cats) or the satisfaction conditions for a type of intention (say, where I intend or will to jump that hurdle). (5) In the experimental paradigm of cognitive neuroscience, we design empirical experiments that tend to confirm or refute aspects of experience (say, where a brain scan shows electrochemical activity in a specific region of the brain thought to subserve a type of vision or emotion or motor control). This style of “neurophenomenology” assumes that conscious experience is grounded in neural activity in embodied action in appropriate surroundings — mixing pure phenomenology with biological and physical science in a way that was not wholly congenial to traditional phenomenologists.",

    "What makes an experience conscious is a certain awareness one has of the experience while living through or performing it. This form of inner awareness has been a topic of considerable debate, centuries after the issue arose with Locke's notion of self-consciousness on the heels of Descartes' sense of consciousness (conscience, co-knowledge). Does this awareness-of-experience consist in a kind of inner observation of the experience, as if one were doing two things at once? (Brentano argued no.) Is it a higher-order perception of one's mind's operation, or is it a higher-order thought about one's mental activity? (Recent theorists have proposed both.) Or is it a different form of inherent structure? (Sartre took this line, drawing on Brentano and Husserl.) These issues are beyond the scope of this article, but notice that these results of phenomenological analysis shape the characterization of the domain of study and the methodology appropriate to the domain. For awareness-of-experience is a defining trait of conscious experience, the trait that gives experience a first-person, lived character. It is that lived character of experience that allows a first-person perspective on the object of study, namely, experience, and that perspective is characteristic of the methodology of phenomenology.",

    "Conscious experience is the starting point of phenomenology, but experience shades off into less overtly conscious phenomena. As Husserl and others stressed, we are only vaguely aware of things in the margin or periphery of attention, and we are only implicitly aware of the wider horizon of things in the world around us. Moreover, as Heidegger stressed, in practical activities like walking along, or hammering a nail, or speaking our native tongue, we are not explicitly conscious of our habitual patterns of action. Furthermore, as psychoanalysts have stressed, much of our intentional mental activity is not conscious at all, but may become conscious in the process of therapy or interrogation, as we come to realize how we feel or think about something. We should allow, then, that the domain of phenomenology — our own experience — spreads out from conscious experience into semi-conscious and even unconscious mental activity, along with relevant background conditions implicitly invoked in our experience. (These issues are subject to debate; the point here is to open the door to the question of where to draw the boundary of the domain of phenomenology.)")

    view <- factor(rep(c("view 1", "view 2", "view 3"), each = 3))
    view
    df <- data.frame(text, view, stringsAsFactors = FALSE)
    @@ -32,7 +45,7 @@ corpus <- tm_map(corpus, removePunctuation)
    corpus <- tm_map(corpus, function(x) removeWords(x, stopwords("english")))

    # error below
    ?stemDocument
    #?stemDocument
    #corpus <- tm_map(corpus, stemDocument, language = "english")
    corpus

  9. @rpietro rpietro created this gist Sep 3, 2013.
    86 changes: 86 additions & 0 deletions lsa_hack.r
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,86 @@
    # script from http://goo.gl/YbQyAQ

    # load required libraries
    install.packages(c("tm", "lsa"))
    install.packages("scatterplot3d")
    library(tm)
    library(ggplot2)
    library(lsa)
    library(scatterplot3d)



    # 1. Prepare mock data
    text <- c("transporting food by cars will cause global warming. so we should go local.",
    "we should try to convince our parents to stop using cars because it will cause global warming.",
    "some food, such as mongo, requires a warm weather to grow. so they have to be transported to canada.",
    "a typical electronic circuit can be built with a battery, a bulb, and a switch.",
    "electricity flows from batteries to the bulb, just like water flows through a tube.",
    "batteries have chemical energe in it. then electrons flow through a bulb to light it up.",
    "birds can fly because they have feather and they are light.", "why some birds like pigeon can fly while some others like chicken cannot?",
    "feather is important for birds' fly. if feather on a bird's wings is removed, this bird cannot fly.")
    view <- factor(rep(c("view 1", "view 2", "view 3"), each = 3))
    view
    df <- data.frame(text, view, stringsAsFactors = FALSE)
    df


    # prepare corpus
    corpus <- Corpus(VectorSource(df$text))
    corpus <- tm_map(corpus, tolower)
    corpus <- tm_map(corpus, removePunctuation)
    corpus <- tm_map(corpus, function(x) removeWords(x, stopwords("english")))

    # error below
    ?stemDocument
    #corpus <- tm_map(corpus, stemDocument, language = "english")
    corpus




    #------------------------------------------------------------------------------

    # 2. MDS with raw term-document matrix compute distance matrix
    td.mat <- as.matrix(TermDocumentMatrix(corpus))
    td.mat
    dist.mat <- dist(t(as.matrix(td.mat)))
    dist.mat # check distance matrix



    # MDS
    fit <- cmdscale(dist.mat, eig = TRUE, k = 2)
    points <- data.frame(x = fit$points[, 1], y = fit$points[, 2])
    ggplot(points, aes(x = x, y = y)) + geom_point(data = points, aes(x = x, y = y,
    color = df$view)) + geom_text(data = points, aes(x = x, y = y - 0.2, label = row.names(df)))





    #------------------------------------------------------------------------------

    # 3. MDS with LSA
    td.mat.lsa <- lw_bintf(td.mat) * gw_idf(td.mat) # weighting
    lsaSpace <- lsa(td.mat.lsa) # create LSA space
    dist.mat.lsa <- dist(t(as.textmatrix(lsaSpace))) # compute distance matrix
    dist.mat.lsa # check distance mantrix


    # MDS
    fit <- cmdscale(dist.mat.lsa, eig = TRUE, k = 2)
    points <- data.frame(x = fit$points[, 1], y = fit$points[, 2])
    ggplot(points, aes(x = x, y = y)) + geom_point(data = points, aes(x = x, y = y,
    color = df$view)) + geom_text(data = points, aes(x = x, y = y - 0.2, label = row.names(df)))


    #------------------------------------------------------------------------------


    fit <- cmdscale(dist.mat.lsa, eig = TRUE, k = 3)
    colors <- rep(c("blue", "green", "red"), each = 3)
    scatterplot3d(fit$points[, 1], fit$points[, 2], fit$points[, 3], color = colors,
    pch = 16, main = "Semantic Space Scaled to 3D", xlab = "x", ylab = "y",
    zlab = "z", type = "h")