juji.func.nlp

Built-in functions that do natural language processing on user input

complex-content?

(complex-content? rep input)

Check if input contains more than :complex-content-threshold number of clauses as set in script config.

input is a vector of tokens. This function is usually used with captured content in REP rules.

contain-email

(contain-email rep)

Check if the current user input contains a valid email.

Example

;; use this function with `extract-email` as in the following rule
;; the rule first check if user input contains an email,
;; if it does, REP responds with a customized acknowledgement.
[(contain-email)]
[(str "Thank you. we have your email as" (extract-email) ".")
 (record-answer ?q (input-text))]

contain-name?

(contain-name? rep)(contain-name? _ tokens)

Check if the current user input contains at least one person’s name. Check the given vector of tokens contains at least one person’s name if a vector is passed in.

contain-phone-number

(contain-phone-number rep)

Check if the current user input contains a valid US phone number.

Example

;; use this function with `extract-phone-number` as in the following rule
;; the rule first check if user input contains an phone number,
;; if it does, REP responds with a customized acknowledgement.
[(contain-phone-number)]
[(str "Thank you. we have your phone number as" (extract-phone-number) ".")
 (record-answer ?q (input-text))]

contains-multiple-questions?

(contains-multiple-questions? _ input)

Returns true iff the given input has more than one question.

input needs to contain at least 2 sentences for this to be true.

input is a vector of tokens. This function is usually used with captured content in REP rules.

contains-non-English?

(contains-non-English? _ input)

Checks if there are non-English words in the given input.

input is a vector of tokens. This function is usually used with captured content in REP rules.

Example

;; The trigger below would fire if the words captured in ?something contains non-English
[I like to read (?something +) (contains-non-English? ?something)]

extract-age

(extract-age rep)

Extract age from the current user input.

extract-email

(extract-email rep)(extract-email _ text)

Extract email from the current user input. Return nil if there’s no valid email found.

extract-first-hobby

(extract-first-hobby rep)

Extract the first hobby from the current user input. If the input has no activity, return nil.

extract-first-name

(extract-first-name rep)

Extract the first name from the current user input. Assume the format “first middle last”.

extract-name

(extract-name rep)

Extract name from the current rep input.

extract-names

(extract-names rep)

Extract people’s names in the current user input. Returns nil if there’s no name, otherwise return a list of strings - each string is a name extracted.

extract-phone-number

(extract-phone-number rep)

Extract phone number from the current user input. Return nil if there’s no valid US phone number found.

extract-why-u-here

(extract-why-u-here rep)

Extract information from the current user input. Assume it answers the greeting question regarding visitor’s objective e.g., “what brings you here?”

get-email-domain

(get-email-domain _ email-text)

Get given email’s domain.

get-possible-company-from-domain

(get-possible-company-from-domain _ email-text)

Make educated guess about the name of the comapny the given email is affiliated with.

get-top-ranked-faq-answer-from-index

deprecated

(get-top-ranked-faq-answer-from-index rep index-id)(get-top-ranked-faq-answer-from-index rep index-id threshold)

Return the most relevent answer to the current user input from the given FAQ index. When threshold is not provided, assume it to be 0.85. When relevance score of the top answer is less than threshold, return nil.

get-top-ranked-qna-answer-info-from-index

(get-top-ranked-qna-answer-info-from-index rep index-id)(get-top-ranked-qna-answer-info-from-index rep index-id threshold)

Return the most relevent answer info to the current user input from the given FAQ index. When threshold is not provided, assume it to be 0.85. When relevance score of the top answer is less than threshold, return nil.

get-top-ranked-qna-answer-text-from-index

(get-top-ranked-qna-answer-text-from-index rep index-id)(get-top-ranked-qna-answer-text-from-index rep index-id threshold)

Return the most relevent answer text to the current user input from the given FAQ index. When threshold is not provided, assume it to be 0.85. When relevance score of the top answer is less than threshold, return nil.

has-adj-definition

(has-adj-definition _ word)

Check whether the given word or phrase has an adjective definition.

word is a string or a vector of strings. However, when it is a vector, only the first item of the vector will be used.

has-negative-sentiment

(has-negative-sentiment rep)

Check if current user input has negative sentiment.

has-noun-definition

(has-noun-definition _ word)

Check whether the given word or phrase has a noun definition.

word is a string or a vector of strings. However, when it is a vector, only the first item of the vector will be used.

has-positive-sentiment

(has-positive-sentiment rep)

Check if current user input has positive sentiment.

has-verb-definition

(has-verb-definition _ word)

Check whether the given word or phrase has a verb definition.

word is a string or a vector of strings. However, when it is a vector, only the first item of the vector will be used.

ignore-case-equals?

(ignore-case-equals? rep phrase1 phrase2)

Check if two phrases are equal if case is ignored.

informativeness

(informativeness _ input)

Calculate informativeness of the given input in term of surprisal

input is a vector of tokens. This function is usually used with captured content in REP rules.

is-gibberish

(is-gibberish rep)

Check if the current user input is considered gibberish (in English).

is-name

deprecated

(is-name _ input)

Check the given input is a name.

input is a vector of tokens. This function is usually used with captured content in REP rules.

Example

;; The trigger below would fire if the words captured in ?someone is a person's name
[I like to play with (?someone +) (is-name ?someone)]

is-name?

(is-name? _ input)

Check the given input is a name. This is more strict than the deprecated function is-name

input is a vector of tokens. This function is usually used with captured content in REP rules.

Example

;; The trigger below would fire if the words captured in ?someone is a person's name
[I like to play with (?someone +) (is-name? ?someone)]

is-similar-to-any

deprecated

lemmatize

(lemmatize _ s)

list-extracted-hobby-info

(list-extracted-hobby-info rep)

Extract and list hobby info separated by comma from the current user input.

max-similarity-score

(max-similarity-score rep anchors)

Calculate the sentence similarity of user input against the anchors, and return the highest similarity score. The function returns 0 if there’s an exception.

anchors is a vector of sentence strings.

Example

;; say user input is "My favorite is Diablo"
;; the following rule will trigger, because the max-similarity-score function returns 0.85
;; thus the inequality will return true
[(> (max-similarity-score ["I like Diablo the most." "I usually play Diablo."]) 0.8)]

non-trivial-noun-verb-extract-from-input

(non-trivial-noun-verb-extract-from-input rep)

Extract non-trivial nouns and verbs from the current user input. Return a vector containing the extracted tokens.

Example

;; say the current user input is "He likes to play basketball."
;; the following pattern stores `["likes" "play" "basketball"]` in variable `?extracted`
[(<- ?extracted (non-trivial-noun-verb-extract))]

present-qna-answer-in-context

(present-qna-answer-in-context rep matched-qna)(present-qna-answer-in-context rep matched-qna in-context-response)

Present Q&A answer consider the current context. If the Q&A is the same as the ones in the current mini-agenda entrance context and in-context-response is supplied, in-context-response is used instead of the qna answer..

present-qna-answer-text

(present-qna-answer-text rep qna-answer-text)

Present Q&A answer text. This also show the multi step Q&A topics.

similarity-exceeds-threshold

(similarity-exceeds-threshold rep anchors)(similarity-exceeds-threshold rep anchors threshold)

Return true if the user input is similar to any of the anchors; return false otherwise.

This is a wrapper for thresholding max-similarity-score. If threshold is provided, the function returns true if (max-similarity-score anchors) is greater than the threshold; otherwise, 0.8 is used as threshold.

user-question?

(user-question? rep)(user-question? _ tokens)

Check if current user input or the given tokens contains a question.